Imagine that I have a nice Deck class, in the best OO fashion. It has Cards, which have a Suit and a Rank, it has a Shuffle method, and so on. Now, I'm going to have a lot of concurrent instances of Deck (say this is a casino). The question is: Should there be a different instance of every card across all the decks?
Card objects would probably be best implemented as immutable objects. In order to create a card, you must pass in a Suit and a Rank, and this Suit and Rank will never be changed at a later time.
From that point of view, since these objects don't change, and since there are a set number to begin with, it makes sense to implement a single static collection containing all 52 possible Card objects, and access these cards from other classes (make the constructor on Card private so that it is impossible to create a card outside of the Card class).
The real distinction here is that Card's themselves don't perform any operations, other operations will act upon cards, so it should be just fine to make a single instance of card.
It is officially called a Flyweight pattern and was preseneted first in GOFs "Design Patterns". Should be very useful in your case. Since cards never change you may even think to implement them as Enums.
http://en.wikipedia.org/wiki/Flyweight_pattern
It depends on how you're going to use the cards. Probably, though, any extra memory usage from extra instances of Card will be trivial - after all, each card is storing only two bytes of data.
You ask: "Should there be a different instance of every card across all the decks?" The answer is no: you can use a single instance of each card and share it across all the decks, even if they're running on different threads. The reason is that cards are immutable, so even if two threads call, say, card.getSuit() on the same card, their computations won't interfere.
This is only true, of course, if you write the card class to really be immutable. As soon as you write to some local variable of card, you expose yourself to data races. But I can't think of a reason to do that, so you should be safe.
Related
I have found the keras-rl/examples/cem_cartpole.py example and I would like to understand, but I don't find documentation.
What does the line
memory = EpisodeParameterMemory(limit=1000, window_length=1)
do? What is the limit and what is the window_length? Which effect does increasing either / both parameters have?
EpisodeParameterMemory is a special class that is used for CEM. In essence it stores the parameters of a policy network that were used for an entire episode (hence the name).
Regarding your questions: The limit parameter simply specifies how many entries the memory can hold. After exceeding this limit, older entries will be replaced by newer ones.
The second parameter is not used in this specific type of memory (CEM is somewhat of an edge case in Keras-RL and mostly there as a simple baseline). Typically, however, the window_length parameter controls how many observations are concatenated to form a "state". This may be necessary if the environment is not fully observable (think of it as transforming a POMDP into an MDP, or at least approximately). DQN on Atari uses this since a single frame is clearly not enough to infer the velocity of a ball with a FF network, for example.
Generally, I recommend reading the relevant paper (again, CEM is somewhat of an exception). It should then become relatively clear what each parameter means. I agree that Keras-RL desperately needs documentation but I don't have time to work on it right now, unfortunately. Contributions to improve the situation are of course always welcome ;).
A little late to the party, but I feel like the answer doesn't really answer the question.
I found this description online (https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html#replay-memory):
We’ll be using experience replay
memory for training our DQN. It stores the transitions that the agent
observes, allowing us to reuse this data later. By sampling from it
randomly, the transitions that build up a batch are decorrelated. It
has been shown that this greatly stabilizes and improves the DQN
training procedure.
Basically you observe and save all of your state transitions so that you can train your network on them later on (instead of having to make observations from the environment all the time).
I am working on a graphic design, vector drawing application that needs to render the data in every frame when there is a change. The issue is, that if the user is moving nodes, there will be changes during every single frame. This is not an issue with a tiny amount of data and is a major slowdown when there is anything more than a minor amount of data.
The reason is that in order to render I preform calculations and store data inside arrays. Then when the function responsible for the computation is done, the GC simply discards the data and next time the function is called, we create new arrays and new data.
In C++ I would probably allocate space in the memory and write to that space(over and over). I would probably improve performance that way. In languages that us GC I cannot allocate space that way. I have to do an ugly hack where I define an array as a class member and then write to that array from the function over and over although that array is only used in that one function and is not used by other methods of the class.
My questions is, what is the best way to reuse memory space in a language that uses GC?
Object pooling would be the major one, see here:
Gotoandplay Tutorial
Also
10 Top Tips around GC
I would also suggest you read through Grant's explanation of the garbage collection system in the Flash Player, it's quite unique, and understanding how Flash handles data is quite important to data intensive scripts.
This presentation
Lets assume we have a simple generational GC with only two generations, the "old" generation (objects who survived at least one collection) and the "young" generation (newly allocated). So how exactly would the GC determine a "young" object to be garbage without tracing the whole reference graph from the very roots? Or to put it a different way: What does the GC choose as roots for the trace when indending to collect the "young" generation only?
I'm interested in the general method but in specific examples from existing implementations as well.
Thanks!
There are a few techniques, which all boil down to maintaining knowledge of which old-gen objects (or ranges of old-gen memory) may contain references to young objects.
Pretty much all implementations I can think of maintain this knowledge by adding write barriers. Those write barriers trigger when a young-gen reference is stored in a old-gen object, and thereby cause execution of a small code snippet which remembers the new reference.
To store that knowledge, some GCs use card marking, where a compact bitmap is used to mark small-ish memory blocks as "contains references to younger generations". Others maintain explicit "remembered sets", which does something similar for individual objects. In both cases, young-gen collections then add the objects in the (remembered set/memory blocks marked by the card table) to the roots.
As for specific implementations:
Mono uses remembered sets.
PyPy has several GCs, the newest and shiniest (Minimark) uses remembered sets, with the addition of card marking for individual large arrays.
.NET uses card marking.
I don't have much experience in machine learning, pattern recognition, data mining, etc. and in their underlying theory and systems.
I would like to develop an artificial model of the time it takes a human to make a move in a given Sudoku puzzle.
So what I'm looking for as an output from the machine learning process is a model that can give predictions on how long does it take for a target human to make a move in a given Sudoku situation.
Same input doesn't always map to same outcome. It takes different times for the human to make a move with the same situation, but my hypothesis is that there's a tendency in the resulting probability distribution. (My educated guess is that it is ~normal.)
I have ideas about the factors that influence the distribution (like #empty slots) but would preferably leave it to the system to figure these patterns out. Please notice, that I'm not interested in the patterns, just the model.
I can generate sample and test data easily by running sudoku puzzles and measuring the times it takes to make the moves.
What kind of learning algorithm would you suggest to use for this?
I was thinking NNs, but I'm not sure if they can have the desired property of giving weighted random outcomes for the same input.
If I understand this correctly you have an input vector of length 81, which contains 1 if the square is filled in and 0 otherwise. You want to learn a function which returns a probability distribution which models the response time of a human to that board position.
My first response would be that this is a regression problem and you should try straightforward linear regression. This will not provide you with a distribution of response times, but a single 'best-guess' response time.
I'm not clear on why you want to model a distribution of response times. However, if you really want to do want to output a distribution then it sounds like you want to look at Bayesian methods. I'm not really an expert on Bayesian inference, so I can't help you much further here.
However, I don't really think your approach is going to work because I agree with your intuition about features such as the number of empty slots being important. There are also other obvious features, such as the number of empty slots per row/column that are likely to be important. Explicitly putting these features in your representation will probably be much more successful than expecting that the learning algorithm will infer something similar on its own.
The monte carlo method seems like it would work well here but would require a stack of solutions the size of the moon to really do it. And it wouldn't give you the time per person, just the time on average.
My understanding of it, tenuous as it is, is that you have a database with a board position and the time it took a human to make the next move. At the very least you have a starting point for most moves. Even if it's not in the database you could start to calculate how long it would take to make a move based on some algorithm. Though I know you had specified you wanted machine learning to do this it might be worth segmenting the problem into something a little smaller then building on it.
If you have some guesstimate as to what influences the function (# of empty cell, etc), try to train a classifier on a vector of features, and not on the 81 cells vector (0/1 or 0..9, doesn't really matter for my argument).
I think that your claim:
we wouldn't have to necessary know the underlying patterns, the "trained patterns" in a learning system automatically encodes these sometimes quite delicate and subtle patterns inside them -- that's one of their great power
is wrong. you do have to give the network the right domain. for example, when trying to detect object in an image, working in the pixel domain is pointless. you'll only get results if you first run some feature detection to detect edges, corners, etc.
Theoretically, with enough non-linearity (in NN - enough layers in the network) it can detect such things, but in practice, I have never seen that work, without giving the classifier the right features to work with.
I was thinking NNs, but I'm not sure if they can have the desired property of giving weighted random outcomes for the same input.
You're just trying to learn a function from 2^81 or 10^81 (or a much smaller feature space as I suggest) to R (response time between 0 and Inf) or some discretization of that. So NN and other classifiers can do that.
The fixnum question brought my mind to an other question I've wondered for a long time.
Many online material about garbage collection does not tell about how runtime type information can be implemented. Therefore I know lots about all sorts of garbage collectors, but not really about how I can implement them.
The fixnum solution is actually quite nice, it's very clear which value is a pointer and which isn't. What other commonly used solutions for storing type information there is?
Also, I wonder about fixnum -thing. Doesn't that mean that you are being limited to fixnums on every array index? Or is there some sort of workaround for getting full 64-bit integers?
Basically to achieve accurate marking you need meta-data indicating which words are used as pointers and which are not.
This meta-data could be stored per reference, as emacs does. If for your language/implementation you don't care much about memory use, you could even make references bigger than words (perhaps twice as big), so that every reference can carry type information as well as its one-word data. That way you could have a fixnum the full size of a 32 bit pointer, at the cost of references all being 64 bit.
Alternatively, the meta-data could be stored along with other type information. So for example a class could contain, as well as the usual function pointer table, one bit per word of the data layout indicating whether or not the word contains a reference that should be followed by the garbage collector. If your language has virtual calls then you must already have a means of working out from an object what function addresses to use, so the same mechanism will allow you to work out what marking data to use - typically you add an extra, secret pointer at the start of every single object, pointing to the class which constitutes its runtime type. Obviously with certain dynamic languages the type data pointed to would need to be copy-on-write, since it is modifiable.
The stack can do similar - store the accurate marking information in data sections of the code itself, and have the garbage collector examine the stored program counter, and/or link pointers on the stack, and/or other information placed on the stack by the code for the purpose, to determine which code each bit of stack relates to and hence which words are pointers. Lightweight exception mechanisms tend to do a similar thing to store information about where try/catch occurs in the code, and of course debuggers need to be able to interpret the stack too, so this can quite possibly be folded in with a bunch of other stuff you'd already be doing to implement any language, including ones with built-in garbage collection.
Note that garbage collection doesn't necessarily need accurate marking. You could treat every word as a pointer, regardless of whether it really is or not, look it up in your garbage collector's "big list of everything" to decide whether it plausibly could refer to an object that has not yet been marked, and if so treat it as a reference to that object. This is simple, but the cost of course is that it's somewhere between "quite slow" and "very slow", depending on what data structures your gc uses for the lookup. Furthermore, sometimes an integer just so happens to have the same value as the address of an unreferenced object, and causes you to keep a whole bunch of objects which should have been collected. So such a garbage collector cannot offer strong guarantees about unreferenced objects ever being collected. This might be fine for a toy implementation or first working version, but is unlikely to be popular with users.
A mixed approach might, say, do accurate marking of objects, but not of regions of the stack where things get particularly hairy. For example if you write a JIT which can create code where a referenced object address appears only in registers, not in your usual stack slots, then you might need to non-accurately follow the region of the stack where the OS stored the registers when it descheduled the thread in question to run the garbage collector. Which is probably quite fiddly, so a reasonable approach (potentially resulting in slower code) would be to require the JIT to always keep a copy of all pointer values it's using on the accurately marked stack.
In Squeak (also Scheme and many others dynamic languages I guess) you have SmallInteger, the class of signed 31-bit integers, and classes for arbitrarily big integers, e.g. LargePositiveInteger. There could very well be other representations, 64-something-bit integers either as full objects or with a couple bits as "I'm not a pointer" flags.
But arithmetic methods are coded to handle over/under-flows, such that if you add one to SmallInteger maxVal, you get 2^30 + 1 as an instance of LargePositiveInteger, and if you subtract one back from it, you get back 2^30 as a SmallInteger.