Not sure of the name.. (bit field) - terminology

I'm not sure what the name of this is called, so I had no idea what to call it.
it's when people use a number to type something, but it's somewhat binary based: 1, 2, 4, 8, 16, 32, etc
for example: 1 = apples, 2 = oranges, 4 = bananas then 3 = apples and oranges, 7 = apples, oranges and bananas.
What is this called?

There are many different terms, but I would call that a bit field. If you're using Java there is a class for this called BitSet.
I've also heard it called bit flags.

Bit fields aka Bit flags?
http://en.wikipedia.org/wiki/Bit_field

I think it is (or similar to) bit array. So each bit stores the state of something. In this case whether this fruit is available or not.

That is called a bit flag, or bit field.

Related

What is the best way to model an environment to force an agent to select `x out of n` choices?

I have an RL problem where I want the agent to make a selection of x out of an array of size n.
I.e. if I have [0, 1, 2, 3, 4, 5] then n = 6 and if x = 3 a valid action could be
[2, 3, 5].
Right now what I tried is have n scores:
Output n continuous numbers, and select the x highest ones. This works quite ok.
And I tried iteratively replacing duplicates out of a Multi Discrete action. Where we have x values that can be anything from 0 to n-1.
Is there some other optimal action space I am missing that would force the agent to make unique choices?
Many thanks for your valuable insights and tips in advance! I am happy to try all!
Since reinforcement learning mostly about interacting with environment, you can approach like this:
Your agent starts choosing actions. After choosing the first action, you can either update the possible choices it has by removing the last choice (with temporary action list) or you can update the values of the chosen action (giving it either negative reward or punishing it). I think this could solve your problem.

reinforcement learning model design - how to add upto 5

I am experimenting with reinforcement learning in python using Keras. Most of the tutorials available use OpenAI Gym library to create the environment, state, and action sets.
After practicing with many good examples written by others, I decided that I want to create my own reinforcement learning environment, state, and action sets.
This is what I think will be fun to teach the machine to do.
An array of integers from 1 to 4. I will call these targets.
targets = [[1, 2, 3, 4]]
Additional numbers list (at random) from 1 to 4. I will call these bullets.
bullets = [1, 2, 3, 4]
When I shoot a bullet to a target, the target's number will be the sum of original target num + bullet num.
I want to shoot a bullet (one at a time) at one of the targets to make
For example, given targets [1 2 3 4] and bullet 1, I want the machine to predict the correct index to shoot at.
In this case, it should be index 3, because 4 + 1 = 5
curr_state = [[1, 2, 3, 4]]
bullet = 1
action = 3 (<-- index of the curr_state)
next_state = [[1, 2, 3, 5]]
I have been picking my brain to think of the best way to construct this into a reinforcement design. I tried some, but the model result is not very good (meaning, it most likely fails to make number 5).
Mostly because the state is a 2D: (1) targets; (2) bullet at that time. The method I employed so far is to convert the state as the following:
State = 5 - targets - bullet
I was wondering if anyone can think of a better way to design this model?
Thanks in advance!
Alright, so it looks like no one is helping you out, so I just wrote a Python environment file for you as you described. I also made it as much OpenAI style for you as possible, here is the link to it, it is in my GitHub repository. You can copy the code or fork it. I will explain it below:
https://github.com/RuiNian7319/Miscellaneous/blob/master/ShootingRange.py
States = [0, 1, 2, ..., 10]
Actions = [-2, -1, 0, 1, 2]
So the game starts at a random number between 0 - 10 (you can change this easily if you want), and the random number is your "target" you described above. Given this target, your AI agent can fire the gun, and it shoots bullets corresponding to the numbers above. The objective is for your bullet and the target to add up to 5. There are negatives in case your AI agent overshoots 5, or if the target is a number above 5.
To get a positive reward, the agent has to get 5. So if the current value is 3, and the agent shoots 2, then the agent will get a reward of 1 since he got the total value of 5, and that episode will end.
There are 3 ways for the game to end:
1) Agent gets 5
2) Agent fails to get 5 in 15 tries
3) The number is above 10. In this case, we say the target is too far
Sometimes, you need to shoot multiple times to get 5. So, if your agent shoots, its current bullet will be added to the state, and the agent tries again from that new state.
Example:
Current state = 2. Agent shoots 2. New state is 4. And the agent starts at 4 at the next time step. This "sequential decision making" creates a reinforcement learning environment, rather than a contextual bandit.
I hope this makes sense, let me know if you have any questions.

AS3 function producing combinations of array, no duplicates

This sounds like a duplicate question, as there are several questions similar to this, but they don't specifically ask this (or I just haven't found it! :) )
I have an array, this one has two distinct elements, "a" and "b", and a length of four total elements:
var list:Array = ["a","a","b","b"];
I'm looking for all combinations, using all elements, no duplicates.
This should yield:
aabb
abab
abba
bbaa
baba
baab
Searching for a solution for this has given me results similar to these:
a,b,ab,ba,aab,abb,aba, etc
or
a a b b, a a b b, a a b b, etc
Mind you, the application that would ultimately use this function would have two distinct elements, "a" and "b", and a length of 50 total elements:
var list:Array = ["a","a","a","a","a","a","a","a","a","a",
"a","a","a","a","a","a","a","a","a","a",
"a","a","a","a","a",
"b","b","b","b","b","b","b","b","b","b",
"b","b","b","b","b","b","b","b","b","b",
"b","b","b","b","b"]
...so a brute force solution like I used with aabb wouldn't be feasible.
Any help, especially using AS3 code, would be appreciated, even if it is simply pointing me to the right google search :)
Here is a JavaScript answer that might get you started: Permutations in JavaScript? (they're both EcmaScript implementations so converting to ActionScript should only require minor changes)
It doesn't handle the uniqueness requirement, but it might point you in the right direction.
However, there are a few things you might need to consider first. I don't think it will be feasible to pre-compute all unique permutations upfront.
Based on this answer about unique permutations it looks like there are 50! / 25! * 25! = 126,410,606,437,752 unique permutations for 25 a's and 25 b's.
To give an idea how large that number is: if each combination was 1 byte in memory (in practice it will be more than this) then that would be: 126410606437752 bytes = 126,410.6 gigabytes in memory.
Plus, the algorithm for generating the permutations has complexity O(n!) - so it might take far too long, separate to memory constraints, to generate the list of permutations.

Real world examples of a kind of "sorted" data

Consider a sorted list of numbers which is "cut," so that it is increasing except for one jump. For instance the order might be,
11, 12, 13, 14, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
What kinds of data naturally have this representation, with one or possibly many "cuts" obscuring the default ordering? The only one I can think of is a deck of cards, but I was asked to produce examples of data that might look like this in an interview. Weeks later, and I still can't think of any, but my curiosity prevails.
Is there a special name for this kind of data? I tried googling "cut data" but that obviously didn't work.
All insight is appreciated.
[Edit] From the discussions below this appears to have some interesting relationships with symmetry groups, and what sorts of rearrangements are possible with just the cut operation. I may have to ask my local mathematicians what I can do with this.
I can think of a few off the top of my head.
The first is the hour of the day as it rolls into a new day: ... 22 23 0 1 2 ....
The second is the alpha ordering on file names: pax1 pax10 pax11 ... pax19 pax2 pax20 ....
Yet another is the months of the financial year (in Australia, most companies close off their financial year at the end of June): 7 8 9 10 11 12 1 2 3 4 5 6.
After a quick analysis, it's obvious to see that any sequence of "cuts" results in a single cut with respect to a different index. In fact, it is only the most recent cut point that matters, as that value will end up at the front of the list, and it will be equivalent to a cut of this data from the original index of that element.
So not so interesting.

What is the name of this data structure or technique of using relative difference between sequence members

Let's say I have a sequence of values (e.g., 3, 5, 8, 12, 15) and I want to occasionally decrease all of them by a certain value.
If I store them as the sequence (0, 2, 3, 4, 3) and keep a variable as a base of 3, I now only have to change the base (and check the first items) whenever I want to decrease them instead of actually going over all the values.
I know there's an official term for this, but when I literally translate from my native language to English it doesn't come out right.
Differential Coding / Delta Encoding?
I don't know a name for the data structure, but it's basically just base+offset :-)
An offset?
If I understand your question right, you're rebasing. That's normally used in reference to patching up addresses in DLLs from a load address.
I'm not sure that's what you're doing, because your example seems to be incorrect. In order to come out with { 3, 5, 8, 12, 15 }, with a base of 3, you'd need { 0, 2, 5, 9, 12 }.
I'm not sure. If you imagine your first array as providing the results of some function of an index value f(i) where f(0) is 3, f(1) is 5, and so forth, then your second array is describing the function f`(i) where f(i+1) = f(i) + f'(i) given f(0) = 3.
I'd call it something like a derivative function, where the process of retrieving your original data is simply the summation function.
What will happen more often, will you be changing f(0) or retrieving values from f(i)? Is this technique rooted in a desire to optimize?
Perhaps you're looking for a term like "Inductive Sequence" or "Induction Sequence." (I just made that up.)