I have an RL problem where I want the agent to make a selection of x out of an array of size n.
I.e. if I have [0, 1, 2, 3, 4, 5] then n = 6 and if x = 3 a valid action could be
[2, 3, 5].
Right now what I tried is have n scores:
Output n continuous numbers, and select the x highest ones. This works quite ok.
And I tried iteratively replacing duplicates out of a Multi Discrete action. Where we have x values that can be anything from 0 to n-1.
Is there some other optimal action space I am missing that would force the agent to make unique choices?
Many thanks for your valuable insights and tips in advance! I am happy to try all!
Since reinforcement learning mostly about interacting with environment, you can approach like this:
Your agent starts choosing actions. After choosing the first action, you can either update the possible choices it has by removing the last choice (with temporary action list) or you can update the values of the chosen action (giving it either negative reward or punishing it). I think this could solve your problem.
From the nested sets reference document written by Mike Hyller and other blogs, I could understand how hierarchies are being managed in RDBMS. I was also able to successfully implement the model for one of my projects. I am currently working on a problem which also has hierarchy, but the nodes are built from the bottom. I am using MySQL.
Consider I have 10 objects, I initially create rows for them in a table. Then, there is a table which has the left and right values that are required for implementing the nested sets model. So in this table, I group these 10 objects into two sets, say two bags, 5 objects in one bag and other 5 objects in one bag (based on some logic). Now these two bags are grouped together to form a bigger bag. Likewise, such bags are grouped together to form a big container.
I hope the example is clear to you to get an idea of what I am trying to achieve here. This is the opposite of applying the traditional nested set model where I build the sets from the top.
Can you please suggest me whether nested sets can be applied here? If yes, will changing the update query during insertion be sufficient to form the entire hierarchy? If you don't suggest, what other techniques can be used to tackle such problems?
Nested sets model works for any hierarchy, as long as it's non-overlapping (i.e. one child can have at most one parent).
Your model seems to have a predefined hierarchy ("objects", "bags" and "containers" being different entities with different properties). If it's the case indeed, you don't need nested sets at all, a simple set of foreign key constraints will suffice.
If it's not though (say, if a "bag" can be promoted to a "container", or there can be "containers" containing other "containers" etc.), you will need to have some kind of a hierarchy model indeed, and nested sets can serve as one as well.
One way to implement one would be to add references to you "bags" or "containers" or whatever to the table which holds your left and right values for your "objects":
CREATE TABLE nested_sets
(
ref BIGINT NOT NULL,
type INT NOT NULL -- 1 = object, 2 = set, 3 = bag
left BIGINT,
right BIGINT
)
INSERT
INTO nested_sets
VALUES (1, 1, 1, 1),
(2, 1, 2, 2),
(3, 1, 3, 3), -- 3 objects in bag 1
(4, 1, 4, 4),
(5, 1, 5, 5),
(6, 1, 6, 6), -- 3 objects in bag 2
(1, 2, 1, 3), -- bag 1, containing objects 1 to 3
(2, 2, 4, 6), -- bag 2, containing objects 4 to 6
(1, 3, 1, 6), -- container 1, containing bags 1 and 2 and, by extension, objects 1 to 6
You may also want to move left and right fields from the nested_sets table to the main tables describing the entities, or, alternatively, you may want to move all entities into a single table. This depends on how rigid your definitions of "bag", "container" and "object" are.
This is the query for a gaming application to get a list of targets for the enemy that excludes locations that the enemy can't see. This is a simplified version of my query to target my specific question.
SELECT * FROM `game_moblist` WHERE (posx!=0 AND posy!=0) AND (posx!=1100 AND posy!=220)
posx is the x coordinate posy is the y coordinate
I'm writing a loop to exclude any tiles that cannot be seen
The issue I see is that its treated as if the parenthesis aren't there. All posx=1100 are excluded and not the ordered pair (1100,220) what is the proper syntax for what I'm trying to do? The only solution I thought of is to combine the two numbers into a unique single number but I'd rather learn something new.
I think you mean:
WHERE NOT (posx=0 AND posy=0)
AND NOT (posx=1100 AND posy=220)
which can be rewritten also as:
WHERE (posx, posy) NOT IN ((0, 0), (1100, 200))
Here is a function I would like to write but am unable to do so. Even if you
don't / can't give a solution I would be grateful for tips. For example,
I know that there is a correlation between the ordered represantions of the
sum of an integer and ordered set partitions but that alone does not help me in
finding the solution. So here is the description of the function I need:
The Task
Create an efficient* function
List<int[]> createOrderedPartitions(int n_1, int n_2,..., int n_k)
that returns a list of arrays of all set partions of the set
{0,...,n_1+n_2+...+n_k-1} in number of arguments blocks of size (in this
order) n_1,n_2,...,n_k (e.g. n_1=2, n_2=1, n_3=1 -> ({0,1},{3},{2}),...).
Here is a usage example:
int[] partition = createOrderedPartitions(2,1,1).get(0);
partition[0]; // -> 0
partition[1]; // -> 1
partition[2]; // -> 3
partition[3]; // -> 2
Note that the number of elements in the list is
(n_1+n_2+...+n_n choose n_1) * (n_2+n_3+...+n_n choose n_2) * ... *
(n_k choose n_k). Also, createOrderedPartitions(1,1,1) would create the
permutations of {0,1,2} and thus there would be 3! = 6 elements in the
list.
* by efficient I mean that you should not initially create a bigger list
like all partitions and then filter out results. You should do it directly.
Extra Requirements
If an argument is 0 treat it as if it was not there, e.g.
createOrderedPartitions(2,0,1,1) should yield the same result as
createOrderedPartitions(2,1,1). But at least one argument must not be 0.
Of course all arguments must be >= 0.
Remarks
The provided pseudo code is quasi Java but the language of the solution
doesn't matter. In fact, as long as the solution is fairly general and can
be reproduced in other languages it is ideal.
Actually, even better would be a return type of List<Tuple<Set>> (e.g. when
creating such a function in Python). However, then the arguments wich have
a value of 0 must not be ignored. createOrderedPartitions(2,0,2) would then
create
[({0,1},{},{2,3}),({0,2},{},{1,3}),({0,3},{},{1,2}),({1,2},{},{0,3}),...]
Background
I need this function to make my mastermind-variation bot more efficient and
most of all the code more "beautiful". Take a look at the filterCandidates
function in my source code. There are unnecessary
/ duplicate queries because I'm simply using permutations instead of
specifically ordered partitions. Also, I'm just interested in how to write
this function.
My ideas for (ugly) "solutions"
Create the powerset of {0,...,n_1+...+n_k}, filter out the subsets of size
n_1, n_2 etc. and create the cartesian product of the n subsets. However
this won't actually work because there would be duplicates, e.g.
({1,2},{1})...
First choose n_1 of x = {0,...,n_1+n_2+...+n_n-1} and put them in the
first set. Then choose n_2 of x without the n_1 chosen elements
beforehand and so on. You then get for example ({0,2},{},{1,3},{4}). Of
course, every possible combination must be created so ({0,4},{},{1,3},{2}),
too, and so on. Seems rather hard to implement but might be possible.
Research
I guess this
goes in the direction I want however I don't see how I can utilize it for my
specific scenario.
http://rosettacode.org/wiki/Combinations
You know, it often helps to phrase your thoughts in order to come up with a solution. It seems that then the subconscious just starts working on the task and notifies you when it found the solution. So here is the solution to my problem in Python:
from itertools import combinations
def partitions(*args):
def helper(s, *args):
if not args: return [[]]
res = []
for c in combinations(s, args[0]):
s0 = [x for x in s if x not in c]
for r in helper(s0, *args[1:]):
res.append([c] + r)
return res
s = range(sum(args))
return helper(s, *args)
print partitions(2, 0, 2)
The output is:
[[(0, 1), (), (2, 3)], [(0, 2), (), (1, 3)], [(0, 3), (), (1, 2)], [(1, 2), (), (0, 3)], [(1, 3), (), (0, 2)], [(2, 3), (), (0, 1)]]
It is adequate for translating the algorithm to Lua/Java. It is basically the second idea I had.
The Algorithm
As I already mentionend in the question the basic idea is as follows:
First choose n_1 elements of the set s := {0,...,n_1+n_2+...+n_n-1} and put them in the
first set of the first tuple in the resulting list (e.g. [({0,1,2},... if the chosen elements are 0,1,2). Then choose n_2 elements of the set s_0 := s without the n_1 chosen elements beforehand and so on. One such a tuple might be ({0,2},{},{1,3},{4}). Of
course, every possible combination is created so ({0,4},{},{1,3},{2}) is another such tuple and so on.
The Realization
At first the set to work with is created (s = range(sum(args))). Then this set and the arguments are passed to the recursive helper function helper.
helper does one of the following things: If all the arguments are processed return "some kind of empty value" to stop the recursion. Otherwise iterate through all the combinations of the passed set s of the length args[0] (the first argument after s in helper). In each iteration create the set s0 := s without the elements in c (the elements in c are the chosen elements from s), which is then used for the recursive call of helper.
So what happens with the arguments in helper is that they are processed one by one. helper may first start with helper([0,1,2,3], 2, 1, 1) and in the next invocation it is for example helper([2,3], 1, 1) and then helper([3], 1) and lastly helper([]). Of course another "tree-path" would be helper([0,1,2,3], 2, 1, 1), helper([1,2], 1, 1), helper([2], 1), helper([]). All these "tree-paths" are created and thus the required solution is generated.
Let's say I have a sequence of values (e.g., 3, 5, 8, 12, 15) and I want to occasionally decrease all of them by a certain value.
If I store them as the sequence (0, 2, 3, 4, 3) and keep a variable as a base of 3, I now only have to change the base (and check the first items) whenever I want to decrease them instead of actually going over all the values.
I know there's an official term for this, but when I literally translate from my native language to English it doesn't come out right.
Differential Coding / Delta Encoding?
I don't know a name for the data structure, but it's basically just base+offset :-)
An offset?
If I understand your question right, you're rebasing. That's normally used in reference to patching up addresses in DLLs from a load address.
I'm not sure that's what you're doing, because your example seems to be incorrect. In order to come out with { 3, 5, 8, 12, 15 }, with a base of 3, you'd need { 0, 2, 5, 9, 12 }.
I'm not sure. If you imagine your first array as providing the results of some function of an index value f(i) where f(0) is 3, f(1) is 5, and so forth, then your second array is describing the function f`(i) where f(i+1) = f(i) + f'(i) given f(0) = 3.
I'd call it something like a derivative function, where the process of retrieving your original data is simply the summation function.
What will happen more often, will you be changing f(0) or retrieving values from f(i)? Is this technique rooted in a desire to optimize?
Perhaps you're looking for a term like "Inductive Sequence" or "Induction Sequence." (I just made that up.)