Defining state and action for Q-learning in the code - deep-learning

I am trying to understand the following code for the simulator to avoid collision with the help of Q-learning. The examples and tutorials which I followed had the space divided into blocks such as taxiv3, so it was rather easier to define state and action spaces. But in this code, the simulator produces random sectors and I am trying to figure out a way to define the state space and action space.
Please let me know if you need more information on my question if it is too vague.
https://github.com/ramondalmau/atcenv/blob/main/atcenv/env.py

Related

do i always need labeled data for ann?

i put a small request on upwork where i am requesting help for a topic which is right now out of my skill zone.
The problem is a fitting problem of small rectangles in a big rectangle via a ANN.
Problem is the first freelancer baffled me a little bit with a comment.
So my thinking was, because the solution is easy verified and rewardable, that you can simply throw a ANN on this problem and with enough time it will perform better and better.
The freelancer requested labeled data first before he can tackle the problem(thats the comment which confuses me).
I was thinking that unlabeled random Input data is enough for the start.
Do I think wrong?
here the link to the job post.
https://www.upwork.com/jobs/~01e040711c31ac0979
edit: directly the original job description
I want python code for training a ANN and using it in a productive enviroment.
The problem it needs to solve is a rectangle fitting problem.
Input are
1000 small Rectangles(groupid,width,heigth,Oriantion(free,restricted,hor or ver), value) --sRect
1 big Rectangles(width, heigth)--bRect
Layout(bool,bool,bool,xpos,ypos,Oriantaion(hor or ver))--Layout
Output
Layout
The bRect will be duplicated to 3 Rectangles where the sRects need to be fitted into.
The Worth of the solution is determined by the sum of the value of sRect inside the bRect.
Further is the value decreased if the sRect is placed in the second bRect or third bRect.
sum(sRect(value))*0.98^nth bRect
Not all sRect needs to be placed.
Layout is structered that the three bool at the start represent at which bRect the sRect is placed. If a sRect is placed at one of the bRect, then the Solution Layout muss stay for this sRect the same.
Restricted Ori means all of the sRect with the same group need to be Oriantated the same way. Hor means the sRect is not turned, ver the sRect is turned by 90degrees.
Other then that normal rules apply, like all sRect needs to be inside the bRect and not Overlapp between sRect.
Looking forward to replys and i am avaible for further explanations.
edit: example picture
important i dont want to optimise for maximum plate usage, because it can happen that a smaller sRect can have a higher value then a bigger sRect.
example fitting problem
Without expected output for each input you cannot use the most standard training methodology - supervised learning. If you only have a way to verify the solution (e.g. in a game of chess you can tell me if I won but you cant tell me how to win) then the most standard approach is reinforcement learning. That being said, it is much more complex problem, not something that say a newcomer to the field of ML will be capable of doing (while supervised learning is something that one can do essentially by following basic tutorials online)

How to handle changing input element numbers and multiple action in Reinforcement Learning?

Hi Respected group members. I have query related to RL. Please help me in pointing me to the right direction. I am fairly new to RL and hence my question may sound silly so please bear with me.
Suppose e.g. task is to arrange n elements on a canvas. Action that can be applied on each element is two dimensional [move up/down, move left/right]. Agent has time limit to finish the task and once time is up it will be given reward if arrangement is right. Next task again will be same but number of elements and canvas dimensions can change. How to handle this scenario using RL as number of actions will change as number of elements will change from one task to another
One method you could consider depending on the detail of your game. If each element has the same goal and same actions you could train an agent that solves for a single element getting to the goal. Once trained you can add more elements and pass each element through the network to get an action for each element.
We have implemented something very similar. The beauty of it is that you only have to train with one element making it much quicker. Also once trained you can have any number of elements and the agent will be able to solve it as easily as if there was one element. All depends on the detail of your game and what you want to achieve.

mxgraph - selfloop and context-dependant shape

I'm currently modifying the editor tool to accomplish some simple tasks in a toy project. My knowledge of the library is quite limited and I've found it really hard to adjust the code for my needs. Thus, I'm writing here in the forum in search for help.
I need to implement self loops. Despite setting correctly the "allowselfloop" flag in the code, the editor still does not accept self-loops. Is there a specific method to be overriden? Where? How?
Another important feature I need to implement is the context-dependant shape creation: when an arc is stroke from a shape the editor automatically create a shape identical to the source of the arc itself. How it is possible to generate different shape depending on the arc source (or other parameters)? Is there a (series of) specific method(s) which must be overriden?
Any help is really appreciated. Thanks in advance.
your first question I don't know,
the second, try catching the event create_cell or cell_created or something like that.
or check out the "contexticons" example, the addGestureListeners function can have more arguments to capture the "mouse up" where you create the cell manually

Vector graphics flood fill algorithms?

I am working on a simple drawing application, and i need an algorithm to make flood fills.
The user workflow will look like this (similar to Flash CS, just more simpler):
the user draws straight lines on the workspace. These are treated as vectors, and can be selected and moved after they are drawn.
user selects the fill tool, and clicks on the drawing area. If the area is surrounded by lines in every direction a fill is applied to the area.
if the lines are moved after the fill is applied, the area of fill is changed accordingly.
Anyone has a nice idea, how to implement such algorithm? The main task is basically to determine the line segments surrounding a point. (and storing this information somehow, incase the lines are moved)
EDIT: an explanation image: (there can be other lines of course in the canvas, that do not matter for the fill algorithm)
EDIT2: a more difficult situation:
EDIT3: I have found a way to fill polygons with holes http://alienryderflex.com/polygon_fill/ , now the main question is, how do i find my polygons?
You're looking for a point location algorithm. It's not overly complex, but it's not simple enough to explain here. There's a good chapter on it in this book: http://www.cs.uu.nl/geobook/
When I get home I'll get my copy of the book and see if I can try anyway. There's just a lot of details you need to know about. It all boils down to building a DCEL of the input and maintain a datastructure as lines are added or removed. Any query with a mouse coord will simply return an inner halfedge of the component, and those in particular contain pointers to all of the inner components, which is exactly what you're asking for.
One thing though, is that you need to know the intersections in the input (because you cannot build the trapezoidal map if you have intersecting lines) , and if you can get away with it (i.e. input is few enough segments) I strongly suggest that you just use the naive O(n²) algorithm (simple, codeable and testable in less than 1 hour). The O(n log n) algorithm takes a few days to code and use a clever and very non-trivial data structure for the status. It is however also mentioned in the book, so if you feel up to the task you have 2 reasons to buy it. It is a really good book on geometric problems in general, so for that reason alone any programmer with interest in algorithms and datastructures should have a copy.
Try this:
http://keith-hair.net/blog/2008/08/04/find-intersection-point-of-two-lines-in-as3/
The function returns the intersection (if any) between two lines in ActionScript. You'll need to loop through all your lines against each other to get all of them.
Of course the order of the points will be significant if you're planning on filling them - that could be harder!
With ActionScript you can use beginFill and endFill, e.g.
pen_mc.beginFill(0x000000,100);
pen_mc.lineTo(400,100);
pen_mc.lineTo(400,200);
pen_mc.lineTo(300,200);
pen_mc.lineTo(300,100);
pen_mc.endFill();
http://www.actionscript.org/resources/articles/212/1/Dynamic-Drawing-Using-ActionScript/Page1.html
Flash CS4 also introduces support for paths:
http://www.flashandmath.com/basic/drawpathCS4/index.html
If you want to get crazy and code your own flood fill then Wikipedia has a decent primer, but I think that would be reinventing the atom for these purposes.

Optical character recognition

Hey everyone,
I'm trying to create a program in Java that can read numbers of the screen, and also recognise images on the screen. I was wondering how i can achieve this?
The font of the numbers will always be the same. I have never programmed anything like this before, but my idea of how it works is to have the program take a screenshot, then overlay the image of the numbers with the section of the screenshot image and check if they match, repeating this for each numbers. If this is the correct way to do this, how would i put that in code.
Thanks in advance for any help.
You could always train a neural net to do it for you. They can get pretty accurate sometimes. If you use something like Matlab it actually has capabilities for that already. Apparently there's a neural network library for java (http://neuroph.sourceforge.net/) although I've never used it personally.
Here's a tutorial about using neuroph: http://www.certpal.com/blogs/2010/04/java-neural-networks-and-neuroph-a-tutorial/
You can use a neural network, support vector machine, or other machine learning construct for this. But it will not do the entire job. If you do a screen shot, you are going to be left with a very large image that you will need to find the individual characters on. You also need to deal with the fact that the camera might not be pointed straight at the text that you want to read. You will likely need to use a series of algorithms to lock onto the right parts of the image and then downsample it in a way that size becomes neutral.
Here is a simple Java applet I wrote that does some of this.
http://www.heatonresearch.com/articles/42/page1.html
It lets you draw on a relatively large area and locks in on your char. Then it recognizes it. I am using the alphabet, but digits should be easier. The complete Java source code is included.
One simpler approach could be to use template matching. If the fonts are same, and/or the size (in pixels)is known, then simple template matching can do the job for you. ifsize of input is unknown, you might have to create copies of images at different scales and do the matching at each scale.
One with the extreme value(highest or lowest depending on the method you follow for template matching) is your result.
Follow this link for details