How are convolutional networks used in AlphaGo?

How are convolutional networks used in AlphaGo? - deep-learning

If I understand go game correctly, there is a board of 19x19. In the AlphaGo Nature paper, http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html, it mentioned convolutional network. My understanding of convolutional networks are examples in image recognitions. Then how could a convolutional network be applied to this problem? Isn't it an overkill to transform the board into a 19x19 image?

Go is influenced a lot by patterns, and as you might have noticed in image classification, convolutional networks are good at those.
You ask if it is an overkill to change a go board into a 19*19 image, i have to admit i have not tried to create an image of it with say 0 for black stone, 0.5 for no stone and 1 for a white stone and train a network with it but i am pretty sure it will work to some extend.
Things are more extreme than that! the 19*19 go board is converted into a 19*19*48 input tensor. (as an rgb image it would only be 19*19*3)
one plane for black stones
one plane for white stones
one plane for empty plaves
and 45 other planes encoding several values whom are helpful for the network to know. (things like, liberties, atari, liberties after move, they are all in the paper but you have to know a little more about go to understand them)
is this an overkill, definitely not! convolutional networks are good at recognizing patterns but they need the right information to do so. for example a ladder is impossible for this network to detect as it is not possible to get that information from one side of the board to the other one and back within the 13convolutional layers used, so some of the 48input planes are used to tell the network if a certain move is a ladder capture or a ladder escape move.

Related

Keypoint detection when target appears multiple times

I am implementing a keypoint detection algorithm to recognize biomedical landmarks on images. I only have one type of landmark to detect. But in a single image, 1-10 of these landmarks can be present. I'm wondering what's the best way to organize the ground truth to maximize learning.
I considered creating 10 landmark coordinates per image and associate them with flags that are either 0 (not present) or 1 (present). But this doesn't seem ideal. Since the multiple landmarks in a single picture are actually the same type of biomedical element, the neural network shouldn't be trying to learn them as separate entities.
Any suggestions?

One landmark that can appear everywhere sounds like a typical CNN problem. Your CNN filters should learn which features make up the landmark, but they don't care where it appears. That would be the responsibility of the next layers. Hence, for training the CNN layers you can use a monochrome image as the target: 1 is "landmark at this pixel", 0 if not.
The next layers are basically processing the CNN-detected features. To train those, your ground truth should be basically the desired outcome. Do you just need a binary output (count>0)? A somewhat accurate estimate of the count? Coordinates? Orientation? NN's don't care that much what they learn, so just give it in training what it should produce in inference.

Unet Segmentation Model

I am trying to solve Baby detection with unet segmentation model. I already collected baby images, baby segments and also passed the adult images as negative (created black segments for this).
So if I will do in this way is unet model can differentiate the adults and babies? if not what I have to do next?

It really depends on your dataset.
During the training, Unet will try to learn specific features in the images, such as baby's shape, body size, color, etc. If your dataset is good enough (e.g. contains lots of babies examples and lots of adults with a separate color and the image dimensions are not that high) then You probably won't have any problems at all.
There is a possibility however, that your model misses some babies or adults in an image. To tackle this issue, There are a couple of things you can do:
Add Data Augmentation techniques during the training (e.g. random crop, padding, brightness, contrast, etc.)
You can make your model stronger by replacing Unet model with a new approach, such as Unet++ or Unet3+. According to Unet3+ paper, it seems that it is able to outperform both Unet & Unet++ in medical image segmentation tasks:
https://arxiv.org/ftp/arxiv/papers/2004/2004.08790.pdf
Also, I have found this repository, which contains a clean implementation of Unet3+, which might help you get started:
https://github.com/kochlisGit/Unet3-Plus

Dealing with false positives in binary image segmentation

I'm working on a model to identify bodies of water in satellite imagery. I've modified this example a bit to use a set of ~600 images I've labeled, and it's working quite well for true positives - it produces an accurate mask for imagery tiles with water in it. However, it produces some false-positives as well, generating masks for tiles that have no water in them whatsoever - tiles containing fields, buildings or parking lots, for instance. I'm not sure how to provide this sort of negative feedback to the model - adding false-positive images to the training set with an empty mask is having no effect, and I tried a training set made up of only false-positives, which just produces random noise, making me think that empty masks have no effect on this particular network.
I also tried training a binary classification network from a couple of examples I found to classify tiles as water/notwater. It doesn't seem to be working with a good enough accuracy to use a first-pass filter, with about 5k images per class. I used OSM label-maker for this, and the image sets aren't perfect - there are some water tiles in the non-water set and vice-versa, but even the training set isn't getting good accuracy (~.85 at best).
Is there a way to provide negative feedback to the binary image segmentation model? Should I use a larger training set? I'm kinda stuck here without an ability to provide negative feedback, and would appreciate any pointers on how to handle this.
Thanks!

Are the ,middle layers in Resnet even learning?

The skip connections allows our gradient all the way from 152nd layers and feed through the initial 1st or 2nd layers of the CNN. But what about the middle layers? Backpropagations in these middle layers are totally irrelevant so are we even learning in resnet?

Backpropagation in these middle layers aren't totally irrelevant. The basic idea of the relevance of the middle layers is that ResNet keeps improving its error-rate when adding new layers (from 5.71 top5 error with 34 layer to 4.49 top5 error with 152). Images have a lot of singularities and complexities, and the folks at Microsoft found out that, when you take care of the vanishing gradient problem (with the feed through) you can gain more knowledge throughout the network with more layers.
The ideia of adding the residual block, it's to prevent the vanishing gradient problem, when you are getting too many layers... But the middle layers are also updated on each training step, and they are also learning (usually high-level features).
Convolutional Neural Networks with lots of layers tend to overfit if the problem isn't too much complex, since its 152 layers have a capacity of learning a lot of different patterns.

Angular Momentum Transfer equations

Does anyone have any good references for equations which can be implemented relatively easily for how to compute the transfer of angular momentum between two rigid bodies?
I've been searching for this sort of thing for a while, and I haven't found any particularly comprehensible explanations of the problem.
To be precise, the question comes about as this; two rigid bodies are moving on a frictionless (well, nearly) surface; think of it as air hockey. The two rigid bodies come into contact, and then move away. Now, without considering angular momentum, the equations are relatively simple; the problem becomes, what happens with the transfer of angular momentum between the bodies?
As an example, assume the two bodies have no angular momentum whatsoever; they're not rotating. When they interact at an oblique angle (vector of travel does not align with the line of their centers of mass), obviously a certain amount of their momentum gets transferred into angular momentum (i.e. they each get a certain amount of spin), but how much and what are the equations for such?
This can probably be solved by using a many-body rigid system to calculate, but I want to get a much more optimized calculation going, so I can calculate this stuff in real-time. Does anyone have any ideas on the equations, or pointers to open-source implementations of these calculations for inclusion in a project? To be precise, I need this to be a rather well-optimized calculation, because of the number of interactions that need to be simulated within a single "tick" of the simulation.
Edit: Okay, it looks like there's not a lot of precise information about this topic out there. And I find the "Physics for Programmers" type of books to be a bit too... dumbed down to really get; I don't want code implementation of an algorithm; I want to figure out (or at least have sketched out for me) the algorithm. Only in that way can I properly optimize it for my needs. Does anyone have any mathematic references on this sort of topic?

If you're interested in rotating non-spherical bodies then http://www.myphysicslab.com/collision.html shows how to do it. The asymmetry of the bodies means that the normal contact force during the collision can create a torque about their respective CGs, and thus cause the bodies to start spinning.
In the case of a billiard ball or air hockey puck, things are a bit more subtle. Since the body is spherical/circular, the normal force is always right through the CG, so there's no torque. However, the normal force is not the only force. There's also a friction force that is tangential to the contact normal which will create a torque about the CG. The magnitude of the friction force is proportional to the normal force and the coefficient of friction, and opposite the direction of relative motion. Its direction is opposing the relative motion of the objects at their contact point.

Well, my favorite physics book is Halliday and Resnick. I never ever feel like that book is dumbing down anything for me (the dumb is inside the skull, not on the page...).
If you set up a thought problem, you can start to get a feeling for how this would play out.
Imagine that your two rigid air hockey pucks are frictionless on the bottom but have a maximal coefficient of friction around the edges. Clearly, if the two pucks head towards each other with identical kinetic energy, they will collide perfectly elastically and head back in opposite directions.
However, if their centers are offset by 2*radius - epsilon, they'll just barely touch at one point on the perimeter. If they had an incredibly high coefficient of friction around the edge, you can imagine that all of their energy would be transferred into rotation. There would have to be a separation after the impact, of course, or they'd immediately stop their own rotations as they stuck together.
So, if you're just looking for something plausible and interesting looking (ala game physics), I'd say that you could normalize the coefficient of friction to account for the tiny contact area between the two bodies (pick something that looks interesting) and use the sin of the angle between the path of the bodies and the impact point. Straight on, you'd get a bounce, 45 degrees would give you bounce and spin, 90 degrees offset would give you maximal spin and least bounce.
Obviously, none of the above is an accurate simulation. It should be a simple enough framework to cause interesting behaviors to happen, though.
EDIT: Okay, I came up with another interesting example that is perhaps more telling.
Imagine a single disk (as above) moving towards a motionless, rigid, near one-dimensional pin tip that provides the previous high friction but low stickiness. If the disk passes at a distance that it just kisses the edge, you can imagine that a fraction of its linear energy will be converted to rotational energy.
However, one thing you know for certain is that there is a maximum rotational energy after this touch: the disk cannot end up spinning at such a speed that it's outer edge is moving at a speed higher than the original linear speed. So, if the disk was moving at one meter per second, it can't end up in a situation where its outer edge is moving at more than one meter per second.
So, now that we have a long essay, there are a few straightforward concepts that should aid intuition:
The sine of the angle of the impact will affect the resulting rotation.
The linear energy will determine the maximum possible rotational energy.
A single parameter can simulate the relevant coefficients of friction to the point of being interesting to look at in simulation.

You should have a look at Physics for Game Developers - it's hard to go wrong with an O'Reilly book.

Unless you have an excellent reason for reinventing the wheel,
I'd suggest taking a good look at the source code of some open source physics engines, like Open Dynamics Engine or Bullet. Efficient algorithms in this area are an artform, and the best implementations no doubt are found in the wild, in throroughly peer-reviewed projects like these.

Please have a look at this references!
If you want to go really into Mecanics, this is the way to go, and its the correct and mathematically proper way!
Glocker Ch., Set-Valued Force Laws: Dynamics of Non-Smooth Systems. Lecture Notes in Applied Mechanics 1, Springer Verlag, Berlin, Heidelberg 2001, 222 pages. PDF (Contents, 149 kB)
Pfeiffer F., Glocker Ch., Multibody Dynamics with Unilateral Contacts. JohnWiley & Sons, New York 1996, 317 pages. PDF (Contents, 398 kB)
Glocker Ch., Dynamik von Starrkörpersystemen mit Reibung und Stößen. VDI-Fortschrittberichte Mechanik/Bruchmechanik, Reihe 18, Nr. 182, VDI-Verlag, Düsseldorf, 1995, 220 pages. PDF (4094 kB)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008