I would like to mutate two monsters in my game to make a simple hybrid. Each monster is a 2d image witch i could implement as composite sprites (to know more about each body parts). The problem is that not all monster is similar types, not all of them is humanoid or any kind of animal. I think for example if we have a lion with 4 legs and the spider with 8 (as example spider gens dominate) it could be the 8 legs lion with other (hybrid between two) color. But if i would have some humanoid and frog what should algorithm do? Any idea or any useful algorithm that could help me?
You could implement simple genetic system. E.g. have each creature represented by array of values instructing your script what to do in order to create the character: [add_red_human_torso, add_blue_frog_left_leg, .... ]. Then you could take random mix of two creatures arrays and build new creature from that.
How about giving attributes to each of the animals limbs, (Strength for lions legs, jumping for frogs legs, height for human torso, camoflauge for spider color,etc.) and then do a multi objective optimization algorithm.
Related
I have trained a deep neural network for speaker recognition(Trained on 64 different speakers).Next I want to add or delete a speaker from the model. Can anyone help me out with the coding part on how to do it, as I am new to voice recognition. Even any research paper that someone knows of can be helpful.
P.s. If I use a new dataset on the pre-trained model then I need to train the model again on new 64 speakers. Considering I just want to add or delete 1 or 2 speaker, how can that be achieved?
One way you can achieve this is to measure the similarity (as often done in speaker verification) instead of the logits layer you have trained on the initial dataset (with 64 speakers).
When feeding an input audio to the speaker recognition model, you can take the hidden layer values right before the logits layer and use it as an utterance-level feature. (let's call this speaker embedding vector)
Let us say that you have a new dataset with M utterances and N speakers (disjoint from the initial training set).
From this dataset, you can extract M embedding vectors using your pre-trained network.
Average the embedding vectors with the same speaker, you will get M speaker-specific embedding vectors. We will call these enrolled vectors.
Then to test a new speech sample, you simply have to extract the embedding vector from the test speech and compare its similarity with the M enrolled vectors (usually cosine similarity is used for speaker verification).
For P test utterances, this will give you a [P x M] matrix. For each test utterance, you can select the language with the highest similarity to perform speaker identification.
By doing this, you can perform speaker identification without re-training the network you have trained for speakers not included in the test set.
If you wish to learn some classical/popular methods used for speaker recognition, you can check the following paper out:
J. H. L. Hansen and T. Hasan, "Speaker recognition by machines and humans: a tutorial review," IEEE Signal Processing Magazine, vol. 32, no. 6, 2015.
I am new to deep learning and Semantic segmentation.
I have a dataset of medical images (CT) in Dicom format, in which I need to segment tumours and organs involved from the images. I have labelled organs contoured by our physician which we call it RT structure stored in Dicom format also.
As far as I know, people usually use "mask". Does it mean I need to convert all the contoured structure in the rt structure to mask? or I can use the information from the RT structure (.dcm) directly as my input?
Thanks for your help.
There is a special library called pydicom that you need to install before you can actually decode and later visualise the X-ray image.
Now, since you want to apply semantic segmentation and you want to segment the tumours, the solution to this is to create a neural network which accepts as input a pair of [image,mask], where, say, all the locations in the mask are 0 except for the zones where the tumour is, which are marked with 1; practically your ground truth is the mask.
Of course for this you will have to implement your CustomDataGenerator() which must yield at every step a batch of [image,mask] pairs as stated above.
I am having 3 different datasets, 3 of them were all blood smear image stained with the same chemical substance. Blood smear images are images that capture your blood, include Red, White blood cells inside.
The first dataset contain 2 classes : normal vs blood cancer
The second dataset contain 2 classes: normal vs blood infection
The third dataset contain 2 classes: normal vs sickle cell disease
So, what i want to do is : when i input a blood smear image, the AI system will tell me whether it was : normal , or blood cancer or blood infection or sickle cell disease (4 classes classification task)
What should i do?
Should i mix these 3 datasets and train only 1 model to detect 4 classes ?
Or should i train 3 different models and them combine them? If yes, what method should i use to combine?
Update : i searched for a while. Can this task called "Learning without forgetting?"
I think it depends on the data.
You may use three different models and make three binary predictions on each image. So you get a vote (probability) for each x vs. normal. If binary classifications are accurate, this should deliver okay results. But you kind of get a cummulated missclassification or error in this case.
If you can afford, you can train a four class model and compare the test error to the series of binary classifications. I understand that you already have three models. So training another one may be not too expensive.
If ONLY one of the classes can occur, a four class model might be the way to go. If in fact two (or more) classes can occur jointly, a series of binary classifications would make sense.
As #Peter said it is totally data dependent. If the images of the 4 classes, namely normal ,blood cancer ,blood infection ,sickle cell disease are easily distinguishable with your naked eyes and there is no scope of confusion among all the classes then you should simply go for 1 model which gives out probabilities of all the 4 classes(as mentioned by #maxi marufo). If there is confusion between classes and the images are NOT distinguishable with naked eyes or there is a lot of scope of confusion between the classes then you should use 3 different models but then you'll need. You simply get the predicted probabilities from all the 3 models suppose p1(normal) and p1(c1), p2(normal) and p2(c2), p3(normal) and p3(c3). Now you can average(p1(normal),p2(normal),p3(normal)) and the use a softmax for p(normal), p1(c1), p2(c2), p3(c3) . Out of multiple ways you could try, the above could be one.
This is a multiclass classification problem. You can train just one model, with the final layer being a full connected (dense) layer of 4 units (i.e. output dimension) and softmax activation function.
I'm trying to develop a model to recognize new gestures with the Myo Armband. (It's an armband that possesses 8 electrical sensors and can recognize 5 hand gestures). I'd like to record the sensors' raw data for a new gesture and feed it to a model so it can recognize it.
I'm new to machine/deep learning and I'm using CNTK. I'm wondering what would be the best way to do it.
I'm struggling to understand how to create the trainer. The input data looks like something like that I'm thinking about using 20 sets of these 8 values (they're between -127 and 127). So one label is the output of 20 sets of values.
I don't really know how to do that, I've seen tutorials where images are linked with their label but it's not the same idea. And even after the training is done, how can I avoid the model to recognize this one gesture whatever I do since it's the only one it's been trained for.
An easy way to get you started would be to create 161 columns (8 columns for each of the 20 time steps + the designated label). You would rearrange the columns like
emg1_t01, emg2_t01, emg3_t01, ..., emg8_t20, gesture_id
This will give you the right 2D format to use different algorithms in sklearn as well as a feed forward neural network in CNTK. You would use the first 160 columns to predict the 161th one.
Once you have that working you can model your data to better represent the natural time series order it contains. You would move away from a 2D shape and instead create a 3D array to represent your data.
The first axis shows the number of samples
The second axis shows the number of time steps (20)
The thirst axis shows the number of sensors (8)
With this shape you're all set to use a 1D convolutional model (CNN) in CNTK that traverses the time axis to learn local patterns from one step to the next.
You might also want to look into RNNs which are often used to work with time series data. However, RNNs are sometimes hard to train and a recent paper suggests that CNNs should be the natural starting point to work with sequence data.
I am working on an AI bot for the game Defcon. The game has cities, with varying populations, and defensive structures with limited range. I'm trying to work out a good algorithm for placing defence towers.
Cities with higher populations are more important to defend
Losing a defence tower is a blow, so towers should be placed reasonably close together
Towers and cities can only be placed on land
So, with these three rules, we see that the best kind of placement is towers being placed in a ring around the largest population areas (although I don't want an algorithm just to blindly place a ring around the highest area of population, sometime there might be 2 sets of cities far apart, in which case the algorithm should make 2 circles, each one half my total towers).
I'm wondering what kind of algorithms might be used for determining placement of towers?
I would define a function determines the value of a tower placed at that position. Then search for maxima in that function and place a tower there.
A sketch for the function could look like this:
if water return 0
popsum = sum for all city over (population/distance) // it's better to have towers close by
towersum = - sum for all existing towers (1/distance) // you want you towers spread somewhat evenly
return popsum + towersum*f // f adjusts the relative importance of spreading towers equally and protecting the population centers with many towers
Should give a reasonable algorithm to start with. For improvement you might change the 1/distance function to something different, to get a faster or slower drop of.
I'd start with implementing a fitness function that calculates the expected protection provided by a set of towers on a given map.
You'd calculate the amount of population inside the "protected" area where areas covered by two towers is rated a bit higher than area covered by only one tower (the exact scaling factor depends a lot on the game mechanics, 'though).
Then you could use a genetic algorithm to experiment with different sets of placements and let that run for several (hundered?) iterations.
If your fitness function is a good fit to the real quality of the placement and your implementation of the genetic algorithm is correct, then you should get a reasonable result.
And once you've done all that you can start developing an attack plan that tries to optimize the casualties for any given set of defense tower placements. Once you have that you can set the two populations against each other and reach even better defense plans this way (that is one of the basic ideas of artificial life).
I don't know the game but from your description it seems that you need an algorithm similar to the one for solving the (weighted) k-centers problem. Well, unfortunately, this is an NP hard problem so in the best case you'll get an approximation upper bounded by some factor.
Take a look here: http://algo2.iti.kit.edu/vanstee/courses/kcenter.pdf
Just define a utility function that takes a potential build position as input and returns a "rating" for that position. I imagine it would look something like:
utility(position p) = k1 * population_of_city_at_p +
k2 * new_area_covered_if_placed_at_p +
k3 * number_of_nearby_defences
(k1, k2, and k3 are arbitrary constants that you'll need to tune)
Then, just randomly sample of bunch of different points, p and choose the one with the highest utility.