Deep Learning CNN MINST TensorFlow applied to my own images - deep-learning

I'm new in Deep Learning and I started with the TenserFlow tutorials (The beginner one and the expert one).In both of them, the data is imported at the beginning with these 2 lines :
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
I would like to use this neural network on my own images. I have 100 000 images et a fileLabel.txt giving the labels for each image in order by column. Is there a way to change these two lines or a few others to import my images without breaking all the code ? I really don't see how to do that, I have the impression that the structure mnist is specific to the images of the tutorial.
Thanks in advance for your help

Short answer to your question is yes - its possible. You don't need to break any code IF your data is also similar to MNIST data with 10 labels and well organized.
Assuming that is not the case, then you need to organize your input data so that you can define (create) your model.
Organizing of your input data includes
Having consistent image sizes (for example MNIST is 28x28 pixel images)
Labeling of your images (for example MNIST has 10 labels - 0 to 9)
Finally how you intend to split your data (for example MNIST data is split into three parts: 55,000 data points of training data (mnist.train), 10,000 points of test data (mnist.test), and 5,000 points of validation data (mnist.validation).
Once you organize your input data, then you read your data by writing a small function like read_images that does something like
reader = tf.WholeFileReader()
key, value = reader.read(filename_queue)
....
Then assuming you want to "label" ahead of time similar to MNIST data, you can store them in a file and read them in your program.
After that, you would have to populate the tf.train.string_input_producer() with a list of strings containing the filename and the label.
....

Related

Is there a faster way to convert sentences to TFHUB embeddings?

So I am involved in a project that involves feeding a combination of text embeddings and image vectors into a DNN to arrive at the result. Now for the word embedding part, I am using TFHUB's Electra while for the image part I am using a NASNet Mobile network.
However, the issue I am facing is that while running the word embedding part, using the code shown below, the code just keeps running nonstop. It has been over 2 hours now and my training dataset has just 14900 rows of tweets.
Note - The input to the function is just a list of 14900 tweets.
tfhub_handle_encoder="https://tfhub.dev/google/electra_small/2"
tfhub_handle_preprocess="https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3
# Load Models
bert_preprocess_model = hub.KerasLayer(tfhub_handle_preprocess)
bert_model = hub.KerasLayer(tfhub_handle_encoder)
def get_text_embedding(text):
preprocessing_layer = hub.KerasLayer(tfhub_handle_preprocess, name='Preprocessor')
encoder_inputs = preprocessing_layer(text) encoder =
hub.KerasLayer(tfhub_handle_encoder, trainable=True, name='Embeddings') outputs =
encoder(encoder_inputs) text_repr = outputs['pooled_output'] text_repr =
tf.keras.layers.Dense(128, activation='relu')(text_repr)
return text_repr
text_repr = get_text_embedding(train_text)
Is there a faster way to get text representation using these models?
Thanks for the help!
The operation performed in the code is quadratic in its nature. While I managed to execute your snippet with 10000 samples within a few minutes, a 14900 long input ran out of memory on 32GB RAM runtime. Is it possible that your runtime is experiencing swapping?
It is not clear what is the snippet trying to achieve. Do you intend to train model? In such case you can define the text_input as an Input layer and use fit to train. Here is an example: https://www.tensorflow.org/text/tutorials/classify_text_with_bert#define_your_model

Anomaly Detection with Autoencoder using unlabelled Dataset (How to construct the input data)

I am new in deep learning field, i would like to ask about unlabeled dataset for Anomaly Detection using Autoencoder. my confusing part start at a few questions below:
1) some post are saying separated anomaly and non-anomaly (assume is labelled) from the original dataset, and train AE with the only non-anomaly dataset (usually amount of non-anomaly will be more dominant). So, the question is how am I gonna separate my dataset if it is unlabeled?
2) if I train using the original unlabeled dataset, how to detect the anomaly data?
Label of data doesn't go into autoencoder.
Auto Encoder consists of two parts
Encoder and Decoder
Encoder: It encodes the input data, say a sample with 784 features to 50 features
Decoder: from those 50 features it converts it back to original feature i.e 784 features.
Now to detect anomaly,
if you pass an unknown sample, it should be converted back to its original sample without much loss.
But if there is a lot of error in converting it back. then it could be an anomaly.
Picture Credit: towardsdatascience.com
I think you answered the question already yourself in part: The definition of an anomaly is that it should be considered "a rare event". So even if you don't know the labels, your training data will contain only very few such samples and predominantly learn on what the data usually looks like. So both during training as well as at prediction time, your error will be large for an anomaly. But since such examples should come up only very seldom, this will not influence your embedding much.
In the end, if you can really justify that the anomaly you are checking for is rare, you might not need much pre-processing or labelling. If it occurs more often (a threshold is hard to give for that, but I'd say it should be <<1%), your AE might pick up on that signal and you would really have to get the labels in order to split the data... . But then again: This would not be an anomaly any more, right? Then you could go ahead and train a (balanced) classifier with this data.

Machine Learning for gesture recognition with Myo Armband

I'm trying to develop a model to recognize new gestures with the Myo Armband. (It's an armband that possesses 8 electrical sensors and can recognize 5 hand gestures). I'd like to record the sensors' raw data for a new gesture and feed it to a model so it can recognize it.
I'm new to machine/deep learning and I'm using CNTK. I'm wondering what would be the best way to do it.
I'm struggling to understand how to create the trainer. The input data looks like something like that I'm thinking about using 20 sets of these 8 values (they're between -127 and 127). So one label is the output of 20 sets of values.
I don't really know how to do that, I've seen tutorials where images are linked with their label but it's not the same idea. And even after the training is done, how can I avoid the model to recognize this one gesture whatever I do since it's the only one it's been trained for.
An easy way to get you started would be to create 161 columns (8 columns for each of the 20 time steps + the designated label). You would rearrange the columns like
emg1_t01, emg2_t01, emg3_t01, ..., emg8_t20, gesture_id
This will give you the right 2D format to use different algorithms in sklearn as well as a feed forward neural network in CNTK. You would use the first 160 columns to predict the 161th one.
Once you have that working you can model your data to better represent the natural time series order it contains. You would move away from a 2D shape and instead create a 3D array to represent your data.
The first axis shows the number of samples
The second axis shows the number of time steps (20)
The thirst axis shows the number of sensors (8)
With this shape you're all set to use a 1D convolutional model (CNN) in CNTK that traverses the time axis to learn local patterns from one step to the next.
You might also want to look into RNNs which are often used to work with time series data. However, RNNs are sometimes hard to train and a recent paper suggests that CNNs should be the natural starting point to work with sequence data.

Can I use autoencoder for clustering?

In the below code, they use autoencoder as supervised clustering or classification because they have data labels.
http://amunategui.github.io/anomaly-detection-h2o/
But, can I use autoencoder to cluster data if I did not have its labels.?
Regards
The deep-learning autoencoder is always unsupervised learning. The "supervised" part of the article you link to is to evaluate how well it did.
The following example (taken from ch.7 of my book, Practical Machine Learning with H2O, where I try all the H2O unsupervised algorithms on the same data set - please excuse the plug) takes 563 features, and tries to encode them into just two hidden nodes.
m <- h2o.deeplearning(
2:564, training_frame = tfidf,
hidden = c(2), auto-encoder = T, activation = "Tanh"
)
f <- h2o.deepfeatures(m, tfidf, layer = 1)
The second command there extracts the hidden node weights. f is a data frame, with two numeric columns, and one row for every row in the tfidf source data. I chose just two hidden nodes so that I could plot the clusters:
Results will change on each run. You can (maybe) get better results with stacked auto-encoders, or using more hidden nodes (but then you cannot plot them). Here I felt the results were limited by the data.
BTW, I made the above plot with this code:
d <- as.matrix(f[1:30,]) #Just first 30, to avoid over-cluttering
labels <- as.vector(tfidf[1:30, 1])
plot(d, pch = 17) #Triangle
text(d, labels, pos = 3) #pos=3 means above
(P.S. The original data came from Brandon Rose's excellent article on using NLTK. )
In some aspects encoding data and clustering data share some overlapping theory. As a result, you can use Autoencoders to cluster(encode) data.
A simple example to visualize is if you have a set of training data that you suspect has two primary classes. Such as voter history data for republicans and democrats. If you take an Autoencoder and encode it to two dimensions then plot it on a scatter plot, this clustering becomes more clear. Below is a sample result from one of my models. You can see a noticeable split between the two classes as well as a bit of expected overlap.
The code can be found here
This method does not require only two binary classes, you could also train on as many different classes as you wish. Two polarized classes is just easier to visualize.
This method is not limited to two output dimensions, that was just for plotting convenience. In fact, you may find it difficult to meaningfully map certain, large dimension spaces to such a small space.
In cases where the encoded (clustered) layer is larger in dimension it is not as clear to "visualize" feature clusters. This is where it gets a bit more difficult, as you'll have to use some form of supervised learning to map the encoded(clustered) features to your training labels.
A couple ways to determine what class features belong to is to pump the data into knn-clustering algorithm. Or, what I prefer to do is to take the encoded vectors and pass them to a standard back-error propagation neural network. Note that depending on your data you may find that just pumping the data straight into your back-propagation neural network is sufficient.

How to massage inputs into Keras framework?

I am new to keras and despite reading the documentation and the examples folder in keras, I'm still struggling with how to fit everything together.
In particular, I want to start with a simple task: I have a sequence of tokens, where each token has exactly one label. I have a lot training data like this - practically infinite, as I can generate more (token, label) training pairs as needed.
I want to build a network to predict labels given tokens. The number of tokens must always be the same as the number of labels (one token = one label).
And I want this to be based on all surrounding tokens, say within the same line or sentence or window -- not just on the preceding tokens.
How far I got on my own:
created the training numpy vectors, where I converted each sentence into a token-vector and label-vector (of same length), using a token-to-int and label-to-int mappings
wrote a model using categorical_crossentropy and one LSTM layer, based on https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py.
Now struggling with:
All the input_dim and input_shape parameters... since each sentence has a different length (different number of tokens and labels in it), what should I put as input_dim for the input layer?
How to tell the network to use the entire token sentence for prediction, not just one token? How to predict a whole sequence of labels given a sequence of tokens, rather than just label based on previous tokens?
Does splitting the text into sentences or windows make any sense? Or can I just pass a vector for the entire text as a single sequence? What is a "sequence"?
What are "time slices" and "time steps"? The documentation keeps mentioning that and I have no idea how that relates to my problem. What is "time" in keras?
Basically I have trouble connecting the concepts from the documentation like "time" or "sequence" to my problem. Issues like Keras#40 didn't make me any wiser.
Pointing to relevant examples on the web or code samples would be much appreciated. Not looking for academic articles.
Thanks!
If you have sequences of different length you can either pad them or use a stateful RNN implementation in which the activations are saved between batches. The former is the easiest and most used.
If you want to use future information when using RNNs you want to use a bidirectional model where you concatenate two RNN's moving in opposite directions. RNN will use a representation of all previous information when e.g. predicting.
If you have very long sentences it might be useful to sample a random sub-sequence and train on that. Fx 100 characters. This also helps with overfitting.
Time steps are your tokens. A sentence is a sequence of characters/tokens.
I've written an example of how I understand your problem but it's not tested so it might not run. Instead of using integers to represent your data I suggest one-hot encoding if it is possible and then use binary_crossentropy instead of mse.
from keras.models import Model
from keras.layers import Input, LSTM, TimeDistributed
from keras.preprocessing import sequence
# Make sure all sequences are of same length
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
# The input shape is your sequence length and your token embedding size (which is 1)
inputs = Input(shape=(maxlen, 1))
# Build a bidirectional RNN
lstm_forward = LSTM(128)(inputs)
lstm_backward = LSTM(128, go_backwards=True)(inputs)
bidirectional_lstm = merge([lstm_forward, lstm_backward], mode='concat', concat_axis=2)
# Output each timestep into a fully connected layer with linear
# output to map to an integer
sequence_output = TimeDistributed(Dense(1, activation='linear'))(bidirectional_lstm)
# Dense(n_classes, activation='sigmoid') if you want to classify
model = Model(inputs, sequence_output)
model.compile('adam', 'mse')
model.fit(X_train, y_train)