I'm totally new in caffe and I'm try to convert a tensorflow model to caffe.
I have a tuple which's shape is a little complex for it's stored some word vector.
This is the shape of the tuple data——
data[0]: a list, [684, 84], stores the sentence vector;
data[1]: a list, [684, 84], stores the position vector;
data[2]: a matrix, [684, 10], stores the aspects of the sentence;
data[3]: a matrix, [1, 684], stores the label of each sentence;
data[4]: a number, stores the max length of sentences;
Each row represents a sentences, which is also a sample of the dataset.
In tf, I return the whole tuple from a function which is wrote by myself.
train_data = read_data(FLAGS.train_data, source_count, source_word2idx)
I noticed that caffe always requires a data layer before training the data, but I don't have ideas how to convert my data to lmdb type or just sent them as a tuple or matrix into the model.
By the way, I'm using pycaffe.
Counld anyone help?
Thanks a lot!
There's no particular magic; all you need to do is to write an input routine that reads the file and returns the data in the format expected for train_data. You do not need to pre-convert your data to LMDB or any other format; just write read data to accept your current input format, and give the model the format it requires.
We can't help you from there: you haven't specified the model's format at all, and you've given us only the shape for the input data (no internal structure or semantics). Simply treat the data as if you were figuring out how to organize the input data for a given output format.
Related
I have a dataset of 78 variables which the input data of all the variables are binary (0 and 1). I want to plot the data in one graph. originally I plan to plot in PCA, but I think it won't work since PCA required numerical input data (is it?). Any suggestions what kind of data visualization to be used for this type of data? Thank you very much.
I do python and R.
I am writing an encoder/decoder model very similar to https://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html
The only difference is, here, the words are represented by some indices. I want to show them based on another metric, which are represented by flaot numbers.
The loss function nn.criterion = nn.NLLLoss(), seems to be working for times we are only workin with classes.
If my output array is not an array of integers, but an array of float numbers, what kind of loss function I can use? Considering all other parts are similar to the tutorial?
Thanks in advance.
As I learn more and more about ML (I am a mobile DEV) I'm starting to form an analogy in my head. I would like the communities opinion / validation.
As a front end DEV you have a backend and an API that you can make requests to. The standard format for the inputs and outputs to the API is JSON.
I'm running into a problem with ML Models that I am trying to use where I don't know how to read the expected input (API) and I don't know how to decode the expected output.
So far I my experience has been fragmented because some models say "Give me an image of [1,2,120,120]" or something like that.
To analogize, is there a unified way to define inputs and outputs for a ML model like JSON unifies the inputs and outputs for an backend API?
If so, what are some rules one must follow to encode and decode data into this format?
Assuming this "ML Model" is in the context of running an input through say a trained pytorch model's forward pass to get an output, the unified way to define inputs and outputs for an ML model are through Tensors. Tensors are essentially a multi-dimensional matrix containing elements of a single data type. Think multi-dimensional lists with a single data type.
Tensors:MLModels::JSON:WebAPI
An Example using an Object Detector
Model
Let's say your model example with the image is an object detector model that takes in an image as input and outputs either dog or cat
The input would usually be:
A tensor representation of an Image with the shape of [1, 2, 120, 120] where 1 represents the batch size, 2 is the dimension of your rgb channels, and 120x120 is the width and height of an image.
The output would usually be:
A normalized 2 dimensional tensor like [0.7, 0.3] where index 0 represents the probability of the image depicting a dog and index 1 represents the probability it's a cat.
Encoding and Decoding
Decoding the output to a string like "dog" or "cat" is obvious.
Encoding an image is slightly less obvious. At its heart, the format
of an image is that of a tensor...a multi-dimensional matrix
containing a single datatype. So is still intuitive to encode an
image in the form of a JPEG or PNG to a tensor representation through
the rgb channel dimensions and the pixel values for each channel.
Typically image files are loaded in using libraries and methods like
the python imaging library and pytorch's
torchvision.transforms.ToTensor().
This example is very specific to an object detector type model, but most supervised ML models will output a tensor like the above or a one-hot label. Most ML models in general will always have data inputs and outputs that can be represented as Tensors.
I'm new here and I really want some help. I have a dataset including geographical information (longitude, latitude.. ) and I want to ensure the prediction of some aspects using this dataset with Support Vector Regression, but I don't know how to perform this task. I have the following inquires,
Is there a specific precessing I need to go through?
Does SVR consider a geographic dataset as normal data set or are there some specificities in term of tools and treatment?
Any recommended prediction analytics tools (including SVR) considering geographical data?
This given solution is for the situation that you want to extract the independent variable base on the dependent variable from a raster.
but if you have you all dependent and independent data with their corresponding location you simply use svm function in R and you then add a raster or vector (new) data to your predict function for prediction, or you also can use the estimated coefficient of dependent variable in raster calculator in GIS and multiply them to the corresponding independent variable and finally you will get your predicted raster.
Simply you can do the following for spatial data in R.
First of all, the support vector regression can be used for prediction of real value and you can use the library("e1071") in R in order to execute this algorithm.
you can import your dataset as CSV along with lat and long columns.
transform your data.fram to Spatial data.frame
#Read data
dat<-read.csv(choose.files())
#convert the data to SPDF.
dat_sp=SpatialPoints(cbind(dat$x,dat$y))
#add your Geographical referense system
dat_crs=CRS("+proj=utm +zone=39 +datum=WGS84")
#Data Frams for SpatialPoint Data(Creating a SpatialPoints data frame for dat)
dat_spdf=SpatialPointsDataFrame(coords = dat_sp,data = dat, proj4string = dat_crs)
plot(dat_spdf, col='blue', cex=1, pch=16, axes=TRUE)
#Extract value
dat_spdf$ref <- extract(raster , dat_spdf)
then you can extract your data on a raster data or whatever you have(your independent variable).
and finally, you can use the following cold in R.
SVM(dependent ~.,independent)
But you need to really have an intuition about what the SVR is and how to evaluate the result.
you also can show your result as a final raster map.
you can use toolbox package or you may use raster package.
I am new to keras and despite reading the documentation and the examples folder in keras, I'm still struggling with how to fit everything together.
In particular, I want to start with a simple task: I have a sequence of tokens, where each token has exactly one label. I have a lot training data like this - practically infinite, as I can generate more (token, label) training pairs as needed.
I want to build a network to predict labels given tokens. The number of tokens must always be the same as the number of labels (one token = one label).
And I want this to be based on all surrounding tokens, say within the same line or sentence or window -- not just on the preceding tokens.
How far I got on my own:
created the training numpy vectors, where I converted each sentence into a token-vector and label-vector (of same length), using a token-to-int and label-to-int mappings
wrote a model using categorical_crossentropy and one LSTM layer, based on https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py.
Now struggling with:
All the input_dim and input_shape parameters... since each sentence has a different length (different number of tokens and labels in it), what should I put as input_dim for the input layer?
How to tell the network to use the entire token sentence for prediction, not just one token? How to predict a whole sequence of labels given a sequence of tokens, rather than just label based on previous tokens?
Does splitting the text into sentences or windows make any sense? Or can I just pass a vector for the entire text as a single sequence? What is a "sequence"?
What are "time slices" and "time steps"? The documentation keeps mentioning that and I have no idea how that relates to my problem. What is "time" in keras?
Basically I have trouble connecting the concepts from the documentation like "time" or "sequence" to my problem. Issues like Keras#40 didn't make me any wiser.
Pointing to relevant examples on the web or code samples would be much appreciated. Not looking for academic articles.
Thanks!
If you have sequences of different length you can either pad them or use a stateful RNN implementation in which the activations are saved between batches. The former is the easiest and most used.
If you want to use future information when using RNNs you want to use a bidirectional model where you concatenate two RNN's moving in opposite directions. RNN will use a representation of all previous information when e.g. predicting.
If you have very long sentences it might be useful to sample a random sub-sequence and train on that. Fx 100 characters. This also helps with overfitting.
Time steps are your tokens. A sentence is a sequence of characters/tokens.
I've written an example of how I understand your problem but it's not tested so it might not run. Instead of using integers to represent your data I suggest one-hot encoding if it is possible and then use binary_crossentropy instead of mse.
from keras.models import Model
from keras.layers import Input, LSTM, TimeDistributed
from keras.preprocessing import sequence
# Make sure all sequences are of same length
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
# The input shape is your sequence length and your token embedding size (which is 1)
inputs = Input(shape=(maxlen, 1))
# Build a bidirectional RNN
lstm_forward = LSTM(128)(inputs)
lstm_backward = LSTM(128, go_backwards=True)(inputs)
bidirectional_lstm = merge([lstm_forward, lstm_backward], mode='concat', concat_axis=2)
# Output each timestep into a fully connected layer with linear
# output to map to an integer
sequence_output = TimeDistributed(Dense(1, activation='linear'))(bidirectional_lstm)
# Dense(n_classes, activation='sigmoid') if you want to classify
model = Model(inputs, sequence_output)
model.compile('adam', 'mse')
model.fit(X_train, y_train)