How do I read variable length 1D inputs in Tensorflow? - csv

I'm trying to read variable length 1-D inputs into a Tensorflow CNN.
I have previously implemented reading fixed length inputs by first constructing a CSV file (where the first column is the label and the remaining columns are the input values - flattened spectrogram data all padded/truncated to the same length) using tf.TextLineReader().
This time I have a directory full of files each one containing a line of data I want to use as input (flattened spectrogram data again but I do not want to force them to the same dimensions), and the line lengths are not fixed. I'm getting an error trying to use the previous approach of compiling a CSV first. I looked into the documentation of tf.TextLineReader() and it specifies that all CSV rows must be the same shape, so I am stuck! Any help would be much appreciated, thanks :)

I'm assuming that the data isn't changing shape when you have a longer or shorter sample right? By that I mean that if you trained your network on arrays of 1000 pixels for example, with a kernel of say [5,1] size. That [5,1] kernel needs to see the same patterns in the variable length data as it did in the training data. If your data is stretched or shrunk, then the correct solution is to interpolate the data to the same size as the training data so the shapes/patterns match.
Assuming you just want variable length inputs, then in theory you should be able to do this by setting your batch size to 1 and varying the 1st dimension of the data.
So your input placeholder would look like:
X = tf.placeholder(dtype, shape=[1,None,1,1])
The 4 shape arguments are: 1=batch size; None=unknown first dimension size; 1=unused because it's a 1D dataset, 1=one channel images, again unused but necessary for tf.conv2d to receive the expected 4D image.
This is not very different from configuring tensorflow to support variable batch sizes. So you should review this link below and understand that process.
get the size of a variable batch dimension
Note that you can't use a batch size more than 1 here because you wouldn't be able to construct a matrix with missing values in the 2nd dimension. I expect the convolution operations to work with this variable dimension (though I haven't actually tried this).
Another option to deal with this problem would be to pad your inputs with 0's so they all have a common length, but that will need to have been trained into the model up front.

Related

How to make a wavetable with Inverse FFT in web-audio api or any other tool

I would like to know how one could generate a wavetable out of a wav file for example.
I know a wavetable can be used in web audio api with setPerdiodic wave and I know how to use it.
But what do I need to do to create my own wavetables? I read about inverse FFT, but I did find nearly nothing. I don't need any code just an idea or a formula of how to get the wavetable from an wav file to a Buffer.
There are a few constraints here and I'm not sure how good the result will be.
Your wav file source can't be too long; the PeriodicWave object
only supports arrays up to size 8192 or so.
I'm going to assume your waveform is intended to be periodic. If the
last sample and the first aren't reasonably close to each other,
there will be a hard-to-reproduce jump.
The waveform must have zero mean, so if it doesn't you should remove
the mean.
With that taken care of, select a power of two greater than the length
of your wave file (not strictly needed, but most FFTs expect powers of
two). Zero-pad the wave file if the length is not a power of two.
Then compute the the FFT. You'll either get an array of complex
numbers or two arrays. Separate these out to real and imaginary
arrays and use them for contructing the PeriodicWave.

Does PyTorch support variable with dynamic dimension?

I've updated my question based upon the variable dimension of variables.
Suppose the input tensor stores the 3d points with dimension 10x3, 10 means the #points and 3 is the feature dimension (say x,y,z coordinates). The dimension of the variable depends on the input tensor, say its dimension is 10x10. When the input tensor changes its dimension to 50x3, then the dimension of the variable will also have to change to 50x50.
I know in Tensorflow, if the input dimension is changing/unknown, we can declare it as tf.placeholder(None,3). However, I never meet the situation where the size of variable is changing/unknown, it seems that the variable will always have the fixed dimension.
I am currently learning PyTorch and don't know whether PyTorch supports this function. Any information would be appreciated!
========= Original question ========
I have a variable in which the size is changeable when input dimension changes. For example, if input is 10x2, then the variable should be 10x10. If input is 25x2, then the variable should be 25x25. As my understanding, the variable is used to store weights, which normally has fixed dimension. However in my case, the dimension of the variable depends on input data, which can change. Does PyTorch currently supports this kind of function?
Thanks!
Your question is little ambiguous. When you say, your input is say, 10x2, you need to define what the input tensor contains.
I am assuming you are talking about torch.autograd.Variable. If you want to use PyTorch's functionality, what you need to do is to provide your input through a tensor in the desired shape of the target function.
For example, if you want to use RNN implemented in PyTorch for an input sentence of length 10 where each word is represented by a 300 dimensional vector (e.g., word embedding), then you can do as follows.
rnn = nn.RNN(300, 256, 2) # emb_size=300,hidden_size=256,num_layers=2
input = Variable(torch.randn(10, 1, 300)) # sent_length=10,batch_size=1,emb_size=300
h0 = Variable(torch.randn(2, 1, 256)) # num_layers=2,batch_size=1,hidden_size=256
output, hn = rnn(input, h0)
If you have more than 1 sentence, then you can provide them in batch. In that case, you need to pad them to handle variable lengths. As you can see, RNN doesn't care about the sentence length, it can handle variable lengths but to provide many sentences in batch, you need padding. You can explore related functionalities in the official documentation.
Since you didn't mention what is your input actually, I am assuming you need variables with variable number of timesteps, in that case PyTorch can serve your purpose. Actually, PyTorch is developed to meet all basic functionalities that are required to build deep neural network architectures.

Should the learning rate be changed when the input/output has a different range?

I have an existing Caffe CNN which gets a 224*224 image as input, each pixel having a single value in the range [0,100]. The output has size 56*56, each being one of 313 classes.
Now I want to change the input/output type and train a model on this new network. The input values are in the range [0,1], and the output consists of 324 classes.
Should I change the base learning rate because of the changed input data range?
No, your learning rate has got nothing to do with your data. It is simply a factor by which the loss gradients are multiplied before updating the weights. You may consider it as a percentage, not absolute.

What is the format of the Qpdeltamap used for ROI in NVENC?

I am trying to get started with ROI encoding with the Nvidia Encoder NVENC.
As a first step I am trying to get the Nvidia demos to encode using ROI. I know that the switch -qpDeltaMapFile enables the flag enableExtQPDeltaMap. This allows me to send a file with a qp map that the encoder uses to tweak the values obtained by the rate control algorithm.
However there is absolutely no documentation on the format of this file. I tried to use one value per byte, and one byte per value assuming fixed size macroblocks of 16x16. It doesn't seem to work as I would expect.
Any guidance or references would help a lot.
There was a bug in my code. It actually works almost as I described.
Assume your screen is divided equally in 16x16 blocks, then each value will be added to the qp that the rate control algorithm chose. Each value passed is a signed integer, therefore a negative value will improve the quality while a positive value will decrease it. A value of 0 will stay with whatever the rate control algorithm decided.

Possible to call subfunction in S-function level-2

I have been trying to convert my level-1 S-function to level-2 but I got stuck at calling another subfunction at function Output(block) trying to look for other threads but to no avail, do you mind to provide related links?
My output depends on a lot processing with the inputs, this is the reason I need to call the sub-function in order to calculate and then return output values, all the examples that I can see are calculating their outputs directly in "function Output(block)", in my case I thought it is not possible.
I then tried to use Interpreted Matlab Function block but failed due to the output dimension is NOT the same as input dimension, also it does not support the return of more than ONE output................
Dear Sir/Madam,
I read in S-function documentation that "S-function level-1 supports vector inputs and outputs. DOES NOT support multiple input and output ports".
Does the second sentence mean the input and output dimension MUST BE SAME?
I have been using S-function level-1 to do the following:
[a1, b1] = choose_cells(c, d);
where a1 and b1 are outputs, c and d are inputs. All the variables are having a single value, except d is an array with 6 values.
Referring to the image attached, we all know that in S-function block, the input dimension must be SAME as output dimension, else we will get error, in this case, the input dimension is 7 while the output dimension is 2, so I have to include the "Terminator" blocks in the diagram for it to work perfectly, otherwise, I will get an error.
My problem is, when the system gets bigger, the array d could contain hundreds of variables, using this method, it means I would have to add hundreds of "Terminator" blocks in order to get this work, this definitely does not sound practical.
Could you please suggest me a wise way to implement this?
Thanks in advance.
http://imgur.com/ib6BTTp
http://imageshack.us/content_round.php?page=done&id=4tHclZ2klaGtl66S36zY2KfO5co