I'm trying to understand how data is interpreted in Caffe.
For that I've taken a look at the Minst Tutorial
Looking at the input data definition:
layers {
name: "mnist"
type: DATA
data_param {
source: "mnist_train_lmdb"
backend: LMDB
batch_size: 64
scale: 0.00390625
}
top: "data"
top: "label"
}
I've now looked at the mnist_train_lmdb and taken one of the entries (shown in hex):
0801101C181C229006
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
00000000000054B99F973C2400000000000000000000000000000000
000000000000DEFEFEFEFEF1C6C6C6C6C6C6C6C6AA34000000000000
00000000000043724872A3E3FEE1FEFEFEFAE5FEFE8C000000000000
000000000000000000000011420E4343433B15ECFE6A000000000000
00000000000000000000000000000000000053FDD112000000000000
000000000000000000000000000000000016E9FF5300000000000000
000000000000000000000000000000000081FEEE2C00000000000000
000000000000000000000000000000003BF9FE3E0000000000000000
0000000000000000000000000000000085FEBB050000000000000000
00000000000000000000000000000009CDF83A000000000000000000
0000000000000000000000000000007EFEB600000000000000000000
00000000000000000000000000004BFBF03900000000000000000000
0000000000000000000000000013DDFEA60000000000000000000000
00000000000000000000000003CBFEDB230000000000000000000000
00000000000000000000000026FEFE4D000000000000000000000000
00000000000000000000001FE0FE7301000000000000000000000000
000000000000000000000085FEFE3400000000000000000000000000
000000000000000000003DF2FEFE3400000000000000000000000000
0000000000000000000079FEFEDB2800000000000000000000000000
0000000000000000000079FECF120000000000000000000000000000
00000000000000000000000000000000000000000000000000000000
2807
(I've added the line breaks here to be able to see the '7' digit.)
Now my question is, where this format is described? Or put differently where is defined that the first 36 bytes are some sort of header and the last 8 bytes have some label correspondence?
How would I go about constructing my own data?
Neither Blob Tutorial nor Layers Definition give much away about required formats. My intention is not to use image data, but time series
Thanks!
I realized that protocol buffers must come into play here. So I tried to deserialize it against some of the types defined in caffe.proto.
Datum seems to be the perfect fit:
{Caffe.Datum}
Channels: 1
Data: {byte[784]}
Encoded: false
FloatData: Count = 0
Height: 28
Label: 7
Width: 28
So the answer is simply: It's a serialized representation of a 'Datum' typed instance as defined per caffe.proto
Btw. since english is not my native language I had to first realize that "Datum" is a singular form of "data"
When it comes to using your own data, it's structured as follows:
The conventional blob dimensions for data are number N x channel
K x height H x width W. Blob memory is row-major in layout so the last
/ rightmost dimension changes fastest. For example, the value at index
(n, k, h, w) is physically located at index ((n * K + k) * H + h) * W
+ w.
See Blobs, Layers, and Nets: anatomy of a Caffe model for reference
I can try to answer your second question. Since Caffe only takes data in a bunch of selected formats like lmdb, hdf5 etc., it is best to convert (or generate - in case of synthetic data) your data to these formats. Following links can help you in this. If you have trouble with import hdf5in Python, then you may refer to this page.
Creating an LMDB file in Python
Writing an HDF5 file in Python
HDF5 more examples
Related
I am generating some summaries using a fine-tuned BART model, and I've noticed something strange. If I feed the labels to the model, it will always generate summaries of the same length of the label, whereas if I do not pass the labels to the model, it generates outputs of length 1024 (max BART seq length). This is unexpected, so I'm trying to understand if there is any problem / bug with the reproducible example below
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model=AutoModelForSeq2SeqLM.from_pretrained('facebook/bart-large-cnn')
tokenizer=AutoTokenizer.from_pretrained('facebook/bart-large-cnn')
sentence_to_summarize = ['This is a text to summarise. I just went for a walk in the park and saw very large crowds gathering to watch an impromptu football match']
encoded_dict = tokenizer.batch_encode_plus(sentence_to_summarize, return_tensors='pt', max_length=1024, padding='max_length')
input_ids = encoded_dict['input_ids']
attention_mask = encoded_dict['attention_mask']
label = tokenizer.encode('I went to the park', return_tensors='pt')
Notice the following two cases.
Case 1:
output = model(input_ids=input_ids, attention_mask=attention_mask)
print(output['logits'].shape)
shape printed is torch.Size([1, 1024, 50264])
Case 2
output = model(input_ids=input_ids, attention_mask=attention_mask, labels=label)
print(output['logits'].shape)
shape printed is torch.Size([1, 7, 50264]) where 7 is the length of the label 'I went to the park' (including start and end tokens).
Ideally the summarization model would learn when to generate the EOS token, but this should not always lead to summaries of identical length of the gold output (i.e. the label). Why is the label length influencing the model output in this way?
I would expect the only difference between cases 1 and 2 being that in the second case the output also contains the loss value, but I wouldn't expect this to influence the logits in any way
Original example not use label parameter
https://huggingface.co/docs/transformers/v4.22.1/en/model_doc/bart#transformers.BartForConditionalGeneration.forward.example
label parameter is optional and i think not used for summerizing
https://huggingface.co/docs/transformers/v4.22.1/en/model_doc/bart#transformers.BartForConditionalGeneration.forward.labels
In caffe, the convolution layer takes one bottom blob, and convolves it with learned filters (which are initialized using the weight type - "Xavier", "MSRA" etc.). However, my question is whether we can simply convolve two bottom blobs and produce a top blob. What would be the most elegant way of doing this? The purpose of this is: one of the bottom blob will be data and the other one will be a dynamic filter (changing depending on the data) produced by previous layers (I am trying to implement dynamic convolution).
My attempt:
One way which came to my mind was to modify the filler.hpp and assign a bottom blob as a filler matrix itself (instead of "Xavier", "MSRA" etc.). Then I thought the convolution layer would pick up from there. We can set lr = 0 to indicate that the weight initialized by our custom filler should not be changed. However, after I looked at the source code, I still don't know how to do it. On the other hand, I don't want to break the workflow of caffe. I still want conv layers to function normally, if I want them to.
Obviously a more tedious way is to use a combination of Slice, tile and/or Scale layer to literally implement convolution. I think it would work, but it will turn out to be messy. Any other thoughts?
Edit 1:
I wrote a new layer by modifying the convolution layer of caffe. In particular, in src/caffe/layers/conv_layer.cpp, on line 27, it takes the weight defined by the filler and convolves it with the bottom blob. So instead of populating that blob from the filler, I modified the layer such that it now takes two bottoms. One of the bottom directly gets assigned to the filler. Now I had to make some other changes such as:
weight blob has the same value for all the samples. Here it will have a different value for different samples. So I changed line 32 from:
this->forward_cpu_gemm(
bottom_data + n * this->bottom_dim_,
weight,
top_data + n * this->top_dim_);
to:
this->forward_cpu_gemm(
bottom_data + n * bottom[1]->count(1),
bottom[0]->cpu_data() + n * bottom[0]->count(1),
top_data + n * this->top_dim_);
To make things easier, I assumed that there is no bias term involved, stride is always 1, padding can always be 0, group will always be 1 etc. However, when I tested the forward pass, it gave me some weird answer (with a simple convolution kernel = np.ones((1,1,3,3)). The learning rates were set to zero for this kernel so that it doesn't change. However, I can't get a right answer. Any suggestions will be appreciated.
Please do not propose solutions using existing layers such as Slice, Eltwise, Crop. I have already implemented - it works - but it is unbelievably complex and memory inefficient.
I think you are on the right way as a whole.
For the "weird" convolution results, I guess the bug most possibly is:
Consider 2D convolution
and suppose bottom[1]'s shape is (num, channels, height, width),
since convolution in caffe is performed as a multiplication of 2 matrix, weight(representing convolution kernels) and col_buffer(reorganized from data to be convolved), and weight is of num_out rows and channels / this->group_ * kernel_h * kernel_w columns, col_buffer is of channels / this->group_ * kernel_h * kernel_w rows and height_out * width_out columns, so as a weight blob of dynamic convolution layer, bottom[0]'s shape should better be (num, num_out, channels/group, kernel_h, kernel_w) to satisfy
bottom[0]->count(1) == num_out * channels / this->group_ * kernel_h * kernel_w
, in which num_out is the number of the dynamic convolution layer's output feature maps.
That means, to make the convolution function
this->forward_cpu_gemm(bottom_data + n * bottom[1]->count(1)
, bottom[0]->cpu_data() + n * bottom[0]->count(1)
, top_data + n * this->top_dim_);
work properly, you must make sure that
bottom[0]->shape(0) == bottom[1]->shape(0) == num
bottom[0]->count(1) == num_out * channels / this->group_ * kernel_h * kernel_w
So most possibly the simple convolution kernel of 4-dimension np.ones((1,1,3,3)) you used may not satify the above condition and result in the wrong convolution results.
Hope it's clear and will help you.
########## Update 1, Oct 10th,2016,Beijing time ##########
I add a dynamic convolution layer here but with no unit test yet. This layer doesn't break the workflow of caffe and only change some private members of BaseConvolution class to be protected.
The files involved are:
include/caffe/layers/dyn_conv_layer.hpp,base_conv_layer.hpp
src/caffe/layers/dyn_conv_layer.cpp(cu)
It grows almost the same with the convolution layer in caffe, and the differences mainly are:
Override the function LayerSetUp() to initialize this->kernel_dim_, this->weight_offset_ etc properly for convolution and ignore initializing this->blobs_ used by Convolution layer routinely to contain weight and bias;
Override the function Reshape() to check that the bottom[1] as a kernel container has proper shape for convolution.
Because I have no time to test it, there may be bugs and I will be very glad to see your feedbacks.
########## Update 2, Oct 12th,2016,Beijing time ##########
I updated test case for dynamic convolution just now. The involved file is src/caffe/test/test_dyn_convolution_layer.cpp. It seems to work fine, but maybe need more thorough tests.
You can build this caffe by cd $CAFFE_ROOT/build && ccmake .., cmake -DBUILD_only_tests="dyn_convolution_layer" .. and make runtest to check it.
I started with Caffe and the mnist example ran well.
I have the train and label data as data.mat. (I have 300 training data with 30 features and labels are (-1, +1) that have saved in data.mat).
However, I don't quite understand how I can use caffe to implement my own dataset?
Is there a step by step tutorial can teach me?
Many thanks!!!! Any advice would be appreciated!
I think the most straight forward way to transfer data from Matlab to caffe is via HDF5 file.
First, save your data in Matlab in an HDF5 file using hdf5write. I assume your training data is stored in a variable name X of size 300-by-30 and the labels are stored in y a 300-by-1 vector:
hdf5write('my_data.h5', '/X',
single( permute(reshape(X,[300, 30, 1, 1]),[4:-1:1]) ) );
hdf5write('my_data.h5', '/label',
single( permute(reshape(y,[300, 1, 1, 1]),[4:-1:1]) ),
'WriteMode', 'append' );
Note that the data is saved as a 4D array: the first dimension is the number of features, second one is the feature's dimension and the last two are 1 (representing no spatial dimensions). Also note that the names given to the data in the HDF5 are "X" and "label" - these names should be used as the "top" blobs of the input data layer.
Why permute? please see this answer for an explanation.
You also need to prepare a text file listing the names of all hdf5 files you are using (in your case, only my_data.h5). File /path/to/list/file.txt should have a single line
/path/to/my_data.h5
Now you can add an input data layer to your train_val.prototxt
layer {
type: "HDF5Data"
name: "data"
top: "X" # note: same name as in HDF5
top: "label" #
hdf5_data_param {
source: "/path/to/list/file.txt"
batch_size: 20
}
include { phase: TRAIN }
}
For more information regarding hdf5 input layer, you can see in this answer.
I'm working with some binary waveform files from various early to mid-90's HP scopes. I am trying to do a bulk conversion (we have over 5000) of the files to CSV's and then upload them into a database. I've tried hexdump, xxd, od, strings, etc. and none of them seem to work. I did hunt down a programmers manual but it's not making a whole lot of sense.
The files have a preamble line as ascii text but then the data points are in binary and for some reason nothing I try can decode them. The preamble gives the data necessary to use the binary values and calculate the correct values. It also states that the data is in WORD format.
:WAV:PRE 2,1,32768,1,+4.000000E-08,-4.9722700001108E-06,0,+2.460630E-04,+2.500000E+00,16384;:WAV:DATA #800065536^W�^W�^W�^
I'm pretty confused.
Have a look at
http://www.naic.edu/~phil/hardware/oscilloscopes/9000A_Programmer_Reference.pdf
specifically page 1-21. After ":WAV:DATA", I think the rest of the chunk above will have 65536 8-bit data bytes (the start of which is represented above by �) . The ^W is probably a delimiter, so you would have to parse that out. Just a thought.
UPDATE: I'm new to oscilloscope data collection and am trying to figure the whole thing out from scratch. So, on further digging, it looks like the data you have provided shows this:
PREamble:
- WORD format (16-bit signed integers split into 2 8-bit bytes)
- If there is a WAV:BYT section, that would specify byte order for each pair
- RAW data
- 32768 data points
- COUNT = 1 (I'm not clear on the meaning of this)
- Next 3 should be X increment, origin, reference
- Next 3 should be Y increment, origin, reference, although the manual that I pointed you at above has many more fields than just these, so you might want to consult your specific scope manual.
DATA:
- On closer examination, I don't think the ^W is a delimiter, I think it is the first byte of the pair (0010111). The � character is apparently a standard "I don't know how to represent this character" web representation. You would need to look at that character as 8 bits also.
- 65536 byte pairs of data
I'm not finding a utility that will do this for you. I think you're going to have to write or acquire some code (Perl, C, Java, Python, VB, etc.) to get this done.
Is there a program or source code for data generation?
I want a data generator for Java. (Language does not matter, if I can get the result file)
I want a correlated data, anti-correlated data, independent data.
I want a data generator program that has
input : min, max, data-distribution (ex., independent, anti-correlated, correlated, Gaussian, Poisson ... ), dimension, # of points (n)
output : n points that follows given data-distribution.
Thank you :)
You can change the interval of the generated numbers with some simple math:
Random r=new Random();
floatx=(r.nextFloat()%(max+min))-min;
The java random class also has an option to return gaussian distributed values.