Amazon Sagemaker Factorization Machine inconsistent prediction results - json

Amazon Sagemaker's Factorization Machine model's inference results differ based on input data format. I am receiving different prediction results depending whether the inference data is json or protobuf.
My JSON input data is sparse.
Protobuf RecordIO input data is also sparse.
I have assumed that no matter the input data format, the Factorization Machine's predictions should be stable?

Related

Is is possible to use a transformer network architecture as an autoencoder to perform anomaly detection?

I would like use the efficiency of transformer architecture to do anomaly detection on time series based on transformers. I am wondering:
Can we modify slightly the architecture to create a bottleneck in the transformer network (similar to a fully connected network AutoEncoder, or AE with LSTMs).
does it actually makes sense to try to do that
I would like the transformer to learn how to reconstruct in output the input sequence, with some intermediate latent space that has lower dimensionality (bottleneck).
My idea was to reduce d_model (number of variables in the time series, or embedding dimension in nlp) but it must be of the same size of the input series according to `torch.nn.Transformer`` (see here)

Number of parameters and FLOPS in ONNX and TensorRT model

Does number of parameters and FLOPS (float operations per second) change when convert a model from PyTorch to ONNX or TensorRT format?
I don't think Anvar's post answered OP's question thoroughly so I did a little bit of research. Some general info before the answers to the questions as I believe OP hasn't understood fully what TensorRT and ONNX optimizations happen during the conversion from PyTorch format.
Both conversions, Pytorch to ONNX and ONNX to TensorRT increase the performance of the model by using several different optimizations. The tools actually print you information about what they do if you choose the verbose flag for them.
The preferred way to convert a Pytorch model to TensorRT is to use Torch-TensorRT as explained here.
TensorRT fuses layers and tensors in the model graph, it then uses a large kernel library to select implementations that perform best on the target GPU.
ONNX runtime offers mostly graph optimizations such as graph simplifications and node fusions to improve performance.
1. Does the number of parameters change when converting a PyTorch model to ONNX or TensorRT?
No: even though the layers are fused the number of parameters does not decrease unless there are some redundant branches in the model.
I tested this by downloading the yolov5s.onnx model here. The original model has 7.2M parameters according to the repository authors. Then I used this tool to count the number of parameters in the yolov5.onnx model and got 7225917 as a result. Thus, onnx conversion did not reduce the amount of parameters.
I was not able to get as elaborate information for TensorRT model but you can get layer information using trtexec. There is a recent question about this but there are no answers yet.
2. Does the number of FLOPS change when converting a PyTorch model to ONNX or TensorRT?
According to this post, no.
I know that since some of new versions of Pytorch (I used 1.8 and it worked for me) there are some fusions of batch norm layers and convolutions while saving model. I'm not sure about ONNX, but TensorRT actively uses horizontal and vertical fusion of different layers, so final model would be computational cheaper, than model that you initialized.

When to use tensorflow datasets api versus pandas or numpy

There are a number of guides I've seen on using LSTMs for time series in tensorflow, but I am still unsure about the current best practices in terms of reading and processing data - in particular, when one is supposed to use the tf.data.Dataset API.
In my situation I have a file data.csv with my features, and would like to do the following two tasks:
Compute targets - the target at time t is the percent change of
some column at some horizon, i.e.,
labels[i] = features[i + h, -1] / features[i, -1] - 1
I would like h to be a parameter here, so I can experiment with different horizons.
Get rolling windows - for training purposes, I need to roll my features into windows of length window:
train_features[i] = features[i: i + window]
I am perfectly comfortable constructing these objects using pandas or numpy, so I'm not asking how to achieve this in general - my question is specifically what such a pipeline ought to look like in tensorflow.
Edit: I guess that I'd also like to know whether the 2 tasks I listed are suited for the dataset api, or if i'm better off using other libraries to deal with them?
First off, note that you can use dataset API with pandas or numpy arrays as described in the tutorial:
If all of your input data fit in memory, the simplest way to create a
Dataset from them is to convert them to tf.Tensor objects and use
Dataset.from_tensor_slices()
A more interesting question is whether you should organize data pipeline with session feed_dict or via Dataset methods. As already stated in the comments, Dataset API is more efficient, because the data flows directly to the device, bypassing the client. From "Performance Guide":
While feeding data using a feed_dict offers a high level of
flexibility, in most instances using feed_dict does not scale
optimally. However, in instances where only a single GPU is being used
the difference can be negligible. Using the Dataset API is still
strongly recommended. Try to avoid the following:
# feed_dict often results in suboptimal performance when using large inputs
sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
But, as they say themselves, the difference may be negligible and the GPU can still be fully utilized with ordinary feed_dict input. When the training speed is not critical, there's no difference, use any pipeline you feel comfortable with. When the speed is important and you have a large training set, the Dataset API seems a better choice, especially you plan distributed computation.
The Dataset API works nicely with text data, such as CSV files, checkout this section of the dataset tutorial.

How to feed the weights of the neurons to the blobs in caffe and vice versa?

I am a very newbie to caffe.
I have a huge weight vector which contains the weights connecting the neurons of the neural network written in C++. I want to know use this weight vector as to define a neural network in Caffe and these weights will be the initial weights of the connecting neurons. How do I feed these weights into the Caffe blobs which is the fundamental way to hold parameter values like weights and biases in caffe.
After every iteration when the weights get updated, I also want to get their values from the blobs and put them back into this huge weight vector which I will access from the remaining part of code in C++.
Please tell me how to code this in caffe. It is actually a process of serialization and deserialization of the weights vector to and from blobs.
Any help will be greatly appreciated

Couchbase: is there a base64-encoding overhead on storing binary data?

It's known (see this answer here), that Couchbase provides binary data as base64-encoded document, when using MapReduce queries.
However, does it stores it as base64 too? From libcouchbase's perspective, it takes a byte array + length, does it gets converted to base64 later?
The Couchbase storage engine stores your data exactly as-as (i.e. the stream of bytes of the length you specify) internally. When reading that data using the CRUD key/value API at the protocol level you get back the exact same stream of bytes.
This is possible because the low-level key-value protocol is binary on the wire, and so there are no issues with using all 8 bits per byte.
Different Client SDKs will expose that in different ways to you. For example:
The C SDK (being low-level) directly gives you back a char* buffer and length.
The Python SDK provides a Transcoding feature where is uses a flag in the documents' metadata to encode the type of the document, so it can automatically convert it to the original type, for example a Python serialised object or a JSON object.
On the other hand, the Views API is done over HTTP with JSON response objects. JSON cannot directly encode 8-bit binary data, so Couchbase needs to use base64 encoding for the view response objects if they contain binary data.
(As an aside, this is one of the reasons why it is recommended to have an index emit the minimum amount of data needed, for example just the key of the document(s) of interest, then use the CRUD key/value interface to actually get the document - the Key/Value interface doesn't have the base64 overhead when transmitting data back.