What is adaptation data? - terminology

I came across the term, adaptation data when I was preparing a document. This document is about a software release. In the guide lines, it requires the description of adaptation data. So,
What is adaptation data?
What does adaptation data contain?
How can I describe the adaptation data?

Adaptation Data are configuration data that "adapt" a program to a particular installation site or to a given set of conditions and operating environment. Changing the max dB in an adaptation file for a car stereo firmware based on the amplication that the speakers are capable of would be an adaptation data change. Calibrating two cameras for stereoscopic vision based on their physical configuration at a site would be adaptation data. Changing the image filtering algorithm of a camera per environment is also adaptation. Changing what is logged from a program per user preference is probably not adaptation data.

Related

Best practices to fine-tune a model?

I have a few questions regarding the fine-tuning process.
I'm building an app that is able to recognize data from the following documents:
ID Card
Driving license
Passport
Receipts
All of them have different fonts (especially receipts) and it is hard to match exactly the same font and I will have to train the model on a lot of similar fonts.
So my questions are:
Should I train a separate model for each of the document types for better performance and accuracy or it is fine to train a single eng model on a bunch of fonts that are similar to the fonts that are being used on this type of documents?
How many pages of training data should I generate per font? By default, I think tesstrain.sh generates around 4k pages.
Maybe any suggestions on how I can generate training data that is closest to real input data
How many iterations should be used?
For example, if I'm using some font that has a high error rate and I want to target 98% - 99% accuracy rate.
As well maybe some of you had experience working with this type of documents and maybe you know some common fonts that are being used for these documents?
I know that MRZ in passport and id cards is using OCR-B font, but what about the rest of the document?
Thanks in advance!
Ans 1
you can train a single model to achieve the same but if you want to detect different languages then I think you will need different models.
Ans 2
If you are looking for some datasets then have a look at this Mnist Png Dataset which has digits as well as alphabets from various computer-based fonts. Here is a link to some starter code to use the data set implemented in Pytorch.
Ans 3
You can use optuna to find the best set of params for your model, but you will need some of the
using-optuna-to-optimize-pytorch-hyperparameters
Have a look at these
PAN-Card-OCR
document-details-parsing-using-ocr
They are trying to achieve similar task.
Hope it answers your Question...!
I would train a classifier on the 4 different types to classify an ID, license, passport, receipts. Basically so you know that a passport is a passport vs a drivers license ect. Then I would have 4 more models that are used for translating each specific type (passport, drivers license, ID, and receipts). It should be noted that if you are working with multiple languages this will likely mean making 4 models based each specific language meaning that if you have L languages you make need 4*L number of models for translating those.
Likely a lot. I don’t think that font is really an issue. Maybe what you should do is try and define some templates for things like drivers license and then generate based on that template?
This is the least of your problems, just test for this.
Assuming you are referring to a ML data model that might be used to perform ocr using computer vision I'd recommend to:
Setup your taxonomy as required by your application requirements.
This means to categorize the expected font sets per type of scanned document (png,jpg tiff etc.) to include inside the appropriate dataset. Select the fonts closest to the ones being used as well as the type of information you need to gather (Digits only, Alphabetic characters).
Perform data cleanup on your dataset and make sure you have homogenous data for the OCR functionality. For example, all document images should be of png type, with max dimensions of 46x46 to have an appropriate training model. Note that higher resolution images and smaller scale means higher accuracy.
Cater for handwritting as well, if you have damaged or non-visible font images. This might improve character conversion options in cases that fonts on paper are not clearly visible/worn out.
In case you are using keras module with TF on mnist provided datasets, setup a cancellation rule for ML model training when you reach 98%-99% accuracy for more control in case you expect your fonts in images to be error-prone (as stated above). This helps avoid higher margin of errors when you have bad images in your training dataset. For a dataset of 1000+ images, a good setting would be using TF Dense of 256 and 5 epochs.
A sample training dataset can be found here.
If you just need to do some automation with your application or do data entry that requires OCR conversion from images, a good open source solution would be to use information gathering automatically via PSImaging module (Powershell) use the degrees of confidence retrieved (from png) and run them against your current datasets to improve your character match accuracy.
You can find the relevant link here

Is there an open source solution for Multiple camera multiple object (people) tracking system?

I have been trying to tackle a problem where I need to track multiple people through multiple camera viewpoints on a real-time basis.
I found a solution DeepCC (https://github.com/daiwc/DeepCC) on DukeMTMC dataset but unfortunately, this solution has been taken down because of data confidentiality issues. They were using Fast R-CNN for object detection, triplet loss for Re-identification and DeepSort for real-time multiple object tracking.
Questions:
1. Can someone share some other resources regarding the same problem?
2. Is there a way to download and still use the DukeMTMC database for multiple tracking problem?
3. Is anyone aware when the official website (http://vision.cs.duke.edu/DukeMTMC/) will be available again?
Please feel free to provide different variations of the question :)
Intel OpenVINO framewors has all part of this task:
Objects detection with pretrained Faster RCNN, SSD or YOLO.
Reidentification models.
And complete demo application.
And you can use another models. Or if you want to use detection on GPU then take opencv_dnn_cuda for detection and OpenVINO for reidentification.
A good deep learning library that I have used in the past for my work is called Mask R-CNN, or Mask Regions-Convolutional Neural-Network. Although I have only used this algorithm on images and not on videos, the same principles apply, and it's very easy to make the transition to detection objects in a video. The algorithm uses Tensorflow and Keras, where you can split your input data, i.e images of people, into two sets, training, and validation.
For training, use a third party software like via, to annotate the people in the images. After the annotations have been drawn, you will export a JSON file with all annotations drawn, which will be used for the training process. Do the same thing for the validation phase, BUT make sure the images in the validation have not been seen before by the algorithm.
Once you have annotated both groups and generated JSON files, you then can start training the algorithm. Mask R-CNN makes it very easy to train, with all you need to do is pass one line full of commands to start it. If you want to train data on your GPU instead of your CPU, then install Nvidia's CUDA, which works very well with supported GPUs, and requires no coding after the installation.
During the training stage, you will be generating weights files, which are stored in the .h5 format. Depending on the number of epochs you choose, there will be a weights file generated per epoch. Once the training has finished, you then will just have to reference that weights file anytime you want to detect relevant objects, i.e. in your video feed.
Some important info:
Mask R-CNN is somewhat of an older algorithm, but it still works flawlessly today. Although some people have updated the algorithm to Tenserflow 2.0+, to get the best use out of it, use the following.
Tensorflow-gpu 1.13.2+
Keras 2.0.0+
CUDA 9.0 to 10.0
Honestly, the hardest part for me in the past was not using the algorithm, but finding the right versions of Tensorflow, Keras, and CUDA, that all play well with each other, and don't error out. Although the above-mentioned versions will work, try and see if you can upgrade or downgrade certain libraries to see if you can get better results.
Article about Mask R-CNN with video, I find it to be very useful and resourceful.
https://www.pyimagesearch.com/2018/11/19/mask-r-cnn-with-opencv/
The GitHub repo can be found below.
https://github.com/matterport/Mask_RCNN
EDIT
You can use this method across multiple cameras, just set up multiple video captures within a computer vision library like OpenCV. I assume this would be done with Python, which both Mask R-CNN and OpenCV are primarly based in.

Downloading Quotes in CSV format from Yahoo Finance - Beta symbol?

By using http://finance.yahoo.com/d/quotes.csv?s=STOCKNAME&f=I am able to download a CSV file, does anyone know what the symbol for beta is? It should go after &f= e.g. the symbol for the stock name is n and it goes in as such: http://finance.yahoo.com/d/quotes.csv?s=STOCKNAME&f=n
Thanks in advance for your help!
Unfortunately you can´t
There is no beta 'symbol' to allow you to download beta using Yahoos CSV API.
With that being said, it may be important to note
Though plenty of financial sites provide them, what risks are you
taking by using one of the betas provided by an outside source? Betas
provided for you by online services have unknown variable inputs,
which in all likelihood are not adaptive to your unique portfolio.
Crucially:
Provided betas are calculated with time frames unknown to their
consumers.
Another problem may be the index used to calculate beta.
Another unknown factor of pre-made betas is the method used to
calculate them.
Yahoo may therefore not provide beta due to it being liable to misinterpretation based on the above (though this is purely speculative).
So then what?
It's actually pretty straight forward to calculate yourself, all you need to do is:
Decide your time horizon for measurement
Decide an appropriate market to measure against
Ensure your chosen investment and markets share matching datapoints across the chose period (for ease of calculation)
Decide an appropriate risk free rate of return
Decide your model of calculation (e.g. regression or the capital asset pricing model, 'CAPM')
The methodology to then perform the calculation is dependant on what you're trying to accomplish and within what (programming) environment.

MetroEEG Data acquisition rate

I have used metroEEG for a Windows Phone 8 app which connects the portable Mindwave mobile to the windows phone via bluetooth. By default the metroEEG library sets the data acquisition rate for the Mindwave EEG to 1Hz, whereas the Mindwave can support a refresh rate of upto 100Hz.
I've looked at the sourcecode for metroEEG but have had no luck. How do I set the data acquisition rate from the Mindwave to 100Hz?
So Mindwave Mobile does provide with 512 updates every second (per the documentation, page 8). However, the update that's provided that often is the "raw wave sample" and not the "Data Payload Structure". The Data Payload Structure is provides every 1 second after 512 raw wave samples.
From what I could tell when doing testing the raw wave samples are completely identical to each other in rapid intervals. You can see that here in the following print out:
MetroEEG filters out the raw wave sample as noise and focuses in on the data payload structure using simple string actions. To see how the actual formatting is done feel free to read Explaining how MetroEEG works.
You can pull down the MetroEEG source and instead of filtering for "UsefulDataPacket" just doing whatever string manipulations you'd like to get any piece of data starting with a [170,170,4] array. Be aware though that C#/WP won't support that much string processing so you'll still have to filter out some results.

How would you go about reverse engineering a set of binary data pulled from a device?

A friend of mine brought up this questiont he other day, he's recently bought a garmin heart rate moniter device which keeps track of his heart rate and allows him to upload his heart rate stats for a day to his computer.
The only problem is there are no linux drivers for the garmin USB device, he's managed to interpret some of the data, such as the model number and his user details and has identified that there are some binary datatables essentially which we assume represent a series of recordings of his heart rate and the time the recording was taken.
Where does one start when reverse engineering data when you know nothing about the structure?
I had the same problem and initially found this project at Google Code that aims to complete a cross-platform version of tools for the Garmin devices ... see: http://code.google.com/p/garmintools/. There's a link on the front page of that project to the protocols you need, which Garmin was thoughtful enough to release publically.
And here's a direct link to the Garmin I/O specification: http://www.garmin.com/support/pdf/IOSDK.zip
I'd start looking at the data in a hexadecimal editor, hopefully a good one which knows the most common encodings (ASCII, Unicode, etc.) and then try to make sense of it out of the data you know it has stored.
As another poster mentioned, reverse engineering can be hairy, not in practice but in legality.
That being said, you may be able to find everything related to your root question at hand by checking out this project and its' code...and they do handle the runner's heart rate/GPS combo data as well
http://www.gpsbabel.org/
I'd suggest you start with checking the legality of reverse engineering in your country of origin. Most countries have very strict laws about what is allowed and what isn't regarding reverse engineering devices and code.
I would start by seeing what data is being sent by the device, then consider how such data could be represented and packed.
I would first capture many samples, and see if any pattern presents itself, since heart beat is something which is regular and that would suggest it is measurement related to the heart itself. I would also look for bit fields which are monotonically increasing, as that would suggest some sort of time stamp.
Having formed a hypothesis for what is where, I would write a program to test it and graph the results and see if it makes sense. If it does but not quite, then closer inspection would probably reveal you need some scaling factors here or there. It is also entirely possible I need to process the data first before it looks anything like what their program is showing, i.e. might need to integrate the data points. If I get garbage, then it is back to the drawing board :-)
I would also check the manufacturer's website, or maybe run strings on their binaries. Finding someone who works in the field of biomedical engineering would also be on my list, as they would probably know what protocols are typically used, if any. I would also look for these protocols and see if any could be applied to the data I am seeing.
I'd start by creating a hex dump of the data. Figure it's probably blocked in some power-of-two-sized chunks. Start looking for repeating patterns. Think about what kind of data they're probably sending. Either they're recording each heart beat individually, or they're recording whatever the sensor is sending at fixed intervals. If it's individual beats, then there's going to be a time delta (since the last beat), a duration, and a max or avg strength of some sort. If it's fixed intervals, then it'll probably be a simple vector of readings. There'll probably be a preamble of some sort, with a start timestamp and the sampling rate. You can try decoding the timestamp yourself, or you might try simply feeding it to ctime() and see if they're using standard absolute time format.
Keep in mind that lots of cheap A/D converters only produce 12-bit outputs, so your readings are unlikely to be larger than 16 bits (and the high-order 4 bits may be used for flags). I'd recommend resetting the device so that it's "blank", dumping and storing the contents, then take a set of readings, record the results (whatever the device normally reports), then dump the contents again and try to correlate the recorded results with whatever data appeared after the "blank" dump.
Unsure if this is what you're looking for but Garmin has created an API that runs with your browser. It seems OSX is supported, as well as Windows browsers... I would try it from Google Chromium to see if it can be used instead of this reverse engineering...
http://developer.garmin.com/web-device/garmin-communicator-plugin/
API Features
Auto-detection of devices connected to a computer Access to device
product information like product name and software version Read
tracks, routes and waypoints from supported recreational, fitness and
navigation devices Write tracks, routes and waypoints to supported
recreational, fitness and navigation devices Read fitness data from
supported fitness devices Geo-code address and save to a device as a
waypoint or favorite Read and write Garmin XML files (GPX and TCX) as
well as binary files. Support for most Garmin devices (USB, USB
mass-storage, most serial devices) Support for Internet Explorer,
Firefox and Chrome on Microsoft Windows. Support for Safari, Firefox
and Chrome on Mac OS X.
Can you synthesize a heart beat using something like a computer speaker? (I have no idea how such devices actually work). Watch how the binary results change based on different inputs.
Ripping apart the device and checking out what's inside would probably help too.