What is the data format called when sets of four hexadecimal characters (in 8 columns)? - terminology

What is data in the following format called:
I guess this is a string representation of some lower level data store. I'm not sure what it would be called though.

Related

Cannot recognize strange json-like data structure [duplicate]

I'm integrating two poorly documented systems, and in the process I've come across a strange data format I haven't seen before. It's stored as plain text in the db with no indication as to what the format is and how to deal with it.
a:17:{s:2:"id";s:27:"145219921F990B11C39E7220000";s:16:"purchase_country";s:2:"no";s:17:"purchase_currency";s:3:"nok";s:6:"locale";s:5:"nb-no";s:6:"status";s:17:"checkout_complete";s:9:"reference";s:27:"145212221F990B11C39E7221000";s:11:"reservation";s:10:"2348226550";s:10:"started_at";s:25:"2014-04-04T10:40:55+02:00";s:12:"completed_at";s:25:"2014-04-02T10:41:11+02:00";s:16:"last_modified_at";s:25:"2014-04-02T10:41:11+02:00";s:10:"expires_at";s:25:"2014-04-16T10:41:11+02:00";s:4:"cart";a:4:{s:25:"total_price_excluding_tax";i:489500;s:16:"total_tax_amount";i:0;s:25:"total_price_including_tax";i:489500;s:5:"items";a:2:{i:0;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:21:"1.OSO SUPER S 200LIT.";s:8:"quantity";i:1;s:10:"unit_price";i:695000;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:695500;s:25:"total_price_excluding_tax";i:694000;s:16:"total_tax_amount";i:0;}i:1;a:10:{s:9:"reference";s:2:"68";s:4:"name";s:32:"1.OSO SUPER S 200LIT. (discount)";s:8:"quantity";i:1;s:10:"unit_price";i:-205100;s:8:"tax_rate";i:0;s:13:"discount_rate";i:0;s:4:"type";s:8:"physical";s:25:"total_price_including_tax";i:-205100;s:25:"total_price_excluding_tax";i:-205100;s:16:"total_tax_amount";i:0;}}}s:8:"customer";a:1:{s:4:"type";s:6:"person";}s:16:"shipping_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"omitted#testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:15:"billing_address";a:8:{s:10:"given_name";s:13:"Testperson-no";s:11:"family_name";s:8:"Approved";s:14:"street_address";s:18:"Sæffleberggate 56";s:11:"postal_code";s:4:"0563";s:4:"city";s:4:"OSLO";s:7:"country";s:2:"no";s:5:"email";s:32:"checkout-no#testdrive.klarna.com";s:5:"phone";s:11:"40 12 34 56";}s:7:"options";a:1:{s:31:"allow_separate_shipping_address";b:0;}s:8:"merchant";a:5:{s:2:"id";s:4:"1601";s:9:"terms_uri";s:95:"omitted";s:12:"checkout_uri";s:59:"omitted";s:16:"confirmation_uri";s:220:"omitted";s:8:"push_uri";s:229:"omitted";}}
An entry consists of colon-separated segments:
A single char type tag (array, object, int, decimal, bool, string)
A number that says how long the value is in characters, bytes, elements (in case of arrays) or key-value pairs (in case of objs), which seems completely useless given that this is a textual format that requires me to parse the length segment anyway. This isn't present for integers and decimals.
Value of the field
Key-value pairs seem to be a flat list of an even number of elements. They also seem to be using arrays as objects as well (see example).
A ; terminator, which seems not to be necessary for objects and arrays, just to make parsing more tedious.
Now, parsing this thing is reasonably easy, but I'm constantly being surprised by new data types and their weird syntax and I'm not sure that I've covered all the edge cases with the few data samples I've analyzed. Is anyone familiar with this format?
Looks like PHP serialization. See: http://www.phpinternalsbook.com/classes_objects/serialization.html

Convention so represent string in .properties

If I have two properties:
foo=1
bar=2345
Is there a way to specify that foo is a number and bar is a string?
I assume: bar="2345" would do but I wonder if there's a widely accepted convention
A properties file is a text file which contains data in some standard format, which can be read by the application using it. It is mostly used for configuration of the application and also for internationalization.
As per the wiki document https://en.wikipedia.org/wiki/.properties
Each parameter is stored as a pair of strings, one storing the name of
the parameter (called the key), and the other storing the value.
There is no way to specify / force the value to be number or string only (instead it is always a string). It is majorly the functionality of the framework / application which; while reading the properties file tries to parse the values. If it fails to parse the value (of certain specific type like number) it may fallback to some default value or will simply terminate the program.

Viewstate: 2 different formats?

Trying to scrape a webpage, I hit the necessity to work with ASP.NET's __VIEWSTATE variables. So, ever the optimist, I decided to read up on those variables, and their formats. Even though classified as Open Source by Microsoft, I couldn't find any formal definition:
Everybody agrees the first step to do is decode the string, using a Base64 decoder. Great - that works...
Next - and this is where the confusion sets in:
Roughly 3/4 of the decoders seem to use binary values (characters whose values indicate the the type of field which is follow). Here's an example of such a specification. This format also seems to expect a 'signature' of 0xFF 0x01 as first two bytes.
The rest of the articles (such as this one) describe a format where the fields in the format are separated (or marked) by t< ... >, p< ... >, etc. (this seems to be the case of the page I'm interested in).
Even after looking at over a hundred pages, I didn't find any mention about the existence of two formats.
My questions are: Are there two different formats of __VIEWSTATE variables in use, or am I missing something basic? Is there any formal description of the __VIEWSTATE contents somewhere?
The view state is serialized and deserialized by the
System.Web.UI.LosFormatter class—the LOS stands for limited object
serialization—and is designed to efficiently serialize certain types
of objects into a base-64 encoded string. The LosFormatter can
serialize any type of object that can be serialized by the
BinaryFormatter class, but is built to efficiently serialize objects
of the following types:
Strings
Integers
Booleans
Arrays
ArrayLists
Hashtables
Pairs
Triplets
Everything you need to know about ViewState: Understanding View State

Is data returned from a MySQL Connector/C query not in native C data format?

If I execute a query against the MySQL Connector/C library the data I'm getting back all appears to be in straight char * format, including numerical data types.
For example, if I execute a query that returns 4 columns, all of which are INTEGER in MySQL, rather than getting back 4 bytes worth of data (each byte representing a single column row value), I'm actually getting back 4 ASCII encoded character bytes, where 1 is actually a byte with the numeric value 49 in it (ASCII for 1).
Is this accurate or am I just missing something complete?
Do I really need to then atoi that returned byte into an int in my code or is there a mechanism to get the native C data types out of the MySQL client directly?
I guess my real question is: is the mysql_store_result structure converting that data to ASCII encoded representations in a way that can be bypassed by my application code?
I believe the data is sent on the wire as text in the MySQL protocol (I just confirmed this with Wireshark). So that means mysql_store_result() is not converting the data, it's just simply passing the data on as it was received. MySQL actually sends integers as text. I agree this always seemed like an odd design to me as well.
MySQL originally only offered the Text Protocol that you are currently using, in which (as you note) results are encoded as strings. MySQL v4.1 (released in April 2003) introduced the Prepared Statement protocol, which (amongst other things) transmits results in a binary format.
See C API Prepared Statements for more information on how to use the latter protocol with Connector/C.

CSV format for OpenCV machine learning algorithms

Machine learning algorithms in OpenCV appear to use data read in CSV format. See for example this cpp file. The data is read into an OpenCV machine learning class CvMLData using the following code:
CvMLData data;
data.read_csv( filename )
However, there does not appear to be any readily available documentation on the required format for the csv file. Does anyone know how the csv file should be arranged?
Other (non-Opencv) programs tend to have a line per training example, and begin with an integer or string indicating the class label.
If I read the source for that class, particularly the str_to_flt_elem function, and the class documentation I conclude that valid formats for individual items in the file are:
Anything that can be parsed to a double by strod
A question mark (?) or the empty string to represent missing values
Any string that doesn't parse to a double.
Items 1 and 2 are only valid for features. anything matched by item 3 is assumed to be a class label, and as far as I can deduce the order of the items doesn't matter. The read_csv function automatically assigns each column in the csv file the correct type, and (if you want) you can override the labels with set_response_index. Delimiter wise you can use the default (,) or set it to whatever you like before calling read_csv with set_delimiter (as long as you don't use the decimal point).
So this should work for example, for 6 datapoints in 3 classes with 3 features per point:
A,1.2,3.2e-2,+4.1
A,3.2,?,3.1
B,4.2,,+0.2
B,4.3,2.0e3,.1
C,2.3,-2.1e+3,-.1
C,9.3,-9e2,10.4
You can move your text label to any column you want, or even have multiple text labels.