QR codes Limits - json

I have to generate codes with custom fields: id of field+name of field+values of the field.
How long is the data I can encode inside the QRcode? I need to know how many fields\values I can insert.
Should I use XML or JSON or CSV? What is most generic and efficient?

XML / JSON will not qualify for a QR code's alphanumeric mode since it will include lower-case letters. You'll have to use byte mode. The max is 2,953 characters. But, the practical limit is far less -- perhaps a few hundred characters.
It is far better to encode a hyperlink to data if you can.
As Terence says, no reader will do anything with XML/JSON except show it. You need a custom reader anyway to do something useful with that data. (Which suggests this is not a good use case for QR codes.) But if you're making your own reader, you can use gzip compression to make the payload much smaller. Your reader would know to unzip it.
You might get away with something workable but this is not a good approach in general.

The maximum number of alphanumeric characters you can have is 4,296. Although this will require the lowest form of error correction and will be very hard to scan.
JSON is generally more efficient at data storage than XML.
However, you will need to write your own app to scan the code - I don't know of any which will process raw JSON or XML. All the scanners will show you the text, though.

Related

Convert comma to dot in Python or MySQL

I have a Python script which collects data and sends it to my MySQL table.
I noticed that the "Cost" sometimes is 0,95 which results in 0 in my table since my table use "0.95" instead of "0,95".
I assume the best solution is to convert the , to . in my Python script by using:
variable.replace(",", ".")
However, couldn't one solution be to change format in my MySQL table? So that I store numbers in this format:
1100
0,95
0,1
150000
My Django Model
cost = models.DecimalField(max_digits=10, decimal_places=4, default=None)
Any feedback on how to best solve this issue?
Thanks
Your first instinct is correct: convert the "unusual" (comma-decimal) input into the standard format that MySQL used by default (dot-decimal) at the first point where you receive it.
there's lots of ways to write numbers
Be careful, though that you don't get stung by people using commas as thousands separators like "3,203,907.23", or the European form "3.203.907,23", the Swiss "3'203'907,23' or even this form, which is widely used in India: "32,03,907.71" (yes, I did mean to type only two digits there!)
To make your life easier, the rule for currencies is relatively simple:
where a dot or comma is followed by only two digits at the end of the string, that character is acting as the decimal separator.
Once you know which is the decimal separator, you can safely remove all other non-digits from the string, change the decimal separator you found to . then use any standard library string-to-number conversion.
Storage format isn't presentation format
Yes, you can tell MySQL to use comma as its decimal separator, but doing that will break so much of your code - including the parts of the framework that read from the database and expect dot-decimal numbers - that you'll regret doing it that way very quickly...
There's a general principle at work here: you should do your data storage and processing using a format that is easy to process, interchangeable with other systems, and understood by other software developers.
Consider what happens if you need to allow a different framework to access your MySQL database to generate reports... whoever develops that software (and it may be you) will be glad that the numbers are all stored the way numbers are "always" stored in databases.
Convert on the way in, re-convert on the way out
Where you need to accept input in a different format, convert that input into your standardised format as early as possible.
When you need to use an output format, do the conversion to that format as late as possible.
The idea is to keep as much of your system "unexceptional" as possible. A programmer who has to remember what numeric format will in force at the time when a given method is called is not a happy programmer.
P.S.
The option you're talking about in MySQL is an example of this pattern: it doesn't change how numeric data is stored. All that changes is how you pass numbers to MySQL and how it presents them back to you.

strategy for returning JSON from XML canonical model

I am using the envelope pattern, and my canonical model part is in XML format. I usually return the model in full or in a summary version. Retrieval of documents is pretty quick, but when returning as part of my REST call, where I need to return JSON to the browser, my json:transform-to-json takes double the version of the call that just returns the XML.
Is a strategy to also have the canonical model in JSON format as well in the envelope, or to maybe have rendered json in full and summary formats in other documents outside of the envelope, which don't get searched, but are mainly used when returning results? This way I don't have to incur the hit for transforming the canonical model to JSON all the time.
Are there any other ways that this has been done?
Conversion from XML to JSON should be relatively light, but the mere fact it has to do something will take overhead. Doing that work upfront will definitely save time. You can put both formats in the same envelop (though JSON will have to be stored as string then), or in a different document as you suggest. Alternatively you could also store it in document-properties. Unfortunately, that only takes XML as well, so you will be storing your JSON as string in there too.
Alternatively, have you profiled the transform to see if there is a particular reason why it slows down so much? Using XSLT versus XQuery for the transform could make a difference too..
HTH!
json:transform-to-json has 3 algorithms optimized for different purposes and will perform with different tradeoffs of flexabilty, fidelity and performance.
"basic" (default) useful only to reverse json:transform-from-json()
"full" - to preserve as much information fidelity as possible, in exchange for a non 'prety' format in many cases.
"custom" - is ... custom ... designed when the json format is fixed or when you want control over the json output at the expense of handling a subset of XML accurately.
Basic and full are the most efficient. However all variants are fairly involved and require completely traversing the XML node tree and creating bottom up a JSON object tree. In ML version 8 this is then translated into the native JSON node structure. In a REST call it would then be serialized as text.
Compared to a direct return of an xml document vi fn:doc("file.xml") there is atleast 2 orders of magnitude more operations involved in the transform case.
For small documents in a REST call that still a small fraction of the total request time, expecialy if the REST call was performing a complex operation itself then returning a small result. Your use case seems the opposite - returning a xml document directly bypasses almost all of the XQuery processing and is sent directly from internal to the output or assigned to a variable.
If that an important use case to optimize, especially if the documents can be large, then saving them as text or binary will be much faster -- at the expense of more storage used. If this is only a variant representation of the xml, try storing the text JSON as binary as it will not incur any indexing overhead.
Otherwise if you need to query over the JSON then in ML7 storing as text gives you simple word queries, in ML8 storing as native JSON gives you structured queries -- both with efficient text serialization.

Is there a Go Language equivalent to Perls' Dumper() method in Data::Dumper?

I've looked at the very similarly titled post (Is there a C equivalent to Perls' Dumper() method in Data::Dumper?), regarding a C equivalent to Data::Dumper::Dumper();. I have a similar question for the Go language.
I'm a Perl Zealot by trade, and am a progamming hobbyist, and make use of Data::Dumper and similar offspring literally hundreds of times a day. I've taken up learning Go, because it looks like a fun and interesting language, something that will get me out of the Perl rut I'm in, while opening my eyes to new ways of doing stuffz... One of the things I really want is something like:
fmt.Println(dump.Dumper(decoded_json))
to see the resulting data structure, like Data::Dumper would turn the JSON into an Array of Hashes. Seeing this in Go, will help me to understand how to construct and work with the data. Something like this would be considered a major lightbulb moment in my learning of Go.
Contrary to the statements made in the C counterpart post, I believe we can write this, and since I'll be passing Dumper to Println, after compilation what ever JSON string or XML page I pass in and decode. I should be able to see the result of the decoding, in a Dumper like state... So, does any more know of anything like this that exists? or maybe some pointers to getting something like this done?
Hi and welcome to go I'm former perl hacker myself.
As to your question the encoding/json package is probably the closest you will find to a go data pretty printer. I'm not sure you really need it though. One of the reasons Data::Dumper was awesome in perl is because many times you really didn't know the structure of the data you were consuming without visually inspecting it. With go though everything is a specific type and every specific type has a specific structure. If you want to know what the data will look like then you probably just need to look at it's definition.
Some other tools you should look at include:
fmt.Println("%#v", data) will print the data in go-syntax form.
fmt.Println("%T", data) will print the data's type in go-syntax
form.
More fmt format string options are documented here: http://golang.org/pkg/fmt/
I found a couple packages to help visualize data in Go.
My personal favourite - https://github.com/davecgh/go-spew
There's also - https://github.com/tonnerre/golang-pretty
I'm not familiar with Perl and Dumper, but from what I understand of your post and the related C post (and the very name of the function!), it outputs the content of the data structure.
You can do this using the %v verb of the fmt package. I assume your JSON data is decoded into a struct or a map. Using fmt.Printf("%v", json_obj) will output the values, while %+v will add field names (for a struct - no difference if its a map, %v will output both keys and values), and %#v will output type information too.

compact binary representation of json [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 months ago.
Improve this question
Are there any compact binary representations of JSON out there? I know there is BSON, but even that webpage says "in many cases is not much more efficient than JSON. In some cases BSON uses even more space than JSON".
I'm looking for a format that's as compact as possible, preferably some kind of open standard?
You could take a look at the Universal Binary JSON specification. It won't be as compact as Smile because it doesn't do name references, but it is 100% compatible with JSON (where as BSON and BJSON define data structures that don't exist in JSON so there is no standard conversion to/from).
It is also (intentionally) criminally simple to read and write with a standard format of:
[type, 1-byte char]([length, 4-byte int32])([data])
So simple data types begin with an ASCII marker code like 'I' for a 32-bit int, 'T' for true, 'Z' for null, 'S' for string and so on.
The format is by design engineered to be fast-to-read as all data structures are prefixed with their size so there is no scanning for null-terminated sequences.
For example, reading a string that might be demarcated like this (the []-chars are just for illustration purposes, they are not written in the format)
[S][512][this is a really long 512-byte UTF-8 string....]
You would see the 'S', switch on it to processing a string, see the 4-byte integer that follows it of "512" and know that you can just grab in one chunk the next 512 bytes and decode them back to a string.
Similarly numeric values are written out without a length value to be more compact because their type (byte, int32, int64, double) all define their length of bytes (1, 4, 8 and 8 respectively. There is also support for arbitrarily long numbers that is extremely portable, even on platforms that don't support them).
On average you should see a size reduction of roughly 30% with a well balanced JSON object (lots of mixed types). If you want to know exactly how certain structures compress or don't compress you can check the Size Requirements section to get an idea.
On the bright side, regardless of compression, the data will be written in a more optimized format and be faster to work with.
I checked the core Input/OutputStream implementations for reading/writing the format into GitHub today. I'll check in general reflection-based object mapping later this week.
You can just look at those two classes to see how to read and write the format, I think the core logic is something like 20 lines of code. The classes are longer because of abstractions to the methods and some structuring around checking the marker bytes to make sure the data file is a valid format; things like that.
If you have really specific questions like the endianness (Big) of the spec or numeric format for doubles (IEEE 754) all of that is covered in the spec doc or just ask me.
Hope that helps!
Yes: Smile data format (see Wikipedia entry. It has public Java implementation, C version in the works at github (libsmile). It has benefit of being more compact than JSON (reliably), but being 100% compatible logical data model, so it is easy and possible to convert back and forth with textual JSON.
For performance, you can see jvm-serializers benchmark, where smile competes well with other binary formats (thrift, avro, protobuf); sizewise it is not the most compact (since it does retain field names), but does much better with data streams where names are repeated.
It is being used by projects like Elastic Search and Solr (optionally), Protostuff-rpc supports it, although it is not as widely as say Thrift or protobuf.
EDIT (Dec 2011) -- there are now also libsmile bindings for PHP, Ruby and Python, so language support is improving. In addition there are measurements on data size; and although for single-record data alternatives (Avro, protobuf) are more compact, for data streams Smile is often more compact due to key and String value back reference option.
gzipping JSON data is going to get you good compression ratios with very little effort because of its universal support. Also, if you're in a browser environment, you may end up paying a greater byte cost in the size of the dependency from a new library than you would in actual payload savings.
If your data has additional constraints (such as lots of redundant field values), you may be able to optimize by looking at a different serialization protocol rather than sticking to JSON. Example: a column-based serialization such as Avro's upcoming columnar store may get you better ratios (for on-disk storage). If your payloads contain lots of constant values (such as columns that represent enums), a dictionary compression approach may be useful too.
Another alternative that should be considered these days is CBOR (RFC 7049), which has an explicitly JSON-compatible model with a lot of flexibility. It is both stable and meets your open-standard qualification, and has obviously had a lot of thought put into it.
Have you tried BJSON ?
Try to use the js-inflate to make and unmake blobs.
https://github.com/augustl/js-inflate
This is perfect and I use a lot.
You might also want to take a look at a library I wrote. It's called minijson, and it was designed for this very purpose.
It's Python:
https://github.com/Dronehub/minijson
Have you tried AVRO? Apache Avro
https://avro.apache.org/

html special characters into outfile.txt

I've got a database where for efficiency, i've put the data into the db in html encoded formats.
I do maintenance on the data, and then move it into production via an 'into outfile', so it ends up in a text file.
The special characters don't make it across cleanly, and it comes out as all messed up code.
Is there a way to maintain the format for the txt file?
Or should I be using another format?
The 'outfile' , and 'import'I find very efficient for doing a bulk transfer.
If i can't use that, any suggestions on the best way to find special characters in mysql?
The only thing I've found seems to find fields that ONLY contain non-ascii characters
SELECT * FROM tableName WHERE NOT columnToCheck REGEXP '[A-Za-z0-9]';
Is there a reason you're storing HTML encoded text in the database? As discussed in episode 58 of the Stack Overflow podcast, you should always try to store raw data at the highest level of precision possible.