What is the most mature JSON library for Erlang? - json

I wanted to use YAML but there is not a single mature YAML library for Erlang. I know there are a few JSON libraries, but was wondering which is the most mature?

Have a look at the one from mochiweb: mochijson.erl
1> mochijson:decode("{\"Name\":\"Tom\",\"Age\":10}").
{struct,[{"Name","Tom"},{"Age",10}]}

I prefer Jiffy. It works with binary and is realy fast.
1> jiffy:decode(<<"{\"Name\":\"Tom\",\"Age\":10}">>).
{[{<<"Name">>,<<"Tom">>},{<<"Age">>,10}]}
Can encode as well:
2> jiffy:encode({[{<<"Name">>,<<"Tom">>},{<<"Age">>,10}]}).
<<"{\"Name\":\"Tom\",\"Age\":10}">>

Also check out jsx. "An erlang application for consuming, producing and manipulating json. Inspired by Yajl." I haven't tried it myself yet, but it looks promising.
As a side note; I found this library through Jesse, a json schema validator by Klarna.

I use the json library provided by yaws.
Edit: I actually switched over to Jiffy, see Konstantin's answer.

Trapexit offers a really cool search feature for Erlang projects.
Lookup for JSON there, you'll find almost 13 results. Check the dates of the latest revisions, the user rating, the project activity status.
UPDATE: I've just found a similar question n StackOverflow. Apparently, they are quite happy with the erlang-json-eep-parser parser.

My favourite is mochijson2. The API is straightforward, it's fast enough for me (I never actually bothered to benchmark it though, to be honest--I'm mostly en- and de-coding small packets), and I've been using it in a stable "production server" for a year now or so. Just remember to install mochinum as well, mochijson2 uses it to encode large numbers, if you miss it, and you'll try to encode a large number, it will throw an exception.
See also: mochijson2 examples (stackoverflow)

Related

CSV to JSON benchmarks

I'm working on a project that uses parallel methods to convert text from one form to another. We're going to implement a CSV to JSON converter to demonstrate the speedups that are possible using our parallel framework.
We want to benchmark our converter once it's finished. What are the fastest libraries/stand-alone programs/etc out there that are capable of doing CSV-JSON conversion? I found a list of potential candidates here:Large CSV to JSON/Object in Node.js, but I'm not sure how fast the listed options are. In the worst case I'll benchmark them myself, but if someone already knows what the "best in class" converters are it'd save me some time.
Looks like the maintainer of csvtojson has developed a benchmark application. I think I can add my csv to json converter to his benchmark project to test my converter.
if your project can consider in-browser apps, I suggest csvtojson as it is by far the speediest converter on the market as of 2017.
I created it myself so I may be a bit biaised, but I specifically developed it for a bigger project that required big csv to json crunching.
Tell me if it served.

Is there anything wrong with YAML format to be joined to the web standards

Well, I think YAML is really fantastic...
It's beautiful, easy to read, clever syntax...compared to any other data serialization format.
As a superset of JSON we could say it's more elaborated, hence its language evolution.
But I see some different opinions out there, such:
YAML is dead,
don't use yaml and so on...
I simply can't understand on what this is based because it seems so nice :)
If we take few well succeeded examples over the web such as Ruby on Rails, we know they use yaml for simple configuration, but one thing that gets me curious is why yaml is not being part of most used formats over web like XML and JSON.
If you take twitter for example...why not offer the data in YAML format from the API as well?
Is there something wrong by doing it?
We can see the evolution on no-sql databases like couchdb, mongo, all json based, even one great project called jsondb which looks very lightweight and it definitely can do the job.
But when writing data structures in json I really can't understand why YAML is not being used instead.
So one of my concerns would be if is there something wrong with YAML?
People can say it's complex, but well, if you pretend to use the same features you would get in json it's definitely not. You will get a more beautiful file for sure tho and with no hassle. It would be indeed more complex if you decide to use more features, but that's how things are, at least you have the possibility to use it if you want to.
The possibility to choose if you want or not to use double-quotes for string is fantastic makes everything cleaner and easier to read....well you see what's my point :)
So my question would be, why YAML is not vastly used in place of JSON?
Why it doesn't seem that it will be used for data structure transfers within the online community?
All I can see is people using it for simple configuration files and nothing else...
Please bear with me since I might be completely wrong and very big projects might be happening and my ignorance on the subject didn't allow me to be a part of it :)
If is there any big project based on yaml out there I would be very happy to know about it
Thanks in advance
It's not that there's something wrong with YAML β€” it's just that it doesn't offer any compelling benefits in many cases. YAML is basically a superset of JSON. For most purposes, JSON is quite sufficient β€” people wouldn't be using advanced YAML features even if they had a full YAML parser β€” and its close ties to JavaScript make it fit in well with the technologies that Web developers are using anyway.
TLDR: People are already using as much YAML as they need. In most cases, that's JSON.
YAML uses more data than non-prettified JSON. It's great for files that humans might want to edit themselves but when all you're doing is passing data around, you're wasting bandwidth if you're using YAML.
If you need an explanation: each space in UTF-16 is two bytes. YAML uses spaces for indentation, and newline characters for nesting.
Take this example:
foo:
bar:
- foo
- bar
This requires 44 characters (including newline characters). The equivalent JSON would be only 29 characters:
{"foo":{"bar":["foo","bar"]}}
Then just imagine what happens if you URL-encode the YAML. It becomes 95 characters:
foo%3A%0A%20%20%20%20bar%3A%0A%20%20%20%20%20%20%20%20-%20foo%0A%20%20%20%20%20%20%20%20-%20bar
Meanwhile the JSON just becomes 64 characters:
%7B%22foo%22%3A%7B%22bar%22%3A%5B%22foo%22%2C%22bar%22%5D%7D%7D
The size increase to YAML from JSON is more than double when it's URL-encoded, in the above example. And I'm sure you can just imagine that the longer your YAML file, the more and more this difference will increase.
Oh, and one other reason not to use YAML: stackoverflow.com does not support YAML syntax highlighting... ! (Of course, I would argue that YAML is so beautiful that it doesn't need syntax highlighting. That's kind of the point of YAML, I think.)
In Ruby many people argue that configuration should be Ruby, rather than YAML. This saves the parsing stage, means you don't have to learn the new syntax, and don't end up with ERB tags everywhere when you are dynamically generating YAML content (Rails fixtures).
Personally I have to agree, and can't see what YAML would offer to network transfers that would make it a worthwhile consideration over JSON.
YAML has an amount of problems, there is a good article
YAML: probably not so great after all on that.
Short summary (in addition to problems already listed in other answers):
Unreadable except for simple and short things
Insecure by default
Has portability issues
Very complex, with amount of surprising behaviors
I considered using YAML few times and never did. The reason always had to do with white spaces for indentation. While I personally love this, even to me it sounded like asking for trouble, because
For sure someone will make a mistake, not expecting that changing white spaces will break the file. Sometimes someone who has no idea about the language / format has to go to the file to change one number or string.
You can't guarantee that everybody everywhere will have it's comparison / merging / SC software configured properly to catch white space or empty lines differences.

How can I marshal JSON to/from a POJO for BlackBerry Java?

I'm writing a RIM BlackBerry client app. BlackBerry uses a simplified version of Java (no generics, no annotations, limited collections support, etc.; roughly a Java 1.3 dialect). My client will be speaking JSON to a server. We have a bunch of JAXB-generated POJOs, but they're heavily annotated, and they use various classes that aren't available on this platform (ArrayList, BigDecimal, XMLGregorianCalendar). We also have the XSD used by the JAXB-XJC compiler to generate those source files.
Being the lazy programmer that I am, I'd really rather not manually translate the existing source files to Java 1.3-compatible JSON-marshalling classes. I already tried JAXB 1.0.6 xjc. Unfortunately, it doesn't understand the XSD file well enough to emit proper classes.
Do you know of a tool that will take JAXB 2.0 XSD files and emit Java 1.3 classes? And do you know of a JSON marshalling library that works with old Java?
I think I am doomed because JSON arrived around 2006, and Java 5 was released in late 2004, meaning that people probably wouldn't be writing JSON-parsing code for old versions of Java.
However, it seems that there must be good JSON libraries for J2ME, which is why I'm holding out hope.
For the first part good luck but I really don't think you're going to find a better solution than to modify the code yourself. However, there is a good J2ME JSON library you can find a link to the mirror here.
I ended up using apt (annotation processing tool) to run over the 1.5 sources and emit new 1.3-friendly source. Actually turned out to be a pretty nice solution!
I still haven't figured out an elegant way to do the actual JSON marshalling, but the apt tool can probably help write the rote code that interfaces with a JSON library like the one Jonathan pointed out.

How can I analyze a closed format (e.g. doc or vce)?

I want to study the .vce format. It's a binary format and it seems more complicated than a simple object serialization. Does it exist any tool or technique to analyze a binary format?
You might need to "Reverse-Code-Engineer" a programm using this file format (http://www.openrce.org/). Tools used for this kind of analysis are: brain, disassembler (IDA Pro for example) and Debugger (OllyDBG for example). But beware - the way for successfull reverse engineering a file format is veeeeeerrry hard.
And reversing an application might be illegal depending on where you live!
You'll have to get a library that can read the format (or create one yourself).
Here is some of the microsoft office binary format specifications
I believe it would only be possible through some nasty reversed-engineering. It would be very useful to have access to application that uses mentioned format, so that you can generate few simple files and compare them in hex editor. You cannot get far with this method, but you might be able to figure out the header.
It would also be useful to study some binary format mechanisms, such as encryption and compression. If you're talking about Visual CertExam file format, than it is likely that useful data will be strongly encrypted.
My 2 cents:
Start by reversing the application reading the files themselves. Particularly android applications are helpful, as the resulting java source is easier to read (you might want to try A+ vce reader for android for example). This program indicates that vce uses/embeds sqlite in the file (in line with what is hinted here: Reverse Engineer a File Format).
Where to go from here? You might want to explore sqlite file carving tools to see if there might be a way to programatically identify the patterns in the file. Good luck!

Is there a streaming API for JSON? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a tool, library or favorite off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
Is DOM the only way to parse JSON?
Some JSON parsers do offer incremental ("streaming") parser; for Java, at least following parsers from json.org page offer such an interface:
Jackson (pull interface)
Json-simple (SAX-style push interface)
(in addition to Software Monkey's parser referred to by another answer)
Actually, it is kind of odd that so many JSON parsers do NOT offer this simple low-level interface -- after all, they already need to implement low-level parsing, so why not expose it.
EDIT (June 2011): Gson too has its own streaming API (with gson 1.6)
By DOM, I assume you mean that the parser reads an entire document at once before you can work with it. Note that saying DOM tends to imply XML, these days, but IMO that is not really an accurate inference.
So, in answer to your questions - "Yes", there are streaming API's and "No", DOM is not the only way. That said, processing a JSON document as a stream is often problematic in that many objects are not simple field/value pairs, but contain other objects as values, which you need to parse to process, and this tends to end up a recursive thing. But for simple messages you can do useful things with a stream/event based parser.
I have written a pull-event parser for JSON (it was one class, about 700 lines). But most of the others I have seen are document oriented. One of the layers I have built on top of my parser is a document reader, which took about 30 LOC. I have only ever used my parser in practice as a document loader (for the above reason).
I am sure if you search the net you will find pull and push based parsers for JSON.
EDIT: I have posted the parser to my site for download. A working compilable class and a complete example are included.
EDIT2: You'll also want to look at the JSON website.
As stefanB mentioned, http://lloyd.github.com/yajl/ is a C library for stream parsing JSON. There are also many wrappers mentioned on that page for other languages:
yajl-ruby - ruby bindings for YAJL
yajl-objc - Objective-C bindings for YAJL
YAJL IO bindings (for the IO language)
Python bindings come in two flavors, py-yajl OR yajl-py
yajl-js - node.js bindings (mirrored to github).
lua-yajl - lua bindings
ooc-yajl - ooc bindings
yajl-tcl - tcl bindings
some of them may not allow streaming, but many of them certainly do.
If you want to use pure javascript and a library that runs both in node.js and in the browser you can try clarinet:
https://github.com/dscape/clarinet
The parser is event-based, and since it’s streaming it makes dealing with huge files possible. The API is very close to sax and the code is forked from sax-js.
Disclaimer: I'm suggesting my own project.
I maintain a streaming JSON parser in Javascript which combines some of the features of SAX and DOM:
Oboe.js website
The idea is to allow streaming parsing, but not require the programmer to listen to lots of different events like with raw SAX. I like SAX but it tends to be quite low level for what most people need. You can listen for any interesting node from the JSON stream by registering JSONPath patterns.
The code is on Github here:
Oboe.js Github page
LitJSON supports a streaming-style API. Quoting from the manual:
"An alternative interface to handling JSON data that might be familiar to some developers is through classes that make it possible to read and write data in a stream-like fashion. These classes are JsonReader and JsonWriter.
"These two types are in fact the foundation of this library, and the JsonMapper type is built on top of them, so in a way, the developer can think of the reader and writer classes as the low-level programming interface for LitJSON."
If you are looking specifically for Python, then ijson claims to support it. However, it is only a parser, so I didn't come across anything for Python that can generate json as a stream.
For C++ there is rapidjson that claims to support both parsing and generation in a streaming manner.
Here's a NodeJS NPM library for parsing and handling streams of JSON:
https://npmjs.org/package/JSONStream
For Python, an alternative (apparently lighter and more efficient) to ijson is jsaone (see that link for rough benchmarks, showing that jsaone is approximately 3x faster).
DISCLAIMER: I'm the author of jsaone, and the tests I made are very basic... I'll be happy to be proven wrong!
Answering the question title: YAJL a JSON parser library in C:
YAJL remembers all state required to
support restarting parsing. This
allows parsing to occur incrementally
as data is read off a disk or network.
So I guess using yajl to parse JSON can be considered as processing stream of data.
In reply to your 2nd question, no, many languages have JSON parsers. PHP, Java, C, Ruby and many others. Just Google for the language of your choice plus "JSON parser".