iam a beginner in hadoop,can any one help me in reading json in mapreduce job.
i have googled and found jaql is suitable for reading json.but i didnot find any documentaion on how it could be implemented in our map reduce job.
is there any other framework which supports reading json in map reduce?
any suggestions on this?
Thanks in Advance
I would rather trust the MapReduce framework itself to handle this. MapReduce allows us to write custom Inout/Output Formats to handle data which is not supported by it OOTB, like JSON. See this question for an example. I would prefer this as I won't require any third party stuff for this. It's just a matter of extending the MapReduce API(But it's just my choice. Other's may find something else more suitable).
But, the easiest way, IMHO, would be to use Hive or Pig to handle JSON data. You don't have to do much in order to make it work, as both these project have OOTB JSON support. See this for Hive-JSON SerDe and this for Pig's JsonLoader and JsonStorage.
HTH
I need to build a template for data exchange between two web services using JSON. I believe I need to build a JSON template which could be similar to XSD used in XML. Is that right?
Please guide me to a good tutorial or a good app to generate a JSON template.
Thanks on advance
What you are looking for is called JSON schema. Have a look at wikipedia's page
If you want a schema generator, then look at the answers of this question.
Would you have any hints on what would be the best way to deal with files containing JSON entries and Hadoop?
There's a nice article on this from the Hadoop in Practice book:
http://java.dzone.com/articles/hadoop-practice
Twitter's elephant-bird library has a JsonStringToMap class which you can use with Pig.
Try this
You can also use JAQL. Its the easiest way to deal with JSON in Map Reduce. Bad thing is that you will have to learn JAQL (unless you know it already) !!
MongoDB is a good option when you are dealing with JSON. MongoDB and Hadoop are a powerful combination and can be used together to deliver complex analytics and data processing for data stored in MongoDB. http://www.mongodb.org/
This seems like a silly question, but I haven't really seen a good guide or best practices for getting Tomcat to respond to JSON requests. Does anyone know of one? We'll be handling a lot of requests so I need to consider redundancy and efficiency.
Thanks.
Consider using Jackson oder JSON-Lib to transform your output from within your servlets/controllers.
I wanted to use YAML but there is not a single mature YAML library for Erlang. I know there are a few JSON libraries, but was wondering which is the most mature?
Have a look at the one from mochiweb: mochijson.erl
1> mochijson:decode("{\"Name\":\"Tom\",\"Age\":10}").
{struct,[{"Name","Tom"},{"Age",10}]}
I prefer Jiffy. It works with binary and is realy fast.
1> jiffy:decode(<<"{\"Name\":\"Tom\",\"Age\":10}">>).
{[{<<"Name">>,<<"Tom">>},{<<"Age">>,10}]}
Can encode as well:
2> jiffy:encode({[{<<"Name">>,<<"Tom">>},{<<"Age">>,10}]}).
<<"{\"Name\":\"Tom\",\"Age\":10}">>
Also check out jsx. "An erlang application for consuming, producing and manipulating json. Inspired by Yajl." I haven't tried it myself yet, but it looks promising.
As a side note; I found this library through Jesse, a json schema validator by Klarna.
I use the json library provided by yaws.
Edit: I actually switched over to Jiffy, see Konstantin's answer.
Trapexit offers a really cool search feature for Erlang projects.
Lookup for JSON there, you'll find almost 13 results. Check the dates of the latest revisions, the user rating, the project activity status.
UPDATE: I've just found a similar question n StackOverflow. Apparently, they are quite happy with the erlang-json-eep-parser parser.
My favourite is mochijson2. The API is straightforward, it's fast enough for me (I never actually bothered to benchmark it though, to be honest--I'm mostly en- and de-coding small packets), and I've been using it in a stable "production server" for a year now or so. Just remember to install mochinum as well, mochijson2 uses it to encode large numbers, if you miss it, and you'll try to encode a large number, it will throw an exception.
See also: mochijson2 examples (stackoverflow)