I'm aware that there are python and powershell methods to convert plain text files, csv's etc.... into json format for upload into NoSQL DBs such as CouchDB.
But according to the CouchDB definitive guide, it makes it seems like there is a native built in way of doing this kind of conversion, without the need for a 3rd party tool.
This older thread appears to hint at this:
Filter and update functions in CouchDB?
This part in particular:
There are other design document functions that are being introduced at the >time of this writing, including _update and _filter that we aren’t covering in >depth here. Filter functions are covered in Chapter 20, Change Notifications. >Imagine a web service that POSTs an XML blob at a URL of your choosing when >particular events occur. PayPal’s instant payment notification is one of >these. With an _update handler, you can POST these directly in CouchDB and it >can parse the XML into a JSON document and save it. The same goes for CSV, >multi-part form, or any other format.
But when I dig deeper I don't find anything concrete.
The supporting wiki link is not clear to me (a beginner with json/NoSQL/curl stuff: http://wiki.apache.org/couchdb/Document_Update_Handlers
Hopefully this is a simple yes/no. And any links to help on this that is better than the above link also appreciated, thank you!
CouchDB supports transforming the internal documents/views into many other formats through the use of show and list functions. It's not a "native" transformation, as you define the transformation yourself, it's not magical.
That being said, there is not a similar mechanism for the reverse (ie: converting some arbitrary format into JSON documents) but you're much better off scripting that with a full-featured language/script and using the bulk docs API to do your imports in batch.
Related
I need to understand the difference between
- message pack
- protocol buffers
- JSON
Without having jumped in deeply into the matter I'd say the following:
All three are data formats that help you serialize information in a structured form so you can easily exchange it between software components (for example client and server).
While I'm not too familiar with the other two, JSON is currently a quasi-standard due to the fact that it is practically built into JavaScript - it's not a coincidence it is called JavaScript Object Notation. The other two seem to require additional libraries on both ends to create the required format.
So when to use which? Use JSON for REST services, for example if you want to publish your API or need different clients to access it. JSON seems to have the broadest acceptance.
I was reading this article about protobuf and I wondered where to use it in the projects. I read some articles that said google created protobuf to replace XML, but as far as I know in 2008 (the first release) JSON was already there.
I searched more and I found an article that the writer suggested to use it instead of JSON, but I still don't get the idea completely.
So where shall I use it? Any special scenario, or like JSON whenever that I want to transport data? Any other scenarios?
It is useful whenever you want to serialize/deserialize your data. Typical situations include sending your data to someone else over the network, storing it to disk or keeping it in context while performing asynchronous processes.
Here is a brief explanation about the main differences between protocol buffer, json and XML: https://stackoverflow.com/a/14029040/6681872
According to a post on Stackflow.com called “what’s is JSOn and why would I use it? “web services used XML as their primary data format for transmitting back data, but since JSON appeared, it is preferred method.” Why do must web services use JSON over XML, is because it’s a better method for interchanging?
XML was designed primarily for document formats, e.g. papers in scientific journals. It contains many features that aren't needed for simple data interchange, and these features can get in the way when you are processing XML because they can't be easily represented in Javascript. So the code for processing the XML ends up a lot more complicated than it could be. By contrasts, JSON has an exact match to the data structures Javascript can handle natively. Of course, that problem could in principle be solved by using a language with better XML support than JavaScript - XSLT, for example - but unfortunately XSLT in the browser has never had the same level of investment put into it.
Additionally, for reasons I have never understood, the browser security folks decided that reading JSON from alien web sites (i.e. from a different domain from your HTML page) is safe, but reading XML from alien sites isn't. So if you switch from XML to JSON, you get rid of a lot of cross-site-scripting hassle.
JSON is less verbose and it is sufficient for simple data transmission, i.e. if you do not need any transformations (XSLT).
I would love to use protocol buffers, but I am not sure if they fit my use case. Here it is:
I have a Quiz app. This requires a bunch of data, like categories, questions, a list of answers (and which ones are correct). I do not want to be responsible for entering this data - I would prefer to pass it off to a non-programmer to serialize all this data for me, in either XML or JSON. Then my app would just read in the data file.
Does Google's Protocol Buffers fit my use case? Or should I stick to a more traditional format like XML or JSON?
I think not: Protobuf is a binary format. So then you would need to support a text format like XML or JSON and Protobuf.
Also it does not seem you would benefit from Protobufs better berformance at all.
I'm creating a web apllication and i want to load a json file to a visualization library. the thing is the json file needs to be in a certain format.
I'm using jena to get data in a json file that is in the TALIS format. How can i get the data writen in a custom format?Is it easier to first get them in talis and then transform them or get them in the desired form from the beginning?
I'd appreciate every possible help!
You don't say how you are serving your data to the client-side JavaScript application. I'm going to take a guess, and assume you are using Jena Fuseki to serve the data. If that's not a correct guess, you'll need to update the question to be more precise about your setup.
I don't think that Fuseki currently supports pluggable writers. So your best solution would be to apply a transformation in the client-side JavaScript to turn the JSON you get from the server into the format that's needed by the visualisation library. I've done this myself in a number of rich-client applications that consume RDF data. I usually find that I would need to apply client-side transform code in any case - often it's not just a difference in the format of the JSON, but also that you need to project some slice or aggregation of the data that's just easier to express in JavaScript rather than in SPARQL or equivalent.