I'm currently evaluating contentful as a potential cms for a project. I've been playing around with the json api, which is great, but I'm having trouble representing anything more complex than a flat object data structure as a content type.
The workaround I've found is to create a separate entity and reference it, which works, but makes things quite a bit more complicated (far more entities, requires additional publishing, etc.).
As discussed by contentful here, this approach works great for relating content, but that's a different use case. I simply want to create a piece of content like the following:
{
"item": "value",
"subitem": {
"item": "value"
}
}
Is there another approach to handle this?
So what you are talking about is exactly the same issues we had when building one of our applications.
To get around this, we wrote a small npm module that parses these complex content types pretty easily.
Check it out here: https://github.com/remedyhealth/contentpull
If you want to see the parts talking specifically about the parsing, we wrote a simple tonic notebook to show this: https://tonicdev.com/mrsteele/contentpull
(the parser section is towards the bottom)
Let me know if that helps at all, and please feel free to fork and improve if you have any good recommendations.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I am currently exploring the usages of JSON. I have read many posts, articles, YouTube videos. Yet, I still don't understand the purpose (it's been a month), practicality of JSON. No definition is suiting my logic to understand this concept in order to comfortably implement it.
What I understand (brief overall understanding): JSON provides an easier way to format data and send across networks.
My question: Could someone provide me a comprehensible storyline with JSON in action as I am struggling to understand its practicality. I hope this question makes sense, if not, I can try and re-word it.
Edit for #Philipp: Yes, I do have experience with reading Text-based files with Java (Mainly with assignments at Uni). No, I do not have experience with any competing technologies such as XML or YAML. Consciously, I believe JSON to be 'Cookies' in a sort of way but this most likely is wrong. I hope this helps and look forward to your explanation? Maybe it might help me understanding it.
JSON is, overly simplified, a standard for how to structure your own file formats. "File" does not necessarily mean a file stored in a filesystem. It can also be an ephemeral file which is created on one computer, sent to a different computer via network, gets processed and then discarded without ever storing it. But thinking of it as a file format makes things easier.
A JSON-based file format includes a document in a key-value structure. Every value has a key. Every value can either be a string, a number, another key-value structure or a list of the things mentioned before. Here is an example based on the one from the wikipedia article on JSON:
{
"firstName": "John",
"lastName": "Smith",
"address": {
"streetAddress": "21 2nd Street",
"city": "New York",
"state": "NY",
"postalCode": "10021-3100"
},
"phoneNumbers": [
{
"type": "home",
"number": "212 555-1234"
},
{
"type": "office",
"number": "646 555-4567"
}
]
}
This file describes a person who has a first name, a last name, one address consisting of a street address, city, state and postal code, and a list of phone numbers, with each phone number having a type and a number.
OK, but there are certainly other ways to store that kind of information. Ways which might be more concise. So why would you choose to invent a file format based on JSON instead of just starting from scratch?
Library support. There are lots of libraries available for parsing and writing JSON. If you ever wrote a file parsing routine yourself, then you know how much of a PITA those can be. There are a ton of edge-cases you have to keep in mind to prevent your program from crashing or reading garbage data. A JSON library takes care of all of these edge-cases for you. This makes it a lot easier for you to create programs working with JSON data than when you invent your own file format.
Tool support. There are editors available which can edit any form of JSON data in a handy UI. For example, did you notice that Stackoverflow automatically added syntax highlighting to the JSON code above? I didn't do anything to make that happen. Stackoverflow just automatically recognized that it is JSON and colored it accordingly. That would not be possible with a homebrewed file format.
Good compromise between machine-readability and human-readability. The format above is not just easy to read for programs (thanks to the aforementioned library support) but also pretty readable and editable for humans. People can intuitively understand the format and edit it in a text editor without breaking stuff. Especially when they worked with JSON-based file formats before.
Forward- and backward compatibility of file formats. This is something you could technically achieve in your own file format, but JSON makes it a lot easier. Imagine you create version 2.0 of your program, which comes with a version 2.0 of the file format. Your documents now have some additional fields. Handling this in homebrewed text-based formats can be really difficult. But the key-value structure of JSON makes it pretty easy to recognize that certain keys are missing and then replace their values with reasonable defaults. Similarly, the 1.0 version of your program might make limited sense of 2.0 documents by simply ignoring any keys it doesn't understand yet.
Interoperability with JavaScript. This might be kind of situational, but the reason why you see JSON being used a lot in the context of web applications is that JSON is actually valid JavaScript. That means that when you have a browser-based application, converting to and from JavaScript Objects to JSON text and vice versa is trivial. That makes it a preferred choice for exchanging data between browser-based applications and servers. The result is that you see a lot of JSON in cookies or webservice requests (although none of these mandate the use of JSON).
JSON or (JavaScript Object Notation) is simply a lightweight, semi-structured way of representing a set of data.
One sample Storyline:
Let's say you are creating an application that needs to communicate with another application and you want to make it easier for other applications to consume the data your application provides.
There are a lot of ways to do this, but by using JSON you make the process more simple (the applications that consume your data can figure out how to read your data on their own - if they want to) AND you cut down on the amount of raw data that is being passed around.
To answer your question is very simple and lightweight compared to other communication methods like SOAP or connecting straight to the database where you hold your data.
I'm currently working on a project that has front-end components (Jira) written in groovy, and backend processes written in powershell. We're using json to pass information back and forth. One of the biggest problems we've encountered is coming up with a standardized "template" for the json that is being used on both ends. What we have works, but it is a frankenstein mess.
We are using json libraries for both json and powershell -- the json that is being constructed on either end is legit json. We are also encoding it to base64 to get around interpolation issues we've run into.
My main question is this: what is the best practice for passing data between different tools in json? I'm relatively new to it. Is there some sort of standard template we should be adhering to? I develop the groovy side, my friend the powershell side -- I was hoping to come up with something that would minimize problems if someone messed up how the json was constructed on either side. Something to check against. Something akin to an xsd.
Was curious if people have dealt with this type of thing, and what the best approach was. As I mentioned before, we have something that works now -- with error handling and whatnot, but it was very organic... and not standardized at all. I saw mention of jasonp, jsend, etc., but having some difficulty groking the options.
Tips/guidance appreciated.
Does anyone know a good way to to draw the exact structure of input data for a method? In my case I have to specify the correct input data for a server application. The server gets an http post with data. Because this data is a very complex json data structure, I want to draw this, so next developer can easily check the drawing and is able to understand, what data is needed for the http post. It would be nice if I can also draw http headers mark data as mandatory or nice to have.
I dont need a data flow diagramm or sth. like that. What I need is a drawing, how to build a valid json for the server method.
Please if anyone have an idea, just answer or comment this question, even if you just have ideas for buzz words, I can google myself.
In order to describe data structure consider (1) using the UML class diagram with multiplicities and ownership and "named association ends". Kirill Fakhroutdinov's examples uml-diagrams.org: Online Shopping and uml-diagrams.org: Sentinel HASP Licensing Domain illustrate what your drawing might look like.
As you need to specifically describe json structure then (2) Google: "json schema" to see how others approached the same problem.
Personally, besides providing the UML diagram I'd (3) consider writing a TypeScript definition file which actually can describe json structure including simple types, nested structures, optional parts etc. and moreover the next developer can validate examples of data structures (unit tests) against the definition by writing a simple TypeScript script and trying to compile it
I wrote a JSON compiler and decompiler in October. After running a bunch of tests against other people's JSON, I was satisfied that it worked, and moved on. I mostly focused on the compiler, because that's usually the hard part, trying to understand all the variables that people can throw at you. JSON was, as advertised, pretty easy to work with (though not as easy, imho, as it could have been). No matter.
Now we've got a format that's starting to gain traction, a JSONification of stream of news displayed by River2. A bunch of Javascript devs are working on renderings of this data, some of which are now already nicer than the one I use, but none are yet functional enough for me to switch to.
But there's a problem with the JSON.
Each group of news bits is organized as a bunch of scalar data, like feed name, url, when the feed was last read, etc. Then there are one or more news items. If there's one item I just include a struct named item. If there's more than one I include a list of structs. The list is named item. I understood this is the convention for repeating elements in JSON.
http://scripting.com/images/2010/12/17/jsonShot.gif
In the screen shot above, there are two "updatedFeed" elements. The first has only one item, the second has more than one.
This causes problems for people in some languages because (apparently) it's hard for them to deal with an object without, in advance, knowing its type. So they say the solution is simple, always make it a list. Simple for them, but... :-)
But this is not so simple on my end. Because I'm using a generic JSON serializer, and it would have no way of knowing that "item" should always be a list. Unless...
One way of dealing with this (that I don't like and won't do) is to make everything a list.
I was just wondering what other JSON-producing environments do in situations like this.
JSON is a serialization format. It generally works best, if the same (expected) object has the same schema each time, or if the receiver is built to be flexible or ignore the parts that change.
In this case, it sounds to me that the news stream always should have the same format, so it sounds like you should tweak the object you are "compiling" to JSON, so that it always has the same structure, or use something like JSON Schema.
The problem:
You have some data and your program needs specified input. For example strings which are numbers. You are searching for a way to transform the original data in a format you need.
And the problem is: The source can be anything. It can be XML, property lists, binary which
contains the needed data deeply embedded in binary junk. And your output format may vary
also: It can be number strings, float, doubles....
You don't want to program. You want routines which gives you commands capable to transform the data in a form you wish. Surely it contains regular expressions, but it is very good designed and it offers capabilities which are sometimes much more easier and more powerful.
ADDITION:
Many users have this problem and hope that their programs can convert, read and write data which is given by other sources. If it can't, they are doomed or use programs like business
intelligence. That is NOT the problem.
I am talking of a tool for a developer who knows what is he doing, but who is also dissatisfied to write every time routines in a regular language. A professional data manipulation tool, something like a hex editor, regex, vi, grep, parser melted together
accessible by routines or a REPL.
If you have the spec of the data format, you can access and transform the data at once. No need to debug or plan meticulously how to program the transformation. I am searching for a solution because I don't believe the problem is new.
It allows:
joining/grouping/merging of results
inserting/deleting/finding/replacing
write macros which allows to execute a command chain repeatedly
meta-grouping (lists->tables->n-dimensional tables)
Example (No, I am not looking for a solution to this, it is just an example):
You want to read xml strings embedded in a binary file with variable length records. Your
tool reads the record length and deletes the junk surrounding your text. Now it splits open
the xml and extracts the strings. Being Indian number glyphs and containing decimal commas instead of decimal points, your tool transforms it into ASCII and replaces commas with points. Now the results must be stored into matrices of variable length....etc. etc.
I am searching for a good language / language-design and if possible, an implementation.
Which design do you like or even, if it does not fulfill the conditions, wouldn't you want to miss ?
EDIT: The question is if a solution for the problem exists and if yes, which implementations are available. You DO NOT implement your own sorting algorithm if Quicksort, Mergesort and Heapsort is available. You DO NOT invent your own text parsing
method if you have regular expressions. You DO NOT invent your own 3D language for graphics if OpenGL/Direct3D is available. There are existing solutions or at least papers describing the problem and giving suggestions. And there are people who may have worked and experienced such problems and who can give ideas and suggestions. The idea that this problem is totally new and I should work out and implement it myself without background
knowledge seems for me, I must admit, totally off the mark.
UPDATE:
Unfortunately I had less time than anticipated to delve in the subject because our development team is currently in a hot phase. But I have contacted the author of TextTransformer and he kindly answered my questions.
I have investigated TextTransformer (http://www.texttransformer.de) in the meantime and so far I can see it offers a complete and efficient solution if you are going to parse character data.
For anyone who will give it a try to implement a good parsing language, the smallest set of operators to directly transform any input data to any output data if (!) they were powerful enough seems to be:
Insert/Remove: Self-explaining
Group/Ungroup: Split the input data into a set of tokens and organize them into groups
and supergroups (datastructures, lists, tables etc.)
Transform
Substituition: Change the content of the tokens (special operation: replace)
Transposition: Change the order of tokens (swap,merge etc.)
Have you investigated TextTransformer?
I have no experience with this, but it sounds pretty good and the author makes quite competent posts in the comp.compilers newsgroup.
You still have to some programming work.
For a programmer, I would suggest:
Perl against a SQL backend.
For a non-programmer, what it sounds like you're looking for is some sort of business intelligence suite.
This suggestion may broaden the scope of your search too much... but here it is:
You could either reuse, as-is, or otherwise get "inspiration" from the [open source] code of the SnapLogic framework.
Edit (answering the comment on SnapLogic documentation etc.)
I agree, the SnapLogic documentation leaves a bit to be desired, in particular for people in your situation, i.e. when just trying to quickly get an overview of what SnapLogic can do, and if it would generally meet their needs, without investing much time and learn the system in earnest.
Also, I realize that the scope and typical uses of of SnapLogic differ, somewhat, from the requirements expressed in the question, and I should have taken the time to better articulate the possible connection.
So here goes...
A salient and powerful feature of SnapLogic is its ability to [virtually] codelessly create "pipelines" i.e. processes made from pre-built components;
Components addressing the most common needs of Data Integration tasks at-large are supplied with the SnapLogic framework. For example, there are components to
read in and/or write to files in CSV or XML or fixed length format
connect to various SQL backends (for either input, output or both)
transform/format [readily parsed] data fields
sort records
join records for lookup and general "denormalized" record building (akin to SQL joins but applicable to any input [of reasonnable size])
merge sources
Filter records within a source (to select and, at a later step, work on say only records with attribute "State" equal to "NY")
see this list of available components for more details
A relatively weak area of functionality of SnapLogic (for the described purpose of the OP) is with regards to parsing. Standard components will only read generic file formats (XML, RSS, CSV, Fixed Len, DBMSes...) therefore structured (or semi-structured?) files such as the one described in the question, with mixed binary and text and such are unlikely to ever be a standard component.
You'd therefore need to write your own parsing logic, in Python or Java, respecting the SnapLogic API of course so the module can later "play nice" with the other ones.
BTW, the task of parsing the files described could be done in one of two ways, with a "monolithic" reader component (i.e. one which takes in the whole file and produces an array of readily parsed records), or with a multi-component approach, whereby an input component reads in and parse the file at "record" level (or line level or block level whatever this may be), and other standard or custom SnapLogic components are used to create a pipeline which effectively expresses the logic of parsing a record (or block or...) into its individual fields/attributes.
The second approach is of course more modular and may be applicable if the goal is to process many different files format, whereby each new format requires piecing together components with no or little coding. Whatever the approach used for the input / parsing of the file(s), the SnapLogic framework remains available to create pipelines to then process the parsed input in various fashion.
My understanding of the question therefore prompted me to suggest SnapLogic as a possible framework for the problem at hand, because I understood the gap in feature concerning the "codeless" parsing of odd-formatted files, but also saw some commonality of features with regards to creating various processing pipelines.
I also edged my suggestion, with an expression like "inspire onself from", because of the possible feature gap, but also because of the relative lack of maturity of the SnapLogic offering and its apparent commercial/open-source ambivalence.
(Note: this statement is neither a critique of the technical maturity/value of the framework per-se, nor a critique of business-oriented use of open-source, but rather a warning that business/commercial pressures may shape the offering in various direction)
To summarize:
Depending on the specific details of the vision expressed in the question, SnapLogic may be worthy of consideration, provided one understands that "some-assembly-required" will apply, in particular in the area of file parsing, and that the specific shape and nature of the product may evolve (but then again it is open source so one can freeze it or bend it as needed).
A more generic remark is that SnapLogic is based on Python which is a very swell language for coding various connectors, convertion logic etc.
In reply to Paul Nathan you mentioned writing throwaway code as something rather unpleasant. I don't see why it should be so. After all, all of our code will be thrown away and replaced eventually, no matter how perfect we wrote it. So my opinion is that writing throwaway code is pretty much ok, if you don't spend too much time writing it.
So, it seems that there are two approaches to solving your solution: either a) find some specific tool intended for the purpose (parse data, perform some basic operations on it and storing it in some specific structure) or b) use some general purpose language with lots of libraries and code it yourself.
I don't think that approach a) is viable because sooner or later you'll bump into an obstacle not covered by the tool and you'll spend your time and nerves hacking the tool, or mailing the authors and waiting for them to implement what you need. I might as well be wrong, so please if you find a perfect tool, drop here a link (I myself am doing lots of data processing in my day job and I can't swear that I couldn't do it more efficiently).
Approach b) may at first seem "unpleasant", but given a nice high-level expressive language with bunch of useful libraries (regexps, XML manipulation, creating parsers...) it shouldn't be too hard, and may be gradually turned into a DSL for the very purpose. Beside Perl which was already mentioned, Python and Ruby sound like good candidates for these languages (I bet some Lisp derivatives too, but I have no experience there).
You might find AntlrWorks useful if you go so far as defining formal grammars for what you're parsing.