Should JSON null be preserved? - json

I tried to convert the JSON data
{
"a": {
"b": null
}
}
to XML using an online converter. The response was
<a>
<b />
</a>
Converting this back to JSON using the same converter gave me
{
"a": {
}
}
This made me wonder – if you're explicitly given a null value, are you required to preserve it when dealing with JSON? I'm fairly sure that the XML <a><b /></a> is not equivalent to <a></a>, and especially not <a /> (which happens to be what I get when I continue with the same exercise).
In other words, if I'm handed JSON of unknown origin and am supposed to hand it over to an unknown recepient, am I required to preserve the nulls or can I safely remove them? Conversely, can I rely on my nulls to end up in the same way I outputted them when delivered by third-party software?
Here's a similar question: Should JSON include null values – However, the question there is whether the code should output nulls if you define the format yourself, not what you should do if you don't know anything about the original format.
EDIT – Clarification: The way I asked the question was bad and apparently caused confusion. To rephrase it: I do understand that XML and JSON are different formats and are able to carry different kinds of (meta)data. I do know that null is a valid value, as defined by RFC4627. I do understand that there are different ways to convert between XML and JSON since the formats don't have a 1-to-1 relationship. I do understand that the converter I found might be buggy. However, the fact that the same converter didn't provide the same conversion in both directions (no information was lost when converting from "b": null to <b /> and a similar translation in the opposite direction would have been possible) made me wonder something that I couldn't find an answer to despite attempts:
Is it legal, according to the JSON standard, to treat {"a":{"b":null}} and {"a":{}} as one and the same object when transferring them on behalf of other software?
Note that I'm here assuming that it's legal to add or remove whitespace as I see fit (e.g. pretty-printing, which is okay according to RFC4627), and even to rearrange the name/value pairs in a collection (again according to RFC4627). I simply don't know if a null must be preserved in the same way as significant data, or can be dropped in the same way as insignificant whitespace.

Yes, null is a separate value in JSON, and is distinct from not having an attribute, obviously. Also, you can see this question about nulls in XML. The thing to conclude here isn't that there is something wrong with JSON or XML, but simply that the tools you use aren't coded to handle these cases.

One of the problems in converting JSON to XML is that if you try and make the conversion lossless, you end up with somewhat "unnatural" XML, whereas if you try to create the most natural XML representation, it ends up losing information. That's why there are lots of different converters that all do it in slightly different ways. Choose the one that meets your requirements.

Related

How do you represent a list in a CSV?

What is the standard way to represent a list/array value in CSV? For example, given this source data in JSON:
[
{
'name': 'Harry',
'subjects': ['math', 'english', 'history']
}
]
My guess as to a CSV representation would be:
name,subjects
Harry,["math","english","history"]
However that doesn't get parsed correctly (with the standard Python CSV parser).
One option, though this is almost always a hack and should be avoided unless truly necessary, is to choose a delimiter that you know will never show up in your data. For example:
name,subject
Harry,math|english|history
Of course you will have to manually handle splitting this string and turning it back into a list. Existing CSV parsers should not support this, because this concept fundamentally does not make sense in CSV.
And of course, this does not generalize well - what happens in the future when you need to store a 2D list, or a dict, or you realize you do need that delimiter character after all?
The root problem here is that CSV is a tabular format, whereas JSON is a hierarchical format. Rather than trying to "squeeze" one format to fit into a fundamentally incompatible format, you should instead normalize your data into a tabular representation. One example of how this could look:
name,subject
Harry,math
Harry,english
Harry,history

Strict JSON parsing in Go

The encoding/json exposes a forgiving parser. Every not present property is simply set to its default value. Is there a better way to make a field required than using bulky switch statments and check every field for its default value? Another problem is that not all default types are nil. Is there another way to distinguish between than a not set field and e.g. 0 other than using pointers to be able to check against nil?
You may look at what there is available to implement
so-called "JSON schema validation".
You may start with this search
which yields github.com/juju/gojsonschema among others;
while I have no idea about its quality, it's used as part of
Ubuntu's Juju cloud orchestration solution so I'd expect it
to be battle tested. Still, caveat emptor.

Specifiy type when converting from XML to JSON in MarkLogic

Using MarkLogic 8, I'm using a custom XML to JSON conversion for json:transform-to-json, and I've got it working just about right except the conversion is outputting numbers as strings.
Is there a way to specify that the value of a particular element should be a number value, not a string?
I don't see anything in the doc for json:config, but just in case there's something I've missed, or if you have a neat post-processing trick, I'd love to hear about how to solve this problem.
You can do that by defining an XML Schema for the non-string type elements. Just make sure it is available in the context (by loading it into xdmp:schemas-database()), and that it is recognized (your XML needs to have a namespace that matches the XML Schema, and you might wanna use import schema)..
HTH!

Which word do you use to describe a JSON-like object?

I have a naming issue.
If I read an object x from some JSON I can call my variable xJson (or some variation). However sometimes it is possible that the data could have come from a number of different sources amongst which JSON is not special (e.g. XMLRPC, programmatically constructed from Maps,Lists & primitives ... etc).
In this situation what is a good name for the variable? The best I have come up with so far is something like 'DynamicData', which is ok in some situations, but is a bit long and not probably not very clear to people unfamiliar with the convention.
SerializedData?
A hierarchical collection of hashes and lists of data is often referred to as a document no matter what serialization format is used. Another useful description might be payload or body in the sense of a message body for transmission or a value string written to a key/value store.
I tend to call the object hierarchy a "doc" myself, and the serialized format a "document." Thus a RequestDocument is parsed into a RequestDoc, and upon further identification it might become an OrderDoc, or a CustomerUpdateDoc, etc. An InvoiceDoc might become known generically as a ResponseDoc eventually serialized to a ResponseDocument.
The longer form is awkward, but such serialized strings are typically short-lived and localized in the code anyway.
If your data is the model, name it after the model it's representing. e.g., name it after the purpose of the contents, not the format of the contents. So if it's a list of customer information, name it "customers", or "customerModel", or something like that.
If you don't know what the contents are, the name isn't important, unless you want to differentiate the format. "responseData", "jsonResponse", etc...
And "DynamicData" isn't a long name, unless there is absolutely nothing descriptive to be said about the data. "data" might be just fine.

JSON posting, am i pushing JSON too far?

Im just wondering if I am pushing JSON too far? and if anyone has hit this before?
I have a xml file:
<?xml version="1.0" encoding="UTF-8"?>
<customermodel:Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:customermodel="http://customermodel" xmlns:personal="http://customermodel/personal" id="1" age="1" name="Joe">
<bankAccounts xsi:type="customermodel:BankAccount" accountNo="10" bankName="HSBC" testBoolean="true" testDate="2006-10-23" testDateTime="2006-10-23T22:15:01+08:00" testDecimal="20.2" testTime="22:15:01+08:00">
<count>0</count>
<bankAddressLine>HSBC</bankAddressLine>
<bankAddressLine>London</bankAddressLine>
<bankAddressLine>31 florence</bankAddressLine>
<bankAddressLine>Swindon</bankAddressLine>
</bankAccounts>
</customermodel:Customer>
Which contains elements and attributes....
Which when i convert to JSON gives me:
{"customermodel:Customer":{"id":"1","name":"Joe","age":"1","xmlns:xsi":"http://www.w3.org/2001/XMLSchema-instance","bankAccounts":{"testDate":"2006-10-23","testDecimal":"20.2","count":"0","testDateTime":"2006-10-23T22:15:01+08:00","bankAddressLine":["HSBC","London","31 florence","Swindon"],"testBoolean":"true","bankName":"HSBC","accountNo":"10","xsi:type":"customermodel:BankAccount","testTime":"22:15:01+08:00"},"xmlns:personal":"http://customermodel/personal","xmlns:customermodel":"http://customermodel"}}
So then i send this too the client.. which coverts to a js object (or whatever) edits some values (the elements) and then sends it back to the server.
So i get the JSON string, and convert this back into XML:
<customermodel:Customer>
<id>1</id>
<age>1</age>
<name>Joe</name>
<xmlns:xsi>http://www.w3.org/2001/XMLSchema-instance</xmlns:xsi>
<bankAccounts>
<testDate>2006-10-23</testDate>
<testDecimal>20.2</testDecimal>
<testDateTime>2006-10-23T22:15:01+08:00</testDateTime>
<count>0</count>
<bankAddressLine>HSBC</bankAddressLine>
<bankAddressLine>London</bankAddressLine>
<bankAddressLine>31 florence</bankAddressLine>
<bankAddressLine>Swindon</bankAddressLine>
<accountNo>10</accountNo>
<bankName>HSBC</bankName>
<testBoolean>true</testBoolean>
<xsi:type>customermodel:BankAccount</xsi:type>
<testTime>22:15:01+08:00</testTime>
</bankAccounts>
<xmlns:personal>http://customermodel/personal</xmlns:personal>
<xmlns:customermodel>http://customermodel</xmlns:customermodel>
</customermodel:Customer>
And there is the problem, is doesn't seem to know the difference between elements/attributes so i can not check against a XSD to check this is now valid?
Is there a solution to this?
I cannot be the first to hit this problem?
JSON does not make sense as an XML encoding, no. If you want to be working with and manipulating XML, then work with and manipulate XML.
JSON is for when you need something that's lighter weight, easier to parse, and easier to write and read. It has a fairly simple structure, that is neither better nor worse than XML, just different. It has lists, associations, strings, and numbers, while XML has nested elements, attributes, and entities. While you could encode each one in the other precisely, you have to ask yourself why you're doing that; if you want JSON use JSON, and if you want XML use XML.
JsonML provides a well thought out standard mapping from XML<->JSON. If you use it, you'll get the benefit of ease-of-manipulation you're looking for on the client with no loss of fidelity in elements/attributes.
I wouldn't encode the xml schema information in the json string-- that seems a little backwards. If you're going to send them JSON, they shouldn't have any inkling that this is anything but JSON. The extra xml will serve to confuse and make the your interface look "leaky".
You might even consider just using xml and avoid the additional layer of abstraction. JSON makes the most sense when you know at least one party is actually using javascript. If this isn't the case it'll still work as well as any other transport format. But if you already have an xml representation it's a little excessive.
On the other hand, if your customer is really using javascript it will make it easier for them to use the data. The only concern is the return trip, and once it's in JSON who do you trust more to do the conversion back to xml correctly? You're probably better qualified for that, since it's your schema.
For this to work you would need to build additional logic/data into your serialize/unserialize methods - probably create something like "attributes" and "data" to hold the different parts:
{"customermodel:Customer":
{
"attributes": {"xmlns:xsi":"...", "xmlns:customermodel":"..."},
"data":
{
"bankAccounts":
{
"attributes": { ... }
"data" :
{
"count":0,
"bankAddressLine":"..."
}
}
}
}