Generic JSON to XML transformation by templating - json

I'm trying to devise a service to convert a generic JSON data representation into XML data representation.
The first idea that came to my mind (and that I found on the internet) takes advantage of the Go templating utilities.
If I have a JSON data representation like the following:
{
"user": {
"name": "Luca",
"surname": "Rossi"
}
}
I could devise a template like the following:
<xml>
<user name="{{.user.name}}" surname="{{.user.surname}}" />
</xml>
to generate:
<xml>
<user name="Luca" surname="Rossi" />
</xml>
The problem is: Go requires the definition of a struct which declares how to marshal and unmarshal a JSON data representation; at the same time, however, I'd like to provide the template to generate the XML as a service configuration available at runtime.
The question is: "is it possible"?
I know (thanks tothis question) that I can take do something like this:
var anyJson map[string]interface{}
json.Unmarshal(bytes, &anyJson)
The problem comes when I have to access values: I'd be asked to do a type assertion, like
anyJson["id"].(string)
Now, I might be able to know the type of anyJson["id"] by means of JSON schema, for example, but I don't know if I can do a parametric type assertion, something like
anyJson["id"].(typeForIDFromJSONSchema)

When you unmarshal into map[string]interface{}, every nested JSON object will also be map[string]interface{}. So type assertions of the contained elements to string may typically work, but not to any struct type - the unmarshaller would always be unaware of your structs.
So the two options I suggest are
to do it 'the hard way' using type switches and type assertions - this is workable and fast, but not all that nice; or
to use a different tool such as jsoniter or gjson - these might be a little slower but they do allow you to traverse arbitrary JSON graphs
I have used both GJson and Jsoniter. GJson works by reading byte by byte through the input, using buffering to keep its speed up, providing an API that allows inspection of the current element and assertions to convert the values.
Jsoniter looks to me like a cleaner implementation along the lines of successful parsers in Java, but I haven't used it for parsing in this manner yet. (It can also be used simply as a fast drop-in replacement for the standard Go encoding/json.) I suggest you focus on using its Iterator and its WhatIsNext method.

Related

Prevent parsing a JSON node with common-lisp YASON library

I am using the Yason library in common-lisp, I want to parse a json string but would like the parser to keep one a its node unparsed.
Typically with an example like that:
{
"metadata1" : "mydata1",
"metadata2" : "mydata2",
"payload" : {...my long payload object},
"otherNodesToParse" : {...}
}
How can I set the yason parser to parse my json but skip the payload node and keep it as a string in the json format.
Use: let's say I just want the envelope data (everything that's not the payload), and to forward the payload as-is (as json string) to another system.
If I parse the whole json (so including payload) and then re-encode the payload to json, it is inefficient. The payload size could also be pretty big.
How do you know where the end of the payload object is in the stream? You do so by parsing the stream: if you don't parse the stream you simply can't know where the end of the object is: that's the nature of JSON's syntax (as it is the nature of CL's default syntax). For instance the only way you can know the difference between where to continue after
{x:1}
and after
{x:1.2}
is by parsing the two things.
So you must necessarily parse the whole thing.
So the answer to your question is: you can't do this.
You could (but not, I think, with YASON) decide that you did not want to build an object as a result of the parse. And perhaps, if the stream you are parsing corresponds to something with random access like a string or a file, you could note the start and end positions in the stream to later extract a string from it corresponding to the unparsed data (or you could perhaps build it up as you go).
It looks as if some or all of this might be possible with CL-JSON, but you'd have to work at it.
Unless the objects you are reading are vast the benefit of this seems questionable-to-none. If you really do want to do something like this efficiently you need a serialisation scheme which tells you how long things are.

Excessive use of map[string]interface{} in go development?

The majority of my development experience has been from dynamically typed languages like PHP and Javascript. I've been practicing with Golang for about a month now by re-creating some of my old PHP/Javascript REST APIs in Golang. I feel like I'm not doing things the Golang way most of the time. Or more generally, I'm not use to working with strongly typed languages. I feel like I'm making excessive use of map[string]interface{} and slices of them to box up data as it comes in from http requests or when it gets shipped out as json http output. So what I'd like to know is if what I'm about to describe goes against the philosophy of golang development? Or if I'm breaking the principles of developing with strongly typed languages?
Right now, about 90% of the program flow for REST Apis I've rewritten with Golang can be described by these 5 steps.
STEP 1 - Receive Data
I receive http form data from http.Request.ParseForm() as formvals := map[string][]string. Sometimes I will store serialized JSON objects that need to be unmarshaled like jsonUserInfo := json.Unmarshal(formvals["user_information"][0]) /* gives some complex json object */.
STEP 2 - Validate Data
I do validation on formvals to make sure all the data values are what I expect before using it in SQL queries. I treat everyting as a string, then use Regex to determine if the string format and business logic is valid (eg. IsEmail, IsNumeric, IsFloat, IsCASLCompliant, IsEligibleForVoting,IsLibraryCardExpired etc...). I've written my own Regex and custom functions for these types of validations
STEP 3 - Bind Data to SQL Queries
I use golang's database/sql.DB to take my formvals and bind them to my Query and Exec functions like this Query("SELECT * FROM tblUser WHERE user_id = ?, user_birthday > ? ",formvals["user_id"][0], jsonUserInfo["birthday"]). I never care about the data types I'm supplying as arguments to be bound, so they're all probably strings. I trust the validation in the step immediately above has determined they are acceptable for SQL use.
STEP 4 - Bind SQL results to []map[string]interface{}{}
I Scan() the results of my queries into a sqlResult := []map[string]interface{}{} because I don't care if the value types are null, strings, float, ints or whatever. So the schema of an sqlResult might look like:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
}
I wrote my own eager load function so that I can bind more information like so EagerLoad("tblAddress", "JOIN ON tblAddress.user_id",&sqlResult) which then populates sqlResult with more information of the type []map[string]interface{}{} such that it looks like this:
sqlResult =>
[0] {
"user_id":"1"
"user_name":"Bob Smith"
"age":"45"
"weight":"34.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
[1] {
"type":"work"
"address1":"5 Kennedy Avenue"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
},
[1] {
"user_id":"2"
"user_name":"Jane Do"
"age":nil
"weight":"22.22"
"addresses"=>
[0] {
"type":"home"
"address1":"56 Front Street West"
"postal":"L3L3L3"
"lat":"34.3422242"
"lng":"34.5523422"
}
}
STEP 5 - JSON Marshal and send HTTP Response
then I do a http.ResponseWriter.Write(json.Marshal(sqlResult)) and output data for my REST API
Recently, I've been revisiting articles with code samples that use structs in places I would have used map[string]interface{}. For example, I wanted to refactor Step 2 with a more standard approach that other golang developers would use. So I found this https://godoc.org/gopkg.in/go-playground/validator.v9, except all it's examples are with structs . I also noticed that most blogs that talk about database/sql scan their SQL results into typed variables or structs with typed properties, as opposed to my Step 4 which just puts everything into map[string]interface{}
Hence, i started writing this question. I feel the map[string]interface{} is so useful because majority of the time,I don't really care what the data is and it gives me to the freedom in Step 4 to construct any data schema on the fly before I dump it as JSON http response. I do all this with as little code verbosity as possible. But this means my code is not as ready to leverage Go's validation tools, and it doesn't seem to comply with the golang community's way of doing things.
So my question is, what do other golang developers do with regards to Step 2 and Step 4? Especially in Step 4...do Golang developers really encourage specifying the schema of the data through structs and strongly typed properties? Do they also specify structs with strongly typed properties along with every eager loading call they make? Doesn't that seem like so much more code verbosity?
It really depends on the requirements just like you have said you don't require to process the json it comes from the request or from the sql results. Then you can easily unmarshal into interface{}. And marshal the json coming from sql results.
For Step 2
Golang has library which works on validation of structs used to unmarshal json with tags for the fields inside.
https://github.com/go-playground/validator
type Test struct {
Field `validate:"max=10,min=1"`
}
// max will be checked then min
you can also go to godoc for validation library. It is very good implementation of validation for json values using struct tags.
For STEP 4
Most of the times, We use structs if we know the format and data of our JSON. Because it provides us more control over the data types and other functionality. For example if you wants to empty a JSON feild if you don't require it in your JSON. You should use struct with _ json tag.
Now you have said that you don't care if the result coming from sql is empty or not. But if you do it again comes to using struct. You can scan the result into struct with sql.NullTypes. With that also you can provide json tag for omitempty if you wants to omit the json object when marshaling the data when sending a response.
Struct values encode as JSON objects. Each exported struct field
becomes a member of the object, using the field name as the object
key, unless the field is omitted for one of the reasons given below.
The encoding of each struct field can be customized by the format
string stored under the "json" key in the struct field's tag. The
format string gives the name of the field, possibly followed by a
comma-separated list of options. The name may be empty in order to
specify options without overriding the default field name.
The "omitempty" option specifies that the field should be omitted from
the encoding if the field has an empty value, defined as false, 0, a
nil pointer, a nil interface value, and any empty array, slice, map,
or string.
As a special case, if the field tag is "-", the field is always
omitted. Note that a field with name "-" can still be generated using
the tag "-,".
Example of json tags
// Field appears in JSON as key "myName".
Field int `json:"myName"`
// Field appears in JSON as key "myName" and
// the field is omitted from the object if its value is empty,
// as defined above.
Field int `json:"myName,omitempty"`
// Field appears in JSON as key "Field" (the default), but
// the field is skipped if empty.
// Note the leading comma.
Field int `json:",omitempty"`
// Field is ignored by this package.
Field int `json:"-"`
// Field appears in JSON as key "-".
Field int `json:"-,"`
As you can analyze from above information given in Golang spec for json marshal. Struct provide so much control over json. That's why Golang developer most probably use structs.
Now on using map[string]interface{} you should use it when you don't the structure of your json coming from the server or the types of fields. Most Golang developers stick to structs wherever they can.

name json variable and jsonString variable convention?

JSON could mean JSON type or json string.
It starts confuse me when different library use json in two different meanings.
I wonder how other people name those variables differently.
For all practical purposes, "JSON" has exactly one meaning, which is a string representing a JavaScript object following certain specific syntax.
JSON is parsed into a JavaScript object using JSON.parse, and an JavaScript object is converted into a JSON string using JSON.stringify.
The problem is that all too many people have gotten into the bad habit of referring to plain old JavaScript objects as JSON. That is either confused or sloppy or both. {a: 1} is a JS object. '{"a": 1}' is a JSON string.
In the same vein, many people use variable names like json to refer to JavaScript objects derived from JSON. For example:
$.getJSON('foo.php') . then(function(json) { ...
In the above case, the variable name json is ill-advised. The actual payload returned from the server is a JSON string, but internally $.getJSON has already transformed that into a plain old JavaScript object, which is what is being passed to the then handler. Therefore, it would be preferable to use the variable name data, for example.
If a library uses the term "json" to refer to things which are not JSON, but actually are JavaScript objects, it is a mark of poor design, and I'd suggest looking around for a different library.

Is it valid to define functions in JSON results?

Part of a website's JSON response had this (... added for context):
{..., now:function(){return(new Date).getTime()}, ...}
Is adding anonymous functions to JSON valid? I would expect each time you access 'time' to return a different value.
No.
JSON is purely meant to be a data description language. As noted on http://www.json.org, it is a "lightweight data-interchange format." - not a programming language.
Per http://en.wikipedia.org/wiki/JSON, the "basic types" supported are:
Number (integer, real, or floating
point)
String (double-quoted Unicode
with backslash escaping)
Boolean
(true and false)
Array (an ordered
sequence of values, comma-separated
and enclosed in square brackets)
Object (collection of key:value
pairs, comma-separated and enclosed
in curly braces)
null
The problem is that JSON as a data definition language evolved out of JSON as a JavaScript Object Notation. Since Javascript supports eval on JSON, it is legitimate to put JSON code inside JSON (in that use-case). If you're using JSON to pass data remotely, then I would say it is bad practice to put methods in the JSON because you may not have modeled your client-server interaction well. And, further, when wishing to use JSON as a data description language I would say you could get yourself into trouble by embedding methods because some JSON parsers were written with only data description in mind and may not support method definitions in the structure.
Wikipedia JSON entry makes a good case for not including methods in JSON, citing security concerns:
Unless you absolutely trust the source of the text, and you have a need to parse and accept text that is not strictly JSON compliant, you should avoid eval() and use JSON.parse() or another JSON specific parser instead. A JSON parser will recognize only JSON text and will reject other text, which could contain malevolent JavaScript. In browsers that provide native JSON support, JSON parsers are also much faster than eval. It is expected that native JSON support will be included in the next ECMAScript standard.
Let's quote one of the spec's - https://www.rfc-editor.org/rfc/rfc7159#section-12
The The JavaScript Object Notation (JSON) Data Interchange Format Specification states:
JSON is a subset of JavaScript but excludes assignment and invocation.
Since JSON's syntax is borrowed from JavaScript, it is possible to
use that language's "eval()" function to parse JSON texts. This
generally constitutes an unacceptable security risk, since the text
could contain executable code along with data declarations. The same
consideration applies to the use of eval()-like functions in any
other programming language in which JSON texts conform to that
language's syntax.
So all answers which state, that functions are not part of the JSON standard are correct.
The official answer is: No, it is not valid to define functions in JSON results!
The answer could be yes, because "code is data" and "data is code".
Even if JSON is used as a language independent data serialization format, a tunneling of "code" through other types will work.
A JSON string might be used to pass a JS function to the client-side browser for execution.
[{"data":[["1","2"],["3","4"]],"aFunction":"function(){return \"foo bar\";}"}]
This leads to question's like: How to "https://stackoverflow.com/questions/939326/execute-javascript-code-stored-as-a-string".
Be prepared, to raise your "eval() is evil" flag and stick your "do not tunnel functions through JSON" flag next to it.
It is not standard as far as I know. A quick look at http://json.org/ confirms this.
Nope, definitely not.
If you use a decent JSON serializer, it won't let you serialize a function like that. It's a valid OBJECT, but not valid JSON. Whatever that website's intent, it's not sending valid JSON.
JSON explicitly excludes functions because it isn't meant to be a JavaScript-only data
structure (despite the JS in the name).
A short answer is NO...
JSON is a text format that is completely language independent but uses
conventions that are familiar to programmers of the C-family of
languages, including C, C++, C#, Java, JavaScript, Perl, Python, and
many others. These properties make JSON an ideal data-interchange
language.
Look at the reason why:
When exchanging data between a browser and a server, the data can only
be text.
JSON is text, and we can convert any JavaScript object into JSON, and
send JSON to the server.
We can also convert any JSON received from the server into JavaScript
objects.
This way we can work with the data as JavaScript objects, with no
complicated parsing and translations.
But wait...
There is still ways to store your function, it's widely not recommended to that, but still possible:
We said, you can save a string... how about converting your function to a string then?
const data = {func: '()=>"a FUNC"'};
Then you can stringify data using JSON.stringify(data) and then using JSON.parse to parse it (if this step needed)...
And eval to execute a string function (before doing that, just let you know using eval widely not recommended):
eval(data.func)(); //return "a FUNC"
Via using NodeJS (commonJS syntax) I was able to get this type of functionality working, I originally had just a JSON structure inside some external JS file, but I wanted that structure to be more of a Class, with methods that could be decided at run time.
The declaration of 'Executor' in myJSON is not required.
var myJSON = {
"Hello": "World",
"Executor": ""
}
module.exports = {
init: () => { return { ...myJSON, "Executor": (first, last) => { return first + last } } }
}
Function expressions in the JSON are completely possible, just do not forget to wrap it in double quotes. Here is an example taken from noSQL database design:
{
"_id": "_design/testdb",
"views": {
"byName": {
"map": "function(doc){if(doc.name){emit(doc.name,doc.code)}}"
}
}
}
although eval is not recommended, this works:
<!DOCTYPE html>
<html>
<body>
<h2>Convert a string written in JSON format, into a JavaScript function.</h2>
<p id="demo"></p>
<script>
function test(val){return val + " it's OK;}
var someVar = "yup";
var myObj = { "func": "test(someVar);" };
document.getElementById("demo").innerHTML = eval(myObj.func);
</script>
</body>
</html>
Leave the quotes off...
var a = {"b":function(){alert('hello world');} };
a.b();

JSON posting, am i pushing JSON too far?

Im just wondering if I am pushing JSON too far? and if anyone has hit this before?
I have a xml file:
<?xml version="1.0" encoding="UTF-8"?>
<customermodel:Customer xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:customermodel="http://customermodel" xmlns:personal="http://customermodel/personal" id="1" age="1" name="Joe">
<bankAccounts xsi:type="customermodel:BankAccount" accountNo="10" bankName="HSBC" testBoolean="true" testDate="2006-10-23" testDateTime="2006-10-23T22:15:01+08:00" testDecimal="20.2" testTime="22:15:01+08:00">
<count>0</count>
<bankAddressLine>HSBC</bankAddressLine>
<bankAddressLine>London</bankAddressLine>
<bankAddressLine>31 florence</bankAddressLine>
<bankAddressLine>Swindon</bankAddressLine>
</bankAccounts>
</customermodel:Customer>
Which contains elements and attributes....
Which when i convert to JSON gives me:
{"customermodel:Customer":{"id":"1","name":"Joe","age":"1","xmlns:xsi":"http://www.w3.org/2001/XMLSchema-instance","bankAccounts":{"testDate":"2006-10-23","testDecimal":"20.2","count":"0","testDateTime":"2006-10-23T22:15:01+08:00","bankAddressLine":["HSBC","London","31 florence","Swindon"],"testBoolean":"true","bankName":"HSBC","accountNo":"10","xsi:type":"customermodel:BankAccount","testTime":"22:15:01+08:00"},"xmlns:personal":"http://customermodel/personal","xmlns:customermodel":"http://customermodel"}}
So then i send this too the client.. which coverts to a js object (or whatever) edits some values (the elements) and then sends it back to the server.
So i get the JSON string, and convert this back into XML:
<customermodel:Customer>
<id>1</id>
<age>1</age>
<name>Joe</name>
<xmlns:xsi>http://www.w3.org/2001/XMLSchema-instance</xmlns:xsi>
<bankAccounts>
<testDate>2006-10-23</testDate>
<testDecimal>20.2</testDecimal>
<testDateTime>2006-10-23T22:15:01+08:00</testDateTime>
<count>0</count>
<bankAddressLine>HSBC</bankAddressLine>
<bankAddressLine>London</bankAddressLine>
<bankAddressLine>31 florence</bankAddressLine>
<bankAddressLine>Swindon</bankAddressLine>
<accountNo>10</accountNo>
<bankName>HSBC</bankName>
<testBoolean>true</testBoolean>
<xsi:type>customermodel:BankAccount</xsi:type>
<testTime>22:15:01+08:00</testTime>
</bankAccounts>
<xmlns:personal>http://customermodel/personal</xmlns:personal>
<xmlns:customermodel>http://customermodel</xmlns:customermodel>
</customermodel:Customer>
And there is the problem, is doesn't seem to know the difference between elements/attributes so i can not check against a XSD to check this is now valid?
Is there a solution to this?
I cannot be the first to hit this problem?
JSON does not make sense as an XML encoding, no. If you want to be working with and manipulating XML, then work with and manipulate XML.
JSON is for when you need something that's lighter weight, easier to parse, and easier to write and read. It has a fairly simple structure, that is neither better nor worse than XML, just different. It has lists, associations, strings, and numbers, while XML has nested elements, attributes, and entities. While you could encode each one in the other precisely, you have to ask yourself why you're doing that; if you want JSON use JSON, and if you want XML use XML.
JsonML provides a well thought out standard mapping from XML<->JSON. If you use it, you'll get the benefit of ease-of-manipulation you're looking for on the client with no loss of fidelity in elements/attributes.
I wouldn't encode the xml schema information in the json string-- that seems a little backwards. If you're going to send them JSON, they shouldn't have any inkling that this is anything but JSON. The extra xml will serve to confuse and make the your interface look "leaky".
You might even consider just using xml and avoid the additional layer of abstraction. JSON makes the most sense when you know at least one party is actually using javascript. If this isn't the case it'll still work as well as any other transport format. But if you already have an xml representation it's a little excessive.
On the other hand, if your customer is really using javascript it will make it easier for them to use the data. The only concern is the return trip, and once it's in JSON who do you trust more to do the conversion back to xml correctly? You're probably better qualified for that, since it's your schema.
For this to work you would need to build additional logic/data into your serialize/unserialize methods - probably create something like "attributes" and "data" to hold the different parts:
{"customermodel:Customer":
{
"attributes": {"xmlns:xsi":"...", "xmlns:customermodel":"..."},
"data":
{
"bankAccounts":
{
"attributes": { ... }
"data" :
{
"count":0,
"bankAddressLine":"..."
}
}
}
}