Strict JSON parsing in Go - json

The encoding/json exposes a forgiving parser. Every not present property is simply set to its default value. Is there a better way to make a field required than using bulky switch statments and check every field for its default value? Another problem is that not all default types are nil. Is there another way to distinguish between than a not set field and e.g. 0 other than using pointers to be able to check against nil?

You may look at what there is available to implement
so-called "JSON schema validation".
You may start with this search
which yields github.com/juju/gojsonschema among others;
while I have no idea about its quality, it's used as part of
Ubuntu's Juju cloud orchestration solution so I'd expect it
to be battle tested. Still, caveat emptor.

Related

Mapping JSON to Apama Events

I am trying to use the Mapper codec in my connectivity chain to convert a JSON object that looks like this:
{"test2":[
["column1","column2","column3"],
["16091", "449", "05302018"],
["16092", "705", "05302018"]
]}
to an EPL type. To me it looks like a sequence of sequences, so I used
event test1 {
sequence<string> values;
}
event test2 {
sequence<test1> tests;
}
But this gives me the error
Unable to parse event test.1: Incorrect type in get (you asked for map but its' actually list)
Any ideas how I should be using the Mapper codec to this end?
Unless explicitly remapped, that won't quite work. You have to consider the entire structure of the document from top to bottom. It's not a sequence of strings - it's a JSON object/dictionary at top-level, with a value that is a sequence of sequences of string.
A JSON object/dictionary can map to an event type based on field names. So as Matt's answer said, a JSON document like yours would need an event type like
event SomeEventType {
sequence<sequence<string > > test2;
}
If it's not appropriate to create an event type that exactly corresponds to the JSON document's structure, then you'll need to use the mapping codec to rearrange the fields in the JSON document to match the fields and sub-fields in an event type. Or possibly a custom codec; I think Matt's right that the mapper can't do exactly what you want.
Further, because JSON documents are type-less at the top-level, you'll need to make sure that the event type is defined somehow. There are multiple ways of doing that.
(1) If this particular connectivity will only send you events of one type, you can use the 'defaultEventType' configuration option of the apama.eventMap host plug-in at the top of your chain e.g.
apama.eventMap:
defaultEventMap: SomeEventType
(2) If it depends on the structure of the document, you'll need to use the classifier codec. That can take a message going towards the correlator, and assign it an event type based on the content of fields (or simply their presence). You can learn about it in the documentation.
(3) The transport will sometimes define it on messages being sent towards the correlator. For example, in the case of the Universal Messaging transport, then the 'tag' of the UM event will be used as the type. This may or may not be appropriate.
If you do end up doing anything non-trivial with the classifier or mapper, I'd strongly recommend use of the 'diagnostic codec' to help in developing the classifier or mapper rules. This is a codec you can put anywhere in the chain of codecs that will log every event going through it, so you can see how your rules are operating by seeing what happens before and after classification/mapping. You can read about it in the documentation, but it's usually as simple as putting '- diagnosticCodec' somewhere in your chain. I've found it absolutely invaluable when debugging connectivity chains.
you want your event type to look like:
event type1 {
sequence<sequence<string> > data;
}
it's not possible in the mapper directly to convert to your type2/type1 schema, but you'd be able to write your own codec to do that or do post-filtering in EPL.
HTH,
Matt

Gemfire pdxInstance datatype

I am writing pdxInstances to GemFire using the sequence: rabbitmq => springxd => gemfire.
If I put this JSON into rabbitmq {'ID':11,'value':5}, value appears as a byte value in GemFire. If I put {'ID':11,'value':500}, value appears as a word and if I put {'ID':11,'value':50000} it appears as an Integer.
A problem arises when I query data from GemFire and order them. For example, if I use a query such as select * from /my_region order by value it fails, saying it cannot compare a byte with a word (or byte with an integer).
Is there any way to declare the data type in JSON? Or any other method to get rid of this problem?
To add a bit of insight into this problem... in reviewing GemFire/Geode source code, it would seem it is not possible to configure the desired value type and override GemFire/Geode's default behavior, which can be seen in JSONFormatter.setNumberField(..).
I will not explain how GemFire/Geode involves the JSONFormatter during a Region.put(key, value) operation as it is rather involved and beyond the scope of this discussion.
However, one could argue that the problem is not necessarily with the JSONFormatter class, since storing a numeric value in a byte is more efficient than storing the value in an integer, especially when the value would indeed fit into a byte. Therefore, the problem is really that the Comparator used in the Query processor should be able to compare numeric values in the same type family (byte, short, int, long), upcasting where appropriate.
If you feel so inclined, feel free to file a JIRA ticket in the Apache Geode JIRA repository at https://issues.apache.org/jira/browse/GEODE-72?jql=project%20%3D%20GEODE
Note, Apache Geode is the open source "core" of Pivotal GemFire now. See the Apache Geode website for more details.
Cheers!
Your best bet would be to take care of this with a custom module or a groovy script. You can either write a custom module in Java to do the conversion and then upload the custom module into SpringXD, then you could reference your custom module like any other processor. Or you could write a script in Groovy and pass the incoming data through a transform processor.
http://docs.spring.io/spring-xd/docs/current/reference/html/#processors
The actual conversion probably won't be too tricky, but will vary depending on which method you use. The stream creation would look something like this when you're done.
stream create --name myRabbitStream --definition "rabbit | my-custom-module | gemfire-json-server etc....."
stream create --name myRabbitStream --definition "rabbit | transform --script=file:/transform.groovy | gemfire-json-server etc...."
It seems like you have your source and sink modules set up just fine, so all you need to do is get your processor module setup to do the conversion and you should be all set.

JSON output to the browser -> providing an order

So I've read that you cannot expect a default order when requesting json. I've seen this in action making a call to a little api that I built, that will return a jumbled, random order of elements each time I make a different call.
How does a site like ticketfly's api ( call it here http://www.ticketfly.com/api/events/upcoming.json?venueId=57 ) always ensure that the json returned is in a specific order?
The event ids always first, etc.
Thanks for shedding some light on the situation.
If you are in control of the endpoint API then you can hardcode the order in which you render the properties. Though I have to ask why exactly do you need the JSON properties in a particular order? You will finally be accessing the properties via there property names so the order in which they appear in the JSON should not ideally matter.
EDIT : Since your bosses insist on this (what can one say now?):
You can try and see if any of the following suits your needs:
Try hardcoding the display order in the view's representation. This means you will need to echo/print each property name explicitly in the view script. In PHP it could be something like echo $variable_representing_json["id"]; and so forth. Note that with this approach you needn't change the original JSON representation.
If you want the original JSON representation to be changed then depending on how you are doing the process it varies in difficulty:
If it's string concatenation that you are using to represent the json then hard-code the order in which the json properties get concatenated in the string.
In some languages the display order of properties is actually a representation of the order in which the properties were defined. In simple words if $var is an empty json representation then you should define $var["id"] = {some_val} first to display it first.
If you are using a framework for processing the JSON data it may have its own quirks irrespective of how you define your representation. In such cases you will have to try and see if you can work around the issue or if it gives any helper methods.

Should JSON null be preserved?

I tried to convert the JSON data
{
"a": {
"b": null
}
}
to XML using an online converter. The response was
<a>
<b />
</a>
Converting this back to JSON using the same converter gave me
{
"a": {
}
}
This made me wonder – if you're explicitly given a null value, are you required to preserve it when dealing with JSON? I'm fairly sure that the XML <a><b /></a> is not equivalent to <a></a>, and especially not <a /> (which happens to be what I get when I continue with the same exercise).
In other words, if I'm handed JSON of unknown origin and am supposed to hand it over to an unknown recepient, am I required to preserve the nulls or can I safely remove them? Conversely, can I rely on my nulls to end up in the same way I outputted them when delivered by third-party software?
Here's a similar question: Should JSON include null values – However, the question there is whether the code should output nulls if you define the format yourself, not what you should do if you don't know anything about the original format.
EDIT – Clarification: The way I asked the question was bad and apparently caused confusion. To rephrase it: I do understand that XML and JSON are different formats and are able to carry different kinds of (meta)data. I do know that null is a valid value, as defined by RFC4627. I do understand that there are different ways to convert between XML and JSON since the formats don't have a 1-to-1 relationship. I do understand that the converter I found might be buggy. However, the fact that the same converter didn't provide the same conversion in both directions (no information was lost when converting from "b": null to <b /> and a similar translation in the opposite direction would have been possible) made me wonder something that I couldn't find an answer to despite attempts:
Is it legal, according to the JSON standard, to treat {"a":{"b":null}} and {"a":{}} as one and the same object when transferring them on behalf of other software?
Note that I'm here assuming that it's legal to add or remove whitespace as I see fit (e.g. pretty-printing, which is okay according to RFC4627), and even to rearrange the name/value pairs in a collection (again according to RFC4627). I simply don't know if a null must be preserved in the same way as significant data, or can be dropped in the same way as insignificant whitespace.
Yes, null is a separate value in JSON, and is distinct from not having an attribute, obviously. Also, you can see this question about nulls in XML. The thing to conclude here isn't that there is something wrong with JSON or XML, but simply that the tools you use aren't coded to handle these cases.
One of the problems in converting JSON to XML is that if you try and make the conversion lossless, you end up with somewhat "unnatural" XML, whereas if you try to create the most natural XML representation, it ends up losing information. That's why there are lots of different converters that all do it in slightly different ways. Choose the one that meets your requirements.

Where Do I Store Hash Table or Dictionary Key Names

When I'm working with Hash Tables/Dictionaries I sometimes struggle with how to specify keys.
For example: if I create a simple Dictionary (using Python for this example),
foo = {'bar': 'baz', 'foobar': 'foobaz' }
I can access values (in other modules) with the key values: (foo['bar']) and get baz back.
In the words of Dr. Evil, "pretty standard, really."
Unfortunately, using static strings for keys tightly couples any modules using this Dictionary to its implementation. Of course, this can also apply when using other key types (e.g. Enums, Objects, etc.); anyway you slice it, all modules which access the Dictionary need to know the values for the keys.
To resolve this, I typically use static constant string values (or Enums if available in the language) for keys, and either store them publicly in the local class/module, or in a separate module/class. Therefore any changes to the dictionary keys themselves are kept in a single location.
This usually looks like this:
BAR_KEY = 'bar'
foo[BAR_KEY] = 'foobar'
Are there better ways of specifying keys such that the use of the Dictionary doesn't necessarily couple a module/class to its implementation?
Note: I've seen a few responses in SO which address this (e.g. property-to-reference-a-key-value-pair-in-a-dictionary), but the topics didn't seem to address this issue specifically. The answers were helpful, but I'd like a wider range of experience.
Why not make a class for this, which only contains properties? This is done nicely with Python (from what I know), and works well with other languages, too. Refactoring the names is trivial with today's tools, too.
In cases where I'm passing the object around and I've got known keys, I'd always prefer adding an attribute to an object. IMO, the use case of dictionaries is when you don't know what the keys are.
Python is trivial:
foo.bar=baz
Java is pretty much the same:
class Foo { public String bar="baz"; }
The Python performance would be pretty much identical, since a property lookup is just a dictionary lookup and the Java performance would be better.
I sometimes create a separate class to hold the dictionary keys. That gives them their own namespace, as well as the regular benefits of having the keys be in const strings, namely that you don't have the risk of typos, you get code completion, and the string value is easy to change. If you don't want to create a separate class, you get all of the benefits except a namespace just from have const strings.
That said, I think you're getting close to soft coding territory. If the keys in the dictionary change, it's OK for code using the dictionary to change.
Personally, I use your method. It's pretty sensible, simple, and solves the actual problem.
I usually do the same thing; if the key is always going to be the same, make a 'constant static' in whatever language to hold the key.