Get field name that err's in Go json unmarshal - json

I have some big json files that are slightly different in the types that the fields contain.
{ "a":"1" }
vs.
{ "a":1 }
When I unmarshal the second I get:
cannot unmarshal number into Go value of type string
However since these jsons are large I would like to have the actual field that is in error so I can fix them. The UnmarshalTypeError does not hold the Struct's field type.
Does anybody know of a way to get to field name? (not debugging I have a lot of different fields that err)
[EDIT]
I know how to solve the type conversion. What I need is a method to see what fields I need to apply that conversion to.

The short answer is that you can't.
However, to fix your problem, there is multiple solutions:
Dive into the json.Unmarshal source code to change its working and add the information you need: copy the function to a local package, do your edits, and use this function
Use a thrid-party tool to help you, for example a JSON validator compatible with JSON Schema: here is an online example, there is probably some better-suited tool

Now the UnmarshalTypeError, contains the field name.

Related

Swift unable to preserve order in String made from JSON for hash verification

We receive a JSON object from network along with a hash value of the object. In order to verify the hash we need to turn that JSON into a string and then make a hash out of it while preserving the order of the elements in the way they are in the JSON.
Say we have:
[
{"site1":
{"url":"https://this.is.site.com/",
"logoutURL":"",
"loadStart":[],
"loadStop":[{"someMore":"smthelse"}],
"there's_more": ... }
},
{"site2":
....
}
]
The Android app is able to get same hash value, and while debugging it we fed same simple string into both algorithms and were able to get out same hash out of it.
The difference that is there happens because of the fact that dictionaries are unordered structure.
While debugging we see that just before feeding a string into a hash algorithm, the string looks like the original JSON, just without the indentations, which means it preserves the order of items in it (on Android that is):
[{"site1":{"url":"https://this.is.site.com/", ...
While doing this with many approaches by now I'm not able to achieve the same: string that I get is different in order and therefore results in a different hash. Is there a way to achieve this?
UPDATE
It appears the problem is slightly different - thanks to #Rob Napier's answer below: I need a hash of only a part of incoming string (that has JSON in it), which means for getting that part I need to first parse it into JSON or struct, and after that - while getting the string value of it - the order of items is lost.
Using JSONSerialization and JSONDecoder (which uses JSONSerialization), it's not possible to reproduce the input data. But this isn't needed. What you're receiving is a string in the first place (as an NSData). Just don't get rid of it. You can parse the data into JSON without throwing away the data.
It is possible to create JSON parsers from scratch in Swift that maintain round-trip support (I have a sketch of such a thing at RNJSON). JSON isn't really that hard to parse. But what you're describing is a hash of "the thing you received." Not a hash of "the re-serialized JSON."

How does spark infers numeric types from JSON?

Trying to create a DataFrame from a JSON file, but when I load data, spark automatically infers that the numeric values in the data are of type Long, although they are actually Integers, and this is also how I parse the data in my code.
Since I'm loading the data in a test env, I don't mind using a few workarounds to fix the schema. I've tried more than a few, such as:
Changing the schema manually
Casting the data using a UDF
Define the entire schema manually
The issue is that the schema is quite complex, and the fields I'm after are nested, which makes all of the options above irrelevant or too complex to write from scratch.
My main question is, how does spark decides if a numeric value is an Integer or Long? and is there anything I can do to enforce that all\some numerics are of a specific type?
Thanks!
It's always LongType by default.
From the source code:
// For Integer values, use LongType by default.
case INT | LONG => LongType
So you cannot change this behaviour. You can iterate by columns and then do casting:
for (c <- schema.fields.filter(_.dataType.isInstanceOf[NumericType])) {
df.withColumn(c.name, col(c.name).cast(IntegerType))
}
It's only a snippet, but something like this should help you :)

Deserialize an anonymous JSON array?

I got an anonymous array which I want to deserialize, here the example of the first array object
[
{ "time":"08:55:54",
"date":"2016-05-27",
"timestamp":1464332154807,
"level":3,
"message":"registerResourcePath ('', '/sap/bc/ui5_ui5/ui2/ushell/resources/')",
"details":"","component":"sap.ui.ModuleSystem"},
{"time":"08:55:54","date":"2016-05-27","timestamp":1464332154808,"level":3,"message":"URL prefixes set to:","details":"","component":"sap.ui.ModuleSystem"},
{"time":"08:55:54","date":"2016-05-27","timestamp":1464332154808,"level":3,"message":" (default) : /sap/bc/ui5_ui5/ui2/ushell/resources/","details":"","component":"sap.ui.ModuleSystem"}
]
I tried deserializing using CL_TREX_JSON_SERIALIZER, but it is corrupt and does not work with my JSON, here is why
Then I tried /UI2/CL_JSON, but it needs a "structure" that perfectly fits the object given by the JSON Object. "Structure" means in my case an internal table of objects with the attributes time, date, timestamp, level, messageanddetails. And there was the problem: it does not properly handle references and uses class description to describe the field assigned to the field-symbol. Since I can not have a list of objects but only a list of references to objects that solution also doesn't works.
As a third attempt I tried with the CALL TRANSFORMATION as described by Horst Keller, but with this method I was not able to read in an anonymous array, and here is why
My major points:
I do not want to change the JSON, since that is what I get from sap.ui.log
I prefere to use built-in functionality and not a thirdparty framework
Your problem comes out not from the anonymity of array, but from the awkwardness of SAP JSON (De)serializer, which doesn't respect double quotes, which enclose JSON attributes. The issue is thoroughly described in this answer.
If you don't want to change your JSON on-the-fly, the only way you have is to change CL_TREX_JSON_DESERIALIZER class like this.
/UI5/CL_JSON_PARSER parses JSONs with unknown format.
Note that it's got "for internal use" written on it so many times that you probably should take it seriously and clone its code to fixate it.

How to fix invalid/stripped json?

I have a problem with JSON. Before we discovered error, column 'data' in mysql database had a type VARCHAR(255), where serialized json stored. About 2 months it worked well, but when project started to grow, 255 chars - became insufficient. But we forget to change type to TEXT. Now we have a problem, that our serialized json is stripped to 255 chars and now is invalid. I dont care about lost data, but I need to make minimal parsable/valid json.
for ex:
data = '{"state_id":[null,20],"dispatcher_id":[null,6057525],"uir":[null,{"level":"2"'
I need to make it valid, like this
data = '{"state_id":[null,20],"dispatcher_id":[null,6057525],"uir":[null,{"level":"2"}]}'
add }]} at the end of json.
Is there any quick way to do it? Or I should write my own parser/fixer?
You can use jsonlist
it's a pure implementation in JavaScript for validating JSON.
According to the validation you'll know what is missing in the end and add the missing value.
you can also check a demo online
I've done this task, so I want to share my solution

how to WCF dynamic DataMember?

example: a Customer class have 100 data member(id, name, age, address...etc) to be serialization to JSON.
In Config file such as Web.config, can set a output list to serialize JSON ouptut.
If output only id and name, then JSON only have id and name.
My Question: Can support dynamic DataMember in a DataContract ?
You mean optional datamembers, I guess so, check this question
Surely you'll have to have null values for the ones you dont want to send over the wire.
Another, more dirtier, solution would be to use a dictionary as a datamember and have the fields you want to send as elements there. There may be type conversion issues, but maybe it serves you better.
Edit:
You probably want to go with a dictioray serialized as an associative array en js, as this question specifies. Check the answers and the links in there. That should get you going.
But still I'd go with optional datamembers since it's more of a "contract" thing. Other than that a better description of what you want to do will help.