Parse a large file of non-schematized json using Jackson? - json

I have a very large .json file on disk. I want to instantiate this as a Java object using the Jackson parser.
The file looks like this:
[ { "prop1": "some_value",
"prop2": "some_other_value",
"something_random": [
// ... arbitrary list of objects containing key/value
// pairs of differing amounts and types ...
]
},
// ... repated many times ...
{
}
]
Basically it's a big array of objects and each object has two string properties that identify it, then another inner array of objects where each object is a random collection of properties and values which are mostly strings and ints, but may contain arrays as well.
Due to this object layout, there isn't a set schema I can use to easily instantiate these objects. Using the org.json processor requires attempting to allocate a string for the entire file, which often fails due to its size. So I'd like to use the streaming parser, but I am completely unfamiliar with it.
What I want in the end is a Map where the String is the value of prop1 and SomeObject is something that holds the data for the whole object (top-level array entry). Perhaps just the JSON which can then be parsed later on when it is needed?
Anyway, ideas on how to go about writing the code for this are welcome.

Since you do not want to bind the whole thing as single object, you probably want to use readValues() method of ObjectReader. And if structure of individual values is kind of generic, you may want to bind them either as java.util.Maps or JsonNodes (Jackson's tree model). So you would do something like:
ObjectMapper mapper = new ObjectMapper();
ObjectReader reader = mapper.reader(Map.class); // or JsonNode.class
MappingIterator<Map> it = reader.readValues(new File("stuff.json"));
while (it.hasNextValue()) {
Map m = it.nextValue();
// do something; like determine real type to use and:
OtherType value = mapper.convertValue(OtherType.class);
}
to iterate over the whole thing.

Related

GSON | Extract JSON's Root Name | JsonPath Or JsonPointer

I am looking at extracting the root element of a JSON document. It looks like this is possible neither using JsonPointer nor JsonPath as my attempts to look up for such an expression has been unsuccessful. Any tips would be appreciated. TIA.
Sample document:
{
"MESSAGE1_ROOT_INPUT": {
"CTRL_SEG": "test"
}
}
The below using gson 2.9.0:
$.*~
produces:
{"CTRL_SEG": "test"}
while JSONPath Online produces this:
[
"MESSAGE1_ROOT_INPUT"
]
The attempt is to get text "MESSAGE1_ROOT_INPUT" using JsonPath/JsonPointer expression(s). Note that, extracting this the traditional (substring or regex on a stringified json text) way, would preferably be my last resort.
Background: We are building an API service that accepts JSON documents with different roots. Such as, MESSAGE2_ROOT_INPUT, MESSAGE3_ROOT_INPUT, etc. It is based on this, the routing of a message further will occur.
Supported/Employed Languages: Java/GSON Library/RegEx
Gson does not natively support JSONPath or JSON Pointer. However, you can quite efficiently obtain the name of the first property using JsonReader:
public static String getFirstPropertyName(Reader reader) throws IOException {
// Don't have to call JsonReader.close(); that would just close the provided reader
JsonReader jsonReader = new JsonReader(reader);
jsonReader.beginObject();
return jsonReader.nextName();
}
There are however two things to keep in mind:
This only reads the beginning of the JSON document; it neither verifies that the complete JSON document has valid syntax, nor checks if there might be more top-level properties
This consumes some data from the Reader; to further process the data you have to buffer the data to allow re-reading it again (you can also first store the JSON in a String and pass a StringReader to JsonReader)

How to parse nested json-array in streaming fashion with zio-json

For a json array like this:
[
my-json-obj1,
my-json-obj2,
my-json-obj3,
....
my-json-objN
]
And MyJsonObj class that represents a mapping of single object in array I can say:
val myJson = '''[...]'''
ZStream
.fromIterable(myJson.toSeq)
.via(JsonDecoder[MyJsonObj].decodeJsonPipeline(JsonStreamDelimiter.Array))
to parse that array in a "streaming" way, i.e. emit mapped objects as they are parsed form the input as opposed to reading all the input first and then extracting the objects.
How can I do the same if the array is nested inside a json object say like this?:
{
"hugeArray":
[
my-json-obj1,
my-json-obj2,
my-json-obj3,
....
my-json-objN
]
}
I trawled through zio-json source code, but I can't find any foothole there for this use case. I guess I could carve out that array from the json document and feed that to decodeJsonPipeline. Is there any better, json-syntax aware way of doing this? If not directly in zio-json perhaps with help of some other open source json libraries?

CPP REST SDK JSON - How to create JSON w/ Array and write to file

I'm having troubles with the JSON classes of the CPP REST SDK. I can't figure out when to use json::value, json::object and json::array. Especially the latter two seem very alike. Also the usage of json::array is rather unintuitive to me. Finally I want to write the JSON to a file or at least to stdcout, so I can check it is correct.
It was way easier for me to use json-spirit, but since I want to make REST requests later on I thought I'd save me the string/wstring madness and use the json classes of the CPP REST SDK.
What I want to achieve is a JSON file like this:
{
"foo-list" : [
{
"bar" : "value1",
"bob" : "value2"
}
]
}
This is the code I tried:
json::value arr;
int i{0};
for(auto& thing : things)
{
json::value obj;
obj[L"bar"] = json::value::string(thing.first);
obj[L"bob"] = json::value::string(thing.second);
arr[i++] = obj;
}
json::value result;
result[L"foo-list"] = arr;
Do I really need this extra counter variable i? Seems rather inelegant. Would using json::array/json::object make things nicer? And how do I write my JSON to a file?
This could help you:
json::value output;
output[L"foo-list"][L"bar"] = json::value::string(utility::conversions::to_utf16string("value1"));
output[L"foo-list"][L"bob"] = json::value::string(utility::conversions::to_utf16string("value2"));
output[L"foo-list"][L"bobList"][0] = json::value::string(utility::conversions::to_utf16string("bobValue1"));
output[L"foo-list"][L"bobList"][1] = json::value::string(utility::conversions::to_utf16string("bobValue1"));
output[L"foo-list"][L"bobList"][2] = json::value::string(utility::conversions::to_utf16string("bobValue1"));
If you want to create list, like bobList, you really need to use some iterator variable.
Otherwise you will get only bunch of separate variables.
For output to console use
cout << output.serialize().c_str();
And finally, this will lead to
{
"foo-list":{
"bar":"value1",
"bob":"value2",
"bobList":[
"bobValue1",
"bobValue1",
"bobValue1"
]
}
}
-- To answer your first question, in arrays, JSON values are stored at ordered indices. Thus, arrays can be traversed efficiently compared to objects as objects are more like hashmaps where the input key goes through hashing mechanism every time and a match within a hashtable is found to reach the value of that key. Thus, arrays are efficient especially when we are trying to traverse through a large JSON.
-- To answer your second question. As you mentioned you need to create something like this.
{
"foo-list" : [
{
"bar" : "value1",
"bob" : "value2"
}
]
}
-- If we were to have json::object with these two json values {"bar":"value1"} and {"bob":"value2"} as the value of the key foo-list,(if we were to have curly braces instead of the square above) it can be implemented as
result[U("foo-list")][U("bar")] = "value1";
result[U("foo-list")][U("bob")] = "value2";
-- But here json::object with these two JSON values {"bar":"value1"} and {"bob":"value2"} is at index 0 of a json::array; and this array is the value of the key foo-list. Thus you need the index variable to implement something like
result[U("foo-list")][0][U("bar")] = "value1";
result[U("foo-list")][0][U("bob")] = "value2";
-- To answer your third question, as correctly pointed out by #Zdeno you can use serialize to convert json::value to string and dump it to the file

Changing an immutable object F#

I think the title of this is wrong but can't create a title that reflects, in the abstract, what I want to achieve.
I am writing a function which calls a service and retrieves data as a JSON string. The function parses the string with a JSON type provider. Under certain conditions I want to amend properties on that JSON object and then return the string of the amended object. So if the response from the call was
{"property1" : "value1","property2" : "value2", "property3": "value3" }
I want to change property3 to a new value and then return the JSON string.
If the JsonProvider was mutable this would be an exercise like:
type JsonResponse =
JsonProvider<""" {"property1" : "value1",
"property2" : "value2",
"property3": "value3" } """>
let jsonResponse = JsonResponse.Parse(response)
jsonResponse.Property3 <- "new value"
jsonResponse.ToString()
However, this does not work as the property cannot be set. I am trying to ascertain the best way to resolve this. I am quite happy to initialise a new object based on the original response but with amended parameters but I am not sure if there is an easy way to achieve this.
For reference, the JSON object is much more involved than the flat example given and contains a deep hierarchy.
Yes, you would need to create a new object, changing the bits you want and using the existing object's values for the rest. We added write APIs for both the XML and JSON type providers a while back. You will notice the types representing your JSON have constructors on them. You can see an example of this in use at the bottom of this link

Mapping unpredictable keys from JSON to POJO

I am using the Jackson JSON library to map JSON streams into POJO's.
My JSON's keys have unpredictable names.
i.e
{
"Random_ID":
{
"Another_Random_ID":
{
"some_key": "value"
"some_key1": "value1"
}
}
...
}
I would like to map this request to a POJO (with the same structure), however the mapper will fail since there is no such setXXX (where XXX is a random_id - since i cannot predict the name).
What would be the best way to map this request to the corresponding object without manually parsing it with createJsonParser.
If names are unpredictable, POJOs are not the way to go.
But you can use the Tree Model, like:
JsonNode root = objectMapper.readTree(jsonSource);
and access it as a logical tree. Also, if you do want to convert the tree (or any of sub-trees, as identified by node that is the root of sub-tree), you can do:
MyPOJO pojo = objectMapper.treeToValue(node, MyPOJO.class);
and back to tree
JsonNode node = objectMapper.valueToTree(pojo);