JDT transform to modify N-th array element - json

I am trying to apply a JDT transform to a JSON document in order to modify a property in a N-th array element. Is that possible without having to replace the entire element or even the entire array?
{
"array": [
{
name: "A",
value: 0
},
{
name: "B",
value: 3.14
}
]
}
Is there a transform that gets me to the following? I want to alter the 2nd array element and only its "value" property. I don't want to search for it by "name" but rather access by index.
{
"array": [
{
name: "A",
value: 0
},
{
name: "B",
value: 12345678
}
]
}

The challenge
It is easy to do your transform with some libraries in JSON. If your object is called foo, you mainly want to something like foo.array[1].value = "12345678" without any kind of looping.
The JDT way
I found How to use SlowCheetah to transform an array elements in Json config file? which asks
For example, if my base config file has this setting:
{
"Settings" : [1, 2, 3]
}
and I want to transfer it to:
{
"Settings" : [4, 5, 6]
}
The solution by Collin K was
{
"#jdt.replace": {
"#jdt.path": "$.Settings",
"#jdt.value": [4,5,6]
}
}
This seems like you need to actually replace the whole array.
Digging further let me to an open issue of JDT which seems to confirm this assumption.
Disclaimer
I have not used JDT myself, but I have been struggling with nested JSONs of various kinds e.g. with Elasticsearch.
Further references
https://github.com/microsoft/json-document-transforms/wiki/Replace-Transformation
Using jq to update objects within a JSON document if the value corresponding to a given key starts with a specified string - the JQ way I would use

Related

AWS Glue Crawler - DynamoDB Export - Get attribute names in schema instead of struct

I've defined a default crawler on the data directory of an export from dynamodb. I'm trying to get it to give me a structured table instead of a table with a single column of type struct. What do I have to do make get the actual column names in there? I've tried adding custom classifiers and different path expressions but nothing seems to work, and I feel like I'm missing something really obvious.
I'm using the crawler builder inside of glue, which doesn't seem to offer much customization.
Here's the schema from the table generated by the default crawler:
And here's one of the items that I've exported from dynamo:
{
"Item": {
"the_url": {
"S": "/2021/07/06/****redacted****.html"
},
"as_of_when": {
"S": "2021-09-01"
},
"user_hashes": {
"SS": [
"****redacted*****"
]
},
"user_id_hashes": {
"SS": [
"u3MeXDcpQm0ACYuUv6TMrg=="
]
},
"accumulated_count": {
"N": "1"
},
"today_count": {
"N": "1"
}
}
}
The way Athena interprets JSON data means that your data has only a single column, Item. Athena doesn't have any mechanism to map arbitrary parts of a JSON object to columns, it can only map top-level attributes to columns.
If you want other parts of the objects as columns you will either have to create a new table with transformed data, or create a view with the attributes as columns, e.g.
CREATE OR REPLACE VIEW attributes_as_top_level_columns AS
SELECT
item.the_url.S AS the_url,
CAST(item.as_of_when.S AS DATE) AS as_of_when,
item.user_hashes.SS AS user_hashes,
item.user_id_hashes.SS AS user_id_hashes,
item.accumulated_count.N AS accumulated_count,
item.today_count.N AS today_count
FROM items
In the example above I've also flattened the data type keys (S, SS, N) and I converted the date string to a date.

Stream analytics parse json, same key can be array or not

A XML is converted to JSON and sent to an EventHub and then a Stream Analytics process it.
The problem is when XML uses the same tags name it gets converted to a list on the JSON side, but when there is only one tag is not converted to a list. So the same tag can be an array or not.
Ex:
I can receive either:
{
"k1": 123,
"k2": {
"l1": 2,
"l2": 12
}
}
or:
{
"k1": 123,
"k2": [
{
"l1": 2,
"l2": 12
},
{
"l1": 3,
"l2": 34
}
]
}
I can easily deal with the first scenario and the second scenario independently, but I don't know how to deal with both at the same time, is this possible?
Yes, it is. If you know how to deal with each of the cases individually, I will just suggest an idea of how you can make the distinction between these two cases, before you treat them individually.
Essentially, the idea is to check if the field is an array. What I did was, I wrote a UDF function in javascript that returns "true"/"false", if the passed object is an array:
function UDFSample(arg1) {
'use strict';
var isArray = Array.isArray(arg1);
return isArray.toString();
}
here is how you can use this in the group query:
with test as (SELECT Document from input where UDF.IsArray(k2) = 'true')
now "test" contains items that you can treat as an array. The same you can do for the case where k2 is just an object.

Translating JSON values using io.circe

I have a function in scala that translates a value and produces a string.
strOut = translate(strIn)
Suppose the following JSON object:
{
"id": "c730433b-082c-4984-3d56-855c243265f0",
"standard": "stda",
"timestamp": "tsx000",
"stdparms" : {
"stdparam1": "a",
"stdparam2": "b"
}
}
and the following mapping provided by the translation function:
"stda" -> "stdb"
"tsx000" -> "tsy000"
"a" -> "f"
"b" -> "g"
What is the best way to translate the whole JSON object using the translate function? My goal is to obtain the following result:
{
"id": "c730433b-082c-4984-3d56-855c243265f0",
"standard": "stdb",
"timestamp": "tsy000",
"stdparms" : {
"stdparam1": "f",
"stdparam2": "g"
}
}
I must use the io.circe library due to project related matters.
If you know beforehand which fields you want to translate, or what translations apply to that field, you can use Cursors to traverse the JSON tree. Or if the fields themselves are fixed (you always know what fields to expect) Optics may require less code.
When you get to the right leaf, you apply the translation.
However, when you don't know what could apply when/where it might be easier to find/replace using string methods.
Note that the JSON you provided as an example is not valid JSON by the way.

What is the JsonPath syntax used by Json.NET?

I'm trying to select some nodes with Json.NET SelectTokens it doesn't seem to support the same syntax the original jsonpath supports. Given this input:
{
"a": [
{
"id": 1
}
],
"b": [
{
"id": 2
},
{
"id": 3,
"c": {
"id": 4
}
}
],
"d": [
{
"id": 5
}
]
}
I want the ids of all top level objects inside aand b only, but not of inner objects. Using goessner's parser I'm able to do it with: $.[a,b].*.id, it returns [1, 2, 3].
Json.NET doesn't seem to support neither the comma or the *. How can this be achieved with Json.NET and are there any reference for what is supported by Json.NET jpath selectors?
The following path will work with Json.NET 10.0.2:
var path = #"$.['a','b'][*].id";
This path seems consistent with the original JsonPATH article, which states:
JSONPath expressions can use the dot–notation
$.store.book[0].title
or the bracket–notation
$['store']['book'][0]['title']
Specifically:
Names inside brackets are shown to be quoted. Presumably doing so distinguishes between indices and numeric names.
Array indices are always shown to be in brackets rather than between periods.
Sample fiddle.
(Honestly, the original article is somewhat vague and allows for variation in implementation. For instance, exactly what does script expression, using the underlying script engine mean?)

Groovy compare two json with unknown nodes names and values

I have a rest API to test and I have to compare two json responses. Below you can find a structure of the file. Both files to compare should contains the same elements but order might be different. Unfortunately the names, the type (simple, array) and the number of keys (root, nodeXYZ) are also not known.
{"root": [{
"node1": "value1",
"node2": "value1",
"node3": [
{
"node311": "value311",
"node312": "value312"
},
{
"node321": "value321",
"node322": "value322"
}
],
"node4": [
{
"node411": "value411",
"node412": "value413",
"node413": [ {
"node4131": "value4131",
"node4132": "value4131"
}],
"node414": []
}
{
"node421": "value421",
"node422": "value422",
"node423": [ {
"node4231": "value4231",
"node4232": "value4231"
}],
"node424": []
}]
"node5": [
{"node51": "value51"},
{"node52": "value52"},
]
}]}
I have found some useful information in
Groovy - compare two JSON objects (same structure) and return ArrayList containing differences
Getting node from Json Response
Groovy : how do i search json with key's value and find its children in groovy
but I could not combine it to an solution.
I thought the solution might look like this:
take root
get root children names
check if child has children and get their names
do it to the lowest leve child
With all names in place comparing should be easy (I guess)
Unfortunately I did not manage to get keys under root
Just compare the slurped maps:
def map1 = new JsonSlurper().parseText(document1)
def map2 = new JsonSlurper().parseText(document2)
assert map1 == map2
Try the JSONassert library: https://github.com/skyscreamer/JSONassert. Then you can use:
JSONAssert.assertEquals(expectedJson, actualJson, JSONCompareMode.STRICT)
And you will get nicely formatted deltas like:
java.lang.AssertionError: Resources.DbRdsLiferayInstance.Properties.KmsKeyId
Expected: kms-key-2
got: kms-key