jOOQ JSON formatting as array of objects - json

I have the following (simplified) jOOQ query:
val result = context.select(
jsonObject(
key("id").value(ITEM.ID),
key("title").value(ITEM.NAAM),
key("resources").value(
jsonArrayAgg(ITEM_INHOUD.RESOURCE_ID).absentOnNull()
)
)
).from(ITEM).fetch()
Now the output that I want is:
[
{
"id": "0da04cc5-f70c-4fb3-b5c7-dc645d342631",
"title": "Title1",
"resources": [
"8b0f6d5c-67fc-47ca-be77-d1735e7721ce",
"ea0316db-1cfd-46d7-8260-5c1a4e65a0cd"
]
},
{
"id": "0f7e67e6-5187-47e2-9f1d-dab08feba38b",
"title": "Title2"
}
]
result.formtJSON() gives the following output:
{
"fields": [
{
"name": "json_object",
"type": "JSON"
}
],
"records": [
[
{
"id": "0da04cc5-f70c-4fb3-b5c7-dc645d342631",
"title": "Title 1"
}
]
]
}
Disabling the headers with result.formatJSON(JSONFormat.DEFAULT_FOR_RECORDS) will get me:
[
[
{
"id": "0da04cc5-f70c-4fb3-b5c7-dc645d342631",
"title": "Title1",
"resources": [
"8b0f6d5c-67fc-47ca-be77-d1735e7721ce",
"ea0316db-1cfd-46d7-8260-5c1a4e65a0cd"
]
}
],
[
{
"id": "0f7e67e6-5187-47e2-9f1d-dab08feba38b",
"title": "Title2"
}
]
]
where I don't want the extra array.
Further customizing the JSONformatter with result.formatJSON(JSONFormat().header(false).recordFormat(JSONFormat.RecordFormat.OBJECT)) I get:
[
{
"json_object": {
"id": "0da04cc5-f70c-4fb3-b5c7-dc645d342631",
"title": "Title1",
"resources": [
"8b0f6d5c-67fc-47ca-be77-d1735e7721ce",
"ea0316db-1cfd-46d7-8260-5c1a4e65a0cd"
]
}
},
{
"json_object": {
"id": "0f7e67e6-5187-47e2-9f1d-dab08feba38b",
"title": "Title2"
}
}
]
where I don't want the object wrapped in json_object.
Is there a way to get the output I want?

Doing it with Result.formatJSON()
This is clearly a flaw in the jOOQ 3.14.0 implementation of Result.formatJSON(). In the special case where there is only one column, and that column is of type JSON or JSONB, the column name may not really matter, and thus its contents should be flattened into the object describing the row. I've created a feature request for this: https://github.com/jOOQ/jOOQ/issues/10953. It will be available in jOOQ 3.15.0 and 3.14.4. You will be able to do this:
result.formatJSON(JSONFormat().header(false).wrapSingleColumnRecords(false));
The RecordFormat is irrelevant here. This works the same way for RecordFormat.ARRAY and RecordFormat.OBJECT
Doing it directly with SQL
Of course, you can always work around this by moving all the logic into SQL. You probably simplified your query by omitting a JOIN and GROUP BY. I'm assuming this is equivalent to what you want:
JSON result = context.select(
jsonArrayAgg(jsonObject(
key("id").value(ITEM.ID),
key("title").value(ITEM.NAAM),
key("resources").value(
select(jsonArrayAgg(ITEM_INHOUD.RESOURCE_ID).absentOnNull())
.from(ITEM_INHOUD)
.where(ITEM_INHOUD.ITEM_ID.eq(ITEM.ID))
)
))
).from(ITEM).fetchSingle().value1()
Note that JSON_ARRAYAGG() aggregates empty sets into NULL, not into an empty []. If that's a problem, use COALESCE()

Related

Using Recursive feature while Flattening in Snowflake

I have a JSON string, which needs to be parsed in order to retrieve particular values.Here is an example I am working with;
{
"assignable_type": "SHIPMENT",
"rule": {
"rules": [
{
"meta_data": {},
"rules": [
{
"op": "IN",
"target": "CLIENT_FID",
"type": "ARRAY_VALUE_ASSERTION",
"values": [
"flx::core:client:dbid/64171",
"flx::core:client:dbid/76049",
"flx::core:client:dbid/34040",
"flx::core:client:dbid/61806"
]
}
],
"type": "AND"
}
],
"type": "OR"
},
"type": "USER_DEFINED"
}
The goal is to get the values when "target":"CLIENT_FID".
Expected Output for this JSON file should be ;
["flx::core:client:dbid/64171",
"flx::core:client:dbid/76049",
"flx::core:client:dbid/34040",
"flx::core:client:dbid/61806"]
Here, as we can see rules is a list of dictionaries, and we can have nested lists as seen in the example.
Similarly, we have other JSON file of following type;
{
"assignable_type": "SHIPMENT",
"rule": {
"rules": [
{
"meta_data": {},
"rules": [
{
"op": "IN",
"target": "PORT_OF_ENTRY_FID",
"type": "ARRAY_VALUE_ASSERTION",
"values": [
"flx::core:port:dbid/566788",
"flx::core:port:dbid/566931",
"flx::core:port:dbid/561482"
]
}
],
"type": "AND"
},
{
"meta_data": {},
"rules": [
{
"op": "IN",
"target": "PORT_OF_LOADING_FID",
"type": "ARRAY_VALUE_ASSERTION",
"values": [
"flx::core:port:dbid/561465"
]
},
{
"op": "IN",
"target": "SHIPMENT_MODE",
"type": "ARRAY_VALUE_ASSERTION",
"values": [
0
]
},
{
"op": "IN",
"target": "CLIENT_FID",
"type": "ARRAY_VALUE_ASSERTION",
"values": [
"flx::core:client:dbid/28169"
]
}
],
"type": "AND"
}
],
"type": "OR"
},
"type": "USER_DEFINED"
}
For the second example ,
Expected Output shd be;
["flx::core:client:dbid/28169"]
As. seen, we may need to read the values at different depths in the file. In order to address this issue, I used following code;
/* first convert the string to a JSON object in cte1 */
with cte1 as (
select to_json(json_string) as json_rep,
parse_json(json_extract_path_text(json_rep, 'rule.rules')) as list_elem
from table 1),
cte2 as (select split_array,
json_extract_path_text(split_array, 'target') as target_client
from (
select json_rep,
list_elem,
t.value as split_array,
typeof(split_array) as obj_type,
index
from cte1,
table(flatten(cte1.list_elem, recursive=>true)) as t) temp /* use recursive feature */
where split_array ilike '%"target":"client_fid"%' /* filter for those rows containing this string */
and obj_type='OBJECT')
select
split_array,
json_extract_path_text(split_array, 'values') as client_values
from cte2
where target_client='CLIENT_FID'; /* filter the rows where we have the dictionary containing client fid */
In order to address the issue of varying depth at which client_fid is found we're recursing while flattening the string into rows. The output which is obtained for both of above inputs is provided below,
For the first String we get the actual output in variable client_values as
["flx::core:client:dbid/64171",
"flx::core:client:dbid/76049",
"flx::core:client:dbid/34040",
"flx::core:client:dbid/61806"]
Similarly, for the second string we get the actual output as
["flx::core:client:dbid/28169"]
As seen the code seems to be working in getting the correct output, but the way I filtered in the final query for target_client='CLIENT_FID'; it seems to be a very hacky way. Hence is it possible to get a better approach to resolve the issue of retrieving client fid values though the depth can vary in the given input.
Help is appreciated.

Delete duplications in JSON file

I am trying to reedit json file to print only subgroups that has any attributes marked as "change": false.
Json below:
{"group":{
"subgroup1":{
"attributes":[
{
"change":false,
"name":"Name"},
{
"change":false,
"name":"SecondName"},
],
"id":1,
"name":"MasterTest"},
"subgroup2":{
"attributes":[
{
"change":true,
"name":"Name"
},
{
"change":false,
"name":"Newname"
}
],
"id":2,
"name":"MasterSet"},
}}
I was trying to use command:
cat test.json | jq '.group[] | select (.attributes[].change==false)
which produce needed output but with duplicates. Can anyone help here? Or shall I use different command to achieve that result?
.attributes[] iterates over the attributes, and each iteration step produces its own result. Use the any filter which aggregates multiple values into one, in this case a boolean with the meaning of "at least one":
.group[] | select(any(.attributes[]; .change==false))
{
"attributes": [
{
"change": false,
"name": "Name"
},
{
"change": false,
"name": "SecondName"
}
],
"id": 1,
"name": "MasterTest"
}
{
"attributes": [
{
"change": true,
"name": "Name"
},
{
"change": false,
"name": "Newname"
}
],
"id": 2,
"name": "MasterSet"
}
Demo
Looks to me like the duplicate is NOT a duplicate, but a condition arising from a nested sub-grouping, which gives the appearance of a duplicate. You should look to see if there is a switch to skip processing sub-groups when the upper-level meets the condition, thereby avoiding the perceived duplication.

jmespath :select json object element based on other (array) element in the object

I have this JSON
{
"srv_config": [{
"name": "db1",
"servers": ["srv1", "srv2"],
"prop": [{"source":"aa"},"destination":"bb"},{"source":"cc"},"destination":"cc"},]
}, {
"name": "db2",
"servers": ["srv2", "srv2"],
"prop": [{"source":"dd"},"destination":"dd"},{"source":"ee"},"destination":"ee"},]
}
]
}
I try to build a JMESPath expression to select the prop application in each object in the main array, but based on the existence of a string in the servers element.
To select all props, I can do:
*.props [*]
But how do I add condition that says "select only if srv1 is in servers list"?
You can use the contains function in order to filter based on a array containing something.
Given the query:
*[?contains(servers, `srv1`)].prop | [][]
This gives us:
[
{
"source": "aa",
"destination": "bb"
},
{
"source": "cc",
"destination": "cc"
}
]
Please mind that I am also using a bit of flattening here.
All this run towards a corrected version of you JSON:
{
"srv_config":[
{
"name":"db1",
"servers":[
"srv1",
"srv2"
],
"prop":[
{
"source":"aa",
"destination":"bb"
},
{
"source":"cc",
"destination":"cc"
}
]
},
{
"name":"db2",
"servers":[
"srv2",
"srv2"
],
"prop":[
{
"source":"dd",
"destination":"dd"
},
{
"source":"ee",
"destination":"ee"
}
]
}
]
}

Issue with cts.jsonPropertyScopeQuery and cts.jsonPropertyValueQuery with data types and field order

I have MarkLogic 9 on my database.
I have created the following documents in my database:
test1.json
{
"users": [
{
"userId": "A",
"value": 0
}
]
}
test2.json
{
"users": [
{
"userId": "A",
"value": "0"
}
]
}
test3.json
{
"users": [
{
"value": 0,
"userId": "A"
}
]
}
test4.json
{
"users": [
{
"value": "0",
"userId": "A"
}
]
}
I have run the following codes and have recorded the results:
cts.uris(“”, null, cts.jsonPropertyScopeQuery(
"users",
cts.andQuery(
[
cts.jsonPropertyValueQuery('userId', "A"),
cts.jsonPropertyValueQuery('value', "0"),
]
)
))
Result: test2.json, test4.json
cts.uris(“”, null, cts.jsonPropertyScopeQuery(
"users",
cts.andQuery(
[
cts.jsonPropertyValueQuery('userId', "A"),
cts.jsonPropertyValueQuery('value', 0),
]
)
))
Result: test3.json
I was wondering why test1.json did not return in the 2nd query while test3.json did. They both had the same values for fields but in different order. The order of the fields are different in test2.json and test4.json, however, the query returned both documents. The only difference between the 2 pairs that I can think of is that there are 2 data types for the field “value”, integer and string.
How would I go about resolving this issue?
https://docs.marklogic.com/cts.jsonPropertyValueQuery shows the value to match as an array.
If you want to keep the variants in data, maybe you can try something on the query side like cts.jsonPropertyValueQuery('value', ["0", 0])

Hive Sql Query To get Json Object from Json Array

I have a json inside 'content' column in the following format:
{ "identifier": [
{
"type": {
"coding": [
{
"code": "MRN",
}
]
},
"value": "181"
},
{
"type": {
"coding": [
{
"code": "PID",
}
]
},
"value": "5d3669b0"
},
{
"type": {
"coding": [
{
"code": "IPN",
}
]
},
"value": "41806"
}
]}
I have to run an hive query to get the "value" of the code which is equal to "MRN".
I have written the following query but its not giving the value as expected:
select get_json_object(content,'$.identifier.value')as Mrn from Doctor where get_json_object(content,'$.identifier.type.coding.code') like '%MRN%'
I dont want to give particular array position like:
select get_json_object(content,'$.identifier[0].value')as Mrn from Doctor where get_json_object(content,'$.identifier[0].type.coding.code') like '%MRN%'
As the json gets created randomly and the position is not fixed always.
Give [ * ] to avoid giving position.
select get_json_object(content,'$.identifier[*].value')as Mrn from Doctor where get_json_object(content,'$.identifier[*].type.coding.code') like '%MRN%'