Conditionally merging two separate JSON objects in JQ

Conditionally merging two separate JSON objects in JQ - json

This is how my input looks:
{
"text" : "Some text here"
}
{
"usage": {
"text_units": 1,
"text_characters": 101,
"features": 1
},
"language": "en",
"categories": [
{
"score": 0.655041,
"label": "/technology law, govt and politics/espionage and intelligence/surveillance"
},
{
"score": 0.639809,
"label": "/technology and computing/computer security/network security"
},
{
"score": 0.624533,
"label": "/business and industrial/business operations"
}
]
}
Using JQ, if the first element of array category in the second object contains /technology, I want to add a new field named relevant with 1 as value (which I managed), and copy the text field from the first object.
So, the expected output is:
{
"usage": {
"text_units": 1,
"text_characters": 101,
"features": 1
},
"language": "en",
"categories": [
{
"score": 0.655041,
"label": "/technology law, govt and politics/espionage and intelligence/surveillance"
},
{
"score": 0.639809,
"label": "/technology and computing/computer security/network security"
},
{
"score": 0.624533,
"label": "/business and industrial/business operations"
}
],
"relevant": 1,
"text": "Some text here"
}
And this is what I have done so far:
if .categories[0].label | test("/technology"; "i") then . |=( . + {"relevant": 1} + {"text": .text}) else . |= . + {"relevant": 0} end
Link to a demo on jqplay

Your input consists of two separate objects. In order to be able to access the first while processing the second, you could save the first into a variable.
. as {$text} | input | if .categories[0].label | test("/technology"; "i") then . + {relevant: 1, $text} else . + {relevant: 0} end
Online demo

Related

Append JSON file after specific array index by using shell script

I want to append some content by using shell script.
I have a JSON file test.json as below.
{
"reference": "Json Test",
"title": {
"a": "Json Test"
},
"components": [
{
"reference": "Json Test",
"type": "panel",
"content": [
{
"link": "abc/123",
"label": {
"a": "for test 123 - a",
"b": "for test 123 - b"
}
},
{
"link": "abc/456",
"label": {
"a": "for test 456 - a",
"b": "for test 456 - b"
}
},
{
"link": "abc/789",
"label": {
"a": "for test 789 - a",
"b": "for test 789 - b"
}
}
]
}
]
}
I want to append the content and output as following by using shell script (*.sh) How can I achieve this ?
{
"reference": "Json Test",
"title": {
"a": "Json Test"
},
"components": [
{
"reference": "Json Test",
"type": "panel",
"content": [
{
"link": "abc/123",
"label": {
"a": "for test 123 - a",
"b": "for test 123 - b"
}
},
{
"link": "abc/101112",
"label": {
"a": "for test 101112 - a",
"b": "for test 101112 - b"
}
},
{
"link": "abc/456",
"label": {
"a": "for test 456 - a",
"b": "for test 456 - b"
}
},
{
"link": "abc/789",
"label": {
"a": "for test 789 - a",
"b": "for test 789 - b"
}
}
]
}
]
}
I tried to access the index and add some test string, the below command will replace the original data.
jq '.components[].content[1] + { "link" : "test" } ' test.json

You can use the slice filter to extract the head and the tail of the array, then use + to concatenate head + the new object + the tail. Finally, use update-assignment |= to modify the array:
.components[].content |= .[0:1] + [{ link: "test" }] + .[1:]
If you are planning on using this more often, consider defining a reusable function:
def splice($at; $obj): .[0:$at] + [$obj] + .[$at:];
.components[].content |= splice(1; {link: "test"})

Grab the empty sub-array at position 1 (slicing either by start and end position .[1:1], or by start position and length .[1:][:0]), and assign to it your insert value formatted as (single-element) array [{"link": "test"}] (as you are assigning to an array after all - add more items to it if you want to add all of them at once). This looks almost like your original attempt:
jq '.components[].content[1:1] = [{"link": "test"}]' test.json
For convenience, you can also turn this into an insertAt function:
def insertAt($pos; $val): .[$pos:$pos] = [$val];
.components[].content |= insertAt(1; {"link": "test"})

Using jq to fetch and show key value with quotes

I have a file that looks as below:
{
"Job": {
"Name": "sample_job",
"Description": "",
"Role": "arn:aws:iam::00000000000:role/sample_role",
"CreatedOn": "2021-10-21T23:35:23.660000-03:00",
"LastModifiedOn": "2021-10-21T23:45:41.771000-03:00",
"ExecutionProperty": {
"MaxConcurrentRuns": 1
},
"Command": {
"Name": "glueetl",
"ScriptLocation": "s3://aws-sample-s3/scripts/sample.py",
"PythonVersion": "3"
},
"DefaultArguments": {
"--TempDir": "s3://aws-sample-s3/temporary/",
"--class": "GlueApp",
"--enable-continuous-cloudwatch-log": "true",
"--enable-glue-datacatalog": "true",
"--enable-metrics": "true",
"--enable-spark-ui": "true",
"--job-bookmark-option": "job-bookmark-enable",
"--job-insights-byo-rules": "",
"--job-language": "python",
"--spark-event-logs-path": "s3://aws-sample-s3/logs"
},
"MaxRetries": 0,
"AllocatedCapacity": 100,
"Timeout": 2880,
"MaxCapacity": 100.0,
"WorkerType": "G.1X",
"NumberOfWorkers": 100,
"GlueVersion": "2.0"
}
}
I want to get key/value from "Name", "--enable-continuous-cloudwatch-log": "" and "--enable-metrics": "". So, I need to show the info like this:
"Name" "sample_job"
"--enable-continuous-cloudwatch-log" ""
"--enable-metrics" ""
UPDATE
Follow the tips from #Inian and #0stone0 I came close to it:
jq -r '(.Job ) + (.Job.DefaultArguments | { "--enable-continuous-cloudwatch-log", "--enable-metrics"}) | to_entries[] | "\"\(.key)\" \"\(.value)\""'
This extract the values I need but show all another key/values.

Since you're JSON isn't valid, I've converted it into:
{
"Job": {
"Name": "sample_job",
"Role": "sample_role_job"
},
"DefaultArguments": {
"--enable-continuous-cloudwatch-log": "test_1",
"--enable-metrics": ""
},
"Timeout": 2880,
"NumberOfWorkers": 10
}
Using the following filter:
"Name \(.Job.Name)\n--enable-continuous-cloudwatch-log \(.DefaultArguments."--enable-continuous-cloudwatch-log")\n--enable-metrics \(.DefaultArguments."--enable-metrics")"
We use string interpolation to show the desired output:
Name sample_job
--enable-continuous-cloudwatch-log test_1
--enable-metrics
jq --raw-output '"Name \(.Job.Name)\n--enable-continuous-cloudwatch-log \(.DefaultArguments."--enable-continuous-cloudwatch-log")\n--enable-metrics \(.DefaultArguments."--enable-metrics")"'
Online Demo

Fill arrays in the first input with elements from the second based on common field

I have two files and I would need to merge the elements of the second file into an object array in the first file based on searching the reference field.
The first file:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : []
},
{
"reference": 25423,
"order_number": "10_2",
"details" : []
}
]
The second file:
[
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
},
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
I would like to get:
[
{
"reference": 25422,
"order_number": "10_1",
"details" : [
{
"record_id" : 1,
"reference": 25422,
"row_description": "descr_1_0"
},
{
"record_id" : 2,
"reference": 25422,
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details" :[
{
"record_id" : 3,
"reference": 25423,
"row_description": "descr_2_0"
}
]
}
]
Below is my code in es_func.jq file launched by this command:
jq -n --argfile f1 es_file1.json --argfile f2 es_file2.json -f es_func.jq
INDEX($f2[] ; .reference) as $details
| $f1
| map( ($details[.reference|tostring]| .row_description) as $vn
| if $vn then .details = [{"row_description" : $vn}] else . end)
I get the result only for the last record in 25422 reference with "row description": "descr_1_1" and not have "row_description": "descr_1_0"
[
{
"reference": 25422,
"order_number": "10_1",
"details": [
{
"row_description": "descr_1_1"
}
]
},
{
"reference": 25423,
"order_number": "10_2",
"details": [
{
"row_description": "descr_2_0"
}
]
}
]
I think I'm close to the solution but something is still missing. Thank you

This would be way easier if you used reduce instead.
jq 'reduce inputs[] as $rec (INDEX(.reference);
.[$rec.reference | tostring].details += [$rec]
) | map(.)' es_file1.json es_file2.json
Online demo

Here's a straightforward, reduce-free solution:
jq '
group_by(.reference)
| INDEX(.[]; .[0]|.reference|tostring) as $dict
| input
| map_values(. + {details: $dict[.reference|tostring]})
' 2.json 1.json

Leveling select fields

I am fetching a json response of following structure:
{
"data": {
"children": [
{
"data": {
"id": "abcdef",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_1.jpg"
}
}
]
},
"title": "Boring Title One"
}
},
{
"data": {
"id": "ghijkl",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_2.jpg"
}
}
]
},
"title": "Boring Title Two"
}
},
{
"data": {
"id": "mnopqr",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_3.jpg"
}
}
]
},
"title": "Boring Title Three"
}
},
{
"data": {
"id": "stuvwx",
"preview": {
"images": [
{
"source": {
"url": "https://example.com/somefiles_4.jpg"
}
}
]
},
"title": "Boring Title Four"
}
}
]
}
}
Ideally I would like to have a shortened json like this:
{
"data": [
{
"id": "abcdef",
"title": "Boring Title One",
"url": "https://example.com/somefiles_1.jpg"
},
{
"id": "ghijkl",
"title": "Boring Title Two",
"url": "https://example.com/somefiles_2.jpg"
},
{
"id": "mnopqr",
"title": "Boring Title Three",
"url": "https://example.com/somefiles_3.jpg"
},
{
"id": "stuvwx",
"title": "Boring Title Four",
"url": "https://example.com/somefiles_4.jpg"
}
]
}
If this is not possible I can work with joining those three values into a single string and latter split when necessary; like this:
abcdef#Boring Title One#https://example.com/somefiles_1.jpg
ghijkl#Boring Title Two#https://example.com/somefiles_2.jpg
mnopqr#Boring Title Three#https://example.com/somefiles_3.jpg
stuvwx#Boring Title Four#https://example.com/somefiles_4.jpg
This is where I am. I was uring the jq with select() and then pipe the results to to_entries like this:
jq -r '.data.children[] | select(.data.post_type|test("image")?) | .data | to_entries[] | [ .value.title , .value.preview.images[0].source.url ] | join("#")' ~/Documents/json/sample.json
I don't understand what goes after to_entries[]; I have tried multiple variations of .key and .values; Mostly I don't get any result but sometimes I get key pairs I do not intend to select. How to learn the proper syntax for it?
Is creating a flat json out of a nested json like this good or is it better to create the string outputs? I feel the string might be error prone especially with the presence of spaces or special characters.

Apparently what you're looking for is the {field} syntax. You don't need to resort to string outputs.
{ data: [
.data.children[].data
| select(has("post_type") and (.post_type | index("image")))
| {id, title} + (.preview.images[].source | {url})
# or, if images array always contains one element:
# | {id, title, url: .preview.images[0].source.url}
]
}

A simple solution to the main question is:
{data: [.data.children[]
| .data
| {id, title, url: .preview.images[0].source.url} ]}
(The "post_type" seems to have disappeared, but hopefully if it's relevant, you will be able to adapt the above as required. Likewise if .images[1] and beyond are relevant.)
String Output
If you want linear output, you should probably consider CSV or TSV, both of which are supported by jq via #csv and #tsv respectively.

Split a string and trim a known prefix from each part in a complex JSON structure

I'm dealing with a fairly complex JSON-structure in which a single entry needs to be edited in several places. For example:
[
{
"name": "test 1",
"stuff": {
"properties": {
"id": 0,
"stuff_list": [
{
"entryId": 1,
"description": "- item 1\n- item 2\n- item 3"
},
{
"entryId": 2,
"description": "- item 1\n- item 2\n- item 3"
}
]
}
}
},
{
"name": "test 2",
"stuff": {
"properties": {
"id": 1,
"stuff_list": [
{
"entryId": 1,
"description": null
},
{
"entryId": 2,
"description": "- item 1\n- item 2\n- item 3"
}
]
}
}
}
]
Here I would like to edit each "description"-element: The string needs to be split at each \n and the substrings "^\n?-\s" of each resulting array element need to be removed. So it should result in:
{
"entryId": 1,
"description": ["item 1", "item 2", "item 3"]
}
My first approach is:
jq '.[].stuff.properties.stuff_list[].description | split("\n")' the_file.json
but that's not working in the first place becaue of the null values that can occur at some places. So now I wonder: how can I achieve what I want?

An alternate version using split() on the \n and trimming string - on the left, would be to do
.[].stuff.properties.stuff_list[].description |=
if . != null then
split("\n") | map(ltrimstr("- "))
else
.
end
jqplay - Demo

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Conditionally merging two separate JSON objects in JQ - json

Your input consists of two separate objects. In order to be able to access the first while processing the second, you could save the first into a variable. . as {$text} | input | if .categories[0].label | test("/technology"; "i") then . + {relevant: 1, $text} else . + {relevant: 0} end Online demo

Related

Append JSON file after specific array index by using shell script

Using jq to fetch and show key value with quotes

Fill arrays in the first input with elements from the second based on common field

Leveling select fields

Split a string and trim a known prefix from each part in a complex JSON structure

Categories

Resources