Combine two Json's by key using JQ - json

I have two json's that are a list of objects that share the same key and I am trying to combine them into one json using jq. The expected output is a single json that contains a list of the combined objects in list form. For example:
Json 1:
[
{"Id":"1", "FirstName":"firstName1", "LastName":"lastName1"},
{"Id":"2", "FirstName":"firstName2", "LastName":"lastName2"},
{"Id":"3", "FirstName":"firstName2", "LastName":"lastName3"}
]
Json 2:
[
{"School":"School1", "Id":"1", "Degree":"Degree1"},
{"School":"School2", "Id":"2", "Degree":"Degree2"},
{"School":"School3", "Id":"3", "Degree":"Degree3"}
]
Combined Json Based on Id
[
{"Id":"1", "FirstName":"firstName1", "LastName":"lastName1",
"School":"School1", "Degree":"Degree1"},
{"Id":"2", "FirstName":"firstName2", "LastName":"lastName2",
"School":"School2", "Degree":"Degree2"},
{"Id":"3", "FirstName":"firstName2", "LastName":"lastName3",
"School":"School3", "Degree":"Degree3"}
]
I have already tried a few ways to merge these jsons I found in this thread such as:
jq -s '.[0] * .[1]' file1 file2
I am still a novice in jq, so any help would be appreciated!

Use the SQL-Style Operators JOIN and INDEX
jq 'JOIN(INDEX(inputs[];.Id);.[];.Id;add)' json1 json2
[
{
"Id": "1",
"FirstName": "firstName1",
"LastName": "lastName1",
"School": "School1",
"Degree": "Degree1"
},
{
"Id": "2",
"FirstName": "firstName2",
"LastName": "lastName2",
"School": "School2",
"Degree": "Degree2"
},
{
"Id": "3",
"FirstName": "firstName2",
"LastName": "lastName3",
"School": "School3",
"Degree": "Degree3"
}
]
Demo

Related

How do I return the id associated with the test json element?

I return the following json from a curl. I've simplified the ID's however my goal is to return the id related to the test object. The desired return value is 55555.
I'm using jq to parse the string.
https://stedolan.github.io/jq
[
{
"attributes": null,
"id": "44444",
"name": "production",
"type": "system_group"
},
{
"attributes": null,
"id": "55555",
"name": "test",
"type": "system_group"
},
{
"attributes": null,
"id": "66666",
"name": "uat",
"type": "system_group"
}
]
It should be pretty straightforward filter, use the select and contain constructs.
jq '.[] | select( .name| contains("test")) | .id'
Call the filter with -r to remove the quotes and print the raw value.
jqplay-URL

parsing JSON with jq to return value of element where another element has a certain value

I have some JSON output I am trying to parse with jq. I read some examples on filtering but I don't really understand it and my output it more complicated than the examples. I have no idea where to even begin beyond jq '.[]' as I don't understand the syntax of jq beyond that and the hierarchy and terminology are challenging as well. My JSON output is below. I want to return the value for Valid where the ItemName equals Item_2. How can I do this?
"1"
[
{
"GroupId": "1569",
"Title": "My_title",
"Logo": "logo.jpg",
"Tags": [
"tag1",
"tag2",
"tag3"
],
"Owner": [
{
"Name": "John Doe",
"Id": "53335"
}
],
"ItemId": "209766",
"Item": [
{
"Id": 47744,
"ItemName": "Item_1",
"Valid": false
},
{
"Id": 47872,
"ItemName": "Item_2",
"Valid": true
},
{
"Id": 47872,
"ItemName": "Item_3",
"Valid": false
}
]
}
]
"Browse"
"8fj9438jgge9hdfv0jj0en34ijnd9nnf"
"v9er84n9ogjuwheofn9gerinneorheoj"
Except for the initial and trailing JSON scalars, you'd simply write:
.[] | .Item[] | select( .ItemName == "Item_2" ) | .Valid
In your particular case, to ensure the top-level JSON scalars are ignored, you could prefix the above with:
arrays |

Creating a CSV from json using jq, based on elements in array

I have the following json format that I need to convert to CSV
[{
"name": "joe",
"age": 21,
"skills": [{
"lang": "spanish",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
},
{
"name": "sarah",
"age": 34,
"skills": [{
"lang": "french",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
}, {
"name": "jim",
"age": 26,
"skills": [{
"lang": "spanish",
"grade": "60"
}, {
"lang": "english",
"grade": "66",
"school": {
"name": "eg school",
"url": "eg-school.com"
}
}]
}
]
to convert to csv
name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3
So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists
I'd also like to add the file and line number it came from.
I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?
With your data in input.json, and the following jq program in tocsv.jq:
.[]
| [.name, .age] +
(.skills[]
| select(.lang == "spanish")
| [.grade, .school.name, .school.url, input_filename, input_line_number] )
| #csv
the invocation:
jq -r -f tocsv.jq input.json
yields:
"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51
If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""
Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:
jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq
where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.
Finally, please also consider using the TSV format (#tsv) rather than the rather messy CSV format (#csv).

Select or exclude multiples object with an array of IDs

I have the following JSON :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
},
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]
And I have the following array store in a variable: ["1", "2", "56", "1337"] (note the IDs are string, and may contain any regular character).
So, thanks to this SO, I found a way to filter my original data. jq 'jq '[.[] | select(.id == ("1", "2", "56", "1337"))]' ./data.json (note the array is surrounded by parentheses and not brackets) produces :
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
But I would also liked to do the opposite (basically excluding IDs instead of selecting them). Using select(.id != ("1", "2", "56", "1337")) doesn't work and using jq '[. - [.[] | select(.id == ("1", "2", "56", "1337"))]]' ./data.json seems very ugly and it doesn't work with my actual data (an output of aws ec2 describe-instances).
So have you any idea to do that? Thank you!
To include them, you need to verify that the id is any of the values in the keep set.
$ jq --argjson include '["1", "2", "56", "1337"]' 'map(select(.id == $include[]))' ...
To exclude them, you need to verify that all values are not in your excluded set. But it might just be easier to take the original set and remove the items that are in the excluded set.
$ jq --argjson exclude '["1", "2", "56", "1337"]' '. - map(select(.id == $exclude[]))' ...
Here is a solution that uses inside. Assuming you run jq as
jq -M --argjson IDS '["1","2","56","1337"]' -f filter.jq data.json
This filter.jq
map( select([.id] | inside($IDS)) )
produces the ids from data.json that are in the $IDS array:
[
{
"id": "1",
"foo": "bar-a",
"hello": "world-a"
},
{
"id": "2",
"foo": "bar-b",
"hello": "world-b"
}
]
and this filter.jq
map( select([.id] | inside($IDS) | not) )
produces the ids from data.json that are not in the $IDS array:
[
{
"id": "10",
"foo": "bar-c",
"hello": "world-c"
},
{
"id": "42",
"foo": "bar-d",
"hello": "world-d"
}
]

How to convert complex JSON to CSV using JQ 1.4

I am using JQ 1.4 on Windows 64 bit machine.
Below are the contents of input file IP.txt
{
"results": [
{
"name": "Google",
"employees": [
{
"name": "Michael",
"division": "Engineering"
},
{
"name": "Laura",
"division": "HR"
},
{
"name": "Elise",
"division": "Marketing"
}
]
},
{
"name": "Microsoft",
"employees": [
{
"name": "Brett",
"division": "Engineering"
},
{
"name": "David",
"division": "HR"
}
]
}
]
}
{
"results": [
{
"name": "Amazon",
"employees": [
{
"name": "Watson",
"division": "Marketing"
}
]
}
]
}
File contains two "results". 1st result containts information for 2 companies: Google and Microsoft. 2nd result contains information for Amazon.
I want to convert this JSON into csv file with company name and employee name.
"Google","Michael"
"Google","Laura"
"Google","Elise"
"Microsoft","Brett"
"Microsoft","David"
"Amazon","Watson"
I am able to write below script:
jq -r "[.results[0].name,.results[0].employees[0].name]|#csv" IP.txt
"Google","Michael"
"Amazon","Watson"
Can someone guide me to write the script without hardcoding the index values?
Script should be able generate output for any number results and each cotaining information of any number of companies.
I tried using below script which didn't generate expected output:
jq -r "[.results[].name,.results[].employees[].name]|#csv" IP.txt
"Google","Microsoft","Michael","Laura","Elise","Brett","David"
"Amazon","Watson"
You need to flatten down the results first to rows of company and employee names. Then with that, you can convert to csv rows.
map(.results | map({ cn: .name, en: .employees[].name } | [ .cn, .en ])) | add[] | #csv
Since you have a stream of inputs, you'll have to slurp (-s) it in. Since you want to output csv, you'll want to use raw output (-r).