Related
I have two json's that are a list of objects that share the same key and I am trying to combine them into one json using jq. The expected output is a single json that contains a list of the combined objects in list form. For example:
Json 1:
[
{"Id":"1", "FirstName":"firstName1", "LastName":"lastName1"},
{"Id":"2", "FirstName":"firstName2", "LastName":"lastName2"},
{"Id":"3", "FirstName":"firstName2", "LastName":"lastName3"}
]
Json 2:
[
{"School":"School1", "Id":"1", "Degree":"Degree1"},
{"School":"School2", "Id":"2", "Degree":"Degree2"},
{"School":"School3", "Id":"3", "Degree":"Degree3"}
]
Combined Json Based on Id
[
{"Id":"1", "FirstName":"firstName1", "LastName":"lastName1",
"School":"School1", "Degree":"Degree1"},
{"Id":"2", "FirstName":"firstName2", "LastName":"lastName2",
"School":"School2", "Degree":"Degree2"},
{"Id":"3", "FirstName":"firstName2", "LastName":"lastName3",
"School":"School3", "Degree":"Degree3"}
]
I have already tried a few ways to merge these jsons I found in this thread such as:
jq -s '.[0] * .[1]' file1 file2
I am still a novice in jq, so any help would be appreciated!
Use the SQL-Style Operators JOIN and INDEX
jq 'JOIN(INDEX(inputs[];.Id);.[];.Id;add)' json1 json2
[
{
"Id": "1",
"FirstName": "firstName1",
"LastName": "lastName1",
"School": "School1",
"Degree": "Degree1"
},
{
"Id": "2",
"FirstName": "firstName2",
"LastName": "lastName2",
"School": "School2",
"Degree": "Degree2"
},
{
"Id": "3",
"FirstName": "firstName2",
"LastName": "lastName3",
"School": "School3",
"Degree": "Degree3"
}
]
Demo
I have a JSON:
{
"Country": "USA",
"State": "TX",
"Employees": [
{
"Name": "Name1",
"address": "SomeAdress1"
}
]
}
{
"Country": "USA",
"State": "FL",
"Employees": [
{
"Name": "Name2",
"address": "SomeAdress2"
},
{
"Name": "Name3",
"address": "SomeAdress3"
}
]
}
{
"Country": "USA",
"State": "CA",
"Employees": [
{
"Name": "Name4",
"address": "SomeAdress4"
}
]
}
I want to use jq to get the following result in csv format:
Country, State, Name, Address
USA, TX, Name1, SomeAdress1
USA, FL, Name2, SomeAdress2
USA, FL, Name3, SomeAdress3
USA, CA, Name4, SomeAdress4
I have got the following jq:
jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | #csv'
And I get the following with 2nd line having more columns than required. I want these extra columns in a separate row:
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"
And I want the following result:
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"
You need to generate a separate array for each employee.
[.Country, .State] + (.Employees[] | [.Name, .address]) | #csv
Online demo
You can store root object in a variable, and then expand the Employees arrays:
$ jq -r '. as $root | .Employees[]|[$root.Country, $root.State, .Name, .address] | #csv'
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2"
"USA","FL","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"
The other answers are good, but I want to talk about why your attempt doesn't work, as well as why it seems like it should.
You are wondering why this:
jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | #csv'
produces this:
"USA","TX","Name1","SomeAdress1"
"USA","FL","Name2","SomeAdress2","Name3","SomeAdress3"
"USA","CA","Name4","SomeAdress4"
perhaps because this:
jq '{Country:.Country,State:.State,Name:(.Employees[]|.Name)}'
produces this:
{
"Country": "USA",
"State": "TX",
"Name": "Name1"
}
{
"Country": "USA",
"State": "FL",
"Name": "Name2"
}
{
"Country": "USA",
"State": "FL",
"Name": "Name3"
}
{
"Country": "USA",
"State": "CA",
"Name": "Name4"
}
It turns out the difference is in what exactly [...] and {...} do in a jq filter. In the array constructor [...], the entire contents of the square brackets, commas and all, is a single filter, which is fully evaluated and all the results combined into one array. Each comma inside is simply the sequencing operator, which means generate all the values from the filter on its left, then all the values from the filter on its right. In contrast, the commas in the {...} object constructor are part of the syntax and just separate the fields of the object. If any of the field expressions yield multiple values then multiple whole objects are produced. If multiple field expressions yield multiple value then you get a whole object for every combination of yielded values.
When you do this:
jq -r '.|[.Country,.State,(.Employees[]|.Name,.address)] | #csv'
^ ^ ^
1 2 3
the problem is that the commas labelled "1", "2" and "3" are all doing the same thing, evaluating all the values for the filter on the left, then all the values for the filter on the right. Then the array constructor catches all of them and produces a single array. The array constructor will never create more than one array for one input.
So with that in mind, you need to make sure that where you're expanding out .Employees[] isn't inside your array constructor. Here's another option to add to the answers you already have:
jq -r '.Employee=.Employees[]|[.Country,.State,.Employee.Name,.Employee.address]|#csv'
demo
or indeed:
jq -r '.Employees[] as $e|[.Country,.State,$e.Name,$e.address]|#csv'
demo
I'm trying to reshape a json document and I assumed it would be easy to do using jq but I haven't been trying for several hours now and no success ...
(Please note that I'm not a jq jedi and the doc did not help)
I want to go from this :
{
"results": [
{
"profile": {
"birthYear": 1900,
"locale": "en_EN",
"city": "Somewhere, Around",
"timezone": "2",
"age": 52,
"gender": "m"
},
"UID": "SQSQSQerl7XSQSqSsqSQ"
}
]
}
to this :
{
"birthYear": 1900,
"locale": "en_EN",
"city": "Somewhere, Around",
"timezone": "2",
"age": 52,
"gender": "m",
"UID": "SQSQSQerl7XSQSqSsqSQ"
}
I got what below using this filter : .results[].profile , .results[].UID
{
"birthYear": 1900,
"locale": "en_EN",
"city": "Somewhere, Around",
"timezone": "2",
"age": 52,
"gender": "m"
}
"UID": "SQSQSQerl7XSQSqSsqSQ"
Thanks in advance for your help..
You can combine two objects with the addition operator.
jq '.results[] | .profile + {UID}'
.profile is already an object.
The other object is created with {}. {UID} is shorthand for {"UID" : .UID}
there are probably better ways but here you go
jq '.results[0].profile * .results[0] | del(.profile)'
explanation:
merge recursivly container with nested-container by means of A * B, then pipe to del( to remove nested container
I have the following json format that I need to convert to CSV
[{
"name": "joe",
"age": 21,
"skills": [{
"lang": "spanish",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
},
{
"name": "sarah",
"age": 34,
"skills": [{
"lang": "french",
"grade": "47",
"school": {
"name": "my school",
"url": "example.com/sp-school"
}
}, {
"lang": "english",
"grade": "87"
}]
}, {
"name": "jim",
"age": 26,
"skills": [{
"lang": "spanish",
"grade": "60"
}, {
"lang": "english",
"grade": "66",
"school": {
"name": "eg school",
"url": "eg-school.com"
}
}]
}
]
to convert to csv
name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3
So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists
I'd also like to add the file and line number it came from.
I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?
With your data in input.json, and the following jq program in tocsv.jq:
.[]
| [.name, .age] +
(.skills[]
| select(.lang == "spanish")
| [.grade, .school.name, .school.url, input_filename, input_line_number] )
| #csv
the invocation:
jq -r -f tocsv.jq input.json
yields:
"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51
If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""
Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:
jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq
where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.
Finally, please also consider using the TSV format (#tsv) rather than the rather messy CSV format (#csv).
I am using JQ 1.4 on Windows 64 bit machine.
Below are the contents of input file IP.txt
{
"results": [
{
"name": "Google",
"employees": [
{
"name": "Michael",
"division": "Engineering"
},
{
"name": "Laura",
"division": "HR"
},
{
"name": "Elise",
"division": "Marketing"
}
]
},
{
"name": "Microsoft",
"employees": [
{
"name": "Brett",
"division": "Engineering"
},
{
"name": "David",
"division": "HR"
}
]
}
]
}
{
"results": [
{
"name": "Amazon",
"employees": [
{
"name": "Watson",
"division": "Marketing"
}
]
}
]
}
File contains two "results". 1st result containts information for 2 companies: Google and Microsoft. 2nd result contains information for Amazon.
I want to convert this JSON into csv file with company name and employee name.
"Google","Michael"
"Google","Laura"
"Google","Elise"
"Microsoft","Brett"
"Microsoft","David"
"Amazon","Watson"
I am able to write below script:
jq -r "[.results[0].name,.results[0].employees[0].name]|#csv" IP.txt
"Google","Michael"
"Amazon","Watson"
Can someone guide me to write the script without hardcoding the index values?
Script should be able generate output for any number results and each cotaining information of any number of companies.
I tried using below script which didn't generate expected output:
jq -r "[.results[].name,.results[].employees[].name]|#csv" IP.txt
"Google","Microsoft","Michael","Laura","Elise","Brett","David"
"Amazon","Watson"
You need to flatten down the results first to rows of company and employee names. Then with that, you can convert to csv rows.
map(.results | map({ cn: .name, en: .employees[].name } | [ .cn, .en ])) | add[] | #csv
Since you have a stream of inputs, you'll have to slurp (-s) it in. Since you want to output csv, you'll want to use raw output (-r).