How to join JSON objects on particular fields using jq? - json

I have following JSON structure:
{
"Reservations": [
{
"Id": "R-1",
"CustomerId": "1"
},
{
"Id": "R-2",
"CustomerId": "2"
}
],
"Customers": [
{
"Id": "1",
"Name": "customer 1"
},
{
"Id": "2",
"Name": "customer 2"
},
{
"Id": "3",
"Name": "customer 3"
}
]
}
I want to join Reservations with Customers and get something like this:
{
"ReservationId": "R-1",
"CustomerName": "customer 1"
}
{
"ReservationId": "R-2",
"CustomerName": "customer 2"
}
I've played with jq extensively, tried using multiple filters separated by comma, tried using variables, read the docs, but it seems like doing such a simple tasks is impossible with jq. Or, am I missing something?

Here's a simple solution using INDEX/2:
INDEX(.Customers[]; .Id) as $c
| .Reservations[]
| { ReservationId: .Id,
CustomerName: $c[.CustomerId].Name }
If your jq does not have INDEX/2 then now would be a good time to upgrade; otherwise, you can copy-and-paste its def from https://github.com/stedolan/jq/blob/master/src/builtin.jq, or you could use INDEX/3 as defined below.
INDEX/3
def INDEX(s; k; v):
reduce s as $x ({}; .[$x|k] = ($x|v));
INDEX(.Customers[]; .Id; .Name) as $c
| .Reservations[]
| { ReservationId: .Id,
CustomerName: $c[.CustomerId] }

Related

jq update json document to alter an array element

I know this has to be simple, but for some reason it's eluding me how to find an element given a condition and modify one of its fields. The doc should be fully output (sed style) with the edit made.
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"wait" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"wait" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
I'm expecting something like this should be workable.
jq '. | (select(.steps[][].name=="Foo" and .steps[][].state=="wait") |= . + {.state:"Ready"}'
or
jq '. | (select(.steps[][]) | if (.name=="Foo" and .state=="wait") then (.state="Ready") else . end)
The desired output, of course, would be
{
"state": "wait",
"steps": {
"step1": [
{ "name":"Foo", "state":"ready" },
{ "name":"Bar", "state":"wait" }
],
"step2": [
{ "name":"Foo", "state":"ready" },
{ "name":"Zoinks", "state":"ready" }
],
"step3": [
{ "name":"Foo", "state":"cancel" }
]
}
}
Instead, when I'm not getting cryptic errors, I'm either modifying a top-level field in the document or modifying the field for all the elements or repeated the entire doc multiple times.
Any insights greatly appreciated.
Thanks.
p.s. is there a better syntax than [] to wildcard the named-elements under steps? Or after the pipe to identify the indices discovered by the select?
Pipe the output of .steps[][] into a select call that chooses the objects with the desired name and state values, then set the state value on the result.
$ jq '(.steps[][] | select(.name == "Foo" and .state == "wait")).state = "ready"' tmp.json
{
"state": "wait",
"steps": {
"step1": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Bar",
"state": "wait"
}
],
"step2": [
{
"name": "Foo",
"state": "ready"
},
{
"name": "Zoinks",
"state": "ready"
}
],
"step3": [
{
"name": "Foo",
"state": "cancel"
}
]
}
}
You can help confirm this using diff (the first jq just normalizes the formatting so that only the changes made by the second one show up in the diff):
$ diff <(jq . tmp.json) <(jq '...' tmp.json)
7c7
< "state": "wait"
---
> "state": "ready"
17c17
< "state": "wait"
---
> "state": "ready"

Add or Update a field in one JSON file from another JSON file based on matching field

I have two JSON files a.json and b.json. The contents in a.json file is a JSON object and inside b.json its an array.I want to add/update status field in each mappings in a.json by retrieving the value from b.json file.
a.json:
{
"title": 25886,
"data": {
"request": {
"c": 46369,
"t1": 1562050127.376641
},
},
"rs": {
"mappings": {
"12345": {
"id": "12345",
"name": "test",
"customer_id": "11228",
},
"45678": {
"id": "45678",
"name": "abc",
"customer_id": "11206",
}
}
}}
b.json:
[
{
"status": "pending",
"extra": {
"name": "test"
},
"enabled": true,
"id": "12345"
},
{
"status": "not_started",
"extra": {
"name": "abc"
},
"enabled": true,
"id": "45678"
}
]
Below is my expected output:
{
"title": 25886,
"data": {
"request": {
"c": 46369,
"t1": 1562050127.376641
},
},
"rs": {
"mappings": {
"12345": {
"id": "12345",
"name": "test",
"customer_id": "11228",
"status":"pending"
},
"45678": {
"id": "45678",
"name": "abc",
"customer_id": "11206",
"status":"not_started"
}
}
}}
In this expected JSON file we have status field whose value is retrieved from b.json file based on a matching id value. How to do this using jq ?
For the purposes of this problem, b.json essentially defines a dictionary, so for simplicity, efficiency and perhaps elegance,
it make sense to start by using the builtin function INDEX to create the relevant dictionary:
INDEX( $b[] | {id, status}; .id )
This assumes an invocation of jq along the lines of:
jq --argfile b b.json -f update.jq a.json
(Yes, I know --argfile has been deprecated. Feel free to choose another way to set $b to the contents of b.json.)
Now, to perform the update, it will be simplest to use the "update" operator, |=, in conjunction with map_values. (Feel free to check the jq manual :-)
Putting everything together:
INDEX( $b[] | {id, status}; .id ) as $dict
| .rs.mappings |= map_values( .status = $dict[.id].status )

Modifying array of key value in JSON jq

In case, I have an original json look like the following:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "user"
}
]
}
]
}
}
And I would like to inplace modify the value for the matched key like so:
jq '.taskDefinition.containerDefinitions[0].environment[] | select(.name=="DB_USERNAME") | .value="new"' json
I got the output
{
"name": "DB_USERNAME",
"value": "new"
}
But I want more like in-place modify or the whole json from the original with new value modified, like this:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Is it possible to do with jq or any known workaround?
Thank you.
Updated
For anyone looking for editing multi-values,
here is the approach I use
JQ=""
for e in DB_HOST=rds DB_USERNAME=xxx; do
k=${e%=*}
v=${e##*=}
JQ+="(.taskDefinition.containerDefinitions[0].environment[] | select(.name==\"$k\") | .value) |= \"$v\" | "
done
jq '${JQ%??}' json
I think there should be more concise way, but this seems working fine.
It is enough to assign to the path, if you are using |=, e.g.
jq '
(.taskDefinition.containerDefinitions[0].environment[] |
select(.name=="DB_USERNAME") | .value) |= "new"
' infile.json
Output:
{
"taskDefinition": {
"containerDefinitions": [
{
"name": "web",
"image": "my-image",
"environment": [
{
"name": "DB_HOST",
"value": "localhost"
},
{
"name": "DB_USERNAME",
"value": "new"
}
]
}
]
}
}
Here is a select-free solution using |=:
.taskDefinition.containerDefinitions[0].environment |=
map(if .name=="DB_USERNAME" then .value = "new"
else . end)
Avoiding select within the expression on the LHS of |= makes the solution more robust w.r.t. the version of jq being used.
You might like to consider this alternative to using |=:
walk( if type=="object" and .name=="DB_USERNAME"
then .value="new" else . end)

How to format a csv file using json data?

I have a json file that I need to convert to a csv file, but I am a little wary of trusting a json-to-csv converter site as the outputted data seems to be incorrect... so I was hoping to get some help here!
I have the following json file structure:
{
"GroupName": "GrpName13",
"Number": 3,
"Notes": "Test Group ",
"Units": [
{
"UnitNumber": "TestUnit13",
"DataSource": "Factory",
"ContractNumber": "TestContract13",
"CarNumber": "2",
"ControllerTypeMessageId" : 4,
"NumberOfLandings": 4,
"CreatedBy": "user1",
"CommissionModeMessageId": 2,
"Details": [
{
"DetailName": "TestFloor13",
"DetailNumber": "5"
}
],
"UnitDevices": [
{
"DeviceTypeMessageId": 1,
"CreatedBy": "user1"
}
]
}
]
}
The issue I think Im seeing is that the converters seem to not be able to comprehend the many nested data values. And the reason I think the converters are wrong is because when I try to convert back to json using them, I dont receive the same structure.
Does anyone know how to manually format this json into csv format, or know of a reliable converter than can handle nested values?
Try
www.json-buddy.com/convert-json-csv-xml.htm
if not working for you then you can try this tool
http://download.cnet.com/JSON-to-CSV/3000-2383_4-76680683.html
should be helpful!
I have tried your json on this for url:
http://www.convertcsv.com/json-to-csv.htm
As a result:
UnitNumber,DataSource,ContractNumber,CarNumber,ControllerTypeMessageId,NumberOfLandings,CreatedBy,CommissionModeMessageId,Details/0/DetailName,Details/0/DetailNumber,UnitDevices/0/DeviceTypeMessageId,UnitDevices/0/CreatedBy
TestUnit13,Factory,TestContract13,2,4,4,user1,2,TestFloor13,5,1,user1
Because it could save the path of the key,like the 'DeviceTypeMessageId' in list 'UnitDevices': it will named the columns name with 'UnitDevices/0/DeviceTypeMessageId', this could avoid the same name mistake, so you can get the columns name by its converter rules.
Hope helpful.
Here is a solution using jq
If the file filter.jq contains
def denormalize:
def headers($p):
keys_unsorted[] as $k
| if .[$k]|type == "array" then (.[$k]|first|headers("\($p)\($k)_"))
else "\($p)\($k)"
end
;
def setup:
[
keys_unsorted[] as $k
| if .[$k]|type == "array" then [ .[$k][]| setup ]
else .[$k]
end
]
;
def iter:
if length == 0 then []
elif .[0]|type != "array" then
[.[0]] + (.[1:] | iter)
else
(.[0][] | iter) as $x
| (.[1:] | iter) as $y
| [$x[]] + $y
end
;
[ headers("") ], (setup | iter)
;
denormalize | #csv
and data.json contains (note extra samples added)
{
"GroupName": "GrpName13",
"Notes": "Test Group ",
"Number": 3,
"Units": [
{
"CarNumber": "2",
"CommissionModeMessageId": 2,
"ContractNumber": "TestContract13",
"ControllerTypeMessageId": 4,
"CreatedBy": "user1",
"DataSource": "Factory",
"Details": [
{
"DetailName": "TestFloor13",
"DetailNumber": "5"
}
],
"NumberOfLandings": 4,
"UnitDevices": [
{
"CreatedBy": "user1",
"DeviceTypeMessageId": 1
},
{
"CreatedBy": "user10",
"DeviceTypeMessageId": 10
}
],
"UnitNumber": "TestUnit13"
},
{
"CarNumber": "99",
"CommissionModeMessageId": 99,
"ContractNumber": "Contract99",
"ControllerTypeMessageId": 99,
"CreatedBy": "user99",
"DataSource": "Another Factory",
"Details": [
{
"DetailName": "TestFloor99",
"DetailNumber": "99"
}
],
"NumberOfLandings": 99,
"UnitDevices": [
{
"CreatedBy": "user99",
"DeviceTypeMessageId": 99
}
],
"UnitNumber": "Unit99"
}
]
}
then the command
jq -M -r -f filter.jq data.json
will produce
"GroupName","Notes","Number","Units_CarNumber","Units_CommissionModeMessageId","Units_ContractNumber","Units_ControllerTypeMessageId","Units_CreatedBy","Units_DataSource","Units_Details_DetailName","Units_Details_DetailNumber","Units_NumberOfLandings","Units_UnitDevices_CreatedBy","Units_UnitDevices_DeviceTypeMessageId","Units_UnitNumber"
"GrpName13","Test Group ",3,"2",2,"TestContract13",4,"user1","Factory","TestFloor13","5",4,"user1",1,"TestUnit13"
"GrpName13","Test Group ",3,"2",2,"TestContract13",4,"user1","Factory","TestFloor13","5",4,"user10",10,"TestUnit13"
"GrpName13","Test Group ",3,"99",99,"Contract99",99,"user99","Another Factory","TestFloor99","99",99,"user99",99,"Unit99"

jq get the value of x based on y in a complex json file

jq strikes again. Trying to get the value of DATABASES_DEFAULT based on the name in a json file that has a whole lot of names and I'm completely lost.
My file looks like the following (output of an aws ecs describe-task-definition) only much more complex; I've stripped this to the most basic example I can where the structure is still intact.
{
"taskDefinition": {
"status": "bar",
"family": "bar2",
"volumes": [],
"taskDefinitionArn": "bar3",
"containerDefinitions": [
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
},
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo2"
}
],
"name": "boo",
"links": []
}
],
"revision": 1
}
}
I need the value of DATABASES_DEFAULT where the name is baz. Note that there are a lot of keypairs with name, I'm specifically talking about the one outside of environment.
I've been tinkering with this but only got this far before realizing that I don't understand how to access nested values.
jq '.[] | select(.name==DATABASES_DEFAULT) | .value'
which is returning
jq: error: DATABASES_DEFAULT/0 is not defined at <top-level>, line 1:
.[] | select(.name==DATABASES_DEFAULT) | .value
jq: 1 compile error
Obviously this a) doesn't work, and b) even if it did, it's independant of the name value. My thought was to return all the db defaults and then identify the one with baz, but I don't know if that's the right approach.
I like to think of it as digging down into the structure, so first you open the outer layers:
.taskDefinition.containerDefinitions[]
Now select the one you want:
select(.name =="baz")
Open the inner structure:
.environment[]
Select the desired object:
select(.name == "DATABASES_DEFAULT")
Choose the key you want:
.value
Taken together:
parse.jq
.taskDefinition.containerDefinitions[] |
select(.name =="baz") |
.environment[] |
select(.name == "DATABASES_DEFAULT") |
.value
Run it like this:
<infile jq -f parse.jq
Output:
"foo"
The following seems to work:
.taskDefinition.containerDefinitions[] |
select(
select(
.environment[] | .name == "DATABASES_DEFAULT"
).name == "baz"
)
The output is the object with the name key mapped to "baz".
$ jq '.taskDefinition.containerDefinitions[] | select(select(.environment[]|.name == "DATABASES_DEFAULT").name=="baz")' tmp.json
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
}