Nested filtering with jq - json

First time user of jq and I'm wanting to filter out objects based on a value within them and I'm struggling to figure it out.
I have a big json file with lots of product data like what's below. I'm wanting to filter out based upon which website_ids they have.
Example Input:
[{
"product_id": "2",
"sku": "PROD2",
"name": "Product Name 2",
"set": "4",
"type": "simple",
"category_ids": {
"item": "15"
},
"website_ids": {
"item": [
"1",
"4"
]}
},{
"product_id": "3",
"sku": "PROD3",
"name": "Product Name 3",
"set": "4",
"type": "simple",
"category_ids": {
"item": "15"
},
"website_ids": {
"item": [
"1",
"2"
]}
}]
Desired output:
[{
"product_id": "2",
"sku": "PROD2",
"name": "Product Name 2",
"set": "4",
"type": "simple",
"category_ids": {
"item": "15"
},
"website_ids": {
"item": [
"1",
"4"
]}
}]
I've tried a few different things but I'm clearly just not getting it.
jq 'map(.website_ids.item[] | contains("4"))'
Gives me:
[
false,
true,
false,
false
]
Which seems to match the website_ids items I want, but I'm not sure how to get the full JSON object from that.
Any help would be super appreciated! Thanks.
EDIT:
I've used this and it works with my example:
map(select(.website_ids.item[] | contains("4")))
What I've realised is that my example and the file I was actually testing on have some differences.
Sometimes a product has this for the website_id items:
"website_ids": {
"item": "2"
}
Which results in the error:
Cannot iterate over string ("2")
Is there a way around this?

All you need to do is add a select call in your map function, like so:
jq 'map(select(.website_ids.item[] | contains("4")))'
After your edit, it's a bit more complicated, but it can be worked around by checking the type of .website_ids.item and then based off of that type, doing a contains check or a simple equality check:
map((select((.website_ids.item | type) == "array") | select(.website_ids.item[] | contains("4"))), (select((.website_ids.item | type) == "string") | select (.website_ids.item == "4")))
Here it is formatted a bit more readable:
map(
(select((.website_ids.item | type) == "array") | select(.website_ids.item[] | contains("4"))),
(select((.website_ids.item | type) == "string") | select (.website_ids.item == "4"))
)

Related

How to filter an array of json with jq in linux?

I have the following JSON input:
{
"paging": {
"count": 0,
"total": 0,
"offset": 0,
"max": 0
},
"executions": [
{
"id": 5,
"href": "https://localhost.com.br",
"permalink": "https://localhost.com.br",
"status": "succeeded",
"project": "PROJETO",
"executionType": "scheduled",
"date-started": {
"unixtime": 1660793400012,
"date": "2022-08-18T03:30:00Z"
},
"date-ended": {
"unixtime": 1660793409694,
"date": "2022-08-18T03:30:09Z"
},
"job": {
"id": "cdkwednweoi-8745bjdf-kcjkjr8745",
"averageDuration": 0,
"name": "routine",
"group": "",
"project": "PROJECT",
"description": "",
"href": "https://localhost.com.br",
"permalink": "https://localhost.com.br"
},
"description": "runner",
"argstring": null,
"serverUUID": "jdnsdnasldnaje382nf5ubv",
"successfulNodes": [
"84jsk937nf"
]
}
]
}
First I want to select an array by a property name. And then I want to select an object of the array by the value of the propertyes.
Example of the desired informations on output:
"href"
"status"
"project"
"date-started":
"unixtime": 48298437239847,
"date": "2022-07-17"
"date-ended":
"unixtime": 48298437239847,
"date": "2022-07-17"
"job":
"name": "cleaner"
I knew how to get the firts values:
jq -r '.executions[] | [.href, .status, .project']
But the other ones I don't know how to do, I've tried with:
jq '.executions[] | with_entries( select(.value | has("date-started") ) )'
But it doesn't works.
Your first query produces a JSON array, so in this response, I'll assume it will suffice to produce an array of the eight values of interest in the order you've specified.
With your input, the following invocation produces the eight values as shown below:
jq '.executions[]
| [.href, .status, .project,
(."date-started" | (.unixtime, .date)),
(."date-ended" | (.unixtime, .date)),
.job.name]'
Output:
[
"https://localhost.com.br/rundeck/api/40/execution/2340",
"succeeded",
"PROJETO",
1660793400012,
"2022-08-18T03:30:00Z",
1660793409694,
"2022-08-18T03:30:09Z",
"proc_limpeza_saft"
]

Retrieve value based on contents of another value

I have this json that i am trying to get the just the id out of based on a contains from another value. I am able to jq the contains part but when I add on | .id i cannot get a result
{
"restrictions": [
{
"id": 1,
"database": {
"match": "exact",
"value": "db_contoso"
},
"measurement": {},
"permissions": [
"write"
]
},
{
"id": 2,
"database": {
"match": "exact",
"value": "db2_contoso"
},
"measurement": {},
"permissions": [
"write"
]
}
]
}
When id run
jq -r '.restrictions[] | .database.value | select(contains("conto")?)
I get the values of db_contoso and db2_contoso. but I am trying to pull just the id based on that. When I add | .id to the end of that command I get nothing.
So that would be to do below. Select the whole object matching the condition and get the value of .id
jq '.restrictions[] | select(.database.value | contains("conto")).id'

select value from subfield that is inside an array

I have a JSON object that looks something like this:
{
"a": [{
"name": "x",
"group": [{
"name": "tom",
"publish": true
},{
"name": "joe",
"publish": true
}]
}, {
"name": "y",
"group": [{
"name": "tom",
"publish": false
},{
"name": "joe",
"publish": true
}]
}]
}
I want to select all the entries where publish=true and create a simplified JSON array of objects like this:
[
{
"name": "x"
"groupName": "tom"
},
{
"name": "x"
"groupName": "joe"
},
{
"name": "y"
"groupName": "joe"
}
]
I've tried many combinations but the fact that group is an array seems to prevent each from working. Both in this specific case as well as in general, how do you do a deep select without loosing the full hierarchy?
Using <expression> as $varname lets you store a value in a variable before going deeper into the hierarchy.
jq -n '
[inputs[][]
| .name as $group
| .group[]
| select(.publish == true)
| {name, groupName: $group}
]' <input.json
You can use this:
jq '.[]|map(
.name as $n
| .group[]
| select(.publish==true)
| {name:$n,groupname:.name}
)' file.json
A shorter, effective alternative:
.a | map({name, groupname: (.group[] | select(.publish) .name)})
Online demo

jq get the value of x based on y in a complex json file

jq strikes again. Trying to get the value of DATABASES_DEFAULT based on the name in a json file that has a whole lot of names and I'm completely lost.
My file looks like the following (output of an aws ecs describe-task-definition) only much more complex; I've stripped this to the most basic example I can where the structure is still intact.
{
"taskDefinition": {
"status": "bar",
"family": "bar2",
"volumes": [],
"taskDefinitionArn": "bar3",
"containerDefinitions": [
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
},
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo2"
}
],
"name": "boo",
"links": []
}
],
"revision": 1
}
}
I need the value of DATABASES_DEFAULT where the name is baz. Note that there are a lot of keypairs with name, I'm specifically talking about the one outside of environment.
I've been tinkering with this but only got this far before realizing that I don't understand how to access nested values.
jq '.[] | select(.name==DATABASES_DEFAULT) | .value'
which is returning
jq: error: DATABASES_DEFAULT/0 is not defined at <top-level>, line 1:
.[] | select(.name==DATABASES_DEFAULT) | .value
jq: 1 compile error
Obviously this a) doesn't work, and b) even if it did, it's independant of the name value. My thought was to return all the db defaults and then identify the one with baz, but I don't know if that's the right approach.
I like to think of it as digging down into the structure, so first you open the outer layers:
.taskDefinition.containerDefinitions[]
Now select the one you want:
select(.name =="baz")
Open the inner structure:
.environment[]
Select the desired object:
select(.name == "DATABASES_DEFAULT")
Choose the key you want:
.value
Taken together:
parse.jq
.taskDefinition.containerDefinitions[] |
select(.name =="baz") |
.environment[] |
select(.name == "DATABASES_DEFAULT") |
.value
Run it like this:
<infile jq -f parse.jq
Output:
"foo"
The following seems to work:
.taskDefinition.containerDefinitions[] |
select(
select(
.environment[] | .name == "DATABASES_DEFAULT"
).name == "baz"
)
The output is the object with the name key mapped to "baz".
$ jq '.taskDefinition.containerDefinitions[] | select(select(.environment[]|.name == "DATABASES_DEFAULT").name=="baz")' tmp.json
{
"dnsSearchDomains": [],
"environment": [
{
"name": "bar4",
"value": "bar5"
},
{
"name": "bar6",
"value": "bar7"
},
{
"name": "DATABASES_DEFAULT",
"value": "foo"
}
],
"name": "baz",
"links": []
}

How to filter an array of objects based on value of a element in the Json in JQ

I am trying to get a value from the Json using JQ.
I have to get a ID from the inputJson , (activeItem) and use that ID to get the name of the element from list of items below.
Can this be done in single query ?
{
"amazon": {
"activeitem" : 2,
"items": [
{
"id" : 1,
"name": "harry potter",
"state": "sold"
},
{
"id" : 2,
"name": "adidas shoes",
"state": "in inventory"
},
{
"id" : 3,
"name": "watch",
"state": "returned"
}
]
}
}
Now i am getting the value first and the filtering, instead i want to do in single query.
With your data, the filter:
.amazon
| .activeitem as $id
| .items[]
| select(.id == $id)
| .name
produces:
"adidas shoes"
(Use the -r command-line option if you want the raw string.)