jq slurp and add key / value pairs - json

$DATA is a long string containing some Email addresses.
echo "$DATA" | grep -Eo "\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" | sort | uniq | jq --slurp --raw-input 'split("\n")[:-1]'
Output:
[
"email1#mydomain.com",
"email2#mydomain.com",
"email3#mydomain.com",
"email4#mydomain.com"
]
Desired Output:
[
{
"email": "email1#mydomain.com",
"free": "0",
"used": "0"
},
{
"email": "email2#mydomain.com",
"free": "0",
"used": "0"
},
{
"email": "email3#mydomain.com",
"free": "0",
"used": "0"
},
{
"email": "email4#mydomain.com",
"free": "0",
"used": "0"
}
]
I guess it should be something like += {"free": "0"}

You can replace your current jq command by the following :
jq --slurp --raw-input 'split("\n")[:-1] | map({email: ., free: 0, used: 0})'
You can try it here.

Related

Nested Json parsing using jq in bash

I need to parse a json file , I tied bash script with jq and not getting expected output .
Json File :
[
{
"fqdn": "my-created-lb",
"status": "Active",
"members": {
"10.45.78.9:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.10:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.11:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.12:80": {
"dc": "NA",
"state": "enabled",
"port": 80
}
}
}
]
I need output as :
"my-created-lb"
"10.45.78.9:80","enabled"
"10.45.78.10:80","enabled"
"10.45.78.11:80","enabled"
"10.45.78.12:80","enabled"
I tried below jq , but not getting expected output :
jq '.[] | .fqdn,.members,.members[].state'
But I am getting below output :
"my-created-lb"
{
"10.45.78.9:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.10:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.11:80": {
"dc": "NA",
"state": "enabled",
"port": 80
},
"10.45.78.12:80": {
"dc": "NA",
"state": "enabled",
"port": 80
}
}
"enabled"
"enabled"
"enabled"
"enabled"
Because you want the key's name, you could use to_entries like this:
jq -r '.[] | .fqdn, ( .members | to_entries | .[] | [ .key, .value.state ] | #tsv )'
Output:
my-created-lb
10.45.78.9:80 enabled
10.45.78.10:80 enabled
10.45.78.11:80 enabled
10.45.78.12:80 enabled
Edit
To get the exact output listed in the original post, modify the jq script to be:
jq -r '.[] | [.fqdn], ( .members | to_entries | .[] | [ .key, .value.state ] ) | #csv'
Output:
"my-created-lb"
"10.45.78.9:80","enabled"
"10.45.78.10:80","enabled"
"10.45.78.11:80","enabled"
"10.45.78.12:80","enabled"
To get precisely your desired output:
jq -r '.[] | ([.fqdn]|#csv), (.members | keys[] as $k | [$k, .[$k].state] | #csv)' file.json
"my-created-lb"
"10.45.78.10:80","enabled"
"10.45.78.11:80","enabled"
"10.45.78.12:80","enabled"
"10.45.78.9:80","enabled"

how to parse json with various arrays

I have a json file that looks like:
[
{
"id": "aaa",
"idMembers": [
"David",
"Mary"
],
"actions": [
{
"id": "1",
"date": "2019-08-28"
},
{
"id": "2",
"date": "2019-08-29"
},
{
"id": "3",
"date": "2019-08-30"
}
]
},
{
"id": "bbb",
"idMembers": [
"Mar",
"Alex"
],
"actions": [
{
"id": "1",
"date": "2019-07-28"
},
{
"id": "2",
"date": "2019-07-29"
}
]
}
]
I would like to obtain a result like:
["David", "Mary", "1", "2019-08-28"]
["David", "Mary", "2", "2019-08-29"]
["David", "Mary", "3", "2019-08-30"]
["Mar", "Alex", "1", "2019-07-28"]
["Mar", "Alex", "2", "2019-07-29"]
I tried:
jq -c '.[] | [ .idMembers[], .actions[].id, .actions[].date] '
But results are:
["David", "Mary", "1", "2", "3", "2019-08-28", "2019-08-29", "2019-08-30"]
["Mar", "Alex", "1", "2", "2019-07-28", "2019-07-29"]
I would like do someting like:
jq -c '.[] | .idMembers[], .actions[] | [ .id, .date] '
but it return me
jq: error (at :1268): Cannot index string with string "id"
Is possible to do something similar to this?
jq -c '.[] | .actions[] | [.idMembers[], .id, .date] '
Make an array out of each object under actions and add it to idMembers.
.[] | .idMembers + (.actions[] | map(.))
map(.) can also be written as [.[]]. For clarification, above is the same as:
.[] | .idMembers + (.actions[0] | map(.)),
.idMembers + (.actions[1] | map(.)),
.idMembers + (.actions[2] | map(.)),
...
.idMembers + (.actions[n] | map(.))
where n is the number of elements in actions.

Output object of two json files with two filters using jq

I have these two json files :
File 0.json
{
"Feline": [
{
"Name": "Leo",
"Race": "Bengal",
"Weight": "12"
},
{
"Name": "Diego",
"Race": "Toyger",
"Weight": "24"
}
]
}
File 1.json
{
"Feline": [
{
"Name": "Lynx",
"Race": "Bengal",
"Weight": "15"
},
{
"Name": "Simba",
"Race": "Ussuri",
"Weight": "14"
}
]
}
With jq I would like the heaviest Feline whose race is Bengal of these two json file.
So the output will be
{
"Feline": [
{
"Name": "Lynx",
"Race": "Bengal",
"Weight": "15"
}
]
}
I tried to combine --slurp and --arg and pipe in max without concrete result.
If someone know how to do this I'll apreciate the help.
$ jq -n '{Feline: [
[inputs.Feline[] | select(.Race=="Bengal")] | max_by(.Weight)
]}' file1 file2
{
"Feline": [
{
"Name": "Lynx",
"Race": "Bengal",
"Weight": "15"
}
]
}
With the following pipeline:
$ jq -s '[.[] | .Feline[] | select(.Race == "Bengal")] | max_by(.Weight) | {"Feline": [.]}' f1 f2
{
"Feline": [
{
"Name": "Lynx",
"Race": "Bengal",
"Weight": "15"
}
]
}

JQ, how to count depending on conditions?

Using jq, I need to get the count within an array depending on two criterias: it MUST have status === 'skipped' && ref.includes(version)
[
{
"id": 15484,
"sha": "52606c8da57984d1243f436e5d12e275db29a6e0",
"ref": "v1.4.15",
"status": "canceled"
},
{
"id": 15483,
"sha": "52606c8da57984d1243f436e5d12e275db29a6e0",
"ref": "v1.4.15",
"status": "canceled"
},
{
"id": 15482,
"sha": "1b4ccc1dc17e9b8ddb24550c5566d2be6b03465e",
"ref": "dev",
"status": "success"
},
{
"id": 15481,
"sha": "5b6ec939739c5a1513634f3b58bf96522917571d",
"ref": "dev",
"status": "failed"
},
{
"id": 15480,
"sha": "ec18d46f491a4645c68388df91fc41455b421e71",
"ref": "dev",
"status": "failed"
},
{
"id": 15479,
"sha": "dd83a6d6e58cc5114aed8016341ab3c5b3ebb702",
"ref": "dev",
"status": "failed"
},
{
"id": 15478,
"sha": "18ccaf4bc37bf65470b2c6ddaa69e5b4018354a7",
"ref": "dev",
"status": "success"
},
{
"id": 15477,
"sha": "f90900d733bce2be3d9ba9db25f8b51296bc6f3f",
"ref": "dev",
"status": "failed"
},
{
"id": 15476,
"sha": "3cf0431a161e6c9ca90e8248af7b4ec39c54bfb1",
"ref": "dev",
"status": "failed"
},
{
"id": 15285,
"sha": "d24b46edc75d8f7308dbef37d7b27625ef70c845",
"ref": "dev",
"status": "success"
},
{
"id": 15265,
"sha": "52606c8da57984d1243f436e5d12e275db29a6e0",
"ref": "v1.4.15",
"status": "success"
},
{
"id": 15264,
"sha": "9a15f8d4c950047f88c642abda506110b9b0bbd7",
"ref": "v1.4.15-static",
"status": "skipped"
},
{
"id": 15263,
"sha": "9a15f8d4c950047f88c642abda506110b9b0bbd7",
"ref": "v1.4.15-static",
"status": "skipped"
},
{
"id": 15262,
"sha": "76451d2401001c4c51b9800d3cdf62e4cdcc86ba",
"ref": "v1.4.15-no-js",
"status": "skipped"
},
{
"id": 15261,
"sha": "76451d2401001c4c51b9800d3cdf62e4cdcc86ba",
"ref": "v1.4.15-no-js",
"status": "skipped"
},
{
"id": 15260,
"sha": "515cd1b00062e9cbce05420036f5ecc7a898a4bd",
"ref": "v1.4.15-cli",
"status": "skipped"
},
{
"id": 15259,
"sha": "515cd1b00062e9cbce05420036f5ecc7a898a4bd",
"ref": "v1.4.15-cli",
"status": "skipped"
},
{
"id": 15258,
"sha": "b67acd3082da795f022fafc304d267d3afd6b736",
"ref": "v1.4.15-node",
"status": "skipped"
},
{
"id": 15257,
"sha": "b67acd3082da795f022fafc304d267d3afd6b736",
"ref": "v1.4.15-node",
"status": "skipped"
},
{
"id": 15256,
"sha": "4da4a788a85d82527ea568fed4f03da193842a80",
"ref": "v1.4.15-bs-redux-saga-router-dom-intl",
"status": "skipped"
}
]
We also like to use environment variable for the query :
status=skipped
ref=v1.4.15
This work but without environment variable options:
cat test.json | jq '[.[] | select(.status=="skipped") | select(.ref | startswith("v1.4.15"))] | length'
How is this possible?
Answer:
status=skipped; ref=v1.4.15; cat test.json | jq --arg REF "$ref" --arg STATUS "$status" -r '[.[] | select(.status==$STATUS) | select(.ref | startswith($REF))] | length'
Use the length() function at the end of the filter, after putting the objects list into an array
jq '[.[] | select(.status == "skipped") | select(.ref | test("1\\.4\\.15"))] | length'
but for just returning the objects leave out the logic to get the length
jq '[.[] | select(.status == "skipped") | select(.ref | test("1\\.4\\.15"))]'
The test() is a more powerful way to match your regex with JSON strings. The startswith() or endswith() can't match strings if they are in the middle.
Using variables,
ref="1\.4\.15"
jq --arg status "$status" --arg ref "$ref" \
'[.[] | select(.status == $status) | select(.ref | test($ref))]|length' json
By using map(select(...) or equivalent, you could use length, but it is generally more efficient to use a generic counting function, such as:
def sigma(s): reduce s as $s (null; .+$s);
sigma(.[] | select(.status=="skipped" and (.ref | startswith("v1.4.15") )) | 1)
Using shell and environment variables
Using shell and environment variables is covered in the jq manual, but in brief, one way to pass in string values is using the command-line option --arg, e.g. along the lines of:
jq --arg status "$status" --arg ref "$ref" -f program.jq test.json
I know jq is popular around here, but may I suggest xidel? See http://videlibri.sourceforge.net/xidel.html.
Just like jq it's a JSON interpreter, but besides JSONiq you can also use XPath/Xquery functions to do all sorts of cool stuff.
This would list all objects with the 2 criteria:
xidel -s test.json -e '$json()[status="skipped" and starts-with(ref,"v1.4.15")]'
To count them, simply enclose the query with the count() function:
xidel -s test.json -e 'count($json()[status="skipped" and starts-with(ref,"v1.4.15")])'
This returns 9.
With variables:
status=skipped
ref=v1.4.15
xidel -s test.json -e 'count($json()[status="'$status'" and starts-with(ref,"'$ref'")])'
For the sake of completeness, this would be an equivalent JSONiq query:
let $a := [
(: copy-paste the entire array here in plain JSON syntax --
omitted for the sake of brevity :)
]
return count(
for $obj in $a[]
where $obj.status eq "skipped"
and
matches($obj.ref, "ˆv")
return $obj
)

jq streaming of large json files to get only objects whose properties have a specific value

I have some rather large json files (~500mb - 4gb compressed) for which I cannot load into memory for manipulation. So I am using the --stream option with jq.
For example my json might look like this - only bigger:
[{
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}, {
"id": "1002",
"type": "Chocolate"
}, {
"id": "1003",
"type": "Blueberry"
}, {
"id": "1004",
"type": "Devil's Food"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5005",
"type": "Sugar"
}, {
"id": "5007",
"type": "Powdered Sugar"
}, {
"id": "5006",
"type": "Chocolate with Sprinkles"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}, {
"id": "0002",
"type": "donut",
"name": "Raised",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5005",
"type": "Sugar"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}, {
"id": "0003",
"type": "donut",
"name": "Old Fashioned",
"ppu": 0.55,
"batters": {
"batter": [{
"id": "1001",
"type": "Regular"
}, {
"id": "1002",
"type": "Chocolate"
}]
},
"topping": [{
"id": "5001",
"type": "None"
}, {
"id": "5002",
"type": "Glazed"
}, {
"id": "5003",
"type": "Chocolate"
}, {
"id": "5004",
"type": "Maple"
}]
}]
If this were the type of file I could hold in memory, and I wanted to select objects that only have batter type "Chocolate", I could use:
cat sample.json | jq '.[] | select(.batters.batter[].type == "Chocolate")'
And I would only get back the full objects with ids "0001" and "0003"
But with streaming I know it's different.
I am reading through the jq documentation on streaming here and here, but I am still quite confused as the examples don't really demonstrate real world problems with json.
Namely, Is it even possible to select whole objects after streaming through their paths and identifying a notable event, or in this case a property value that matches a certain string?
I know that I can use:
cat sample.json | jq --stream 'select(.[0][1] == "batters" and .[0][2] == "batter" and .[0][4] == "type") | .[1]'
to give me all of the batter types. But is there a way to say: "If it's Chocolate, grab the object this leaf is a part of"?
Command:
$ jq -cn --stream 'fromstream(1|truncate_stream(inputs))' array_of_objects.json |
jq 'select(.batters.batter[].type == "Chocolate") | .id'
Output:
"0001"
"0003"
The first invocation of jq converts the array of objects into a stream of objects. The second is based on your invocation and can be tailored further to your needs.
Of course the two invocations can (and probably should) be combined into one, but you might want to use the first invocation to save the big file as a file containing the stream of objects.
By the way, it would probably be better to use the following select:
select( any(.batters.batter[]; .type == "Chocolate") )
Here is another approach. Start with a streaming filter filter1.jq that extracts the record number and the minimum set of attributes you need to process. E.g.
select(length==2)
| . as [$p, $v]
| {r:$p[0]}
| if $p[1] == "id" then .id = $v
elif $p[1] == "batters" and $p[-1] == "type" then .type = $v
else empty
end
Running this with
jq -M -c --stream -f filter1.jq bigdata.json
produces values like
{"r":0,"id":"0001"}
{"r":0,"type":"Regular"}
{"r":0,"type":"Chocolate"}
{"r":0,"type":"Blueberry"}
{"r":0,"type":"Devil's Food"}
{"r":1,"id":"0002"}
{"r":1,"type":"Regular"}
{"r":2,"id":"0003"}
{"r":2,"type":"Regular"}
{"r":2,"type":"Chocolate"}
now pipe this into a second filter filter2.jq which does the processing you want on those attributes for each record
foreach .[] as $i (
{c: null, r:null, id:null, type:null}
; .c = $i
| if .r != .c.r then .id=null | .type=null | .r=.c.r else . end # control break
| .id = if .c.id == null then .id else .c.id end
| .type = if .c.type == null then .type else .c.type end
; if ([.id, .type] | contains([null])) then empty else . end
)
| select(.type == "Chocolate").id
with a command like
jq -M -c --stream -f filter1.jq bigdata.json | jq -M -s -r -f filter2.jq
to produce
0001
0003
filter1.jq and filter2.jq do a little more than what you need for this specific problem but they can be generalized easily.