jq - Looping through json and concatenate the output to single string - json

I was currently learning the usage of jq. I have a json file and I am able to loop through and filter out the values I need from the json. However, I am running into issue when I try to combine the output into single string instead of having the output in multiple lines.
File svcs.json:
[
{
"name": "svc-A",
"run" : "True"
},
{
"name": "svc-B",
"run" : "False"
},
{
"name": "svc-C",
"run" : "True"
}
]
I was using the jq to filter to output the service names with run value as True
jq -r '.[] | select(.run=="True") | .name ' svcs.json
I was getting the output as follows:
svc-A
svc-C
I was looking to get the output as single string separated by commas.
Expected Output:
"svc-A,svc-C"
I tried to using join, but was unable to get it to work so far.

The .[] expression explodes the array into a stream of its elements. You'll need to collect the transformed stream (the names) back into an array. Then you can use the #csv filter for the final output
$ jq -r '[ .[] | select(.run=="True") | .name ] | #csv' svcs.json
"svc-A","svc-C"
But here's where map comes in handy to operate on an array's elements:
$ jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"

Keep the array using map instead of decomposing it with .[], then join with a glue string:
jq -r 'map(select(.run=="True") | .name) | join(",")' svcs.json
svc-A,svc-C
Demo
If your goal is to create a CSV output, there is a special #csv command taking care of quoting, escaping etc.
jq -r 'map(select(.run=="True") | .name) | #csv' svcs.json
"svc-A","svc-C"
Demo

Related

Can this jq map be simplified?

Given this JSON:
{
"key": "/books/OL1000072M",
"source_records": [
"ia:daywithtroubadou00pern",
"bwb:9780822519157",
"marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267"
]
}
Can the following jq code be simplified?
jq -r '.key as $olid | .source_records | map([$olid, .])[] | #tsv'
The use of variable assignment feels like cheating and I'm wondering if it can be eliminated. The goal is to map the key value onto each of the source_records values and output a two column TSV.
Instead of mapping into an array, and then iterating over it (map(…)[]) just create an array and collect its items ([…]). Also, you can get rid of the variable binding (as) by moving the second part into its own context using parens.
jq -r '[.key] + (.source_records[] | [.]) | #tsv'
Alternatively, instead of using #tsv you could build your tab-separated output string yourself. Either by concatenation (… + …) or by string interpolation ("\(…)"):
jq -r '.key + "\t" + .source_records[]'
jq -r '"\(.key)\t\(.source_records[])"'
Output:
/books/OL1000072M ia:daywithtroubadou00pern
/books/OL1000072M bwb:9780822519157
/books/OL1000072M marc:marc_loc_2016/BooksAll.2016.part25.utf8:103836014:1267
It's not much shorter, but I think it's clearer than the original and clearer than the other shorter answers.
jq -r '.key as $olid | .source_records[] | [ $olid, . ] | #tsv'

Convert value of json from int to string using jq

Given a json that looks something like:
[{"id":1,"firstName":"firstName1","lastName":"lastName1"},
{"id":2,"firstName":"firstName2","lastName":"lastName2"},
{"id":3,"firstName":"firstName3","lastName":"lastName3"}]
What would be the best way to convert the id value from an int to a string and then saving the file?
I have tried:
echo "$(jq -r '[.[] | .id = .id|tostring]' test.json)" > test.json
But that seems to put each entry into a string and adds the backslashes
[
"{\"id\":1,\"firstName\":\"firstName1\",\"lastName\":\"lastName1\"}",
"{\"id\":2,\"firstName\":\"firstName2\",\"lastName\":\"lastName2\"}",
"{\"id\":3,\"firstName\":\"firstName3\",\"lastName\":\"lastName3\"}"
]
| has a lower priority than the assignment (=). The expression .id = .id | tostring is interpreted as (.id = .id) | tostring.
The assignment does change anything and can be removed. The script becomes [ .[] | tostring ], that explains the output (each object is serialized as JSON into a string).
The solution is to use parentheses to enforce the desired order of execution.
The command is:
jq '[ .[] | .id = (.id | tostring) ]' test.json
Do not use process expansion ($(...)) to compose an echo command line. It is inefficient and not needed.
Redirect the output of jq directly to a file. Use a different file than the input file (or it ends up destroying your data).
jq '[ .[] | .id = (.id | tostring) ]' test.json > output.json

jq how to pass json keys from a shell variable

I have a json file I am parsing with jq. This is a sample of the file
[{
"key1":{...},
"key2":{...}
}]
[{
"key1":{...},
"key2":{...}
}]
...
each line is a list containing a json (which I know is not technically a json format but jq still works on such a file)
The below jq command works:
cat file.json | jq -r '.[] | [.key1,.key2]'
The above correctly shows:
[
<value_of_key1>,<value_of_key2>
]
[
<value_of_key1>,<value_of_key2>
]
However, I want .key1,.key2 to be dynamic since these keys can change. So I want to pass a variable to jq. Something like:
$KEYS=.key1,.key2
cat file.json | jq -r --arg var "$KEYS" '.[] | [$var]'
But the above is returning the keys themselves:
[
".key1,.key2"
]
[
".key1,.key2"
]
why is this happening? what is the correct command to make this happen?
This answer does not help me. I am not getting any errors as the OP in that question.
Fetching the value of a jq variable doesn't cause it to be executed as jq code.
Furthermore, jq lacks the facility to take a string, compile it as jq code, and evaluate the result. (This is commonly known as eval.)
So, short of a writing a jq parser and evaluator in jq, you will need to impose limits and/or accept a different format.
For example,
keys='[ [ "key1", "childkey" ], [ "key2", "childkey2" ] ]' # JSON
jq --argjson keys "$keys" '.[] | [ getpath( $keys[] ) ]' file.json
or
keys='key1.childkey,key2.childkey2'
jq --arg keys "$keys" '
( ( $keys / "," ) | map( . / "." ) ) as $keys |
.[] | [ getpath( $keys[] ) ]
' file.json
Suppose you have:
cat file
[{
"key1":1,
"key2":2
}]
[{
"key1":1,
"key2":2
}]
You can use a jq command like so:
jq '.[] | [.key1,.key2]' file
[
1,
2
]
[
1,
2
]
You can use -f to execute a filter from a file and nothing keeps you from creating the file separately from the shell variables.
Example:
keys=".key1"
echo ".[] | [${keys}]" >jqf
jq -f jqf file
[
1
]
[
1
]
Or just build the string directly into jq:
# note double " causing string interpolation
jq ".[] | [${keys}]" file
You can use --argjson option and destructuring.
file.json
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"c":1},"key2":{"d":2}}]
$ in='["key1","key2"]' jq -c --argjson keys "$in" '$keys as [$key1,$key2] | .[] | [.[$key1,$key2]]' file.json
output:
[{"a":1},{"b":2}]
[{"c":1},{"d":2}]
Elaborating on ikegami's answer.
To start with here's my version of the answer:
$ in='key1.a,key2.b'; jq -c --arg keys "$in" '($keys/","|map(./".")) as $paths | .[] | [getpath($paths[])]' <<<$'[{"key1":{"a":1},"key2":{"b":2}}] [{"key1":{"a":3},"key2":{"b":4}}]'
This gives output
[1,2]
[3,4]
Let's try it.
We have input
[{"key1":{"a":1},"key2":{"b":2}}]
[{"key1":{"a":3},"key2":{"b":4}}]
And we want to construct array
[["key1","a"],["key2","b"]]
then use it on getpath(PATHS) builtin to extract values out of our input.
To start with we are given in shell variable with string value key1.a,key2.b. Let's call this $keys.
Then $keys/"," gives
["key1.a","key2.b"]
["key1.a","key2.b"]
After that $keys/","|map(./".") gives what we want.
[["key1","a"],["key2","b"]]
[["key1","a"],["key2","b"]]
Let's call this $paths.
Now if we do .[]|[getpath($paths[])] we get the values from our input equivalent to
[.[] | .key1.a, .key2.b]
which is
[1,2]
[3,4]

How do I write a jq query to convert a JSON file to CSV?

The JSON files look like:
{
"name": "My Collection",
"description": "This is a great collection.",
"date": 1639717379161,
"attributes": [
{
"trait_type": "Background",
"value": "Sand"
},
{
"trait_type": "Skin",
"value": "Dark Brown"
},
{
"trait_type": "Mouth",
"value": "Smile Basic"
},
{
"trait_type": "Eyes",
"value": "Confused"
}
]
}
I found a shell script that uses jq and has this code:
i=1
for eachFile in *.json; do
cat $i.json | jq -r '.[] | {column1: .name, column2: .description} | [.[] | tostring] | #csv' > extract-$i.csv
echo "converted $i of many json files..."
((i=i+1))
done
But its output is:
jq: error (at <stdin>:34): Cannot index string with string "name"
converted 1 of many json files...
Any suggestions on how I can make this work? Thank you!
Quick jq lesson
===========
jq filters are applied like this:
jq -r '.name_of_json_field_0 <optional filter>, .name_of_json_field_1 <optional filter>'
and so on and so forth. A single dot is the simplest filter; it leaves the data field untouched.
jq -r '.name_of_field .'
You may also leave the filter field untouched for the same effect.
In your case:
jq -r '.name, .description'
will extract the values of both those fields.
.[] will unwrap an array to have the next piped filter applied to each unwrapped value. Example:
jq -r '.attributes | .[]
extracts all trait_types objects.
You may sometime want to repackage objects in an array by surrounding the filter in brackets:
jq -r '[.name, .description, .date]
You may sometime want to repackage data in an object by surrounding the filter in curly braces:
`jq -r '{new_field_name: .name, super_new_field_name: .description}'
playing around with these, I was able to get
jq -r '[.name, .description, .date, (.attributes | [.[] | .trait_type] | #csv | gsub(",";";") | gsub("\"";"")), (.attributes | [.[] | .value] | .[]] | #csv | gsub(",";";") | gsub("\"";""))] | #csv'
to give us:
"My Collection","This is a great collection.",1639717379161,"Background;Skin;Mouth;Eyes","Sand;Dark Brown;Smile Basic;Confused"
Name, description, and date were left as is, so let's break down the weird parts, one step at a time.
.attributes | [.[] | .trait_type]
.[] extracts each element of the attributes array and pipes the result of that into the next filter, which says to simply extract trait_type, where they are re-packaged in an array.
.attributes | [.[] | .trait_type] | #csv
turn the array into a csv-parsable format.
(.attributes | [.[] | .trait_type] | #csv | gsub(",";";") | gsub("\"";""))
Parens separate this from the rest of the evaluations, obviously.
The first gsub here replaces commas with semicolons so they don't get interpreted as a separate field, the second removes all extra double quotes.

Can't put JSON output into CSV format with jq

I'm building a list of AWS EBS volumes attributes so I can store it as CSV in a variable, using jq. I'm going to output the variable to a spread sheet.
The first command gives the values I'm looking for using jq:
aws ec2 describe-volumes | jq -r '.Volumes[] | .VolumeId, .AvailabilityZone, .Attachments[].InstanceId, .Attachments[].State, (.Tags // [] | from_entries.Name)'
Gives output that I want like this:
MIAPRBcdm0002_test_instance
vol-0105a1678373ae440
us-east-1c
i-0403bef9c0f6062e6
attached
MIAPRBcdwb00000_app1_vpc
vol-0d6048ec6b2b6f1a4
us-east-1c
MIAPRBcdwb00001 /carbon
vol-0cfcc6e164d91f42f
us-east-1c
i-0403bef9c0f6062e6
attached
However, if I put it into CSV format so I can output the variable to a spread sheet, the command blows up and doesn't work:
aws ec2 describe-volumes | jq -r '.Volumes[] | .VolumeId, .AvailabilityZone, .Attachments[].InstanceId, .Attachments[].State, (.Tags // [] | from_entries.Name)| #csv'
jq: error (at <stdin>:4418): string ("vol-743d1234") cannot be csv-formatted, only array
Even putting the top level of the JSON into CSV format fails for EBS volumes:
aws ec2 describe-volumes | jq -r '.Volumes[].VolumeId | #csv'
jq: error (at <stdin>:4418): string ("vol-743d1234") cannot be csv-formatted, only array
Here is the AWS EBS Volumes JSON FILE that I am working with, with these commands (the file has been cleaned of company identifiers, but is valid json).
How can I get this json into CSV format using jq?
You can only apply #csv over an array content, just enclose your filter within a [..] as below
jq -r '[.Volumes[] | .VolumeId, .AvailabilityZone, .Attachments[].InstanceId, .Attachments[].State, (.Tags // [] | from_entries.Name)]|#csv'
Using the above might still retain the quotes, so using join() would also be appropriate here
jq -r '[.Volumes[] | .VolumeId, .AvailabilityZone, .Attachments[].InstanceId, .Attachments[].State, (.Tags // [] | from_entries.Name)] | join(",")'
The accepted Answer resolves another obscure jq error:
string ("xxx") cannot be csv-formatted, only array
In my case I did not want the entire output of jq, but rather each Elastic Search document I supplied to jq to be printed as a CSV string on a line of its own. To accomplish this I simply moved the brackets to enclose only the items to be included on each line.
First, by placing my brackets only around items to be included on each line of output, I produced:
jq -r '.hits.hits[]._source | [.syscheck.path, .syscheck.size_after]'
[
"/etc/group-",
"783"
]
[
"/etc/gshadow-",
"640"
]
[
"/etc/group",
"795"
]
[
"/etc/gshadow",
"652"
]
[
"/etc/ssh/sshd_config",
"3940"
]
Piping this to | #csv prints each document's values of .syscheck.path and .syscheck.size_after, quoted and comma-separated, on a separate line:
$ jq -r '.hits.hits[]._source | [.syscheck.path, .syscheck.size_after] | #csv'
"/etc/group-","783"
"/etc/gshadow-","640"
"/etc/group","795"
"/etc/gshadow","652"
"/etc/ssh/sshd_config","3940"
Or to omit quotation marks, following the pattern noted in the accepted Answer:
$ jq -r '.hits.hits[]._source | [.syscheck.path, .syscheck.size_after] | join(",")'
/etc/group-,783
/etc/gshadow-,640
/etc/group,795
/etc/gshadow,652
/etc/ssh/sshd_config,3940