jq variable seems to hid original input - json

I'm trying to parse through some json and put certain sections into variables. I think I'm not understanding something about how variables work though.
Json:
{
"resources": [
{
"type": "Microsoft.ApiManagement/service/apis"
},
{
"type": "Microsoft.ApiManagement/service/apis/schemas"
}
]
}
Then using this jq:
.resources[] | select(.type == "Microsoft.ApiManagement/service/apis") as $apis | { types: [.type], apis: $apis}
I get this:
{
"types": [
"Microsoft.ApiManagement/service/apis"
],
"apis": {
"type": "Microsoft.ApiManagement/service/apis"
}
}
When I expected this:
{
"types": [
"Microsoft.ApiManagement/service/apis",
"Microsoft.ApiManagement/service/apis/schemas"
],
"apis": {
"type": "Microsoft.ApiManagement/service/apis"
}
}
https://jqplay.org/s/4aeBOY9x6q
According to the variables section of the jq manual
The expression exp as $x | ... means: for each value of expression
exp, run the rest of the pipeline with the entire original input, and
with $x set to that value. Thus as functions as something of a foreach
loop.
Which makes me think that .type should return from the original set not the filtered result I stored in $apis. Where is the disconnect?

It's the select that is "hiding" some of the input.
To produce the output you expect, the simplest is not to use variables at all. You could, for example, simply write:
.resources[].type

Related

jq output is empty when tag name does not exist

When I run the jq command to parse a json document from the amazon cli I have the following problem.
I’m parsing through the IP address and a tag called "Enviroment". The enviroment tag in the instance does not exist therefore it does not throw me any result.
Here's an example of the relevant output returned by the AWS CLI
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
I’m running the following command
aws ec2 describe-instances --filters "Name=tag:Name,Values=Balance-OTA-SS_a" | jq -c '.Reservations[].Instances[] | ({IP: .PrivateIpAddress, Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)})'
## output
empty
How do I show the IP address in the output of the command even if the enviroment tag does not exist?
Regards,
Let's assume this input:
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
This is the format returned by describe-instances, but with all the irrelevant fields removed.
Note that tags is always a list of objects, each of which has a Key and a Value. This format is perfect for from_entries, which can transform this list of tags into a convenient mapping object. Try this:
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags|from_entries.Environment)
}
{"IP":"10.0.0.1","Ambiente":"alpha"}
{"IP":"10.0.0.2","Ambiente":null}
That answers how to do it. But you probably want to understand why your approach didn't work.
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)
}
The .[] filter you're using on the tags can return zero or multiple results. Similarly, the select filter can eliminate some or all items. When you apply this inside an object constructor (the expression from { to }), you're causing that whole object to be created a variable number of times. You need to be very careful where you use these filters, because often that's not what you want at all. Often you instead want to do one of the following:
Wrap the expression that returns multiple results in an array constructor [ ... ]. That way instead of outputting the parent object potentially zero or multiple times, you output it once containing an array that potentially has zero or multiple items. E.g.
[.Tags[]|select(.Key=="Environment")]
Apply map to the array to keep it an array but process its contents, e.g.
.Tags|map(select(.Key=="Environment"))
Apply first(expr) to capture only the first value emitted by the expression. If the expression might emit zero items, you can use the comma operator to provide a default, e.g.
first((.Tags[]|select(.Key=="Environment")),null)
Apply some other array-level function, such as from_entries.
.Tags|from_entries.Environment
You can either use an if ... then ... else ... end construct, or //. For example:
.Reservations[].Instances[]
| {IP: .PrivateIpAddress} +
({Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)}
// null)

Update deeply nested field with value from higher-level object in JQ

Given the following input JSON:
{
"version": 2,
"models": [
{
"name": "first_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": null
}
}
]
},
{
"name": "second_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": null
}
}
]
}
]
}
How would I, using jq, replace the null (i.e., the value of "compare_model") with the value from the "name" key? Note that the key-value pairs in question here are not at the same level in the hierarchy: the former is nested in an object in an array, and it is this array that is at the same level as the latter.
For example, the output file should read:
{
"version": 2,
"models": [
{
"name": "first_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": "first_table"
}
}
]
},
{
"name": "second_table",
"tests": [
{
"dbt_utils.equal_rowcount": {
"compare_model": "second_table"
}
}
]
}
]
}
FWIW, this is an intermediate step in some YAML (via yq, the Python wrapper variety of jq as opposed to the go variant) wrangling I'm doing on DBT config files.
(Bonus points if you can wrap the replacement text with parentheses and/or prefix it without breaking out of jq. :D If not, no worries -- this step I can do with another program.)
Needless to say, but your help is very much appreciated!
The key to a simple solution is to use |=, e.g.
.models |=
map(.name as $name
| (.tests[]."dbt_utils.equal_rowcount".compare_model =
$name))
To wrap the replacement value in parentheses, just add them:
.models |=
map("(\(.name))" as $name
| (.tests[]."dbt_utils.equal_rowcount".compare_model =
$name))
If you want the replacement to be conditional on the existing value being null, you could perhaps (depending on the exact requirements) use //=.
Using //= and walk
Here's another take on the problem:
.models
|= map("(\(.name))" as $name
| walk(if type=="object" and has("compare_model")
then .compare_model //= $name
else . end))
That the fields are not at the same level doesn't really matter here.
.models[] |= (.tests[]."dbt_utils.equal_rowcount".compare_model = "(\(.name))")
Online demo

jq - retrieve values from json table on one line for specific columns

I'm trying to get cell values from a json formatted table but only for specific columns and have it output into its own object.
json example -
{
"rows":[
{
"id":409363222161284,
"rowNumber":1,
"cells":[
{
"columnId":"nameColumn",
"value":"name1"
},
{
"columnId":"infoColumn",
"value":"info1"
},
{
"columnId":"excessColumn",
"value":"excess1"
}
]
},
{
"id":11312541213,
"rowNumber":2,
"cells":[
{
"columnId":"nameColumn",
"value":"name2"
},
{
"columnId":"infoColumn",
"value":"info2"
},
{
"columnId":"excessColumn",
"value":"excess2"
}
]
},
{
"id":11312541213,
"rowNumber":3,
"cells":[
{
"columnId":"nameColumn",
"value":"name3"
},
{
"columnId":"infoColumn",
"value":"info3"
},
{
"columnId":"excessColumn",
"value":"excess3"
}
]
}
]
}
Ideal output would be filtered by two columns - nameColumn, infoColumn - with each row being a single line of the values.
Output example -
{
"name": "name1",
"info": "info1"
}
{
"name": "name2",
"info": "info2"
}
{
"name": "name3",
"info": "info3"
}
I've tried quite a few different combinations of things with select statements and this is the closest I've come but it only uses one.
jq '.rows[].cells[] | {name: (select(.columnId=="nameColumn") .value), info: "infoHereHere"}'
{
"name": "name1",
"info": "infoHere"
}
{
"name": "name2",
"info": "infoHere"
}
{
"name": "name3",
"info": "infoHere"
}
If I try to combine another one, it's not so happy.
jq -j '.rows[].cells[] | {name: (select(.columnId=="nameColumn") .value), info: (select(.columnId=="infoColumn") .value)}'
Nothing is output.
** Edit **
Apologies for being unclear with this. The final output would ideally be a csv for the selected columns values
name1,info1
name2,info2
Presumably you would want the output to be grouped by row, so let's first consider:
.rows[].cells
| map(select(.columnId=="nameColumn" or .columnId=="infoColumn"))
This produces a stream of JSON arrays, the first of which using your main example would be:
[
{
"columnId": "nameColumn",
"value": "name1"
},
{
"columnId": "infoColumn",
"value": "info1"
}
]
If you want the output in some alternative format, then you could tweak the above jq program accordingly.
If you wanted to select a large number of columns, the use of a long "or" expression might become unwieldy, so you might also want to consider using a "whitelist". See e.g. Whitelisting objects using select
Or you might want to use del to delete the unwanted columns.
Producing CSV
One way would be to use #csv with the -r command-line option, e.g. with:
| map(select(.columnId=="nameColumn" or .columnId=="infoColumn")
| {(.columnId): .value} )
| add
| [.nameColumn, .infoColumn]
| #csv

Perform string manipulation on a value and return the original JSON document with jq

In my JSON document I have a string that I need manipulated and then have the entire document returned with the 'fixed' values.
The input document is:
{
"records" : [
{
"time": "123456789000"
},
{
"time": "123456789000"
}
]
}
I want to find the "time" key and replace the string by dropping off the last 3 chars. The resulting document would be:
{
"records" : [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
I've been trying to understand the jq query syntax but I'm not coming right. I'm still struggling to return the whole document when filtering on a specific value. All I have so far is:
.records[] | select(.time | contains("123456789000"))
Here is a solution using |= and string slicing
.records[].time |= .[:-3]
Sample Run (assuming data in data.json)
$ jq -M '.records[].time |= .[:-3]' data.json
{
"records": [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
Try it online at jqplay.org
With jq sub() function:
jq '.records[].time |= sub("[0-9]{3}$";"")' file
The output:
{
"records": [
{
"time": "123456789"
},
{
"time": "123456789"
}
]
}
Or even simpler: via dividing the time value by 1000:
jq '.records[].time |= (tonumber / 1000 | tostring)' file
The following works with jq version 1.4 or later:
jq '.records[].time |= .[:-3]' file.json
(The expression .[:-3] is short for .[0:-3]; the negative integer here counts from the right.)
With jq 1.3, the following filter would work in your particular case:
.records[].time |= (tonumber | ./1000 | tostring)

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)