I have this JSON document:
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
}
}
My expected result should be:
1,G1
2,GM1
With *.a i get
[
"G1",
"GM1"
]
but I am absolutely stuck for the rest.
Sadly there is not much you can do that would be totally matching your use case and that would scale properly.
This is because JMESPath does not have a way to reference its parent, although this has been requested before, to allow you something like
*.[join(',', [keys($), a])]
You can definitely extract a list of keys and values, thanks to the function keys:
#.{keys: keys(#), values: *.a}
That gives
{
"keys": [
"1",
"2"
],
"values": [
"G1",
"GM1"
]
}
But then you just fall under the same case as this other question, because keys will give you a list of keys.
You can also end with a list of lists:
#.[keys(#), *.a]
Will give you:
[
[
"1",
"2"
],
[
"G1",
"GM1"
]
]
And you can even go further and flatten it if needed:
#.[keys(#), *.a] []
Gives:
[
"1",
"2",
"G1",
"GM1"
]
With all this if you do happen to have a list of exactly two items, then a solution would be to use a combination of join and slice:
#.[join(',',[keys(#),*.a][] | [::2]), join(',',[keys(#),*.a][] | [1::2])]
That would give the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, as soon as you have more than two items to consider you would end up with a buggy:
[
"1,3,G1,GM3",
"2,4,GM1,GM4"
]
With a data set of
{
"1": {
"a": "G1"
},
"2": {
"a": "GM1"
},
"3": {
"a": "GM3"
},
"4": {
"a": "GM4"
}
}
And then, of course, the same can be achieved hardcoding indexes:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])]
That also gives the expected:
[
"1,G1",
"2,GM1"
]
But, sadly, this only works if you know in advance the number of rows that are going to be returned to you.
And if you want a single string, given that were you want to feed the data accepts \n as a new line, you can join he whole array again:
#.[join(',', [keys(#)[0], *.a | [0]]), join(',', [keys(#)[1], *.a | [1]])].join(`\n`,#)
Will give:
"1,G1\n2,GM1"
Finally this expression worked 100% for me:
[{key1:keys(#)[0],a:*.a| [0]},{key1:keys(#)[1],a:*.a| [1]}]
Related
I am new to jq. I have a json file that looks like this
[
{
"k1":"a",
"k2":"aa",
"k3":["sk1":"a","sk2":"cc","sk3":"cc"],
"k4":["sk6":"zs","sk8":"we",...],
...
},
{
"k1":"b",
"k2":"ba",
"k3":["sk1":"a","sk3":"cc",...],
"k4":["sk6":"zs","sk8":"we",...],
...
},
{
"k1":"b",
"k2":"ba",
"k4":["sk6":"zs","sk8":"we",...],
...
}
...
]
I would like to get all the entries in the array such that the key 3 ("k3") doesnt have the subkey "sk2". Note that some of the elements of the array dont have "k3" (so I would like to remove those) and then those that have "k3" sometimes dont have "sk2" (thats the ones I want).
How to accomplish this in jq?
You can use select to filter, and has to check the keys:
jq 'map(select(has("k3") and (.k3 | has("sk2") | not)))' file.json
[
{
"k1": "b",
"k2": "ba",
"k3": {
"sk1": "a",
"sk3": "cc"
},
"k4": {
"sk6": "zs",
"sk8": "we"
}
}
]
Demo
When I run the jq command to parse a json document from the amazon cli I have the following problem.
I’m parsing through the IP address and a tag called "Enviroment". The enviroment tag in the instance does not exist therefore it does not throw me any result.
Here's an example of the relevant output returned by the AWS CLI
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
I’m running the following command
aws ec2 describe-instances --filters "Name=tag:Name,Values=Balance-OTA-SS_a" | jq -c '.Reservations[].Instances[] | ({IP: .PrivateIpAddress, Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)})'
## output
empty
How do I show the IP address in the output of the command even if the enviroment tag does not exist?
Regards,
Let's assume this input:
{
"Reservations": [
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.1",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
},
{
"Key": "Environment",
"Value": "alpha"
}
]
}
]
},
{
"Instances": [
{
"PrivateIpAddress": "10.0.0.2",
"Tags": [
{
"Key": "Name",
"Value": "Balance-OTA-SS_a"
}
]
}
]
}
]
}
This is the format returned by describe-instances, but with all the irrelevant fields removed.
Note that tags is always a list of objects, each of which has a Key and a Value. This format is perfect for from_entries, which can transform this list of tags into a convenient mapping object. Try this:
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags|from_entries.Environment)
}
{"IP":"10.0.0.1","Ambiente":"alpha"}
{"IP":"10.0.0.2","Ambiente":null}
That answers how to do it. But you probably want to understand why your approach didn't work.
.Reservations[].Instances[] |
{
IP: .PrivateIpAddress,
Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)
}
The .[] filter you're using on the tags can return zero or multiple results. Similarly, the select filter can eliminate some or all items. When you apply this inside an object constructor (the expression from { to }), you're causing that whole object to be created a variable number of times. You need to be very careful where you use these filters, because often that's not what you want at all. Often you instead want to do one of the following:
Wrap the expression that returns multiple results in an array constructor [ ... ]. That way instead of outputting the parent object potentially zero or multiple times, you output it once containing an array that potentially has zero or multiple items. E.g.
[.Tags[]|select(.Key=="Environment")]
Apply map to the array to keep it an array but process its contents, e.g.
.Tags|map(select(.Key=="Environment"))
Apply first(expr) to capture only the first value emitted by the expression. If the expression might emit zero items, you can use the comma operator to provide a default, e.g.
first((.Tags[]|select(.Key=="Environment")),null)
Apply some other array-level function, such as from_entries.
.Tags|from_entries.Environment
You can either use an if ... then ... else ... end construct, or //. For example:
.Reservations[].Instances[]
| {IP: .PrivateIpAddress} +
({Ambiente: (.Tags[]|select(.Key=="Environment")|.Value)}
// null)
Here is a simplified sample of the JSON data I'm working with:
[
{ "certname": "one.example.com",
"name": "fact1",
"value": "value1"
},
{ "certname": "one.example.com",
"name": "fact2",
"value": 42
},
{ "certname": "two.example.com",
"name": "fact1",
"value": "value3"
},
{ "certname": "two.example.com",
"name": "fact2",
"value": 10000
},
{ "certname": "two.example.com",
"name": "fact3",
"value": { "anotherkey": "anothervalue" }
}
]
The result I want to achieve, using jq preferably, is the following:
[
{
"certname": "one.example.com",
"fact1": "value1",
"fact2": 42
},
{
"certname": "two.example.com",
"fact1": "value3",
"fact2": 10000,
"fact3": { "anotherkey": "anothervalue" }
}
]
Its worth pointing out that not all elements have the same name/value pairs, by any means. Also, values are often complex objects in their own right.
If I was doing this in Python, it wouldn't be a big deal (and yes, I can hear the chorus of "do it in Python" ringing in my ears now). I would like to understand how to do this in jq, and it's escaping me at the moment.
... using jq preferably ...
That's the spirit! And in that spirit, here's a concise solution:
map( {certname, (.name): .value} )
| group_by(.certname)
| map(add)
Of course there are other reasonable solutions. If the above is at first puzzling, you might like to add a debug statement here or there, or you might like to explore the pipeline by executing the first line by itself, etc.
Is it possible to collect recursive-descent results into a single array with jq?
Would flatten help? Looks so to me, but I just cannot get it working. Take a look how far I am now at https://jqplay.org/s/6bxD-Wq0QE, anyone can make it working?
BTW,
.data.search.edges[].node | {name, topics: ..|.topics?} works, but I want all topics from the same node to be in one array, instead of having same name in all different returned results.
flatten alone will give me Cannot iterate over null, and
that's why I'm trying to use map(select(.? != null)) to filter the nulls out. However, I'd get Cannot iterate over null as well for my map-select.
So now it all comes down to how to filter out those nulls?
UPDATE:, by "collect into a single array" I meant to get something like this:
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topics": ["banking", "leumi", "api", "puppeteer", "scraper", "open-api"]
}
]
instead of having same name duplicated in all different returned results. Thus recursively descends seems to me to be the option, but I'm open to any solution as long as I can get result like above. Is that possible? Thx.
One way to collect the non-falsey values:
.data.search.edges[].node
| {name, topics: [.. | .topics? | select(.)]}
The result would be:
{
"name": "leumi-leumicard-bank-data-scraper",
"topics": [
"banking",
"leumi",
"api",
"puppeteer",
"scraper",
"open-api"
]
}
{
"name": "echarts-scrappeteer",
"topics": []
}
Not sure what you're expecting to get in your results... but it seems like you're trying to get all the repositories and their topics in a flat array. I don't see any reason why you should use recurse here, you're only selecting from one class of objects. Just reference them directly.
[.data.search.edges[].node | {name,topic:(.repositoryTopics.nodes[].topic.topics)}]
For your particular input produces:
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "banking"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "leumi"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "api"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "puppeteer"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "scraper"
},
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": "open-api"
}
]
https://jqplay.org/s/G2inYAJNLS
If you wanted to have an array of topics within the nodes instead, just collect them in an array by putting the filter that selects the topics within [].
[.data.search.edges[].node | {name,topic:[.repositoryTopics.nodes[].topic.topics]}]
[
{
"name": "leumi-leumicard-bank-data-scraper",
"topic": [
"banking",
"leumi",
"api",
"puppeteer",
"scraper",
"open-api"
]
},
{
"name": "echarts-scrappeteer",
"topic": []
}
]
https://jqplay.org/s/0AFneNK89i
All the examples I've seen so far "reduce" the output (filter out) some part. I understand how to operate on the part of the input I want to, but I haven't figured out how to output the rest of the content "untouched".
The particular example would be an input file with several high level entries "array1", "field1", "array2", "array3" say. Each array contents is different. The specific processing I want to do is to sort "array1" entries by a "name" field which is doable by:
jq '.array1 | sort_by(.name)' test.json
but I also want this output as "array1" as well as all the other data to be preserved.
Example input:
{
"field1": "value1",
"array1":
[
{ "name": "B", "otherdata": "Bstuff" },
{ "name": "A", "otherdata": "Astuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
Expected output:
{
"field1": "value1",
"array1":
[
{ "name": "A", "otherdata": "Astuff" },
{ "name": "B", "otherdata": "Bstuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
I've tried using map but I can't seem to get the syntax correct to be able to handle any type of input other than the array I want to be sorted by name.
Whenever you use the assignment operators (=, |=, +=, etc.), the context of the expression is kept unchanged. So as long as your top-level filter(s) are assignments, in the end, you'll get the rest of the data (with your changes applied).
In this case, you're just sorting the array1 array so you could just update the array.
.array1 |= sort_by(.name)