Create a new json string from jq output elements - json

My jq command returns objects in brackets but without comma separators. But I would like to create a new json string from it.
This call finds all elements of arr that have a FooItem in them and then returns texts from the nested array at index 3:
jq '.arr[] | select(index("FooItem")) | .[3].texts'
on this json (The original has more elements ):
{
"arr": [
[
"create",
"w199",
"FooItem",
{
"index": 0,
"texts": [
"aBarfoo",
"avalue"
]
}
],
[
"create",
"w200",
"NoItem",
{
"index": 1,
"val": 5,
"hearts": 5
}
],
[
"create",
"w200",
"FooItem",
{
"index": 1,
"texts": [
"mybarfoo",
"bValue"
]
}
]
]
}
returns this output:
[
"aBarfoo",
"avalue"
]
[
"mybarfoo",
"bValue"
]
But I'd like to create a new json from these objects that looks like this:
{
"arr": [
[
"aBarfoo",
"avalue"
],
[
"mybarfoo",
"bValue"
]
]
}
Can jq do this?
EDIT
One more addition: Considering that texts also has strings of zero length, how would you delete those/not have them in the result?
"texts": ["",
"mybarfoo",
"bValue",
""
]

You can always embed a stream of (zero or more) JSON entities within some other JSON structure by decorating the stream, that is, in the present case, by wrapping the STREAM as follows:
{ arr: [ STREAM ] }
In the present case, however, we can also take the view that we are simply editing the original document, and accordingly use a variation of the map(select(...)) idiom:
.arr |= map( select(index("FooItem")) | .[3].texts)
This latter approach ensures that the context of the "arr" key is preserved.
Addendum
To filter out the empty strings, simply add another map(select(...)):
.arr |= map( select(index("FooItem"))
| .[3].texts | map(select(length>0)))

Related

Updating Nested JSON Array with new key and value from another key

I have have a JSON file where I have IDs with tasks. Some tasks can be empty. I want to put the ID into the tasks where tasks are not empty.
[
{
"id": 1961126,
"tasks": [
{
"id": 70340700,
"title": "Test1",
},
{
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I would like to get the tasks array updated to look like
[
{
"id": 1961126,
"tasks": [
{
**"sId":1961126,**
"id": 70340700,
"title": "Test1",
},
{
**"sId":1961126,**
"id": 69801130,
"title": "Test15A",
}
]
},
{
"id": 1961126,
"tasks": []
}
]
I can't figure out how to get the id from the object into the nested array. Here is what I have come up with
jq 'map(.tasks[0]|select( . != null )|.sId = .id)' file.json
This is only pulling in the same id. I have tired to put in [].id but I get a error Cannot index number with string "id". I am still learning how to deal with nested arrays and objects.
Save the ID in a variable and add it as a new field to each array member.
jq 'map(.id as $sId | .tasks[] += {$sId})' file.json
Demo
Note #1: Get rid of the final , within each object (see the Demo), as it's not proper JSON.
Note #2: Object fields generally have no order, but if you want to have the propagated ID shown first, as seen in your expected output, you could try to replace += {$sId} (which by itself is shorthand for |= . + {$sId}) with |= {$sId} + . to flip the order of generation (Demo). Although there is no guarantee that it stays that way with further processing.

jq with multiple select statements and an array

I've got some JSON like the following (I've filtered the output here):
[
{
"Tags": [
{
"Key": "Name",
"Value": "example1"
},
{
"Key": "Irrelevant",
"Value": "irrelevant"
}
],
"c7n:MatchedFilters": [
"tag: example_tag_rule"
],
"another_key": "another_value_I_dont_want"
},
{
"Tags": [
{
"Key": "Name",
"Value": "example2"
}
],
"c7n:MatchedFilters": [
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
}
]
I'd like to create a csv file with the value within the Name key and all of the "c7n:MatchedFilters" in the array. I've made a few attempts but still can't get quite the output I expect. There's some example code and the output below:
#Prints the key that I'm after.
cat new.jq | jq '.[] | [.Tags[], {"c7n:MatchedFilters"}] | .[] | select(.Key=="Name")|.Value'
"example1"
"example2"
#Prints all the filters in an array I'm after.
cat new.jq | jq -r '.[] | [.Tags[], {"c7n:MatchedFilters"}] | .[] | select(."c7n:MatchedFilters") | .[]'
[
"tag: example_tag_rule"
]
[
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
#Prints *all* the tags (including ones I don't want) and all the filters in the array I'm after.
cat new.jq | jq '.[] | [.Tags[], {"c7n:MatchedFilters"}] | select((.[].Key=="Name") and (.[]."c7n:MatchedFilters"))'
[
{
"Key": "Name",
"Value": "example1"
},
{
"Key": "Irrelevant",
"Value": "irrelevant"
},
{
"c7n:MatchedFilters": [
"tag: example_tag_rule"
]
}
]
[
{
"Key": "Name",
"Value": "example2"
},
{
"c7n:MatchedFilters": [
"tag:example_tag_rule",
"tag: example_tag_rule2"
]
}
]
I hope this makes sense, let me know if I've missed anything.
Your attempts are not working because you start out with [.Tags[], {"c7n:MatchedFilters"}] to construct one array containing all the tags and an object containing the filters. You are then struggling to find a way to process this entire array at once because it jumbles together these unrelated things without any distinction. You will find it much easier if you don't combine them in the first place!
You want to find the single tag with a Key of "Name". Here's one way to find that:
first(
.Tags[]|
select(.Key=="Name")
).Value as $name
By using a variable binding we can save it for later and worry about constructing the array separately.
You say (in the comments) that you just want to concatenate the filters with spaces. You can do that easily enough:
(
."c7n:MatchedFilters"|
join(" ")
) as $filters
You can combine all this together like follows. Note that each variable binding leaves the input stream unchanged, so it's easy to compose everything.
jq --raw-output '
.[]|
first(
.Tags[]|
select(.Key=="Name")
).Value as $name|
(
."c7n:MatchedFilters"|
join(" ")
) as $filters|
[$name, $filters]|
#csv
Hopefully that's easy enough to read and separates out each concept. We break up the array into a stream of objects. For each object, we find the name and bind it to $name, we concatenate the filters and bind them to $filters, then we construct an array containing both, then we convert the array to a CSV string.
We don't need to use variables. We could just have a big array constructor wrapped around the expression to find the name and the expression to find the filters. But I hope you can see the variables make things a bit flatter and easier to understand.

How to use jq to reconstruct complete contents of json file, operating only on part of interest?

All the examples I've seen so far "reduce" the output (filter out) some part. I understand how to operate on the part of the input I want to, but I haven't figured out how to output the rest of the content "untouched".
The particular example would be an input file with several high level entries "array1", "field1", "array2", "array3" say. Each array contents is different. The specific processing I want to do is to sort "array1" entries by a "name" field which is doable by:
jq '.array1 | sort_by(.name)' test.json
but I also want this output as "array1" as well as all the other data to be preserved.
Example input:
{
"field1": "value1",
"array1":
[
{ "name": "B", "otherdata": "Bstuff" },
{ "name": "A", "otherdata": "Astuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
Expected output:
{
"field1": "value1",
"array1":
[
{ "name": "A", "otherdata": "Astuff" },
{ "name": "B", "otherdata": "Bstuff" }
],
"array2" :
[
array2 stuff
],
"array3" :
[
array3 stuff
]
}
I've tried using map but I can't seem to get the syntax correct to be able to handle any type of input other than the array I want to be sorted by name.
Whenever you use the assignment operators (=, |=, +=, etc.), the context of the expression is kept unchanged. So as long as your top-level filter(s) are assignments, in the end, you'll get the rest of the data (with your changes applied).
In this case, you're just sorting the array1 array so you could just update the array.
.array1 |= sort_by(.name)

How to remove an array element with jq?

I'm trying to figure out how to remove an array element from some JSON using jq.
Below is the input and desired output.
jq .Array[0]
outputs the array element I want.
{
"blah1": [
"key1:val1"
],
"foobar0": "barfoo0",
"foobar1": "barfoo1"
}
But how do I re-wrap this with:
{
"blah0": "zeroblah",
"Array": [
and
]
}
Input:
{
"blah0": "zeroblah",
"Array": [
{
"blah1": [
"key1:val1"
],
"foobar0": "barfoo0",
"foobar1": "barfoo1"
},
{
"blah2": [
"key2:val2"
],
"foobar2": "barfoo2",
"foobar3": "barfoo3"
}
]
}
Desired output:
{
"blah0": "zeroblah",
"Array": [
{
"blah1": [
"key1:val1"
],
"foobar0": "barfoo0",
"foobar1": "barfoo1"
}
]
}
Regarding the second part of Paul Ericson's question
But more generically, I'm trying to understand how jq would allow for selective array element control. Maybe next time I want to delete array elements 1,3,5 and 11.
To delete elements 1,3,5 and 11 just use
del(
.Array[1,3,5,11]
)
but in general you can use a more sophisticated filter as the argument to del. For example, this filter removes the elements within .Array whose .foobar2 key is "barfoo2":
del(
.Array[]
| select(.foobar2 == "barfoo2")
)
producing in this example
{
"blah0": "zeroblah",
"Array": [
{
"blah1": [
"key1:val1"
],
"foobar0": "barfoo0",
"foobar1": "barfoo1"
}
]
}
In this particular case, the simplest would be:
del(.Array[1])
More generally, if you wanted to delete all items in the array except for the first:
.Array |= [.[0]]

Using jq to list keys in a JSON object

I have a hierarchically deep JSON object created by a scientific instrument, so the file is somewhat large (1.3MB) and not readily readable by people. I would like to get a list of keys, up to a certain depth, for the JSON object. For example, given an input object like this
{
"acquisition_parameters": {
"laser": {
"wavelength": {
"value": 632,
"units": "nm"
}
},
"date": "02/03/2525",
"camera": {}
},
"software": {
"repo": "github.com/username/repo",
"commit": "a7642f",
"branch": "develop"
},
"data": [{},{},{}]
}
I would like an output like such.
{
"acquisition_parameters": [
"laser",
"date",
"camera"
],
"software": [
"repo",
"commit",
"branch"
]
}
This is mainly for the purpose of being able to enumerate what is in a JSON object. After processing the JSON objects from the instrument begin to diverge: for example, some may have a field like .frame.cross_section.stats.fwhm, while others may have .sample.species, so it would be convenient to be able to interrogate the JSON object on the command line.
The following should do exactly what you want
jq '[(keys - ["data"])[] as $key | { ($key): .[$key] | keys }] | add'
This will give the following output, using the input you described above:
{
"acquisition_parameters": [
"camera",
"date",
"laser"
],
"software": [
"branch",
"commit",
"repo"
]
}
Given your purpose you might have an easier time using the paths builtin to list all the paths in the input and then truncate at the desired depth:
$ echo '{"a":{"b":{"c":{"d":true}}}}' | jq -c '[paths|.[0:2]]|unique'
[["a"],["a","b"]]
Here is another variation uing reduce and setpath which assumes you have a specific set of top-level keys you want to examine:
. as $v
| reduce ("acquisition_parameters", "software") as $k (
{}; setpath([$k]; $v[$k] | keys)
)