Use jq to find array element by containing string - json

I have an array "operations" from which I would like to return all the elements that contain a matching string like "w51". Until now, all the samples I have found dealt with key value pairs. I am using jq '.operations[]' < file to retrieve the elements.
{
"operations": [
[
"create",
"w51",
"rwt.widgets.Label",
{
"parent": "w41",
"style": [
"NONE"
],
"bounds": [
101,
0,
49,
42
],
"tabIndex": -1,
"customVariant": "variant_pufferLabelLogout"
}
],
[
"create",
"w39",
"rwt.widgets.Composite",
{
"parent": "w34",
"style": [
"NONE"
],
"children": [
"w52"
],
"bounds": [
0,
42,
762,
868
],
"tabIndex": -1,
"clientArea": [
0,
0,
762,
868
]
}
]
]
}
My expected output when searching for an array element that contains "w51" would be this:
[
"create",
"w51",
"rwt.widgets.Label",
{
"parent": "w41",
"style": [
"NONE"
],
"bounds": [
101,
0,
49,
42
],
"tabIndex": -1,
"customVariant": "variant_pufferLabelLogout"
}
]

If you use jq version 1.4 or later, the following should produce the desired output:
.operations[]
| select( index("w51") )
Alternatives
There are many alternatives, depending on which version of jq you have. If your jq has any/0, the following is an efficient option:
.operations[] | select( any(. == "w51" ) )

Related

How to perform Jolt Spec operation in nested Json

I am trying to write a Jolt Spec on nested json of following input record. There is a challenge in this...objectname is not always have multiple object like below, in some case it contain single object, and we have to parse this too.
Input
{
"Status": "Green",
"objectname": {
"LED TV": {
"values": [
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
"time": [
1663331241000,
1663330155000,
1663328545000
]
},
"LED Bulb": {
"values": [
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
"time": [
1663331241000,
1663330155000,
1663328545000
]
},
"LED LAMP": {
"values": [
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
"time": [
1663331541000,
1663330555000,
1663328545000
]
}
},
"Source": "LED EQUIPS"
}
Expected Output
[
[
"Status",
"Green",
"objectname",
"LED TV",
"values",
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
[
"Status",
"Green",
"objectname",
"LED Bulb",
"values",
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
[
"Status",
"Green",
"objectname",
"LED LAMP",
"values",
[
"one",
"two",
"three",
"four"
],
[
0,
0,
0,
"one"
],
[
0,
"one",
0,
"two"
]
],
"Source",
"LED EQUIPS"
]
PS : we have to parse "objectname" if they have one or multiple entries in it.
You can use $ wildcards to get keys of the attributes or objects for several layers, and use [# ] wildcards to generate respective arrays from the derived values from the nodes such as
[
{
"operation": "shift",
"spec": {
"o*": { // filter for "objectname"
"*": {
"v*": { // filter for "values"
"#Status": "[#3]", // fixed value "Status" in order to repeat within the objects
"#(3,Status)": "[#3]", // pick the value of the Status attribute after going three levels up the tree
"$2": "[#3]", // collect the key names for each object under related "objectname"
"$1": "[#3]",
"$": "[#3]",
"*": {
"#": "[#4]"
}
}
}
}
}
}
]

Iterate over array and output TSV report

I have file with 30, 000 JSON lines delimited by new line. I am using JQ to process it.
Below is each line schema (new.json).
{
"indexed": {
"date-parts": [
[
2020,
8,
13
]
],
"date-time": "2020-08-13T06:27:26Z",
"timestamp": 1597300046660
},
"reference-count": 42,
"publisher": "American Chemical Society (ACS)",
"issue": "3",
"content-domain": {
"domain": [],
"crossmark-restriction": false
},
"short-container-title": [
"Org. Lett."
],
"published-print": {
"date-parts": [
[
2005,
2
]
]
},
"DOI": "10.1021/ol047829t",
"type": "journal-article",
"created": {
"date-parts": [
[
2005,
1,
27
]
],
"date-time": "2005-01-27T05:53:29Z",
"timestamp": 1106805209000
},
"page": "383-386",
"source": "Crossref",
"is-referenced-by-count": 38,
"title": [
"Liquid-Crystalline [60]Fullerene-TTF Dyads"
],
"prefix": "10.1021",
"volume": "7",
"author": [
{
"given": "Emmanuel",
"family": "Allard",
"affiliation": []
},
{
"given": "Frédéric",
"family": "Oswald",
"affiliation": []
},
{
"given": "Bertrand",
"family": "Donnio",
"affiliation": []
},
{
"given": "Daniel",
"family": "Guillon",
"affiliation": []
}
],
"member": "316",
"container-title": [
"Organic Letters"
],
"original-title": [],
"link": [
{
"URL": "https://pubs.acs.org/doi/pdf/10.1021/ol047829t",
"content-type": "unspecified",
"content-version": "vor",
"intended-application": "similarity-checking"
}
],
"deposited": {
"date-parts": [
[
2020,
4,
7
]
],
"date-time": "2020-04-07T13:39:55Z",
"timestamp": 1586266795000
},
"score": null,
"subtitle": [],
"short-title": [],
"issued": {
"date-parts": [
[
2005,
2
]
]
},
"references-count": 42,
"alternative-id": [
"10.1021/ol047829t"
],
"URL": "http://dx.doi.org/10.1021/ol047829t",
"relation": {},
"ISSN": [
"1523-7060",
"1523-7052"
],
"issn-type": [
{
"value": "1523-7060",
"type": "print"
},
{
"value": "1523-7052",
"type": "electronic"
}
],
"subject": [
"Physical and Theoretical Chemistry",
"Organic Chemistry",
"Biochemistry"
]
}
For every DOI, I need to obtain the values of given and family key in the same cell of the same row of that DOI in the CSV/TSV format.
The expected output for the above json is (in CSV/TSV format):
|DOI| givenName|familyName|
|10.1021/ol047829t|Emmanuel; Frédéric; Bertrand; Daniel;|Allard; Oswald; Donnio; Guillon|
I am using the below command line but it is throwing error and when I try to alter I am unable to get CSV/TSV output at all.
cat new.json | jq -r "[.DOI, .publisher, .author[] | .given] | #tsv" > manage.tsv
The same logic applies for subject key also. I am using the below command line to output values of subject key to CSV but it is throwing only the first element (in this case only: "Physical and Theoretical Chemistry")
cat new.json | jq -c -r "[.DOI, .publisher, .subject[0]] | #csv" > manage.csv
Any pointers for right jq command line will be of great help.
Join given and family names by semicolons separately, then pass resulting strings as fields to the TSV filter.
["DOI", "givenName", "familyName"],
(inputs | [.DOI, (.author | map(.given), map(.family) | join("; "))])
| #tsv
Online demo
Note that you need to invoke JQ with -r and -n flags for this to work and produce a valid TSV output.

Filtering out one object from a list of objects based on a field using jq

We have the following json file , that include partitions and partition id )
in the file we have 6 partitions , while topic name is the same on all partitions
more file.json
{
"version": 1,
"partitions": [
{
"topic": "list_of_cars",
"partition": 2,
"replicas": [
1003,
1004,
1005
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"topic": "list_of_cars",
"partition": 4,
"replicas": [
1005,
1006,
1001
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"topic": "list_of_cars",
"partition": 0,
"replicas": [
1001,
1002,
1003
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"topic": "list_of_cars",
"partition": 1,
"replicas": [
1002,
1003,
1004
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"topic": "list_of_cars",
"partition": 5,
"replicas": [
1006,
1001,
1002
],
"log_dirs": [
"any",
"any",
"any"
]
},
{
"topic": "list_of_cars",
"partition": 3,
"replicas": [
1004,
1005,
1006
],
"log_dirs": [
"any",
"any",
"any"
]
}
]
}
Is it possible to print the following according to partition id
For example
Lets say we want to print the json part for partition id – 4
Then expected results should be like this
{
"topic": "list_of_cars",
"partition": 4,
"replicas": [
1005,
1006,
1001
],
"log_dirs": [
"any",
"any",
"any"
]
}
the best case is to print the following valid format ( if it possible )
{
"version": 1,
"partitions": [{
"topic": "list_of_cars",
"partition": 4,
"replicas": [
1005,
1006,
1001
],
"log_dirs": [
"any",
"any",
"any"
]
}]
}
This is a job of simple filter in jq to select the required object from the list of objects.
jq --arg part_id "4" '.partitions[] | select(.partition == ($part_id|tonumber))'
or use the map() function
You can feed the required partition id as input and later use that in the select(..) expression. Since by default the args are evaluated as strings and the filter needing an integer value to be checked, we do a string to input conversion using tonumber, so that the .partitition is compared against an integer value.
To answer the follow up question to retain only the object needed and remove the other ones, use the |= operator and select
jq --arg part_id "4" '.partitions |= map(select(.partition == ($part_id|tonumber)))'

JQ query output in csv format

I have been trying to extract a csv from the below json file using jq but not able to get so far. Does any experts out here can help?
{
"values": [
{
"resourceId": "xxxx-xxxx-xxx-8b16-xxxxxx",
"property-contents": {
"property-content": [
{
"statKey": "config|name",
"timestamps": [
1517591034069
],
"values": [
"somebname.UNIVERSE.test.com"
]
},
{
"statKey": "summary|guest|ipAddress",
"timestamps": [
1517591034069
],
"values": [
"100.xx.5.xx"
]
},
{
"statKey": "summary|parentCluster",
"timestamps": [
1551120506024
],
"values": [
"UFO-UFO"
]
},
{
"statKey": "summary|parentDatacenter",
"timestamps": [
1551120806021
],
"values": [
"GALAXY-D123"
]
},
{
"statKey": "summary|parentVcenter",
"timestamps": [
1517591334271
],
"values": [
"X-RAY123"
]
},
{
"statKey": "summary|runtime|powerState",
"timestamps": [
1517591034069
],
"values": [
"Powered On"
]
}
]
}
},
..
...
xxx-xxxx-xxx-8b16-xxxxxx,somebname.UNIVERSE.test.com,100.xx.5.xx,UFO-UFO,GALAXY-D123,X-RAY123,Powered On
Expected o/p is:
xxx-xxxx-xxx-8b16-xxxxxx,somebname.UNIVERSE.test.com,100.xx.5.xx,UFO-UFO,GALAXY-D123,X-RAY123,Powered On
Your expected output leaves some things unclear:
The second CSV column contains somebname.UNIVERSE.test.com, which was presumably derived from the section "property-content": [ { ..., "values": [ "somebname.UNIVERSE.test.com" ], ... }. How do you determine which element in the "property-content" list to pick for the second column? Is it because it's the first element? Is it because of its "statKey": "config|name"?
What if the "property-content" list is empty? What if it doesn't have the "statKey" entry you're looking for? What if the "values" list has zero or more than one element? The CSV row can only contain one scalar value. The same question applies for subsequent columns.
Making a wild guess here,
$ jq -r '.values[] | [ .resourceId, (."property-contents"."property-content"[] | .values[]) ] | join(",")' your.json
xxxx-xxxx-xxx-8b16-xxxxxx,somebname.UNIVERSE.test.com,100.xx.5.xx,UFO-UFO,GALAXY-D123,X-RAY123,Powered On
I cannot guarantee (and somewhat doubt) that this works in the general case, but I've been unable to extract a general case from your one example.

Convert a complex JSON file into a simple JSON file using JQ without getting cartesian product

I want to convert a complex JSON file into a simple JSON file using JQ. However, the query I'm using generates an incorrect output.
My (cut down) JSON file:
[
{
"id": 100,
"foo": [
{
"bar": [
{"type": "read"},
{"type": "write"}
],
"users": ["admin_1"],
"groups": []
},
{
"bar": [
{"type": "execute"},
{ "type": "read"}
],
"users": [],
"groups": ["admin_2"]
}
]
},
{
"id": 101,
"foo": [
{
"bar": [
{"type": "read"}
],
"users": [
"admin_3"
],
"groups": []
}
]
}
]
I need to generate a flatter JSON file and combine the users and groups into one field, similar to this:
[
{
"id": 100,
"users_groups": [
"admin_1",
"admin_2"
],
"bar": ["read"]
},
{
"id": 100,
"users_groups": ["admin_1"],
"bar": ["write"]
},
{
"id": 100,
"users_groups": ["admin_2"],
"bar": ["execute"]
},
{
"id": 101,
"users_groups": ["admin_3"],
"bar": ["read"]
}
]
Everything I try in JQ results in me getting an incorrect output (where admin_1 incorrectly has bar=execute and admin_2 incorrectly has bar=write), similar to the following:
[
{
"id": 100,
"users_groups": [
"admin_1",
"admin_2"
],
"bar": ["read", "write", "execute"]
},
{
"id": 101,
"users_groups": ["admin_3"],
"bar": ["read"]
}
]
I have tried many vairiats of this query - any idea what I should be doing instead?
cat file.json | jq -r '[.[] | select(has("foo")) |{"id", "users":(.foo[] | .users), "groups":(.foo[] | .groups), "bar":([.foo[].bar[] | .type])} ] '
The following filter groups by "type" as the question seems to require:
map(.id as $id
| [.foo[]
| {id: $id, bar: .bar[].type} +
{"users_groups": (.users + .groups)[]} ]
| group_by(.bar)
| map(.[0] + {"users_groups": [.[].users_groups]}) )
Output
[
[
{
"id": 100,
"bar": "execute",
"users_groups": [
"admin_2"
]
},
{
"id": 100,
"bar": "read",
"users_groups": [
"admin_1",
"admin_2"
]
},
{
"id": 100,
"bar": "write",
"users_groups": [
"admin_1"
]
}
],
[
{
"id": 101,
"bar": "read",
"users_groups": [
"admin_3"
]
}
]
]
Variations
To achieve the array-of-objects output format, simply tack on | [.[][]];
it would similarly be trivially easy to ensure that .bar is array-valued, though that might be pointless given that the grouping is by .type.