I have this JSON:
{
"time": "2022-02-28T22:00:55.196Z",
"severity": "INFO",
"params": [
{"key": "state", "value": "pending"},
{"key": "options", "value": "request"},
{"key": "description", "value": "[FILTERED]"}
],
"content_length": "231"
}
I want to print key value pairs of where the key matches to state and options and also the time and its value. I am able to print the time and all key value pairs by using below command, but not sure how to extract specific key value pairs.
jq '"time:\(.time)" ,[.params[] | "key:\(.key)" ,"value:\(.value)"]' test.json
This gives the output:
"time:2022-02-28T22:00:55.196Z"
[
"key:state",
"value:pending",
"key:options",
"value:request",
"key:description",
"value:[FILTERED]"
]
But my desired output is:
"time:2022-02-28T22:00:55.196Z"
"key:state",
"value:pending",
"key:options",
"value:request"
One solution to the stated problem would be:
< test.json jq '
"time:\(.time)",
[.params[] | select(.key|IN("state","options"))
| "key:\(.key)" ,"value:\(.value)"]
' | sed '/^[][]$/d'
However, it would almost certainly be better to modify the requirements slightly so that the output format is less idiosyncratic. This should also make it easier to formulate a cleaner (e.g. only-jq) solution.
You can use #csv (comma separated values).
Filter
"time:\(.time)",
(.params |
[map(select(.key=="state" or .key=="options"))[]
| "key:\(.key)", "value:\(.value)"]
| #csv)
Input
{
"time": "2022-02-28T22:00:55.196Z",
"severity": "INFO",
"params": [
{"key": "state", "value": "pending"},
{"key": "options", "value": "request"},
{"key": "description", "value": "[FILTERED]"}
],
"content_length": "231"
}
Output
time:2022-02-28T22:00:55.196Z
"key:state","value:pending","key:options","value:request"
Demo
https://jqplay.org/s/F_3QP6-EvK
Related
I want to parse a nested json to csv. The data looks similar to this.
{"tables":[{"name":"PrimaryResult","columns":[{"name":"name","type":"string"},{"name":"id","type":"string"},{"name":"custom","type":"dynamic"}]"rows":[["Alpha","1","{\"age\":\"23\",\"number\":\"xyz\"}]]]}
I want csv file as:
name id age number
alpha 1 23 xyz
I tried:
jq -r ".tables | .[] | .columns | map(.name)|#csv" demo.json > demo.csv
jq -r ".tables | .[] | .rows |.[]|#csv" demo.json >> demo.csv
But I am not getting expected result.
Output:
name id custom
alpha 1 {"age":"23","number":"xyz}
Expected:
name id age number
alpha 1 23 xyz
Assuming valid JSON input:
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{ "name": "name", "type": "string" },
{ "name": "id", "type": "string" },
{ "name": "custom", "type": "dynamic" }
],
"rows": [
"Alpha",
"1",
"{\"age\":\"23\",\"number\":\"xyz\"}"
]
}
]
}
And assuming fixed headers:
jq -r '["name", "id", "age", "number"],
(.tables[].rows | [.[0,1], (.[2] | fromjson | .age, .number)])
| #csv' input.json
Output:
"name","id","age","number"
"Alpha","1","23","xyz"
If any of the assumptions is wrong, you need to clarify your requirements, e.g.
How are column names determined?
What happens if the input contains multiple tables?
As the "dynamic" object always of the same shape? Or can it sometimes contain fewer, more, or different columns?
Assuming that the .rows array is a 2D array of rows and fields, and that a column of type "dynamic" always expects a JSON-encoded object whose fields represent further columns but may or may not always be present in every row.
Then you could go with transposing the headers array and the rows array in order to integratively process each column by their type, especially collecting all keys from the "dynamic" type on the fly, and then transpose it back to get the row-based CSV output.
Input (I have added another row for illustration):
{
"tables": [
{
"name": "PrimaryResult",
"columns": [
{
"name": "name",
"type": "string"
},
{
"name": "id",
"type": "string"
},
{
"name": "custom",
"type": "dynamic"
}
],
"rows": [
[
"Alpha",
"1",
"{\"age\":\"23\",\"number\":\"123\"}"
],
[
"Beta",
"2",
"{\"age\":\"45\",\"word\":\"xyz\"}"
]
]
}
]
}
Filter:
jq -r '
.tables[] | [.columns, .rows[]] | transpose | map(
if first.type == "string" then first |= .name
elif first.type == "dynamic" then
.[1:] | map(fromjson)
| (map(keys[]) | unique) as $keys
| [$keys, (.[] | [.[$keys[]]])] | transpose[]
else empty end
)
| transpose[] | #csv
'
Output:
"name","id","age","number","word"
"Alpha","1","23","123",
"Beta","2","45",,"xyz"
Demo
Lets say I have an I/P json file as below. And I want to extract the O/P in a CSV format with the below fields. Specifically, I want to get the value of the key "Gamma" in the o/p if the key "Gamma" exists in "tags" map. If the key doesn't exists, it should just print a NULL value. The expected o/p is below.
generated_time,platform,id,,
2021-09-09:12:03:12,earth,2eeee67748,Ray,2021-08-25 09:41:06
2021-09-09:12:03:12,sun,xxxxx12334,NULL,2021-08-25 10:11:31
[
{
"generated_time": "generated_time",
"platform": "platform",
"id": "id"
},
{
"generated_time": "2021-09-09:12:03:12",
"platform": "earth",
"id": "2eeee67748",
"tags": {
"app": "map",
"Gamma": "Ray",
"null": [
"allow-all-humans"
]
},
"created": "2021-08-25 09:41:06"
},
{
"generated_time": "2021-09-09:12:03:12",
"platform": "sun",
"id": "xxxxx12334",
"tags": {
"component": "machine",
"environment": "hot",
"null": [
"aallow-all-humans"
]
},
"created": "2021-08-25 10:11:31"
}
]
jq has a builtin #csv which renders an array
as CSV with double quotes for strings, and quotes escaped by repetition.
If the additional quoting (as compared to your expected output) isn't an issue, the following
jq --raw-output '
# produce an array for each element in the input array
.[] | [
# containing the first three columns unchanged
.generated_time, .platform, .id,
# if the input element has a field named "tags"
if has("tags")
# then add two more columns and replace an inexistant Gamma with "NULL"
then (.tags.Gamma // "NULL", .created)
# otherwise add two empty columns instead
else (null, null) end
# and convert the array into CSV format
] | #csv
' input.json
will produce
"generated_time","platform","id",,
"2021-09-09:12:03:12","earth","2eeee67748","Ray","2021-08-25 09:41:06"
"2021-09-09:12:03:12","sun","xxxxx12334","NULL","2021-08-25 10:11:31"
Preface: If the following is not possible with jq, then I completely accept that as an answer and will try to force this with bash.
I have two files that contain some IDs that, with some massaging, should be able to be combined into a single file. I have some content that I'll add to that as well (as seen in output). Essentially "mitre_test" should get compared to "sys_id". When compared, the "mitreid" from in2.json becomes technique_ID in the output (and is generally the unifying field of each output object).
Caveats:
There are some junk "desc" values placed in the in1.json that are there to make sure this is as programmatic as possible, and there are actually numerous junk inputs on the true input file I am using.
some of the mitre_test values have pairs and are not in a real array. I can split on those and break them out, but find myself losing the other information from in1.json.
Notice in the "metadata" for the output that is contains the "number" values from in1.json, and stored in a weird way (but the way that the receiving tool requires).
in1.json
[
{
"test": "Execution",
"mitreid": "T1204.001",
"mitre_test": "90b"
},
{
"test": "Defense Evasion",
"mitreid": "T1070.001",
"mitre_test": "afa"
},
{
"test": "Credential Access",
"mitreid": "T1556.004",
"mitre_test": "14b"
},
{
"test": "Initial Access",
"mitreid": "T1200",
"mitre_test": "f22"
},
{
"test": "Impact",
"mitreid": "T1489",
"mitre_test": "fa2"
}
]
in2.json
[
{
"number": "REL0001346",
"desc": "apple",
"mitre_test": "afa"
},
{
"number": "REL0001343",
"desc": "pear",
"mitre_test": "90b"
},
{
"number": "REL0001366",
"desc": "orange",
"mitre_test": "14b,f22"
},
{
"number": "REL0001378",
"desc": "pineapple",
"mitre_test": "90b"
}
]
The output:
[{
"techniqueID": "T1070.001",
"tactic": "defense-evasion",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001346"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1204.001",
"tactic": "execution",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001343"
},
{
"name": "DET_ID",
"value": "REL0001378"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1556.004",
"tactic": "credential-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1200",
"tactic": "initial-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
}
]
I'm assuming I have some splitting to do on mitre_test with something like .mitre_test |= split(",")), and there are some joins I'm assuming, but doing so causes data loss or mixing up of the data. You'll notice the static data in the output exists as well, but is likely easy to place in and as such isn't as much of an issue.
Edit: reduced some of the match IDs so that it is easier to look at while analyzing the in1 and in2 files. Also simplified the two inputs to have a similar structure so that the answer is easier to understand later.
The requirements are somewhat opaque but it's fairly clear that if the task can be done by computer, it can be done using jq.
From the description, it would appear that one of the unusual aspects of the problem is that the "dictionary" defined by in1.json must be derived by splitting the key names that are CSV (comma-separated values). Here therefore is a jq def that will do that:
# Input: a JSON dictionary for which some keys are CSV,
# Output: a JSON dictionary with the CSV keys split on the commas
def refine:
. as $in
| reduce keys_unsorted[] as $k ({};
if ($k|index(","))
then ($k/",") as $keys
| . + ($keys | map( {(.): $in[$k]}) | add)
else .[$k] = $in[$k]
end );
You can see how this works by running:
INDEX($mitre.records[]; .mitre_test) | refine
using an invocation of jq such as:
jq --argfile mitre in1.json -f program.jq in2.json
For the joining part of the problem, there are many relevant Q&As on SO, e.g.
How to join JSON objects on particular fields using jq?
There is probably a much more elegant way to do this, but I ended up manually walking around things and piping to new output.
Explanation:
Read in both files, pull the fields I need.
Break out the mitre_test values that were previously just a comma separated set of values with map and try.
Store the none-changing fields as a variable and then manipulate mitre_test to become an appropriately split array, removing nulls.
Group by mitre_test values, since they are the common thing that the output is based on.
Cleanup more nulls.
Sort output to look like I want it.
jq . in1.json in2.json | \
jq '.[] |{number: .number, test: .test, mitreid: .mitreid, mitre_test: .mitre_test}' |\
jq -s '[. |map(try(.mitre_test |= split(",")) // .)|\
.[] | [.number,.test,.mitreid] as $h | .mitre_test[] |$h + [.] | \
{DET_ID: .[0], tactic: .[1], techniqueID: .[2], mitre_test: .[3]}] |\
del(.[][] | nulls)' |jq '[group_by(.mitre_test)[]|{mitre_test: .[0].mitre_test, techniqueID: [.[].techniqueID],tactic: [.[].tactic], DET_ID: [.[].DET_ID]}]|\
del(.[].techniqueID[] | nulls) | del(.[].tactic[] | nulls) | del(.[].DET_ID[] | nulls)' | \
jq '.[]| [{techniqueID: .techniqueID[0],tactic: .tactic[0], metadata: [{name: "DET_ID",value: .DET_ID[]}]}] | .[] | \
select((.metadata|length)>0)'
It was a long line, so I split it among some of the basic ideas.
I have some JSON output I am trying to parse with jq. I read some examples on filtering but I don't really understand it and my output it more complicated than the examples. I have no idea where to even begin beyond jq '.[]' as I don't understand the syntax of jq beyond that and the hierarchy and terminology are challenging as well. My JSON output is below. I want to return the value for Valid where the ItemName equals Item_2. How can I do this?
"1"
[
{
"GroupId": "1569",
"Title": "My_title",
"Logo": "logo.jpg",
"Tags": [
"tag1",
"tag2",
"tag3"
],
"Owner": [
{
"Name": "John Doe",
"Id": "53335"
}
],
"ItemId": "209766",
"Item": [
{
"Id": 47744,
"ItemName": "Item_1",
"Valid": false
},
{
"Id": 47872,
"ItemName": "Item_2",
"Valid": true
},
{
"Id": 47872,
"ItemName": "Item_3",
"Valid": false
}
]
}
]
"Browse"
"8fj9438jgge9hdfv0jj0en34ijnd9nnf"
"v9er84n9ogjuwheofn9gerinneorheoj"
Except for the initial and trailing JSON scalars, you'd simply write:
.[] | .Item[] | select( .ItemName == "Item_2" ) | .Valid
In your particular case, to ensure the top-level JSON scalars are ignored, you could prefix the above with:
arrays |
I have a JSON formatted stream, full of objects. Each object looks like this:
{
"object": "alpha",
"attributes": [
{
"type": "A",
"description": "a",
"value": 1271129046.9144535
},
{
"type": "B",
"description": "b",
"value": 6738889338.63777
},
{
"type": "C",
"description": "c",
"value": 214918692.38456276
},
{
"type": "D",
"description": "d",
"value": 140222346.75136077
},
{
"type": "E",
"description": "e",
"value": 2085635554.8128803
}
]
}
I'd like to get data out as:
alpha,A,a,1271129046.9144535
alpha,B,b,6738889338.63777
alpha,C,c,214918692.38456276
alpha,D,d,140222346.75136077
alpha,E,e,2085635554.8128803
The next object may be "beta" instead of "alpha", hence I don't want to just strip the "object" key.
My restrictions are that I want to process this stream in a bash pipeline. I'm hoping I can just use "jq" for this, rather than piping through python/ruby/perl etc which I'd rather not depend on if I can help it.
Any ideas would be most grateful!
It looks like you're building up CSV data, the #csv filter was made for this. You just need to collect an array of the values you want to write out and pass it in to the filter. You could do this:
$ jq -r '.attributes[] as $attr | [.object, $attr.type, $attr.description, $attr.value] | #csv' input.json
Which produces this:
"alpha","A","a",1271129046.9144535
"alpha","B","b",6738889338.63777
"alpha","C","c",214918692.38456276
"alpha","D","d",140222346.75136077
"alpha","E","e",2085635554.8128803
(1) Slightly briefer than the accepted answer:
jq -r '[.object] + (.attributes[] | [.type, .description, .value]) | #csv'
(2) If you don't want the quotation marks, then one possibility would be:
jq -r '"\(.object)," + (.attributes[] | "\(.type),\(.description),\(.value)")'