why parsing with jq returns null? - json

i am trying to extract values of 3 fields (status, id, name) from my json file by using jq tool, here is my json:
cat parse.json
{
"stream": {
"_id": 65675798730520654496,
"broadcast_platform": "live",
"community_id": "",
"community_ids": [],
"average_fps": 60.0247524752,
"delay": 0,
"created_at": "2018-09-26T07:25:38Z",
"is_playlist": false,
"stream_type": "live",
"preview": {
"small": "https://static-cdn.jtvnw.net/previews-ttv/live_user_versuta-80x4512wdfqf.jpg",
},
"channel": {
"mature": true,
"status": "status",
"broadcaster_language": "ru",
"broadcaster_software": "",
"_id": 218025408945123423423445,
"name": "djvbsdhvsdvasdv",
"created_at": "2011-04-17T17:31:36.091604Z",
"updated_at": "2018-09-26T09:49:04.434245Z",
"partner": true,
"video_banner": null,
"profile_banner": "https://static-cdn.jtvnw.net/jtv_user_pictures/25c2bec3-95b8-4347-aba0-128b3b913b0d-profile_banner-480.png",
"profile_banner_background_color": "",
"views": 103911737,
"followers": 446198,
"broadcaster_type": "",
"description": "",
"private_video": false,
"privacy_options_enabled": false
}
}
}
online json validators say that it is valid, when i try to get some field it return null
cat parse.json | jq '.channel'
null
cat parse.json | jq '.channel.status'
null
what am i doing wrong guys ?

Your JSON object has a top-level field "stream" You need to access "stream" to access the other sub-properties, e.g. channel:
jq '.stream.channel.status' parse.json
You can also do cat parse.json | jq '.stream.channel.status'.
Your example JSON is also invalid because the stream.preview.small property has a trailing comma. Simply removing that comma will make it valid, though.

To deal with the invalid JSON, you could use a JSON rectifier such as hjson; to avoid any hassles associated with identifying the relevant paths, you could use ..|objects. Thus for example:
$ hjson -j parse.json | jq '..|objects|select(.status) | .status, ._id, .name'
"status"
218025408945123440000000
"djvbsdhvsdvasdv"

Related

How to parse this boolean contained JSON output with jq?

The JSON output I am trying to parse:
{
"success": true,
"data": {
"aa": [
{
"timestamp": 123456,
"price": 1
},
{
"timestamp": 123457,
"price": 2
],
"bb": [
{
"timestamp": 123456,
"price": 3
},
{
"timestamp": 123457,
"price": 4
}
]
}
}
So after banging my head against the wall a million times, I just removed the "success": true", line from the output and I could easily do jq stuff with it. Otherwise if I ran for example:
cat jsonfile.json | jq -c .[].aa
I would get:
Cannot index boolean with string "aa"
Which makes sense, since the first key is boolean. But I have no clue how to skip it while processing with jq.
Goal is to filter only timestamp and price of "aa", without giving any care about the "success": true key/value pair.
You need to select the data field first: jq .data.aa[]

Combine files in jq based on similar ID object and reform data

Preface: If the following is not possible with jq, then I completely accept that as an answer and will try to force this with bash.
I have two files that contain some IDs that, with some massaging, should be able to be combined into a single file. I have some content that I'll add to that as well (as seen in output). Essentially "mitre_test" should get compared to "sys_id". When compared, the "mitreid" from in2.json becomes technique_ID in the output (and is generally the unifying field of each output object).
Caveats:
There are some junk "desc" values placed in the in1.json that are there to make sure this is as programmatic as possible, and there are actually numerous junk inputs on the true input file I am using.
some of the mitre_test values have pairs and are not in a real array. I can split on those and break them out, but find myself losing the other information from in1.json.
Notice in the "metadata" for the output that is contains the "number" values from in1.json, and stored in a weird way (but the way that the receiving tool requires).
in1.json
[
{
"test": "Execution",
"mitreid": "T1204.001",
"mitre_test": "90b"
},
{
"test": "Defense Evasion",
"mitreid": "T1070.001",
"mitre_test": "afa"
},
{
"test": "Credential Access",
"mitreid": "T1556.004",
"mitre_test": "14b"
},
{
"test": "Initial Access",
"mitreid": "T1200",
"mitre_test": "f22"
},
{
"test": "Impact",
"mitreid": "T1489",
"mitre_test": "fa2"
}
]
in2.json
[
{
"number": "REL0001346",
"desc": "apple",
"mitre_test": "afa"
},
{
"number": "REL0001343",
"desc": "pear",
"mitre_test": "90b"
},
{
"number": "REL0001366",
"desc": "orange",
"mitre_test": "14b,f22"
},
{
"number": "REL0001378",
"desc": "pineapple",
"mitre_test": "90b"
}
]
The output:
[{
"techniqueID": "T1070.001",
"tactic": "defense-evasion",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001346"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1204.001",
"tactic": "execution",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001343"
},
{
"name": "DET_ID",
"value": "REL0001378"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1556.004",
"tactic": "credential-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1200",
"tactic": "initial-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
}
]
I'm assuming I have some splitting to do on mitre_test with something like .mitre_test |= split(",")), and there are some joins I'm assuming, but doing so causes data loss or mixing up of the data. You'll notice the static data in the output exists as well, but is likely easy to place in and as such isn't as much of an issue.
Edit: reduced some of the match IDs so that it is easier to look at while analyzing the in1 and in2 files. Also simplified the two inputs to have a similar structure so that the answer is easier to understand later.
The requirements are somewhat opaque but it's fairly clear that if the task can be done by computer, it can be done using jq.
From the description, it would appear that one of the unusual aspects of the problem is that the "dictionary" defined by in1.json must be derived by splitting the key names that are CSV (comma-separated values). Here therefore is a jq def that will do that:
# Input: a JSON dictionary for which some keys are CSV,
# Output: a JSON dictionary with the CSV keys split on the commas
def refine:
. as $in
| reduce keys_unsorted[] as $k ({};
if ($k|index(","))
then ($k/",") as $keys
| . + ($keys | map( {(.): $in[$k]}) | add)
else .[$k] = $in[$k]
end );
You can see how this works by running:
INDEX($mitre.records[]; .mitre_test) | refine
using an invocation of jq such as:
jq --argfile mitre in1.json -f program.jq in2.json
For the joining part of the problem, there are many relevant Q&As on SO, e.g.
How to join JSON objects on particular fields using jq?
There is probably a much more elegant way to do this, but I ended up manually walking around things and piping to new output.
Explanation:
Read in both files, pull the fields I need.
Break out the mitre_test values that were previously just a comma separated set of values with map and try.
Store the none-changing fields as a variable and then manipulate mitre_test to become an appropriately split array, removing nulls.
Group by mitre_test values, since they are the common thing that the output is based on.
Cleanup more nulls.
Sort output to look like I want it.
jq . in1.json in2.json | \
jq '.[] |{number: .number, test: .test, mitreid: .mitreid, mitre_test: .mitre_test}' |\
jq -s '[. |map(try(.mitre_test |= split(",")) // .)|\
.[] | [.number,.test,.mitreid] as $h | .mitre_test[] |$h + [.] | \
{DET_ID: .[0], tactic: .[1], techniqueID: .[2], mitre_test: .[3]}] |\
del(.[][] | nulls)' |jq '[group_by(.mitre_test)[]|{mitre_test: .[0].mitre_test, techniqueID: [.[].techniqueID],tactic: [.[].tactic], DET_ID: [.[].DET_ID]}]|\
del(.[].techniqueID[] | nulls) | del(.[].tactic[] | nulls) | del(.[].DET_ID[] | nulls)' | \
jq '.[]| [{techniqueID: .techniqueID[0],tactic: .tactic[0], metadata: [{name: "DET_ID",value: .DET_ID[]}]}] | .[] | \
select((.metadata|length)>0)'
It was a long line, so I split it among some of the basic ideas.

How to loop through JSON objects to extract specific key values with jq

I am new to Bash scripting and jq. I am trying to extract the key value pairs name and transcription.normalized from a JSON object.
I have learned how to get a list of all the values from name and normalized separately but it is not really what I am looking for.
cat submission.json | jq '.documents[] .pages[] .fields[] .name, .documents[] .pages[] .fields[] .transcription.normalized'
I am wondering if I need to perform some sort of loop but just not sure. I really want a single script that pulls those 2 fields in a format that I can easily dump to a CSV file.
This is the example of what the JSON looks like.
{
"id": 1,
"state": "complete",
"substate": null,
"exceptions": [],
"name": "Sender Account Number",
"output_name": null,
"field_definition_attributes": {
"required": false,
"data_type": "Account Number",
"multiline": false,
"consensus_required": false,
"supervision_override": null
},
"transcription": {
"raw": "1685-0441-1",
"normalized": "168504411",
"source": "machine_transcription",
"data_deleted": false,
"user_transcribed": null,
"row_index": null
},
"field_image_url": "/api/v4/image/be167a88-9d1d-43bc-82b2-3d96d8c06656?start_x=0.3110429607297866&start_y=0.1052441592299208&end_x=0.5696909842243418&end_y=0.16043316955780607"
}
You don't need a loop. And jq can produce CSV out of arrays.
jq -r '.documents[].pages[].fields[] | [.name, .transcription.normalized] | #csv' file

How do I return the id associated with the test json element?

I return the following json from a curl. I've simplified the ID's however my goal is to return the id related to the test object. The desired return value is 55555.
I'm using jq to parse the string.
https://stedolan.github.io/jq
[
{
"attributes": null,
"id": "44444",
"name": "production",
"type": "system_group"
},
{
"attributes": null,
"id": "55555",
"name": "test",
"type": "system_group"
},
{
"attributes": null,
"id": "66666",
"name": "uat",
"type": "system_group"
}
]
It should be pretty straightforward filter, use the select and contain constructs.
jq '.[] | select( .name| contains("test")) | .id'
Call the filter with -r to remove the quotes and print the raw value.
jqplay-URL

Get outputs from jq on a single line

I got below output using: https://stackoverflow.com/a/40330344
(.issues[] | {key, status: .fields.status.name, assignee: .fields.assignee.emailAddress})
Output:
{
"key": "SEA-739",
"status": "Open",
"assignee": null
}
{
"key": "SEA-738",
"status": "Resolved",
"assignee": "user2#mycompany.com"
}
But I need to parse each and every line but it's tough to identify which assignee is for which key as far as key group is concerned. Is this possible to make one bunch in one row using jq?
Expected output:
{ "key": "SEA-739", "status": "Open", "assignee": null }
{ "key": "SEA-738", "status": "Resolved", "assignee": "user2#mycompany.com"}
OR
{ "SEA-739", "Open", null }
{ "SEA-738", "Resolved", user2#mycompany.com }
-c is what you likely need
Using the output you posted above, you can process it further:
jq -c . input
To Give;
{"key":"SEA-739","status":"Open","assignee":null}
{"key":"SEA-738","status":"Resolved","assignee":"user2#mycompany.com"}
Or you can just change your original command
FROM
jq -r '(.issues[] | {key, status: .fields.status.name, assignee: .fields.assignee.emailAddress})'
TO
jq -c '(.issues[] | {key, status: .fields.status.name, assignee: .fields.assignee.emailAddress})'
Not precisely an answer to the long version of the question, but for people who Googled this looking for other single line output formats from jq:
$ jq -r '[.key, .status, .assignee]|#tsv' <<<'
{
"key": "SEA-739",
"status": "Open",
"assignee": null
}
{
"key": "SEA-738",
"status": "Resolved",
"assignee": "user2#mycompany.com"
}'
outputs:
SEA-739 Open
SEA-738 Resolved user2#mycompany.com
#sh rather than #tsv returns:
'SEA-739' 'Open' null
'SEA-738' 'Resolved' 'user2#mycompany.com'
Additionally, there are other output formats to do things such as escape the output, like #html, or encode it, as with #base64. The list is available in the Format strings and escaping section of either the jq(1) man page or online at stedolan.github.io/jq/manual.