Using jq, how can I extract the host, name, and date_happened fields from the json output below? I read another thread regarding this here at stackoverflow, but still can't seem to get it.
-Thanks
{
"events": [
{
"alert_type": "success",
"children": [
{
"alert_type": "error",
"date_happened": 1573502725,
"id": "5188183926379101887"
},
{
"alert_type": "success",
"date_happened": 1573503145,
"id": "5188190972457497744"
}
],
"comments": [],
"date_happened": 1573502725,
"device_name": null,
"host": "i-0e4b192579a9b423b",
"id": 5188183933173874377,
"is_aggregate": true,
"priority": "normal",
"resource": "/api/v1/events/5188183933173874377",
"source": "Monitor Alert",
"tags": [
"autoscaling_group:app2_backend-asg-prod",
"availability-zone:us-east-1b",
"datadog-agent:true",
"environment:prod",
"host:i-0e6b192579a9b423b",
"iam_profile:app2_backend_instance_profile",
"image:ami-2769055d",
"instance-type:m4.large",
"monitor",
"name:app2_backend-prod",
"region:us-east-1",
"role:app2_backend",
The jq filter:
.events[]
| [.host,
(.tags[] | select( test("^name:") ) | sub("name:";"")),
.date_happened]
produces
["i-0e4b192579a9b423b","app2_backend-prod",1573502725]
Related
I am trying to extract values from a json that I obtained using the curl command for api testing. My json looks as below. I need some help extracting the value "20456" from here?
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.076+0000"
},
"links": {},
"data": {
"id": 24843,
"username": "abcd",
"firstName": "abc",
"lastName": "xyz",
"email": "abc#abc.com",
"phone": "",
"title": "",
"location": "",
"licenseType": "FLOATING",
"active": true,
"uid": "u24843",
"type": "users"
}
}
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.282+0000",
"pageInfo": {
"startIndex": 0,
"resultCount": 1,
"totalResults": 1
}
},
"links": {
"data.createdBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.createdBy}"
},
"data.fields.user1": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.user1}"
},
"data.modifiedBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.modifiedBy}"
},
"data.fields.projectManager": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.projectManager}"
},
"data.parent": {
"type": "projects",
"href": "https://abc#abc.com/rest/v1/projects/{data.parent}"
}
},
"data": [
{
"id": 20456,
"projectKey": "Stratus",
"parent": 20303,
"isFolder": false,
"createdDate": "2018-03-12T23:46:59.000+0000",
"modifiedDate": "2020-04-28T22:14:35.000+0000",
"createdBy": 18994,
"modifiedBy": 18865,
"fields": {
"projectManager": 18373,
"user1": 18628,
"projectKey": "Stratus",
"text1": "",
"name": "Stratus",
"description": "",
"date2": "2019-03-12",
"date1": "2018-03-12"
},
"type": "projects"
}
]
}
I have tried the following, but end up getting error:
▶ cat jqTrial.txt | jq '.data[].id'
jq: error (at <stdin>:21): Cannot index number with string "id"
20456
Also tried this but I get strings outside the object that I am not sure how to remove:
cat jqTrial.txt | jq '.data[]'
Assuming you want the project id not the user id:
jq '
.data
| if type == "object" then . else .[] end
| select(.type == "projects")
| .id
' file.json
There's probably a better way to write the 2nd expression
Indeed, thanks to #pmf
.data | objects // arrays[] | select(.type == "projects").id
Your input consists of two JSON documents; both have a data field on top level. But while the first one is itself an object which has an .id field, the second one is an array with one object item, which also has an .id field.
To retrieve both, you could use the --slurp (or -s) option which wraps both top-level objects into an array, then you can address them separately by index:
jq --slurp '.[0].data.id, .[1].data[].id' jqTrial.txt
24843
20456
Demo
Info
I have a terraform state file (json) with some deprecated attributes.
I would like to remove theses deprecated attributes.
I try to use jq and select() && del() but did not succeed to get back my full json without the deprecated attribue timeouts.
Problem
How to get my full json without the attribute timeouts for only one type of resources google_dns_record_set.
Data
{
"version": 4,
"terraform_version": "1.0.6",
"serial": 635,
"lineage": "6a9c2392-fdae-2b54-adcc-7366f262ffa4",
"outputs": {"test":"test1"},
"resources": [
{
"module": "module.resources",
"mode": "data",
"type": "google_client_config"
},
{
"module": "module.xxx.module.module1[\"cluster\"]",
"mode": "managed",
"type": "google_dns_record_set",
"name": "public_ip_ic_dns",
"provider": "module.xxx.provider[\"registry.terraform.io/hashicorp/google\"]",
"instances": [
{
"schema_version": 0,
"attributes": {
"id": "projects/xxx-xxx/managedZones/xxx--public/rrsets/*.net1.cluster.xxx--public.net.com./A",
"managed_zone": "xxx--public",
"name": "*.net1.cluster.xxx--public.net.com.",
"project": "xxx-xxx",
"rrdatas": [
"11.22.33.44"
],
"timeouts": null,
"ttl": 300,
"type": "A"
},
"sensitive_attributes": [],
"private": "xxx",
"dependencies": [
"xxx"
]
}
]
}
]
}
Command
jq -r '.resources[] | select(.type=="google_dns_record_set").instances[].attributes | del(.timeouts)' data.json
Pull the del command up front to include the whole selection as its own filter
del(.resources[] | select(.type=="google_dns_record_set").instances[].attributes.timeouts)
Demo
I have a kafka message like below, where im trying to read the data from the json path. However im having a challenge when reading some of the attributes from the json path. here is the sample message.
sample1:
{
"header": {
"bu": "google",
"id": "12345",
"bum": "google",
"originTimestamp": "2021-10-09T15:17:09.842+00:00",
"batchSize": "0",
"jobType": "Batch"
},
"payload": {
"derivationdetails": {
"Id": "6783jhvvh897u31y283y",
"itemid": "1234567",
"batchid": 107,
"attributes": {
"itemid": "1234567",
"lineNbr": "1498",
"cat": "5929",
"Id": "6783jhvvh897u31y283y",
"indicator": "false",
"subcat": "3514"
},
"Exception": {
"values": [
{
"type": "PICK",
"value": "blocked",
"Reason": [
"RULE"
],
"rules": [
"439"
]
}
],
"rulesBagInfo": [
{
"Idtype": "XXXX",
"uniqueid": "7889423rbhevfhjaufdyeuiryeukjbdafvjd",
"rulesMatch": [
"439"
]
}
]
}
}
}
}
sample 2: Same message but see the difference in "Payload"
{
"header": {
"bu": "google",
"id": "12345",
"bum": "google",
"originTimestamp": "2021-10-09T15:17:09.842+00:00",
"batchSize": "0",
"jobType": "Batch"
},
"payload": {
"Id": "6783jhvvh897u31y283y",
"itemid": "1234567",
"batchid": 107,
"attributes": {
"itemid": "1234567",
"lineNbr": "1498",
"cat": "5929",
"Id": "6783jhvvh897u31y283y",
"indicator": "false",
"subcat": "3514"
},
"Exception": {
"values": [
{
"type": "PICK",
"value": "blocked",
"Reason": [
"RULE"
],
"rules": [
"439"
]
}
],
"rulesBagInfo": [
{
"Idtype": "XXXX",
"uniqueid": "7889423rbhevfhjaufdyeuiryeukjbdafvjd",
"rulesMatch": [
"439"
]
}
]
}
}
}
If you observe, sometimes the message has "derivationdetails", and sometimes it doesn't. But irrespective of its existence, i need to read the values of id,itemid and batchid. I tried using
$.payload[*].id
$.payload[*].itemid
$.payload[*].batchid
But i see that for batchid is returning null even though it has a value in the message, and the attributes under "attributes" return null if im using the above. For fields under "attributes" im using this(example):
$.payload.attributes.itemId
And, completely blank on how to read the below part.
"Exception": {
"values": [
{
"type": "PICK",
"value": "blocked",
"Reason": [
"RULE"
],
"rules": [
"439"
]
}
],
"rulesBagInfo": [
{
"Idtype": "XXXX",
"uniqueid": "7889423rbhevfhjaufdyeuiryeukjbdafvjd",
"rulesMatch": [
"439"
]
Im new to this and need some suggestions on how to read the attributes properly. Any help would be much appreciated.Thanks
Use ..(recursive descent, Deep scan. JSONPath borrows this syntax from E4X.) to get the values. But It will return a list if there are multiple entries with same key nested in deep.
Below jsonpath expressions will return a list with one item each for both sample1 and sample2
$.payload..attributes.Id
$.payload..attributes.itemid
$.payload..batchid
$.payload..Exception
My original JSON is given below.
[
{
"id": "1",
"name": "AA_1",
"total": "100002",
"files": [
{
"filename": "8665b987ab48511eda9e458046fbc42e.csv",
"filename_original": "some.csv",
"status": "3",
"total": "100002",
"time": "2020-08-24 23:25:49"
}
],
"status": "3",
"created": "2020-08-24 23:25:49",
"filenames": "8665b987ab48511eda9e458046fbc42e.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
},
{
"id": "4",
"name": "AA_2",
"total": "43806503",
"files": [
{
"filename": "1b4812fe634938928953dd40db1f70b2.csv",
"filename_original": "other.csv",
"status": "3",
"total": "21903252",
"time": "2020-08-24 23:33:43"
},
{
"filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"filename_original": "some.csv",
"status": "2",
"total": 0,
"time": "2020-08-24 23:29:30"
}
],
"status": "2",
"created": "2020-08-24 23:35:51",
"filenames": "1b4812fe634938928953dd40db1f70b2.csv&&63ab85fef2412ce80ae8bd018497d8bf.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
}
]
From this JSON I want to create new objects by combining fields from objects which have status: 2 and their files which also have the same pair, status: 2.
So, I am expecting a JSON array as below.
[
{
"id": "4",
"name": "AA_2",
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": 2
}
]
So far I tried with this JQ filter:
.[]|select(.status=="2")|[{id:.id,file_filename:.files[].filename,file_status:.files[].status}]
But this produces some invalid data.
[
{
"id": "4", # want to remove this as file.status != 2
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "3"
},
{
"id": "4",
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "2"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "3"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "2"
}
]
How do I filter the new JSON using JQ and remove these duplicate objects?
By applying [] operator to files twice, you're running into a combinatorial explosion. That needs to be avoided, for example:
[ .[] | select(.status == "2") | {id, name} + (.files[] | select(.status == "2") | {file_filename: .filename, file_status: .status}) ]
Online demo
I wish to parse individual elements of inner JSON object to build / load in the database.
The following is the JSON object. How can I parse elements like id, name queue etc? I will iterate it in loop and work and build the insert query.
{
"apps": {
"app": [
{
"id": "application_1540378900448_18838",
"user": "hive",
"name": "insert overwrite tabl...summary_view_stg_etl(Stage-2)",
"queue": "Data_Ingestion",
"state": "FINISHED",
"finalStatus": "SUCCEEDED",
"progress": 100
},
{
"id": "application_1540378900448_18833",
"user": "hive",
"name": "insert into SNOW_WORK...metric_definitions')(Stage-13)",
"queue": "Data_Ingestion",
"state": "FINISHED",
"finalStatus": "SUCCEEDED",
"progress": 100
}
]
}
}
You're better off converting the data to a format easily consumed by a database processor, like csv, then do something about it.
$ jq -r '(.apps.app[0] | keys_unsorted) as $k
| $k, (.apps.app[] | [.[$k[]]])
| #csv
' input.json
its pretty simple just fetch elment which is having an array of values.
var JSONOBJ={
"apps": {
"app": [
{
"id": "application_1540378900448_18838",
"user": "hive",
"name": "insert overwrite tabl...summary_view_stg_etl(Stage-2)",
"queue": "Data_Ingestion",
"state": "FINISHED",
"finalStatus": "SUCCEEDED",
"progress": 100
},
{
"id": "application_1540378900448_18833",
"user": "hive",
"name": "insert into SNOW_WORK...metric_definitions')(Stage-13)",
"queue": "Data_Ingestion",
"state": "FINISHED",
"finalStatus": "SUCCEEDED",
"progress": 100
}
]
}
}
JSONOBJ.apps.app.forEach(function(o){console.log(o.id);console.log(o.user);console.log(o.name);})