How to retrieve deep-linked kv pair using jq - json

got to this section of json with jq -r '.update[]' now need to find, operation:restrictions:user:results:username,userKey
"update": {
"operation": "update",
"restrictions": {
"user": {
"results": [
{
"type": "known",
"username": "xxxx",
"userKey": "yyyyy",
"profilePicture": {
"path": "/download/attachments/49710215/user-avatar",
"width": 48,
"height": 48,
"isDefault": false
},
"displayName": "xxxyyyzzz",
"_links": {
"self": "aa"
},
"_expandable": {
"status": ""
}
},

Assuming the input as shown has been tidied a bit to make it valid JSON, running
jq -r '
.update
| [.operation] + (.restrictions.user.results[] | [.username, .userKey])
| #csv
'
against it would yield:
"update","xxxx","yyyyy"

Related

How can I convert nested JSON to CSV on command line?

I have a JSON file who I am trying to convert to CSV using jq, but I've been having a lot of problems, this is the JSON:
{
"nhits": 2,
"parameters": {
"dataset": "real-time-bezettingen-fietsenstallingen-gent",
"rows": 10,
"start": 0,
"facet": [
"facilityname"
],
"format": "json",
"timezone": "UTC"
},
"records": [
{
"datasetid": "real-time-bezettingen-fietsenstallingen-gent",
"recordid": "d471594688a931ba8d81f8d883874a08cee84775",
"fields": {
"id": "48-2",
"freeplaces": 71,
"facilityname": "Braunplein",
"geo_point_2d": [
51.05406845807926,
3.723722319130363
],
"time": "2022-11-10T12:18:01+00:00",
"totalplaces": 116,
"occupiedplaces": 45,
"bezetting": 38
},
"geometry": {
"type": "Point",
"coordinates": [
3.723722319130363,
51.05406845807926
]
},
"record_timestamp": "2022-11-10T12:18:04.838Z"
},
{
"datasetid": "real-time-bezettingen-fietsenstallingen-gent",
"recordid": "d0121748cf31c7e1c02d99712bdf07cb33156689",
"fields": {
"id": "48-1",
"freeplaces": 65,
"facilityname": "Korenmarkt",
"geo_point_2d": [
51.05388288288933,
3.7214177570400473
],
"time": "2022-11-10T12:18:01+00:00",
"totalplaces": 235,
"occupiedplaces": 170,
"bezetting": 72
},
"geometry": {
"type": "Point",
"coordinates": [
3.7214177570400473,
51.05388288288933
]
},
"record_timestamp": "2022-11-10T12:18:04.838Z"
}
],
"facet_groups": [
{
"name": "facilityname",
"facets": [
{
"name": "Braunplein",
"count": 1,
"state": "displayed",
"path": "Braunplein"
},
{
"name": "Korenmarkt",
"count": 1,
"state": "displayed",
"path": "Korenmarkt"
}
]
}
]
}
I only want to have the columns facilityname, time, totalplaces, occupiedplaces and bezetting, I tried converting using the following command:
jq -r '["naam", "tijd", "totaalAantalPlaatsen", "bezettePlaatsen", "bezetting"] , .records[] | (.fields[] | [.facilityname, .time, .totalplaces, .occupiedplaces, .bezetting]) | #csv' data.json
But I get the error:
jq: error (at data.json:0): Cannot index array with string "fields"
Does anyone know what I'm doing wrong?
You just need some parentheses around the .records[] ... part
jq -r '
["name", "tijd", "totaalAantalPlaatsen", "bezettePlaatsen", "bezetting"],
(.records[].fields | [.facilityname, .time, .totalplaces, .occupiedplaces, .bezetting])
| #csv
' file.json

How to extract a paticular key from the json

I am trying to extract values from a json that I obtained using the curl command for api testing. My json looks as below. I need some help extracting the value "20456" from here?
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.076+0000"
},
"links": {},
"data": {
"id": 24843,
"username": "abcd",
"firstName": "abc",
"lastName": "xyz",
"email": "abc#abc.com",
"phone": "",
"title": "",
"location": "",
"licenseType": "FLOATING",
"active": true,
"uid": "u24843",
"type": "users"
}
}
{
"meta": {
"status": "OK",
"timestamp": "2022-09-16T14:45:55.282+0000",
"pageInfo": {
"startIndex": 0,
"resultCount": 1,
"totalResults": 1
}
},
"links": {
"data.createdBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.createdBy}"
},
"data.fields.user1": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.user1}"
},
"data.modifiedBy": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.modifiedBy}"
},
"data.fields.projectManager": {
"type": "users",
"href": "https://abc#abc.com/rest/v1/users/{data.fields.projectManager}"
},
"data.parent": {
"type": "projects",
"href": "https://abc#abc.com/rest/v1/projects/{data.parent}"
}
},
"data": [
{
"id": 20456,
"projectKey": "Stratus",
"parent": 20303,
"isFolder": false,
"createdDate": "2018-03-12T23:46:59.000+0000",
"modifiedDate": "2020-04-28T22:14:35.000+0000",
"createdBy": 18994,
"modifiedBy": 18865,
"fields": {
"projectManager": 18373,
"user1": 18628,
"projectKey": "Stratus",
"text1": "",
"name": "Stratus",
"description": "",
"date2": "2019-03-12",
"date1": "2018-03-12"
},
"type": "projects"
}
]
}
I have tried the following, but end up getting error:
▶ cat jqTrial.txt | jq '.data[].id'
jq: error (at <stdin>:21): Cannot index number with string "id"
20456
Also tried this but I get strings outside the object that I am not sure how to remove:
cat jqTrial.txt | jq '.data[]'
Assuming you want the project id not the user id:
jq '
.data
| if type == "object" then . else .[] end
| select(.type == "projects")
| .id
' file.json
There's probably a better way to write the 2nd expression
Indeed, thanks to #pmf
.data | objects // arrays[] | select(.type == "projects").id
Your input consists of two JSON documents; both have a data field on top level. But while the first one is itself an object which has an .id field, the second one is an array with one object item, which also has an .id field.
To retrieve both, you could use the --slurp (or -s) option which wraps both top-level objects into an array, then you can address them separately by index:
jq --slurp '.[0].data.id, .[1].data[].id' jqTrial.txt
24843
20456
Demo

need to extract specific string with JQ

I have a JSON file (see below) and with JQ I need to extract the resourceName value for value = mail#mail1.com
So in my case, the result should be name_1
Any idea to do that ?
Because this does not work :
jq '.connections[] | select(.emailAddresses.value | test("mail#mail1.com"; "i")) | .resourceName' file.json
{
"connections": [
{
"resourceName": "name_1",
"etag": "123456789",
"emailAddresses": [
{
"metadata": {
"primary": true,
"source": {
"type": "CONTACT",
"id": "123456"
}
},
"value": "mail#mail1.com",
}
]
},
{
"resourceName": "name_2",
"etag": "987654321",
"emailAddresses": [
{
"metadata": {
"primary": true,
"source": {
"type": "CONTACT",
"id": "654321"
},
"sourcePrimary": true
},
"value": "mail#mail2.com"
}
]
}
],
"totalPeople": 187,
"totalItems": 187
}
One solution is to store the parent object while selecting on the child array:
jq '.connections[] | . as $parent | .emailAddresses // empty | .[] | select(.value == "mail#mail1.com") | $parent.resourceName' file.json
emailAddresses is an array. Use any if finding one element that matches will suffice.
.connections[] | select(any(.emailAddresses[];.value == "mail#mail1.com")).resourceName

Building json path from JQ using some keyword

I have a deep json. Sometimes, I need to look for the json path for a key containing certain word.
{
"apiVersion": "v1",
"kind": "Pod",
"metadata": {
"creationTimestamp": "2019-03-28T21:09:42Z",
"labels": {
"bu": "finance",
"env": "prod"
},
"name": "auth",
"namespace": "default",
"resourceVersion": "2786",
"selfLink": "/api/v1/namespaces/default/pods/auth",
"uid": "ce73565a-519d-11e9-bcb7-0242ac110009"
},
"spec": {
"containers": [
{
"command": [
"sleep",
"4800"
],
"image": "busybox",
"imagePullPolicy": "Always",
"name": "busybox",
"resources": {},
"terminationMessagePath": "/dev/termination-log",
"terminationMessagePolicy": "File",
"volumeMounts": [
{
"mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
"name": "default-token-dbpcm",
"readOnly": true
}
]
}
],
"dnsPolicy": "ClusterFirst",
"nodeName": "node01",
"priority": 0,
"restartPolicy": "Always",
"schedulerName": "default-scheduler",
"securityContext": {},
"serviceAccount": "default",
"serviceAccountName": "default",
"terminationGracePeriodSeconds": 30,
"tolerations": [
{
"effect": "NoExecute",
"key": "node.kubernetes.io/not-ready",
"operator": "Exists",
"tolerationSeconds": 300
},
{
"effect": "NoExecute",
"key": "node.kubernetes.io/unreachable",
"operator": "Exists",
"tolerationSeconds": 300
}
],
"volumes": [
{
"name": "default-token-dbpcm",
"secret": {
"defaultMode": 420,
"secretName": "default-token-dbpcm"
}
}
]
},
"status": {
"conditions": [
{
"lastProbeTime": null,
"lastTransitionTime": "2019-03-28T21:09:42Z",
"status": "True",
"type": "Initialized"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2019-03-28T21:09:50Z",
"status": "True",
"type": "Ready"
},
{
"lastProbeTime": null,
"lastTransitionTime": null,
"status": "True",
"type": "ContainersReady"
},
{
"lastProbeTime": null,
"lastTransitionTime": "2019-03-28T21:09:42Z",
"status": "True",
"type": "PodScheduled"
}
],
"containerStatuses": [
{
"containerID": "docker://b5be8275555ad70939401d658bb4e504b52215b70618ad43c2d0d02c35e1ae27",
"image": "busybox:latest",
"imageID": "docker-pullable://busybox#sha256:061ca9704a714ee3e8b80523ec720c64f6209ad3f97c0ff7cb9ec7d19f15149f",
"lastState": {},
"name": "busybox",
"ready": true,
"restartCount": 0,
"state": {
"running": {
"startedAt": "2019-03-28T21:09:49Z"
}
}
}
],
"hostIP": "172.17.0.37",
"phase": "Running",
"podIP": "10.32.0.4",
"qosClass": "BestEffort",
"startTime": "2019-03-28T21:09:42Z"
}
}
Currently If i need the podIP, then I do that this way to find the object which has the search keyword and then I build the path
curl myson | jq "[paths]" | grep "IP" --context=10
Is there any nice shortcut to simplify this? What I really need is - all the paths which could have the matching key.
spec.podIP
spec.hostIP
select paths containing keyword in their last element, and use join(".") to generate your desired output.
paths
| select(.[-1] | type == "string" and contains("keyword"))
| join(".")
.[-1] returns the last element of an array,
type == "string" is required because an array index is a number and numbers and strings can't be checked for their containment.
You may want to specify -r option.
As #JeffMercado implicitly suggested you can set the query from command line without touching the script:
jq -r 'paths
| select(.[-1] | type == "string" and contains($q))
| join(".")' file.json --arg q 'keyword'
You can stream the input in, which provides paths and values. You could then inspect the paths and optionally output the values.
$ jq --stream --arg pattern 'IP' '
select(length == 2 and any(.[0][] | strings; test($pattern)))
| "\(.[0] | join(".")): \(.[1])"
' input.json
"status.hostIP: 172.17.0.37"
"status.podIP: 10.32.0.4"
shameless plug
https://github.com/TomConlin/json_to_paths
because sometime you do not even know the component you want to filter for before you see what is there.
json2jqpath.jq file.json
.
.apiVersion
.kind
.metadata
.metadata|.creationTimestamp
.metadata|.labels
.metadata|.labels|.bu
.metadata|.labels|.env
.metadata|.name
.metadata|.namespace
.metadata|.resourceVersion
.metadata|.selfLink
.metadata|.uid
.spec
.spec|.containers
.spec|.containers|.[]
.spec|.containers|.[]|.command
.spec|.containers|.[]|.command|.[]
.spec|.containers|.[]|.image
.spec|.containers|.[]|.imagePullPolicy
.spec|.containers|.[]|.name
.spec|.containers|.[]|.resources
.spec|.containers|.[]|.terminationMessagePath
.spec|.containers|.[]|.terminationMessagePolicy
.spec|.containers|.[]|.volumeMounts
.spec|.containers|.[]|.volumeMounts|.[]
.spec|.containers|.[]|.volumeMounts|.[]|.mountPath
.spec|.containers|.[]|.volumeMounts|.[]|.name
.spec|.containers|.[]|.volumeMounts|.[]|.readOnly
.spec|.dnsPolicy
.spec|.nodeName
.spec|.priority
.spec|.restartPolicy
.spec|.schedulerName
.spec|.securityContext
.spec|.serviceAccount
.spec|.serviceAccountName
.spec|.terminationGracePeriodSeconds
.spec|.tolerations
.spec|.tolerations|.[]
.spec|.tolerations|.[]|.effect
.spec|.tolerations|.[]|.key
.spec|.tolerations|.[]|.operator
.spec|.tolerations|.[]|.tolerationSeconds
.spec|.volumes
.spec|.volumes|.[]
.spec|.volumes|.[]|.name
.spec|.volumes|.[]|.secret
.spec|.volumes|.[]|.secret|.defaultMode
.spec|.volumes|.[]|.secret|.secretName
.status
.status|.conditions
.status|.conditions|.[]
.status|.conditions|.[]|.lastProbeTime
.status|.conditions|.[]|.lastTransitionTime
.status|.conditions|.[]|.status
.status|.conditions|.[]|.type
.status|.containerStatuses
.status|.containerStatuses|.[]
.status|.containerStatuses|.[]|.containerID
.status|.containerStatuses|.[]|.image
.status|.containerStatuses|.[]|.imageID
.status|.containerStatuses|.[]|.lastState
.status|.containerStatuses|.[]|.name
.status|.containerStatuses|.[]|.ready
.status|.containerStatuses|.[]|.restartCount
.status|.containerStatuses|.[]|.state
.status|.containerStatuses|.[]|.state|.running
.status|.containerStatuses|.[]|.state|.running|.startedAt
.status|.hostIP
.status|.phase
.status|.podIP
.status|.qosClass
.status|.startTime

Query parent data (multi-level) based on a child value, on a json file, using jq

I have a ksh script that retrives (using curl) a json file similar to the one bellow:
{
"Type1": {
"dev": {
"server": [
{ "group": "APP1", "name": "DAPP1002", "ip": "10.1.1.1" },
{ "group": "APP2", "name": "DAPP2001", "ip": "10.1.1.2" }
]
},
"qa": {
"server": [
{ "group": "APP1", "name": "QAPP1002", "ip": "10.1.2.1" },
{ "group": "APP2", "name": "QAPP2001", "ip": "10.1.2.2" }
]
},
"prod": {
"proxy": "type1.prod.proxy.mydomain.com",
"server": [
{ "group": "APP1", "name": "PAPP1001", "ip": "10.1.3.1" },
{ "group": "APP1", "name": "PAPP1002", "ip": "10.1.3.2" },
{ "group": "APP2", "name": "PAPP2001", "ip": "10.1.3.3" }
]
}
},
"Type2": {
"dev": {
"server": [
{ "group": "APP8", "name": "DAPP8002", "ip": "10.2.1.1" },
{ "group": "APP9", "name": "DAPP9001", "ip": "10.2.1.2" }
]
},
"qa": {
"server": [
{ "group": "APP8", "name": "QAPP8002", "ip": "10.2.2.1" },
{ "group": "APP9", "name": "QAPP9001", "ip": "10.2.2.2" }
]
},
"prod": {
"proxy": "type2.prod.proxy.mydomain.com",
"server": [
{ "group": "APP8", "name": "PAPP8001", "ip": "10.2.3.1" },
{ "group": "APP9", "name": "PAPP9001", "ip": "10.2.3.2" },
{ "group": "APP9", "name": "PAPP9002", "ip": "10.2.3.3" }
]
}
}
}
... based on a server name (field "name") I would have to collect the following info, to pass to a function:
"Type", "name", "ip", "proxy"
(Note that the "proxy" info is optional)
I am new to json, and I am trying to get this filtered with jq but so far, I am out of lucky.
What I acomplished so far is the following jq query, when searching for "PAPP9001" :
jq '.[] | .[] | select(.server[].name=="PAPP9001") | .proxy as $proxy | .server[] | {proxy: $proxy, name: .name, ip: .ip} | select(.name=="PAPP9001")' curlreturn.json
which returns me:
{
"proxy": "type2.prod.proxy.mydomain.com",
"name": "PAPP9001",
"ip": "10.2.3.2"
}
but:
I could not get the "Type" info, at the top level
Considering the number of pipes and the 2 selects, I doubt that this is the most efficient way.
One way to retrieve the key names programmatically is using to_entries. For example, given your input, this jq filter:
to_entries[]
| .key as $type
| .value[]
| .proxy as $proxy
| .server[]
| select(.name == "PAPP9001")
| { Type: $type, name, ip, proxy: $proxy }
yields:
{
"Type": "Type2",
"name": "PAPP9001",
"ip": "10.2.3.2",
"proxy": "type2.prod.proxy.mydomain.com"
}
Variations
If, for example, you wanted these four fields as a CSV row, then you could replace the last line of the filter above with:
| [$type, .name, .ip, $proxy] | #csv
See the jq manual for how to use string interpolation.