Find the value of key from JSON - json

I'd like to extract the "id" key from this single line of JSON.
I believe this can be accomplished with grep, but I am not sure on the correct way.
If there is a better way that does not have dependencies, I would be interested.
Here is my example output:
{
"data": {
"name": "test",
"id": "4dCYd4W9i6gHQHvd",
"domains": ["www.test.domain.com", "test.domain.com"],
"serverid": "bbBdbbHF8PajW221",
"ssl": null,
"runtime": "php5.6",
"sysuserid": "4gm4K3lUerbSPfxz",
"datecreated": 1474597357
},
"actionid": "WXVAAHQDCSILMYTV"
}

If you have a grep that can do Perl compatible regular expressions (PCRE):
$ grep -Po '"id": *\K"[^"]*"' infile.json
"4dCYd4W9i6gHQHvd"
-P enables PCRE
-o retains nothing but the match
"id": * matches "id" and an arbitrary amount of spaces
\K throws away everything to its left ("variable size positive look-behind")
"[^"]*" matches two quotes and all the non-quotes between them
If your grep can't do that, you an use
$ grep -o '"id": *"[^"]*"' infile.json | grep -o '"[^"]*"$'
"4dCYd4W9i6gHQHvd"
This uses grep twice. The result of the first command is "id": "4dCYd4W9i6gHQHvd"; the second command removes everything but a pair of quotes and the non-quotes between them, anchored at the end of the string ($).
But, as pointed out, you shouldn't use grep for this, but a tool that can parse JSON – for example jq:
$ jq '.data.id' infile.json
"4dCYd4W9i6gHQHvd"
This is just a simple filter for the id key in the data object. To get rid of the double quotes, you can use the -r ("raw output") option:
$ jq -r '.data.id' infile.json
4dCYd4W9i6gHQHvd
jq can also neatly pretty print your JSON:
$ jq . infile.json
{
"data": {
"name": "test",
"id": "4dCYd4W9i6gHQHvd",
"domains": [
"www.test.domain.com",
"test.domain.com"
],
"serverid": "bbBdbbHF8PajW221",
"ssl": null,
"runtime": "php5.6",
"sysuserid": "4gm4K3lUerbSPfxz",
"datecreated": 1474597357
},
"actionid": "WXVAAHQDCSILMYTV"
}

Just pipe your data to jq and select by keys
"data": {
"name": "test",
"id": "4dCYd4W9i6gHQHvd",
"domains": [
"www.test.domain.com",
"test.domain.com"
],
"serverid": "bbBdbbHF8PajW221",
"ssl": null,
"runtime": "php5.6",
"sysuserid": "4gm4K3lUerbSPfxz",
"datecreated": 1474597357
},
"actionid": "WXVAAHQDCSILMYTV"
} | jq '.data.id'
# 4dCYd4W9i6gHQHvd
Tutorial Here

I found myself that the best way is to use python, as it handles JSON natively and is preinstalled on most systems these days, unlike jq:
$ python -c 'import sys, json; print(json.load(sys.stdin)["data"]["id"])' < infile.json
4dCYd4W9i6gHQHvd

No python ,jq, awk, sed just GNU grep:
#!/bin/bash
json='{"data": {"name": "test", "id": "4dCYd4W9i6gHQHvd", "domains": ["www.test.domain.com", "test.domain.com"], "serverid": "bbBdbbHF8PajW221", "ssl": null, "runtime": "php5.6", "sysuserid": "4gm4K3lUerbSPfxz", "datecreated": 1474597357}, "actionid": "WXVAAHQDCSILMYTV"}'
echo $json | grep -o '"id": "[^"]*' | grep -o '[^"]*$'
Tested & working here: https://ideone.com/EG7fv7
source: https://brianchildress.co/parse-json-using-grep

$ grep -oP '"id": *"\K[^"]*' infile.json
4dCYd4W9i6gHQHvd
Hopefully it will work for all. As this will work for me to print without quotes.

Related

Extract a particular field from json output using jq

I'm working a bash script to extraxt specific field from json output using jq.
USERNAME=$(echo "$OUTPUT" | jq -r '.[] | .name')
Due to jq it always fails with parse error: Invalid numeric literal at line 1, column 2 error.
My restapi result has the below output.
[
{
"url": "#/systemadm/groups/uuid-d6e4e05",
"options": {},
"group_id": 313,
"owner": "abc-123-mec",
"owner_id": "ad1337884",
"id": "c258d7b330",
"name": "abc-group"
},
{
"options": {},
"id": "global%3Regmebers",
"name": "Udata-123"
},
{
"url": "#/systemadm/groups/uuid-38943000",
"options": {},
"group_id": 910,
"owner": "framework-abcc",
"owner_id": "78d4472b738bc",
"id": "38943000057a",
"name": "def-group"
},
........................
............................
......................................
So what's wrong with this jq response of code to get "name" ?
jq can only process valid JSON.
If the value of OUTPUT is literally "id": "c258d7b330","name": "abc-group", then you could enclose it in curly braces to make it valid JSON. No guarantees though; this depends on the exact format of your input.
OUTPUT='"id": "c258d7b330",
"name": "abc-group"'
USERNAME="$(printf '%s\n' "{$OUTPUT}" | jq -r '.name')"
printf '%s\n' "$USERNAME"; // abc-group
If it cannot be converted to valid JSON, maybe a simple solution using grep+cut or awk would suffice?
OUTPUT='"id": "c258d7b330",
"name": "abc-group"'
USERNAME="$(printf '%s\n' "$OUTPUT" | grep '^"name":' | cut -d'"' -f4)"
awk:
printf '%s\n' "$OUTPUT" | awk -F'"' '/^"name":/{print $4}'
Or even use jq to parse the input as array of strings and then filter for the line in which you are interested:
jq -Rr '(select(startswith("\"name\":")) / "\"")[3]'
All options are really fragile and I recommend to fix your input to be actual, valid JSON

Replace a keyword with the content of the file

I have a templatized json file called template.json as below:
{
"subject": "Some subject line",
"content": $CONTENT,
}
I have another file called sample.json with the json content as below:
{
"status": "ACTIVE",
"id": 217,
"type": "TEXT",
"name": "string",
"subject": "string",
"url": "contenttemplates/217",
"content": {
"text": "hello ${user_name}",
"variables": [{
"key": "${user_name}",
"value": null
}]
},
"content_footer": null,
"audit": {
"creator": "1000",
"timestamp": 1548613800000,
"product": "2",
"channel": "10",
"party": null,
"event": {
"type": null,
"type_id": "0",
"txn_id": "0"
},
"client_key": "pk6781gsfr5"
}
}
I want to replace $CONTENT from template.json with the content under the tag "content" from the content.json file . I have tried with below sed commands:
sed -i 's/$CONTENT/'$(jq -c '.content' sample.json)'/' template.json
I am getting below error:
sed: -e expression #1, char 15: unterminated `s' command
Can someone please help me to get the right sed command (or any other alternative)?
The jq Cookbook has a section on using jq with templates: https://github.com/stedolan/jq/wiki/Cookbook#using-jq-as-a-template-engine
In the present case, the first technique ("Using jq variables as template variables") matches the already-defined template file (except for the dangling comma), so you could for example write:
jq -n --arg CONTENT "$(jq -c .content sample.json)" '
{"subject": "Some subject line", "content": $CONTENT}'
or use the format:
jq -n --arg CONTENT "$(jq -c .content sample.json)" -f template.jq
(I'd only use the .json suffix for files that hold JSON or JSON streams.)
The output from jq contains spaces, you need to quote them to prevent the shell from tokenizing them.
sed -i 's/$CONTENT/'"$(jq -c '.content' sample.json)/" template.json
See further When to wrap quotes around a shell variable?
With GNU sed:
sed '/$CONTENT/{s/.*/jq -c ".content" sample.json/e}'
Replace the entire line with your command and e (GNU only) to execute the command and replace sed's pattern space with the output of the command.

Why is my jq failing on my JSON?

Haven't used jq before but I'm wanting to build a shell script that will get a JSON response and extract just the values. To learn I thought I would try on my blog's WP API but for some reason I'm getting an error of:
jq: error (at :322): Cannot index array with string "slug"
When researching for and testing previous questions:
jq: Cannot index array with string
jq is sed for JSON
JSON array to bash variables using jq
How to use jq in a shell pipeline?
How to extract data from a JSON file
The above reading I've tried to code:
URL="http://foobar.com"
RESPONSE=$(curl -so /dev/null -w "%{http_code}" $URL)
WPAPI="/wp-json/wp/v2"
IDENTIFIER="categories"
if (("$RESPONSE" == 200)); then
curl -s {$URL$WPAPI"/"$IDENTIFIER"/"} | jq '.' >> $IDENTIFIER.json
result=$(jq .slug $IDENTIFIER.json)
echo $result
else
echo "Not returned status 200";
fi
An additional attempt changing the jq after the curl:
curl -s {$URL$WPAPI"/"$IDENTIFIER"/"} | jq '.' | $IDENTIFIER.json
result=(jq -r '.slug' $IDENTIFIER.json)
echo $result
I can modify the uncompress with the python JSON tool:
result=(curl -s {$URL$WPAPI"/"$IDENTIFIER"/"} | python -m json.tool > $IDENTIFIER.json)
I can save the JSON to a file but when I use jq I cannot get just the slug and here are my other trys:
catCalled=$(curl -s {$URL$WPAPI"/"$IDENTIFIER"/"} | python -m json.tool | ./jq -r '.slug')
echo $catCalled
My end goal is to try to use jq in a shell script and build a slug array with jq. What am I doing wrong in my jq and can I use jq on a string without creating a file?
Return from curl after uncompress per comment request:
[
{
"id": 4,
"count": 18,
"description": "",
"link": "http://foobar.com/category/foo/",
"name": "Foo",
"slug": "foo",
"taxonomy": "category",
},
{
"id": 8,
"count": 9,
"description": "",
"link": "http://foobar.com/category/bar/",
"name": "Bar",
"slug": "bar",
"taxonomy": "category",
},
{
"id": 5,
"count": 1,
"description": "",
"link": "http://foobar.com/category/mon/",
"name": "Mon",
"slug": "mon",
"taxonomy": "category",
},
{
"id": 11,
"count": 8,
"description": "",
"link": "http://foobar.com/category/fort/",
"name": "Fort",
"slug": "fort",
"taxonomy": "category",
}
]
eventually my goal is trying to get the name of the slug's into an array like:
catArray=('foo','bar','mon', 'fort')
There are 2 issues here:
slug is not a root level element in your example json. The root level element is an array. If you want to access the slug property of each element of the array, you can do so like this:
jq '.[].slug' $IDENTIFIER.json
Your example json has trailing commas after the last property of each array element. Remove the commas after "taxonomy": "category".
If I take your sample json, remove the errant commas, save it to a plain text file called test.json and run the following command:
jq '.[].slug' test.json
I get the following output:
"foo"
"bar"
"mon"
"fort"
Preprocessing
Unfortunately, the JSON-like data shown as having been produced by curl is not strictly JSON. jq does not have a "relaxed JSON" mode, so in order to use jq, you will have to preprocess the JSON-like data, e.g. using hjson (see http://hjson.org/):
$ hjson -j input.qjson > input.json
jq
With the JSON in input.json:
$ jq -c 'map(.slug)' input.json
["foo","bar","mon","fort"]
your string is not json, notice how the last member of your objects ends with a comma,
{foo:"bar",baz:9,}
this is legal in javascript, but it's illegal in json. if you are supposed to be receiving json from that endpoint, then contact the people behind it and tell them to fix the bug (it's breaking the json specs by ending objects's last member with a comma, which is illegal in json.) - until it's fixed, i guess you can patch it with a little regex, but it's a dirty quickfix, and probably not very reliable, but running it through
perl -p -0777 -e 's/\"\,\s*}/\"}/g;' makes it legal json..

Getting output using jq

I have the following JSON outp
{
"environment": {
"reg": "abc"
},
"system": {
"svcs": {
"upsvcs": [
{
"name": "monitor",
"tags": [],
"vmnts": [],
"label": "upsvcs",
"credentials": {
"Date": "Feb152018",
"time": "1330"
}
},
{
"name": "application",
"tags": [],
"vmnts": [],
"label": "upsvcs",
"credentials": {
"lastViewed": "2018-02-07"
}
}
]
}
}
and to retrieve Date value (from credentials). I have tried `curl xxx | jq -r '. | select (.Date)'
which is not returning any value. Can someone please let me know what is the correct syntax and any explanation on how to retrieve elements (or any articles that do so).
TIA
The (or at least a) short answer is:
.system.svcs.upsvcs[0].credentials.Date
Without a schema, it is often a bit tricky to get the exact path correctly, so you might want to consider using the jq filter paths. In your case:
$ jq -c paths input.json | grep Date
["system","svcs","upsvcs",0,"credentials","Date"]
You could also use this path (i.e., the above array) directly:
$ jq 'getpath(["system","svcs","upsvcs",0,"credentials","Date"])' input.json
"Feb152018"
and thus you could use a "path-free" query:
$ jq --argjson p $(jq -c paths input.json | grep --max 1 Date) 'getpath($p)' input.json
"Feb152018"
Another approach would be just to retrieve all “truthy” .Date values, no matter where they appear:
$ jq -c '.. | .Date? // empty' input.json
"Feb152018"
select
Since you mentioned select, please note that:
$ jq -c '.. | select(.Date?)' input.json
{"Date":"Feb152018","time":"1330"}
Further information
For further information about jq in general and the topic of retrieval in particular, see the online tutorial, manual, and FAQ:
https://stedolan.github.io/jq/tutorial/
https://stedolan.github.io/jq/manual/v1.5/
https://github.com/stedolan/jq/wiki/FAQ

How to use `jq` to obtain the keys

My json looks like this :
{
"20160522201409-jobsv1-1": {
"vmStateDisplayName": "Ready",
"servers": {
"20160522201409 jobs_v1 1": {
"serverStateDisplayName": "Ready",
"creationDate": "2016-05-22T20:14:22.000+0000",
"state": "READY",
"provisionStatus": "PENDING",
"serverRole": "ROLE",
"serverType": "SERVER",
"serverName": "20160522201409 jobs_v1 1",
"serverId": 2902
}
},
"isAdminNode": true,
"creationDate": "2016-05-22T20:14:23.000+0000",
"totalStorage": 15360,
"shapeId": "ot1",
"state": "READY",
"vmId": 4353,
"hostName": "20160522201409-jobsv1-1",
"label": "20160522201409 jobs_v1 ADMIN_SERVER 1",
"ipAddress": "10.252.159.39",
"publicIpAddress": "10.252.159.39",
"usageType": "ADMIN_SERVER",
"role": "ADMIN_SERVER",
"componentType": "jobs_v1"
}
}
My key keeps changing from time to time. So for example 20160522201409-jobsv1-1 may be something else tomorrow. Also I may more than one such entry in the json payload.
I want to echo $KEYS and I am trying to do it using jq.
Things I have tried :
| jq .KEYS is the command i use frequently.
Is there a jq command to display all the primary keys in the json?
I only care about the hostname field. And I would like to extract that out. I know how to do it using grep but it is NOT a clean approach.
You can simply use: keys:
% jq 'keys' my.json
[
"20160522201409-jobsv1-1"
]
And to get the first:
% jq -r 'keys[0]' my.json
20160522201409-jobsv1-1
-r is for raw output:
--raw-output / -r: With this option, if the filter’s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems.
Source
If you want a known value below an unknown property, eg xxx.hostName:
% jq -r '.[].hostName' my.json
20160522201409-jobsv1-1