How to find value of a key in a json response trace file using shell script - json

I have a response trace file containing below response:
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
I need to fetch the value of the "id" key in a variable which I can put in my further code.
Expected result is
echo $id - should give me 70EA96FB313349279EB089BA9DE2EC3B value

With valid JSON (remove first to second row with sed and parse with jq):
id=$(sed '1,2d' file | jq -r '.member[]|.id')
Output to variable id:
70EA96FB313349279EB089BA9DE2EC3B

I would strongly suggest using jq to parse json.
But given that json is mostly compatible with python dictionaries and arrays, this HACK would work too:
$ cat resp
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
$ awk 'NR==3{print "a="$0;print "print a[\"member\"][0][\"id\"]"}' resp | python
70EA96FB313349279EB089BA9DE2EC3B
$ sed -n '3s|.*|a=\0\nprint a["member"][0]["id"]|p' resp | python
70EA96FB313349279EB089BA9DE2EC3B
Note that this code is
1. dirty hack, because your system does not have the right tool - jq
2. susceptible to shell injection attacks. Hence use it ONLY IF you trust the response received from your service.

Quick and dirty (don't use eval):
eval $(cat response_file | tail -1 | awk -F , '{ print $5 }' | sed -e 's/"//g' -e 's/:/=/')
It is based on the exact structure you gave, and hoping there is no , in any value before "id".
Or assign it yourself:
id=$(cat response_file | tail -1 | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
Note that you can't access the name field with that trick, as it is the first item of the member array and will be "swallowed" by the { print $2 }. You can use an even-uglier hack to retrieve it though:
id=$(cat response_file | tail -1 | sed -e 's/:\[/,/g' -e 's/}\]//g' | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
But, if you can, jq is the right tool for that work instead of ugly hacks like that (but if it works...).

When you can't use jq, you can consider
id=$(grep -Eo "[0-9A-F]{32}" file)
This is only working when the file looks like what I expect, so you might need to add extra checks like
id=$(grep "My des_" file | grep -Eo "[0-9A-F]{32}" | head -1)

Related

Data extraction for specific string

I have a long list of JSON data, with repeats of contents similar to followings.
Due to the original JSON file is too long, I will just shared the hyperlinks here. This is a result generated from a database called RegulomeDB.
Direct link to the JSON file
I would like to extract specific data (eQTLs) from "method": "eQTLs" and "value": "xxxx", and put them into 2 columns (tab delimited) exactly like below.
Note: "value":"xxxx" is extracted right after "method": "eQTLs"is detected.
eQTLs firstResult, secondResult, thirdResult, ...
In this example, the desired output is:
eQTLs EIF3S8, EIF3CL
I've tried using a python script but was unsuccessful.
import json
with open('file.json') as f:
f_json = json.load(f)
print 'f_json[0]['"method": "eQTLs"'] + "\t" + f_json[0]["value"]
Thank you for your kind help.
Maybe you'll find the JSON-parser xidel useful. It can open urls and can manipulate strings any way you want:
$ xidel -s "https://regulomedb.org/regulome-search/?regions=chr16:28539847-28539848&genome=GRCh37&format=json" \
-e '"eQTLs "||join($json("#graph")()[method="eQTLs"]/value,", ")'
eQTLs EIF3S8, EIF3CL
Or with the XPath/XQuery 3.1 syntax:
-e '"eQTLs "||join($json?"#graph"?*[method="eQTLs"]?value,", ")'
Try this:
cat file.json | grep -iE '"method":\s*"eQTLs"[^}]*' -o | cut -d ',' -f 1,5 | sed -r 's/"|:|method|value//gi' | sed 's/\s*eqtls,\s*//gi' | tr '\n' ',' | sed 's/,$/\n/g' | sed 's/,/, /g' | xargs echo -e 'eQTLs\x09'

sed unterminated `s' command with jq json string

Piggybacking off of this question I have a command (running in a Docker container) where I am trying to sed to replace an expression with a JSON string generated by jq.
Tiny backstory:
I have a whitelist of env vars in a file tmp.txt:
ENV_VAR_A
ENV_VAR_B
ENV_VAR_C
I use jq using the answer in the previous thread to generate a JSON string like this:
jq -Rn '[inputs | {(.): env[.]}] | add' ./tmp.txt
# GENERATES { "ENV_VAR_A": "a val", "ENV_VAR_B": "a val", "ENV_VAR_C": "a val"}
Amazing! Now I am trying to use sed (as a Docker CMD) to do replace something:
# CMD sed -i 's#{{SOME_PATTERN}}#'$( jq -Rn '[inputs | {(.): env[.]}] | add' ./etc/nginx/conf.d/env)'#' ./somefile
But I am getting:
sed: -e expression #1, char 22: unterminated `s' command
So something went wrong the substitution - but I am not nearly knowledgeable enough in shell to figure out how to fix it, I feel like I have to move some quotes/delimiters around, or maybe pipe my jq to something to "clean up" the json string before I substitute, but I'm not sure what.
Looking for some sed-fu, can anyone help?
This is a bit tricky since replacement string has many lines. You can try this sed with a process substitution:
sed -i -e '/{{SOME_PATTERN}}/r '<( jq -Rn '[inputs | {(.): env[.]}] | add' /etc/nginx/conf.d/env) -e '//d' somefile
Make sure you're using bash.
With a bit modified jq command that produces single line output, you can just do:
sed -i 's/{{SOME_PATTERN}}/'"$(jq -nRc '[inputs | {(.): env[.]}] | add' /etc/nginx/conf.d/env)"'/' somefile
#Adam asked:
what would that look like?
If your jq has the --rawfile option, there should be no need to juggle jq and sed:
< somefile jq -R --rawfile text tmp.txt '
($text
| split("\n")
| map(select(length>0)
| {(.): env[.]}) | add) as $json
| sub("{{SOME_PATTERN}}"; $json|tostring)'

How to parse JSON in shell script?

I run the curl command $(curl -i -o - --silent -X GET --cert "${CERT}" --key "${KEY}" "$some_url") and save the response in the variable response. ${response} is as shown below
HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Content-Length: 34
Connection: keep-alive
Keep-Alive: timeout=5
X-XSS-Protection: 1;
{"status":"running","details":"0"}
I want to parse the JSON {"status":"running","details":"0"} and assign 'running' and 'details' to two different variables where I can print status and details both. Also if the status is equal to error, the script should exit. I am doing the following to achieve the task -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
echo "Status: ${status1}"
echo "Details: ${details1}"
if [[ $status1 == 'error' ]]; then
exit 1
fi
Instead of parsing the JSON twice, I want to do it only once. Hence I want to combine the following lines but still assign the status and details to two separate variables -
status1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.status')
details1=$(echo "${response}" | awk '/^{.*}$/' | jq -r '.details')
First, stop using the -i argument to curl. That takes away the need for awk (or any other pruning of the header after-the-fact).
Second:
{
IFS= read -r -d '' status1
IFS= read -r -d '' details1
} < <(jq -r '.status + "\u0000" + .details + "\u0000"' <<<"$response")
The advantage of using a NUL as a delimiter is that it's the sole character that can't be present in the value of a C-style string (which is how shell variables' values are stored).
You can use a construction like:
read status1 details1 < <(jq -r '.status + " " + .details' <<< "${response}")
You use read to assign the different inputs to two variables (or an array, if you want), and use jq to print the data you need separated by whitespace.
As Benjamin already suggested, only retrieving the json is a better way to go. Poshi's solution is solid.
However, if you're looking for the most compact to do this, no need to save the response as a variable if the only thing your're going to do with it is extract other variables from it on a one time basis. Just pipe curl directly into:
curl "whatever" | jq -r '[.status, .details] |#tsv'
or
curl "whatever" | jq -r '[.status, .details] |join("\t")'
and you'll get your values fielded for you.

Nested while loop with input from main while in shell script

I am having Json file and i am trying to parse it by using below
#!/bin/ksh
while read rec
do
while read line
do
firstname=`echo $line | sed -n -e 's/^.*\(full-name\)/\1/p' | cut -f3 -d'"'`
id=`echo $line | sed -n -e 's/^.*\(id\)/\1/p' | cut -f3 -d'"'`
echo "${firstname}'|'${id}"
done < `echo $rec | nawk 'gsub("}}}}", "\n")' | sed 's/{"results"//g'`
done < /var/tmp/Cloud_test.txt
My sample file is :
{"results":[{"general-info":{"full-name":"TELOS MANAGEMENT","body":{"party":{"xrefs":{"xref":[{"id":"66666"}]}}}},"_id":"91002551"},{"_id":"222222","body":{"party":{"general-info":{"full-name":"DO REUSE"},"xrefs":{"xref":[{"id":"777777"}]}}}}]}
Expected Result:
TELOS MANAGEMENT|66666
DO REUSE|777777
I am facing problem in inside while passing parameter. Its not getting passed line by line. Its passed complete line and result is not coming as expected. Please help to get it fixed.
As pointed out by #l0b0 this kind of problem is best solved using a JSON-aware tool such as jq. Here, then, is a jq solution.
It must be pointed out, however, that the sample input is strangely irregular, so the requirements are not so clear. If the JSON were more regular, the jq solution would be simpler.
In any case, the following jq filter does produce the result as described:
.results[]
| ..
| objects
| select(has("general-info"))
| [(.["general-info"]|.["full-name"]), (.. | .id? // empty)]
| join("|")
Simplification
The second last line above could be simplified to:
[."general-info"."full-name", (.. | .id? // empty)]
This is a more complicated (double) case of this question.
The following works for me:
cat sample.json |
sed -e 's/"full-name"/\n&/g' |
tail -n+2 |
sed -e 's/"full-name":"\([^"]*\).*{"id":"\([^"]*\).*/\1\|\2/'

How to iterate through json in bash script

I have the json as below, i need to get only the mail from the above json in bash script
value={"count":5,"users":[{"username":"asa","name":"asa
Tran","mail":"asa#xyz.com"},{"username":"qq","name":"qq
Morris","mail":"qq#xyz.com"},{"username":"qwe","name":"qwe
Org","mail":"qwe#xyz.com"}]}
Output can be as
mail=asa#xyz.com,qq#xyz.com,qwe#xyz.com
All the above need to be done in the bash script (.sh)
I have already tried with the array iteration as but of no use
for key in "${!value[#]}"
do
#echo "key = $key"
echo "value = ${value[$key]}"
done
Even i have tried with the array conversion as
alias json-decode="php -r
'print_r(json_decode(file_get_contents(\"php://stdin\"),1));'"
value=$(curl --user $credentials -k $endPoint | json-decode)
Still i was not able to get the specific output.
jq is the tool to iterate through a json. In your case:
while read user; do
jq -r '.mail' <<< $user
done <<< $(jq -c '.users[]' users.json)
would give:
asa#xyz.com
qq#xyz.com
qwe#xyz.com
NOTE: I removed "value=" because that is not valid json. Users.json contains:
{"count":5,"users":[{"username":"asa","name":"asa Tran","mail":"asa#xyz.com"},{"username":"qq","name":"qq Morris","mail":"qq#xyz.com"},{"username":"qwe","name":"qwe Org","mail":"qwe#xyz.com"}]}
If this is valid json and the email field is the only one containing a # character, you can do something like this:
echo $value | tr '"' '\n' | grep #
It replaces double-quotes by new line character and only keeps lines containing #. It is really not json parsing, but it works.
You can store the result in a bash array
emails=($(echo $value | tr '"' '\n' | grep #))
and iterate on them
for email in ${emails[#]}
do
echo $email
done
You should use json_pp tool (in debian, it is part of the libjson-pp-perl package)
One would use it like this :
cat file.json | json_pp
And get a pretty print for your json.
So in your case, you could do :
#!/bin/bash
MAILS=""
LINES=`cat test.json | json_pp | grep '"mail"' | sed 's/.* : "\(.*\)".*/\1/'`
for LINE in $LINES ; do
MAILS="$LINE,$MAILS"
done
echo $MAILS | sed 's/.$//'
Output :
qwe#xyz.com,qq#xyz.com,asa#xyz.com
Using standard unix toolbox : sed command
cat so.json | sed "s/},/\n/g" | sed 's/.*"mail":"\([^"]*\)".*/\1/'
With R you could do this as follows:
$ value={"count":5,"users":[{"username":"asa","name":"asa Tran","mail":"asa#xyz.com"},{"username":"qq","name":"qq Morris","mail":"qq#xyz.com"},{"username":"qwe","name":"qwe Org","mail":"qwe#xyz.com"}]}
$ echo $value | R path users | R map path mail
["asa#xyz.com", "qq#xyz.com", "qwe#gyz.com"]