Extract key-value from JSON having JSON list? - json

I need to extract the id of "name":"ConsumeKafka" from the JSON file textuploader.com/1dchq, so that it gives me the result:
"id":"772658d2-8510-3834-856b-6cfd7e8871f6".
I cannot use any third party tool due to restrictions. How can I do this using sed/awk?

$ awk -F '[":,]+' '$2=="id" {id=$3} $2=="name" && $3=="ConsumeKafka" {print id}' file
772658d2-8510-3834-856b-6cfd7e8871f6
772658d2-8510-3834-856b-6cfd7e8871f6
772658d2-8510-3834-856b-6cfd7e8871f6
-F '[":,]+' - Use any number of double-quotes, colons, or commas as Field Separator.
$2=="id" {id=$3} - If the second field is exactly id, save the next field.
$2=="name" && $3=="ConsumeKafka" {print id} - Print the saved id according to fields 2 and 3.
If you only need the first match, do {print id; exit}

You can use the EvaluateJSONPath processor to extract JSON values to flowfile attributes. Use the JSONPath expression $.processors[?(#.component.name=="ConsumeKafka")].component.id to extract the ConsumeKafka id to an attribute on the flowfile.
As an aside, I think the API response you're using is too generic & large to be helpful. You can restrict the information returned in the JSON response to be more specific by making a more specific API call.

From the text file you can do:
awk '/"id"/ {print $1}' file
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",

Related

Fetch json data using Regex on linux

I know we should use JQ for parsing json data, but I want to parse it using regex. I want to fetch the value of a json key into a variable in my shell script. As of now, I am using JQ for parsing.
So my abc.json is
{"value1":5.0,"value2":2.5,"value3":"2019-10-24T15:26:00.000Z","modifier":[],"value4":{"value41":{"value411":5}}}
Currently, my XYZ.sh has these lines to fetch the data
data1 =$(cat abc.json | jq -r '.value4.value41.value411')
I want data1 to have value of value411. How can I achieve this?
ps- The JSON is mutable. The above JSON is just a part of the JSON file that I want to fetch.
Is your json structure immutable? If you have to use it, consider the following
┌──[root#vms83.liruilongs.github.io]-[~]
└─$cat abc.json | awk -F: '{print $NF}' | grep -o '[[:digit:]]'
5
I think your problem was you had a space between data and =. There can't be a space there.
This works as you want it to (I removed the unnecessary cat)
data1=$(jq -r '.value4.value41.value411' abc.json)
echo $data1

Convert bash output to JSON

I am running the following command:
sudo clustat | grep primary | awk 'NF{print $1",""server:"$2 ",""status:"$3}'
Results are:
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
My desired result is:
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
I can't seem to put the qoutation marks withour srewing up my output.
Use jq:
sudo clustat | grep primary |
jq -R 'split(" ")|{service:.[0], server:.[1], status:.[2]}'
The input is read as raw text, not JSON. Each line is split on a space (the argument to split may need to be adjusted depending on the actual input). jq ensures that values are properly quoted when constructing the output objects.
Don't do this: Instead, use #chepner's answer, which is guaranteed to generate valid JSON as output with all possible inputs (or fail with a nonzero exit status if no JSON representation is possible).
The below is only tested to generate valid JSON with the specific inputs shown in the question, and will quite certainly generate output that is not valid JSON with numerous possible inputs (strings with literal quotes, strings ending in literal backslashes, etc).
sudo clustat |
awk '/primary/ {
print "{\"service\":\"" $1 "\",\"server\":\"" $2 "\",\"status\":\""$3"\"}"
}'
For JSON conversion of common shell commands, a good option is jc (JSON Convert)
There is no parser for clustat yet though.
clustat output does look table-like, so you may be able to use the --asciitable parser with jc.

Get pair of value from json file by sed

I want to get value from JSON file:
Example:
{"name":"ghprbActualCommitAuthorEmail","value":"test#gmail.com"},{"name":"ghprbPullId","value":"226"},{"name":"ghprbTargetBranch","value":"master"},
My expect is :
I want to get test#gmail.com, 226 and master.
sed is the wrong tool for processing JSON.
Assuming you have a file tmp.json with valid JSON like
[{"name":"ghprbActualCommitAuthorEmail","value":"test#gmail.com"},
{"name":"ghprbPullId","value":"226"},
{"name":"ghprbTargetBranch","value":"master"}]
you can use jq '.[].value' tmp.son.
If the file instead contains
{"name":"ghprbActualCommitAuthorEmail","value":"test#gmail.com"}
{"name":"ghprbPullId","value":"226"}
{"name":"ghprbTargetBranch","value":"master"}
(i.e., just a stream of 3 separate JSON objects, you could use jq '.value' tmp.json, as jq will apply the filter to each object in succession. You can also use jq -s '.[].value' tmp.son, where the -s flag tells jq to read the entire input into an array first. This lets you use the same filter in both cases.

Efficiently get the first record of a JSONL file

Is it possible to efficiently get the first record of a JSONL file without consuming the entire stream / file? One way I have been able to inefficiently do so is the following:
curl -s http://example.org/file.jsonl | jq -s '.[0]'
I realize that head could be used here to extract the first line, but assume that the file may not use a newline as the record separator and may simply be concatenated objects or arrays.
If I'm understanding correctly, the JSONL format just returns a stream of JSON objects which jq handles quite nicely. Best case scenario that you wanted the first item, you could just utilize the input filter to grab the first item.
I think you could just do this:
$ curl -s http://example.org/file.jsonl | jq -n 'input'
You need the null input -n to not process the input immediately then input just gets one input from the stream. No need to go through the rest of the input stream.

Extract dates from a specific json format with sed

I have a json file including the sample lines of code below:
[{"tarih":"20130824","tarihView":"24-08-2013"},{"tarih":"20130817","tarihView":"17-08-2013"},{"tarih":"20130810","tarihView":"10-08-2013"},{"tarih":"20130803","tarihView":"03-08-2013"},{"tarih":"20130727","tarihView":"27-07-2013"},{"tarih":"20130720","tarihView":"20-07-2013"},{"tarih":"20130713","tarihView":"13-07-2013"},{"tarih":"20130706","tarihView":"06-07-2013"}]
I need to extract all the dates in the yy/mm/dd format into a text format with proper line endings:
20130824
20130817
20130810
20130803
...
20130706
How can I do this by using sed or similar console utility?
Many thanks for your help.
this line works for your example:
grep -Po '\d{8}' file
or with BRE:
grep -o '[0-9]\{8\}' file
it outputs:
20130824
20130817
20130810
20130803
20130727
20130720
20130713
20130706
if you want to extract the string after "tarih":", you could :
grep -Po '"tarih":"\K\d{8}' file
it gives same output.
Note that regex won't do date string validation.
This is VERY easy in python:
#!/bin/bash
python -c "vals=$(cat jsonfile)
for curVal in vals: print curVal['tarih']"
If I paste your example to jsonfile I get this output
20130824
20130817
20130810
20130803
20130727
20130720
20130713
20130706
Which is exactly what you need, right?
This works because in python [] is a list and {} is a dictionary, so it is very easy to get any data from that structure. This method is very safe as well, because it wont fail if some field in your data contains { , " or any other character that sed will probably look for. Also it does not depend on the field position or the number of fields.