Extract dates from a specific json format with sed - json

I have a json file including the sample lines of code below:
[{"tarih":"20130824","tarihView":"24-08-2013"},{"tarih":"20130817","tarihView":"17-08-2013"},{"tarih":"20130810","tarihView":"10-08-2013"},{"tarih":"20130803","tarihView":"03-08-2013"},{"tarih":"20130727","tarihView":"27-07-2013"},{"tarih":"20130720","tarihView":"20-07-2013"},{"tarih":"20130713","tarihView":"13-07-2013"},{"tarih":"20130706","tarihView":"06-07-2013"}]
I need to extract all the dates in the yy/mm/dd format into a text format with proper line endings:
20130824
20130817
20130810
20130803
...
20130706
How can I do this by using sed or similar console utility?
Many thanks for your help.

this line works for your example:
grep -Po '\d{8}' file
or with BRE:
grep -o '[0-9]\{8\}' file
it outputs:
20130824
20130817
20130810
20130803
20130727
20130720
20130713
20130706
if you want to extract the string after "tarih":", you could :
grep -Po '"tarih":"\K\d{8}' file
it gives same output.
Note that regex won't do date string validation.

This is VERY easy in python:
#!/bin/bash
python -c "vals=$(cat jsonfile)
for curVal in vals: print curVal['tarih']"
If I paste your example to jsonfile I get this output
20130824
20130817
20130810
20130803
20130727
20130720
20130713
20130706
Which is exactly what you need, right?
This works because in python [] is a list and {} is a dictionary, so it is very easy to get any data from that structure. This method is very safe as well, because it wont fail if some field in your data contains { , " or any other character that sed will probably look for. Also it does not depend on the field position or the number of fields.

Related

Calling Imagemagick from awk?

I have a CSV of image details I want to loop over in a bash script. awk seems like an obvious choice to loop over the data.
For each row, I want to take the values, and use them to do Imagemagick stuff. The following isn't working (obviously):
awk -F, '{ magick "source.png" "$1.jpg" }' images.csv
GNU AWK excels at processing structured text data, although it can be used to summon commands using system function it is less handy for that than some other language, e.g. python has module of standard library called subprocess which is more feature-rich.
If you wish to use awk for this task anyway, then I suggest preparing output to be feed into bash command, say you have file.txt with following content
file1.jpg,file1.bmp
file2.png,file2.bmp
file3.webp,file3.bmp
and you have files listed in 1st column in current working directory and wish to convert them to files shown in 2nd column and access to convert command, then you might do
awk 'BEGIN{FS=","}{print "convert \"" $1 "\" \"" $2 "\""}' file.txt | bash
which is equvialent to starting bash and doing
convert "file1.jpg" "file1.bmp"
convert "file2.png" "file2.bmp"
convert "file3.webp" "file3.bmp"
Observe that I have used literal " to enclose filenames, so it should work with names containing spaces. Disclaimer: it might fail if name containing special character, e.g. ".

Extract a value from a single line of JSON

How can extract the value using a sed or awk command where jp or any JSON content in not allowed?
{"test":{"components":[{"metric":"complexity","value":"90"}]}}
Output:
90
I tried with the below command, but I am getting the below error:
def value=sh (script: 'grep -Po \'(?<="value":")[^"\\]*(?:\\.[^"\\]*)*\' sample.txt')
But getting the below error from the Jenkins script:
grep: missing terminating ] for character class
Though passing JSON content should be done by a JSON parser, since the OP told the jq tool is not allowed to adding to this solution here, strictly written and tested with shown samples only.
awk 'match($0, /"value":"[0-9]+/){print substr($0, RSTART+9, RLENGTH-9)}' Input_file

Extract key-value from JSON having JSON list?

I need to extract the id of "name":"ConsumeKafka" from the JSON file textuploader.com/1dchq, so that it gives me the result:
"id":"772658d2-8510-3834-856b-6cfd7e8871f6".
I cannot use any third party tool due to restrictions. How can I do this using sed/awk?
$ awk -F '[":,]+' '$2=="id" {id=$3} $2=="name" && $3=="ConsumeKafka" {print id}' file
772658d2-8510-3834-856b-6cfd7e8871f6
772658d2-8510-3834-856b-6cfd7e8871f6
772658d2-8510-3834-856b-6cfd7e8871f6
-F '[":,]+' - Use any number of double-quotes, colons, or commas as Field Separator.
$2=="id" {id=$3} - If the second field is exactly id, save the next field.
$2=="name" && $3=="ConsumeKafka" {print id} - Print the saved id according to fields 2 and 3.
If you only need the first match, do {print id; exit}
You can use the EvaluateJSONPath processor to extract JSON values to flowfile attributes. Use the JSONPath expression $.processors[?(#.component.name=="ConsumeKafka")].component.id to extract the ConsumeKafka id to an attribute on the flowfile.
As an aside, I think the API response you're using is too generic & large to be helpful. You can restrict the information returned in the JSON response to be more specific by making a more specific API call.
From the text file you can do:
awk '/"id"/ {print $1}' file
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"67e21117-891e-3019-8926-7571b3b0317f",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"a1c4b268-3a6f-3b4c-bf12-0ded10f5d767",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",
"id":"772658d2-8510-3834-856b-6cfd7e8871f6",

Convert bash output to JSON

I am running the following command:
sudo clustat | grep primary | awk 'NF{print $1",""server:"$2 ",""status:"$3}'
Results are:
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
service:servicename,server:servername,status:started
My desired result is:
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
{"service":"servicename","server":"servername","status":"started"}
I can't seem to put the qoutation marks withour srewing up my output.
Use jq:
sudo clustat | grep primary |
jq -R 'split(" ")|{service:.[0], server:.[1], status:.[2]}'
The input is read as raw text, not JSON. Each line is split on a space (the argument to split may need to be adjusted depending on the actual input). jq ensures that values are properly quoted when constructing the output objects.
Don't do this: Instead, use #chepner's answer, which is guaranteed to generate valid JSON as output with all possible inputs (or fail with a nonzero exit status if no JSON representation is possible).
The below is only tested to generate valid JSON with the specific inputs shown in the question, and will quite certainly generate output that is not valid JSON with numerous possible inputs (strings with literal quotes, strings ending in literal backslashes, etc).
sudo clustat |
awk '/primary/ {
print "{\"service\":\"" $1 "\",\"server\":\"" $2 "\",\"status\":\""$3"\"}"
}'
For JSON conversion of common shell commands, a good option is jc (JSON Convert)
There is no parser for clustat yet though.
clustat output does look table-like, so you may be able to use the --asciitable parser with jc.

Alter log file date with the command sed?

i have the following line multiple times in a log file with other data.
And i like to analyze this data by importing the json part to a mongodb first and the run selected queries over it.
DEBUG 2015-04-18 23:13:23,374 [TEXT] (Class.java:19) - {"a":"1", "b":"2", ...}
To alter the data just to get the json part i use:
cat mylog.log | sed "s/DEBUG.*19) - //g" > mylog.json
The main problem here is, that is like to add the date and time part as well and as an additional json value to get something like this:
{"date": "2015-04-18", "time":"23:13:26,374", "a":"1", "b":"2", ...}
Here is the main question. How can i do this by using the linux console and the comman sed? Or by an alternative console command?
thx in advance
Since this appears to be a very rigid format, you could probably use sed like so:
sed 's/DEBUG \([^ ]*\) \([^ ]*\).*19) - {/{ "date": "\1", "time": "\2", /' mylog.log
Where [^ ]* matches a sequence of non-space characters and \(regex\) is a capturing group that makes a matched string available for use in the replacement as \1, \2, and so forth depending on its position. You can see these used in the replacement part.
If it were me, though, I'd use Perl for its ability to split a line into fields and match non-greedily:
perl -ape 's/.*?{/{ "date": "$F[1]", "time": "$F[2]", /' mylog.log
The latter replaces everything up to the first { (because .*? matches non-greedily) and replaces it with the string you want. $F[1] and $F[2] are the second and third whitespace-delimited field in the line; -a makes Perl split the line into the #F array this way.