Linux convert a table from bash command into json - json

while working with json in windows is very easy in Linux I'm getting trouble.
I found a way to convert list into json using jq:
For example:
ls | jq -R -s -c 'split("\n")'
output:
["bin","boot","dev","etc","home","lib","lib64","media","mnt","opt","proc","root","run","sbin","srv","sys","tmp","usr","var"]
I'm getting trouble to convert a table into json
I'm looking for an option to convert a table that I get from bash command into a json. I already searched for many tools but none of them are generic and you need to adjust the commands to each different table.
Do you know how can I convert a table that I get from a bash commands into json that can be generic?
table output for example:
rpm -qai
output:
Name : gnome-session
Version : 3.8.4
Release : 11.el7
Architecture: x86_64
Install Date: Mon 21 Dec 2015 04:12:41 PM EST
Group : User Interface/Desktops
Size : 1898331
License : GPLv2+
Signature : RSA/SHA256, Thu 03 Jul 2014 09:39:10 PM EDT,
Key ID 24c6a8a7f4a80eb5
Source RPM : gnome-session-3.8.4-11.el7.src.rpm
Build Date : Mon 09 Jun 2014 09:12:26 PM EDT
Build Host : worker1.bsys.centos.org
Relocations : (not relocatable)
Packager : CentOS BuildSystem <http://bugs.centos.org>
Vendor : CentOS
URL : http://www.gnome.org
Summary : GNOME session manager
Description : nome-session manages a GNOME desktop or GDM login session. It starts up the other core GNOME components and handles logout and saving the
session.
Thanks!

There are too many poorly-specified textual formats to create a single tool for what you are asking for, but Unix is well-equipped to the task. Usually, you would create a simple shell or Awk script to convert from one container format to another. Here's one example:
printf '"%s", ' * | sed 's/, $//;s/.*/[ & ]/'
The printf will produce a comma-separated, double-quoted list of wildcard matches. The sed will trim the final comma and add a pair of square brackets around the entire output. The results will be incorrect if a file name contains a double quote, for example, but in the name of simplicity, let's not embellish this any further.
Here's another:
rpm -qai | awk -F ' *: ' 'BEGIN { print "{\n"; }
{ printf "%s\"%s\": \"%s\"", delim, $1, substr($0, 15); delim="\n," }
END { print "\n}"; }'
The -qf output format is probably better but this shows how you can extract fields from a reasonably free-form line-oriented format using a simple Awk script. The first field before the colon is extracted as the key, and everything from the 15th column onwards is extracted as the value. Again, we ignore the possible complications (double quotes in the values would need to be escaped, again, for example) to keep the example simple.
If your needs are serious, you will need to spend more time on creating a robust parser; but then, you will usually want to work with tools which have a well-defined output format in the first place (XML, JSON, etc) and spend as little time as possible on ad-hoc parsers. Unfortunately, there is still a plethora of tools out there which do not support an --xml or --json output option out of the box, but JSON support is fortunately becoming more widely supported.

You can convert a table from bash command into a json using jq
This command will return a detailed report on the system’s disk space usage
df -h
The output is something like this
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3s1s1 926Gi 20Gi 803Gi 3% 502068 4293294021 0% /
devfs 205Ki 205Ki 0Bi 100% 710 0 100% /dev
/dev/disk3s6 926Gi 7.0Gi 803Gi 1% 7 8418661400 0% /System/Volumes/VM
/dev/disk3s2 926Gi 857Mi 803Gi 1% 1811 8418661400 0% /System/Volumes/Preboot
/dev/disk3s4 926Gi 623Mi 803Gi 1% 267 8418661400 0% /System/Volumes/Update
Now we can convert the output of this command into json with jq
command=($(df -h | tr -s ' ' | jq -c -Rn 'input | split(" ") as $head | inputs | split(" ") | to_entries | map(.key = $head[.key]) | from_entries'))
echo $command | jq
{
"Filesystem": "/dev/disk3s1s1",
"Size": "926Gi",
"Used": "20Gi",
"Avail": "803Gi",
"Capacity": "3%",
"iused": "502068",
"ifree": "4293294021",
"%iused": "0%",
"Mounted": "/"
}
{
"Filesystem": "devfs",
"Size": "205Ki",
"Used": "205Ki",
"Avail": "0Bi",
"Capacity": "100%",
"iused": "710",
"ifree": "0",
"%iused": "100%",
"Mounted": "/dev"
}
{
"Filesystem": "/dev/disk3s6",
"Size": "926Gi",
"Used": "7.0Gi",
"Avail": "803Gi",
"Capacity": "1%",
"iused": "7",
"ifree": "8418536520",
"%iused": "0%",
"Mounted": "/System/Volumes/VM"
}
{
"Filesystem": "/dev/disk3s2",
"Size": "926Gi",
"Used": "857Mi",
"Avail": "803Gi",
"Capacity": "1%",
"iused": "1811",
"ifree": "8418536520",
"%iused": "0%",
"Mounted": "/System/Volumes/Preboot"
}
convert table from bash command into json

Related

How to properly chain multiple jq statements together when processing json in the shell such as with curl?

I am new to jq so if this is not a jq question or a json question please point me in the right direction. I am not sure of the correct terminology so it is making it hard for me to properly articulate the problem.
I am using to curl to pull some json that I want to filter out keys with specific values. Here is some of the sample json:
{
"id": "593f468c81aaa30001960e16",
"name": "Name 1",
"channels": [
"593f38398481bc00019632e5"
],
"geofenceProfileId": null
}
{
"id": "58e464585180ac000a748b57",
"name": "Name 2",
"channels": [
"58b480097f04f20007f3cdca",
"580ea26616de060006000001"
],
"geofenceProfileId": null
}
{
"id": "58b4d6db7f04f20007f3cdd2",
"name": "Name 3",
"channels": [
"58b8a25cf9f6e19cf671872f"
],
"geofenceProfileId": "57f53018271c810006000001"
}
When I run the following command:
curl -X GET -H 'authorization: Basic somestring=' "https://myserver/myjson" |
jq '.[] | {id: .id, name: .name, channels: .channels, geofenceProfileId: .geofenceProfileId}' |
jq '.[] | select(.channels == 58b8a25cf9f6e19cf671872f)'
I get the following error:
jq: error: syntax error, unexpected IDENT, expecting ';' or ')' (Unix shell quoting issues?) at , line 1:
.[] | select(.channels == 58b8a25cf9f6e19cf671872f)
jq: 1 compile error
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 351k 0 351k 0 0 1109k 0 --:--:-- --:--:-- --:--:-- 1110k
Is this error because jq pretty prints the output of the first statement and the second statement is expecting it to be in one code block? If so, how do I convert it back to non pretty print format or how can I use jq to run a new filter on the output?
Basically I am trying to parse hundreds of records and filter out all of the records that are in a specific channel number or have a specific geofenceProfileId.
I'd suggest you start with:
jq 'select(.channels | index("58b8a25cf9f6e19cf671872f"))'
In fact, this might even be exactly the filter you want. If you want to remove the "channels" once you've made the selection, you could augment the filter above as follows:
select(.channels | index("58b8a25cf9f6e19cf671872f")) | del(.channels)
The main thing to note is that one can create a pipeline WITHIN a single invocation of jq. So most likely you'll end up with: curl ... | jq ...
Btw
The jq expression {"id": .id} can be abbreviated to {id}, so instead of:
{id.id, name: .name, channels: .channels, geofenceProfileId: .geofenceProfileId}
you could write:
{id, name, channels, geofenceProfileId}
Probably not related to your case but I managed to transform my command
npm pkg get version -ws | jq "select(to_entries | min_by(.value) | .value)"
to
npm pkg get version -ws | jq "to_entries | min_by(.value) | .value"
and result is same. May be it helps. SO the idea is to pipe inside jq statement

How do I parse this supposedly JSON format

I have a huge file with data in the below format. (It's the response from an API call I made to one of Twitter's APIs). I want to extract the value of the field "followers_count" from it. Ordinarily, I would do this with jq with the following command : cat | jq -r '.followers_count'
But this contains special characters so jq cannot handle it. Can someone help by telling me how do I convert it in JSON (e.g. using a shell script) or alternatively how to get the followers_count field without conversion? If this format has a specific name, I would be interested to know about it.
Thanks.
SAMPLE LINE IN FILE:
b'[{"id":2361407554,"id_str":"2361407554","name":"hakimo ait","screen_name":"hakimo_ait","location":"","description":"","url":null,"entities":{"description":{"urls":[]}},"protected":false,"followers_count":0,"friends_count":6,"listed_count":0,"created_at":"Sun Feb 23 19:08:04 +0000 2014","favourites_count":0,"utc_offset":null,"time_zone":null,"geo_enabled":false,"verified":false,"statuses_count":1,"lang":"fr","status":{"created_at":"Sun Feb 23 19:09:21 +0000 2014","id":437665498961293312,"id_str":"437665498961293312","text":"c.ronaldo","truncated":false,"entities":{"hashtags":[],"symbols":[],"user_mentions":[],"urls":[]},"source":"\u003ca href=\"https:\/\/mobile.twitter.com\" rel=\"nofollow\"\u003eMobile Web (M2)\u003c\/a\u003e","in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"retweet_count":0,"favorite_count":0,"favorited":false,"retweeted":false,"lang":"es"},"contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_image_url":"http:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_normal.png","profile_image_url_https":"https:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_normal.png","profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"has_extended_profile":false,"default_profile":true,"default_profile_image":true,"following":false,"follow_request_sent":false,"notifications":false,"translator_type":"none"}]'
This is not the valid JSON, if you want to grab some certain part from this response, you can dump this result in file and then iterate over it and get the text you want to grab.
otherwise, if response will be in JSON, it will be easily parse through jq library, you can also dump this record in file, convert it into json and then parse it !
there are multiple ways 'grep,awk,sed' ..... you can go for it !
remove 'b from beginning and ' from bottom,it will become valid JSON !
Well i have removed the b' from beginning and ' from the bottom ! and look it is a valid JSON, now we can easily use jq with it like this !
i am doing it with my file....
jq -r '.accounts|keys[]' ../saadaccounts.json | while read key ;
do
DATA="$(jq ".accounts [$key]" ../saadaccounts.json )"
FNAME=$( echo $DATA | jq -r '.first_name' )
LNAME=$( echo $DATA | jq -r '.Last_name' )
done
*** YOUR JSON FILE ***
[
{
"id":2361393867,
"id_str":"2361393867",
"name":"graam a7bab",
"screen_name":"bedoo691",
"location":"",
"description":"\u0627\u0633\u062a\u063a\u0641\u0631\u0627\u0644\u0644\u0647 \u0648\u0627\u062a\u0648\u0628 \u0627\u0644\u064a\u0647\u0647 ..!*",
"url":null,
"entities":{
"description":{
"urls":[
]
}
},
"protected":false,
"followers_count":1,
"friends_count":6,
"listed_count":0,
"created_at":"Sun Feb 23 19:03:21 +0000 2014",
"favourites_count":1,
"utc_offset":null,
"time_zone":null,
"geo_enabled":false,
"verified":false,
"statuses_count":7,
"lang":"ar",
"status":{
"created_at":"Tue Mar 04 16:07:44 +0000 2014",
"id":440881284383256576,
"id_str":"440881284383256576",
"text":"#Naif8989",
"truncated":false,
"entities":{
"hashtags":[
],
"symbols":[
],
"user_mentions":[
{
"screen_name":"Naif8989",
"name":"\u200f naif alharbi",
"id":540343286,
"id_str":"540343286",
"indices":[
0,
9
]
}
],
"urls":[
]
},
"source":"\u003ca href=\"http:\/\/twitter.com\/download\/android\" rel=\"nofollow\"\u003eTwitter for Android\u003c\/a\u003e",
"in_reply_to_status_id":437675858485321728,
"in_reply_to_status_id_str":"437675858485321728",
"in_reply_to_user_id":2361393867,
"in_reply_to_user_id_str":"2361393867",
"in_reply_to_screen_name":"bedoo691",
"geo":null,
"coordinates":null,
"place":null,
"contributors":null,
"is_quote_status":false,
"retweet_count":0,
"favorite_count":0,
"favorited":false,
"retweeted":false,
"lang":"und"
},
"contributors_enabled":false,
"is_translator":false,
"is_translation_enabled":false,
"profile_background_color":"C0DEED",
"profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
"profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
"profile_background_tile":false,
"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/437664693373911040\/ydODsIeh_normal.jpeg",
"profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/437664693373911040\/ydODsIeh_normal.jpeg",
"profile_link_color":"1DA1F2",
"profile_sidebar_border_color":"C0DEED",
"profile_sidebar_fill_color":"DDEEF6",
"profile_text_color":"333333",
"profile_use_background_image":true,
"has_extended_profile":false,
"default_profile":true,
"default_profile_image":false,
"following":false,
"follow_request_sent":false,
"notifications":false,
"translator_type":"none"
}
]

jq: extract value based on different (calculated) value

I am trying to filter down a very large json file (AWS output from aws rds describe-db-snapshots) into just a list of snapshots for deletion.
The final list of snapshots should be older than 60 days. I can discern their age via their SnapshotCreateTime, but I need their DBSnapshotIdentifier value to be able to delete them.
Greatly stripped down for SO purposes, below is the input.json file.
{
"Engine": "postgres",
"SnapshotCreateTime": "2017-08-22T16:35:42.302Z",
"AvailabilityZone": "us-east-1b",
"DBSnapshotIdentifier": "alex2-20170822-0108-bkup",
"AllocatedStorage": 5
}
{
"Engine": "postgres",
"SnapshotCreateTime": "2017-06-02T16:35:42.302Z",
"AvailabilityZone": "us-east-1a",
"DBSnapshotIdentifier": "alex-dbs-16opfr84gq4h9-snapshot-rtsmdbinstance-fr84gq4h9",
"AllocatedStorage": 5
}
{
"Engine": "postgres",
"SnapshotCreateTime": "2017-04-22T16:35:42.302Z",
"AvailabilityZone": "us-east-1a",
"DBSnapshotIdentifier": "alex3-20170422-update",
"AllocatedStorage": 5
}
I know about select but from what I can tell it can't handle the math needed for the time comparison in a one-liner. I figured I'd need to branch out to bash, so I've been messing with the following (clunky) workaround. It's not working, but I figured I'd include it as proof of effort.
THEN=$(date +'%Y%m%d' -d "`date`-60days")
while IFS= read -r i
do
awsDate=$(jq -r '.SnapshotCreateTime' < $i) // get time
snapDate=$(date -d $awsDate +'%Y%m%d') //convert to correct format
if [ $snapDate -gt $THEN ] //compare times
then
// something to copy the ID
fi
done < input.json
In this case I'd be looking for an output of
alex-dbs-16opfr84gq4h9-snapshot-rtsmdbinstance-fr84gq4h9
alex3-20170422-update
Here is an all-jq solution (i.e. one that does not depend on calling the date command). You might like to try a variation, e.g. passing some form of the date in, using one of the command-line options such as --arg.
jq currently does not quite understand the SnapshotCreateTime format; that's where the call to sub comes in:
def ago(days): now - (days*24*3600);
select(.SnapshotCreateTime | sub("\\.[0-9]*";"") < (ago(60) | todate))
| .DBSnapshotIdentifier
After fixing the sample input so that it is valid JSON, the output would be:
"alex-dbs-16opfr84gq4h9-snapshot-rtsmdbinstance-fr84gq4h9"
"alex3-20170422-update"
To strip the quotation marks, use the -r command-line option.
Here is a solution which defines a filter function which uses select, sub, fromdate and now.
def too_old:
select( .SnapshotCreateTime
| sub("[.][0-9]+Z";"Z") # remove fractional seconds
| fromdate # convert to unix time
| now - . # convert to age in seconds
| . > (86400 * 60) # true if older than 60 days in seconds
)
;
too_old
| .DBSnapshotIdentifier
If you place this in a file filter.jq and run jq with the -r option e.g
jq -M -r -f filter.jq input.json
it will produce the output you requested:
alex-dbs-16opfr84gq4h9-snapshot-rtsmdbinstance-fr84gq4h9
alex3-20170422-update

Find and edit a Json file using bash

I have multiple files in the following format with different categories like:
{
"id": 1,
"flags": ["a", "b", "c"],
"name": "test",
"category": "video",
"notes": ""
}
Now I want to append all the files flags whose category is video with string d. So my final file should look like the file below:
{
"id": 1,
"flags": ["a", "b", "c", "d"],
"name": "test",
"category": "video",
"notes": ""
}
Now using the following command I am able to find files of my interest, but now I want to work with editing part which I an unable to find as there are 100's of file to edit manually, e.g.
find . - name * | xargs grep "\"category\": \"video\"" | awk '{print $1}' | sed 's/://g'
You can do this
find . -type f | xargs grep -l '"category": "video"' | xargs sed -i -e '/flags/ s/]/, "d"]/'
This will find all the filnames which contain line with "category": "video", and then add the "d" flag.
Details:
find . -type f
=> Will get all the filenames in your directory
xargs grep -l '"category": "video"'
=> Will get those filenames which contain the line "category": "video"
xargs sed -i -e '/flags/ s/]/, "d"]/'
=> Will add the "d" letter to the flags:line.
"TWEET!!" ... (yellow flag thown to the ground) ... Time Out!
What you have, here, is "a JSON file." You also have, at your #!shebang command, your choice of(!) full-featured programming languages ... with intimate and thoroughly-knowledgeale support for JSON ... with which you can very-speedily write your command-file.
Even if it is "theoretically possible" to do this using "bash scripts," this is roughly equivalent to "putting a beautiful stone archway over the front-entrance to a supermarket." Therefore, "waste ye no time" in such an utterly-profitless pursuit. Write a script, using a language that "honest-to-goodness knows about(!) JSON," to decode the contents of the file, then manipulate it (as a data-structure), then re-encode it again.
Here is a more appropriate approach using PHP in shell:
FILE=foo2.json php -r '$file = $_SERVER["FILE"]; $arr = json_decode(file_get_contents($file)); if ($arr->category == "video") { $arr->flags[] = "d"; file_put_contents($file,json_encode($arr)); }'
Which will load the file, decode into array, add "d" into flags property only when category is video, then write back to the file in JSON format.
To run this for every json file, you can use find command, e.g.
find . -name "*.json" -print0 | while IFS= read -r -d '' file; do
FILE=$file
# run above PHP command in here
done
If the files are in the same format, this command may help (version for a single file):
ex +':/category.*video/norm kkf]i, "d"' -scwq file1.json
or:
ex +':/flags/,/category/s/"c"/"c", "d"/' -scwq file1.json
which is basically using Ex editor (now part of Vim).
Explanation:
+ - executes Vim command (man ex)
:/pattern_or_range/cmd - find pattern, if successful execute another Vim commands (:h :/)
norm kkf]i - executes keystrokes in normal mode
kk - move cursor up twice
f] - find ]
i, "d" - insert , "d"
-s - silent mode
-cwq - executes wq (write & quit)
For multiple files, use find and -execdir or extend above ex command to:
ex +'bufdo!:/category.*video/norm kkf]i, "d"' -scxa *.json
Where bufdo! executes command for every file, and -cxa saves every file. Add -V1 for extra verbose messages.
If flags line is not 2 lines above, then you may perform backward search instead. Or using similar approach to #sps by replacing ] with d.
See also: How to change previous line when the pattern is found? at Vim.SE.
Using jq:
find . -type f | xargs cat | jq 'select(.category=="video") | .flags |= . + ["d"]'
Explanation:
jq 'select(.category=="video") | .flags |= . + ["d"]'
# select(.category=="video") => filters by category field
# .flags |= . + ["d"] => Updates the flags array

Monitoring disk usage in CentOS using 'du' to output JSON

I want to write a line of code which will take the results of:
du -sh -c --time /00-httpdocs/*
and output it in JSON format. The goal is to get three pieces of information for each project file in a site: directory path, date last modified, and disk space usage in human readable format. This command will output that data in tab-delimited format with each entry on a new line in the terminal:
4.6G 2014-08-22 12:26 /00-httpdocs/00
1.1G 2014-08-22 13:32 /00-httpdocs/01
711M 2014-02-14 23:39 /00-httpdocs/02
The goal is to get it to export to a JSON file so it would need to be formatted something like this:
{"httpdocs": [
{
"size": "4.6G",
"modified": "2014-08-22 12:26",
"path": "/00-httpdocs/00-PREVIEW"}
{
"size": "1.1G",
"modified": "2014-08-22 13:32",
"path": "/00-httpdocs/8oclock"}
{
"size": "711M",
"modified": "2014-02-14 23:39",
"path": "/00-httpdocs/8oclock.new"}
]}
(I know that's not quite proper JSON, I just wrote it as an example. Apologies to the pedantic among us.)
I need size to return as an integer (so maybe remove '-sh' and handle conversion later?).
I've tried using awk and sed but I'm a total novice and can't quite get the formatting right.
I've made it about this far:
du -sh -c --time /00-httpdocs/* | awk ' BEGIN {print "\"httpdocs:\": [";} {print "{"$0"},\n";} END {print "]";}'
The goal is to have this trigger twice a day so that we can get the data and use it inside of a JavaScript application.
sed '1 i\
{"httpdocs": [
s/\([^[:space:]]*\)([[:space:]]*\([^[:space:]]*\)[[:space:]]*\([^[:space:]]*\)/ {\
"size" : "\1",\
"modified": "\2",\
"path": "\3"}/
$ a\^J]}' YourFile
Quick and dirty (posix version so --posix on GNU sed).
Take the 3 argument and place them (s/../../) into a 'template" using group (\( ...\) and \1).
Include header at 1st line (i \...) and append footer ant last (a \...).
[:space:] may be [:blank:]