I have a huge file with data in the below format. (It's the response from an API call I made to one of Twitter's APIs). I want to extract the value of the field "followers_count" from it. Ordinarily, I would do this with jq with the following command : cat | jq -r '.followers_count'
But this contains special characters so jq cannot handle it. Can someone help by telling me how do I convert it in JSON (e.g. using a shell script) or alternatively how to get the followers_count field without conversion? If this format has a specific name, I would be interested to know about it.
Thanks.
SAMPLE LINE IN FILE:
b'[{"id":2361407554,"id_str":"2361407554","name":"hakimo ait","screen_name":"hakimo_ait","location":"","description":"","url":null,"entities":{"description":{"urls":[]}},"protected":false,"followers_count":0,"friends_count":6,"listed_count":0,"created_at":"Sun Feb 23 19:08:04 +0000 2014","favourites_count":0,"utc_offset":null,"time_zone":null,"geo_enabled":false,"verified":false,"statuses_count":1,"lang":"fr","status":{"created_at":"Sun Feb 23 19:09:21 +0000 2014","id":437665498961293312,"id_str":"437665498961293312","text":"c.ronaldo","truncated":false,"entities":{"hashtags":[],"symbols":[],"user_mentions":[],"urls":[]},"source":"\u003ca href=\"https:\/\/mobile.twitter.com\" rel=\"nofollow\"\u003eMobile Web (M2)\u003c\/a\u003e","in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"geo":null,"coordinates":null,"place":null,"contributors":null,"is_quote_status":false,"retweet_count":0,"favorite_count":0,"favorited":false,"retweeted":false,"lang":"es"},"contributors_enabled":false,"is_translator":false,"is_translation_enabled":false,"profile_background_color":"C0DEED","profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png","profile_background_tile":false,"profile_image_url":"http:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_normal.png","profile_image_url_https":"https:\/\/abs.twimg.com\/sticky\/default_profile_images\/default_profile_normal.png","profile_link_color":"1DA1F2","profile_sidebar_border_color":"C0DEED","profile_sidebar_fill_color":"DDEEF6","profile_text_color":"333333","profile_use_background_image":true,"has_extended_profile":false,"default_profile":true,"default_profile_image":true,"following":false,"follow_request_sent":false,"notifications":false,"translator_type":"none"}]'
This is not the valid JSON, if you want to grab some certain part from this response, you can dump this result in file and then iterate over it and get the text you want to grab.
otherwise, if response will be in JSON, it will be easily parse through jq library, you can also dump this record in file, convert it into json and then parse it !
there are multiple ways 'grep,awk,sed' ..... you can go for it !
remove 'b from beginning and ' from bottom,it will become valid JSON !
Well i have removed the b' from beginning and ' from the bottom ! and look it is a valid JSON, now we can easily use jq with it like this !
i am doing it with my file....
jq -r '.accounts|keys[]' ../saadaccounts.json | while read key ;
do
DATA="$(jq ".accounts [$key]" ../saadaccounts.json )"
FNAME=$( echo $DATA | jq -r '.first_name' )
LNAME=$( echo $DATA | jq -r '.Last_name' )
done
*** YOUR JSON FILE ***
[
{
"id":2361393867,
"id_str":"2361393867",
"name":"graam a7bab",
"screen_name":"bedoo691",
"location":"",
"description":"\u0627\u0633\u062a\u063a\u0641\u0631\u0627\u0644\u0644\u0647 \u0648\u0627\u062a\u0648\u0628 \u0627\u0644\u064a\u0647\u0647 ..!*",
"url":null,
"entities":{
"description":{
"urls":[
]
}
},
"protected":false,
"followers_count":1,
"friends_count":6,
"listed_count":0,
"created_at":"Sun Feb 23 19:03:21 +0000 2014",
"favourites_count":1,
"utc_offset":null,
"time_zone":null,
"geo_enabled":false,
"verified":false,
"statuses_count":7,
"lang":"ar",
"status":{
"created_at":"Tue Mar 04 16:07:44 +0000 2014",
"id":440881284383256576,
"id_str":"440881284383256576",
"text":"#Naif8989",
"truncated":false,
"entities":{
"hashtags":[
],
"symbols":[
],
"user_mentions":[
{
"screen_name":"Naif8989",
"name":"\u200f naif alharbi",
"id":540343286,
"id_str":"540343286",
"indices":[
0,
9
]
}
],
"urls":[
]
},
"source":"\u003ca href=\"http:\/\/twitter.com\/download\/android\" rel=\"nofollow\"\u003eTwitter for Android\u003c\/a\u003e",
"in_reply_to_status_id":437675858485321728,
"in_reply_to_status_id_str":"437675858485321728",
"in_reply_to_user_id":2361393867,
"in_reply_to_user_id_str":"2361393867",
"in_reply_to_screen_name":"bedoo691",
"geo":null,
"coordinates":null,
"place":null,
"contributors":null,
"is_quote_status":false,
"retweet_count":0,
"favorite_count":0,
"favorited":false,
"retweeted":false,
"lang":"und"
},
"contributors_enabled":false,
"is_translator":false,
"is_translation_enabled":false,
"profile_background_color":"C0DEED",
"profile_background_image_url":"http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
"profile_background_image_url_https":"https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png",
"profile_background_tile":false,
"profile_image_url":"http:\/\/pbs.twimg.com\/profile_images\/437664693373911040\/ydODsIeh_normal.jpeg",
"profile_image_url_https":"https:\/\/pbs.twimg.com\/profile_images\/437664693373911040\/ydODsIeh_normal.jpeg",
"profile_link_color":"1DA1F2",
"profile_sidebar_border_color":"C0DEED",
"profile_sidebar_fill_color":"DDEEF6",
"profile_text_color":"333333",
"profile_use_background_image":true,
"has_extended_profile":false,
"default_profile":true,
"default_profile_image":false,
"following":false,
"follow_request_sent":false,
"notifications":false,
"translator_type":"none"
}
]
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
All,
This is my first time submitting a stack overflow question, so thanks in advance for taking the time to read/consider my question. I'm currently using the 'utmpdump' utility to dump linux authentication log results each hour from a bash script, which is done using the syntax shown below:
dateLastHour=$(date +"%a %b %d %H:" -d '1 hour ago')
dateNow=$(date +"%a %b %d %H:")
utmpdump /var/log/wtmp* | awk "/$dateLastHour/,/$dateNow/"
What I'm now trying to accomplish and the subject of this question is how can I take these results and delimited them by new line for each authentication log, before converting each authentication event into it's own JSON file to be exported to an external syslog collector for additional analysis and long term storage?
As an example, here's some of the test results I've been using:
[7] [08579] [ts/0] [egecko] [pts/0 ] [10.0.2.6 ] [1.1.1.1 ] [Fri Nov 04 23:40:29 2022 EDT]
[8] [08579] [ ] [ ] [pts/0 ] [ ] [0.0.0.0 ] [Fri Nov 04 23:55:16 2022 EDT]
[2] [00000] [~~ ] [reboot ] [~ ] [3.10.0-1160.80.1.el7.x86_64] [0.0.0.0 ] [Sat Dec 03 12:28:05 2022 EST]
[5] [00811] [tty1] [ ] [tty1 ] [ ] [0.0.0.0 ] [Sat Dec 03 12:28:12 2022 EST]
[6] [00811] [tty1] [LOGIN ] [tty1 ] [ ] [0.0.0.0 ] [Sat Dec 03 12:28:12 2022 EST]
[1] [00051] [~~ ] [runlevel] [~ ] [3.10.0-1160.80.1.el7.x86_64] [0.0.0.0 ] [Sat Dec 03 12:28:58 2022 EST]
[7] [02118] [ts/0] [egecko] [pts/0 ] [1.1.1.1 ] [1.1.1.1 ] [Sat Dec 03 12:51:22 2022 EST]
Any assistance or pointers here is greatly appreciated!
I've been using the following SED commands to trim out unnessecary whitespace, and I know that what I probably should do is using IDF to split the results string into new lines before using brackets as the delimeter:
utmpResults=$(echo "$utmpResults" | sed 's/ */ /g')
IFS="\n" read -a array <<< "$utmpResults"
echo $array
But when I echo $array it only returns the first line...?
With the help of jq (sed for json), it's an easy task:
#!/bin/bash
jq -R -c '
select(length > 0) | # remove empty lines
[match("\\[(.*?)\\]"; "g").captures[].string # find content within square brackets
| sub("^\\s+";"") | sub("\\s+$";"")] # trim content
| { # convert to json object
"type" : .[0],
"pid" : .[1],
"terminal_name_suffix" : .[2],
"user" : .[3],
"tty" : .[4],
"remote_hostname" : .[5],
"remote_host" : .[6],
"datetime" : .[7],
"timestamp" : (.[7] | strptime("%a %b %d %T %Y %Z") | mktime)
}' input.txt
Output
{"type":"7","pid":"08579","terminal_name_suffix":"ts/0","user":"egecko","tty":"pts/0","remote_hostname":"10.0.2.6","remote_host":"1.1.1.1","datetime":"Fri Nov 04 23:40:29 2022 EDT","timestamp":1667605229}
{"type":"8","pid":"08579","terminal_name_suffix":"","user":"","tty":"pts/0","remote_hostname":"","remote_host":"0.0.0.0","datetime":"Fri Nov 04 23:55:16 2022 EDT","timestamp":1667606116}
{"type":"2","pid":"00000","terminal_name_suffix":"~~","user":"reboot","tty":"~","remote_hostname":"3.10.0-1160.80.1.el7.x86_64","remote_host":"0.0.0.0","datetime":"Sat Dec 03 12:28:05 2022 EST","timestamp":1670070485}
{"type":"5","pid":"00811","terminal_name_suffix":"tty1","user":"","tty":"tty1","remote_hostname":"","remote_host":"0.0.0.0","datetime":"Sat Dec 03 12:28:12 2022 EST","timestamp":1670070492}
{"type":"6","pid":"00811","terminal_name_suffix":"tty1","user":"LOGIN","tty":"tty1","remote_hostname":"","remote_host":"0.0.0.0","datetime":"Sat Dec 03 12:28:12 2022 EST","timestamp":1670070492}
{"type":"1","pid":"00051","terminal_name_suffix":"~~","user":"runlevel","tty":"~","remote_hostname":"3.10.0-1160.80.1.el7.x86_64","remote_host":"0.0.0.0","datetime":"Sat Dec 03 12:28:58 2022 EST","timestamp":1670070538}
{"type":"7","pid":"02118","terminal_name_suffix":"ts/0","user":"egecko","tty":"pts/0","remote_hostname":"1.1.1.1","remote_host":"1.1.1.1","datetime":"Sat Dec 03 12:51:22 2022 EST","timestamp":1670071882}
Without the option -c you can create formatted output.
To save each line in a file, you can do it like this in bash.
I have chosen the timestamp as the file name.
INPUT_AS_JSON_LINES=$(
jq -R -c '
select(length > 0) | # remove empty lines
[match("\\[(.*?)\\]"; "g").captures[].string # find content within square brackets
| sub("^\\s+";"") | sub("\\s+$";"")] # trim content
| { # convert to json object
"type" : .[0],
"pid" : .[1],
"terminal_name_suffix" : .[2],
"user" : .[3],
"tty" : .[4],
"remote_hostname" : .[5],
"remote_host" : .[6],
"datetime" : .[7],
"timestamp" : (.[7] | strptime("%a %b %d %T %Y %Z") | mktime)
}' input.txt
)
while read line
do
FILENAME="$(jq '.timestamp' <<< "$line").json"
CONTENT=$(jq <<< "$line") # format json
echo "writing file '$FILENAME'"
echo "$CONTENT" > "$FILENAME"
done <<< "$INPUT_AS_JSON_LINES"
Output
writing file '1667605229.json'
writing file '1667606116.json'
writing file '1670070485.json'
writing file '1670070492.json'
writing file '1670070492.json'
writing file '1670070538.json'
writing file '1670071882.json'
I have a JSON endpoint which I can fetch value with curl and yml local file. I want to get the difference and delete it with id of name present on JSON endpoint.
JSON's endpoint
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_toto_a"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_tata_b"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_titi_c"
}
]
files.yml
---
instance:
toto:
name: "toto"
tata:
name: "tata"
Between JSON's endpoint and local file, I want to delete it with id of tata, because it is the difference between the sources.
declare -a arr=(_a _b _c)
ar=$(cat files.yml | grep name | cut -d '"' -f2 | tr "\n" " ")
fileItemArray=($ar)
ARR_PRE=("${fileItemArray[#]/#/V1_}")
for i in "${arr[#]}"; do local_var+=("${ARR_PRE[#]/%/$i}"); done
remote_var=$(curl -sX GET "XXXX" | jq -r '.[].name | #sh' | tr -d \'\")
diff_=$(echo ${local_var[#]} ${remote_var[#]} | tr ' ' '\n' | sort | uniq -u)
output = titi
the code works, but I want to delete the titi with id dynamically
curl -X DELETE "XXXX" $id_titi
I am trying to delete with bash script, but I have no idea to continue...
Your endpoint is not proper JSON as it has
commas after the .name field but no following field
no commas between the elements of the top-level array
If this is not just a typo from pasting your example into this question, then you'd need to address this first before proceeding. This is how it should look like:
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "toto"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "tata"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "titi"
}
]
If your endpoint is proper JSON, try the following. It extracts the names from your .yml file (just as you do - there are plenty of more efficient and less error-prone ways but I'm trying to adapt your approach as much as possible) but instead of a Bash array generates a JSON array using jq which for Bash is a simple string. For your curl output it's basically the same thing, extracting a (JSON) array of names into a Bash string. Note that in both cases I use quotes <var>="$(…)" to capture strings that may include spaces (although I also use the -c option for jq to compact it's output to a single line). For the difference between the two, everything is taken over by jq as it can easily be fed with the JSON arrays as variables, perform the subtraction and output in your preferred format:
fromyml="$(cat files.yml | grep name | cut -d '"' -f2 | jq -Rnc '[inputs]')"
fromcurl="$(curl -sX GET "XXXX" | jq -c 'map(.name)')"
diff="$(jq -nr --argjson fromyml "$fromyml" --argjson fromcurl "$fromcurl" '
$fromcurl - $fromyml | .[]
')"
The Bash variable diff now contains a list of names only present in the curl output ($fromcurl - $fromyml), one per line (if, other than in your example, there happens to be more than one). If the curl output had duplicates, they will still be included (use $fromcurl - $fromyml | unique | .[] to get rid of them):
titi
As you can see, this solution has three calls to jq. I'll leave it to you to further reduce that number as it fits your general workflow (basically, it can be put together into one).
Getting the output of a program into a variable can be done using read.
perl -M5.010 -MYAML -MJSON::PP -e'
sub get_next_file { local $/; "".<> }
my %filter = map { $_->{name} => 1 } values %{ Load(get_next_file)->{instance} };
say for grep !$filter{$_}, map $_->{name}, #{ decode_json(get_next_file) };
' b.yaml a.json |
while IFS= read -r id; do
curl -X DELETE ..."$id"...
done
I used Perl here because what you had was no way to parse a YAML file. The snippet requires having installed the YAML Perl module.
I am a newbie to jq and is very excited to use it. What ever i am trying to achieve is possible with python but the intention is to learn jq.I am trying to process JSON out of a curl command.
Below is the response of my curl command
{
"results": [{
"name": "smith Jones",
"DOB": "1992-03-26",
"Enrollmentdate": "2013-08-24"
},
{
"name": "Jacob Mathew",
"DOB": "1993-03-26",
"Enrollmentdate": "2014-10-02"
},
{
"name": "Anita Rodrigues",
"DOB": "1994-03-26",
"Enrollmentdate": "2015-02-19"
}
]
}
I was able to get the desired output to some extent. But i am unable to print the key itself in the output. I need this information to use it at a later time as a header of the column when i export this csv file (file.csv) into excel. I am planning to write a bash script to achieve the csv to excel.
<curl-command>|jq '.results | map(.name), map(.DOB), map(.Enrollmentdate) | #csv' >file.csv
I was able to get the output as below
smith jones, jacob Mathew, Anita Rodrigues
1992-03-26, 1993-03-26, 1994-03-26
2013-08-24, 2014-10-02, 2015-02-19
What i am trying to achieve is as below
name:smith jones, name:jacob Mathew, name:Anita Rodrigues
DOB:1992-03-26, DOB:1993-03-26, DOB:1994-03-26
Enrollmentdate:2013-08-24, Enrollmentdate:2014-10-02, Enrollmentdate:2015-02-19
Since you want the key names as well as their values, then adapting your approach, you could use the following, in conjunction with the -r command-line option, to produce CSV:
.results
| map(to_entries[] | select(.key=="name")),
map(to_entries[] | select(.key=="DOB")),
map(to_entries[] | select(.key=="Enrollmentdate"))
| map("\(.key):\(.value)" )
| #csv`
If you want CSV, then stick with the above; if you are confident that quoting the strings
is never necessary, change #csv to join(", "); if you want to remove the quotation
marks only when they are not necessary, you could add a def for a simple filter to do just that.
The repetition of to_entries in the above is a bit of an eye-sore. You might want to think about how to avoid it.
while working with json in windows is very easy in Linux I'm getting trouble.
I found a way to convert list into json using jq:
For example:
ls | jq -R -s -c 'split("\n")'
output:
["bin","boot","dev","etc","home","lib","lib64","media","mnt","opt","proc","root","run","sbin","srv","sys","tmp","usr","var"]
I'm getting trouble to convert a table into json
I'm looking for an option to convert a table that I get from bash command into a json. I already searched for many tools but none of them are generic and you need to adjust the commands to each different table.
Do you know how can I convert a table that I get from a bash commands into json that can be generic?
table output for example:
rpm -qai
output:
Name : gnome-session
Version : 3.8.4
Release : 11.el7
Architecture: x86_64
Install Date: Mon 21 Dec 2015 04:12:41 PM EST
Group : User Interface/Desktops
Size : 1898331
License : GPLv2+
Signature : RSA/SHA256, Thu 03 Jul 2014 09:39:10 PM EDT,
Key ID 24c6a8a7f4a80eb5
Source RPM : gnome-session-3.8.4-11.el7.src.rpm
Build Date : Mon 09 Jun 2014 09:12:26 PM EDT
Build Host : worker1.bsys.centos.org
Relocations : (not relocatable)
Packager : CentOS BuildSystem <http://bugs.centos.org>
Vendor : CentOS
URL : http://www.gnome.org
Summary : GNOME session manager
Description : nome-session manages a GNOME desktop or GDM login session. It starts up the other core GNOME components and handles logout and saving the
session.
Thanks!
There are too many poorly-specified textual formats to create a single tool for what you are asking for, but Unix is well-equipped to the task. Usually, you would create a simple shell or Awk script to convert from one container format to another. Here's one example:
printf '"%s", ' * | sed 's/, $//;s/.*/[ & ]/'
The printf will produce a comma-separated, double-quoted list of wildcard matches. The sed will trim the final comma and add a pair of square brackets around the entire output. The results will be incorrect if a file name contains a double quote, for example, but in the name of simplicity, let's not embellish this any further.
Here's another:
rpm -qai | awk -F ' *: ' 'BEGIN { print "{\n"; }
{ printf "%s\"%s\": \"%s\"", delim, $1, substr($0, 15); delim="\n," }
END { print "\n}"; }'
The -qf output format is probably better but this shows how you can extract fields from a reasonably free-form line-oriented format using a simple Awk script. The first field before the colon is extracted as the key, and everything from the 15th column onwards is extracted as the value. Again, we ignore the possible complications (double quotes in the values would need to be escaped, again, for example) to keep the example simple.
If your needs are serious, you will need to spend more time on creating a robust parser; but then, you will usually want to work with tools which have a well-defined output format in the first place (XML, JSON, etc) and spend as little time as possible on ad-hoc parsers. Unfortunately, there is still a plethora of tools out there which do not support an --xml or --json output option out of the box, but JSON support is fortunately becoming more widely supported.
You can convert a table from bash command into a json using jq
This command will return a detailed report on the system’s disk space usage
df -h
The output is something like this
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk3s1s1 926Gi 20Gi 803Gi 3% 502068 4293294021 0% /
devfs 205Ki 205Ki 0Bi 100% 710 0 100% /dev
/dev/disk3s6 926Gi 7.0Gi 803Gi 1% 7 8418661400 0% /System/Volumes/VM
/dev/disk3s2 926Gi 857Mi 803Gi 1% 1811 8418661400 0% /System/Volumes/Preboot
/dev/disk3s4 926Gi 623Mi 803Gi 1% 267 8418661400 0% /System/Volumes/Update
Now we can convert the output of this command into json with jq
command=($(df -h | tr -s ' ' | jq -c -Rn 'input | split(" ") as $head | inputs | split(" ") | to_entries | map(.key = $head[.key]) | from_entries'))
echo $command | jq
{
"Filesystem": "/dev/disk3s1s1",
"Size": "926Gi",
"Used": "20Gi",
"Avail": "803Gi",
"Capacity": "3%",
"iused": "502068",
"ifree": "4293294021",
"%iused": "0%",
"Mounted": "/"
}
{
"Filesystem": "devfs",
"Size": "205Ki",
"Used": "205Ki",
"Avail": "0Bi",
"Capacity": "100%",
"iused": "710",
"ifree": "0",
"%iused": "100%",
"Mounted": "/dev"
}
{
"Filesystem": "/dev/disk3s6",
"Size": "926Gi",
"Used": "7.0Gi",
"Avail": "803Gi",
"Capacity": "1%",
"iused": "7",
"ifree": "8418536520",
"%iused": "0%",
"Mounted": "/System/Volumes/VM"
}
{
"Filesystem": "/dev/disk3s2",
"Size": "926Gi",
"Used": "857Mi",
"Avail": "803Gi",
"Capacity": "1%",
"iused": "1811",
"ifree": "8418536520",
"%iused": "0%",
"Mounted": "/System/Volumes/Preboot"
}
convert table from bash command into json
I want to write a line of code which will take the results of:
du -sh -c --time /00-httpdocs/*
and output it in JSON format. The goal is to get three pieces of information for each project file in a site: directory path, date last modified, and disk space usage in human readable format. This command will output that data in tab-delimited format with each entry on a new line in the terminal:
4.6G 2014-08-22 12:26 /00-httpdocs/00
1.1G 2014-08-22 13:32 /00-httpdocs/01
711M 2014-02-14 23:39 /00-httpdocs/02
The goal is to get it to export to a JSON file so it would need to be formatted something like this:
{"httpdocs": [
{
"size": "4.6G",
"modified": "2014-08-22 12:26",
"path": "/00-httpdocs/00-PREVIEW"}
{
"size": "1.1G",
"modified": "2014-08-22 13:32",
"path": "/00-httpdocs/8oclock"}
{
"size": "711M",
"modified": "2014-02-14 23:39",
"path": "/00-httpdocs/8oclock.new"}
]}
(I know that's not quite proper JSON, I just wrote it as an example. Apologies to the pedantic among us.)
I need size to return as an integer (so maybe remove '-sh' and handle conversion later?).
I've tried using awk and sed but I'm a total novice and can't quite get the formatting right.
I've made it about this far:
du -sh -c --time /00-httpdocs/* | awk ' BEGIN {print "\"httpdocs:\": [";} {print "{"$0"},\n";} END {print "]";}'
The goal is to have this trigger twice a day so that we can get the data and use it inside of a JavaScript application.
sed '1 i\
{"httpdocs": [
s/\([^[:space:]]*\)([[:space:]]*\([^[:space:]]*\)[[:space:]]*\([^[:space:]]*\)/ {\
"size" : "\1",\
"modified": "\2",\
"path": "\3"}/
$ a\^J]}' YourFile
Quick and dirty (posix version so --posix on GNU sed).
Take the 3 argument and place them (s/../../) into a 'template" using group (\( ...\) and \1).
Include header at 1st line (i \...) and append footer ant last (a \...).
[:space:] may be [:blank:]