Parsing JSON with BusyBox tools - json

I'm working on a blog theme for Hugo installable on Android (BusyBox via Termux) and plan to create a BusyBox Docker image and copy my theme and the hugo binary to it for use on ARM.
Theme releases are archived and made available on NPM and the tools available on BusyBox have allowed me to reliably parse version from the metadata from JSON:
meta=$(wget -qO - https://registry.npmjs.org/package/latest)
vers=$(echo "$meta" | egrep -o "\"version\".*[^,]*," | cut -d ',' -f1 | cut -d ':' -f2 | tr -d '" ')
Now I would like to copy the dist value from the meta into a text file for use in Hugo:
"dist": {
"integrity": "sha512-3MH2/UKYPjr+CTC85hWGg/N3GZmSlgBWXzdXHroDfJRnEmcBKkvt1oiadN8gzCCppqCQhwtmengZzg0imm1mtg==",
"shasum": "a159699b1c5fb006a84457fcdf0eb98d72c2eb75",
"tarball": "https://registry.npmjs.org/after-dark/-/after-dark-6.4.1.tgz",
"fileCount": 98,
"unpackedSize": 5338189
},
Above pretty-printed for clarity. The actual metadata is compressed.
Is there a way I can reuse the version parsing logic above to also pull the dist field value?

Proper robust parsing requires tools like jq where it could be as simple as jq '.version' ip.txt and jq '.dist' ip.txt
You could use sed but use it at your own risk
$ sed -n 's/.*"version":"\([^"]*\).*/\1/p' ip.txt
6.4.1
$ sed -n 's/.*\("dist":{[^}]*}\).*/\1/p' ip.txt
"dist":{"integrity":....
....}
-n option to disable automatic printing
the p modifier with s command will allow to print only when substitution succeeds, this will mean output is empty instead of entire input line when something goes wrong
.*"version":"\([^"]*\).* this will match entire line, capturing data between double quotes after version tag - you'll have to adjust the regex if whitespaces are allowed and other valid json formats
.*\("dist":{[^}]*}\).* this will match entire line, capturing data starting with "dist":{ and first occurrence of } afterwards - so this is not suited if the tag itself can contain }

Related

How can I format a json file into a bash environment variable?

I'm trying to take the contents of a config file (JSON format), strip out extraneous new lines and spaces to be concise and then assign it to an environment variable before starting my application.
This is where I've got so far:
pwr_config=`echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json | xargs -0 printf '%q\n'` npm run start
This pipes a short node.js app into the node runtime taking an argument of the file name and it parses and stringifies the JSON file to validate it and remove any unnecessary whitespace. So far so good.
The result of this is then piped to printf, or at least it would be but printf doesn't support input in this way, apparently, so I'm using xargs to pass it in in a way it supports.
I'm using the %q formatter to format the string escaping any characters that would be a problem as part of a command, but when calling printf through xargs, printf claims it doesn't support %q. I think this is perhaps because there is more than one version of printf but I'm not exactly sure how to resolve that.
Any help would be appreciated, even if the solution is completely different from what I've started :) Thanks!
Update
Here's the output I get on MacOS:
$ cat config.json | xargs -0 printf %q
printf: illegal format character q
My JSON file looks like this:
{
"hue_host": "192.168.1.2",
"hue_username": "myUsername",
"port": 12000,
"player_group_config": [
{
"name": "Family Room",
"player_uuid": "ATVUID",
"hue_group": "3",
"on_events": ["media.play", "media.resume"],
"off_events": ["media.stop", "media.pause"]
},
{
"name": "Lounge",
"player_uuid": "STVUID",
"hue_group": "1",
"on_events": ["media.play", "media.resume"],
"off_events": ["media.stop", "media.pause"]
}
]
}
Two ways:
Use xargs to pick up bash's printf builtin instead of the printf(1) executable, probably in /usr/bin/printf(thanks to #GordonDavisson):
pwr_config=`echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json | xargs -0 bash -c 'printf "%q\n"'` npm run start
Simpler: you don't have to escape the output of a command if you quote it. In the same way that echo "<|>" is OK in bash, this should also work:
pwr_config="$(echo "console.log(JSON.stringify(JSON.parse(require('fs').readFileSync(process.argv[2], 'utf-8'))));" | node - config.json )" npm run start
This uses the newer $(...) form instead of `...`, and so the result of the command is a single word stored as-is into the pwr_config variable.*
Even simpler: if your npm run start script cares about the whitespace in your JSON, it's fundamentally broken :) . Just do:
pwr_config="$(< config.json)" npm run start
The $(<...) returns the contents of config.json. They are all stored as a single word ("") into pwr_config, newlines and all.* If something breaks, either config.json has an error and should be fixed, or the code you're running has an error and needs to be fixed.
* You actually don't need the "" around $(). E.g., foo=$(echo a b c) and foo="$(echo a b c)" have the same effect. However, I like to include the "" to remind myself that I am specifically asking for all the text to be kept together.

not able to store sed output to variable

I am new to bash script.
I am getting some json response and i get only one property from the response. I want to save it to a variable but it is not working
token=$result |sed -n -e 's/^.*access_token":"//p' | cut -d'"' -f1
echo $token
it returns blank line.
I cannot use jq or any third party tools.
Please let me know what I am missing.
Your command should be:
token=$(echo "$result" | sed -n -e 's/^.*access_token":"//p' | cut -d'"' -f1)
You need to use echo to print the contents of the variable over standard output, and you need to use a command substitution $( ) to assign the output of the pipeline to token.
Quoting your variables is always encouraged, to avoid problems with white space and glob characters like *.
As an aside, note that you can probably obtain the output using something like:
token=$(jq -r .access_token <<<"$result")
I know you've said that you can't use jq but it's a standalone binary (no need to install it) and treats your JSON in the correct way, not as arbitrary text.
Give this a try:
token="$(sed -E -n -e 's/^.*access_token": ?"//p' <<<"$result" | cut -d'"' -f1)"
Explanation:
token="$( script here )" means that $token is set to the output/result of the script run inside the subshell through a process known as command substituion
-E in sed allows Extended Regular Expressions. We want this because JSON generally contains a space after the : and before the next ". We use the ? after the space to tell sed that the space may or may not be present.
<<<"$result" is a herestring that feeds the data into sed as stdin in place of a file.

How do I get the latest tag value from the github API for a given repository

I can get the latest commit from the GitHub api using :
$ curl 'https://api.github.com/repos/dwkns/test/commits?per_page=1'
However the resulting JSON doesn't contain any reference to the tag I created when I did that commit.
I can get a list of tags using :
$ curl 'https://api.github.com/repos/dwkns/test/tags'
However the resulting JSON, while it contains the names of tags I want, is not in the order in which they were created - there is no way of telling which tag is the latest one.
EDIT : The latest tag created was LatestLatestLatest
My question then is what API call(s) do I need to do to get the name of the latest tag in my repository?
Semantic Versioning Example
NOTE: If you're in a hurry and don't need all the fine details explained, just jump down to "The Solution" and execute the command.
This solution uses curl and grep to match the LATEST semantically versioned release number. An example will be demonstrated using my own Github repo "pi-ap" (a pile of bash scripts which automates config of a Raspberry Pi into a wireless AP).
You can test the example I give you on the CLI and after you're satisfied it works as intended, you can tweak it to your own use-case.
Versioning Format Construction:
Since we're using grep to match the version number, I need to explain its' construction. 3 pairs of integers separated by 2 dots and prefaced by a "v":
vXX.XX.XX
^ ^ ^
| | |
| | Patch
| Minor
Major
NOTE: If a field only has a single digit, I'll pad it with a zero to ensure the resulting format is predictable: always 3 pairs of integers separated by 2 dots.
The Solution:
Github Username: F1Linux
Github Repo Name: pi-ap (NOTE: exclude the ".git" suffix)
curl -s 'https://github.com/f1linux/pi-ap/tags/'|grep -Eo "$Version v[0-9]{1,2}.[0-9]{1,2}.[0-9]{1,2}"|sort -r|head -n1
Validate the Result Correct:
In your browser, go to:
https://github.com/f1linux/pi-ap/tags
And validate that the latest tag was returned from the command.
The above is fairly extensible for most use-cases. Just need to change the user & repo names and remove/replace the "v" if you don't use this convention in tagging your repos.
Using jq in combination with curl you can have a pretty straightforward command:
curl -s \
-H "Accept: application/vnd.github.v3+json" \
https://api.github.com/repos/dwkns/test/tags \
| jq -r '.[0].name'
Output (as of today):
v56
Explanation on jq command:
-r is for "raw", avoid json quotes on jq's output
.[0] selects the first (latest) tag object in json array we got from github
.name selects the name property in this lastest json object
#!/bin/sh
curl -s https://github.com/dwkns/test/tags |
awk '/tag-name/{print $3;exit}' FS='[<>]'
Or
#!/bin/awk -f
BEGIN {
FS = "[<>]"
while ("curl -s https://github.com/dwkns/test/tags" | getline) {
if(/tag-name/){print $3;exit}
}
}

Retrieve Dropbox personal path from ~/.dropbox/info.json in Bash Script

In Dropbox since version 2.8, the path to your dropbox folder can be found in the file ~/.dropbox/info.json
In my case, I'm seeking my personal path, not the business path, which is not in the typical Dropbox location ~/Dropbox but on a separate volume.
My ~/.dropbox/info.json:
{"business": {"path": "/Users/ChristopherA/ReOrient Media", "host": 123456789}, "personal": {"path": "/Volumes/Cloud/Dropbox", "host": 123456789}}
I have tried using grep/awk, but can't quite reliably get just the path /Volumes/Cloud/Dropbox, as there may be only one first level entry (i.e. no business dropbox), and the order might different for other users (i.e. I can't always rely on last pa
Some people suggested using jsawk, but I wasn't able to figure out how to make it work, and I'd prefer no dependencies as this script will be used on multiple computers.
Ideas?
-- Christopher Allen
A solution using a json-specific tool would be much more robust.
Using sed
Using just sed, and assuming that your json data is in a file called json, try:
$ sed -n 's/.*"personal":[^}]*"path": "\([^"]*\)",.*/\1\n/p' json
/Volumes/Cloud/Dropbox
Your sample json data was all on a single line. If that is not the case in general, then it would be better to remove the newlines before passing it to sed:
$ tr '\n' ' ' <json | sed -n 's/.*"personal":[^}]*"path": "\([^"]*\)",.*/\1\n/p'
/Volumes/Cloud/Dropbox
Using awk
$ awk -F'"' -v RS='"personal"[^}]*path":' 'NR==2 {print $2}' json
/Volumes/Cloud/Dropbox
The above uses a regular expression for the record separator. GNU awk supports this. Others may or may not.
Mac OSX Version
From Christopher Allen, the following works on a Mac:
tr '\n' ' ' <json | sed -n 's/.*"personal":[^}]*"path": "([^"]*)",.*/\1/p
Using bash
#!/bin/bash
data=$(cat json)
data=${data#*\"personal\":}
data=${data#*path\":}
data=${data#*\"}
data=${data%%\"*}
echo "$data"

Shell: Replacing each New Line "\n" character with "\\n"

I'm inserting a git diff of changed files into a JSON object to send using a curl request.
The problem is it doesn't like the new-line characters being inserted into the JSON but I'm not sure how to get around that. Translate tool didn't work, this perl solution I'm using is close but just replaces with spaces:
changedfiles=$(git diff --name-only $3..$4 | perl -p -e 's/\n/ /')
and changing it to this didn't help:
changedfiles=$(git diff --name-only $3..$4 | perl -p -e 's/\n/\\n/')
Can anyone point me in the right direction? It doesn't need to use perl, it just needs to work
(...being simple would be nice too)
Instead of trying to do ad-hoc escaping for characters that your immediate testing finds problematic, how about using an actual JSON library that handles all of them in a solid way?
Here's an example in bash using inlined python:
python -c '
import json
import sys
print(json.dumps({"data": sys.argv[1]}))
' "$(git diff --name-only $3..$4)"
It prints the json object { "data": "your command output here" } with standards compliant escaping.
This is what I think you want to do to get a quoted list of files separated by commas (i.e. for inserting into a JSON string):
git diff --name-only $3..$4 | perl -p -e 's/(.*)/"$1",/;s/\n//;s/""/","/'
This works if your files don't contain double quotes or special characters that need to be JSON escaped.
First, we put the files in quotes followed by a comma, then remove newlines, then change the "" between files to ",". Although, this is kind of a hack. Somewhat better might be:
git diff --name-only $3..$4 | perl -p -e '$/="";s/(.*)\n/"$1",/g;s/,$//'
Here we read in the whole input, newlines and all, do our substitution and remove the final comma.