Extract UTF-uncoded binary data from JSON using jq

Extract UTF-uncoded binary data from JSON using jq - json

Say I have a JSON with a 0xb7 byte encoded as a UTF codepoint:
{"key":"_\u00b7_"}
If I extract the value of the "key" with jq it keeps the utf8 encoding of this byte which is "c2 b7":
$ echo '{"key":"_\u00b7_"}' | ./jq '.key' -r | xxd
0000000: 5fc2 b75f 0a _.._.
Is there any jq command that extracts the decoded "5f b7 5f" byte sequence out of this JSON?
I can solve this with extra tools like iconv but it's a bit ugly:
$ echo '{"key":"_\u00b7_"}' | ./jq '.key' -r \
| iconv -f utf8 -t utf32le \
| xxd -ps | sed -e 's/000000//g' | xxd -ps -r \
| xxd
0000000: 5fb7 5f0a _._.

def hx:
def hex: [if . < 10 then 48 + . else 55 + . end] | implode ;
tonumber | "\(./16 | floor | hex)\(. % 16 | hex)";
{"key":"_\u00b7_"} | .key | explode | map(hx)
produces:
["5F","B7","5F"]
"Raw Bytes" (caveat emptor)
Since jq only supports UTF-8 strings, you would have to use some external tool to obtain the "raw bytes". Maybe this is closer to what you want:
jq -nrj '{"key":"_\u00b7_"} | .key' | iconv -f utf-8 -t ISO8859-1
This produces the three bytes.
And here's an iconv-free solution:
jq -nrj '{"key":"_\u00b7_"} | .key' | php -r 'print utf8_decode(readline());'

Alternate
Addressing the character encoding scenario outside of jq:
Though you didn't want extra tools, iconv and hexdump are indeed readily available - I for one frequently lean on iconv when I require certain parts of a pipeline to be completely known to me, and hexdump when I want control of the formatting of the representation of those parts.
So an alternative is:
jq -njr '{"key":"_\u00b7_"} | .key' | iconv -f utf8 -t UTF-32LE | hexdump -ve '1/1 "%.X"'
Result:
5FB75F

Related

unescape backslash in jq output

https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/cid/5317139/property/IsomericSMILES/JSON
For the above JSON, the following jq prints 5317139 CCC/C=C\\1/C2=C(C3C(O3)CC2)C(=O)O1.
.PropertyTable.Properties
| .[]
| [.CID, .IsomericSMILES]
| #tsv
But there are two \ before the first 1. Is it wrong, should three be just one \? How to get the correct number of backslash?

The extra backslash in the output is the result of the request to produce TSV, since "\" has a special role to play in jq's TSV (e.g. "\t" signifies the tab character).
By contrast, consider:
jq -r '
.PropertyTable.Properties
| .[]
| [.CID, .IsomericSMILES]
| join("\t")' smiles.json
5317139 CCC/C=C\1/C2=C(C3C(O3)CC2)C(=O)O1

How to find value of a key in a json response trace file using shell script

I have a response trace file containing below response:
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
I need to fetch the value of the "id" key in a variable which I can put in my further code.
Expected result is
echo $id - should give me 70EA96FB313349279EB089BA9DE2EC3B value

With valid JSON (remove first to second row with sed and parse with jq):
id=$(sed '1,2d' file | jq -r '.member[]|.id')
Output to variable id:
70EA96FB313349279EB089BA9DE2EC3B

I would strongly suggest using jq to parse json.
But given that json is mostly compatible with python dictionaries and arrays, this HACK would work too:
$ cat resp
#RESPONSE BODY
#--------------------
{"totalItems":1,"member":[{"name":"name","title":"PatchedT","description":"My des_","id":"70EA96FB313349279EB089BA9DE2EC3B","type":"Product","modified":"2019 Jul 23 10:22:15","created":"2019 Jul 23 10:21:54",}]}
$ awk 'NR==3{print "a="$0;print "print a[\"member\"][0][\"id\"]"}' resp | python
70EA96FB313349279EB089BA9DE2EC3B
$ sed -n '3s|.*|a=\0\nprint a["member"][0]["id"]|p' resp | python
70EA96FB313349279EB089BA9DE2EC3B
Note that this code is
1. dirty hack, because your system does not have the right tool - jq
2. susceptible to shell injection attacks. Hence use it ONLY IF you trust the response received from your service.

Quick and dirty (don't use eval):
eval $(cat response_file | tail -1 | awk -F , '{ print $5 }' | sed -e 's/"//g' -e 's/:/=/')
It is based on the exact structure you gave, and hoping there is no , in any value before "id".
Or assign it yourself:
id=$(cat response_file | tail -1 | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
Note that you can't access the name field with that trick, as it is the first item of the member array and will be "swallowed" by the { print $2 }. You can use an even-uglier hack to retrieve it though:
id=$(cat response_file | tail -1 | sed -e 's/:\[/,/g' -e 's/}\]//g' | awk -F , '{ print $5 }' | cut -d: -f2 | sed -e 's/"//g')
But, if you can, jq is the right tool for that work instead of ugly hacks like that (but if it works...).

When you can't use jq, you can consider
id=$(grep -Eo "[0-9A-F]{32}" file)
This is only working when the file looks like what I expect, so you might need to add extra checks like
id=$(grep "My des_" file | grep -Eo "[0-9A-F]{32}" | head -1)

jq to convert two text strings into separate json objects

How do I convert these two text strings into separate json objects
Text strings:
start process: Mon May 15 03:14:09 UTC 2017
logfilename: log_download_2017
Json output:
{
"start process": "Mon May 15 03:14:09 UTC 2017",
}
{
"logfilename": "log_download_2017",
}
Shell script:
logfilename="log_download_2017"
echo "start process: $(date -u)" | tee -a $logfilename.txt | jq -R split(:) >> $logfilename.json
echo "logfilename:" $logfilename | tee -a $logfilename.txt | jq -R split(:) >> $logfilename.json

One approach would be to use index/1, e.g. along these lines:
jq -R 'index(":") as $ix | {(.[:$ix]) : .[$ix+1:]}'
Or, if your jq supports regex, you might like to consider:
jq -R 'match( "([^:]*):(.*)" ) | .captures | {(.[0].string): .[1].string}'
or:
jq -R '[capture( "(?<key>[^:]*):(?<value>.*)" )] | from_entries'

jq to convert two text strings into a single json object

How do I convert these two text strings into a single json object
Text strings:
start process: Mon May 15 03:14:09 UTC 2017
logfilename: log_download_2017
Json output:
{
"start process": "Mon May 15 03:14:09 UTC 2017",
"logfilename": "log_download_2017",
}
Shell script:
logfilename="log_download_2017"
echo "start process: $(date -u)" | tee -a $logfilename.txt | jq -R . >> $logfilename.json
echo "logfilename:" $logfilename | tee -a $logfilename.txt | jq -R . >> $logfilename.json

As mentioned e.g. at Use jq to turn x=y pairs into key/value pairs, the basic task of converting a key:value string can be accomplished in a number of ways. For example, you could start with:
index(":") as $ix | {(.[:$ix]) : .[$ix+1:]}
You evidently want to trim some spaces, which can be done using sub/2.
To combine the objects, you could use add. To do this in a single pass, you would use jq -R -s
Putting it all together, you could do worse than:
def trim: sub("^ +";"") | sub(" +$";"");
def s2o:
(index(":") // empty) as $ix
| {(.[:$ix]): (.[$ix+1:]|trim)};
split("\n") | map(s2o) | add

I am converting my Json data to .CSV format using Jq but unable to run it

Input :-
{"Timestamp":140,
"DateTime":"2014-06-02 14:32:34.440 PDT",
"CustomerId":"01",
"VisitorId":"78"}
Desired Output
Timestamp; DateTime; CustomerId; VisitorId
140; 2014-06-02 14:32:34.440 PDT; 01; 78
I tried the following code:-
results.txt
| (map(keys) | add | unique) as $cols
| map(. as $row | $cols | map($row[.])) as $rows
| $cols, $rows[] | #csv
Error:-
'add' is not recognized as an internal or external command,
operable program or batch file."
I don't know what is wrong. I am using window platform with cygwin.

With your input, and the following program in tocsv.jq:
(keys_unsorted | join(",")),
([.[]] | #csv)
the command:
$ jq -r -f tocsv.jq input.json
produces:
Timestamp,DateTime,CustomerId,VisitorId
140,"2014-06-02 14:32:34.440 PDT","01","78"
Eliminating the quotation marks in the second line is left as an exercise for the interested reader :-) [Hint: use join(",") again.]
WARNING: the above program is intended only for jq version 1.5 or later. When using an earlier version of jq, using to_entries or explicitly specifying the key names may be required.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Extract UTF-uncoded binary data from JSON using jq - json

Related

unescape backslash in jq output

How to find value of a key in a json response trace file using shell script

jq to convert two text strings into separate json objects

jq to convert two text strings into a single json object

I am converting my Json data to .CSV format using Jq but unable to run it

Categories

Resources