Pulling data out of JSON Object via jq to CSV in Bash - json

I'm working on a bash script (running via gitBash on Windows technically but I don't think that matters) that will convert some JSON API data into CSV files. Most of it has gone fairly well, especially since I'm not particularly familiar with JQ as this is my first time using it.
I've got some JSON data that looks like the array below. What I'm trying to do is select the cardType,MaskedPan,amount and datetime out of the data.
this is probably the first time in life that my google searching has failed me. I know(or should I say think) that that is actually an object and not just a simple array.
I've not really found anything that helps me know how to grab that data I need and export it into a CSV file. I've had no issue grabbing the other data that I need but these few pieces are proving to be a big problem for me.
The script I'm trying basically can be boiled down to this:
jq='/c/jq-win64.exe -r';
header='("cardType")';
fields='[.TransactionDetails[0].Value[0].cardType]';
$jq ''$header',(.[] | '$fields' | #csv)' < /t/API_Data/JSON/GetByDate-082719.json >
/t/API_Data/CSV/test.csv;
If I do .TransactionDetails[0].Value I can get that whole chunk of data. But that is problematic in a CSV as it contains commas.
I suppose I could make this a TSV and import it into the database as one big string and sub string it out. But that isn't the "right" solution. I'm sure there is a way JQ can give me what I need.
"TransactionDetails": [
{
"TransactionId": 123456789,
"Name": "BlacklinePaymentDetail",
"Value": "{\"cardType\":\"Visa\",\"maskedPan\":\"1234\",\"paymentDetails\":{\"reference\":\"123456789012\",\"amount\":99.99,\"dateTime\":\"2019/08/27 08:41:09\"}}",
"ShowOnTill": false,
"PrintOnOrder": false,
"PrintOnReceipt": false
}
]
Ideally I'd be able to just have a field in the CSV for cardType,MaskedPan,amount and datetime instead of pulling the "Value" that contains all of it.
Any advice would be appreciated.

The ingredient you're missing is fromjson, which converts a stringified JSON to JSON. Adding enclosing braces around your sample input,
the invocation:
jq -r -f program.jq input.json
produces:
"Visa","1234",99.99,"2019/08/27 08:41:09"
where program.jq is:
.TransactionDetails[0].Value
| fromjson
| [.cardType, .maskedPan] + (.paymentDetails | [.amount, .dateTime])
| #csv

Related

iterating json to store key value pairs using shell script

I have a json file that is getting created at runtime using the sh script within groovy code. The json file has below contents.
cat.json
{
"user1":"pass1",
"user2":"pass2",
"user3":"pass3"
}
Now I want to create a file at runtime which stores key value pairs in below format
test
user1:pass1
user2:pass2
user3:pass3
can some one help me out shell codes for writing this.
You have literally dozen ways to convert that JSON document to a tabular data file (pretty much like CSV/colon-SV) since you mentioned Java, Groovy, including Java-driven scripting engines (BeanShell, JavaScript, Groovy itself), but if you can use jq then you can extract k/v pairs at least for simple values that do not require any escaping:
#!/bin/sh
jq -r 'to_entries[] | "\(.key):\(.value)"' \
< cat.json
This answer is inspired by searching for extracting entries using jq (or converting a JSON file to a CSV file) and especially by the answer https://stackoverflow.com/a/50496145/12232870 by #peak.

How to replace values in a JSON dictionary with their respective shell variables in jq?

I have the following JSON structure:
{
"host1": "$PROJECT1",
"host2": "$PROJECT2",
"host3" : "xyz",
"host4" : "$PROJECT4"
}
And the following environment variables in the shell:
PROJECT1="randomtext1"
PROJECT2="randomtext2"
PROJECT4="randomtext3"
I want to check the values for each key, if they have a "$" character in them, replace them with their respective environment variable(which is already present in the shell) so that my JSON template is rendered with the correct environment variables.
I can use the --args option of jq but there are quite a lot of variables in my actual JSON template that I want to render.
I have been trying the following:
jq 'with_entries(.values as v | env.$v)
Basically making each value as a variable, then updating its value with the variable from the env object but seems like I am missing out on some understanding. Is there a straightforward way of doing this?
EDIT
Thanks to the answers on this question, I was able to achieve my larger goal for a part of which this question was asked
iterating over each value in an object,
checking its value,
if it's a string and starts with the character "$"
use the value to update it with an environment variable of the same name .
if it's an array
use the value to retrieve an environment variable of the same name
split the string with "," as delimiter, which returns an array of strings
Update the value with the array of strings
jq 'with_entries(.value |= (if (type=="array") then (env[.[0][1:]] | split(",")) elif (type=="string" and startswith("$")) then (env[.[1:]]) else . end))'
You need to export the Bash variables to be seen by jq:
export PROJECT1="randomtext1"
export PROJECT2="randomtext2"
export PROJECT4="randomtext3"
Then you can go with:
jq -n 'with_entries((.value | select(startswith("$"))) |= env[.[1:]])'
and get:
{
"host1": "randomtext1",
"host2": "randomtext2",
"host3": "xyz",
"host4": "randomtext3"
}
Exporting a large number of shell variables might not be such a good idea and does not address the problem of array-valued variables. It might therefore be a good idea to think along the lines of printing the variable=value details to a file, and then combining that file with the template. It’s easy to do and examples on the internet abound and probably here on SO as well. You could, for example, use printf like so:
printf "%s\t" ${BASH_VERSINFO[#]}
3 2 57 1
You might also find declare -p helpful.
See also https://github.com/stedolan/jq/wiki/Cookbook#arbitrary-strings-as-template-variables

Convert CSV to Grouped JSON

I have several large CSV's which I would like to export to a particular JSON format but I'm not really sure how to convert it over. It's a list of usernames and urls.
b00nw33,harrypotter788.flv
b00nw33,harrypotter788.mov
b00nw33,levitation271.avi
b01spider,schimbvalutar109.avi
...
I want to export them to JSON grouped by the username like the following
{
"b00nw33": [
"harrypotter788.flv",
"harrypotter788.mov",
"levitation271.avi"
],
"b01spider": [
"schimbvalutar109.avi"
]
}
What is the JQ to do this? Thank you!
The key to a simple solution is the generic function aggregate_by:
# In this formulation, f must either always evaluate to a string or
# always to an integer, it being understood that negative integers
# might be problematic
def aggregate_by(s; f; g):
reduce s as $x (null; .[$x|f] += [$x|g]);
If the CSV can be accurately parsed by simply splitting on commas, then the desired transformation can be accomplished using the following jq filter:
aggregate_by(inputs | split(","); .[0]; .[1])
This assumes jq is invoked with the -R (raw) and -n options.
Output
With the given CSV input, the output would be:
{
"b00nw33": [
"harrypotter788.flv",
"harrypotter788.mov",
"levitation271.avi"
],
"b01spider": [
"schimbvalutar109.avi"
]
}
Handling non-trivial CSV
The above solution assumes that the CSV is as uncomplicated as the sample. If, on the contrary, the CSV cannot be accurately parsed by simply splitting at commas, a more general parser will be needed.
One approach would be to use the very robust and fast csv2json parser at https://github.com/fadado/CSV
Alternatively, you could use one of the many available "csv2tsv" parsers to generate TSV, which jq can handle directly (by splitting on tabs, i.e. split("\t") rather than split(",")).
In any case, once the CSV has been converted to JSON, the filter aggregate_by defined above can be used.
If you are interested in a jq parser for CSV, you might want to look at fromcsvfile (https://gist.github.com/pkoppstein/bbbbdf7489c8c515680beb1c75fa59f2); see also
the definitions for fromcsv being proposed at https://github.com/stedolan/jq/issues/1650#issuecomment-448050902

Oracle SQLcl: Spool to json, only include content in items array?

I'm making a query via Oracle SQLcl. I am spooling into a .json file.
The correct data is presented from the query, but the format is strange.
Starting off as:
SET ENCODING UTF-8
SET SQLFORMAT JSON
SPOOL content.json
Follwed by a query, produces a JSON file as requested.
However, how do I remove the outer structure, meaning this part:
{"results":[{"columns":[{"name":"ID","type":"NUMBER"},
{"name":"LANGUAGE","type":"VARCHAR2"},{"name":"LOCATION","type":"VARCHAR2"},{"name":"NAME","type":"VARCHAR2"}],"items": [
// Here is the actual data I want to see in the file exclusively
]
I only want to spool everything in the items array, not including that key itself.
Is this possible to set as a parameter before querying? Reading the Oracle docs have not yielded any answers, hence asking here.
Thats how I handle this.
After output to some file, I use jq command to recreate the file with only the items
ssh cat file.json | jq --compact-output --raw-output '.results[0].items' > items.json
`
Using this library = https://stedolan.github.io/jq/

Efficiently get the first record of a JSONL file

Is it possible to efficiently get the first record of a JSONL file without consuming the entire stream / file? One way I have been able to inefficiently do so is the following:
curl -s http://example.org/file.jsonl | jq -s '.[0]'
I realize that head could be used here to extract the first line, but assume that the file may not use a newline as the record separator and may simply be concatenated objects or arrays.
If I'm understanding correctly, the JSONL format just returns a stream of JSON objects which jq handles quite nicely. Best case scenario that you wanted the first item, you could just utilize the input filter to grab the first item.
I think you could just do this:
$ curl -s http://example.org/file.jsonl | jq -n 'input'
You need the null input -n to not process the input immediately then input just gets one input from the stream. No need to go through the rest of the input stream.