jq construct with value strings spanning multiple lines - json

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.

Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')

Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json

Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

Related

Non json output from gcloud ai-platform predict. Parsing non-json outputs

I am using gcloud ai-platform predict to call an endpoint and get predictions as below using json-request and not json-response
gcloud ai-platform predict --json-request instances.json
The response is however not json and hense cannot be read further causing other complications. Below is the response.
VAL HS
0.5 {'hs_1': [[-0.134501, -0.307326, -0.151994, -0.065352, -0.14138]], 'hs_2' : [[-0.134501, -0.307326, -0.151994, -0.065352, 0.020759]]}
Can gcloud ai-platform predict return a json instead or may be parse it differently. ?
Thanks for your help.
Apparently, your output is a table with headers and two columns: a score and the (alleged) JSON content. You should extract the second column of any preferred data row (your example only has one but in general you might receive several score-JSON pairs). Maybe your API already offers functionality to extract a certain 'state', e.g. the one with the highest score. If not, a simple awk or sed script can get this job done easily.
Then, the only remaining issue before having proper JSON (which can then be queried by jq) is with the quoting style. Your output encloses field names with ' instead of " ('lstm_1' instead of "lstm_1"). Correcting thin, unfortunately, is a not-so-easy task if you can expect to receive arbitrarily complex JSON data (such as strings containing quotation marks etc.). However, if your JSON will always look as simple as in the example provided, simply substituting the wrong for the right one becomes an easy task again for tools like awk or sed.
For instance, using sed on your example output to select the second line (which is the first data row), drop everything from the beginning until but not including the first opening curly brace (which marks the beginning of the second column), make said substitutions and pipe the result into jq:
... | sed -n "2{s/^[^{]\+//;s/'/\"/g;p;q}" | jq .
{
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
[Edited to reflect upon a comment]
If you want to utilize the score as well, let jq handle it. For instance:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{score:first,status:last}'
{
"score": 0.548,
"status": {
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
}
[Edited to reflect upon changes in the OP]
As changes affected only names and values but no structure, the hitherto valid approach still holds:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{val:first,hs:last}'
{
"val": 0.5,
"hs": {
"hs_1": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
-0.14138
]
],
"hs_2": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
0.020759
]
]
}
}

transform json to add array objects

I need to transform an array by adding additional objects -
I have:
"user_id":"testuser"
"auth_token":"abcd"
I need:
"key":"user_id"
"value":"testuser"
"key":"auth_token"
"value":"abcd"
I have been using jq but cant figure out how to do it. Do i need to transform this into a multi-dimensional array first?
I have tried multiple jq queries but cant find the most suitable
When i try using jq i get
jq: error: syntax error, unexpected $end, expecting QQSTRING_TEXT or QQSTRING_INTERP_START or QQSTRING_END (Unix shell quoting issues?) at , line 1
Your input is not json, it's just a bunch of what could be thought of as key/value pairs. Assuming your json input actually looked like this:
{
"user_id": "testuser",
"auth_token": "abcd"
}
You could get an array of key/value pair objects using to_entries.
$ jq 'to_entries' input.json
[
{
"key": "user_id",
"value": "testuser"
},
{
"key": "auth_token",
"value": "abcd"
}
]
If on the other hand your input was actually that, you would need to convert it to a format that can be processed. Fortunately you could read it in as a raw string and probably parse using regular expressions or basic string manipulation.
$ jq -Rn '[inputs|capture("\"(?<key>[^\"]+)\":\"(?<value>[^\"]*)\"")]' input.txt
$ jq -Rn '[inputs|split(":")|map(fromjson)|{key:.[0],value:.[1]}]' input.txt
You can use to_entries filter for that.
Here is jqplay example
Robust conversion of key:value lines to JSON.
If the key:value specifications would be valid JSON except for the
missing punctuation (opening and closing braces etc), then a simple and quite robust approach to converting these key:value pairs to a single valid JSON object is illustrated by the following:
cat <<EOF | jq -nc -R '["{" + inputs + "}" | fromjson] | add'
"user_id": "testuser"
"auth_token" : "abcd"
EOF
Output
{
"user_id": "testuser",
"auth_token": "abcd"
}

Retrieve one (last) value from influxdb

I'm trying to retrieve the last value inserted into a table in influxdb. What I need to do is then post it to another system via HTTP.
I'd like to do all this in a bash script, but I'm open to Python also.
$ curl -sG 'https://influx.server:8086/query' --data-urlencode "db=iotaWatt" --data-urlencode "q=SELECT LAST(\"value\") FROM \"grid\" ORDER BY time DESC" | jq -r
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "grid",
"columns": [
"time",
"last"
],
"values": [
[
"2018-01-17T04:15:30Z",
690.1
]
]
}
]
}
]
}
What I'm struggling with is getting this value into a clean format I can use. I don't really want to use sed, and I've tried jq but it complains the data is a string and not an index:
jq: error (at <stdin>:1): Cannot index array with string "series"
Anyone have a good suggestion?
Pipe that curl to the jq below
$ your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]'
"grid"
"2018-01-17T04:15:30Z"
690.1
The results could be stored into a bash array and used later.
$ results=( $(your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]') )
$ echo "${results[#]}"
"grid" "2018-01-17T04:15:30Z" 690.1
# Individual values could be accessed using "${results[0]}" and so, mind quotes
All good :-)
Given the JSON shown, the jq query:
.results[].series[].values[]
produces:
[
"2018-01-17T04:15:30Z",
690.1
]
This seems to be the output you want, but from the point of view of someone who is not familiar with influxdb, the requirements seem very opaque, so you might want to consider a variant, such as:
.results[-1].series[-1].values[-1]
which in this case produces the same result, as it happens.
If you just want the atomic values, you could simply append [] to either of the queries above.

Pass bash variable (string) as jq argument to walk JSON object

This is seemingly a very basic question. I am new to JSON, so I am prepared for the incoming facepalm.
I have the following JSON file (test-app-1.json):
"application.name": "test-app-1",
"environments": {
"development": [
"server1"
],
"stage": [
"server2",
"server3"
],
"production": [
"server4",
"server5"
]
}
The intent is to use this as a configuration file and reference for input validation.
I am using bash 3.2 and will be using jq 1.4 (not the latest) to read the JSON.
The problem:
I need to return all values in the specified JSON array based on an argument.
Example: (how the documentation and other resources show that it should work)
APPLICATION_ENVIRONMENT="developement"
jq --arg appenv "$APPLICATION_ENVIRONMENT" '.environments."$env[]"' test-app-1.json
If executed, this returns null. This should return server1.
Obviously, if I specify text explicitly matching the JSON arrays under environment, it works just fine:
jq 'environments.development[]' test-app-1.json returns: server1.
Limitations: I am stuck to jq 1.4 for this project. I have tried the same actions in 1.5 on a different machine with the same null results.
What am I doing wrong here?
jq documentation: https://stedolan.github.io/jq/manual/
You have three issues - two typos and one jq filter usage issue:
set APPLICATION_ENVIRONMENT to development instead of developement
use variable name consistently: if you define appenv, use $appenv, not $env
address with .environments[$appenv]
When fixed, looks like this:
$ APPLICATION_ENVIRONMENT="development"
$ jq --arg appenv "$APPLICATION_ENVIRONMENT" '.environments[$appenv][]' test-app-1.json
"server1"

Need help! - Unable to load JSON using COPY command

Need your expertise here!
I am trying to load a JSON file (generated by JSON dumps) into redshift using copy command which is in the following format,
[
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
,
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
]
We ran into the error - "Invalid JSONPath format: Member is not an object."
when I tried to get rid of square braces - [] and remove the "," comma separator between JSON dicts then it loads perfectly fine.
{
"cookieId": "cb2278",
"environment": "STAGE",
"errorMessages": [
"70460"
]
}
{
"cookieId": "cb2271",
"environment": "STG",
"errorMessages": [
"70460"
]
}
But in reality most JSON files from API s have this formatting.
I could do string replace or reg ex to get rid of , and [] but I am wondering if there is a better way to load into redshift seamlessly with out modifying the file.
One way to convert a JSON array into a stream of the array's elements is to pipe the former into jq '.[]'. The output is sent to stdout.
If the JSON array is in a file named input.json, then the following command will produce a stream of the array's elements on stdout:
$ jq ".[]" input.json
If you want the output in jsonlines format, then use the -c switch (i.e. jq -c ......).
For more on jq, see https://stedolan.github.io/jq