JMESPath join output and strip unwanted chars

JMESPath join output and strip unwanted chars - json

I have the following json;
[
{
"compartment-id": "CompartmentID-123",
"defined-tags": {},
"display-name": "Test-123",
"freeform-tags": {},
"id": "ID-ABC",
"kms-key-id": "",
"lifecycle-state": "ACTIVE",
}
]
I am looking to join the id and display-name params into a comma seperate string with " and [] stripped, like so:
ID-ABC,Test-123
Closest I've managed to get so far is:
oci fs file-system list -c $compart --availability-domain $ad --query 'data[].[id,"display-name"][]' | tr -d '" '
[
ID-ABC,
Test-123
]
Wondering if there's a cleaner way of doing this all within JMESPath without piping output to tr, jq, sed or awk etc etc
UPDATE based on input from β.εηοιτ.βε
So close...
oci fs file-system list -c $compart --availability-domain $ad3 --query 'data[0].join(',', [id, "display-name"])'
Returns
ParseError: invalid token: Parse error at column 13, token "," (COMMA), for expression:
"data[0].join(,, [id, "display-name"])"
However playing around with quotes, best I can get is by using;
oci fs file-system list -c $compart --availability-domain $ad3 --query "data[0].join(',', [id, 'display-name'])"
Private key passphrase:
"ID-ABC,display-name"
I'm beginning to wonder if there's something wrong with my local settings or shell, whereby it's getting confused by the quotes marks?

Since you do have an array, you will first need to target the first element of the array, to get rid of it. Since an array is 0-based, this can easily be achieved via:
data[0]
which gives
{
"compartment-id": "CompartmentID-123",
"defined-tags": {},
"display-name": "Test-123",
"freeform-tags": {},
"id": "ID-ABC",
"kms-key-id": "",
"lifecycle-state": "ACTIVE"
}
Then, in order to create the string you are looking for, you can use the join function, that accepts, as parameter, a glue character and array.
You can get the array you look for pretty easily, using filters and multiselect lists:
data[0].[id, "display-name"]
which gives
[
"ID-ABC",
"Test-123"
]
Now we just need to apply the join function on top of all this:
data[0].join(',', [id, "display-name"])
which finally gives:
"ID-ABC,Test-123"
Additional notes on quoting in JMESPath:
'some-string'
is a raw literal string and won't be interpreted
`some-string`
is another form of raw literal string and won't be interpreted
"some-string"
is a literal expression and will be interpreted
So:
data[0].join(',', [id, "display-name"])
and
data[0].join(`,`, [id, "display-name"])
are two strictly equals queries.
While
data[0].join(',', [id, 'display-name'])
has a totally different meaning and would end you with the string display-name being the second element of your array, so it will result in
"ID-ABC,display-name"
All of this based on the JSON structure:
{
"data":[
{
"compartment-id":"CompartmentID-123",
"defined-tags":{},
"display-name":"Test-123",
"freeform-tags":{},
"id":"ID-ABC",
"kms-key-id":"",
"lifecycle-state":"ACTIVE"
}
]
}

Related

Non json output from gcloud ai-platform predict. Parsing non-json outputs

I am using gcloud ai-platform predict to call an endpoint and get predictions as below using json-request and not json-response
gcloud ai-platform predict --json-request instances.json
The response is however not json and hense cannot be read further causing other complications. Below is the response.
VAL HS
0.5 {'hs_1': [[-0.134501, -0.307326, -0.151994, -0.065352, -0.14138]], 'hs_2' : [[-0.134501, -0.307326, -0.151994, -0.065352, 0.020759]]}
Can gcloud ai-platform predict return a json instead or may be parse it differently. ?
Thanks for your help.

Apparently, your output is a table with headers and two columns: a score and the (alleged) JSON content. You should extract the second column of any preferred data row (your example only has one but in general you might receive several score-JSON pairs). Maybe your API already offers functionality to extract a certain 'state', e.g. the one with the highest score. If not, a simple awk or sed script can get this job done easily.
Then, the only remaining issue before having proper JSON (which can then be queried by jq) is with the quoting style. Your output encloses field names with ' instead of " ('lstm_1' instead of "lstm_1"). Correcting thin, unfortunately, is a not-so-easy task if you can expect to receive arbitrarily complex JSON data (such as strings containing quotation marks etc.). However, if your JSON will always look as simple as in the example provided, simply substituting the wrong for the right one becomes an easy task again for tools like awk or sed.
For instance, using sed on your example output to select the second line (which is the first data row), drop everything from the beginning until but not including the first opening curly brace (which marks the beginning of the second column), make said substitutions and pipe the result into jq:
... | sed -n "2{s/^[^{]\+//;s/'/\"/g;p;q}" | jq .
{
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
[Edited to reflect upon a comment]
If you want to utilize the score as well, let jq handle it. For instance:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{score:first,status:last}'
{
"score": 0.548,
"status": {
"lstm_1": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
-0.1413831114768982
]
],
"lstm_2": [
[
-0.13450142741203308,
-0.3073260486125946,
-0.15199440717697144,
-0.06535257399082184,
0.02075939252972603
]
]
}
}
[Edited to reflect upon changes in the OP]
As changes affected only names and values but no structure, the hitherto valid approach still holds:
... | sed -n "2{s/'/\"/g;p;q}" | jq -s '{val:first,hs:last}'
{
"val": 0.5,
"hs": {
"hs_1": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
-0.14138
]
],
"hs_2": [
[
-0.134501,
-0.307326,
-0.151994,
-0.065352,
0.020759
]
]
}
}

Parsing JSON file in powershell with specific characters

I am trying to get the values in powershell within specific characters. Basically I have a json with thousands of objects like this
"Name": "AllZones_APOPreface_GeographyMatch_FromBRE_ToSTR",
"Sequence": 0,
"Condition": "this.TripOriginLocationCode==\"BRE\"&&this.TripDestinationLocationCode==\"STR\"",
"Action": "this.FeesRate=0.19m;this.ZoneCode=\"Zone1\";halt",
"ElseAction": ""
I want everything within \" \"
IE here I would see that BRE and STR is Zone1
All I need is those 3 things outputted.
I have been searching how to do it with ConvertFrom-Json but no success, maybe I just havent found a good article on this.
Thanks

Start by representing your JSON as a string:
$myjson = #'
{
"Name": "AllZones_APOPreface_GeographyMatch_FromBRE_ToSTR",
"Sequence": 0,
"Condition": "this.TripOriginLocationCode==\"BRE\"&&this.TripDestinationLocationCode==\"STR\"",
"Action": "this.FeesRate=0.19m;this.ZoneCode=\"Zone1\";halt",
"ElseAction": ""
}
'#
Next, create a regular expression that matches everything in between \" and \", that's under 10 characters long (else it'll match unwanted results).
$regex = [regex]::new('\\"(?<content>.{1,10})\\"')
Next, perform the regular expression comparison, by calling the Matches() method on the regular expression. Pass your JSON string into the method parameters, as the text that you want to perform the comparison against.
$matchlist = $regex.Matches($myjson)
Finally, grab the content match group that was defined in the regular expression, and extract the values from it.
$matchlist.Groups.Where({ $PSItem.Name -eq 'content' }).Value
Result
BRE
STR
Zone1
Approach #2: Use Regex Look-behinds for more accurate matching
Here's a more specific regular expression that uses look-behinds to validate each field appropriately. Then we assign each match to a developer-friendly variable name.
$regex = [regex]::new('(?<=TripOriginLocationCode==\\")(?<OriginCode>\w+)|(?<=TripDestinationLocationCode==\\")(?<DestinationCode>\w+)|(?<=ZoneCode=\\")(?<ZoneCode>\w+)')
$matchlist = $regex.Matches($myjson)
### Assign each component to its own friendly variable name
$OriginCode, $DestinationCode, $ZoneCode = $matchlist[0].Value, $matchlist[1].Value, $matchlist[2].Value
### Construct a string from the individual components
'Your origin code is {0}, your destination code is {1}, and your zone code is {2}' -f $OriginCode, $DestinationCode, $ZoneCode
Result
Your origin code is BRE, your destination code is STR, and your zone code is Zone1

jq split string by a pattern

I have a json object with one of field having values for example "countries-sapi-1.0", "inventory-list-api-1.0-snapshot"
Note that the first one has sapi and the other one has api.
Using jq, how can i get countries-sapi or inventory-list-api I mean whatever is there before the version. the version can be as simple as 1.0 or 1.0.1-snapshot etc..

I got here searching for how to split by regex instead of substring in jq, but I found out that you have to give two arguments to the split function (where the second argument contains flags for the regex, but it can be just an empty string).
$ jq -n '"axxbxxxc"|split("x+";"")'
[
"a",
"b",
"c"
]
From the manual:
split
Splits an input string on the separator argument.
jq 'split(", ")'
"a, b,c,d, e, "
=> ["a","b,c,d","e",""]
[...]
split(regex; flags)
For backwards compatibility, split splits on a string, not a regex.

It looks like you need to study up on regular expressions (regex); see for example https://regexone.com/ or https://github.com/zeeshanu/learn-regex or dozens of others.
Using jq, in your particular case, you could start with:
sub(" *- *[0-9]+\\.[0-9]+.*$"; "")
Note that two backslashes are required here because the "from" expression must be (or evaluate to) a valid JSON string.

For input "countries-sapi-1.0", use: .[] | match( "\\d"; "ig") which will give you the following output:
{
"offset": 15,
"length": 1,
"string": "1",
"captures": []
}
{
"offset": 17,
"length": 1,
"string": "0",
"captures": []
}
This uses the first object's offset value and tries to slice it from the starting position to the received offset.
Slice from beginning: $ jq -c '.[:15]'
In our case we got 15 as the offset for the first object so we used :15 for the slice.

Retrieve one (last) value from influxdb

I'm trying to retrieve the last value inserted into a table in influxdb. What I need to do is then post it to another system via HTTP.
I'd like to do all this in a bash script, but I'm open to Python also.
$ curl -sG 'https://influx.server:8086/query' --data-urlencode "db=iotaWatt" --data-urlencode "q=SELECT LAST(\"value\") FROM \"grid\" ORDER BY time DESC" | jq -r
{
"results": [
{
"statement_id": 0,
"series": [
{
"name": "grid",
"columns": [
"time",
"last"
],
"values": [
[
"2018-01-17T04:15:30Z",
690.1
]
]
}
]
}
]
}
What I'm struggling with is getting this value into a clean format I can use. I don't really want to use sed, and I've tried jq but it complains the data is a string and not an index:
jq: error (at <stdin>:1): Cannot index array with string "series"
Anyone have a good suggestion?

Pipe that curl to the jq below
$ your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]'
"grid"
"2018-01-17T04:15:30Z"
690.1
The results could be stored into a bash array and used later.
$ results=( $(your_curl_stuff_here | jq '.results[].series[]|.name,.values[0][]') )
$ echo "${results[#]}"
"grid" "2018-01-17T04:15:30Z" 690.1
# Individual values could be accessed using "${results[0]}" and so, mind quotes
All good :-)

Given the JSON shown, the jq query:
.results[].series[].values[]
produces:
[
"2018-01-17T04:15:30Z",
690.1
]
This seems to be the output you want, but from the point of view of someone who is not familiar with influxdb, the requirements seem very opaque, so you might want to consider a variant, such as:
.results[-1].series[-1].values[-1]
which in this case produces the same result, as it happens.
If you just want the atomic values, you could simply append [] to either of the queries above.

jq construct with value strings spanning multiple lines

I am trying to form a JSON construct using jq that should ideally look like below:-
{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": [
"event"
],
"traffic_including": [
"unattributed_traffic"
],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
"attribution_campaign_id",
"attribution_creative",
"attribution_date_adjusted",
"attribution_date_utc",
"attribution_matched_by",
"attribution_matched_to",
"attribution_network",
"attribution_network_id",
"attribution_seconds_since",
"attribution_site_id",
"attribution_site_id",
"attribution_tier",
"attribution_timestamp",
"attribution_timestamp_adjusted",
"attribution_tracker",
"attribution_tracker_id",
"attribution_tracker_name",
"count",
"custom_dimensions",
"device_id_adid",
"device_id_android_id",
"device_id_custom",
"device_id_idfa",
"device_id_idfv",
"device_id_kochava",
"device_os",
"device_type",
"device_version",
"dimension_count",
"dimension_data",
"dimension_sum",
"event_name",
"event_time_registered",
"geo_city",
"geo_country",
"geo_lat",
"geo_lon",
"geo_region",
"identity_link",
"install_date_adjusted",
"install_date_utc",
"install_device_version",
"install_devices_adid",
"install_devices_android_id",
"install_devices_custom",
"install_devices_email_0",
"install_devices_email_1",
"install_devices_idfa",
"install_devices_ids",
"install_devices_ip",
"install_devices_waid",
"install_matched_by",
"install_matched_on",
"install_receipt_status",
"install_san_original",
"install_status",
"request_ip",
"request_ua",
"timestamp_adjusted",
"timestamp_utc"
]
}
What I have tried unsuccessfully thus far is below:-
json_construct=$(cat <<EOF
{
"api_key": "6AEC90B5-4169-59AF-7AC9-D655F83B4825",
"app_guid": "komacca-s-rewards-app-au-ios-production-cv8tx71",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv"
"columns_order": ["attribution_attribution_action","attribution_campaign","attribution_campaign_id","attribution_creative","attribution_date_adjusted","attribution_date_utc","attribution_matched_by","attribution_matched_to","attributio
network","attribution_network_id","attribution_seconds_since","attribution_site_id","attribution_tier","attribution_timestamp","attribution_timestamp_adjusted","attribution_tracker","attribution_tracker_id","attribution_tracker_name","
unt","custom_dimensions","device_id_adid","device_id_android_id","device_id_custom","device_id_idfa","device_id_idfv","device_id_kochava","device_os","device_type","device_version","dimension_count","dimension_data","dimension_sum","ev
t_name","event_time_registered","geo_city","geo_country","geo_lat","geo_lon","geo_region","identity_link","install_date_adjusted","install_date_utc","install_device_version","install_devices_adid","install_devices_android_id","install_
vices_custom","install_devices_email_0","install_devices_email_1","install_devices_idfa","install_devices_ids","install_devices_ip","install_devices_waid","install_matched_by","install_matched_on","install_receipt_status","install_san_
iginal","install_status","request_ip","request_ua","timestamp_adjusted","timestamp_utc"]
}
EOF)
followed by:-
echo "$json_construct" | jq '.'
I get the following error:-
parse error: Expected separator between values at line 10, column 15
I am guessing it is because of the string literal which spans to multiple lines that jq is unable to parse it.

Use jq itself:
my_formatted_json=$(jq -n '{
"api_key": "XXXXXXXXXX-7AC9-D655F83B4825",
"app_guid": "XXXXXXXXXXXXXX",
"time_start": 1508677200,
"time_end": 1508763600,
"traffic": ["event"],
"traffic_including": ["unattributed_traffic"],
"time_zone": "Australia/NSW",
"delivery_format": "csv",
"columns_order": [
"attribution_attribution_action",
"attribution_campaign",
...,
"timestamp_utc"
]
}')

Your input "JSON" is not valid JSON, as indicated by the error message.
The first error is that a comma is missing after the key/value pair: "delivery_format": "csv", but there are others -- notably, JSON strings cannot be split across lines. Once you fix the key/value pair problem and the JSON strings that are split incorrectly, jq . will work with your text. (Note that once your input is corrected, the longest JSON string is quite short -- 50 characters or so -- whereas jq has no problems processing strings of length 10^8 quite speedily ...)
Generally, jq is rather permissive when it comes to JSON-like input, but if you're ever in doubt, it would make sense to use a validator such as the online validator at jsonlint.com
By the way, the jq FAQ does suggest various ways for handling input that isn't strictly JSON -- see https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json

Along the lines of chepner's suggestion since jq can read raw text data you could just use a jq filter to generate a legal json object from your script variables. For example:
#!/bin/bash
# whatever logic you have to obtain bash variables goes here
key=XXXXXXXXXX-7AC9-D655F83B4825
guid=XXXXXXXXXXXXXX
# now use jq filter to read raw text and construct legal json object
json_construct=$(jq -MRn '[inputs]|map(split(" ")|{(.[0]):.[1]})|add' <<EOF
api_key $key
app_guid $guid
EOF)
echo $json_construct
Sample Run (assumes executable script is in script.sh)
$ ./script.sh
{ "api_key": "XXXXXXXXXX-7AC9-D655F83B4825", "app_guid": "XXXXXXXXXXXXXX" }
Try it online!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

JMESPath join output and strip unwanted chars - json

Related

Non json output from gcloud ai-platform predict. Parsing non-json outputs

Parsing JSON file in powershell with specific characters

jq split string by a pattern

Retrieve one (last) value from influxdb

jq construct with value strings spanning multiple lines

Categories

Resources