I'm cURLing a web API for an application/json response, that response is a set of key value pairs, like this:
{"id":89,"name":"User saved 2018-07-03 12:01:47.337483","create_time":1530644507337,"auto":false,"recovered":false}
{"id":49,"name":"User saved 2018-05-24 12:33:53.927798","create_time":1527190433927,"auto":false,"recovered":false}
{"id":199,"name":"Daily backup 2018-10-22 02:37:37.332271","create_time":1540201057332,"auto":true,"recovered":false}
etc, etc...
I'd like to iterate through this response and find the highest value integer for the "id" key then save that as a variable. If the above was my whole JSON I'd want to end up with variable=199.
Doing something like this:
MY_VARIABLE=$(curl -k -X GET --header "Accept: application/json" \
--header "Authorization: MyAPITarget apikey=${MY_APIKEY}" \
"https://targetserver/api/methodImCalling" |
The output of that is the JSON above. Can I pipe that output into an array and iterate through it but only look for the value of "id" then do something similar to a:
for (i = 0; i < id.length; i++)
I've only been working with code a short while and most my background at this point is JS, trying to make the connection here for bash. I'm trying to avoid using any "installed" tools whatsoever which is why I'm using bash, I'd like this script to run "out of the box" on any linux / unix platform. Any tips? Thanks!
It's probably a separate installation, but the tool you want is jq:
max_id=$(curl ... | jq -s 'map(.id) | max')
The standard tools that one can expect to be pre-installed simply aren't suitable for working with JSON.
While not standard, any machine that has curl installed is likely to have Python installed, and you can use its standard json module to process the JSON properly. Here's a somewhat ungainly one-liner:
curl ... |
python -c 'import json,sys; x="[%s]"%(",".join(sys.stdin),); print(max(y["id"] for y in json.loads(x)))'
Other non-standard but common languages (Perl, Ruby, etc) probably also have built-in ways to consume JSON.
I'm running into a problem as I attempt to automate an API process into BigQuery.
The issue is that I need the data to be in a newline delimited JSON format to go into my BigQuery database but the data I'm pulling does not do that, so I need to parse it out.
Here is a link to pastebin so you can get an idea of what the data looks like, but also, here it is just because:
{"type":"user.list","users":[{"type":"user","id":"581c13632f25960e6e3dc89a","user_id":"ieo2e6dtsqhiyhtr","anonymous":false,"email":"test#gmail.com","name":"Joe Martinez","pseudonym":null,"avatar":{"type":"avatar","image_url":null},"app_id":"b5vkxvop","companies":{"type":"company.list","companies":[]},"location_data":{"type":"location_data","city_name":"Houston","continent_code":"NA","country_name":"United States","latitude":29.7633,"longitude":-95.3633,"postal_code":"77002","region_name":"Texas","timezone":"America/Chicago","country_code":"USA"},"last_request_at":1478235114,"last_seen_ip":"66.87.120.30","created_at":1478234979,"remote_created_at":1478234944,"signed_up_at":1478234944,"updated_at":1478235145,"session_count":1,"social_profiles":{"type":"social_profile.list","social_profiles":[]},"unsubscribed_from_emails":false,"user_agent_data":"Mozilla/5.0 (Linux; Android 6.0.1; SM-G920P Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.68 Mobile Safari/537.36","tags":{"type":"tag.list","tags":[]},"segments":{"type":"segment.list","segments":[{"type":"segment","id":"57d2ea275bfcebabd516d963"},{"type":"segment","id":"57d2ea265bfcebabd516d962"}]},"custom_attributes":{"claimCount":"1","memberType":"claimant"}},{"type":"user","id":"581c22a19a1dc02c460541df","user_id":"1o3helrdv58cxm7jf","anonymous":false,"email":"test#mail.com","name":"Joe Coleman","pseudonym":null,"avatar":{"type":"avatar","image_url":null},"app_id":"b5vkxvop","companies":{"type":"company.list","companies":[]},"location_data":{"type":"location_data","city_name":"San Jose","continent_code":"NA","country_name":"United States","latitude":37.3394,"longitude":-121.895,"postal_code":"95141","region_name":"California","timezone":"America/Los_Angeles","country_code":"USA"},"last_request_at":1478239113,"last_seen_ip":"216.151.183.47","created_at":1478238881,"remote_created_at":1478238744,"signed_up_at":1478238744,"updated_at":1478239113,"session_count":1,"social_profiles":{"type":"social_profile.list","social_profiles":[]},"unsubscribed_from_emails":false,"user_agent_data":"Mozilla/5.0 (Windows NT 6.3; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0","tags":{"type":"tag.list","tags":[]},"segments":{"type":"segment.list","segments":[{"type":"segment","id":"57d2ea275bfcebabd516d963"},{"type":"segment","id":"57d2ea265bfcebabd516d962"}]},"custom_attributes":{"claimCount":"2","memberType":"claimant"}}],"scroll_param":"24ba0fac-b8f9-46b2-944a-9bb523dcd1b1"}
The two problems are the first line:
{"type":"user.list","users":
And the final piece at the bottom:
,"scroll_param":"24bd0rac-b2f9-46b2-944a-9zz543dcd1b1"}
If you eliminate those two, you are simply left with the necessary data needed, and I know what filter is needed to parse it out to put it in newline delimited format.
You can see for yourself by playing around with this tool, but if you only copy and paste everything from that first open bracket to the close bracket on the final line, set it to "Compact Output" and apply the filter:
.[]
The result will be like what you see here, in a nice and neat newline delimited format like you see here., also here it is not in the link:
{"type":"user","id":"581c13632f25960e6e3dc89a","user_id":"ieo2e6dtsqhiyhtr","anonymous":false,"email":"test#gmail.com","name":"Joe Martinez","pseudonym":null,"avatar":{"type":"avatar","image_url":null},"app_id":"b5vkxvop","companies":{"type":"company.list","companies":[]},"location_data":{"type":"location_data","city_name":"Houston","continent_code":"NA","country_name":"United States","latitude":29.7633,"longitude":-95.3633,"postal_code":"77002","region_name":"Texas","timezone":"America/Chicago","country_code":"USA"},"last_request_at":1478235114,"last_seen_ip":"66.87.120.30","created_at":1478234979,"remote_created_at":1478234944,"signed_up_at":1478234944,"updated_at":1478235145,"session_count":1,"social_profiles":{"type":"social_profile.list","social_profiles":[]},"unsubscribed_from_emails":false,"user_agent_data":"Mozilla/5.0 (Linux; Android 6.0.1; SM-G920P Build/MMB29K) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.68 Mobile Safari/537.36","tags":{"type":"tag.list","tags":[]},"segments":{"type":"segment.list","segments":[{"type":"segment","id":"57d2ea275bfcebabd516d963"},{"type":"segment","id":"57d2ea265bfcebabd516d962"}]},"custom_attributes":{"claimCount":"1","memberType":"claimant"}}
{"type":"user","id":"581c22a19a1dc02c460541df","user_id":"1o3helrdv58cxm7jf","anonymous":false,"email":"test#mail.com","name":"Joe Coleman","pseudonym":null,"avatar":{"type":"avatar","image_url":null},"app_id":"b5vkxvop","companies":{"type":"company.list","companies":[]},"location_data":{"type":"location_data","city_name":"San Jose","continent_code":"NA","country_name":"United States","latitude":37.3394,"longitude":-121.895,"postal_code":"95141","region_name":"California","timezone":"America/Los_Angeles","country_code":"USA"},"last_request_at":1478239113,"last_seen_ip":"216.151.183.47","created_at":1478238881,"remote_created_at":1478238744,"signed_up_at":1478238744,"updated_at":1478239113,"session_count":1,"social_profiles":{"type":"social_profile.list","social_profiles":[]},"unsubscribed_from_emails":false,"user_agent_data":"Mozilla/5.0 (Windows NT 6.3; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0","tags":{"type":"tag.list","tags":[]},"segments":{"type":"segment.list","segments":[{"type":"segment","id":"57d2ea275bfcebabd516d963"},{"type":"segment","id":"57d2ea265bfcebabd516d962"}]},"custom_attributes":{"claimCount":"2","memberType":"claimant"}}
So what I need is a filter I can apply in the same manner I used .[] that pull out all the text prior to the first open bracket (as I highlighted above) as well as all the text prior to the closed bracket at the end.
But here's where the final problem comes in. While I need that final piece of text out of the equation, I still do need that string of letters and numbers known as the scroll paramater. This is because in order to fully capture all the data I need in the API, I need to continuously use the new scroll paramater it generates from the command line call until all the data is in.
The initial call looks as such:
$ curl -s https://api.program.io/users/scroll -u 'dG9rOmU5NGFjYTkwXzliNDFfNGIyMF9iYzA0XzU0NDg3MjE5ZWJkZDoxOjA=': -H 'Accept:application/json'
But in ordere to get all the info in, I need that scroll parameter for a seperate call that looks like:
curl -s https://api.intercom.io/users/scroll?scroll_param=foo -u 'dG9rOmU5NGFjYTkwXzliNDFfNGIyMF9iYzA0XzU0NDg3MjE5ZWJkZDoxOjA=': -H 'Accept:application/json' >scroll.json
So while I need to get rid of the text in the blob that contains the paramater in order to put it in newline delimited format, I still need to extract whatever that paramater is to loop back into another script that will continue to run until it is empty.
Would love to hear any advice in working around this!
Like others who have posted comments, I won't pretend to understand the details of the specific question, but if the general question is how to use jq to emit newline-delimited JSON (that is, ensure that each JSON text is followed by a newline, and that no other (raw) newlines are added), the answer is simple: use jq with the -c option, and without the -r option.
From a cursory examination of your data, the filter
.users[]
will give you just the user data to load and the filter
.scroll_param
will return just the scroll parameter. If you put your data in a file you could invoke jq once for each filter but if you have to stream the data you could simply use the , operator to return one value after another. e.g.
.scroll_param
, .users[]
If you use that filter along with the -c option jq will generate output like
"24ba0fac-b8f9-46b2-944a-9bb523dcd1b1"
{"type":"user","id":"581c13632f25960e6e3dc89a","user_id":"ieo2e6dtsqhiyhtr",...
{"type":"user","id":"581c22a19a1dc02c460541df","user_id":"1o3helrdv58cxm7jf",...
presumably the script that reads the output from jq could capture the first line for use in the curl invocation and put the rest of the data into the file you load.
Hope this helps.
I kwould like to read the json file from http://freifunk.in-kiel.de/alfred.json in bash and separate it into files named by hostname of each element in that json string.
How do I read json with bash?
How do I read json with bash?
You can use jq for that. First thing you have to do is extract the list of hostnames and save it to a bash array. Running a loop on that array you would then run again a query for each hostname to extract each element based on them and save the data through redirection with the filename based on them as well.
The easiest way to do this is with two instances of jq -- one listing hostnames, and another (inside the loop) extracting individual entries.
This is, alas, a bit inefficient (since it means rereading the file from the top for each record to extract).
while read -r hostname; do
[[ $hostname = */* ]] && continue # paranoia; see comments
jq --arg hostname "$hostname" \
'.[] | select(.hostname == $hostname)' <alfred.json >"out-${hostname}.json"
done < <(jq -r '.[] | .hostname' <alfred.json)
(The out- prefix prevents alfred.json from being overwritten if it includes an entry for a host named alfred).
You can use python one-liner in similar way like (I haven't checked):
curl -s http://freifunk.in-kiel.de/alfred.json | python -c '
import json, sys
tbl=json.load(sys.stdin)
for t in tbl:
with open(tbl[t]["hostname"], "wb") as fp:
json.dump(tbl[t], fp)
'