jq-win64.exe: Parsing data from a JSON file in Windows Batch File - json

I have the following JSON file (song.json) that contains:
{
"Result": [
{
"ItemTitle": "Sometimes It Hurts",
"Artists": [
"Voost"
],
"MediaEnd": "00:02:15.8490000",
"Extro": "00:02:12.8200000",
"MediaId": 9551,
"ActualLength": "00:02:12.8200000",
"ItemType": "Song"
},
{
"ItemTitle": "Been a Long Time (Full Intention 2021 Remix)",
"Artists": [
"The Fog"
],
"MediaEnd": "00:03:11.3170000",
"IntroEnd": "00:00:07.4700000",
"Extro": "00:03:08.6300000",
"MediaId": 9489,
"ActualLength": "00:03:08.6300000",
"ItemType": "Song"
}
],
"ExceptionMessage": null,
"FailMessage": null,
"ExceptionTypeName": null
}
I want to extract the first “ItemTitle” and the first “Artist” and save them as variables.
In this example the result I am looking for would be:
ItemTitle=Sometimes It Hurts
Artist=Voost
I have been trying to use jq-win64.exe as this needs to run in a Windows Batch File, but I can’t get the syntax right. I have tried various examples that I have found on here but none of them appears to work as required. Can anyone suggest a solution?

First things first.
Since you seem to be having trouble on several fronts , I would suggest that you first get your jq query working the way you want
by using jq with the -f command-line option. That way, you can write your query without having to worry about
Windows rules for escaping characters on the command line. When the results are as you wish, you might even to decide to
leave well-enough alone.
Next, to obtain the values you want, it would seem you will want a jq query like this:
.Result | first(.[].ItemTitle), first(.[].Artists[])
With your JSON, this produces:
Sometimes It Hurts
Voost
But you say you want these values in the KEY=VALUE form. This can be achieved by modifying the basic query, e.g. as follows:
.Result|"ItemTitle=\(first(.[].ItemTitle))", "Artist=\(first(.[].Artists[]))"
Somehow I doubt this is really what you want, but the rest should be clear sailing.

Related

Using jq to edit key:value in a .conf file from shell

I'm trying to write a shell script that passes an env variable into a .conf file so that I can manipulate the log_file and log_level keys programatically.
Actual file as station.conf
{
"SX1301_conf": {
"lorawan_public": true,
"clksrc": 1,
"radio_0": {
"type": "SX1257",
"rssi_offset": -166.0,
"tx_enable": true,
"antenna_gain": 0
},
"radio_1": {
"type": "SX1257",
"rssi_offset": -166.0,
"tx_enable": false
}
},
"station_conf": {
"log_file": "stderr",
"log_level": "DEBUG",
/* XDEBUG,DEBUG,VERBOSE,INFO,NOTICE,WARNING,ERROR,CRITICAL */
"log_size": 10000000,
"log_rotate": 3,
"CUPS_RESYNC_INTV": "1s"
}
}
I wanted to test manually before passing shell variables so I tried jq '".station_conf.log_level="ERROR"' station.conf, but I keep getting errors including shell quoting errors and invalid numeric literal errors (which btw, seems to be a open bug: https://github.com/stedolan/jq/issues/501)
Any tips on how to do this? Ideally I'd be able to replace log_level value with a $LOG_LEVEL from my env. Thanks!
Assuming the input is valid JSON, for robustness, you could start with:
jq '.station_conf.log_level="ERROR"' station.conf
To pass in a shell variable, consider:
jq —-arg v "$LOG_LEVEL" '
.station_conf.log_level=$v' station.conf
You are getting invalid numeric literal error because at least your example input is not valid json. As you can see, it contains /* comment */, which is not supported by jq. You have several options here.
keep using jq and make your input files valid json.
use another tool instead of jq, which support comments and/or other non-standard features.
If you choose second way, i.e. different tool, you can find some alternatives either on jq web page (https://github.com/stedolan/jq/wiki/FAQ#processing-not-quite-valid-json) or there is also scout (https://github.com/ABridoux/scout).

Retrieving the first entity out of several ones

I am a rank beginner with jq, and I've been going through the tutorial, but I think there is a conceptual difference I don't understand. A common problem I encounter is that a large JSON file will contain many objects, each of which is quite big, and I'd like to view the first complete object, to see which fields exist, what types, how much nesting, etc.
In the tutorial, they do this:
# We can use jq to extract just the first commit.
$ curl 'https://api.github.com/repos/stedolan/jq/commits?per_page=5' | jq '.[0]'
Here is an example with one object - here, I'd like to return the whole array (just like my_array=['foo']; my_array[0] would return foo in Python).
wget https://hacker-news.firebaseio.com/v0/item/8863.json
I can access and pretty-print the whole thing with .
$ cat 8863.json | jq '.'
$
{
"by": "dhouston",
"descendants": 71,
"id": 8863,
"kids": [
9224,
...
8876
],
"score": 104,
"time": 1175714200,
"title": "My YC app: Dropbox - Throw away your USB drive",
"type": "story",
"url": "http://www.getdropbox.com/u/2/screencast.html"
}
But trying to get the first element fails:
$ cat 8863.json| jq '.[0]'
$ jq: error (at <stdin>:0): Cannot index object with number
I get the same error jq '.[0]' 8863.json, but strangely echo 8863.json | jq '.[0]' gives me parse error: Invalid numeric literal at line 2, column 0. What is the difference? Also, is this not the correct way to get the zeroth member of the JSON?
I've looked at other SO posts with this error message and at the manual, but I'm still confused. I think of the file as an array of JSON objects, and I'd like to get the first. But it looks like jq works with something called a "stream", and does operations on all of it (say, return one given field from every object).
Clarification:
Let's say I have 2 objects in my JSON:
{
"by": "pg",
"id": 160705,
"poll": 160704,
"score": 335,
"text": "Yes, ban them; I'm tired of seeing Valleywag stories on News.YC.",
"time": 1207886576,
"type": "pollopt"
}
{
"by": "dpapathanasiou",
"id": 16070,
"kids": [
16078
],
"parent": 16069,
"text": "Dividends don't mean that much: Microsoft in its dominant years (when they had 40%-plus margins and were raking in the cash) never paid a dividend (they did so only recently).",
"time": 1177355133,
"type": "comment"
}
How would I get the entire first object (lines 1-9) with jq?
Cannot index object with number
This error message says it all, you can't index objects with numbers. If you want to get the value of by field, you need to do
jq '.by' file
Wrt
echo 8863.json | jq '.[0]' gives me parse error: Invalid numeric literal at line 2, column 0.
It's normal since you didn't specify -R/--raw-input flag, and so jq sees the shell string 8863.json as a JSON string, and one cannot apply array indexing to JSON strings. (To get the first character as a string, you'd write .[0:1].)
If your input file consists of several separate entities, to get the first one:
jq -n 'input' file
or,
jq -n 'first(inputs)' file
To get nth (let's say 5th for example):
jq -n 'nth(5; inputs)' file
a large JSON file will contain many objects, each of which is quite big, and I'd like to view the first complete object, to see which fields exist, what types, how much nesting, etc.
As implied in #OguzIsmail's response, there are important differences between:
- a JSON file (i.e, a file containing exactly one JSON entity);
- a file containing a sequence (i.e., stream) of JSON entities;
- a file containing an array of JSON entities.
In the first two cases, you can write jq -n input to select the first entity, and in the case of an array of entities, jq .[0] will suffice.
(In JSON-speak, a "JSON object" is a kind of dictionary, and is not to be confused with JSON entities in general.)
If you have a bunch of JSON objects (whether as a stream or array or whatever), just looking at the first often doesn't really give an accurate picture of all them. For getting a bird's eye view of a bunch of objects, using a "schema inference engine" is often the way to go. For this purpose, you might like to consider my schema.jq schema inference engine. It's usually very simple to use but of course how you use it will depend on whether you have a stream or array of JSON entities. For basic details, see https://gist.github.com/pkoppstein/a5abb4ebef3b0f72a6ed; for related topics (e.g. verification), see the entry for JESS at https://github.com/stedolan/jq/wiki/Modules
Please note that schema.jq infers a structural schema that mirrors the entities under consideration. Such structural schemas have little in common with JSON Schema schemas, which you might also like to consider.

Using jq to combine multiple JSON files

First off, I am not an expert with JSON files or with JQ. But here's my problem:
I am simply trying to download to card data (for the MtG card game) through an API, so I can use it in my own spreadsheets etc.
The card data from the API comes in pages, since there is so much of it, and I am trying to find a nice command line method in Windows to combine the files into one. That will make it nice and easy for me to use the information as external data in my workbooks.
The data from the API looks like this:
{
"object": "list",
"total_cards": 290,
"has_more": true,
"next_page": "https://api.scryfall.com/cards/search?format=json&include_extras=false&order=set&page=2&q=e%3Alea&unique=cards",
"data": [
{
"object": "card",
"id": "d5c83259-9b90-47c2-b48e-c7d78519e792",
"oracle_id": "c7a6a165-b709-46e0-ae42-6f69a17c0621",
"multiverse_ids": [
232
],
"name": "Animate Wall",
......
}
{
"object": "card",
......
}
]
}
Basically I need to take what's inside the "data" part from each file after the first, and merge it into the first file.
I have tried a few examples I found online using jq, but I can't get it to work. I think it might be because in this case the data is sort of under an extra level, since there is some basic information, then the "data" category is beneath it. I don't know.
Anyway, any help on how to get this going would be appreciated. I don't know much about this, but I can learn quickly so even any pointers would be great.
Thanks!
To merge the .data elements of all the responses into the first response, you could run:
jq 'reduce inputs.data as $s (.; .data += $s)' page1.json page2.json ...
Alternatives
You could use the following filter in conjunction with the -n command-line option:
reduce inputs as $s (input; .data += ($s.data))
Or if you simply want an object of the form {"data": [ ... ]} then (again assuming you invoke jq with the -n command-line option) the following jq filter would suffice:
{data: [inputs.data] | add}
Just to provide closure, #peak provided the solution. I am using it in conjunction with the method found here for using wildcards in batch files to address multiple files. The code looks like this now:
set expanded_list=
for /f "tokens=*" %%F in ('dir /b /a:-d "All Cards\!setname!_*.json"') do call set expanded_list=!expanded_list! "All Cards\%%F"
jq-win32 "reduce inputs.data as $s (.; .data += $s)" !expanded_list! > "All Cards\!setname!.json"
All the individual pages for each card set are named "setname"_"pagenumber".json
The code finds all the pages for each set and combines them into one variable which I can pass into jq.
Thanks again!

Can I use one packer builder with many provisioners and still run parallel builds?

I have what seems to be like a valid use case for an unsupported - afaik - scenario, using packer.io and I'm worried I might be missing something...
So, in packer, I can add:
many builders,
have a different name per builder,
use the builder name in the only section of the provisioners and finally
run packer build -only=<builder_name> to effectively limit my build to only the provisioners combined with the specific builder.
This is all fine.
What I am now trying to do, is use the same base image to create 3 different builds (and resulting AMIs). Obviously, I could just copy-paste the same builder config 3 times and then use 3 different provisioners, linking each to the respective builder, using the only parameter.
This feels totally wasteful and very error prone though... It sounds like I should be able to use the same builder and just limit which provisioners are applied .. ?
Is my only solution to use 3 copy-pasted builders? Is there any better solution?
I had the same issue, where I want to build 2 different AMIs (one for staging, one for production) and the only difference between them is the ansible group to apply during the provisioning. Building off the answer by #Rickard ov Essen I wrote a bash script using jq to duplicate the builder section of the config.
Here's my packer.json file:
{
"builders": [
{
"type": "amazon-ebs",
"name": "staging",
"region": "ap-southeast-2",
"source_ami_filter": {
"filters": {
"virtualization-type": "hvm",
"name": "ubuntu/images/hvm-ssd/ubuntu-xenial-16.04-amd64-server-*",
"root-device-type": "ebs"
},
"owners": ["099720109477"],
"most_recent": true
},
"instance_type": "t2.nano",
"ssh_username": "ubuntu",
"force_deregister": true,
"force_delete_snapshot": true,
"ami_name": "my-ami-{{ build_name }}"
}
],
"provisioners": [
{
"type": "ansible",
"playbook_file": "provisioning/site.yml",
"groups": ["{{ build_name }}"]
}
]
}
The ansible provisioner user the variable build_name to choose which ansible group to run.
Then I have a bash script build.sh which runs the packer build:
#!/bin/bash
jq '.builders += [.builders[0] | .name = "production"]' packer.json > packer_temp.json
packer build packer_temp.json
rm packer_temp.json
You can see what the packer_temp.json file looks like on this jqplay.
If you need to add more AMIs you can just keep adding more jq filters:
jq '.builders += [.builders[0] | .name = "production"] | .builders += [.builders[0] | .name = "test"]
This will add another AMI for test.
only works on filters on builder name so that is not an option.
You could solve this with any of these aproches:
Preprocess a json and create 3 templates from one.
Use a template with a user variable defining which build it is and build 3 times. Use conditions on the variable in you scripts to run the correct scripts.
Build a base AMI with the common parts of the template and then run 3 different builds on that provisioning the differences.
In general Packer try to solve one thing well, by not including a advanced DSL for describing different build flavours the scope decreses. It's easy to preprocess and create json for more advanced use cases.

Format for writing a JSON log file?

Are there any format standards for writing and parsing JSON log files?
The problem I see is that you can't have a "pure" JSON log file since you need matching brackets and trailing commas are forbidden. So while the following may be written by an application, it can't be parsed by standard JSON parsers:
[{date:'2012-01-01 02:00:01', severity:"ERROR", msg:"Foo failed"},
{date:'2012-01-01 02:04:02', severity:"INFO", msg:"Bar was successful"},
{date:'2012-01-01 02:10:12', severity:"DEBUG", msg:"Baz was notified"},
So you must have some conventions on how to structure your log files in a way that a parser can process them. The easiest thing would be "one log message object per line, newlines in string values are escaped". Are there any existing standards and tools?
You're not going to write a single JSON object per FILE, you're going to write a JSON object per LINE. Each line can then be parsed individually. You don't need to worry about trailing commas and have the whole set of objects enclosed by brackets, etc. See http://blog.nodejs.org/2012/03/28/service-logging-in-json-with-bunyan/ for a more detailed explanation of what this can look like.
Also check out Fluentd http://fluentd.org/ for a neat toolset to work with.
Edit: this format is now called JSONLines or jsonl as pointed out by #Mnebuerquo below - see http://jsonlines.org/
gem log_formatter is the ruby choice, as the formatter group, now support json formatter for ruby and log4r.
simple to get stated for ruby.
gem 'log_formatter'
require 'log_formatter'
require 'log_formatter/ruby_json_formatter'
logger.debug({data: "test data", author: 'chad'})
result
{
"source": "examples",
"data": "test data",
"author": "chad",
"log_level": "DEBUG",
"log_type": null,
"log_app": "app",
"log_timestamp": "2016-08-25T15:34:25+08:00"
}
for log4r:
require 'log4r'
require 'log_formatter'
require 'log_formatter/log4r_json_formatter'
logger = Log4r::Logger.new('Log4RTest')
outputter = Log4r::StdoutOutputter.new(
"console",
:formatter => Log4r::JSONFormatter::Base.new
)
logger.add(outputter)
logger.debug( {data: "test data", author: 'chad'} )
Advance usage: README
Full Example Code: examples