read raw input lines and output single array - json

I have a directory with files in it. I would like to create an array from that list of files. I thought it would be pretty easy, like:
ls mydir | jq -R '[.]'
[
"file1"
]
[
"file2"
]
[
"file3"
]
The only thing I could figure out is this:
ls mydir | jq -sR '[split("\n")[]|select(.|length>0)]'
[
"file1",
"file2",
"file3"
]
Is there a better way?

You'd have to be extra careful in dealing with Unix filenames in general. They can contain almost any character in a filename, including whitespace, newlines, commas, pipe symbols, and pretty much anything else you'd ever try to use as a delimiter except NUL. Your best bet is to separate the names with the NUL character, which is the only character that can't be part of a valid filename and split on it with jq
Use the native shell printf to separate entries on \0 and delimit it back
printf '%s\0' * | jq -Rn 'inputs | split("\u0000")'
or for just files
for file in *; do
[ -f "$file" ] && printf '%s\0' "$file"
done | jq -Rn 'inputs | split("\u0000")'

Using find opens up other possibilities:
find . -type f -maxdepth 1 -print0 |
jq -Rs 'split("\u0000") | map(sub("./";""))'

Related

Parse multiple json files and output the match/hits against the regex with associated file names

Currently, the cat command piped to jq helps me to parse multiple JSON files in my working directory and screen against the regex pattern matching email ids available in all in the files. However, am keen to identify the file name also in which the regex pattern is being hit/matched
cat *.json | jq '. as $data | [path(..| select(scalars and (tostring | test("^[a-zA-Z0-9+_.-]+#[a-zA-Z0-9.-]+$", "ixn")))) ] | map({ (.|join(".")): (. as $path | .=$data | getpath($path)) }) | reduce .[] as $item ({}; . * $item)'
Request your kind help tweaking the command to print $filename. thanks!
input_filename evaluates to the input file name of the file currently being read (after it has been opened). For STDIN, it evaluates to "<stdin>":
jq 'input_filename, input_filename' <<< 1
"<stdin>"
"<stdin>"
It works with the -n command-line option, but only after an input or inputs function has been called:
jq -n 'input_filename, (input | input_filename)' <<< 1
null
"<stdin>"
For a jq-internal solution use input_filename as #peak suggested. Here's an external solution which iterates over your input files and passes the file name as variable into jq. This approach, however, calls jq once for each input file (as opposed to your cat *.json | jq ... approach which has just one call), so you might run into performance issues when applied to a larger number of input files.
for f in *.json
do jq --arg f "$f" '. as $data | ... (use $f here) ...' "$f"
done

Create a json from given list of filenames in unix script

Hello I am trying to write unix script/command where I have to list out all filenames from given directory with filename format string-{number}.txt(eg: filename-1.txt,filename-2.txt) from which I have to form a json object. any pointers would be helpful.
[{
"filenumber": "1",
"name": "filename-1.txt"
},
{
"filenumber": "2",
"name": "filename-2.txt"
}
]
In the above json file-number should be read from {number} format of the each filename
A single call to jq should suffice :
shopt -s extglob
printf "%s\0" *-+([0-9]).txt | \
jq -sR 'split("\u0000") |
map({filenumber:capture(".*-(?<n>.*)\\.txt").n,
name:.})'
Very easy for the command-line tool xidel and its integrated EXPath File Module:
$ xidel -se '
array{
for $x in file:list(.,false(),"*.txt")
return {
"filenumber":extract($x,"(\d+)\.txt",1),
"name":$x
}
}
'
Intuitively, I'd say you can do this with jq. However, in practice I've rarely been able to achieve what I wanted with jq :-)
With some lunch break puzzling, I've come up with this beauty:
ls | jq -R '{filenumber:input_line_number, name:.}' | jq -s .
Instead of ls you could use any other command that produces a newline separated list of strings.
I have tried with multiple examples to achieve exact use case of mine and finally found this working fine exactly how I wanted Thanks
for file in $(ls *.txt); do file_version=$(echo $file | sed 's/\(^.*-\)\(.*\)\(.txt.*$\)/\2/'); jq -n --arg name "$file_version" --arg path "$file" '{name: $name, name: $path}'; done | jq -n '.urls |= [inputs]'

jq raw json output carriage return?

Feel free to edit the title; not sure how to word it. I'm trying to turn shell output into JSON data for a reporting system I'm writing for work. Quick question, no matter what i do, when I take raw input in slurp mode and output the JSON, the last item in the array is blank (""). I feel like this is some sort of rookie jq issue I'm running into, but can't figure out how to word the issue. This seems to happen no matter what command I run on the shell and pipe to jq:
# rpm -qa | grep kernel | jq -R -s 'split("\n")'
[
"kernel-2.6.32-504.8.1.el6.x86_64",
"kernel-firmware-2.6.32-696.20.1.el6.noarch",
"kernel-headers-2.6.32-696.20.1.el6.x86_64",
"dracut-kernel-004-409.el6_8.2.noarch",
"abrt-addon-kerneloops-2.0.8-43.el6.x86_64",
"kernel-devel-2.6.32-358.11.1.el6.x86_64",
"kernel-2.6.32-131.4.1.el6.x86_64",
"kernel-devel-2.6.32-696.20.1.el6.x86_64",
"kernel-2.6.32-696.20.1.el6.x86_64",
"kernel-devel-2.6.32-504.8.1.el6.x86_64",
"libreport-plugin-kerneloops-2.0.9-33.el6.x86_64",
""
]
Any help is appreciated.
Every line ends with a newline. Either remove the final newline, or omit the empty element at the end of the array.
vnix$ printf 'foo\nbar\n' |
> jq -R -s '.[:-1] | split("\n")'
[
"foo",
"bar"
]
vnix$ printf 'foo\nbar\n' |
> jq -R -s 'split("\n")[:-1]'
[
"foo",
"bar"
]
The notation x[:-1] retrieves the value of a string or array x with the last element removed. This is called "slice notation".
Just to spell this out, if you take the string "foo\n" and split on newline, you get "foo" from before the newline and "" after it.
To make this really robust, maybe trim the last character only if it really is a newline.
vnix$ printf 'foo\nbar\n' |
> jq -R -s 'sub("\n$";"") | split("\n")'
[
"foo",
"bar"
]
vnix$ printf 'foo\nbar' |
> # notice, no final ^ newine
> jq -R -s 'sub("\n$";"") | split("\n")'
[
"foo",
"bar"
]
Assuming you have access to jq 1.5 or later, you can circumvent the problem entirely and economically using inputs:
jq -nR '[inputs]'
Just be sure to include the -n option, otherwise the first line will go missing.
You can also use
rpm -qa | grep kernel | jq -R . | jq -s .
to get the desired result.
Please see https://github.com/stedolan/jq/issues/563

Reading and Looping Through A JSON File in BASH

I've got a JSON file (see below) called department_groups.json.
Essentially if I gave an argument of commercial I'd like it to return:
commercial-team#domain.com
commercial-updates#domain.com
Can anyone guide/help me with doing this?
{
"legal": {
"google_groups":[
["Legal", "legal#domain.com"],
["Legal Team", "legal-team#domain.com"],
["Compliance Checks", "compliance#domain.com"]
],
"samba_groups": ""
},
"commercial":{
"google_groups":[
["Commercial Team", "commercial-team#domain.com"],
["Commercial Updates", "commercial-updates#domain.com"]
],
"samba_groups": ""
},
"technology":{
"google_groups":[
["Technology", "technology#domain.com"],
["Incidents", "incidents#domain.com"]
],
"samba_groups": ""
}
}
This returns the second element in each array in the google_groups property of the commercial property:
jq --arg key commercial '.[$key].google_groups | .[] | .[1]' file
Use jq -r to output in "raw" format (lose the double quotes).
$ key=commercial
$ jq -r --arg key "$key" '.[$key].google_groups | .[] | .[1]' file
commercial-team#domain.com
commercial-updates#domain.com
I used --arg in these examples to show how it is used, optionally with a shell variable. If, on the other hand, commercial was just a fixed string, then you could simplify:
jq -r '.commercial.google_groups | .[] | .[1]' file
To process each line of the output, you can just use a shell while read loop:
key=commercial
while read -r email; do
echo "$email"
# process each email individually here
done < <(jq -r --arg key "$key" '.[$key].google_groups | .[] | .[1]' file)
Here I am using a process substitution <(), which acts like a file that can be processed by the shell. One advantage of doing this, over using a pipe, is that no subshell is created. Among other things, this means that the variables used within the loop remain in scope after the while block, so you can use them later.
If you prefer to use a pipe, just remove the part after done and move the command up to the first line:
jq ... | while read -r email; do # etc.
As #TomFenech noted, the requirements are somewhat unclear, but if it's the email addresses you want, the following variant of his answer may be of interest:
key=commercial
$ jq -r --arg key "$key" '.[$key].google_groups[][] | select(test("#"))' department_groups.json
commercial-team#domain.com
commercial-updates#domain.com

Filter only specific keys from an external file in jq

I have a JSON file with the following format:
[
{
"id": "00001",
"attr": {
"a": "foo",
"b": "bar",
...
}
},
{
"id": "00002",
"attr": {
...
},
...
},
...
]
and a text file with a list of ids, one per line. I'd like to use jq to filter only the records whose ids are mentioned in the text file. I.e. if the list contains "00001", only the first one should be printed.
Note, that I can't simply grep since each record may have an arbitrary number of attributes and sub-attributes.
There are basically two ways to proceed:
read the file of ids from STDIN
read the JSON from STDIN
Both are feasible, but here we illustrate (2) as it leads to a simple but efficient solution.
Suppose the JSON file is named in.json and the list of ids is in a file named ids.txt like so:
00001
00010
Notice that this file has no quotation marks. If it does, then the following can be significantly simplified as shown in the postscript.
The trick is to convert ids.txt into a JSON array. With the above assumption about quotation marks, this can be done by:
jq -R . ids.txt | jq -s .
Assuming a reasonable shell, a simple solution is now at hand:
jq --argjson ids "$(jq -R . ids.txt | jq -s .)" '
map( select( .id as $id | $ids | index($id) ))' in.json
Faster
Assuming your jq has any/2, then a simpler and more efficient solution can be obtaining by defining:
def isin($a): . as $in | any($a[]; $in == .);
The required jq filter is then just:
map( select( .id | isin($ids) ) )
If these two lines of jq are put into a file named select.jq, the required incantation is simply:
jq --argjson ids "$(jq -R . ids.txt | jq -s)" -f select.jq in.json
Postscript
If the index file consists of a stream of valid JSON texts (e.g., strings with quotation marks) and if your jq supports the --slurpfile option, the invocation can be further simplified to:
jq --slurpfile ids ids.txt -f select.jq in.json
Or if you want everything as a one-liner:
jq --slurpfile ids ids.txt 'map(select(.id as $id|any($ids[];$id==.)))' in.json