I have a json file with this data:
{
"data": [
{
"name": "table",
"values": [
"This is old data",
"that needs to be",
"replaced."
]
}
]
}
But my challege here is I need to replace that values array with words in a text or csv file:
this
this
this
is
is
an
an
array
My output needs to have (although I could probably get away with the words all on one line...):
"values": [
"this this this",
"is is",
"an an",
"array"
],
Is this possible with only jq? Or would I have to get awk to help out?
I already started down the awk road with:
awk -F, 'BEGIN{ORS=" "; {print "["}} {print $2} END{{print "]"}}' filename
But I know there is still some work here...
And then I came across jq -Rn inputs. But I haven't figured out how or if I can get the desired result.
Thanks for any pointers.
Assuming you have a raw ASCII text file named file and an input JSON file, you could do
jq --rawfile txt file '.data[].values |= ( $txt | split("\n")[:-1] | group_by(.) | map(join(" ")) )' json
produces
{
"data": [
{
"name": "table",
"values": [
"an an",
"array",
"is is",
"this this this"
]
}
]
}
You can use jq and awk.
Given:
$ cat file
{
"data": [
{
"name": "table",
"values": [
"This is old data",
"that needs to be",
"replaced."
]
}
]
}
$ cat replacement
this
this
this
is
is
an
an
array
First create a string for the replacement array (awk is easy to use here):
ins=$(awk '!s {s=last=$1; next}
$1==last{s=s " " $1; next}
{print s; s=last=$1}
END{print s}' replacement | tr '\n' '\t')
Then use jq to insert into the JSON:
jq --rawfile txt <(echo "$ins") '.data[].values |= ( $txt | split("\t")[:-1] )' file
{
"data": [
{
"name": "table",
"values": [
"this this this",
"is is",
"an an",
"array"
]
}
]
}
You can also use ruby to process both files:
ruby -r json -e '
BEGIN{ ar=File.readlines(ARGV[0])
.map{|l| l.rstrip}
.group_by{|e| e}
.values
.map{|v| v.join(" ")}
j=JSON.parse(File.read(ARGV[1]))
}
j["data"][0]["values"]=ar
puts JSON.pretty_generate(j)' txt file
# same output...
Related
I have several text files, each one has a title inside. For example:
echo 'title: hello' > 1.txt
echo 'title: world' > 2.txt
echo 'title: good' > 3.txt
And I have a JSON file called abc.json generated by a shell script like this:
{
"": [
{
"title": "",
"file": "1"
},
{
"title": "",
"file": "2"
},
{
"title": "",
"file": "3"
}
]
}
What I want is to update the title value in the abc.json by the title in the respective text file, like this:
{
"": [
{
"title": "hello",
"file": "1"
},
{
"title": "world",
"file": "2"
},
{
"title": "good",
"file": "3"
}
]
}
The text files and the JSON files are in the same directory like this:
➜ tmp.uFtH6hMC ls
1.txt 2.txt 3.txt abc.json
Thank you very much!
Update requirement
Sorry, guys. All your answers are perfect for the above requirement.
But some important detailed information I missed:
The filename of text files may contain space, so the current directory should be like this:
➜ $ gfind . -maxdepth 1 -type f -printf '%P\n'
The text file contain one title line and more content.txt
The title identifier in the text file is fixed.txt
The filename of text file may contain space.txt
abc.json
The text files include one title-line which contains the title-value that will be extracted into the abc.json, i.e., ## hello means that "hello" need to be put into title field in abc.json. The title-line, could be anyline in the file, looks like ## <title-value>. The title-identifier ## is fixed and sperated with title-value by one single whitespace which is the first whitespace in the title-line. So the text files content could look like this:
The text file contain one title line and more content.txt:
## hello world
some more content below...
...
The title identifier in the text file is fixed.txt:
## How are you?
some more content below...
...
The filename of text file may contain space.txt:
some pre-content...
...
## I'm fine, thank you.
some more content below...
...
Before updating, the abc.json looks like this:
{
"": [
{
"title": "",
"file": "The filename of text file may contain space"
},
{
"title": "",
"file": "The text file contain one title line and more content"
},
{
"title": "",
"file": "The title identifier in the text file is fixed"
}
]
}
After updating, the abc.json should be like this:
{
"": [
{
"title": "I'm fine, thank you.",
"file": "The filename of text file may contain space"
},
{
"title": "hello world",
"file": "The text file contain one title line and more content"
},
{
"title": "How are you?",
"file": "The title identifier in the text file is fixed"
}
]
}
Sorry again...thank you for your patience and great help!
You can use a shell loop to iterate over your files, extract the second column, create each array element and then transform the stream of array elements into your final object:
for f in *.txt; do
cut -d' ' -f2- "$f" | jq -R --arg file "$f" '{title:.,file:($file/"."|first)}';
done | jq -s '{"":.}'
It is also possible to remove the file extension in shell directly, which makes the jq filter a little bit simpler:
for f in *.txt; do
cut -d' ' -f2- "$f" | jq -R --arg file "${f%.txt}" '{title:.,$file}';
done | jq -s '{"":.}'
cut extracts the title value and must be adapted if the files are structured differently, e.g. by using grep, sed, or awk to extract the title and then feed it to jq.
Since the .title and .files has the same number, we can use that to index it from the input.
So using cut we can read all the *.txt files, split on and then get the second to last field, this gives:
cat *.txt | cut -d ' ' -f 1-
hello
world
good
(titles with spaces will work due to the -f 1-)
Using --arg we pass that to jq, which we then parse into an array:
($inputs | split("\n")) as $parsed
Now that $parsed looks like:
[
"hello",
"world",
"good"
]
To update the value, loop over each object in the "" array, then get the matching value from $parsed by using .file | tonumber - 1 (since array are 0-indexed)
jq --arg inputs "$(cat *.txt | cut -d ' ' -f 1-)" \
'($inputs | split("\n")) as $parsed
| .""[]
|= (.title = $parsed[.file | tonumber - 1])' \
abc.json
Output:
{
"": [
{
"title": "hello",
"file": "1"
},
{
"title": "world",
"file": "2"
},
{
"title": "good",
"file": "3"
}
]
}
Use input_filename to get the input files' names, read their raw content with the -R flag set, and use select to find the right item to update; all in one go:
jq -Rn --argfile base abc.json '
reduce (inputs | [
ltrimstr("title: "),
(input_filename | rtrimstr(".txt"))
]) as [$title, $file] ($base;
(.[""][] | select(.file == $file)).title = $title
)
' *.txt
If the left part of the text files' contents ("title" in the samples) should be a dynamic field name, you could capture it as well:
jq -Rn --argfile base abc.json '
reduce (inputs | [
capture("^(?<key>.*): (?<value>.*)$"),
(input_filename | rtrimstr(".txt"))
]) as [{$key, $value}, $file] ($base;
(.[""][] | select(.file == $file))[$key] = $value
)
' *.txt
Output:
{
"": [
{
"title": "hello",
"file": "1"
},
{
"title": "world",
"file": "2"
},
{
"title": "good",
"file": "3"
}
]
}
Could you please assist me to on how I can merge two json variables in bash to get the desired output mentioned below {without manually lopping over .data[] array} ? I tired echo "${firstJsonoObj} ${SecondJsonoObj}" | jq -s add but it didn't parse through the array.
firstJsonoObj='{"data" :[{"id": "123"},{"id": "124"}]}'
SecondJsonoObj='{"etag" :" 234324"}'
desired output
{"data" :[{"id": "123", "etag" :" 234324"},{"id": "124", "etag" :" 234324"}]}
Thanks in advance!
You can append to each data element using +=:
#!/bin/bash
firstJsonoObj='{"data" :[{"id": "123"},{"id": "124"}]}'
SecondJsonoObj='{"etag" :" 234324"}'
jq -c ".data[] += $SecondJsonoObj" <<< "$firstJsonoObj"
Output:
{"data":[{"id":"123","etag":" 234324"},{"id":"124","etag":" 234324"}]}
Please don't use double quotes to inject data from shell into code. jq provides the --arg and --argjson options to do that safely:
#!/bin/bash
firstJsonoObj='{"data" :[{"id": "123"},{"id": "124"}]}'
SecondJsonoObj='{"etag" :" 234324"}'
jq --argjson x "$SecondJsonoObj" '.data[] += $x' <<< "$firstJsonoObj"
# or
jq --argjson a "$firstJsonoObj" --argjson b "$SecondJsonoObj" -n '$a | .data[] += $b'
{
"data": [
{
"id": "123",
"etag": " 234324"
},
{
"id": "124",
"etag": " 234324"
}
]
}
jq -s add will not work because you want to add the second document to a deeper level within the first. Use .data[] += input (without -s), with . acessing the first and ìnput accessing the second input:
echo "${firstJsonoObj} ${SecondJsonoObj}" | jq '.data[] += input'
Or, as bash is tagged, use a Heredoc:
jq '.data[] += input' <<< "${firstJsonoObj} ${SecondJsonoObj}"
Output:
{
"data": [
{
"id": "123",
"etag": " 234324"
},
{
"id": "124",
"etag": " 234324"
}
]
}
Demo
The output from the tool I am using is creating an element in the json that is an object when there is only 1 item but an array when there is more than 1.
How do I parse this with jq to return the full list of names only from within content?
{
"data": [
{
"name": "data block1",
"content": {
"name": "1 bit of data"
}
},
{
"name": "data block2",
"content": [
{
"name": "first bit"
},
{
"name": "another bit"
},
{
"name": "last bit"
}
]
}
]
}
What I can't work out is how to switch depending on the type of content.
# jq '.data[].content.name' test.json
"1 bit of data"
jq: error (at test.json:22): Cannot index array with string "name"
# jq '.data[].content[].name' test.json
jq: error (at test.json:22): Cannot index string with string "name"
I am sure I should be able to use type but my jq-fu is not strong enough!
# jq '.data[].content | type=="array"' test.json
false
true
jq version 1.5
jq '.data[].content | if type == "array" then .[] else . end | .name?'
(The trailing ? is there just in case.)
More succinctly:
jq '.data[].content | .name? // .[].name?'
I have a result.json:
{
"Msg": "This is output",
"output": {}
}
and a output.json:
{
"type": "string",
"value": "result is here"
}
I want to replace output field in result.json with whole file output.json as
{
"Msg": "This is output",
"output": {
"type": "string",
"value": "result is here"
}
}
and idea with jq command line? Thank you in advance.
You can use --argfile to process multiple files :
jq --argfile f1 result.json --argfile f2 output.json -n '$f1 | .output = $f2'
Basically the same as Bertrand Martel's answer, but using a different (and shorter) approach to reading the two files.
jq -n 'input | .output = input' result.json output.json
I have a while loop with two variables I have to merge into a single piece of JSON, like so:
#!/bin/bash
while read -r from to
do
# BONUS: Solution would ideally require no quoting at this point
echo { \"From\": \"$from\", \"To\": \"$to\" }
done << EOF
foo bar
what ever
EOF
Which currently outputs invalid JSON:
{ "From": "foo", "To": "bar" }
{ "From": "what", "To": "ever" }
What's the simplest I can create valid JSON like:
[
{ "From": "foo", "To": "bar" },
{ "From": "what", "To": "ever" }
]
I looked at jq but I couldn't figure out how to do it. I'm not looking to do it in shell ideally because I feel adding commas and such is a bit ugly.
With jq:
$ jq -nR '[inputs | split(" ") | {"From": .[0], "To": .[1]}]' <<EOF
foo bar
what ever
EOF
[
{
"From": "foo",
"To": "bar"
},
{
"From": "what",
"To": "ever"
}
]
-n tells jq to not read any input; -R is for raw input so it doesn't expect JSON.
The input is read with inputs, resulting in one string per input line:
$ jq -nR 'inputs' <<EOF
foo bar
what ever
EOF
"foo bar"
"what ever"
These are then split into arrays of words:
$ jq -nR 'inputs | split(" ")' <<EOF
foo bar
what ever
EOF
[
"foo",
"bar"
]
[
"what",
"ever"
]
From this, we construct the objects:
$ jq -nR 'inputs | split(" ") | {"From": .[0], "To": .[1]}' <<EOF
foo bar
what ever
EOF
{
"From": "foo",
"To": "bar"
}
{
"From": "what",
"To": "ever"
}
And finally, we wrap everything in [] to get the final output shown first.
The more intuitive approach of splitting input directly fails because wrapping everything in [] results in one array per input line:
$ jq -R '[split(" ") | { "From": .[0], "To": .[1] }]' <<EOF
foo bar
what ever
EOF
[
{
"From": "foo",
"To": "bar"
}
]
[
{
"From": "what",
"To": "ever"
}
]
Hence the somewhat cumbersome -n/inputs. Notice that inputs requires jq version 1.5.
Here's an all-jq solution that assumes the "From" and "To" values are presented exactly as in your example:
jq -R -n '[inputs | split(" ") | {From: .[0], To: .[1]}]'
You might want to handle additional spaces using gsub/2.
If your jq does not have inputs then you can use this incantation:
jq -R -s 'split("\n")
| map(select(length>1) | split(" ") | {From: .[0], To: .[1]})'
Or you could just pipe the output from your while-loop into jq -s.