Format number in thousands separators with jq json cli - json

Given {"a": 1234567890}, I want 1,234,567,890 in the result, how this can be done with jq
echo '{"a": 1234567890}' | jq '.a | FORMAT?'
Thanks for #peak's answer, the solution is
echo '{"a": 1234567890}' | jq -r 'def h: [while(length>0; .[:-3]) | .[-3:]] | reverse | join(","); .a | tostring | h'
//-> 1,234,567,890

Here's an idiomatic one-liner definition:
def h: tostring | [while(length>0; .[:-3]) | .[-3:]] | reverse | join(",");
Example
12, 123, 1234, 12345678 | h
Output (using -r option):
12
123
1,234
12,345,678

jq doesn't have (yet) a printf function to format according locale settings.
If that's an option for you can pass the number to the shell using printf:
echo '{"a": 12345}' | jq '.a' | xargs printf "%'.f\n"
12,345
Note that the printf conversion relies on the format %'.f that is explained in man 3 printf

Here's a generic solution for integers or integer-valued strings:
# "h" for "human-readable"
def h:
def hh: .[0] as $s | .[1] as $answer
| if ($s|length) == 0 then $answer
else ((if $answer == "" then "" else "," end) + $answer ) as $a
| [$s[0:-3], $s[-3:] + $a] | hh
end;
[ tostring, ""] | hh;
Example
12, 123, 1234, 12345678 | h
Result (using -r option):
12
123
1,234
12,345,678

Related

JQ - Groupby and concatenate text objects

Not quite getting it. I can produce multiple lines but cannot get multiple entries to combine. Looking to take Source JSON and output to CSV as shown:
Source JSON:
[{"State": "NewYork","Drivers": [
{"Car": "Jetta","Users": [{"Name": "Steve","Details": {"Location": "Home","Time": "9a-7p"}}]},
{"Car": "Jetta","Users": [{"Name": "Roger","Details": {"Location": "Office","Time": "3p-6p"}}]},
{"Car": "Ford","Users": [{"Name": "John","Details": {"Location": "Home","Time": "12p-5p"}}]}
]}]
Desired CSV:
"NewYork","Jetta","Steve;Roger","Home;Office","9a-7p;3p-6p"
"NewYork","Ford","John","Home","12p-5p"
JQ code that does not work:
.\[\] | .Drivers\[\] | .Car as $car |
.Users\[\] |
\[$car, .Name\] | #csv
You're looking for something like this:
.[] | [.State] + (
.Drivers | group_by(.Car)[] | [.[0].Car] + (
map(.Users) | add | [
map(.Name),
map(.Details.Location),
map(.Details.Time)
] | map(join(";"))
)
) | #csv
$ jq -r -f tst.jq file
"NewYork","Ford","John","Home","12p-5p"
"NewYork","Jetta","Steve;Roger","Home;Office","9a-7p;3p-6p"
$
Not quite optimised, but I though't I'd share the general idea:
jq -r 'map(.State as $s |
(.Drivers | group_by(.Car))[]
| [
$s,
(map(.Users[].Name) | join(";")),
(map(.Users[].Details.Location) | join(";")),
(map(.Users[].Details.Time) | join(";"))
])
[] | #csv' b
map() over each state, remember the name (map(.State as $s | )
group_by(.Car)
Create an array containing all your fields that is passed to #csv
Use map() and join() to create the fields for Name, Location and Time
This part could be improved so you don't need that duplicated part
Output (with --raw-output:
"NewYork","John","Home","12p-5p"
"NewYork","Steve;Roger","Home;Office","9a-7p;3p-6p"
JqPlay seems down, so I'm still searching for an other way of sharing a public demo
Far from perfect, but it builds the result incrementally so it should be easily debuggable and extensible:
map({State} + (.Drivers[] | {Car} + (.Users[] | {Name} + (.Details | {Location, Time}))))
| group_by(.Car)
| map(reduce .[] as $item (
{State:null,Car:null,Name:[],Location:[],Time:[]};
. + ($item | {State,Car}) | .Name += [$item.Name] | .Location += [$item.Location] | .Time += [$item.Time]))
| .[]
| [.State, .Car, (.Name,.Location,.Time|join(","))]
| #csv

Output semicolon-separated string

Lets say we have this file:
{
"persons": [
{
"friends": 4,
"phoneNumber": 123456,
"personID": 11111
},
{
"friends": 2057,
"phoneNumber": 432100,
"personID": 22222
},
{
"friends": 50,
"phoneNumber": 147258,
"personID": 55555
}
]
}
I now want to extract the phone numbers of the persons 11111, 22222, 33333, 44444 and 55555 as a semicolon-separated string:
123456;432100;;;147258
While running
cat persons.txt | jq ".persons[] | select(.personID==<ID>) | .phoneNumber"
once for each <ID> and glueing the results together with the ; afterwards works, this is terribly slow, because it has to reload the file for each of the IDs (and other fields I want to extract).
Concatenating it in a single query:
cat persons.txt | jq "(.persons[] | select(.personID==11111) | .phoneNumber), (.persons[] | select(.personID==22222) | .phoneNumber), (.persons[] | select(.personID==33333) | .phoneNumber), (.persons[] | select(.personID==44444) | .phoneNumber), (.persons[] | select(.personID==55555) | .phoneNumber)"
This also works, but it gives
123456
432100
147258
so I do not know which of the fields are missing and how many ; I have to insert.
With your sample input in input.json, and using jq 1.6 (or a jq with INDEX/2), the following invocation of jq produces the desired output:
jq -r --argjson ids '[11111, 22222, 33333, 44444, 55555]' -f tossv.jq input.json
assuming tossv.jq contains the program:
INDEX(.persons[]; .personID) as $dict
| $ids
| map( $dict[tostring] | .phoneNumber)
| join(";")
Program notes
INDEX/2 produces a JSON object that serves as a dictionary. Since JSON keys must be strings, tostring must be used in line 3 above.
When using join(";"), null values effectively become empty strings.
If your jq does not have INDEX/2, then now might be a good time to upgrade. Otherwise you can snarf its definition by googling: jq "def INDEX" builtin.jq
Unfortunately I couldn't test if peak's answer works since I only have jq 1.5. Here's what I came up with yesterday evening:
For each semicolon, add the following query
(\";\" as \$a | \$a)
Resulting command (abstract):
cat persons.txt | jq "(<1's phone number>), (\";\" as \$a | \$a),
(<2's phone number>), (\";\" as \$a | \$a), ..."
Resulting command (concrete):
cat persons.txt | jq "(.persons[] | select(.personID==11111) | .phoneNumber), (\";\" as \$a | \$a),
(.persons[] | select(.personID==22222) | .phoneNumber), (\";\" as \$a | \$a),
(.persons[] | select(.personID==33333) | .phoneNumber), (\";\" as \$a | \$a),
(.persons[] | select(.personID==44444) | .phoneNumber), (\";\" as \$a | \$a),
(.persons[] | select(.personID==55555) | .phoneNumber)"
Result:
123456
";"
432100
";"
";"
";"
147258
Delete the newlines and ":
<commandAsAbove> | tr --delete "\n\""
Result:
123456;432100;;;147258
Do not get me wrong, this is far uglier than peak's answer, but it worked for me yesterday.
Without jq solution:
for i in $(seq 11111 11111 55555)
do
string=$(grep -B1 "$i" persons.txt | head -1 | sed 's/.* \(.*\),/\1/g')
echo "$string;" >> output
done
cat output | tr -d '\n' | rev | cut -d';' -f2- | rev > tmp && mv tmp output
This little script will yield the result you want and you can adapt it quickly if the input data varies
cat output
123456;432100;;;147258

jq - How to filter a json that does not contain

I have an aws query that I want to filter in jq.
I want to filter all the imageTags that don't end with "latest"
So far I did this but it filters things containing "latest" while I want to filter things not containing "latest" (or not ending with "latest")
aws ecr describe-images --repository-name <repo> --output json | jq '.[]' | jq '.[]' | jq "select ((.imagePushedAt < 14893094695) and (.imageTags[] | contains(\"latest\")))"
Thanks
You can use not to reverse the logic
(.imageTags[] | contains(\"latest\") | not)
Also, I'd imagine you can simplify your pipeline into a single jq call.
All you have to do is | not within your jq
A useful example, in particular for mac brew users:
List all bottled formulae
by querying the JSON and parsing the output
brew info --json=v1 --installed | jq -r 'map(
select(.installed[].poured_from_bottle)|.name) | unique | .[]' | tr '\n' ' '
List all non-bottled formulae
by querying the JSON and parsing the output and using | not
brew info --json=v1 --installed | jq -r 'map(
select(.installed[].poured_from_bottle | not) | .name) | unique | .[]'
In this case contains() doesn't work properly, is better use the not of index() function
select(.imageTags | index("latest") | not)
This .[] | .[] can be shorten to .[][] e.g.,
$ jq --null-input '[[1,2],[3,4]] | .[] | .[]'
1
2
3
4
$ jq --null-input '[[1,2],[3,4]] | .[][]'
1
2
3
4
To check whether a string does not contain another string, you can combine contains and not e.g.,
$ jq --null-input '"foobar" | contains("foo") | not'
false
$ jq --null-input '"barbaz" | contains("foo") | not'
true
You can do something similar with an array of strings with either any or all e.g.,
$ jq --null-input '["foobar","barbaz"] | any(.[]; contains("foo"))'
true
$ jq --null-input '["foobar","barbaz"] | any(.[]; contains("qux"))'
false
$ jq --null-input '["foobar","barbaz"] | all(.[]; contains("ba"))'
true
$ jq --null-input '["foobar","barbaz"] | all(.[]; contains("qux"))'
false
Say you had file.json:
[ [["foo", "foo"],["foo", "bat"]]
, [["foo", "bar"],["foo", "bat"]]
, [["foo", "baz"],["foo", "bat"]]
]
And you only want to keep the nested arrays that don't have any strings with "ba":
$ jq --compact-output '.[][] | select(all(.[]; contains("bat") | not))' file.json
["foo","foo"]
["foo","bar"]
["foo","baz"]

json parsing with jq and convert to csv

I need to get some values from a json file with JQ. I need to get a csv (Time, Data.key, Lat, Lng, Qline)
Input:
{
"Time":"14:16:23",
"Data":{
"101043":{
"Lat":49,
"Lng":15,
"Qline":420
},
"101044":{
"Lat":48,
"Lng":15,
"Qline":421
}
}
}
Example output of csv:
"14:16:23", 101043, 49, 15, 420
"14:16:23", 101044, 48, 15, 421
Thanks a lot.
I tried only to:
cat test.json | jq '.Data[] |[ .Lat, .Lng, .Qline ] | #csv'
Try this:
{ Time } + (.Data | to_entries[] | { key: .key | tonumber } + .value)
| [ .Time, .key, .Lat, .Lng, .Qline ]
| #csv
Make sure you get the raw output by using the -r switch.
Here's another solution that doesn't involve the +'s.
{Time, Data: (.Data | to_entries)[]}
| [.Time, (.Data.key | tonumber), .Data.value.Lat, .Data.value.Lng, .Data.value.Qline]
| #csv
Here is another solution. If data.json contains the sample data then
jq -M -r '(.Data|keys[]) as $k | {Time,k:$k}+.Data[$k] | [.[]] | #csv' data.json
will produce
"14:16:23","101043",49,15,420
"14:16:23","101044",48,15,421

Unable to fix the logic of the bash script

I have a table (say UserInputDetails) with the following entries:
+------------+-----------+----------+
| screenId | userInput | numInput |
+------------+-----------+----------+
| 13_1_2_1 | 2 | 9 |
| 13_1_2_2 | 2 | 9 |
| 13_1_2_2 | 3 | 2 |
| 13_1_2_2 | 9 | 2 |
| 13_1_2_2_2 | 3 | 3 |
| 13_1_2_2_2 | 5 | 2 |
| 13_2_2_2 | 4 | 4 |
| 13_2_2_2 | 5 | 4 |
| 13_2_2_2 | 7 | 2 |
+------------+-----------+----------+
I need to write a shell script which gives its expected output as:
13_1_2_1,0,0,9,0,0,0,0,0,0,0
13_1_2_2,0,0,9,2,0,0,0,0,0,2
13_1_2_2_2,0,0,0,3,0,2,0,0,0,0
13_2_2_2,0,0,0,0,4,4,0,2,0,0
Explanation for the output:
the first line of input denotes the numInputs for a particular userInput for screenId '13_1_2_1'. The line first prints the screenId and then corresponding NumInput for userInput 0-9. Since the numInput for userInput '2' is 9 and for the rest of 0-9 is 0, it gives the value 13_1_2_1,0,0,9,0,0,0,0,0,0,0
The bash script written for the following function is:
#!/bin/bash
MYSQL="mysql -uroot -proot -N Database1"
yesterday=""
if [ $# -ge 1 ]
then
yesterday="$1"
else
yesterday=`$MYSQL -sBe "select date_sub(date(now()), interval 1 day);"`
fi
echo "DATE: $yesterday"
PREVSCREENID=''
SCREENID=
ABC=tempSqlDataFile
$MYSQL -sBe "select screenId, userInput, numInput from userInputDetails group by screenID, userInput" > $ABC
for i in {0..9}
do
arr[$i]='0'
done
while read line
do
SCREENID=`echo $line | awk '{ print $1 }'`
i=`echo $line | awk '{print $2 }'`
arr[$i]=`echo $line | awk '{print $3}'`
if [[ $SCREENID != $PREVSCREENID ]]
then
echo "$SCREENID ${arr[*]}" | tr ' ' ','
for i in {0..9}
do
arr[$i]='0'
done
else
i=`echo $line | awk '{print $2 + 1}'`
arr[$i]=`echo $line | awk '{print $3}'`
fi
PREVSCREENID=$SCREENID
done < $ABC
The logic somewhere is going wrong and I am unable to get the logic right. the output from the above shell script is:
13_1_2_1,0,0,9,0,0,0,0,0,0,0,
13_1_2_2,0,0,9,0,0,0,0,0,0,0,
13_1_2_2_2,0,0,9,3,0,0,0,0,0,2,
13_2_2_2,0,0,0,3,4,2,0,0,0,0,
Please can you help me fix the logic in my script? Also, since I am new to scripting and programming, this may not be an efficient way to perform this task. Please suggest if there is an efficient way.
There are a number of errors in your script. Here's a rewrite of the latter part:
while read SCREENID i n; do
if [[ "$SCREENID" != "$PREVSCREENID" ]]; then
[ "$PREVSCREENID" ] && echo "$PREVSCREENID ${arr[*]}" | tr ' ' ,
for j in {0..9}; do arr[$j]=0; done
fi
arr[$i]="$n"
PREVSCREENID="$SCREENID"
done < "$ABC"
echo "$PREVSCREENID ${arr[*]}" | tr ' ' ,
You can avoid calling tr like this:
print_arr() { IFS=,; echo $PREVSCREENID,"${arr[*]}"; unset IFS; }
while read SCREENID i n; do
if [[ "$SCREENID" != "$PREVSCREENID" ]]; then
[ "$PREVSCREENID" ] && print_arr
for j in {0..9}; do arr[$j]=0; done
fi
arr[$i]="$n"
PREVSCREENID="$SCREENID"
done < "$ABC"
print_arr