Building new JSON with JQ and bash - json

I am trying to create JSON from scratch using bash.
The final structure needs to be like:
{
"hosts": {
"a_hostname" : {
"ips" : [
1,
2,
3
]
},
{...}
}
}
First I'm creating an input file with the format:
hostname ["1.1.1.1","2.2.2.2"]
host-name2 ["3.3.3.3","4.4.4.4"]
This is being created by:
for host in $( ansible -i hosts all --list-hosts ) ; \
do echo -n "${host} " ; \
ansible -i hosts $host -m setup | sed '1c {' | jq -r -c '.ansible_facts.ansible_all_ipv4_addresses' ; \
done > hosts.txt
The key point here is that the IP list/array, is coming from a JSON file and being extracted by jq. This extraction outputs an already valid / quoted JSON array, but as a string in a txt file.
Next I'm using jq to parse the whole text file into the desired JSON:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]):
{ips:.[1]}
}
] | add }
' < ~/hosts.txt
This is almost correct, everything except for the IPs value which is treated as a string and quoted leading to:
{
"hosts": {
"hostname1": {
"ips": "[\"1.1.1.1\",\"2.2.2.2\"]"
},
"host-name2": {
"ips": "[\"3.3.3.3\",\"4.4.4.4\"]"
}
}
}
I'm now stuck at this final hurdle - how to insert the IPs without causing them to be quoted again.
Edit - quoting solved by using {ips: .[1] | fromjson }} instead of {ips:.[1]}.
However this was completely negated by #CharlesDuffy's help suggesting converting to TSV.
Original Q body:
So far I've got to
jq -n {hosts:{}} | \
for host in $( ansible -i hosts all --list-hosts ) ; \
do jq ".hosts += {$host:{}}" | \
jq ".hosts.$host += {ips:[1,2,3]}" ; \
done ;
([1,2,3] is actually coming from a subshell but including it seemed unnecessary as that part works, and made it harder to read)
This sort of works, but there seems to be 2 problems.
1) Final output only has a single host in it containg data from the first host in the list (this persists even if the second problem is bypassed):
{
"hosts": {
"host_1": {
"ips": [
1,
2,
3
]
}
}
}
2) One of the hostnames has a - in it, which causes syntax and compiler errors from jq. I'm stuck going around quote hell trying to get it to be interpreted but also quoted. Help!
Thanks for any input.

Let's say your input format is:
host_1 1 2 3
host_2 2 3 4
host-with-dashes 3 4 5
host-with-no-addresses
...re: edit specifying a different format: Add #tsv onto the JQ command producing the existing format to generate this one instead.
If you want to transform that to the format in question, it might look like:
jq -Rn '
{ "hosts": [inputs |
split("\\s+"; "g") |
select(length > 0 and .[0] != "") |
{(.[0]): .[1:]}
] | add
}' <input.txt
Which yields as output:
{
"hosts": {
"host_1": [
"1",
"2",
"3"
],
"host_2": [
"2",
"3",
"4"
],
"host-with-dashes": [
"3",
"4",
"5"
],
"host-with-no-addresses": []
}
}

Related

Bash JSON compare two list and delete id

I have a JSON endpoint which I can fetch value with curl and yml local file. I want to get the difference and delete it with id of name present on JSON endpoint.
JSON's endpoint
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_toto_a"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_tata_b"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "V1_titi_c"
}
]
files.yml
---
instance:
toto:
name: "toto"
tata:
name: "tata"
Between JSON's endpoint and local file, I want to delete it with id of tata, because it is the difference between the sources.
declare -a arr=(_a _b _c)
ar=$(cat files.yml | grep name | cut -d '"' -f2 | tr "\n" " ")
fileItemArray=($ar)
ARR_PRE=("${fileItemArray[#]/#/V1_}")
for i in "${arr[#]}"; do local_var+=("${ARR_PRE[#]/%/$i}"); done
remote_var=$(curl -sX GET "XXXX" | jq -r '.[].name | #sh' | tr -d \'\")
diff_=$(echo ${local_var[#]} ${remote_var[#]} | tr ' ' '\n' | sort | uniq -u)
output = titi
the code works, but I want to delete the titi with id dynamically
curl -X DELETE "XXXX" $id_titi
I am trying to delete with bash script, but I have no idea to continue...
Your endpoint is not proper JSON as it has
commas after the .name field but no following field
no commas between the elements of the top-level array
If this is not just a typo from pasting your example into this question, then you'd need to address this first before proceeding. This is how it should look like:
[
{
"hosts": [
"server1"
],
"id": "qz9o847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "toto"
},
{
"hosts": [
"server2"
],
"id": "a6aa847b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "tata"
},
{
"hosts": [
"server3"
],
"id": "a6d9ee7b-f07c-49d1-b1fa-e5ed0b2f0519",
"name": "titi"
}
]
If your endpoint is proper JSON, try the following. It extracts the names from your .yml file (just as you do - there are plenty of more efficient and less error-prone ways but I'm trying to adapt your approach as much as possible) but instead of a Bash array generates a JSON array using jq which for Bash is a simple string. For your curl output it's basically the same thing, extracting a (JSON) array of names into a Bash string. Note that in both cases I use quotes <var>="$(…)" to capture strings that may include spaces (although I also use the -c option for jq to compact it's output to a single line). For the difference between the two, everything is taken over by jq as it can easily be fed with the JSON arrays as variables, perform the subtraction and output in your preferred format:
fromyml="$(cat files.yml | grep name | cut -d '"' -f2 | jq -Rnc '[inputs]')"
fromcurl="$(curl -sX GET "XXXX" | jq -c 'map(.name)')"
diff="$(jq -nr --argjson fromyml "$fromyml" --argjson fromcurl "$fromcurl" '
$fromcurl - $fromyml | .[]
')"
The Bash variable diff now contains a list of names only present in the curl output ($fromcurl - $fromyml), one per line (if, other than in your example, there happens to be more than one). If the curl output had duplicates, they will still be included (use $fromcurl - $fromyml | unique | .[] to get rid of them):
titi
As you can see, this solution has three calls to jq. I'll leave it to you to further reduce that number as it fits your general workflow (basically, it can be put together into one).
Getting the output of a program into a variable can be done using read.
perl -M5.010 -MYAML -MJSON::PP -e'
sub get_next_file { local $/; "".<> }
my %filter = map { $_->{name} => 1 } values %{ Load(get_next_file)->{instance} };
say for grep !$filter{$_}, map $_->{name}, #{ decode_json(get_next_file) };
' b.yaml a.json |
while IFS= read -r id; do
curl -X DELETE ..."$id"...
done
I used Perl here because what you had was no way to parse a YAML file. The snippet requires having installed the YAML Perl module.

How can jq be used to insert dynamic field names recursively for all objects in an array?

'm new to jq, and hoping to convert JSON below so that, for each object in the records array , the "Account" object is deleted and replaced with an "AccountID" field which has a the value of Account.Id.
Assuming I don't know what the name of the field (eg. Account ) is prior to executing, so it Has to be dynamically included as an argument to --arg.
Contacts.json:
{
"records": [
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef1"
},
"Account": {
"attributes": {
"type": "Account",
"url": "/services/data/v51.0/sobjects/Account/asdf"
},
"Id": "asdf"
}
},
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef2"
},
"Account": {
"attributes": {
"type": "Account",
"url": "/services/data/v51.0/sobjects/Account/qwer"
},
"Id": "qwer"
}
}
]
}
to
{
"records": [
{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef1"
},
"AccountID": "asdf"
}
},{
"attributes": {
"type": "Contact",
"referenceId": "ContactRef2"
},
"AccountID": "qwer"
}
}
]
}
This example above is a little contrived because in actuality, I need to be able to dynamically name the ID field to be able to port the new JSON structure into destination system. For my use case, it's not always valid to tack "ID" onto the field name ( eg. Account .. ID ), so I passed the field names to --arg .
This is as close as I got.. but it's not quite there. and I suspect there is better way.
jq -c --arg field "Account" --arg field_name_id "AccountID" '. |= . + if .records?[]?[$field] != null then { "\($field_name_id)" : .records[][$field].Id } else empty end | if .records?[]?[$field] != null then del(.records[][$field]) else empty end' Contacts.json
I've wrestled with this quite a while, but this is as far as I'm able to manage without running into tons of syntax errors. I really appreciate any help to add an AccountID field on each object in the records array.
Here's the actual bash script where jq is being run ( relevant parts are where FIELD(S) is being used )
#! /bin/bash
# This script takes a of soql file as first and only argument
# The main purpose is to tweak the json results from an sfdx:data:tree:export so the json is compatible with sfdx:data:tree:import
# This is needed because sfdx export & import are inadequate to use whne relationships more than 2 levels deep in the export query.
# grab all unique object names within the soql file for any objects where the ID field is being SELECTed ( eg. "Account Iteration__r Profile UserRole" )
FIELDS=`grep -oe '\([A-Za-z_]\+\)\.[iI][dD]' $1 | cut -f 1 -d . - | sort -u`
#find all json files in file and rewrite the relationship FIELDS blocks into someting sfdx can import
for FIELD in $FIELDS;
do
if [[ $FIELD =~ __r ]]
then
FIELD_NAME_ID=`sed 's/__r/__c/' <<< $FIELD`
else
FIELD_NAME_ID="${FIELD}ID"
fi
JSON_FILES=`ls *.json`
#Loop all json files in direcotry
for DATA_FILE in $JSON_FILES
do
#replace any email addresses left in custom data( just in case )
#using gsed becuse Mac lacks -i flag for in-place substitution
gsed -i 's/[^# "]*#[^#]*\.[^# ,"]*/fake#test.com/g' $DATA_FILE
# make temporary file to hold the rewritten json
TEMP_FILE="temp-${DATA_FILE}.bk"
echo $DATA_FILE $FIELD $FIELD_NAME_ID
#For custom relationship jttrs. change __r to __c to get the name of Id field, otherwise just add "ID".
jq -c --arg field $FIELD --arg field_name_id $FIELD_NAME_ID '. |= . + if .records?[]?[$field] != null then { "\($field_name_id)" : .records[][$field].Id } else empty end | if .records?[]?[$field] != null then del(.records[][$field]) else empty end' $DATA_FILE 1> ./$TEMP_FILE 2> modify-json.errors
# if TEMP_FILE is not empty, then jq revised it, so replace contents the original JSON DATA_FILE
if [[ -s ./$TEMP_FILE ]]
then
#JSON format spacing/line-breaks
jq '.' $TEMP_FILE > $DATA_FILE
fi
rm $TEMP_FILE
done
done
The key to a simple solution is |=. Here's one using map:
.records |= map( .Account.Id as $x
| del(.Account)
| . + {AccountID: $x} )
which can be simplified to:
.records |= map( . + {AccountID: .Account.Id}
| del(.Account) )
Either of these can easily be adapted to the case where the two field names are passed in as arguments, or if they must be inferred from the "owner" of "Id".
Adapting peak's answer to use the dynamic field name:
jq -c --arg field "Account" \
--arg field_name_id "AccountID" '
.records |= map(.[$field].Id as $x
| del(.[$field])
| . + {($field_name_id): $x})
'

How to use jq to find all paths to a certain key

In a very large nested json structure I'm trying to find all of the paths that end in a key.
ex:
{
"A": {
"A1": {
"foo": {
"_": "_"
}
},
"A2": {
"_": "_"
}
},
"B": {
"B1": {}
},
"foo": {
"_": "_"
}
}
would print something along the lines of:
["A","A1","foo"], ["foo"]
Unfortunately I don't know at what level of nesting the keys will appear, so I haven't been able to figure it out with a simple select. I've gotten close with jq '[paths] | .[] | select(contains(["foo"]))', but the output contains all the permutations of any tree that contains foo.
output: ["A", "A1", "foo"]["A", "A1", "foo", "_"]["foo"][ "foo", "_"]
Bonus points if I could keep the original data structure format but simply filter out all paths that don't contain the key (in this case the sub trees under "foo" wouldn't need to be hidden).
With your input:
$ jq -c 'paths | select(.[-1] == "foo")'
["A","A1","foo"]
["foo"]
Bonus points:
(1) If your jq has tostream:
$ jq 'fromstream(tostream| select(.[0]|index("foo")))'
Or better yet, since your input is large, you can use the streaming parser (jq -n --stream) with this filter:
fromstream( inputs|select( (.[0]|index("foo"))))
(2) Whether or not your jq has tostream:
. as $in
| reduce (paths(scalars) | select(index("foo"))) as $p
(null; setpath($p; $in|getpath($p)))
In all three cases, the output is:
{
"A": {
"A1": {
"foo": {
"_": "_"
}
}
},
"foo": {
"_": "_"
}
}
I had the same fundamental problem.
With (yaml) input like:
developer:
android:
members:
- alice
- bob
oncall:
- bob
hr:
members:
- charlie
- doug
this:
is:
really:
deep:
nesting:
members:
- example deep nesting
I wanted to find all arbitrarily nested groups and get their members.
Using this:
yq . | # convert yaml to json using python-yq
jq '
. as $input | # Save the input for later
. | paths | # Get the list of paths
select(.[-1] | tostring | test("^(members|oncall|priv)$"; "ix")) | # Only find paths which end with members, oncall, and priv
. as $path | # save each path in the $path variable
( $input | getpath($path) ) as $members | # Get the value of each path from the original input
{
"key": ( $path | join("-") ), # The key is the join of all path keys
"value": $members # The value is the list of members
}
' |
jq -s 'from_entries' | # collect kv pairs into a full object using slurp
yq --sort-keys -y . # Convert back to yaml using python-yq
I get output like this:
developer-android-members:
- alice
- bob
developer-android-oncall:
- bob
hr-members:
- charlie
- doug
this-is-really-deep-nesting-members:
- example deep nesting

Create JSON using jq from pipe-separated keys and values in bash

I am trying to create a json object from a string in bash. The string is as follows.
CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0
The output is from docker stats command and my end goal is to publish custom metrics to aws cloudwatch. I would like to format this string as json.
{
"CONTAINER":"nginx_container",
"CPU%":"0.02%",
....
}
I have used jq command before and it seems like it should work well in this case but I have not been able to come up with a good solution yet. Other than hardcoding variable names and indexing using sed or awk. Then creating a json from scratch. Any suggestions would be appreciated. Thanks.
Prerequisite
For all of the below, it's assumed that your content is in a shell variable named s:
s='CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0'
What (modern jq)
# thanks to #JeffMercado and #chepner for refinements, see comments
jq -Rn '
( input | split("|") ) as $keys |
( inputs | split("|") ) as $vals |
[[$keys, $vals] | transpose[] | {key:.[0],value:.[1]}] | from_entries
' <<<"$s"
How (modern jq)
This requires very new (probably 1.5?) jq to work, and is a dense chunk of code. To break it down:
Using -n prevents jq from reading stdin on its own, leaving the entirety of the input stream available to be read by input and inputs -- the former to read a single line, and the latter to read all remaining lines. (-R, for raw input, causes textual lines rather than JSON objects to be read).
With [$keys, $vals] | transpose[], we're generating [key, value] pairs (in Python terms, zipping the two lists).
With {key:.[0],value:.[1]}, we're making each [key, value] pair into an object of the form {"key": key, "value": value}
With from_entries, we're combining those pairs into objects containing those keys and values.
What (shell-assisted)
This will work with a significantly older jq than the above, and is an easily adopted approach for scenarios where a native-jq solution can be harder to wrangle:
{
IFS='|' read -r -a keys # read first line into an array of strings
## read each subsequent line into an array named "values"
while IFS='|' read -r -a values; do
# setup: positional arguments to pass in literal variables, query with code
jq_args=( )
jq_query='.'
# copy values into the arguments, reference them from the generated code
for idx in "${!values[#]}"; do
[[ ${keys[$idx]} ]] || continue # skip values with no corresponding key
jq_args+=( --arg "key$idx" "${keys[$idx]}" )
jq_args+=( --arg "value$idx" "${values[$idx]}" )
jq_query+=" | .[\$key${idx}]=\$value${idx}"
done
# run the generated command
jq "${jq_args[#]}" "$jq_query" <<<'{}'
done
} <<<"$s"
How (shell-assisted)
The invoked jq command from the above is similar to:
jq --arg key0 'CONTAINER' \
--arg value0 'nginx_container' \
--arg key1 'CPU%' \
--arg value1 '0.0.2%' \
--arg key2 'MEMUSAGE/LIMIT' \
--arg value2 '25.09MiB/15.26GiB' \
'. | .[$key0]=$value0 | .[$key1]=$value1 | .[$key2]=$value2' \
<<<'{}'
...passing each key and value out-of-band (such that it's treated as a literal string rather than parsed as JSON), then referring to them individually.
Result
Either of the above will emit:
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Why
In short: Because it's guaranteed to generate valid JSON as output.
Consider the following as an example that would break more naive approaches:
s='key ending in a backslash\
value "with quotes"'
Sure, these are unexpected scenarios, but jq knows how to deal with them:
{
"key ending in a backslash\\": "value \"with quotes\""
}
...whereas an implementation that didn't understand JSON strings could easily end up emitting:
{
"key ending in a backslash\": "value "with quotes""
}
I know this is an old post, but the tool you seek is called jo: https://github.com/jpmens/jo
A quick and easy example:
$ jo my_variable="simple"
{"my_variable":"simple"}
A little more complex
$ jo -p name=jo n=17 parser=false
{
"name": "jo",
"n": 17,
"parser": false
}
Add an array
$ jo -p name=jo n=17 parser=false my_array=$(jo -a {1..5})
{
"name": "jo",
"n": 17,
"parser": false,
"my_array": [
1,
2,
3,
4,
5
]
}
I've made some pretty complex stuff with jo and the nice thing is that you don't have to worry about rolling your own solution worrying about the possiblity of making invalid json.
You can ask docker to give you JSON data in the first place
docker stats --format "{{json .}}"
For more on this, see: https://docs.docker.com/config/formatting/
JSONSTR=""
declare -a JSONNAMES=()
declare -A JSONARRAY=()
LOOPNUM=0
cat ~/newfile | while IFS=: read CONTAINER CPU MEMUSE MEMPC NETIO BLKIO PIDS; do
if [[ "$LOOPNUM" = 0 ]]; then
JSONNAMES=("$CONTAINER" "$CPU" "$MEMUSE" "$MEMPC" "$NETIO" "$BLKIO" "$PIDS")
LOOPNUM=$(( LOOPNUM+1 ))
else
echo "{ \"${JSONNAMES[0]}\": \"${CONTAINER}\", \"${JSONNAMES[1]}\": \"${CPU}\", \"${JSONNAMES[2]}\": \"${MEMUSE}\", \"${JSONNAMES[3]}\": \"${MEMPC}\", \"${JSONNAMES[4]}\": \"${NETIO}\", \"${JSONNAMES[5]}\": \"${BLKIO}\", \"${JSONNAMES[6]}\": \"${PIDS}\" }"
fi
done
Returns:
{ "CONTAINER": "nginx_container", "CPU%": "0.02%", "MEMUSAGE/LIMIT": "25.09MiB/15.26GiB", "MEM%": "0.16%", "NETI/O": "0B/0B", "BLOCKI/O": "22.09MB/4.096kB", "PIDS": "0" }
Here is a solution which uses the -R and -s options along with transpose:
split("\n") # [ "CONTAINER...", "nginx_container|0.02%...", ...]
| (.[0] | split("|")) as $keys # [ "CONTAINER", "CPU%", "MEMUSAGE/LIMIT", ... ]
| (.[1:][] | split("|")) # [ "nginx_container", "0.02%", ... ] [ ... ] ...
| select(length > 0) # (remove empty [] caused by trailing newline)
| [$keys, .] # [ ["CONTAINER", ...], ["nginx_container", ...] ] ...
| [ transpose[] | {(.[0]):.[1]} ] # [ {"CONTAINER": "nginx_container"}, ... ] ...
| add # {"CONTAINER": "nginx_container", "CPU%": "0.02%" ...
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "nginx_container" "0.02%" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
Not using jq but possible to use args and environment in values.
CONTAINER=nginx_container
json_template='{"CONTAINER":"%s","CPU%":"%s","MEMUSAGE/LIMIT":"%s", "MEM%":"%s","NETI/O":"%s","BLOCKI/O":"%s","PIDS":"%s"}'
json_string=$(printf "$json_template" "$CONTAINER" "$1" "25.09MiB/15.26GiB" "0.16%" "0B/0B" "22.09MB/4.096kB" "0")
echo "$json_string"
If you're starting with tabular data, I think it makes more sense to use something that works with tabular data natively, like sqawk to make it into json, and then use jq work with it further.
echo 'CONTAINER|CPU%|MEMUSAGE/LIMIT|MEM%|NETI/O|BLOCKI/O|PIDS
nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0' \
| sqawk -FS '[|]' -RS '\n' -output json 'select * from a' header=1 \
| jq '.[] | with_entries(select(.key|test("^a.*")|not))'
{
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0"
}
Without jq, sqawk gives a bit too much:
[
{
"anr": "1",
"anf": "7",
"a0": "nginx_container|0.02%|25.09MiB/15.26GiB|0.16%|0B/0B|22.09MB/4.096kB|0",
"CONTAINER": "nginx_container",
"CPU%": "0.02%",
"MEMUSAGE/LIMIT": "25.09MiB/15.26GiB",
"MEM%": "0.16%",
"NETI/O": "0B/0B",
"BLOCKI/O": "22.09MB/4.096kB",
"PIDS": "0",
"a8": "",
"a9": "",
"a10": ""
}
]

Getting the contents of an unknown key

I am trying to get the value of the 'url' name which sits underneath a name that I do not know up front. e.g. it's not 'name' or 'size' - just a string that another tool generates - example "x1234" is not known to me by name:
"foo": {
"bar": {
"x1234": {
"url": "http://example.com"
}
}
}
so jq ".foo.bar" returns the "x1234" fragment but what I need is the "url" value underneath it. I've tried many things after reading the docs but I wasn't able to figure out the right syntax.
Can anyone tell me where I'm going wrong?
One approach is to use ... For example, provided the input is valid JSON:
$ jq '.. | .url? | select(.)' input.json
"http://example.com"
Or equivalently (and easier to type):
$ jq '.. | .url? // empty' input.json
Assuming foo and bar are known, you could just do
.foo.bar[].url
Let's say if bar is also unknown, then do the following
.foo[][].url
Here is a solution which uses tostream. If filter.jq contains
tostream
| select(length==2) as [$p,$v]
| if $p[-1] == "url" and ($v|endswith(".zip")) then $v else empty end
and if data.json contains (note outer { } added to make the example legal JSON and a second entry added to demonstrate excluding values not ending in .zip as asked in follow-up comment to peak's answer)
{
"foo": {
"bar": {
"x1234": {
"url": "http://example.com"
},
"x1234xxx": {
"url": "http://example.com/file.zip"
}
}
}
}
then the command
jq -M -f filter.jq data.json
produces
"http://example.com/file.zip"