Add element outside of array to csv output - json

I have a JSON structure similar to this one:
{
"results": [
{
"title": "Page 1",
"content": {
"id": "11111"
}
},
{
"title": "Page 2",
"content": {
"id": "22222"
}
},
{
"title": "Page 3",
"content": {
"id": "33333"
}
}
],
"start": 0,
"limit": 25,
"size": 3,
"totalSize": 3,
"query": "Hello World"
}
The output I need is:
"Page 1","11111","Hello World"
"Page 2","22222","Hello World"
"Page 3","33333","Hello World"
I can get the elements from the array with:
cat my.json | jq -r '.results[] | [.title, .content.id] | #csv'
But how do I add the "query" element which is outside of the array to each line in the output?
I tried a bunch of options but I can't get it to work.

Use a little placeholder to store the value of .query and use it back when putting it in array
.query as $q | .results[] | [.title, .content.id, $q] | #csv
Or put them in separate arrays. The (...) around .results path ensures, you don't walk to the path below and still remain at the top level node path.
(.results[] | [.title, .content.id]) + [.query] | #csv

Related

Remove last character from json output using JQ

I have a json that looks like this:
{
"HostedZones": [
{
"ResourceRecordSetCount": 2,
"CallerReference": "test20150527-2",
"Config": {
"Comment": "test2",
"PrivateZone": true
},
"Id": "/hostedzone/Z119WBBTVP5WFX",
"Name": "dev.devx.company.services."
},
{
"ResourceRecordSetCount": 2,
"CallerReference": "test20150527-1",
"Config": {
"Comment": "test",
"PrivateZone": true
},
"Id": "/hostedzone/Z3P5QSUBK4POTI",
"Name": "test.devx.company.services."
}
],
"IsTruncated": false,
"MaxItems": "100"
}
And my goal is to fetch a specific Name (in my case it's the test.devx.company.services), however the Name field contains an extra "." at the end that I'd like to remove from the output.
This is what I have so far:
jq --raw-output '.HostedZones[] | select(.Name | test("test")?) | (.Name[:-1] | sub("."; ""))'
The problem with that it is removing the first character from the output also.
So the output currently is: est.devx.company.services (JQ play snippet)
Not sure what I'm doing wrong :/
To always remove the last character, if it contains "test":
jq '(.HostedZones[].Name | select(contains("test"))) |= .[:-1]'
To remove it only if it is a dot:
jq '(.HostedZones[].Name | select(contains("test"))) |= sub("[.]$"; "")'

Print only one property of an object that is within an an array attribute as well as a property that is a sibling to the array property in jq

I have a json file that looks like so:
[
{
"code": "1234",
"files": [
{
"fileType": "pdf",
"url": "http://.../a.pdf"
},
{
"fileType": "video",
"url": "http://.../b.mp4"
}
]
},
{
"code": "4321",
"files": [
{
"fileType": "pdf",
"url": "http://.../c.pdf"
},
{
"fileType": "video",
"url": "http://.../d.mp4"
}
]
},
{
"code": "9999",
"files": [
{
"fileType": "pdf",
"url": "http://.../e.pdf"
}
]
}
]
I would like to print out only the files that are of fileType == video in the files array such that I end up with output that looks like so:
1234, "http://.../b.mp4"
4321, "http://.../d.mp4"
So far I am only able to output something that looks like this:
1234, "http://.../a.pdf", "http://.../b.mp4",
4321, "http://.../c.pdf", "http://.../d.mp4"
Using the following:
jq -r '.[] | select(.files[]?.fileType == "video") | [.code, .files[].url] | #csv'
I was wondering how I can filter the .files[] based on the fileType as I am outputting them?
The following pipeline makes the solution fairly self-explanatory, assuming one understands the basic syntax and the -r command-line option:
< input.json jq -r '
.[]
| .code as $code
| .files[]
| select(.fileType == "video")
| "\($code), \"\(.url)\""
'

How can I filter by a numeric field using jq?

I am writing a script to query the Bitbucket API and delete SNAPSHOT artifacts that have never been downloaded. This script is failing because it gets ALL snapshot artifacts, the select for the number of downloads does not appear to be working.
What is wrong with my select statement to filter objects by the number of downloads?
Of course the more direct solution here would be if I could just query the Bitbucket API with a filter. To the best of my knowledge the API does not support filtering by downloads.
My script is:
#!/usr/bin/env bash
curl -X GET --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=100" > downloads.json
# get all values | reduce the set to just be name and downloads | select entries where downloads is zero | select entries where name contains SNAPSHOT | just get the name
#TODO i screwed up the selection somewhere its returning files that contain SNAPSHOT regardless of number of downloads
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#unique sort, not sure why jq gives me multiple values
sort -u snapshots_without_any_downloads.js | tr -d '"' > unique_snapshots_without_downloads.js
cat unique_snapshots_without_downloads.js | xargs -t -I % curl -Ss -X DELETE --user "me:mykey" "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/%" > deleted_files.txt
A deidentified sample of the raw input from the API is:
{
"pagelen": 10,
"size": 40,
"values": [
{
"name": "myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
}
},
"downloads": 2,
"created_on": "2018-03-15T17:50:00.157310+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430894
},
{
"name": "myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.0.zip"
}
},
"downloads": 0,
"created_on": "2018-03-15T17:50:00.157310+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430894
},
{
"name": "myproject_1.0_mc_3.5.1.zip",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads/myproject_1.1-SNAPSHOT_0210f77_mc_3.5.1.zip"
}
},
"downloads": 5,
"created_on": "2018-03-15T17:49:14.885544+00:00",
"user": {
"username": "me",
"display_name": "me",
"type": "user",
"uuid": "{3051ec5f-cc92-4bc3-b291-38189a490a89}",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/users/me"
},
"html": {
"href": "https://bitbucket.org/me/"
},
"avatar": {
"href": "https://bitbucket.org/account/me/avatar/32/"
}
}
},
"type": "download",
"size": 430934
}
],
"page": 1,
"next": "https://api.bitbucket.org/2.0/repositories/myemployer/myproject/downloads?pagelen=10&page=2"
}
The output I want from this snippet is myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip - that artifact is a SNAPSHOT and has zero downloads.
I have used this intermediate step to do some debugging:
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads>0) | select(.name | contains("SNAPSHOT")) | unique' downloads.json > snapshots_with_downloads.js
jq '.values | {name: .[].name, downloads: .[].downloads} | select(.downloads==0) | select(.name | contains("SNAPSHOT")) | .name' downloads.json > snapshots_without_any_downloads.js
#this returns the same values for each list!
diff unique_snapshots_with_downloads.js unique_snapshots_without_downloads.js
This adjustment gives a cleaner and unique structure, it suggests that theres some sort of splitting or streaming aspect of jq that I do not fully understand:
#this returns a "unique" array like I expect, adding select to this still does not produce the desired outcome
jq '.values | [{name: .[].name, downloads: .[].downloads}] | unique' downloads.json
The data after this step looks like this. It just removed the cruft I didn't need from the raw API response:
[
{
"name": "myproject_1.0_2400a51_mc_3.4.0.zip",
"downloads": 0
},
{
"name": "myproject_1.0_2400a51_mc_3.4.1.zip",
"downloads": 2
},
{
"name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.0.zip",
"downloads": 0
},
{
"name": "myproject_1.1-SNAPSHOT_391f4d5_mc_3.5.1.zip",
"downloads": 2
}
]
As I understand it:
You want globally unique outputs
You want only items with downloads==0
You want only items whose name contains "SNAPSHOT"
The following will accomplish that:
jq -r '
[.values[] | {(.name): .downloads}]
| add
| to_entries[]
| select(.value == 0)
| .key | select(contains("SNAPSHOT"))'
Rather than making unique an explicit step, this version generates a map from names to download counters (adding the values together -- which means that in case of conflicts, the last one wins), and thereby both ensures that the outputs are unique.
Given your test JSON, output is:
myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip
Applied to the overall problem context, this strategy can be used to simplify the overall process:
jq -r '[.values[] | {(.links.self.href): .downloads}] | add | to_entries[] | select(.value == 0) | .key | select(contains("SNAPSHOT"))'
It simplifies the overall process by acting on the URL to the file rather than the name only. This simplifies the subsequent DELETE call. The sort and tr calls can also be removed.
Here's a solution which sums up the .download values per .name before making the selection based on the total number of downloads:
reduce (.values[] | select(.name | contains("SNAPSHOT"))) as $v
({}; .[$v.name] += $v.downloads)
| with_entries(select(.value == 0))
| keys_unsorted[]
Example:
$ jq -r -f program.jq input.json
myproject_1.1-SNAPSHOT_thanks_for_the_reminder_charles_duffy_mc_3.5.0.zip
p.s.
What is wrong with my select statement ...?
The problem that jumps out is the bit of the pipeline just before the "select" filter:
.values | {name: .[].name, downloads: .[].downloads}
The use of .[] in this manner results in the Cartesian product being formed -- that is, the above expression will emit n*n JSON sets, where n is the length of .values. You evidently intended to write:
.values[] | {name: .name, downloads: .downloads}
which can be abbreviated to:
.values[] | {name, downloads}

How do I use jq to pull the newest HUE Scene from the JSON dump?

I am trying to use jq to find the ID of a hue scene when I pass the scene name. The problem is if I update the scene it makes another scene with a new ID assigned to it. So as I make changes to the scene more than one result returns. How Do I find the newest scene? I see there is an object that is lastupdated.
Here is what I have so far:
curl -s ${BASEURL}/scenes/ | /usr/local/bin/jq -r -e --arg SCENENAME "${SCENENAME}" '. as $object | keys[] | select($object[.].name == $SCENENAME)'
Here is what the json output looks like:
"FUX9A2m4LcuF6YG": {
"name": "KitchenDay",
"lights": [
"6",
"7",
"10",
"11"
],
"owner": "43594f081bb6d23e9ccd254927fa47",
"recycle": true,
"locked": false,
"appdata": {},
"picture": "",
"lastupdated": "2018-02-25T03:35:57",
"version": 2 }
In this response, I'll assume the following input, which seems to capture the essence of the problem:
{
"FUX9A2m4LcuF6YG": {
"name": "KitchenDay",
"lastupdated": "2018-02-25T03:35:57"
},
"later": {
"name": "KitchenNight",
"lastupdated": "2018-02-25T23:35:57"
}
}
With that as input, the following filter:
to_entries | [max_by(.value.lastupdated)] | from_entries
produces:
{
"later": {
"name": "KitchenNight",
"lastupdated": "2018-02-25T23:35:57"
}
}
The key here is that the max of the "lastupdated" field corresponds to the most recent time.
If you just want the key name, you could instead write:
to_entries | max_by(.value.lastupdated) | .key

create an object from an existing json file using 'jq'

I have a messages.json file
[
{
"id": "title",
"description": "This is the Title",
"defaultMessage": "title",
"filepath": "src/title.js"
},
{
"id": "title1",
"description": "This is the Title1",
"defaultMessage": "title1",
"filepath": "src/title1.js"
},
{
"id": "title2",
"description": "This is the Title2",
"defaultMessage": "title2",
"filepath": "src/title2.js"
},
{
"id": "title2",
"description": "This is the Title2",
"defaultMessage": "title2",
"filepath": "src/title2.js"
},
]
I want to create an object
{
"title": "Dummy1",
"title1": "Dummy2",
"title2": "Dummy3",
"title3": "Dummy4"
}
from the top one.
So far I have
jq '.[] | .id' src/messages.json;
And it does give me the IDs
How do I add some random text and make the new object as above?
Can we also create a new JSON file and write the newly created object onto it using jq?
Your output included "title3" so I'll assume that you intended that the second occurrence of "title2" in the input was supposed to refer to "title3".
With this assumption, the following jq program seems to do what you want:
map( .id )
| . as $in
| reduce range(0;length) as $i ({};
. + {($in[$i]): "dummy\(1+$i)"})
In words, extract the values of .id, and then turn each into an object of the form: {(.id) : "dummy\(1+$i)"}
This uses string interpolation, and produces:
{
"title": "dummy1",
"title1": "dummy2",
"title2": "dummy3",
"title3": "dummy4"
}
reduce-free solution
map(.id )
| [., [range(0;length)]]
| transpose
| map( {(.[0]): "dummy\(.[1]+1)"})
| add
Output
Can we also create a new json file and write the newly created object onto it using jq?
Yes, just use output redirection:
jq -f program.jq messages.json > output.json
Addendum
I want a parent object "de" to the already created json file objects
You could just pipe either of the above solutions to: {de: .}