I am currently learning how to usage jq. I have a json object and I am able to loop and read values with jq as such cat test.json | jq -r '.test'. However, I am running into some complexity when I only want to display on values that have outdated or deprecated = true but ommit from result if relase = never-update-app or never-upgrade-app. I am trying cat test.json | jq '.test | select(.[].outdated==true)' but this is not returning the desired results. Is this something possible through jq and have it display in the below desired format?
My desired output is shown below:
Release Name Installed Latest Old Deprecated
test-app 1.0.0 2.0.0 true false
json:
{
"test": [{
"release": "myapp1",
"Installed": {
"version": "0.3.0",
"appVersion": "v1.2.6"
},
"Latest": {
"version": "",
"appVersion": ""
},
"outdated": false,
"deprecated": false
}, {
"release": "myapp2",
"Installed": {
"version": "6.5.13",
"appVersion": "1.9.1"
},
"Latest": {
"version": "",
"appVersion": ""
},
"outdated": false,
"deprecated": false
}, {
"release": "test-app",
"Installed": {
"version": "1.0.0",
"appVersion": ""
},
"Latest": {
"version": "2.0.0",
"appVersion": ""
},
"outdated": true,
"deprecated": false
}, {
"release": "never-update-app",
"Installed": {
"version": "1.0.0",
"appVersion": ""
},
"Latest": {
"version": "3.0.0",
"appVersion": ""
},
"outdated": true,
"deprecated": false
}, {
"release": "never-upgrade-app",
"Installed": {
"version": "2.0.0",
"appVersion": ""
},
"Latest": {
"version": "2.0.0",
"appVersion": ""
},
"outdated": false,
"deprecated": true
}]
}
How to act on each element in an array
You have an array in .test. If you want to extract the individual elements of that array to use in the rest of your filter, use .test[]. If you want to process them but keep the results in an array, use .test|map(...) where ... is the filter you want to apply to the elements of the array.
How to filter objects with multiple conditions
As described in this question you can use the select() filter with multiple conditions. So, for example:
jq '.test[]|
select(
(.outdated or .deprecated) and
.release != "never-update-app" and
.release != "never-upgrade-app"
)'
How to combine non-trivial filters
Use parentheses! As an example, suppose you want to replace the two "never-" tests with a startswith test. By surrounding the pipes with parentheses you isolated the calculation and just get the true/false answer to combine in a boolean expression.
jq '.test[]|
select(
(.outdated or .deprecated) and
(.release|startswith("never-")|not)
)'
How to format results in a table
I think this answer on formatting results in a table is probably sufficient. If you have more specific needs I think you might want to spin this off into a separate question.
Filter
(["ReleaseName","Installed","Latest","Old","Deprecated"] | #tsv), (.test[] | select( (.outdated==true or .deprecated==true) and ((.release=="never-update-app" or .release=="never-upgrade-app") | not) ) | "\(.release) \(.Installed.version) \(.Latest.version) \(.outdated) \(.deprecated)" / " " | #tsv)
Output
ReleaseName Installed Latest Old Deprecated
test-app 1.0.0 2.0.0 true false
Demo
https://jqplay.org/s/B4a5cxlL28
Related
Preface: If the following is not possible with jq, then I completely accept that as an answer and will try to force this with bash.
I have two files that contain some IDs that, with some massaging, should be able to be combined into a single file. I have some content that I'll add to that as well (as seen in output). Essentially "mitre_test" should get compared to "sys_id". When compared, the "mitreid" from in2.json becomes technique_ID in the output (and is generally the unifying field of each output object).
Caveats:
There are some junk "desc" values placed in the in1.json that are there to make sure this is as programmatic as possible, and there are actually numerous junk inputs on the true input file I am using.
some of the mitre_test values have pairs and are not in a real array. I can split on those and break them out, but find myself losing the other information from in1.json.
Notice in the "metadata" for the output that is contains the "number" values from in1.json, and stored in a weird way (but the way that the receiving tool requires).
in1.json
[
{
"test": "Execution",
"mitreid": "T1204.001",
"mitre_test": "90b"
},
{
"test": "Defense Evasion",
"mitreid": "T1070.001",
"mitre_test": "afa"
},
{
"test": "Credential Access",
"mitreid": "T1556.004",
"mitre_test": "14b"
},
{
"test": "Initial Access",
"mitreid": "T1200",
"mitre_test": "f22"
},
{
"test": "Impact",
"mitreid": "T1489",
"mitre_test": "fa2"
}
]
in2.json
[
{
"number": "REL0001346",
"desc": "apple",
"mitre_test": "afa"
},
{
"number": "REL0001343",
"desc": "pear",
"mitre_test": "90b"
},
{
"number": "REL0001366",
"desc": "orange",
"mitre_test": "14b,f22"
},
{
"number": "REL0001378",
"desc": "pineapple",
"mitre_test": "90b"
}
]
The output:
[{
"techniqueID": "T1070.001",
"tactic": "defense-evasion",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001346"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1204.001",
"tactic": "execution",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001343"
},
{
"name": "DET_ID",
"value": "REL0001378"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1556.004",
"tactic": "credential-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
},
{
"techniqueID": "T1200",
"tactic": "initial-access",
"score": 1,
"color": "",
"comment": "",
"enabled": true,
"metadata": [{
"name": "DET_ID",
"value": "REL0001366"
}],
"showSubtechniques": true
}
]
I'm assuming I have some splitting to do on mitre_test with something like .mitre_test |= split(",")), and there are some joins I'm assuming, but doing so causes data loss or mixing up of the data. You'll notice the static data in the output exists as well, but is likely easy to place in and as such isn't as much of an issue.
Edit: reduced some of the match IDs so that it is easier to look at while analyzing the in1 and in2 files. Also simplified the two inputs to have a similar structure so that the answer is easier to understand later.
The requirements are somewhat opaque but it's fairly clear that if the task can be done by computer, it can be done using jq.
From the description, it would appear that one of the unusual aspects of the problem is that the "dictionary" defined by in1.json must be derived by splitting the key names that are CSV (comma-separated values). Here therefore is a jq def that will do that:
# Input: a JSON dictionary for which some keys are CSV,
# Output: a JSON dictionary with the CSV keys split on the commas
def refine:
. as $in
| reduce keys_unsorted[] as $k ({};
if ($k|index(","))
then ($k/",") as $keys
| . + ($keys | map( {(.): $in[$k]}) | add)
else .[$k] = $in[$k]
end );
You can see how this works by running:
INDEX($mitre.records[]; .mitre_test) | refine
using an invocation of jq such as:
jq --argfile mitre in1.json -f program.jq in2.json
For the joining part of the problem, there are many relevant Q&As on SO, e.g.
How to join JSON objects on particular fields using jq?
There is probably a much more elegant way to do this, but I ended up manually walking around things and piping to new output.
Explanation:
Read in both files, pull the fields I need.
Break out the mitre_test values that were previously just a comma separated set of values with map and try.
Store the none-changing fields as a variable and then manipulate mitre_test to become an appropriately split array, removing nulls.
Group by mitre_test values, since they are the common thing that the output is based on.
Cleanup more nulls.
Sort output to look like I want it.
jq . in1.json in2.json | \
jq '.[] |{number: .number, test: .test, mitreid: .mitreid, mitre_test: .mitre_test}' |\
jq -s '[. |map(try(.mitre_test |= split(",")) // .)|\
.[] | [.number,.test,.mitreid] as $h | .mitre_test[] |$h + [.] | \
{DET_ID: .[0], tactic: .[1], techniqueID: .[2], mitre_test: .[3]}] |\
del(.[][] | nulls)' |jq '[group_by(.mitre_test)[]|{mitre_test: .[0].mitre_test, techniqueID: [.[].techniqueID],tactic: [.[].tactic], DET_ID: [.[].DET_ID]}]|\
del(.[].techniqueID[] | nulls) | del(.[].tactic[] | nulls) | del(.[].DET_ID[] | nulls)' | \
jq '.[]| [{techniqueID: .techniqueID[0],tactic: .tactic[0], metadata: [{name: "DET_ID",value: .DET_ID[]}]}] | .[] | \
select((.metadata|length)>0)'
It was a long line, so I split it among some of the basic ideas.
Here is a simplified json file of a terraform state file (let's call it dev.ftstate)
{
"version": 4,
"terraform_version": "0.12.9",
"serial": 2,
"lineage": "ba56cc3e-71fd-1488-e6fb-3136f4630e70",
"outputs": {},
"resources": [
{
"module": "module.rds.module.reports_cpu_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.reports_lag_warning",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds.module.cross_region_replica_lag_alert",
"mode": "managed",
"type": "datadog_monitor",
"name": "alert",
"each": "list",
"provider": "module.rds.provider.datadog",
"instances": []
},
{
"module": "module.rds",
"mode": "managed",
"type": "aws_db_instance",
"name": "master",
"provider": "provider.aws",
"instances": [
{
"schema_version": 0,
"attributes": {
"address": "dev-database.123456.us-east-8.rds.amazonaws.com",
"allocated_storage": 10,
"password": "",
"performance_insights_enabled": false,
"tags": {
"env": "development"
},
"timeouts": {
"create": "6h",
"delete": "6h",
"update": "6h"
},
"timezone": "",
"username": "admin",
"vpc_security_group_ids": [
"sg-1234"
]
},
"private": ""
}
]
}
]
}
There are many modules at the same level of module.rds inside the instances. I took out many of them to create the simplified version of the raw data. The key takeway: do not assume the array index will be constant in all cases.
I wanted to extract the password field in the above example.
My first attempt is to use equality check to extract the relevant modules
` jq '.resources[].module == "module.rds"' dev.tfstate`
but it actually just produced a list of boolean values. I don't see any mention of builtin functions like filter in jq's manual
then I tried to just access the field:
> jq '.resources[].module[].attributes[].password?' dev.tfstate
then it throws the following error
jq: error (at dev.tfstate:1116): Cannot iterate over string ("module.rds")
So what is the best way to extract the value? Hopefully it can only focus on the password attribute in module.rds module only.
Edit:
My purpose is to detect if a password is left inside a state file. I want to ensure the passwords are exclusively stored in AWS secret manager.
You can extract the module you want like this.
jq '.resources[] | select(.module == "module.rds")'
I'm not confident that I understand the requirements for the rest of the solution. So this might not only not be the best way of doing what you want; it might not do what you want at all!
If you know where password will be, you can do this.
jq '.resources[] | select(.module == "module.rds") | .instances[].attributes.password'
If you don't know exactly where password will be, this is a way of finding it.
jq '.resources[] | select(.module == "module.rds") | .. | .password? | values'
According to the manual under the heading "Recursive Descent," ..|.a? will "find all the values of object keys “a” in any object found “below” ."
values filters out the null results.
You could also get the password value out of the state file without jq by using Terraform outputs. Your module should define an output with the value you want to output and you should also output this at the root module.
Without seeing your Terraform code you'd want something like this:
modules/rds/main.tf
resource "aws_db_instance" "master" {
# ...
}
output "password" {
value = aws_db_instance.master.password
sensitive = true
}
example/main.tf
module "rds" {
source = "../modules/rds"
# ...
}
output "rds_password" {
value = module.rds.password
sensitive = true
}
The sensitive = true parameter means that Terraform won't print the output to stdout when running terraform apply but it's still held in plain text in the state file.
To then access this value without jq you can use the terraform output command which will retrieve the output from the state file and print it to stdout. From there you can use it however you want.
Here is my JSON test.jsonfile :
[
{
"name": "nodejs",
"version": "0.1.21",
"apiVersion": "v1"
},
{
"name": "nodejs",
"version": "0.1.20",
"apiVersion": "v1"
},
{
"name": "nodejs",
"version": "0.1.11",
"apiVersion": "v1"
},
{
"name": "nodejs",
"version": "0.1.9",
"apiVersion": "v1"
},
{
"name": "nodejs",
"version": "0.1.8",
"apiVersion": "v1"
}
]
When I use max_by, jq return 0.1.9 instead of 0.1.21 probably due to the quoted value :
cat test.json | jq 'max_by(.version)'
{
"name": "nodejs",
"version": "0.1.9",
"apiVersion": "v1"
}
How can I get the element with version=0.1.21 ?
Semantic version compare is not supported out of the box in jq. You need to play around with the fields split by .
jq 'sort_by(.version | split(".") | map(tonumber))[-1]'
The split(".") takes the string from .version and creates an array of fields i.e. 0.1.21 becomes an array of [ "0", "1", "21"] and map(tonumber) takes an input array and transforms the string elements to an array of digits.
The sort_by() function does a index wise comparison for each of the elements in the array generated from last step and sorts in the ascending order with the object containing the version 0.1.21 at the last. The notation [-1] is to get the last object from this sorted array.
Here's an adaptation of the more general answer using jq at
How to sort Artifactory package search result by version number with JFrog CLI?
def parse:
[splits("[-.]")]
| map(tonumber? // .) ;
max_by(.version|parse)
As a less robust one-liner:
max_by(.version | [splits("[.]")] | map(tonumber))
I'm looking for a way to parse RAW JSON into CSV and I'm a total novice with anything related to coding, programming, etc. I've found a site https://json-csv.com/ that does exactly what I need but the data sets I'm parsing are bigger than their free amount so I basically pay $10 a month for something I believe could be done by way of macro or something I could figure out.
I'm essentially looking for a quick way to parse this below chunk into a structured, column based detail. The columns would be: Key, Value, Context_Geography, Context_CompanyID, Context_ProductID, Description, Created by, Updated by, updated date.
{"policies":[{"key":"viaPayEnabledRates","value":"","context":{"geography":"","companyID":"","productID":""},"created_by":"0","updated_by":"0","updated_date":"2014-03-24T21:22:25.420+0000"},{"key":"viaPayEnabledRates","value":"[\"WSPNConsortia\",\"WSPNNegotiated\",\"WSPNPublished\"]","context":{"geography":"","companyID":"*","productID":"60003"},"description":"Central Payment Pilot","created_by":"10130590","updated_by":"10130590","updated_date":"2016-04-05T07:51:29.043+0000"}
Here is a solution using jq
If the file filter.jq contains
def headers:
[
"Key", "Value", "Context_Geography", "Context_CompanyID", "Context_ProductID",
"Description", "Created by", "Updated by", "updated date"
]
;
def fields:
[
.key, .value, .context.geography, .context.companyID, .context.productID,
.description, .created_by, .updated_by, .updated_date
]
;
headers, (.policies[] | fields)
| #csv
and the file data.jq contains your sample data
{
"policies": [
{
"key": "viaPayEnabledRates",
"value": "",
"context": {
"geography": "",
"companyID": "",
"productID": ""
},
"created_by": "0",
"updated_by": "0",
"updated_date": "2014-03-24T21:22:25.420+0000"
},
{
"key": "viaPayEnabledRates",
"value": "[\"WSPNConsortia\",\"WSPNNegotiated\",\"WSPNPublished\"]",
"context": {
"geography": "",
"companyID": "*",
"productID": "60003"
},
"description": "Central Payment Pilot",
"created_by": "10130590",
"updated_by": "10130590",
"updated_date": "2016-04-05T07:51:29.043+0000"
}
]
}
then running jq as
jq -M -r -f filter.jq data.json
produces the output
"Key","Value","Context_Geography","Context_CompanyID","Context_ProductID","Description","Created by","Updated by","updated date"
"viaPayEnabledRates","","","","",,"0","0","2014-03-24T21:22:25.420+0000"
"viaPayEnabledRates","[""WSPNConsortia"",""WSPNNegotiated"",""WSPNPublished""]","","*","60003","Central Payment Pilot","10130590","10130590","2016-04-05T07:51:29.043+0000"
Writting one shell script to automatically get list of name, current and latest available version from raw json data.
I am trying to format JSON data stored in file using shell script. I tried using JQ command line JSON parser.
I want to get formatted JSON data in script. Their is advanced option provided in JQ for same scenario. I am not able to use it properly.
Example: File containing Following JSON
{
"endpoint": {
"name": "test-plugin",
"version": "0.0.1"
},
"dependencies": {
"plugin1": {
"main": {
"name": "plugin1name",
"description": "Dummy text"
},
"pkgMeta": {
"name": "plugin1name",
"version": "0.0.1"
},
"dependencies": {},
"versions": [
"0.0.5",
"0.0.4",
"0.0.3",
"0.0.2",
"0.0.1"
],
"update": {
"latest": "0.0.5"
}
},
"plugin2": {
"main": {
"name": "plugin2name",
"description": "Dummy text"
},
"pkgMeta": {
"name": "plugin2name",
"version": "0.1.1"
},
"dependencies": {},
"versions": [
"0.1.5",
"0.1.4",
"0.1.3",
"0.1.2",
"0.1.1"
],
"update": {
"latest": "0.1.5"
}
}
}
}
Trying to get result in format
[{name: "plugin1name",
c_version: "0.0.1",
n_version: "0.0.5"
},
{name: "plugin2name",
c_version: "0.1.1",
n_version: "0.1.5"}]
Can someone suggest anything ?
Your json file is not valid at: .dependencies.pkgMeta.version.
After fixing your json file, try this command:
jq '
.dependencies |
to_entries |
map(.value |
{
name: .main.name,
c_version: .pkgMeta.version,
n_version: .update.latest
}
)' input.json
The result is:
[
{
"name": "plugin1name",
"c_version": "0.0.1",
"n_version": "0.0.5"
},
{
"name": "plugin2name",
"c_version": "0.1.1",
"n_version": "0.1.5"
}
]