I have a text file with data like below:
6111119268639|22|65024:3|2000225350|Samsung|ADD|234534643645|REMOVE|5645657|65067:3|Apple|ADD|234534643645|REMOVE|3432523|65023:3
6111119268639|22|65024:3|2000225350|Apple|ADD|234534643645|REMOVE|3432523|65023:3
6111119268639|22|65024:3|2000225350|Samsung|ADD|234534643645|REMOVE|3432523|65023:3
and so on ...
I want want json output like this below:
[{
"ExternalId": "6111119268639",
"ExternalIdType": "22",
"RPPI": "65024:3",
"NewPrimaryOfferId": "2000225350",
"Samsung": [{
"Action": "ADD",
"NewSecondaryOfferId": "234534643645"
},
{
"Action": "REMOVE",
"SecondaryProductOfferId": "5645657",
"RemoveSecondaryProductInstance": "65067:3"
}
],
"Apple": [
{
"Action": "ADD",
"NewComponentOfferId": "234534643645"
},
{
"Action": "REMOVE",
"ComponentOfferId": "3432523",
"RemoveAddOnProductInstance": "65023:3"
}
]
},
{
"ExternalId": "6111119268639",
"ExternalIdType": "22",
"RPPI": "65024:3",
"NewPrimaryOfferId": "2000225350",
"Apple": [{
"Action": "ADD",
"NewComponentOfferId": "234534643645"
},
{
"Action": "REMOVE",
"ComponentOfferId": "3432523",
"RemoveAddOnProductInstance": "65023:3"
}
]
},
{
"ExternalId": "6111119268639",
"ExternalIdType": "22",
"RPPI": "65024:3",
"NewPrimaryOfferId": "2000225350",
"Apple": [{
"Action": "Samsung",
"NewComponentOfferId": "234534643645"
},
{
"Action": "REMOVE",
"ComponentOfferId": "3432523",
"RemoveAddOnProductInstance": "65023:3"
}
]
}
]
Here ExternalId,ExternalIdType,RPPI,NewPrimaryOfferId are constant and will be there in every line.But Samsung and Apple can vary accordingly means there could be only 'Samsung' in one line or there could be only 'Apple' in one line or there could be both as shown in sample text.
I have written a Jq command for this like below:
jq -Rn '[inputs / "|" | [[
["ExternalId"],["ExternalIdType"],["RPPI"],["NewPrimaryOfferId"],
(("Samsung", "Apple") as $p |
[$p, 0] + (["Action"], ["NewSecondaryOfferId"]),
[$p, 1] + (["Action"], ["SecondaryProductOfferId"], ["RemoveSecondaryProductInstance"])
)
],.] | transpose | reduce .[] as $k ({}; setpath($k[0];$k[1]))]' data.txt
But seems like it is not giving me the desired output I want.Please suggest how can I write the jq command for this using if-else condition for products or any shell script to get the desired json output.Thanks in advance!
Another approach:
jq -Rn '
[
inputs / "|" | reduce (.[4:] | while(. != [];.[6:])) as $prod (
.[:4] | with_entries(.key |= ["ExternalId","ExternalIdType","RPPI","NewPrimaryOfferId"][.]);
.[$prod[0]] = [
{Action:"ADD", NewComponentOfferId:$prod[2]},
{Action:"REMOVE", ComponentOfferId:$prod[4], RemoveAddOnProductInstance:$prod[5]}
]
)
]
' data.txt
Demo
This seems to work on your test data:
jq -nR '
def offer:
. as $data |
[[], 0] | until([$data[.[1]]] | inside(["ADD", "REMOVE"]) | not;
if $data[.[1]] == "ADD" then
[ .[0] + [{ Action: "ADD", NewComponentOfferId: $data[.[1] + 1] }], .[1] + 2 ]
else
[ .[0] + [{ Action: "REMOVE", ComponentOfferId: $data[.[1] + 1],
RemoveAddOnProductInstance: $data[.[1] + 2] }], .[1] + 3 ]
end);
def build:
(. / "|") as $data | ($data | length) as $len |
[ { ExternalId: $data[0], ExternalIdType: $data[1], RPPI: $data[2],
NewPrimaryOfferId: $data[3] }, 4 ] |
until(.[1] >= $len;
($data[.[1]+1:] | offer) as $off |
[ .[0] + { ($data[.[1]]): $off[0] }, .[1] + 1 + $off[1] ]) |
.[0];
[ inputs | build ]' data.txt
Related
Trying to create the following JSON structure through bash. There will be a max of 4 environments that I want to be shown even if there are no content within them, and example output can be found below the structure.
Input Text File:
DEV,Middleware,Mqwerty,Mqwerty
DEV,Middleware,Mqwerty,Mqwerty
DEV,Middleware,Mqwerty,Mqwerty
DEV,System,Sqwerty,Sqwerty
DEV,Application,Aqwerty,Aqwerty,Aqwerty
UAT,Application,Aqwerty,Aqwerty,Aqwerty
DEV,Utility,Uqwerty,Uqwerty,Uqwerty
PROD,Middleware,Mqwerty,Mqwerty
DEV,Middleware,Mqwerty,Mqwerty
Desired JSON Structure:
{
"ENV": {
"DEV": {
"Middleware": [
{
"name": "Mqwerty",
"release": "Mqwerty"
},
{
"name": "Mqwerty",
"release": "Mqwerty"
},
{
"name": "Mqwerty",
"release": "Mqwerty"
}
],
"System": [
{
"name": "Sqwerty",
"tag": "Sqwerty"
}
],
"Application": [
{
"domain": "Aqwerty",
"host": "Aqwerty",
"user": "Aqwerty"
},
{
"domain": "Aqwerty",
"host": "Aqwerty",
"user": "Aqwerty"
}
],
"Utility": [
{
"domain": "Uqwerty",
"health": "Uqwerty",
"version": "Uqwerty"
}
]
},
"SIT": {
"Middleware": [],
"System": [],
"Application": [],
"Utility": []
},
"UAT": {
"Middleware": [
{
"name": "Mqwerty",
"release": "Mqwerty"
},
{
"name": "Mqwerty",
"release": "Mqwerty"
}
],
"System": [],
"Application": [],
"Utility": []
},
"PROD": {
"Middleware": [],
"System": [],
"Application": [],
"Utility": []
}
}
}
Some key notes, even in environments that don't have information, the 'template' of middleware, system, application and utility (lets call these categories) is still there. The categories also have a predefined key:value structure that follows:
Application (keys): domain, host, user
Utility: domain, health, version
Middleware: name, release
System: name, tag
This is the code I've been able to get so far, however its unable to add a particular set of keys for each category (Application, Utility, Middleware and System) and also isn't able to add all the values as well.
#!/usr/bin/jq -Rnf
reduce inputs as $line
( .ENV
["DEV", "SIT", "UAT", "PROD"]
["Middleware", "System", "Application", "Utility"] = []
; ($line | split(",")) as $elements
| .ENV [$elements[0]] [$elements[1]] +=
[ $elements[2:]
| with_entries(.key |= "value\(.+1)")
]
)
I really do appreciate any help and thank you for taking you time reading this questions, apologies for being a long one. Also any good resources regarding jq would be appreciated.
Here's one way to build up a solution from easily understood pieces. In this case, jq would be invoked with -nR.
def initial:
null
| .["DEV", "SIT", "UAT", "PROD"]["Middleware", "System", "Application", "Utility"] = [];
def objectify($keys):
. as $in
| reduce range(0; $keys|length) as $i ({}; .[$keys[$i]] = ($in[$i]) );
def object:
.[0] as $top
| .[1:]
| if $top == "Middleware" then objectify(["name", "release"])
elif $top == "System" then objectify(["domain", "tag"])
elif $top == "Application" then objectify(["domain", "host", "user"])
elif $top == "Utility" then objectify(["domain", "health", "version"])
else objectify( map(tostring) ) # or raise an error, or ...
end;
reduce (inputs | split(",")) as $line (initial;
getpath($line[0:2]) as $v
| setpath($line[0:2]; $v + [$line[1:] | object] ))
| {ENV: .}
Here's a DRYer and more declarative version of my other solution on this page. It also handles the anomalous case slightly differently.
< input.txt jq -nR '
def categories:
{ "Middleware": ["name", "release"],
"System": ["domain", "tag"],
"Application": ["domain", "host", "user"],
"Utility": ["domain", "health", "version"] };
def initial:
null
| .["DEV", "SIT", "UAT", "PROD"][ categories | keys[]] = [];
def objectify($keys):
. as $in
| reduce range(0; $keys|length) as $i ({}; .[$keys[$i]] = ($in[$i]) );
def object:
categories[.[0]] as $keys
| .[1:]
| objectify($keys // [range(0;length) | tostring]);
reduce (inputs | split(",")) as $line (initial;
getpath($line[0:2]) as $v
| setpath($line[0:2]; $v + [$line[1:] | object] ))
| {ENV: .}
Say I have the following CSV data in input.txt:
broker,client,contract_id,task_type,doc_names
alice#company.com,John Doe,33333,prove-employment,important-doc-pdf
alice#company.com,John Doe,33333,prove-employment,paperwork-pdf
alice#company.com,John Doe,33333,submit-application,blah-pdf
alice#company.com,John Doe,00000,prove-employment,test-pdf
alice#company.com,John Doe,00000,submit-application,test-pdf
alice#company.com,Jane Smith,11111,prove-employment,important-doc-pdf
alice#company.com,Jane Smith,11111,submit-application,paperwork-pdf
alice#company.com,Jane Smith,11111,submit-application,unimportant-pdf
bob#company.com,John Doe,66666,submit-application,pdf-I-pdf
bob#company.com,John Doe,77777,submit-application,pdf-J-pdf
And I'd like to transform it into the following JSON:
[
{"broker": "alice#company.com",
"clients": [
{
"client": "John Doe",
"contracts": [
{
"contract_id": 33333,
"documents": [
{
"task_type": "prove-employment",
"doc_names": ["important-doc-pdf", "paperwork-pdf"]
},
{
"task_type": "submit-application",
"doc_names": ["blah-pdf"]
}
]
},
{
"contract_id": 00000,
"documents": [
{
"task_type": "prove-employment",
"doc_names": ["test-pdf"]
},
{
"task_type": "submit-application",
"doc_names": ["test-pdf"]
}
]
}
]
},
{
"client": "Jane Smith",
"contracts": [
{
"contract_id": 11111,
"documents": [
{
"task_type": "prove-employment",
"doc_names": ["important-doc-pdf"]
},
{
"task_type": "submit-application",
"doc_names": ["paperwork-pdf", "unimportant-pdf"]
}
]
}
]
}
]
},
{"broker": "bob#company.com",
"clients": [
{
"client": "John Doe",
"contracts": [
{
"contract_id": 66666,
"documents": [
{
"task_type": "submit-application",
"doc_names": ["pdf-I-pdf"]
}
]
},
{
"contract_id": 77777,
"documents": [
{
"task_type": "submit-application",
"doc_names": ["pdf-J-pdf"]
}
]
}
]
}
]
}
]
Based on a quick search, it seems like people recommend jq for this type of task. I read some of the manual and played around with it for a bit, and I'm understand that it's meant to be used by composing its filters together to produce the desired output.
So far, I've been able to transform each line of the CSV into a list of strings for example with jq -Rs '. / "\n" | .[] | . / ","'.
But I'm having trouble with something even a bit more complex, like assigning a key to each value on a line (not even the final JSON form I'm looking to get). This is what I tried: jq -Rs '[inputs | . / "\n" | .[] | . / "," as $line | {"broker": $line[0], "client": $line[1], "contract_id": $line[2], "task_type": $line[3], "doc_name": $line[4]}]', and it gives back [].
Maybe jq isn't the best tool for the job here? Perhaps I should be using awk? If all else fails, I'd probably just parse this using Python.
Any help is appreciated.
Here's a jq solution that assumes the CSV input is very simple (e.g., no field has embedded commas), followed by a brief explanation.
To handle arbitrary CSV, you could use a CSV-to-TSV conversion tool in conjunction with the jq program given below with trivial modifications.
A Solution
The following jq program assumes jq is invoked with the -R option.
(The -n option should not be used as the header row is read without using input.)
# sort-free plug-in replacement for the built-in group_by/1
def GROUP_BY(f):
reduce .[] as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s else ($s|tojson) end) as $y
| .[$t][$y] += [$x] )
| [.[][]]
;
# input: an array
def obj($keys):
. as $in | reduce range(0; $keys|length) as $i ({}; .[$keys[$i]] = $in[$i]);
# input: an array to be grouped by $keyname
# output: an object
def gather_by($keyname; $newkey):
($keyname + "s") as $plural
| GROUP_BY(.[$keyname])
| {($plural): map({($keyname): .[0][$keyname],
($newkey) : map(del(.[$keyname])) } ) }
;
split(",") as $headers
| [inputs
| split(",")
| obj($headers)
]
| gather_by("broker"; "clients")
| .brokers[].clients |= (gather_by("client"; "contracts") | .clients)
| .brokers[].clients[].contracts |= (gather_by("contract_id"; "documents") | .contract_ids)
| .brokers[].clients[].contracts[].documents |= (gather_by("task_type"; "doc_names") | .task_types)
| .brokers[].clients[].contracts[].documents[].doc_names |= map(.doc_names)
| .brokers
Explanation
The expected output as shown respects the ordering of the input lines, and so jq's built-in group_by may not be appropriate; hence GROUP_BY is defined above as a plug-in replacement for group_by. It's a bit complicated because it is completely generic in the same way as group_by.
The obj filter converts an array into an object with keys $keys.
The gather_by filter groups together items in the input array as appropriate for the present problem.
gather_by/2 example
To get a feel for what gather_by does, here's an example:
[ {a:1,b:1}, {a:2, b:2}, {a:1,b:0}] | gather_by("a"; "objects")
produces:
{
"as": [
{
"a": 1,
"objects": [
{
"b": 1
},
{
"b": 0
}
]
},
{
"a": 2,
"objects": [
{
"b": 2
}
]
}
]
}
Output
[
{
"broker": "alice#company.com",
"clients": [
{
"client": "John Doe",
"contracts": [
{
"contract_id": "33333",
"documents": [
{
"task_type": "prove-employment",
"doc_names": [
"important-doc-pdf",
"paperwork-pdf"
]
},
{
"task_type": "submit-application",
"doc_names": [
"blah-pdf"
]
}
]
},
{
"contract_id": "00000",
"documents": [
{
"task_type": "prove-employment",
"doc_names": [
"test-pdf"
]
},
{
"task_type": "submit-application",
"doc_names": [
"test-pdf"
]
}
]
}
]
},
{
"client": "Jane Smith",
"contracts": [
{
"contract_id": "11111",
"documents": [
{
"task_type": "prove-employment",
"doc_names": [
"important-doc-pdf"
]
},
{
"task_type": "submit-application",
"doc_names": [
"paperwork-pdf",
"unimportant-pdf"
]
}
]
}
]
}
]
},
{
"broker": "bob#company.com",
"clients": [
{
"client": "John Doe",
"contracts": [
{
"contract_id": "66666",
"documents": [
{
"task_type": "submit-application",
"doc_names": [
"pdf-I-pdf"
]
}
]
},
{
"contract_id": "77777",
"documents": [
{
"task_type": "submit-application",
"doc_names": [
"pdf-J-pdf"
]
}
]
}
]
}
]
}
]
Here's a jq solution which uses a generic approach that makes no reference to specific header names except for the specification of certain plural forms.
The generic approach is encapsulated in the recursively defined filter nested_group_by($headers; $plural).
The main assumptions are:
The CVS input can be parsed by splitting on commas;
jq is invoked with the -R command-line option.
# Emit a stream of arrays, each array being a group defined by a value of f,
# which can be any jq filter that produces exactly one value for each item in `stream`.
def GROUP_BY(f):
reduce .[] as $x ({};
($x|f) as $s
| ($s|type) as $t
| (if $t == "string" then $s else ($s|tojson) end) as $y
| .[$t][$y] += [$x] )
| [.[][]]
;
def obj($headers):
. as $in | reduce range(0; $headers|length) as $i ({}; .[$headers[$i]] = $in[$i]);
def nested_group_by($array; $plural):
def plural: $plural[.] // (. + "s");
if $array == [] then .
elif $array|length == 1 then GROUP_BY(.[$array[0]]) | map(map(.[])[])
else ($array[1] | plural) as $groupkey
| $array[0] as $a0
| GROUP_BY(.[$a0])
| map( { ($a0): .[0][$a0], ($groupkey): map(del( .[$a0] )) } )
| map( .[$groupkey] |= nested_group_by($array[1:]; $plural) )
end
;
split(",") as $headers
| {contract_id: "contracts",
task_type: "documents",
doc_names: "doc_names" } as $plural
| [inputs
| split(",")
| obj($headers)
]
| nested_group_by($headers; $plural)
Here is output.json: https://1drv.ms/u/s!AizscpxS0QM4hJo5SnYOHAcjng-jww
i have issues in sts:AsumeRole.Principal.Service part when have multiple Services
Principal": {
"Service": [
"ssm.amazonaws.com",
"ec2.amazonaws.com"
]
}
in my code below, it's .Principal.Service field.
If have only one service, no issues
"InstanceProfileList": [
{
"InstanceProfileId": "AIPAJMMLWIVZ2IXTOC3RO",
"Roles": [
{
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"AWS": "*"
}
}
]
},
"RoleId": "AROAJPHJ4EDQG3G5ZQZT2",
"CreateDate": "2017-04-04T23:46:47Z",
"RoleName": "dev-instance-role",
"Path": "/",
"Arn": "arn:aws:iam::279052847476:role/dev-instance-role"
}
],
"CreateDate": "2017-04-04T23:46:47Z",
"InstanceProfileName": "bastionServerInstanceProfile",
"Path": "/",
"Arn": "arn:aws:iam::279052847476:instance-profile/bastionServerInstanceProfile"
}
],
"RoleName": "dev-instance-role",
"Path": "/",
"AttachedManagedPolicies": [
{
"PolicyName": "dev-instance-role-policy",
"PolicyArn": "arn:aws:iam::279052847476:policy/dev-instance-role-policy"
}
],
"RolePolicyList": [],
"Arn": "arn:aws:iam::279052847476:role/dev-instance-role"
},
{
"AssumeRolePolicyDocument": {
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Effect": "Allow",
"Principal": {
"Service": [
"ssm.amazonaws.com",
"ec2.amazonaws.com"
]
}
}
]
},
If only one service exists, no issues, but if more than one then getting error string ("") and array (["ssm.amazonaws.com) cannot be added
How to get all values for Principal.Service in one row.
My code:
jq -rc '.RoleDetailList
| map(select((.AssumeRolePolicyDocument.Statement | length > 0) and
(.AssumeRolePolicyDocument.Statement[].Principal.Service) or
(.AssumeRolePolicyDocument.Statement[].Principal.AWS) or
(.AssumeRolePolicyDocument.Statement[].Principal.Federated) or
(.AttachedManagedPolicies | length >0) or
(.RolePolicyList | length > 0)) )[]
| [.RoleName,
([.RolePolicyList[].PolicyName,
([.AttachedManagedPolicies[].PolicyName] | join("--"))]
| join(" ")),
(.AssumeRolePolicyDocument.Statement[]
| .Principal.Federated + "" + .Principal.Service + ""+.Principal.AWS)]
| #csv' ./output.json
Desired output:
"dev-instance-role","dev-instance-role-policy","ssm.amazonaws.com--ec2.amazonaws.com--*"
Current output:
"dev-instance-role","dev-instance-role-policy","*"
Consider adding additional condition to check whether .Principal.Service is type of either array or string:
jq -rc '.RoleDetailList
| map(select((.AssumeRolePolicyDocument.Statement | length > 0) and
(.AssumeRolePolicyDocument.Statement[].Principal.Service) or
(.AssumeRolePolicyDocument.Statement[].Principal.AWS) or
(.AssumeRolePolicyDocument.Statement[].Principal.Federated) or
(.AttachedManagedPolicies | length >0) or
(.RolePolicyList | length > 0)) )[]
| [.RoleName,
([.RolePolicyList[].PolicyName,
([.AttachedManagedPolicies[].PolicyName] | join("--"))]
| join(" ")),
(.AssumeRolePolicyDocument.Statement[]
| .Principal.Federated + ""
+ (.Principal.Service | if type == "array" then join("--") else . end)
+ "" + .Principal.AWS)]
| #csv' ./output.json
The output:
"ADFS-Administrators","Administrator-Access ","arn:aws:iam::279052847476:saml-provider/companyADFS"
"ADFS-amtest-ro","pol-amtest-ro","arn:aws:iam::279052847476:saml-provider/companyADFS"
"adfs-host-role","pol-amtest-ro","ec2.amazonaws.com"
"aws-elasticbeanstalk-ec2-role","AWSElasticBeanstalkWebTier--AWSElasticBeanstalkMulticontainerDocker--AWSElasticBeanstalkWorkerTier","ec2.amazonaws.com"
"aws-elasticbeanstalk-service-role","AWSElasticBeanstalkEnhancedHealth--AWSElasticBeanstalkService","elasticbeanstalk.amazonaws.com"
"AWSAccCorpAdmin","AdministratorAccess","arn:aws:iam::279052847476:saml-provider/LastPass"
"AWScompanyCorpAdmin","AdministratorAccess","arn:aws:iam::279052847476:saml-provider/LastPass"
"AWScompanyCorpPowerUser","PowerUserAccess","arn:aws:iam::279052847476:saml-provider/LastPass"
"AWSServiceRoleForAutoScaling","AutoScalingServiceRolePolicy","autoscaling.amazonaws.com"
"AWSServiceRoleForElasticBeanstalk","AWSElasticBeanstalkServiceRolePolicy","elasticbeanstalk.amazonaws.com"
"AWSServiceRoleForElasticLoadBalancing","AWSElasticLoadBalancingServiceRolePolicy","elasticloadbalancing.amazonaws.com"
"AWSServiceRoleForOrganizations","AWSOrganizationsServiceTrustPolicy","organizations.amazonaws.com"
"AWSServiceRoleForRDS","AmazonRDSServiceRolePolicy","rds.amazonaws.com"
"Cloudyn","ReadOnlyAccess","arn:aws:iam::432263259397:root"
"DatadogAWSIntegrationRole","DatadogAWSIntegrationPolicy","arn:aws:iam::464622532012:root"
"datadog_alert_metrics_role","AWSLambdaBasicExecutionRole-66abe1f2-cee8-4a90-a026-061b24db1b02","lambda.amazonaws.com"
"dev-instance-role","dev-instance-role-policy","*"
"ec2ssmRole","AmazonEC2RoleforSSM","ssm.amazonaws.com--ec2.amazonaws.com"
"ecsInstanceRole","AmazonEC2ContainerServiceforEC2Role","ec2.amazonaws.com"
"ecsServiceRole","AmazonEC2ContainerServiceRole","ecs.amazonaws.com"
"flowlogsRole","oneClick_flowlogsRole_1495032428381 ","vpc-flow-logs.amazonaws.com"
"companyDevShutdownEC2Instaces","oneClick_lambda_basic_execution_1516271285849 ","lambda.amazonaws.com"
"companySAMLUser","AdministratorAccess","arn:aws:iam::279052847476:saml-provider/companyAzureAD"
"irole-matlabscheduler","pol-marketdata-rw","ec2.amazonaws.com"
"jira_role","","*"
"lambda-ec2-ami-role","lambda-ec2-ami-policy","lambda.amazonaws.com"
"lambda_api_gateway_twilio_processor","AWSLambdaBasicExecutionRole-f47a6b57-b716-4740-b2c6-a02fa6480153--AWSLambdaSNSPublishPolicyExecutionRole-d31a9f16-80e7-47c9-868a-f162396cccf6","lambda.amazonaws.com"
"lambda_stop_rundeck_instance","oneClick_lambda_basic_execution_1519651160794 ","lambda.amazonaws.com"
"OneLoginAdmin","AdministratorAccess","arn:aws:iam::279052847476:saml-provider/OneLoginAdmin"
"OneLoginDev","PowerUserAccess","arn:aws:iam::279052847476:saml-provider/OneLoginDev"
"rds-host-role","","ec2.amazonaws.com"
"rds-monitoring-role","AmazonRDSEnhancedMonitoringRole","monitoring.rds.amazonaws.com"
"role-amtest-ro","pol-amtest-ro","ec2.amazonaws.com"
"role-amtest-rw","pol-amtest-rw","ec2.amazonaws.com"
"Stackdriver","ReadOnlyAccess","arn:aws:iam::314658760392:root"
"vmimport","vmimport ","vmie.amazonaws.com"
"workspaces_DefaultRole","SkyLightServiceAccess ","workspaces.amazonaws.com"
It appears that .Principal.Service is either a string or an array of strings, so you need to handle both cases. Consider therefore:
def to_s: if type == "string" then . else join("--") end;
You might want to make this more generic to make it more robust or for other reasons.
You might also want to streamline your jq filter to make it more intelligible and maintainable, e.g. by using jq variables. Note also that
.x.a + .x.b + x.c
can be written as:
.x | (.a + .b + .c)
I need to aggregate values by key. Example JSON input is:
$ cat json | jq
[
{
"key": "john",
"value": "ontario"
},
{
"key": "ryan",
"value": "chicago"
},
{
"key": "ryan",
"value": "illinois"
},
{
"key": "john",
"value": "toronto"
},
]
Is it possible and if so how to merge/join/concat values with the same key so that the result is:
[
{
"key": "john",
"value": "toronto ontario"
},
{
"key": "ryan",
"value": "illinois chicago"
},
]
I am targetting JQ specifically because of its ease of use from cfengine.
Group the pairs by key, then combine the values.
group_by(.key) | map({key:.[0].key,value:(map(.value) | join(" "))})
For this type of problem, I prefer to avoid the overhead of sorting, and to guarantee that the ordering of the objects in the input is respected.
Here's one approach that assumes the values associated with "key" and "value" are all strings (as is the case in the example). This assumption makes it easy to avoid an inefficient lookup:
def merge_by_key(separator):
reduce .[] as $o
({}; $o["key"] as $k
| if .[$k] then .[$k] += (separator + $o["value"])
else .[$k] = $o["value"] end);
merge_by_key(" ") | to_entries
Output:
[{"key":"john","value":"ontario toronto"},
{"key":"ryan","value":"chicago illinois"}]
Generic solution
def merge_at_key(separator):
reduce .[] as $o
([];
$o["key"] as $k
| (map(.key) | index($k)) as $i
| if $i then (.[$i] | .value) += (separator + $o["value"])
else . + [$o] end);
I have the following JSON:
[
{
"name": "InstanceA",
"tags": [
{
"key": "environment",
"value": "production"
},
{
"key": "group",
"value": "group1"
}
]
},
{
"name": "InstanceB",
"tags": [
{
"key": "group",
"value": "group2"
},
{
"key": "environment",
"value": "staging"
}
]
}
]
I'm trying to get a flat output of value based on the condition key == 'environment'. I already tried select(boolean_expression), but I cannot get the desired output, like:
"InstanceA, production"
"InstanceB, staging"
Does jq support this kind of output? If so, how to do it?
Yes.
For example:
$ jq '.[] | "\(.name), \(.tags | from_entries | .environment)"' input.json
Output:
"InstanceA, production"
"InstanceB, staging"
jq '.[] | .name + ", " + (.tags[] | select(.key == "environment").value)' f.json
Here is a solution using join
.[]
| [.name, (.tags[] | if .key == "environment" then .value else empty end)]
| join(", ")