JQ - unique count of each value in an array - json

I'm needing to solve this with JQ. I have a large lists of arrays in my json file and am needing to do some sort | uniq -c types of stuff on them. Specifically I have a relatively nasty looking fruit array that needs to break down what is inside. I'm aware of unique and things like that, and imagine there is likely a simple way to do this, but I've been trying run down assigning things as variables and appending and whatnot, but I can't get the most basic part of counting the unique values per that fruit array, and especially not without breaking the rest of the content (hence the variable ideas). Please tell me I'm overthinking this.
I'd like to turn this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"fruit": [
"Kiwi",
"Apple",
"",
"",
"",
"Kiwi",
"",
"Kiwi",
"",
"",
"Mango",
"Kiwi"
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"fruit": [
"",
"Orange"
]
}
]
Into this;
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Kiwi - 3"
},
{
"name": "fruit",
"value": "Mango - 1"
},
{
"name": "fruit",
"value": "Apple - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]

Using group_by and length would be one way:
jq '
map(with_entries(select(.key == "fruit") |= (
.value |= (group_by(.) | map(
{name: "fruit", value: "\(.[0] | select(. != "")) - \(length)"}
))
| .key = "metadata"
)))
'
[
{
"uid": "123abc",
"tID": [
"T19"
],
"metadata": [
{
"name": "fruit",
"value": "Apple - 1"
},
{
"name": "fruit",
"value": "Kiwi - 4"
},
{
"name": "fruit",
"value": "Mango - 1"
}
]
},
{
"uid": "456xyz",
"tID": [
"T15"
],
"metadata": [
{
"name": "fruit",
"value": "Orange - 1"
}
]
}
]
Demo

Related

How to avoid generating all combinations of selected data while constructing an object?

My original JSON is given below.
[
{
"id": "1",
"name": "AA_1",
"total": "100002",
"files": [
{
"filename": "8665b987ab48511eda9e458046fbc42e.csv",
"filename_original": "some.csv",
"status": "3",
"total": "100002",
"time": "2020-08-24 23:25:49"
}
],
"status": "3",
"created": "2020-08-24 23:25:49",
"filenames": "8665b987ab48511eda9e458046fbc42e.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
},
{
"id": "4",
"name": "AA_2",
"total": "43806503",
"files": [
{
"filename": "1b4812fe634938928953dd40db1f70b2.csv",
"filename_original": "other.csv",
"status": "3",
"total": "21903252",
"time": "2020-08-24 23:33:43"
},
{
"filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"filename_original": "some.csv",
"status": "2",
"total": 0,
"time": "2020-08-24 23:29:30"
}
],
"status": "2",
"created": "2020-08-24 23:35:51",
"filenames": "1b4812fe634938928953dd40db1f70b2.csv&&63ab85fef2412ce80ae8bd018497d8bf.csv",
"is_append": "0",
"is_deleted": "0",
"comment": null
}
]
From this JSON I want to create new objects by combining fields from objects which have status: 2 and their files which also have the same pair, status: 2.
So, I am expecting a JSON array as below.
[
{
"id": "4",
"name": "AA_2",
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": 2
}
]
So far I tried with this JQ filter:
.[]|select(.status=="2")|[{id:.id,file_filename:.files[].filename,file_status:.files[].status}]
But this produces some invalid data.
[
{
"id": "4", # want to remove this as file.status != 2
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "3"
},
{
"id": "4",
"file_filename": "1b4812fe634938928953dd40db1f70b2.csv",
"file_status": "2"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "3"
},
{
"id": "4", # Repeat
"file_filename": "63ab85fef2412ce80ae8bd018497d8bf.csv",
"file_status": "2"
}
]
How do I filter the new JSON using JQ and remove these duplicate objects?
By applying [] operator to files twice, you're running into a combinatorial explosion. That needs to be avoided, for example:
[ .[] | select(.status == "2") | {id, name} + (.files[] | select(.status == "2") | {file_filename: .filename, file_status: .status}) ]
Online demo

POWERSHELL - How to access multilevel child elements in JSON file with condtion

can someone please send me solution or link for PowerShell 5 and 7 how can I access child elements if specific condition is fulfilled for JSON file which I have as output.json. I haven't find it on the net.
I want to retrieve value of the "children" elements if type element has value FILE and to put that into some list. So final result should be [test1.txt,test2.txt]
Thank you!!!
{
"path": {
"components": [
"Packages"
],
"parent": "",
"name": "Packages",
},
"children": {
"values": [
{
"path": {
"components": [
"test1.txt"
],
"parent": "",
"name": "test1.txt",
},
"type": "FILE",
"size": 405
},
{
"path": {
"components": [
"test2.txt"
],
"parent": "",
"name": "test2.txt",
},
"type": "FILE",
"size": 409
},
{
"path": {
"components": [
"FOLDER"
],
"parent": "",
"name": "FOLDER",
},
"type": "DIRECTORY",
"size": 1625
}
]
"start": 0
}
}
1.) The json is incorrect, I assumt that this one is the correct one:
{
"path": {
"components": [
"Packages"
],
"parent": "",
"name": "Packages"
},
"children": {
"values": [
{
"path": {
"components": [
"test1.txt"
],
"parent": "",
"name": "test1.txt"
},
"type": "FILE",
"size": 405
},
{
"path": {
"components": [
"test2.txt"
],
"parent": "",
"name": "test2.txt"
},
"type": "FILE",
"size": 409
},
{
"path": {
"components": [
"FOLDER"
],
"parent": "",
"name": "FOLDER"
},
"type": "DIRECTORY",
"size": 1625
}
],
"start": 0
}
}
2.) The structure is not absolute clear, but for your example this seems to me to be the correct solution:
$element = $json | ConvertFrom-Json
$result = #()
$element.children.values | foreach {
if ($_.type -eq 'FILE') { $result += $_.path.name }
}
$result | ConvertTo-Json
Be aware, that the used construct $result += $_.path.name is fine if you have up to ~10k items, but for very large items its getting very slow and you need to use an arraylist. https://adamtheautomator.com/powershell-arraylist/

Using jq to convert object to key with values

I have been playing around with jq to format a json file but I am having some issues trying to solve a particular transformation. Given a test.json file in this format:
[
{
"name": "A", // This would be the first key
"number": 1,
"type": "apple",
"city": "NYC" // This would be the second key
},
{
"name": "A",
"number": "5",
"type": "apple",
"city": "LA"
},
{
"name": "A",
"number": 2,
"type": "apple",
"city": "NYC"
},
{
"name": "B",
"number": 3,
"type": "apple",
"city": "NYC"
}
]
I was wondering, how can I format it this way using jq?
[
{
"key": "A",
"values": [
{
"key": "NYC",
"values": [
{
"number": 1,
"type": "a"
},
{
"number": 2,
"type": "b"
}
]
},
{
"key": "LA",
"values": [
{
"number": 5,
"type": "b"
}
]
}
]
},
{
"key": "B",
"values": [
{
"key": "NYC",
"values": [
{
"number": 3,
"type": "apple"
}
]
}
]
}
]
I have followed this thread Using jq, convert array of name/value pairs to object with named keys and tried to group the json using this expression
jq '. | group_by(.name) | group_by(.city) ' ./test.json
but I have not been able to add the keys in the output.
You'll want to group the items at the different levels and building out your result objects as you want.
group_by(.name) | map({
key: .[0].name,
values: (group_by(.city) | map({
key: .[0].city,
values: map({number,type})
}))
})
Just keep in mind that group_by/1 yields groups in a sorted order. You'll probably want an implementation that preserves that order.
def group_by_unsorted(key_selector):
reduce .[] as $i ({};
.["\($i|key_selector)"] += [$i]
)|[.[]];

Rename JSON key field with value in object

Have the following json output:
[
{
"id": "47",
"canUpdate": true,
"canDelete": true,
"canArchive": true,
"info": [
{
"key": "problem_type",
"value": "PAN",
"valueCaption": "PAN",
"keyCaption": "Category"
},
{
"key": "status",
"value": 3,
"valueCaption": "Closed",
"keyCaption": "Status"
},
{
"key": "insert_time",
"value": 1466446314000,
"valueCaption": "2016-06-20 14:11:54.0",
"keyCaption": "Request time"
}
As you can see under "info" they actually label the key:value pair as "key": "problem_type" and "value": "PAN" and then "valueCaption": "PAN" "keyCaption": "Category". What I need to do is remap the file so that, in this example, it shows as "problem_type": "PAN" and "Category": "PAN". What would be the best method to iterate through the output to remap the key:value pairs in this manner?
How it needs to be:
[
{
"id": "47",
"canUpdate": true,
"canDelete": true,
"canArchive": true,
"info": [
{
"problem_type": "PAN",
"Category": "PAN"
},
{
"status": 3,
"Status": "Closed"
},
{
"insert_time": 1466446314000,
"Request time": "2016-06-20 14:11:54.0"
}
Here is a jq solution which uses Update assignment |=
.[].info[] |= {(.key):.value, (.keyCaption):.valueCaption}
Sample Run (assumes data in data.json)
$ jq -M '.[].info[] |= {(.key):.value, (.keyCaption):.valueCaption}' data.json
[
{
"id": "47",
"canUpdate": true,
"canDelete": true,
"canArchive": true,
"info": [
{
"problem_type": "PAN",
"Category": "PAN"
},
{
"status": 3,
"Status": "Closed"
},
{
"insert_time": 1466446314000,
"Request time": "2016-06-20 14:11:54.0"
}
]
}
]
Try it online at jqplay.org

How to convert flat array of objects with "categories" and "subcategories" to tree with JQ?

I have a JSON array of objects with groups and subgroups that looks like this:
[
{
"Group name": "Elevation",
"Subgroup": "Contours",
"name": "Contours - Labels"
},
{
"Group name": "Elevation",
"Subgroup": "Contours",
"name": "Contours"
},
{
"Group name": "Elevation",
"Subgroup": "Cuttings",
"name": "Cuttings"
},
{
"Group name": "Framework",
"Subgroup": "Reserves",
"name": "Reserves"
},
{
"Group name": "Framework",
"Subgroup": "Indigenous Reserves",
"name": "Reserves"
},
{
"Group name": "Framework",
"Subgroup": "Land Borders",
"name": "Mainland"
}
]
I'd like to convert it to a nested structure:
[ { type: group, name: Elevation, Items: [ ... ] } ]
How do you do this in JQ?
I'm not sure what your expected output is exactly, but this could be done more efficiently. By parameterizing your grouping function, you could build up your trees in a reusable fashion.
def regroup(keyfilter; itemfilter):
group_by(keyfilter) | map({
type: "group",
name: (.[0] | keyfilter),
items: itemfilter
})
;
regroup(."Group name";
regroup(.Subgroup;
map({ name })
)
)
This yields the following results:
[
{
"type": "group",
"name": "Elevation",
"items": [
{
"type": "group",
"name": "Contours",
"items": [
{ "name": "Contours - Labels" },
{ "name": "Contours" }
]
},
{
"type": "group",
"name": "Cuttings",
"items": [
{ "name": "Cuttings" }
]
}
]
},
{
"type": "group",
"name": "Framework",
"items": [
{
"type": "group",
"name": "Indigenous Reserves",
"items": [
{ "name": "Reserves" }
]
},
{
"type": "group",
"name": "Land Borders",
"items": [
{ "name": "Mainland" }
]
},
{
"type": "group",
"name": "Reserves",
"items": [
{ "name": "Reserves" }
]
}
]
}
]
Functions are very powerful, you should try to utilize it more. Hopefully this shows just how powerful it can be.
This did the job:
def subgroups:
group_by(."Subgroup")|
map({
"type": "group",
"name": .[0]."Subgroup",
"items": (map(del(.Subgroup)))
});
group_by(."Group name")|
map({
"type": "group",
"name": .[0]."Group name",
"items": (subgroups2|map(del(."Group name")))
})
It's just the same process twice:
group_by to sort the outer items on "group name"
For each of those groups, construct an object whose name is taken from the first item, and whose items are...
Passed through to a function that repeats 1 and 2, and removes the grouping field.