jq add element to each object inside array of objects - json

If I have an array of objects like this:
[
{
"remote" : [
{
"id" : 1
},
{
"id" : 2
},
{
" id" : 3
}
],
"text_id" : 1
},
{
"remote" : [
{
"id" : 4
},
{
"id" : 5
},
{
"id" : 6
}
],
"text_id" : 2
}
]
How would you add "text_id" field to every object inside .[].remote[] array so it would become
[
{
"remote" : [
{
"id" : 1,
"text_id" : 1
},
{
"id" : 2,
"text_id" : 1
},
{
" id" : 3,
"text_id" : 1
}
]
},
{
"remote" : [
{
"id" : 4,
"text_id" : 2
},
{
"id" : 5,
"text_id" : 2
},
{
"id" : 6,
"text_id" : 2
}
]
}
]
I have already spent several hours trying to figure this out. It looks like there has to be a way to do this using foreach directive, but after I checked the manual for it, it seemed to me pretty obscure so I though maybe someone could give an example.
Thanks.

jq 'map( .text_id as $t
| .remote |= map( . + {text_id : $t} )
| del(.text_id)
)'

You don't need map for that.
.[] |= (.remote[] += {text_id} | del(.text_id))
Online demo

I think you should use array.map()
You have two levels of array so it should look like this :
const fatArray = [
{
remote : [
{
id : 1
},
{
id : 2
},
{
id : 3
}
],
text_id : 1
},
{
remote : [
{
id : 4
},
{
id : 5
},
{
id : 6
}
],
text_id : 2
}
];
const finalArray = fatArray.map( arr => {
return arr.remote.map(elem => { return {text_id: arr.text_id, id: elem.id}})
}
)
console.log(finalArray);

Related

How to find all the json key-value pair by matching the value using json query

I have below JSON structure :
{
"key" : "value",
"array" : [
{ "key" : 1 },
{ "key" : 2, "misc": {
"a": "Apple",
"b": "Butterfly",
"c": "Cat",
"d": "Dog"
} },
{ "key" : 3 }
],
"tokenize" : {
"firstkey" : {
"token" : 0
},
"secondkey" : {
"token" : 1
},
"thirdkey" : {
"token" : 0
}
}
}
I am able to traverse the above structure till array->dictionary->b by the below syntax :
$.array[?(#.key=2)].misc.b
Now I need to print all the tokens which has value 0. The same way as shown above I can traverse till $.array[?(#.key=2)].tokenize.
How can I query it to print all values having token:0 .
To be very precise, I want the output to be shown as :
[
"tokenize" : {
"firstkey" : {
"token" : 0
},
"thirdkey" : {
"token" : 0
}
}
]
The following query already showing something near to what I want but it does not show the keys ("firstkey" and "thirdkey" in this case).
$.tokenize[?(#.token == 0)]
Please help me to get this as well.
Thanks.
You can try this script.
$.tokenize[?(#.token == 0)].token
Result:
[
0,
0
]
$.tokenize[?(#.token == 0)]~
will output
[
"firstkey",
"thirdkey"
]
for the OP's sample json, use https://jsonpath-plus.github.io/JSONPath/demo/ to verify against your data.

MongoDB aggregate and count json paths

I have a MongoDB Collection which contains data elements like this:
{
"_id" : "9878jr23geg",
"element" : {
"name" : "element7",
"Set" : [
{
"SubListA" : [
{
"name" : "AlbertEinstein",
"value" : "45"
},
{
"name" : "JohnDoe",
"value" : "34"
},
]
},
{
"MoreNames" : [
{
"name" : "TimMcGraw",
"value" : "39"
}
]
}
]
}
{
"_id" : "275678hfvd",
"element" : {
"name" : "element8",
"Set" : [
{
"SubListA" : [
{
"name" : "AlbertEinstein",
"value" : "45"
},
{
"name" : "JimmyKimmel",
"value" : "41"
}
]
}
]
}
I'm trying to count the occurrences of each unique name, grouped by the element of Set to which they belong. For example, both objects in my example above have an object with name: "AlbertEinstein" inside element.Set.SublistA; therefore I'd expect a return value something along the lines of:
element.Set.SublistA.AlbertEinstein | 2
Essentially, I'd like a count for each of the distinct names when the data is grouped by objects within element.Set.
Ideally, for the example given, I'd like all of:
element.Set.SubListA.AlbertEinstein | 2
element.Set.SubListA.JohnDoe | 1
element.Set.MoreNames.TimMcGraw | 1
element.Set.SublistA.JimmyKimmel | 1
I've tried several aggregate queries but none seems to achieve what I'm trying to do.

Filtering JSONPath with given string value

If I have a JSON like so:
{
"data": [
{
"service" : { "id" : 1 }
},
{
"service" : { "id" : 2 }
},
{
"service" : {}
}
]
}
This query works:
$..service[?(#.id==2)]
And gives expected result:
[
{
"id" : 2
}
]
However, if I had strings as id's:
{
"data": [
{
"service" : { "id" : "a" }
},
{
"service" : { "id" : "b" }
},
{
"service" : {}
}
]
}
Running similar query:
$..service[?(#.id == "a")]
Gives no results (empty array).
I am using this evaluator.
I was looking at docs here but could not find anything to point me in the right direction... Any help if someone knows how to write such query? Thanks :)
without " works
$..service[?(#.id == b)]
give this result
[
{
"id" : "b"
}
]

Finding JSON objects in mongoDB

I'm trying to find objects using the built it queries and It just doesn't work..
My JSON file is something like this:
{ "Text1":
{
"id":"2"
},
"Text2":
{
"id":"2,3"
},
"Text3":
{
"id":"1"
}
}
And I write this db.myCollection.find({"id":2})
And it doesn't find anything.
When I write db.myCollection.find() it shows all the data as it should.
Anyone knows how to do it correctly?
Its hard to change the data-structure but as you want just your matching sub-document and you don't know where is your target sub-document (for example the query should be on Text1 or Text2 , ...) there is a good data structure for this:
{
"_id" : ObjectId("548dd9261a01c68fab8d67d7"),
"pair" : [
{
"id" : "2",
"key" : "Text1"
},
{
"id" : [
"2",
"3"
],
"key" : "Text2"
},
{
"id" : "1",
"key" : "Text3"
}
]
}
and your query is:
db.myCollection.findOne({'pair.id' : "2"} , {'pair.$':1, _id : -1}).pair // there is better ways (such as aggregation instead of above query)
as result you will have:
{
"0" : {
"id" : "2",
"key" : "Text1"
}
}
Update 1 (newbie way)
If you want all the document not just one use this
var result = [];
db.myCollection.find({'pair.id' : "2"} , {'pair.$':1, _id : -1}).forEach(function(item)
{
result.push(item.pair);
});
// the output will be in result
Update 2
Use this query to get all sub-documents
db.myCollection.aggregate
(
{ $unwind: '$pair' },
{ $match : {'pair.id' : "2"} }
).result
it produce output as
{
"0" : {
"_id" : ObjectId("548deb511a01c68fab8d67db"),
"pair" : {
"id" : "2",
"key" : "Text1"
}
},
"1" : {
"_id" : ObjectId("548deb511a01c68fab8d67db"),
"pair" : {
"id" : [
"2",
"3"
],
"key" : "Text2"
}
}
}
Since your are query specify a field in a subdocument this is what will work. see .find() documentation.
db.myCollection.find({"Text1.id" : "2"}, {"Text1.id": true})
{ "_id" : ObjectId("548dd798e2fa652e675af11d"), "Text1" : { "id" : "2" } }
If the query is on "Text1" or "Text2" the best thing to do here as mention in the accepted answer is changing you document structure. This can be easily done using the "Bulk" API.
var bulk = db.mycollection.initializeOrderedBulkOp(),
count = 0;
db.mycollection.find().forEach(function(doc) {
var pair = [];
for(var key in doc) {
if(key !== "_id") {
var id = doc[key]["id"].split(/[, ]/);
pair.push({"key": key, "id": id});
}
}
bulk.find({"_id": doc._id}).replaceOne({ "pair": pair });
count++; if (count % 300 == 0){
// Execute per 300 operations and re-Init
bulk.execute();
bulk = db.mycollection.initializeOrderedBulkOp();
}
})
// Clean up queues
if (count % 300 != 0 )
bulk.execute();
Your document now look like this:
{
"_id" : ObjectId("55edddc6602d0b4fd53a48d8"),
"pair" : [
{
"key" : "Text1",
"id" : [
"2"
]
},
{
"key" : "Text2",
"id" : [
"2",
"3"
]
},
{
"key" : "Text3",
"id" : [
"1"
]
}
]
}
Running the following query:
db.mycollection.aggregate([
{ "$project": {
"pair": {
"$setDifference": [
{ "$map": {
"input": "$pair",
"as": "pr",
"in": {
"$cond": [
{ "$setIsSubset": [ ["2"], "$$pr.id" ]},
"$$pr",
false
]
}
}},
[false]
]
}
}}
])
returns:
{
"_id" : ObjectId("55edddc6602d0b4fd53a48d8"),
"pair" : [
{
"key" : "Text1",
"id" : [
"2"
]
},
{
"key" : "Text2",
"id" : [
"2",
"3"
]
}
]
}

Count links in Arrays in MongoDB collection

I have a collection with objects, which are linking to other objects in the array:
{
"_id" : ObjectId("53f75bedc5489f86666d305e"),
"id" : "2",
"links_to" : [
{
"id" : 1,
"label" : null,
},
{
"id" : 3,
"label" : null,
},
{
"id" : 60,
"label" : null,
},
{
"id" : 23,
"label" : null,
},
},
{
"_id" : ObjectId("53f75bedc5489f86666d305e"),
"id" : "3",
"links_to" : [
{
"id" : 4,
"label" : null,
},
{
"id" : 8,
"label" : null,
},
{
"id" : 23,
"label" : null,
},
{
"id" : 2,
"label" : null,
},
},
...
Now I would like to write a query, which gives as an output for each id the number of links. Eg.:
{"id": 1, "numberOfLinks": 21},
{"id": 2, "numberOfLinks": 15},
...
Thanks in advance.
The best approach is to keep the count on the document and update it when you either $push or $pull elements of the array using the $inc operator. In this way the field is maintained on the document itself:
{
"links_to": [],
"linkCount": 0
}
When you "push"
db.collecction.update(
{},
{ "$push": { "links_to": newLink }, "$inc": { "linkCount": 1 } }
)
And "pull":
db.collecction.update(
{},
{ "$pull": { "links_to": newLink }, "$inc": { "linkCount": -1 } }
)
Without doing this, you can use the $size operator from the aggregation framework in Mondern MongoDB to get the array length:
db.collection.aggregate([
{ "$project": {
"numberOfLinks": { "$size": "$link_count" }
}}
])
Or in versions prior to MongoDB 2.6 you can count the array members after $unwind and $group:
db.collection.aggregate([
{ "$unwind": "$link_count" },
{ "$group": {
"_id": "$id",
"numberOfLinks": { "$sum": 1 }
}}
])
So usually unless you want something specifically "dynamic" then just maintain the count on the document. This avoids the overhead of calculation when you query.
Actually this is fairly simple to achieve using aggregation:
db.foo.aggregate([
{$unwind: "$links_to" },
{$group: { _id: {"lti":"$links_to.id"}, numberOfLinks: {$sum: 1} } },
{$project: { _id:0, id: "$_id.lti", numberOfLinks: "$numberOfLinks" } }
])
produces the desired output, though in reversed order of fields, at least in the shell output:
{ "numberOfLinks" : 3, "id" : 3 }
{ "numberOfLinks" : 3, "id" : 2 }
{ "numberOfLinks" : 1, "id" : 5 }
{ "numberOfLinks" : 2, "id" : 4 }
{ "numberOfLinks" : 3, "id" : 1 }
If you can live with an output like:
{ "_id" : { "linksToId" : 3 }, "numberOfLinks" : 3 }
{ "_id" : { "linksToId" : 2 }, "numberOfLinks" : 3 }
{ "_id" : { "linksToId" : 5 }, "numberOfLinks" : 1 }
{ "_id" : { "linksToId" : 4 }, "numberOfLinks" : 2 }
{ "_id" : { "linksToId" : 1 }, "numberOfLinks" : 3 }
you can skip the $project step of the aggregation pipeline.
This is extremely efficient. I did a test basically doing the same thing over a collection of 5M documents with roughly 17M relations. Takes 18 seconds on a not exactly high performance server.