I'm using Postgres 10.12, and I have a table (reels_data) that has a jsonb column called blocks, which is an array of objects, each with its own type and data object. Example:
[
{
"type" : "LOGO",
"data" : {
"imageId" : 399
}
},
{
"type" : "CONTACT_INFO",
"data" : {
"email" : "",
"phone" : "",
"url" : "",
"name" : "Bob",
"jobTitle" : "Developer"
}
},
{
"type" : "MEDIA",
"data" : {
"playlists" : [
{
"id" : "134e3b49-fe08-43b9-b13a-dc886ec0af61",
"name" : "Untitled Playlist",
"media" : [
{
"id" : 265,
"fileUuid" : "8a7519b8-92dc-4978-a239-5b25d66caf45",
"itemType" : "TRACK",
"name" : "Test",
"duration" : "104.749"
},
{
"id" : 266,
"fileUuid" : "7409bbd5-f8a0-46f2-a077-78c14a4dcd80",
"itemType" : "TRACK",
"name" : "Test 2",
"duration" : "144.163"
},
{
"id" : 267,
"fileUuid" : "14c0d325-bfce-4ac5-a4f6-3edaa0e86ac5",
"itemType" : "TRACK",
"name" : "Test 3",
"duration" : "143.871"
}
]
}
]
}
}
]
My challenge is, if a user deletes media with ID 265, it has to be pulled from all the blocks of type "MEDIA", and to make it more complicated, from all of the playlists in the playlists array.
These blocks can be in any order, so I can't assume an index of 2. And there could be one playlist or 10, and the media to remove could exist in none or several of these playlists.
Is there a single Postgres query I could write to remove all media of ID x? Or is this better written as a SQL query to simply retrieve the above data, add some data processing in JavaScript, and then a build & commit a SQL transaction to update several rows with new data? Efficiency is the top priority (not taxing the DB server).
Considering your structure is fixed, please try below query:
with cte as(
select
id,
data->'type' "type",
data->'data' "data",
playlists->>'id',
playlists->>'name',
jsonb_build_object('id', playlists->>'id','name', playlists->>'name', 'media',json_agg(z.media) ) "playlists"
from reels_data t1
left join lateral jsonb_array_elements(t1.blocks) x(data) on true
left join lateral jsonb_array_elements(x.data->'data'->'playlists') y(playlists) on true
left join lateral jsonb_array_elements(y.playlists->'media') z(media) on true
where z.media->>'id' is null or z.media->>'id' <>'265'
group by 1,2,3,4,5
),
cte1 as
(
select id,jsonb_agg(final) "final_data" from (
select
id,
type,
data,
playlists,
jsonb_build_object('type',type,'data',case when type='"MEDIA"' then jsonb_build_object('playlists',jsonb_agg(playlists)) else data end) "final"
from
cte
group by 1,2,3,4) t1
group by 1
)
update reels_data t1 set blocks= t2."final_data" from cte1 t2 where t1.id=t2.id
It will replace all the objects with given id.
DEMO
Related
Having the next objects
"a" : {
"id" : "1",
"arr" : [
{"id" : "b1"}, {"id" : "b2"}
]
}
"b1" : {
"id" : "b1",
"innerArr" : [{"id" : "c1"},{"id" : "c2"}]
}
"b2" : {
"id" : "b2",
"innerArr" : [{"id" : "c3"}]
}
"c1" : {
"name" : "c1"
}...
Right now I'm able to make a join with NEST over an array like this.
SELECT *
FROM bucket AS a
NEST bucket AS bs
ON META(a).id IN a.arr[*].id
{
"id" : "1",
"arr" : [
{"id" : "b1"}, {"id" : "b2"}
],
"bs" : [
{
"id" : "b1",
"innerArr" : [{"id" : "c1"},{"id" : "c2"}]
},
{
"id" : "b2",
"innerArr" : [{"id" : "c3"}]
}
]
}
Now I want to NEST c documents for each item in bs
Adding this NEST doesn't work
NEST bucket AS c
ON META(c).id IN bs[*].innerArr[*].id
I'm looking for this result:
{
"id" : "1",
"arr" : [
{"id" : "b1"}, {"id" : "b2"}
],
"bs" : [
{
"id" : "b1",
"innerArr" : [{"id" : "c1"},{"id" : "c2"}],
"cs" : [{"name" : "c1"},{"name" : "c2"}]
},
{
"id" : "b2",
"innerArr" : [{"id" : "c3"}],
"cs" : [{"name" : "c3"}]
}
]
}
I was able to solve it iterating every element from bs in a subquery. Since the subquery is in the SELECT part of the query, it must have USE KEYS instead of ON META().id =. Finally I add the subquery result to each item.
SELECT a*.
,ARRAY OBJECT_ADD(item, "cs", (SELECT c.* FROM bucket AS c USE KEYS item.innerArr[*].id)) FOR item IN bs END
FROM bucket AS a,
NEST bucket AS bs
ON META(a).id IN a.arr[*].id
It means that for each element in bs it queries every element of innerArray. And then adds the result to the element of bs.
NEST bucket AS bs .
bs starts with document (Scan, Fetch, ON). At the end of NEST bs becomes ARRAY for Filter, Group, projections, etc. Same applies chained JOIN, NEST. Example 17: https://blog.couchbase.com/ansi-join-support-n1ql/
In those situations use JOIN+GROUP on LEFT document + ARRAG_AGG on right document. Or use ARRAY …FOR… syntax.
The desired results can be achieved by following query. This LEFT outer nest
SELECT a.*,
(SELECT b.*,
(SELECT c.*
FROM bucket AS c USE KEYS b.innerArr[*].id) AS cs
FROM bucket AS b USE KEYS a.arr[*].id) AS bs
FROM bucket AS a
WHERE ..........;
I try to find a way of joining two nodes in Firebase (JSON based structure)
Example data structure:
"users" : {
"1" : {
"name" : "Example Name",
"contacts" : {
"2" : true,
"3" : true
},
"posts" : {
"15" : true,
"28" : true
}
},
"posts" : {
"5" : {
user : "2",
date_time : "11.11.2016",
text : "example text"
},
"15" : {
user : "1",
date_time : "25.11.2016",
text : "example text"
}
}
The user should now have a newsfeed screen, where all posts of all of his contacts are listed. Therefore a join of the two nodes would make the query much more efficient.
Right now I would execute a query for each contact to get the post id's and then have to do a final query to get the actual posts.
EDIT: details and question in comments
I want update a array value that is nested within an array value: i.e. set
status = enabled
where alerts.id = 2
{
"_id" : ObjectId("5496a8ed49847b6cd7c7b350"),
"name" : "joe",
"locations" : [
{
"name": "my location",
"alerts" : [
{
"id" : 1,
"status" : null
},
{
"id" : 2,
"status" : null
}
]
}
]
}
I would have used the position $ character, but cannot use it twice in a statement - multi positional operators are not supported yet: https://jira.mongodb.org/browse/SERVER-831
How do I issue a statement to only update the status field of an alert matching an id of 2?
UPDATE
If I change the schema as follows:
{
"_id" : ObjectId("5496ab2149847b6cd7c7b352"),
"name" : "joe",
"locations" : {
"my location" : {
"alerts" : [
{
"id" : 1,
"status" : "enabled"
},
{
"id" : 2,
"status" : "enabled"
}
]
},
"my other location" : {
"alerts" : [
{
"id" : 3,
"status" : null
},
{
"id" : 4,
"status" : null
}
]
}
}
}
I can then use:
update({"locations.my location.alerts.id":1},{$set: {"locations.my location.alerts.$.status": "enabled"}});
Problem is I cannot create indexes on the alert id :-(
it may be better of modelled as such, specially if an index on location and,or alerts.id is needed.
{
"_id" : ObjectId("5496a8ed49847b6cd7c7b350"),
"name" : "joe",
"location" : "myLocation",
"alerts" : [{
"id" : 1,
"status" : null
},
{
"id" : 2,
"status" : null
}
]
}
{
"_id" : ObjectId("5496a8ed49847b6cd7c7b350"),
"name" : "joe",
"location" : "otherLocation",
"alerts" : [{
"id" : 1,
"status" : null
},
{
"id" : 2,
"status" : null
}
]
}
I think you are having a wrong tool for the job. What you have in your example is relational data and it's much easier to handle with relational database. So I would suggest to use SQL-database instead of mongo.
But if you really want to do it with mongo, then I guess the only option is to fetch the document and modify it and put it back.
I have been using the following MySQL command to construct a heatmap from log data. However, I have a new data set that is stored in a Mongo database and I need to run the same command.
select concat(a.packages '&' b.packages) "Concurrent Packages",
count(*) "Count"
from data a
cross join data b
where a.packages<b.packages and a.jobID=b.jobID
group by a.packages, b.packages
order by a.packages, b.packages;
Keep in mind that the tables a and b do not exist prior to the query. However, they are created from the packages column of the data table, which has jobID as the field which I want to check for matches. In other words if two packages are within the same job I want to add an entry to the concurrent usage count. How can I generate a similar query in Mongo?
This is not a "join" of different documents; it is an operation within one document, and can be done in MongoDB.
You have a SQL TABLE "data" like this:
JobID TEXT,
package TEXT
The best way to store this in MongoDB will be a collection called "data", containing one document per JobID that contains an array of packages:
{
_id: <JobID>,
packages: [
"packageA",
"packageB",
....
]
}
[ Note: you could also implement your data table as only one document in MongoDB, containing an array of jobs which contain each an array of packages. This is not recommended, because you might hit the 16MB document size limit and nested arrays are not (yet) well supported by different queries - if you want to use the data for other purposes as well ]
Now, how to get a result like this ?
{ pair: [ "packageA", "packageB" ], count: 20 },
{ pair: [ "packageA", "packageC" ], count: 11 },
...
As there is no built-in "cross join" of two arrays in MongoDB, you'll have to program it out in the map function of a mapReduce(), emitting each pair of packages as a key:
mapf = function () {
that = this;
this.packages.forEach( function( p1 ) {
that.packages.forEach( function( p2 ) {
if ( p1 < p2 ) {
key = { "pair": [ p1, p2 ] };
emit( key, 1 );
};
});
});
};
[ Note: this could be optimized, if the packages arrays were sorted ]
The reduce function is nothing more than summing up the counters for each key:
reducef = function( key, values ) {
count = 0;
values.forEach( function( value ) { count += value } );
return count;
};
So, for this example collection:
> db.data.find()
{ "_id" : "Job01", "packages" : [ "pA", "pB", "pC" ] }
{ "_id" : "Job02", "packages" : [ "pA", "pC" ] }
{ "_id" : "Job03", "packages" : [ "pA", "pB", "pD", "pE" ] }
we get the following result:
> db.data.mapReduce(
... mapf,
... reducef,
... { out: 'pairs' }
... );
{
"result" : "pairs",
"timeMillis" : 443,
"counts" : {
"input" : 3,
"emit" : 10,
"reduce" : 2,
"output" : 8
},
"ok" : 1,
}
> db.pairs.find()
{ "_id" : { "pair" : [ "pA", "pB" ] }, "value" : 2 }
{ "_id" : { "pair" : [ "pA", "pC" ] }, "value" : 2 }
{ "_id" : { "pair" : [ "pA", "pD" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pA", "pE" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pC" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pD" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pB", "pE" ] }, "value" : 1 }
{ "_id" : { "pair" : [ "pD", "pE" ] }, "value" : 1 }
For more information on mapReduce consult: http://docs.mongodb.org/manual/reference/method/db.collection.mapReduce/ and http://docs.mongodb.org/manual/applications/map-reduce/
You can't. Mongo doesn't do joins. Switching from SQL to Mongo is a lot more involved than migrating your queries.
Typically, you would include all the pertinent information in the same record (rather than normalize the information and select it with a join). Denormalize!
I want to figure out the most active users on my site.
I have records of the form
{
"_id" : "db1855b0-f2f4-44eb-9dbb-81e27780c796",
"createdAt" : 1360497266621,
"profile" : { "name" : "test" },
"services" : { "resume":
{ "loginTokens" : [{
"token" : "82c01cb8-796a-4765-9366-d07c98c64f4d",
"when" : 1360497266624
},
{
"token" : "0e4bc0a4-e139-4804-8527-c416fb20f6b1",
"when" : 1360497474037
} ]
},
"twitter" : {
"accessToken" : "9314Sj9kKvSyosxTWPY5r57851C2ScZBCe",
"accessTokenSecret" : "UiDcJfOfjH7g9UiBEOBs",
"id" : 2933049,
"screenName" : "testname"
}
}
}
I want to be able to select users and order by the number of loginTokens.
In MySQL it would be something like:
SELECT id, COUNT(logins) AS logins
FROM users
GROUP BY id ORDER BY logins DESC
I've tried this on querymongo.com and i got an error (can't work with aliases/ cant order by non-column names)
What's the Mongo way to do this?
Thanks!
I just converted:
SELECT id, COUNT(logins)
FROM users
GROUP BY id
To:
db.users.group({
"key": {
"id": true
},
"initial": {
"countlogins": 0
},
"reduce": function(obj, prev) {
prev.countlogins++;
}
});
Hope this helps
Here is an example of what you said using the aggregation framework:
db.users.aggregate([
{$unwind: '$services.resume.loginTokens'},
{$group: {_id: '$_id', logins: {$sum: 1}}},
{$sort: {logins: -1}}
])
This should do the trick.