Mongo DB issue with findOne query - mysql

I am maintaining a rooms table, where it consists of records associated with the conversations. I want to get the room id between two users so using findOne query but it's bringing other records and not satisfying my need.
Suggest me where the query has gone wrong.
If I give the query:
rooms.findOne({ "userId" :"800", "userId" :"600"});
I am expectng conversation id of fsny11z742kpgb9 but it's giving 6puebew70kke29.
{
"_id": ObjectId("571c5724db62826826d28d08"),
"conversationId": "6puebew70kke29",
"userId": "600",
"firstName": "Test",
"profileImagePath": "",
"created": ISODate("2016-04-24T05:18:28.753Z"),
"__v": 0
}
{
"_id": ObjectId("571c5724db62826826d28d09"),
"conversationId": "6puebew70kke29",
"userId": "900",
"firstName": "User",
"profileImagePath": "",
"created": ISODate("2016-04-24T05:18:28.754Z"),
"__v": 0
}
{
"_id": ObjectId("571c574edb62826826d28d0b"),
"conversationId": "fsny11z742kpgb9",
"userId": "600",
"firstName": "FitTest",
"profileImagePath": "",
"created": ISODate("2016-04-24T05:19:10.192Z"),
"__v": 0
}
{
"_id": ObjectId("571c574edb62826826d28d0c"),
"conversationId": "fsny11z742kpgb9",
"userId": "800",
"firstName": "Dev",
"profileImagePath": "",
"created": ISODate("2016-04-24T05:19:10.193Z"),
"__v": 0
}

You have to use aggregation to do so.
rooms.aggregate([
{ $group: { _id: '$conversationId', users: { $push: '$userId' } } },
{ $match: { users: { $all: ['800', '600'] }, groupType: 'PRIVATE' } },
])

The findOne() operation returns the first document according to the natural order which reflects the order of documents on the disk, see mongodb docs.
Second, the query document you provide as parameter to the findOne() operation contains two values for userId, this is not the same as the $in operator. The latter one overrides the first one.
As Mathieu suggested, a proper lookup would be to use an aggregation pipeline with two steps:
rooms.aggregate([
{ $group: { _id: '$conversationId', users: { $push: '$userId' } } },
{ $match: { users: { $all: ['800', '600'] }, groupType: 'PRIVATE' } },
])
create list with id matching the conversation id and a field of type array containing all the userIds ($group stage)
filter out all entries, where the user-array contains the ids of the both users your are looking for. ($match stage)
Bear in mind, that this will return all conversations of both users.

Related

Elasticsearch - How to get the latest record in each group with filter?

I have a few records in elastic search I want to group the record by user_id and fetch the latest record which is event_type is 1
If the latest record event_type value is not 1 then we should not fetch that record. I did it in MySQL query. Please let me know how can I do that same in elastic search.
After executing the MySQL query
SELECT * FROM user_events
WHERE id IN( SELECT max(id) FROM `user_events` group by user_id ) AND event_type=1;
I need the same output in elasticsearch aggregations.
Elasticsearch Query:
GET test_analytic_report/_search
{
"from": 0,
"size": 0,
"query": {
"bool": {
"must": [
{
"range": {
"event_date": {
"gte": "2022-10-01",
"lte": "2023-02-06"
}
}
}
]
}
},
"sort": {
"event_date": {
"order": "desc"
}
},
"aggs": {
"group": {
"terms": {
"field": "user_id"
},
"aggs": {
"group_docs": {
"top_hits": {
"size": 1,
"_source": ["user_id", "event_date", "event_type"],
"sort": {
"user_id": "desc"
}
}
}
}
}
}
}
I have the above query I have two users whose user_id is 55 and 56. So, in my aggregations, it should not come. But It fetched the other event_type data but I want only event_types=1 with the latest one. if the user's last record does not have event_type=1, it should not come.
In the above table, user_id 56 latest record event_type contains 2 so it should not come in our aggregations.
I tried but it's not returning the exact result that I want.
Note: event_date is the current date and time. As per the above image, I have inserted it manually that's why the date differs
GET user_events/_search
{
"size": 1,
"query": {
"term": {
"event_type": 1
}
},
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
Explanation: This is an Elasticsearch API request in JSON format. It retrieves the latest event of type 1 (specified by "event_type": 1 in the query) from the "user_events" index, with a size of 1 (specified by "size": 1) and sorts the results in descending order by the "id" field (specified by "order": "desc" in the sort).
If your ES version supports, you can do it with field collapse feature. Here is an example query:
{
"_source": false,
"query": {
"bool": {
"filter": {
"term": {
"event_type": 1
}
}
}
},
"collapse": {
"field": "user_id",
"inner_hits": {
"name": "the_record",
"size": 1,
"sort": [
{
"id": "desc"
}
]
}
},
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
In the response, you will see that the document you want is in inner_hits under the name you give. In my example it is the_record. You can change the size of the inner hits if you want more records in each group and sort them.
Tldr;
They are many ways to go about it:
Sorting
Collapsing
Latest Transform
All those solution are approximate of what you could get with sql.
But my personal favourite is transform
Solution - transform jobs
Set up
We create 2 users, with 2 events.
PUT 75324839/_bulk
{"create":{}}
{"user_id": 1, "type": 2, "date": "2015-01-01T00:00:00.000Z"}
{"create":{}}
{"user_id": 1, "type": 1, "date": "2016-01-01T00:00:00.000Z"}
{"create":{}}
{"user_id": 2, "type": 1, "date": "2015-01-01T00:00:00.000Z"}
{"create":{}}
{"user_id": 2, "type": 2, "date": "2016-01-01T00:00:00.000Z"}
Transform job
This transform job is going to run against the index 75324839.
It will find the latest document, with regard to the user_id, based of the value in date field.
And the results are going to be stored in latest_75324839.
PUT _transform/75324839
{
"source": {
"index": [
"75324839"
]
},
"latest": {
"unique_key": [
"user_id"
],
"sort": "date"
},
"dest": {
"index": "latest_75324839"
}
}
If you were to query latest_75324839
You would find:
{
"hits": [
{
"_index": "latest_75324839",
"_id": "AGvuZWuqqz7c5ytICzX5Z74AAAAAAAAA",
"_score": 1,
"_source": {
"date": "2017-01-01T00:00:00.000Z",
"user_id": 1,
"type": 1
}
},
{
"_index": "latest_75324839",
"_id": "AA3tqz9zEwuio1D73_EArycAAAAAAAAA",
"_score": 1,
"_source": {
"date": "2016-01-01T00:00:00.000Z",
"user_id": 2,
"type": 2
}
}
]
}
}
Get the final results
To get the amount of user with type=1.
A simple search query such as:
GET latest_75324839/_search
{
"query": {
"term": {
"type": {
"value": 1
}
}
},
"aggs": {
"number_of_user": {
"cardinality": {
"field": "user_id"
}
}
}
}
Side notes
This transform job has been running in batch, this means it will only run once.
It is possible to run it in a continuous fashion, to get all the time the latest event for a user_id.
Here are some examples.
Your are looking for an SQL HAVING clause, which would allow you to filter results after grouping. But sadly there is nothing equivalent on Elastic.
So it is not possible to
sort, collapse and filter afterwards (even post_filter does not
help here)
use a top_hits aggregation with custom sorting and then filter
use any map/reduce scripted aggregations, as they do not support
sorting.
work with subqueries.
So basically seen, Elastic is not a database. Any sorting or relation to other documents should be based on scoring. And the score should be calculated independently for each document, distributed on shards.
But there is a tiny loophole, which might be the solution for your use case. It is based on a top_metrics aggregation followed by bucket selector to eliminate the unwanted event types:
GET test_analytic_report/_search
{
"size": 0,
"aggs": {
"by_id": {
"terms": {
"field": "user_id",
"size": 100
},
"aggs": {
"tm": {
"top_metrics": {
"metrics": {
"field": "event_type"
},
"sort": [
{
"id": {
"order": "desc"
}
}
]
}
},
"event_type_filter": {
"bucket_selector": {
"buckets_path": {
"event_type": "tm.event_type"
},
"script": "params.event_type == 1"
}
}
}
}
}
}
If you require more fields from the source document you can add them to the top_metrics.
It is sorted by id now, but you can also use event_date.

IF statement in Couchbase Map of view - I'm sure I'm missing something simple

I'm trying to limit the map in my view to a specific set of documents by either having the id "startsWith" a string or based on there being a specific node in the JSON> I can't seem to get a result set once I add an IF statement. The reduce is a simple _count:
function(doc, meta) {
if (doc.metricType == "Limit_Exceeded") {
emit([doc.ownedByCustomerNumber, doc.componentProduct.category], meta.id);
}
}
I've also tried if (doc.metricType) and also if(meta.id.startsWith("Turnaway:")
Example Doc:
{
"OvidUserId": 26105400,
"id": "Turnaway:00005792:10562440",
"ipAddress": "111187081038",
"journalTurnawayNumber": 10562440,
"metricType": "Limit_Exceeded",
"oaCode": "OA_Gold",
"orderNumber": 683980,
"ovidGroupID": 3113900,
"ovidGroupName": "tnu999",
"ovidUserName": "tnu999",
"ownedByCustomerNumber": 59310,
"platform": "Lippincott",
"samlString": "",
"serialName": "00005792",
"sessionID": "857616ee-dab7-43d0-a08b-abb2482297dd",
"soldProduct": {
"category": "Multidisciplinary Subjects",
"name": "Custom Collection For CALIS - LWW TA 2020",
"productCode": "CCFCCSI20",
"productNumber": 33410,
"subCategory": "",
"subject": "Multidisciplinary Subjects"
},
"soldToCustomer": {
"customerNumber": 59310,
"keyAccount": false,
"name": "Tongji University"
},
"turnawayDateTime": "2022-05-04T03:01:44.600",
"usedByCustomer": {
"customerNumber": 59310,
"keyAccount": false,
"name": "Tongji University"
},
"usedByCustomerNumber": 59310,
"yearMonth": "202205"
},
"id": "Turnaway:00005792:10562440"
}
Thanks,
Gerry
Found it (of course after posting the question) The second component of the Key in the emit has to exist. I entered doc.componentProduct.category instead of doc.soldProduct.Category.

Query Nested Mongodb info with Variable Nested Field names

I have a MongoDB that is structured as below:
[
{
"subject_id": "1",
"name": "Maria",
"dob": "1/1/00",
"gender": "F",
"visits": {
"1/1/18": {
"date_entered": "1/2/18",
"entered_by": "Sally"
},
"1/2/18": {
"date_entered": "1/2/18",
"entered_by": "Tim",
}
},
"samples": {
"XXX123": {
"collected_by": "Sally",
"collection_date": "1/3/18"
}
}
},
{
"subject_id": "2",
"name": "Bob",
"dob": "1/2/00",
"gender": "M",
"visits": {
"1/3/18": {
"date_entered": "1/4/18",
"entered_by": "Tim"
}
},
"samples": {
"YYY456": {
"collected_by": "Sally",
"collection_date": "1/5/18"
},
"ZZZ789": {
"collected_by": "Tim",
"collection_date": "1/6/18"
},
"AAA123": {
"collected_by": "Sally",
"collection_date": "1/7/18"
}
}
}
]
If I wanted to query the database to find all samples collected by Sally or all visits entered by Tim, what would be the best way of doing that?
I'm new to MongoDB and my attempts with various regex's haven't produced results. Any advice would be greatly appreciated.
I first used project on the required fields to use objectToArray followed by unwind to create separate records for array created in project.
The results are then filtered using match.
This works for the data provided in the question -
db.so.aggregate([
{$project: {visits: {$objectToArray: "$visits"}, samples: {$objectToArray: "$samples"}}},
{$unwind: "$visits"},
{$unwind: "$samples"},
{ $match: {
$or : [
{ "visits.v.entered_by" : "Tim" },
{ "samples.v.collected_by" : "Sally" }
]
}
}
])

couchdb ; how get documents directly in first level of json, and not grouped inside value - viewWithList

I have a design document: 'accounts',and the view: 'accounts-view'
the view's content is:
function (doc) {
emit( doc._id, doc);
}
And my code in express is:
db.view('accounts', 'accounts-view', function(err, body) {
if (err) throw error;
res.json(body.rows);
});
Result is:
[
{
"id": "8767d3474a0e80dd0ab7d0b0580065af",
"key": "8767d3474a0e80dd0ab7d0b0580065af",
"value": {
"_id": "8767d3474a0e80dd0ab7d0b0580065af",
"_rev": "1-37eb3e76e4715e9a4fc8930470cc4ca3",
"type": "accounts",
"lastname": "Kitchen",
"firstname": "Peter"
}
},
{
"id": "8767d3474a0e80dd0ab7d0b058006e3c",
"key": "8767d3474a0e80dd0ab7d0b058006e3c",
"value": {
"_id": "8767d3474a0e80dd0ab7d0b058006e3c",
"_rev": "1-bcab94bb253c83b4951a787c253896f5",
"type": "accounts",
"lastname": "Kolner",
"firstname": "John"
}
}
]
How i can get just something like this: ( just printing all is inside value for every row)
[
{
"_id": "8767d3474a0e80dd0ab7d0b0580065af",
"_rev": "1-37eb3e76e4715e9a4fc8930470cc4ca3",
"type": "accounts",
"lastname": "Kitchen",
"firstname": "Peter"
},
{
"_id": "8767d3474a0e80dd0ab7d0b058006e3c",
"_rev": "1-bcab94bb253c83b4951a787c253896f5",
"type": "accounts",
"lastname": "Kolner",
"firstname": "John"
}
]
UPDATE:
I've follow Domonique's suggestions ; and now I have a new view, that emit just the id (so i can save space on disk and retrive de doc with the parameter "include_docs=true" on the view):
function(doc) {
if (doc.type && doc.type=='accounts') {
emit( doc._id);
}
}
and a new list:
function(head, req) {
provides('json', function() {
var results = [];
while (row = getRow()) {
//results.push(row.value);
results.push(row.doc);
}
send(JSON.stringify(results));
});
}
Finally i get the records with:
http://127.0.0.1:5984/crm/_design/crmapp/_list/accounts-list/accounts-view?include_docs=true
and the result is:
[
{
"_id": "8767d3474a0e80dd0ab7d0b0580065af",
"_rev": "1-37eb3e76e4715e9a4fc8930470cc4ca3",
"type": "accounts",
"lastname": "Kitchen",
"firstname": "Peter"
},
{
"_id": "8767d3474a0e80dd0ab7d0b058006e3c",
"_rev": "1-bcab94bb253c83b4951a787c253896f5",
"type": "accounts",
"lastname": "Kolner",
"firstname": "John"
},
{
"_id": "8767d3474a0e80dd0ab7d0b058008e9a",
"_rev": "1-86078f00be82b97499a0f52488cefbbf",
"lastname": "Tower",
"firstname": "George",
"type": "accounts"
}
]
my app node express updated:
db.viewWithList('crmapp', 'accounts-view','accounts-list', {"include_docs":"true"} , function(err, body) {
if (err) throw err;
res.json(body);
});
with this list , I don't need more reduce it on express project, it's ok ?
How to udate my list or view to get by id ? it'not working just adding id on the url ; like this:
http://127.0.0.1:5984/crm/_design/crmapp/_list/accounts-list/accounts-view?include_docs=true&_id=8767d3474a0e80dd0ab7d0b058006e3c
I get all the records and not the only one by id
To answer your question here, you should simply map the array and only include the value portion:
db.view('accounts', 'accounts-view', function(err, body) {
if (err) throw error;
res.json(body.rows.map(function (row) {
return row.value;
}));
});
Since it's apparent you are new to CouchDB, I'll also give you some advice regarding views. First, the view you've created is actually just a duplicate of the system view _all_docs, so you should just use that instead rather than creating your own view. (especially since you've effectively created a duplicate on disk)
However, it is probably pretty likely that as you get further along in your application, you'll be using real views that partition documents differently depending on the query. As such, you should not emit your entire document (ie: doc) in your view function. By doing this, you are effectively duplicating that document on disk, since it will be represented in your database, as well as the view index.
The recommended starting point is to simply leave out the 2nd argument of your emit.
function (doc) {
emit(doc._id);
}
When you query the view, you can simply add include_docs=true to the URL and your view will look something like this:
[
{
"id": "8767d3474a0e80dd0ab7d0b0580065af",
"key": "8767d3474a0e80dd0ab7d0b0580065af",
"value": null,
"doc": {
"_id": "8767d3474a0e80dd0ab7d0b0580065af",
"_rev": "1-37eb3e76e4715e9a4fc8930470cc4ca3",
"type": "accounts",
"lastname": "Kitchen",
"firstname": "Peter"
}
}
// ...
]
Then, you can retrieve the doc instead of value to achieve the same result much more efficiently.

finding document in mongodb with specific id and username

{
"data": [
{
"_id": 555,
"username": "jackson",
"status": "i am coding",
"comments": [
{
"user": "bob",
"comment": "bob me "
},
{
"user": "daniel",
"comment": "bob the builder"
},
{
"user": "jesus",
"comment": "bob the builder"
},
{
"user": "hunter",
"comment": "bob the builder"
},
{
"user": "jeo",
"comment": "bob the builder"
},
{
"user": "jill",
"comment": "bob the builder"
}
]
}
]
}
so i want to get the result with _id :555 and user:bob i tried with below code but i cant make it work it returns empty array
app.get('/all',function(req , res){
db.facebook.find({_id:555},{comments:[{user:"bob"}]},function(err,docs){res.send(docs,{data:docs});});
} );
i want the result to be like this listed below with the comment with user:bob
{
"_id": 555,
"username": "jackson",
"status": "i am coding",
"comments": [
{
"user": "bob",
"comment": "bob me "
}
]
}
Only aggregate or mapReduce could exclude items from subarray in output. Shortest is to use $redact:
db.facebook.aggregate({
$redact:{
$cond:{
if:{$and:[{$not:"$username"},{$ne:["$user","bob"]}]},
then: "$$PRUNE",
else: "$$DESCEND"
}
}
})
Explanation:
$reduct would be applied to each subdocument starting from whole document. For each subdocument $reduct would either prune it or descend. We want to keep top level document, that is why we have {$not:"$username"} condition. It prevents top level document from pruning. On next level we have comments array. $prune would apply condition to each item of comments array. First condition {$not:"$username"} is true for all comments, and second condition {$ne:["$user","bob"]} is true for all subdocuments where user!="bob", so such documents would be pruned.
Update: in node.js with mongodb native driver
db.facebook.aggregate([{
$redact:{
$cond:{
if:{$and:[{$not:"$username"},{$ne:["$user","bob"]}]},
then: "$$PRUNE",
else: "$$DESCEND"
}
}
}], function(err,docs){})
One more thing: $prune is new operator and available only in MongoDB 2.6