MySQL query conversion to MongoDB - mysql

I am new to MongoDB. I have done code to get the highest deaths value country-wise with their reporting date in MySQL. As the initial step, I am trying to get the max value of the deaths column, but it is returning another value that is not the highest. Here is my MySQL code:
SELECT
d.country_name, s.dt, MAX(s.deaths)
FROM
Demographics d
inner JOIN statistics s
where d.country_id=s.country_id
GROUP BY country_name
ORDER BY MAX(s.deaths) DESC;
It is returning:
Germany
2022-01-29
118335
Bangladesh
2022-01-30
34
What will be the equivalent code to MongoDB to get the same result?
To get the max value of the deaths column in MongoDB i used:
db.statistics.aggregate([
{
$group: {
_id: "$country_id",
maxQuantity: {
$max: "$deaths"
}
}
}
])
Here is my sample input:
Demographics
{"country_id":"BGD","country_name":"Bangladesh","population":"164700000","area":"148460","density":"1265"}, {"country_id":"DEU","country_name":"Germany","population":"83200000","area":"357386","density":"232"}
statistics
{"country_id":"DEU","dt":"2022-01 29", "confirmed_cases":"2016684", "deaths":"118335"},
{"country_id":"DEU","dt":"2022-01-17", "confirmed_cases":"53916", "deaths":"143"},
{"country_id":"BGD","dt":"2022-01-30", "confirmed_cases":"12183", "deaths":"34"},
{"country_id":"BGD","dt":"2022-01-29", "confirmed_cases":"10378", "deaths":"21"},

Updated: Post Owner request for the max deaths for each country.
$lookup - Join both statistics and Demographic collections by country_id.
$set - Set death field by converting to integer.
$sort - Sort by death DESC.
$group - Group by country_id. Get first value ($first) as the result was sorted in Stage 3.
$sort - Sort by maxQuantity DESC.
$project - Decorate output document.
Side note: It's weird to store confirmed_cases and deaths as string type.
db.statistics.aggregate([
{
"$lookup": {
"from": "Demographics",
"localField": "country_id",
"foreignField": "country_id",
"as": "demographics"
}
},
{
"$set": {
deaths: {
$toInt: "$deaths"
}
}
},
{
$sort: {
deaths: -1
}
},
{
$group: {
_id: {
country_id: "$country_id"
},
country: {
$first: "$demographics"
},
dt: {
$first: "$dt"
},
maxQuantity: {
$first: "$deaths"
}
}
},
{
$sort: {
maxQuantity: -1
}
},
{
$project: {
_id: 0,
country_name: {
$first: "$country.country_name"
},
dt: "$dt",
maxQuantity: "$maxQuantity"
}
}
])
Sample Mongo Playground
For MySQL query, INNER JOIN should be:
INNER JOIN statistics s ON d.country_id=s.country_id
and without the need of WHERE.

Related

mongosh: How to create ascending date object for every record in a collection?

I have a small collection with records of the format:
db.presentations =
[
{
"_id": "1",
"student": "A",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
{
"_id": "2",
"student": "B",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
...
,
{
"_id": "26",
"student": "Z",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
]
Instead of all the presentationDates being the same, I want to set them to an ascending order. So, student A's presentationDate is 2023-01-17, student B's is 2023-01-18, student C's is 2023-01-19, and so on.
I've been exploring some functions that could do this, but none really seem to fit what I'm trying to do, eg:
$dateAdd: allows specification of the unit and amount (eg, day, 3) by which to increase a date object, but it must be used as part of an aggregation pipeline. I don't see how to increment by variable amount for each document.
forEach() / map(): allows flexibility in function applied to each record, but again, I don't see how to increment by variable (uniformly increasing) amount for each document. I'm also not sure it's possible to edit documents within a forEach?
Put another way, I'm basically trying to iterate through my cursor/collection and update each document, incrementing a global variable on each itereation.
I'm new to mongosh, so any ideas, feedback are appreciated!
Of course you could select the data, iterate over all documents, change the value and save back. You can also do it with an aggregation pipeline like this:
db.collection.aggregate([
{
$setWindowFields: {
sortBy: { student: 1 },
output: {
pos: { $documentNumber: {} }
}
}
},
{
$set: {
presentationDate: {
$dateAdd: {
startDate: "$presentationDate",
unit: "day",
amount: "$pos"
}
}
}
}
])
If you like to modify the data, then use
db.collection.updateMany({}, [
{
$setWindowFields: {
sortBy: { student: 1 },
output: {
pos: { $documentNumber: {} }
}
}
},
{
$set: {
presentationDate: {
$dateAdd: {
startDate: "$presentationDate",
unit: "day",
amount: "$pos"
}
}
}
}
])

MongoDB nested array query how to

I am trying to query a document in my MongoDB
Document:
{
_id: '111',
subEntities: [
{
subId: '999',
dateOfStart: '2098-01-01',
dateOfTermination: '2099-12-31'
},
{
subId: '998',
dateOfStart: '2088-01-01',
dateOfTermination: '2089-12-31'
}
]
}
My Query:
{"$and": [
{"subEntities.dateOfStart": {"$lte": "2098-01-02"}},
{"subEntities.dateOfTermination": {"$gte": "2099-12-30"}},
{"subEntities.subId": {"$in": ["998"]}}
]}
As you can see, I am trying to apply a date value and an ID to the subentities.
The date value should be between dateOfStart and dateOfTermination.
The query returns a match, although the date value only matches the first subentity and the ID query matches the second subquery.
How can I make it so that there is only one match when both queries match the same subentity?
Can I aggregate the subentities?
Thanks a lot!
When you query arrays Mongo by default "flattens" them, which means each condition of the query get's executed independently.
You want to be using $elemMatch, this allows you to query full objects from within an array, like so:
db.collection.find({
subEntities: {
$elemMatch: {
dateOfStart: {
"$lte": "2098-01-02"
},
dateOfTermination: {
"$gte": "2099-12-30"
},
subId: {
"$in": [
"998"
]
}
}
}
})
Mongo Playground
If you want to filter dates between dateOfStart and dateOfTermination you should invert the $gte and $lte conditions:
{
"$and": [
{ "subEntities.dateOfStart": { "$gte": "2098-01-02" } },
{ "subEntities.dateOfTermination": { "$lte": "2099-12-30" } },
{ "subEntities.subId": { "$in": ["998"] } }
]
}

ARRAY_SUM with condition

I have document like this
{
bills: [
{
id: "1",
total: 20.0
},
{
id: "1",
total: 20.0
},
{
id: "2",
total: 10.0
}
]
}
I would like to do the DISTINCT SUM of total value with distinction based on id property but could not find and instruction for this case.
For the example case, the expected total is 30.0.
Use the ARRAY operator to select which elements of the array you want to use.
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/collectionops.html#array
Then use ARRAY_DISTINCT() and ARRAY_SUM() to compute the total.
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/arrayfun.html#fn-array-distinct
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/arrayfun.html#fn-array-sum
Which one you choose when “id”:“1” has different values
{
bills: [
{
id: "1",
total: 20.0
},
{
id: "1",
total: 30.0
},
{
id: "2",
total: 10.0
}
]
}
By using subquery expression https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/subqueries.html you can use complete select functionality of arrays.
In case of different values the following query uses MAX. This is per document sum
SELECT ARRAY_SUM((SELECT RAW MAX(b.total) FROM d.bills AS b GROUP BY b.id)) AS s
FROM default AS d
WHERE ......

sql subqueries to mongodb

I am new to MongoDB and I am trying to turn SQL queries into MongoDB queries. But can't seem to find any way to turn a SQL query with a subquery to mongoDB.
for example:
SELECT article, dealer, price
FROM shop
WHERE price=(SELECT MAX(price) FROM shop);
I tried the following, but it doesn't seem to work.
db.shop.group({
"initial": {},
"reduce": function(obj, prev) {
prev.maximumvalueprice = isNaN(prev.maximumvalueprice) ? obj.price :
Math.max(prev.maximumvalueprice, obj.price);
}}).forEach(
function(data){
db.shop.find({
"price": data
},
{
"article": 1,
"dealer": 1,
"price": 1
})
})
How do I convert this SQL query into a MongoDB query?
If you are using MongoDB v. 3.2 or newer you can try to use $lookup.
Try to use aggregation:
$sort your collection by price by DESC;
set $limit to 1 (it will take a first document, which will be with biggest price);
then use $lookup to select the documents from the same collection by max price and set it to tmpCollection element;
$unwind tmpCollection;
$replaceRoot - change document root to $tmpCollection
Example:
db.getCollection("shop").aggregate([
{$sort: {"price":-1}},
{$limit: 1},
{$lookup: {
from: "shop",
localField: "price",
foreignField: "price",
as: "tmpCollection"
}},
{$unwind: "$tmpCollection"},
{$replaceRoot: {newRoot:"$tmpCollection"}}
]);
Looks like you need the aggregation framework for this task using $first within a $group pipeline stage on ordered documents. The initial pipeline step for ordering the documents in the collection is $sort:
db.shop.aggregate([
{ "$sort": { "price": -1 } }, // <-- sort the documents first in descending order
{
"$group": {
"_id": null,
"article": { "$first": "$article" },
"dealer": { "$first": "$dealer" },
"price": { "$first": "$price" }
}
}
])
or using $last
db.shop.aggregate([
{ "$sort": { "price": 1 } }, // <-- note the sort direction
{
"$group": {
"_id": null,
"article": { "$last": "$article" },
"dealer": { "$last": "$dealer" },
"price": { "$last": "$price" }
}
}
])

Elastic Search Complex Scenario

I am quite new to Elastic Search. I have a complex scenario and I am unable to get the right solution(Elastic Queries/params) for this. Any Help would be highly appreciate.
My Fields
Product Name (String)
Price Min
Price Max
Availabilty Status(Avialable/Unavailable)
Beside of these a search will always be filter on unique user. So Mysql query looks like :
Select * from product where product_name like %xxx% AND price >= price_min AND price <= price_max AND availability = availability_ status AND user = 1;
I like the exact elastic search params to solve this scenario Or near about solution will also be appreciated.
You will need to use a filter here since you need an exact match and not a full-text match. And this way its even faster.
{
"query": {
"filtered": {
"query": {
"match": { "name": "YourName" }
},
"filter": {
"bool": {
"must": [
{ "range": { "price_min": { "gte": 20 }}},
{ "range": { "price_max": { "lte": 170 }}},
{ "term" : { "availability" : "false" }} ]
}
}
}
}
}
What you provide in "yourName" is a full text match (Similar names will be retrieved if the name field is analyzed). Im assuming that you have left it unchanged so the field is analyzed(Stemmed + stop words removal) by default.
Since you need the result to match all the criteria its an AND combination. Hence use
bool filter to AND all the terms.
So given your SQL query
SELECT *
FROM product
WHERE product_name LIKE '%xxx%'
AND price >= price_min
AND price <= price_max
AND availability = availability_status
AND user = 1;
We need:
a match query for the product_name
a range filter for the price field
and a term filter for the availability and user fields
All these are wrapped inside a filtered query whose filters will filter out all non-matching documents, and finally the match query will run on the remaining documents to match the product_name
The correct query would then look like this:
{
"query": {
"filtered": {
"query": {
"match": {
"product_name": "xxx"
}
},
"filter": {
"bool": {
"must": [
{
"range": {
"price": {
"gte": 20,
"lte": 170
}
}
},
{
"term": {
"availability": "availability_status"
}
},
{
"term": {
"user": 1
}
}
]
}
}
}
}
}