ARRAY_SUM with condition - couchbase

I have document like this
{
bills: [
{
id: "1",
total: 20.0
},
{
id: "1",
total: 20.0
},
{
id: "2",
total: 10.0
}
]
}
I would like to do the DISTINCT SUM of total value with distinction based on id property but could not find and instruction for this case.
For the example case, the expected total is 30.0.

Use the ARRAY operator to select which elements of the array you want to use.
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/collectionops.html#array
Then use ARRAY_DISTINCT() and ARRAY_SUM() to compute the total.
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/arrayfun.html#fn-array-distinct
https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/arrayfun.html#fn-array-sum

Which one you choose when “id”:“1” has different values
{
bills: [
{
id: "1",
total: 20.0
},
{
id: "1",
total: 30.0
},
{
id: "2",
total: 10.0
}
]
}
By using subquery expression https://docs.couchbase.com/server/6.0/n1ql/n1ql-language-reference/subqueries.html you can use complete select functionality of arrays.
In case of different values the following query uses MAX. This is per document sum
SELECT ARRAY_SUM((SELECT RAW MAX(b.total) FROM d.bills AS b GROUP BY b.id)) AS s
FROM default AS d
WHERE ......

Related

mongosh: How to create ascending date object for every record in a collection?

I have a small collection with records of the format:
db.presentations =
[
{
"_id": "1",
"student": "A",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
{
"_id": "2",
"student": "B",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
...
,
{
"_id": "26",
"student": "Z",
"presentationDate": {
"$date": "2023-01-17T00:00:00Z"
}
},
]
Instead of all the presentationDates being the same, I want to set them to an ascending order. So, student A's presentationDate is 2023-01-17, student B's is 2023-01-18, student C's is 2023-01-19, and so on.
I've been exploring some functions that could do this, but none really seem to fit what I'm trying to do, eg:
$dateAdd: allows specification of the unit and amount (eg, day, 3) by which to increase a date object, but it must be used as part of an aggregation pipeline. I don't see how to increment by variable amount for each document.
forEach() / map(): allows flexibility in function applied to each record, but again, I don't see how to increment by variable (uniformly increasing) amount for each document. I'm also not sure it's possible to edit documents within a forEach?
Put another way, I'm basically trying to iterate through my cursor/collection and update each document, incrementing a global variable on each itereation.
I'm new to mongosh, so any ideas, feedback are appreciated!
Of course you could select the data, iterate over all documents, change the value and save back. You can also do it with an aggregation pipeline like this:
db.collection.aggregate([
{
$setWindowFields: {
sortBy: { student: 1 },
output: {
pos: { $documentNumber: {} }
}
}
},
{
$set: {
presentationDate: {
$dateAdd: {
startDate: "$presentationDate",
unit: "day",
amount: "$pos"
}
}
}
}
])
If you like to modify the data, then use
db.collection.updateMany({}, [
{
$setWindowFields: {
sortBy: { student: 1 },
output: {
pos: { $documentNumber: {} }
}
}
},
{
$set: {
presentationDate: {
$dateAdd: {
startDate: "$presentationDate",
unit: "day",
amount: "$pos"
}
}
}
}
])

MySQL query conversion to MongoDB

I am new to MongoDB. I have done code to get the highest deaths value country-wise with their reporting date in MySQL. As the initial step, I am trying to get the max value of the deaths column, but it is returning another value that is not the highest. Here is my MySQL code:
SELECT
d.country_name, s.dt, MAX(s.deaths)
FROM
Demographics d
inner JOIN statistics s
where d.country_id=s.country_id
GROUP BY country_name
ORDER BY MAX(s.deaths) DESC;
It is returning:
Germany
2022-01-29
118335
Bangladesh
2022-01-30
34
What will be the equivalent code to MongoDB to get the same result?
To get the max value of the deaths column in MongoDB i used:
db.statistics.aggregate([
{
$group: {
_id: "$country_id",
maxQuantity: {
$max: "$deaths"
}
}
}
])
Here is my sample input:
Demographics
{"country_id":"BGD","country_name":"Bangladesh","population":"164700000","area":"148460","density":"1265"}, {"country_id":"DEU","country_name":"Germany","population":"83200000","area":"357386","density":"232"}
statistics
{"country_id":"DEU","dt":"2022-01 29", "confirmed_cases":"2016684", "deaths":"118335"},
{"country_id":"DEU","dt":"2022-01-17", "confirmed_cases":"53916", "deaths":"143"},
{"country_id":"BGD","dt":"2022-01-30", "confirmed_cases":"12183", "deaths":"34"},
{"country_id":"BGD","dt":"2022-01-29", "confirmed_cases":"10378", "deaths":"21"},
Updated: Post Owner request for the max deaths for each country.
$lookup - Join both statistics and Demographic collections by country_id.
$set - Set death field by converting to integer.
$sort - Sort by death DESC.
$group - Group by country_id. Get first value ($first) as the result was sorted in Stage 3.
$sort - Sort by maxQuantity DESC.
$project - Decorate output document.
Side note: It's weird to store confirmed_cases and deaths as string type.
db.statistics.aggregate([
{
"$lookup": {
"from": "Demographics",
"localField": "country_id",
"foreignField": "country_id",
"as": "demographics"
}
},
{
"$set": {
deaths: {
$toInt: "$deaths"
}
}
},
{
$sort: {
deaths: -1
}
},
{
$group: {
_id: {
country_id: "$country_id"
},
country: {
$first: "$demographics"
},
dt: {
$first: "$dt"
},
maxQuantity: {
$first: "$deaths"
}
}
},
{
$sort: {
maxQuantity: -1
}
},
{
$project: {
_id: 0,
country_name: {
$first: "$country.country_name"
},
dt: "$dt",
maxQuantity: "$maxQuantity"
}
}
])
Sample Mongo Playground
For MySQL query, INNER JOIN should be:
INNER JOIN statistics s ON d.country_id=s.country_id
and without the need of WHERE.

Temporary Tables (CTE) in MongoDB

I know that common table expressions (CTE) a.k.a. "temporary named result sets" can be used in SQL to generate a temporary table, but can this be done in MongoDB? I want a document, but it's only for temporary use in my query.
Can you create a temporary table in MongoDB without creating a new collection?
For example, if I were to try to recreate the code below in Mongo...
Example CTE Table in SQL:
n
f1
f2
1
20
12
2
40
0.632
3
60
0.647
WITH RECURSIVE example (n, f1, f2) AS
( SELECT 1, 20, 12
UNION ALL SELECT
n + 1,
n * 20,
least(6*n, $globalVar * 100),
FROM example WHERE n < 3
) SELECT * FROM example
It seems that there is no general equivalent for CTE in MongoDB. However, for OP's example, it is possible to wrangle the output of $range to produce a similar effect.
// whichever collection doesn't matter; as long as it has 1 document then it should be fine
db.collection.aggregate([
{
// jsut take 1 document
"$limit": 1
},
{
// use $range to generate iterator [1, 2, 3]
"$addFields": {
"rg": {
"$range": [
1,
4
]
},
globalVar: 0.001
}
},
{
// do the mapping according to logic
"$addFields": {
"cte": {
"$map": {
"input": "$rg",
"as": "n",
"in": {
n: "$$n",
f1: {
"$multiply": [
"$$n",
20
]
},
f2: {
"$cond": {
"if": {
$lt: [
{
"$multiply": [
"$$n",
6
]
},
{
"$multiply": [
"$globalVar",
100
]
}
]
},
"then": {
"$multiply": [
"$$n",
6
]
},
"else": {
"$multiply": [
"$globalVar",
100
]
}
}
}
}
}
}
}
},
{
// wrangle back to expected form
"$unwind": "$cte"
},
{
"$replaceRoot": {
"newRoot": "$cte"
}
}
])
Here is the Mongo playground for your reference.

sql subqueries to mongodb

I am new to MongoDB and I am trying to turn SQL queries into MongoDB queries. But can't seem to find any way to turn a SQL query with a subquery to mongoDB.
for example:
SELECT article, dealer, price
FROM shop
WHERE price=(SELECT MAX(price) FROM shop);
I tried the following, but it doesn't seem to work.
db.shop.group({
"initial": {},
"reduce": function(obj, prev) {
prev.maximumvalueprice = isNaN(prev.maximumvalueprice) ? obj.price :
Math.max(prev.maximumvalueprice, obj.price);
}}).forEach(
function(data){
db.shop.find({
"price": data
},
{
"article": 1,
"dealer": 1,
"price": 1
})
})
How do I convert this SQL query into a MongoDB query?
If you are using MongoDB v. 3.2 or newer you can try to use $lookup.
Try to use aggregation:
$sort your collection by price by DESC;
set $limit to 1 (it will take a first document, which will be with biggest price);
then use $lookup to select the documents from the same collection by max price and set it to tmpCollection element;
$unwind tmpCollection;
$replaceRoot - change document root to $tmpCollection
Example:
db.getCollection("shop").aggregate([
{$sort: {"price":-1}},
{$limit: 1},
{$lookup: {
from: "shop",
localField: "price",
foreignField: "price",
as: "tmpCollection"
}},
{$unwind: "$tmpCollection"},
{$replaceRoot: {newRoot:"$tmpCollection"}}
]);
Looks like you need the aggregation framework for this task using $first within a $group pipeline stage on ordered documents. The initial pipeline step for ordering the documents in the collection is $sort:
db.shop.aggregate([
{ "$sort": { "price": -1 } }, // <-- sort the documents first in descending order
{
"$group": {
"_id": null,
"article": { "$first": "$article" },
"dealer": { "$first": "$dealer" },
"price": { "$first": "$price" }
}
}
])
or using $last
db.shop.aggregate([
{ "$sort": { "price": 1 } }, // <-- note the sort direction
{
"$group": {
"_id": null,
"article": { "$last": "$article" },
"dealer": { "$last": "$dealer" },
"price": { "$last": "$price" }
}
}
])

How to search nested JSON in MySQL

I am using MySQL 5.7+ with the native JSON data type. Sample data:
[
{
"code": 2,
"stores": [
{
"code": 100,
"quantity": 2
},
{
"code": 200,
"quantity": 3
}
]
},
{
"code": 4,
"stores": [
{
"code": 300,
"quantity": 4
},
{
"code": 400,
"quantity": 5
}
]
}
]
Question: how do I extract an array where code = 4?
The following (working) query has the position of the data I want to extract and the search criterion hardcoded:
SELECT JSON_EXTRACT(data_column, '$[0]')
FROM json_data_table
WHERE data_column->'$[1].code' = 4
I tried using a wildcard (data_column->'$[*].code' = 4) but I get no results in return.
SELECT row FROM
(
SELECT data_column->"[*]" as row
FROM json_data_table
WHERE 4 IN JSON_EXTRACT(data_column, '$[*].code')
)
WHERE row->".code" = 4
... though this would be much easier to work with if this wasn't an unindexed array of objects at the top level. You may want to consider some adjustments to the schema.
Note:
If you have multiple rows in your data, specifying "$[i]" will pick that row, not the aggregate of it. With your dataset, "$[1].code" will always evaluate to the value of code in that single row.
Essentially, you were saying:
$ json collection
[1] second object in the collection.
.code attribute labeled "code".
... since there will only ever be one match for that query, it will always eval to 4...
WHERE 4 = 4
Alternate data structure if possible
Since the entire purpose of "code" is as a key, make it the key.
[
"code2":{
"stores": [
{
"code": 100,
"quantity": 2
},
{
"code": 200,
"quantity": 3
}
]
},
"code4": {
"stores": [
{
"code": 300,
"quantity": 4
},
{
"code": 400,
"quantity": 5
}
]
}
]
Then, all it would require would be:
SELECT datacolumn->"[code4]" as code4
FROM json_data_table
This is what you are looking for.
SELECT data_column->'$[*]' FROM json_data_table where data_column->'$[*].code' like '%4%'.
The selected data will have [] around it when selecting from an array thus data_column->'$[*].code' = 4 is not possible.