How to find condition when sum is specific value by MySQL - mysql

I'd like to know some condition from this table.
date value
2022-01-01 5
2022-01-02 1
2022-01-03 3
2022-01-04 0
2022-01-05 2
2022-01-06 2
When is the date if sum of values exceed 10?
Actually, the answer is '2022-01-05'. Because sum from '2022-01-01' to '2022-01-05' is 11. It's easy for us as a human.
But how do I express in MySQL? Please let me know.

If you are using MySQL 8+ then window functions makes your requirement easy:
WITH cte AS (
SELECT *, SUM(value) OVER (ORDER BY date) sum_value
FROM yourTable
)
SELECT date
FROM cte
WHERE sum_value > 10
ORDER BY date
LIMIT 1;
On earlier versions of MySQL we can express the rolling sum with a correlated subquery:
SELECT date
FROM yourTable t1
WHERE (SELECT SUM(t2.value)
FROM yourTable t2
WHERE t2.date <= t1.date) >= 10
ORDER BY date
LIMIT 1;

Another approach for MySQL < 8, using a user variable to store the rolling sum -
SELECT `date`
FROM (
SELECT t.*, #sum_value := #sum_value + `value` AS `sum_value`
FROM t, (SELECT #sum_value := 0) z
ORDER BY `date` ASC
) y
WHERE `sum_value` >= 10
ORDER BY `date` ASC
LIMIT 1;

Related

Get sum of previous records in query and add or subtract the following results

Case:
I select an initial date and an end date, it should bring me the movements of all the products in that date range, but if there were movements before the initial date (records in table), I want to obtain the previous sum (prevData)
if the first move is exit 5 and the second move is income 2.
I would have in the first row (prevData-5), second row would have (prevData-5 + 2) and thus have a cumulative.
The prevData would be calculated as the sum of the above, validating the product id of the record, I made the query but if the product has 10 movements, I would do the query 10 times, and how would I identify the sum of another product_id?
SELECT
ik.id,
ik.quantity,
ik.date,
ik.product_id,
#balance = (SELECT SUM(quantity) FROM table_kardex WHERE product_id = ik.product_id AND id < ik.id)
from table_kardex ik
where ik.date between '2021-11-01' and '2021-11-15'
order by ik.product_id,ik.id asc
I hope you have given me to understand, I will be attentive to any questions.
#table_kardex
id|quantity|date|product_id
1 8 2020-10-12 2
2 15 2020-10-12 1
3 5 2021-11-01 1
4 10 2021-11-01 2
5 -2 2021-11-02 1
6 -4 2021-11-02 2
#result
id|quantity|date|product_id|saldo
3 5 2021-11-01 1 20 (15+5)
5 -2 2021-11-02 1 18 (15+5-2)
4 10 2021-11-01 2 18 (8+10-4)
6 -4 2021-11-02 2 14 (15+5-2)
Use MySQL 5.7
If you're using MySQL 8+, then analytic functions can be used here:
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY date) rn,
SUM(quantity) OVER (PARTITION BY product_id ORDER BY date) saldo
FROM table_kardex
WHERE date BETWEEN '2021-11-01' AND '2021-11-15'
)
SELECT id, quantity, date, product_id, saldo
FROM cte
WHERE rn > 1
ORDER BY product_id, date;
MySQL 5.7
Try this:
SELECT *
FROM (
SELECT product_id,
t1.`date`,
SUM(t2.quantity) - t1.quantity cumulative_quantity_before,
SUM(t2.quantity) cumulative_quantity_after
FROM table t1
JOIN table t2 USING (product_id)
WHERE t1.`date` >= t2.`date`
AND t1.`date` <= #period_end
GROUP BY product_id, t1.`date`, t1.quantity
) prepare_data
WHERE `date` >= #period_start;
The easiest solution is to use the window function SUM OVER to get the running total. In the second step reduce this to the date you want to have this started:
SELECT id, quantity, date, product_id, balance
FROM
(
SELECT
id,
quantity,
date,
product_id,
SUM(quantity) OVER (PARTITION BY product_id ORDER BY id) AS balance
from table_kardex ik
where date < DATE '2021-11-16'
) cumulated
WHERE date >= DATE '2021-11-01'
ORDER BY product_id, id;
UPDATE: You have changed your request to mention that you are using an old MySQL version (5.7). This doesn't support window functions. In that case use your original query. If I am not mistaken, though, #balance = (...) is invalid syntax for MySQL. And according to your explanation you want id <= ik.id, not id < ik.id:
SELECT
ik.id,
ik.quantity,
ik.date,
ik.product_id,
(
SELECT SUM(quantity)
FROM table_kardex
WHERE product_id = ik.product_id AND id <= ik.id
) AS balance
FROM table_kardex ik
WHERE ik.date >= DATE '2021-11-01' AND ik.date < DATE '2021-11-16'
ORDER BY ik.product_id, ik.id;
The appropriate indexes for this query are:
create index idx1 on table_kardex (date, product_id, id);
create index idx2 on table_kardex (product_id, id, quantity);

Getting all previous records of table by date MySQL

My table currently has 21000 records, it's daily updated and almost 300 entries are inserted. Now, what I want is to have a query which will fetch the counts of elements that my table had for the previous 10 days, so it returns:
26000
21300
21000
etc
Right now, I wrote this:
"SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01'"
And it returns 21000 but only for 1 day. I want by query to return records according to 10 days.
However, this does it for only 1 day.
edit : database flavor is mysql and date column is date not datetime
The most efficient method may be aggregation and cumulative sums:
select date(task_start_time) as dte, count(*) as cnt_on_day,
sum(count(*)) over (order by date(task_start_time)) as running_cnt
from tbl_task
group by dte
order by dte desc
limit 10;
This returns the last 10 days in the data. You can easily adjust to more days if you like -- in fact all of them -- without much trouble.
I don't know if I'm wrong, but could you not simple add a GROUP BY - statement? Like:
"SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01' GROUP
BY task_start_time"
EDIT:
This should only work if task_start_time is a date, not if it is a datetime
EDIT2:
If it is a datetime you could use the date function:
SELECT COUNT(*) from tbl_task where `task_start_time` < '2020-12-01' GROUP
BY DATE(task_start_time)
You can use UNION ALL and date arithmetic.
SELECT count(*)
FROM tbl_task
WHERE task_start_time < current_date
UNION ALL
SELECT count(*)
FROM tbl_task
WHERE task_start_time < date_sub(current_date, INTERVAL 1 DAY)
...
UNION ALL
SELECT count(*)
FROM tbl_task
WHERE task_start_time < date_sub(current_date, INTERVAL 9 DAY);
Edit:
You might also join a derived table that uses FROM-less SELECTs and UNION ALL to get the days to look back and then aggregate. This might be a little easier to construct dynamically. (But it may be slower I suspect.)
SELECT count(*)
FROM (SELECT 0 x
UNION ALL
SELECT 1
...
UNION ALL
SELECT 9)
INNER JOIN tbl_task t
ON t.task_start_time < date_sub(current_date, INTERVAL x.x DAY)
GROUP BY x.x;
In MySQL version 8+ you can even use a recursive CTE to construct the table with the days.
WITH RECURSIVE x
AS
(
SELECT 0 x
UNION ALL
SELECT x + 1
FROM x
WHERE x + 1 < 10
)
SELECT count(*)
FROM x
INNER JOIN tbl_task t
ON t.task_start_time < date_sub(current_date, INTERVAL x.x DAY)
GROUP BY x.x;

comparing between rows

How do I compare all the rows value(sum) in my table to the first row (which is the first date)
for example:
ID Date Sum
1 01-01-2020 60
2 01-02-2020 70
3 01-05-2020 80
4 01-06-2020 25
I want all the IDs which the sum in them is greater than 60 (the first date)
I tried to set the first date as min(date) but I can't compare the sums inside the date.
The result should be:
ID Date Sum
2 01-02-2020 70
3 01-05-2020 80
Select * from mytable where sum > (select sum from mytable order by date limit 1).
In MySQL 8+, you can use first_value():
select t.*, first_value(sum) over (order by date) as first_sum
from t;
You can then incorporate this into a subquery to get the rows that are at or exceed the value:
select t.*
from (select t.*, first_value(sum) over (order by date) as first_sum
from t
) t
where sum > first_sum;
You can also do this with a cross join:
select t.*
from t cross join
(select t.*
from t
order by date asc
limit 1
) t1
where t.sum > t1.sum;
Or do the same thing with a subquery in the where clause:
select t.*
from t
where t.sum > (select t2.sum
from t t2
order by t2.date
limit 1
);

MySQL - get min/max of consecutive events in a series of rows

I have a table that looks like this:
http://sqlfiddle.com/#!9/152d2/1/0
CREATE TABLE Table1 (
id int,
value decimal(10,5),
dt datetime,
threshold_id int
);
Current Query:
SELECT sensors_id, DATE_FORMAT(datetime, '%Y-%m-%d'), MIN(value), MAX(value)
FROM Readings
WHERE datetime < "2015-11-18 00:00:00"
AND datetime > "2015-10-18 00:00:00"
AND sensors_id = 9
GROUP BY DATE_FORMAT(datetime, '%Y-%m-%d')
ORDER BY datetime DESC
What I'm trying to do is to return the min/max value in each group, where threshold_id IS NOT NULL. Therefore, the example should return something like:
min_value | max_value | start_date | end_date
9 | 10.5 | 2015-07-29 10:52:31 | 2015-07-29 10:57:31
8.5 | 9.5 | 2015-07-29 11:03:31 | 2015-07-29 11:05:31
I can't work out how to do this grouping. I need to return the min/max for each group of consecutive rows where the threshold_id IS NOT NULL.
Use user variables to compare existing value to the previous value and increment a column you can use to group by,tested on my machine.
SELECT MIN(value),MAX(value),MIN(dt),MAX(dt)
FROM (
SELECT id,value,dt,
CASE WHEN COALESCE(threshold_id,'')=#last_ci THEN #n ELSE #n:=#n+1 END AS g,
#last_ci := COALESCE(threshold_id,'') As th
FROM
Table1, (SELECT #n:=0) r
ORDER BY
id
) s
WHERE th!=''
GROUP BY
g
For mysql 8 this could be rewritten as below.Use a CTE to get different sequences and GROUP By the difference between them.
WITH cte as (
SELECT *,
ROW_NUMBER() OVER (ORDER BY id)as rn,
ROW_NUMBER() OVER (PARTITION BY threshold_id ORDER BY id)as rnn
FROM Table1
ORDER BY id
)
SELECT MIN(value),MAX(value),MIN(dt),MAX(dt) FROM cte WHERE threshold_id IS NOT NULL GROUP BY rn-rnn
MYSQL8
FIDDLE
Your sample data only includes a single day's worth, so you only get a single row back (assuming you want to group by day):
SELECT DAYOFYEAR(dt) `day`, MIN(`value`) min_value, MAX(`value`) max_value
FROM Table1
GROUP BY `day`
ORDER BY `day` ASC

Calculating a Moving Average MySQL?

Good Day,
I am using the following code to calculate the 9 Day Moving average.
SELECT SUM(close)
FROM tbl
WHERE date <= '2002-07-05'
AND name_id = 2
ORDER BY date DESC
LIMIT 9
But it does not work because it first calculates all of the returned fields before the limit is called. In other words it will calculate all the closes before or equal to that date, and not just the last 9.
So I need to calculate the SUM from the returned select, rather than calculate it straight.
IE. Select the SUM from the SELECT...
Now how would I go about doing this and is it very costly or is there a better way?
If you want the moving average for each date, then try this:
SELECT date, SUM(close),
(select avg(close) from tbl t2 where t2.name_id = t.name_id and datediff(t2.date, t.date) <= 9
) as mvgAvg
FROM tbl t
WHERE date <= '2002-07-05' and
name_id = 2
GROUP BY date
ORDER BY date DESC
It uses a correlated subquery to calculate the average of 9 values.
Starting from MySQL 8, you should use window functions for this. Using the window RANGE clause, you can create a logical window over an interval, which is very powerful. Something like this:
SELECT
date,
close,
AVG (close) OVER (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
FROM tbl
WHERE date <= DATE '2002-07-05'
AND name_id = 2
ORDER BY date DESC
For example:
WITH t (date, `close`) AS (
SELECT DATE '2020-01-01', 50 UNION ALL
SELECT DATE '2020-01-03', 54 UNION ALL
SELECT DATE '2020-01-05', 51 UNION ALL
SELECT DATE '2020-01-12', 49 UNION ALL
SELECT DATE '2020-01-13', 59 UNION ALL
SELECT DATE '2020-01-15', 30 UNION ALL
SELECT DATE '2020-01-17', 35 UNION ALL
SELECT DATE '2020-01-18', 39 UNION ALL
SELECT DATE '2020-01-19', 47 UNION ALL
SELECT DATE '2020-01-26', 50
)
SELECT
date,
`close`,
COUNT(*) OVER w AS c,
SUM(`close`) OVER w AS s,
AVG(`close`) OVER w AS a
FROM t
WINDOW w AS (ORDER BY date DESC RANGE INTERVAL 9 DAY PRECEDING)
ORDER BY date DESC
Leading to:
date |close|c|s |a |
----------|-----|-|---|-------|
2020-01-26| 50|1| 50|50.0000|
2020-01-19| 47|2| 97|48.5000|
2020-01-18| 39|3|136|45.3333|
2020-01-17| 35|4|171|42.7500|
2020-01-15| 30|4|151|37.7500|
2020-01-13| 59|5|210|42.0000|
2020-01-12| 49|6|259|43.1667|
2020-01-05| 51|3|159|53.0000|
2020-01-03| 54|3|154|51.3333|
2020-01-01| 50|3|155|51.6667|
Use something like
SELECT
sum(close) as sum,
avg(close) as average
FROM (
SELECT
(close)
FROM
tbl
WHERE
date <= '2002-07-05'
AND name_id = 2
ORDER BY
date DESC
LIMIT 9 ) temp
The inner query returns all filtered rows in desc order, and then you avg, sum up those rows returned.
The reason why the query given by you doesn't work is due to the fact that the sum is calculated first and the LIMIT clause is applied after the sum has already been calculated, giving you the sum of all the rows present
an other technique is to do a table:
CREATE TABLE `tinyint_asc` (
`value` tinyint(3) unsigned NOT NULL default '0',
PRIMARY KEY (value)
) ;
​
INSERT INTO `tinyint_asc` VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12),(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23),(24),(25),(26),(27),(28),(29),(30),(31),(32),(33),(34),(35),(36),(37),(38),(39),(40),(41),(42),(43),(44),(45),(46),(47),(48),(49),(50),(51),(52),(53),(54),(55),(56),(57),(58),(59),(60),(61),(62),(63),(64),(65),(66),(67),(68),(69),(70),(71),(72),(73),(74),(75),(76),(77),(78),(79),(80),(81),(82),(83),(84),(85),(86),(87),(88),(89),(90),(91),(92),(93),(94),(95),(96),(97),(98),(99),(100),(101),(102),(103),(104),(105),(106),(107),(108),(109),(110),(111),(112),(113),(114),(115),(116),(117),(118),(119),(120),(121),(122),(123),(124),(125),(126),(127),(128),(129),(130),(131),(132),(133),(134),(135),(136),(137),(138),(139),(140),(141),(142),(143),(144),(145),(146),(147),(148),(149),(150),(151),(152),(153),(154),(155),(156),(157),(158),(159),(160),(161),(162),(163),(164),(165),(166),(167),(168),(169),(170),(171),(172),(173),(174),(175),(176),(177),(178),(179),(180),(181),(182),(183),(184),(185),(186),(187),(188),(189),(190),(191),(192),(193),(194),(195),(196),(197),(198),(199),(200),(201),(202),(203),(204),(205),(206),(207),(208),(209),(210),(211),(212),(213),(214),(215),(216),(217),(218),(219),(220),(221),(222),(223),(224),(225),(226),(227),(228),(229),(230),(231),(232),(233),(234),(235),(236),(237),(238),(239),(240),(241),(242),(243),(244),(245),(246),(247),(248),(249),(250),(251),(252),(253),(254),(255);
After you can used it like that:
select
date_add(tbl.date, interval tinyint_asc.value day) as mydate,
count(*),
sum(myvalue)
from tbl inner
join tinyint_asc.value <= 30 -- for a 30 day moving average
where date( date_add(o.created_at, interval tinyint_asc.value day ) ) between '2016-01-01' and current_date()
group by mydate
This query is fast:
select date, name_id,
case #i when name_id then #i:=name_id else (#i:=name_id)
and (#n:=0)
and (#a0:=0) and (#a1:=0) and (#a2:=0) and (#a3:=0) and (#a4:=0) and (#a5:=0) and (#a6:=0) and (#a7:=0) and (#a8:=0)
end as a,
case #n when 9 then #n:=9 else #n:=#n+1 end as n,
#a0:=#a1,#a1:=#a2,#a2:=#a3,#a3:=#a4,#a4:=#a5,#a5:=#a6,#a6:=#a7,#a7:=#a8,#a8:=close,
(#a0+#a1+#a2+#a3+#a4+#a5+#a6+#a7+#a8)/#n as av
from tbl,
(select #i:=0, #n:=0,
#a0:=0, #a1:=0, #a2:=0, #a3:=0, #a4:=0, #a5:=0, #a6:=0, #a7:=0, #a8:=0) a
where name_id=2
order by name_id, date
If you need an average over 50 or 100 values, it's tedious to write, but
worth the effort. The speed is close to the ordered select.