Minimum / maximum of "sum" with relative values - mysql

I have the following table test:
+----+-------+
| id | value |
+----+-------+
| 1 | -3 |
| 2 | -5 |
| 3 | 10 |
| 4 | -1 |
+----+-------+
For MIN(value) I get -5, for MAX(value) I get 10, and for SUM(value) I get 1. However, I would like to get the minimum and maximum value when progressing through the table step by step.
Example 1: SELECT AWESOME_FUNCTION_SUM_MIN(value) FROM test ORDER BY id ASC
This should return -8 (first row is -3, plus the second row -5 results in the lowest value over the course of all values).
Example 2: SELECT AWESOME_FUNCTION_SUM_MAX(value) FROM test ORDER BY id ASC
This should return 2 (first row -3, second -5, and third row +10 leads to the highest value over the course of all values).
Obviously, ORDER BY does not really make sense, since it is used for ordering the results of a query, but I used it here anyways for demonstration purposes. To me, this is such a basic functionality, so I was surprised to find nothing about it. I potentially am using the wrong keywords. Can somebody help me out? Or do I have to extract all values and do the analysis externally (=not with MySQL)?

Create table/insert data.
CREATE TABLE test
(`id` INT, `value` INT)
;
INSERT INTO test
(`id`, `value`)
VALUES
(1, -3),
(2, -5),
(3, 10),
(4, -1)
;
MySQL doesnt have those functions but you can simulate them using a self join.
Query SUM_MIN
SELECT
SUM(test.value)
FROM
test
INNER JOIN (
SELECT
id
FROM
test
WHERE
test.value > 0
ORDER BY
id ASC
LIMIT 1
)
AS
positive_number
ON
test.id < positive_number.id
ORDER BY
test.id
Result
sum(test.value)
-----------------
-8
Query SUM_MAX
SELECT
SUM(test.value)
FROM
test
INNER JOIN (
SELECT
id
FROM
test
WHERE
test.value > 0
ORDER BY
id ASC
LIMIT 1
)
AS
positive_number
ON
test.id <= positive_number.id
ORDER BY
test.id
Result
sum(test.value)
-----------------
2

Here's one way:
SELECT x.*
, #least:=LEAST(#least,value) least
, #greatest:=GREATEST(#greatest,value) greatest
, #i:=#i+value running
FROM my_table x
, (SELECT #least:=1000,#greatest:=-1000,#i:=0) vars
ORDER
BY id;
+----+-------+-------+----------+---------+
| id | value | least | greatest | running |
+----+-------+-------+----------+---------+
| 1 | -3 | -3 | -3 | -3 |
| 2 | -5 | -5 | -3 | -8 |
| 3 | 10 | -5 | 10 | 2 |
| 4 | -1 | -5 | 10 | 1 |
+----+-------+-------+----------+---------+

To get a cumulative sum, you can join a table to itself.
select min(val)
from (select sum(a.value) as val from test a join test b
on a.id<=b.id group by b.id) t1;
/* answer: -8 */
select max(val)
from (select sum(a.value) as val from test a join test b
on a.id<=b.id group by b.id) t1;
/* answer: 2 */

Related

How can I create a column with incremented values from a column with cumulated values in MySQL?

Table1 contains a column with cumulated values (all positive integers):
id ValuesCum
1 5
2 8
3 20
I would like to write a statement that returns an extra column with the incremented values for each row. The output should read something like:
id ValuesCum ValuesInc
1 5 (5)
2 8 3
3 20 12
Does anyone have a solution for this?
If you are running MySQL 8.0, you can use window function lag() for this:
select
t.*,
ValuesCum - lag(ValuesCum, 1, 0) over(order by id) ValuesInc
from mytable t
In earlier versions, an alternative is a correlated subquery:
select
t.*,
ValuesCum - (
select coalesce(max(t1.ValuesCum), 0)
from mytable t1
where t1.id < t.id
) ValuesInc
from mytable t
You can use a correlated subquery to get the value of ValuesCum of the previous id:
select t.*,
t.ValuesCum -
coalesce((select ValuesCum from tablename where id < t.id order by id desc limit 1), 0) ValuesInc
from tablename t
See the demo.
Results:
| id | ValuesCum | ValuesInc |
| --- | --------- | --------- |
| 1 | 5 | 5 |
| 2 | 8 | 3 |
| 3 | 20 | 12 |

What optimal selection statements are possible to select previous and next rows (in one statement)?

How to select previous AND next rows from an ordered table, ordered by an order column?
This is a simple example of such a table (e.g. test_table):
+--------+-----------+----------+
| id | name | order |
+--------+-----------+----------+
| 126 | Test 0 | 0 |
+--------+-----------+----------+
| 73 | Test 1 | 1 | >
+--------+-----------+----------+
| 801 | Test 5 | 5 | <<<
+--------+-----------+----------+
| 3 | Test 8 | 8 | >
+--------+-----------+----------+
| 45 | Test 12 | 12 |
+--------+-----------+----------+
This is an example statement, that makes it possible to do what I need (in this example I have the order-value 5, and I need the previous and next rows by order):
SELECT * FROM
(
SELECT * FROM test_table
WHERE test_table.order < 5
ORDER BY test_table.order DESC LIMIT 1
) AS a
UNION
SELECT * FROM
(
SELECT * FROM test_table
WHERE test_table.order > 5
ORDER BY test_table.order LIMIT 1
) AS b
However, I think it is too complicated. Is there another way to do it, using less selects (and/or without a union)? In short: are there more performant/efficient and optimal statements/algorithms or some best practices?
To be clearer, I expect the following result set:
+--------+-----------+----------+
| id | name | order |
+--------+-----------+----------+
| 73 | Test 1 | 1 |
+--------+-----------+----------+
| 3 | Test 8 | 8 |
+--------+-----------+----------+
P.S. Please, do not use any procedures or custom functions. Assume that there are no appropriate administrator rights for it.
As I see, your solution is already optimal. I would though write it a bit shorter and use UNION ALL instead of UNION (which is a shortcut for UNION DISTINCT):
(
SELECT * FROM test_table
WHERE test_table.order < 5
ORDER BY test_table.order DESC LIMIT 1
) UNION ALL (
SELECT * FROM test_table
WHERE test_table.order > 5
ORDER BY test_table.order LIMIT 1
)
Given an index on the order column, it should also have the best possible performance.
One alternative way to do this would be to use the LEAD and LAG analytic functions:
WITH cte1 AS (
SELECT id,
LAG(id) OVER (ORDER BY `order`) id_lag,
LEAD(id) OVER (ORDER BY `order`) id_lead
FROM test_table
),
cte2 AS (
SELECT * FROM cte1 WHERE `order` = 5
)
SELECT id, name, `order` FROM test_table WHERE id = (SELECT id_lag FROM cte2)
UNION ALL
SELECT id, name, `order` FROM test_table WHERE id = (SELECT id_lead FROM cte2);

Update value based on value from nearest smaller neigbour

I have a table with a column A that is INT(11) (it's a timestamp, but for now I just use small numbers)
id | A | diff |
---+----+------+
1 | 12 | |
2 | 7 | |
3 | 23 | |
4 | 9 | |
5 | 2 | |
6 | 30 | |
I like to update diff with the difference between A and it's nearest smaller neighbour. So if A=12 it's first smaller neightbour is A=7, if A=30 it is A=23. I should end up with a table like this (sorted on A):
id | A | diff |
---+----+------+
5 | 2 | - |
2 | 7 | 5 | (7-5)
4 | 9 | 2 | (9-7)
1 | 12 | 3 | (12-9)
3 | 23 | 11 | (23-12)
6 | 30 | 7 | (30-23)
I can calculate the difference at the moment of insertion, as I know A then (here: A=15):
INSERT INTO `table` (`A`,`diff`)
(SELECT 15 , 15-`A` FROM `table` WHERE `A` < 15 ORDER BY `A` DESC LIMIT 1)
This results in a new record:
id | A | diff |
---+----+------+
7 | 15 | 3 | (3 being the difference between A=12 and A=15
(NOTE: This fails miserably when A=1, being the new smallest value and having no smaller neighbour, so no value of diff)
But now the value of diff in record 3 is wrong, because it still is based on the difference between 23 - 12 as is now should be 23 - 15.
So I just want to insert the A value and then run an update on the table, refreshing diff where necessery. But that's where my knowledge of MYSQL ends...
I crafted this query, but it fails saying `You can't specify table 't1' for update in FROM clause
UPDATE `table` AS t1
SET
t1.`diff` = t1.`A` - (SELECT `A` FROM `table`
WHERE `A` < t1.`A`
ORDER BY `A` DESC LIMIT 1
)
Here's a query:
SELECT x.*
, x.a-MAX(y.a) diff
FROM my_table x
LEFT
JOIN my_table y
ON y.a < x.a
GROUP
BY x.id
ORDER
BY a;
I'm not sure why you would want to store derived data, but you can I guess...
UPDATE my_table m
JOIN
( SELECT x.*
, x.a-MAX(y.a) q
FROM my_table x
JOIN my_table y
ON y.a < x.a
GROUP
BY x.id
) n
ON n.id = m.id
SET m.diff = q;
You may try this after inserting new value :
UPDATE x
SET
x.diff = iq2.new_diff
FROM
#t x
INNER JOIN
(SELECt id,A,diff , new_diff
FROM
(select id,A,15 as new_number,
CASE WHEN (A-15) < 0 THEN NULL ELSE (A-15) END as new_diff,diff
from #t
) iq
WHERE
iq.new_diff <= iq.diff
AND iq.new_diff <> 0
)iq2
on x.A = iq2.A
inner query compares the previous difference and current one and then updates the relevant ones.

Fetch Unit consumption date-wise

I am struggling in to get result from mysql in the following way. I have 10 records in mysql db table having date and unit fields. I need to get used units on every date.
Table structure as follows, adding today unit with past previous unit in every record:
Date Units
---------- ---------
10/10/2012 101
11/10/2012 111
12/10/2012 121
13/10/2012 140
14/10/2012 150
15/10/2012 155
16/10/2012 170
17/10/2012 180
18/10/2012 185
19/10/2012 200
Desired output will be :
Date Units
---------- ---------
10/10/2012 101
11/10/2012 10
12/10/2012 10
13/10/2012 19
14/10/2012 10
15/10/2012 5
16/10/2012 15
17/10/2012 10
18/10/2012 5
19/10/2012 15
Any help will be appreciated. Thanks
There's a couple of ways to get the resultset. If you can live with an extra column in the resultset, and the order of the columns, then something like this is a workable approach.
using user variables
SELECT d.Date
, IF(#prev_units IS NULL
,#diff := 0
,#diff := d.units - #prev_units
) AS `Units_used`
, #prev_units := d.units AS `Units`
FROM ( SELECT #prev_units := NULL ) i
JOIN (
SELECT t.Date, t.Units
FROM mytable t
ORDER BY t.Date, t.Units
) d
This returns the specified resultset, but it includes the Units column as well. It's possible to have that column filtered out, but it's more expensive, because of the way MySQL processes an inline view (MySQL calls it a "derived table")
To remove that extra column, you can wrap that in another query...
SELECT f.Date
, f.Units_used
FROM (
query from above goes here
) f
ORDER BY f.Date
but again, removing that column comes with the extra cost of materializing that result set a second time.
using a semi-join
If you are guaranteed to have a single row for each Date value, either stored as a DATE, or as a DATETIME with the timecomponent set to a constant, such as midnight, and no gaps in the Date value, and Date is defined as DATE or DATETIME datatype, then another query that will return the specifid result set:
SELECT t.Date
, t.Units - s.Units AS Units_Used
FROM mytable t
LEFT
JOIN mytable s
ON s.Date = t.Date + INTERVAL -1 DAY
ORDER BY t.Date
If there's a missing Date value (a gap) such that there is no matching previous row, then Units_used will have a NULL value.
using a correlated subquery
If you don't have a guarantee of no "missing dates", but you have a guarantee that there is no more than one row for a particular Date, then another approach (usually more expensive in terms of performance) is to use a correlated subquery:
SELECT t.Date
, ( t.Units - (SELECT s.Units
FROM mytable s
WHERE s.Date < t.Date
ORDER BY s.Date DESC
LIMIT 1)
) AS Units_used
FROM mytable t
ORDER BY t.Date, t.Units
spencer7593's solution will be faster, but you can also do something like this...
SELECT * FROM rolling;
+----+-------+
| id | units |
+----+-------+
| 1 | 101 |
| 2 | 111 |
| 3 | 121 |
| 4 | 140 |
| 5 | 150 |
| 6 | 155 |
| 7 | 170 |
| 8 | 180 |
| 9 | 185 |
| 10 | 200 |
+----+-------+
SELECT a.id,COALESCE(a.units - b.units,a.units) units
FROM
( SELECT x.*
, COUNT(*) rank
FROM rolling x
JOIN rolling y
ON y.id <= x.id
GROUP
BY x.id
) a
LEFT
JOIN
( SELECT x.*
, COUNT(*) rank
FROM rolling x
JOIN rolling y
ON y.id <= x.id
GROUP
BY x.id
) b
ON b.rank= a.rank -1;
+----+-------+
| id | units |
+----+-------+
| 1 | 101 |
| 2 | 10 |
| 3 | 10 |
| 4 | 19 |
| 5 | 10 |
| 6 | 5 |
| 7 | 15 |
| 8 | 10 |
| 9 | 5 |
| 10 | 15 |
+----+-------+
This should give the desired result. I don't know how your table is called so I named it "tbltest".
Naming a table date is generally a bad idea as it also refers to other things (functions, data types,...) so I renamed it "fdate". Using uppercase characters in field names or tablenames is also a bad idea as it makes your statements less database independent (some databases are case sensitive and some are not).
SELECT
A.fdate,
A.units - coalesce(B.units, 0) AS units
FROM
tbltest A left join tbltest B ON A.fdate = B.fdate + INTERVAL 1 DAY

What the proper way of using "WHERE" clause when it needs to fetch data by approximate value of some field?

+-------+
| value |
+-------+
| 13.00 |
| 15.00 |
| 17.50 |
| 18.00 |
| 18.10 |
| 18.30 |
| 19.90 |
| 20.00 |
| 20.30 |
| 20.60 |
+-------+
SELECT * FROM `table` WHERE `value` = 19;
I want retrieve rows which contains value from 18.00 to 20.60 (plus or minus 2)
Number 19 I'm geting by POST.
You can use between
SELECT * FROM table
WHERE value between $posted_value - 2 and $posted_value + 2
EDIT
If you want a range of +2 or -2, the most efficient way to do that would be:
SELECT t.*
FROM `table` t
WHERE t.value >= 19 - 2.0
AND t.value <= 19 + 2.0
ORDER BY t.value
original
To get just the value column from the seven rows with values "closest" to 19, calculate the difference between 19 and the value in the value column, take the absolute value of the difference, and then sort by that. Then limit the number of rows returned:
SELECT s.value
FROM (
SELECT s.value
FROM `table` s
ORDER BY ABS(19.0-s.value)
LIMIT 7
) s
ORDER BY s.value
To get the entire row, for the rows with the values "closest" to 19, you could do the same query, but also retrieve a unique identifier from the row, and then perform a join to the original table, for example:
SELECT t.*
FROM (
SELECT r.id
, r.value
, ABS(19.0-r.value) AS `absdiff`
FROM `table` r
ORDER BY ABS(19.0-r.value)
LIMIT 7
) s
JOIN `table` t
ON t.id = s.id