How to get the difference between consecutive rows in MySQL? - mysql

I have a table in mysql database this data.
id date number qty
114 07-10-2018 200 5
120 01-12-2018 300 10
123 03-02-2019 700 12
1126 07-03-2019 1000 15
I want to calculate difference between two consecutive rows and i need output format be like:
id date number diff qty avg
114 07-10-2018 200 0 5 0
120 01-12-2018 300 100 10 10
123 03-02-2019 700 400 12 33.33
1126 07-03-2019 1000 300 15 20
Any one know how to do this in mysql query? I want first value of diff and avg column to be 0 and rest is the difference.

For MySQL 8 then use Lag window function.
SELECT
test.id,
test.date,
test.number,
test.qty,
IFNULL(test.number - LAG(test.number) OVER w, 0) AS diff,
ROUND(IFNULL(test.number - LAG(test.number) OVER w, 0)/ test.qty, 2) AS 'Avg'
FROM purchases test
WINDOW w AS (ORDER BY test.`date` ASC);
For MySQL 5.7 or lesser version
We can use the MySQL variable to do this job. Consider your table name is test.
SELECT
test.id,
test.date,
test.number,
test.qty,
#diff:= IF(#prev_number = 0, 0, test.number - #prev_number) AS diff,
ROUND(#diff / qty, 2) 'avg',
#prev_number:= test.number as dummy
FROM
test,
(SELECT #prev_number:= 0 AS num) AS b
ORDER BY test.`date` ASC;
-------------------------------------------------------------------------------
Output:
| id | date | number| qty | diff | avg | dummy |
-----------------------------------------------------------------
| 114 | 2018-10-07 | 200 | 5 | 0 | 0.00 | 200 |
| 120 | 2018-12-01 | 300 | 10 | 100 | 10.00 | 300 |
| 123 | 2019-02-03 | 700 | 12 | 400 | 33.33 | 700 |
| 1126 | 2019-03-07 | 1000 | 15 | 300 | 20.00 | 1000 |
Explaination:
(SELECT #prev_number:= 0 AS num) AS b
we initialized variable #prev_number to zero in FROM clause and joined with each row of the test table.
#diff:= IF(#prev_number = 0, 0, test.number - #prev_number) AS diff First we are generating difference and then created another variable diff to reuse it for average calculation. Also we included one condition to make the diff for first row as zero.
#prev_number:= test.number as dummy we are setting current number to this variable, so it can be used by next row.
Note: We have to use this variable first, in both difference as well as average and then set to the new value, so next row can access value from the previous row.
You can skip/modify order by clause as per your requirements.

There could be better ways to do this, but try this:
SELECT A.id,
A.date,
A.number,
A.qty,
A.diff,
B.avg
FROM
(SELECT *, abs(LAG(number, 1, number) OVER (ORDER BY id) - number) AS 'diff'
FROM table) AS A
JOIN
(SELECT *, abs(LAG(number, 1, number) OVER (ORDER BY id) - number)/qty AS 'avg' FROM table) AS B
ON A.id = B.id;

Related

How do I get results of a MySQL JOIN where records meet a value criteria in joined table?

This may be simple but I can't figure it out...
I have two tables:
tbl_results:
runID | balance |
1 | 3432
2 | 5348
3 | 384
tbl_phases:
runID_fk | pc |
1 | 34
1 | 2
1 | 18
2 | 15
2 | 18
2 | 20
3 | -20
3 | 10
3 | 60
I want to get a recordset of: runID, balance, min(pc), max(pc) only where pc>10 and pc<50 for each runID as a group, excluding runIDs where any associated pc value is outside of value range.
I would want the following results from what's described above:
runID | balance | min_pc | max_pc
2 | 5348 | 15 | 20
... because runID=1&3 have pc values that fall outside the numeric range for pc noted above.
Thanks in advance!
You may apply filters based on your requirements in your having clause. You may try the following.
Query #1
SELECT
r.runID,
MAX(r.balance) as balance,
MIN(p.pc) as min_pc,
MAX(p.pc) as max_pc
FROM
tbl_results r
INNER JOIN
tbl_phases p ON p.runID_fk = r.runID
GROUP BY
r.runID
HAVING
MIN(p.pc)>10 AND MAX(p.pc) < 50;
runID
balance
min_pc
max_pc
2
5348
15
20
Query #2
SELECT
r.runID,
MAX(r.balance) as balance,
MIN(p.pc) as min_pc,
MAX(p.pc) as max_pc
FROM
tbl_results r
INNER JOIN
tbl_phases p ON p.runID_fk = r.runID
GROUP BY
r.runID
HAVING
COUNT(CASE WHEN p.pc <= 10 or p.pc >= 50 THEN 1 END) =0;
runID
balance
min_pc
max_pc
2
5348
15
20
View working demo on DB Fiddle
Updated with comments from Rahul Biswas

how to count max and min average and median into the data mysql

i have 1 table called order_match which contain order_buyer_Id as the id of the transaction, createdby as the id of the buyer, and createdAt as the date when the transaction happen.
on this case, i want to count of the order (order_buyer_Id) for each buyer (createdby) and find out the maximum and the minumum count after that.
this is the example data:
+----------------+-----------+------------+
| order_buyer_id | createdby | createdAt |
+----------------+-----------+------------+
| 19123 | 19 | 2017-02-02 |
| 193241 | 19 | 2017-02-02 |
| 123123 | 20 | 2017-02-02 |
| 32242 | 20 | 2017-02-02 |
| 32434 | 20 | 2017-02-02 |
+----------------+-----------+------------+
and if run the syntax, the expected result are:
+-----+-----+---------+--------+
| max | min | average | median |
+-----+-----+---------+--------+
| 3 | 2 | 2,5 | 3 |
+-----+-----+---------+---------
i've use with this syntax
select max(count(order_buyer_id)), min(count(order_buyer_id)), avg(count(order_buyer_id)), median(count(order_buyer_Id)) from order_match where createdby = 19 and 20 and createdAt = '2017-02-02' group by createdby;
Most of what you want to do is straightforward, but to compute median values you need a ROW_NUMBER function, which you have to simulate with variables in MySQL 5.7. Having computed the row number (based on ordering counts) you can then take the either the middle count (if there are an odd number of values) or the average of the two middle values (if there are an even number of values) to get the median. By using conditional aggregation, we can then compute the median at the same time as the other values:
SELECT MAX(count) AS max,
MIN(count) AS min,
AVG(count) AS average,
AVG(CASE WHEN rn IN (FLOOR((#tr+1)/2), FLOOR((#tr+2)/2)) THEN count END) AS median
FROM (
SELECT count,
#rn := #rn + 1 AS rn,
#tr := #rn AS tr
FROM (
SELECT COUNT(*) AS count
FROM order_match
GROUP BY createdby
ORDER BY count
) o
CROSS JOIN (SELECT #rn := 0) init
) c
Output (for your sample data):
max min average median
3 2 2.5 2.5
Demo on SQLFiddle

Grouping only specific columns and FIFO calculation in SQL Query

I have the following sample data:
order_id receipt_id receipt_amount total_already_invoiced_amount
14 36 30 150
14 37 30 150
15 42 30 30
16 45 30 60
16 46 40 60
17 50 40 60
17 51 40 60
17 52 40 60
The column receipt_amount is the amount of an order received for that specific line.
The column total_already_invoiced_amount is the total amount invoiced for an order.
I want to transform this table into a new one which retains only the lines where there is a received amount which is remaining after deducting the total invoiced amount (first in first out).
For example, if I have 3 receipt lines, each of 40, and my total invoiced is 60, then I can figure out that the first receipt line is fully invoiced, the second receipt line has 20 remaining to be invoiced and the third one has not been invoiced at all. I cannot aggregate, I must keep the receipt_id as an index as these can have different dates and I need to be able to distinguish according to that.
The result of such query would be the following:
order_id receipt_id received_not_invoiced_amount
16 46 10
17 51 20
17 52 40
I understand I can select group by order_id to get the aggregated receipt_amount, but it will also aggregate the total_already_invoiced_amount, which is not what I want. I am trying the following but that will not perform the FIFO calculation....
SELECT order_id,
receipt_id,
(total_already_invoiced_amount -
(SELECT receipt_amount FROM X GROUP BY order_id)
) total_already_invoiced_amount
FROM X
WHERE (total_already_invoiced_amount -
(SELECT receipt_amount FROM X GROUP BY order_id)) < 0
I'm a bit lost of where to start with to make this work.
In the absence of Windowing functions (not available in MySQL 5.7), one approach is to do a Self-Join and compute Sum of all the receipts for the order, until the receipt row of the first table. We can then use conditional statements to determine the differences accordingly:
Query #1 View on DB Fiddle
SELECT t1.order_id,
t1.receipt_id,
CASE
WHEN Coalesce(Sum(t2.receipt_amount), 0) <=
t1.total_already_invoiced_amount
THEN 0
ELSE Least(Coalesce(Sum(t2.receipt_amount), 0) -
t1.total_already_invoiced_amount,
t1.receipt_amount)
end AS received_not_invoiced_amount
FROM X t1
LEFT JOIN X t2
ON t2.order_id = t1.order_id
AND t2.receipt_id <= t1.receipt_id
GROUP BY t1.order_id,
t1.receipt_id,
t1.receipt_amount,
t1.total_already_invoiced_amount
HAVING received_not_invoiced_amount > 0;
| order_id | receipt_id | received_not_invoiced_amount |
| -------- | ---------- | ---------------------------- |
| 16 | 46 | 10 |
| 17 | 51 | 20 |
| 17 | 52 | 40 |
For good performance, you can define the following composite index: (order_id, receipt_id).
Another approach is using User-defined Variables. It is like a looping technique, where we calculate rolling (cumulative) sum over order_id, as we move down the receipts. Based on the sum, we determine whether excess payment received or not accordingly. For more detailed explanation on this technique, you may check this answer: https://stackoverflow.com/a/53465139
Query #2 View on DB Fiddle
SELECT order_id,
receipt_id,
received_not_invoiced_amount
FROM (SELECT #s := IF(#o = order_id, #s + receipt_amount, receipt_amount) AS
cum_receipt_amount,
IF(#s <= total_already_invoiced_amount, 0,
Least(#s - total_already_invoiced_amount, receipt_amount)) AS
received_not_invoiced_amount,
#o := order_id AS
order_id
,
receipt_id
FROM (SELECT *
FROM X
ORDER BY order_id,
receipt_id) t1
CROSS JOIN (SELECT #o := 0,
#s := 0) vars) t2
WHERE received_not_invoiced_amount > 0;
| order_id | receipt_id | received_not_invoiced_amount |
| -------- | ---------- | ---------------------------- |
| 16 | 46 | 10 |
| 17 | 51 | 20 |
| 17 | 52 | 40 |
For good performance, you can define the same composite index: (order_id, receipt_id).
You may benchmark both the approaches for best performance.
You want a cumulative sum:
select order_id, receipt_id,
least(running_ra, total_already_invoiced_amount), receipt_amount)
from (select x.*,
sum(receipt_amount) over (partition by order_id order by receipt_id) as running_ra
from x
) x
where running_ra > total_already_invoiced_amount

Fetch Unit consumption date-wise

I am struggling in to get result from mysql in the following way. I have 10 records in mysql db table having date and unit fields. I need to get used units on every date.
Table structure as follows, adding today unit with past previous unit in every record:
Date Units
---------- ---------
10/10/2012 101
11/10/2012 111
12/10/2012 121
13/10/2012 140
14/10/2012 150
15/10/2012 155
16/10/2012 170
17/10/2012 180
18/10/2012 185
19/10/2012 200
Desired output will be :
Date Units
---------- ---------
10/10/2012 101
11/10/2012 10
12/10/2012 10
13/10/2012 19
14/10/2012 10
15/10/2012 5
16/10/2012 15
17/10/2012 10
18/10/2012 5
19/10/2012 15
Any help will be appreciated. Thanks
There's a couple of ways to get the resultset. If you can live with an extra column in the resultset, and the order of the columns, then something like this is a workable approach.
using user variables
SELECT d.Date
, IF(#prev_units IS NULL
,#diff := 0
,#diff := d.units - #prev_units
) AS `Units_used`
, #prev_units := d.units AS `Units`
FROM ( SELECT #prev_units := NULL ) i
JOIN (
SELECT t.Date, t.Units
FROM mytable t
ORDER BY t.Date, t.Units
) d
This returns the specified resultset, but it includes the Units column as well. It's possible to have that column filtered out, but it's more expensive, because of the way MySQL processes an inline view (MySQL calls it a "derived table")
To remove that extra column, you can wrap that in another query...
SELECT f.Date
, f.Units_used
FROM (
query from above goes here
) f
ORDER BY f.Date
but again, removing that column comes with the extra cost of materializing that result set a second time.
using a semi-join
If you are guaranteed to have a single row for each Date value, either stored as a DATE, or as a DATETIME with the timecomponent set to a constant, such as midnight, and no gaps in the Date value, and Date is defined as DATE or DATETIME datatype, then another query that will return the specifid result set:
SELECT t.Date
, t.Units - s.Units AS Units_Used
FROM mytable t
LEFT
JOIN mytable s
ON s.Date = t.Date + INTERVAL -1 DAY
ORDER BY t.Date
If there's a missing Date value (a gap) such that there is no matching previous row, then Units_used will have a NULL value.
using a correlated subquery
If you don't have a guarantee of no "missing dates", but you have a guarantee that there is no more than one row for a particular Date, then another approach (usually more expensive in terms of performance) is to use a correlated subquery:
SELECT t.Date
, ( t.Units - (SELECT s.Units
FROM mytable s
WHERE s.Date < t.Date
ORDER BY s.Date DESC
LIMIT 1)
) AS Units_used
FROM mytable t
ORDER BY t.Date, t.Units
spencer7593's solution will be faster, but you can also do something like this...
SELECT * FROM rolling;
+----+-------+
| id | units |
+----+-------+
| 1 | 101 |
| 2 | 111 |
| 3 | 121 |
| 4 | 140 |
| 5 | 150 |
| 6 | 155 |
| 7 | 170 |
| 8 | 180 |
| 9 | 185 |
| 10 | 200 |
+----+-------+
SELECT a.id,COALESCE(a.units - b.units,a.units) units
FROM
( SELECT x.*
, COUNT(*) rank
FROM rolling x
JOIN rolling y
ON y.id <= x.id
GROUP
BY x.id
) a
LEFT
JOIN
( SELECT x.*
, COUNT(*) rank
FROM rolling x
JOIN rolling y
ON y.id <= x.id
GROUP
BY x.id
) b
ON b.rank= a.rank -1;
+----+-------+
| id | units |
+----+-------+
| 1 | 101 |
| 2 | 10 |
| 3 | 10 |
| 4 | 19 |
| 5 | 10 |
| 6 | 5 |
| 7 | 15 |
| 8 | 10 |
| 9 | 5 |
| 10 | 15 |
+----+-------+
This should give the desired result. I don't know how your table is called so I named it "tbltest".
Naming a table date is generally a bad idea as it also refers to other things (functions, data types,...) so I renamed it "fdate". Using uppercase characters in field names or tablenames is also a bad idea as it makes your statements less database independent (some databases are case sensitive and some are not).
SELECT
A.fdate,
A.units - coalesce(B.units, 0) AS units
FROM
tbltest A left join tbltest B ON A.fdate = B.fdate + INTERVAL 1 DAY

Retrieve detail rows of a group based on grand total

I have a table that looks like this one :
+------+------+------------------+
| item | val | timestamp |
+------+------+------------------+
| 1 | 3.66 | 16-05-2011 09:17 |
| 1 | 2.56 | 16-05-2011 09:47 |
| 2 | 4.23 | 16-05-2011 09:37 |
| 3 | 6.89 | 16-05-2011 11:26 |
| 3 | 1.12 | 16-05-2011 12:11 |
| 3 | 4.56 | 16-05-2011 13:23 |
| 4 | 1.10 | 16-05-2011 14:11 |
| 4 | 9.79 | 16-05-2011 14:23 |
| 5 | 1.58 | 16-05-2011 15:27 |
| 5 | 0.80 | 16-05-2011 15:29 |
| 6 | 3.80 | 16-05-2011 15:29 |
+------+------+------------------+
so, the grand total of all item for the day : 16 May 2011 is : 40.09
Now i want to retrieve which items of this list form an amount of 80% of the grand total.
Let me make an example :
Grand Total : 40.09
80% of the Grand Total : 32.07
starting from the item with more percentage weight on the total amount i want to retrieve the grouped list of the item that form the 80% of the grand total :
+------+------+
| item | val |
+------+------+
| 3 | 12.57|
| 4 | 10.89|
| 1 | 6.22|
+------+------+
As you can see the elements in the result set are the elements grouped by item code and ordered from the element with greater percentage weight on the grand total descending until reaching the 80% threshold.
From the item 2 onward the items are discarded from the result set because they exceed the threshold of 80%, because :
12.57 + 10.89 + 6.22 + 4.23 > 32.07 (80 % of the grand total )
This is not an homework, this is a real context where i am stumbled and i need to achieve the result with a single query ...
The query should run unmodified or with few changes on MySQL, SQL Server, PostgreSQL .
You can do this with a single query:
WITH Total_Sum(overallTotal) as (SELECT SUM(val)
FROM dataTable),
Summed_Items(id, total) as (SELECT id, SUM(val)
FROM dataTable
GROUP BY id),
Ordered_Sums(id, total, ord) as (SELECT id, total,
ROW_NUMBER() OVER(ORDER BY total DESC)
FROM Summed_Items),
Percent_List(id, itemTotal, ord, overallTotal) as (
SELECT id, total, ord, total
FROM Ordered_Sums
WHERE ord = 1
UNION ALL
SELECT b.id, b.total, b.ord, b.total + a.overallTotal
FROM Percent_List as a
JOIN Ordered_Sums as b
ON b.ord = a.ord + 1
JOIN Total_Sum as c
ON (c.overallTotal * .8) > (a.overallTotal + b.total))
SELECT id, itemTotal
FROM Percent_List
Which will yield the following:
id itemTotal
3 12.57
4 10.89
1 6.22
Please note that this will not work in mySQL (no CTEs), and will require a more recent version of postgreSQL to work (otherwise OLAP functions are not supported). SQLServer should be able to run the statement as-is (I think - this was written and tested on DB2). Otherwise, you may attempt to translate this into correlated table joins, etc, but it will not be pretty, if it's even possible (a stored procedure or re-assembly in a higher level language may then be your only option).
I don't know of any way this can be done with a single query; you'll probably have to create a stored procedure. The steps of the proc would be something like this:
Calculate the grand total for that day by using a SUM
Get the individual records for that day ordered by val DESC
Keep a running total as you loop through the individual records; as long as the running total is < 0.8 * grandtotal, add the current record to your list