mysql group by query with average calculation - mysql

id originator revenue date
-- ---------- ------- ----------
1 acme 1 2013-09-15
2 acme 0 2013-09-15
3 acme 4 2013-09-14
4 acme 6 2013-09-13
5 acme -6 2013-09-13
6 hello 1 2013-09-15
7 hello 0 2013-09-14
8 hello 2 2013-09-13
9 hello 5 2013-09-14
I have the above table . And I would like to add the ranking column based on the revenue generated by the originator based on the revenue for last 3 days
the fields to be displayed as below:
originator revenue toprank
---------- ------- -------
hello 8 1
acme 5 2
2) And based on the above data , i would like to calculate the avg revenue generated based on the following criteria
If the sum of total revenue for the same date is 0 ( zero) then it should not be counted with calculating the average.
a) avg value for originator acme should be sum of revenue/count(no of dates where the revenue is non zero value) so (4+1)/2 i.e 2.5
b) avg value for originator hello should be sum of revenue/count(no of dates where the revenue is non zero value) so (5+2+1)/3 i.e 2.6666
originator revenue toprank avg(3 days)
---------- ------- ------- -----------
hello 8 1 2.6666
acme 5 2 2.5

To ignore a row when averaging, give AVG a null value. The NULLIF function is good for this.
The ranking is problematic in MySQL. It doesn't support analytic functions that make this a bit easier to do in Oracle, MySQL, Teradata, etc. The most common workaround is to use a counter variable, and that requires an ordered set of rows, which means the total revenue must be calculated in an inner query.
SELECT originator, TotalRev, Avg3Days, #rnk := #rnk + 1 AS TopRank
FROM (
SELECT
originator,
SUM(revenue) AS TotalRev,
AVG(NULLIF(revenue, 0)) AS Avg3Days
FROM myTable
GROUP BY originator
ORDER BY TotalRev DESC
) Tots, (SELECT #rnk := 0) Ranks

If you want to get the values for the last 3 days from today's date, try something like this:
SET #rank=0;
select originator, rev, #rank:=#rank+1 AS rank, avg
FROM
(select originator, sum(revenue) as rev,
AVG(NULLIF(revenue, 0)) as avg
FROM t1
WHERE date >= DATE_ADD(CURDATE(), INTERVAL -3 DAY)
group by originator
order by 2 desc) as t2;
SQL Fiddle..
EDITED:
If you want to get the values for the last 3 days from the nearest date, try this:
SET #rank=0;
select originator, rev, #rank:=#rank+1 AS rank, avg
from
(select originator, sum(revenue) as rev,
AVG(NULLIF(revenue, 0)) as avg
from t1
WHERE date >= DATE_ADD((select max(date) from t1), INTERVAL -3 DAY)
group by originator
order by 2 desc) as t2;
SQL Fiddle..

Related

Get all transaction details of a user f their 2nd month of transaction

Trying to get the 2nd transaction month details for all the customers
Date User_id amount
2021-11-01 1 100
2021-11-21 1 200
2021-12-20 2 110
2022-01-20 2 200
2022-02-04 1 50
2022-02-21 1 100
2022-03-22 2 200
For every customer get all the records in the month of their 2nd transaction (There can be multiple transaction in a month and a day by a particular user)
Expected Output
Date User_id amount
2022-02-04 1 50
2022-02-21 1 100
2022-01-20 2 200
You can use dense_rank:
select Date, User_id, amount from
(select *, dense_rank() over(partition by User_id order by year(Date), month(date)) r
from table_name) t
where r = 2;
Fiddle
If dense_rank is an option you can:
with cte1 as (
select *, extract(year_month from date) as yyyymm
from t
), cte2 as (
select *, dense_rank() over (partition by user_id order by yyyymm) as dr
from cte1
)
select *
from cte2
where dr = 2
Note that it is possible to write the above using one cte.

Display N most sold items per day

I am clueless how can I write a (MySQL) query for this. I am sure it is super simple for an experienced person.
I have a table which summarizes sold items per day, like:
date
item
quantity
2020-01-15
apple
3
2020-01-15
pear
2
2020-01-15
potato
1
2020-01-14
orange
3
2020-01-14
apple
2
2020-01-14
potato
2
2020-01-13
lemon
5
2020-01-13
kiwi
2
2020-01-13
apple
1
I would like to query the N top sellers for every day, grouped by the date DESC, sorted by date and then quantity DESC, for N = 2 the result would look like:
date
item
quantity
2020-01-15
apple
3
2020-01-15
pear
2
2020-01-14
orange
3
2020-01-14
apple
2
2020-01-13
lemon
5
2020-01-13
kiwi
2
Please tell me how can I limit the returned item count per date.
First of all, it is not a good idea to use DATE as the name of a column.
You can use #rank := IF(#current = date, #rank + 1, 1) to number your rows by DATE. This statement checks each time that if the date has changed, it starts counting from zero.
Select date, item, quantity
from
(
SELECT item, date, sum(quantity) as quantity,
#rank := IF(#current = date, #rank + 1, 1) as ranking,
#current := date
FROM yourtable
GROUP BY item, date
order by date, sum(quantity) desc
) t
where t.ranking < 3
You can do this if you are using MySQL 8.0++
SELECT * FROM
(SELECT DATE, ITEM, QUANTITY, ROW_NUMBER() OVER (PARTITION BY DATE ORDER BY QUANTITY DESC) as order_rank FROM TABLE_NAME) as R
WHERE order_rank < 2
I think you can use:
select t.*
from t
where (quantity, item) >= (select t2.quantity, t2.item
from t t2
where t2.date = t.date
order by t2.quantity desc, t2.item
limit 1 offset 1
);
The only caveat is that you need to have at least "n" items available on the day (although that condition can be added as well).

Grouping by to find average differences for specific indexes in SQL

I have the following table:
person_index score year
3 76 2003
3 86 2004
3 86 2005
3 87 2006
4 55 2005
4 91 2006
I want to group by person_index, getting the average score difference between consecutive years, such that I end up with one row per person, indicating the average increase/decrease:
person_index avg(score_diff)
3 3.67
4 36
So for person with index 3 - there were changes over 3 years, one was 10pt, one was 0, and one was 1pt. Therefore, their average score_diff is 3.67.
EDIT: to clarify, scores can also decrease. And years aren't necessarily consecutive (one person might not get a score at a certain year, so could be 2013 followed by 2015).
Simplest way is to use LAG(MySQL 8.0+):
WITH cte AS (
SELECT *, score - LAG(score) OVER(PARTITION BY person_index ORDER BY year) AS diff
FROM tab
)
SELECT person_index, AVG(diff) AS avg_diff
FROM cte
GROUP BY person_index;
db<>fiddle demo
Output:
+---------------+----------+
| person_index | avg_diff |
+---------------+----------+
| 3 | 3.6667 |
| 4 | 36.0000 |
+---------------+----------+
If the scores only increase -- as in your example -- you can simply do:
select person_id,
( max(score) - min(score) ) / nullif(max(year) - min(year) - 1, 0)
from t
group by person_id;
If they do not only increase, it is a bit trickier because you have to calculate the first and last scores:
select t.person_id,
(tmax.score - tmin.score) / nullif(tmax.year - tmin.year - 1, 0)
from (select t.person_id, min(year) as miny, max(year) as maxy
from t
group by person_id
) p join
t tmin
on tmin.person_id = p.person_id and tmin.year = p.miny join
t tmax
on tmax.person_id = p.person_id and tmax.year = p.maxy join

How to sum up records from starting month to current per month

I've searched for this topic but all I got was questions about grouping results by month. I need to retrieve rows grouped by month with summed up cost from start date to this whole month
Here is an example table
Date | Val
----------- | -----
2017-01-20 | 10
----------- | -----
2017-02-15 | 5
----------- | -----
2017-02-24 | 15
----------- | -----
2017-03-14 | 20
I need to get following output (date format is not the case):
2017-01-20 | 10
2017-02-24 | 30
2017-03-14 | 50
When I run
SELECT SUM(`val`) as `sum`, DATE(`date`) as `date` FROM table
AND `date` BETWEEN :startDate
AND :endDate GROUP BY year(`date`), month(`date`)
I got sum per month of course.
Nothing comes to my mind how to put in nicely in one query to achieve my desired effect, probably W will need to do some nested queries but maybe You know some better solution.
Something like this should work (untestet). You could also solve this by using subqueries, but i guess that would be more costly. In case you want to sort the result by the total value the subquery variant might be faster.
SET #total:=0;
SELECT
(#total := #total + q.sum) AS total, q.date
FROM
(SELECT SUM(`val`) as `sum`, DATE(`date`) as `date` FROM table
AND `date` BETWEEN :startDate
AND :endDate GROUP BY year(`date`), month(`date`)) AS q
You can use DATE_FORMAT function to both, format your query and group by.
DATE_FORMAT(date,format)
Formats the date value according to the format string.
SELECT Date, #total := #total + val as total
FROM
(select #total := 0) x,
(select Sum(Val) as Val, DATE_FORMAT(Date, '%m-%Y') as Date
FROM st where Date >= '2017-01-01' and Date <= '2017-12-31'
GROUP BY DATE_FORMAT(Date, '%m-%Y')) y
;
+---------+-------+
| Date | total |
+---------+-------+
| 01-2017 | 10 |
+---------+-------+
| 02-2017 | 30 |
+---------+-------+
| 03-2017 | 50 |
+---------+-------+
Can check it here: http://rextester.com/FOQO81166
Try this.
I use yearmonth as an integer (the year of the date multiplied by 100 plus the month of the date) . If you want to re-format, your call, but integers are always a bit faster.
It's the complete scenario, including input data.
CREATE TABLE tab (
dt DATE
, qty INT
);
INSERT INTO tab(dt,qty) VALUES( '2017-01-20',10);
INSERT INTO tab(dt,qty) VALUES( '2017-02-15', 5);
INSERT INTO tab(dt,qty) VALUES( '2017-02-24',15);
INSERT INTO tab(dt,qty) VALUES( '2017-03-14',20);
SELECT
yearmonths.yearmonth
, SUM(by_month.month_qty) AS running_qty
FROM (
SELECT DISTINCT
YEAR(dt) * 100 + MONTH(dt) AS yearmonth
FROM tab
) yearmonths
INNER JOIN (
SELECT
YEAR(dt) * 100 + MONTH(dt) AS yearmonth
, SUM(qty) AS month_qty
FROM tab
GROUP BY YEAR(dt) * 100 + MONTH(dt)
) by_month
ON yearmonths.yearmonth >= by_month.yearmonth
GROUP BY yearmonths.yearmonth
ORDER BY 1;
;
yearmonth|running_qty
201,701| 10.0
201,702| 30.0
201,703| 50.0
select succeeded; 3 rows fetched
Need explanations?
My solution has the advantage over the others that it will be re-usable without change when you move it to a more modern database - and you can convert it to using analytic functions when you have time.
Marco the Sane

Compute outstanding amounts in MySQL

I am having an issue with a SELECT command in MySQL. I have a database of securities exchanged daily with maturity from 1 to 1000 days (>1 mio rows). I would like to get the outstanding amount per day (and possibly per category). To give an example, suppose this is my initial dataset:
DATE VALUE MATURITY
1 10 3
1 15 2
2 10 1
3 5 1
I would like to get the following output
DATE OUTSTANDING_AMOUNT
1 25
2 35
3 15
Outstanding amount is calculated as the total of securities exchanged still 'alive'. That means, in day 2 there is a new exchange for 10 and two old exchanges (10 and 15) still outstanding as their maturity is longer than one day, for a total outstanding amount of 35 on day 2. In day 3 instead there is a new exchange for 5 and an old exchange from day 1 of 10. That is, 15 of outstanding amount.
Here's a more visual explanation:
Monday Tuesday Wednesday
10 10 10 (Day 1, Value 10, matures in 3 days)
15 15 (Day 1, 15, 2 days)
10 (Day 2, 10, 1 day)
5 (Day 3, 5, 3 days with remainder not shown)
-------------------------------------
25 35 15 (Outstanding amount on each day)
Is there a simple way to get this result?
First of all in the main subquery we find SUM of all Values for current date. Then add to them values from previous dates according their MATURITY (the second subquery).
SQLFiddle demo
select T1.Date,T1.SumValue+
IFNULL((select SUM(VALUE)
from T
where
T1.Date between
T.Date+1 and T.Date+Maturity-1 )
,0)
FROM
(
select Date,
sum(Value) as SumValue
from T
group by Date
) T1
order by DATE
I'm not sure if this is what you are looking for, perhaps if you give more detail
select
DATE
,sum(VALUE) as OUTSTANDING_AMOUNT
from
NameOfYourTable
group by
DATE
Order by
DATE
I hope this helps
Each date considers each row for inclusion in the summation of value
SELECT d.DATE, SUM(m.VALUE) AS OUTSTANDING_AMOUNT
FROM yourTable AS d JOIN yourtable AS m ON d.DATE >= m.MATURITY
GROUP BY d.DATE
ORDER BY d.DATE
A possible solution with a tally (numbers) table
SELECT date, SUM(value) outstanding_amount
FROM
(
SELECT date + maturity - n.n date, value, maturity
FROM table1 t JOIN
(
SELECT 1 n UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) n ON n.n <= maturity
) q
GROUP BY date
Output:
| DATE | OUTSTANDING_AMOUNT |
-----------------------------
| 1 | 25 |
| 2 | 35 |
| 3 | 15 |
Here is SQLFiddle demo