Calculating month over month numbers - mysql

I searched around and found solutions, but they didn't work with MySQL because they used functions from other software.
I'm trying to show month-over-month growth for the current year (starting January), though knowing how to check within the past year might come in handy in the future as well.
What the "orders" table might look like:
+-----------+-------+
| Month | Sales |
+-----------+-------+
| 1-1-2017 | 3 |
| 1-5-2017 | 9 |
| 2-16-2017 | 10 |
| 2-16-2017 | 13 |
| 3-7-2017 | 25 |
| 4-29-2017 | 22 |
+-----------+-------+
What I want the query result to look like:
+----------+-------+--------+
| Month | Sales | Growth |
+----------+-------+--------+
| January | 12 | |
| February | 23 | 91.66% |
| March | 25 | 8.69% |
| April | 22 | -12% |
+----------+-------+--------+
Is there a simple way to do this?

You can do something like that:
SELECT
thisMonth.MonthOnly,
SUM(thisMonth.Sales) AS ThisMonthSales,
(SUM(thisMonth.Sales) / SUM(lastMonth.Sales) - 1) * 100 AS Growth
FROM
(
SELECT STR_TO_DATE(DATE_FORMAT(Month, '%Y%m01'), '%Y%m%d') AS MonthOnly,
SUM(Sales) AS Sales
FROM orders
GROUP BY DATE_FORMAT(Month, '%Y%m01')
) thisMonth
LEFT OUTER JOIN
(
SELECT STR_TO_DATE(DATE_FORMAT(DATE_ADD(Month, INTERVAL 1 MONTH), '%Y%m01'), '%Y%m%d') AS MonthOnly,
SUM(Sales) AS Sales
FROM orders
GROUP BY DATE_FORMAT(Month, '%Y%m01')
) lastMonth
ON thisMonth.MonthOnly = lastMonth.MonthOnly
GROUP BY thisMonth.MonthOnly

Related

MySQL: Tagging 1 to each unique occurence during SELECT query

Sample table tbl_name:
| ID | Name | Month | Quarter | Year |
| 1 | A | Jan | 1 | 2019 |
| 1 | A | Feb | 1 | 2019 |
| 2 | B | May | 2 | 2019 |
| 3 | C | May | 2 | 2018 |
Hi, this is the table I extract using SELECT query. I can find the distinct name per year using SELECT distinct name, year FROM tbl_name; But I'm trying to add a column during SELECT query to identify or count the unique occurrence per year of the name.
Expected:
| ID | Name | Month | Quarter | Year | Unique Count |
| 1 | A | Jan | 1 | 2019 | 1 |
| 1 | A | Feb | 1 | 2019 | 0 |
| 2 | B | May | 2 | 2019 | 1 |
| 3 | C | May | 2 | 2018 | 1 |
I tried splitting into two queries - one select everything; the other select just distinct and join them together but that will introduce duplicates. Is there a way to do this using SQL?
If you are running MySQL 8.0, you can use row_number() to flag the appearance of a name in a year:
select
t.*,
(row_number() over(
partition by name, year
order by str_to_date(concat(year, '-', month), '%Y-%b')
) = 1) unique_count
from mytable t
Note: do consider fixing the storage strategy of your date columns. Rather than splitting the information over several columns, you would better have a unique column in the relevant DATE datatype to store that information. That would save you the pain of recomposing the date when you need it.
Demo on DB Fiddle:
ID | Name | Month | Quarter | Year | unique_count
-: | :--- | :---- | ------: | ---: | -----------:
1 | A | Feb | 1 | 2019 | 1
1 | A | Jan | 1 | 2019 | 0
2 | B | May | 2 | 2019 | 1
3 | C | May | 2 | 2018 | 1
You can try this below logic-
DEMO HERE
WITH your_table(ID,Name,Month,Quarter,Year)
AS
(
SELECT 1,'A','Jan',1,2019 UNION ALL
SELECT 1,'A','Feb',1,2019 UNION ALL
SELECT 2,'B','May',2,2019 UNION ALL
SELECT 3,'C','May',2,2018
)
,CTE AS
(
SELECT *,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY Quarter,Year) RN
FROM your_table
)
SELECT ID,Name,Month,Quarter,Year,
CASE WHEN RN = 1 THEN 1 ELSE 0 END Unique_Count
FROM CTE
Output is-
ID Name Month Quarter Year Unique_Count
1 A Jan 1 2019 1
1 A Feb 1 2019 0
2 B May 2 2019 1
3 C May 2 2018 1

Get Top 5 For Each Metric For Each Month For Each Item

I'm having trouble creating the query to break my now daily, table into a top monthly.
I have a table with the following structure (note the headers are actually all caps):
Start_Date | Month| Item | Location | ... | Quantity | Sales |
-----------|------|------|----------|-----|----------|-------|
8/6/19 | 08 | A | USE | ... | | |
8/6/19 | 08 | B | USE | ... | | |
8/6/19 | 08 | C | USW | ... | | |
8/6/19 | 08 | D | USW | ... | | |
8/5/19 | 08 | A | USE | ... | | |
8/5/19 | 08 | B | USE | ... | | |
8/5/19 | 08 | C | USW | ... | | |
8/5/19 | 08 | D | USW | ... | | |
.....
7/1/19 | 07 | D | USW | ... | | |
Every date has the metrics above, there's 4 rows per day due to the 4 different items, which I want -- comparing by item. My goal is to now break this out to a monthly table, with the goal of taking the top 5 in each category (Quantity, Sales, etc.) for that month and getting the AVG of that.
Example: Item A
8/6/19: Quantity = 500 | Sales = 100
8/5/19: Quantity = 478 | Sales = 130
8/4/19: Quantity = 366 | Sales = 113
8/3/19: Quantity = 678 | Sales = 90
8/2/19: Quantity = 594 | Sales = 92
8/1/19: Quantity = 500 | Sales = 105
Note: There's data for the other items B, C and D respectively.
My goal was to take the top 5 for each category and present that at the monthly level:
Results:
| Month| Item | Location | ... | Quantity | Sales |
|------|------|----------|-----|----------|-------|
| 08 | A | USE | ... | 550 | 108 |
| 08 | B | USE | ... | | |
| 08 | C | USW | ... | | |
| 08 | D | USW | ... | | |
| 07 | A | USE | ... | | |
| 07 | B | USE | ... | | |
| 07 | C | USW | ... | | |
| 07 | D | USW | ... | | |
Quantity = 550 was taken from 8/1 -> 8/6 Quantity adding the top 5 (1, 2, 3, 5, 6) and dividing by 5 (AVG of top 5). Then Sales was 1, 2, 4, 5, 6.
So obviously I need to query each category (Quantity, Sales, etc.) separately and then UNION the results together. I'm just struggling with how to even get the TOP 5 of a specific category.
I've done some searching on Stack and Google for how to obtain the Top5 in a query. I see some threads that suggest you can actually use TOP(5) which doesn't work for me. LIMIT 5 only limits the results to 5, and I'm unable to use LIMIT in a subquery w/ the most recent version of SQL. Ordering and using a simple statement like "number <= 5" won't work since the months are different in the later rows of the dataset... I'm able to get the AVG for all the data for a specific month by simply just SELECTing the AVG() and GROUPing by ITEM but I want to project a top 5/10 which I've been unable to figure out.
Thanks for your assistance. I could throw out some of the queries I've tried but none are even close as I've been trying to use LIMIT.
You can use window functions:
select month, item,
avg(case when seqnum_q <= 5 then quantity end) as quantity_top5,
avg(case when seqnum_s <= 5 then sales end) as sales_top5
from (select t.*,
row_number() over (partition by item, month order by quantity desc) as seqnum_q,
row_number() over (partition by item, month order by sales desc) as seqnum_s
from t
) t
group by month, item;
Note: Using month without year is discomfiting. Either year should be included in the logic, or you should be clear that the data is only for one year.

How to find elements on one column for values in other columns having no more than 3 gap in SQL

I have an sql view say emp_table which looks like the following:
+----------+----------+------+
| actor_id | movie_id | year |
+----------+----------+------+
| 2 | 280088 | 2002 |
| 2 | 396232 | 2000 |
| 3 | 376687 | 2000 |
| 4 | 336265 | 2001 |
| 5 | 135644 | 1953 |
| 6 | 12083 | 1996 |
| 7 | 252053 | 1993 |
| 7 | 402635 | 1992 |
| 7 | 409592 | 1995 |
| 8 | 101866 | 2000 |
| 9 | 336265 | 2001 |
| 10 | 12148 | 2000 |
| 11 | 80189 | 2001 |
| 12 | 12148 | 2000 |
| 13 | 80189 | 2001 |
| 14 | 70079 | 1982 |
| 15 | 12148 | 2000 |
| 16 | 242675 | 1991 |
| 17 | 105231 | 1993 |
| 17 | 242453 | 1988 |
+----------+----------+------+
... and so on. I need to find all the actor_id who never had a career gap of more than 3 year. Meaning I need to calculate all the actors for whom if I calculate the number of unique years they acted in a movie, and then sort it, then the maximum consecutive difference between the year would never be more than 3 years.
Please help me with this sql query. I have tried sql self join but couldn't think more about it.
All the SQL code is for MySQL only.
Note You can consider that there is only one combination of actor_id and movie_id.
Expected Result
+----------+----------+
| actor_id | max_gap |
+----------+----------+
| 2 | 2 |
| 3 | 0 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 2 |
| . | . |
| . | . |
| . | . |
| 17 | 5 |
+----------+----------+
And so on
Note 2: Sorry for so many changes in the output. This is the final version and no more change after this.
With MySQL 8 and MariaDB 10.2 you can use the window function LEAD() to get the next consecutive playing year for an actor (or LAG() for the last one). Then you just need to get the max difference in the outer query.
with tmp as (
select
actor_id,
year,
lead(year) over (partition by actor_id order by year) as year_lead
from emp_table e
)
select actor_id, coalesce(max(year_lead - year), 0) as max_gap
from tmp
group by actor_id
having max_gap <= 3;
Demo: https://www.db-fiddle.com/f/cWChT2TqLuRT8bW1zcM9G2/0
I initially started with an anti-join approach, but then changed it upon seeing your requirement for the max gap.
The approach below begins with a subquery which itself uses a correlated subquery to compute the forward looking year gap, for every actor record and year. It then aggregates by actor and asserts that a gap greater than 3 years never occurs.
SELECT actor_id, MAX(gap) AS max_gap
FROM
(
SELECT
e1.actor_id,
ABS(e1.year - COALESCE((SELECT e2.year FROM emp_table e2
WHERE e2.actor_id = e1.actor_id AND e2.year > e1.year
ORDER BY e2.year LIMIT 1), e1.year)) AS gap
FROM emp_table e1
) t
GROUP BY
actor_id
HAVING
MAX(gap) <= 3;
Note that the call to COALESCE is very necessary, because of the edge case of an actor's most recent year. In this case, there is no forward looking year, but we want to discount this year.
A self join of the table and the group by actor_id:
select
e1.actor_id, max(coalesce(e2.year, e1.year) - e1.year) max_gap
from emp_table e1 left join emp_table e2
on
e2.actor_id = e1.actor_id
and
e2.year = (
select min(year) from emp_table where actor_id = e1.actor_id and year > e1.year
)
group by e1.actor_id
having max_gap <= 3
See the demo

Displaying groups having max number of occurence

t_table looks like:
+-----------+---------+--------------+------------------+-----------------------+----------------------------------+
| pk_IdLoan | fk_IdCar| fk_IdCustomer| fk_Source_Agency | fk_Destination_Agency | RentalDate | DeliveryDate | Cost |
+-----------+---------+--------------+------------------+-----------------------+----------------------------------+
I wrote a query:
(SELECT fk_IdCustomer, MONTHNAME(RentalDate) AS Month, YEAR(RentalDate) As Year, COUNT(*)
FROM t_loan
GROUP BY fk_IdCustomer, Month, Year);
which results in
+---------------+-------------+------+----------+
| fk_IdCustomer | Month | Year | COUNT(*) |
+---------------+-------------+------+----------+
| 1 | July | 2016 | 3 |
| 1 | November | 2017 | 1 |
| 1 | September | 2016 | 7 |
| 5 | May | 2016 | 1 |
| 6 | January | 2016 | 1 |
| 6 | September | 2017 | 2 |
+---------------+-------------+------+----------+
Now I want to get these months and years for each customer which result in highest COUNT(*), f.e.:
+---------------+-------------+------+----------+
| fk_IdCustomer | Month | Year | COUNT(*) |
+---------------+-------------+------+----------+
| 1 | September | 2016 | 7 |
| 5 | May | 2016 | 1 |
| 6 | September | 2017 | 2 |
+---------------+-------------+------+----------+
How to achieve this?
This is a bit painful in MySQL, which doesn't support CTEs or window functions. One method is:
SELECT fk_IdCustomer, MONTHNAME(RentalDate) AS Month,
YEAR(RentalDate) As Year, COUNT(*) as cnt
FROM t_loan l
GROUP BY fk_IdCustomer, Month, Year
HAVING cnt = (SELECT COUNT(*)
FROM t_loan l2
WHERE l2.fk_IdCustomer = l.fk_IdCustomer
GROUP BY MONTHNAME(RentalDate), YEAR(RentalDate)
ORDER BY COUNT(*) DESC
LIMIT 1
);
Note: If there are duplicates, you will get all matching values.

Getting average of one column in a nested SQL query

I'll try to simplify this for simplicity's sake. I need to get the average value of a returned column of a query. Make sense? I'll try to elaborate.(Sample results borrowed from another question)
Plant_ID | Year |Quarter| MR | Range
| CCAR | 2009 | 1 | 706 | Null
| CCAR | 2009 | 2 | 626 | 0,08
| CCAR | 2009 | 2 | 637 | 0,11
| CCAR | 2009 | 2 | 737 | 0,1
| CCAR | 2009 | 1 | 552 | 0,19
| CCAR | 2009 | 4 | 418 | 0,137
| CCAR | 2009 | 1 | 503 | 0,085
| CCAR | 2009 | 2 | 645 | 0,058
| CCAR | 2009 | 4 | 743 | 0,098
| CCAR | 2009 | 3 | 556 | 0,187
| CCAR | 2009 | 1 | 298 | 0,258
| CCAR | 2009 | 2 | 339 | 0,041
| CCAR | 2010 | 1 | 381 | 0,042
I would get this result when I run a query like this
Select PlantID, Year, Quarter, MR, Range FROM TestTable WHERE PlantID in('CCAR')
I want the average MR for each quarter. Preliminarily I would try something like this.
Select Quarter, AVG(MR) FROM (Select PlantID, Year, Quarter, MR, Range FROM TestTable WHERE PlantID in ('CCAR')) GROUP BY Quarter ORDER BY Quarter
The issue is that I don't know where to nest the query to accomplish this. Any help?
Thanks!
Add an alias e.g. hallo
Select Quarter, AVG(MR) FROM
(Select PlantID, Year, Quarter, MR, Range FROM TestTable WHERE PlantID in ('CCAR')) hallo
GROUP BY Quarter ORDER BY Quarter
while this is not necessary in this case, the following would be enough
Select Quarter, AVG(MR)
FROM TestTable
WHERE PlantID in ('CCAR')
GROUP BY Quarter
ORDER BY Quarter
Just use aggregation:
select Year, Quarter, avg(MR)
from TestTable
where PlantID in ('CCAR')
group by Year, Quarter;
I presume you want the year with the quarter. If not, just remove year from the select and group by.