I have a table with the following columns -
ID, Year, Month, Sales
The data is in long format. So, for 10 unique IDs and 5 years of data I will have 10(IDs) * 5(Years) * 12(Months) = 600 rows
I want to extract the following information -
Find out the Year and Month in which the sales was maximum for each ID
Find out the Year in which there was maximum sales for each ID
What should be the query in SQL. I use MySQL 5.6
Since you are using mysql 5.6 window function will not work. So you can use subquery in where clause to get your desired result:
create table yourtable (ID int, Year int, Month int, Sales int);
insert into yourtable values(1,2020,1,119);
insert into yourtable values(2,2020,1,105);
insert into yourtable values(1,2020,2,110);
insert into yourtable values(1,2021,1,120);
Query#1
select id, year, month, sales from yourtable a
where sales= (select max(sales) from yourtable b where a.id=b.id)
Output:
id
year
month
sales
2
2020
1
105
1
2021
1
120
Query#2:
select id,year,sum(sales)from yourtable a
group by id,year
having sum(sales)=(
select sum(sales) from yourtable b where a.id=b.id
group by id,year
order by sum(sales)
desc limit 1
)
Output:
id
year
sum(sales)
1
2020
229
2
2020
105
db<fiddle here
Related
I have 2 tables in Mysql. I want to regroup and count the Number of Orderid per month for each customer. If there is no order, I would like to add 0.
Customer Table
CustomerID
1
2
3
Order Table
OrderId CustomerID Date
1 1 2022-01-02
2 1 2022-01-04
3 2 2022-02-03
4 2 2022-03-03
Expect results
CustomerID Date CountOrderID
1 2022-01 2
2 2022-01 1
3 2022-01 0
1 2022-02 0
2 2022-02 1
3 2022-02 0
1 2022-03 0
2 2022-03 1
3 2022-03 0
How I can do this in Mysql?
SELECT customer.CustomerID,
year_month.y_m AS `Date`,
COUNT(order.OrderId) AS CountOrderID
FROM customer
CROSS JOIN (
SELECT DISTINCT DATE_FORMAT(`date`, '%Y-%m') AS y_m
FROM order
) AS year_month
LEFT JOIN order ON order.CustomerID = customer.CustomerID
AND DATE_FORMAT(order.`date`, '%Y-%m') = year_month.y_m
GROUP BY 1, 2;
If order table does not contains for some year and month then according row won't present in the output. If you need in it then you'd generate calendar table instead of year_month subquery.
you can reduce the number of cte's I added more here to explain the steps:
first you need the format year and month, for that I used DATE_FORMAT() function
since you need to have all the combination of dates and the year month you need a cross join. This will produce all the distinct dates with all the distinct customer id's. In other words all the pairs between dates and customer id
once you have a table with all the combinations you need to pass the actual data with the left join this will produce null where you actually don't have rows and hence will produce 0 when the count is performed
the last step is simply count function
with main as (
select distinct DATE_FORMAT(date,'%Y-%m') as year_month from order
),
calendar as (
select * from customer
cross join main
),
joining_all as (
select
calendar.*,
order. OrderId
left join order
on calendar.CustomerID = order.CustomerID
and calendar.year_month = DATE_FORMAT(order.date,'%Y-%m')
)
select
CustomerID,
year_month as Date,
count(OrderId) as CountOrderID
from joining_all
group by 1,2
maybe the shorter version can work with the code below. if runs into syntax you can use the one above
with main as (
select distinct DATE_FORMAT(date,'%Y-%m') as year_month from order
cross join customer
)
select
main.CustomerID,
main.year_month as Date,
count(order.OrderId) as CountOrderID
from main
left join order
on main.CustomerID = order.CustomerID
and main.year_month = DATE_FORMAT(order.date,'%Y-%m')
group by 1,2
I am having trouble coming up with a query to get a list of customer ids and the date of their 20th purchase.
I am given a table called transactions with the column name customer_id and purchase_date. Each row in the table is equal to one transaction.
customer_id
purchase_date
1
2020-11-19
2
2022-01-01
3
2021-12-05
3
2021-12-09
3
2021-12-16
I tried to do it like this and assumed I would have to count the number of times the customer_id has been mentioned and return the id number if the count equals 20.
SELECT customer_id, MAX(purchase_date)
FROM transactions
(
SELECT customer_id,
FROM transactions
GROUP BY customer_id
HAVING COUNT (customer_id) =20
)
How can I get this to return the list of customer_id and only the date of the 20th transaction?
You need to select the rows of transactions belonging to the customer_id and filter the result by the 20th row
SELECT * FROM (
SELECT customer_id, purchase_date, ROW_NUMBER() OVER(
PARTITION BY customer_id
ORDER BY purchase_date DESC
) AS nth
FROM transactions
) as t WHERE nth = 20
My solution:
select *
from transactions t
inner join (
select
customer_id,
purchase_date,
row_number() over (partition by customer_id order by purchase_date) R
from transactions) x on x.purchase_date=t.purchase_date
and x.customer_id=t.customer_id
where x.R=20;
see: DBFIDDLE
For MySQL5.7, see: DBFIDDLE
set #r:=1;
select *
from transactions t
inner join (
select
customer_id,
purchase_date,
#r:=#r+1 R
from transactions) x on x.purchase_date=t.purchase_date
and x.customer_id=t.customer_id
where x.R=20;
Use row_number = 20
SELECT
customer_id,
purchase_date as date_t_20
FROM
(
SELECT
customer_id,
purchase_date,
Row_number() OVER (
PARTITION BY customer_id
ORDER BY purchase_date) AS rn
FROM transactions
) T
WHERE rn = 20;
I have table with fields Customer date and amount
I want to sum Amount grouped by customer except the last two amounts of every customer by date
sample data
customer date amount
a 2020-10-1 100
a 2020-10-2 150
a 2020-10-3 30
a 2020-10-4 20
b 2020-10-1 1
b 2020-10-5 13
b 2020-10-7 50
b 2020-10-9 18
desired result
Customer Amount
A 150
B 14
something like
select Customer ,
SUM(amount- last 2 amount)
From TableA
Group By Customer
One option uses window functions, available in MySQL 8.0:
select customer, sum(amount) total_amount
from (
select a.*, row_number() over(partition by customer order by date desc) rn
from tablea a
) a
where rn > 2
group by customer
In earlier versions, an alternative uses a correlated subquery that returns the third latest date per customer for filtering:
select customer, sum(amount) total_amount
from tablea a
where date <= (select a1.date from tablea a1 where a1.customer = a.customer order by a1.date desc limit 2, 1)
group by customer
I have a table with following format -
Customer_id Purchase_date
c1 2015-01-11
c2 2015-02-12
c3 2015-11-12
c1 2016-01-01
c2 2016-12-29
c4 2016-11-28
c4 2015-03-15
... ...
The table essentially contains customer_id with their purchase_date. The customer_id is repetitive based on the purchase made on purchase_date. The above is just a sample data and the table contains about 100,000 records.
Is there a way to partition the customer based on pre-defined category data
Category Partitioning
- Category-1: Customer who has not made purchase in last 10 weeks, but made a purchase before that
- Category-2: Customer who as not made a purchase in last 5 weeks, but made purchase before that
- Category-3: Customer who has made one or more purchase in last 4 weeks or it has been 8 weeks since the first purchase
- Category-4: Customer who has made only one purchase in the last 1 week
- Category-5: Customer who has made only one purchase
What I'm looking for is a query that tells customer and their category -
Customer_id Category
C1 Category-1
... ...
The query can adhere to - oracle, postgres, sqlserver
From your question it seems that a customer can fall in multiple categories. So lets find out the customers in each category and then take UNION of the results.
SELECT DISTINCT Customer_Id, 'CATEGORY-1' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) > 10
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-2' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) > 5
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-3' AS Category FROM mytable GROUP BY
Customer_Id HAVING DATEDIFF(ww,MAX(Purchase_date),GETDATE()) < 4 OR
DATEDIFF(ww,MIN(Purchase_date),GETDATE()) =8
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-4' AS Category FROM mytable WHERE
DATEDIFF(ww,Purchase_date,GETDATE())<=1 GROUP BY Customer_Id having
COUNT(*) =1
UNION
SELECT DISTINCT Customer_Id, 'CATEGORY-5' AS Category FROM mytable GROUP BY
Customer_Id HAVING COUNT(*) =1
ORDER BY Category
Hope this serves your purpose.
Thanks
you can use something like this
with myTab as (
SELECT Customer_id ,MIN(Purchase_date) AS Min_Purchase_date,MAX(Purchase_date) AS Max_Purchase_date
, SUM(CASE WHEN Purchase_date>= DATEADD(WEEk ,-1,GETDATE()) THEN 1 ELSE 0 END ) AS Count_LastWeek
, COUNT(*) AS Count_All
FROM Purchases_Table
GROUP BY Customer_id
)
SELECT Customer_id
, CASE WHEN Max_Purchase_date < DATEADD(WEEK,-10,GETDATE()) THEN 'Category-1'
WHEN Max_Purchase_date < DATEADD(WEEK,-5,GETDATE()) THEN 'Category-2'
WHEN Max_Purchase_date >= DATEADD(WEEK,-4,GETDATE())
OR DATEDIFF(WEEK, Min_Purchase_date,Max_Purchase_date) >= 8 THEN 'Category-3'
WHEN Count_LastWeek = 1 THEN 'Category-4'
WHEN Count_All = 1 THEN 'Category-5'
ELSE 'No Category'
END
FROM myTab
Currently trying to create a query that shows how many accounts have paid month on month but on a cumulative basis (penetration). So as an example I have a table with Month paid and account number, which shows what month that account paid.
Month | AccountNo
Jan-14 | 123456
Feb-14 | 321654
So using the above the result set would show
Month | Payers
Jan-14 | 1
Feb-14 | 2
being because one account paid in Jan, then one in Feb meaning that there have been by the end of Feb 2 payments overall, but only one in Jan. Tried a few inner joins back onto the table itself with a t1.Month >= t2.Month as i would for a normal cumulative query but the result is always out.
Any questions please ask, unsure if the above will be clear to anyone but me.
If you have date in the table then you can try the following query.
SELECT [Month]
,(SELECT COUNT(AccountNo)
FROM theTable i
-- This is to make sure to add until the last day of the current month.
WHERE i.[Date] <= DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,o.[Date])+1,0)) AS CumulativeCount
FROM theTable o
Ok, several things. You need to have an actual date field, as you can't order by the month column you have.
You need to consider there may be gaps in the months - i.e. some months where there is no payment (not sure if that is true or not)
I'd recommend a recursive common table expression to do the actual aggregation
Heres how it works out:
-- setup
DECLARE #t TABLE ([Month] NCHAR(6), AccountNo INT)
INSERT #t ( [Month], AccountNo )
VALUES ( 'Jan-14',123456),('Feb-14',456789),('Apr-14',567890)
-- assume no payments in march
; WITH
t2 AS -- get a date column we can sort on
(
SELECT [Month],
CONVERT(DATETIME, '01 ' + REPLACE([Month], '-',' '), 6) AS MonthStart,
AccountNo
FROM #t
),
t3 AS -- group by to get the number of payments in each month
(
SELECT [Month], MonthStart, COUNT(1) AS PaymentCount FROM t2
GROUP BY t2.[Month], t2.MonthStart
),
t4 AS -- get a row number column to order by (accounting for gaps)
(
SELECT [Month], MonthStart, PaymentCount,
ROW_NUMBER() OVER (ORDER BY MonthStart) AS rn FROM t3
),
t5 AS -- recursive common table expression to aggregate subsequent rows
(
SELECT [Month], MonthStart, PaymentCount AS CumulativePaymentCount, rn
FROM t4 WHERE rn = 1
UNION ALL
SELECT t4.[Month], t4.MonthStart,
t4.PaymentCount + t5.CumulativePaymentCount AS CumulativePaymentCount, t4.rn
FROM t5 JOIN t4 ON t5.rn + 1 = t4.rn
)
SELECT [Month], CumulativePaymentCount FROM t5 -- select desired results
and the results...
Month CumulativePaymentCount
Jan-14 1
Feb-14 2
Apr-14 3
If your month column is date type then its easy to work on else you need some additional conversion for it. Here the query goes...
create table example (
MONTHS datetime,
AccountNo INT
)
GO
insert into example values ('01/Jan/2009',300345)
insert into example values ('01/Feb/2009',300346)
insert into example values ('01/Feb/2009',300347)
insert into example values ('01/Mar/2009',300348)
insert into example values ('01/Feb/2009',300349)
insert into example values ('01/Mar/2009',300350)
SELECT distinct datepart (m,months),
(SELECT count(accountno)
FROM example b
WHERE datepart (m,b.MONTHS) <= datepart (m,a.MONTHS)) AS Total FROM example a