SQL Consecutive Monthly Purchases - mysql

I'm having great difficulty writing this query and cannot find any answers online which could be applied to my problem.
I have a couple of tables which looks similar to the below with. Each purchase date corresponds with an item purchased.
Cust_ID
Purchase_Date
123
08/01/2022
123
08/20/2022
123
09/05/2022
123
10/08/2022
123
12/25/2022
123
01/26/2023
The result I am looking for should contain the customers ID, a range of the purchases, the number of consecutive months they had made a purchase (regardless of which day they purchased), and a count of how many purchases they had made in the time frame. The result should look something like the below for my example.
Cust_ID
Min Purchase Date
Max Purchase Date
Consecutive Months
No. Items Purchased
123
08/01/2022
10/08/2022
3
4
123
12/25/2022
01/26/2023
2
2
I have tried using CTEs with querys similar to
WITH CTE as
(
SELECT
PaymentDate PD,
CustomerID CustID,
DATEADD(m, -ROW_NUMBER() OVER (PARTITION BY c.CustomerID ORDER BY
DATEPART(m,PaymentDate)), PaymentDate) as TempCol1,
FROM customers as c
LEFT JOIN payments as p on c.customerid = p.customerid
GROUP BY c.CustomerID, p.PaymentDate
)
SELECT
CustID,
MIN(PD) AS MinPaymentDate,
MAX(PD) AS MaxPaymentDate,
COUNT(*) as ConsecutiveMonths,
FROM CTE
GROUP BY CustID, TempCol1
However, the above failed to properly count consecutive months. When the payment dates matched a month apart (e.g. 1/1/22 - 2/1/22), the query properly counts the consecutive months. However, if the dates do not match from month to month (e.g. 1/5/22 - 2/15/22), the count breaks.
Any guidance/help would be much appreciated!

This is just a small enhancement on the answer already given by ahmed. If your date range for this query is more than a year, then year(M.Purchase_Date) + month(M.Purchase_Date) will be 2024 for both 2022-02-01 and 2023-01-01 as YEAR() and MONTH() both return integer values. This will return incorrect count of consecutive months. You can change this to use CONCAT() or FORMAT(). Also, the COUNT(*) for ItemsPurchased should be counting the right hand side of the join, as it is a LEFT JOIN.
WITH consecutive_months AS
(
SELECT *,
DATEADD(
month,
-DENSE_RANK() OVER (
PARTITION BY CustomerID
ORDER BY YEAR(PaymentDate), MONTH(PaymentDate)
),
PaymentDate
) AS grp_date
FROM payments
)
SELECT
C.CustomerID AS CustID,
MIN(M.PaymentDate) AS MinPaymentDate,
MAX(M.PaymentDate) AS MaxPaymentDate,
COUNT(DISTINCT FORMAT(M.PaymentDate, 'yyyyMM')) AS ConsecutiveMonths,
COUNT(M.CustomerID) AS ItemsPurchased
FROM customers C
LEFT JOIN consecutive_months M
ON C.CustomerID = M.CustomerID
GROUP BY C.CustomerID, YEAR(M.grp_date), MONTH(M.grp_date)
Here's a db<>fiddle

You need to use the dense_rank function instead of the row_number, this will give the same rank for the same months and avoid breaking the grouping column. Also, you need to aggregate for 'year-month' of the grouping date column.
with consecutive_months as
(
select *,
Purchase_Date - interval
dense_rank() over (partition by Cust_ID order by year(Purchase_Date), month(Purchase_Date))
month as grp_date
from payments
)
select C.Cust_ID,
min(M.Purchase_Date) as MinPurchaseDate,
max(M.Purchase_Date) as MaxPurchaseDate,
count(distinct year(M.Purchase_Date), month(M.Purchase_Date)) as ConsecutiveMonthsNo,
count(M.Cust_ID) as ItemsPurchased
from customers C left join consecutive_months M
on C.Cust_ID = M.Cust_ID
group by C.Cust_ID, year(M.grp_date), month(M.grp_date)
See demo on MySQL
You tagged your question with MySQL, while it seems that you posted an SQL Server query syntax, for SQL Server just use dateadd(month, -dense_rank() over (partition by Cust_ID order by year(Purchase_Date), month(Purchase_Date)), Purchase_Date).
See demo on SQL Server.

Related

Retrieving top company for each quarter and corresponding revenue

Company_name
Quarter
Year
Revenue
TCS
Q1
2001
50
CTS
Q2
2010
60
ZOHO
Q2
2007
70
CTS
Q4
2015
90
This is my sample table where I store the names of the companies, quarters of the years, years and revenue for each year per a certain quarter.
I want to find the company with top revenue for each quarter, regardless of the year, and display its revenue too.
In the above case the resultant output should be something like this:
QUARTER
COMPANY_NAME
REVENUE
Q1
TCS
50
Q2
ZOHO
70
Q4
CTS
90
Here's what I've tried:
SELECT DISTINCT(C1.QUARTER),
C1.REVENUE
FROM COMPANY_REVENUE C1,
COMPANY_REVENUE C2
WHERE C1.REVENUE = GREATEST(C1.REVENUE, C2.REVENUE);
There are a couple of problems in your query, among which:
the fact that the DISTINCT keyword can be applied to full rows rather than single fields,
the SELF JOIN should be explicit, though most importantly it requires a matching condition, defined by an ON clause (e.g. SELECT ... FROM tab1 JOIN tab2 ON tab1.field = tab2.field WHERE ...)
Though probably you could solve your problem in another way.
Approach for MySQL 8.0
One way of computing values on partitions (in your case you want to partition on quarters only) is using window functions. In the specific case you can use ROW_NUMBER, which will compute a ranking over your revenues descendently for each selected partition. As long as you want the highest revenue for each quarter, you can select the row number equal to 1 for each quarter group.
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER(
PARTITION BY Quarter
ORDER BY Revenue DESC
) AS rn
FROM tab
)
SELECT Quarter,
Company_name,
Revenue
FROM cte
WHERE rn = 1
Check the demo here.
Approach for MySQL 5.7
In this case you can use an aggregation function. As long as you want your max "Revenue" for each "Quarter", you need first to select the maximum value for each "Quarter", then you need to join back to your original table on two conditions:
table's quarter matches subquery quarter,
table's revenue matches subquery max revenue
SELECT tab.Quarter,
tab.Company_name,
tab.Revenue
FROM tab
INNER JOIN (SELECT Quarter,
MAX(Revenue) AS Revenue
FROM tab
GROUP BY Quarter ) max_revenues
ON tab.Quarter = max_revenues.Quarter
AND tab.Revenue = max_revenues.Revenue
Check the demo here.
Note: the second solution will find for each quarter all companies that have the maximum revenue for that quarter, which means that if two or more companies have the same maximum value, both will be returned. This won't happen for the first solution, as long as the ranking ensures only one (the ranked = 1) will be retrieved.
You can just use a cte:
with x as (
select Quarter, max(Revenue) as Revenue
from table
group by Quarter
)
select t.Company_name, x.Quarter, x.Revenue
from x
join table t
on x.Revenue = t.Revenue
and t.Quarter = x.Quarter;
see db<>fiddle.
First you select the max Revenue group by Quarter, then I'm joining to the table on the returned max(Revenue) but as #lemon pointed out in comments that's not enough because what would happen when there's two revenues on same company but different quarters it will return more rows as shown in this db<>fiddle.
So that's why I need to add the join on quarter so it will only return one result per quarter.
But if you're using a version of MySql that doesn't support cte you can use a subquery like:
select t.Company_name, x.Quarter, x.Revenue
from
(
select Quarter, max(Revenue) as Revenue
from test
group by Quarter
) x
join test t
on x.Quarter = t.Quarter
and x.Revenue = t.Revenue;
Try this,
SELECT quarter, company_name,max(revenue) FROM table_name GROUP BY quarter

I would like to count the number of users who made multiple purchases grouped by month

So what i'm trying to do here, is that i am trying to count the number of repeat users (users who made more than one order) in a period of time, let it be month day or year, the case here is months
i'm currently running mysql mariadb and i'm pretty much a beginner in mysql, i've tried multiple subqueries but all have failed till now
This is what i have tried so far ..
This returns all the number of users with no ordering count condition
Since people are asking for sample data, here is what the data is looking like at the moment:
Order_Creation_Date - User_ID - Order_ID
2019-01-01 123 1
2019-01-01 123 2
2019-01-01 231 3
2019-01-01 231 4
This is the query i am using to get the result but it keeps on returning total number of users within the month
select month(o.created_at)month,
year(o.created_at)year,
count(distinct o.user_uuid) from orders o
group by month(o.created_at)
having count(*)>1
and this returns the number of users as 1 ..
select month(o.created_at)month,
year(o.created_at)year,
(select count(distinct ord.user_uuid) from orders ord
where ord.user_uuid = o.user_uuid
group by ord.user_uuid
having count(*)>1) from orders o
group by month(o.created_at)
Expected result will be from the sample data above
Month Count of repeat users
1 2
If you want the number of users that make more than one purchase in January, then do two levels of aggregations: one by user and month and the other by month:
select yyyy, mm, sum( num_orders > 1) as num_repeat_users
from (select year(o.created) as yyyy, month(o.created) as mm,
o.user_uuid, count(*) as num_orders
from orders o
group by yyyy, mm, o.user_uuid
) o
group by yyyy, mm;
I think you should try something like this which will return USer_ID list Month and Year wise who ordered more that once for the period-
SELECT
[user_uuid],
MONTH(o.created_at) month,
YEAR(o.created_at) year,
COUNT(o.user_uuid)
FROM orders o
GROUP BY
MONTH(o.created_at),YEAR(o.created_at)
HAVING COUNT(*) > 1;
For more, if you are looking for the count that how many users placed more that one order, you can just place the above query as a sub query and make a count on column 'user_uuid'

how can i make query with couple of profits?

first ,i need to get the sum of TotalPrice of sport's and music's departments from the first 3 months of 2016,second, i need to get the result of what i wrote before dividing to sum of all TotalPrice at the year of 2016 from all departments, and third- i need to get the first result dividing to sum of all Total price from all over the years.
all this at the same query!
thanks!
the table called Sales and the attributes are: S_id, date, department, totalPrice.
THIS IS MY CHRY :
Select sum(TotalPrice) as sportMusic, sportMusic/sum(TotalPrice)
From Sales
Where (Department="MUSIC" OR Department="SPORT") and
DATE BETWEEN "2016/01/01" AND "2016/03/31"
You can use your query and two more queries as subqueries (also called "derived tables") in your from clause. Cross join the three result rows and use the totals in your select clause. Something along the lines of:
select
ms_2016_q1.total as ms_2016_q1_total,
ms_2016_q1.total / all_2016.total as rate_2016,
ms_2016_q1.total / all_years.total as rate_all
from
(
select sum(totalprice) as total
from sales
where department in ('MUSIC', 'SPORT')
and date between date '2016-01-01' and date '2016-03-31'
) ms_2016_q1
cross join
(
select sum(totalprice) as total
from sales
where date between date '2016-01-01' and date '2016-12-31'
) all_2016
cross join
(
select sum(totalprice) as total
from sales
) all_years;

How can I optimize the query below which uses three levels of select statements?

How to optimize the below query:
I have two tables, 'calendar_table' and 'consumption', Here I use this query to calculate monthly consumption for each year.
The calendar table has day, month and year for years 2005 - 2009 and consumption table has billed consumption data for monthly bill cycle. This query will count the number of days for each bill and use that the find the consumption for each month.
SELECT id,
date_from as bill_start_date,
theYear as Year,
MONTHNAME(STR_TO_DATE(theMonth, '%m')) as month,
sum(DaysOnBill),
TotalDaysInTheMonth,
sum(perDayConsumption * DaysOnBill) as EstimatedConsumption
FROM
(
SELECT
id,
date_from,
theYear,
theMonth, # use theMonth for displaying the month as a number
COUNT(*) AS DaysOnBill,
TotalDaysInTheMonth,
perDayConsumption
FROM
(
SELECT
c.id,
c.date_from as date_from,
ct.dt,
y AS theYear,
month AS theMonth,
DAY(LAST_DAY(ct.dt)) as TotalDaysInTheMonth,
perDayConsumption
FROM
consumption AS c
INNER JOIN
calendar_table AS ct
ON ct.dt >= c.date_from
AND ct.dt<= c.date_to
) AS allDates
GROUP BY
id,
date_from,
theYear,
theMonth ) AS estimates
GROUP BY
id,
theYear,
theMonth;
It is taking around 1000 seconds to go through around 1 million records. Can something be done to make it faster?.
The query is a bit dubious pretending to do one grouping first and then building on that with another, which actually isn't the case.
First the bill gets joined with all its days. Then we group by bill plus month and year thus getting a monthly view on the data. This could be done in one pass, but the query is joining first and then using the result as a derived table which gets aggregated. At last the results are taken again and "another" group is built, which is actually the same as before (bill plus month and year) and some pseudo aggregations are done (e.g. sum(perDayConsumption * DaysOnBill) which is the same as perDayConsumption * DaysOnBill, as SUM sums one record only here).
This can simply written as:
SELECT
c.id,
c.date_from as bill_start_date,
ct.y AS Year,
MONTHNAME(STR_TO_DATE(ct.month, '%m')) as month,
COUNT(*) AS DaysOnBill,
DAY(LAST_DAY(ct.dt)) as TotalDaysInTheMonth,
SUM(c.perDayConsumption) as EstimatedConsumption
FROM consumption AS c
INNER JOIN calendar_table AS ct ON ct.dt BETWEEN c.date_from AND c.date_to
GROUP BY
c.id,
ct.y,
ct.month;
I don't know if this will be faster or if MySQL's optimizer doesn't see through your query itself and boils it down to this anyhow.

How to use group concat with group by?

Here my table is having filed as following:
employee_id,
expense_id,
expense_type,
expense_cost,
expense_date and etc,
And i want to display as month wise expenses as row wise for particular employee.
in my table data has stored like
2wheeler 01/03/2014 99 Santhosh 4493.00 March 500.00
Auto 03/02/2014 99 Santhosh 0.00 February 80.00
Food 01/02/2014 99 Santhosh 0.00 February 200.00
Phone Expense 01/03/2014 99 Santhosh 0.00 March 500.00
In this table i want to get out as
single user row with concat of expense type and sum of cost for every month that mean : march have single row with concat of expense type & sum of cost.
I would suggest doing a sub query that sums up all the occurances of an expenses per employee per month per expense. Then use that as a source to get the list of employees, months and the GROUP_CONCATed list of expense types and the total cost of them.
Like this:-
SELECT employee_id, expense_month, GROUP_CONCAT(CONCAT_WS('=', expense_type, monthly_exployee_expense))
FROM
(
SELECT employee_id, MONTH(expense_date) AS expense_month, expense_type, SUM(expense_cost) AS monthly_exployee_expense
FROM some_table
GROUP BY employee_id, expense_month, expense_type
) Sub1
GROUP BY employee_id, expense_month
EDIT
Reading you comment it seems that you need an expense listed even when an employee has not incurred that expense that month.
To do that I think you will need to cross join the employees (I have assumed a table name of tbl_employee) with the table of expenses types, and also all the possible months (assuming you want a row for a month for an employee when that employee has had no expenses that month). I have got the possible months by just selecting the distinct year / month from the table listing all the expenses (but there are other ways to get this - depends if there are any months where no employees had any expenses and if you want to put these out anyway).
Once those are cross joined to get every combination of employee, month and expense you can left join the actual expenses in the sub query, and then do the GROUP_CONCAT much as before.
Not tested, but something like this:-
SELECT employee_id, expense_month, GROUP_CONCAT(CONCAT_WS('=', exp_type_text, monthly_exployee_expense))
FROM
(
SELECT tbl_employee.employee_id, expense_months.expense_month, tbl_expense_type.exp_type_id, tbl_expense_type.exp_type_text, SUM(expense_cost) AS monthly_exployee_expense
FROM tbl_employee
CROSS JOIN tbl_expense_type
CROSS JOIN
(
SELECT DISTINCT DATE_FORMAT(expense_date, '%Y%m') AS expense_month
FROM some_table
) expense_months
LEFT OUTER JOIN some_table
ON tbl_employee.employee_id = some_table.employee_id
AND tbl_expense_type.exp_type_id = some_table.expense_type
AND expense_months.expense_month = DATE_FORMAT(some_table.expense_date, '%Y%m')
GROUP BY tbl_employee.employee_id, expense_months.expense_month, tbl_expense_type.exp_type_id, tbl_expense_type.exp_type_text
) Sub1
GROUP BY employee_id, expense_month