Including and excluding specific records - mysql

I want to find some of buyer who had special condition (in this case, transaction >= 600000 called star member)
In this case, I want to find out star member (transaction >= 600000) who exists in January 2020 and March 2020, but it does not include star member who is doing transaction in February 2020.
here's my syntax
SELECT users_id
FROM order_star_member
GROUP BY users_id
HAVING SUM(CASE WHEN MONTHNAME(createdAt) = 'January'
THEN total_price_star_member END) >= 600000
AND SUM(CASE WHEN MONTHNAME(createdAt) = 'March'
THEN total_price_star_member END) >= 600000
AND NOT EXISTS (SELECT 1 FROM order_star_member
GROUP BY users_id
having sum(case when monthname(createdAt) = 'February'
THEN total_price_star_member END) >= 600000);
and here's my fiddle
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=2c85037215fe71f700b51c8fd3a5ae76
on my fiddle, the expected result are the users_Id 15 because that id order at january and march but not in february

First in the inner t we group by month to determine all the star members.
The outer grouping groups by user_id. Their score is the sum of their star_member.
For February (m=2 (February being the second month) on the first line of the query below) if they are a star_member, they get an penalty (-100) as an arbitrary value that the SUM cannot overcome.
The only way a month_score=2 can exist if if a user has a star_member being true (1) for both January and March but not February.
SELECT users_id, SUM(IF(m=2 and star_member, -100, star_member)) as month_score
FROM
(SELECT users_id,
MONTH(createdAt) as m,
SUM(total_price_star_member) >= 600000 as star_member
FROM order_star_member
WHERE createdAt BETWEEN '20190101' AND '20190331'
GROUP BY users_id, m
) t
GROUP BY users_id
HAVING month_score=2
fiddle

Related

how to make cohort analysis in mysql

I have a table called order_star_member:
create table order_star_member(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
users_id INT(11) NOT NULL,
createdAt datetime NOT NULL,
total_price_star_member decimal(10,2) NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO order_star_member(users_id, createdAt, total_price_star_member)
VALUES
(15, '2021-01-01', 350000),
(15, '2021-01-02', 400000),
(16, '2021-01-02', 700000),
(15, '2021-02-01', 350000),
(16, '2021-02-02', 700000),
(15, '2021-03-01', 350000),
(16, '2021-03-01', 850000),
(17, '2021-03-03', 350000);
DB Fiddle
I want to find users in the month March with transaction >= 700000 and first transaction >= 700000. The user whose transaction is >= 700000 is called star member.
My query so far:
SELECT COUNT(users_id) count_star_member,
year_and_month DIV 100 `year`,
year_and_month MOD 100 `month`
FROM (SELECT users_id,
MIN(year_and_month) year_and_month
FROM ( SELECT users_id,
DATE_FORMAT(createdAt, '%Y%m') year_and_month,
SUM(total_price_star_member) month_price
FROM order_star_member
GROUP BY users_id,
DATE_FORMAT(createdAt, '%Y%m')
HAVING month_price >= 350000 ) starrings
GROUP BY users_id
HAVING SUM(year_and_month = '202103') > 0 ) first_starrings
GROUP BY year_and_month
ORDER BY `year`, `month`;
+-------------------+------+-------+
| count_star_member | year | month |
+-------------------+------+-------+
| 1 | 2021 | 1 |
+-------------------+------+-------+
Explanation: in march 2021, there's only one 'star member', which is users_id 16, whose first transaction is in january 2021, so 'star member' in march 2021 is as above.
But starting from March, the definition of 'star member' changes from 700,000 to 350,000.
I want to find the 'star member' in March, and his first transaction, but if the first transaction is in a month before March 2021, then the star member should be the user whose transaction >= 700,000 -- but if the first transaction is in March 2021, as I sid, select a user whose transaction >= 350,000.
Thus my updated expectation:
+-------------------+------+-------+
| count_star_member | year | month |
+-------------------+------+-------+
| 2 | 2021 | 1 |
| 1 | 2021 | 3 |
+-------------------+------+-------+
Explanation : users 15, 16, and 17 are star member in march 2021. but users 15 and 16 are doing their first star member in January 2021 (because it is before March 2021, when the requirement to become star member is 700,000), while user 17 is also a star member because the first transaction is 350,000 in March 2021.
My understanding is that in determining the final output, you need 2 things:
A user's first transaction
The users who are star members for the requested month using the condition that before March 2021 cumulative monthly transaction amounts >=700000 and after March >=350000
If correct, since you are using a version less than 8.0(where it could be done with one statement) your solution is as follows:
You need a rules table or some configuration of rules (we'll call it SMLimitDef) which would look like this entered directly in a table:
insert into SMLimitDef(sEffDate,eEffDate,priceLimit)
VALUES('1980-01-01','2021-02-28',700000),
('2021-03-01','2999-12-31',350000);
Next, you need a query or view that figures out your first transactions(called vFirstUserTransMatch) which would look something like this:
create view vFirstUserTransMatch as
SELECT *,month(osm.createdAt) as createMonth, year(osm.createdAt) as createYear
FROM order_star_member osm
where createdAt=(select MIN(createdAt) from order_star_member b
where b.users_id=osm.users_id
)
Next you need a summary view or query that summarizes transactions per month per user
create view vOSMSummary as
SELECT users_id,month(osm.createdAt) as createMonth, year(osm.createdAt) as createYear, sum(total_price_star_member) as totalPrice
FROM order_star_member osm
group by users_id,month(osm.createdAt), year(osm.createdAt);
Next you need a query that puts it all together based on your criteria:
select osm.*,futm.createMonth as firstMonth, futm.createYear as firstYear
from vOSMSummary osm
inner join vFirstUserTransMatch futm
on osm.users_id=futm.users_id
where exists(select 'x' from SMLimitDef c
where osm.createMonth between Month(c.sEffDate) and Month(c.eEffDate)
and osm.createYear between Year(c.sEffDate) and Year(c.eEffDate)
and osm.totalPrice>=c.pricelimit
)
and osm.CreateMonth=3 and osm.createYear=2021
Lastly, you can do your summary
SELECT COUNT(users_id) count_star_member,
firstYear `year`,
firstMonth `month`
FROM (
select osm.*,futm.createMonth as firstMonth, futm.createYear as firstYear
from vOSMSummary osm
inner join vFirstUserTransMatch futm
on osm.users_id=futm.users_id
where exists(select 'x' from SMLimitDef c
where osm.createMonth between Month(c.sEffDate) and Month(c.eEffDate)
and osm.createYear between Year(c.sEffDate) and Year(c.eEffDate)
and osm.totalPrice>=c.pricelimit
)
and osm.CreateMonth=3 and osm.createYear=2021
) d
group by firstYear, firstMonth
Like I said, if you were using mySQL 8, everything could be in one query using "With" statements but for your version, for readability and simplicity, you need views otherwise you can still embed the sql for those views into the final sql.
Fiddle looks like this
Contrast with version 8 which looks like this
This is probably what you need:
SELECT min_year, min_month, COUNT(users_id)
FROM (
SELECT osm2.users_id, YEAR(min_createdAt) min_year, MONTH(min_createdAt) min_month, SUM(total_price_star_member) sum_price
FROM (
SELECT users_id, MIN(createdAt) min_createdAt
FROM order_star_member
GROUP BY users_id
) AS osm1
JOIN order_star_member osm2 ON osm1.users_id = osm2.users_id
WHERE DATE_FORMAT(osm2.createdAt, '%Y%m') = DATE_FORMAT(osm1.min_createdAt, '%Y%m')
GROUP BY osm2.users_id, min_createdAt
) t1
WHERE users_id IN (
SELECT users_id
FROM (
SELECT users_id, DATE_FORMAT(createdAt, '%Y-%m-01') month_createdAt
FROM order_star_member
WHERE DATE_FORMAT(createdAt, '%Y%m') = '202103'
GROUP BY users_id, DATE_FORMAT(createdAt, '%Y-%m-01')
HAVING SUM(total_price_star_member) >= (
CASE
WHEN date(month_createdAt) < date '2021-03-01' THEN 700000
ELSE 350000
END
)
) t3
) AND
(((min_year < 2021 OR min_month < 3) AND t1.sum_price >= 700000) OR
((min_year = 2021 AND min_month = 3) AND t1.sum_price >= 350000))
GROUP BY min_year, min_month
First you find the MIN(createdAt) for each member, with:
SELECT users_id, MIN(createdAt) min_createdAt
FROM order_star_member
GROUP BY users_id
Then you compute the SUM of all the total_price_star_member in the month of the min_createdAt date:
SELECT osm2.users_id, YEAR(min_createdAt) min_year, MONTH(min_createdAt) min_month, SUM(total_price_star_member) sum_price
FROM osm1
JOIN order_star_member osm2 ON osm1.users_id = osm2.users_id
WHERE DATE_FORMAT(osm2.createdAt, '%Y%m') = DATE_FORMAT(osm1.min_createdAt, '%Y%m')
GROUP BY osm2.users_id, min_createdAt
Next you filter on the month you are interested in. Here you cannot use HAVING with something that cannot be computed from what you have in the GROUP BY statement, so you need to project also DATE_FORMAT(createdAt, '%Y-%m-01') to establish the minimum total price for star membership in the HAVING clause that is now allowed.
SELECT users_id
FROM (
SELECT users_id, DATE_FORMAT(createdAt, '%Y-%m-01') month_createdAt
FROM order_star_member
WHERE DATE_FORMAT(createdAt, '%Y%m') = '202102'
GROUP BY users_id, DATE_FORMAT(createdAt, '%Y-%m-01')
HAVING SUM(total_price_star_member) >= (
CASE
WHEN date(month_createdAt) < date '2021-03-01' THEN 700000
ELSE 350000
END
)
) t3
In the end you check also for the min_month and min_year, then you group based on these attributes and COUNT how many members in each group.
SELECT min_year, min_month, COUNT(users_id)
FROM t1
WHERE users_id IN (...) AND
(((min_year < 2021 OR min_month < 3) AND t1.sum_price >= 700000) OR
((min_year = 2021 AND min_month = 3) AND t1.sum_price >= 350000))
GROUP BY min_year, min_month
I have not immediately understood what your goal is and I am not sure I get it now, that is why I changed this query a few times by now so you might be able to simplify it.

count with more than 1 having clause

so i have a case from my previous question how to count with more than 1 having clause mysql
assume i have the data dummy like this
CREATE TABLE order_star_member ( users_id INT,
createdAt DATE,
total_price_star_member DECIMAL(10,2) );
INSERT INTO order_star_member VALUES
(12,'2019-01-01',100000),
(12,'2019-01-10',100000),
(12,'2019-01-20',100000),
(12,'2019-02-10',100000),
(12,'2019-02-15',300000),
(12,'2019-02-21',500000),
(13,'2019-01-02',900000),
(13,'2019-01-11',300000),
(13,'2019-01-18',400000),
(13,'2019-02-06',100000),
(13,'2019-02-08',900000),
(13,'2019-02-14',400000),
(14,'2019-01-21',500000),
(14,'2019-01-23',200000),
(14,'2019-01-24',300000),
(14,'2019-02-08',100000),
(14,'2019-02-09',200000),
(14,'2019-02-14',100000),
(15, '2019-03-04',1000000),
(14, '2019-03-04', 300000),
(14, '2019-03-04', 350000),
(13, '2019-03-04', 400000),
(15, '2019-01-23', 620000),
(15, '2019-02-01', 650000),
(12, '2019-03-03', 750000),
(16, '2019-03-04', 650000),
(17, '2019-03-03', 670000),
(18, '2019-02-02', 450000),
(19, '2019-03-03', 750000);
SELECT * from order_star_member;
and then i summarize data per month
-- summary per-month data
SELECT users_id,
SUM( CASE WHEN MONTHNAME(createdAt) = 'January'
THEN total_price_star_member
END ) total_price_star_member_January,
SUM( CASE WHEN MONTHNAME(createdAt) = 'February'
THEN total_price_star_member
END ) total_price_star_member_February,
SUM( CASE WHEN MONTHNAME(createdAt) = 'March'
THEN total_price_star_member
END ) total_price_star_member_March
FROM order_star_member
GROUP BY users_id
ORDER BY 1;
on my previous question i have a data called order_star_member, which contain createdAt as the date of the transaction, users_id as the buyer, total_price_star_member as the amount of the transaction. on this case i want to find out buyer who had star member (transaction in a month within >= 600000) and find out where they coming from, the data(dummy) for order_star_member begin on January 2019 untill March 2019 and it solved by #Akina with this query
SELECT COUNT(users_id) count_star_member,
year_and_month DIV 100 `year`,
year_and_month MOD 100 `month`
FROM (SELECT users_id,
MIN(year_and_month) year_and_month
FROM ( SELECT users_id,
DATE_FORMAT(createdAt, '%Y%m') year_and_month,
SUM(total_price_star_member) month_price
FROM order_star_member
GROUP BY users_id,
DATE_FORMAT(createdAt, '%Y%m')
HAVING month_price >= 600000 ) starrings
GROUP BY users_id
HAVING SUM(year_and_month = '201903') > 0 ) first_starrings
GROUP BY year_and_month
ORDER BY `year`, `month`;
explanation i want to find out the distribution for each star member (users_id who transaction >= 600000 in a month) in march and where the users_Id doing his transaction >= 600.000 before march (if the users_Id doing transaction on the first time in march, then the users_Id enter the march to march statistic)
but on april 2019 to be star member you have to transaction >= 700.000 instead of 600.000, so i want to find out the data for user where transaction in april 2019 with >= 700.000 (star member) in a month of april 2019 and doing first transaction before april 2019 with total amount of 600.000 in a month
so which part i should change in this query to find out the user who doing total transaction >= 700.000 in april in a month and doing his first transaction (if first transaction before april) >= 600.000

select mysql data with having sum

i have table order_star_member which contain users_id as the buyer, createdAt as the time the buyer doing transaction, and total_price_star_member as the amount of transaction, i want to find the buyer from january with the transaction >= 600000 and the buyer from january who also doing transaction >= 600000 (both of this month doing transaction >= 600000) idk what is the exact query, so i make a new table called january which contain the buyer who doing transaction in january >= 600000 and february which contain the buyer who doing transaction in february >= 600000, after that i use this syntax :
select count(*) as total from (SELECT
sum(b.total_price_star_member) as total_transaction, b.users_id
FROM order_star_member b
WHERE
EXISTS (SELECT 1 FROM january d
WHERE d.buyer_id = b.users_id) AND
EXISTS (SELECT 1 FROM february a
WHERE a.buyer_id = b.users_id) AND
NOT EXISTS (SELECT 1 FROM order_star_member c
WHERE c.users_id = b.users_id AND c.createdAt < '2020-01-01') group by b.users_id having sum(b.total_price_star_member) >= 600000 order by total_transaction) inner_query;
do you know what the exact query so i dont need to make new table again just like that.
example table
January 2020
users_id total_transaction
- 12 750000
- 13 450000
- 14 300000
february 2020
users_id total_transaction
- 12 650000
- 13 550000
- 14 650000
so when i run the query, then the users_id 12 will appear because in february and january he/she had a total transaction in >= 600000
SELECT users_id
FROM order_star_member
GROUP BY users_id
HAVING SUM(CASE WHEN MONTHNAME(createdAt) = 'January'
THEN total_price_star_member END) >= 600000
AND SUM(CASE WHEN MONTHNAME(createdAt) = 'February'
THEN total_price_star_member END) >= 600000;
fiddle

Count most recent renewals by memberID

I have a subscription table that tracks membership renewals by date. Renewals are for calendar year but members are allowed to renew as early as Oct 1, in which case they would be current until Dec. 31 of the following year . Therefore it is possible to renew in January and then renew again in October of the same year. I'm reporting total memberships by month and I want to avoid counting that as 2 memberships.
Each record has a unique prodID but can have more than 1 record of a memberID due to the renewal option above. payDate is the transaction date
My statement is:
$sql = "SELECT
EXTRACT(MONTH FROM payDate) as month,
EXTRACT(YEAR FROM payDate) as year,
count(*)
FROM
memberDues
WHERE payDate >= $lastYear-10-01
GROUP BY
month,
year
ORDER BY
year ASC,
month ASC";
I get an output like this (not formatted):
Member dues paid by month
October 46
November 30
December 99
January 42
February 8
March 9
April 4
May 1
June 3
Member Total: 242
How do I modify the select statement to avoid duplicate renewals in a report period?
you need to group by memberID to get one member renewal not being repeated.
SELECT year, month, count(memberID) FROM (
SELECT memberID,
EXTRACT(MONTH FROM payDate) as month,
EXTRACT(YEAR FROM payDate) as year,
count(*)
FROM
memberDues
WHERE payDate >= $lastYear-10-01
GROUP BY
memberID,
month,
year
ORDER BY
year ASC,
month ASC";
)TMP GROUP BY year, month
Counting the distinct memberID's for each month should do the job:
SELECT
EXTRACT(MONTH FROM payDate) as month,
EXTRACT(YEAR FROM payDate) as year,
count(DISTINCT memberID)
FROM
memberDues
WHERE payDate >= $lastYear-10-01
GROUP BY
month,
year
ORDER BY
year ASC,
month ASC

Created date between that particular month mysql

I have a table event, where i have records with a field end_date, so my problem is i want to fetch number of records, grouping month wise, where end_date should with in that month only, so for example:
If a record have end_date as 2013-01-01 00:00:00 then it should be counted in January 2013, and i am not able to do that. I am unable to put that where condition, how to do tell database that end_date should be between the month for which it is currently grouping.
SELECT COUNT(*) AS 'count', MONTH(created) AS 'month', YEAR(created) AS 'year' FROM event WHERE is_approved =1 GROUP BY YEAR(created), MONTH(created)
Please help me out.
EDIT :
Data say i have is like:
Record name end_date
record_1 2013-11-01 00:00:00
record_2 2013-11-30 00:00:00
record_3 2013-12-01 00:00:00
record_4 2013-12-04 00:00:00
record_5 2013-12-06 00:00:00
record_6 2013-12-10 00:00:00
...many more
Result Expected is:
Count month year
2 11 2013
4 12 2013
....so on
Try this:
SELECT COUNT(1) AS 'count', MONTH(end_date) AS 'month', YEAR(end_date) AS 'year'
FROM event
WHERE is_approved = 1
GROUP BY EXTRACT(YEAR_MONTH FROM end_date);
OR
SELECT COUNT(1) AS 'count', MONTH(end_date) AS 'month', YEAR(end_date) AS 'year'
FROM event
WHERE is_approved = 1
GROUP BY YEAR(end_date), MONTH(end_date);
::EDIT::
1. end date is greater than that particular month - Simply add where condition in your query and pass particular month in format of YYYYMM instead of 201411
2. event is started - Add one more where condition to check whether the created date is less then current date
SELECT COUNT(1) AS 'count', MONTH(end_date) AS 'month', YEAR(end_date) AS 'year'
FROM event
WHERE is_approved = 1 AND
EXTRACT(YEAR_MONTH FROM end_date) > 201411 AND
DATE(created) <= CURRENT_DATE()
GROUP BY EXTRACT(YEAR_MONTH FROM end_date);
OR
SELECT COUNT(1) AS 'count', MONTH(end_date) AS 'month', YEAR(end_date) AS 'year'
FROM event
WHERE is_approved = 1 AND
EXTRACT(YEAR_MONTH FROM end_date) > 201411 AND
DATE(created) <= CURRENT_DATE()
GROUP BY YEAR(end_date), MONTH(end_date);
The count is aggregated based on the month and year so if you are spanning years, you wont have Jan 2013 mixed with Jan 2014, hence pulling those values too and that is the same basis of the group by.
As for your criteria, that all goes in the WHERE clause. In this case, I did anything starting with Jan 1, 2013 and ending Dec 31, 2014 via 'yyyy-mm-dd' standard date recognized format. That said, and the structure of the table you provided, I am using the "end_date" column.
SELECT
YEAR(end_date) AS EventYear,
MONTH(end_Date) AS EventMonth,
COUNT(*) AS EventCount
FROM
event
WHERE is_approved = 1
and end_date between '2013-01-01' and '2014-12-31'
GROUP BY
YEAR(end_date),
MONTH(end_Date)
Now, if you want them to have the most recent events on the top, I would put the year and month descending so 2014 is listed first, then 2013, etc and months within them as December (month 12), before the others.
GROUP BY
YEAR(end_date) DESC,
MONTH(end_Date) DESC
Your criteria could be almost anything from as simple as just a date change, approved status, or even get counts per account status is so needed, such as (and these are just EXAMPLES if you had such code status values)
SUM( is_approved = 0 ) as PendingEvent,
SUM( is_approved = 1 ) as ApprovedEvent,
SUM( is_approved = 2 ) as CancelledEvent
Per comment feedback.
For different date ranges, ignore the between clause and change the WHERE to something like
WHERE end_date > '2014-08-01' or all after a date...
where end_date < '2014-01-01' or all before a date...
They will still group by month / year. If you wanted based on a start date of the event, just change that column in instead, or do IN ADDITION to the others.
MySQL has a bunch of date and time functions that can help you with that. For example:
MONTH() Return the month from the date passed
or
YEAR() Return the year
So you can just get the month and year of your dates. And group your results by them.
SELECT
COUNT(*) cnt
,MONTH(end_date) month
,YEAR(end_date) year
FROM events
GROUP BY month, year
Result :
cnt month year
2 11 2013
4 12 2013
Update:
For filtering only the records that have an end_date greater than a particular month AND have already started, you just need to add a WHERE clause. For example, if the particular month were February 2015:
SELECT
COUNT(*) cnt
,MONTH(end_date) month
,YEAR(end_date) year
FROM events
WHERE end_date >= '2015-03-01'
AND created < NOW()
GROUP BY month, year
Alternatively, the first part of the WHERE clause can be rewritten in the following way, which is probably more comfortable to use if you have to pass the year and month as distinct parameters.
...
WHERE (YEAR(end_date) > 2015
OR (YEAR(end_date) = 2015 AND MONTH(end_date) > 02))
AND created...
SELECT COUNT(*) AS 'count', MONTH(created) AS 'month', YEAR(created) AS 'year' FROM event WHERE is_approved =1 and month(created) = "the month u want" and year(created) = "the year you want" group by GROUP BY YEAR(created), MONTH(created)
you will need to pull the month and year... i could help with that but not sure how you are getting it but months would be 01/02/03 ect and year is 2013/2014/2015 ect