MySQL find the mode of multiple subsets - mysql

I have a database like this:
custNum date purchase dayOfWeek
333 2001-01-01 23.23 1
333 2001-03-04 34.56 5
345 2008-02-02 22.55 3
345 2008-04-05 12.35 6
... ... ... ...
I'm trying to get the mode (most frequently occuring value) for the dayOfWeek column for each customer. Basically it would be the day of the week each customer shops the most on. Like:
custNum max(count(dayofweek(date)))
333 5
345 3
356 2
388 7
... ...
Any help would be great thanks.

select custNum, dayOfWeek
from tableName t
group by custNum, dayOfWeek
having dayOfWeek = (
select dayOfWeek
from tableName
where custNum = t.custNum
group by dayOfWeek
order by count(*) desc, dayOfWeek
limit 1
)

Related

How to select rows in every day between two dates using MySQL

I want to keep track of users logged in every day between two dates.
Let's say I have a table my_table like this:
user_id
login_datetime
1
2021-10-02 10:00:00
1
2021-10-02 12:00:00
2
2021-10-02 12:20:00
1
2021-10-03 17:00:00
1
2021-10-04 22:00:00
2
2021-10-04 23:00:00
and given date range is from '2021-10-02' to '2021-10-04'.
I want to get user_id = 1 in this case, because user_id = 2 is not logged in at '2021-10-03'
result
user_id
login_date
1
2021-10-02
1
2021-10-03
1
2021-10-04
Is there any solution for this?
One approach uses aggregation:
SELECT user_id
FROM my_table
WHERE login_datetime >= '2021-10-02' AND login_datetime < '2021-10-05'
GROUP BY user_id
HAVING COUNT(DISTINCT DATE(login_datetime)) = 3; -- range has 3 dates in it
Demo
The HAVING clause asserts that any matching user must have 3 distinct dates present, which would imply that such a user would have login activity on all dates from 2021-10-02 to 2021-10-04 inclusive.
Edit:
To get the exact output in your question, you may use:
SELECT DISTINCT user_id, DATE(login_datetime) AS login_date
FROM my_table
WHERE user_id IN (
SELECT user_id
FROM my_table
WHERE login_datetime >= '2021-10-02' AND login_datetime < '2021-10-05'
GROUP BY user_id
HAVING COUNT(DISTINCT DATE(login_datetime)) = 3
);

Get monthly counts on multiple dates

I have a table that looks like this
id
date registered
date cancelled
1
2021-01-01
2021-03-02
2
2021-01-05
2021-01-21
3
2021-02-04
2021-02-25
4
2021-02-16
2021-03-26
How do I generate a query in mysql that will give me counts of cancelled and registered for each month.
I can do it for just one of the dates but don't know how to combine for both dates.
For eg for a single date I would do this.
SELECT date_format(`users`.`dateregistered`,_utf8'%Y-%m') AS `DateREegistered`, count(0) AS `Registration Count`
FROM `users`
GROUP BY date_format(`users`.`dateregistered`,_utf8'%Y-%m')
But I want something like this
Date
Registered Count
Cancelled Count
2021-01
2
1
2021-02
2
1
2021-03
0
2
Please let me know if you have any ideas.
You can join the distinct months appearing in date registered and date registered to the table and use conditional aggregation:
SELECT t.Date,
SUM(t.Date = date_format(dateregistered, '%Y-%m')) `Registered Count`,
SUM(t.Date = date_format(datecancelled, '%Y-%m')) `Cancelled Count`
FROM (
SELECT date_format(dateregistered, '%Y-%m') Date FROM users
UNION
SELECT date_format(datecancelled, '%Y-%m') FROM users
) t INNER JOIN users u
ON t.Date IN (date_format(dateregistered, '%Y-%m'), date_format(datecancelled, '%Y-%m'))
GROUP BY t.Date
See the demo.
Results:
Date
Registered Count
Cancelled Count
2021-01
2
1
2021-02
2
1
2021-03
0
2

SQL Duplicate Count if Customer has Spent on Specific Date and Returned?

I've spent a fair amount of time trying to get my head round how to do this, and I can't. I'm making it far to complicated for myself, I understand the code, just not how it all flows together.
If I have table "Customers" with columns for "customer_id", "store_id", "visited", and "date" - I want to identify Customers who visited (visited = yes) a specific store (store_id="NEA") on a set date "2015-05-14" - and then have returned to the same store since then, and count the number of customers who have returned - can anyone help me out?
I know I would need to select customer_id for those who have a store_id of "NEA", a date of "2015-05-14" and a "yes" for visited, but how do I then identify those who returned, and count them - so how many customers visited on that day and then returned?
So for example:
customer_id | store_id | date | visited
123 NEA 2015-05-14 yes
456 NEA 2015-05-14 yes
789 ABC 2015-05-16 no
123 NEA 2015-05-14 yes
654 TDF 2015-05-12 yes
987 PEH 2015-05-14 yes
123 NEA 2015-05-14 no
456 NEA 2015-05-17 yes
987 LEA 2015-05-14 yes
159 NEA 2015-05-16 yes
123 NEA 2015-05-19 yes
or something like this:
SELECT count(*) AS cnt,t.*
FROM yourTable AS t
WHERE
`date` = '2015-05-14'
AND
store_id = 'NEA'
AND
visited = 'YES'
GROUP BY customer_id
HAVING cnt >1;
SELECT DISTINCT customer_id, date
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id, date
HAVING COUNT(*) >= 2
Follow the link below for a running demo:
SQLFiddle
The above query yields a list of duplicate customers and the dates on which they visited the same store twice or more. If you want a count of duplicate customers by date, you can wrap it and subquery:
SELECT t.date, COUNT(*) AS duplicateCount
FROM
(
SELECT DISTINCT customer_id, date
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id, date
HAVING COUNT(*) >= 2
) t
GROUP BY t.date
SQLFiddle
Update:
Based on your feedback, the following query might be what you had in mind:
SELECT DISTINCT customer_id
FROM Customers
WHERE visited = 'yes'
GROUP BY customer_id, store_id
HAVING SUM(CASE WHEN date = '2015-05-14' THEN 1 ELSE 0 END) >= 1 AND
SUM(CASE WHEN date > '2015-05-14' THEN 1 ELSE 0 END) >= 1

MySQL get count of periods where date in row

I have an MySQL table, similar to this example:
c_id date value
66 2015-07-01 1
66 2015-07-02 777
66 2015-08-01 33
66 2015-08-20 200
66 2015-08-21 11
66 2015-09-14 202
66 2015-09-15 204
66 2015-09-16 23
66 2015-09-17 0
66 2015-09-18 231
What I need to get is count of periods where dates are in row. I don't have fixed start or end date, there can be any.
For example: 2015-07-01 - 2015-07-02 is one priod, 2015-08-01 is second period, 2015-08-20 - 2015-08-21 is third period and 2015-09-14 - 2015-09-18 as fourth period. So in this example there is four periods.
SELECT
SUM(value) as value_sum,
... as period_count
FROM my_table
WHERE cid = 66
Cant figure this out all day long.. Thx.
I don't have enough reputation to comment to the above answer.
If all you need is the NUMBER of splits, then you can simply reword your question: "How many entries have a date D, such that the date D - 1 DAY does not have an entry?"
In which case, this is all you need:
SELECT
COUNT(*) as PeriodCount
FROM
`periods`
WHERE
DATE_ADD(`date`, INTERVAL - 1 DAY) NOT IN (SELECT `date` from `periods`);
In your PHP, just select the "PeriodCount" column from the first row.
You had me working on some crazy stored procedure approach until that clarification :P
I should get deservedly flamed for this, but anyway, consider the following...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(date DATE NOT NULL PRIMARY KEY
,value INT NOT NULL
);
INSERT INTO my_table VALUES
('2015-07-01',1),
('2015-07-02',777),
('2015-08-01',33),
('2015-08-20',200),
('2015-08-21',11),
('2015-09-14',202),
('2015-09-15',204),
('2015-09-16',23),
('2015-09-17',0),
('2015-09-18',231);
SELECT x.*
, SUM(y.value) total
FROM
( SELECT a.date start
, MIN(c.date) end
FROM my_table a
LEFT
JOIN my_table b
ON b.date = a.date - INTERVAL 1 DAY
LEFT
JOIN my_table c
ON c.date >= a.date
LEFT
JOIN my_table d
ON d.date = c.date + INTERVAL 1 DAY
WHERE b.date IS NULL
AND c.date IS NOT NULL
AND d.date IS NULL
GROUP
BY a.date
) x
JOIN my_table y
ON y.date BETWEEN x.start AND x.end
GROUP
BY x.start;
+------------+------------+-------+
| start | end | total |
+------------+------------+-------+
| 2015-07-01 | 2015-07-02 | 778 |
| 2015-08-01 | 2015-08-01 | 33 |
| 2015-08-20 | 2015-08-21 | 211 |
| 2015-09-14 | 2015-09-18 | 660 |
+------------+------------+-------+
4 rows in set (0.00 sec) -- <-- This is the number of periods
there is a simpler way of doing this, see here SQLfiddle:
SELECT min(date) start,max(date) end,sum(value) total FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl, (SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
GROUP BY diff
This select groups over the same difference between sequential number and date value.
Edit
As Strawberry remarked quite rightly, there was a flaw in my apporach, when a period spans a month change or indeed a change into the next year. The unix_timestamp() function can cure this though: It returns the seconds since 1970-1-1, so by dividing this number by 24*60*60 you get the days since that particular date. The rest is simple ...
If you only need the count, as your last comment stated, you can do it even simpler:
SELECT count(distinct diff) period_count FROM
(SELECT #i:=#i+1 i,
ROUND(Unix_timestamp(date)/(24*60*60))-#i diff,
date,value
FROM tbl,(SELECT #i:=0)n WHERE c_id=66 ORDER BY date) t
Tnx. #cars10 solution worked in MySQL, but could not manage to get period count to echo in PHP. It returned 0. Got it working tnx to #jarkinstall. So my final select looks something like this:
SELECT
sum(coalesce(count_tmp,coalesce(count_reserved,0))) as sum
,(SELECT COUNT(*) FROM my_table WHERE cid='.$cid.' AND DATE_ADD(date, INTERVAL - 1 DAY) NOT IN (SELECT date from my_table WHERE cid='.$cid.' AND coalesce(count_tmp,coalesce(count_reserved,0))>0)) as periods
,count(*) as count
,(min(date)) as min_date
,(max(date)) as max_date
FROM my_table WHERE cid=66
AND coalesce(count_tmp,coalesce(count_reserved,0))>0
ORDER BY date;

mysql get average data for full months

Given the following sample data:
tblData
Date Sales
----------------------
2011-12-01 122
2011-12-02 433
2011-12-03 213
...
2011-12-31 235
2011-11-01 122
2011-11-02 433
2011-11-03 213
...
2011-11-30 235
2011-10-10 122
2011-10-11 433
2011-10-12 213
...
2011-10-31 235
Notice that October data begins at 10 October, whereas subsequent months have complete data.
I need to get the average monthly sales over all complete months, which in this case would be November and December 2011.
How would I do this?
SELECT `date`, AVG(`sales`)
FROM sales
GROUP BY YEAR(`date`), MONTH(`date`)
HAVING COUNT(`date`) = DAY(LAST_DAY(`date`));
Example
If you want to limit the result, either
HAVING ...
ORDER BY `date` DESC LIMIT 3
which should always return data for the 3 most recent months, or something like
FROM ...
WHERE DATE_FORMAT(CURDATE() - INTERVAL 3 MONTH, '%Y-%m')
<= DATE_FORMAT(`date`, '%Y-%m')
GROUP BY ...
which should return data for the 3 previous months, if there is any. I'm not sure which is better but I don't believe WHERE gets to use any index on date, and if you're using DATETIME and don't format it you'll also be comparing the days and you don't want that,
Can't test it right now, but please have a try with this one:
SELECT
DATE_FORMAT(`Date`, '%Y-%m') AS yearMonth,
SUM(Sales)
FROM
yourTable
GROUP BY
yearMonth
HAVING
COUNT(*) = DAY(LAST_DAY(`Date`)