MySQL SELECT SUM CASE with GROUP BY or DISTINCT - mysql

I'm trying to count unique user ids in a log table by month. So far I came up with the following query:
SELECT
COUNT(CASE WHEN log_date LIKE '2020-01%' THEN 1 END) AS januari
FROM user_log;
This query returns the total of all rows of the user_log in januari. However I would like to know how many unique users have logged in in Januari. So I need something like:
SELECT
COUNT(**DISTINCT user_id** CASE WHEN log_date LIKE '2020-01%' THEN 1 END) AS januari
FROM user_log;
I also tried GROUP BY, but so far no luck. Does anyone have a suggestion?

Consider:
SELECT COUNT(DISTINCT CASE WHEN log_date >= '2020-01-01' AND log_date < '2020-02-01' THEN userid END) AS januari
FROM user_log;
I changed the filtering logic to use half-open intervals rather than string matching: it is more efficient.
Note that, if you just that result for January, it is sufficient to use a WHERE clause:
SELECT COUNT(DISTINCT userid) januari
FROM user_log
WHERE log_date >= '2020-01-01' AND log_date < '2020-02-01'

Related

Avg function not returning proper value

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1
Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)
Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

mySQL query that is a bit tricky

Hi there I want to design this query in mySQL.
Statement: For all the customers that transacted during 2017, what % made another transaction within 30 days?
can you tell me how such query can be designed?
This is the picture of the table to perform this query on:
Table name is: transactions
Just use lead() to get the next date. Then aggregate at the customer level to determine if any transaction in the time period has another within 30 days for that customer.
Finally, aggregate again:
select avg(case when mindiff < 30 then 1.0 else 0 end) as within_30days
from (select customerid, min(datediff(next_date - date)) as mindiff
from (select t.*, lead(date) over (partition by customerid order by date) as next_date
from transactions t
) t
where date >= '2017-01-01' and date < '2018-01-01'
group by customerid
) c

MySQL combine 2 different counts in one query

I have a table, that pretty much looks like this:
users (id INT, masterId INT, date DATETIME)
Every user has exactly one master. But masters can have n users.
Now I want to find out how many users each master has. I'm doing that this way:
SELECT `masterId`, COUNT(`id`) AS `total` FROM `users` GROUP BY `masterId` ORDER BY `total` DESC
But now I also want to know how many new users a master has since the last 14 days. I could do it with this query:
SELECT `masterId`, COUNT(`id`) AS `last14days` FROM `users` WHERE `date` > DATE_SUB(NOW(), INTERVAL 14 DAY) GROUP BY `masterId` ORDER BY `total` DESC
Now the question: Could I somehow get this information with one query, instead of using 2 queries?
You can use conditional aggregation to do this by only counting rows for with the condition is true. In standard SQL this would be done using a case expression inside the aggregate function:
SELECT
masterId,
COUNT(id) AS total,
SUM(CASE WHEN date > DATE_SUB(NOW(), INTERVAL 14 DAY) THEN 1 ELSE 0 END) AS last14days
FROM users
GROUP BY masterId
ORDER BY total DESC
Sample SQL Fiddle

Mysql query to count result by removing time from datetime

I am trying to write one query in which i need to fetch/count records which are registered on same date. But the issue is that in mysql structure created_date field have "datetime" structure.
Let me give you example
If 5 people are registered on 2015-02-25 and 6 people registered on 2015-02-11. It will output as
Sno. Date. count
1) 2015-02-25 5
2) 2015-02-11 6
Here is sample of attached database rows for better understanding
http://i.stack.imgur.com/iPeLl.png
SELECT date(created_at),count(*) FROM myTable GROUP BY date(created_at)
It might be the one that you expected.
SELECT
DATE_FORMAT(created_at,"%Y-%m-%d") AS Date,
COUNT(*) AS count
FROM table_name
GROUP BY DATE_FORMAT(created_at,"%Y-%m-%d")
ORDER BY created_at DESC;
This query will count the registered people in each day. Of course the latest registration will come first.
your query should be like this:
select date(created_at) created_at, count(*) from TABLE
group by date(created_at)
Select between dates
select date(created_at) created_at, count(*) from TABLE
where date(created_at) >= '2015-02-11' and date(created_at) <= '2015-02-25'
group by date(created_at)
With between:
select date(created_at) created_at, count(*) from Mytable
where date(created_at) BETWEEN '2015-01-05' AND '2015-02-25'
group by date(created_at)
Referrence: count()

MySQL - How can I improve these queries?

first one:
SELECT MONTH(timestamp) AS d, COUNT(*) AS c
FROM table
WHERE YEAR(timestamp)=2012 AND Status = 1
GROUP BY MONTH(timestamp)
one of the issues I'm facing for this one is that I have to run multiple queries that use different values for Status. Is there a way to combine them into one? Like in one column it would have all the counts for when Status=1 and another column for when Status=2, etc.
second one:
SELECT COUNT(*) c , MONTH(timestamp) t FROM
(
SELECT t.adminid, timestamp
FROM table1 t
LEFT JOIN admins a ON a.adminID=t.adminID
WHERE YEAR(timestamp)=2012
GROUP BY t.adminID, DATE(Timestamp)
ORDER BY timestamp DESC
) AS a
GROUP BY MONTH(timestamp)
ORDER BY MONTH(timestamp) ASC;
a nested query, not sure if I can improve on this. I'm running this one on 2 tables, one has ~35k rows and one has ~300k rows. It takes about half a second for the first table and 4-5 seconds for the second.
These might help:
First one:
SELECT MONTH(timestamp) AS d,
sum(case when Status=1 then 1 else 0 end) as Status1Count,
sum(case when Status=2 then 1 else 0 end) as Status2Count,
sum(case when Status=3 then 1 else 0 end) as Status3Count
FROM `table`
WHERE timestamp between '2012-01-01 00:00:00' and '2012-12-31 23:59:59'
AND Status in (1,2,3)
GROUP BY MONTH(timestamp);
Second one:
Make sure that there is an index on the timestamp column and then make sure that you do not run any conversion functions e.g. MONTH(timestamp) on the indexed column. Somthing like:
SELECT COUNT(*) c , a.m as t FROM
(
SELECT t.adminid, timestamp, MONTH(timestamp) as m
FROM table1 t
LEFT JOIN admins a ON a.adminID=t.adminID
WHERE timestamp between '2012-01-01 00:00:00' and '2012-12-31 23:59:59'
GROUP BY t.adminID, DATE(Timestamp)
ORDER BY timestamp DESC
) AS a
GROUP BY a.m
ORDER BY a.m ASC;
Second one is a bit tricky since I do not have the data in front of me so I can't see the DB access path!