Using an alias in where clause with a group by - mysql

Using a SQL query, I am trying to find the number of users that have had page views greater than 5 in a given month.
What I have so far is exactly the above except, I can't add the condition of a minimum of 5 page views. It is currently showing the number of users who have had at least 1 page view in a given month.
SELECT CONCAT(MONTH(analytics.date),'/',YEAR(analytics.date)) AS DATE,
COUNT(analytics.id) AS views,
COUNT(DISTINCT users.id) AS num_users
FROM users
LEFT JOIN analytics ON users.id = analytics.user_id
WHERE users.banned = 0
AND analytics.id IS NOT NULL
GROUP BY YEAR(analytics.date), MONTH(analytics.date)
I tried adding AND views > 5 in the where clause but that didn't work as I get an unknown column.
I don't think a HAVING clause will work as this is applied after the GROUP BY and I need to find individual users who have had more than 5 page views.
How else can I achieve this?

If this is your requirement, then you need to aggregate twice, once at the user level and second at the analytics level. Or, use a subquery in the where clause. Here is what you may need:
SELECT CONCAT(MONTH(a.date),'/',YEAR(a.date)) AS DATE,
COUNT(a.id) AS views,
COUNT(DISTINCT u.id) AS num_users
FROM users u LEFT JOIN
analytics a
ON u.id = a.user_id
WHERE u.banned = 0 AND a.id IS NOT NULL AND
5 <= (SELECT COUNT(*) FROM analytics a2 WHERE a2.user_id = u.userid)
GROUP BY YEAR(a.date), MONTH(a.date);
This uses the overall count for the limit.
EDIT: TO speed the subquery, be sure you have an index on analytis(user_id, date).

You have to use a subquery for this, since you're selecting which users feed into the GROUP BY. Here, we do a subquery in the WHERE clause to ask for each row if the user has at least five entries in the analytics table.
SELECT CONCAT(MONTH(analytics.date),'/',YEAR(analytics.date)) AS DATE,
COUNT(analytics.id) AS views,
COUNT(DISTINCT users.id) AS num_users
FROM users
LEFT JOIN analytics ON users.id = analytics.user_id
WHERE users.banned = 0
AND (SELECT COUNT(*) FROM analytics AS a WHERE a.user_id = users.id) > 5
AND analytics.id IS NOT NULL
GROUP BY YEAR(analytics.date), MONTH(analytics.date)
If you want there to be more than 5 views for the user in the given month, then you have to modify your query and you'll need to use an inner join:
SELECT CONCAT(MONTH(analytics.date),'/',YEAR(analytics.date)) AS DATE,
COUNT(analytics.id) AS views,
COUNT(DISTINCT users.id) AS num_users
FROM users
JOIN analytics ON users.id = analytics.user_id
WHERE users.banned = 0
AND (SELECT COUNT(*) FROM analytics AS a WHERE a.user_id = users.id AND EXTRACT(YEAR_MONTH FROM a.date) = EXTRACT(YEAR_MONTH FROM analytics.date)) > 5
AND analytics.id IS NOT NULL
GROUP BY YEAR(analytics.date), MONTH(analytics.date)

Related

mysql - join table row based on id with the highest column value in the table - multiple joins and conditions

I have searched on SO prior to asking and have tried things I found - I have more involved due to multiple joins and conditions and cannot get the correct results.
SQL Fiddle here with basic data entered.
The query below does not give the results I want, but gives an idea of what I am looking to achieve. I want to return 1 result per computer_id where time.capture_timestamp is between a specific start/end value and is the highest value in the table for that computer_id including that row's other column values. I have tried a few different things I found here on SO involving MAX() and subqueries, but can't seem to get what I am looking for.
SELECT
computers.computer_name,
users.username,
time.title,
time.capture_timestamp
FROM computers
INNER JOIN time
ON time.computer_id = computers.computer_id AND time.capture_timestamp >= 0 AND time.capture_timestamp <= 9999999999
INNER JOIN users
ON users.user_id = time.user_id
GROUP BY computers.computer_id
ORDER BY time.capture_timestamp DESC
The fiddle as is will return :
computer_name username title capture_timestamp
computer1 user1 some title 1595524341
computer2 user3 some title3 1595524331
while the result I would like is actually :
computer_name username title capture_timestamp
computer1 user2 some title2 1595524351
computer2 user3 some title3 1595524331
... based on the example values in the fiddle. Yes, the start/end time values include 'everything' in this example, but in use there would actually be a timestamp range provided.
Using ROW_NUMBER:
WITH cte AS (
SELECT c.computer_name, u.username, t.title, t.capture_timestamp,
ROW_NUMBER() OVER (PARTITION BY c.computer_id
ORDER BY t.capture_timestamp DESC) rn
FROM computers c
INNER JOIN time t ON t.computer_id = c.computer_id
INNER JOIN users u ON u.user_id = t.user_id
WHERE t.capture_timestamp BETWEEN 0 AND 9999999999
)
SELECT computer_name, username, title, capture_timestamp
FROM cte
WHERE rn = 1;
You can add a correlated sub query to get the desired results
select
computers.computer_name,
users.username,
t.title,
t.capture_timestamp
from computers
inner join time t
on t.computer_id = computers.computer_id
and t.capture_timestamp >= 0 and t.capture_timestamp <= 9999999999
inner join users
on users.user_id = t.user_id
where t.capture_timestamp =(
select max(capture_timestamp)
from time
where capture_timestamp >= 0 and capture_timestamp <= 9999999999
and t.computer_id = computer_id
)
order by t.capture_timestamp desc
DEMO

How can I get customer data based on the number of users they have?

I want to get customer data from all the businesses with more than 1 user.
For this I think I need a subquery to count more than 1 user and then the outer query to give me their emails.
I have tried subqueries in the WHERE and HAVING clause
SELECT u.mail
FROM users u
WHERE count IN (
SELECT count (u.id_business)
FROM businesses b
INNER JOIN users u ON b.id = u.id_business
GROUP BY b.id, u.id_business
HAVING COUNT (u.id_business) >= 2
)
I believe that you do not need a subquery, everything can be achieved in a joined aggregate query with a HAVING clause, like :
SELECT u.mail
FROM users u
INNER JOIN businesses b on b.id = u.id_business
GROUP BY u.id, u.email
HAVING COUNT (*) >= 2
NB : in case several users may have the same email, I have added the primary key of users to the GROUP BY clause (I assumed that the pk is called id) : you may remove this if email is a unique field in users.

Retrieve data based on COUNT() and MONTH() in MySQL

I have users and likes tables. A foreign key of the latter references id from users table. The task at hand is to retrieve all distinct users who have more than 100 likes in March 2018. I'm trying to extract date-related values from a column with a type TIMESTAMP
I've come up with only seeing that pretty much all of them have some likes in that period:
SELECT DISTINCT u.name
FROM users AS u
JOIN likes AS l ON u.id = l.user_id
WHERE MONTH(l.timestamp) = 3 AND YEAR(l.timestamp) = 2018;
I guess I have to make use of COUNT() and GROUP BY somehow, but all my struggles were leading to syntax errors. Please give a hand.
You don't want select distinct. You want group by and having:
SELECT u.name
FROM users u JOIN
likes l
ON u.id = l.user_id
WHERE MONTH(l.timestamp) = 3 AND YEAR(l.timestamp) = 2018
GROUP BY u.name
HAVING COUNT(*) > 100;
To be honest, it is better to write the WHERE clause as:
WHERE l.timestamp >= '2018-03-01' AND l.timestamp < '2018-04-01'
This allows the SQL engine to use an index on timestamp, if one is available.

MySQL order by with count and group by issues

My Tables look like:
# Table user
user_id PK
...
# Table buy
buy_id PK
user_id FK
...
# Table offert
offert_id
user_id
...
Well i need to know the last 'buy' of 1 'user' and get the count of 'offert' this 'user' has, I tried something like:
select b.buy_id,count(distinct c.offert_id) as cv from user a
inner join buy b using(user_id) left join offert c using(user_id) where a.user_id=4
group by a.user_id order by b.buy_id desc
but it always returns the first 'buy' not the last, look like this order by doesn't make any effect
I know that i can do it with sub queries but i would like know if is there a way to do it whout use sub queries, maybe using max functions but idk how to do it.
thanks.
Your approach is simply not guaranteed to work. One big reason is that the group by is processed before the order by.
Assuming that you mean the biggest buy_id for each user, you can do this as:
select u.user_id, u.last_buy_id, count(distinct o.offert_id)
from (select u.*,
(select buy_id from buy b where u.user_id = u.user_id order by buy_id desc limit 1
) last_buy_id
from user u
) left outer join
offert o
on o.user_id = u.user_id
group by u.user_id;
The first subquery uses a correlated subquery to get the last buy id for each user. It then joins in offert and does the aggregation. Note that this version includes the user_id in the aggregation.

MySQL AVG(TIMESTAMPDIFF) with GROUP BY

I have two tables user (one) and transaction (many) and I need to get the average time in days from when a user was created to when they made their first transaction. I'm using AVG(TIMESTAMPDIFF) which is working well, except that the GROUP BY returns an average against every user instead of one single average for all unique users in the transaction table. If I remove the GROUP BY, I get a single average figure but it takes into account multiple transactions from users, whereas I just want to have one per user (the first they made).
Here's my SQL:
SELECT AVG(TIMESTAMPDIFF(DAY, u.date_created, t.transaction_date)) AS average
FROM transaction t
LEFT JOIN user u ON u.id = t.user_id
WHERE t.user_id IS NOT NULL AND t.status = 1
GROUP BY t.user_id;
I'd appreciate it if someone can help me return the average for unique users only. It's fine to break the query down into two, but the tables are large so returning lots of data and putting it back in is a no-go. Thanks in advance.
SELECT AVG(TIMESTAMPDIFF(DAY, S.date_created, S.transaction_date)) AS average
FROM (
SELECT u.date_created, t.transaction_date
FROM transaction t
INNER JOIN user u ON u.id = t.user_id
WHERE t.status = 1
GROUP BY t.user_id
HAVING u.date_created = MIN(u.date_created)
) s
I replaced the LEFT JOIN with an INNER JOIN because I think that's what you want, but it's not 100% equivalant to your WHERE t.user_id IS NOT NULL.
Feel free to put the LEFT JOIN back if need be.
select avg( TIMESTAMPDIFF(DAY, u.date_created, min_tdate) ) as average
from user u
inner join
(select t.user_id, min(t.transaction_date) as min_tdate
from transaction t
where t.status=1;
group by t.user_id
) as min_t
on u.id=min_t.user_id;