I have the tables users and statuses . I want to select all the users, plus any statuses they might have, but only the most recent status from each user.
Here is the code that doesn't work:
SELECT users.id, alias, gender, login, logout, users.create_date, statustext as statustxt, TIMESTAMPDIFF(YEAR,birthdate,CURDATE()) AS age
FROM users
LEFT JOIN statuses s ON users.id = s.user_id
WHERE s.ID = (
SELECT MAX(s2.ID)
FROM statuses s2
WHERE s2.user_id = s.user_id
)
This gets the users with the most recent statuses, but not the users from the users table as well. Maybe it can be fixed by some small adjustment?
I got the sub query by searching, but I don't understand how that code works. It seems to compare two versions of the same table (For example: WHERE s2.user_id = s.user_id ) . Where can I read about this sort of technique?
Is a sub query required in this case by the way?
If you can find a solution would be great, and some basic explanation of how it works would highly appreciated.
----------EDIT---------------
I took one of the responses (by maresa) and combined with the sub query of my initial code , and this works(!) It has 3 selects and looks a bit over complicated maybe?:
SELECT users.id, alias, gender, login, logout, users.create_date, statustext as statustxt, TIMESTAMPDIFF(YEAR,birthdate,CURDATE()) AS age
FROM users
LEFT JOIN (
SELECT user_id, statustext FROM statuses s
WHERE s.ID = (
SELECT MAX(s2.ID)
FROM statuses s2
WHERE s2.user_id = s.user_id
)
) as s ON users.id = s.user_id
I've encountered similar problem. This post is relevant: http://www.microshell.com/database/sql/optimizing-sql-that-selects-the-maxminetc-from-a-group/.
Regarding your specific query, since you care only the latest status, you want to first get the latest status from each users. Assuming that the latest status has the latest id (based on your sample), the SQL would be below:
SELECT
MAX(ID), statustext, user_id
FROM
statuses
GROUP BY
user_id
What the above query does is, to get the latest status per user_id. Once you get that, you can think of it as if it's a table. Then simply join on this "table" (the query) instead of the real one (statuses table). Therefore, your query would be like below:
SELECT
users.id, alias, gender, login, logout, users.create_date, statustext as statustxt, TIMESTAMPDIFF(YEAR,birthdate,CURDATE()) AS age
FROM
users
LEFT JOIN (
SELECT
MAX(ID), user_id
FROM
statuses
GROUP BY
user_id
) as s ON users.id = s.user_id
LEFT JOIN statuses ON statuses.ID = s.ID -- EDIT: Added this line.
You might use a subselect as the join, and limit to show only 1 row:
SELECT
users.id,
alias,
gender,
login,
logout,
users.create_date,
statustext as statustxt,
TIMESTAMPDIFF(YEAR,birthdate,CURDATE()) AS age
FROM users
LEFT JOIN (
SELECT user_id,statustext
FROM statuses s1
WHERE s1.user_id = users.id
ORDER BY status_date DESC
LIMIT 1
) s ON users.id = s.user_id
You could use a in clause with a tuple
SELECT
users.id
, alias
, gender
, login
, logout
, users.create_date
, statustext as statustxt
, TIMESTAMPDIFF(YEAR,birthdate,CURDATE()) AS age
FROM users
LEFT JOIN statuses s ON users.id = s.user_id
WHERE (s.user_id, s.ID) in (
SELECT user_id, MAX(s2.ID)
FROM statuses s2
group by user_id
)
Related
I have two tables
users: id, email, firstName, lastName
subscriptions: id, userId, currentPeriodStart, currentPeriodEnd
Below just shows you how the two tables are related. I want to return subscriptions that expire after 1565827199, but it needs to check against each user's most recent subscription.
select
u.id
from users u
join subscriptions s on u.id s.userId
where s.currentPeriodEnd > 1565827199
ORDER BY u.lastName ASC
A user may have multiple subscriptions in the subscriptions table. What I need to do is modify the query above, so it checks against that user's most recent subscription and not the first one it finds.
select * from subscriptions ORDER BY currentPeriodEnd DESC LIMIT 1
I've tried a few different things (alias table, sub query) I found elsewhere on stackoverflow without any luck.
You can filter with a correlated subquery, like so:
select u.*, s.*
from users u
inner join subscriptions s on u.id = s.userId
where s.currentPeriodEnd = (
select max(s1.currentPeriodEnd)
from subscriptions s1
where s1.userId = u.id and s1.currentPeriodEnd > 1565827199
)
order by u.lastName
For performance, consider an index on subscriptions(userId, currentPeriodEnd).
Alternatively, if you are running MySQL 8.0, you can use row_number():
select *
from (
select
u.*,
s.*,
row_number() over(partition by u.id order by s.currentPeriodEnd desc)
from users u
inner join subscriptions s on u.id = s.userId
where s.currentPeriodEnd > 1565827199
) t
where rn = 1
order by lastName
Join with a subquery that gets the latest time for each user, and filters it down to just the ones after your specified timestamp.
select u.id
from users u
join (
select userid
FROM subscriptions
GROUP BY userid
HAVING MAX(currentPeriodEnd) > 1565827199
) s ON s.userid = u.id
ORDER BY u.lastName ASC
i have the table "user" (Primary key is id) and the table "user_meta" (Primary key is user_id and valid_from).
The user table contains basic user data e.g. username, password, etc.
The user_meta contains possible changing data e.g. lastname, gender(yea its 2018 :D) etc.
So i have a history on which day which data are valid.
My Problem ist that i try to select all user with the currently valid data, but i failed often...
How i can select the correct data ?
For one user i can simply use
"select * from user_meta
JOIN user on user_meta.user_id = user.id
ORDER BY valid_from DESC LIMIT 1"
but how its working with multiple/all users?
greetings,
False
you could use a join on a subselect for max_valid group by user
select * from user_meta
inner join (
select user.id, max(user_meta.valid_from) max_valid
from user_meta
JOIN user on user_meta.user_id = user.id
group by user.id
) t on t.id= user_meta.user_id and t.max_valid = user_meta.valid_from
or more simple
select * from user_meta
inner join (
select user_meta.user_id, max(user_meta.valid_from) max_valid
from user_meta
group by user_meta.user_id
) t on t.user_id= user_meta.user_id and t.max_valid = user_meta.valid_from
You probably want something along these lines:
SELECT u.*, um.*
FROM user u
INNER JOIN user_meta um
ON u.id = um.user_id
INNER JOIN
(
SELECT user_id, MAX(valid_from) AS max_valid_from
FROM user_meta
GROUP BY user_id
) t
ON um.user_id = t.user_id AND
um.valid_from = t.max_valid_from;
Not much to explain here, except that the subquery aliased as t will filter off all metadata records except for the latest one, for each user.
One table is Users with id and email columns.
Another table is Payments with id, created_at, user_id and foo columns.
User has many Payments.
I need a query that returns each user's email, his last payment date and this last payment's foo value. How do I do that? What I have now is:
SELECT users.email, MAX(payments.created_at), payments.foo
FROM users
JOIN payments ON payments.user_id = users.id
GROUP BY users.id
This is wrong, because foo value does not necessarily belong to user's most recent payment.
Try this :
select users.email,foo,create_at
from users
left join(
select a.* from payments a
inner join (
select id,user_id,max(create_at)
from payments
group by id,user_id
)b on a.id = b.id
) payments on users.id = payments.user_id
If users has no payment yet, then foo and create_at would return NULL. if you want to exclude users who has no payment, then use INNER JOIN.
One approach would be to use a MySQL version of rank over partition and then select only those rows with rank = 1:
select tt.email,tt.created_at,tt.foo from (
select t.*,
case when #cur_id = t.id then #r:=#r+1 else #r:=1 end as rank,
#cur_id := t.id
from (
SELECT users.id,users.email, payments.created_at, payments.foo
FROM users
JOIN payments ON payments.user_id = users.id
order by users.id asc,payments.created_at desc
) t
JOIN (select #cur_id:=-1,#r:=0) r
) tt
where tt.rank =1;
This would save hitting the payments table twice. Could be slower though. Depends on your data!
I need to count the amount of users that have have answered all of those 3 profile_options (so they have at least 3 records in the profile_answers table).
SELECT COUNT(DISTINCT(users.id)) users_count
FROM users
INNER JOIN profile_answers ON profile_answers.user_id = users.id
WHERE profile_answers.profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT(profile_answers.id))>=3
The problem is that this query is return a table with rows for each user and how many they answered (in this case always 3). What I need is to return just one row that has the total number of users (so the sum of all rows of this example)
I know how to do it with another subquery but the problem is that I am running into "Mysql::Error: Too high level of nesting for select"
Is there a way to do this without the extra subquery?
SELECT SUM(sum_sub.users_count) FROM (
(SELECT COUNT(DISTINCT(users.id)) users_count
FROM users
INNER JOIN profile_answers ON profile_answers.user_id = users.id
WHERE profile_answers.profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT(profile_answers.id))>=3)
) sum_sub
Please give this query a shoot
SELECT COUNT(DISTINCT(u.id)) AS users_count
FROM users AS u
INNER JOIN (
SELECT user_id, COUNT(DISTINCT profile_option_id) AS total
FROM profile_answers
WHERE profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT profile_option_id) = 3
) AS a ON a.user_id = u.id
If you have lots of data in your tables, you will get a better/faster performance by using temporary tables like so
CREATE TEMPORARY TABLE a (KEY(user_id)) ENGINE = MEMORY
SELECT user_id, COUNT(DISTINCT profile_option_id) AS total
FROM profile_answers
WHERE profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT profile_option_id) = 3;
Then your final query will look like this
SELECT COUNT(DISTINCT(u.id)) as users_count
FROM a
INNER JOIN on a.user_id = u.id
Unless there is a need to join the users table you can go with this
SELECT COUNT(*) AS users_count
FROM (
SELECT user_id, COUNT(DISTINCT profile_option_id) AS total
FROM profile_answers
WHERE profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT profile_option_id) = 3
) AS a
Should you need another solution, please consider providing us you EXPLAIN EXTENDED for the query and the table definitions along with a better problem description.
I hope this helps
You can give the queries a name using the AS clause. See the updated query below.
SELECT SUM(sum_sub.users_count) FROM (
(SELECT COUNT(DISTINCT(users.id)) as users_count
FROM users
INNER JOIN profile_answers ON profile_answers.user_id = users.id
WHERE profile_answers.profile_option_id IN (37,86,102)
GROUP BY users.id
HAVING COUNT(DISTINCT(profile_answers.id))>=3)
) as sum_sub
You should not group by on a field not present in select statement.
select id, count(*) from users group by id is fine
select count(id) from users group by id is NOT
Regarding your query I think the link to user table is not necessary. Just using foreign key should be fine.
Try this one:
select count(*) from
(SELECT users_id count(*) as cnt
FROM profile_answers
INNER JOIN users ON profile_answers.user_id = users.id
WHERE profile_answers.profile_option_id IN (37,86,102)
group by users_id
having count(*) >3)
Need a little bit of help with a query.
SELECT id,email FROM user_info WHERE username!=''
AND email='example#gmail.com'
GROUP BY email ORDER BY id DESC LIMIT 1
What this does currently is fetch a distinct email but not the newest ID for an account under this email.
It sounds like you can apply an aggregate to the id field and it will return the most recent id for each email:
SELECT Max(id), email
FROM user_info
WHERE username!=''
AND email='example#gmail.com'
GROUP BY email
Unfortunately, when you apply a GROUP BY without an aggregate there is no guarantee what id will be returned unless you specify to select the max.
If you want to return the username that is associated with the max(id), then you can use a subquery:
SELECT i.MaxId,
i.email,
u.username
FROM user_info u
inner join
(
select max(id) MaxId, email
from user_info
WHERE username!=''
AND email='example#gmail.com'
group by email
) i
on u.id = i.maxid
and u.email = i.email
WHERE username!=''
AND email='example#gmail.com';
If by "newest id" you mean the largest value, then here's one approach that makes use of MySQL user variables to retain the value from previous rows, so a comparison can be made:
SELECT IF(u.email=#prev_email,1,0) AS dup_email_ind
, u.id
, u.username
, #prev_email := u.email AS email
FROM user_info u
CROSS
JOIN ( SELECT #prev_email := NULL) i
WHERE u.username != ''
AND u.email = 'example#gmail.com'
GROUP
BY u.email DESC
, u.id DESC
, u.username DESC
This returns all of the rows, with an indicator of whether the row is considered an older "duplicate" email or not. Rows that have dup_email_ind = 1 are identified as older duplicates, dup_email_ind = 0 indicates that this row is the latest row (the row with the largest id value) for a given email value.
(Usually, when I'm looking for duplicates like this, it's helpful for me to return both, or all, of the rows that are "duplicates".)
To return only the rows with the "newest id", wrap the query above (as an inline view) in another query: the output from the query is used a row source for the outer query.)
SELECT d.*
FROM (
-- the query above gets put here
) d
WHERE d.dup_mail_ind = 0
Another approach is to use a correlated subqueries in the SELECT list, although this is really only suitable for returning small sets. (This approach can have serious performance issues with large sets.)
SELECT ( SELECT u1.id
FROM user_info u1
WHERE u1.email = e.email
AND u1.username != ''
ORDER BY u1.id DESC
LIMIT 1
) AS id
, ( SELECT u2.username
FROM user_info u2
WHERE u2.email = e.email
AND u2.username != ''
ORDER BY u2.id DESC
LIMIT 1
) AS username
, e.email
FROM ( SELECT u.email
FROM user_info u
WHERE u.email = 'example#gmail.com'
AND u.username != ''
GROUP BY u.email
) e