Transform sub-queries to joins - mysql

I have three tables: userProfile, loginTimes, orders.
I am trying get each user's profile row, his last login time, and his last order row.
Here's my query:
Select u.*, t.loginTime, orders.* From userProfiles u
Inner Join
(Select userId, MAX(time) loginTime From loginTimes Group By userID) t
On u.userId = t.userID
Inner Join
(Select userId, MAX(enterDate) orderDate From orders Group By userId) o
On u.userID = o.userID
Inner Join
orders On orders.userId = u.userId And orders.enterDate = o.orderDate
Is there any way to rewrite without so many sub queries?

OP I think this is the query you are going for, this still requires 2 subqueries, but I don't believe your original query functioned as intended.
You could remove the loginTimes subquery, and use MAX(loginTime) in the outer SELECT list, but then you'd need to GROUP BY every field in the order table, which is arguably just as unclean.
The following query retrieves the UserId, latest LoginTime and the entire order record for the user's most recent order:
SELECT u.userId,
u.userName,
l.loginTime,
o.*
FROM userProfiles u
INNER JOIN ( SELECT userId,
loginTime = MAX(time)
FROM loginTimes
GROUP BY userID) l ON u.userId = l.userId
INNER JOIN ( SELECT *,
rowNum = ROW_NUMBER() OVER (PARTITION BY userId
ORDER BY enterDate DESC)
FROM orders) o ON u.userId = o.userId AND o.rowNum = 1
Working on SQLFiddle

You can easily re-write the aggregation with the following.
-- Aggregation
(
SELECT
T.userId,
MAX(time) as loginTime,
MAX(enterDate) as orderDate
FROM
loginTimes as T INNER JOIN orders as O
ON
T.userId = O.userId
GROUP BY
T.userId
)
However, I do not understand why you are calculating MAX(enterDate) and not using it.
The two tables that are joined without aggregation is easy also. You should stay away from using *. It is just wasted overhead if all the fields are not being used.
SELECT
U.*, O.*
FROM
userProfiles as U
INNER JOIN
orders as O
ON O.userId = U.userId
Please explain what you are trying to return as values from the Query. What is the business logic?

I believe this will do:
SELECT u.userID
,u.otherColumn
,MAX(t.time) AS loginTime
,MAX(o.enterDate) AS orderDate
FROM userProfiles u
JOIN loginTimes t ON t.userID = u.userID
JOIN orders o ON o.userID = u.userID
GROUP BY u.userID, u.otherColumn
For every other column in userProfiles you add to the SELECT clause, you need to add it to the GROUP BY clause as well..
Update:
Just because it can be done.. I tried it without any subquery :)
SELECT u.userID
,MAX(t.time) AS loginTime
,o.*
FROM userProfiles u
JOIN loginTimes t ON t.userID = u.userID
JOIN orders o ON o.userID = u.userID
LEFT JOIN orders o1 ON o.userID = o1.userID AND o.enterDate < o1.enterDate
WHERE o1.orderID IS NULL
GROUP BY u.userID
,o.* --write out the fields here
You'll have to write down the fields of the orders table you want in the select clause in your GROUP BY clause also.

Related

Subquery left join refer to parent ID

I am trying to make a query to fetch the newest car for each user:
select * from users
left join
(select cars.* from cars
where cars.userid=users.userid
order by cars.year desc limit 1) as cars
on cars.userid=users.userid
It looks like it says Unknown column "users.userid" in where clause
I tried to remove cars.userid=users.userid part, but then it only fetches 1 newest car, and sticks it on to each user.
Is there any way to accomplish what I'm after? thanks!!
For this purpose, I usually use row_number():
select *
from users u left join
(select c.* , row_number() over (partition by c.userid order by c.year desc) as seqnum
from cars c
) c
on c.userid = u.userid and c.seqnum = 1;
One option is to filter the left join with a subquery:
select * -- better enumerate the columns here
from users u
left join cars c
on c.userid = u.userid
and c.year = (select max(c1.year) from cars c1 where c1.userid = c.userid)
For performance, consider an index on car(userid, year).
Note that this might return multiple cars per user if you have duplicate (userid, year) in cars. It would be better to have a real date rather than just the year.
Maybe there are better and more efficient way to query this. Here is my solution;
select users.userid, cars.*
from users
left join cars on cars.userid = users.userid
join (SELECT userid, MAX(year) AS maxDate
FROM cars
GROUP BY userid) as sub on cars.year = sub.maxDate;

MySQL Group Join Table

I have below three SQL statement and I want to select out like below, I tried but not success.
Need some
help.
Output:
member_id, balance, firstname, lastname, LastPurchase, LastOrde
SELECT c.member_id
, c.firstname
, c.lastname
, m.balance
FROM member m
, customer c
where m.member_id = c.member_id
order
by m.member_id
SELECT member_id, max(date) as LastPurchase
FROM purchase
GROUP
BY member_id
SELECT member_id, max(date) as LastOrder
FROM ordert
GROUP
BY member_id
You can join these statements -
SELECT c.member_id, c.firstname, c.lastname, m.balance, p.LastPurchase, o.LastOrder
FROM member m
join customer c on m.member_id = c.member_id
left join (SELECT member_id, max(date) as LastPurchase
FROM purchase
GROUP BY member_id) p on p.member_id = m.member_id
left join (SELECT member_id, max(date) as LastOrder
FROM ordert
GROUP BY member_id) o on o.member_id = m.member_id
order by m.member_id
You can join the aggregate queries. The JOIN ... USING syntax comes handy here, since all join column names are the same:
SELECT c.member_id, c.firstname, c.lastname, m.balance, p.last_purchase, o.last_purchase
FROM member m
INNER JOIN customer c USING(member_id)
INNER JOIN (
SELECT member_id, max(date) last_purchase FROM purchase GROUP BY member_id
) p USING(member_id)
INNER JOIN (
SELECT member_id, max(date) last_order FROM order GROUP BY member_id
) o USING(member_id)
ORDER BY c.member_id
Important: your original query uses implicit, old-shool joins (with a comma in the from clause) - this syntax fell out of favor more than 20 years ago and its use is discourage, since it is harder to write, read, and understand.
One of the many benefits of using explicit joins here is that you can easily change the INNER JOINs to LEFT JOINs if there is a possibility that a member has no purchase or no order at all.

Count different totals from multiple tables in mysql grouped by user_id in one query

I want to count user_id from courses_taken and quiz_attempts table but my query brings me wrong numbers.
SELECT
u.id,
u.email,
u.user,
u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(qa.user_id) total_quiz,
COUNT(ct.user_id) total_courses
FROM users u
LEFT JOIN courses_taken ct
ON u.id = ct.user_id
LEFT JOIN quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode=1
GROUP BY u.id
ORDER BY total_courses DESC
Table structure
users table
id, email, user, joined
quiz_attempt table
id,user_id, last_attempt_time, attempt_mode etc.
courses_taken table
id,user_id,course_id,taken_on etc.
Here i am trying to get all users with their total number of quiz attempts and total number of courses taken. But my query returns same numbers for both quiz attempts and courses taken.
What you can do is use COUNT DISTINCT on a column which varies uniquely with the value that you are trying to count, i.e.:
...
COUNT(DISTINCT qa.id) total_quiz,
COUNT(DISTINCT ct.course_id) total_courses
...
SqlFiddle here
You should not put distinct on the user_ID column but put it on the id for that table like this:
SELECT u.id, u.email, u.userid, u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(DISTINCT qa.id) as total_quiz,
COUNT(DISTINCT ct.id) as total_courses
FROM users u LEFT JOIN
courses_taken ct
ON u.id = ct.user_id LEFT JOIN
quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode = 1
GROUP BY u.id, u.email, u.userid, u.joined
ORDER BY total_courses DESC;
or if this confuses you, you can use subquery like this:-
SELECT
u.id,
u.email,
u.UserId,
u.joined,
qa.last_attempt_time as last_attempt_time,
qa.total_quizCOUNT,
ct.total_coursesCOUNT
FROM users u
LEFT JOIN
(Select user_id, Count(user_id) as total_coursesCOUNT from courses_taken group by user_id) ct
ON u.id = ct.user_id
LEFT JOIN (Select user_id, Count(user_id) total_quizCOUNT, MAX(last_attempt_time) as last_attempt_time from quiz_attempt where attempt_mode = 1 group by user_id) qa
ON u.id = qa.user_id
ORDER BY total_coursesCOUNT DESC
You probably have a cartesian product problem because of the join. The better solution is to pre-aggregate the results. However, in many cases if the tables are not too big, then count(distinct) solves the problem:
SELECT u.id, u.email, u.user, u.joined,
MAX(qa.last_attempt_time) as last_attempt_time,
COUNT(DISTINCT qa.id) as total_quiz,
COUNT(DISTINCT ct.id) as total_courses
FROM users u LEFT JOIN
courses_taken ct
ON u.id = ct.user_id LEFT JOIN
quiz_attempt qa
ON u.id = qa.user_id AND qa.attempt_mode = 1
GROUP BY u.id
ORDER BY total_courses DESC;
Note that this works because you are using MAX() and COUNT(). It would not work with SUM() or AVG().

SQL query on 3 tables with one to many relationships

I need help on figuring out the SQL query for my e-commerce site
there are Users(customers / customer-service-reps) table
there are Orders table
there are Line-Items(columns are manufacturer, quantity, ect) table
Users have many Orders, and Orders have many Line-Items.
I am trying to find list of users who has made 1 or more order which includes items from ('X-Parts' <- name of manufacturer)
Any help would be greatly appreciated
Try This
SELECT U.UserID, COUNT(O.OrderID) OrderCount
FROM Users U INNER JOIN Orders O ON U.UserID = O.UserID
INNER JOIN Line-Items L ON O.OrderID = L.OrderID
Where L.manufacturer = 'X-Parts'
Group BY U.UserID
Having count(O.orderID) >= 1
Sample Demo:- http://sqlfiddle.com/#!3/f1712/2
Its one or more orders.
SELECT U.UserID, COUNT(O.OrderID) as OrderCount
FROM Users U
INNER JOIN Orders O ON U.UserID = O.UserID
INNER JOIN Line-Items L ON O.OrderID = L.OrderID
Where L.manufacturer = 'X-Parts'
Group BY U.UserID
Having count(O.orderID) >= 1

MySQL INNER JOIN select only one row from second table

I have a users table and a payments table, for each user, those of which have payments, may have multiple associated payments in the payments table. I would like to select all users who have payments, but only select their latest payment. I'm trying this SQL but i've never tried nested SQL statements before so I want to know what i'm doing wrong. Appreciate the help
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*
FROM payments AS p
ORDER BY date DESC
LIMIT 1
)
ON p.user_id = u.id
WHERE u.package = 1
You need to have a subquery to get their latest date per user ID.
SELECT u.*, p.*
FROM users u
INNER JOIN payments p
ON u.id = p.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON p.user_ID = b.user_ID AND
p.date = b.maxDate
WHERE u.package = 1
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.id = (
SELECT id
FROM payments AS p2
WHERE p2.user_id = u.id
ORDER BY date DESC
LIMIT 1
)
Or
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
WHERE NOT EXISTS (
SELECT 1
FROM payments AS p2
WHERE
p2.user_id = p.user_id AND
(p2.date > p.date OR (p2.date = p.date AND p2.id > p.id))
)
These solutions are better than the accepted answer because they work correctly when there are multiple payments with same user and date. You can try on SQL Fiddle.
SELECT u.*, p.*, max(p.date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
ORDER BY p.date DESC
Check out this sqlfiddle
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM payments AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
ORDER BY p.user_id ASC, date DESC)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
You can try this:
SELECT u.*, p.*
FROM users AS u LEFT JOIN (
SELECT *, ROW_NUMBER() OVER(PARTITION BY userid ORDER BY [Date] DESC) AS RowNo
FROM payments
) AS p ON u.userid = p.userid AND p.RowNo=1
There are two problems with your query:
Every table and subquery needs a name, so you have to name the subquery INNER JOIN (SELECT ...) AS p ON ....
The subquery as you have it only returns one row period, but you actually want one row for each user. For that you need one query to get the max date and then self-join back to get the whole row.
Assuming there are no ties for payments.date, try:
SELECT u.*, p.*
FROM (
SELECT MAX(p.date) AS date, p.user_id
FROM payments AS p
GROUP BY p.user_id
) AS latestP
INNER JOIN users AS u ON latestP.user_id = u.id
INNER JOIN payments AS p ON p.user_id = u.id AND p.date = latestP.date
WHERE u.package = 1
#John Woo's answer helped me solve a similar problem. I've improved upon his answer by setting the correct ordering as well. This has worked for me:
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN (
SELECT user_ID, MAX(date) as maxDate FROM
(
SELECT user_ID, date
FROM payments
ORDER BY date DESC
) d
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
WHERE a.package = 1
I'm not sure how efficient this is, though.
SELECT U.*, V.* FROM users AS U
INNER JOIN (SELECT *
FROM payments
WHERE id IN (
SELECT MAX(id)
FROM payments
GROUP BY user_id
)) AS V ON U.id = V.user_id
This will get it working
Matei Mihai given a simple and efficient solution but it will not work until put a MAX(date) in SELECT part so this query will become:
SELECT u.*, p.*, max(date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
And order by will not make any difference in grouping but it can order the final result provided by group by. I tried it and it worked for me.
My answer directly inspired from #valex very usefull, if you need several cols in the ORDER BY clause.
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM (SELECT * FROM payments ORDER BY p.user_id ASC, date DESC) AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
This is quite simple do The inner join and then group by user_id and use max aggregate function in payment_id assuming your table being user and payment query can be
SELECT user.id, max(payment.id)
FROM user INNER JOIN payment ON (user.id = payment.user_id)
GROUP BY user.id
If you do not have to return the payment from the query you can do this with distinct, like:
SELECT DISTINCT u.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
This will return only users which have at least one record associated in payment table (because of inner join), and if user have multiple payments, will be returned only once (because of distinct), but the payment itself won't be returned, if you need the payment to be returned from the query, you can use for example subquery as other proposed.