Following query shows the results I need from table pointsList (latest records per user grouped by column idMatch)
select * from pointsList p
inner join ( select idMatch, max(datePointsCalculated ) as MaxDate from pointsList group by idMatch ) tm
on p.idMatch = tm.idMatch and p.datePointsCalculated = tm.MaxDate
order by p.idMatch ASC
But now I want to also select some additional information about the user (from table users). My naive way to tackle this was by making a new inner join like this:
select * from pointsList p, users u
inner join users
on u.idUser = p.idUser
inner join ( select idMatch, max(datePointsCalculated ) as MaxDate from pointsList group by idMatch ) tm
on p.idMatch = tm.idMatch and p.datePointsCalculated = tm.MaxDate
order by p.idMatch ASC
But I get the error message "unknown column 'p.idUser' in on clause". I tried using users.idUser and pointsList.idUser and other combinations (renaming pointsList in the on-clouse) but I always get the unknown column error (pointsList.idUser really does exist). Anyone could explain what I am doing wrong? I would like to extend this query to another table as well.
Thank you in advance!
Comma , priority in FROM clause is less than JOIN priority. And your query acts as:
select * from pointsList p,
(
users u
inner join users
on u.idUser = p.idUser
inner join ( select idMatch, max(datePointsCalculated ) as MaxDate from pointsList group by idMatch ) tm
on p.idMatch = tm.idMatch and p.datePointsCalculated = tm.MaxDate
)
order by p.idMatch ASC
Of course, table pointsList AS p is not accessible within the parenthesis.
Use CROSS JOIN instead of comma:
select * from pointsList p
CROSS JOIN users u
inner join users
on u.idUser = p.idUser
inner join ( select idMatch, max(datePointsCalculated ) as MaxDate from pointsList group by idMatch ) tm
on p.idMatch = tm.idMatch and p.datePointsCalculated = tm.MaxDate
order by p.idMatch ASC
Related
I have a query with one LEFT JOIN that works fine. When I add a second LEFT JOIN to a table with multiple records per field in the first table, however, I am getting the product of the results in the two tables ie books x publishers returned. How can I prevent this from happening?
SELECT a.*,b.*,p.*, group_concat(b.id as `bids`)
FROM authors `a`
LEFT JOIN books `b`
ON b.authorid = a.id
LEFT JOIN publishers `p`
on p.authorid = a.id
GROUP by a.id
EDIT:
Figured it out. The way to do this is to use subqueries as in this answer:
SELECT u.id
, u.account_balance
, g.grocery_visits
, f.fishmarket_visits
FROM users u
LEFT JOIN (
SELECT user_id, count(*) AS grocery_visits
FROM grocery
GROUP BY user_id
) g ON g.user_id = u.id
LEFT JOIN (
SELECT user_id, count(*) AS fishmarket_visits
FROM fishmarket
GROUP BY user_id
) f ON f.user_id = u.id
ORDER BY u.id;
If you do multiple LEFT Joins, your query will return a cartesian product of the results. To avoid this and get only one copy of fields you desire, do a subquery for each table you wish to join as below. Hope this helps someone in the future.
SELECT u.id
, u.account_balance
, g.grocery_visits
, f.fishmarket_visits
FROM users u
LEFT JOIN (
SELECT user_id, count(*) AS grocery_visits
FROM grocery
GROUP BY user_id
) g ON g.user_id = u.id
LEFT JOIN (
SELECT user_id, count(*) AS fishmarket_visits
FROM fishmarket
GROUP BY user_id
) f ON f.user_id = u.id
ORDER BY u.id;
I have 3 tables sc_user, sc_cube, sc_cube_sent
I wand to join to a user query ( sc_user) one distinct random message/cube ( from sc_cube ), that has not been sent to that user before ( sc_cube_sent), so each row in the result set has a disctinct user id and a random cubeid from sc_cube that is not part of sc_cube_sent with that user id associated there.
I am facing the problem that I seem not to be able to use a correlation id for the case that I need the u.id of the outer query in the inner On clause. I would need the commented section to make it work.
# get one random idcube per user not already sent to that user
SELECT u.id, sub.idcube
FROM sc_user as u
LEFT JOIN (
SELECT c.idcube, sent.idreceiver FROM sc_cube c
LEFT JOIN sc_cube_sent sent ON ( c.idcube = sent.idcube /* AND sent.idreceiver = u.id <-- "unknown column u.id in on clause" */ )
WHERE sent.idcube IS NULL
ORDER BY RAND()
LIMIT 1
) as sub
ON 1
I added a fiddle with some data : http://sqlfiddle.com/#!9/7b0bc/1
new cubeids ( sc_cube ) that should show for user 1 are the following : 2150, 2151, 2152, 2153
Edit>>
I could do it with another subquery instead of a join, but that has a huge performance impact and is not feasible ( 30 secs+ on couple of thousand rows on each table with reasonably implemented keys ), so I am still looking for a way to use the solution with JOIN.
SELECT
u.id,
(SELECT sc_cube.idcube
FROM sc_cube
WHERE NOT EXISTS(
SELECT sc_cube.idcube FROM sc_cube_sent WHERE sc_cube_sent.idcube = sc_cube.idcube AND sc_cube_sent.idreceiver = u.id
)
ORDER BY RAND() LIMIT 0,1
) as idcube
FROM sc_user u
without being able to test this, I would say you need to include your sc_user in the subquery because you have lost the scope
LEFT JOIN
( SELECT c.idcube, sent.idreceiver
FROM sc_user u
JOIN sc_cube c ON c.whatever_your_join_column_is = u.whatever_your_join_column_is
LEFT JOIN sc_cube_sent sent ON ( c.idcube = sent.idcube AND sent.idreceiver = u.id )
WHERE sent.idcube IS NULL
ORDER BY RAND()
LIMIT 1
) sub
If you want to get messagges ids that has not been sent to the particular user, then why use a join or left join at all ?
Just do:
SELECT sent.idcube
FROM sc_cube_sent sent
WHERE sent.idreceiver <> u.id
Then the query may look like this:
SELECT u.id,
/* sub.idcube */
( SELECT sent.idcube
FROM sc_cube_sent sent
WHERE sent.idreceiver <> u.id
ORDER BY RAND()
LIMIT 1
) as idcube
FROM sc_user as u
Got it working with NOT IN subselect in the on clause. Whereas the correlation link u.id is not given within the LEFT JOIN scope, it is for the scope of the ON clause. Here is how it works:
SELECT u.id, sub.idcube
FROM sc_user as u
LEFT JOIN (
SELECT idcube FROM sc_cube c ORDER BY RAND()
) sub ON (
sub.idcube NOT IN (
SELECT s.idcube FROM sc_cube_sent s WHERE s.idreceiver = u.id
)
)
GROUP BY u.id
Fiddle : http://sqlfiddle.com/#!9/7b0bc/48
I don't know much about query optimization but I know the order in which queries get executed
FROM clause
WHERE clause
GROUP BY clause
HAVING clause
SELECT clause
ORDER BY clause
This the query I had written
SELECT
`main_table`.forum_id,
my_topics.topic_id,
(
SELECT MAX(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id
) AS `maxpostid`,
(
SELECT my_posts.admin_user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `admin_user_id`,
(
SELECT my_posts.user_id FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_posts.post_id DESC LIMIT 1
) AS `user_id`,
(
SELECT COUNT(my_topics.topic_id) FROM my_topics WHERE my_topics.forum_id = main_table.forum_id ORDER BY my_topics.forum_id DESC LIMIT 1
) AS `topicscount`,
(
SELECT COUNT(my_posts.post_id) FROM my_posts WHERE my_topics.topic_id = my_posts.topic_id ORDER BY my_topics.topic_id DESC LIMIT 1
) AS `postcount`,
(
SELECT CONCAT(admin_user.firstname,' ',admin_user.lastname) FROM admin_user INNER JOIN my_posts ON my_posts.admin_user_id = admin_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `adminname`,
(
SELECT forum_user.nick_name FROM forum_user INNER JOIN my_posts ON my_posts.user_id = forum_user.user_id WHERE my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `nickname`,
(
SELECT CONCAT(ce1.value,' ',ce2.value) AS fullname FROM my_posts INNER JOIN customer_entity_varchar AS ce1 ON ce1.entity_id = my_posts.user_id INNER JOIN customer_entity_varchar AS ce2 ON ce2.entity_id=my_posts.user_id WHERE (ce1.attribute_id = 1) AND (ce2.attribute_id = 2) AND my_posts.post_id = maxpostid ORDER BY my_posts.post_id DESC LIMIT 1
) AS `fullname`
FROM `my_forums` AS `main_table`
LEFT JOIN `my_topics` ON main_table.forum_id = my_topics.forum_id
WHERE (forum_status = '1')
And now I want to know if there is any way to optimize it ? Because all the logic is written in Select section not From, but I don't know how to write the same logic in From section of the query ?
Does it make any difference or both are same ?
Thanks
Correlated subqueries should really be a last resort, they often end up being executed RBAR, and given that a number of your subqueries are very similar, trying to get the same result using joins is going to result in a lot less table scans.
The first thing I note is that all of your subqueries include the table my_posts, and most contain ORDER BY my_posts.post_id DESC LIMIT 1, those that don't have a count with no group by so the order and limit are redundant anyway, so my first step would be to join to my_posts:
SELECT *
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
WHERE forum_status = '1';
Here the subquery just ensures you get the latest post per topic_id. I have shortened your table aliases here for my convenience, I am not sure why you would use a table alias that is longer than the actual table name?
Now you have the bulk of your query you can start adding in your columns, in order to get the post count, I have added a count to the subquery Maxp, I have also had to add a few more joins to get some of the detail out, such as names:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
t2.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN my_topics AS t
ON f.forum_id = t.forum_id
LEFT JOIN
( SELECT topic_id,
MAX(post_id) AS post_id,
COUNT(*) AS postcount
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
LEFT JOIN
( SELECT forum_id, COUNT(*) AS topicscount
FROM my_topics
GROUP BY forum_id
) AS t2
ON t2.forum_id = f.forum_id
WHERE forum_status = '1';
I am not familiar with your schema so the above may need some tweaking, but the principal remains - use JOINs over sub-selects.
The next stage of optimisation I would do is to get rid of your customer_entity_varchar table, or at least stop using it to store things as basic as first name and last name. The Entity-Attribute-Value model is an SQL antipattern, if you added two columns, FirstName and LastName to your forum_user table you would immediately lose two joins from your query. I won't get too involved in the EAV vs Relational debate as this has been extensively discussed a number of times, and I have nothing more to add.
The final stage would be to add appropriate indexes, you are in the best decision to decide what is appropriate, I'd suggest you probably want indexes on at least the foreign keys in each table, possibly more.
EDIT
To get one row per forum_id you would need to use the following:
SELECT f.forum_id,
t.topic_id,
p.post_id AS `maxpostid`,
p.admin_user_id,
p.user_id,
MaxT.topicscount,
maxp.postcount,
CONCAT(au.firstname,' ',au.lastname) AS adminname,
fu.nick_name AS nickname
CONCAT(ce1.value,' ',ce2.value) AS fullname
FROM my_forums AS f
LEFT JOIN
( SELECT t.forum_id,
COUNT(DISTINCT t.topic_id) AS topicscount,
COUNT(*) AS postCount,
MAX(t.topic_ID) AS topic_id
FROM my_topics AS t
INNER JOIN my_posts AS p
ON p.topic_id = p.topic_id
GROUP BY t.forum_id
) AS MaxT
ON MaxT.forum_id = f.forum_id
LEFT JOIN my_topics AS t
ON t.topic_ID = Maxt.topic_ID
LEFT JOIN
( SELECT topic_id, MAX(post_id) AS post_id
FROM my_posts
GROUP BY topic_id
) AS Maxp
ON Maxp.topic_id = t.topic_id
LEFT JOIN my_posts AS p
ON p.post_id = Maxp.post_id
LEFT JOIN admin_user AS au
ON au.admin_user_id = p.admin_user_id
LEFT JOIN forum_user AS fu
ON fu.user_id = p.user_id
LEFT JOIN customer_entity_varchar AS ce1
ON ce1.entity_id = p.user_id
AND ce1.attribute_id = 1
LEFT JOIN customer_entity_varchar AS ce2
ON ce2.entity_id = p.user_id
AND ce2.attribute_id = 2
WHERE forum_status = '1';
I have 3 tables "maintenances", "cars", "users" . I want to select all data from table maintenance with a distinct car_id and the last record for each distinct (based on max maintenance_date)
SELECT
m. * , u.username, c.Model, c.Make, c.License, c.Milage, COUNT( m.process_id ) AS count_nr
FROM
maintenances AS m
LEFT JOIN users AS u ON u.id = m.user_id
LEFT JOIN cars AS c ON c.id = m.car_id
WHERE
maintenance_date = (SELECT MAX(maintenance_date) FROM maintenances WHERE car_id = m.car_id)
The problem is that this query returns only one record which has the max date from all records. I want all records (distinct car_id and from records with the same car_id to display only values for max(maintenance_date))
This is your query:
SELECT m. * , u.username, c.Model, c.Make, c.License, c.Milage, COUNT( m.process_id ) AS count_nr
----------------------------------------------------------------^
FROM maintenances AS m LEFT JOIN
users AS u
ON u.id = m.user_id LEFT JOIN
cars AS c
ON c.id = m.car_id
WHERE maintenance_date = (SELECT MAX(maintenance_date) FROM maintenances WHERE car_id = m.car_id);
It is an aggregation query. Without a group by, only one row is returned (all the rows are in one group). So, add the group by:
SELECT m. * , u.username, c.Model, c.Make, c.License, c.Milage, COUNT( m.process_id ) AS count_nr
FROM maintenances AS m LEFT JOIN
users AS u
ON u.id = m.user_id LEFT JOIN
cars AS c
ON c.id = m.car_id
WHERE maintenance_date = (SELECT MAX(m2.maintenance_date) FROM maintenances m2 WHERE m2.car_id = m.car_id);
GROUP BY c.id
I also fixed the correlation statement, to be clear that it is correlated to the outer query.
add GROUP BY u.username .
WHERE
maintenance_date = (SELECT MAX(maintenance_date) FROM maintenances WHERE car_id = m.car_id)
GROUP BY u.username
i have these tables :
notice
id INT
cdate DATETIME
...
theme
id
name
notice_theme
id_notice
id_theme
I want to get the latest notices for each theme.
SELECT id_theme, n.id
FROM notice_theme
LEFT JOIN (
SELECT id, cdate
FROM notice
ORDER BY cdate DESC
) AS n ON notice_theme.id_notice = n.id
GROUP BY id_theme
The result is not good. An idea ? Thanks.
There are so many ways to solve this but I'm used to do it this way. An extra subquery is needed to separately calculate the latest cDate for every ID.
SELECT a.*, c.*
FROM theme a
INNER JOIN notice_theme b
ON a.ID = b.id_theme
INNER JOIN notice c
ON b.id_notice = c.ID
INNER JOIN
(
SELECT a.id_theme, MAX(b.DATE_CREATE) max_date
FROM notice_theme a
INNER JOIN notice b
ON a.ID_Notice = b.ID
GROUP BY a.id_theme
) d ON b.id_theme = d.id_theme AND
c.DATE_CREATE = d.max_date
SQLFiddle Demo