Please assume this queries:
SELECT p.* FROM posts p
JOIN posts_tags pt ON pt.post_id = p.id
JOIN tags t ON pt.tag_id = t.id AND t.name = 'php'
SELECT p.* FROM posts p
JOIN posts_tags pt ON pt.post_id = p.id
JOIN tags t ON pt.tag_id = t.id
WHERE t.name = 'php'
AS you know, both have an identical result. But this condition t.name = 'php' is in JOIN clause in the first query and it is on the WHERE clause on the second query. I want to know which one is better and why?
Generally, adding condition in Where clause, makes the code more clearer. When you add them to the AND clause, it gives a feeling that JOIN is based on the combination of two fields.
Adding condition to Where clause, might help the optimizer to filter out the records even before joining, in case of large tables. I would suggest to keep it in the WHERE clause.
EDIT
Also, refer to this related post: Join best practices
Related
For this example I got 3 simple tables (Page, Subs and Followers):
For each page I need to know how many subs and followers it has.
My result is supposed to look like this:
I tried using the COUNT function in combination with a GROUP BY like this:
SELECT p.ID, COUNT(s.UID) AS SubCount, COUNT(f.UID) AS FollowCount
FROM page p, subs s, followers f
WHERE p.ID = s.ID AND p.ID = f.ID AND s.ID = f.ID
GROUP BY p.ID
Obviously this statement returns a wrong result.
My other attempt was using two different SELECT statements and then combining the two subresults into one table.
SELECT p.ID, COUNT(s.UID) AS SubCount FROM page p, subs s WHERE p.ID = s.ID GROUP BY p.ID
and
SELECT p.ID, COUNT(f.UID) AS FollowCount FROM page p, follow f WHERE p.ID = f.ID GROUP BY p.ID
I feel like there has to be a simpler / shorter way of doing it but I'm too unexperienced to find it.
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Next, learn what COUNT() does. It counts the number of non-NULL values. So, your expressions are going to return the same value -- because f.UID and s.UID are never NULL (due to the JOIN conditions).
The issue is that the different dimensions are multiplying the amounts. A simple fix is to use COUNT(DISTINCT):
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p JOIN
subs s
ON p.ID = s.ID JOIN
followers f
ON s.ID = f.ID
GROUP BY p.ID;
The inner joins are equivalent to the original query. You probably want left joins so you can get counts of zero:
SELECT p.ID, COUNT(DISTINCT s.UID) AS SubCount, COUNT(DISTINCT f.UID) AS FollowCount
FROM page p LEFT JOIN
subs s
ON p.ID = s.ID LEFT JOIN
followers f
ON p.ID = f.ID
GROUP BY p.ID;
Scalar subquery should work in this case.
SELECT p.id,
(SELECT Count(s_uid)
FROM subs s1
WHERE s1.s_id = p.id) AS cnt_subs,
(SELECT Count(f_uid)
FROM followers f1
WHERE f1.f_id = p.id) AS cnt_fol
FROM page p
GROUP BY p.id;
Here is my query:
SELECT posts.id, posts.title, posts.body, posts.keywords
FROM posts
INNER JOIN pivot ON pivot.post_id = posts.id
INNER JOIN tags ON tags.id = pivot.tag_id
WHERE tags.name IN ('html', 'php')
GROUP BY posts.id
It selects all posts that have tagged with either php or html or both of them. Now I need to add ORDER BY clause to the query and sort the result based on the abundance. I mean I need to bring the posts that have both php and html tags in the top of result.
How can I do that?
Learn to use table aliases. It makes the queries easier to write and to read. However, you just need an appropriate ORDER BY:
SELECT p.id, p.title, p.body, p.keywords
FROM posts p INNER JOIN
pivot pi
ON pi.post_id = p.id INNER JOIN
tags t
ON t.id = pi.tag_id
WHERE t.name IN ('html', 'php')
GROUP BY p.id
ORDER BY COUNT(DISTINCT t.name) DESC;
I'm a little bit confused about a stupid query:
I get rows from the table posts joined with the table authors and the table comments, in a way like this:
SELECT posts.*, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors ON posts.id_author = authors.id_author
LEFT JOIN comments ON posts.id_post = comments.id_post
WHERE posts.active = 1
AND comments.active = 1
this doesn't work, of course.
What I try to do is to retrieve:
1) all my active post (those that were not marked as deleted);
2) the names of their authors;
3) the number of active comments (those that were not marked as deleted) for each post (if there is at least one);
What's the way? I know it's a trivial one, but by now my brain is in offside…
Thanks!
Presumably, id_post uniquely identifies each row in posts. Try this:
SELECT p.*, a.name, COUNT(c.id_post) AS num_comments
FROM posts p JOIN
authors a
ON p.id_author = a.id_author LEFT JOIN
comments c
ON p.id_post = c.id_post
WHERE p.active = 1 AND c.active = 1
GROUP BY p.id_post;
Note that this uses a MySQL extension. In most other databases, you would need to list all the columns in posts plus a.name in the group by clause.
EDIT:
The above is based on your query. If you want all active posts with a count of active comments, just do:
SELECT p.*, a.name, SUM(c.active = 1) AS num_comments
FROM posts p LEFT JOIN
authors a
ON p.id_author = a.id_author LEFT JOIN
comments c
ON p.id_post = c.id_post
WHERE p.active = 1
GROUP BY p.id_post;
Since you are doing a count, you need to have a group by. So you will need to add
Group By posts.*, authors.name
You should you GROUP BY clause together with aggregate functions. Try something similar to:
SELECT posts.*, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors ON posts.id_author = authors.id_author
LEFT JOIN comments ON posts.id_post = comments.id_post
-- group by
GROUP BY posts.*, authors.name
--
WHERE posts.active = 1
AND comments.active = 1
I found the correct solution:
SELECT posts.id_post, authors.name, COUNT(comments.id_post) AS num_comments
FROM posts JOIN authors
ON posts.id_author = authors.id_author
LEFT OUTER JOIN comments
ON (posts.id_post = comments.id_post AND comments.active = 1)
WHERE posts.active = 1
GROUP BY posts.id_post;
Thanks everyone for the help!
I'm fairly new to both mysql and php so I am still getting my head around it all, so please bear with me.
I basically have a site where users can make topics, and tag tagwords to their topics, I am trying to join the tables so when I query for the POSTS, it can also show the tag information in there.
"SELECT u.user_id, u.username, u.profile, topic_tags.tag_id, tags.tag_id, tags.tags,
p.post_id, p.post_content, p.post_date, p.post_topic, p.post_by, p.invisipost
FROM `posts` p
JOIN `users` u on p.post_by = u.user_id
JOIN `topics` t on p.post_topic = t.topic_id
WHERE p.post_topic='$id'
INNER JOIN `tags` ON topic_tags.tag_id = tags.tag_id
INNER JOIN `topic_tags` ON topics.topic_id = topic_tag.tag_id
WHERE topic_tags.tag_id = topics.topic_id";
Like I said I am still very new to this so if you could offer any advice I would be much appreciative.
EDIT: here is the code that calls the tags
<?php
$topic_id = $rows['topic_id'];
$sql=mysql_query("SELECT * FROM topic_tags WHERE `topic_id`='{$topic_id}'");
while($rowd=mysql_fetch_array($sql))
{
$tag_id = $rowd['tag_id'];
$fetch_name = mysql_fetch_object(mysql_query("SELECT * FROM `tags` WHERE `tag_id`='{$tag_id}'"));
?>
<div id="topic-tagged"><?php echo ucwords($fetch_name->tags);?></div>
<?php
}
?>
All the wheres should be at the end:
SELECT u.user_id, u.username, u.profile,
topic_tags.tag_id, tags.tag_id, tags.tags,
p.post_id, p.post_content, p.post_date,
p.post_topic, p.post_by, p.invisipost
FROM `posts` p
JOIN `users` u on p.post_by = u.user_id
JOIN `topics` t on p.post_topic = t.topic_id
INNER JOIN `tags` ON topic_tags.tag_id = tags.tag_id
INNER JOIN `topic_tags` ON topics.topic_id = topic_tag.tag_id
WHERE p.post_topic='$id' and topic_tags.tag_id = topics.topic_id
This is the corrected query statement based on your original question. I am still not sure if the last part is correct though. You might want to run this in your database directly and see if you get the results you need.
There are a few rules when writing SQL. The where clause comes after the from clause. In addition, tables can only be references in an on clause after they have been placed in the from clause. And, a query only has one from statement and one where statement. All joins are placed in the from statement.
Two good practices are to use table aliases that are abbreviations for the table (which you do sometimes). And, don't mix join and inner join. They are synonyms, but only one should be used in a query.
SELECT u.user_id, u.username, u.profile, tt.tag_id, ta.tag_id, ta.tags,
p.post_id, p.post_content, p.post_date, p.post_topic, p.post_by, p.invisipost
FROM `posts` p join
`users` u
on p.post_by = u.user_id join
`topic_tags` tt
ON p.post_topic = tt.topic_id join
`tags` ta
ON tt.tag_id = ta.tag_id
WHERE p.post_topic='$id';
Finally, I'm pretty sure that you do not want and tt.tag_id = topics.topic_id. This is comparing a tag_id to a topic_id. They are not referring to the same thing. I think the joins as shown above are sufficient for your query.
I have 3 tables: posts, tags, *posts_tags* . I want to list posts, and all tags associated with them, but to limit the results.
This is what I do now:
SELECT
p.*,
t.*
FROM
(
SELECT * FROM posts LIMIT 0, 10
) as p
LEFT JOIN
posts_tags as pt
ON pt.post_id = p.post_id
LEFT JOIN
tags as t
ON t.tag_id = pt.tag_id
It is working fine, but it seems to be a little bit slow..
Is there a better/faster way of doing this? Can I apply LIMIT somewhere else for better results?
EDIT: I want to limit posts, and not results. A post can have many tags.
Have you tried moving the limiting subquery to the where clause instead:
SELECT
p.*,
t.*
FROM
posts as p
LEFT JOIN
posts_tags as pt
ON pt.post_id = p.post_id
LEFT JOIN
tags as t
ON t.tag_id = pt.tag_id
WHERE
p.post_id in (select post_id from post limit 0,10)
Try running your query with the EXPLAIN keyword in front of it:
EXPLAIN SELECT ...
This will give you and idea about how MySQL is executing your query. Maybe you miss a key or an index somewhere. Here's how to read the result of EXPLAIN:
http://dev.mysql.com/doc/refman/5.5/en/explain-output.html
SELECT
p.*,
t.*
FROM
posts as p
LEFT JOIN
posts_tags as pt
ON pt.post_id = p.post_id
LEFT JOIN
tags as t
ON t.tag_id = pt.tag_id
LIMIT 0, 10
Should work ;)
EDIT
MySQL is quite slow when running multiple joins, in my opinion it's better to separate your query into two and then join the result in your app code (application overhead should not be so big since its only 10 results).