MYSQL subquery SELECT in JOIN clause - mysql

Ok... well I have to put the subquery in a JOIN clause since it selects more than one column and putting it in the SELECT clause does not allow that as it gives me an error of an operand.
Anywho, this is my query:
SELECT
c.id,
c.title,
c.description,
c.icon,
p.id as topic_id,
p.title AS topic_title,
p.date,
p.username
FROM forum_cat c
LEFT JOIN (
SELECT
ft.id,
ft.cat_id,
ft.title,
fp.date,
u.username
FROM forum_topic ft
JOIN forum_post fp ON fp.topic_id = ft.id
JOIN user u ON u.user_id = fp.author_id
WHERE ft.cat_id = c.id
ORDER BY fp.date DESC
LIMIT 1
) p ON p.cat_id = c.id
WHERE c.main_cat = ?
ORDER BY c.list_no
Now the important thing I need here... FOR EACH category, I want to show the latest post and topic title in each category.
However, this select statement is going INSIDE a foreach loop looping around the general categories which is found my main_cat.
So there are 5 main categories with 3-8 subcategories.. this is the subcategory query. BUT FOR EACH subcategory, I need to grab the latest post.. However, it only runs this SELECT query for each main category so it's only select THE LATEST post between all subcategories combined... I want to get the latest post of EACH subcategory, but I rather not run this query for each subcategory... since I want the page load to be fast.
BUT REMEMBER, some subcategories WILL NOT have a latest post since some of them may not even contain a topic yet! So hence the left join.
Does anyone know how to go about doing this?
AND BTW, there is an error it gives me (WHERE ft.cat_id = c.id) in the subquery because c.id is an unknown column. But I'm trying to reference it from the outer query so can someone help me on that issue as well?
Thank you!
All tables:
forum_cat (Subcategories)
-----------------------------------------------
ID, Title, Description, Icon, Main_cat, List_no
forum_topic (Topics in each subcategory)
--------------------------------------------
ID, Author_id, Cat_id, Title, Sticky, Locked
forum_post (Posts in each topic)
--------------------------------------------
ID, Topic_id, Author_id, Body, Date, Hidden'
The main categories are listed in a function. I didn't store them in the database since it was a waste of space since they never change. There are 7 main categories though.

It's hard to tell without seeing DDL of your tables, relevant sample data and desired output.
I could've got your requirements wrong, but try this:
SELECT *
FROM forum_cat c LEFT JOIN
(SELECT t.cat_id,
p.topic_id,
t.title,
p.id,
p.body,
MAX(p.`date`) AS `date`,
p.author_id,
u.username
FROM forum_post p INNER JOIN
forum_topic t ON t.id = p.topic_id INNER JOIN
`user` u ON u.user_id = p.author_id
GROUP BY t.cat_id) d ON d.cat_id = c.id
WHERE c.main_cat = 1
ORDER BY c.list_no

Related

Get total count and comments with posts

I want to get the total likes and total count of the every post in a single query with the help of joins.
I am using this query. but the result is wrong
SELECT blog.id, count(blog_comments.id) as likes , count(blog_likes.id) as comments
FROM blog LEFT JOIN
blog_comments
ON blog.id = blog_comments.blog_id LEFT JOIN
blog_likes
ON blog.id = blog_likes.blog_id
GROUP BY blog.id
Please check the image for table structure:
Your problem is that you are aggregating along two dimensions at the same time. The produces a Cartesian product -- a row with each like pairs with each comment, for a total of l * c rows.
The simplest way to fix this is to use the DISTINCT keyword:
SELECT b.id, count(DISTINCT bl.id) as likes , count(DISTINCT bc.id) as comments
FROM blog b LEFT JOIN
blog_comments bc
ON b.id = bc.blog_id LEFT JOIN
blog_likes
ON b.id = bl.blog_id
GROUP BY b.id;
If you have posts that have lots of likes and lots of comments, this is not recommended, because it creates a Cartesian product of the two.
There are several solutions for this, but I would recommend correlated subqueries:
select b.id,
(select count(*) from blog_likes bl where bl.blog_id = b.id) as likes,
(select count(*) from blog_comments bc where bc.blog_id = b.id) as comments
from blogs b;
This can take advantage of indexes on blog_likes(blog_id) and blog_comments(blog_id).
This is according to my table it will help you...
SELECT people.pe_name, COUNT(distinct orders.ord_id) AS num_orders, COUNT(items.item_id) AS num_items FROM people INNER JOIN orders ON orders.pe_id = people.pe_id INNER JOIN items ON items.ord_id = orders.ord_id GROUP BY people.pe_id;

Nested query performance

I have two queries below. The first one has a nested select. The second one makes use of a group by clause.
select
posts.*,
(select count(*) from comments where comments.post_id = posts.id and comments.is_approved = 1) as comments_count
from
posts
select
posts.*,
count(comments.id) comments_count
from
posts
left join comments on
comments.post_id = posts.id
group by
posts.*
From my understanding the first query is worse because it has to do a select for each record in posts where as the second query does not.
Is this true or false?
As with all performance questions, you should test the performance on your system with your data.
However, I would expect the first to perform better, with the right indexes. The right index for:
select p.*,
(select count(*)
from comments c
where c.post_id = p.id and c.is_approved = 1
) as comments_count
from posts p
is comments(post_id, is_approved).
MySQL implements a group by by doing a file sort. This version saves a file sort on all the data. My guess is that will be faster than the second method.
As a note: group by posts.* is not valid syntax. I assume this was intended for illustration purposes only.
This is the standard way I would do it (the use of LEFT JOIN, and SUM lets you also know which posts have no comments.)
SELECT posts.*
, SUM(IF(comments.id IS NULL, 0, 1)) AS comments_count
FROM posts
LEFT JOIN comments USING (post_id)
GROUP BY posts.post_id
;
But if I were trying for faster, this might be better.
SELECT posts.*, IFNULL(subQ.comments_count, 0) AS comments_count
FROM posts
LEFT JOIN (
SELECT post_id, COUNT(1) AS comments_count
FROM comments
GROUP BY post_id
) As subQ
USING (post_id)
;
After a bit more research I found no time difference between the two queries
Benchmark.bm do |b|
b.report('joined') do
1000.times do
ActiveRecord::Base.connection.execute('
select
p.id,
(select count(c.id) from comments c where c.post_id = p.id) comment_count
from
posts l;')
end
end
b.report('nested') do
1000.times do
ActiveRecord::Base.connection.execute('
select
p.id,
count(c.id) comment_count
from
posts File.join(File.dirname(__FILE__), *%w[rel path here])
left join comments c on
c.post_id = p.id
group by
p.id;')
end
end
end
user system total real
nested 2.120000 0.900000 3.020000 ( 3.349015)
joined 2.110000 0.990000 3.100000 ( 3.402986)
However I did notice that when running an explain for both queries, more indexes are possible in the first query. Which makes me think it is a better option if the attributes needed in the select changed.

SQL query optimization and sort by other row if first is empty

SQL Query:
SELECT
T.*,
U.nick AS author_nick,
P.id AS post_id,
P.name AS post_name,
P.author AS post_author_id,
U2.nick AS post_author
FROM
zero_topics T
LEFT JOIN
zero_posts P
ON
T.id = P.topic_id
LEFT JOIN
zero_players U
ON
T.author = U.uuid
LEFT JOIN
zero_players U2
ON
P.author = U2.uuid
ORDER BY
P.id DESC
Questions:
I need to double left join to get user nick from UUID for topic and post
Not all topics will have post, as you see i sort from post id(it will be date) but it shows on first place topics with last post, and on bottom topics without replies, how can i define order when posts doesn't exists?
1.You will need to double left join if you need to show the nicks in different columns
2.You could use a case in you order by
ORDER BY
CASE
WHEN P.id is null THEN T.ID
ELSE P.ID
END ASC
Final Query:-
SELECT
T.*,
U.nick AS author_nick,
P.id AS post_id,
P.name AS post_name,
P.author AS post_author_id,
U2.nick AS post_author
FROM
zero_topics T
LEFT JOIN
zero_posts P
ON
T.id = P.topic_id
LEFT JOIN
zero_players U
ON
T.author = U.uuid
LEFT JOIN
zero_players U2
ON
P.author = U2.uuid
ORDER BY
CASE
WHEN P.id is null THEN T.ID
ELSE P.ID
END ASC
You actually have two join chains from the topics table. One chain ties an author directly to the topic and one ties an author to each post about the topic, either one or both may be left joined. But once you start a left join in a chain, it must then be continued down the rest of the chain or you nullify the left join. Actually, the topic author is in a chain of length 1 so you don't have to worry about that one.
If every topic has an author, you don't need to left join the first players table (T.author = U.uuid) as that would always link. You would left join down the post chain to see topics even if they have no posts written on them.
Assuming that is what you want to see, then the order by clause could well stay just as you wrote it. What you would get is a list of posts, ordered by ID, with the topics scattered around however they ended up. Any topics with no posts would be clumped all either at the beginning or at the end of the result set, depending on your settings and the DBMS.
If, however, you wrote the order by like this:
order by t.Title, p.id;
Then you would get all the topic ordered by title, with the posts written about that topic ordered by ID within each topic. Any topic with no posts would have a single row (assuming only one topic author) in the proper title order but showing only topic data.
So it all depends on what you want to see.

MySQL query with LEFT OUTER JOIN and WHERE

I have three tables: stories, story_types, and comments
The following query retrieves all of the records in the stories table, gets their story_types, and the number of comments associated with each story:
SELECT s.id AS id,
s.story_date AS datetime,
s.story_content AS content,
t.story_type_label AS type_label,
t.story_type_slug AS type_slug,
COUNT(c.id) AS comment_count
FROM stories AS s
LEFT OUTER JOIN story_types AS t ON s.story_type_id = t.id
LEFT OUTER JOIN comments AS c ON s.id = c.story_id
GROUP BY s.id;
Now what I want to do is only retrieve a record from stories WHERE s.id = 1 (that's the primary key). I have tried the following, but it still returns all of the records:
SELECT s.id AS id,
s.story_date AS datetime,
s.story_content AS content,
t.story_type_label AS type_label,
t.story_type_slug AS type_slug,
COUNT(c.id) AS comment_count
FROM stories AS s
LEFT OUTER JOIN story_types AS t ON s.story_type_id = t.id
AND s.id = 1
LEFT OUTER JOIN comments AS c ON s.id = c.story_id
GROUP BY s.id;
I have also tried a WHERE clause at the end, which throws an error.
Can someone point out the correct syntax for a condition like this in this situation?
I'm using MySQL 5.1.47. Thanks.
I'm guessing you put the WHERE after the GROUP BY, which is illegal. See this reference on the SELECT syntax in MySQL.
Try this:
SELECT
s.id AS id,
s.story_date AS datetime,
s.story_content AS content,
t.story_type_label AS type_label,
t.story_type_slug AS type_slug,
COUNT(c.id) AS comment_count
FROM
stories AS s
LEFT JOIN story_types AS t ON s.story_type_id = t.id
LEFT JOIN comments AS c ON s.id = c.story_id
WHERE
s.id = 1
GROUP BY
s.id;
editor's note: I reformatted the code to highlight the query structure
Following up this comment on the accepted answer:
It is not intuitive to me that this WHERE would go in the second JOIN
This is just to outline how proper code formatting enhances understanding. Here is how I usually format SQL:
SELECT
s.id AS id,
s.story_date AS datetime,
s.story_content AS content,
t.story_type_label AS type_label,
t.story_type_slug AS type_slug,
COUNT(c.id) AS comment_count
FROM
stories AS s
LEFT JOIN story_types AS t ON t.id = s.story_type_id
LEFT OUTER JOIN comments AS c ON s.id = c.story_id
WHERE
s.id = 1
GROUP BY
s.id;
The WHERE is not on the second join. There is only one WHERE clause allowed in a SELECT statement, and it always is top level.
PS: Also note that in many database engines (apart from MySQL) it is illegal to use a GROUP BY clause and then selecting columns without aggregating them via functions like MIN(), MAX(), or COUNT(). IMHO this is bad style and a bad habit to get into.

Get the latest row from another table in MySQL

Let's say I have two tables, news and comments.
news (
id,
subject,
body,
posted
)
comments (
id,
parent, // points to news.id
message,
name,
posted
)
I would like to create one query that grabs the latest x # of news item along with the name and posted date for the latest comment for each news post.
Speed matters in terms of selecting ALL the comments in a subquery is not an option.
I just realized the query does not return results if there are no comments attached to the news table, here's the fix as well as an added column for the total # of posts:
SELECT news.*, comments.name, comments.posted, (SELECT count(id) FROM comments WHERE comments.parent = news.id) AS numComments
FROM news
LEFT JOIN comments
ON news.id = comments.parent
AND comments.id = (SELECT max(id) FROM comments WHERE parent = news.id)
If speed is that important, why not create a recent_comment table that contains the id and parent id of just the most recent comments? Every time a comment is posted on a news post, replace that news id's most recent comment id. Create an index on the news id column of the new table and your joins will be fast.You'd be trading write speed for read speed, but not by a whole lot.
Assuming posted is a unique timestamp, otherwise choose a unique autonumber
select c.id, c.parent, c.message, c.name, c.posted
c.message, c.name,
c.posted -- same as comment_latest.recent
from comments c
join
(
select parent, max(posted) as recent
from comments
group by parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
Complete(displays news information):
select
n.id as news_id, n.subject, n.body, n.posted as news_posted_date
c.id as comment_id,
c.message, c.name as commenter_name, c.posted as comment_posted_date
from comments c
join
(
select r.parent, max(r.posted) as recent
from comments r
join
(
select id from news order by id desc limit $last_x_news
) news l
on r.parent = l.id
group by r.parent
) as comment_latest
on c.parent = comment_latest.parent
and c.posted = comment_latest.recent
join news n on c.parent = n.id
NOTE:
The above code is not subquery, it is table-deriving query. It is faster than subquery. This is subquery(slow):
select
id,
subject,
body,
posted as news_posted_date,
(select id from comments where parent = news.id order by posted desc limit 1) as comment_id,
(select message from comments where parent = news.id order by posted desc limit 1) as message,
(select name from comments where parent = news.id order by posted desc limit 1) as name,
(select posted from comments where parent = news.id order by posted desc limit 1) as comment_posted_date,
from news
SELECT news.subject, news.body, comments.name, comments.posted
FROM news
INNER JOIN comments ON
(comments.parent = news.id)
WHERE comments.parent = news.id
AND comments.id = (SELECT MAX(id)
FROM comments
WHERE parent = news.id)
ORDER BY news.id
This gets all the news items, along with the related comment with the highest id value, which in theory should be the latest.
My solution is similar to J but I think he added one line that is unnecessary:
SELECT news.*, comments.name, comments.posted FROM news INNER JOIN comments ON news.id = comments.parent WHERE comments.id = (SELECT max(id) FROM comments WHERE parent = news.id )
Not sure of the speed on an extremely large table though.
Given the constraints brought to light in the comments of my other answer, I have a new idea that may or may not make any sense in practise.
Create a view (or function if it's more appropriate) with the following definition, called recent_comments:
SELECT MAX(id), parent
FROM comments
GROUP BY parent
If you have a clustered index on the parent column, this is probably a reasonably fast query, but even then it will still be a bottleneck.
Using this, the query you need to get your answer is something like,
SELECT news.*, comments.*
FROM news
INNER JOIN recent_comments
ON news.id = recent_comments.parent
INNER JOIN comments
ON comments.id = recent_comments.id
Plus considerations for news posts that don't have any comments yet.
I think the solution provided by #Jan is the best. i.e create the "View" and inner join it with the SQL statement.
It'll definitely reduce the time to pull the data. I tested it and it works 100%.