Mysql forum - get number of replies - mysql

I'm making a very simple forum with one table called forum_posts. I store both the replies and the posts in that same table because I've found that works pretty well for comments system I made before.
If the post is a reply, it has a reply_id that is the post_id of the post it is replying to. If it is a 'root post' so to speak, it has a 0 for the reply_id.
I already have the number of views. But I'd like to get the number of replies for each record in the results.
How would I do that?
SELECT a.account_id, a.store_name, p.post_id, p.post_title, p.post_text, p.views, p.creation_timestamp, p.update_timestamp
FROM forum_posts AS p
INNER JOIN accounts AS a
ON p.account_id = a.account_id
WHERE p.reply_id > 0
As you can probably guess, I'm making the forum listings where people choose a forum post to go and view.

You will need to join posts against itself and count it to get the number of replies. This isn't particularly efficient over millions of forum posts - most forums denormalize this and maintain a post count attribute separately.
However, in keeping with what you have, something like (untested)...
SELECT a.account_id, a.store_name, cp.* FROM
(
SELECT p.post_id, p.post_title, p.post_text, p.views, p.creation_timestamp,
p.update_timestamp, p.account_id, (COUNT(*) - 1) as replies
FROM forum_posts AS p
LEFT JOIN forum_posts AS p1 ON p1.reply_id > 0 AND p1.reply_id = p.post_id
GROUP BY p.post_id
)
AS cp
INNER JOIN accounts AS a ON cp.account_id = a.account_id
WHERE cp.replies > 0
The end WHERE is optional. I just copied it from your first query. It's also worth noting that this is MySQL specific as it uses the GROUP BY without a full list of non-aggregated columns (but you tagged your question as MySQL so no problem).

SELECT account_id, store_name, post_id, post_title, post_text, views, creation_timestamp, update_timestamp, IF(reply_id IS NOT NULL, replies, 0)
FROM (
SELECT a.account_id, a.store_name, p.post_id, p.post_title, p.post_text, p.views, p.creation_timestamp, p.update_timestamp, r.reply_id, COUNT(*) as replies
FROM forum_posts AS p
INNER JOIN accounts AS a
ON p.account_id = a.account_id
LEFT JOIN forum_posts AS r
ON p.post_id = r.reply_id
WHERE p.reply_id = 0
GROUP BY p.post_id
) AS sq
It also seems that you need p.reply_id = 0 if you want to show only "root" posts (forum threads) rather than replies.
Outer query makes sure that the posts with no replies (which are still returned in the inner query) are printed with the number of replies 0 rather than 1 (which would be incorrect obviously).

Related

Nested query performance

I have two queries below. The first one has a nested select. The second one makes use of a group by clause.
select
posts.*,
(select count(*) from comments where comments.post_id = posts.id and comments.is_approved = 1) as comments_count
from
posts
select
posts.*,
count(comments.id) comments_count
from
posts
left join comments on
comments.post_id = posts.id
group by
posts.*
From my understanding the first query is worse because it has to do a select for each record in posts where as the second query does not.
Is this true or false?
As with all performance questions, you should test the performance on your system with your data.
However, I would expect the first to perform better, with the right indexes. The right index for:
select p.*,
(select count(*)
from comments c
where c.post_id = p.id and c.is_approved = 1
) as comments_count
from posts p
is comments(post_id, is_approved).
MySQL implements a group by by doing a file sort. This version saves a file sort on all the data. My guess is that will be faster than the second method.
As a note: group by posts.* is not valid syntax. I assume this was intended for illustration purposes only.
This is the standard way I would do it (the use of LEFT JOIN, and SUM lets you also know which posts have no comments.)
SELECT posts.*
, SUM(IF(comments.id IS NULL, 0, 1)) AS comments_count
FROM posts
LEFT JOIN comments USING (post_id)
GROUP BY posts.post_id
;
But if I were trying for faster, this might be better.
SELECT posts.*, IFNULL(subQ.comments_count, 0) AS comments_count
FROM posts
LEFT JOIN (
SELECT post_id, COUNT(1) AS comments_count
FROM comments
GROUP BY post_id
) As subQ
USING (post_id)
;
After a bit more research I found no time difference between the two queries
Benchmark.bm do |b|
b.report('joined') do
1000.times do
ActiveRecord::Base.connection.execute('
select
p.id,
(select count(c.id) from comments c where c.post_id = p.id) comment_count
from
posts l;')
end
end
b.report('nested') do
1000.times do
ActiveRecord::Base.connection.execute('
select
p.id,
count(c.id) comment_count
from
posts File.join(File.dirname(__FILE__), *%w[rel path here])
left join comments c on
c.post_id = p.id
group by
p.id;')
end
end
end
user system total real
nested 2.120000 0.900000 3.020000 ( 3.349015)
joined 2.110000 0.990000 3.100000 ( 3.402986)
However I did notice that when running an explain for both queries, more indexes are possible in the first query. Which makes me think it is a better option if the attributes needed in the select changed.

SQL query optimization and sort by other row if first is empty

SQL Query:
SELECT
T.*,
U.nick AS author_nick,
P.id AS post_id,
P.name AS post_name,
P.author AS post_author_id,
U2.nick AS post_author
FROM
zero_topics T
LEFT JOIN
zero_posts P
ON
T.id = P.topic_id
LEFT JOIN
zero_players U
ON
T.author = U.uuid
LEFT JOIN
zero_players U2
ON
P.author = U2.uuid
ORDER BY
P.id DESC
Questions:
I need to double left join to get user nick from UUID for topic and post
Not all topics will have post, as you see i sort from post id(it will be date) but it shows on first place topics with last post, and on bottom topics without replies, how can i define order when posts doesn't exists?
1.You will need to double left join if you need to show the nicks in different columns
2.You could use a case in you order by
ORDER BY
CASE
WHEN P.id is null THEN T.ID
ELSE P.ID
END ASC
Final Query:-
SELECT
T.*,
U.nick AS author_nick,
P.id AS post_id,
P.name AS post_name,
P.author AS post_author_id,
U2.nick AS post_author
FROM
zero_topics T
LEFT JOIN
zero_posts P
ON
T.id = P.topic_id
LEFT JOIN
zero_players U
ON
T.author = U.uuid
LEFT JOIN
zero_players U2
ON
P.author = U2.uuid
ORDER BY
CASE
WHEN P.id is null THEN T.ID
ELSE P.ID
END ASC
You actually have two join chains from the topics table. One chain ties an author directly to the topic and one ties an author to each post about the topic, either one or both may be left joined. But once you start a left join in a chain, it must then be continued down the rest of the chain or you nullify the left join. Actually, the topic author is in a chain of length 1 so you don't have to worry about that one.
If every topic has an author, you don't need to left join the first players table (T.author = U.uuid) as that would always link. You would left join down the post chain to see topics even if they have no posts written on them.
Assuming that is what you want to see, then the order by clause could well stay just as you wrote it. What you would get is a list of posts, ordered by ID, with the topics scattered around however they ended up. Any topics with no posts would be clumped all either at the beginning or at the end of the result set, depending on your settings and the DBMS.
If, however, you wrote the order by like this:
order by t.Title, p.id;
Then you would get all the topic ordered by title, with the posts written about that topic ordered by ID within each topic. Any topic with no posts would have a single row (assuming only one topic author) in the proper title order but showing only topic data.
So it all depends on what you want to see.

MYSQL subquery SELECT in JOIN clause

Ok... well I have to put the subquery in a JOIN clause since it selects more than one column and putting it in the SELECT clause does not allow that as it gives me an error of an operand.
Anywho, this is my query:
SELECT
c.id,
c.title,
c.description,
c.icon,
p.id as topic_id,
p.title AS topic_title,
p.date,
p.username
FROM forum_cat c
LEFT JOIN (
SELECT
ft.id,
ft.cat_id,
ft.title,
fp.date,
u.username
FROM forum_topic ft
JOIN forum_post fp ON fp.topic_id = ft.id
JOIN user u ON u.user_id = fp.author_id
WHERE ft.cat_id = c.id
ORDER BY fp.date DESC
LIMIT 1
) p ON p.cat_id = c.id
WHERE c.main_cat = ?
ORDER BY c.list_no
Now the important thing I need here... FOR EACH category, I want to show the latest post and topic title in each category.
However, this select statement is going INSIDE a foreach loop looping around the general categories which is found my main_cat.
So there are 5 main categories with 3-8 subcategories.. this is the subcategory query. BUT FOR EACH subcategory, I need to grab the latest post.. However, it only runs this SELECT query for each main category so it's only select THE LATEST post between all subcategories combined... I want to get the latest post of EACH subcategory, but I rather not run this query for each subcategory... since I want the page load to be fast.
BUT REMEMBER, some subcategories WILL NOT have a latest post since some of them may not even contain a topic yet! So hence the left join.
Does anyone know how to go about doing this?
AND BTW, there is an error it gives me (WHERE ft.cat_id = c.id) in the subquery because c.id is an unknown column. But I'm trying to reference it from the outer query so can someone help me on that issue as well?
Thank you!
All tables:
forum_cat (Subcategories)
-----------------------------------------------
ID, Title, Description, Icon, Main_cat, List_no
forum_topic (Topics in each subcategory)
--------------------------------------------
ID, Author_id, Cat_id, Title, Sticky, Locked
forum_post (Posts in each topic)
--------------------------------------------
ID, Topic_id, Author_id, Body, Date, Hidden'
The main categories are listed in a function. I didn't store them in the database since it was a waste of space since they never change. There are 7 main categories though.
It's hard to tell without seeing DDL of your tables, relevant sample data and desired output.
I could've got your requirements wrong, but try this:
SELECT *
FROM forum_cat c LEFT JOIN
(SELECT t.cat_id,
p.topic_id,
t.title,
p.id,
p.body,
MAX(p.`date`) AS `date`,
p.author_id,
u.username
FROM forum_post p INNER JOIN
forum_topic t ON t.id = p.topic_id INNER JOIN
`user` u ON u.user_id = p.author_id
GROUP BY t.cat_id) d ON d.cat_id = c.id
WHERE c.main_cat = 1
ORDER BY c.list_no

MySQL count of grandchild table returning different results

I am developing a PHP forum.
This forum uses four database tables: forum, thread, post, user.
On my landing page, I have a listing of all forums, plus columns for latest thread (achieved via join and inner join), total threas (simple count subquery), and total posts.
I have a fair-sized query that returns all of the above, and everything is working quite nicely - except for the total posts.
The main query is thus:
select f.id as forum_id,
f.name as forum_name,
f.description,
t.forum_id,
#this subquery counts total threads in each forum
(select count(t.forum_id)
from thread t
where t.forum_id = f.id
) as total_threads,
#this query counts total posts for each forum
(SELECT COUNT( p.id )
FROM post p
WHERE p.thread_id = t.id
AND t.forum_id = f.id
GROUP BY f.id) as total_posts,
t.id as thread_id,
t.name as thread_name,
t.forum_id as parent_forum,
t.user_id,
t.date_created,
u.id as user_id,
u.username
from forum f
# this join finds all latest threads of each forum
join
(select forum_id, max(date_created) as latest
from thread
group by forum_id) as d on d.forum_id = f.id
#and this inner join grabs the rest of the thread table for each latest thread
inner join thread as t
on d.forum_id = t.forum_id
and d.latest = t.date_created
join user as u on t.user_id = u.id
So, if you will direct your attention to the total posts subquery above
you'll notice htat I am counting all posts where their thread id = the id of each thread which then = the id of each forum, If i use this query alone (and include the table aliases used elsewhere in the main query) it works perfectly.
however, when used in the contect of the main query, and with tables aliases being provided elsewhere, it only returns the count for the first thread p/forum.
If i try to state the table aliases in the subquery it returns the error that more than one row has been returned.
Why the discrepancy regarding the content of the query, and why only the first thread being counted when used as a calculated field in the main query?
As both t.forum_id and f.id are only relevant outside of the subquery, your subquery is equivalent to this:
IF(t.forum_id = f.id,
(SELECT COUNT(p.id)
FROM post p
WHERE p.thread_id = t.id
GROUP BY 1)
, 0) AS total_posts
You probably want something like this:
SELECT f.name AS forum_name, COUNT(p.id) AS total_posts
FROM forum AS f
JOIN thread AS t ON t.forum_id = f.id
JOIN post AS p ON p.thread_id = t.id
GROUP BY f.id
That query will return one row per forum, and should correctly include the post count.
Note that if there are no posts in a forum, that forum will not be returned by this query - you can change that by using LEFT JOINs instead of JOINs, if that is something you need to watch for.

How to left join multiple one to many tables in mysql?

i have a problem with joining three tables in mysql.
lets say we have a table named posts which I keep my entries in it, i have a table named likes which i store user_id's and post_id's in and a third table named comments which i store user_id's and post_id's and comment's text in it.
I need a query that fetches list of my entries, with number of likes and comments for each entry.
Im using this query:
SELECT posts.id, count(comments.id) as total_comments, count(likes.id) as total_likes
FROM `posts`
LEFT OUTER JOIN comments ON comments.post_id = posts.id
LEFT OUTER JOIN likes ON likes.post_id = posts.id
GROUP BY posts.id
but there is a problem with this query, if comments are empty for an item, likes count is just ok, but lets say if an entry has 2 comments and 4 likes, both total_comments and total_likes will be "8", meaning that mysql multiplies them.
I'm confused and I dont know what whould I do.
Thanks in advace.
Use count(distinct comments.id) and count(distinct likes.id), provided these ids are unique.
Well this is one way to approach it (assuming mysql allows derived tables):
SELECT posts.id, comments.total_comments, likes.total_likes
FROM `posts`
LEFT OUTER JOIN (select post_id, count(id) as total_comments from comments) comments
ON comments.post_id = posts.id
LEFT OUTER JOIN (select post_id, count(id) as total_likes from likes) likes
ON likes.post_id = posts.id
You could also use correlated subqueries. You may want a case statment inthere to account for putting in a 0 when there are no matched records.
Let's try a correlated subquery:
SELECT posts.id,
(select count(Id) from comments where post_id = posts.id) as total_comments,
(select count(Id) from likes where post_id = posts.id) as total_likes
FROM `posts`