MySQL - Group by Parent/Child but order by time - mysql

I have a 'comment' system that allows for parent comments and also replies.
The following query groups the comments as follows
| Parent Comment Newest
-- Reply
-- Reply
| Parent Comment Oldest
SELECT *
FROM comments c
WHERE c.thred = 50
GROUP BY c.id
ORDER BY
IF(parent_id IS NULL, c.id, parent_id) DESC,
parent_id IS NOT NULL,
c.id ASC
However I want to modify this query to also take into consideration TIME.
I would like the PARENT comments to be sorted by 'last replied time' - so that the comments with the most recent activity are at the top.\
I can modify the code to allow for a 'last_reply' column on the parent post to facilitate that if it makes the query easier - however I am at a loss to how to allow 'time' to factor into this query above.
An example of a system that does this would be Yammer. Posts are sorted by Newest posted but also old posts are pushed back to the top if there have been recent responses/replies.

Related

Database schema design for posts, comments and replies

In my previous project I had posts and comments as two tables:
post
id
text
timestamp
userid
comment
id
message
timestamp
userid
postid
Now I've got to design replies to comments. The replies is just one level, so users can only reply to comments, not to replies. The tree structure is only 1 level deep. My first idea was to use the same comment table for both comments and replies. I added a new column though:
comment
id
message
timestamp
userid
postid
parentcommentid
Replies have parentcommentid set to the parent comment they belong. Parent comments don't have it (null)
Retrieving comments for a given post is simple:
but this time I need another query to find out the comment replies. This has to be done for each comment:
This doesn't seem to be a good solution, is there a way to have a single query which returns the complete list of comments/replies in the correct order? (dictated by the timestamp and the nesting)
You may use join and achieve result in single query like I provided below:
SELECT *, cc.message as replied_message
FROM `post`
JOIN comment as c
ON c.postid = post.id
JOIN comment as cc
ON cc.id = c.parentcommentid
ORDER BY c.timestamp DESC, cc.timestamp DESC;
Please note that, it works correctly only if 1 comment have 1 reply only.multiple replies on single comment will not support by this query
I know this reply is years too late, but hopefully it will help others facing this problem now.
I came up with single table and a single query that returns all the results in the correct order that I use on my own site, it's slightly different but the logic could be used to fit this question.
table name : comments
id / varchar(32)
userid / int(10)
comment / text
ordering / int(10)
ordering_secondary / int(10)
source / tinyint(4)
state / tinyint(4)
created / timestamp
edited / timestamp
There is a primary key on (id, ordering, ordering_secondary, source), so no duplicate of these four columns combined will insert.
Inserting a comment you first check "ordering" and increment the new comment by 1.
SELECT ordering FROM comments WHERE id="page id" AND source=0 ORDER BY ordering DESC LIMIT 1
"source" column will be "0" for parent, "1" for a reply for example. So just insert the comment with the ordering value incremented by 1 for each comment on a specific id.
When inserting a reply comment, use the same "ordering" value as the parent but increment the "ordering_secondary" column.
SELECT ordering FROM comments WHERE id="page id" AND source=1 AND ordering="parent comment ordering" ORDER BY ordering DESC LIMIT 1
So the data would look like :
In the table there are two parent comments and two replies to the second parent comment. No replies to first parent comment.
This approach is obviously slightly more overhead on inserts as you have to look up the ordering value of the last comment on an "id" but querying the data is simple.
SELECT * FROM comments WHERE id=? ORDER BY ordering DESC, ordering_secondary ASC LIMIT 30
If you're using a database that supports JSON or object aggregation, you can get a nicer result from the query where each top-level comment is a row (and is not duplicated), and the replies are nested in an array/JSON within each row.
This gives you flexibility with what you do with it and also makes it easier to ensure the ordering and nesting is correct.
An example using Postgres:
SELECT
p.id AS post_id,
c.id AS comment_id,
c.message,
JSON_AGG(
JSON_BUILD_OBJECT('comment', r.comment, 'timestamp', r.timestamp)
ORDER BY r.timestamp
) AS child_comments
FROM
post AS p
INNER JOIN comment AS c
ON c.post_id = p.id
LEFT JOIN comment AS r
ON r.parent_id = c.id
WHERE
post.id = <some id>
AND c.parent_id IS NULL
GROUP BY
post.id,
c.id,
c.message
ORDER BY
c.timestamp DESC
;
Note that, as above, this example will only retrieve the top-level and their first-level replies. It won't get replies to replies. You can use recursive commands or additional subqueries to do that.

MySQL Select Newest Approval Date among article date & comment date

My aim is to write the last modified date (lastmod) of my approved articles for my sitemap.xml file. There are 2 possibilities for this data.
If article has no approved comment, then lastmod is the approval
date of the article.
If article has at least 1 approved comment, then lastmod is the
approval date of the last approved comment of related article.
after asking php unsuccessful while loop within prepared statements question and reading answers, I decided
not to use loop within prepared for this case
add col_article_id column to my comments table which shows the
related article's id in article table
and try to solve my case with a smarter mySQL query
I have 2 tables:
articles
comments (comments.col_article_id links the comment to the related
article)
After I tried query below,
select tb_articles.col_approvaldate, tb_comments.col_approvaldate
from tb_articles, tb_comments
where tb_articles.col_status ='approved' AND tb_comments.col_status ='approved' AND tb_articles.col_id=tb_comments.col_article_id
my problems are:
1 - I need someway as if it was allowed in mysql syntax that select max( tb_articles.col_approvaldate, tb_comments.col_approvaldate)
2 - If I have n approved articles without any approved comment and m approved articles with approved comment(s), then I should have n+m result rows with 1 column at each row. But currently, I have m rows with 2 columns at each row.
So I'm aware that I'm terribly on wrong way.
I also searched "mysql newest row per group" keywords. But this the point I could arrived after all.
this is my 1st join experience. can you please correct me? best regards
Try this:
select
tb_articles.id,
ifnull(
max( tb_comments.col_approvaldate),
tb_articles.col_approvaldate
) as last_approved
from tb_articles
left join tb_comments
on tb_articles.col_id=tb_comments.col_article_id
and tb_comments.col_status ='approved'
where tb_articles.col_status ='approved'
group by 1;
Do I understand correctly that tb_comments record for an article is created by default? This should not be the case - if there are no comments, there shouldn't be a record there. If there is, what is the default date you are putting in? NULL? Also tb_comments.col_status seems redundant to me - if you have tb_comments.col_approvaldate then it is approved on that date and you don't really need status at all.
This query is probably what would work for you (commented AND shouldn't change things if I understand your table structure properly):
SELECT
a.col_id AS 'article_id',
IFNULL(MAX(c.col_approvaldate), a.col_approvaldate)
FROM
tb_articles a
LEFT JOIN tb_comments c ON a.col_id = c.col_article_id #AND c.col_status ='approved'
WHERE
a.col_status ='approved'
GROUP BY a.col_id

Mysql - Ordering Facebook posts, comments, and replies correctly

Can't seem to find a good answer for this. I currently have two tables, one with Facebook posts, the other with comments. I now need to add replies in addition to this, since FB recently did this.
My current query selects from the posts and joins to the comments. What I'm hoping to do for the replies is to add another entry in the comments but with a parent ID. Whatever query I end up having, I would like the results to look like this:
postID commentID parentID
1
2 1
2 2 1
2 3 1
3 4
So post 1 has no comments, post 2 has one comment with two replies to that comment, and post 3 only has one comment. In my comments table, comment 1-4 are all separate entries in the same table. Is there anyway to do this with one query and not having to have another join to the comments table?
Edit, current query. This query doesn't take care of replies, it's only for posts and one level of comments.
select facebookFeeds.*, facebookComments.userID, facebookComments.name, facebookComments.message as cMessage, facebookComments.createdTime as cCreatedTime from facebookFeeds left join facebookComments on facebookFeeds.id = facebookComments.feedID where facebookComments.accountID= 24 order by createdTime desc
I figured it out, have it working using an if in the order clause. The only disadvantage is that the parent comment ID must be a lower number than its replies. Otherwise, the replies will show before the parent comment. When receiving the data, the replies come after the comment, so it should be fine.
select facebookFeeds.*, facebookComments.id as cID, parentID, facebookComments.userID, facebookComments.name, facebookComments.message as cMessage, facebookComments.createdTime as cCreatedTime from facebookFeeds left join facebookComments on facebookFeeds.id = facebookComments.feedID where facebookComments.accountID = 24 order by facebookFeeds.createdTime desc, if(parentID is null, cID, parentID)

Get the first and last posts in a thread

I am trying to code a forum website and I want to display a list of threads. Each thread should be accompanied by info about the first post (the "head" of the thread) as well as the last. My current database structure is the following:
threads table:
id - int, PK, not NULL, auto-increment
name - varchar(255)
posts table:
id - int, PK, not NULL, auto-increment
thread_id - FK for threads
The tables have other fields as well, but they are not relevant for the query. I am interested in querying threads and somehow JOINing with posts so that I obtain both the first and last post for each thread in a single query (with no subqueries). So far I am able to do it using multiple queries, and I have defined the first post as being:
SELECT *
FROM threads t
LEFT JOIN posts p ON t.id = p.thread_id
ORDER BY p.id
LIMIT 0, 1
The last post is pretty much the same except for ORDER BY id DESC. Now, I could select multiple threads with their first or last posts, by doing:
SELECT *
FROM threads t
LEFT JOIN posts p ON t.id = p.thread_id
ORDER BY p.id
GROUP BY t.id
But of course I can't get both at once, since I would need to sort both ASC and DESC at the same time.
What is the solution here? Is it even possible to use a single query? Is there any way I could change the structure of my tables to facilitate this? If this is not doable, then what tips could you give me to improve the query performance in this particular situation?
You could do something with a subquery and joins:
SELECT first.text as first_post_text, last.text as last_post_text
FROM
(SELECT MAX(id) as max_id, MIN(id) as min_id FROM posts WHERE thread_id = 1234) as sub
JOIN posts first ON (sub.max_id = first.id)
JOIN posts last ON (sub.min_id = last.id)
But that doesn't solve your problem of doing it without subqueries.
You could add columns to your threads table so that you keep the id of the first and last post of each thread. The first post would never change, but every time you added a new post you would have to update that record in the threads table, so that would double your writes, and you may need to use a transaction to avoid race conditions.
Or you could go so far as to duplicate information about the first and last post in the threads row. Say you needed the user_id of the poster, the timestamp it was posted, and the first 100 characters of the post. You could create 6 new columns in the threads table to contain those pieces of data for the first and last post. It duplicates data, but it means you may be able to display a list of threads without needing to query the posts table at all.

SQL select with inner join, sub select and limit

I've been working with this SQL problem for about 2 days now and suspect I'm very close to resolving the issue but just can't seem to find a solution that completely works.
What I'm attempting to do is a selective join on two tables called application_info and application_status that are used to store information about open access journal article funding requests.
application_info has general information about the applicant and uses an auto indexing field called Application_ID as a key field. application_status is used to track the ongoing information about the status of the application (received, under review, funded, denied, withdrawn, etc.) as well as status of the journal article (submitted, accepted, resubmitted, published or rejected) and contains both an Application_ID field and an auto indexing field called Status_ID along with a status text and status date field.
Because we want to keep a running log of application, article, and funding status changes we don't want to overwrite existing rows in the application_status with updated values, but instead want to only show the most recent status values. Because an application will eventually have more than one status change this creates a need to apply some sort of limit on the inner join of the status data to the application data so that only one row is returned for each application ID.
Here's an example of what I am attempting to do in a query that currently throws an error:
-- simplified example
SELECT
application_info.*,
artstatus.Status_ID AS Article_Status_ID,
artstatus.Application_ID AS Article_Application_ID,
artstatus.Status_State_Date AS Article_Status_State_Date,
artstatus.Status_State_Text AS Article_Status_State_Text
FROM application_info
LEFT JOIN (
SELECT
Status_ID,
Application_ID,
Status_State_Text,
Status_State_Date,
Status_State_InitiatedBy,
Status_State_ChangebBy,
Status_State_Notes
FROM application_status
WHERE Status_State_Text LIKE 'Article Status%'
AND Application_ID = application_info.Application_ID -- how to pass the current application_info.Application_ID from the ON clause to here?
-- and Application_ID = 29 -- this would be an option for specific IDs, but not an option for getting a complete list of application IDs with status
-- GROUP BY Application_ID -- reduces the sub query to 1 row (Yeah!) but returns the first row encountered before the ORDER BY comes into play
ORDER BY Status_ID DESC
-- a GROUP BY after the ORDER BY might resolve the issue if we could do a sort first
LIMIT 1 -- only want to get the first (most recent) row, only works correctly if passing an Application_ID
) AS artstatus
ON application_info.Application_ID = artstatus.Application_ID
-- WHERE application_info.Application_ID = 29 -- need to get all IDs with statu values as well as for specific ID requests
;
Eliminating the AND Application_ID = application_info.Application_ID and portion of the sub query along with the LIMIT causes the select to work, but returns a row for every status for a given application ID. I've tried messing with using MIN/MAX operators but have noticed that they return unpredictable rows from the application_status table when they work.
I've also attempted to do sub selects in the ON section of the join, but don't know how to make that work because the end result would always need to return an Application_ID (can both Application_ID and Status_ID be returned and used?).
Any hints on how to get this to work as I'm intending? Can this even be done?
Further edit: working query below. The key was to move the sub query in the join one level deeper and then return just a single status ID.
-- simplified example (now working)
SELECT
application_info.*,
artstatus.Status_ID AS Article_Status_ID,
artstatus.Application_ID AS Article_Application_ID,
artstatus.Status_State_Date AS Article_Status_State_Date,
artstatus.Status_State_Text AS Article_Status_State_Text
FROM application_info
LEFT JOIN (
SELECT
Status_ID,
Application_ID,
Status_State_Text,
Status_State_Date,
Status_State_InitiatedBy,
Status_State_ChangebBy,
Status_State_Notes
FROM application_status AS artstatus_int
WHERE
-- sub query moved one level deeper so current join Application_ID can be passed
-- order by and limit can now be used
Status_ID = (
SELECT status_ID FROM application_status WHERE Application_ID = artstatus_int.Application_ID
AND status_State_Text LIKE 'Article Status%'
ORDER BY Status_ID DESC
LIMIT 1
)
ORDER BY Application_ID, Status_ID DESC
-- no need for GROUP BY or LIMIT here because only one row is returned per Application_ID
) AS artstatus
ON application_info.Application_ID = artstatus.Application_ID
-- WHERE application_info.Application_ID = 29 -- works for specific application ID as well
-- more LEFT JOINS follow
;
You can't have a correlated subquery in the from clause.
Try this idea instead:
select <whatever>
from (select a.*,
(select max(status_id) as maxstatusid
from application_status aps
where aps.application_id = a.application_id
) as maxstatusid
from application
) left outer join
application_status aps
on aps.status_id = a.maxstatusid
. . .
That is, put the correlated subquery in the select clause to get the most recent status. Then join this in to the status table to get other information. And, finish the query with other details.
You seem pretty adept at your SQL skills, so it doesn't seem necessary to rewrite the whole query for you.