Order by select count(*) and LIMIT is very slow - mysql

I have this query in my program, when I do some sorting with select count(*) field from the query, I dont know why, it very slow when running that query.
The problem is when i do some ordering from posts_count, it run more slower than i do ordering with the other field.
Here's the query:
select 'tags'.*, (select count(*) from 'posts' inner join 'post_tag' on 'posts'.'id' = 'post_tag'.'post_id' where 'tags'.'id' = 'post_tag'.'tag_id') as 'posts_count' from 'tags' order by 'posts_count' asc limit 15 offset 0;
Here's the execution time :
Please someone help me to improve this query , Thank you.
What i expect is the query can be run faster.

SELECT t.*, COUNT(*) AS count
FROM tags AS t
LEFT OUTER JOIN post_tag AS pt ON t.id = pt.tag_id
GROUP BY t.id
ORDER BY count ASC LIMIT 15 OFFSET 0;
You should make sure post_tag has an index starting with the tag_id column. You didn't include your table definition in your question, so I must assume the index is there. If the primary key starts with tag_id, that's okay too.
You don't need to join to posts, if I can assume that a row exists in post_tag means it must reference an existing row in posts. You can get the information you need only by joining to post_tag.

Related

select unique values from column but order based on another

I need a unique list of parent_threads based on the desc order of postID, postID is always unique but often the parent_thread field is the same for multiple posts.
So what i need is a list of posts in order they were replied to.
so for example in the image below i need to disregard posts 400 and 399 as they're repeats. i've got a query to work using a subquery but the problem with this subquery is that it can sometimes take up to 1 second to query, i was wondering if there was a more efficient way to do this. i've tried group by and distinct but keep getting the wrong results.
imge of the table
Here is the query that i have which produces the results i want, which is often slow.
SELECT `postID`
FROM `posts`
ORDER BY
(
SELECT MAX(`postID`)
FROM `posts` `sub`
WHERE `sub`.`parent_thread` = `posts`.postID
)
DESC
Your subquery is known as a dependent subquery. They can make queries very slow because they get repeated a lot.
JOIN to your subquery instead. That way it will be used just once, and things will speed up. Try this subquery to generate a list of max post ids, one for each parent thread.
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
Then use it in your main query like this
SELECT posts.postID
FROM posts
LEFT JOIN (
SELECT MAX(postID) maxPostID, parent_thread
FROM posts
GROUP BY parent_thread
) m ON posts.postID = m.parent_thread
ORDER BY m.maxPostID DESC

SQL INNER JOIN and AVG() returning wrong data

I am trying to select all rows from a table containing data about a video and then afterwards i am joining all of their ratings as an AVG() from another table.
The thing is there is only 1 row for each video but many ratings for each video, so i have to get all the ratings and find the average for each video.
I have this piece of SQL
SELECT t1.video_id,
t1.video_title,
t1.video_url,
t1.video_views,
AVG(t2.videos_rating_rating) AS rating
FROM videos_approved t1
INNER JOIN videos_rating t2
ON t1.video_id = t2.videos_rating_video_fk
WHERE 1
ORDER BY video_id
DESC LIMIT 12
The SQL returns a result but it only returns 1 row with a wrong Average value?
Can someone explain to me why this is going on and what i could do instead?
You need to use GROUP BY here. In your current query you are taking an average over the entire table.
SELECT
t1.video_id,
t1.video_title,
t1.video_url,
t1.video_views,
AVG(t2.videos_rating_rating) AS rating
FROM videos_approved t1
INNER JOIN videos_rating t2
ON t1.video_id = t2.videos_rating_video_fk
GROUP BY
t1.video_id
ORDER BY
t1.video_id DESC
LIMIT 12
Note that my answer assumes that video_id is the primary key of the videos_approved table, in which case we may select any column from that table even when grouping by the video_id. If not, then strictly speaking we would have to do another join.
Try replacing INNER JOIN with LEFT JOIN.
See more details in this answer and on this page (search for AVG + GROUP BY + JOINS)

Select random record from mysql with multiple filters

I know this was discussed many times but my research did not help me with my problem.
I have a table (innodb) with about 3k records. I need to pick 1 row random with some filters, which i do it like this:
select id, title, topic_id
from posts
where id not in
(select post_id from records where user_id='$my_id' and checked='1')
and topic_id='$topic_id' and status='1'
order by RAND() limit 1
This gives me the result i wanted. The problem is this takes too much time even with 3k records. It will get slower when records are increased.
I have to find a solution for this. Any suggestions?
Update: Both tables are indexed with id columns.
Instead of using where id not in, I would use a LEFT JOIN:
SELECT id,
title,
topic_id
FROM posts p
LEFT JOIN records r
ON p.id = r.post_id
AND r.user_id='$my_id'
AND r.checked = '1'
WHERE p.topic_id='$topic_id'
AND status='1'
AND r.post_id IS NULL
ORDER BY RAND()
LIMIT 1;
With this, you will want an index on posts.id and another index on records.post_id, records.user_id, records.checked

How to sort groups in MySQL join operator?

In my sql I have this query
SELECT * FROM threads t
JOIN (
SELECT c.*
FROM comments c
WHERE c.thread_id = t.id
ORDER BY date_sent
ASC LIMIT 1
) d ON t.id = d.thread_id
ORDER By d.date_sent DESC
Basically I have two tables, threads and comments. Comments have a foreign key to the thread table. I want to get the earliest comment row for each thread row. Threads should have at least 1 comment. If it doesn't, then the thread row shouldn't be included.
In my query above, I do a select on thread, and then I join it with a custom query. I want to use t.id, where t is the select table outside the brackets. Inside the brackets I create a new result set thats comments are for the current thread. I do the sorting and limiting there.
Then afterwards, I sort it again, so its earliest on top. However when I run this, it gives an error #1054 - Unknown column 't.id' in 'where clause'.
Does anyone know whats wrong here?
Thanks
The unknown column t.id is due to the fact that the alias t is unknown inside the subquery, but indeed it isn't needed anyway since you join it in the ON clause.
Instead of a LIMIT 1, use a MIN(date_sent) aggregate grouped by thread_id in the subquery. Be careful also using SELECT * in a join query, if columns in both tables have the same names; better to list the columns explicitly.
SELECT
/* List the columns you explicitly need here rather than *
if there is any name overlap (like `id` for example) */
t.*,
c.*
FROM
threads t
/* join threads against the subquery returning only thread_id and earliest date_sent */
INNER JOIN (
SELECT thread_id, MIN(date_sent) AS firstdate
FROM comments
GROUP BY thread_id
) earliest ON t.id = earliest.thread_id
/* then join the subquery back against the full comments table to get the other columns
in that table. The join is done on both thread_id and the date_sent timestamp */
INNER JOIN comments c
ON earliest.thread_id = c.thread_id
AND earliest.firstdate = c.date_sent
ORDER BY c.date_sent DESC
Michael's answer is correct. This is another answer that follows more the form of your query. You can do what you want as a correlated subquery and then join in the additional information:
SELECT *
FROM (SELECT t.*,
(SELECT c.id
FROM comments c
WHERE c.thread_id = t.id
ORDER BY c.date_sent ASC
LIMIT 1
) as mostrecentcommentid
FROM threads t
) t JOIN
comments c
on t.mostrecentcommentid = c.id
ORDER By c.date_sent DESC;
It is possible that this has better performance, because it does not require aggregating all the data. However, for performance, you would want an index on comments(thread_id, date_set, id).

MySQL query performance problem - INNER JOIN, ORDER BY, DESC

I have got this query:
SELECT
t.type_id, t.product_id, u.account_id, t.name, u.username
FROM
types AS t
INNER JOIN
( SELECT user_id, username, account_id
FROM users WHERE account_id=$account_id ) AS u
ON
t.user_id = u.user_id
ORDER BY
t.type_id DESC
1st question:
It takes around 30seconds to do this at the moment with only 18k records in types table.
The only indexes at the moment are only a primary indexes with just id.
Would the long time be caused by a lack of more indexes? Or would it be more to do with the structure of this query?
2nd question:
How can I add the LIMIT so I only get 100 records with the highest type_id?
Without changing the results, I think it is a 100 times faster if you don't make a sub-select of your users table. It is not needed at all in this case.
You can just add LIMIT 100 to get only the first 100 results (or less if there aren't a 100).
SELECT SQL_CALC_FOUND_ROWS /* Calculate the total number of rows, without the LIMIT */
t.type_id, t.product_id, u.account_id, t.name, u.username
FROM
types t
INNER JOIN users u ON u.user_id = t.user_id
WHERE
u.account_id = $account_id
ORDER BY
t.type_id DESC
LIMIT 1
Then, execute a second query to get the total number of rows that is calculated.
SELECT FOUND_ROWS()
That sub select on MySQL is going to slow down your query. I'm assuming that this
SELECT user_id, username, account_id
FROM users WHERE account_id=$account_id
doesn't return many rows at all. If that's the case then the sub select alone won't explain the delay you're seeing.
Try throwing an index on user_id in your types table. Without it, you're doing a full table scan of 18k records for each record returned by that sub select.
Inner join the users table and add that index and I bet you see a huge increase in speed.