Please help me optimize query about getting reccomended (rec) for movies. I have many records and query run quite slow. The following query run for 2 mins
SELECT rec.toMovieID, sum(rec.score)
FROM rec
WHERE movieID in
(SELECT movieid as movieID FROM userFavorites as ufv WHERE ufv.userid = 29)
GROUP BY rec.toAMovieID
ORDER BY rec.score DESC
LIMIT 10
Do you think I can optimize it more?
You can use an inner join instead of a subselect
SELECT
rec.toMovieID,
sum(rec.score)
FROM rec INNER JOIN userFavorites ON rec.movieID = userFavorites.movieid
WHERE
userid = 29
GROUP BY rec.toAMovieID
ORDER BY rec.score DESC
LIMIT 10
You should set indexes on rows in where clause, at least for movieid and userid. (If not allready done)
You can use exists:
SELECT rec.toMovieID, sum(rec.score)
FROM rec r
WHERE EXISTS (SELECT 1 FROM userFavorites as ufv WHERE ufv.userid = 29 and ufv.MovieId = r.MovieId)
GROUP BY rec.toAMovieID
ORDER BY rec.score DESC
LIMIT 10;
You have to be careful using a join because of duplicate records.
Related
I have a performance issue with the query below on MYSQL. The below query has 5 tables involved. When I apply the order by and limit, the results are retrieved in 0.3 secs. But without the order by and limit, I was able to get the results in 0.01 secs. I am tired changing the query but that did not work. Could someone please help me with this query so I can get the results in desired time (<0.3 secs).
Below are the details.
m_todos = 286579 (records)
m_pat = 214858 (records)
users = 119 (records)
m_programs = 26 (records)
role = 4 (records)
SELECT *
FROM (
SELECT t.*,
mp.name as A_name,
u.first_name, u.last_name,
p.first, p.last, p.zone, p.language,p.handling,
r.name,
u2.first_name AS created_first_name,
u2.last_name AS created_last_name
FROM m_todos t
INNER JOIN role r ON t.role_id=r.id
INNER JOIN m_pat p ON t.patient_id = p.id
LEFT JOIN users u2 ON t.created_id=u2.id
LEFT JOIN m_programs mp ON t.prog_id=mp.id
LEFT JOIN users u ON t.user_id=u.id
WHERE t.role_id !='9'
AND t.completed = '0000-00-00 00:00:00'
) C
ORDER BY priority DESC, due ASC
LIMIT 0,10
Get rid of the outer SELECT; move the ORDER BY and LIMIT in.
Indexes:
t: (completed)
t: (priority, due)
I assume priority and due are in t?? Please be explicit in the query. It could make a huge difference.
If the following works, it should speed things up a lot: Start by finding the t.id without all the JOINs:
SELECT id
FROM m_todos
WHERE role_id !='9'
AND completed = '0000-00-00 00:00:00'
ORDER BY priority DESC, due DESC
LIMIT 10
That will benefit from this covering composite index:
INDEX(completed, role_id, priority, due, id)
Debug that. Then use it in the rest:
SELECT t.*, the-other-stuff
FROM ( that-query ) AS t1
JOIN m_todos AS t USING(id)
then-the-rest-of-the-JOINs
ORDER BY priority DESC, due ASC -- yes, again
If you don't need all of t.*, it may be beneficial to spell out the actual columns needed.
The reason for this to run much faster is that the 10 rows are found efficiently by looking only at the one table. The original code was shoveling around a lot more rows than 10 and they included all the columns of t, plus columns from the other tables.
My version does only 10 lookups for all the extra stuff.
I have a relatively basic query that fetches the most recent messages per conversation:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
The message table contains more than 911000 records, the conversation table contains around 680000. The execution time for this query, varies between 4 and 10 seconds, depending on the load on the server. Which is far too long.
Below is a screenshot of the EXPLAIN result:
The cause is apparently the MAX and/or the GROUP BY, because the following similar query only takes 10ms:
SELECT COUNT(*)
FROM `message`
LEFT JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE (`message`.`status`=0)
AND (`message`.`user_id` <> 1)
AND ((`conversation`.`sender_user_id` = 1 OR `conversation`.`receiver_user_id` = 1))
The corresponding EXPLAIN result:
I have tried adding different indices to both tables without any improvement, for example: conv_msg_idx(add_time, conversation_id) on message which seems to be used according to the first EXPLAIN result, however the query still takes around 10 seconds to execute.
Any help improving the indices or query to get the execution time down would be greatly appreciated.
EDIT:
I have changed the query to use an INNER JOIN:
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
But the execution time is still ~ 6 seconds.
You should create Multiple-Column Index on the columns which are in your WHERE clause and which you want to SELECT (except conversation_id). (reference)
conversation_id should be an index in both table.
Try to avoid 'Or' in Sql query this will make the fetching slow. Instead use union or any other methods.
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE (conversation.sender_user_id = 1 AND conversation.status != -1)) GROUP BY conversation_id
union
SELECT message.conversation_id, MAX(message.add_time) AS max_add_time FROM message INNER JOIN conversation ON message.conversation_id = conversation.id WHERE ((conversation.receiver_user_id = 1 AND conversation.status != -2) ) GROUP BY conversation_id ORDER BY max_add_time DESC LIMIT 12
Instead of depending on a single table message, have two tables: One for message, as you have, plus another thread that keeps the status of the thread of messages.
Yes, that requires a little more work when adding a new message -- update a column or two in thread.
But it eliminates the GROUP BY and MAX that are causing grief in this query.
While doing this split, see if some other columns would be better off in the new table.
SELECT `message`.`conversation_id`, MAX(`message`.`add_time`) AS `max_add_time`
FROM `message`
INNER JOIN `conversation` ON `message`.`conversation_id` = `conversation`.`id`
WHERE ((`conversation`.`receiver_user_id` = 1 AND `conversation`.`status` != -2)
OR (`conversation`.`sender_user_id` = 1 AND `conversation`.`status` != -1))
GROUP BY `conversation_id`
ORDER BY `max_add_time` DESC
LIMIT 12
You can try with INNER JOIN, if your logic not get affect using it.
you can modify this query by avoiding max() use
select * from(
select row_number() over(partition by conversation_id order by add_time desc)p1
)t1 where t1.p1=1
Doing a query on forum database. I am using this query to get thread name, poster, date etc.
(Left only thread_subject for now)
SELECT `thread_subject` FROM `fusion_posts` JOIN `fusion_threads`
ON fusion_posts.thread_id=fusion_threads.thread_id JOIN `fusion_users` ON
fusion_posts.post_author=fusion_users.user_id
GROUP BY fusion_posts.thread_id ORDER BY `post_id` DESC LIMIT 16
Basically, I also need to add something like the count below to the existing select, to count posts of each thread.
SELECT COUNT(*) AS PostCount FROM fusion_posts,fusion_threads WHERE fusion_threads.thread_id = fusion_posts.thread_id group by fusion_threads.thread_id
How could I do that?
Try this:-
SELECT `thread_subject`, COUNT(*) AS PostCount
FROM `fusion_posts` JOIN `fusion_threads`
ON fusion_posts.thread_id=fusion_threads.thread_id JOIN `fusion_users`
ON fusion_posts.post_author=fusion_users.user_id
GROUP BY fusion_posts.thread_id, `thread_subject`
ORDER BY `post_id` DESC
LIMIT 16
I've got the following, slow performing, SQL query:
SELECT *
FROM news_events
WHERE 1 AND (user_id = 2416) OR id IN(SELECT content_id FROM likes WHERE user_id = 2416)
ORDER BY id DESC
LIMIT 0,10
The news_events table has indexes on user_id. And the likes table has an index on user_id.
To try to improve performance I have re-written the query using an INNER JOIN the following way:
SELECT a.*
FROM news_events a
INNER JOIN likes b ON (a.id = b.content_id)
WHERE (a.user_id = 2416) OR (b.user_id = 2416)
ORDER BY a.id DESC
LIMIT 0,10
But performance doesn't improve either. I've run explain on this last query and this is the result:
I appreciate any pointer on what I could do to improve the performance of this query.
SELECT *
FROM
(
SELECT a.*
FROM news_events a
WHERE a.user_id = 2416
UNION
SELECT ne.*
FROM news_events ne
INNER JOIN likes l
ON ne.id=l.contentid
WHERE l.user_id = 2416
)
ORDER BY 1 DESC
LIMIT 0,10
Try this query -
SELECT * FROM news_events ne
LEFT JOIN (SELECT content_id FROM likes WHERE user_id = 2416) l
ON ne.user_id = 2416 OR ne.id = l.content_id
ORDER BY
ne.id DESC
LIMIT
0, 10
These columns should be indexed: news_events.user_id, news_events.id, likes.user_id, likes.content_id.
Your query is quite good enough. Posted queries by mates are also fine. But, if you are having large set of data and you did not rebuild indexes since long then, you need to rebuild indexes on both tables.
It is a standard protocol that db admin need to rebuild all the indexes timely as well as recompile all the objects+packages in the db.
I hope it will help :)
Keep querying!
I have the following 2 queries.
Query 1 :
select distinct(thread_id) from records where client_name='MyClient'
Query 2 :
select max(thread_no) from records
where thread_id='loop_result_from_above_query' AND action='Reviewed'
Is it possible to combine them into a single query ?
The second query is run on every result of the first query.
Thank you.
See attached image of a small snippet of mysql records.
I need a single mysql query to output only records which have action="MyAction" as the latest records for a given set of thread_ids. In the sample data set : record with Sr: 7201
I hope this helps in helping me :)
SELECT client_name, thread_id, MAX(thread_no) max_thread
FROM records
WHERE action='Reviewed' AND client_name='MyClient'
GROUP BY client_name, thread_id
UPDATE 1
SELECT a.*
FROM records a
INNER JOIN
(
SELECT thread_id, max(sr) max_sr
FROM records
GROUP BY thread_id
) b ON a.thread_id = b.thread_id AND
a.sr = b.max_sr
WHERE a.action = 'MyAction'
You can use SELF JOIN, but it is not advisable and will impact your query performance. Please check below query for your reference
SELECT DISTINCT r1.thread_id, MAX(r2.thread_no) from records r1 LEFT JOIN records r2 ON r2.thread_id=r1.thread_id WHERE r1.client_name='MyClient' AND r2.action='Reviewed'
SELECT a.maxthreadid,
b.maxthreadno
FROM (SELECT DISTINCT( thread_id ) AS MaxThreadId
FROM records
WHERE client_name = 'MyClient') a
CROSS JOIN (SELECT Max(thread_no) AS MaxThreadNo
FROM records
WHERE thread_id = 'loop_result_from_above_query'
AND action = 'Reviewed') b
Try this.
SELECT *
FROM (SELECT Row_number()
OVER (
partition BY thread_id
ORDER BY thread_no) no,
Max(thread_no)
OVER(
partition BY thread_id ) Maxthread_no,
thread_id,
action,
client_name
FROM records
Where client_name = 'MyClient') AS T1
WHERE no = 1
AND action = 'Reviewed'