I installed a plug in and I have done optimisation on the back end (SSD, single column indexing for columns called in GROUP BY & WHERE)
but when running this query
SELECT u.user_id, u.profile_page_id, u.server_id AS user_server_id, u.user_name, u.full_name, u.gender, u.user_image, u.is_invisible, u.user_group_id, u.language_id, u.birthday, u.country_iso, m.*
FROM(
(SELECT m.*
FROM phpfox_channel_video AS m
INNER JOIN phpfox_channel_category AS mc
ON(mc.category_id = mc.category_id)
INNER JOIN phpfox_channel_category_data AS mcd
ON(mcd.video_id = m.video_id)
WHERE m.in_process = 0 AND m.view_id = 0 AND m.module_id = 'videochannel' AND m.item_id = 0 AND m.privacy IN(0) AND mcd.category_id = 17
GROUP BY m.video_id
ORDER BY m.time_stamp DESC
)) AS m
JOIN phpfox_user AS u
ON(u.user_id = m.user_id)
ORDER BY m.time_stamp DESC
LIMIT 24;
it takes 20 seconds, while changing it to this instead
SELECT u.user_id, u.profile_page_id, u.server_id AS user_server_id, u.user_name, u.full_name, u.gender, u.user_image, u.is_invisible, u.user_group_id, u.language_id, u.birthday, u.country_iso, m.*
FROM(
(SELECT m.*
FROM phpfox_channel_video AS m
INNER JOIN phpfox_channel_category_data AS mcd
ON(mcd.video_id = m.video_id AND mcd.category_id = 17)
WHERE m.in_process = 0 AND m.view_id = 0 AND m.module_id = 'videochannel' AND m.item_id = 0 AND m.privacy IN(0)
GROUP BY m.video_id
ORDER BY m.time_stamp DESC
)) AS m
JOIN phpfox_user AS u
ON(u.user_id = m.user_id)
ORDER BY m.time_stamp DESC
LIMIT 24;
This runs about 5-6 seconds
The phpfox_channel_video contains 2 million rows (and will keep on adding quickly, its a social media site and user can upload files too) so caching isn't quite useful (but activated).
Any hints on how to optimise this ? I have minimum experience with MariaDB/MySQL as I've been accustomed to MS SQL for big data, and creating my own structure. Any recommended method without needing much altering to the tables (adding tables is OK).
Or should I need to restructure the PHP & table to optimise the query to be below 1 second / query.
Thank you!
I found these links
http://mysql.rjweb.org/doc.php/memory &
http://mysql.rjweb.org/doc.php/ricksrots#indexing
Are they still relevant ?
attached is the explain results
And as for the Index, the current config is set to index every column is stated as an index key, all all the tables involved in the query above.
Would a print out of my current server configuration be helpful ? Thanks !
INNER JOIN phpfox_channel_category AS mc ON(mc.category_id = mc.category_id)
Is almost useless.
You don't use any columns of mc for other purposes.
This JOIN is performed.
This JOIN verified that there is a corresponding row in mc.
This JOIN will bloat the temp table if there are multiple corresponding rows.
Bloat leads to wasted work in the GROUP BY.
Similarly, your second query does not use mcd.
Please use different aliases for derived tables. It is hard to follow the multiple uses of m..
This is totally useless:
ORDER BY m.time_stamp DESC
MySQL/MariaDB is free to ignore an ORDER BY in a derived table. A table is defined to be an unordered set of rows. Ordering can only be done at the end.
Suggested index
m: INDEX(item_id, module_id, view_id, in_process, -- any order; tested with '='
privacy, -- sometimes has a list?
video_id) -- last
mcd: INDEX(category_id, video_id) -- in either order
There is a more logical way to do this, and possibly faster:
INNER JOIN phpfox_channel_category_data AS mcd
ON mcd.video_id = m.video_id
AND mcd.category_id = 17
Remove that, and remove the GROUP BY m.id, then add this to the WHERE:
AND EXISTS( SELECT 1 FROM phpfox_channel_category_data AS mcd
WHERE mcd.video_id = m.video_id
AND mcd.category_id = 17 )
(The index mentioned above still applies.)
Not that I have perhaps eliminated two "filesorts" -- for the GROUP BY and the ORDER BY. Another note: EXPLAIN does not always show how many filesorts thre really are. (But EXPLAIN FORMAT=JSON SELECT ... does.)
I managed to clean up the query, after checking the table, turns out that
WHERE m.in_process = 0
AND m.view_id = 0
AND m.module_id = 'videochannel'
AND m.item_id = 0
AND m.privacy IN(0)
Doesn't need to be run, because all the table matches that condition .. (for the current case of this website).. So I just optimize those long queries. And Manage to hit < 1 second now ..
Related
I have a problem when IN clause contains too many values. Consider this query
EXPLAIN
SELECT DISTINCT t.entry_id , t.sticky , wd.field_id_104 , t.title
FROM exp_channel_titles AS t
LEFT JOIN exp_channels ON t.channel_id = exp_channels.channel_id
LEFT JOIN exp_channel_data AS wd ON t.entry_id = wd.entry_id
LEFT JOIN exp_members AS m ON m.member_id = t.author_id
INNER JOIN exp_category_posts ON t.entry_id = exp_category_posts.entry_id
INNER JOIN exp_categories ON exp_category_posts.cat_id = exp_categories.cat_id
WHERE t.entry_id !=''
AND t.site_id IN ('1')
AND t.entry_date < 1610109517
AND (t.expiration_date = 0 OR t.expiration_date > 1610109517)
AND t.entry_id IN ('0','649','650','651','652','653','654','655')
;
if there are few values output is following, which is ok
but if IN ('0','649','650','651','652','653','654','655', thousand values)
query run about 1 minute and explain change to this
how to fix that?
UPDATE: range_optimizer_max_mem_size had already set to 0 and isn't issue
We have had similar problems at my company when someone runs a query with a very long list of values in an IN (...) predicate.
We found that MySQL enforces a limit on memory available to the range optimizer. If the list of values is too long, it exceeds the memory limit, and the optimizer cannot finish its analysis to see if it should use the index. So it gives up and says, "forget it! it's a table-scan for you."
We fix it by setting the MySQL Server configuration value range_optimizer_max_mem_size=0 which means there is no limit to the memory that the range optimizer can use.
This creates a risk that if someone were to run a query with a million values in the IN (...) list, it could use a lot of memory, maybe enough to kill the MySQL Server. But so far the tradeoff is preferable, to allow the optimizer to choose the index.
See documentation:
https://dev.mysql.com/doc/refman/5.7/en/range-optimization.html
https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_range_optimizer_max_mem_size
Re your comment:
Another common reason for the optimizer to choose to do a table-scan is that it calculates that your conditions match a large enough portion of the table that it's more expensive to use the index than to simply run a table-scan and examine every row.
The threshold for this isn't documented, and it depends on the implementation of the cost-based optimizer, so it might change from version to version. But my observation is that usually if your conditions match more than 20% of the table, the optimizer chooses the table-scan.
You could use an index hint to tell the optimizer to treat a table-scan as infinitely expensive, so the index is preferred to a table-scan.
Explode-implode. This is a classic problem of an inefficient way to write a query.
JOIN several tables
Filter
Collapse the results -- usually by GROUP BY or LIMIT, but DISTINCT has the same effect.
So... Turn the query inside out.
Find the ids of the desired rows in t
JOIN that to the rest of the tables.
Presumably the DISTINCT will not be needed at all.
SELECT t2.entry_id, t2.sticky, wd.field_id_104, t2.title
FROM ( SELECT id
FROM exp_channel_titles
WHERE entry_id !=''
AND site_id IN ('1')
AND entry_date < 1610109517
AND (expiration_date = 0 OR expiration_date > 1610109517)
AND entry_id IN ('0','649','650','651','652','653','654','655')
) AS t
JOIN exp_channel_titles AS t2 USING(id)
LEFT JOIN exp_channels ON t2.channel_id = exp_channels.channel_id
LEFT JOIN exp_channel_data AS wd ON t2.entry_id = wd.entry_id
;
Another reformulation
Since there is only one use for md, this might be better:
SELECT entry_id,
sticky,
( SELECT wd.field_id_104
FROM exp_channels ON t2.channel_id = exp_channels.channel_id
LEFT JOIN exp_channel_data AS wd ON t.entry_id = wd.entry_id
) AS field_id_104,
title
FROM exp_channel_titles
WHERE entry_id !=''
AND site_id IN ('1')
AND entry_date < 1610109517
AND (expiration_date = 0 OR expiration_date > 1610109517)
AND entry_id IN ('0','649','650','651','652','653','654','655')
;
and have a 5-column index starting with site_id, entry_date
Other...
AND (t.expiration_date = 0 OR t.expiration_date > 1610109517)
OR is not sargeable. Can you redesign the table to avoid this OR?
Without the above reformulation, this may help:
INDEX(site_id, entry_date)
Also, get rid of these, since they seem to be totally useless:
LEFT JOIN exp_channels ON t.channel_id = exp_channels.channel_id
LEFT JOIN exp_members AS m ON m.member_id = t.author_id
And these may be useless:
INNER JOIN exp_category_posts ON t.entry_id = exp_category_posts.entry_id
INNER JOIN exp_categories ON exp_category_posts.cat_id = exp_categories.cat_id
MySQL query SLOW don’t know how to optimize
I think I m fine with hardware 60GB RAM 10 Cores SSD
Hi I m having a big issue with this query running slow on Mysql they query is below:
# Thread_id: 1165100 Schema: back-Alvo-11-07-19 QC_hit: No
# Query_time: 9.015205 Lock_time: 0.000188 Rows_sent: 1 Rows_examined: 2616880
# Rows_affected: 0
SET timestamp=1568549358;
SELECT count(*) as total_rows FROM(
(SELECT m.*
FROM phpfox_channel_video AS m
INNER JOIN phpfox_channel_category AS mc
ON(mc.category_id = mc.category_id)
INNER JOIN phpfox_channel_category_data AS mcd
ON(mcd.video_id = m.video_id)
WHERE m.in_process = 0 AND m.view_id = 0
AND m.module_id = 'videochannel'
AND m.item_id = 0 AND m.privacy IN(0)
AND mcd.category_id = 17
GROUP BY m.video_id
ORDER BY m.time_stamp DESC
LIMIT 12
)) AS m
JOIN phpfox_user AS u
ON(u.user_id = m.user_id);
This query is running very slow as you can see 9 seconds
When looking for online help to optimize queries always talk about adding indexes,
as you can see below for EXPLAIN statment I already have indexes
Do you guys have any Idea where I should look to improve speed os this query? I m not DB guy having hard time with this. This is a website and have 400,000 videos.
Thanks
The explain shows that you are not using an index on table phpfox_channel_video as m, and that it is using a temporary index on table phpfox_channel_category AS mc, which means it is not using an index, but is building an index first, which takes considerable time.
Also, the index for table phpfox_channel_category_data AS mcd could be better.
The indexes you need are:
CREATE INDEX idx_cat_data_video_id ON phpfox_channel_category_data
(category_id, video_id);
CREATE INDEX idx_channel_cat_id ON phpfox_channel_category (category_id);
CREATE INDEX idx_video_mult ON phpfox_channel_video
(in_process, view_id, module_id, item_id, privacy, video_id, time_stamp);
Don't fetch m.* if you are only going to do COUNT(*).
If phpfox_channel_category is a many-to-many mapping table, follow the tips in http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
m needs INDEX(in_process, view_id, module_id, item_id, privacy) in any order.
Avoid the GROUP BY:
INNER JOIN phpfox_channel_category AS mc ON(mc.category_id = mc.category_id)
INNER JOIN phpfox_channel_category_data AS mcd ON(mcd.video_id = m.video_id)
AND mcd.category_id = 17
GROUP BY m.video_id
--> (something like)
AND EXISTS(
SELECT 1
FROM phpfox_channel_category AS mc
JOIN phpfox_channel_category_data AS mcd
ON mcd.video_id = mc.video_id
WHERE mcd.video_id = 17
AND mc.video_id = m.video_id
)
Let's make sure that we are optimizing the right query. I suggest we check this condition in the ON clause:
mc.category_id = mc.category_id
We know that's going to be TRUE for every row in mc with a non-NULL value of category_id. We could express that condition as:
mc.category_id IS NOT NULL
This means the join is almost a cross join; every row returned from m matched with every row from mc. That is, we could get an equivalent result writing:
FROM phpfox_channel_video m
JOIN phpfox_channel_category mc
ON mc.category_id IS NOT NULL
I suspect that's not actually the result we're after. I think we were meaning to match to m.category_id. But that's just a guess.
If video_id column is PRIMARY KEY or UNIQUE KEY on m, we can avoid the potentially expensive GROUP BY operation by avoiding the joins that create duplicated rows, by using EXISTS with correlated subqueries. If we can avoid generating an intermediate result with duplicate values of video_id, then we can avoid the need to do the GROUP BY.
Also, for the inline view query, rather than return all columns * we can return just the expressions that we need. In the outer query, the only column referenced is user_id.
So we could write something like this:
SELECT COUNT(*) AS total_rows
FROM (
SELECT m.user_id
FROM phpfox_channel_video m
WHERE EXISTS ( SELECT 1
FROM phpfox_channel_category mc
WHERE mc.category_id = m.category_id
-- mc.category_id = mc.category_id -- <original
)
AND EXISTS ( SELECT 1
FROM phpfox_channel_category_data mcd
WHERE mcd.video_id = m.video_id
AND mcd.category_id = 17
)
AND m.in_process = 0
AND m.view_id = 0
AND m.module_id = 'videochannel'
AND m.item_id = 0
AND m.privacy IN (0)
ORDER BY m.time_stamp DESC
LIMIT 12
) d
JOIN phpfox_user u
ON u.user_id = d.user_id
For tuning, optimal index for m will have leading columns that have equality predicates, followed by the time_stamp column so that we can avoid a "Using filesort" operation, the ORDER BY can be satisfied by returning rows in index order. It looks like the reason we need the rows ordered is for the LIMIT clause.
... ON phpfox_channel_video (in_process, view_id, item_id, module_id
, time_stamp, video_id, ... )
The other two tables, we want indexes with leading columns that have equality predicates
... ON phpfox_channel_category_data (video_id, category_id, ...)
... ON phpfox_channel_category ( category_id, ... )
NOTES:
(It's not entirely clear why we need an inline view, and we are delaying the join from the user_id reference. Then again, the point of the entire query isn't really obvious to me; I'm just providing a re-write, given the provided SQL, with the change to the condition category_id.)
The above assumed that category_id column exists in m, and that it's a one-to-many relationship.
But if that's not true... if the mcd table is actually junction table, resolving a many-to-many relationship between video and category, such that the join condition was meant to be
mcd.category_id = mc.category_id
^
Then we would want to replace the WHERE EXISTS and AND EXISTS in the query above, into a single correlated subquery. Something like this:
SELECT COUNT(*) AS total_rows
FROM (
SELECT m.user_id
FROM phpfox_channel_video m
WHERE EXISTS ( SELECT 1
FROM phpfox_channel_category mc
JOIN phpfox_channel_category_data mcd
ON mcd.category_id = mc.category_id
WHERE mcd.video_id = m.video_id
AND mcd.category_id = 17
)
AND m.in_process = 0
AND m.view_id = 0
AND m.module_id = 'videochannel'
AND m.item_id = 0
AND m.privacy IN (0)
ORDER BY m.time_stamp DESC
LIMIT 12
) d
JOIN phpfox_user u
ON u.user_id = d.user_id
I have a performance issue with the query below on MYSQL. The below query has 5 tables involved. When I apply the order by and limit, the results are retrieved in 0.3 secs. But without the order by and limit, I was able to get the results in 0.01 secs. I am tired changing the query but that did not work. Could someone please help me with this query so I can get the results in desired time (<0.3 secs).
Below are the details.
m_todos = 286579 (records)
m_pat = 214858 (records)
users = 119 (records)
m_programs = 26 (records)
role = 4 (records)
SELECT *
FROM (
SELECT t.*,
mp.name as A_name,
u.first_name, u.last_name,
p.first, p.last, p.zone, p.language,p.handling,
r.name,
u2.first_name AS created_first_name,
u2.last_name AS created_last_name
FROM m_todos t
INNER JOIN role r ON t.role_id=r.id
INNER JOIN m_pat p ON t.patient_id = p.id
LEFT JOIN users u2 ON t.created_id=u2.id
LEFT JOIN m_programs mp ON t.prog_id=mp.id
LEFT JOIN users u ON t.user_id=u.id
WHERE t.role_id !='9'
AND t.completed = '0000-00-00 00:00:00'
) C
ORDER BY priority DESC, due ASC
LIMIT 0,10
Get rid of the outer SELECT; move the ORDER BY and LIMIT in.
Indexes:
t: (completed)
t: (priority, due)
I assume priority and due are in t?? Please be explicit in the query. It could make a huge difference.
If the following works, it should speed things up a lot: Start by finding the t.id without all the JOINs:
SELECT id
FROM m_todos
WHERE role_id !='9'
AND completed = '0000-00-00 00:00:00'
ORDER BY priority DESC, due DESC
LIMIT 10
That will benefit from this covering composite index:
INDEX(completed, role_id, priority, due, id)
Debug that. Then use it in the rest:
SELECT t.*, the-other-stuff
FROM ( that-query ) AS t1
JOIN m_todos AS t USING(id)
then-the-rest-of-the-JOINs
ORDER BY priority DESC, due ASC -- yes, again
If you don't need all of t.*, it may be beneficial to spell out the actual columns needed.
The reason for this to run much faster is that the 10 rows are found efficiently by looking only at the one table. The original code was shoveling around a lot more rows than 10 and they included all the columns of t, plus columns from the other tables.
My version does only 10 lookups for all the extra stuff.
The below query is very slow (takes around 1 second), but is only searching approx 2500 records (+ inner joined tables).
if i remove the ORDER BY, the query runs in much less time (0.05 or less)
OR if i remove the part nested select below "# used to select where no ProfilePhoto specified" it also runs fast, but i need both of these included.
I have indexes (or primary key) on :tPhoto_PhotoID, PhotoID, p.Enabled, CustomerID, tCustomer_CustomerID, ProfilePhoto (bool), u.UserName, e.PrivateEmail, m.tUser_UserID, Enabled, Active, m.tMemberStatuses_MemberStatusID, e.tCustomerMembership_MembershipID, e.DateCreated
(do i have too many indexes? my understanding is add them anywhere i use WHERE or ON)
The Query :
SELECT e.CustomerID,
e.CustomerName,
e.Location,
SUBSTRING_INDEX(e.CustomerProfile,' ', 25) AS Description,
IFNULL(p.PhotoURL, PhotoTable.PhotoURL) AS PhotoURL
FROM tCustomer e
LEFT JOIN (tCustomerPhoto ep INNER JOIN tPhoto p ON (ep.tPhoto_PhotoID = p.PhotoID AND p.Enabled=1))
ON e.CustomerID = ep.tCustomer_CustomerID AND ep.ProfilePhoto = 1
# used to select where no ProfilePhoto specified
LEFT JOIN ((SELECT pp.PhotoURL, epp.tCustomer_CustomerID
FROM tPhoto pp
LEFT JOIN tCustomerPhoto epp ON epp.tPhoto_PhotoID = pp.PhotoID
GROUP BY epp.tCustomer_CustomerID) AS PhotoTable) ON e.CustomerID = PhotoTable.tCustomer_CustomerID
INNER JOIN tUser u ON u.UserName = e.PrivateEmail
INNER JOIN tmembers m ON m.tUser_UserID = u.UserID
WHERE e.Enabled=1
AND e.Active=1
AND m.tMemberStatuses_MemberStatusID = 2
AND e.tCustomerMembership_MembershipID != 6
ORDER BY e.DateCreated DESC
LIMIT 12
i have similar queries that but they run much faster.
any opinions would be grateful:
Until we get more clarity on your question between working in other query etc..Try EXPLAIN {YourSelectQuery} in MySQL client and see the suggestions to improve the performance.
I'm using a query which generally executes in under a second, but sometimes takes between 10-40 seconds to finish. I'm actually not totally clear on how the subquery works, I just know that it works, in that it gives me 15 rows for each faverprofileid.
I'm logging slow queries and it's telling me 5823244 rows were examined, which is odd because there aren't anywhere close to that many rows in any of the tables involved (the favorites table has the most at 50,000 rows).
Can anyone offer me some pointers? Is it an issue with the subquery and needing to use filesort?
EDIT: Running explain shows that the users table is not using an index (even though id is the primary key). Under extra it says: Using temporary; Using filesort.
SELECT F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
The number of rows examined represents is large because many rows have been examined more than once. You are getting this because of an incorrectly optimized query plan which results in table scans when index lookups should have been performed. In this case the number of rows examined is exponential, i.e. of an order of magnitude comparable to the product of the total number of rows in more than one table.
Make sure that you have run ANALYZE TABLE on your three tables.
Read on how to avoid table scans, and identify then create any missing indexes
Rerun ANALYZE and re-explain your queries
the number of examined rows must drop dramatically
if not, post the full explain plan
use query hints to force the use of indices (to see the index names for a table, use SHOW INDEX):
SELECT
F.id,F.created,U.username,U.fullname,U.id,I.*
FROM favorites AS F FORCE INDEX (faver_profile_id_key)
INNER JOIN users AS U FORCE INDEX FOR JOIN (PRIMARY) ON F.faver_profile_id = U.id
INNER JOIN items AS I FORCE INDEX FOR JOIN (PRIMARY) ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
AND (SELECT COUNT(*) FROM favorites FORCE INDEX (faver_profile_id_key) WHERE faver_profile_id = F.faver_profile_id
AND created > F.created AND removed = 0 AND collection_id is null) < 15
ORDER BY F.faver_profile_id, F.created DESC;
You may also change your query to use GROUP BY faver_profile_id/HAVING count > 15 instead of the nested SELECT COUNT(*) subquery, as suggested by vartec. The performance of both your original and vartec's query should be comparable if both are properly optimized e.g. using hints (your query would use nested index lookups, whereas vartec's query would use a hash-based strategy.)
I think with GROUP BY and HAVING it should be faster.
Is that what you want?
SELECT F.id,F.created,U.username,U.fullname,U.id, I.field1, I.field2, count(*) as CNT
FROM favorites AS F
INNER JOIN users AS U ON F.faver_profile_id = U.id
INNER JOIN items AS I ON F.notice_id = I.id
WHERE faver_profile_id IN (360,379,95,315,278,1)
AND F.removed = 0
AND I.removed = 0
AND F.collection_id is null
AND I.nudity = 0
GROUP BY F.id,F.created,U.username,U.fullname,U.id,I.field1, I.field2
HAVING CNT < 15
ORDER BY F.faver_profile_id, F.created DESC;
Don't know which fields from items you need, so I've put placeholders.
I suggest you use Mysql Explain Query to see how your mysql server handles the query. My bet is your indexes aren't optimal, but explain should do much better than my bet.
You could do a loop on each id and use limit instead of the count(*) subquery:
foreach $id in [123,456,789]:
SELECT
F.id,
F.created,
U.username,
U.fullname,
U.id,
I.*
FROM
favorites AS F INNER JOIN
users AS U ON F.faver_profile_id = U.id INNER JOIN
items AS I ON F.notice_id = I.id
WHERE
F.faver_profile_id = {$id} AND
I.removed = 0 AND
I.nudity = 0 AND
F.removed = 0 AND
F.collection_id is null
ORDER BY
F.faver_profile_id,
F.created DESC
LIMIT
15;
I'll suppose the result of that query is intented to be shown as a paged list. In that case, perhaps you could consider to do a simpler "unjoined query" and do a second query for each row to read only the 15, 20 or 30 elements shown. Was not a JOIN a heavy operation? This would simplify the query and It wouldn't become slower when the joined tables grow.
Tell me if I'm wrong, please.