i try to optimize my query because it takes 3.5 seconds and its too long.
this is my query:
SELECT
`products`.*,
IFNULL(SUM(`products_uses`.`quantity`),0) as `usesQuantity`,
IF((`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)) > 0, (`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)), 0) AS `totalUses`
FROM `products`
LEFT JOIN `products_uses` ON `products`.`id` = `products_uses`.`productId`
WHERE `products`.`nurseForm` = 1
GROUP BY `products`.`id`
ORDER BY `products`.`fav` DESC, `products`.`productName` ASC
i tried to optimize with variables but nothing changed:
SELECT
`products`.*,
#usesQuantity := IFNULL(SUM(`products_uses`.`quantity`),0) as `usesQuantity`,
IF((`products`.`productQuantity` - #usesQuantity) > 0, (`products`.`productQuantity` - #usesQuantity), 0) AS `totalUses`
FROM `products`
LEFT JOIN `products_uses` ON `products`.`id` = `products_uses`.`productId`
WHERE `products`.`nurseForm` = 1
GROUP BY `products`.`id`
ORDER BY `products`.`fav` DESC, `products`.`productName` ASC
this query
sum how much quantity used each product - IFNULL(SUM(products_uses.quantity),0)
,
how much uses each product -
IF((`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)) > 0, (`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)), 0)
i tried to changed structure of the tables to myISAM and InnoDB nothing changed.
what can i do to optimize this query?
tnx
I'm not sure that introducing user variables will change much. But stategically adding indices on your two tables might help. Try this:
ALTER TABLE products ADD INDEX nurse_index (nurseForm);
ALTER TABLE products_uses ADD INDEX product_index (productId);
The first index, on the products.nurseForm column, might help the WHERE clause. In particular, this index would be a big help if only a few records match.
The second index, on products_uses.productId, might help the join go faster. Again, this would depend on how large your tables are.
You may also run EXPLAIN to see if any other bottlenecks stand out.
Providing SHOW CREATE TABLE would help. Meanwhile, I will guess.
This may (or may not) help. It attempts to do the summing without hauling all the columns of products around. It avoids the GROUP BY.
SELECT p2.*,
IFNULL(x.usesQuantity, 0) as `usesQuantity`,
GREATEST(p2.`productQuantity` - IFNULL(x.usesQuantity), 0) AS `totalUses`
FROM `products` AS p2
LEFT JOIN
( SELECT p.id AS xid,
SUM(pu.quantity) as `usesQuantity`
FROM products_uses AS pu
JOIN products AS p ON p.id = pu.productId
WHERE p.nurseForm = 1
) AS x ON x.xid = p2.id
ORDER BY p2.`fav` DESC,
p2.`productName` ASC
These indexes should help:
products: INDEX(nurseForm, id)
products: PRIMARY KEY(id) -- I am assuming this??
products_uses: INDEX(productId, quantity)
If LEFT were unnecessary there would be other optimizations.
MySQL 5.6 would help with the subquery.
MySQL 8.0 could help with the ORDER BY, meanwhile, a sort is required due to the mixture of DESC and ASC.
Related
I want to improve my current query. So I have this table called Incomes. Where I have a sourceId varchar field. I have a single SELECT for the fields I need, but I needed to add an extra field called isFirstTime to represent if it was the first time on the row on what that sourceId was used. This is my current query:
SELECT DISTINCT
`income`.*,
CASE WHEN (
SELECT
`income2`.id
FROM
`income` as `income2`
WHERE
`income2`."sourceId" = `income`."sourceId"
ORDER BY
`income2`.created asc
LIMIT 1
) = `income`.id THEN true ELSE false END
as isFirstIncome
FROM
`income` as `income`
WHERE `income`.incomeType IN ('passive', 'active') AND `income`.status = 'paid'
ORDER BY `income`.created desc
LIMIT 50
The query works but slows down if I keep increasing the LIMIT or OFFSET. Any suggestions?
UPDATE 1:
Added WHERE statements used on the original query
UPDATE 2:
MYSQL version 5.7.22
You can achieve it using Ordered Analytical Function.
You can use ROW_NUMBER or RANK to get the desired result.
Below query will give the desired output.
SELECT *,
CASE
WHEN Row_number()
OVER(
PARTITION BY sourceid
ORDER BY created ASC) = 1 THEN true
ELSE false
END AS isFirstIncome
FROM income
WHERE incomeType IN ('passive', 'active') AND status = 'paid'
ORDER BY created desc
DB Fiddle: See the result here
My first thought is that isFirstIncome should be an extra column in the table. It should be populated as the data is inserted.
If you don't like that, let's try to optimize the query...
Let's avoid doing the subquery more than 50 times. This requires turning the query inside-out. (It's like "explode-implode", where the query gathers lots of stuff, then sorts it and throws most of the rows away.)
To summarize:
do the least amount of effort to just identify the 5 rows.
JOIN to whatever tables are needed (including itself if appropriate); this is to get any other columns desired (including isFirstIncome).
SELECT i3.*,
( ... using i3 ... ) as isFirstIncome
FROM (
SELECT i1.id, i1.sourceId
FROM `income` AS i1
WHERE i1.incomeType IN ('passive', 'active')
AND i1.status = 'paid'
ORDER BY i1.created DESC
LIMIT 50
) AS i2
JOIN income AS i3 USING(id)
ORDER BY i2.created DESC -- yes, repeated
(I left out the computation of isFirstIncome; it is discussed in other Answers. But note that it will be executed at most 50 times.)
(The aliases -- i1, i2, i3 -- are numbered in the order they will be "used"; this is to assist in following the SQL.)
To assist in performance, add
INDEX(status, incomeType, created, id, sourceId)
It should help with my formulation, but probably not for the other versions. Your version would benefit from
INDEX(sourceId, created, id)
I have a drupal 7 site running on MySQL. Some pages on the site are excruciatingly slow to load.
I investigated load times and have identified the culprit query, which is regularly taking 10s to execute on some pages. In one case it even took 70s!
The query is from a "view" that generates a short list of related content from elsewhere in the site based on the site taxonomy.
This is an example (with arguments) from one slow page:
SELECT node.nid AS nid, node.title AS node_title, node.created AS node_created, 'podcasts:panel_pane_3' AS view_name, RAND() AS random_field
FROM node node
LEFT JOIN (SELECT td.*, tn.nid AS nid
FROM taxonomy_term_data td
LEFT JOIN taxonomy_vocabulary tv ON td.vid = tv.vid
LEFT JOIN taxonomy_index tn ON tn.tid = td.tid
WHERE (tv.machine_name IN ('listen')) ) taxonomy_term_data_node
ON node.nid = taxonomy_term_data_node.nid
LEFT JOIN taxonomy_index taxonomy_index ON node.nid = taxonomy_index.nid
WHERE (( (taxonomy_index.tid IN ('472', '350', '742', '681', '3907', '1541', '411', '636', '990', '7757', '680', '743', '11479', '8106', '566', '2230', '11480', '766'))
AND (node.nid != '191314' OR node.nid IS NULL) )
AND(( (node.status = '1')
AND (node.type IN ('article', 'experiment', 'interview', 'podcast', 'question')) )))
ORDER BY random_field ASC, node_created DESC
LIMIT 5 OFFSET 0
From initial research I thought it would be a case of adding indices, but the columns of the tables concerned seem to have existing index entries.
I'm therefore uncertain how to proceed and would really value some guidance if anyone can help me please?
PS - I did ask MySQL to Explain itself and this is what was generated:
Few recommendations to optimize this query:
Avoid selecting unnecessary columns: do you really need all the columns in td.*? In most cases it means that too much information is passed over the network to the application.
Mixed ORDER BY directions: you're sorting by two columns: random_field ASC, node_created DESC. Sorting by different orders will prevent index usage, which will slow down the search. Do you think it makes sense to make both ASC or both DESC?
I assume that taxonomy_index.tid is numeric, and so are 'node.nid' and 'node.status'. In that case, when comparing them to constants, do not add quotes around the constant, as it will cause an unrequired cast which might prevent index use. For example, turn node.status = '1' to node.status = 1.
You are left joining a subquery (taxonomy_term_data_node) - if you're using MySQL < 5.6, or maybe even MySQL 5.7, it's most likely that MySQL can't index that subquery properly. Therefore, I would recommend to extract that subquery to a temporary table, index it and join to it from the outer query. See the transformation below.
So to apply most changes (the ones that do not require your decision, such as part 1 and 2 above), perform these steps:
First, index the main query by adding these indexes:
ALTER TABLE
`node`
ADD
INDEX `node_idx_status_nid_title_created` (`status`, `nid`, `title`, `created`);
ALTER TABLE
`taxonomy_index`
ADD
INDEX `taxonomy_index_idx_nid` (`nid`);
ALTER TABLE
`taxonomy_index`
ADD
INDEX `taxonomy_index_idx_tid_nid` (`tid`, `nid`);
ALTER TABLE
`taxonomy_term_data`
ADD
INDEX `taxonomy_term_data_idx_vid_tid` (`vid`, `tid`);
ALTER TABLE
`taxonomy_vocabulary`
ADD
INDEX `taxonomy_vocabulary_idx_vid` (`vid`);
First, create the temporary table:
CREATE TEMPORARY TABLE IF NOT EXISTS temp1 AS SELECT
taxonomy_term_data.*,
tn.nid AS nid
FROM
taxonomy_term_data td
LEFT JOIN
taxonomy_vocabulary tv
ON td.vid = tv.vid
LEFT JOIN
taxonomy_index tn
ON tn.tid = td.tid
WHERE
(
tv.machine_name IN (
'listen'
)
);
Now index the subquery:
ALTER TABLE temp1 ADD INDEX temp1_idx_nid (nid);
And the outer query will join to it:
SELECT
node.nid AS nid,
node.title AS node_title,
node.created AS node_created,
'podcasts:panel_pane_3' AS view_name,
RAND() AS random_field
FROM
node node
LEFT JOIN
temp1 taxonomy_term_data_node
ON node.nid = taxonomy_term_data_node.nid
LEFT JOIN
taxonomy_index taxonomy_index
ON node.nid = taxonomy_index.nid
WHERE
(
(
(
taxonomy_index.tid IN (
'472', '350', '742', '681', '3907', '1541', '411', '636', '990', '7757', '680', '743', '11479', '8106', '566', '2230', '11480', '766'
)
)
AND (
node.nid != '191314'
OR node.nid IS NULL
)
)
AND (
(
(
node.status = '1'
)
AND (
node.type IN (
'article', 'experiment', 'interview', 'podcast', 'question'
)
)
)
)
)
ORDER BY
random_field ASC,
node_created DESC LIMIT 5
Thanks for the guidance above, everyone. However, I have solved this with the help of Andy Batey at Cambridge University.
The clue was comparing the EXPLAIN statements generated when the query above was run on MySQL v5.5 (very fast results) versus v5.7 (very slow results); they query was being handled quite differently on the two platforms.
The key was adding this to my.cnf:
optimizer_switch='derived_merge=off'
Now the native query executes in 50ms or less, compared with 12s or longer before.
I hope this helps anyone else who runs into this upgrade problem.
This query (along with a few others I think have a related issue) did not take 30 seconds when MySQL was local on the same EC2 instance as the rest of the website. More like milliseconds.
Does anything look off?
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
LEFT JOIN chv_follows ON chv_follows.follow_followed_user_id =
chv_images.image_user_id
LEFT JOIN chv_follows_projects ON
chv_follows_projects.follows_project_project_id =
chv_images.image_project_id LEFT JOIN chv_projects ON
chv_projects.project_id = follows_project_project_id WHERE
chv_follows.follow_user_id='1' OR (follows_project_user_id = 1 AND
chv_projects.project_privacy = "public" AND
chv_projects.project_is_public_upload = 1) GROUP BY chv_images.image_id
ORDER BY chv_images.image_id DESC
LIMIT 0,15
And this is what EXPLAIN shows:
Thank you
Update: This query has the same issue. It does not have a GROUP BY.
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
ORDER BY chv_images.image_id DESC
LIMIT 0,15
That EXPLAIN shows several table-scans (type: ALL), so it's not surprising that it takes over 30 seconds.
Here's your EXPLAIN:
Notice the column rows shows an estimated 14420 rows read from the first table chv_images. It's doing a table-scan of all the rows.
In general, when you do a series of JOINs, you can multiple together all the values in the rows column of the EXPLAIN, and the final result is how many row-reads MySQL has to do. In this case it's 14420 * 2 * 1 * 1 * 2 * 1 * 916, or 52,834,880 row-reads. That should put into perspective the high cost of doing several table-scans in the same query.
You might help avoid those table-scans by creating some indexes on these tables:
ALTER TABLE chv_storages
ADD INDEX (storage_id);
ALTER TABLE chv_categories
ADD INDEX (category_id);
ALTER TABLE chv_likes
ADD INDEX (like_content_id, like_content_type, like_user_id);
Try creating those indexes and then run the EXPLAIN again.
The other tables are already doing lookups by primary key (type: eq_ref) or by secondary key (type: ref) so those are already optimized.
Your EXPLAIN shows your query uses a temporary table and filesort. You should reconsider whether you need the GROUP BY, because that's probably causing the extra work.
Another tip is to avoid using SELECT * because it might be forcing the query to read many extra columns that you don't need. Instead, explicitly name only the columns you need.
Is there any indexes in chv_images?
I propose:
CREATE INDEX idx_image_id ON chv_images (image_id);
(Bill's ideas are good. I'll take the discussion a different way...)
Explode-Implode -- If the LEFT JOINs match no more than 1 row, change, for example,
SELECT
...
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
into
SELECT ...,
( SELECT foo FROM chv_meta WHERE image_id = chv_images.image_id ) AS foo, ...
If that can be done for all the JOINs, you can get rid of GROUP BY. This will avoid the costly "explode-implode" where JOINs lead to more rows, then GROUP BY gets rid of the dups. (I suspect you can't move all the joins in.)
OR -> UNION -- OR is hard to optimize. Your query looks like a good candidate for turning into UNION, then making more indexes that will become useful.
WHERE chv_follows.follow_user_id='1'
OR (follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1
)
Assuming that follows_project_user_id is in `chv_images,
( SELECT ...
WHERE chv_follows.follow_user_id='1' )
UNION DISTINCT -- or ALL, if you are sure there won't be dups
( SELECT ...
WHERE follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1 )
Indexes needed:
chv_follows: (follow_user_id)
chv_projects: (project_privacy, project_is_public_upload) -- either order
But this has not yet handled the ORDER BY and LIMIT. The general pattern for such:
( SELECT ... ORDER BY ... LIMIT 15 )
UNION
( SELECT ... ORDER BY ... LIMIT 15 )
ORDER BY ... LIMIT 15
Yes, the ORDER BY and LIMIT are repeated.
That works for page 1. If you want the next 15 rows, see http://mysql.rjweb.org/doc.php/pagination#pagination_and_union
After building those two sub-selects, look at them; I think you will be able to optimize each one, and may need new indexes because the Optimizer will start with a different 'first' table.
I have a big query (MYSQL) to join several tables:
SELECT * FROM
`AuthLogTable`,
`AppTable`,
`Company`,
`LicenseUserTable`,
`LicenseTable`,
`LicenseUserPool`,
`PoolTable`
WHERE
`LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID` and
`LicenseUserTable`.`License`=`LicenseTable`.`License` and
LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID` and
`LicenseUserPool`.`PoolID`=`PoolTable`.`id` and
`Company`.`id`=`LicenseTable`.`CompanyID` and
`AuthLogTable`.`License` = `LicenseTable`.`License` and
`AppTable`.`AppID` = `AuthLogTable`.`AppID` AND
`PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
ORDER BY
`AuthLogTable`.`AuthDate` DESC,
`AuthLogTable`.`AuthTime` DESC
LIMIT 0,20
I use explain and it gives the following:
How to make this faster? It takes several seconds in a big table.
"Showing rows 0 - 19 ( 20 total, Query took 3.5825 sec)"
as far as i know, the fields used in the query are indexed in each table.
Indices are set for AuthLogTable
You can try running this query without 'order by' clause on your data and see if it makes a difference (also run 'explain'). If it does, you can consider adding index/indices on the fields you sort by. Using temporary; using filesort; means that the temp table is created and then sorted, without index that takes time.
As far as I know, join style doesn't make any difference because query is parsed into another form anyway. But you still may want to use ANSI join syntax (see also this question ANSI joins versus "where clause" joins).
First of all consider modifying your query to use JOINS properly. Also, make sure that you have indexed the columns used in JOIN ON clause ,WHERE condition and ORDER BY clause.
select * from `AuthLogTable`
join `AppTable` on `AppTable`.`AppID` = `AuthLogTable`.`AppID`
join `LicenseTable` on `AuthLogTable`.`License` = `LicenseTable`.`License`
join `Company` on `Company`.`id`=`LicenseTable`.`CompanyID`
join `LicenseUserTable` on `LicenseUserTable`.`License`=`LicenseTable`.`License`
join `LicenseUserPool` on `LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID`
join `PoolTable` on `LicenseUserPool`.`PoolID`=`PoolTable`.`id`
where LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID`
and `PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
order by `AuthLogTable`.`AuthDate` desc, `AuthLogTable`.`AuthTime` desc
limit 0,20;
Try the following query:
SELECT *
FROM `AuthLogTable`
JOIN `AppTable` ON (`AppTable`.`AppID` = `AuthLogTable`.`AppID`)
JOIN `LicenseUserPool` ON (LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID`)
JOIN `LicenseUserTable` ON (`LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID`)
JOIN `LicenseTable` ON (`AuthLogTable`.`License` = `LicenseTable`.`License`
AND `LicenseUserTable`.`License`=`LicenseTable`.`License`)
JOIN `Company` ON (`Company`.`id`=`LicenseTable`.`CompanyID`)
JOIN `PoolTable` ON (`LicenseUserPool`.`PoolID`=`PoolTable`.`id`)
WHERE `PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
ORDER BY `AuthLogTable`.`AuthDate` DESC, `AuthLogTable`.`AuthTime` DESC LIMIT 0,20
I have the following query:
SELECT *
FROM products
INNER JOIN product_meta
ON products.id = product_meta.product_id
JOIN sales_rights
ON product_meta.product_id = sales_rights.product_id
WHERE ( products.categories REGEXP '[[:<:]]5[[:>:]]' )
AND ( active = '1' )
AND ( products.show_browse = 1 )
AND ( product_meta.software_platform_mac IS NOT NULL )
AND ( sales_rights.country_id = '240'
OR sales_rights.country_id = '223' )
GROUP BY products.id
ORDER BY products.avg_rating DESC
LIMIT 0, 18;
Running the query with the omission of the ORDER BY section and the query runs in ~90ms, with the ORDER BY section and the query takes ~8s.
I've browsed around SO and have found the reason for this could be that the sort is being executed before all the data is returned, and instead we should be running ORDER BY on the result set instead? (See this post: Slow query when using ORDER BY)
But I can't quite figure out the definitive way on how I do this?
I've browsed around SO and have found the reason for this could be
that the sort is being executed before all the data is returned, and
instead we should be running ORDER BY on the result set instead?
I find that hard to believe, but if that's indeed the issue, I think you'll need to do something like this. (Note where I put the parens.)
select * from
(
SELECT products.id, products.avg_rating
FROM products
INNER JOIN product_meta
ON products.id = product_meta.product_id
JOIN sales_rights
ON product_meta.product_id = sales_rights.product_id
WHERE ( products.categories REGEXP '[[:<:]]5[[:>:]]' )
AND ( active = '1' )
AND ( products.show_browse = 1 )
AND ( product_meta.software_platform_mac IS NOT NULL )
AND ( sales_rights.country_id = '240'
OR sales_rights.country_id = '223' )
GROUP BY products.id
) as X
ORDER BY avg_rating DESC
LIMIT 0, 18;
Also, edit your question and include a link to that advice. I think many of us would benefit from reading it.
Additional, possibly unrelated issues
Every column used in a WHERE clause should probably be indexed somehow. Multi-column indexes might perform better for this particular query.
The column products.categories seems to be storing multiple values that you filter with regular expressions. Storing multiple values in a single column is usually a bad idea.
MySQL's GROUP BY is indeterminate. A standard SQL statement using a GROUP BY might return fewer rows, and it might return them faster.
If you can, you may want to index your ID columns so that the query will run quicker. This is a DBA-level solution, rather than a SQL solution - tuning the database will help overall performance.
The issue in the instance of this query, was that by using GROUP BY and ORDER BY in a query, MySQL is unable to use the index if the GROUP BY and ORDER BY expressions are different.
Related Reading:
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
http://mysqldba.blogspot.co.uk/2008/06/how-to-pick-indexes-for-order-by-and.html