I have a big query (MYSQL) to join several tables:
SELECT * FROM
`AuthLogTable`,
`AppTable`,
`Company`,
`LicenseUserTable`,
`LicenseTable`,
`LicenseUserPool`,
`PoolTable`
WHERE
`LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID` and
`LicenseUserTable`.`License`=`LicenseTable`.`License` and
LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID` and
`LicenseUserPool`.`PoolID`=`PoolTable`.`id` and
`Company`.`id`=`LicenseTable`.`CompanyID` and
`AuthLogTable`.`License` = `LicenseTable`.`License` and
`AppTable`.`AppID` = `AuthLogTable`.`AppID` AND
`PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
ORDER BY
`AuthLogTable`.`AuthDate` DESC,
`AuthLogTable`.`AuthTime` DESC
LIMIT 0,20
I use explain and it gives the following:
How to make this faster? It takes several seconds in a big table.
"Showing rows 0 - 19 ( 20 total, Query took 3.5825 sec)"
as far as i know, the fields used in the query are indexed in each table.
Indices are set for AuthLogTable
You can try running this query without 'order by' clause on your data and see if it makes a difference (also run 'explain'). If it does, you can consider adding index/indices on the fields you sort by. Using temporary; using filesort; means that the temp table is created and then sorted, without index that takes time.
As far as I know, join style doesn't make any difference because query is parsed into another form anyway. But you still may want to use ANSI join syntax (see also this question ANSI joins versus "where clause" joins).
First of all consider modifying your query to use JOINS properly. Also, make sure that you have indexed the columns used in JOIN ON clause ,WHERE condition and ORDER BY clause.
select * from `AuthLogTable`
join `AppTable` on `AppTable`.`AppID` = `AuthLogTable`.`AppID`
join `LicenseTable` on `AuthLogTable`.`License` = `LicenseTable`.`License`
join `Company` on `Company`.`id`=`LicenseTable`.`CompanyID`
join `LicenseUserTable` on `LicenseUserTable`.`License`=`LicenseTable`.`License`
join `LicenseUserPool` on `LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID`
join `PoolTable` on `LicenseUserPool`.`PoolID`=`PoolTable`.`id`
where LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID`
and `PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
order by `AuthLogTable`.`AuthDate` desc, `AuthLogTable`.`AuthTime` desc
limit 0,20;
Try the following query:
SELECT *
FROM `AuthLogTable`
JOIN `AppTable` ON (`AppTable`.`AppID` = `AuthLogTable`.`AppID`)
JOIN `LicenseUserPool` ON (LEFT(RIGHT(`AuthLogTable`.`User`, 17), 16)=`LicenseUserPool`.`UserID`)
JOIN `LicenseUserTable` ON (`LicenseUserPool`.`UserID`=`LicenseUserTable`.`UserID`)
JOIN `LicenseTable` ON (`AuthLogTable`.`License` = `LicenseTable`.`License`
AND `LicenseUserTable`.`License`=`LicenseTable`.`License`)
JOIN `Company` ON (`Company`.`id`=`LicenseTable`.`CompanyID`)
JOIN `PoolTable` ON (`LicenseUserPool`.`PoolID`=`PoolTable`.`id`)
WHERE `PoolTable`.`id` IN (-1,1,2,4,15,16,17,5,18,19,43,20,3,6,8,10,29,30,7,11,12,24,25,26,27,28,21,23,22,31,32,33,34,35,36,37,38,39,40,41,42,-1)
ORDER BY `AuthLogTable`.`AuthDate` DESC, `AuthLogTable`.`AuthTime` DESC LIMIT 0,20
Related
How can I optimize this query SQL?
CREATE TABLE table1 AS
SELECT * FROM temp
WHERE Birth_Place IN
(SELECT c.DES_COM
FROM tableCom AS c
WHERE c.COD_PROV IS NULL)
ORDER BY Cod, Birth_Date
I think that the problem is the IN clause
First of all it's not quite valid SQL, since you are selecting and sorting by columns that are not part of the group. What you want to do is called "select top N in group", check out Select first row in each GROUP BY group?
Your query doesn't make sense, because you have SELECT * with GROUP BY. Ignoring that, I would recommend writing the query as:
SELECT t.*
FROM temp t
WHERE EXISTS (SELECT 1
FROM tableCom c
WHERE t.Birth_Place = c.DES_COM AND
c.COD_PROV IS NULL
)
ORDER BY Cod, Birth_Date;
For this, I recommend an index on tableCom(desc_com, cod_prov). Your database might also be able to use an an index on temp(cod, birth_date, birthplace).
i try to optimize my query because it takes 3.5 seconds and its too long.
this is my query:
SELECT
`products`.*,
IFNULL(SUM(`products_uses`.`quantity`),0) as `usesQuantity`,
IF((`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)) > 0, (`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)), 0) AS `totalUses`
FROM `products`
LEFT JOIN `products_uses` ON `products`.`id` = `products_uses`.`productId`
WHERE `products`.`nurseForm` = 1
GROUP BY `products`.`id`
ORDER BY `products`.`fav` DESC, `products`.`productName` ASC
i tried to optimize with variables but nothing changed:
SELECT
`products`.*,
#usesQuantity := IFNULL(SUM(`products_uses`.`quantity`),0) as `usesQuantity`,
IF((`products`.`productQuantity` - #usesQuantity) > 0, (`products`.`productQuantity` - #usesQuantity), 0) AS `totalUses`
FROM `products`
LEFT JOIN `products_uses` ON `products`.`id` = `products_uses`.`productId`
WHERE `products`.`nurseForm` = 1
GROUP BY `products`.`id`
ORDER BY `products`.`fav` DESC, `products`.`productName` ASC
this query
sum how much quantity used each product - IFNULL(SUM(products_uses.quantity),0)
,
how much uses each product -
IF((`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)) > 0, (`products`.`productQuantity` - IFNULL(SUM(`products_uses`.`quantity`),0)), 0)
i tried to changed structure of the tables to myISAM and InnoDB nothing changed.
what can i do to optimize this query?
tnx
I'm not sure that introducing user variables will change much. But stategically adding indices on your two tables might help. Try this:
ALTER TABLE products ADD INDEX nurse_index (nurseForm);
ALTER TABLE products_uses ADD INDEX product_index (productId);
The first index, on the products.nurseForm column, might help the WHERE clause. In particular, this index would be a big help if only a few records match.
The second index, on products_uses.productId, might help the join go faster. Again, this would depend on how large your tables are.
You may also run EXPLAIN to see if any other bottlenecks stand out.
Providing SHOW CREATE TABLE would help. Meanwhile, I will guess.
This may (or may not) help. It attempts to do the summing without hauling all the columns of products around. It avoids the GROUP BY.
SELECT p2.*,
IFNULL(x.usesQuantity, 0) as `usesQuantity`,
GREATEST(p2.`productQuantity` - IFNULL(x.usesQuantity), 0) AS `totalUses`
FROM `products` AS p2
LEFT JOIN
( SELECT p.id AS xid,
SUM(pu.quantity) as `usesQuantity`
FROM products_uses AS pu
JOIN products AS p ON p.id = pu.productId
WHERE p.nurseForm = 1
) AS x ON x.xid = p2.id
ORDER BY p2.`fav` DESC,
p2.`productName` ASC
These indexes should help:
products: INDEX(nurseForm, id)
products: PRIMARY KEY(id) -- I am assuming this??
products_uses: INDEX(productId, quantity)
If LEFT were unnecessary there would be other optimizations.
MySQL 5.6 would help with the subquery.
MySQL 8.0 could help with the ORDER BY, meanwhile, a sort is required due to the mixture of DESC and ASC.
This query (along with a few others I think have a related issue) did not take 30 seconds when MySQL was local on the same EC2 instance as the rest of the website. More like milliseconds.
Does anything look off?
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
LEFT JOIN chv_follows ON chv_follows.follow_followed_user_id =
chv_images.image_user_id
LEFT JOIN chv_follows_projects ON
chv_follows_projects.follows_project_project_id =
chv_images.image_project_id LEFT JOIN chv_projects ON
chv_projects.project_id = follows_project_project_id WHERE
chv_follows.follow_user_id='1' OR (follows_project_user_id = 1 AND
chv_projects.project_privacy = "public" AND
chv_projects.project_is_public_upload = 1) GROUP BY chv_images.image_id
ORDER BY chv_images.image_id DESC
LIMIT 0,15
And this is what EXPLAIN shows:
Thank you
Update: This query has the same issue. It does not have a GROUP BY.
SELECT *, chv_images.image_id FROM chv_images
LEFT JOIN chv_storages ON chv_images.image_storage_id =
chv_storages.storage_id
LEFT JOIN chv_users ON chv_images.image_user_id = chv_users.user_id
LEFT JOIN chv_albums ON chv_images.image_album_id = chv_albums.album_id
LEFT JOIN chv_categories ON chv_images.image_category_id =
chv_categories.category_id
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
LEFT JOIN chv_likes ON chv_likes.like_content_type = "image" AND
chv_likes.like_content_id = chv_images.image_id AND chv_likes.like_user_id = 1
ORDER BY chv_images.image_id DESC
LIMIT 0,15
That EXPLAIN shows several table-scans (type: ALL), so it's not surprising that it takes over 30 seconds.
Here's your EXPLAIN:
Notice the column rows shows an estimated 14420 rows read from the first table chv_images. It's doing a table-scan of all the rows.
In general, when you do a series of JOINs, you can multiple together all the values in the rows column of the EXPLAIN, and the final result is how many row-reads MySQL has to do. In this case it's 14420 * 2 * 1 * 1 * 2 * 1 * 916, or 52,834,880 row-reads. That should put into perspective the high cost of doing several table-scans in the same query.
You might help avoid those table-scans by creating some indexes on these tables:
ALTER TABLE chv_storages
ADD INDEX (storage_id);
ALTER TABLE chv_categories
ADD INDEX (category_id);
ALTER TABLE chv_likes
ADD INDEX (like_content_id, like_content_type, like_user_id);
Try creating those indexes and then run the EXPLAIN again.
The other tables are already doing lookups by primary key (type: eq_ref) or by secondary key (type: ref) so those are already optimized.
Your EXPLAIN shows your query uses a temporary table and filesort. You should reconsider whether you need the GROUP BY, because that's probably causing the extra work.
Another tip is to avoid using SELECT * because it might be forcing the query to read many extra columns that you don't need. Instead, explicitly name only the columns you need.
Is there any indexes in chv_images?
I propose:
CREATE INDEX idx_image_id ON chv_images (image_id);
(Bill's ideas are good. I'll take the discussion a different way...)
Explode-Implode -- If the LEFT JOINs match no more than 1 row, change, for example,
SELECT
...
LEFT JOIN chv_meta ON chv_images.image_id = chv_meta.image_id
into
SELECT ...,
( SELECT foo FROM chv_meta WHERE image_id = chv_images.image_id ) AS foo, ...
If that can be done for all the JOINs, you can get rid of GROUP BY. This will avoid the costly "explode-implode" where JOINs lead to more rows, then GROUP BY gets rid of the dups. (I suspect you can't move all the joins in.)
OR -> UNION -- OR is hard to optimize. Your query looks like a good candidate for turning into UNION, then making more indexes that will become useful.
WHERE chv_follows.follow_user_id='1'
OR (follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1
)
Assuming that follows_project_user_id is in `chv_images,
( SELECT ...
WHERE chv_follows.follow_user_id='1' )
UNION DISTINCT -- or ALL, if you are sure there won't be dups
( SELECT ...
WHERE follows_project_user_id = 1
AND chv_projects.project_privacy = "public"
AND chv_projects.project_is_public_upload = 1 )
Indexes needed:
chv_follows: (follow_user_id)
chv_projects: (project_privacy, project_is_public_upload) -- either order
But this has not yet handled the ORDER BY and LIMIT. The general pattern for such:
( SELECT ... ORDER BY ... LIMIT 15 )
UNION
( SELECT ... ORDER BY ... LIMIT 15 )
ORDER BY ... LIMIT 15
Yes, the ORDER BY and LIMIT are repeated.
That works for page 1. If you want the next 15 rows, see http://mysql.rjweb.org/doc.php/pagination#pagination_and_union
After building those two sub-selects, look at them; I think you will be able to optimize each one, and may need new indexes because the Optimizer will start with a different 'first' table.
I'm not an expert in SQL, i have an sql statement :
SELECT * FROM articles WHERE article_id IN
(SELECT distinct(content_id) FROM contents_by_cats WHERE cat_id='$cat')
AND permission='true' AND date <= '$now_date_time' ORDER BY date DESC;
Table contents_by_cats has 11000 rows.
Table articles has 2700 rows.
Variables $now_date_time and $cat are php variables.
This query takes about 10 seconds to return the values (i think because it has nested SELECT statements) , and 10 seconds is a big amount of time.
How can i achieve this in another way ? (Views or JOIN) ?
I think JOIN will help me here but i don't know how to use it properly for the SQL statement that i mentioned.
Thanks in advance.
A JOIN is exactly what you are looking for. Try something like this:
SELECT DISTINCT articles.*
FROM articles
JOIN contents_by_cats ON articles.article_id = contents_by_cats.content_id
WHERE contents_by_cats.cat_id='$cat'
AND articles.permission='true'
AND articles.date <= '$now_date_time'
ORDER BY date DESC;
If your query is still not as fast as you would like then check that you have an index on articles.article_id and contents_by_cats.content_id and contents_by_cats.cat_id. Depending on the data you may want an index on articles.date as well.
Do note that if the $cat and $now_date_time values are coming from a user then you should really be preparing and binding the query rather than just dumping these values into the query.
This is the query we are starting with:
SELECT a.*
FROM articles a
WHERE article_id IN (SELECT distinct(content_id)
FROM contents_by_cats
WHERE cat_id ='$cat'
) AND
permission ='true' AND
date <= '$now_date_time'
ORDER BY date DESC;
Two things will help this query. The first is to rewrite it using exists rather than in and to simplify the subquery:
SELECT a.*
FROM articles a
WHERE EXISTS (SELECT 1
FROM contents_by_cats cbc
WHERE cbc.content_id = a.article_id and cat_id = '$cat'
) AND
permission ='true' AND
date <= '$now_date_time'
ORDER BY date DESC;
Second, you want indexes on both articles and contents_by_cats:
create index idx_articles_3 on articles(permission, date, article_id);
create index idx_contents_by_cats_2 on contents_by_cat(content_id, cat_id);
By the way, instead of $now_date_time, you can just use the now() function in MySQL.
I have the following query:
SELECT *
FROM products
INNER JOIN product_meta
ON products.id = product_meta.product_id
JOIN sales_rights
ON product_meta.product_id = sales_rights.product_id
WHERE ( products.categories REGEXP '[[:<:]]5[[:>:]]' )
AND ( active = '1' )
AND ( products.show_browse = 1 )
AND ( product_meta.software_platform_mac IS NOT NULL )
AND ( sales_rights.country_id = '240'
OR sales_rights.country_id = '223' )
GROUP BY products.id
ORDER BY products.avg_rating DESC
LIMIT 0, 18;
Running the query with the omission of the ORDER BY section and the query runs in ~90ms, with the ORDER BY section and the query takes ~8s.
I've browsed around SO and have found the reason for this could be that the sort is being executed before all the data is returned, and instead we should be running ORDER BY on the result set instead? (See this post: Slow query when using ORDER BY)
But I can't quite figure out the definitive way on how I do this?
I've browsed around SO and have found the reason for this could be
that the sort is being executed before all the data is returned, and
instead we should be running ORDER BY on the result set instead?
I find that hard to believe, but if that's indeed the issue, I think you'll need to do something like this. (Note where I put the parens.)
select * from
(
SELECT products.id, products.avg_rating
FROM products
INNER JOIN product_meta
ON products.id = product_meta.product_id
JOIN sales_rights
ON product_meta.product_id = sales_rights.product_id
WHERE ( products.categories REGEXP '[[:<:]]5[[:>:]]' )
AND ( active = '1' )
AND ( products.show_browse = 1 )
AND ( product_meta.software_platform_mac IS NOT NULL )
AND ( sales_rights.country_id = '240'
OR sales_rights.country_id = '223' )
GROUP BY products.id
) as X
ORDER BY avg_rating DESC
LIMIT 0, 18;
Also, edit your question and include a link to that advice. I think many of us would benefit from reading it.
Additional, possibly unrelated issues
Every column used in a WHERE clause should probably be indexed somehow. Multi-column indexes might perform better for this particular query.
The column products.categories seems to be storing multiple values that you filter with regular expressions. Storing multiple values in a single column is usually a bad idea.
MySQL's GROUP BY is indeterminate. A standard SQL statement using a GROUP BY might return fewer rows, and it might return them faster.
If you can, you may want to index your ID columns so that the query will run quicker. This is a DBA-level solution, rather than a SQL solution - tuning the database will help overall performance.
The issue in the instance of this query, was that by using GROUP BY and ORDER BY in a query, MySQL is unable to use the index if the GROUP BY and ORDER BY expressions are different.
Related Reading:
http://dev.mysql.com/doc/refman/5.0/en/order-by-optimization.html
http://mysqldba.blogspot.co.uk/2008/06/how-to-pick-indexes-for-order-by-and.html