Very slow MySQL query (Left-JOIN, GROUP BY, etc. ?) - mysql

I have troubles with the subsequent query, which I submit to a MySQL server. It takes 25s... for (COUNT(*) < 20k)-tables - only featured have 600k rows. However, indexes are created where it should (that is to say : for concerned columns in ON clauses). I tried to remove GROUP BY, which improved the case a bit. But the queries still give a slow response a general rule. I made that post because I could not find a solution into the variety of cases found into stackoverflow. Any suggestion?
SELECT
doctor.id as doctor_id,
doctor.uuid as doctor_uuid,
doctor.firstname as doctor_firstname,
doctor.lastname as doctor_lastname,
doctor.cloudRdvMask as doctor_cloudRdvMask,
GROUP_CONCAT(recommendation.id SEPARATOR ' ') as recommendation_ids,
GROUP_CONCAT(recommendation.uuid SEPARATOR ' ') as recommendation_uuids,
GROUP_CONCAT(recommendation.disponibility SEPARATOR ' ') as recommendation_disponibilities,
GROUP_CONCAT(recommendation.user_id SEPARATOR ' ') as recommendation_user_ids,
GROUP_CONCAT(recommendation.user_uuid SEPARATOR ' ') as recommendation_user_uuids,
location.id as location_id,
location.uuid as location_uuid,
location.lat as location_lat,
location.lng as location_lng,
profession.id as profession_id,
profession.uuid as profession_uuid,
profession.name as profession_name
FROM featured as doctor
LEFT JOIN location as location
ON doctor.location_id = location.id
LEFT JOIN profession as profession
ON doctor.profession_id = profession.id
LEFT JOIN
(
SELECT
featured.id as id,
featured.uuid as uuid,
featured.doctor_id as doctor_id,
featured.disponibility as disponibility,
user.id as user_id,
user.uuid as user_uuid
FROM featured as featured
LEFT JOIN user as user
ON featured.user_id = user.id
WHERE discr = 'recommendation'
) as recommendation
ON recommendation.doctor_id = doctor.id
WHERE
doctor.discr = 'doctor'
AND
doctor.state = 'PubliƩ'
GROUP BY doctor.uuid
Here comes the EXPLAIN result:
id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
1 | SIMPLE | doctor | NULL | ref | discr,state | discr | 767 | const | 194653 | 50.00 | Using where |
1 | SIMPLE | location | NULL | eq_ref | PRIMARY | PRIMARY | 4 | doctoome.doctor.location_id | 1 | 100.00 | NULL |
1 | SIMPLE | profession | NULL | eq_ref | PRIMARY | PRIMARY | 4 | doctoome.doctor.profession_id | 1 | 100.00 | NULL |
1 | SIMPLE | featured | NULL | ref | IDX_3C1359D487F4FB17,discr | IDX_3C1359D487F4FB17 | 5 | doctoome.doctor.id | 196 | 100.00 | Using where |
1 | SIMPLE | user | NULL | eq_ref | PRIMARY | PRIMARY | 4 | doctoome.featured.user_id | 1 | 100.00 | Using index |
EDIT This link helped me, it goes now with 8s. https://www.percona.com/blog/2016/10/12/mysql-5-7-performance-tuning-immediately-after-installation/. But I still find it slow, I just let it in case anybody would know what could also be improved. Thanks

I think removing the subquery might help, along with some more indexes:
SELECT . . . -- you need to fix the `GROUP_CONCAT()` column references
FROM featured doctor LEFT JOIN
location location
ON doctor.location_id = location.id LEFT JOIN
profession profession
ON doctor.profession_id = profession.id LEFT JOIN
featured featured
ON featured.doctor_id = doctor.doctor_id LEFT JOIN
user user
ON featured.user_id = user.id
WHERE doctor.discr = 'doctor' AND
doctor.state = 'PubliƩ' AND
featured.discr = 'recommendation'
GROUP BY doctor.uuid;
Then you want an index on featured(discr, state, doctor_id, location_id, profession_id) and featured(doctor_id, discr, user_id).

Related

MySQL: How do I Optimize this JOIN Query?

I have 3 tables; artist, album, song_cover, and song.
I would like to select an album and the total number of songs in that album.
Am currently using this query, but it is logged in the mysql-slow.log file. In the PHPMyAdmin, the query speed is inconsistent. Sometimes it will execute for 0.0005 seconds and other times, 2 seconds or more.
SELECT /*+ MAX_EXECUTION_TIME(1000) */ album.*,
artist_id, artist_aka, artist_slug, artist_profile_image, cover_filename,
(
SELECT COUNT(*)
FROM song
WHERE song.song_album_id = album.album_id
) AS TotalSongs
FROM album
LEFT JOIN artist ON album.album_artist = artist.artist_id
LEFT JOIN song_cover ON album.album_cover_id = song_cover.cover_id
ORDER BY album_id DESC LIMIT 0, 11
ROWS
artist: 15,978, album: 14,167, song: 67,559, song_cover: 12,668
EXPLAIN
Thank you in advance.
I would write it this way:
EXPLAIN SELECT b.*,
a.artist_id, a.artist_aka, a.artist_slug, a.artist_profile_image,
c.cover_filename,
COUNT(*) AS TotalSongs
FROM album AS b
INNER JOIN artist AS a ON b.album_artist = a.artist_id
LEFT OUTER JOIN song AS s ON s.song_album_id = b.album_id
LEFT OUTER JOIN song_cover AS c ON b.album_cover_id = c.cover_id
GROUP BY b.album_id
ORDER BY b.album_id DESC LIMIT 0, 11;
This eliminates the dependent subquery, in favor of another join and GROUP BY.
Here's the EXPLAIN report as near as I can guess at it:
+----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
| 1 | SIMPLE | b | NULL | index | PRIMARY,album_cover_id,album_artist | PRIMARY | 4 | NULL | 1 | 100.00 | Backward index scan |
| 1 | SIMPLE | a | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.b.album_artist | 1 | 100.00 | NULL |
| 1 | SIMPLE | s | NULL | ref | song_album_id | song_album_id | 4 | test.b.album_id | 1 | 100.00 | Using index |
| 1 | SIMPLE | c | NULL | eq_ref | PRIMARY | PRIMARY | 4 | test.b.album_cover_id | 1 | 100.00 | NULL |
+----+-------------+-------+------------+--------+-------------------------------------+---------------+---------+-----------------------+------+----------+---------------------+
I have no data in my tables, so the row counts are trivial.
There's still a problem that it's doing an index-scan of album, which in your case is 14,167 rows. That could be costly.
But the other joins are all using indexes. Two of them are type: eq_ref, indicating that it's joining to the primary key of those tables.
I changed the join to artist to an inner join. I can't see how an album could not reference an artist. But I suppose it's possible for an album to have no songs, hence the outer join.
I find it strange that you join album directly to song_cover. Wouldn't song_cover also need to reference the original song it's a cover of?

How to optimize MySQL select query or make it faster

I have a select query, that selects over 50k records from MySQL 5.5 database at once, and this amount is expected to grow. The query contains multiple subquery which is taking over 120s to execute.
Initially some of the sale_items and stock tables didn't have more that the ID keys, so I added some more:
SELECT
`p`.`id` AS `id`,
`p`.`Name` AS `Name`,
`p`.`Created` AS `Created`,
`p`.`Image` AS `Image`,
`s`.`company` AS `supplier`,
`s`.`ID` AS `supplier_id`,
`c`.`name` AS `category`,
IFNULL((SELECT
SUM(`stocks`.`Total_Quantity`)
FROM `stocks`
WHERE (`stocks`.`Product_ID` = `p`.`id`)), 0) AS `total_qty`,
IFNULL((SELECT
SUM(`sale_items`.`quantity`)
FROM `sale_items`
WHERE (`sale_items`.`product_id` = `p`.`id`)), 0) AS `total_sold`,
IFNULL((SELECT
SUM(`sale_items`.`quantity`)
FROM `sale_items`
WHERE ((`sale_items`.`product_id` = `p`.`id`) AND `sale_items`.`Sale_ID` IN (SELECT
`refunds`.`Sale_ID`
FROM `refunds`))), 0) AS `total_refund`
FROM ((`products` `p`
LEFT JOIN `cats` `c`
ON ((`c`.`ID` = `p`.`cat_id`)))
LEFT JOIN `suppliers` `s`
ON ((`s`.`ID` = `p`.`supplier_id`)))
This is the explain result
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 20981 | |
| 2 | DERIVED | p | ALL | NULL | NULL | NULL | NULL | 20934 | |
| 2 | DERIVED | c | eq_ref | PRIMARY | PRIMARY | 4 | p.cat_id | 1 | |
| 2 | DERIVED | s | eq_ref | PRIMARY | PRIMARY | 4 | p.supplier_id | 1 | |
| 5 | DEPENDENT SUBQUERY | sale_items | ref | sales_items_product_id | sales_items_product_id | 5 | p.id | 33 | Using where |
| 6 | DEPENDENT SUBQUERY | refunds | index_subquery | IDX_refunds_sale_id | IDX_refunds_sale_id | 5 | func | 1 | Using index; Using where |
| 4 | DEPENDENT SUBQUERY | sale_items | ref | sales_items_product_id | sales_items_product_id | 5 | p.id | 33 | Using where |
| 3 | DEPENDENT SUBQUERY | stocks | ref | IDX_stocks_product_id | IDX_stocks_product_id | 5 | p.id | 1 | Using where |
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
I am expecting that the query takes less that 3s at most, but I can't seem to figure out the best way to optimize this query.
The query looks fine to me. You select all data and aggregate some of it. This takes time. Your explain plan shows there are indexes on the IDs, which is good. And at a first glance there is not much we seem to be able to do here...
What you can do, though, is provide covering indexes, i.e. indexes that contain all columns you need from a table, so the data can be taken from the index directly.
create index idx1 on cats(id, name);
create index idx2 on suppliers(id, company);
create index idx3 on stocks(product_id, total_quantity);
create index idx4 on sale_items(product_id, quantity, sale_id);
This can really boost your query.
What you can try About the query itself is to move the subqueries to the FROM clause. MySQL's optimizer is not great, so although it should get the same execution plan, it may well be that it favors the FROM clause.
SELECT
p.id,
p.name,
p.created,
p.image,
s.company as supplier,
s.id AS supplier_id,
c.name AS category,
COALESCE(st.total, 0) AS total_qty,
COALESCE(si.total, 0) AS total_sold,
COALESCE(si.refund, 0) AS total_refund
FROM products p
LEFT JOIN cats c ON c.id = p.cat_id
LEFT JOIN suppliers s ON s.id = p.supplier_id
LEFT JOIN
(
SELECT SUM(total_quantity) AS total
FROM stocks
GROUP BY product_id
) st ON st.product_id = p.id
LEFT JOIN
(
SELECT
SUM(quantity) AS total,
SUM(CASE WHEN sale_id IN (SELECT sale_id FROM refunds) THEN quantity END) as refund
FROM sale_items
GROUP BY product_id
) si ON si.product_id = p.id;
(If sale_id is unique in refunds, then you can even join it to sale_items. Again: this should usually not make a difference, but in MySQL it may still. MySQL was once notorious for treating IN clauses much worse than the FROM clause. This may not be the case anymore, I don't know. You can try - if refunds.sale_id is unique).

Select records from table where field not in left join of different table in MySql

Struggling to get this at any sensible run time. I have three tables:
temp_company
id (PRIMARY KEY), number (KEY), s_code (KEY)
company
id (PRIMARY KEY), number (KEY)
company_scode
company_id (UNIQUE on company_id and code), code (KEY)
There is also a foreign key between code and code the code_description table.
There is also a foreign key between company_id and the id in the company table
I need to match up the temp_company table to the company table on the number field, I then want to check if the s_code in temporary table exists for the company in the company_scode table, if it doesn't then select that row.
So far I have:
SELECT temp_company.s_code
FROM temp_company
WHERE temp_company.s_code NOT IN
(SELECT code
FROM company
LEFT JOIN company_scode ON company.id = company_scode.company_id
WHERE
company.number = temp_company.number
)
but this is very slow, I would appreciate a better way to select every temp_company record where it's s_code does not exist in the many to many relationship between company and company_scode.
* UPDATE *
Thank you to Loc and Ollie for your answers, these are still taking a very long time (I left Ollie's for 8 hours and it was still going).
In terms of index's I have updated above with info. I've put the explains below for the two answers to try to shed some light and hopefully get this faster.
EXPLAIN for Ollie's answer:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+----+--------------------+------------+-------+---------------+------------+---------+-----------------------+---------+--------------------------+
| 1 | PRIMARY | tc | ALL | (NULL) | (NULL) | (NULL) | (NULL) | 3216320 | |
+----+--------------------+------------+-------+---------------+------------+---------+-----------------------+---------+--------------------------+
| 1 | PRIMARY | <derived2> | ALL | (NULL) | (NULL) | (NULL) | (NULL) | 2619433 | Using where; Not exists |
+----+--------------------+------------+-------+---------------+------------+---------+-----------------------+---------+--------------------------+
| 2 | DERIVED | s | index | company_id | code | 62 | (NULL) | 2405379 | Using index |
+----+--------------------+------------+-------+---------------+------------+---------+-----------------------+---------+--------------------------+
| 2 | DERIVED | c | eq_ref| PRIMARY | PRIMARY | 4 | mydbname.s.company_id | 1 | |
+----+--------------------+------------+-------+---------------+------------+---------+-----------------------+---------+--------------------------+
EXPLAIN for Loc's answer:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
+----+--------------------+-------+-------+---------------+------------+---------+---------------+---------+--------------------------+
| 1 | PRIMARY | tc | ALL | (NULL) | (NULL) | (NULL) | (NULL) | 3216320 | Using where |
+----+--------------------+-------+-------+---------------+------------+---------+---------------+---------+--------------------------+
| 2 | DEPENDENT SUBQUERY | c | index | (NULL) | number | 63 | (NULL) | 3189756 | Using where; Using index |
+----+--------------------+-------+-------+---------------+------------+---------+---------------+---------+--------------------------+
| 2 | DEPENDENT SUBQUERY | cc | ref | company_id | company_id | 4 | mydbname.c.id | 1 | Using where; Using Index |
+----+--------------------+-------+-------+---------------+------------+---------+---------------+---------+--------------------------+
TEST this:
SELECT tc.*
FROM temp_company tc
WHERE NOT EXISTS
(
SELECT 1
FROM company c LEFT JOIN company_scode cc ON c.id = cc.company_id
WHERE c.number = tc.number
)
Here's a way that might be plenty faster than your nested SELECT.
SELECT tc.id, tc.number, tc.s_code
FROM temp_company AS tc
LEFT JOIN (
SELECT s.code AS company_scode,
c.id
FROM company AS c
JOIN company_scode AS s ON c.id = s.code
) AS existing_company
ON ( tc.scode = existing_company.company_scode
AND tc.id = existing_company.id)
WHERE existing_company.company_scode IS NULL
This works by running a subquery that returns a list of (id, scode). It then joins that to the temp_company table and uses IS NULL to look for items that only showed up on the left side of the join.
In the end I got this down to a managable time of about 1 minute with the following:
CREATE TEMPORARY TABLE tc AS
(SELECT company.id AS cid, temp_company.scode AS tcode
FROM temp_company
INNER JOIN company ON temp_company.number = company.number
WHERE temp_company.scode IS NOT NULL AND temp_company.scode != "")
CREATE TEMPORARY TABLE rc AS
(SELECT tc.cid as cid FROM tc
LEFT JOIN company_scode ON tc.cid = company_scode.company_id
WHERE tc.tcode = company_scode.code)
SELECT * FROM tc
WHERE tc.cid NOT IN (SELECT cid FROM rc)
I'd rather not be using temporary tables so if anyone posts a solution in a similar or quicker timeframe then I'll happily update the answer to that.

NULL used as index key instead of possible key in MySQL db

I am running the following query:
SELECT p.val1, p.val2, p.val3, p.val4, p.val5, p.val6, p.val7, p.val8
FROM db1.tbl1 AS p
INNER JOIN db2.tbl2 vp ON p.pid = vp.pid
INNER JOIN db2.tbl1 AS vs ON vp.vid = vs.vid
INNER JOIN db3.tbl1 AS sa ON vs.sid = sa.sid
LEFT JOIN db4.tbl1 AS fs ON p.aid = fs.aid
WHERE sa.id = '11594'
AND fs.aid IS NULL
ORDER BY IF( (
ISNULL( egl )
OR egl = '' ) , 1, 0
), egl DESC
LIMIT 15
OFFSET 0
Unfortunately, it just hangs when run.
Running an EXPLAIN nets me this info:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
|*---*|*-----------*|*-----*|*-----*|*---------------------------------*|*----------*|*-------*|*----------*|*------*|*--------------
| 1 | SIMPLE | sa | const | PRIMARY,s_key,p_key,n_key,ignored | PRIMARY | 4 | const | 1 | Using filesort
| 1 | SIMPLE | p | ALL | PRIMARY, pid | NULL | NULL | NULL | 744704 |
| 1 | SIMPLE | vp | ref | PRIMARY,pid | pid | 130 | db1.p.pid | 1 | Using index
| 1 | SIMPLE | vs | ref | vid | vid | 130 | db2.vp.vid | 1 | Using where
| 1 | SIMPLE | fs | ref | a_key | a_key | 97 | func | 1 | Using where; Using index
If I and USE INDEX or FORCE INDEX after the FROM db1.tbl1 AS p, it does not change a thing.
My assumption is the problem is that table p isn't using any of the indexes. Is this assumption correct?
What are some reasons this query wouldn't use one of the possible keys?
The problem was with the ORDER BY clause. The dbms was attempting to apply it to db1.tbl1 before the joins (apparently). Wrapping the query in a select and putting the ORDER BY outside made the dbms work as expected.
SELECT * FROM
(SELECT p.val1, p.val2, p.val3, p.val4, p.val5, p.val6, p.val7, p.val8
FROM db1.tbl1 AS p
INNER JOIN db2.tbl2 vp ON p.pid = vp.pid
INNER JOIN db2.tbl1 AS vs ON vp.vid = vs.vid
INNER JOIN db3.tbl1 AS sa ON vs.sid = sa.sid
LEFT JOIN db4.tbl1 AS fs ON p.aid = fs.aid
WHERE sa.id = '11594'
AND fs.aid IS NULL) AS tmp
ORDER BY IF( (
ISNULL( egl )
OR egl = '' ) , 1, 0
), egl DESC
LIMIT 15
OFFSET 0

MySQL - Find rows matching all rows from joined table AND string from other tables

this is a follow up from MySQL - Find rows matching all rows from joined table
Thanks to this site the query runs perfectly.
But now i had to extend the query for a search for artist and track. This has lead me to the following query:
SELECT DISTINCT`t`.`id`
FROM `trackwords` AS `tw`
INNER JOIN `wordlist` AS `wl` ON wl.id=tw.wordid
INNER JOIN `track` AS `t` ON tw.trackid=t.id
WHERE (wl.trackusecount>0) AND
(wl.word IN ('please','dont','leave','me')) AND
t.artist IN (
SELECT a.id
FROM artist as a
INNER JOIN `artistalias` AS `aa` ON aa.ref=a.id
WHERE a.name LIKE 'pink%' OR aa.name LIKE 'pink%'
)
GROUP BY tw.trackid
HAVING (COUNT(*) = 4);
The Explain for this query looks quite good i think:
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
| 1 | PRIMARY | wl | range | PRIMARY,word,trackusecount | word | 767 | NULL | 4 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | tw | ref | wordid,trackid | wordid | 4 | mbdb.wl.id | 31 | |
| 1 | PRIMARY | t | eq_ref | PRIMARY | PRIMARY | 4 | mbdb.tw.trackid | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | aa | ref | ref,name | ref | 4 | func | 2 | |
| 2 | DEPENDENT SUBQUERY | a | eq_ref | PRIMARY,name,namefull | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
Did you see room for optimization ? Query has a runtime from around 7secs, which is to much unfortunatly. Any suggestions are welcome.
TIA
You have two possible selective conditions here: artists's name and the word list.
Assuming that the words are more selective than artists:
SELECT tw.trackid
FROM (
SELECT tw.trackid
FROM wordlist AS wl
JOIN trackwords AS tw
ON tw.wordid = wl.id
WHERE wl.trackusecount > 0
AND wl.word IN ('please','dont','leave','me')
GROUP BY
tw.trackid
HAVING COUNT(*) = 4
) tw
INNER JOIN
track AS t
ON t.id = tw.trackid
AND EXISTS
(
SELECT NULL
FROM artist a
WHERE a.name LIKE 'pink%'
AND a.id = t.artist
UNION ALL
SELECT NULL
FROM artist a
JOIN artistalias aa
ON aa.ref = a.id
AND aa.name LIKE 'pink%'
WHERE a.id = t.artist
)
You need to have the following indexes for this to be efficient:
wordlist (word, trackusecount)
trackwords (wordid, trackid)
artistalias (ref, name)
Have you already indexed the name columns? That should speed this up.
You can also try using fulltext searching with Match and Against.