Inexplicably slow query in MySQL - mysql

Given this result-set:
mysql> EXPLAIN SELECT c.cust_name, SUM(l.line_subtotal) FROM customer c
-> JOIN slip s ON s.cust_id = c.cust_id
-> JOIN line l ON l.slip_id = s.slip_id
-> JOIN vendor v ON v.vend_id = l.vend_id WHERE v.vend_name = 'blahblah'
-> GROUP BY c.cust_name
-> HAVING SUM(l.line_subtotal) > 49999
-> ORDER BY c.cust_name;
+----+-------------+-------+--------+---------------------------------+---------------+---------+----------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------------------------+---------------+---------+----------------------+------+----------------------------------------------+
| 1 | SIMPLE | v | ref | PRIMARY,idx_vend_name | idx_vend_name | 12 | const | 1 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | l | ref | idx_vend_id | idx_vend_id | 4 | csv_import.v.vend_id | 446 | |
| 1 | SIMPLE | s | eq_ref | PRIMARY,idx_cust_id,idx_slip_id | PRIMARY | 4 | csv_import.l.slip_id | 1 | |
| 1 | SIMPLE | c | eq_ref | PRIMARY,cIndex | PRIMARY | 4 | csv_import.s.cust_id | 1 | |
+----+-------------+-------+--------+---------------------------------+---------------+---------+----------------------+------+----------------------------------------------+
4 rows in set (0.04 sec)
I'm a bit baffled as to why the query referenced by this EXPLAIN statement is still taking about a minute to execute. Isn't it true that this query only has to search through 449 rows? Anyone have any idea as to what could be slowing it down so much?

I think the having sum() is the root of all evil. This forces to mysql to make two joins (from customer to slip and then to line) to get the value of the sum. After this it has to retrieve all the data to properly filter by the sum() value to get a meaningful result.
It might be optimized to the following and probably get better response times:
select c.cust_name,
grouping.line_subtotal
from customer c join
(select c.cust_id,
l.vend_id,
sum(l.line_subtotal) as line_subtotal
from slip s join line l on s.slip_id = l.slip_id
group by c.cust_id, l.vend_id) grouping
on c.cust_id = grouping.cust_id
join vendor v on v.vend_id = grouping.vend_id
where v.vend_name = 'blablah'
and grouping.line_subtotal > 499999
group by c.cust_name
order by c.cust_name;
In other words, create a sub-select that does all the necessary grouping before making the real query.

You can run your select vendor query first, and then join the results with the rest:
SELECT c.cust_name, SUM(l.line_subtotal) FROM customer c
-> JOIN slip s ON s.cust_id = c.cust_id
-> JOIN line l ON l.slip_id = s.slip_id
-> JOIN (SELECT * FROM vendor WHERE vend_name='blahblah') v ON v.vend_id = l.vend_id
-> GROUP BY c.cust_name
-> HAVING SUM(l.line_subtotal) > 49999
-> ORDER BY c.cust_name;
Also, do vend_name and/or cust_name have an index? That might be an issue here.

Related

How to optimize MySQL select query or make it faster

I have a select query, that selects over 50k records from MySQL 5.5 database at once, and this amount is expected to grow. The query contains multiple subquery which is taking over 120s to execute.
Initially some of the sale_items and stock tables didn't have more that the ID keys, so I added some more:
SELECT
`p`.`id` AS `id`,
`p`.`Name` AS `Name`,
`p`.`Created` AS `Created`,
`p`.`Image` AS `Image`,
`s`.`company` AS `supplier`,
`s`.`ID` AS `supplier_id`,
`c`.`name` AS `category`,
IFNULL((SELECT
SUM(`stocks`.`Total_Quantity`)
FROM `stocks`
WHERE (`stocks`.`Product_ID` = `p`.`id`)), 0) AS `total_qty`,
IFNULL((SELECT
SUM(`sale_items`.`quantity`)
FROM `sale_items`
WHERE (`sale_items`.`product_id` = `p`.`id`)), 0) AS `total_sold`,
IFNULL((SELECT
SUM(`sale_items`.`quantity`)
FROM `sale_items`
WHERE ((`sale_items`.`product_id` = `p`.`id`) AND `sale_items`.`Sale_ID` IN (SELECT
`refunds`.`Sale_ID`
FROM `refunds`))), 0) AS `total_refund`
FROM ((`products` `p`
LEFT JOIN `cats` `c`
ON ((`c`.`ID` = `p`.`cat_id`)))
LEFT JOIN `suppliers` `s`
ON ((`s`.`ID` = `p`.`supplier_id`)))
This is the explain result
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 20981 | |
| 2 | DERIVED | p | ALL | NULL | NULL | NULL | NULL | 20934 | |
| 2 | DERIVED | c | eq_ref | PRIMARY | PRIMARY | 4 | p.cat_id | 1 | |
| 2 | DERIVED | s | eq_ref | PRIMARY | PRIMARY | 4 | p.supplier_id | 1 | |
| 5 | DEPENDENT SUBQUERY | sale_items | ref | sales_items_product_id | sales_items_product_id | 5 | p.id | 33 | Using where |
| 6 | DEPENDENT SUBQUERY | refunds | index_subquery | IDX_refunds_sale_id | IDX_refunds_sale_id | 5 | func | 1 | Using index; Using where |
| 4 | DEPENDENT SUBQUERY | sale_items | ref | sales_items_product_id | sales_items_product_id | 5 | p.id | 33 | Using where |
| 3 | DEPENDENT SUBQUERY | stocks | ref | IDX_stocks_product_id | IDX_stocks_product_id | 5 | p.id | 1 | Using where |
+----+--------------------+------------+----------------+------------------------+------------------------+---------+---------------------------------
I am expecting that the query takes less that 3s at most, but I can't seem to figure out the best way to optimize this query.
The query looks fine to me. You select all data and aggregate some of it. This takes time. Your explain plan shows there are indexes on the IDs, which is good. And at a first glance there is not much we seem to be able to do here...
What you can do, though, is provide covering indexes, i.e. indexes that contain all columns you need from a table, so the data can be taken from the index directly.
create index idx1 on cats(id, name);
create index idx2 on suppliers(id, company);
create index idx3 on stocks(product_id, total_quantity);
create index idx4 on sale_items(product_id, quantity, sale_id);
This can really boost your query.
What you can try About the query itself is to move the subqueries to the FROM clause. MySQL's optimizer is not great, so although it should get the same execution plan, it may well be that it favors the FROM clause.
SELECT
p.id,
p.name,
p.created,
p.image,
s.company as supplier,
s.id AS supplier_id,
c.name AS category,
COALESCE(st.total, 0) AS total_qty,
COALESCE(si.total, 0) AS total_sold,
COALESCE(si.refund, 0) AS total_refund
FROM products p
LEFT JOIN cats c ON c.id = p.cat_id
LEFT JOIN suppliers s ON s.id = p.supplier_id
LEFT JOIN
(
SELECT SUM(total_quantity) AS total
FROM stocks
GROUP BY product_id
) st ON st.product_id = p.id
LEFT JOIN
(
SELECT
SUM(quantity) AS total,
SUM(CASE WHEN sale_id IN (SELECT sale_id FROM refunds) THEN quantity END) as refund
FROM sale_items
GROUP BY product_id
) si ON si.product_id = p.id;
(If sale_id is unique in refunds, then you can even join it to sale_items. Again: this should usually not make a difference, but in MySQL it may still. MySQL was once notorious for treating IN clauses much worse than the FROM clause. This may not be the case anymore, I don't know. You can try - if refunds.sale_id is unique).

Sql query fills /tmp and takes minutes to run

I have a query, which is not operating on a lot of data (IMHO) but takes a number of minutes (5-10) to execute and ends up filling the /tmp space (takes up to 20GB) while executing. Once it's finished the space is freed again.
The query is as follows:
SELECT c.name, count(b.id), c.parent_accounting_reference, o.contract, a.contact_person, a.address_email, a.address_phone, a.address_fax, concat(ifnull(concat(a.description, ', '),''), ifnull(concat(a.apt_unit, ', '),''), ifnull(concat(a.preamble, ', '),''), ifnull(addr_entered,'')) FROM
booking b
join visit v on (b.visit_id = v.id)
join super_booking s on (v.super_booking_id = s.id)
join customer c on (s.customer_id = c.id)
join address a on (a.customer_id = c.id)
join customer_number cn on (cn.customer_numbers_id = c.id)
join number n on (cn.number_id = n.id)
join customer_email ce on (ce.customer_emails_id = c.id)
join email e on (ce.email_id = e.id)
left join organization o on (o.accounting_reference = c.parent_accounting_reference)
left join address_type at on (a.type_id = at.id and at.name_key = 'billing')
where s.company_id = 1
and v.expected_start_date between '2015-01-01 00:00:00' and '2015-02-01 00:00:00'
group by s.customer_id
order by count(b.id) desc
And the explain plan for the same is:
+----+-------------+-------+--------+--------------------------------------------------------------+---------------------+---------+--------------------------------------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+--------------------------------------------------------------+---------------------+---------+--------------------------------------+-------+----------------------------------------------+
| 1 | SIMPLE | s | ref | PRIMARY,FKC4F8739580E01B03,FKC4F8739597AD73B1 | FKC4F8739580E01B03 | 9 | const | 74088 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ce | ref | FK864C4FFBAF6458E3,customer_emails_id,customer_emails_id_2 | customer_emails_id | 9 | id_dev.s.customer_id | 1 | Using where |
| 1 | SIMPLE | cn | ref | FK530F62CA30E87991,customer_numbers_id,customer_numbers_id_2 | customer_numbers_id | 9 | id_dev.ce.customer_emails_id | 1 | Using where |
| 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 8 | id_dev.s.customer_id | 1 | |
| 1 | SIMPLE | e | eq_ref | PRIMARY | PRIMARY | 8 | id_dev.ce.email_id | 1 | Using index |
| 1 | SIMPLE | n | eq_ref | PRIMARY | PRIMARY | 8 | id_dev.cn.number_id | 1 | Using index |
| 1 | SIMPLE | v | ref | PRIMARY,FK6B04D4BEF4FD9A | FK6B04D4BEF4FD9A | 8 | id_dev.s.id | 1 | Using where |
| 1 | SIMPLE | b | ref | FK3DB0859E1684683 | FK3DB0859E1684683 | 8 | id_dev.v.id | 1 | Using index |
| 1 | SIMPLE | o | ref | org_acct_reference | org_acct_reference | 767 | id_dev.c.parent_accounting_reference | 1 | |
| 1 | SIMPLE | a | ref | FKADDRCUST,customer_address_idx | FKADDRCUST | 9 | id_dev.c.id | 256 | Using where |
| 1 | SIMPLE | at | eq_ref | PRIMARY | PRIMARY | 8 | id_dev.a.type_id | 1 | |
+----+-------------+-------+--------+--------------------------------------------------------------+---------------------+---------+--------------------------------------+-------+----------------------------------------------+
It appears to be using the correct indexes and such so I can't understand why the large usage of /tmp and long execution time.
Your query uses a temporary table, which you can see by the Using temporary; note in the EXPLAIN result. Your MySQL settings are probably configured to use /tmp to store temporary tables.
If you want to optimize the query further, you should probably investigate why the temporary table is needed at all. The best way to do that is gradually simplifying the query until you figure out what is causing it. In this case, probably just the amount of rows needed to be processed, so if you really do need all this data, you probably need the temp table too. But don't give up on optimizing on my account ;)
By the way, on another note, you might want to look into COALESCE for handling NULL values.
You're stuck with a temporary table, because you're doing an aggregate query and then ordering it by one of the results in the aggregate. Your optimizing goal should be to reduce the number of rows and/or columns in that temporary table.
Add an index on visit.expected_start_date. This may help MySQL satisfy your query more quickly, especially if your visit table has many rows that lie outside the date range in your query.
It looks like you're trying to find the customers with the most bookings in a particular date range.
So, let's start with a subquery to summarize the least amount of material from your database.
SELECT count(*) booking_count, s.customer_id
FROM visit v
JOIN super_booking s ON v.super_booking_id = s.id
JOIN booking b ON v.id = b.visit_id
WHERE v.expected_start_date <= '2015-01-01 00:00:00'
AND v.expected_start_date > '2015-02-01 00:00:00'
AND s.company_id = 1
GROUP BY s.customer_id
This gives back a list of booking counts and customer ids for the date range and company id in question. It will be pretty efficient, especially if you put an index on expected_start_date in the visit table
Then, let's join that subquery to the one that pulls out all that information you need.
SELECT c.name, booking_count, c.parent_accounting_reference,
o.contract,
a.contact_person, a.address_email, a.address_phone, a.address_fax,
concat(ifnull(concat(a.description, ', '),''),
ifnull(concat(a.apt_unit, ', '),''),
ifnull(concat(a.preamble, ', '),''),
ifnull(addr_entered,''))
FROM (
SELECT count(*) booking_count, s.customer_id
FROM visit v
JOIN super_booking s ON v.super_booking_id = s.id
JOIN booking b ON v.id = b.visit_id
WHERE v.expected_start_date <= '2015-01-01 00:00:00'
AND v.expected_start_date > '2015-02-01 00:00:00'
AND s.company_id = 1
GROUP BY s.customer_id
) top
join customer c on top.customer_id = c.id
join address a on (a.customer_id = c.id)
join customer_number cn on (cn.customer_numbers_id = c.id)
join number n on (cn.number_id = n.id)
join customer_email ce on (ce.customer_emails_id = c.id)
join email e on (ce.email_id = e.id)
left join organization o on (o.accounting_reference = c.parent_accounting_reference)
left join address_type at on (a.type_id = at.id and at.name_key = 'billing')
order by booking_count DESC
That should speed your work up a whole bunch, by reducing the size of the data you need to summarize.
Note: Beware the trap in date BETWEEN this AND that. You really want
date >= this
AND date < that
because BETWEEN means
date >= this
AND date <= that

MySQL Query performance improvement for order by before group by

below is a query I use to get the latest record per serverID unfortunately this query does take endless to process. According to the stackoverflow question below it should be a very fast solution. Is there any way to speed up this query or do I have to split it up? (first get all serverIDs than get the last record for each server)
Retrieving the last record in each group
SELECT s1.performance, s1.playersOnline, s1.serverID, s.name, m.modpack, m.color
FROM stats_server s1
LEFT JOIN stats_server s2
ON (s1.serverID = s2.serverID AND s1.id < s2.id)
INNER JOIN server s
ON s1.serverID=s.id
INNER JOIN modpack m
ON s.modpack=m.id
WHERE s2.id IS NULL
ORDER BY m.id
15 rows in set (34.73 sec)
Explain:
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Impossible WHERE |
+------+-------------+-------+------+---------------+------+---------+------+------+----------+------------------+
1 row in set, 1 warning (0.00 sec)
Sample Output:
+-------------+---------------+----------+---------------+-------------------------+--------+
| performance | playersOnline | serverID | name | modpack | color |
+-------------+---------------+----------+---------------+-------------------------+--------+
| 99 | 18 | 15 | hub | Lobby | AAAAAA |
| 98 | 12 | 10 | horizons | Horizons | AA00AA |
| 97 | 6 | 11 | m_lobby | Monster | AA0000 |
| 99 | 1 | 12 | m_north | Monster | AA0000 |
| 86 | 10 | 13 | m_south | Monster | AA0000 |
| 87 | 17 | 14 | m_east | Monster | AA0000 |
| 98 | 10 | 16 | m_west | Monster | AA0000 |
| 84 | 7 | 5 | tppi | Test Pack Please Ignore | 55FFFF |
| 95 | 15 | 6 | agrarian_plus | Agrarian Skies | 00AA00 |
| 98 | 23 | 7 | agrarian2 | Agrarian Skies | 00AA00 |
| 74 | 18 | 9 | agrarian | Agrarian Skies | 00AA00 |
| 97 | 37 | 17 | agrarian3 | Agrarian Skies | 00AA00 |
| 99 | 17 | 3 | bteam_pvp | Attack of the B-Team | FFAA00 |
| 73 | 44 | 8 | bteam_pve | Attack of the B-Team | FFAA00 |
| 93 | 11 | 4 | crackpack | Crackpack | EFEFEF |
+-------------+---------------+----------+---------------+-------------------------+--------+
15 rows in set (38.49 sec)
Sample Data:
http://www.mediafire.com/download/n0blj1io0c503ig/mym_bridge.sql.bz2
Edit
Ok I solved it. Here is expanded rows showing your original slow query:
And here is a fast query using MAX() with GROUP BY that gives the identical results. Please try it for yourself.
SELECT s1.id
,s1.performance
,s1.playersOnline
,s1.serverID
,s.name
,m.modpack
,m.color
FROM stats_server s1
JOIN (
SELECT MAX(id) as 'id'
FROM stats_server
GROUP BY serverID
) AS s2
ON s1.id = s2.id
JOIN server s
ON s1.serverID = s.id
JOIN modpack m
ON s.modpack = m.id
ORDER BY m.id
I would phrase this query using not exists:
SELECT ss.performance, ss.playersOnline, ss.serverID, s.name, m.modpack, m.color
FROM stats_server ss INNER JOIN
server s
ON ss.serverID = s.id INNER JOIN
modpack m
ON s.modpack = m.id
WHERE NOT EXISTS (select 1
from stats_server ss2
where ss2.serverID = ss.serverID AND ss2.id > ss.id
)
Apart from the primary key indexes on server and modpack (which I assume are there), you also want an index on stats_server(ServerId, id). This index should also help your version of the query.
Am I missing something? Why wouldn't a standard uncorrelated subquery work?
SELECT x.id, x.performance, x.playersOnline, s.name, m.modpack, m.color, x.timestamp
FROM stats_server x
JOIN
( SELECT serverid, MAX(id) maxid FROM stats_server GROUP BY serverid ) y
ON y.serverid = x.serverid AND y.maxid = x.id
JOIN server s
ON x.serverID=s.id
JOIN modpack m
ON s.modpack=m.id
I'm guessing that you really want this (notice the order of the joins and the join criteria), and this matches the indexes that you've created:
SELECT s1.performance, s1.playersOnline, s1.serverID, s.name, m.modpack, m.color
FROM server s
INNER JOIN stats_server s1
ON s1.serverID = s.id
LEFT JOIN stats_server s2
ON s2.serverID = s.id AND s2.id > s1.id
INNER JOIN modpack m
ON m.id = s.modpack
WHERE s2.id IS NULL
ORDER BY m.id
MySQL doesn't always inner join the tables in the order that you write them in the query since the order doesn't really matter for the result set (though it can affect index use).
With no usable index specified in the WHERE clause, MySQL might want to start with the table with the least number of rows (maybe stats_server in this case). With the ORDER BY clause, MySQL might want to start with modpack so it doesn't have to order the results later.
MySQL picks the execution plan then sees if it has the proper index for joining rather than seeing what indexes it has to join on then picking the execution plan. MySQL doesn't just automatically pick the plan that matches your indexes.
STRAIGHT_JOIN tells MySQL in what order to join the tables so that it uses the indexes that you expect it to use:
SELECT s1.performance, s1.playersOnline, s1.serverID, s.name, m.modpack, m.color
FROM server s
STRAIGHT_JOIN stats_server s1
ON s1.serverID = s.id
LEFT JOIN stats_server s2
ON s2.serverID = s.id AND s2.id > s1.id
STRAIGHT_JOIN modpack m
ON m.id = s.modpack
WHERE s2.id IS NULL
ORDER BY m.id
I don't know what indexes you've defined since you've not provided an EXPLAIN result or shown your indexes, but this should give you some idea on how to improve the situation.

group by slows down the query

I have on the products table the following index: (product_t,productid,forsale). The MySQL manual says:
The GROUP BY names only columns that form a leftmost prefix of the index and no other columns.
http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html
When I do the following query
SELECT z.product_t, COUNT(z.productid)
FROM xcart_products z
JOIN xcart_products_lng w ON z.productid = w.productid
AND w.code = 'US'
WHERE z.forsale = 'Y'
group by z.product_t
And therefore using the left most index field (product_t), the execution time is still massive:
+-----------+--------------------+
| product_t | COUNT(z.productid) |
+-----------+--------------------+
| B | 4 |
| C | 10521 |
| D | 1 |
| F | 16 |
| G | 363 |
| J | 16 |
| L | 749 |
| M | 22 |
| O | 279 |
| P | 5304 |
| S | 22 |
| W | 662 |
+-----------+--------------------+
12 rows in set (0.81 sec)
When I use the whole index (product_t,productid,forsale), the execution time is blazing fast (0.005 seconds). How should I change it to make the query go faster?
I think the query somehow could be improved through the use of a semi join... However i'm not sure how...
The slow down might not be related to the GROUP BY clause. Try adding an index for w.code and z.forsale individually.
MySQL Profiling might also help you in your endeavour
The most obvious answer would be to create a new index for product_t only. Cany you create new indexes?
Thankfully I found a way how to increase the speed. I did have another table where all the product_t's are defined. Each product_t there also uses the 'code' field (to specify the translation of each product_t). In this table I have an index defined on product_t and code as well. By changing the query to:
SELECT z.product_t, COUNT(z.productid)
FROM xcart_products z
JOIN xcart_products_lng w ON z.productid = w.productid
JOIN xcart_products_product_t_lng p ON z.product_t = p.product_t
AND p.code = 'US'
WHERE z.forsale = 'Y'
group by p.product_t,p.code
I managed to increase the speed to 0.10 seconds. The reason why it's faster is because the grouping by is now done on a whole index. Secondly the where code='US' is replaced from the big products_lng table, to the small product_t_lng table. I think this is the most efficient the query can be...
mysql> SELECT z.product_t, COUNT( z.productid )
-> FROM xcart_products z
-> JOIN xcart_products_lng w ON z.productid = w.productid
-> JOIN xcart_products_product_t_lng p ON z.product_t = p.product_t
-> AND p.code = 'US'
-> WHERE z.forsale = 'Y'
-> GROUP BY p.product_t, p.code
-> ;
+-----------+----------------------+
| product_t | COUNT( z.productid ) |
+-----------+----------------------+
| B | 4 |
| C | 10521 |
| F | 16 |
| G | 363 |
| L | 749 |
| M | 22 |
| O | 279 |
| P | 5304 |
| S | 22 |
| W | 662 |
+-----------+----------------------+
10 rows in set (0.14 sec)

MySQL - Find rows matching all rows from joined table AND string from other tables

this is a follow up from MySQL - Find rows matching all rows from joined table
Thanks to this site the query runs perfectly.
But now i had to extend the query for a search for artist and track. This has lead me to the following query:
SELECT DISTINCT`t`.`id`
FROM `trackwords` AS `tw`
INNER JOIN `wordlist` AS `wl` ON wl.id=tw.wordid
INNER JOIN `track` AS `t` ON tw.trackid=t.id
WHERE (wl.trackusecount>0) AND
(wl.word IN ('please','dont','leave','me')) AND
t.artist IN (
SELECT a.id
FROM artist as a
INNER JOIN `artistalias` AS `aa` ON aa.ref=a.id
WHERE a.name LIKE 'pink%' OR aa.name LIKE 'pink%'
)
GROUP BY tw.trackid
HAVING (COUNT(*) = 4);
The Explain for this query looks quite good i think:
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
| 1 | PRIMARY | wl | range | PRIMARY,word,trackusecount | word | 767 | NULL | 4 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | tw | ref | wordid,trackid | wordid | 4 | mbdb.wl.id | 31 | |
| 1 | PRIMARY | t | eq_ref | PRIMARY | PRIMARY | 4 | mbdb.tw.trackid | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | aa | ref | ref,name | ref | 4 | func | 2 | |
| 2 | DEPENDENT SUBQUERY | a | eq_ref | PRIMARY,name,namefull | PRIMARY | 4 | func | 1 | Using where |
+----+--------------------+-------+--------+----------------------------+---------+---------+-----------------+------+----------------------------------------------+
Did you see room for optimization ? Query has a runtime from around 7secs, which is to much unfortunatly. Any suggestions are welcome.
TIA
You have two possible selective conditions here: artists's name and the word list.
Assuming that the words are more selective than artists:
SELECT tw.trackid
FROM (
SELECT tw.trackid
FROM wordlist AS wl
JOIN trackwords AS tw
ON tw.wordid = wl.id
WHERE wl.trackusecount > 0
AND wl.word IN ('please','dont','leave','me')
GROUP BY
tw.trackid
HAVING COUNT(*) = 4
) tw
INNER JOIN
track AS t
ON t.id = tw.trackid
AND EXISTS
(
SELECT NULL
FROM artist a
WHERE a.name LIKE 'pink%'
AND a.id = t.artist
UNION ALL
SELECT NULL
FROM artist a
JOIN artistalias aa
ON aa.ref = a.id
AND aa.name LIKE 'pink%'
WHERE a.id = t.artist
)
You need to have the following indexes for this to be efficient:
wordlist (word, trackusecount)
trackwords (wordid, trackid)
artistalias (ref, name)
Have you already indexed the name columns? That should speed this up.
You can also try using fulltext searching with Match and Against.