Why is my MySQL query is so slow? - mysql

I'm trying to figure out why that query so slow (take about 6 second to get result)
SELECT DISTINCT
c.id
FROM
z1
INNER JOIN
c ON (z1.id = c.id)
INNER JOIN
i ON (c.member_id = i.member_id)
WHERE
c.id NOT IN (... big list of ids which should be excluded)
This is execution plan
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
| 1 | SIMPLE | z1 | index | PRIMARY | PRIMARY | 4 | NULL | 318563 | 99.85 | Using where; Using index; Using temporary |
| 1 | SIMPLE | c | eq_ref | PRIMARY,member_id | PRIMARY | 4 | z1.id | 1 | 100.00 | |
| 1 | SIMPLE | i | eq_ref | PRIMARY | PRIMARY | 4 | c.member_id | 1 | 100.00 | Using index |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
is it because mysql has to take out almost whole 1st table ? Can it be adjusted ?

You can try to replace c with a subquery.
SELECT DISTINCT
c.id
FROM
z1
INNER JOIN
(select c.id
from c
WHERE
c.id NOT IN (... big list of ids which should be excluded)) c ON (z1.id = c.id)
INNER JOIN
i ON (c.member_id = i.member_id)
to leave only necessary id's

It is imposible to say from the information you've provided whether there is a faster solution to obtaining the same data (we would need to know abou data distributions and what foreign keys are obligatory). However assuming that this is a hierarchical data set, then the plan is probably not optimal: the only predicate to reduce the number of rows is c.id NOT IN.....
The first question to ask yourself when optimizing any query is Do I need all the rows? How many rows is this returning?
I'm struggling to see any utlity in a query which returns a list of 'id' values (implying a set of autoincrement integers).
You can't use an index for a NOT IN (or <>) hence the most eficient solution is probably to start with a full table scan on 'c' - which should be the outcome of StanislavL's query.
Since you don't use the values from i and z, the joins could be replaced with 'exists' which may help performance.

I would consider creating a compound index for c(id, member_id). This way the query should work at index level only without scanning any rows in tables.

Related

Joining table twice makes the query slow

My problem is that my query is very slow when use JOIN on the same table twice.
I want to retrieve all the products from a given category. But since the product can be in multiple categories I also want to get the (c.canonical) category that should provide the URL base. Therefore I have 2 extra JOIN on categories AS c and categories_products AS cp2.
Original query
SELECT p.product_id
FROM products AS p
JOIN categories_products AS cp
ON p.product_id = cp.product_id
JOIN product_variants AS pv
ON pv.product_id = p.product_id
WHERE cp.category_id = 2
AND p.status = 2
GROUP BY p.product_id
ORDER BY cp.product_sortorder ASC
LIMIT 0, 40
EXPLAIN
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
|----|-------------|-------|--------|------------------------|------------------------|---------|-------------------------|------|----------------------------------------------|
| 1 | SIMPLE | cp | ref | FK_categories_products | FK_categories_products | 4 | const | 1074 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | superlove.cp.product_id | 1 | Using where |
| 1 | SIMPLE | pv | ref | FK_product_variants | FK_product_variants | 4 | superlove.p.product_id | 1 | Using where |
Slow query
SELECT p.product_id, c.category_id
FROM products AS p
JOIN categories_products AS cp
ON p.product_id = cp.product_id
JOIN categories_products AS cp2 // Extra line
ON p.product_id = cp2.product_id // Extra line
JOIN categories AS c // Extra line
ON cp2.category_id = c.category_id // Extra line
JOIN product_variants AS pv
ON pv.product_id = p.product_id
WHERE cp.category_id = 2
AND p.status = 2
AND c.canonical = 1 // Extra line
GROUP BY p.product_id
ORDER BY cp.product_sortorder ASC
LIMIT 0, 40
EXPLAIN
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
|----|-------------|-------|--------|------------------------|------------------------|---------|--------------------------|------|----------------------------------------------|
| 1 | SIMPLE | c | ALL | PRIMARY | (null) | (null) | (null) | 221 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | cp2 | ref | FK_categories_products | FK_categories_products | 4 | superlove.c.category_id | 33 | |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 4 | superlove.cp2.product_id | 1 | Using where |
| 1 | SIMPLE | pv | ref | FK_product_variants | FK_product_variants | 4 | superlove.p.product_id | 1 | Using where |
| 1 | SIMPLE | cp | ref | FK_categories_products | FK_categories_products | 4 | const | 1074 | Using where |
The MySQL optimizer seems to have trouble with this query. I get the impression that only rather few products would be in the requested category, but there would likely be many canonical categories. However, the optimizer apparently cannot tell that cp.category_id = 2 is a stronger condition than c.canonical = 1, so it starts the new query with c instead of cp, leading to a lot of superfluous rows along the way.
Providing data to the optimizer
Your first attempt should be trying to provide the optimizer with the required data: using the ANALYZE TABLE command, you can collect information about key distribution. For this to work, you'd have to have suitable keys in place. So perhaps you should add a key on categories.canonical. Then MySQL would know that there are (if I understand you correctly) only two distinct values for that column, and perhaps even how many rows in each. With a bit of luck, that would tell it that using c.canonical = 1 as the starting point would be a poor choice.
Forcing join order
If that does not help, then I suggest you force the order using STRAIGHT_JOIN. In particular, you might want to force cp as the first table, just as your original (and fast) query had it. If that solves the problem, you can stick to that solution. If not, then you should provide a new EXPLAIN output, so we can see where that approach fails.
Schema considerations
One more thing to consider: your question implies that for every product, there is exactly one canonical category associated with it. But your database schema does not reflect that fact. You might want to consider ways to modify your schema to reflect that fact. For example, you could have a column called canonical_category_id in products table, and use categories_products for non-canonical categories only. If you use such a setup, you might want to create a VIEW which joins products to all their categories, both canonical and non-canonical ones, using a UNION like this:
CREATE VIEW products_all_categories AS
SELECT product_id, canonical_category_id AS category_id
FROM products
UNION ALL
SELECT product_id, category_id
FROM categories_products
You could use this instead of categories_products in those places where you don't care whether a category is canonical or not. You could even rename the table and name the view categories_products instead, so that your existing queries work as they used to. You should add an index on the two columns from products used in this query. Perhaps even two indices, one for either order of these columns.
Not sure whether this whole setup would be acceptable in your application. Not sure whether it would really bring the intended speed gain. In the end, you might be forced to maintain redundant data, like a products.canonical column in addition to a reference to the canonical category in the categories_products table. I know redundant data is ugly from a design point of view, but for the sake of performance it might be necessary in order to avoid long computations. At least on a RDBMS which doesn't support materialized views. You could probably use triggers to keep data consistent, though I have no actual experience there.

joining table in mysql not using index properly?

I have four tables that I am trying to join and output the result to a new table. My code looks like this:
create table tbl
select a.dte, a.permno, (ret - rf) f0_xs_ret, (xs_ret - (betav*xs_mkt)) f0_resid, mkt_cap last_year_mkt_cap, betav beta_value
from a inner join b using (dte)
inner join c on (year(a.dte) = c.yr and a.permno = c.permno)
inner join d on (a.permno = d.permno and year(a.dte)-1 = year(d.dte));
All of the tables have multiple indices and for table a, (dte, permno) identify a unique record, for table b, dte id's a unique record, for table c, (yr, permno) id a unique record and for table d, (dte, permno) id a unique record. the explain from the select part of the query is:
+----+-------------+-------+--------+-------------------+---------+---------+---------- ------------------------+--------+-------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------+---------+---------+---------- ------------------------+--------+-------------------+
| 1 | SIMPLE | d | ALL | idx1 | NULL | NULL | NULL | 264129 | |
| 1 | SIMPLE | c | ref | idx2 | idx2 | 4 | achernya.d.permno | 16 | |
| 1 | SIMPLE | b | ALL | PRIMARY,idx2 | NULL | NULL | NULL | 12336 | Using join buffer |
| 1 | SIMPLE | a | eq_ref | PRIMARY,idx1,idx2 | PRIMARY | 7 | achernya.b.dte,achernya.d.permno | 1 | Using where |
+----+-------------+-------+--------+-------------------+---------+---------+----------------------------------+--------+-------------------+
Why does mysql have to read so many rows to process this thing? and if i am reading this correctly, it has to read (264129*16*12336) rows which should take a good month.
Could someone please explain what's going on here?
MySQL has to read the rows because you're using functions as your join conditions. An index on dte will not help resolve YEAR(dte) in a query. If you want to make this fast, then put the year in its own column to use in joins and move the index to that column, even if that means some denormalization.
As for the other columns in your index that you don't apply functions to, they may not be used if the index won't provide much benefit, or they aren't the leftmost column in the index and you don't use the leftmost prefix of that index in your join condition.
Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.)
http://dev.mysql.com/doc/refman/5.0/en/mysql-indexes.html

MySQL Query Optimization; SELECT multiple fields vs. JOIN

We've got a relatively straightforward query that does LEFT JOINs across 4 tables. A is the "main" table or the top-most table in the hierarchy. B links to A, C links to B. Furthermore, X links to A. So the hierarchy is basically
A
C => B => A
X => A
The query is essentially:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
WHERE
b.flag = true
ORDER BY
x.date DESC
LIMIT 25
Via EXPLAIN, I've confirmed that the correct indexes are in place, and that the built-in MySQL query optimizer is using those indexes correctly and properly.
So here's the strange part...
When we run the query as is, it takes about 1.1 seconds to run.
However, after doing some checking, it seems that if I removed most of the SELECT fields, I get a significant speed boost.
So if instead we made this into a two-step query process:
First query same as above except change the SELECT clause to only SELECT a.id instead of SELECT *
Second query also same as above, except change the WHERE clause to only do an a.id IN agains the result of Query 1 instead of what we have before
The result is drastically different. It's .03 seconds for the first query and .02 for the second query.
Doing this two-step query in code essentially gives us a 20x boost in performance.
So here's my question:
Shouldn't this type of optimization already be done within the DB engine? Why does the difference in which fields that are actually SELECTed make a difference on the overall performance of the query?
At the end of the day, it's merely selecting the exact same 25 rows and returning the exact same full contents of those 25 rows. So, why the wide disparity in performance?
ADDED 2012-08-24 13:02 PM PDT
Thanks eggyal and invertedSpear for the feedback. First off, it's not a caching issue -- I've run tests running both queries multiple times (about 10 times) alternating between each approach. The result averages at 1.1 seconds for the first (single query) approach and .03+.02 seconds for the second (2 query) approach.
In terms of indexes, I thought I had done an EXPLAIN to ensure that we're going thru the keys, and for the most part we are. However, I just did a quick check again and one interesting thing to note:
The slower "single query" approach doesn't show the Extra note of "Using index" for the third line:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
While it does show "Using index" for when we query for just the IDs:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | Using index |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
The strange thing is that both do list the correct index being used... but I guess it begs the questions:
Why are they different (considering all the other clauses are the exact same)? And is this an indication of why it's slower?
Unfortunately, the MySQL docs do not give much information for when the "Extra" column is blank/null in the EXPLAIN results.
More important than speed, you have a flaw in your query logic. When you test a LEFT JOINed column in the WHERE clause (other than testing for NULL), you force that join to behave as if it were an INNER JOIN. Instead, you'd want:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
AND b.flag = true
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
ORDER BY
x.date DESC
LIMIT 25
My next suggestion would be to examine all of those .*'s in your SELECT. Do you really need all the columns from all the tables?

Why scan type is changed from ALL to RANGE when using LIMIT on SQL queries + Optimize query

I have this query
SELECT l.licitatii_id,
l.nume,
l.data_publicarii,
l.data_limita
FROM licitatii_ue l
INNER JOIN domenii_licitatii dl
ON l.licitatii_id = dl.licitatii_id
AND dl.tip_licitatie = '2'
INNER JOIN domenii d
ON dl.domenii_id = d.domenii_id
AND d.status = 1
AND d.tip_domeniu = '1'
WHERE l.status = 1
AND Unix_timestamp(TIMESTAMPADD(DAY, 1, CAST(From_unixtime(l.data_limita)
AS DATE)))
< '1300683793'
GROUP BY l.licitatii_id
ORDER BY data_publicarii DESC
Explain outputs:
+-----+--------------+--------+---------+-------------------------------------+----------+----------+---------------------------+-------+-----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | d | ALL | PRIMARY,key_status_tip_domeniu | NULL | NULL | NULL | 120 | 85.83 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | dl | ref | PRIMARY,tip_licitatie,licitatii_id | PRIMARY | 4 | web61db1.d.domenii_id | 6180 | 100.00 | Using where; Using index |
| 1 | SIMPLE | l | eq_ref | PRIMARY | PRIMARY | 4 | web61db1.dl.licitatii_id | 1 | 100.00 | Using where |
+-----+--------------+--------+---------+-------------------------------------+----------+----------+---------------------------+-------+-----------+----------------------------------------------+
As you see type=ALL for d table
now if I add LIMIT 100 to the query
plan changes to range:
+-----+--------------+--------+---------+-------------------------------------+-------------------------+----------+---------------------------+-------+-----------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
| 1 | SIMPLE | d | range | PRIMARY,key_status_tip_domeniu | key_status_tip_domeniu | 9 | NULL | 103 | 100.00 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | dl | ref | PRIMARY,tip_licitatie,licitatii_id | PRIMARY | 4 | web61db1.d.domenii_id | 6180 | 100.00 | Using where; Using index |
| 1 | SIMPLE | l | eq_ref | PRIMARY | PRIMARY | 4 | web61db1.dl.licitatii_id | 1 | 100.00 | Using where |
+-----+--------------+--------+---------+-------------------------------------+-------------------------+----------+---------------------------+-------+-----------+----------------------------------------------+
Why does this happen?
Can this query be optimized more, both queries take 13 seconds.
Table schema is visible on gist github
MySQL chooses domenii as the leading table for the join.
This table is filtered on (status, tip_domeniu) = (1, 1).
It does not seem to be a very selective condition, so normally a full table scan with filtering would be preferred over the index scan.
We can see that MySQL expects 120 records to be returned from domanii for which this condition would hold.
When you add a LIMIT, the number of records expected to be processed is decreased, and MySQL considers the index scan more efficient for this.
Note that this condition:
Unix_timestamp(TIMESTAMPADD(DAY, 1, CAST(From_unixtime(l.data_limita) AS DATE))) < '1300683793'
is not sargable, so you deprive the optimizer to use an index on data_limita.
Create the following indexes:
licitatii_ue (status, data_limita)
licitatii_ue (status, data_publicarii)
and rewrite the query like this:
SELECT l.licitatii_id,
l.nume,
l.data_publicarii,
l.data_limita
FROM licitatii_ue l
JOIN domenii_licitatii dl
ON l.licitatii_id = dl.licitatii_id
AND dl.tip_licitatie = '2'
JOIN domenii d
ON dl.domenii_id = d.domenii_id
AND d.status = 1
AND d.tip_domeniu = '1'
WHERE l.status = 1
AND l.data_limita < FROM_UNIXTIME(((1300683793 - 86400) div 86400) * 86400)
GROUP BY
l.licitatii_id
ORDER BY
data_publicarii DESC
Ah, the mysteries of the query optimizer are many and unknowable...
At a quick glance, the most obvious thing to optimize might be the
AND Unix_timestamp(TIMESTAMPADD(DAY, 1, CAST(From_unixtime(l.data_limita)
AS DATE)))
clause.
depending on the number of records in the licitatii_ue table, this looks like an expensive operation, and it will bypass any indices available.
ALL is table scan, range is range scan (due to LIMIT). Nothing bad with that, actually it also causes a key to be used (key_status_tip_domeniu).
The reason you are slow is, most likely, that you are using ORDER BY data_publicarii DESC (this is easy to test, just drop the ORDER BY and benchmark the query; would expect few orders of magnitude).
Mysql admits (under Extra column of explain) that it is using filesort (needed for order by because it can't or does not know how to use an index). Adding yet another index to the mix might help, especially if you confirm that ORDER BY is making it slow.
EDIT
Actually, you do have a cardinal sin in your query:
Unix_timestamp(TIMESTAMPADD(DAY, 1, CAST(From_unixtime(l.data_limita)
AS DATE)))
< '1300683793'
Avoid applying any functions to your field values if you can apply them to a constant. So switch it around and rewrite it as
l.data_limita < some_function('1300683793')
However complext the some_function would be, it will be calculated only once. Mysql planner will know it is a constant. The way you wrote it would force mysql to apply unix_timestamp, timestampadd, cast and from_unixtime to value of data_limita from each row. Now in I/O bound systems this will usually just burn some extra CPU cycles while waiting for the disks to spin around (however, it might get significant, your system might get CPU bound and it is just a bad thing). Biggest difference is that you loose possibility to use an index on data_limita.
Finally, all your indexes are singe field indexes and mysql does some index merging, but is not stellar in it. You might want to try creating indexes that cover all your conditions and sorting order (in order of selectivity for target query).

understanding mysql explain

So, I've never understood the explain of MySQL. I understand the gross concepts that you should have at least one entry in the possible_keys column for it to use an index, and that simple queries are better. But what is the difference between ref and eq_ref? What is the best way to be optimizing queries.
For example, this is my latest query that I'm trying to figure out why it takes forever (generated from django models) :
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
| 1 | SIMPLE | T6 | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_alias_id | 4 | const | 244 | Using temporary; Using filesort |
| 1 | SIMPLE | T5 | eq_ref | PRIMARY | PRIMARY | 4 | paul.T6.achievement_id | 1 | Using index |
| 1 | SIMPLE | T4 | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_achievement_id | 4 | paul.T6.achievement_id | 298 | |
| 1 | SIMPLE | yourock_alias | eq_ref | PRIMARY | PRIMARY | 4 | paul.T4.alias_id | 1 | Using index |
| 1 | SIMPLE | yourock_achiever | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_alias_id | 4 | paul.T4.alias_id | 152 | |
| 1 | SIMPLE | yourock_achievement | eq_ref | PRIMARY | PRIMARY | 4 | paul.yourock_achiever.achievement_id | 1 | |
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
6 rows in set (0.00 sec)
I had hoped to learn enough about mysql explain that the query wouldn't be needed. Alas, it seems that you can't get enough information from the explain statement and you need the raw SQL. Query :
SELECT `yourock_achievement`.`id`,
`yourock_achievement`.`modified`,
`yourock_achievement`.`created`,
`yourock_achievement`.`string_id`,
`yourock_achievement`.`owner_id`,
`yourock_achievement`.`name`,
`yourock_achievement`.`description`,
`yourock_achievement`.`owner_points`,
`yourock_achievement`.`url`,
`yourock_achievement`.`remote_image`,
`yourock_achievement`.`image`,
`yourock_achievement`.`parent_achievement_id`,
`yourock_achievement`.`slug`,
`yourock_achievement`.`true_points`
FROM `yourock_achievement`
INNER JOIN
`yourock_achiever`
ON `yourock_achievement`.`id` = `yourock_achiever`.`achievement_id`
INNER JOIN
`yourock_alias`
ON `yourock_achiever`.`alias_id` = `yourock_alias`.`id`
INNER JOIN
`yourock_achiever` T4
ON `yourock_alias`.`id` = T4.`alias_id`
INNER JOIN
`yourock_achievement` T5
ON T4.`achievement_id` = T5.`id`
INNER JOIN
`yourock_achiever` T6
ON T5.`id` = T6.`achievement_id`
WHERE
T6.`alias_id` = 6
ORDER BY
`yourock_achievement`.`modified` DESC
Paul:
eq_ref
One row is read from this table for each combination of rows from the previous tables. Other than the system and const types, this is the best possible join type. It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE index.
eq_ref can be used for indexed columns that are compared using the = operator. The comparison value can be a constant or an expression that uses columns from tables that are read before this table. In the following examples, MySQL can use an eq_ref join to process ref_table:
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column_part1=other_table.column
AND ref_table.key_column_part2=1;
ref
All rows with matching index values are read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this is a good join type.
ref can be used for indexed columns that are compared using the = or <=> operator. In the following examples, MySQL can use a ref join to process ref_table:
SELECT * FROM ref_table WHERE key_column=expr;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column_part1=other_table.column
AND ref_table.key_column_part2=1;
These are copied verbatim from the MySQL manual: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
If you could post your query that is taking forever, I could help pinpoint what is slowing it down. Also, please specify what your definition of forever is. Also, if you could provide your "SHOW CREATE TABLE xxx;" statements for these tables, I could help in optimizing your query as much as possible.
What jumps out at me immediately as a possible point of improvement is the "Using temporary; Using filesort;". This means that a temporary table was created to satisfy the query (not necessarily a bad thing), and that the GROUP BY/ORDER BY you designated could not be retrieved from an index, thus resulting in a filesort.
You query seems to process (244 * 298 * 152) = 11,052,224 records, which according to Using temporary; Using filesort need to be sorted.
This can take long.
If you post your query here, we probably will be able to optimize it somehow.
Update:
You query indeed does a number of nested loops and seems to yield lots of values which need to be sorted then.
Could you please run the following query:
SELECT COUNT(*)
FROM `yourock_achievement`
INNER JOIN
`yourock_achiever`
ON `yourock_achievement`.`id` = `yourock_achiever`.`achievement_id`
INNER JOIN
`yourock_alias`
ON `yourock_achiever`.`alias_id` = `yourock_alias`.`id`
INNER JOIN
`yourock_achiever` T4
ON `yourock_alias`.`id` = T4.`alias_id`
INNER JOIN
`yourock_achievement` T5
ON T4.`achievement_id` = T5.`id`
INNER JOIN
`yourock_achiever` T6
ON T5.`id` = T6.`achievement_id`
WHERE
T6.`alias_id` = 6