sql query optimization sugarcrm - mysql

I am trying to optimize sql query in mysql db. Tried various variations with indexes, but nothing helps. Maybe I am missing something
Query:
SELECT count(1) AS fAccounts
from sugarcrm.accounts t4,
( SELECT t3.related_id
FROM sugarcrm.prospect_lists_prospects t3, sugarcrm.prospect_list_campaigns t2
where t3.deleted=0
and t3.related_type='Accounts'
and t3.prospect_list_id=t2.prospect_list_id
and t2.deleted=0
and t2.campaign_id='10909eb7-8080-45b6-8c9f-563b42be91e5'
) t3
where t4.deleted=0
and t4.id=t3.related_id;
Explain:
+----+-------------+------------+--------+---------------------------------------------------+----------------+---------+------------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------------------------------------------+----------------+---------+------------------------------+--------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5000 | |
| 1 | PRIMARY | t4 | eq_ref | PRIMARY;idx_accnt_id_del;idx_accnt_assigned_del | PRIMARY | 108 | t3.related_id | 1 | Using where |
| 2 | DERIVED | t2 | ref | idx_pro_id;idx_cam_id;idx_prospect_list_campaigns | idx_cam_id | 111 | | 1 | Using where |
| 2 | DERIVED | t3 | ref | idx_plp_pro_id;idx_plp_rel_id_2 | idx_plp_pro_id | 111 | sugarcrm.t2.prospect_list_id | 463968 | Using where |
+----+-------------+------------+--------+---------------------------------------------------+----------------+---------+------------------------------+--------+-------------+

The inner query is the trouble maker. There are two ways it could be performed: Start with t2 and do a "Nested Loop Join" to t3 or vice versa. The Optimizer will look at the WHERE clause and the table sizes and the indexes to estimate which one would be best to start with. Let's give the optimizer the 'best' index for going each way:
Starting with t2:
t2: INDEX(deleted, campaign_id) -- in either order
t3: INDEX(prospect_list_id, deleted, related_type) -- in any order
Starting with t3:
t3: INDEX(deleted, related_type) -- in either order
t2: INDEX(prospect_list_id, deleted, campaign_id) -- in any order
Rather than adding 2 indexes to each table, let's do
t2: INDEX(campaign_id, deleted, prospect_list_id) -- in this order
t3: INDEX(related_type, deleted, prospect_list_id) -- in this order
Similarly, t4 (which will be last) needs
INDEX(deleted, id)
unless it is InnoDB and already has PRIMARY KEY(id), which will be 'clustered' with the data.
There is a problem... When you do a JOIN, then compute aggregates, the JOIN first gives you an explosion of rows, then the COUNT() counts too many of them, thereby getting an inflated number. So, be sure to sanity check the results.
Since the only need for t4 is to verify that the related_id is there, the query could be reformulated as
SELECT COUNT(*) AS fAccounts
FROM prospect_lists_prospects t3
-- Note the use of `JOIN...ON...`:
JOIN prospect_list_campaigns t2 ON t3.prospect_list_id=t2.prospect_list_id
where t3.deleted=0
and t3.related_type='Accounts'
and t2.deleted=0
and t2.campaign_id='10909eb7-8080-45b6-8c9f-563b42be91e5'
AND ( EXISTS *
FROM accounts
FROM accounts t4
WHERE t4.id = t3.related_id
)
This still needs the suggested indexes (one per table).

Since you don't use DISTINCT anywhere, I see no need to bother creating a temporary table. Try this one:
SELECT
count(1) AS fAccounts
from sugarcrm.accounts t4 inner join
sugarcrm.prospect_lists_prospects t3 on t4.id=t3.related_id inner join
sugarcrm.prospect_list_campaigns t2 on t3.prospect_list_id=t2.prospect_list_id
where t3.deleted=0
and t3.related_type='Accounts'
and t2.deleted=0
and t2.campaign_id='10909eb7-8080-45b6-8c9f-563b42be91e5'
and t4.deleted=0

Related

Exotic GROUP BY In MySQL

Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)

Load average jumps up during a sort in mysql query

I have encountered a MySQL query that takes over 2 minutes to complete and brings up the server load very high (e.g. from 2 to 14, or sometimes higher).
The query does a left join between tables, then sorts the data based on a float column on of the joined tables, like this:
SELECT table1.*, table2.*, table3.field, table4.field
FROM table1
LEFT JOIN table2 ON table1...
LEFT JOIN table3 ON table1...
LEFT JOIN table4 ON table3...
LEFT JOIN table5 ON table1...
WHERE table1.deleted=0
ORDER BY table2.float_field ASC
LIMIT 1,300
The JOINS are all done on indexed keys, and table2 also has an index on the float_field.
The same database structure and query is used on other databases without issues. This table2 is a custom field table, alterable by users of this database, so in this particular system, I see that it has 107 fields, more than 2/3 of them being varchar(150). Would this be why the high load, or is there some other reason? Any suggestion for how to handle it (ideally without having to re-do the db schema)?
Thanks in advance.
EDIT: Here are the 'explain' results:
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
| 1 | SIMPLE | table1 | ALL | idx_1,idx_2 | NULL | NULL | NULL | 33861 | Using where |
| 1 | SIMPLE | table2 | eq_ref | PRIMARY | PRIMARY | 108 | db.table1.id | 1 | |
| 1 | SIMPLE | jtl0 | ref | idx_X | idx_X | 111 | db.table1.id | 1 | |
| 1 | SIMPLE | table4 | eq_ref | PRIMARY,... | PRIMARY | 108 | db.jtl0.field | 1 | |
| 1 | SIMPLE | jt1 | eq_ref | PRIMARY | PRIMARY | 108 | db.table1.fieldX| 1 | |
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
Both idx_1 and idx_2 use 'deleted' column as the first field in the index. There is only this 1 field in the where
I also corrected the original text, there are 5 tables used, not 4 (although the last table has 20 rows only, so it doesn't matter here).
select table2.*
is generally bad style - returning a lot of columns you are not interested in. In this case it could be causing the slowness given the large number of (text) columns in this table.
100 columns * 150 characters * 1300 rows is roughly 19.5 MB, so the slowness could well be reading all the data from disk and transmitting it across the network.
Do you still see the slowness if you restrict this to the particular columns of table2 that you are interested in?
EDIT : your explain select output seems to confirm that it is not a difficult query to run, with only a small number of rows. This makes the sheer data size of each row in table2 the most likely problem. You can test this by removing / limiting the reference to table2. If that is the case, then the only way to speed this query up will be to request fewer columns from table2.

Why is my MySQL query is so slow?

I'm trying to figure out why that query so slow (take about 6 second to get result)
SELECT DISTINCT
c.id
FROM
z1
INNER JOIN
c ON (z1.id = c.id)
INNER JOIN
i ON (c.member_id = i.member_id)
WHERE
c.id NOT IN (... big list of ids which should be excluded)
This is execution plan
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
| 1 | SIMPLE | z1 | index | PRIMARY | PRIMARY | 4 | NULL | 318563 | 99.85 | Using where; Using index; Using temporary |
| 1 | SIMPLE | c | eq_ref | PRIMARY,member_id | PRIMARY | 4 | z1.id | 1 | 100.00 | |
| 1 | SIMPLE | i | eq_ref | PRIMARY | PRIMARY | 4 | c.member_id | 1 | 100.00 | Using index |
+----+-------------+-------+--------+-------------------+---------+---------+--------------------+--------+----------+--------------------------+
is it because mysql has to take out almost whole 1st table ? Can it be adjusted ?
You can try to replace c with a subquery.
SELECT DISTINCT
c.id
FROM
z1
INNER JOIN
(select c.id
from c
WHERE
c.id NOT IN (... big list of ids which should be excluded)) c ON (z1.id = c.id)
INNER JOIN
i ON (c.member_id = i.member_id)
to leave only necessary id's
It is imposible to say from the information you've provided whether there is a faster solution to obtaining the same data (we would need to know abou data distributions and what foreign keys are obligatory). However assuming that this is a hierarchical data set, then the plan is probably not optimal: the only predicate to reduce the number of rows is c.id NOT IN.....
The first question to ask yourself when optimizing any query is Do I need all the rows? How many rows is this returning?
I'm struggling to see any utlity in a query which returns a list of 'id' values (implying a set of autoincrement integers).
You can't use an index for a NOT IN (or <>) hence the most eficient solution is probably to start with a full table scan on 'c' - which should be the outcome of StanislavL's query.
Since you don't use the values from i and z, the joins could be replaced with 'exists' which may help performance.
I would consider creating a compound index for c(id, member_id). This way the query should work at index level only without scanning any rows in tables.

MySQL Query Optimization; SELECT multiple fields vs. JOIN

We've got a relatively straightforward query that does LEFT JOINs across 4 tables. A is the "main" table or the top-most table in the hierarchy. B links to A, C links to B. Furthermore, X links to A. So the hierarchy is basically
A
C => B => A
X => A
The query is essentially:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
WHERE
b.flag = true
ORDER BY
x.date DESC
LIMIT 25
Via EXPLAIN, I've confirmed that the correct indexes are in place, and that the built-in MySQL query optimizer is using those indexes correctly and properly.
So here's the strange part...
When we run the query as is, it takes about 1.1 seconds to run.
However, after doing some checking, it seems that if I removed most of the SELECT fields, I get a significant speed boost.
So if instead we made this into a two-step query process:
First query same as above except change the SELECT clause to only SELECT a.id instead of SELECT *
Second query also same as above, except change the WHERE clause to only do an a.id IN agains the result of Query 1 instead of what we have before
The result is drastically different. It's .03 seconds for the first query and .02 for the second query.
Doing this two-step query in code essentially gives us a 20x boost in performance.
So here's my question:
Shouldn't this type of optimization already be done within the DB engine? Why does the difference in which fields that are actually SELECTed make a difference on the overall performance of the query?
At the end of the day, it's merely selecting the exact same 25 rows and returning the exact same full contents of those 25 rows. So, why the wide disparity in performance?
ADDED 2012-08-24 13:02 PM PDT
Thanks eggyal and invertedSpear for the feedback. First off, it's not a caching issue -- I've run tests running both queries multiple times (about 10 times) alternating between each approach. The result averages at 1.1 seconds for the first (single query) approach and .03+.02 seconds for the second (2 query) approach.
In terms of indexes, I thought I had done an EXPLAIN to ensure that we're going thru the keys, and for the most part we are. However, I just did a quick check again and one interesting thing to note:
The slower "single query" approach doesn't show the Extra note of "Using index" for the third line:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
While it does show "Using index" for when we query for just the IDs:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | Using index |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
The strange thing is that both do list the correct index being used... but I guess it begs the questions:
Why are they different (considering all the other clauses are the exact same)? And is this an indication of why it's slower?
Unfortunately, the MySQL docs do not give much information for when the "Extra" column is blank/null in the EXPLAIN results.
More important than speed, you have a flaw in your query logic. When you test a LEFT JOINed column in the WHERE clause (other than testing for NULL), you force that join to behave as if it were an INNER JOIN. Instead, you'd want:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
AND b.flag = true
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
ORDER BY
x.date DESC
LIMIT 25
My next suggestion would be to examine all of those .*'s in your SELECT. Do you really need all the columns from all the tables?

understanding mysql explain

So, I've never understood the explain of MySQL. I understand the gross concepts that you should have at least one entry in the possible_keys column for it to use an index, and that simple queries are better. But what is the difference between ref and eq_ref? What is the best way to be optimizing queries.
For example, this is my latest query that I'm trying to figure out why it takes forever (generated from django models) :
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
| 1 | SIMPLE | T6 | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_alias_id | 4 | const | 244 | Using temporary; Using filesort |
| 1 | SIMPLE | T5 | eq_ref | PRIMARY | PRIMARY | 4 | paul.T6.achievement_id | 1 | Using index |
| 1 | SIMPLE | T4 | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_achievement_id | 4 | paul.T6.achievement_id | 298 | |
| 1 | SIMPLE | yourock_alias | eq_ref | PRIMARY | PRIMARY | 4 | paul.T4.alias_id | 1 | Using index |
| 1 | SIMPLE | yourock_achiever | ref | yourock_achiever_achievement_id,yourock_achiever_alias_id | yourock_achiever_alias_id | 4 | paul.T4.alias_id | 152 | |
| 1 | SIMPLE | yourock_achievement | eq_ref | PRIMARY | PRIMARY | 4 | paul.yourock_achiever.achievement_id | 1 | |
+----+-------------+---------------------+--------+-----------------------------------------------------------+---------------------------------+---------+--------------------------------------+------+---------------------------------+
6 rows in set (0.00 sec)
I had hoped to learn enough about mysql explain that the query wouldn't be needed. Alas, it seems that you can't get enough information from the explain statement and you need the raw SQL. Query :
SELECT `yourock_achievement`.`id`,
`yourock_achievement`.`modified`,
`yourock_achievement`.`created`,
`yourock_achievement`.`string_id`,
`yourock_achievement`.`owner_id`,
`yourock_achievement`.`name`,
`yourock_achievement`.`description`,
`yourock_achievement`.`owner_points`,
`yourock_achievement`.`url`,
`yourock_achievement`.`remote_image`,
`yourock_achievement`.`image`,
`yourock_achievement`.`parent_achievement_id`,
`yourock_achievement`.`slug`,
`yourock_achievement`.`true_points`
FROM `yourock_achievement`
INNER JOIN
`yourock_achiever`
ON `yourock_achievement`.`id` = `yourock_achiever`.`achievement_id`
INNER JOIN
`yourock_alias`
ON `yourock_achiever`.`alias_id` = `yourock_alias`.`id`
INNER JOIN
`yourock_achiever` T4
ON `yourock_alias`.`id` = T4.`alias_id`
INNER JOIN
`yourock_achievement` T5
ON T4.`achievement_id` = T5.`id`
INNER JOIN
`yourock_achiever` T6
ON T5.`id` = T6.`achievement_id`
WHERE
T6.`alias_id` = 6
ORDER BY
`yourock_achievement`.`modified` DESC
Paul:
eq_ref
One row is read from this table for each combination of rows from the previous tables. Other than the system and const types, this is the best possible join type. It is used when all parts of an index are used by the join and the index is a PRIMARY KEY or UNIQUE index.
eq_ref can be used for indexed columns that are compared using the = operator. The comparison value can be a constant or an expression that uses columns from tables that are read before this table. In the following examples, MySQL can use an eq_ref join to process ref_table:
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column_part1=other_table.column
AND ref_table.key_column_part2=1;
ref
All rows with matching index values are read from this table for each combination of rows from the previous tables. ref is used if the join uses only a leftmost prefix of the key or if the key is not a PRIMARY KEY or UNIQUE index (in other words, if the join cannot select a single row based on the key value). If the key that is used matches only a few rows, this is a good join type.
ref can be used for indexed columns that are compared using the = or <=> operator. In the following examples, MySQL can use a ref join to process ref_table:
SELECT * FROM ref_table WHERE key_column=expr;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column=other_table.column;
SELECT * FROM ref_table,other_table
WHERE ref_table.key_column_part1=other_table.column
AND ref_table.key_column_part2=1;
These are copied verbatim from the MySQL manual: http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
If you could post your query that is taking forever, I could help pinpoint what is slowing it down. Also, please specify what your definition of forever is. Also, if you could provide your "SHOW CREATE TABLE xxx;" statements for these tables, I could help in optimizing your query as much as possible.
What jumps out at me immediately as a possible point of improvement is the "Using temporary; Using filesort;". This means that a temporary table was created to satisfy the query (not necessarily a bad thing), and that the GROUP BY/ORDER BY you designated could not be retrieved from an index, thus resulting in a filesort.
You query seems to process (244 * 298 * 152) = 11,052,224 records, which according to Using temporary; Using filesort need to be sorted.
This can take long.
If you post your query here, we probably will be able to optimize it somehow.
Update:
You query indeed does a number of nested loops and seems to yield lots of values which need to be sorted then.
Could you please run the following query:
SELECT COUNT(*)
FROM `yourock_achievement`
INNER JOIN
`yourock_achiever`
ON `yourock_achievement`.`id` = `yourock_achiever`.`achievement_id`
INNER JOIN
`yourock_alias`
ON `yourock_achiever`.`alias_id` = `yourock_alias`.`id`
INNER JOIN
`yourock_achiever` T4
ON `yourock_alias`.`id` = T4.`alias_id`
INNER JOIN
`yourock_achievement` T5
ON T4.`achievement_id` = T5.`id`
INNER JOIN
`yourock_achiever` T6
ON T5.`id` = T6.`achievement_id`
WHERE
T6.`alias_id` = 6