Google Cloud SQL slow for queries Using filesort - mysql

mysql> EXPLAIN SELECT * FROM `condominio_boleto`
INNER JOIN `contrato_contrato` ON (`condominio_boleto`.`contrato_id` = `contrato_contrato`.`id`)
INNER JOIN `cadastro_imovel` ON (`contrato_contrato`.`imovel_id` = `cadastro_imovel`.`id`)
INNER JOIN `cadastro_pessoa` ON (`contrato_contrato`.`pessoa_id` = `cadastro_pessoa`.`id`)
ORDER BY `condominio_boleto`.`id` DESC LIMIT 1;
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
| 1 | SIMPLE | cadastro_imovel | ALL | PRIMARY | NULL | NULL | NULL | 128 | Using temporary; Using filesort |
| 1 | SIMPLE | contrato_contrato | ref | PRIMARY,contrato_contrato_33999a20,contrato_contrato_8b5ebd9d | contrato_contrato_33999a20 | 4 | mydb.cadastro_imovel.id | 1 | |
| 1 | SIMPLE | cadastro_pessoa | eq_ref | PRIMARY | PRIMARY | 4 | mydb.contrato_contrato.pessoa_id | 1 | |
| 1 | SIMPLE | condominio_boleto | ref | condominio_boleto_91c8cd68 | condominio_boleto_91c8cd68 | 4 | mydb.contrato_contrato.id | 9 | |
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
4 rows in set (0.00 sec)
This query is taking 3-4 seconds to run on Google Cloud SQL (D0 instance). If I remove the ORDER BY clause it no longer shows the Extra Using temporary; Using filesort and speeds up to <100ms. But because it's auto-genreated by Django admin I can't remove that ORDER BY clause.
All these tables are really small. condominio_boleto has 5k records all other tables have less than 500 records.
Can I speed this up with indexes? Is this a known problem on Google Cloud SQL?

I had a similar experience on the Google Cloud SQL Tier D0 (128MB RAM). One of my website was running very slow and took a long time to return pages. After running Jet Profiler I found that my database queries were running slow (2-3s to execute and 7 threads on average). The problem queries were those with inner joins and orders. So I upgraded to Tier D1 (512MB RAM) and as expected, no more slow queries. My guess is D0 isn't made to handle height load or complex queries. It's mostly suited for low usage and testing.

Related

JOIN performance very slow when selecting VARCHAR field

I have a difficult problem with a query which I can't find out why it is performing so bad.
Please see following queries and query times (using HeidiSQL):
SELECT p.TID, a.TID
FROM characters AS p JOIN account a ON p.AccountId = a.TID;
=> rows: 57.879 Query time: 0.063 sec. (+ 0.328 sec. network)
Explain:
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
| 1 | SIMPLE | a | index | TID | WebAccountId | 5 | NULL | 21086 | Using index |
| 1 | SIMPLE | p | ref | AccountId | AccountId | 5 | dol.a.TID | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
This is fast but as soon as I select a VARCHAR(255) field from table characters it gets very slow. See network time.
SELECT p.TID, a.TID, p.LastName
FROM characters AS p JOIN account a ON p.AccountId = a.TID;
=> rows: 57.879 Query time: 0.219 sec. (+ 116.234 sec. network)
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
| 1 | SIMPLE | a | index | TID | WebAccountId | 5 | NULL | 21086 | Using index |
| 1 | SIMPLE | p | ref | AccountId | AccountId | 5 | dol.a.TID | 1 | Using where |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
Query time is still good but network time got unbearable.
One could think that its caused by the transfer of p.LastName but see the query without the join:
SELECT p.TID, p.LastName
FROM characters AS p
=> rows: 57.881 Query time: 0.063 sec. (+ 0.578 sec. network)
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 59800 | |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
Any idea what is going on here? I have no idea how to fix that.
Edit:
Added the Explain output for each query.
In case it matters, it's mysql 5.1.72-community
Edit2: Tested from commandline. Same performance. If I look into the mysql process list I see Sending data for the poor performing query. The query was originally used in a ASP.NET web application before and performance was very bad. That is why I used HeidiSQL to investigate. I would definitely rule out HeidiSQL as the problem.
Edit3 Test result in Mysql Workbench:
I found out what was the culprit here. I used mysql 5.1.72 with InnoDB on default settings.
This means it used an InnoDB buffer pool of just 8MB
innodb_buffer_pool_size=8M
Mysql was forced to write the result to disk as it couldn't hold it in memory for transfer as soon as I added the VARCHAR fields to the select clause. The Join seems to have pressured the memory usage of that buffer even more.
After I changed the buffer size to 1G the problem was gone.
innodb_buffer_pool_size=1G
The first request after mysql start can still be a bit slow but subsequent queries are very fast.
So it was basically misconfiguration of the mysql server.

complex query takes too much time transferring

the following query is very slow, I don't understand why. I have all id as indexes (some primary).
SELECT r.name as tool, r.url url ,r.id_tool recId, count(*) as count, r.source as source,
group_concat(t.name) as instrument
FROM tools r
INNER JOIN
instruments_tools ifr
ON ifr.id_tool = r.id_tool
INNER JOIN
instrument t
ON t.id= ifr.id_instrument
WHERE t.id IN (433,37,362) AND t.source IN (1,2,3)
GROUP BY r.id_tool
ORDER BY count desc,rand() limit 10;
Locally on a Wampserver installation I have serious issues with transferring data. With Heidi I see two "Sending Data" of 2 resp 6 seconds.
On a shared server, this is the important part I see:
| statistics | 0.079963 |
| preparing | 0.000028 |
| Creating tmp table | 0.000037 |
| executing | 0.000005 |
| Copying to tmp table | 7.963576 |
| converting HEAP to MyISAM | 0.015790 |
| Copying to tmp table on disk | 5.383739 |
| Creating sort index | 0.015143 |
| Copying to group table | 0.023708 |
| converting HEAP to MyISAM | 0.014513 |
| Copying to group table | 0.099595 |
| Sorting result | 0.034256 |
Considering that I'd like to improve the query (see LIMIT) or remove rand() and add weights, I'm a bit afraid I'm doing something very wrong.
Additional info:
The tools table is 500.000 rows big, while the instruments around 6000. instruments_tools is around 3M rows.
The query is to find which tool I can make with the instruments I have (by checking t.id IN(id of instruments). Group_concat(t.name) is a way to know which instrument is selected.
explain of the query:
+----+-------------+-------+--------+-------------------------+---------------+-------- -+----------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------+---------------+---------+----------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 4 | NULL | 3 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ifr | ref | id_tool,id_instrument | id_instrument | 5 | mydb2.t.id | 374 | Using where |
| 1 | SIMPLE | r | eq_ref | PRIMARY | PRIMARY | 4 | mydb2.ifr.id_tool | 1 | |
+----+-------------+-------+--------+-------------------------+---------------+---------+----------------------------+------+----------------------------------------------+
You need a compound index on the intersection table:
ALTER TABLE instruments_tools ADD KEY (id_instrument, id_tool);
The order of columns in that index is important!
What you're hoping for is that the joins will start with the instrument table, then look up the matching index entry in the compound index based on id_instrument. Then once it finds that index entry, it has the related id_tool for free. So it doesn't have to read the instrument_tools table at all, it only need to read the index entry. That should give the "Using index" comment in your EXPLAIN for the instruments_tools table.
That should help, but you can't avoid the temp table and filesort, because of the columns you're grouping by and sorting by cannot make use of an index.
You can try to make MySQL avoid writing the temp table to disk by increasing the size of memory it can use for temporary tables:
mysql> SET GLOBAL tmp_table_size = 256*1024*1024; -- 256MB
mysql> SET GLOBAL max_heap_table_size = 256*1024*1024; -- 256MB
That figure is just an example. I have no idea how large it would have to be for the temp table in your case.

Optimizing / improving a slow mysql query - indexing? reorganizing?

First off, I've looked at several other questions about optimizing sql queries, but I'm still unclear for my situation what is causing my problem. I read a few articles on the topic as well and have tried implementing a couple possible solutions, as I'll describe below, but nothing has yet worked or even made an appreciable dent in the problem.
The application is a nutrition tracking system - users enter the foods they eat and based on an imported USDA database the application breaks down the foods to the individual nutrients and gives the user a breakdown of the nutrient quantities on a (for now) daily basis.
here's
A PDF of the abbreviated database schema
and here it is as a (perhaps poor quality) JPG. I made this in open office - if there are suggestions for better ways to visualize a database, I'm open to suggestions on that front as well! The blue tables are directly from the USDA, and the green and black tables are ones I've made. I've omitted a lot of data in order to not clutter things up unnecessarily.
Here's the query I'm trying to run that takes a very long time:
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM
(SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (nutr_no <100000
OR nutr_no IN
(SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'))
) as listing
LEFT JOIN
(SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no ) as data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units
So I know that's rather complex - The first select gets a listing of all the nutrients that the user consumed within the given date range, and the second fills in all the quantities.
When I implement them separately, the first query is really fast, but the second is slow and gets very slow when the date ranges get large. The join makes the whole thing ridiculously slow. I know that the 'main' problem is the join between these two derived tables, and I can get rid of that and do the join by hand basically in php much faster, but I'm not convinced that's the whole story.
For example: for 1 month of data, the query takes about 8 seconds, which is slow, but not completely terrible. Separately, each query takes ~.01 and ~2 seconds respectively. 2 seconds still seems high to me.
If I try to retrieve a year's worth of data, it takes several (>10) minutes to run the whole query, which is problematic - the client-server connection sometimes times out, and in any case we don't want I don't want to sit there with a spinning 'please wait' icon. Mainly, I feel like there's a problem because it takes more than 12x as long to retrieve 12x more information, when it should take less than 12x as long, if I were doing things right.
Here's the 'explain' for each of the slow queries: (the whole thing, and just the second half).
Whole thing:
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5053 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 4341 | |
| 4 | DERIVED | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 4 | DERIVED | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 4 | DERIVED | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 4 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 4 | DERIVED | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
| 2 | DERIVED | meals | range | day_ind | day_ind | 9 | NULL | 30 | Using where |
| 2 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | Using where |
| 3 | DEPENDENT SUBQUERY | nutr_rights | index_subquery | users_userid,nutr_def_nutr_no | nutr_def_nutr_no | 19 | func | 1 | Using index; Using where |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
10 rows in set (2.82 sec)
Second chunk (data):
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 1 | SIMPLE | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 1 | SIMPLE | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 1 | SIMPLE | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)
I've 'analyzed' all the tables involved in the query, and added an index on the datetime field that is joining meals and food entries. I called it 'day_ind'. I hoped that would accelerate things, but it didn't seem to make a difference. I also tried removing the 'sum' function, as I understand that having a function in the query will frequently mean a full table scan, which is obviously much slower. Unfortunately removing the 'sum' didn't seem to make a difference either (well, about 3-5% or so, but not the order magnitude that I'm looking for).
I would love any suggestions and will be happy to provide any more information you need to help diagnose and improve this problem. Thanks in advance!
There are a few type All in your explain suggest full table scan. and hence create temp table. You could re-index if it is not there already.
Sort and Group By are usually the performance killer, you can adjust Mysql memory settings to avoid physical i/o to tmp table if you have extra memory available.
Lastly, try to make sure the data type of the join attributes matches. Ie data.date_time = listing.date_time has same data format.
Hope that helps.
Okay, so I eventually figured out what I'm gonna end up doing. I couldn't make the 'data' query any faster - that's still the bottleneck. But now I've made it so the total query process is pretty close to linear, not exponential.
I split the query into two parts and made each one into a temporary table. Then I added an index for each of those temp tables and did the join separately afterwards. This made the total execution time for 1 month of data drop from 8 to 2 seconds, and for 1 year of data from ~10 minutes to ~30 seconds. Good enough for now, I think. I can work with that.
Thanks for the suggestions. Here's what I ended up doing:
create table listing (
SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (
nutr_no <100000 OR nutr_no IN (
SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'
)
)
);
create table data (
SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no
);
create index joiner on data(nutr_no, date_time);
create index joiner on listing(nutr_no, date_time);
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM listing
LEFT JOIN data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units;

MySQL query optimization - distinct, order by and limit

I am trying to optimize the following query:
select distinct this_.id as y0_
from Rental this_
left outer join RentalRequest rentalrequ1_
on this_.id=rentalrequ1_.rental_id
left outer join RentalSegment rentalsegm2_
on rentalrequ1_.id=rentalsegm2_.rentalRequest_id
where
this_.DTYPE='B'
and this_.id<=1848978
and this_.billingStatus=1
and rentalsegm2_.endDate between 1273631699529 and 1274927699529
order by rentalsegm2_.id asc
limit 0, 100;
This query is done multiple time in a row for paginated processing of records (with a different limit each time). It returns the ids I need in the processing. My problem is that this query take more than 3 seconds. I have about 2 million rows in each of the three tables.
Explain gives:
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | rentalsegm2_ | range | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9 | NULL | 449904 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest | PRIMARY | 8 | solscsm_main.rentalsegm2_.rentalRequest_id | 1 | Using where |
| 1 | SIMPLE | this_ | eq_ref | PRIMARY,index_billingStatus | PRIMARY | 8 | solscsm_main.rentalrequ1_.rental_id | 1 | Using where |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+----------------------------------------------+
I tried to remove the distinct and the query ran three times faster. explain without the query gives:
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
| 1 | SIMPLE | rentalsegm2_ | range | index_endDate,fk_rentalRequest_id_BikeRentalSegment | index_endDate | 9 | NULL | 451972 | Using where; Using filesort |
| 1 | SIMPLE | rentalrequ1_ | eq_ref | PRIMARY,fk_rental_id_BikeRentalRequest | PRIMARY | 8 | solscsm_main.rentalsegm2_.rentalRequest_id | 1 | Using where |
| 1 | SIMPLE | this_ | eq_ref | PRIMARY,index_billingStatus | PRIMARY | 8 | solscsm_main.rentalrequ1_.rental_id | 1 | Using where |
+----+-------------+--------------+--------+-----------------------------------------------------+---------------+---------+--------------------------------------------+--------+-----------------------------+
As you can see, the Using temporary is added when using distinct.
I already have an index on all fields used in the where clause.
Is there anything I can do to optimize this query?
Thank you very much!
Edit: I tried to order by on this_.id as suggested and the query was 5x slower. Here is the explain plan:
+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | this_ | ref | PRIMARY,index_billingStatus | index_billingStatus | 5 | const | 782348 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | rentalrequ1_ | ref | PRIMARY,fk_rental_id_BikeRentalRequest | fk_rental_id_BikeRentalRequest | 9 | solscsm_main.this_.id | 1 | Using where; Using index; Distinct |
| 1 | SIMPLE | rentalsegm2_ | ref | index_endDate,fk_rentalRequest_id_BikeRentalSegment | fk_rentalRequest_id_BikeRentalSegment | 8 | solscsm_main.rentalrequ1_.id | 1 | Using where; Distinct |
+----+-------------+--------------+------+-----------------------------------------------------+---------------------------------------+---------+------------------------------+--------+----------------------------------------------+
From the execution plan we see that the optimizer is smart enough to understand that you do not require OUTER JOINs here. Anyway, you should better specify that explicitly.
The DISTINCT modifier means that you want to GROUP BY all fields in SELECT part, that is ORDER BY all of the specified fields and then discard duplicates. In other words, order by rentalsegm2_.id asc clause does not make any sence here.
The query below should return the equivalent result:
select distinct this_.id as y0_
from Rental this_
join RentalRequest rentalrequ1_
on this_.id=rentalrequ1_.rental_id
join RentalSegment rentalsegm2_
on rentalrequ1_.id=rentalsegm2_.rentalRequest_id
where
this_.DTYPE='B'
and this_.id<=1848978
and this_.billingStatus=1
and rentalsegm2_.endDate between 1273631699529 and 1274927699529
limit 0, 100;
UPD
If you want the execution plan to start with RentalSegment, you will need to add the following indices to the database:
RentalSegment (endDate)
RentalRequest (id, rental_id)
Rental (id, DTYPE, billingStatus) or (id, billingStatus, DTYPE)
The query then could be rewritten as the following:
SELECT this_.id as y0_
FROM RentalSegment rs
JOIN RentalRequest rr
JOIN Rental this_
WHERE rs.endDate between 1273631699529 and 1274927699529
AND rs.rentalRequest_id = rr.id
AND rr.rental_id <= 1848978
AND rr.rental_id = this_.id
AND this_.DTYPE='D'
AND this_.billingStatus = 1
GROUP BY this_.id
LIMIT 0, 100;
If the execution plan will not start from RentalSegment you can force in with STRAIGHT_JOIN.
The reason that the query without the distinct runs faster is because you have a limit clause. Without the distinct, the server only needs to look at the first hundred matches. However however some of those rows may have duplicate fields, so if you introduce the distinct clause, the server has to look at many more rows in order to find ones that do not have duplicate values.
BTW, why are you using OUTER JOIN?
Here for "rentalsegm2_" table, optimizer has chosen "index_endDate" index and its no of rows expected from this table is about 4.5 lakhs. Since there are other where conditions exist, you can check for "this_" table indexes . I mean you can check in "this_ table" for how much records affected for each where conditions.
In summary, you can try for alternate solutions by changing indices used by optimizer.
This can be obtained by "USE INDEX", "FORCE INDEX" commands.
Thanks
Rinson KE
DBA
www.qburst.com

Mysql queries crawl when switching servers

I ran into a problem last week moving from dev-testing where one of my queries which had run perfectly in dev, was crawling on my testing server.
It was fixed by adding FORCE INDEX on one of the indexes in the query.
Now I've loaded the same database into the production server (and it's running with the FORCE INDEX command, and it has slowed again.
Any idea what would cause something like this to happen? The testing and prod are both running the same OS and version of mysql (unlike the dev).
Here's the query and the explain from it.
EXPLAIN SELECT showsdate.bid, showsdate.bandid, showsdate.date, showsdate.time,
-> showsdate.title, showsdate.name, showsdate.address, showsdate.rank, showsdate.city, showsdate.state,
-> showsdate.lat, showsdate.`long` , tickets.link, tickets.lowprice, tickets.highprice, tickets.source
-> , tickets.ext, artistGenre, showsdate.img
-> FROM tickets
-> RIGHT OUTER JOIN (
-> SELECT shows.bid, shows.date, shows.time, shows.title, artists.name, artists.img, artists.rank, artists
-> .bandid, shows.address, shows.city, shows.state, shows.lat, shows.`long`, GROUP_CONCAT(genres.genre SEPARATOR
-> ' | ') AS artistGenre
-> FROM shows FORCE INDEX (biddate_idx)
-> JOIN artists ON shows.bid = artists.bid JOIN genres ON artists.bid=genres.bid
-> WHERE `long` BETWEEN -74.34926984058 AND -73.62463215942 AND lat BETWEEN 40.39373515942 AND 41.11837284058
-> AND shows.date >= '2009-03-02' GROUP BY shows.bid, shows.date ORDER BY shows.date, artists.rank DESC
-> LIMIT 0, 30
-> )showsdate ON showsdate.bid = tickets.bid AND showsdate.date = tickets.date;
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 30 | |
| 1 | PRIMARY | tickets | ref | biddate_idx | biddate_idx | 7 | showsdate.bid,showsdate.date | 1 | |
| 2 | DERIVED | genres | index | bandid_idx | bandid_idx | 141 | NULL | 531281 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | shows | ref | biddate_idx | biddate_idx | 4 | activeHW.genres.bid | 5 | Using where |
| 2 | DERIVED | artists | eq_ref | bid_idx | bid_idx | 4 | activeHW.genres.bid | 1 | |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
I think I chimed in when you asked this question about the differences in dev -> test.
Have you tried rebuilding the indexes and recalculating statistics? Generally, forcing an index is a bad idea as the optimizer usually makes good choices as to which indexes to use. However, that assumes that it has good statistics to work from and that the indexes aren't seriously fragmented.
ETA:
To rebuild indexes, use:
REPAIR TABLE tbl_name QUICK;
To recalculate statistics:
ANALYZE TABLE tbl_name;
Does test server have only 10 records and production server 1000000000 records?
This might also cause different execution times
Are the two servers configured the same? It sounds like you might be crossing a "tipping point" in MySQL's performance. I'd compare the MySQL configurations; there might be a memory parameter way different.