my query is taking around 2800 secs to get output.
we have indexes also but no luck.
my target is need to get the output with in 2 to 3 seconds.
if possible please re-write the query.
query:
select ttl.id, ttl.url, ttl.canonical_url_id
from t_target_url ttl
where ttl.own_domain_id=476 and ttl.type != 10
order by ttl.week_entrances desc
limit 550000;
Explain Plan:
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
| 1 | SIMPLE | ttl | ref | own_domain_id_type_status,type | own_domain_id_type_status | 5 | const | 57871959 | Using where; Using filesort |
+----+-------------+-------+------+--------------------------------+---------------------------+---------+-------+----------+-----------------------------+
1 row in set (0.80 sec)
mysql> show create table t_target_url\G
*************************** 1. row ***************************
Table: t_target_url
Create Table: CREATE TABLE `t_target_url` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`own_domain_id` int(11) DEFAULT NULL,
`url` varchar(2000) NOT NULL,
`create_date` datetime DEFAULT NULL,
`friendly_name` varchar(255) DEFAULT NULL,
`section_name_id` int(11) DEFAULT NULL,
`type` int(11) DEFAULT NULL,
`status` int(11) DEFAULT NULL,
`week_entrances` int(11) DEFAULT NULL COMMENT 'last 7 days entrances',
`week_bounces` int(11) DEFAULT NULL COMMENT 'last 7 days bounce',
`canonical_url_id` int(11) DEFAULT NULL COMMENT 'the primary URL ID, NOT allow canonical of canonical',
KEY `id` (`id`),
KEY `urlindex` (`url`(255)),
KEY `own_domain_id_type_status` (`own_domain_id`,`type`,`status`),
KEY `canonical_url_id` (`canonical_url_id`),
KEY `type` (`type`,`status`)
) ENGINE=InnoDB AUTO_INCREMENT=227984392 DEFAULT CHARSET=utf8
/*!50100 PARTITION BY RANGE (`type`)
(PARTITION p0 VALUES LESS THAN (0) ENGINE = InnoDB,
PARTITION p1 VALUES LESS THAN (1) ENGINE = InnoDB,
PARTITION p2 VALUES LESS THAN (2) ENGINE = InnoDB,
PARTITION pEOW VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
Your query itself looks fine, however, the order by clause, and possible half-million records is probably your killer. I would add an index to help optimize that portion via
( own_domain_id, week_entrances, type )
So this way, you are first hitting your critical key "own_domain_id", and then getting everything already in order. The type is for != 10, thus any other type and would appear to cause more problems if that was in the second index position.
Comment Feedback.
For simplistic purposes, your critical key per the where clause is "ttl.own_domain_id=476". You only care about data for domain ID 476. Now, lets assume you have 15 "types" that span all different week entrances, such as
own_domain_id type week_entrances
476 1 1000
476 1 1700
476 1 850
476 2 15000
476 2 4250
476 2 12000
476 7 2500
476 7 5300
476 10 1250
476 10 4100
476 12 8000
476 12 3150
476 15 5750
476 15 27000
This obviously is not to scale of your half-million capacity, but shows sample data.
By having the type != 10, it will STILL have to blow through all the records for id=476, yet exclude only those with the type = 10. It then has to put all the data in order by the week entrances which would take more time. By having the week entrances as part of the key in the second position, THEN the type, the data WILL BE able to be optimized in the returned result set already in proper order. However, when it gets to the type of "!= 10", it will still skip over those quickly as they are encountered. Here would be the revised index data per above sample.
own_domain_id week_entrances type
476 850 1
476 1000 1
476 1250 10
476 1700 1
476 2500 7
476 3150 12
476 4100 10
476 4250 2
476 5300 7
476 5750 15
476 8000 12
476 12000 2
476 15000 2
476 27000 15
So, as you can see, the data is already pre-sorted per the index, and applying DESCENDING order is no problem for the engine, just pulls the records in reverse order and skips the 10's as they are found.
Does that help?
Additional comment feedback per Salman.
Think of this another way with a store with 10 different branch locations, each with their own sales. The transactions receipts are stored in boxes (literally). Think of how you would want to go through the boxes if you were looking for all transactions on a given date.
Box 1 = Store #1 only, and transactions sorted by date
Box 2 = Store #2 only, and transactions sorted by date
Box ...
Box 10 = Store #10 only, sorted by date.
You have to go through 10 boxes, pulling out all for a given date... Or in the original question, every transaction EXCEPT for one date, and you want them in order by dollar amount of transaction, regardless of date... What a mess that could be.
If you had the boxes pregroup sorted, regardless of store
Box 1 = Sales from $1 - $1000 (all properly sorted by amount)
Box 2 = Sales from $1001 - $2000 (properly sorted)
Box ...
Box 10... same...
You STILL have to go through all the boxes and put them in order, but at least, as you are looking through the transactions, you could just throw out the one for the date exclusion to ignore.
Indexes help pre-organize how the engine can best go through them for your criteria.
Related
I have a very simple query that is running extremely slowly despite being indexed.
My table is as follows:
mysql> show create table mytable
CREATE TABLE `mytable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`start_time` datetime DEFAULT NULL,
`status` varchar(64) DEFAULT NULL,
`user_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `ix_status_user_id_start_time` (`status`,`user_id`,`start_time`),
### other columns and indices, not relevant
) ENGINE=InnoDB AUTO_INCREMENT=115884841 DEFAULT CHARSET=utf8
Then the following query takes more than 10 seconds to run:
select id from mytable USE INDEX (ix_status_user_id_start_time) where status = 'running';
There are about 7 million rows in the table, and approximately 200 of rows have status running.
I would expect this query to take less than a tenth of a second. It should find the first row in the index with status running. And then scan the next 200 rows until it finds the first non-running row. It should not need to look outside the index.
When I explain the query I get a very strange result:
mysql> explain select id from mytable USE INDEX (ix_status_user_id_start_time) where status =
'running';
+----+-------------+---------+------------+------+------------------------------+------------------------------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------+------------+------+------------------------------+------------------------------+---------+-------+---------+----------+-------------+
| 1 | SIMPLE | mytable | NULL | ref | ix_status_user_id_start_time | ix_status_user_id_start_time | 195 | const | 2118793 | 100.00 | Using index |
+----+-------------+---------+------------+------+------------------------------+------------------------------+---------+-------+---------+----------+-------------+
It is estimating a scan of more than 2 million rows! Also, the cardinality of the status index does not seem correct. There are only about 5 or 6 different statuses, not 344.
Other info
There are somewhat frequent insertions and updates to this table. About 2 rows inserted per second, and 10 statuses updated per second. I don't know how much impact this has, but I would not expect it to be 30 seconds worth.
If I query by both status and user_id, sometimes it is fast (sub 0.1s) and sometimes it is slow (> 1s), depending on the user_id. This does not seem to depend on the size of the result set (some users with 20 rows are quick, others with 4 are slow)
Can anybody explain what is going on here and how it can be fixed?
I am using mysql version 5.7.33
As already mentioned in the comment, you are using many indexes on a big table. So the required memory for this indexes is very high.
You can increase the index buffer size in the my.cnf by changing the innodb_buffer_pool_size to a higher value.
But probably it is more efficient to use less indexes and do not use combined indexes if not absolutely needed.
My guess is, that if you remove all indexes and create only one on status this query will run in under 1s.
I need some help figuring out a performance issue. A database containing a single table with a growing number of METARs (aviation weather reports) is slowing down after about 8 million records are present. This despite indexes being in use. Performance can be recovered by rebuilding indexes, but that's really slow and takes the database offline, so I've resorted to just dropping the table and recreating it (losing the last few weeks of data).
The behaviour is the same whether a query is run trying to retrieve an actual metar, or whether a simple select count(*) is executed.
The table creation syntax is as follows:
CREATE TABLE `metars` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`tstamp` timestamp NULL DEFAULT NULL,
`metar` varchar(255) DEFAULT NULL,
`icao` char(7) DEFAULT NULL,
`qnh` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `timestamp` (`tstamp`),
KEY `icao` (`icao`),
KEY `qnh` (`qnh`),
KEY `metar` (`metar`)
) ENGINE=InnoDB AUTO_INCREMENT=812803050 DEFAULT CHARSET=latin1;
Up to about 8 million records, a select count(*) returns in about 500ms. Then it gradually increases, currently again at 14 million records, the count takes between 3 and 30 seconds. I was surprised to see that when explaining the count query, it's using the timestamp as an index, not the primary key. Using the primary key this should be a matter of just a few ms to return the number of records:
mysql> explain select count(*) from metars;
+----+-------------+--------+-------+---------------+-----------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------+---------------+-----------+---------+------+----------+-------------+
| 1 | SIMPLE | metars | index | NULL | timestamp | 5 | NULL | 14693048 | Using index |
+----+-------------+--------+-------+---------------+-----------+---------+------+----------+-------------+
1 row in set (0.00 sec)
Forcing it to use the primary index is even slower:
mysql> select count(*) from metars use index(PRIMARY);
+----------+
| count(*) |
+----------+
| 14572329 |
+----------+
1 row in set (37.87 sec)
Oddly, the typical use case query is to get the weather for an airport nearest to a specific point in time which continues to perform very well, despite being more complex than a simple count:
mysql> SELECT qnh, metar from metars WHERE icao like 'KLAX' ORDER BY ABS(TIMEDIFF(tstamp, STR_TO_DATE('2019-10-10 00:00:00', '%Y-%m-%d %H:%i:%s'))) LIMIT 0,1;
+------+-----------------------------------------------------------------------------------------+
| qnh | metar |
+------+-----------------------------------------------------------------------------------------+
| 2980 | KLAX 092353Z 25012KT 10SM FEW015 20/14 A2980 RMK AO2 SLP091 T02000139 10228 20200 56007 |
+------+-----------------------------------------------------------------------------------------+
1 row in set (0.01 sec)
What am I doing wrong here?
InnoDB performs a plain COUNT(*) by traversing some index. It prefers the smallest index because that will require touching the least number of blocks.
The PRIMARY KEY is clustered with the data, so that index is actually the biggest.
What version are you using? TIMESTAMP changed at some point. Perhaps that explains why tstamp is used instead of qnh.
If you are purging old data by using DELETE, see http://mysql.rjweb.org/doc.php/partitionmaint for a faster way.
I assume the data is static; that is it is never UPDATEd? Consider building and maintaining a summary table, perhaps indexed by date. This could have various counts for each day. Then a fetch from that table would be much faster than hitting the raw data. More: http://mysql.rjweb.org/doc.php/summarytables
How many rows for KLAX? That query must fetch all of them in order to convert the timestamp before doing the LIMIT. If you had INDEX(icao, tstamp), you could find the next before or after a given time even faster.
Good afternoon all. I am coming to you in the hopes that you can provide some direction with a MYSQL optimization problem that I am having. First, a few system specifications.
MYSQL version: 5.2.47 CE
WampServer v 2.2
Computer:
Samsung QX410 (laptop)
Windows 7
Intel i5 (2.67 Ghz)
4GB RAM
I have two tables:
“Delta_Shares” contains stock trade data, and contains two columns of note. “Ticker” is Varchar(45), “Date_Filed” is Date. This table has about 3 million rows (all unique). I have an index on this table “DeltaSharesTickerDateFiled” on (Ticker, Date_Filed).
“Stock_Data” contains two columns of note. “Ticker” is Varchar(45), “Value_Date” is Date. This table has about 19 million rows (all unique). I have an index on this table “StockDataIndex” on (Ticker, Value_Date).
I am attempting to update the “Delta_Shares” table by looking up information from the Stock_Data table. The following query takes more than 4 hours to run.
update delta_shares A, stock_data B
set A.price_at_file = B.stock_close
where A.ticker = B.ticker
and A.date_filed = B.value_Date;
Is the excessive runtime the natural result of the large number of rows, poor index’ing, a bad machine, bad SQL writing, or all of the above? Please let me know if any additional information would be useful (I am not overly familiar with MYSQL, though this issue has moved me significantly down the path of optimization). I greatly appreciate any thoughts or suggestions.
UPDATED with "EXPLAIN SELECT"
1(id) SIMPLE(seltype) A(table) ALL(type) DeltaSharesTickerDateFiled(possible_keys) ... 3038011(rows)
1(id) SIMPLE(seltype) B(table) ref(type) StockDataIndex(possible_keys) StockDataIndex(key) 52(key_len) 13ffeb2013.A.ticker,13ffeb2013.A.date_filed(ref) 1(rows) Using where
UPDATED with table describes.
Stock_Data Table:
idstock_data int(11) NO PRI auto_increment
ticker varchar(45) YES MUL
value_date date YES
stock_close decimal(10,2) YES
Delta_Shares Table:
iddelta_shares int(11) NO PRI auto_increment
cik int(11) YES MUL
ticker varchar(45) YES MUL
date_filed_identify int(11) YES
Price_At_File decimal(10,2) YES
delta_shares int(11) YES
date_filed date YES
marketcomparable varchar(45) YES
market_comparable_price decimal(10,2) YES
industrycomparable varchar(45) YES
industry_comparable_price decimal(10,2) YES
Index from Delta_Shares:
delta_shares 0 PRIMARY 1 iddelta_shares A 3095057 BTREE
delta_shares 1 DeltaIndex 1 cik A 18 YES BTREE
delta_shares 1 DeltaIndex 2 date_filed_identify A 20633 YES BTREE
delta_shares 1 DeltaSharesAllIndex 1 cik A 18 YES BTREE
delta_shares 1 DeltaSharesAllIndex 2 ticker A 619011 YES BTREE
delta_shares 1 DeltaSharesAllIndex 3 date_filed_identify A 3095057 YES BTREE
delta_shares 1 DeltaSharesTickerDateFiled 1 ticker A 11813 YES BTREE
delta_shares 1 DeltaSharesTickerDateFiled 2 date_filed A 3095057 YES BTREE
Index from Stock_Data:
stock_data 0 PRIMARY 1 idstock_data A 18683114 BTREE
stock_data 1 StockDataIndex 1 ticker A 14676 YES BTREE
stock_data 1 StockDataIndex 2 value_date A 18683114 YES BTREE
There are a few benchmarks you could make to see where the bottleneck is. For example, try updating the field to a constant value and see how long it takes (obviously, you'll want to make a copy of the database to do this on). Then try a select query that doesn't update, but just selects the values to be updated and the values they will be updated to.
Benchmarks like these will usually tell you whether you're wasting your time trying to optimize or whether there is much room for improvement.
As for the memory, here's a rough idea of what you're looking at:
varchar fields are 2 bytes plus actual length and datetime fields are 8 bytes. So let's make an extremely liberal guess that your varchar fields in the Stock_Data table average around 42 bytes. With the datetime field that adds up to 50 bytes per row.
50 bytes x 20 million rows = .93 gigabytes
So if this process is the only thing going on in your machine then I don't see memory as being an issue since you can easily fit all the data from both tables that the query is working with in memory at one time. But if there are other things going on then it might be a factor.
Try analyse on both tables and use straight join instead of the implicit join. Just a guess, but it sounds like a confused optimiser.
In my database there is table as below
+----+-----------+----------+
| id | fk_id | duration |
+----+-----------+----------+
| 1 | 23 | 00:00:31 |
| 2 | 23 | 00:00:36 |
| 3 | 677 | 00:00:36 |
| 4 | 678 | 00:00:36 |
+----+-----------+----------+
In the above table the column duration is of data type time .
The following one is the schema of that table
Create Table: CREATE TABLE `durationof` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`fk_id` int(11) NOT NULL,
`duration` time NOT NULL,
PRIMARY KEY (`id`))
Here the thing just I want to add all times in a duration column in a query, how can I?
Just like sum function is there any function to add all mysql time vales .
I tried select addtime(diration) from durationof;
But that is not working.
I completely disagree with #Corbin's original answer. As documented under The TIME Type:
TIME values may range from '-838:59:59' to '838:59:59'. The hours part may be so large because the TIME type can be used not only to represent a time of day (which must be less than 24 hours), but also elapsed time or a time interval between two events (which may be much greater than 24 hours, or even negative).
To take the summation of all such intervals: convert to seconds, take the sum and then convert back again.
SELECT SEC_TO_TIME(SUM(TIME_TO_SEC(duration))) FROM durationof
See it on sqlfiddle.
Instread of converting to seconds to do the sum, I would be tempted to just store the time in seconds.
In particular, Instead of storing a TIME, you could store some type of integer (int, smallint, bigint, etc). You would identify your smallest unit of measure and store in that.
For example, if you care about precision down to seconds, store the durations in seconds. For example, you might store 45 for 45 seconds. If you cared about milliseconds, you would treat the data as milliseconds. In other words, 45000 would be stored for 45 seconds.
Then you're back to summing normally.
Alternatively, if you want to stick with TIME, go with eggyval's answer.
Your specific circumstances will probably dictate whether TIME or an integer is better to work with.
I am going through the slow query log to try to determine why some of the queries behave erratically. For the sake of consistency, the queries are not cached and flushing was done to clear system cache before running the test. The query goes something like this:
SELECT P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask' FROM Property P
INNER JOIN Exchange E ON E.currency = P.currency
WHERE P.floor_area >= k?
AND P.closing_date >= CURDATE() // this and key_buffer_size=0 prevents caching
AND P.type ='c'
AND P.lat BETWEEN v? AND v?
AND P.lng BETWEEN v? AND v?
AND P.price * E.rate BETWEEN k? AND k?
ORDER BY P.floor_area DESC LIMIT 100;
The k? are user defined constant values; v? are variables that change as user drag or zoom on a map. 100 results are pulled out from the table and sorted according to floor area in descending order.
A PRIMARY key on id and an INDEX on floor_area is set up only. No other index is created so that MySQL would consistently use floor_area as the only key. The query times and rows examined are recorded as follows:
query number 1 2 3 4 5 6 7 8 9 10
user action on map start > + + < ^ + > v +
time in seconds 138 0.21 0.43 32.3 0.12 0.12 36.3 4.33 0.33 2.00
rows examined ('000) 43 43 43 60 43 43 111 139 133 176
The query execution plan is as follows:
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
| 1 | SIMPLE | P | range | id_flA | id_flA | 3 | NULL | 4223660 | Using where |
| 1 | SIMPLE | E | eq_ref | PRIMARY | PRIMARY | 3 | BuySell.P.currency | 1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+--------------------+---------+-------------+
The test is being performed a few times and the results are quite consistent with the above. What could be the reason(s) for the spike in query times in query number 4 and number 7 and how do I bring it down?
UPDATE:
Results of removing ORDER BY as suggested by Digital Precision:
query number 1 2 3 4 5 6 7 8 9 10
user action on map start > + + < ^ + > v +
time in seconds 255 3.10 3.16 3.08 3.18 3.21 3.32 3.18 3.17 3.80
rows examined ('000) 131 131 131 131 136 136 136 136 136 157
The query execution plan is the same as above though it seems more like a table scan. Note that I am using MyISAM engine, version 5.5.14.
AS requested, below is schema:
| Property | CREATE TABLE `Property` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`type` char(1) NOT NULL DEFAULT '',
`lat` decimal(6,4) NOT NULL DEFAULT '0.0000',
`lng` decimal(7,4) NOT NULL DEFAULT '0.0000',
`floor_area` mediumint(8) unsigned NOT NULL DEFAULT '0',
`currency` char(3) NOT NULL DEFAULT '',
`price` int(10) unsigned NOT NULL DEFAULT '0',
`closing_date` date NOT NULL DEFAULT '0000-00-00',
`name` char(25) NOT NULL DEFAULT '',
PRIMARY KEY (`id`),
KEY `id_flA` (`floor_area`)
) ENGINE=MyISAM AUTO_INCREMENT=5000000 DEFAULT CHARSET=latin1
| Exchange | CREATE TABLE `Exchange` (
`currency` char(3) NOT NULL,
`rate` decimal(11,10) NOT NULL DEFAULT '0.0000000000',
PRIMARY KEY (`currency`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
2ND UPDATE:
I thought it would be appropriate to post the non-default parameters in the my.cnf configuration file since two of the answerers are mentioning about the parameters:
max_heap_table_size = 1300M
key_buffer_size = 0
read_buffer_size = 1300M
read_rnd_buffer_size = 1024M
sort_buffer_size = 1300M
I have 2GB of RAM on my test server.
I guess I figure out the reason of spikes. Here is how it goes :
First I created the tables and load some randomly generated data on it:
Here is my query:
SELECT SQL_NO_CACHE P.id, P.name, P.lat, P.lng, P.price * E.rate AS 'ask'
FROM Property P
INNER JOIN Exchange E ON E.currency = P.currency
WHERE P.floor_area >= 2000
AND P.closing_date >= CURDATE()
AND P.type ='c'
AND P.lat BETWEEN 12.00 AND 22.00
AND P.lng BETWEEN 10.00 AND 20.00
AND P.price BETWEEN 100 / E.rate AND 10000 / E.rate
ORDER BY P.floor_area DESC LIMIT 100;
And here is the describe :
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | P | range | id_flA | id_flA | 3 | NULL | 4559537 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | E | ALL | PRIMARY | NULL | NULL | NULL | 6 | Using where; Using join buffer |
+----+-------------+-------+-------+---------------+--------+---------+------+---------+----------------------------------------------+
it took between 3.5 ~ 3.9 sec every time I query the data (didn't make any difference which parameters I use). It didn't make sense so I researched Using join buffer
Then I wanted to try this query without "join buffer" so I inserted 1 more random data to Exchange table.
INSERT INTO Exchange(currency, rate) VALUES('JJ', 1);
Now I use the same sql and the it took 0.3 ~ 0.5 seconds for response. And here is the describe :
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
| 1 | SIMPLE | P | range | id_flA | id_flA | 3 | NULL | 4559537 | Using where |
| 1 | SIMPLE | E | eq_ref | PRIMARY | PRIMARY | 3 | test.P.currency | 1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+-----------------+---------+-------------+
So the problem (as far as I see), the optimizer trying to use "join buffer". The optimum solution of this problem would be to force optimizer not to use "join buffer". (which I couldn't find how to) or change the "join_buffer_size" value. I solve it by adding "dummy" values to Exchange table (so the optimizer wouldn't use join buffer) but it's not a exact solution, its just a stupid trick to fool mysql.
Edit : I researched in mysql forums/bugs about this "join buffer" behavior; then asked about it in official forums. I am going to fill a bug report about this irrational behavior of optimizer.
Couple of things:
Why are you calculating the product of P.price and E.rate in the SELECT and aliasing as 'ask', then doing the calculation again in the where clause? Should be able to do AND ask BETWEEN k? and k? -- Edit: This won't work due to the way MySQL works. Apparently MySQL evaluates the WHERE clause before any aliases (sourced).
What kind of index do you have on Exchange.currency and Property.currency? If exchange is a lookup table, maybe you would be better off adding a pivot (linking) table with Property.Id and Exchange.Id
The order by floor_area forces MySQL to create a temp table in order to do the sorting correctly, any chance you can do the sorting at the app layer?
Adding an index on type column will help as well.
-- Edit
Not sure what you mean by the comment // this and key_buffer_size=0 prevents caching on the CURDATE where conditional, you can force no sql caching using the 'SQL_NO_CACHE' flag on your select statement.
What I would recommend now that you have removed the ORDER BY, is to update your query statement as follows (Added P alias to columns to reduce any confusion):
WHERE P.type ='condominium'
AND P.floor_area >= k?
AND P.closing_date >= CURDATE() // No longer necessary with SQL_NO_CACHE
AND P.lat BETWEEN v? AND v?
AND P.lng BETWEEN v? AND v?
AND P.price * E.rate BETWEEN k? AND k?
Then add an index to the 'type' column and a composite index on the 'type' and 'floor_area' columns. As you stated, the type column is a low-cardinality column, but the table is large and should help. And even though floor_area appears to be a high-cardinality column, the composite index will help speed up your query times.
You may also want to research if there is a penalty using BETWEEN rather than range operators ( >, <, <= etc.)
Try an index on type and floor_area (and possibly closing_date too).
Modify your constants by the exchange rate instead of the price column:
P.price between ( k? / E.rate ) and ( k? / E.rate )
then try an index on price.
I've become a little obsessed with this question; the spike is hard to explain.
Here's what I did:
I re-created your schema, and populated the property table with 4.5 million records, with random values for the numerical and date columns. This almost certainly doesn't match your data - I'm guessing the lat/longs tend to cluster in population areas, the prices around multiples of 10K, and the floor space will be skewed towards lower-end values.
I ran your query with a range of values for lat, long, floorspace and price. With just the index on floor area, I saw the the query plan would ignore the index for some values of floor area. This was presumably because the query analyzer decided the number of records excluded by using the index was too small. However, in re-running the query for a variety of different scenarios, I noticed that the query plan would ignore the index every now and again - can't explain that.
It's always worth running ANALYZE TABLE when dealing with this kind of weirdness.
I did get slightly different "explain" results: specifically, the property table select gave 'Using where; Using temporary; Using filesort'. This suggests the index is only used for the where clause, and not to order the results.
This confirms that the most likely explanation of the performance peaks is not related so much to the query engine, but to the way the temporary table is handled, and the requirement to do a filesort. In trying to reproduce this issue, I did notice that response time went up dramatically as the number of records returned from the "where" clause increased - though I didn't see the spikes you've noticed.
I've tried a variety of different indices; using all the keys in the where clause does speed up the time to retrieve the records matching the where clause, but does nothing for the subsequent order by.
This, once again, suggests it's the performance of the temporary table that's the cause of the spikes. read_rnd_buffer_size would be the obvious thing to look at.