I took over a project written in Laravel 4. We have MySQL 5.6.21 - PHP 5.4.30 - currently running on Windows 8.1.
Every morning on the first attempt to access the landingpage - which contain about 5 queries on the backend - this site will crash with a php-timeout (over 30 seconds for response).
After using following I got closer to the cause: Laravel 4 - logging SQL queries. One of the queries takes more than 25 seconds on the first call. After that its always < 0.5 seconds.
The query has got 3 joins and 2 subselects wrapped in Cache::remember. I want to go into optimizing this so that on production it won't run into this problem.
So I want to test different SQLs
The Problem is that the first time the data gets cached somehow and then I can't see whether my new SQL's are better or not.
Now, since I guess it's a caching issue (on the first attempt it takes long, afterwards not) I did these:
MySQL: FLUSH TABLES;
restart MySQL
restart Apache
php artisan cache:clear
But still, the query works fast. Then after some time I don't access the database at all (can't give an exact time, maybe 4 hours of inactivity) it happens again.
Explain says:
1 | Primary | table1 | ALL | 2 possible keys | NULL | ... | 1010000 | using where; using temporary; using filesort
1 | Primary | table2 | eq_ref | PRIMARY | PRIMARY | ... | 1 | using where; using index
1 | Primary | table3 | eq_ref | PRIMARY | PRIMARY | ... | 1 | using where; using index
1 | Primary | table4 | eq_ref | PRIMARY | PRIMARY | ... | 1 | NULL
3 | Dependent Subquery | table5 | ref | 2 possible keys | table1.id | ... | 17 | using where
2 | Dependent Subquery | table5 | ref | 2 possible keys | table1.id | ... | 17 | using where
So here the questions:
What's the reason for this long time?
How can I reproduce it? and
Is there a way to fix it?
I read mysql slow on first query, then fast for related queries. However that doesn't answer my question on how to reproduce this behaviour.
Update
I changed the SQL and now it is written like:
select
count(ec.id) as asdasda
from table1 ec force index for join (PRIMARY)
left join table2 e force index for join (PRIMARY) on ec.id = e.id
left join table3 v force index for join (PRIMARY) on e.id = v.id
where
v.col1 = 'aaa'
and v.col2 = 'bbb'
and v.col3 = 'ccc'
and e.datecol > curdate()
and e.col1 != 0
Now explain says:
+----+-------------+--------+--------+---------------+--------------+---------+-----------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+--------+---------------+--------------+---------+-----------------+--------+-------------+
| 1 | SIMPLE | table3 | ALL | PRIMARY | NULL | NULL | NULL | 114032 | Using where |
| 1 | SIMPLE | table2 | ref | PRIMARY | PRIMARY | 5 | table3.id | 11 | Using where |
| 1 | SIMPLE | table1 | eq_ref | PRIMARY | PRIMARY | 4 | table2.id | 1 | Using index |
+----+-------------+--------+--------+---------------+--------------+---------+-----------------+--------+-------------+
Is that as good as it can get?
The data might be cached in the InnoDB buffer pool or on Windows filesystem cache.
You can't explicitly flush the InnoDB cache but you can set the flushing parameters to more aggressive values:
SET GLOBAL innodb_old_blocks_pct = 5
SET GLOBAL innodb_max_dirty_pages_pct = 0
You can use the solution provided here to clear Windows filesystem cache: Clear file cache to repeat performance testing
But what you really need is an index on table3 (col1, col2, col3)
Related
the following query is very slow, I don't understand why. I have all id as indexes (some primary).
SELECT r.name as tool, r.url url ,r.id_tool recId, count(*) as count, r.source as source,
group_concat(t.name) as instrument
FROM tools r
INNER JOIN
instruments_tools ifr
ON ifr.id_tool = r.id_tool
INNER JOIN
instrument t
ON t.id= ifr.id_instrument
WHERE t.id IN (433,37,362) AND t.source IN (1,2,3)
GROUP BY r.id_tool
ORDER BY count desc,rand() limit 10;
Locally on a Wampserver installation I have serious issues with transferring data. With Heidi I see two "Sending Data" of 2 resp 6 seconds.
On a shared server, this is the important part I see:
| statistics | 0.079963 |
| preparing | 0.000028 |
| Creating tmp table | 0.000037 |
| executing | 0.000005 |
| Copying to tmp table | 7.963576 |
| converting HEAP to MyISAM | 0.015790 |
| Copying to tmp table on disk | 5.383739 |
| Creating sort index | 0.015143 |
| Copying to group table | 0.023708 |
| converting HEAP to MyISAM | 0.014513 |
| Copying to group table | 0.099595 |
| Sorting result | 0.034256 |
Considering that I'd like to improve the query (see LIMIT) or remove rand() and add weights, I'm a bit afraid I'm doing something very wrong.
Additional info:
The tools table is 500.000 rows big, while the instruments around 6000. instruments_tools is around 3M rows.
The query is to find which tool I can make with the instruments I have (by checking t.id IN(id of instruments). Group_concat(t.name) is a way to know which instrument is selected.
explain of the query:
+----+-------------+-------+--------+-------------------------+---------------+-------- -+----------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------+---------------+---------+----------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t | range | PRIMARY | PRIMARY | 4 | NULL | 3 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | ifr | ref | id_tool,id_instrument | id_instrument | 5 | mydb2.t.id | 374 | Using where |
| 1 | SIMPLE | r | eq_ref | PRIMARY | PRIMARY | 4 | mydb2.ifr.id_tool | 1 | |
+----+-------------+-------+--------+-------------------------+---------------+---------+----------------------------+------+----------------------------------------------+
You need a compound index on the intersection table:
ALTER TABLE instruments_tools ADD KEY (id_instrument, id_tool);
The order of columns in that index is important!
What you're hoping for is that the joins will start with the instrument table, then look up the matching index entry in the compound index based on id_instrument. Then once it finds that index entry, it has the related id_tool for free. So it doesn't have to read the instrument_tools table at all, it only need to read the index entry. That should give the "Using index" comment in your EXPLAIN for the instruments_tools table.
That should help, but you can't avoid the temp table and filesort, because of the columns you're grouping by and sorting by cannot make use of an index.
You can try to make MySQL avoid writing the temp table to disk by increasing the size of memory it can use for temporary tables:
mysql> SET GLOBAL tmp_table_size = 256*1024*1024; -- 256MB
mysql> SET GLOBAL max_heap_table_size = 256*1024*1024; -- 256MB
That figure is just an example. I have no idea how large it would have to be for the temp table in your case.
I have encountered a MySQL query that takes over 2 minutes to complete and brings up the server load very high (e.g. from 2 to 14, or sometimes higher).
The query does a left join between tables, then sorts the data based on a float column on of the joined tables, like this:
SELECT table1.*, table2.*, table3.field, table4.field
FROM table1
LEFT JOIN table2 ON table1...
LEFT JOIN table3 ON table1...
LEFT JOIN table4 ON table3...
LEFT JOIN table5 ON table1...
WHERE table1.deleted=0
ORDER BY table2.float_field ASC
LIMIT 1,300
The JOINS are all done on indexed keys, and table2 also has an index on the float_field.
The same database structure and query is used on other databases without issues. This table2 is a custom field table, alterable by users of this database, so in this particular system, I see that it has 107 fields, more than 2/3 of them being varchar(150). Would this be why the high load, or is there some other reason? Any suggestion for how to handle it (ideally without having to re-do the db schema)?
Thanks in advance.
EDIT: Here are the 'explain' results:
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
| 1 | SIMPLE | table1 | ALL | idx_1,idx_2 | NULL | NULL | NULL | 33861 | Using where |
| 1 | SIMPLE | table2 | eq_ref | PRIMARY | PRIMARY | 108 | db.table1.id | 1 | |
| 1 | SIMPLE | jtl0 | ref | idx_X | idx_X | 111 | db.table1.id | 1 | |
| 1 | SIMPLE | table4 | eq_ref | PRIMARY,... | PRIMARY | 108 | db.jtl0.field | 1 | |
| 1 | SIMPLE | jt1 | eq_ref | PRIMARY | PRIMARY | 108 | db.table1.fieldX| 1 | |
+----+-------------+--------+--------+---------------+---------+---------+-----------------+-------+-------------+
Both idx_1 and idx_2 use 'deleted' column as the first field in the index. There is only this 1 field in the where
I also corrected the original text, there are 5 tables used, not 4 (although the last table has 20 rows only, so it doesn't matter here).
select table2.*
is generally bad style - returning a lot of columns you are not interested in. In this case it could be causing the slowness given the large number of (text) columns in this table.
100 columns * 150 characters * 1300 rows is roughly 19.5 MB, so the slowness could well be reading all the data from disk and transmitting it across the network.
Do you still see the slowness if you restrict this to the particular columns of table2 that you are interested in?
EDIT : your explain select output seems to confirm that it is not a difficult query to run, with only a small number of rows. This makes the sheer data size of each row in table2 the most likely problem. You can test this by removing / limiting the reference to table2. If that is the case, then the only way to speed this query up will be to request fewer columns from table2.
I'm creating a bunch of tables in a makefile. My make target looks something like:
TASK:
cat script.sql | mysql -v -v -v dbName
Inside script.sql, one of the create table commands hangs indefinitely with the mysql process at 100% CPU.
If I run the same command as the same user on the same machine but from the command-line, it runs fine.
$ cat script.sql | mysql -v -v -v dbName
Delving into it a bit more, it turns out that explain yields different results in the two environments.
From inside make:
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | o | ALL | NULL | NULL | NULL | NULL | 2340 | NULL |
| 1 | SIMPLE | d | index | NULL | PRIMARY | 3 | NULL | 2739 | Using index; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 7 | db1.o.field1,db3.d.date | 1 | Using where |
| 1 | SIMPLE | n | ALL | PRIMARY | NULL | NULL | NULL | 1 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
From the command-line:
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
| 1 | SIMPLE | o | ALL | NULL | NULL | NULL | NULL | 2340 | NULL |
| 1 | SIMPLE | d | index | NULL | PRIMARY | 3 | NULL | 2739 | Using index; Using join buffer (Block Nested Loop) |
| 1 | SIMPLE | p | eq_ref | PRIMARY | PRIMARY | 7 | db1.o.field1,db3.d.date | 1 | Using where |
| 1 | SIMPLE | n | ref | PRIMARY | PRIMARY | 4 | db2.p.field1 | 1 | Using where |
+----+-------------+-------+--------+---------------+---------+---------+----------------------------------------+------+----------------------------------------------------+
Some digging directed me to this question, and running analyze on one of the tables involved does solve the issue.
But seriously, what is going on here? Is there some environment variable causing mysql to behave differently?
The query in question looks like this:
drop view if exists v;
create view v as (
select *
from db1.order o
cross join db3.dates d
left join db2.price p on (1=1
and p.id = o.id
and p.date = d.date
and p.volume > 0)
left join db3.security n on (1=1
and n.id = p.id
and n.date <= d.date)
);
explain select * from v;
analyze table n;
explain select * from v;
create table t (
primary key (date asc, id asc)
) as (
select * from v
);
From inside make, the first explain yields the first result above, then the analyze causes the second explain to yield the second result above.
It is suspicious of script.sql equality. According to your EXPLAIN output.
JOIN order is same each other, but referenced table of third table 'p' is different.
When executed in shell, 'p' references 'db3.d' but in Make, 'p' references 'db2.d'
That's why I am suspicious.
can you post your query? if confidential, rename table, column. if there is sub-query, more than 2 table alias can be there. but it looks like that there is no sub-query.
this question you gave me is not related yours. he has new environment and ANALYZE is required for table statistics is changed.
to figure out two sql is really same turn on General log. it's simple. add SET GLOBAL general_log = 'ON' at 1st line of script.sql, and SET GLOBAL general_log = 'OFF' at end of sql.
what do you think about my opinion?
UPDATED
Ok, script.sql cleared of suspicion. Then I have no idea why two run differently. MySQL forums may help you.
BTW, I can tell you some information.
how script.sql work? CREATE VIEW and SELECT .. FROM view are part of or whole of script.sql. Is there creation or insertion on db3.security or other tables? If you post MySQL forum, It would be better described how script.sql work.
USE INDEX Did you try EXPLICITLY use of USE INDEX? most inner table 'n' is doing full scan.
innodb_stats_sample_pages Finally set innodb_stats_sample_pages=64 in my.cnf (default is 8), if you use InnoDB. When innodb table is opened, MySQL read 8 random pages, these page is used to aggregate statistics on table (this statistics used to cost of join). So statistics may change every table opening (it's read random page). more sample pages accurate Statistics.
(sorry for my poor English)
We've got a relatively straightforward query that does LEFT JOINs across 4 tables. A is the "main" table or the top-most table in the hierarchy. B links to A, C links to B. Furthermore, X links to A. So the hierarchy is basically
A
C => B => A
X => A
The query is essentially:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
WHERE
b.flag = true
ORDER BY
x.date DESC
LIMIT 25
Via EXPLAIN, I've confirmed that the correct indexes are in place, and that the built-in MySQL query optimizer is using those indexes correctly and properly.
So here's the strange part...
When we run the query as is, it takes about 1.1 seconds to run.
However, after doing some checking, it seems that if I removed most of the SELECT fields, I get a significant speed boost.
So if instead we made this into a two-step query process:
First query same as above except change the SELECT clause to only SELECT a.id instead of SELECT *
Second query also same as above, except change the WHERE clause to only do an a.id IN agains the result of Query 1 instead of what we have before
The result is drastically different. It's .03 seconds for the first query and .02 for the second query.
Doing this two-step query in code essentially gives us a 20x boost in performance.
So here's my question:
Shouldn't this type of optimization already be done within the DB engine? Why does the difference in which fields that are actually SELECTed make a difference on the overall performance of the query?
At the end of the day, it's merely selecting the exact same 25 rows and returning the exact same full contents of those 25 rows. So, why the wide disparity in performance?
ADDED 2012-08-24 13:02 PM PDT
Thanks eggyal and invertedSpear for the feedback. First off, it's not a caching issue -- I've run tests running both queries multiple times (about 10 times) alternating between each approach. The result averages at 1.1 seconds for the first (single query) approach and .03+.02 seconds for the second (2 query) approach.
In terms of indexes, I thought I had done an EXPLAIN to ensure that we're going thru the keys, and for the most part we are. However, I just did a quick check again and one interesting thing to note:
The slower "single query" approach doesn't show the Extra note of "Using index" for the third line:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
While it does show "Using index" for when we query for just the IDs:
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | t1 | index | PRIMARY | shop_group_id_idx | 5 | NULL | 102 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | t2 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t1.organization_id | 1 | Using where |
| 1 | SIMPLE | t0 | ref | bundle_idx,shop_id_idx | shop_id_idx | 4 | dbmodl_v18.t1.organization_id | 309 | Using index |
| 1 | SIMPLE | t3 | eq_ref | PRIMARY | PRIMARY | 4 | dbmodl_v18.t0.id | 1 | |
+----+-------------+-------+--------+------------------------+-------------------+---------+-------------------------------+------+----------------------------------------------+
The strange thing is that both do list the correct index being used... but I guess it begs the questions:
Why are they different (considering all the other clauses are the exact same)? And is this an indication of why it's slower?
Unfortunately, the MySQL docs do not give much information for when the "Extra" column is blank/null in the EXPLAIN results.
More important than speed, you have a flaw in your query logic. When you test a LEFT JOINed column in the WHERE clause (other than testing for NULL), you force that join to behave as if it were an INNER JOIN. Instead, you'd want:
SELECT
a.*, b.*, c.*, x.*
FROM
a
LEFT JOIN b ON b.a_id = a.id
AND b.flag = true
LEFT JOIN c ON c.b_id = b.id
LEFT JOIN x ON x.a_id = a.id
ORDER BY
x.date DESC
LIMIT 25
My next suggestion would be to examine all of those .*'s in your SELECT. Do you really need all the columns from all the tables?
I ran into a problem last week moving from dev-testing where one of my queries which had run perfectly in dev, was crawling on my testing server.
It was fixed by adding FORCE INDEX on one of the indexes in the query.
Now I've loaded the same database into the production server (and it's running with the FORCE INDEX command, and it has slowed again.
Any idea what would cause something like this to happen? The testing and prod are both running the same OS and version of mysql (unlike the dev).
Here's the query and the explain from it.
EXPLAIN SELECT showsdate.bid, showsdate.bandid, showsdate.date, showsdate.time,
-> showsdate.title, showsdate.name, showsdate.address, showsdate.rank, showsdate.city, showsdate.state,
-> showsdate.lat, showsdate.`long` , tickets.link, tickets.lowprice, tickets.highprice, tickets.source
-> , tickets.ext, artistGenre, showsdate.img
-> FROM tickets
-> RIGHT OUTER JOIN (
-> SELECT shows.bid, shows.date, shows.time, shows.title, artists.name, artists.img, artists.rank, artists
-> .bandid, shows.address, shows.city, shows.state, shows.lat, shows.`long`, GROUP_CONCAT(genres.genre SEPARATOR
-> ' | ') AS artistGenre
-> FROM shows FORCE INDEX (biddate_idx)
-> JOIN artists ON shows.bid = artists.bid JOIN genres ON artists.bid=genres.bid
-> WHERE `long` BETWEEN -74.34926984058 AND -73.62463215942 AND lat BETWEEN 40.39373515942 AND 41.11837284058
-> AND shows.date >= '2009-03-02' GROUP BY shows.bid, shows.date ORDER BY shows.date, artists.rank DESC
-> LIMIT 0, 30
-> )showsdate ON showsdate.bid = tickets.bid AND showsdate.date = tickets.date;
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 30 | |
| 1 | PRIMARY | tickets | ref | biddate_idx | biddate_idx | 7 | showsdate.bid,showsdate.date | 1 | |
| 2 | DERIVED | genres | index | bandid_idx | bandid_idx | 141 | NULL | 531281 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | shows | ref | biddate_idx | biddate_idx | 4 | activeHW.genres.bid | 5 | Using where |
| 2 | DERIVED | artists | eq_ref | bid_idx | bid_idx | 4 | activeHW.genres.bid | 1 | |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
I think I chimed in when you asked this question about the differences in dev -> test.
Have you tried rebuilding the indexes and recalculating statistics? Generally, forcing an index is a bad idea as the optimizer usually makes good choices as to which indexes to use. However, that assumes that it has good statistics to work from and that the indexes aren't seriously fragmented.
ETA:
To rebuild indexes, use:
REPAIR TABLE tbl_name QUICK;
To recalculate statistics:
ANALYZE TABLE tbl_name;
Does test server have only 10 records and production server 1000000000 records?
This might also cause different execution times
Are the two servers configured the same? It sounds like you might be crossing a "tipping point" in MySQL's performance. I'd compare the MySQL configurations; there might be a memory parameter way different.