Optimize MySQL InnoDB query for max, count - mysql

I have an MySQL InnoDB table with 5.7M rows and 1.9GB size:
+-------------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------+------+-----+---------+----------------+
| id | int(20) | NO | PRI | NULL | auto_increment |
| listing_id | int(20) | YES | | NULL | |
| listing_link | text | YES | | NULL | |
| transaction_title | text | YES | | NULL | |
| image_thumb | text | YES | | NULL | |
| seller_link | text | YES | | NULL | |
| seller_name | text | YES | | NULL | |
| sale_date | date | YES | | NULL | |
+-------------------+---------+------+-----+---------+----------------+
Here are my my.ini settings for my 3GB RAM server:
key_buffer = 16M
max_allowed_packet = 16M
sort_buffer_size = 8M
net_buffer_length = 8K
read_buffer_size = 2M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 8M
log_error = "mysql_error.log"
innodb_autoinc_lock_mode=0
join_buffer_size = 8M
thread_cache_size = 8
thread_concurrency = 8
query_cache_size = 64M
query_cache_limit = 2M
ft_min_word_len = 4
thread_stack = 192K
tmp_table_size = 64M
innodb_buffer_pool_size = 2G
innodb_additional_mem_pool_size = 16M
innodb_log_file_size = 512M
innodb_log_buffer_size = 8M
innodb_flush_log_at_trx_commit = 1
innodb_lock_wait_timeout = 120
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency = 16
innodb_log_files_in_group = 3
innodb_max_dirty_pages_pct = 90
When I run next query it takes over 20 minutes to return the results:
SELECT transaction_title,
listing_id,
seller_name,
Max(sale_date) AS sale_date,
Count(*) AS count
FROM sales_meta
WHERE `sale_date` BETWEEN '2017-06-06' AND '2017-06-06'
GROUP BY listing_id
HAVING Count(*) > 1
ORDER BY count DESC,
seller_name;
I've done some research and it appears I need to add some indexes to speed things up, but I am confused how to go about it. There are some single-column indexes and some multi-column indexes, which one should I do?
To make things more complicated, there are few other queries that I will need to perform on this table regularly:
SELECT *
FROM sales_meta
WHERE ` sale_date `= '2017-06-06';
and
SELECT DISTINCT `seller_name`
FROM `sales_meta`;
These two are probably less taxing, but I still need to optimize for them as well if possible, although the first query out of three is the top priority for now.

if you want just the value for a single day and the the data type is date then you could avoid the between clause and use =
SELECT transaction_title,
listing_id,
seller_name,
Max(sale_date) AS max_sale_date,
Count(*) AS count
FROM sales_meta
WHERE sale_date = str_to_date('2017-06-06', '%Y-%m-%d')
GROUP BY listing_id
HAVING Count(*) > 1
ORDER BY count DESC, seller_name;
and be sure you have an index on sale_date

Looks like the index on sale_date is definitely something you should add as a couple of queries in the question use sale_date
Another suggestion is to index the column used in GROUP BY as per MySQL's documentation
Instead of following an approach of adding all the indices in one go, I would opt for incremental approach and measure the performance after adding each index.

INDEX(sale_date) -- very important for the first query
str_to_date('2017-06-06', '%Y-%m-%d') -- no better than '2017-06-06'
innodb_buffer_pool_size = 2G -- too big for your tiny RAM; change to 1G (swapping kills perf)
GROUP BY listing_id -- meaningless, since `listing_id` is unique; hence count is always 1
Prefer using an explicit list instead of `SELECT *`
SELECT DISTINCT `seller_name`
FROM `sales_meta`; -- needs INDEX(seller_name)
but `seller_name` needs to be a VARCHAR, not TEXT
Further evidence that str_to_date is useless:
mysql> SELECT STR_TO_DATE('2019-02-27', '%Y-%m-%d');
+---------------------------------------+
| STR_TO_DATE('2019-02-27', '%Y-%m-%d') |
+---------------------------------------+
| 2019-02-27 |
+---------------------------------------+

Related

MySQL "sending data" activity take longer time

I recently upgraded MySQL 5.1 to 5.7.8rc.
I have a unique issue with "sending data" status during profiling. It takes more time than expected for every complex or union queries (some time for simple queries also). I tried with best optimized as well as original query. Googled and tried all the possible configurations but no luck.
Query performs supper fast in MySQL 5.1, but not in 5.7.
Also tried with table optimization, analyze, repair etc..
some details for reference:
OS: Centos 6.9 64 bit
MySQL: 5.7.8 rc
CPU: 4
RAM: 64 GB
Data volume: 450 GB
Type: Dedicated VM
Query Profiling:
Status Duration
starting 0.000515
checking permissions 0.000023
checking permissions 0.000016
checking permissions 0.000014
checking permissions 0.000014
checking permissions 0.000014
checking permissions 0.000014
checking permissions 0.000014
checking permissions 0.000014
checking permissions 0.000016
checking permissions 0.000014
checking permissions 0.000019
Opening tables 0.000079
init 0.000325
System lock 0.000068
optimizing 0.000079
statistics 0.001888
preparing 0.000151
Creating tmp table 0.000128
Sorting result 0.000027
optimizing 0.000034
statistics 0.000064
preparing 0.000047
Creating tmp table 0.000047
Sorting for group 0.000030
optimizing 0.000018
statistics 0.000022
preparing 0.000022
Creating tmp table 0.000043
Sorting for order 0.000023
executing 0.000016
Sending data 4.015596
Creating sort index 0.004766
executing 0.000010
Sending data 0.000159
executing 0.000008
Sending data 0.000025
Creating sort index 0.000349
end 0.000010
query end 0.000024
removing tmp table 0.000013
query end 0.000011
removing tmp table 0.000008
query end 0.000009
removing tmp table 0.000011
query end 0.000009
removing tmp table 0.000007
query end 0.000010
removing tmp table 0.000008
query end 0.000006
closing tables 0.000026
freeing items 0.000039
removing tmp table 0.000009
freeing items 0.000067
logging slow query 0.000047
cleaning up 0.000042
Execution Plan:
+----+--------------+-------------+------------+--------+-----------------------------------------------------+--------------------+---------+--------------------------------------+------+----------+------------------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------+-------------+------------+--------+-----------------------------------------------------+--------------------+---------+--------------------------------------+------+----------+------------------------------------+
| 1 | PRIMARY | TIGM_GRP | NULL | index | PRIMARY,FLD_PARENT_GROUP_ID | PRIMARY | 2 | NULL | 87 | 0.33 | Using where |
| 1 | PRIMARY | TIPLD | NULL | range | FLD_PRICE_LEVEL_ID | FLD_PRICE_LEVEL_ID | 3 | NULL | 26 | 10.00 | Using index condition; Using where |
| 1 | PRIMARY | TIPLM | NULL | eq_ref | PRIMARY | PRIMARY | 2 | TIPLD.FLD_PRICE_LEVEL_ID | 1 | 10.00 | Using where |
| 1 | PRIMARY | TIGL | NULL | ref | FLD_GROUP_ID,fld_item_id | FLD_GROUP_ID | 2 | TIGM_GRP.FLD_GROUP_ID | 404 | 100.00 | NULL |
| 1 | PRIMARY | TIM | NULL | eq_ref | PRIMARY,FLD_ITEM_TYPE,FLD_ADDON_ID,FLD_PRODUCT_TYPE | PRIMARY | 4 | TIGL.FLD_ITEM_ID | 1 | 50.00 | Using where |
| 1 | PRIMARY | TSIAM | NULL | eq_ref | PRIMARY | PRIMARY | 4 | TIM.FLD_ADDON_ID | 1 | 10.00 | Using where |
| 1 | PRIMARY | TSIARM | NULL | ref | FLD_ADDON_ID | FLD_ADDON_ID | 5 | TIM.FLD_ADDON_ID | 1 | 10.00 | Using index condition; Using where |
| 1 | PRIMARY | TIPD | NULL | ref | FLD_ITEM_ID,FLD_PRICE_LEVEL_ID | FLD_ITEM_ID | 2 | TIM.FLD_ITEM_ID | 3 | 1.35 | Using index condition; Using where |
| 1 | PRIMARY | TIGM_PARENT | NULL | eq_ref | PRIMARY | PRIMARY | 2 | TIGM_GRP.FLD_PARENT_GROUP_ID | 1 | 100.00 | Using index |
| 2 | UNION | TIGM_GRP | NULL | index | PRIMARY,FLD_PARENT_GROUP_ID | PRIMARY | 2 | NULL | 87 | 0.06 | Using where |
| 2 | UNION | TIGM_PARENT | NULL | eq_ref | PRIMARY | PRIMARY | 2 | TIGM_GRP.FLD_PARENT_GROUP_ID | 1 | 100.00 | Using index |
|NULL| UNION RESULT | <union1,2> | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | Using temporary; Using filesort |
+----+--------------+-------------+------------+--------+-----------------------------------------------------+--------------------+---------+--------------------------------------+------+----------+------------------------------------+
my.cnf:
[mysql]
# CLIENT #######################################################################
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
# GENERAL ######################################################################
user = mysql
port = 3306
socket = /var/lib/mysql/mysql.sock
server_id = 32108
default_storage_engine = InnoDB
pid_file = /var/run/mysqld/mysqld.pid
optimizer_prune_level = 0
optimizer_search_depth = 0
max_length_for_sort_data = 8388608 #New
net_buffer_length = 1048576 #New
back_log = 80
symbolic-links = 0
log_bin_trust_function_creators = 1
net_read_timeout = 10 #90
net_write_timeout = 10 #120
net_retry_count = 30
thread_stack = 512K #192K
long_query_time = 10
tmpdir = /tmp
# MyISAM #######################################################################
key_buffer_size = 64M
read_buffer_size = 32M
read_rnd_buffer_size = 32M
bulk_insert_buffer_size = 16M
myisam_sort_buffer_size = 128M
myisam_max_sort_file_size = 1G
myisam_repair_threads = 1
memlock
max_allowed_packet = 32M
max_connect_errors = 100
sql_mode = STRICT_TRANS_TABLES,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION
sysdate-is-now = 1
explicit_defaults_for_timestamp = 1
innodb = FORCE
# Password policy disabled as per communication 05-Sep2017
validate_password = OFF
# DATA STORAGE ##################################################################
datadir = /var/lib/mysql/
# BINARY LOGGING ################################################################
log-bin = /var/lib/mysql/mysql-bin
expire-logs-days = 14
sync-binlog = 1
# REPLICATION ###################################################################
skip-slave-start = 1
relay-log = /var/log/mysql-realy-logs/relay-bin
slave-skip-errors = 1062 #,1053,1032,1237,1146
slave-net-timeout = 60
relay_log_purge = 1
# CACHES AND LIMITS #############################################################
tmp-table-size = 256M
max-heap-table-size = 256M
query_cache_min_res_unit = 12288 #8192 #New
query-cache-type = 1
query-cache-size = 32M #102400 #0 #256M
max-connections = 150
thread-cache-size = 10 #-1 #Auto resized
open-files-limit = 65535
table_definition_cache = 2000
table_open_cache = 4096 #3092
table_open_cache_instances = 4
sort_buffer_size = 128M
join_buffer_size = 512M
binlog_cache_size = 16M
query_cache_limit = 4M
# INNODB ########################################################################
innodb_fast_shutdown = 1
innodb_flush_method = O_DIRECT
innodb_log_group_home_dir = /mysql_redo_logs/mysql_redo_logs
innodb_log_files_in_group = 2
innodb_log_file_size = 1G
innodb_flush_log_at_trx_commit = 2
innodb_file_per_table = 1 #ON
innodb_buffer_pool_size = 32G #20G
innodb_buffer_pool_instances = 32
innodb_log_buffer_size = 64M
innodb_adaptive_hash_index = 1 #ON
innodb_thread_concurrency = 300 # "0" is default and means infinite (as and when needed). #250 #32
innodb_thread_sleep_delay = 1
innodb_flush_neighbors = 1
innodb_sync_array_size = 832 # Default is 768
skip-innodb_doublewrite #New
innodb_page_cleaners = 32 # Must be =innodb_buffer_pool_instances
innodb_sort_buffer_size = 512M
innodb_read_io_threads = 64
innodb_write_io_threads = 16
#innodb_concurrency_tickets = 429496729
innodb_max_dirty_pages_pct = 90
innodb_lock_wait_timeout = 10 #80
innodb_compression_level = 0 #New
innodb_lru_scan_depth = 512 #1024 #Default
# LOGGING #######################################################################
log_error = /var/lib/mysql/mysql-error.log
log_queries_not_using_indexes = 1
slow_query_log = 1
log_error_verbosity = 3
[mysqld_safe]
open-files-limit = 4096
malloc-lib = /usr/lib64/libtcmalloc_minimal.so.4
[mysqldump]
quick
max_allowed_packet = 64M
So far I have tried most of the options, currently I started the service in caching mode for quick response.
Can you please help me fix this "sending data" delay issue
Everyone
I think I figured out the issue.
1) 5.6 the optimization engine and plan generated is a result mixture of meta-data and user query (non relational joins). engine trusts user and get best plan.
But 5.7 does not trust soft relations between the tables though they are indexed.
2) 5.7 expects explicit data integrity and do not believe on user.
POC:
I created the same tables involved with hard FK references, wala ! found a miraculous response. the same query which executed without FK took 6.085 Sec (sending data),
and the query with data integrity enforced took 0.05 Sec (with no change in indexing).
So I think 5.7 expects strong data integrity constraints explicitly.
A lot of Optimizer changes happened between 5.1 and 5.7. For one thing, ICP is new; I see it several times in the EXPLAIN. We need to see the query to discuss further. If possible, get the EXPLAIN from 5.1.
Meanwhile, here are some minor comments on my.cnf:
thread_stack = 512K #192K
Generally, it is not useful to increase this setting.
optimizer_prune_level = 0
optimizer_search_depth = 0
What prompted you to set to 0?
innodb = FORCE
deprecated in 5.7.5; suggest removing before it becomes an error.
slave-skip-errors = 1062 #,1053,1032,1237,1146
Sweeping gremlins under the rug -- you will stub your toes on they lump.
innodb_buffer_pool_size = 32G #20G
Unless you have some large apps on the same machine, 44G might be a little better.
innodb_page_cleaners = 32 # Must be =innodb_buffer_pool_instances
re the comment -- Technically, it is auto-limited to pool_instances; see https://dev.mysql.com/doc/refman/8.0/en/innodb-parameters.html#sysvar_innodb_page_cleaners (Thanks, WilsonHauck)
log_queries_not_using_indexes = 1
In my opinion, this clutters the slowlog. The interesting entries are those that exceed long_query_time. Do you have any interesting queries in the slowlog?

Different sql explains on two servers. "Copying to tmp table" is extremely slow

I have a query that executes less time on dev server than on prod (database is the same). Prod server is much more efficient (64gb ram, 12 cores, etc).
Here's the query:
SELECT `u`.`id`,
`u`.`user_login`,
`u`.`last_name`,
`u`.`first_name`,
`r`.`referrals`,
`pr`.`worker`,
`rep`.`repurchase`
FROM `ci_users` `u`
LEFT JOIN
(SELECT `referrer_id`,
COUNT(user_id) referrals
FROM ci_referrers
GROUP BY referrer_id) AS `r` ON `r`.`referrer_id` = `u`.`id`
LEFT JOIN
(SELECT `user_id`,
`expire`,
SUM(`quantity`) worker
FROM ci_product_11111111111111111
GROUP BY `user_id`) AS `pr` ON `pr`.`user_id` = `u`.`id`
AND (`pr`.`expire` > '2015-12-10 09:23:45'
OR `pr`.`expire` IS NULL)
LEFT JOIN `ci_settings` `rep` ON `u`.`id` = `rep`.`id`
ORDER BY `id` ASC LIMIT 100,
150;
Having following explain result on dev server:
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
| 1 | PRIMARY | u | index | NULL | PRIMARY | 4 | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 5 | dev1.u.id | 10 | NULL |
| 1 | PRIMARY | <derived3> | ref | <auto_key1> | <auto_key1> | 5 | dev1.u.id | 15 | Using where |
| 1 | PRIMARY | rep | eq_ref | PRIMARY | PRIMARY | 4 | dev1.u.id | 1 | NULL |
| 3 | DERIVED | ci_product_11111111111111111 | ALL | NULL | NULL | NULL | NULL | 30296 | Using temporary; Using filesort |
| 2 | DERIVED | ci_referrers | ALL | NULL | NULL | NULL | NULL | 11503 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
And this one from prod:
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
| 1 | PRIMARY | u | ALL | NULL | NULL | NULL | NULL | 10990 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2628 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 8830 | |
| 1 | PRIMARY | rep | eq_ref | PRIMARY | PRIMARY | 4 | prod123.u.id | 1 | |
| 3 | DERIVED | ci_product_11111111111111111 | ALL | NULL | NULL | NULL | NULL | 28427 | Using temporary; Using filesort |
| 2 | DERIVED | ci_referrers | ALL | NULL | NULL | NULL | NULL | 11837 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
Profiling results on prod server shown me something like that:
............................................
| statistics | 0.000030 |
| preparing | 0.000026 |
| Creating tmp table | 0.000037 |
| executing | 0.000008 |
| Copying to tmp table | 5.170296 |
| Sorting result | 0.001223 |
| Sending data | 0.000133 |
| Waiting for query cache lock | 0.000005 |
............................................
After googling a while I decided to move temporary tables into RAM:
/etc/fstab:
tmpfs /var/tmpfs tmpfs rw,uid=110,gid=115,size=16G,nr_inodes=10k,mode=0700 0 0
directory rules:
drwxrwxrwt 2 mysql mysql 40 Dec 15 13:57 tmpfs
/etc/mysql/my.cnf(played a lot with values):
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /var/tmpfs
lc-messages-dir = /usr/share/mysql
skip-external-locking
bind-address = 127.0.0.1
key_buffer = 16000M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 150
myisam-recover = BACKUP
tmp_table_size = 512M
max_heap_table_size = 1024M
max_connections = 100000
table_cache = 1024
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64
query_cache_limit = 1000M
query_cache_size = 10000M
log_error = /var/log/mysql/error.log
expire_logs_days = 10
max_binlog_size = 100M
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
[isamchk]
key_buffer = 16M
And it doesn't work. Execution time still the same, around 5 sec.
Can you please answer 2 questions:
What's wrong with tmpfs configuration?
Why explains are different on servers, how can I optimize this query? (even not using tmpfs; I figured out that if the last 'order by' removed, query completes much faster).
Thanks in advance.
Why explains are different on servers, how can I optimize this query?
(even not using tmpfs; I figured out that if the last 'order by'
removed, query completes much faster).
You say "database is the same", but from the explain outputs you presumably mean "the schema is the same". It looks like there is a lot more data in the production schema? MySQL optimises the way it handles queries based on the amount of data, index sizes, etc. That'll explain (at the highest level) why you're seeing such dramatic differences.
The column of your explain outputs to look at is "rows". Notice how the two derived tables were very small in dev? It looks like (you could ask in #mysql on freenode IRC to confirm) that MySQL was creating indexes for the derived tables in dev, but choosing not to in production (possibly because there were so many more records?).
What's wrong with tmpfs configuration?
Nothing. :) MySQL creates temporary tables in memory until the amount of data in them hits a certain size (tmp_table_size) before it writes temporary data to disk. You can trust MySQL to do this - you don't need to create all the complexity and overhead of creating a temporary filesystem in memory and pointing MySQL there... The key variable for InnoDB is innodb_buffer_pool_size, which I can't see you've tuned.
There's plenty of documentation online, including a lot of (IMHO) good stuff by Percona. (I'm not affiliated with them, but I have worked with them; if you can afford a support contract with them - do it. They really know their stuff.)
I'm absolutely no expert in tuning MySQL, so I'm not going to comment on the options you've selected, except to say that I've spent weeks before reading and tuning - just to have the Percona team look at it and say "That's great, but you've missed this and got that wrong" - and had a noticeable improvement as a result!
Finally I'd point at some other things - indexes, schema and queries being the major ones. You've got two subqueries, I'd try to factor those out to see if that helps first. You'll need a representative data sample available in dev to tune the query properly. (I've used a read-only replication server for this in the past.) I'm not fully understanding what your query is trying to do but it looks like you can just join those tables in and group the overall result.
If I'm missing the obvious (likely!) - then I'd consider maintaining a table of the data in those subqueries separately. I've always used SPs to handle INSERTs by default since a DBA pointed out you can more easily add such cache logic in at a later time in a transactionally safe manner. So when you insert into ci_* tables, also update a table of the COUNT() data (if you can't factor out the subqueries) - so everything becomes a well-indexed set of joins.
The explains show that on prod the query does not use indexes on u, derived1, derived2 tables, while on dev it does. Scanned row numbers are significantly higher on prod as a result. The index names on the 2 derived tables suggest that these have been created by mysql on the fly, taking advantage of materialized derived tables optimisation strategy, which is available from mysql v5.6.5. Since no such optimization is present in the explain from the prod server, prod server may have an earlier mysql version.
As #Satevg supplied in a comment, the dev and prod environments have the following mysql versions:
Dev: debian 7, Mysql 5.6.28. Prod: debian 8, Mysql 5.5.44
This subtle difference in mysql version may explain the speed difference, since the dev server can take advantage of the materialization optimization strategy, while the prod - being v5.5 only - cannot.

Improve MySQL query speed with lots of joins

I have a query in MySQL which uses multiple joins and it runs slow at the moment - on average it is taking around 35 seconds to run.
The query is:
SELECT t.id,
CASE t.emp_accepted
WHEN '1' THEN 'No'
WHEN '0' THEN 'Yes'
END AS accepted,
e.department,
e.works_id,
e.first_name,
e.sur_name,
e.job_title,
e.job_status,
e.site_id,
e.manager,
d1.department_name AS dept_name,
d2.department_name AS sub_dept_name,
temp_hours_worked.hours AS hours,
s.office_name AS site_name,
CONCAT(e2.first_name, ' ', e2.sur_name) AS manager_name,
CONCAT(e3.first_name, ' ', e3.sur_name) AS validated_by
FROM time t
LEFT JOIN employee e
ON t.employee_id = e.employee_id
LEFT JOIN departments d1
ON e.department = d1.id
LEFT JOIN departments d2
ON e.sub_department = d2.id
LEFT JOIN site s
ON e.site_id = s.id
LEFT JOIN employee e2
ON e.manager = e2.id
LEFT JOIN employee e3
ON t.manager_id = e3.id
LEFT JOIN temp_hours_worked
ON temp_hours_worked.week_beginning = t.week_beginning
AND temp_hours_worked.employee_id = t.employee_id
AND temp_hours_worked.company_id=?
WHERE t.company_id = ?;
Explain:
+----+-------------+-------------------+--------+---------------+-------------+---------+-----------------------------------------+------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+---------------+-------------+---------+-----------------------------------------+------+-------+
| 1 | SIMPLE | t | ref | company_id | company_id | 4 | const | 5566 | |
| 1 | SIMPLE | e | ref | employee_id | employee_id | 4 | DBNAME.t.employee_id | 1 | |
| 1 | SIMPLE | d1 | eq_ref | PRIMARY | PRIMARY | 4 | DBNAME.e.department | 1 | |
| 1 | SIMPLE | d2 | eq_ref | PRIMARY | PRIMARY | 4 | DBNAME.e.sub_department | 1 | |
| 1 | SIMPLE | s | eq_ref | PRIMARY | PRIMARY | 4 | DBNAME.e.site_id | 1 | |
| 1 | SIMPLE | e2 | eq_ref | PRIMARY | PRIMARY | 4 | DBNAME.e.manager | 1 | |
| 1 | SIMPLE | e3 | eq_ref | PRIMARY | PRIMARY | 4 | DBNAME.t.manager_id | 1 | |
| 1 | SIMPLE | temp_hours_worked | ref | company_id | company_id | 4 | const | 5566 | |
+----+-------------+-------------------+--------+---------------+-------------+---------+-----------------------------------------+------+-------+
MySQL version is 5.5.31 running on Centos 6.5 64 bit and the server is 8 core, 4GB RAM with SSD disks. Load average on the box is:
load average: 0.24, 0.29, 0.29
and free memory shows as:
total used free shared buffers cached
Mem: 3880 3067 813 0 177 1065
-/+ buffers/cache: 1825 2055
Swap: 1023 0 1023
Disk space is OK:
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 45G 11G 32G 26% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/usr/tmpDSK 1008M 51M 907M 6% /tmp
Output from hdparm -Tt /dev/xvda1
Timing cached reads: 12538 MB in 1.99 seconds = 6297.76 MB/sec
Timing buffered disk reads: 826 MB in 3.00 seconds = 275.27 MB/sec
my.cnf:
[mysql]
# CLIENT #
port = 3306
socket = /var/lib/mysql/mysql.sock
[mysqld]
local-infile=0
# GENERAL #
user = mysql
default_storage_engine = InnoDB
socket = /var/lib/mysql/mysql.sock
pid_file = /var/lib/mysql/mysql.pid
# MyISAM #
key_buffer_size = 32M
myisam_recover = FORCE,BACKUP
# SAFETY #
max_allowed_packet = 16M
max_connect_errors = 1000000
skip_name_resolve
#sql_mode = NO_AUTO_CREATE_USER,NO_AUTO_VALUE_ON_ZERO,NO_ENGINE_SUBSTITUTION,NO_ZERO_DATE,NO_ZERO_IN_DATE,ONLY_FULL_GROUP_BY
sysdate_is_now = 1
innodb = FORCE
#innodb_strict_mode = 1
# DATA STORAGE #
datadir = /var/lib/mysql/
# BINARY LOGGING #
log_bin = /var/lib/mysql/mysql-bin
expire_logs_days = 14
sync_binlog = 1
# CACHES AND LIMITS #
tmp_table_size = 32M
max_heap_table_size = 32M
query_cache_type = 0
query_cache_size = 0
max_connections = 500
thread_cache_size = 50
open_files_limit = 65535
table_definition_cache = 4096
table_open_cache = 4096
# INNODB #
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 128M
innodb_flush_log_at_trx_commit = 1
innodb_file_per_table = 1
innodb_buffer_pool_size = 1456M
# LOGGING #
log_error = /var/lib/mysql/mysql-error.log
#log_queries_not_using_indexes = 1
slow_query_log = 1
slow_query_log_file = /var/lib/mysql/mysql-slow.log
The columns being queried/joined are all necessary and cannot be removed and I realise there is no index on quite a few of the columns but as they are only single rows I am not sure it matters - is there anything else I can do to speed this query up?
This query should never, ever be that slow on those hardware specs. The explain output indicates that all joined fields use pretty optimal indexes, and only scans a mere 5566 rows. The only index improvement could be a combined index on temp_hours_worked on fields week_beginning, employee_id, company_id but that's never going to be be much of a difference. There aren't even any filesorts or temp tables according to the explain output.
I suspect you're either running into locking issues (the load you show is low, but doesn't tell how many simultaneous queries are running on these same tables) or your MySQL is incredibly underpowered by config (using the default tiny.config settings or comparable).
Things to check:
Use hdparm -Tt /dev/sdX to test drive performance - the SSD disks or RAID array may be borked
Check your performance settings. Don't hesitate to put all buffer settings in my.cnf at least at twice their current value, you have RAM to spare. A few may warrant extremely higher settings. A script like MySQLTuner may be of help with this.
Also check whether the issue is reproducable on another server.
A good beginning to up MySQL buffer values is adding this bit to your my.cnf:
key_buffer = 768M
table_cache = 1024
sort_buffer_size = 4M
read_buffer_size = 4M
read_rnd_buffer_size = 16M
myisam_sort_buffer_size = 128M
query_cache_size = 128M
thread_concurrency = 16
table_open_cache = 2048
tmp_table_size = 64M
max_heap_table_size = 64M
You can review current values in phpMyAdmin (server -> variables).

MySQL: Modifying a query based on EXPLAIN plan

I have a long running query that I'd like to speed up. The result of the query is a new table.
The tables are all MYISAM and it is running on a large EC2 instance (m2.4xlarge, 64GB RAM).
System usage looks like this:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
17438 mysql 20 0 32.7g 7.1g 7420 S 2 10.6 1:35.82 mysqld
Relevant portion of my cnf:
key_buffer = 32768M
max_allowed_packet = 96M
thread_stack = 192K
thread_cache_size = 8
sort_buffer_size = 2M
read_buffer_size = 2M
read_rnd_buffer_size = 8M
table_cache = 512
thread_concurrency = 8
bulk_insert_buffer_size = 2048M
max_write_lock_count = 1
# ~1/4 of memory of machine
max_heap_table_size = 16384M
tmp_table_size = 16384M
# ~1/4 of memory
myisam_sort_buffer_size = 17179869184
When I run this simple query, it takes much longer than I think it should and memory and CPU usage on the machine is low.
The query explain plan looks like this:
mysql> explain SELECT encounter_id
-> FROM encounters e, sampled_patients sp
-> WHERE e.patient_id = sp.patient_id;
+----+-------------+-------+-------+---------------+------------+---------+--------------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+------------+---------+--------------------+---------+-------------+
| 1 | SIMPLE | sp | index | patient_id | patient_id | 4 | NULL | 1537954 | Using index |
| 1 | SIMPLE | e | ref | patient_id | patient_id | 5 | noah.sp.patient_id | 1 | Using where |
+----+-------------+-------+-------+---------------+------------+---------+--------------------+---------+-------------+
2 rows in set (0.00 sec)
It has to look through ~1.5M rows, but it's indexed. How can I speed this up?

Mysql 5.0.51 vs 5.1.66 perfomance issue

I have 2 mysql servers
Server A has mysql 5.0.51 - 8GB RAM Single Quad Core
Server B has mysql 5.1.66 - 64GB RAM - 2x Quad Core
Running the following query
select FULLNAME ,(select COUNT(*) FROM ORDERS S, ACCOUNTS T WHERE S.CREATED BETWEEN '2013-04-21 00:00' AND '2013-04-27 23:59' AND S.ACCOUNT=T.ACCOUNT AND T.USERNAME=U.USERNAME AND T.CUSTOMERSTATUS = 'Donation'
and T.TIMEST=
(SELECT TC.TIMEST FROM DETAILS A, ACCOUNTS TC WHERE S.ACCOUNT=A.ACCOUNT AND A.ACCOUNT = TC.ACCOUNT AND T.USERNAME=U.USERNAME AND T.CUSTOMERSTATUS = 'Donation' AND A.ANAL16 <> 'Cheque' order by TIMEST DESC LIMIT 1))
from USERS U
On server A it completes in 27 seconds
On server B it never finishes - I just terminated it now after sending data for 400 seconds.
Here is the configuration variables from server A
join_buffer_size 131072
key_buffer_size 16777216
myisam_sort_buffer_size 8388608
net_buffer_length 16384
preload_buffer_size 32768
read_buffer_size 131072
read_rnd_buffer_size 262144
sort_buffer_size 2097144
and the same from server B
join_buffer_size 131072
key_buffer_size 16777216
myisam_sort_buffer_size 8388608
net_buffer_length 16384
preload_buffer_size 32768
read_buffer_size 131072
read_rnd_buffer_size 262144
sort_buffer_size 2097144
sql_buffer_result OFF
I just can't figure out why it doesn't complete on the faster, much more powerful server.
I found a few posts online but they all mentioned it was an 'indexing' issue but I can't fathom out how its any different between the 2 machines, I took the dump this morning with all the indexes and it all re-imported fine.
Any help would be much appreciated!
Update with explain code
Server A
1 PRIMARY U ALL NULL NULL NULL NULL 57 Using where; Using temporary; Using filesort
3 DEPENDENT SUBQUERY S ALL PRIMARY,ACCSTO0472 NULL NULL NULL 3948 Using where; Using temporary
3 DEPENDENT SUBQUERY T ref PRIMARY,TELCOM0473 TELCOM0473 9 func 1 Using where
4 DEPENDENT SUBQUERY TC ref PRIMARY,TELCOM0472 PRIMARY 98 tms42_gg.S.ACCOUNT 2273 Using where; Using temporary; Using filesort
4 DEPENDENT SUBQUERY A ref PRIMARY,RCMANL0472,RCMANL0473 RCMANL0473 98 tms42_gg.S.ACCOUNT 1 Using where; Using index
2 DEPENDENT SUBQUERY R ALL PRIMARY NULL NULL NULL 636 Using where; Using temporary
2 DEPENDENT SUBQUERY T ref PRIMARY,ACCSTO0122 ACCSTO0122 250 tms42_gg.R.ACCOUNT,tms42_gg.U.USERNAME 1 Using where
Server B
| 1 | PRIMARY | U | ALL | NULL | NULL | NULL | NULL | 57 | Using where; Using temporary; Using filesort |
| 3 | DEPENDENT SUBQUERY | S | ALL | PRIMARY,ACCSTO0472 | NULL | NULL | NULL | 3948 | Using where; Using temporary |
| 3 | DEPENDENT SUBQUERY | T | ref | PRIMARY,TELCOM0473,TELCOM047J,TELCOM047JR | TELCOM0473 | 9 | func | 1 | Using where |
| 4 | DEPENDENT SUBQUERY | TC | index | PRIMARY,TELCOM0472,TELCOM047J,TELCOM047JR | TELCOM0473 | 9 | NULL | 1 | Using where; Using temporary |
| 4 | DEPENDENT SUBQUERY | A | ref | PRIMARY,RCMANL0472,RCMANL0473 | RCMANL0473 | 98 | tms42_gg.S.ACCOUNT | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | R | ALL | PRIMARY | NULL | NULL | NULL | 636 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | T | ref | PRIMARY,ACCSTO0122 | ACCSTO0122 | 250 | tms42_gg.R.ACCOUNT,tms42_gg.U.USERNAME | 1 | Using where
I set SESSION SQL_BUFFER_RESULT= ON before running the explain in both places - still the same results!
That SQL looks pretty inefficient with nested correlated queries.
Not sure I have understood quite what you are trying to do, but try recoding it something like this:-
SELECT U.FULLNAME , Sub2.RecCount
FROM USERS U
LEFT OUTER JOIN (select T.USERNAME, COUNT(*) AS RecCount
FROM ORDERS S
INNER JOIN ACCOUNTS T ON S.ACCOUNT = T.ACCOUNT
INNER JOIN (SELECT A.ACCOUNT, MAX(TC.TIMEST) AS MaxTimeSt
FROM DETAILS A
INNER JOIN ACCOUNTS TC ON A.ACCOUNT = TC.ACCOUNT
WHERE A.ANAL16 != 'Cheque'
GROUP BY A.ACCOUNT) Sub1 ON S.ACCOUNT = Sub1.ACCOUNT AND T.TIMEST = Sub1.MaxTimeSt
WHERE S.CREATED BETWEEN '2013-04-21 00:00' AND '2013-04-27 23:59'
AND T.USERNAME = U.USERNAME
AND T.CUSTOMERSTATUS = 'Donation'
GROUP BY T.USERNAME) Sub2
ON Sub2.USERNAME = U.USERNAME
Not tested so please excuse any typos