Mysql 5.0.51 vs 5.1.66 perfomance issue - mysql

I have 2 mysql servers
Server A has mysql 5.0.51 - 8GB RAM Single Quad Core
Server B has mysql 5.1.66 - 64GB RAM - 2x Quad Core
Running the following query
select FULLNAME ,(select COUNT(*) FROM ORDERS S, ACCOUNTS T WHERE S.CREATED BETWEEN '2013-04-21 00:00' AND '2013-04-27 23:59' AND S.ACCOUNT=T.ACCOUNT AND T.USERNAME=U.USERNAME AND T.CUSTOMERSTATUS = 'Donation'
and T.TIMEST=
(SELECT TC.TIMEST FROM DETAILS A, ACCOUNTS TC WHERE S.ACCOUNT=A.ACCOUNT AND A.ACCOUNT = TC.ACCOUNT AND T.USERNAME=U.USERNAME AND T.CUSTOMERSTATUS = 'Donation' AND A.ANAL16 <> 'Cheque' order by TIMEST DESC LIMIT 1))
from USERS U
On server A it completes in 27 seconds
On server B it never finishes - I just terminated it now after sending data for 400 seconds.
Here is the configuration variables from server A
join_buffer_size 131072
key_buffer_size 16777216
myisam_sort_buffer_size 8388608
net_buffer_length 16384
preload_buffer_size 32768
read_buffer_size 131072
read_rnd_buffer_size 262144
sort_buffer_size 2097144
and the same from server B
join_buffer_size 131072
key_buffer_size 16777216
myisam_sort_buffer_size 8388608
net_buffer_length 16384
preload_buffer_size 32768
read_buffer_size 131072
read_rnd_buffer_size 262144
sort_buffer_size 2097144
sql_buffer_result OFF
I just can't figure out why it doesn't complete on the faster, much more powerful server.
I found a few posts online but they all mentioned it was an 'indexing' issue but I can't fathom out how its any different between the 2 machines, I took the dump this morning with all the indexes and it all re-imported fine.
Any help would be much appreciated!
Update with explain code
Server A
1 PRIMARY U ALL NULL NULL NULL NULL 57 Using where; Using temporary; Using filesort
3 DEPENDENT SUBQUERY S ALL PRIMARY,ACCSTO0472 NULL NULL NULL 3948 Using where; Using temporary
3 DEPENDENT SUBQUERY T ref PRIMARY,TELCOM0473 TELCOM0473 9 func 1 Using where
4 DEPENDENT SUBQUERY TC ref PRIMARY,TELCOM0472 PRIMARY 98 tms42_gg.S.ACCOUNT 2273 Using where; Using temporary; Using filesort
4 DEPENDENT SUBQUERY A ref PRIMARY,RCMANL0472,RCMANL0473 RCMANL0473 98 tms42_gg.S.ACCOUNT 1 Using where; Using index
2 DEPENDENT SUBQUERY R ALL PRIMARY NULL NULL NULL 636 Using where; Using temporary
2 DEPENDENT SUBQUERY T ref PRIMARY,ACCSTO0122 ACCSTO0122 250 tms42_gg.R.ACCOUNT,tms42_gg.U.USERNAME 1 Using where
Server B
| 1 | PRIMARY | U | ALL | NULL | NULL | NULL | NULL | 57 | Using where; Using temporary; Using filesort |
| 3 | DEPENDENT SUBQUERY | S | ALL | PRIMARY,ACCSTO0472 | NULL | NULL | NULL | 3948 | Using where; Using temporary |
| 3 | DEPENDENT SUBQUERY | T | ref | PRIMARY,TELCOM0473,TELCOM047J,TELCOM047JR | TELCOM0473 | 9 | func | 1 | Using where |
| 4 | DEPENDENT SUBQUERY | TC | index | PRIMARY,TELCOM0472,TELCOM047J,TELCOM047JR | TELCOM0473 | 9 | NULL | 1 | Using where; Using temporary |
| 4 | DEPENDENT SUBQUERY | A | ref | PRIMARY,RCMANL0472,RCMANL0473 | RCMANL0473 | 98 | tms42_gg.S.ACCOUNT | 1 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | R | ALL | PRIMARY | NULL | NULL | NULL | 636 | Using where; Using temporary |
| 2 | DEPENDENT SUBQUERY | T | ref | PRIMARY,ACCSTO0122 | ACCSTO0122 | 250 | tms42_gg.R.ACCOUNT,tms42_gg.U.USERNAME | 1 | Using where
I set SESSION SQL_BUFFER_RESULT= ON before running the explain in both places - still the same results!

That SQL looks pretty inefficient with nested correlated queries.
Not sure I have understood quite what you are trying to do, but try recoding it something like this:-
SELECT U.FULLNAME , Sub2.RecCount
FROM USERS U
LEFT OUTER JOIN (select T.USERNAME, COUNT(*) AS RecCount
FROM ORDERS S
INNER JOIN ACCOUNTS T ON S.ACCOUNT = T.ACCOUNT
INNER JOIN (SELECT A.ACCOUNT, MAX(TC.TIMEST) AS MaxTimeSt
FROM DETAILS A
INNER JOIN ACCOUNTS TC ON A.ACCOUNT = TC.ACCOUNT
WHERE A.ANAL16 != 'Cheque'
GROUP BY A.ACCOUNT) Sub1 ON S.ACCOUNT = Sub1.ACCOUNT AND T.TIMEST = Sub1.MaxTimeSt
WHERE S.CREATED BETWEEN '2013-04-21 00:00' AND '2013-04-27 23:59'
AND T.USERNAME = U.USERNAME
AND T.CUSTOMERSTATUS = 'Donation'
GROUP BY T.USERNAME) Sub2
ON Sub2.USERNAME = U.USERNAME
Not tested so please excuse any typos

Related

mysql query returns empty results intermittently

I have the following query that sometimes returns an empty set on the master but NEVER on the read replica and there is data that is there that match on both databases. It is random and am wondering if there is a mysql setting or something with query cache. Running mysql 5.6.40-log on rds.
I have tried doing optimizer_switch="index_merge_intersection=off" but it didn't work.
UPDATE optimizer_switch="index_merge_intersection=off seems to have worked, but I cleared the query cache after making this change and the problem seems to have resolved itself).
One really odd issue that happened is the query worked via mysql command line 100% of the time; but the web application didn't work until I cleared the query cache (even though it connects as the same user).
Once I do optimize table phppos_items it fixes it for a little bit (3 minutes) and then it goes back to being erratic (mostly empty sets). These are all innodb tables.
settings:
https://gist.github.com/blasto333/82b18ef979438b93e4c39624bbf489d7
Seems to return empty set more often during busy time of day. Server is rds m4.large with 500 databases with 100 tables each
Query:
SELECT SUM( phppos_sales_items.damaged_qty ) AS damaged_qty,
SUM( phppos_sales_items.subtotal ) AS subtotal,
SUM( phppos_sales_items.total ) AS total,
SUM( phppos_sales_items.tax ) AS tax,
SUM( phppos_sales_items.profit ) AS profit
FROM `phppos_sales`
JOIN `phppos_sales_items` ON `phppos_sales_items`.`sale_id` = `phppos_sales`.`sale_id`
JOIN `phppos_items` ON `phppos_sales_items`.`item_id` = `phppos_items`.`item_id`
WHERE `phppos_sales`.`deleted` =0
AND `sale_time` BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 23:59:59'
AND `phppos_sales`.`location_id` IN ( 1 )
AND `phppos_sales`.`store_account_payment` =0
AND `suspended` <2
AND `phppos_items`.`deleted` =0
AND `phppos_items`.`supplier_id` = '485'
GROUP BY `phppos_sales_items`.`sale_id`
Explain:
+----+-------------+--------------------+-------------+-----------------------------------------------------------------------------------------------+-----------------------------+---------+-------------------------------------------------------+------+---------------------------------------------------------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------------+-----------------------------------------------------------------------------------------------+-----------------------------+---------+-------------------------------------------------------+------+---------------------------------------------------------------------------------------------------------+
| 1 | SIMPLE | phppos_items | index_merge | PRIMARY,phppos_items_ibfk_1,deleted,deleted_system_item | phppos_items_ibfk_1,deleted | 5,4 | NULL | 44 | Using intersect(phppos_items_ibfk_1,deleted); Using where; Using index; Using temporary; Using filesort |
| 1 | SIMPLE | phppos_sales_items | ref | PRIMARY,item_id,phppos_sales_items_ibfk_3,phppos_sales_items_ibfk_4,phppos_sales_items_ibfk_5 | item_id | 4 | phppoint_customer.phppos_items.item_id | 16 | NULL |
| 1 | SIMPLE | phppos_sales | eq_ref | PRIMARY,deleted,location_id,sales_search,phppos_sales_ibfk_10 | PRIMARY | 4 | phppoint_customer.phppos_sales_items.sale_id | 1 | Using where |
+----+-------------+--------------------+-------------+-----------------------------------------------------------------------------------------------+-----------------------------+---------+-------------------------------------------------------+------+---------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

How to execute a very huge MySQL query? (it took several days then stop without success)

I need to execute a query in DATABASE2 (60G) that fetch data from DATABASE1 (80G). Some tables have 400M rows in both databases.
INSERT IGNORE INTO product_to_category (
SELECT DISTINCT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2.article_links al ON al.supplierid=p.manufacturer_id
AND al.datasupplierarticlenumber=p.mpn
JOIN DATABASE2.passanger_car_pds pds ON al.productid=pds.productid
)
The execution took more than 6 days!!! then stooped without inserting any row into table.
[root#XXXX ~]# mysqladmin pr
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| 939 | root | localhost | mws_autocms | Query | 408622 | Sending data | INSERT IGNORE INTO product_to_category (
SELECT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2 |
| 107374 | root | localhost | | Query | 0 | starting | show processlist |
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
if I run the query with LIMIT 100 at the end, it executes the query and insert data to table.
I tuned MySQL to:
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 1G
innodb_log_buffer_size = 512M
query_cache_size = 0
query_cache_type = 0
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 8
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_flush_log_at_trx_commit = 2
innodb_large_prefix = 1
innodb_file_per_table = 1
innodb_file_format = Barracuda
max_allowed_packet = 1024M
lower_case_table_names = 1
Without any success.
Any help/advice to run this query please. I've been struggling for weeks.
Here the output of the EXPLAIN command
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
| 1 | INSERT | product_to_category | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 1 | SIMPLE | p | NULL | ALL | manufacturer_id | NULL | NULL | NULL | 5357582 | 100.00 | Using temporary |
| 1 | SIMPLE | al | NULL | ref | PRIMARY,productid,supplierid,datasupplierarticlenumber | datasupplierarticlenumber | 100 | mws_autocms.p.mpn,mws_autocms.p.manufacturer_id | 56 | 100.00 | Using where; Using index |
| 1 | SIMPLE | pds | NULL | ref | productid | productid | 4 | mws_tecdoc_2018_4_fr.al.productid | 1322 | 100.00 | Using where; Using index |
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
This is way too broad to answer here. My response here is really a comment - but its a bit long.
"I've been struggling for weeks" - and this is all you have to show for it?
I tuned MySQL to:
Why? How? What hardware is this running on?
Given that there are several options here which require a restart, does that mean you have exclusive use the DB instance? If so, why are using O_DIRECT?
Why the join when you are only using the data from one table?
Some tables have 400M rows in both databases.
You need to have a better understanding of cardinality or how to communicate this.
then stooped without inserting any row into table
Why did it stop without inserting? What did you do to investigate?
Whenever I got stuck with things like this I start to break down the requirement and introduce intermediary steps into my plan. From reading your question, you need to:
1) Join data from multiple sources together and then
2) Insert that resultset into another database.
You can therefore break it down into multiple steps giving the database less to do before it times out.
Create a table of ONLY the data you want to insert (one query, something like the following)
CREATE TABLE dataToImport AS
SELECT DISTINCT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2.article_links al ON al.supplierid=p.manufacturer_id
AND al.datasupplierarticlenumber=p.mpn
JOIN DATABASE2.passanger_car_pds pds ON al.productid=pds.productid
Then import that data:
INSERT IGNORE INTO product_to_category SELECT product_id, nodeid FROM dataToImport
It's a bit of a crude operation, but it means the database is doing less work in a single hit, so you might find it solves your problem.
If it still doesn't work, you need to understand how big the resultset of that SELECT query is, so run your SELECT on it's own first and look at the output.

MySQL Entity Framework Wraps query into sub-select for Order By

We support both MSSQL and MySQL for Entityframework 6 in an MVC 5 Application. Now, the problem I am having is when using the MySQL connectors and LINQ, queries which have an INNER JOIN and an ORDER BY will cause the query to be brought into a sub-select and the ORDER BY is applied on the outside. This causes a substantial performance impact. This does not happen when using the MSSQL connector. Here is an example:
SELECT
`Project3`.*
FROM
(SELECT
`Extent1`.*,
`Extent2`.`Name_First`
FROM
`ResultRecord` AS `Extent1`
LEFT OUTER JOIN `ResultInputEntity` AS `Extent2` ON `Extent1`.`Id` = `Extent2`.`Id`
WHERE
`Extent1`.`DateCreated` <= '4/4/2016 6:29:59 PM'
AND `Extent1`.`DateCreated` >= '12/31/2015 6:30:00 PM'
AND 0000 = `Extent1`.`CustomerId`
AND (`Extent1`.`InUseById` IS NULL OR 0000 = `Extent1`.`InUseById` OR `Extent1`.`LockExpiration` < '4/4/2016 6:29:59 PM')
AND `Extent1`.`DivisionId` IN (0000)
AND `Extent1`.`IsDeleted` != 1
AND EXISTS( SELECT
1 AS `C1`
FROM
`ResultInputEntityIdentification` AS `Extent3`
WHERE
`Extent1`.`Id` = `Extent3`.`InputEntity_Id`
AND 0 = `Extent3`.`Type`
AND '0000' = `Extent3`.`Number`
AND NOT (`Extent3`.`Number` IS NULL)
OR LENGTH(`Extent3`.`Number`) = 0)
AND EXISTS( SELECT
1 AS `C1`
FROM
`ResultRecordAssignment` AS `Extent4`
WHERE
1 = `Extent4`.`AssignmentType`
AND `Extent4`.`AssignmentId` = 0000
OR 2 = `Extent4`.`AssignmentType`
AND `Extent4`.`AssignmentId` = 0000
AND `Extent4`.`ResultRecordId` = `Extent1`.`Id`)) AS `Project3`
ORDER BY `Project3`.`DateCreated` ASC , `Project3`.`Name_First` ASC , `Project3`.`Id` ASC
LIMIT 0 , 25
This query simply times out when being ran against against a few million rows. This is the explain for the above query:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
| 1 | PRIMARY | Extent1 | ref | IX_ResultRecord_CustomerId,IX_ResultRecord_DateCreated,IX_ResultRecord_IsDeleted,IX_ResultRecord_InUseById,IX_ResultRecord_LockExpiration,IX_ResultRecord_DivisionId | IX_ResultRecord_CustomerId | 4 | const | 1 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | Extent2 | ref | PRIMARY | PRIMARY | 8 | Extent1.Id | 1 | |
| 4 | DEPENDENT SUBQUERY | Extent4 | ref | IX_RA_AT,IX_RA_A_ID,IX_RA_RR_ID | IX_RA_A_ID | 5 | const | 1 | Using where |
| 3 | DEPENDENT SUBQUERY | Extent3 | ALL | IX_InputEntity_Id,IX_InputEntityIdentification_Type,IX_InputEntityIdentification_Number | | | | 14341877 | Using where
Now, as it would get generated in MSSQL, or we simply get rid of the sub select to ORDER BY, the improvement is dramatic!
SELECT
`Extent1`.*,
`Extent2`.`Name_First`
FROM
`ResultRecord` AS `Extent1`
LEFT OUTER JOIN `ResultInputEntity` AS `Extent2` ON `Extent1`.`Id` = `Extent2`.`Id`
WHERE
`Extent1`.`DateCreated` <= '4/4/2016 6:29:59 PM'
AND `Extent1`.`DateCreated` >= '12/31/2015 6:30:00 PM'
AND 0000 = `Extent1`.`CustomerId`
AND (`Extent1`.`InUseById` IS NULL
OR 0000 = `Extent1`.`InUseById`
OR `Extent1`.`LockExpiration` < '4/4/2016 6:29:59 PM')
AND `Extent1`.`DivisionId` IN (0000)
AND `Extent1`.`IsDeleted` != 1
AND EXISTS( SELECT
1 AS `C1`
FROM
`ResultInputEntityIdentification` AS `Extent3`
WHERE
`Extent1`.`Id` = `Extent3`.`InputEntity_Id`
AND 9 = `Extent3`.`Type`
AND '0000' = `Extent3`.`Number`
AND NOT (`Extent3`.`Number` IS NULL)
OR LENGTH(`Extent3`.`Number`) = 0)
AND EXISTS( SELECT
1 AS `C1`
FROM
`ResultRecordAssignment` AS `Extent4`
WHERE
1 = `Extent4`.`AssignmentType`
AND `Extent4`.`AssignmentId` = 0000
OR 2 = `Extent4`.`AssignmentType`
AND `Extent4`.`AssignmentId` = 0000
AND `Extent4`.`ResultRecordId` = `Extent1`.`Id`)
ORDER BY `Extent1`.`DateCreated` ASC , `Extent2`.`Name_First` ASC , `Extent1`.`Id` ASC
LIMIT 0 , 25
This query now runs in 0.10 seconds! And the explain plan is now this:
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | extra |
| 1 | PRIMARY | <subquery2> | ALL | distinct_key | | | | 1 | Using temporary; Using filesort |
| 1 | PRIMARY | Extent1 | ref | PRIMARY,IX_ResultRecord_CustomerId,IX_ResultRecord_DateCreated,IX_ResultRecord_IsDeleted,IX_ResultRecord_InUseById,IX_ResultRecord_LockExpiration,IX_ResultRecord_DivisionId | PRIMARY | 8 | Extent3.InputEntity_Id | 1 | Using where |
| 1 | PRIMARY | Extent4 | ref | IX_RA_AT,IX_RA_A_ID,IX_RA_RR_ID | IX_RA_RR_ID | 8 | Extent3.InputEntity_Id | 1 | Using where; Start temporary; End temporary |
| 1 | PRIMARY | Extent2 | ref | PRIMARY | PRIMARY | 8 | Extent3.InputEntity_Id | 1 | |
| 2 | MATERIALIZED | Extent3 | ref | IX_InputEntity_Id,IX_InputEntityIdentification_Type,IX_InputEntityIdentification_Number | IX_InputEntityIdentification_Type | 4 | const | 1 | Using where |
Now, I have had this issue many times across the system, and it is clear that it is an issue with the MySQL EF 6 Connector deciding to always wrap queries in a sub-select to apply the ORDER BY, but only when there is a join in the query. This is causing major performance issues. Some answers I have seen suggest modifying the connector source code, but that can be tedious, has anyone had this same issue, know a work around, modified the connector already or have any other suggestions besides simply moving to SQL Server and leaving MySQL behind, as that is not an option.
Did you have a look to SQL Server generated SQL? Is it different or only performances are different?
Because [usually] is not the provider that decide the structure of the query (i.e. order a subquery). The provider just translate the structure of the query to the syntax of the DBMS. So, In your case the problem could be the DBMS optimizer.
In issues similar to your I used a different approach based on mapping a query to entities i.e. using ObjectContext.ExecuteStoreQuery.
It turns out that in order to work around this with the MySQL Driver, your entire lambda must be written in one go. Meaning in ONE Where(..) Predicate. This way the driver knows that it is all one result set. Now, if you build an initial IQueryable, and then keep appending Where clauses to it which access child tables, it will believe that there are multiple result sets and therefore wrap your entire query into a sub-select in order to sort and limit it.

Different sql explains on two servers. "Copying to tmp table" is extremely slow

I have a query that executes less time on dev server than on prod (database is the same). Prod server is much more efficient (64gb ram, 12 cores, etc).
Here's the query:
SELECT `u`.`id`,
`u`.`user_login`,
`u`.`last_name`,
`u`.`first_name`,
`r`.`referrals`,
`pr`.`worker`,
`rep`.`repurchase`
FROM `ci_users` `u`
LEFT JOIN
(SELECT `referrer_id`,
COUNT(user_id) referrals
FROM ci_referrers
GROUP BY referrer_id) AS `r` ON `r`.`referrer_id` = `u`.`id`
LEFT JOIN
(SELECT `user_id`,
`expire`,
SUM(`quantity`) worker
FROM ci_product_11111111111111111
GROUP BY `user_id`) AS `pr` ON `pr`.`user_id` = `u`.`id`
AND (`pr`.`expire` > '2015-12-10 09:23:45'
OR `pr`.`expire` IS NULL)
LEFT JOIN `ci_settings` `rep` ON `u`.`id` = `rep`.`id`
ORDER BY `id` ASC LIMIT 100,
150;
Having following explain result on dev server:
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
| 1 | PRIMARY | u | index | NULL | PRIMARY | 4 | NULL | 1 | NULL |
| 1 | PRIMARY | <derived2> | ref | <auto_key0> | <auto_key0> | 5 | dev1.u.id | 10 | NULL |
| 1 | PRIMARY | <derived3> | ref | <auto_key1> | <auto_key1> | 5 | dev1.u.id | 15 | Using where |
| 1 | PRIMARY | rep | eq_ref | PRIMARY | PRIMARY | 4 | dev1.u.id | 1 | NULL |
| 3 | DERIVED | ci_product_11111111111111111 | ALL | NULL | NULL | NULL | NULL | 30296 | Using temporary; Using filesort |
| 2 | DERIVED | ci_referrers | ALL | NULL | NULL | NULL | NULL | 11503 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+-------------+---------+-----------+-------+---------------------------------+
And this one from prod:
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
| 1 | PRIMARY | u | ALL | NULL | NULL | NULL | NULL | 10990 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 2628 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 8830 | |
| 1 | PRIMARY | rep | eq_ref | PRIMARY | PRIMARY | 4 | prod123.u.id | 1 | |
| 3 | DERIVED | ci_product_11111111111111111 | ALL | NULL | NULL | NULL | NULL | 28427 | Using temporary; Using filesort |
| 2 | DERIVED | ci_referrers | ALL | NULL | NULL | NULL | NULL | 11837 | Using temporary; Using filesort |
+----+-------------+------------------------------+--------+---------------+---------+---------+--------------+-------+---------------------------------+
Profiling results on prod server shown me something like that:
............................................
| statistics | 0.000030 |
| preparing | 0.000026 |
| Creating tmp table | 0.000037 |
| executing | 0.000008 |
| Copying to tmp table | 5.170296 |
| Sorting result | 0.001223 |
| Sending data | 0.000133 |
| Waiting for query cache lock | 0.000005 |
............................................
After googling a while I decided to move temporary tables into RAM:
/etc/fstab:
tmpfs /var/tmpfs tmpfs rw,uid=110,gid=115,size=16G,nr_inodes=10k,mode=0700 0 0
directory rules:
drwxrwxrwt 2 mysql mysql 40 Dec 15 13:57 tmpfs
/etc/mysql/my.cnf(played a lot with values):
[client]
port = 3306
socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
[mysqld]
user = mysql
pid-file = /var/run/mysqld/mysqld.pid
socket = /var/run/mysqld/mysqld.sock
port = 3306
basedir = /usr
datadir = /var/lib/mysql
tmpdir = /var/tmpfs
lc-messages-dir = /usr/share/mysql
skip-external-locking
bind-address = 127.0.0.1
key_buffer = 16000M
max_allowed_packet = 16M
thread_stack = 192K
thread_cache_size = 150
myisam-recover = BACKUP
tmp_table_size = 512M
max_heap_table_size = 1024M
max_connections = 100000
table_cache = 1024
innodb_thread_concurrency = 0
innodb_read_io_threads = 64
innodb_write_io_threads = 64
query_cache_limit = 1000M
query_cache_size = 10000M
log_error = /var/log/mysql/error.log
expire_logs_days = 10
max_binlog_size = 100M
[mysqldump]
quick
quote-names
max_allowed_packet = 16M
[mysql]
[isamchk]
key_buffer = 16M
And it doesn't work. Execution time still the same, around 5 sec.
Can you please answer 2 questions:
What's wrong with tmpfs configuration?
Why explains are different on servers, how can I optimize this query? (even not using tmpfs; I figured out that if the last 'order by' removed, query completes much faster).
Thanks in advance.
Why explains are different on servers, how can I optimize this query?
(even not using tmpfs; I figured out that if the last 'order by'
removed, query completes much faster).
You say "database is the same", but from the explain outputs you presumably mean "the schema is the same". It looks like there is a lot more data in the production schema? MySQL optimises the way it handles queries based on the amount of data, index sizes, etc. That'll explain (at the highest level) why you're seeing such dramatic differences.
The column of your explain outputs to look at is "rows". Notice how the two derived tables were very small in dev? It looks like (you could ask in #mysql on freenode IRC to confirm) that MySQL was creating indexes for the derived tables in dev, but choosing not to in production (possibly because there were so many more records?).
What's wrong with tmpfs configuration?
Nothing. :) MySQL creates temporary tables in memory until the amount of data in them hits a certain size (tmp_table_size) before it writes temporary data to disk. You can trust MySQL to do this - you don't need to create all the complexity and overhead of creating a temporary filesystem in memory and pointing MySQL there... The key variable for InnoDB is innodb_buffer_pool_size, which I can't see you've tuned.
There's plenty of documentation online, including a lot of (IMHO) good stuff by Percona. (I'm not affiliated with them, but I have worked with them; if you can afford a support contract with them - do it. They really know their stuff.)
I'm absolutely no expert in tuning MySQL, so I'm not going to comment on the options you've selected, except to say that I've spent weeks before reading and tuning - just to have the Percona team look at it and say "That's great, but you've missed this and got that wrong" - and had a noticeable improvement as a result!
Finally I'd point at some other things - indexes, schema and queries being the major ones. You've got two subqueries, I'd try to factor those out to see if that helps first. You'll need a representative data sample available in dev to tune the query properly. (I've used a read-only replication server for this in the past.) I'm not fully understanding what your query is trying to do but it looks like you can just join those tables in and group the overall result.
If I'm missing the obvious (likely!) - then I'd consider maintaining a table of the data in those subqueries separately. I've always used SPs to handle INSERTs by default since a DBA pointed out you can more easily add such cache logic in at a later time in a transactionally safe manner. So when you insert into ci_* tables, also update a table of the COUNT() data (if you can't factor out the subqueries) - so everything becomes a well-indexed set of joins.
The explains show that on prod the query does not use indexes on u, derived1, derived2 tables, while on dev it does. Scanned row numbers are significantly higher on prod as a result. The index names on the 2 derived tables suggest that these have been created by mysql on the fly, taking advantage of materialized derived tables optimisation strategy, which is available from mysql v5.6.5. Since no such optimization is present in the explain from the prod server, prod server may have an earlier mysql version.
As #Satevg supplied in a comment, the dev and prod environments have the following mysql versions:
Dev: debian 7, Mysql 5.6.28. Prod: debian 8, Mysql 5.5.44
This subtle difference in mysql version may explain the speed difference, since the dev server can take advantage of the materialization optimization strategy, while the prod - being v5.5 only - cannot.

Same query, same system, different execution time

The queries below all execute instantly on our development server where as they can take upto 2 minutes 20 seconds.
The query execution time seems to be affected by home ambiguous the LIKE string's are. If they closely match a country that has few matches it will take less time, and if you use something like 'ge' for germany - it will take longer to execute. But this doesn't always work out like that, at times its quite erratic.
Sending data appears to be the culprit but why and what does that mean. Also memory on production looks to be quite low (free memory)?
Production:
Intel Quad Xeon E3-1220 3.1GHz
4GB DDR3
2x 1TB SATA in RAID1
Network speed 100Mb
Ubuntu
Development
Intel Core i3-2100, 2C/4T, 3.10GHz
500 GB SATA - No RAID
4GB DDR3
This query is NOT the query in question but is related so ill post it.
SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%ge%' OR LCASE(c.company_country) LIKE '%abcde%') AND f.form_question_has_answer_form_id = '174'
And the explain plan for the above query is, run on both dev and production produce the same plan.
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
| 1 | SIMPLE | p2 | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
| 1 | SIMPLE | f | ref | form_question_has_answer_form_id,form_question_has_answer_user_id | form_question_has_answer_form_id | 4 | const | 796 | Using where |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using index |
| 1 | SIMPLE | p | ref | project_company_has_user_unique_key,project_company_has_user_user_id,project_company_has_user_company_id,project_company_has_user_project_id | project_company_has_user_user_id | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using where |
| 1 | SIMPLE | f2 | ref | form_project_id | form_project_id | 4 | const | 15 | Using where |
| 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_company_id | 1 | Using where |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
This query takes 2 minutes ~20 seconds to execute.
The query that is ACTUALLY being run on the server is this one:
SELECT
COUNT(*) AS num_results
FROM (SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%ge%' OR LCASE(c.company_country) LIKE '%abcde%') AND f.form_question_has_answer_form_id = '174'
GROUP BY
f.form_question_has_answer_id;) dctrn_count_query;
With explain plans (again same on dev and production):
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | DERIVED | p2 | const | PRIMARY | PRIMARY | 4 | | 1 | Using index |
| 2 | DERIVED | f | ref | form_question_has_answer_form_id,form_question_has_answer_user_id | form_question_has_answer_form_id | 4 | | 797 | Using where |
| 2 | DERIVED | p | ref | project_company_has_user_unique_key,project_company_has_user_user_id,project_company_has_user_company_id,project_company_has_user_project_id,project_company_has_user_garbage_collection | project_company_has_user_user_id | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using where |
| 2 | DERIVED | f2 | ref | form_project_id | form_project_id | 4 | | 15 | Using where |
| 2 | DERIVED | c | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_company_id | 1 | Using where |
| 2 | DERIVED | u | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_user_id | 1 | Using where; Using index |
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
On the production server the information I have is as follows.
Upon execution:
+-------------+
| num_results |
+-------------+
| 3 |
+-------------+
1 row in set (2 min 14.28 sec)
Show profile:
+--------------------------------+------------+
| Status | Duration |
+--------------------------------+------------+
| starting | 0.000016 |
| checking query cache for query | 0.000057 |
| Opening tables | 0.004388 |
| System lock | 0.000003 |
| Table lock | 0.000036 |
| init | 0.000030 |
| optimizing | 0.000016 |
| statistics | 0.000111 |
| preparing | 0.000022 |
| executing | 0.000004 |
| Sorting result | 0.000002 |
| Sending data | 136.213836 |
| end | 0.000007 |
| query end | 0.000002 |
| freeing items | 0.004273 |
| storing result in query cache | 0.000010 |
| logging slow query | 0.000001 |
| logging slow query | 0.000002 |
| cleaning up | 0.000002 |
+--------------------------------+------------+
On development the results are as follows.
+-------------+
| num_results |
+-------------+
| 3 |
+-------------+
1 row in set (0.08 sec)
Again the profile for this query:
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000022 |
| checking query cache for query | 0.000148 |
| Opening tables | 0.000025 |
| System lock | 0.000008 |
| Table lock | 0.000101 |
| optimizing | 0.000035 |
| statistics | 0.001019 |
| preparing | 0.000047 |
| executing | 0.000008 |
| Sorting result | 0.000005 |
| Sending data | 0.086565 |
| init | 0.000015 |
| optimizing | 0.000006 |
| executing | 0.000020 |
| end | 0.000004 |
| query end | 0.000004 |
| freeing items | 0.000028 |
| storing result in query cache | 0.000005 |
| removing tmp table | 0.000008 |
| closing tables | 0.000008 |
| logging slow query | 0.000002 |
| cleaning up | 0.000005 |
+--------------------------------+----------+
If i remove user and/or project innerjoins the query is reduced to 30s.
Last bit of information I have:
Mysqlserver and Apache are on the same box, there is only one box for production.
Production output from top: before & after.
top - 15:43:25 up 78 days, 12:11, 4 users, load average: 1.42, 0.99, 0.78
Tasks: 162 total, 2 running, 160 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 50.4%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3772580k used, 265288k free, 243704k buffers
Swap: 3905528k total, 265384k used, 3640144k free, 1207944k cached
top - 15:44:31 up 78 days, 12:13, 4 users, load average: 1.94, 1.23, 0.87
Tasks: 160 total, 2 running, 157 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.2%us, 50.6%sy, 0.0%ni, 49.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3834300k used, 203568k free, 243736k buffers
Swap: 3905528k total, 265384k used, 3640144k free, 1207804k cached
But this isn't a good representation of production's normal status so here is a grab of it from today outside of executing the queries.
top - 11:04:58 up 79 days, 7:33, 4 users, load average: 0.39, 0.58, 0.76
Tasks: 156 total, 1 running, 155 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.3%us, 2.8%sy, 0.0%ni, 93.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3676136k used, 361732k free, 271480k buffers
Swap: 3905528k total, 268736k used, 3636792k free, 1063432k cached
Development: This one doesn't change during or after.
top - 15:47:07 up 110 days, 22:11, 7 users, load average: 0.17, 0.07, 0.06
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4111972k total, 1821100k used, 2290872k free, 238860k buffers
Swap: 4183036k total, 66472k used, 4116564k free, 921072k cached
If the slowness is intermittent, it's either server load or other resource contention (in your case, most likely memory). Your system needs to have enough RAM to store all of the indexes in memory at once, otherwise, it will have to swap out memory if the needed index isn't already loaded in RAM.
Your TOP results show that there is a low amount of RAM available.
Use the innodb_buffer_pool_size setting to configure the size of the buffer pool for InnoDB, and key_buffer_size for MyISAM. Ensure you set it high enough to hold all of the indexes in RAM at the same time, and that your system has enough RAM to accomplish this.
An explain-plan is usually the best place to start whenever you have a slow query. To get one, run
DESCRIBE SELECT
COUNT(*) AS num_results
FROM (SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%finland%' OR LCASE(c.company_country) LIKE '%finnland%') AND f.form_question_has_answer_form_id = '174'
GROUP BY
f.form_question_has_answer_id;) dctrn_count_query;
This will show you a table listing the steps required to execute your query. If you see a large value in the 'rows' column and NULL in the 'key' column, that indicates that your query having to scan a large number of rows to determine which ones to return.
In that case, adding an index on destination_id should dramatically speed your query, at some cost to insert and delete speed (since the index will also need to be updated).
Sending data | 136.213836
Obviously :D
This must be some kind of infrastructure issue... Network or something, Try to ping your server from your sql/plus terminal and you will have your answer
If this link is correct the number behind 'sending data' is not the time needed for sending data but for "Sorting result".
This in turn might hint to a memory or CPU issue on the production server. You might want to look into the relevant statistics on the machines.
Run the queries in development and production and compare the query plans. If they are different, try updating statistics on the tables involved in the query.
I'm not intimately familiar with the optimizer in mysql.
This problem is occuring at the end of the query. One possibility is a look that is occuring on the production system. For instance, perhaps there is some sort of lock on metadata that prevents a new table from being created.
Also, the environment where a query runs does affect the optimizer. At the very least, multiple threads/processors will have an impact on the query plan. Mysql could generate different query optimizations based on available resources. I know SQL Server will produce different query plans based on available memory -- using hash tables when there is enough memory but nested loop joins when less is available. Are the memory allcoations exactly the same on both systems?
The only thing that occurs to me is a configuration difference between the two MySQL servers. Is there any significant difference in the memory setup between the two servers? Is it a virtual server?
Also, is the load on the production server very high?