Same query, same system, different execution time - mysql

The queries below all execute instantly on our development server where as they can take upto 2 minutes 20 seconds.
The query execution time seems to be affected by home ambiguous the LIKE string's are. If they closely match a country that has few matches it will take less time, and if you use something like 'ge' for germany - it will take longer to execute. But this doesn't always work out like that, at times its quite erratic.
Sending data appears to be the culprit but why and what does that mean. Also memory on production looks to be quite low (free memory)?
Production:
Intel Quad Xeon E3-1220 3.1GHz
4GB DDR3
2x 1TB SATA in RAID1
Network speed 100Mb
Ubuntu
Development
Intel Core i3-2100, 2C/4T, 3.10GHz
500 GB SATA - No RAID
4GB DDR3
This query is NOT the query in question but is related so ill post it.
SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%ge%' OR LCASE(c.company_country) LIKE '%abcde%') AND f.form_question_has_answer_form_id = '174'
And the explain plan for the above query is, run on both dev and production produce the same plan.
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
| 1 | SIMPLE | p2 | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index |
| 1 | SIMPLE | f | ref | form_question_has_answer_form_id,form_question_has_answer_user_id | form_question_has_answer_form_id | 4 | const | 796 | Using where |
| 1 | SIMPLE | u | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using index |
| 1 | SIMPLE | p | ref | project_company_has_user_unique_key,project_company_has_user_user_id,project_company_has_user_company_id,project_company_has_user_project_id | project_company_has_user_user_id | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using where |
| 1 | SIMPLE | f2 | ref | form_project_id | form_project_id | 4 | const | 15 | Using where |
| 1 | SIMPLE | c | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_company_id | 1 | Using where |
+----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+-------------+
This query takes 2 minutes ~20 seconds to execute.
The query that is ACTUALLY being run on the server is this one:
SELECT
COUNT(*) AS num_results
FROM (SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%ge%' OR LCASE(c.company_country) LIKE '%abcde%') AND f.form_question_has_answer_form_id = '174'
GROUP BY
f.form_question_has_answer_id;) dctrn_count_query;
With explain plans (again same on dev and production):
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
| 1 | PRIMARY | NULL | NULL | NULL | NULL | NULL | NULL | NULL | Select tables optimized away |
| 2 | DERIVED | p2 | const | PRIMARY | PRIMARY | 4 | | 1 | Using index |
| 2 | DERIVED | f | ref | form_question_has_answer_form_id,form_question_has_answer_user_id | form_question_has_answer_form_id | 4 | | 797 | Using where |
| 2 | DERIVED | p | ref | project_company_has_user_unique_key,project_company_has_user_user_id,project_company_has_user_company_id,project_company_has_user_project_id,project_company_has_user_garbage_collection | project_company_has_user_user_id | 4 | new_klarents.f.form_question_has_answer_user_id | 1 | Using where |
| 2 | DERIVED | f2 | ref | form_project_id | form_project_id | 4 | | 15 | Using where |
| 2 | DERIVED | c | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_company_id | 1 | Using where |
| 2 | DERIVED | u | eq_ref | PRIMARY | PRIMARY | 4 | new_klarents.p.project_company_has_user_user_id | 1 | Using where; Using index |
+----+-------------+-------+--------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+----------------------------------------------------+------+------------------------------+
On the production server the information I have is as follows.
Upon execution:
+-------------+
| num_results |
+-------------+
| 3 |
+-------------+
1 row in set (2 min 14.28 sec)
Show profile:
+--------------------------------+------------+
| Status | Duration |
+--------------------------------+------------+
| starting | 0.000016 |
| checking query cache for query | 0.000057 |
| Opening tables | 0.004388 |
| System lock | 0.000003 |
| Table lock | 0.000036 |
| init | 0.000030 |
| optimizing | 0.000016 |
| statistics | 0.000111 |
| preparing | 0.000022 |
| executing | 0.000004 |
| Sorting result | 0.000002 |
| Sending data | 136.213836 |
| end | 0.000007 |
| query end | 0.000002 |
| freeing items | 0.004273 |
| storing result in query cache | 0.000010 |
| logging slow query | 0.000001 |
| logging slow query | 0.000002 |
| cleaning up | 0.000002 |
+--------------------------------+------------+
On development the results are as follows.
+-------------+
| num_results |
+-------------+
| 3 |
+-------------+
1 row in set (0.08 sec)
Again the profile for this query:
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000022 |
| checking query cache for query | 0.000148 |
| Opening tables | 0.000025 |
| System lock | 0.000008 |
| Table lock | 0.000101 |
| optimizing | 0.000035 |
| statistics | 0.001019 |
| preparing | 0.000047 |
| executing | 0.000008 |
| Sorting result | 0.000005 |
| Sending data | 0.086565 |
| init | 0.000015 |
| optimizing | 0.000006 |
| executing | 0.000020 |
| end | 0.000004 |
| query end | 0.000004 |
| freeing items | 0.000028 |
| storing result in query cache | 0.000005 |
| removing tmp table | 0.000008 |
| closing tables | 0.000008 |
| logging slow query | 0.000002 |
| cleaning up | 0.000005 |
+--------------------------------+----------+
If i remove user and/or project innerjoins the query is reduced to 30s.
Last bit of information I have:
Mysqlserver and Apache are on the same box, there is only one box for production.
Production output from top: before & after.
top - 15:43:25 up 78 days, 12:11, 4 users, load average: 1.42, 0.99, 0.78
Tasks: 162 total, 2 running, 160 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 50.4%sy, 0.0%ni, 49.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3772580k used, 265288k free, 243704k buffers
Swap: 3905528k total, 265384k used, 3640144k free, 1207944k cached
top - 15:44:31 up 78 days, 12:13, 4 users, load average: 1.94, 1.23, 0.87
Tasks: 160 total, 2 running, 157 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.2%us, 50.6%sy, 0.0%ni, 49.3%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3834300k used, 203568k free, 243736k buffers
Swap: 3905528k total, 265384k used, 3640144k free, 1207804k cached
But this isn't a good representation of production's normal status so here is a grab of it from today outside of executing the queries.
top - 11:04:58 up 79 days, 7:33, 4 users, load average: 0.39, 0.58, 0.76
Tasks: 156 total, 1 running, 155 sleeping, 0 stopped, 0 zombie
Cpu(s): 3.3%us, 2.8%sy, 0.0%ni, 93.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4037868k total, 3676136k used, 361732k free, 271480k buffers
Swap: 3905528k total, 268736k used, 3636792k free, 1063432k cached
Development: This one doesn't change during or after.
top - 15:47:07 up 110 days, 22:11, 7 users, load average: 0.17, 0.07, 0.06
Tasks: 210 total, 2 running, 208 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.1%us, 0.2%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 4111972k total, 1821100k used, 2290872k free, 238860k buffers
Swap: 4183036k total, 66472k used, 4116564k free, 921072k cached

If the slowness is intermittent, it's either server load or other resource contention (in your case, most likely memory). Your system needs to have enough RAM to store all of the indexes in memory at once, otherwise, it will have to swap out memory if the needed index isn't already loaded in RAM.
Your TOP results show that there is a low amount of RAM available.
Use the innodb_buffer_pool_size setting to configure the size of the buffer pool for InnoDB, and key_buffer_size for MyISAM. Ensure you set it high enough to hold all of the indexes in RAM at the same time, and that your system has enough RAM to accomplish this.

An explain-plan is usually the best place to start whenever you have a slow query. To get one, run
DESCRIBE SELECT
COUNT(*) AS num_results
FROM (SELECT
f.form_question_has_answer_id
FROM
form_question_has_answer f
INNER JOIN
project_company_has_user p ON f.form_question_has_answer_user_id = p.project_company_has_user_user_id
INNER JOIN
company c ON p.project_company_has_user_company_id = c.company_id
INNER JOIN
project p2 ON p.project_company_has_user_project_id = p2.project_id
INNER JOIN
user u ON p.project_company_has_user_user_id = u.user_id
INNER JOIN
form f2 ON p.project_company_has_user_project_id = f2.form_project_id
WHERE
(f2.form_template_name = 'custom' AND p.project_company_has_user_garbage_collection = 0 AND p.project_company_has_user_project_id = '29') AND (LCASE(c.company_country) LIKE '%finland%' OR LCASE(c.company_country) LIKE '%finnland%') AND f.form_question_has_answer_form_id = '174'
GROUP BY
f.form_question_has_answer_id;) dctrn_count_query;
This will show you a table listing the steps required to execute your query. If you see a large value in the 'rows' column and NULL in the 'key' column, that indicates that your query having to scan a large number of rows to determine which ones to return.
In that case, adding an index on destination_id should dramatically speed your query, at some cost to insert and delete speed (since the index will also need to be updated).

Sending data | 136.213836
Obviously :D
This must be some kind of infrastructure issue... Network or something, Try to ping your server from your sql/plus terminal and you will have your answer

If this link is correct the number behind 'sending data' is not the time needed for sending data but for "Sorting result".
This in turn might hint to a memory or CPU issue on the production server. You might want to look into the relevant statistics on the machines.

Run the queries in development and production and compare the query plans. If they are different, try updating statistics on the tables involved in the query.

I'm not intimately familiar with the optimizer in mysql.
This problem is occuring at the end of the query. One possibility is a look that is occuring on the production system. For instance, perhaps there is some sort of lock on metadata that prevents a new table from being created.
Also, the environment where a query runs does affect the optimizer. At the very least, multiple threads/processors will have an impact on the query plan. Mysql could generate different query optimizations based on available resources. I know SQL Server will produce different query plans based on available memory -- using hash tables when there is enough memory but nested loop joins when less is available. Are the memory allcoations exactly the same on both systems?

The only thing that occurs to me is a configuration difference between the two MySQL servers. Is there any significant difference in the memory setup between the two servers? Is it a virtual server?
Also, is the load on the production server very high?

Related

How to execute a very huge MySQL query? (it took several days then stop without success)

I need to execute a query in DATABASE2 (60G) that fetch data from DATABASE1 (80G). Some tables have 400M rows in both databases.
INSERT IGNORE INTO product_to_category (
SELECT DISTINCT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2.article_links al ON al.supplierid=p.manufacturer_id
AND al.datasupplierarticlenumber=p.mpn
JOIN DATABASE2.passanger_car_pds pds ON al.productid=pds.productid
)
The execution took more than 6 days!!! then stooped without inserting any row into table.
[root#XXXX ~]# mysqladmin pr
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
| 939 | root | localhost | mws_autocms | Query | 408622 | Sending data | INSERT IGNORE INTO product_to_category (
SELECT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2 |
| 107374 | root | localhost | | Query | 0 | starting | show processlist |
+--------+-------------+-------------------+-------------+---------+--------+--------------+------------------------------------------------------------------------------------------------------+
if I run the query with LIMIT 100 at the end, it executes the query and insert data to table.
I tuned MySQL to:
innodb_flush_method = O_DIRECT
innodb_log_files_in_group = 2
innodb_log_file_size = 1G
innodb_log_buffer_size = 512M
query_cache_size = 0
query_cache_type = 0
innodb_buffer_pool_size = 12G
innodb_buffer_pool_instances = 8
innodb_read_io_threads = 16
innodb_write_io_threads = 16
innodb_flush_log_at_trx_commit = 2
innodb_large_prefix = 1
innodb_file_per_table = 1
innodb_file_format = Barracuda
max_allowed_packet = 1024M
lower_case_table_names = 1
Without any success.
Any help/advice to run this query please. I've been struggling for weeks.
Here the output of the EXPLAIN command
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
| 1 | INSERT | product_to_category | NULL | ALL | NULL | NULL | NULL | NULL | NULL | NULL | NULL |
| 1 | SIMPLE | p | NULL | ALL | manufacturer_id | NULL | NULL | NULL | 5357582 | 100.00 | Using temporary |
| 1 | SIMPLE | al | NULL | ref | PRIMARY,productid,supplierid,datasupplierarticlenumber | datasupplierarticlenumber | 100 | mws_autocms.p.mpn,mws_autocms.p.manufacturer_id | 56 | 100.00 | Using where; Using index |
| 1 | SIMPLE | pds | NULL | ref | productid | productid | 4 | mws_tecdoc_2018_4_fr.al.productid | 1322 | 100.00 | Using where; Using index |
+----+-------------+---------------------+------------+------+--------------------------------------------------------+---------------------------+---------+-------------------------------------------------+---------+----------+--------------------------+
This is way too broad to answer here. My response here is really a comment - but its a bit long.
"I've been struggling for weeks" - and this is all you have to show for it?
I tuned MySQL to:
Why? How? What hardware is this running on?
Given that there are several options here which require a restart, does that mean you have exclusive use the DB instance? If so, why are using O_DIRECT?
Why the join when you are only using the data from one table?
Some tables have 400M rows in both databases.
You need to have a better understanding of cardinality or how to communicate this.
then stooped without inserting any row into table
Why did it stop without inserting? What did you do to investigate?
Whenever I got stuck with things like this I start to break down the requirement and introduce intermediary steps into my plan. From reading your question, you need to:
1) Join data from multiple sources together and then
2) Insert that resultset into another database.
You can therefore break it down into multiple steps giving the database less to do before it times out.
Create a table of ONLY the data you want to insert (one query, something like the following)
CREATE TABLE dataToImport AS
SELECT DISTINCT p.product_id, pds.nodeid
FROM product p
JOIN DATABASE2.article_links al ON al.supplierid=p.manufacturer_id
AND al.datasupplierarticlenumber=p.mpn
JOIN DATABASE2.passanger_car_pds pds ON al.productid=pds.productid
Then import that data:
INSERT IGNORE INTO product_to_category SELECT product_id, nodeid FROM dataToImport
It's a bit of a crude operation, but it means the database is doing less work in a single hit, so you might find it solves your problem.
If it still doesn't work, you need to understand how big the resultset of that SELECT query is, so run your SELECT on it's own first and look at the output.

Sending data taking too long but indexes aready create

I am having problems with a query that is taking 20 seconds to return results :(
In table cases and cases_cstm, i have 960,000 rows
This is My query:
SELECT cases.id ,cases_cstm.assunto_c, cases.name , cases.case_number ,
cases.priority , accounts.name account_name ,
accounts.assigned_user_id account_name_owner ,
'Accounts' account_name_mod, cases.account_id ,
LTRIM(RTRIM(CONCAT(IFNULL(jt1.first_name,''),' ',IFNULL(jt1.last_name,'')))) assigned_user_name ,
jt1.created_by assigned_user_name_owner ,
'Users' assigned_user_name_mod, cases.status , cases.date_entered ,
cases.assigned_user_id
FROM cases
LEFT JOIN cases_cstm ON cases.id = cases_cstm.id_c
LEFT JOIN accounts accounts ON
cases.account_id=accounts.id AND accounts.deleted=0 AND
accounts.deleted=0
LEFT JOIN users jt1 ON
cases.assigned_user_id=jt1.id AND
jt1.deleted=0 AND jt1.deleted=0
where
(((LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%' OR
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'))) AND
cases.deleted=0 ORDER BY cases.date_entered DESC LIMIT 0,11;
here is the indexes of the table:
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+--------
| Table | Non_unique | Key_name | Seq_in_index | Column_name |Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+---------
| cases | 0 | PRIMARY | 1 | id | A | 911472 | NULL | NULL | | BTREE |
| cases | 0 | case_number | 1 | case_number | A | 911472 | NULL | NULL | | BTREE |
|
| cases | 1 | idx_case_name | 1 | name | A | 911472 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_account_id | 1 | account_id | A | 455736 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 1 | assigned_user_id| A | 106 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 2 | status | A | 197 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 3 | deleted | A | 214 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_priority | 1 | priority | A | 455736 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_date_entered | 1 | date_entered| A | 455736 | NULL | NULL | YES | BTREE |
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+---------
The Explain command of query(Image!)
this is the profile of query execution:
+--------------------+-----------+
| Status | Duration |
+--------------------+-----------+
| starting | 0.000122 |
| Opening tables | 0.000180 |
| System lock | 0.000005 |
| Table lock | 0.000005 |
| init | 0.000051 |
| optimizing | 0.000017 |
| statistics | 0.000071 |
| preparing | 0.000021 |
| executing | 0.000003 |
| Sorting result | 0.000004 |
| Sending data | 21.595455 |
| end | 0.000012 |
| query end | 0.000002 |
| freeing items | 0.000419 |
| logging slow query | 0.000005 |
| logging slow query | 0.000002 |
| cleaning up | 0.000004 |
Can someone help me undertang why the query is taking so long to execute?
Thanks!!
First, change your LEFT JOIN to accounts to an INNER JOIN I don't know if that will make a drastic change, but it makes a lot more sense if you understand the difference.
What you are saying with LEFT JOIN is "I want all cases, whether or not they have an associated account". An INNER JOIN here means "Give me all cases and return all accounts for them".
The end-result of your query is the same, because you are later on filtering things out with your WHERE clause, but I have a feeling that this might be why idx_account_id is being ignored.
A second, probably bigger problem is your where clause:
(((LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%' OR
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'))) AND
There's a ton of functions here, and MySQL can't optimize this using an index. Every record will be checked for this condition, and all functions you're using will be called for every record. This is most likely the biggest problem.
First, this can be simplified a bit. I think both sides of this OR statements are the same, so lets first turn this into one:
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'
Since you are adding a wildcard on one side of the switch statement, why bother with the RTRIM?
LTRIM(CONCAT(IFNULL(accounts.name,''))) LIKE 'rodrigo fernando%'
You don't need to CONCAT anything, if there's only one thing!
LTRIM(IFNULL(accounts.name,'')) LIKE 'rodrigo fernando%'
LTRIM works just fine on NULL values
LTRIM(accounts.name) LIKE 'rodrigo fernando%'
Alright, that saved us a bunch of functions. However, the last LTRIM is still a major problem as it completely still blocks mysql from using indexes. The solution is fairly simple though:
Update your accounts table, once:
UPDATE accounts SET name = LTRIM(name);
Make sure that whenever you insert new accounts, you trim before inserting. So you're really doing this now during INSERT time, not SELECT time.
Change your previous WHERE clause to:
accounts.name LIKE 'rodrigo fernando%'
Boom, you can now use an index on accounts.name and it will be fast as fuck.
I was able to solve this problem following the tips of Evert!
To be clear, this is a query that is being mounted dynamically by a system, I will still need to optimize the code to remove the functions that do not make sense in this case.
What helped me was to replace the LEFT by INNER joins in cases for cases_cstm and cases for accounts ... only with this change the query started being executed in 0.9 seconds!
Thanks for everyone's help!
You need to change your query as well as your indexing also. As you have indexed only on cases table, and you are using account and users table you should consider these table also while indexing. Make changes in your query as follows.
SELECT cases.id, cases_cstm.assunto_c, cases.name, cases.case_number, cases.priority, accounts.name account_name, accounts.assigned_user_id account_name_owner,
'Accounts' account_name_mod, cases.account_id, LTRIM(RTRIM(CONCAT(IFNULL(jt1.first_name,''),' ',IFNULL(jt1.last_name,'')))) assigned_user_name,
jt1.created_by assigned_user_name_owner,'Users' assigned_user_name_mod, cases.status, cases.date_entered, cases.assigned_user_id
FROM cases [enter link description here][1]
LEFT JOIN cases_cstm ON cases.id = cases_cstm.id_c
LEFT JOIN accounts accounts ON cases.account_id=accounts.id AND accounts.deleted=0 AND (TRIM(accounts.name) LIKE 'rodrigo fernando%' OR TRIM(accounts.name) LIKE 'rodrigo fernando%')
LEFT JOIN users jt1 ON cases.assigned_user_id=jt1.id AND jt1.deleted=0
WHERE cases.deleted=0 ORDER BY cases.date_entered DESC LIMIT 0,11;
Then first delete indexes that are not being used then create indexes on tables as follows:
Cases - deleted, date_entered (One index with multiple column)
accounts - deleted, name (One index with multiple column)
Users - deleted
Create these indexes and make sure sequence of columns selected and used in query must same, because MySql uses any leftmost prefix of the index. If you need more details go through these links:
Multiple-Column Indexes
MySQL ORDER BY / LIMIT performance

Finding closest value. How to tell MySQL that the data is already ordered?

Let's say I have a table like the following:
+-----------+------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+------------+------------+------+-----+---------+
| datetime | double | NO | PRI | NULL |
| some_value | float | NO | | NULL |
+------------+------------+------+-----+---------+
Date is necessary to be in double and is registered in unix time with fractional seconds (no possibility to install mysql 5.6 to use fractional DATETIME). In addition, the values of the field datetime are not only primary, they are also always increasing. I would like to find the closest row to certain value. Usually you can use something like:
select * from table order by abs(datetime - $myvalue) limit 1
However, I'm afraid that this implementation will be slow for hundred thousands of values, because it is going to search in all the database. And since I have an ordered list, I know I can do some binary search to speed up the process, but I have no idea how to tell MySQL to perform such kind of search.
In order to test the performance I do the following lines:
SET profiling = 1;
SELECT * FROM table order by abs(datetime - $myvalue) limit 1;
SHOW PROFILE FOR QUERY 1;
With the following results:
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000122 |
| Waiting for query cache lock | 0.000051 |
| checking query cache for query | 0.000191 |
| checking permissions | 0.000038 |
| Opening tables | 0.000094 |
| System lock | 0.000047 |
| Waiting for query cache lock | 0.000085 |
| init | 0.000103 |
| optimizing | 0.000031 |
| statistics | 0.000057 |
| preparing | 0.000049 |
| executing | 0.000023 |
| Sorting result | 2.806665 |
| Sending data | 0.000359 |
| end | 0.000049 |
| query end | 0.000033 |
| closing tables | 0.000050 |
| freeing items | 0.000089 |
| logging slow query | 0.000067 |
| cleaning up | 0.000032 |
+--------------------------------+----------+
Which in my understanding, the sorting the result takes 2.8 seconds, however my data is already sorted. As additional information, I have around 240,000 rows.
It won't scan the entire database. A primary key is indexed by a B-tree. Forcing it into a binary search would be slower, if you could do it, which you can't.
Try making it a field:
select abs(datetime - $myvalue) as date_diff, table.*
from table
order by date_diff
limit 1
Indexes are supported in RDBMSs. Define an index on date time or field of your interest and db will not do the complete table scan

Google Cloud SQL slow for queries Using filesort

mysql> EXPLAIN SELECT * FROM `condominio_boleto`
INNER JOIN `contrato_contrato` ON (`condominio_boleto`.`contrato_id` = `contrato_contrato`.`id`)
INNER JOIN `cadastro_imovel` ON (`contrato_contrato`.`imovel_id` = `cadastro_imovel`.`id`)
INNER JOIN `cadastro_pessoa` ON (`contrato_contrato`.`pessoa_id` = `cadastro_pessoa`.`id`)
ORDER BY `condominio_boleto`.`id` DESC LIMIT 1;
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
| 1 | SIMPLE | cadastro_imovel | ALL | PRIMARY | NULL | NULL | NULL | 128 | Using temporary; Using filesort |
| 1 | SIMPLE | contrato_contrato | ref | PRIMARY,contrato_contrato_33999a20,contrato_contrato_8b5ebd9d | contrato_contrato_33999a20 | 4 | mydb.cadastro_imovel.id | 1 | |
| 1 | SIMPLE | cadastro_pessoa | eq_ref | PRIMARY | PRIMARY | 4 | mydb.contrato_contrato.pessoa_id | 1 | |
| 1 | SIMPLE | condominio_boleto | ref | condominio_boleto_91c8cd68 | condominio_boleto_91c8cd68 | 4 | mydb.contrato_contrato.id | 9 | |
+----+-------------+-------------------+--------+---------------------------------------------------------------+----------------------------+---------+------------------------------------+------+---------------------------------+
4 rows in set (0.00 sec)
This query is taking 3-4 seconds to run on Google Cloud SQL (D0 instance). If I remove the ORDER BY clause it no longer shows the Extra Using temporary; Using filesort and speeds up to <100ms. But because it's auto-genreated by Django admin I can't remove that ORDER BY clause.
All these tables are really small. condominio_boleto has 5k records all other tables have less than 500 records.
Can I speed this up with indexes? Is this a known problem on Google Cloud SQL?
I had a similar experience on the Google Cloud SQL Tier D0 (128MB RAM). One of my website was running very slow and took a long time to return pages. After running Jet Profiler I found that my database queries were running slow (2-3s to execute and 7 threads on average). The problem queries were those with inner joins and orders. So I upgraded to Tier D1 (512MB RAM) and as expected, no more slow queries. My guess is D0 isn't made to handle height load or complex queries. It's mostly suited for low usage and testing.

Optimizing / improving a slow mysql query - indexing? reorganizing?

First off, I've looked at several other questions about optimizing sql queries, but I'm still unclear for my situation what is causing my problem. I read a few articles on the topic as well and have tried implementing a couple possible solutions, as I'll describe below, but nothing has yet worked or even made an appreciable dent in the problem.
The application is a nutrition tracking system - users enter the foods they eat and based on an imported USDA database the application breaks down the foods to the individual nutrients and gives the user a breakdown of the nutrient quantities on a (for now) daily basis.
here's
A PDF of the abbreviated database schema
and here it is as a (perhaps poor quality) JPG. I made this in open office - if there are suggestions for better ways to visualize a database, I'm open to suggestions on that front as well! The blue tables are directly from the USDA, and the green and black tables are ones I've made. I've omitted a lot of data in order to not clutter things up unnecessarily.
Here's the query I'm trying to run that takes a very long time:
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM
(SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (nutr_no <100000
OR nutr_no IN
(SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'))
) as listing
LEFT JOIN
(SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no ) as data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units
So I know that's rather complex - The first select gets a listing of all the nutrients that the user consumed within the given date range, and the second fills in all the quantities.
When I implement them separately, the first query is really fast, but the second is slow and gets very slow when the date ranges get large. The join makes the whole thing ridiculously slow. I know that the 'main' problem is the join between these two derived tables, and I can get rid of that and do the join by hand basically in php much faster, but I'm not convinced that's the whole story.
For example: for 1 month of data, the query takes about 8 seconds, which is slow, but not completely terrible. Separately, each query takes ~.01 and ~2 seconds respectively. 2 seconds still seems high to me.
If I try to retrieve a year's worth of data, it takes several (>10) minutes to run the whole query, which is problematic - the client-server connection sometimes times out, and in any case we don't want I don't want to sit there with a spinning 'please wait' icon. Mainly, I feel like there's a problem because it takes more than 12x as long to retrieve 12x more information, when it should take less than 12x as long, if I were doing things right.
Here's the 'explain' for each of the slow queries: (the whole thing, and just the second half).
Whole thing:
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5053 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 4341 | |
| 4 | DERIVED | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 4 | DERIVED | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 4 | DERIVED | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 4 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 4 | DERIVED | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
| 2 | DERIVED | meals | range | day_ind | day_ind | 9 | NULL | 30 | Using where |
| 2 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | Using where |
| 3 | DEPENDENT SUBQUERY | nutr_rights | index_subquery | users_userid,nutr_def_nutr_no | nutr_def_nutr_no | 19 | func | 1 | Using index; Using where |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
10 rows in set (2.82 sec)
Second chunk (data):
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 1 | SIMPLE | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 1 | SIMPLE | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 1 | SIMPLE | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)
I've 'analyzed' all the tables involved in the query, and added an index on the datetime field that is joining meals and food entries. I called it 'day_ind'. I hoped that would accelerate things, but it didn't seem to make a difference. I also tried removing the 'sum' function, as I understand that having a function in the query will frequently mean a full table scan, which is obviously much slower. Unfortunately removing the 'sum' didn't seem to make a difference either (well, about 3-5% or so, but not the order magnitude that I'm looking for).
I would love any suggestions and will be happy to provide any more information you need to help diagnose and improve this problem. Thanks in advance!
There are a few type All in your explain suggest full table scan. and hence create temp table. You could re-index if it is not there already.
Sort and Group By are usually the performance killer, you can adjust Mysql memory settings to avoid physical i/o to tmp table if you have extra memory available.
Lastly, try to make sure the data type of the join attributes matches. Ie data.date_time = listing.date_time has same data format.
Hope that helps.
Okay, so I eventually figured out what I'm gonna end up doing. I couldn't make the 'data' query any faster - that's still the bottleneck. But now I've made it so the total query process is pretty close to linear, not exponential.
I split the query into two parts and made each one into a temporary table. Then I added an index for each of those temp tables and did the join separately afterwards. This made the total execution time for 1 month of data drop from 8 to 2 seconds, and for 1 year of data from ~10 minutes to ~30 seconds. Good enough for now, I think. I can work with that.
Thanks for the suggestions. Here's what I ended up doing:
create table listing (
SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (
nutr_no <100000 OR nutr_no IN (
SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'
)
)
);
create table data (
SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no
);
create index joiner on data(nutr_no, date_time);
create index joiner on listing(nutr_no, date_time);
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM listing
LEFT JOIN data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units;