Finding closest value. How to tell MySQL that the data is already ordered?

Finding closest value. How to tell MySQL that the data is already ordered? - mysql

Let's say I have a table like the following:
+-----------+------------+------+-----+---------+
| Field | Type | Null | Key | Default |
+------------+------------+------+-----+---------+
| datetime | double | NO | PRI | NULL |
| some_value | float | NO | | NULL |
+------------+------------+------+-----+---------+
Date is necessary to be in double and is registered in unix time with fractional seconds (no possibility to install mysql 5.6 to use fractional DATETIME). In addition, the values of the field datetime are not only primary, they are also always increasing. I would like to find the closest row to certain value. Usually you can use something like:
select * from table order by abs(datetime - $myvalue) limit 1
However, I'm afraid that this implementation will be slow for hundred thousands of values, because it is going to search in all the database. And since I have an ordered list, I know I can do some binary search to speed up the process, but I have no idea how to tell MySQL to perform such kind of search.
In order to test the performance I do the following lines:
SET profiling = 1;
SELECT * FROM table order by abs(datetime - $myvalue) limit 1;
SHOW PROFILE FOR QUERY 1;
With the following results:
+--------------------------------+----------+
| Status | Duration |
+--------------------------------+----------+
| starting | 0.000122 |
| Waiting for query cache lock | 0.000051 |
| checking query cache for query | 0.000191 |
| checking permissions | 0.000038 |
| Opening tables | 0.000094 |
| System lock | 0.000047 |
| Waiting for query cache lock | 0.000085 |
| init | 0.000103 |
| optimizing | 0.000031 |
| statistics | 0.000057 |
| preparing | 0.000049 |
| executing | 0.000023 |
| Sorting result | 2.806665 |
| Sending data | 0.000359 |
| end | 0.000049 |
| query end | 0.000033 |
| closing tables | 0.000050 |
| freeing items | 0.000089 |
| logging slow query | 0.000067 |
| cleaning up | 0.000032 |
+--------------------------------+----------+
Which in my understanding, the sorting the result takes 2.8 seconds, however my data is already sorted. As additional information, I have around 240,000 rows.

It won't scan the entire database. A primary key is indexed by a B-tree. Forcing it into a binary search would be slower, if you could do it, which you can't.

Try making it a field:
select abs(datetime - $myvalue) as date_diff, table.*
from table
order by date_diff
limit 1

Indexes are supported in RDBMSs. Define an index on date time or field of your interest and db will not do the complete table scan

Related

How to speed up Group by query

I have a mysql query that take 30 seconds to run. there are more than 3millions rows in the table
here is the db structure :
text (VARCHAR(64)),
kpi1 (INT),
kpi2 (INT),
position (DECIMAL),
date(DATE)
device (VARCHAR(32))
Here is the query :
select date, sum(kpi1), sum(kpi2) FROM `table_name` GROUP BY date ;
Explain method gives me this result :
ID | select type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtred | extra
1 | SIMPLE | table_name | NULL | index | UNIQUE,DATE | DATE | 3 | NULL | 3316480 | 100.00 | NULL
I have an index on date.
Here the result with profiling :
mysql> show profile for query 1;
+----------------------+-----------+
| Status | Duration |
+----------------------+-----------+
| starting | 0.000080 |
| checking permissions | 0.000011 |
| Opening tables | 0.000021 |
| init | 0.000023 |
| System lock | 0.000011 |
| optimizing | 0.000007 |
| statistics | 0.000021 |
| preparing | 0.000019 |
| Sorting result | 0.000007 |
| executing | 0.000005 |
| Sending data | 32.814836 |
| end | 0.000011 |
| query end | 0.000009 |
| closing tables | 0.000009 |
| freeing items | 0.000082 |
| cleaning up | 0.000013 |
+----------------------+-----------+
16 rows in set, 1 warning (0,00 sec)
Any idea ?

If the data on historical dates is static (as in, not changing because the date / activity is already done), then this is a perfect example of when to use a summary table. Create a separate table that is nothing but the date and the aggregates as you need them. Do that for all days prior to the current, so only at the end of the day, you insert (such as some daily trigger) the sum of the prior day. You could even include the count of records, something like
insert into MyDailySummaryTable
( Date, kpi1Sum, kpi2Sum, numRecs )
select date,
sum(kpi1) kpi1Sum,
sum(kpi2) kpi2Sum,
count(*) numRecs
FROM
`table_name`
where
date < curdate()
GROUP BY
date ;
then for each day after the initial load
insert into MyDailySummaryTable
( Date, kpi1Sum, kpi2Sum, numRecs )
select date,
sum(kpi1) kpi1Sum,
sum(kpi2) kpi2Sum,
count(*) numRecs
FROM
`table_name`
where
date = date_add( curdate(), interval -1 day )
GROUP BY
date ;
If your "date" field has timestamp information too, you may need to adjust the query to ignore the time portions.
Then, when trying to run your totals, you can just query from the MyDailySummaryTable directly and have instant results.
You could even expand the query aggregate table to include the counts per device in case you ever wanted to find tracking info for that one specific device too.

JOIN performance very slow when selecting VARCHAR field

I have a difficult problem with a query which I can't find out why it is performing so bad.
Please see following queries and query times (using HeidiSQL):
SELECT p.TID, a.TID
FROM characters AS p JOIN account a ON p.AccountId = a.TID;
=> rows: 57.879 Query time: 0.063 sec. (+ 0.328 sec. network)
Explain:
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
| 1 | SIMPLE | a | index | TID | WebAccountId | 5 | NULL | 21086 | Using index |
| 1 | SIMPLE | p | ref | AccountId | AccountId | 5 | dol.a.TID | 1 | Using where; Using index |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+--------------------------+
This is fast but as soon as I select a VARCHAR(255) field from table characters it gets very slow. See network time.
SELECT p.TID, a.TID, p.LastName
FROM characters AS p JOIN account a ON p.AccountId = a.TID;
=> rows: 57.879 Query time: 0.219 sec. (+ 116.234 sec. network)
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
| 1 | SIMPLE | a | index | TID | WebAccountId | 5 | NULL | 21086 | Using index |
| 1 | SIMPLE | p | ref | AccountId | AccountId | 5 | dol.a.TID | 1 | Using where |
+----+-------------+-------+-------+---------------+--------------+---------+-----------+-------+-------------+
Query time is still good but network time got unbearable.
One could think that its caused by the transfer of p.LastName but see the query without the join:
SELECT p.TID, p.LastName
FROM characters AS p
=> rows: 57.881 Query time: 0.063 sec. (+ 0.578 sec. network)
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
| 1 | SIMPLE | p | ALL | NULL | NULL | NULL | NULL | 59800 | |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------+
Any idea what is going on here? I have no idea how to fix that.
Edit:
Added the Explain output for each query.
In case it matters, it's mysql 5.1.72-community
Edit2: Tested from commandline. Same performance. If I look into the mysql process list I see Sending data for the poor performing query. The query was originally used in a ASP.NET web application before and performance was very bad. That is why I used HeidiSQL to investigate. I would definitely rule out HeidiSQL as the problem.
Edit3 Test result in Mysql Workbench:

I found out what was the culprit here. I used mysql 5.1.72 with InnoDB on default settings.
This means it used an InnoDB buffer pool of just 8MB
innodb_buffer_pool_size=8M
Mysql was forced to write the result to disk as it couldn't hold it in memory for transfer as soon as I added the VARCHAR fields to the select clause. The Join seems to have pressured the memory usage of that buffer even more.
After I changed the buffer size to 1G the problem was gone.
innodb_buffer_pool_size=1G
The first request after mysql start can still be a bit slow but subsequent queries are very fast.
So it was basically misconfiguration of the mysql server.

Sending data taking too long but indexes aready create

I am having problems with a query that is taking 20 seconds to return results :(
In table cases and cases_cstm, i have 960,000 rows
This is My query:
SELECT cases.id ,cases_cstm.assunto_c, cases.name , cases.case_number ,
cases.priority , accounts.name account_name ,
accounts.assigned_user_id account_name_owner ,
'Accounts' account_name_mod, cases.account_id ,
LTRIM(RTRIM(CONCAT(IFNULL(jt1.first_name,''),' ',IFNULL(jt1.last_name,'')))) assigned_user_name ,
jt1.created_by assigned_user_name_owner ,
'Users' assigned_user_name_mod, cases.status , cases.date_entered ,
cases.assigned_user_id
FROM cases
LEFT JOIN cases_cstm ON cases.id = cases_cstm.id_c
LEFT JOIN accounts accounts ON
cases.account_id=accounts.id AND accounts.deleted=0 AND
accounts.deleted=0
LEFT JOIN users jt1 ON
cases.assigned_user_id=jt1.id AND
jt1.deleted=0 AND jt1.deleted=0
where
(((LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%' OR
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'))) AND
cases.deleted=0 ORDER BY cases.date_entered DESC LIMIT 0,11;
here is the indexes of the table:
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+--------
| Table | Non_unique | Key_name | Seq_in_index | Column_name |Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+---------
| cases | 0 | PRIMARY | 1 | id | A | 911472 | NULL | NULL | | BTREE |
| cases | 0 | case_number | 1 | case_number | A | 911472 | NULL | NULL | | BTREE |
|
| cases | 1 | idx_case_name | 1 | name | A | 911472 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_account_id | 1 | account_id | A | 455736 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 1 | assigned_user_id| A | 106 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 2 | status | A | 197 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_cases_stat_del | 3 | deleted | A | 214 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_priority | 1 | priority | A | 455736 | NULL | NULL | YES | BTREE |
|
| cases | 1 | idx_date_entered | 1 | date_entered| A | 455736 | NULL | NULL | YES | BTREE |
+-------+------------+--------------------+--------------+------------------
+-----------+-------------+----------+--------+------+------------+---------
The Explain command of query(Image!)
this is the profile of query execution:
+--------------------+-----------+
| Status | Duration |
+--------------------+-----------+
| starting | 0.000122 |
| Opening tables | 0.000180 |
| System lock | 0.000005 |
| Table lock | 0.000005 |
| init | 0.000051 |
| optimizing | 0.000017 |
| statistics | 0.000071 |
| preparing | 0.000021 |
| executing | 0.000003 |
| Sorting result | 0.000004 |
| Sending data | 21.595455 |
| end | 0.000012 |
| query end | 0.000002 |
| freeing items | 0.000419 |
| logging slow query | 0.000005 |
| logging slow query | 0.000002 |
| cleaning up | 0.000004 |
Can someone help me undertang why the query is taking so long to execute?
Thanks!!

First, change your LEFT JOIN to accounts to an INNER JOIN I don't know if that will make a drastic change, but it makes a lot more sense if you understand the difference.
What you are saying with LEFT JOIN is "I want all cases, whether or not they have an associated account". An INNER JOIN here means "Give me all cases and return all accounts for them".
The end-result of your query is the same, because you are later on filtering things out with your WHERE clause, but I have a feeling that this might be why idx_account_id is being ignored.
A second, probably bigger problem is your where clause:
(((LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%' OR
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'))) AND
There's a ton of functions here, and MySQL can't optimize this using an index. Every record will be checked for this condition, and all functions you're using will be called for every record. This is most likely the biggest problem.
First, this can be simplified a bit. I think both sides of this OR statements are the same, so lets first turn this into one:
LTRIM(RTRIM(CONCAT(IFNULL(accounts.name,'')))) LIKE 'rodrigo fernando%'
Since you are adding a wildcard on one side of the switch statement, why bother with the RTRIM?
LTRIM(CONCAT(IFNULL(accounts.name,''))) LIKE 'rodrigo fernando%'
You don't need to CONCAT anything, if there's only one thing!
LTRIM(IFNULL(accounts.name,'')) LIKE 'rodrigo fernando%'
LTRIM works just fine on NULL values
LTRIM(accounts.name) LIKE 'rodrigo fernando%'
Alright, that saved us a bunch of functions. However, the last LTRIM is still a major problem as it completely still blocks mysql from using indexes. The solution is fairly simple though:
Update your accounts table, once:
UPDATE accounts SET name = LTRIM(name);
Make sure that whenever you insert new accounts, you trim before inserting. So you're really doing this now during INSERT time, not SELECT time.
Change your previous WHERE clause to:
accounts.name LIKE 'rodrigo fernando%'
Boom, you can now use an index on accounts.name and it will be fast as fuck.

I was able to solve this problem following the tips of Evert!
To be clear, this is a query that is being mounted dynamically by a system, I will still need to optimize the code to remove the functions that do not make sense in this case.
What helped me was to replace the LEFT by INNER joins in cases for cases_cstm and cases for accounts ... only with this change the query started being executed in 0.9 seconds!
Thanks for everyone's help!

You need to change your query as well as your indexing also. As you have indexed only on cases table, and you are using account and users table you should consider these table also while indexing. Make changes in your query as follows.
SELECT cases.id, cases_cstm.assunto_c, cases.name, cases.case_number, cases.priority, accounts.name account_name, accounts.assigned_user_id account_name_owner,
'Accounts' account_name_mod, cases.account_id, LTRIM(RTRIM(CONCAT(IFNULL(jt1.first_name,''),' ',IFNULL(jt1.last_name,'')))) assigned_user_name,
jt1.created_by assigned_user_name_owner,'Users' assigned_user_name_mod, cases.status, cases.date_entered, cases.assigned_user_id
FROM cases [enter link description here][1]
LEFT JOIN cases_cstm ON cases.id = cases_cstm.id_c
LEFT JOIN accounts accounts ON cases.account_id=accounts.id AND accounts.deleted=0 AND (TRIM(accounts.name) LIKE 'rodrigo fernando%' OR TRIM(accounts.name) LIKE 'rodrigo fernando%')
LEFT JOIN users jt1 ON cases.assigned_user_id=jt1.id AND jt1.deleted=0
WHERE cases.deleted=0 ORDER BY cases.date_entered DESC LIMIT 0,11;
Then first delete indexes that are not being used then create indexes on tables as follows:
Cases - deleted, date_entered (One index with multiple column)
accounts - deleted, name (One index with multiple column)
Users - deleted
Create these indexes and make sure sequence of columns selected and used in query must same, because MySql uses any leftmost prefix of the index. If you need more details go through these links:
Multiple-Column Indexes
MySQL ORDER BY / LIMIT performance

Optimizing / improving a slow mysql query - indexing? reorganizing?

First off, I've looked at several other questions about optimizing sql queries, but I'm still unclear for my situation what is causing my problem. I read a few articles on the topic as well and have tried implementing a couple possible solutions, as I'll describe below, but nothing has yet worked or even made an appreciable dent in the problem.
The application is a nutrition tracking system - users enter the foods they eat and based on an imported USDA database the application breaks down the foods to the individual nutrients and gives the user a breakdown of the nutrient quantities on a (for now) daily basis.
here's
A PDF of the abbreviated database schema
and here it is as a (perhaps poor quality) JPG. I made this in open office - if there are suggestions for better ways to visualize a database, I'm open to suggestions on that front as well! The blue tables are directly from the USDA, and the green and black tables are ones I've made. I've omitted a lot of data in order to not clutter things up unnecessarily.
Here's the query I'm trying to run that takes a very long time:
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM
(SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (nutr_no <100000
OR nutr_no IN
(SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'))
) as listing
LEFT JOIN
(SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no ) as data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units
So I know that's rather complex - The first select gets a listing of all the nutrients that the user consumed within the given date range, and the second fills in all the quantities.
When I implement them separately, the first query is really fast, but the second is slow and gets very slow when the date ranges get large. The join makes the whole thing ridiculously slow. I know that the 'main' problem is the join between these two derived tables, and I can get rid of that and do the join by hand basically in php much faster, but I'm not convinced that's the whole story.
For example: for 1 month of data, the query takes about 8 seconds, which is slow, but not completely terrible. Separately, each query takes ~.01 and ~2 seconds respectively. 2 seconds still seems high to me.
If I try to retrieve a year's worth of data, it takes several (>10) minutes to run the whole query, which is problematic - the client-server connection sometimes times out, and in any case we don't want I don't want to sit there with a spinning 'please wait' icon. Mainly, I feel like there's a problem because it takes more than 12x as long to retrieve 12x more information, when it should take less than 12x as long, if I were doing things right.
Here's the 'explain' for each of the slow queries: (the whole thing, and just the second half).
Whole thing:
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5053 | Using temporary; Using filesort |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 4341 | |
| 4 | DERIVED | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 4 | DERIVED | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 4 | DERIVED | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 4 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 4 | DERIVED | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
| 2 | DERIVED | meals | range | day_ind | day_ind | 9 | NULL | 30 | Using where |
| 2 | DERIVED | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | Using where |
| 3 | DEPENDENT SUBQUERY | nutr_rights | index_subquery | users_userid,nutr_def_nutr_no | nutr_def_nutr_no | 19 | func | 1 | Using index; Using where |
+----+--------------------+--------------------+----------------+-------------------------------+------------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
10 rows in set (2.82 sec)
Second chunk (data):
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | meals | range | PRIMARY,day_ind | day_ind | 9 | NULL | 30 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | food_entries | ref | meals_meal_id | meals_meal_id | 5 | nutrition.meals.meal_id | 15 | Using where |
| 1 | SIMPLE | recipe_ingredients | ref | foods_food_id,ingred_ndb_no | foods_food_id | 4 | nutrition.food_entries.entry_ident | 2 | |
| 1 | SIMPLE | nutr_def | ALL | PRIMARY | NULL | NULL | NULL | 174 | |
| 1 | SIMPLE | nut_data | ref | PRIMARY | PRIMARY | 36 | nutrition.nutr_def.nutr_no,nutrition.recipe_ingredients.ingred_ndb_no | 1 | |
+----+-------------+--------------------+-------+-----------------------------+---------------+---------+-----------------------------------------------------------------------+------+----------------------------------------------+
5 rows in set (0.00 sec)
I've 'analyzed' all the tables involved in the query, and added an index on the datetime field that is joining meals and food entries. I called it 'day_ind'. I hoped that would accelerate things, but it didn't seem to make a difference. I also tried removing the 'sum' function, as I understand that having a function in the query will frequently mean a full table scan, which is obviously much slower. Unfortunately removing the 'sum' didn't seem to make a difference either (well, about 3-5% or so, but not the order magnitude that I'm looking for).
I would love any suggestions and will be happy to provide any more information you need to help diagnose and improve this problem. Thanks in advance!

There are a few type All in your explain suggest full table scan. and hence create temp table. You could re-index if it is not there already.
Sort and Group By are usually the performance killer, you can adjust Mysql memory settings to avoid physical i/o to tmp table if you have extra memory available.
Lastly, try to make sure the data type of the join attributes matches. Ie data.date_time = listing.date_time has same data format.
Hope that helps.

Okay, so I eventually figured out what I'm gonna end up doing. I couldn't make the 'data' query any faster - that's still the bottleneck. But now I've made it so the total query process is pretty close to linear, not exponential.
I split the query into two parts and made each one into a temporary table. Then I added an index for each of those temp tables and did the join separately afterwards. This made the total execution time for 1 month of data drop from 8 to 2 seconds, and for 1 year of data from ~10 minutes to ~30 seconds. Good enough for now, I think. I can work with that.
Thanks for the suggestions. Here's what I ended up doing:
create table listing (
SELECT nutrdesc, nutr_no, date_time, units
FROM meals, nutr_def
WHERE meals.users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
AND (
nutr_no <100000 OR nutr_no IN (
SELECT nutr_def_nutr_no
FROM nutr_rights
WHERE nutr_rights.users_userid = '2'
)
)
);
create table data (
SELECT nutrdesc, date_time, nut_data.nutr_no, sum(ingred_gram_mass*entry_qty_num*nutr_val/100) AS total_nutr_mass
FROM nut_data, recipe_ingredients, food_entries, meals, nutr_def
WHERE nut_data.nutr_no = nutr_def.nutr_no
AND ndb_no = ingred_ndb_no
AND foods_food_id = entry_ident
AND meals_meal_id = meal_id
AND users_userid = '2'
AND date_time BETWEEN '2009-8-12' AND '2009-9-12'
GROUP BY date_time,nut_data.nutr_no
);
create index joiner on data(nutr_no, date_time);
create index joiner on listing(nutr_no, date_time);
SELECT listing.date_time,listing.nutrdesc,data.total_nutr_mass,listing.units
FROM listing
LEFT JOIN data
ON data.date_time = listing.date_time
AND listing.nutr_no = data.nutr_no
ORDER BY listing.date_time,listing.nutrdesc,listing.units;

Mysql queries crawl when switching servers

I ran into a problem last week moving from dev-testing where one of my queries which had run perfectly in dev, was crawling on my testing server.
It was fixed by adding FORCE INDEX on one of the indexes in the query.
Now I've loaded the same database into the production server (and it's running with the FORCE INDEX command, and it has slowed again.
Any idea what would cause something like this to happen? The testing and prod are both running the same OS and version of mysql (unlike the dev).
Here's the query and the explain from it.
EXPLAIN SELECT showsdate.bid, showsdate.bandid, showsdate.date, showsdate.time,
-> showsdate.title, showsdate.name, showsdate.address, showsdate.rank, showsdate.city, showsdate.state,
-> showsdate.lat, showsdate.`long` , tickets.link, tickets.lowprice, tickets.highprice, tickets.source
-> , tickets.ext, artistGenre, showsdate.img
-> FROM tickets
-> RIGHT OUTER JOIN (
-> SELECT shows.bid, shows.date, shows.time, shows.title, artists.name, artists.img, artists.rank, artists
-> .bandid, shows.address, shows.city, shows.state, shows.lat, shows.`long`, GROUP_CONCAT(genres.genre SEPARATOR
-> ' | ') AS artistGenre
-> FROM shows FORCE INDEX (biddate_idx)
-> JOIN artists ON shows.bid = artists.bid JOIN genres ON artists.bid=genres.bid
-> WHERE `long` BETWEEN -74.34926984058 AND -73.62463215942 AND lat BETWEEN 40.39373515942 AND 41.11837284058
-> AND shows.date >= '2009-03-02' GROUP BY shows.bid, shows.date ORDER BY shows.date, artists.rank DESC
-> LIMIT 0, 30
-> )showsdate ON showsdate.bid = tickets.bid AND showsdate.date = tickets.date;
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 30 | |
| 1 | PRIMARY | tickets | ref | biddate_idx | biddate_idx | 7 | showsdate.bid,showsdate.date | 1 | |
| 2 | DERIVED | genres | index | bandid_idx | bandid_idx | 141 | NULL | 531281 | Using index; Using temporary; Using filesort |
| 2 | DERIVED | shows | ref | biddate_idx | biddate_idx | 4 | activeHW.genres.bid | 5 | Using where |
| 2 | DERIVED | artists | eq_ref | bid_idx | bid_idx | 4 | activeHW.genres.bid | 1 | |
+----+-------------+------------+--------+---------------+-------------+---------+------------------------------+--------+----------------------------------------------+

I think I chimed in when you asked this question about the differences in dev -> test.
Have you tried rebuilding the indexes and recalculating statistics? Generally, forcing an index is a bad idea as the optimizer usually makes good choices as to which indexes to use. However, that assumes that it has good statistics to work from and that the indexes aren't seriously fragmented.
ETA:
To rebuild indexes, use:
REPAIR TABLE tbl_name QUICK;
To recalculate statistics:
ANALYZE TABLE tbl_name;

Does test server have only 10 records and production server 1000000000 records?
This might also cause different execution times

Are the two servers configured the same? It sounds like you might be crossing a "tipping point" in MySQL's performance. I'd compare the MySQL configurations; there might be a memory parameter way different.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Finding closest value. How to tell MySQL that the data is already ordered? - mysql

It won't scan the entire database. A primary key is indexed by a B-tree. Forcing it into a binary search would be slower, if you could do it, which you can't.

Try making it a field: select abs(datetime - $myvalue) as date_diff, table.* from table order by date_diff limit 1

Indexes are supported in RDBMSs. Define an index on date time or field of your interest and db will not do the complete table scan

Related

How to speed up Group by query

JOIN performance very slow when selecting VARCHAR field

Sending data taking too long but indexes aready create

Optimizing / improving a slow mysql query - indexing? reorganizing?

Mysql queries crawl when switching servers

Categories

Resources