Optimize sql query - mysql

I'm trying to optimize sql query which now takes about 20s to execute.
Here is the structure of my tables.
last_login
id | ip_address |when
1 2130706433 2012-05-04 12:00:36
and
country_by_ip
ip_from | ip_to | country
16843008 | 16843263 | CN
Here is the query I use:
SELECT
ll.ip_address,
ll.when,
cbi.country
FROM last_login ll
LEFT JOIN `country_by_ip` cbi on ll.ip_address BETWEEN cbi.ip_from AND cbi.ip_to
The fields ip_from and ip_to are indexed.
Can you recommend me how to speed up this query ?
//EDIT
CREATE TABLE `last_login` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`ip_address` int(11) unsigned NOT NULL,
`when` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=32 DEFAULT CHARSET=utf8
CREATE TABLE `country_by_ip` (
`ip_from` int(20) unsigned NOT NULL,
`ip_to` int(20) DEFAULT NULL,
`country` varchar(255) DEFAULT NULL,
KEY `ip_from` (`ip_from`),
KEY `ip_to` (`ip_to`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

How about:
SELECT
ll.ip_address,
ll.when,
cbi.country
FROM last_login ll
LEFT JOIN `country_by_ip` cbi on ll.ip_address > cbi.ip_from
WHERE ll.ip_address < cbi.ip_to
However, I am totally agree with #Romain, change the DB schema to better design.

You're not doing yourself any favours by splitting the country_by_ip range across 2 sperate indexes - change these to KEY ip_range (ip_from,ip_to).

Related

Optimize sql query to speed up a search which currently takes around 85 seconds

I have a database with the records near about 2.7 milion . I need to fetch records from that for that i am using the below query
for result
SELECT r3.original_image_title,r3.uuid,r3.original_image_URL FROM `image_attributes` AS r1 INNER JOIN `filenames` as r3 WHERE r1.`uuid` = r3.`uuid` and r3.`status` = 1 and r1.status=1 and (r1.`attribute_name` like "Quvenzhané Wallis%" or r3.original_image_URL like "Quvenzhané Wallis%") group by r3.`uuid` limit 0,20
for total count
SELECT count(DISTINCT(r1.`uuid`)) as count FROM `image_attributes` AS r1 INNER JOIN `filenames` as r3 WHERE r1.`uuid` = r3.`uuid` and r3.`status` = 1 and r1.status=1 and (r1.`attribute_name` like "Quvenzhané Wallis%" or r3.original_image_URL like "Quvenzhané Wallis%")
table structures are as below
CREATE TABLE IF NOT EXISTS `image_attributes` (
`index` int(11) NOT NULL AUTO_INCREMENT,
`attribute_name` text NOT NULL,
`attribute_type` varchar(255) NOT NULL,
`uuid` varchar(255) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '1',
PRIMARY KEY (`index`),
KEY `attribute_type` (`attribute_type`),
KEY `uuid` (`uuid`),
KEY `status` (`status`),
KEY `attribute_name` (`attribute_name`(50))
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=2730431 ;
CREATE TABLE IF NOT EXISTS `filenames` (
`index` int(11) NOT NULL AUTO_INCREMENT,
`original_image_title` text NOT NULL,
`original_image_URL` text NOT NULL,
`uuid` varchar(255) NOT NULL,
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`index`),
KEY `uuid` (`uuid`),
KEY `original_image_URL` (`original_image_URL`(50))
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=591967 ;
please suggest me how can i optimize the queries to make the search result faster
I would recommend to you a book called 'High Performance MySql'. There is a section called Optimize databases and queries, or something like that.

MySQL select with where takes a long time

I have a table with about 700.000 rows:
CREATE TABLE IF NOT EXISTS `ext_log_entries` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`action` varchar(8) NOT NULL,
`logged_at` datetime NOT NULL,
`object_id` varchar(32) DEFAULT NULL,
`object_class` varchar(255) NOT NULL,
`version` int(11) NOT NULL,
`data` longtext COMMENT '(DC2Type:array)',
`username` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `log_date_lookup_idx` (`logged_at`),
KEY `log_user_lookup_idx` (`username`),
KEY `log_class_lookup_idx` (`object_class`),
KEY `log_version_lookup_idx` (`object_id`,`object_class`,`version`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1219777 ;
I try to run the following query:
SELECT n0_.id AS id0, n0_.action AS action1, n0_.logged_at AS logged_at2, n0_.object_id AS object_id3, n0_.object_class AS object_class4, n0_.version AS version5, n0_.data AS data6, n0_.username AS username7
FROM ext_log_entries n0_
WHERE n0_.object_id =275634
AND n0_.object_class = 'My\\MyBundle\\Entity\\Field'
AND n0_.version <=1
ORDER BY n0_.version ASC
Here is the MySQL plan:
id 1
select_type SIMPLE
table n0_
type ref
possible_keys log_class_lookup_idx,log_version_lookup_idx
key log_class_lookup_idx
key_len 767
ref const
rows 641159
Extra Using where; Using filesort
My query need about 37 seconds to be executed for only 1 row in the result...
I tried to run the same query by deleting my indexes and it goes a little bit faster : about 31 seconds...
I don't understand why my query is taking so much time and why my indexes don't help the performance? Do you know how I can do to have good performance on this query?
Thanks in advance for your help !
EDIT
Here are the cardinalties of the indexes
log_date_lookup_idx BTREE logged_at 1221578 A
log_user_lookup_idx BTREE username 40 A YES
log_class_lookup_idx BTREE object_class 1010 A
log_version_lookup_idx BTREE object_id 1221578 A YES
object_class 1221578 A
version 1221578 A
I found a solution, not THE solution, but at least it works for me.
I think it could help anyway all people who are using gedmo loggable and who are lucky (like me) to have objects with only integers IDs.
I changes my column object_id to integer instead of varchar(255). My query now take 0.008 second ! It works for me because i'm sure i'll always have only integers, for people who have varchar, I'm sorry i tried many things but nothing worked....
CREATE TABLE IF NOT EXISTS `ext_log_entries` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`action` varchar(8) NOT NULL,
`logged_at` datetime NOT NULL,
`object_id` int(11) DEFAULT NULL,
`object_class` varchar(255) NOT NULL,
`version` int(11) NOT NULL,
`data` longtext COMMENT '(DC2Type:array)',
`username` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `log_date_lookup_idx` (`logged_at`),
KEY `log_user_lookup_idx` (`username`),
KEY `log_class_lookup_idx` (`object_class`),
KEY `log_version_lookup_idx` (`object_id`,`object_class`,`version`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1219777 ;

mySQL query is very slow after using DISTINCT and GROUP BY?

I have tables with following structure:
-- Table structure for table `temp_app`
--
CREATE TABLE IF NOT EXISTS `temp_app` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`vid` int(5) NOT NULL,
`num` varchar(64) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `vid` (`vid`),
KEY `num` (`num`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=69509;
-- Table structure for table `inv_flags`
--
CREATE TABLE IF NOT EXISTS `inv_flags` (
`num` varchar(64) NOT NULL,
`vid` int(11) NOT NULL,
`f_special` tinyint(1) NOT NULL, /*0 or 1*/
`f_inserted` tinyint(1) NOT NULL, /*0 or 1*/
`f_notinserted` tinyint(1) NOT NULL, /*0 or 1*/
`userID` int(11) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY `num` (`num`),
KEY `userID` (`userID`),
KEY `vid` (`vid`),
KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Execution time of the following query is 9 seconds to display 30 records. What is wrong?
SELECT date_format(ifs.`timestamp`,'%y/%m/%d') as `date`
,count(DISTINCT ta.num) as inserted /*Unique nums*/
,SUM(ifs.f_notinserted) as not_inserted
,SUM(ifs.f_special) as special
,count(ta.num) as links /*All nums*/
from inventory_flags ifs
LEFT JOIN temp_app ta ON ta.num = ifs.num AND ta.vid = ifs.vid
WHERE ifs.userID = 3
GROUP BY date(ifs.`timestamp`) DESC LIMIT 30
EXPLAIN RESULT
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ifs ref userID userID 4 const 12153 Using where
1 SIMPLE ta ref vid,num num 194 ifs.num 1
COUNT DISTINCT can sometimes cause rotten performance with MySql. Try this instead:
select count(*) from (select distinct...
as it can sometimes prevent MySql from writing the entire interim result to disk.
Here is the MySql bug info:
http://bugs.mysql.com/bug.php?id=21849

MySQL Indexes for extremely slow queries

The following query, regardless of environment, takes more than 30 seconds to compute.
SELECT COUNT( r.response_answer )
FROM response r
INNER JOIN (
SELECT G.question_id
FROM question G
INNER JOIN answer_group AG ON G.answer_group_id = AG.answer_group_id
WHERE AG.answer_group_stat = 'statistic'
) AS q ON r.question_id = q.question_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE os.survey_id =42
AND r.response_answer = 5
AND DATEDIFF( NOW( ) , r.added_dt ) <1000000
AND r.uuid IS NOT NULL
When I explain the query,
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1087
1 PRIMARY r ref question_id,org_survey_code,code_question,uuid,uor question_id 4 q.question_id 1545 Using where
1 PRIMARY os eq_ref org_survey_code,survey_id,org_survey_code_2 org_survey_code 12 survey_2.r.org_survey_code 1 Using where
2 DERIVED G ALL agid NULL NULL NULL 1680
2 DERIVED AG eq_ref PRIMARY PRIMARY 1 survey_2.G.answer_group_id 1 Using where
I have a very basic knowledge of indexing, but I have tried nearly every combination I can think of and cannot seem to improve the speed of this query. The responses table is right around 2 million rows, question is about 1500 rows, answer_group is about 50, and org_survey is about 8,000.
Here is the basic structure for each:
CREATE TABLE `response` (
`response_id` int(10) unsigned NOT NULL auto_increment,
`response_answer` text NOT NULL,
`question_id` int(10) unsigned NOT NULL default '0',
`org_survey_code` varchar(7) NOT NULL,
`uuid` varchar(40) default NULL,
`added_dt` datetime default NULL,
PRIMARY KEY (`response_id`),
KEY `question_id` (`question_id`),
KEY `org_survey_code` (`org_survey_code`),
KEY `code_question` (`org_survey_code`,`question_id`),
KEY `IDX_ADDED_DT` (`added_dt`),
KEY `uuid` (`uuid`),
KEY `response_answer` (`response_answer`(1)),
KEY `response_question` (`response_answer`(1),`question_id`),
) ENGINE=MyISAM AUTO_INCREMENT=2298109 DEFAULT CHARSET=latin1
CREATE TABLE `question` (
`question_id` int(10) unsigned NOT NULL auto_increment,
`question_text` varchar(250) NOT NULL default '',
`question_group` varchar(250) default NULL,
`question_position` tinyint(3) unsigned NOT NULL default '0',
`survey_id` tinyint(3) unsigned NOT NULL default '0',
`answer_group_id` mediumint(8) unsigned NOT NULL default '0',
`seq_id` int(11) NOT NULL default '0',
PRIMARY KEY (`question_id`),
KEY `question_group` (`question_group`(10)),
KEY `survey_id` (`survey_id`),
KEY `agid` (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1860 DEFAULT CHARSET=latin1
CREATE TABLE `org_survey` (
`org_survey_id` int(11) NOT NULL auto_increment,
`org_survey_code` varchar(10) NOT NULL default '',
`org_id` int(11) NOT NULL default '0',
`org_manager_id` int(11) NOT NULL default '0',
`org_url_id` int(11) default '0',
`division_id` int(11) default '0',
`sector_id` int(11) default NULL,
`survey_id` int(11) NOT NULL default '0',
`process_batch` tinyint(4) default '0',
`added_dt` datetime default NULL,
PRIMARY KEY (`org_survey_id`),
UNIQUE KEY `org_survey_code` (`org_survey_code`),
KEY `org_id` (`org_id`),
KEY `survey_id` (`survey_id`),
KEY `org_survey_code_2` (`org_survey_code`,`total_taken`),
KEY `org_manager_id` (`org_manager_id`),
KEY `sector_id` (`sector_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9268 DEFAULT CHARSET=latin1
CREATE TABLE `answer_group` (
`answer_group_id` tinyint(3) unsigned NOT NULL auto_increment,
`answer_group_name` varchar(50) NOT NULL default '',
`answer_group_type` varchar(20) NOT NULL default '',
`answer_group_stat` varchar(20) NOT NULL default 'demographic',
PRIMARY KEY (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=53 DEFAULT CHARSET=latin1
I know there are small things I can probably do to improve the efficiency of the database, such as reducing the size of integers where it's unnecessary. However, those are fairly trivial considering the ridiculous time it takes just to produce a result here. How can I properly index these tables, based on what explain has shown me? It seems that I have tried a large variety of combinations to no avail. Also, is there anything else that anyone can see that will optimize the table and reduce the query? I need it to be computed in less than a second. Thanks in advance!
1.If you want the index of r.added_dt to be used, instead of:
DATEDIFF(NOW(), r.added_dt) < 1000000
use:
CURDATE() - INTERVAL 1000000 DAY < r.added_dt
Anyway, the above condition is checking if added_at is a million days old or not. Do you really store so old dates? If not, you can simply remove this condition.
If you want this condition, an index on added_at would help a lot. Your query as it is now, checks all rows for this condition, calling the DATEDIFF() function as many times as the rows of the response table.
2.Since r.response_answer cannot be NULL, instead of:
SELECT COUNT( r.response_answer )
use:
SELECT COUNT( * )
COUNT(*) is faster than COUNT(field).
3.Two of the three fields that you use for joining tables have different datatypes:
ON question . answer_group_id
= answer_group . answer_group_id
CREATE TABLE question (
...
answer_group_id mediumint(8) ..., <--- mediumint
CREATE TABLE answer_group (
answer_group_id` tinyint(3) ..., <--- tinyint
-------------------------------
ON org_survey . org_survey_code
= response . org_survey_code
CREATE TABLE response (
...
org_survey_code varchar(7) NOT NULL, <--- 7
CREATE TABLE org_survey (
...
org_survey_code varchar(10) NOT NULL default '', <--- 10
Datatype mediumint is not the same as tinyint and the same goes for varchar(7) and varchar(10). When they are used for join, MySQL has to lose time doing conversion from one type to another. Convert one of them so they have identical datatypes. This is not the main issue of the query but this change will also help all other queries that use these joins.
And after making this change do a 'Analyze Table ' for the table. It will help mysql making better execution plans.
You have a response_answer = 5 condition, where response_answer is text. It's not an error, but it's better to use response_answer = '5' (the conversion of 5 to '5' will be done by MySQL anyway, if you don't do that).
Real issue is that you don't have a compound index on the 3 fields that are used in the WHERE conditions. Try adding this one:
ALTER TABLE response
ADD INDEX ind_u1_ra1_aa
(uuid(1), response_answer(1), added_at) ;
(this may take a while as your table is not small)
Can you try the following query? I've removed the sub-query from your original one. This may let the optimiser produce a better execution plan.
SELECT COUNT(r.response_answer)
FROM response r
INNER JOIN question q ON r.question_id = q.question_id
INNER JOIN answer_group ag ON q.answer_group_id = ag.answer_group_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE
ag.answer_group_stat = 'statistic'
AND os.survey_id = 42
AND r.response_answer = 5
AND DATEDIFF(NOW(), r.added_dt) < 1000000
AND r.uuid IS NOT NULL

Why does MySQL stop using an index for a join when I select non-indexed fields in the field list

I have the following two tables:
CREATE TABLE `temporal_expressions` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dated_obj_type` varchar(255) DEFAULT NULL,
`dated_obj_id` int(11) DEFAULT NULL,
`start_date` datetime DEFAULT NULL,
`end_date` datetime DEFAULT NULL,
`start_time` int(11) DEFAULT NULL,
`end_time` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`lock_version` int(11) NOT NULL DEFAULT '0',
`wday` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `te_search` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`),
KEY `te_calendar` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`),
KEY `te_search_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`wday`),
KEY `te_calendar_wday` (`dated_obj_type`,`dated_obj_id`,`start_date`,`end_date`,`start_time`,`end_time`,`wday`),
KEY `te_index` (`wday`,`dated_obj_type`,`start_date`,`end_date`,`start_time`,`end_time`,`dated_obj_id`)
) ENGINE=InnoDB AUTO_INCREMENT=8162445 DEFAULT CHARSET=latin1
CREATE TABLE `asset_blocks` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`block_type` int(11) DEFAULT '0',
`spaces_left` int(11) DEFAULT NULL,
`provider_note` varchar(255) DEFAULT NULL,
`extra_data` text,
`lock_version` int(11) DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
`type` varchar(255) DEFAULT NULL,
`service_provider_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `type` (`type`,`id`),
KEY `service_provider_id` (`service_provider_id`,`type`,`id`),
) ENGINE=InnoDB AUTO_INCREMENT=516867 DEFAULT CHARSET=latin1
If I run explain on this query (note that I am only selecting fields in the te_calendar_wday index from temporal_expressions) it uses the index for the join as expected
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,te_index te_calendar_wday 272 NULL 9 Using where; Using index
1 SIMPLE asset_blocks eq_ref PRIMARY PRIMARY 4 lb_production.temporal_expressions.dated_obj_id 1
However, if I run this query (note that I have added a non-indexed field to the field list) it no longer uses the index (it uses a join buffer). Is this intentional or am I missing something?
EXPLAIN SELECT asset_blocks.*, temporal_expressions.id,
temporal_expressions.dated_obj_type, temporal_expressions.dated_obj_id,
temporal_expressions.start_date, temporal_expressions.end_date,
temporal_expressions.start_time, temporal_expressions.created_at
FROM `asset_blocks`
LEFT OUTER JOIN `temporal_expressions`
ON `temporal_expressions`.dated_obj_id = `asset_blocks`.id
AND `temporal_expressions`.dated_obj_type = 'AssetBlock'
WHERE ( temporal_expressions.start_date <= '2010-11-25'
AND temporal_expressions.end_date >= '2010-11-01'
AND temporal_expressions.start_time < 1000 AND temporal_expressions.end_time > 1200
AND temporal_expressions.wday IN (1,2,3,4,5,6)
AND asset_blocks.id IN (1,2,3,4,5,6,7,8,9) )
1 SIMPLE asset_blocks range PRIMARY PRIMARY 4 NULL 9 Using where
1 SIMPLE temporal_expressions range te_search,te_calendar,te_search_wday,te_calendar_wday,new_te_index te_search 272 NULL 9 Using where; Using join buffer
I cannot be sure if this is the case here, but:
If you select only indexed fields, MySQL can answer the whole query out of the index and does not even load the table data file.
If you select a field that is not indexed, it has to load the table data.
When making its execution plan, in certain cases (see comment) MySQL decides to do a full table scan although an index is present. This is because it's much quicker to read all data blindly than to look up every entry in the index and then read the data.