Hint indexes to mysql on Join - mysql

EXPLAIN SELECT SENDER AS PROFILEID, MESSAGE, 'R' AS SR, SEEN, DATE
FROM `MESSAGE_LOG`
JOIN MESSAGES ON ( MESSAGE_LOG.ID = MESSAGES.ID )
WHERE `RECEIVER` = '9063911' AND SENDER NOT
IN ('658', '87238', '99359', '643848', '651922', '734783', '880643'
) AND `TYPE` = 'R' AND `IS_MSG` = 'Y'
UNION SELECT RECEIVER AS PROFILEID, MESSAGE, 'S' AS SR, SEEN, DATE
FROM `MESSAGE_LOG`
JOIN MESSAGES ON ( MESSAGE_LOG.ID = MESSAGES.ID )
WHERE `SENDER` = '9063911' AND RECEIVER NOT
IN (
'658', '87238', '99359', '643848', '651922', '734783', '880643'
) AND `TYPE` = 'R' AND `IS_MSG` = 'Y'
ORDER BY DATE DESC
Explain query returns:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY MESSAGE_LOG ref ID,SENDER,RECEIVER RECEIVER 4 const 142 Using where
1 PRIMARY MESSAGES eq_ref ID ID 4 newjs.MESSAGE_LOG.ID 1
2 UNION MESSAGE_LOG range ID,SENDER,RECEIVER SENDER 8 NULL 168 Using where
2 UNION MESSAGES eq_ref ID ID 4 newjs.MESSAGE_LOG.ID 1
NULL UNION RESULT <union1,2> ALL NULL NULL NULL NULL
The query is taking around 15 sec to execute. What could be the reason and how the query can be optimized further.
EDIT: Is it because mysql is not able to utilize indexes properly?
Indexes on MESSAGE_LOG table:
Keyname Type Cardinality Action Field
ID UNIQUE 44491833 Edit Drop ID
SENDER INDEX 44491833 Edit Drop SENDER,RECEIVER
RECEIVER INDEX 1483061 Edit Drop RECEIVER,FOLDERID,OBSCENE
Indexes on MESSAGES table:
Keyname Type Cardinality Action Field
ID UNIQUE 43572638 Edit Drop ID
Can I hint mysql to use right index? If yes, then how and on which column?
Tables structure:
CREATE TABLE `MESSAGE_LOG` (
`SENDER` int(8) unsigned NOT NULL DEFAULT '0',
`RECEIVER` int(8) unsigned NOT NULL DEFAULT '0',
`DATE` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
`IP` int(10) unsigned DEFAULT NULL,
`RECEIVER_STATUS` char(1) NOT NULL DEFAULT 'U',
`FOLDERID` mediumint(9) NOT NULL DEFAULT '0',
`MSG_OBS_ID` int(11) NOT NULL DEFAULT '0',
`SENDER_STATUS` char(1) NOT NULL DEFAULT 'U',
`TYPE` char(1) NOT NULL DEFAULT 'R',
`ID` int(11) NOT NULL,
`OBSCENE` char(1) NOT NULL DEFAULT 'N',
`IS_MSG` char(1) NOT NULL DEFAULT 'N',
`SEEN` char(1) NOT NULL,
UNIQUE KEY `ID` (`ID`),
KEY `SENDER` (`SENDER`,`RECEIVER`),
KEY `RECEIVER` (`RECEIVER`,`FOLDERID`,`OBSCENE`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
CREATE TABLE `MESSAGES` (
`ID` int(11) NOT NULL,
`MESSAGE` text NOT NULL,
UNIQUE KEY `ID` (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

Related

EXPLAIN filter value changes on changing column

I have experienced a strange behavior in MySQL 5.7
SELECT orderId
FROM `order`
WHERE vendor = 11 AND status >= 0 AND city = 1
AND created_date >= '2020-12-01' AND created_date <= '2020-12-31'
SELECT userId
FROM `order`
WHERE vendor = 11 AND status >= 0 AND city = 1
AND created_date >= '2020-12-01' AND created_date <= '2020-12-31'
I have a composite index on all the 4 columns
INDEX is vendor_city_date (vendor, city, created_date, status).
But for some columns I get a filter of 50 for others I get a filter of 5.55.
This is the structure of the table it has 50 other columns I have removed those.
CREATE TABLE `order` (
`orderId` int(11) NOT NULL AUTO_INCREMENT,
`vendor` int(11) NOT NULL DEFAULT '1',
`userId` int(11) NOT NULL,
`offer` int(11) NOT NULL DEFAULT '0',
`city` int(11) NOT NULL,
`offset` int(11) NOT NULL DEFAULT '',
`status` tinyint(4) NOT NULL DEFAULT '0',
`payment` tinyint(4) NOT NULL DEFAULT '0',
`created_date` date NOT NULL,
PRIMARY KEY (`orderId`),
KEY `city_id` (`city`),
KEY `user_id` (`userId`),
KEY `status` (`status`),
KEY `vendor` (`vendor`),
KEY `user_status` (`userId`,`status`),
KEY `vendor_city_date` (`vendor`,`city`,`created_date`,`status`) USING BTREE
);

mysql gives NULL for record in a table if JOIN and COUNT used but SELECT works fine why?

Table 1
CREATE TABLE IF NOT EXISTS `com_msg` (
`msg_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`msg_to` int(10) NOT NULL,
`msg_from` int(10) NOT NULL,
`msg_new` tinyint(1) unsigned NOT NULL DEFAULT '1',
`msg_content` varchar(300) NOT NULL,
`msg_date` date NOT NULL,
`bl_sender` tinyint(1) unsigned NOT NULL DEFAULT '0',
`bl_recip` tinyint(1) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`msg_id`),
UNIQUE KEY `msg_id` (`msg_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
Table 2
CREATE TABLE IF NOT EXISTS `ac_vars` (
`user_id` int(10) unsigned NOT NULL,
`ac_ballance` smallint(3) unsigned NOT NULL DEFAULT '0',
`prof_views` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`user_id`),
UNIQUE KEY `id` (`user_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
When I use query :
SELECT ac_ballance, prof_views, COUNT( msg_id ) AS messages
FROM ac_vars
INNER JOIN com_msg ON user_id = msg_to
WHERE user_id =".$userid." AND com_msg.msg_new =1;
I get :
ac_ballance=NULL(incorrect)
prof_views=NULL(incorrect)
messages=0(correct)
But with Select statement just on ac_vars I get correct values, what is the correct way of doing this?
You want rows from the table ac_vars even when there's no corresponding row in the table com_msg.
So you must use a LEFT JOIN:
SELECT ac_ballance, prof_views, COUNT( msg_id ) AS messages
FROM ac_vars
LEFT JOIN com_msg
ON user_id = msg_to AND com_msg.msg_new =1
WHERE user_id =".$userid.";
Please note that the condition
com_msg.msg_new =1
got to be a part of the JOIN condition and not the WHERE clause, because there's no value in com_msg that fulfills this condition.
Note
Adding
GROUP BY ac_ballance, prof_views
is not needed by MySQLs optimization because the values in those columns are directly dependent of the user_id and the WHERE clause permits only one single row.

mySQL query is very slow after using DISTINCT and GROUP BY?

I have tables with following structure:
-- Table structure for table `temp_app`
--
CREATE TABLE IF NOT EXISTS `temp_app` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`vid` int(5) NOT NULL,
`num` varchar(64) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `vid` (`vid`),
KEY `num` (`num`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=69509;
-- Table structure for table `inv_flags`
--
CREATE TABLE IF NOT EXISTS `inv_flags` (
`num` varchar(64) NOT NULL,
`vid` int(11) NOT NULL,
`f_special` tinyint(1) NOT NULL, /*0 or 1*/
`f_inserted` tinyint(1) NOT NULL, /*0 or 1*/
`f_notinserted` tinyint(1) NOT NULL, /*0 or 1*/
`userID` int(11) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
KEY `num` (`num`),
KEY `userID` (`userID`),
KEY `vid` (`vid`),
KEY `timestamp` (`timestamp`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Execution time of the following query is 9 seconds to display 30 records. What is wrong?
SELECT date_format(ifs.`timestamp`,'%y/%m/%d') as `date`
,count(DISTINCT ta.num) as inserted /*Unique nums*/
,SUM(ifs.f_notinserted) as not_inserted
,SUM(ifs.f_special) as special
,count(ta.num) as links /*All nums*/
from inventory_flags ifs
LEFT JOIN temp_app ta ON ta.num = ifs.num AND ta.vid = ifs.vid
WHERE ifs.userID = 3
GROUP BY date(ifs.`timestamp`) DESC LIMIT 30
EXPLAIN RESULT
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE ifs ref userID userID 4 const 12153 Using where
1 SIMPLE ta ref vid,num num 194 ifs.num 1
COUNT DISTINCT can sometimes cause rotten performance with MySql. Try this instead:
select count(*) from (select distinct...
as it can sometimes prevent MySql from writing the entire interim result to disk.
Here is the MySql bug info:
http://bugs.mysql.com/bug.php?id=21849

MySQL Indexes for extremely slow queries

The following query, regardless of environment, takes more than 30 seconds to compute.
SELECT COUNT( r.response_answer )
FROM response r
INNER JOIN (
SELECT G.question_id
FROM question G
INNER JOIN answer_group AG ON G.answer_group_id = AG.answer_group_id
WHERE AG.answer_group_stat = 'statistic'
) AS q ON r.question_id = q.question_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE os.survey_id =42
AND r.response_answer = 5
AND DATEDIFF( NOW( ) , r.added_dt ) <1000000
AND r.uuid IS NOT NULL
When I explain the query,
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 1087
1 PRIMARY r ref question_id,org_survey_code,code_question,uuid,uor question_id 4 q.question_id 1545 Using where
1 PRIMARY os eq_ref org_survey_code,survey_id,org_survey_code_2 org_survey_code 12 survey_2.r.org_survey_code 1 Using where
2 DERIVED G ALL agid NULL NULL NULL 1680
2 DERIVED AG eq_ref PRIMARY PRIMARY 1 survey_2.G.answer_group_id 1 Using where
I have a very basic knowledge of indexing, but I have tried nearly every combination I can think of and cannot seem to improve the speed of this query. The responses table is right around 2 million rows, question is about 1500 rows, answer_group is about 50, and org_survey is about 8,000.
Here is the basic structure for each:
CREATE TABLE `response` (
`response_id` int(10) unsigned NOT NULL auto_increment,
`response_answer` text NOT NULL,
`question_id` int(10) unsigned NOT NULL default '0',
`org_survey_code` varchar(7) NOT NULL,
`uuid` varchar(40) default NULL,
`added_dt` datetime default NULL,
PRIMARY KEY (`response_id`),
KEY `question_id` (`question_id`),
KEY `org_survey_code` (`org_survey_code`),
KEY `code_question` (`org_survey_code`,`question_id`),
KEY `IDX_ADDED_DT` (`added_dt`),
KEY `uuid` (`uuid`),
KEY `response_answer` (`response_answer`(1)),
KEY `response_question` (`response_answer`(1),`question_id`),
) ENGINE=MyISAM AUTO_INCREMENT=2298109 DEFAULT CHARSET=latin1
CREATE TABLE `question` (
`question_id` int(10) unsigned NOT NULL auto_increment,
`question_text` varchar(250) NOT NULL default '',
`question_group` varchar(250) default NULL,
`question_position` tinyint(3) unsigned NOT NULL default '0',
`survey_id` tinyint(3) unsigned NOT NULL default '0',
`answer_group_id` mediumint(8) unsigned NOT NULL default '0',
`seq_id` int(11) NOT NULL default '0',
PRIMARY KEY (`question_id`),
KEY `question_group` (`question_group`(10)),
KEY `survey_id` (`survey_id`),
KEY `agid` (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=1860 DEFAULT CHARSET=latin1
CREATE TABLE `org_survey` (
`org_survey_id` int(11) NOT NULL auto_increment,
`org_survey_code` varchar(10) NOT NULL default '',
`org_id` int(11) NOT NULL default '0',
`org_manager_id` int(11) NOT NULL default '0',
`org_url_id` int(11) default '0',
`division_id` int(11) default '0',
`sector_id` int(11) default NULL,
`survey_id` int(11) NOT NULL default '0',
`process_batch` tinyint(4) default '0',
`added_dt` datetime default NULL,
PRIMARY KEY (`org_survey_id`),
UNIQUE KEY `org_survey_code` (`org_survey_code`),
KEY `org_id` (`org_id`),
KEY `survey_id` (`survey_id`),
KEY `org_survey_code_2` (`org_survey_code`,`total_taken`),
KEY `org_manager_id` (`org_manager_id`),
KEY `sector_id` (`sector_id`)
) ENGINE=MyISAM AUTO_INCREMENT=9268 DEFAULT CHARSET=latin1
CREATE TABLE `answer_group` (
`answer_group_id` tinyint(3) unsigned NOT NULL auto_increment,
`answer_group_name` varchar(50) NOT NULL default '',
`answer_group_type` varchar(20) NOT NULL default '',
`answer_group_stat` varchar(20) NOT NULL default 'demographic',
PRIMARY KEY (`answer_group_id`)
) ENGINE=MyISAM AUTO_INCREMENT=53 DEFAULT CHARSET=latin1
I know there are small things I can probably do to improve the efficiency of the database, such as reducing the size of integers where it's unnecessary. However, those are fairly trivial considering the ridiculous time it takes just to produce a result here. How can I properly index these tables, based on what explain has shown me? It seems that I have tried a large variety of combinations to no avail. Also, is there anything else that anyone can see that will optimize the table and reduce the query? I need it to be computed in less than a second. Thanks in advance!
1.If you want the index of r.added_dt to be used, instead of:
DATEDIFF(NOW(), r.added_dt) < 1000000
use:
CURDATE() - INTERVAL 1000000 DAY < r.added_dt
Anyway, the above condition is checking if added_at is a million days old or not. Do you really store so old dates? If not, you can simply remove this condition.
If you want this condition, an index on added_at would help a lot. Your query as it is now, checks all rows for this condition, calling the DATEDIFF() function as many times as the rows of the response table.
2.Since r.response_answer cannot be NULL, instead of:
SELECT COUNT( r.response_answer )
use:
SELECT COUNT( * )
COUNT(*) is faster than COUNT(field).
3.Two of the three fields that you use for joining tables have different datatypes:
ON question . answer_group_id
= answer_group . answer_group_id
CREATE TABLE question (
...
answer_group_id mediumint(8) ..., <--- mediumint
CREATE TABLE answer_group (
answer_group_id` tinyint(3) ..., <--- tinyint
-------------------------------
ON org_survey . org_survey_code
= response . org_survey_code
CREATE TABLE response (
...
org_survey_code varchar(7) NOT NULL, <--- 7
CREATE TABLE org_survey (
...
org_survey_code varchar(10) NOT NULL default '', <--- 10
Datatype mediumint is not the same as tinyint and the same goes for varchar(7) and varchar(10). When they are used for join, MySQL has to lose time doing conversion from one type to another. Convert one of them so they have identical datatypes. This is not the main issue of the query but this change will also help all other queries that use these joins.
And after making this change do a 'Analyze Table ' for the table. It will help mysql making better execution plans.
You have a response_answer = 5 condition, where response_answer is text. It's not an error, but it's better to use response_answer = '5' (the conversion of 5 to '5' will be done by MySQL anyway, if you don't do that).
Real issue is that you don't have a compound index on the 3 fields that are used in the WHERE conditions. Try adding this one:
ALTER TABLE response
ADD INDEX ind_u1_ra1_aa
(uuid(1), response_answer(1), added_at) ;
(this may take a while as your table is not small)
Can you try the following query? I've removed the sub-query from your original one. This may let the optimiser produce a better execution plan.
SELECT COUNT(r.response_answer)
FROM response r
INNER JOIN question q ON r.question_id = q.question_id
INNER JOIN answer_group ag ON q.answer_group_id = ag.answer_group_id
INNER JOIN org_survey os ON os.org_survey_code = r.org_survey_code
WHERE
ag.answer_group_stat = 'statistic'
AND os.survey_id = 42
AND r.response_answer = 5
AND DATEDIFF(NOW(), r.added_dt) < 1000000
AND r.uuid IS NOT NULL

Trying to optimize MySQL query with LEFT OUTER JOIN

I've this query, which works fine except it takes a long while (7 seconds, with 40k records in the jobs table, and 700k in the wq table).
I tried an EXPLAIN and it says its looking at all the records in the job table, and not using any of the indexes.
I don't know how to tell MySQL that it should use the jobs.status field to filter the the records before looking up the wq table.
The objective of this, is to get all the records from jobs that have a status != 331, and also any other job which has a wq status of (101, 111, 151).
Query:
SELECT jobs.*
FROM jobs
LEFT OUTER JOIN wq ON (wq.job = jobs.id AND jobs.status IN (341, 331) AND wq.status IN (101, 111, 151))
WHERE ((wq.info is not NULL) or (jobs.status != 331 and ack = 0))
EXPLAIN output:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE jobs ALL ack,status,status_ack NULL NULL NULL 38111 Using filesort
1 SIMPLE wq ref PRIMARY,job,status PRIMARY 4 cts.jobs.id 20 Using where
Table definitions:
CREATE TABLE jobs ( id int(10) NOT NULL AUTO_INCREMENT,
comment varchar(100) NOT NULL DEFAULT '',
profile varchar(60) NOT NULL DEFAULT '',
start_at int(10) NOT NULL DEFAULT '0',
data text NOT NULL,
status int(10) NOT NULL DEFAULT '0',
info varchar(200) NOT NULL DEFAULT '',
finish int(10) NOT NULL DEFAULT '0',
priority int(5) NOT NULL DEFAULT '0',
ack tinyint(4) NOT NULL DEFAULT '0',
PRIMARY KEY (id),
KEY start_at (start_at),
KEY status (status),
KEY status_ack (status,
ack) ) ENGINE=MyISAM AUTO_INCREMENT=2037530 DEFAULT CHARSET=latin1;
CREATE TABLE wq ( job int(10) NOT NULL DEFAULT '0',
process varchar(60) NOT NULL DEFAULT '',
step varchar(60) NOT NULL DEFAULT '',
status int(10) NOT NULL DEFAULT '0',
run_at int(10) NOT NULL DEFAULT '0',
original_run_at int(10) NOT NULL DEFAULT '0',
info varchar(200) NOT NULL DEFAULT '',
pos int(10) NOT NULL DEFAULT '0',
changed_at int(10) NOT NULL DEFAULT '0',
file varchar(60) NOT NULL DEFAULT '',
PRIMARY KEY (job,
process,
step,
file),
KEY job (job),
KEY status (status) ) ENGINE=MyISAM DEFAULT CHARSET=latin1
Unfortunately mysql (and perhaps any dbms) cannot optimize expressions like jobs.status != 331 and ack = 0 because B-Tree is not a structure that allows to find fast anything that is-not-equal-to-a-constant-value. Thus you'll always get a fullscan.
If there were some better condition like jobs.status = 331 and ack = 0 (note on the fact that i've changed != to =) then it would be an advice to speed up this query:
split the query into 2, joined by UNION ALL
replace in one query LEFT JOIN to INNER JOIN (in the one that implies that wq.info is not NULL)