Mysql query not using all indexed column for searching - mysql

For faster search i have indexed two columns(composite index) client_id and batch_id.
Below is my output of indexes of my table
show indexes from authentication_codes
*************************** 3. row ***************************
Table: authentication_codes
Non_unique: 1
Key_name: client_id
Seq_in_index: 1
Column_name: client_id
Collation: A
Cardinality: 18
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
*************************** 4. row ***************************
Table: authentication_codes
Non_unique: 1
Key_name: client_id
Seq_in_index: 2
Column_name: batch_id
Collation: A
Cardinality: 18
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
4 rows in set (0.02 sec)
ERROR:
No query specified
when i use explain to check if indexing is used in query or not it gives me below output.
mysql> explain select * from authentication_codes where client_id=6 and batch_id="101" \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: authentication_codes
type: ref
possible_keys: client_id
key: client_id
key_len: 773
ref: const,const
rows: 1044778
Extra: Using where
1 row in set (0.00 sec)
ERROR:
No query specified
********************EDIT***************************
output of show create table authentication_codes is as below
mysql> show create table authentication_codes \G;
*************************** 1. row ***************************
Table: authentication_codes
Create Table: CREATE TABLE `authentication_codes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`code` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`batch_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`serial_num` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`client_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_authentication_codes_on_code` (`code`),
KEY `client_id_batch_id` (`client_id`,`batch_id`)
) ENGINE=InnoDB AUTO_INCREMENT=48406205 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
1 row in set (0.00 sec)
my question is why batch_id column is not used for searching. why only client_id column is used for searching??

To use index on two columns you need to create two column index. MySQL cannot use two separate indexes on one table.
This query will add multi column index on client_id and batch_id
alter table authentication_codes add index client_id_batch_id (client_id,batch_id)
http://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html

The EXPLAIN does not match the CREATE TABLE, at least in the name of the relevant index.
Explaining the EXPLAIN (as displayed at the moment):
select_type: SIMPLE
table: authentication_codes
type: ref
possible_keys: client_id
key: client_id -- The index named "client_id" was used
key_len: 773 -- (explained below)
ref: const,const -- 2 constants were used for the first two columns in that index
rows: 1044778 -- About this many rows (2% of table) matches those two constants
Extra: Using where
773 = 2 + 3 * 255 + 1 + 4 + 1
2 = length for VARCHAR
3 = max width of a utf8 character -- do you really need utf8?
255 = max length provided in VARCHAR(255) -- do you really need that much?
1 = extra length for NULL -- perhaps your columns could/should be NOT NULL?
4 = length of INT for client_id -- if you don't need 4 billion ids, maybe a smaller INT would work? and maybe UNSIGNED, too?
So, yes, it is using both parts of client_id=6 and batch_id="101". But there are a million rows in that batch for that client, so the query takes time.
If you want to discuss how to further speed up the use of this table, please provide the other common queries. (I don't want to tweak the schema to make this query faster, only to find that other queries are made slower.)

Related

MySQL count query takes too long for particular table only

I have one table that contains documents, and on production there are about 1.2 millon records in this table. On this table when I do select count(*) from <table>, it takes too long that at the end I need to restart the DB. On the other hand I also have many other table containing 10-12 million rows but those tables does not have this issue.
These are indexes of that table
mysql> show index from candidates_resume\G
*************************** 1. row ***************************
Table: candidates_resume
Non_unique: 0
Key_name: PRIMARY
Seq_in_index: 1
Column_name: id
Collation: A
Cardinality: 843657
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
*************************** 2. row ***************************
Table: candidates_resume
Non_unique: 0
Key_name: candidate_id
Seq_in_index: 1
Column_name: candidate_id
Collation: A
Cardinality: 844009
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
*************************** 3. row ***************************
Table: candidates_resume
Non_unique: 1
Key_name: candidates_resume_uploaded_on_e4c78158b8c18f_uniq
Seq_in_index: 1
Column_name: uploaded_on
Collation: A
Cardinality: 844009
Sub_part: NULL
Packed: NULL
Null:
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
*************************** 4. row ***************************
Table: candidates_resume
Non_unique: 1
Key_name: candidates_resume_pdf_file_5b052603240d1d43_uniq
Seq_in_index: 1
Column_name: pdf_file
Collation: A
Cardinality: 844009
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
*************************** 5. row ***************************
Table: candidates_resume
Non_unique: 1
Key_name: candidates_resume_watermark_file_68fd6000f27d4f8d_uniq
Seq_in_index: 1
Column_name: watermark_file
Collation: A
Cardinality: 844009
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
Visible: YES
Expression: NULL
And this is result of SHOW CREATE TABLE
Create Table: CREATE TABLE `candidates_resume` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(50) NOT NULL,
`uploaded_on` datetime NOT NULL,
`candidate_id` int(11) NOT NULL,
`file` varchar(100) NOT NULL,
`hash` varchar(10) NOT NULL,
`pdf_file` varchar(100) DEFAULT NULL,
`resume_text` longtext NOT NULL,
`watermark_file` varchar(100) DEFAULT NULL,
`html_file` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `candidate_id` (`candidate_id`),
KEY `candidates_resume_uploaded_on_e4c78158b8c18f_uniq` (`uploaded_on`),
KEY `candidates_resume_pdf_file_88ec1f31_uniq` (`pdf_file`),
KEY `candidates_resume_watermark_file_23af2d43_uniq` (`watermark_file`),
CONSTRAINT `candidate_id_refs_id_88f99c34` FOREIGN KEY (`candidate_id`) REFERENCES `candidates_candidate` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=591098 DEFAULT CHARSET=utf8
Can anyone guide me how can I catch the issue with this table ?
SELECT COUNT(*) FROM ... without any filtering (WHERE) must scan the entire table or an index. This takes time.
Do EXPLAIN SELECT ... to see how it is handled. I think it will use your UNIQUE(candidate_id). (Please provide SHOW CREATE TABLE.)
Assuming that candidate_id is INT or BIGINT, the query can't be run much faster.
Why do you need to count the number of rows. Would an estimate be "good enough"? If so, see SHOW TABLE STATUS or the equivalent query in information_schema.
If the count from midnight this morning would be "good enough", then perform that and save it somewhere.
If you can't figure how to avoid timeout, see wait_timeout. Caution; there are several flavors of it.
With a Summary Table
Build and maintain a table that keeps, say, the hourly counts of rows:
CREATE TABLE counts (
hr MEDIUMINT UNSIGNED NOT NULL,
ct SMALLINT UNSIGNED NOT NULL,
PRIMARY KEY(hr)
) ENGINE=InnoDB;
Initialize (one-time task):
INSERT INTO counts (hr, ct)
SELECT FLOOR(UNIX_TIMESTAMP(uploaded_on) / 3600),
COUNT(*)
FROM candidates_resume
GROUP BY 1;
As a new row is inserted into candidates_resume:
INSERT INTO candidates_resume
(hr, ct)
VALUES
(FLOOR(UNIX_TIMESTAMP(uploaded_on) / 3600), 1)
ON DUPLICATE KEY UPDATE ct = ct + 1;
When wanting the count:
SELECT SUM(ct) FROM counts;
That gives the count up to the start of the current hour. If you need the count up to the current second, add on a second query to count just the rows since the start of the hour.
(There are a few loose ends to fix.)
More discussion: http://mysql.rjweb.org/doc.php/summarytables

MySQL, the index of text can`t work

I create a table like this
CREATE TABLE `text_tests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`text_st_date` text NOT NULL,
`varchar_st_date` varchar(255) NOT NULL DEFAULT '2015-08-25',
`text_id` text NOT NULL,
`varchar_id` varchar(255) NOT NULL DEFAULT '0',
`int_id` int(11) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_of_text_st_date` (`text_st_date`(50),`id`),
KEY `idx_of_varchar_st_date` (`varchar_st_date`,`id`),
KEY `idx_of_text_id` (`text_id`(20),`id`),
KEY `idx_of_varchar_id` (`varchar_id`,`id`),
KEY `idx_of_int_id` (`int_id`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
then i make some datas use Ruby
(1..10000).each do |_i|
item = TextTest.new
item.text_st_date = (Time.now + _i.days).to_s
item.varchar_st_date = (Time.now + _i.days).to_s
item.text_id = _i
item.varchar_id = _i
item.int_id = _i
item.save
end
at last, I try to use the index of text, but it can`t work, it always full table scan.
EXPLAIN SELECT id
FROM text_tests
ORDER BY text_st_date DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9797
Extra: Using filesort
1 row in set (0.02 sec)
EXPLAIN SELECT id
FROM text_tests
ORDER BY text_id DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9797
Extra: Using filesort
1 row in set (0.00 sec)
varchar works good
EXPLAIN SELECT id
FROM text_tests
ORDER BY varchar_st_date DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: index
possible_keys: NULL
key: idx_of_varchar_st_date `enter code here`
key_len: 771
ref: NULL
rows: 20
Extra: Using index
1 row in set (0.00 sec)
EXPLAIN SELECT id
FROM text_tests
ORDER BY varchar_id DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: index
possible_keys: NULL
key: idx_of_varchar_id
key_len: 771
ref: NULL
rows: 20
Extra: Using index
1 row in set (0.00 sec)
Why the index of text can`t work, and how to use the index of text?
Indexes don't serve a very strong purpose to satisfy queries that return all the rows of the table in the result set. One of their primary purposes is to accelerate WHERE and JOIN ... ON clauses. If your query has no WHERE clause, don't be surprised if the query planner decides to scan the whole table.
Also, your first query does ORDER BY text_column. But your index only encompasses the first fifty characters of that column. So, to satisfy the query, MySql has sort the whole thing. What's more, it has to sort it on the hard drive, because the in-memory table support can't handle BLOB or Text Large Objects.
MySQL is very good at handling dates, but you need to tell it that you have dates, not VARCHAR(255).
Use a DATE datatype for date columns ! If Ruby won't help you do that, then get rid of Ruby.

MySQL Refusing to Use Index for Simple Query

I have a table that I'm running a very simple query against. I've added an index to the table on a high cardinality column, so MySQL should be able to narrow the result almost instantly, but it's doing a full table scan every time. Why isn't MySQL using my index?
mysql> select count(*) FROM eventHistory;
+----------+
| count(*) |
+----------+
| 247514 |
+----------+
1 row in set (0.15 sec)
CREATE TABLE `eventHistory` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`whatID` varchar(255) DEFAULT NULL,
`whatType` varchar(255) DEFAULT NULL,
`whoID` varchar(255) DEFAULT NULL,
`createTimestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `whoID` (`whoID`,`whatID`)
) ENGINE=InnoDB;
mysql> explain SELECT * FROM eventHistory where whoID = 12551\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: eventHistory
type: ALL
possible_keys: whoID
key: NULL
key_len: NULL
ref: NULL
rows: 254481
Extra: Using where
1 row in set (0.00 sec)
I have tried adding FORCE INDEX to the query as well, and it still seems to be doing a full table scan. The performance of the query is also poor. It's currently taking about 0.65 seconds to find the appropriate row.
The above answers lead me to realize two things.
1) When using a VARCHAR index, the query criteria needs to be quoted or MySQL will refuse to use the index (implicitly casting behind the scenes?)
SELECT * FROM foo WHERE column = '123'; # do this
SELECT * FROM foo where column = 123; # don't do this
2) You're better off using/indexing an INT if at all possible.

MySQL SELECT COUNT with GROUP and ORDER performance issue

The Facts:
Dedicated Server, 4 Cores, 16GB
MySQL 5.5.29-0ubuntu0.12.10.1-log - (Ubuntu)
One Table, 1.9M rows and growing
I need all sorted rows for export or a 5er chunk. The query takes 25 seconds with Copying To Tmp Table 23.3 s
I tried InnoDB and MyISAM, changing the index order, using a MD5 Hash of some_text as GROUP BY, partition the table by day.
dayis a Unix-Timestamp and alway present.
lang some_bool some_filter ano_filter rel_id could be in where clause but not need to.
Here is the MyISAM example:
The table
mysql> SHOW CREATE TABLE data \G;
*************************** 1. row ***************************
Table: data
Create Table: CREATE TABLE `data` (
`data_id` bigint(20) NOT NULL AUTO_INCREMENT,
`rel_id` int(11) NOT NULL,
`some_text` varchar(255) DEFAULT NULL,
`lang` varchar(3) DEFAULT NULL,
`some_bool` tinyint(1) DEFAULT NULL,
`some_filter` varchar(40) DEFAULT NULL,
`ano_filter` varchar(10) DEFAULT NULL,
`day` int(11) DEFAULT NULL,
PRIMARY KEY (`data_id`),
KEY `cnt_idx` (`some_filter`,`ano_filter`,`rel_id`,`lang`,`some_bool`,`some_text`,`day`)
) ENGINE=MyISAM AUTO_INCREMENT=1900099 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
The query
mysql> EXPLAIN SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND
'1366934399' GROUP BY `some_text` ORDER BY `num` DESC \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data
type: index
possible_keys: NULL
key: cnt_idx
key_len: 947
ref: NULL
rows: 1900098
Extra: Using where; Using index; Using temporary; Using filesort
1 row in set (0.00 sec)
mysql> SELECT `some_text` , COUNT(*) AS `num` FROM `data`
WHERE `lang` = 'en' AND `day` BETWEEN '1364342400' AND '1366934399'
GROUP BY `some_text` ORDER BY `num` DESC LIMIT 5 \G;
...
*************************** 5. row ***************************
5 rows in set (24.26 sec)
Any idea how to speed up that thing?`
No index is being used because of the column order in the index. Indexes work left to right. For this query to use an index, you would need an index of lang, day.

How can I get mysql to use indexes with a simple AND + OR query?

This is using MySQL 5.5.
I cannot seem to convince MySQL to use indexes for these queries and they are taking anywhere from 2-10 seconds to run on a table with 1.1 million rows.
Table:
CREATE TABLE `notifiable_events` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`key` varchar(255) NOT NULL,
`trigger_profile_id` int(10) unsigned DEFAULT NULL,
`model1` varchar(25) NOT NULL,
`model1_id` varchar(36) NOT NULL,
`model2` varchar(25) NOT NULL DEFAULT '',
`model2_id` varchar(36) NOT NULL DEFAULT '',
`event_data` text,
`created` datetime DEFAULT NULL,
`modified` datetime DEFAULT NULL,
`deleted` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
KEY `key` (`key`),
KEY `notifiable_events__trigger_profile` (`trigger_profile_id`),
KEY `deleted` (`deleted`),
KEY `noti_evnts__m2` (`model2`),
KEY `noti_evnts__m1` (`model1`),
CONSTRAINT `notifiable_events__trigger_profile` FOREIGN KEY (`trigger_profile_id`) REFERENCES `profiles` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1177918 DEFAULT CHARSET=utf8
QUERY:
SELECT *
FROM notifiable_events
WHERE (`model1` = 'page' AND `model1_id` = '54321')
OR (`model2` = 'page' AND `model2_id` = '12345');
EXPLAIN(S):
mysql> EXPLAIN EXTENDED SELECT * FROM notifiable_events WHERE (`model1` = 'page' AND `model1_id` = '922645') OR (`model2` = 'page' AND `model2_id` = '922645')\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: notifiable_events
type: ALL
possible_keys: noti_evnts__m2,noti_evnts__m1,noti_evnts__m1_m2
key: NULL
key_len: NULL
ref: NULL
rows: 1033088
filtered: 100.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN EXTENDED SELECT * FROM notifiable_events WHERE (`model1` = 'page' AND `model1_id` = '922645') OR (`model1` = 'page' AND `model1_id` = '922645')\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: notifiable_events
type: ref
possible_keys: noti_evnts__m1,noti_evnts__m1_m2
key: noti_evnts__m1
key_len: 77
ref: const
rows: 1
filtered: 100.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN EXTENDED SELECT * FROM notifiable_events WHERE (`model2` = 'page' AND `model2_id` = '922645') OR (`model2` = 'page' AND `model2_id` = '922645')\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: notifiable_events
type: ref
possible_keys: noti_evnts__m2
key: noti_evnts__m2
key_len: 77
ref: const
rows: 428920
filtered: 100.00
Extra: Using where
1 row in set, 1 warning (0.00 sec)
You can see that if I only use model1 or only use model2 then it will use the index, however, as soon as I try to use both of them together it gives up on the index entirely and does a full table scan.
I have already tried FORCE INDEX and I have tried a combination of multi-key indexes (and I left one in place for this table as an example). I have also tried rearranging the order of elements in the query but that doesn't seem to have any effect either.
UPDATE:
I forgot to mention that I already tried ANALYZE and OPTIMIZE (multiple times each, no change). I also already tried an index on the *_id (the cardinality is very bad and those columns are mostly unique entries) and a multi index with all 4 columns being used in the query. No improvement or use of the index in either of those cases either.
It seems like it should be really easy to use an index to limit the rows being checked here so I hope I am just missing something.
or clauses goof up query optimization sometimes.
You could try a union, something like this. It may reactivate the indexes.
SELECT *
FROM notifiable_events
WHERE `model1` = 'page' AND `model1_id` = '54321'
UNION
SELECT *
FROM notifiable_events
WHERE `model2` = 'page' AND `model2_id` = '12345'
Edit, to answer the question in the comment
If you're trying to update records using this kind of selection scheme, you can try this:
UPDATE notifiable_events
SET what=ever,
what=ever,
what=else
WHERE id IN (
SELECT id
FROM notifiable_events
WHERE `model1` = 'page' AND `model1_id` = '54321'
UNION
SELECT id
FROM notifiable_events
WHERE `model2` = 'page' AND `model2_id` = '12345'
)
(Note that it's helpful when you're using Stackoverflow to explain what you're actually trying to do. It is of course OK to reduce a complex problem to a simpler one if you can, but saying you're doing a SELECT when you're actually doing an UPDATE is, umm, an oversimplification.)