MySQL Not Using Index when Specifying Non-Indexed Columns - mysql

When trying to run a query specifying a non-indexed column or *, the index isn't used. Using the indexed column or count(*) does use the index. Any idea why specifying any non-indexed column in the output would case the index to not be used? I've noticed that as the number of rows increases, my queries have been slowing down.
Here are three identical queries where one uses count(*), the other just uses * and one uses an indexed column.
Optimized query
MariaDB [db]> explain select count(*) from data_history where id=18 and time > date_sub(now(), interval 7 day)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data_history
type: index
possible_keys: PRIMARY,ID
key: ID
key_len: 26
ref: NULL
rows: 1176921
Extra: Using where; Using index
1 row in set (0.00 sec)
Non-optimized query
MariaDB [db]> explain select * from data_history where id=18 and time > date_sub(now(), interval 7 day)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data_history
type: ALL
possible_keys: PRIMARY,ID
key: NULL
key_len: NULL
ref: NULL
rows: 1176929
Extra: Using where
1 row in set (0.00 sec)
Optimized query
MariaDB [db]> explain select id from data_history where id=18 and time > date_sub(now(), interval 7 day)\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: data_history
type: index
possible_keys: PRIMARY,ID
key: ID
key_len: 26
ref: NULL
rows: 1176935
Extra: Using where; Using index
1 row in set (0.00 sec)
Table structure
CREATE TABLE `data_history` (
`id` varchar(20) NOT NULL DEFAULT '',
`time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`b1` int(3) DEFAULT NULL,
PRIMARY KEY (`id`,`time`),
UNIQUE KEY `ID` (`id`,`time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Related

how to rewrite a query for the latin1 and utf8mb4

the below 2 tables contain different charset can I change the charset does it impact any data or is it safer? and for now, I want to change the query by using the (covert using latin1 )function but I'm not sure how to possibly change the query.
Question 1: does anything will happen while changing the charset ?
Question 2 : did the below query have any possible rewrites?
select
adc.consent_id,
adc.user_id,
adc.loan_id,
ls.loan_schedule_id,
adc.max_amount,
adc.method,
ls.due_date,
DATEDIFF(
CURDATE(),
ls.due_date
),
l.status_code
from
razorpay_enach_consent as adc
join loan_schedules as ls on adc.loan_id = ls.loan_id
AND adc.is_active = 1
AND adc.token_status = 'confirmed'
and ls.due_date <= DATE_ADD(now(), INTERVAL 1 DAY)
and ls.status_code = 'repayment_pending'
and due_date >= date_sub(now(), INTERVAL 2 day)
and ls.loan_schedule_id not in (
select
loan_schedule_id
from
repayment_transactions rt
where
rt.status_code in (
'repayment_auto_debit_order_created',
'repayment_auto_debit_request_sent',
'repayment_transaction_inprogress'
)
and rt.entry_type = 'AUTODEBIT_RP'
)
join loans l on adc.loan_id = l.loan_id
and l.status_code = 'disbursal_completed'
limit
30
explain plan
*************************** 1. row ***************************
id: 1
select_type: PRIMARY
table: adc
partitions: NULL
type: ref
possible_keys: idx_is_active_loan_id
key: idx_is_active_loan_id
key_len: 1
ref: const
rows: 829
filtered: 10.00
Extra: Using index condition; Using where
*************************** 2. row ***************************
id: 1
select_type: PRIMARY
table: l
partitions: NULL
type: eq_ref
possible_keys: PRIMARY,idx_loans_status_code,idx_lid_uid_bnk_statcd,idx_loan_id_tenure_days,idx_disbursal_date_status_code
key: PRIMARY
key_len: 8
ref: loanfront.adc.loan_id
rows: 1
filtered: 7.15
Extra: Using where
*************************** 3. row ***************************
id: 1
select_type: PRIMARY
table: ls
partitions: NULL
type: ref
possible_keys: fk_loan_schedules_loans
key: fk_loan_schedules_loans
key_len: 8
ref: loanfront.adc.loan_id
rows: 1
filtered: 4.09
Extra: Using index condition; Using where
*************************** 4. row ***************************
id: 2
select_type: DEPENDENT SUBQUERY
table: rt
partitions: NULL
type: index_subquery
possible_keys: idx_transactions_status_code,idx_repayment_transactions,idx_entry_type_status_code
key: idx_repayment_transactions
key_len: 5
ref: func
rows: 4
filtered: 1.10
Extra: Using where
Table structure ;
Table : loan_schedules
(`tmp_user_id`,`tmp_loan_id`,`emi_group`,`schedule_num`),
KEY `fk_loan_schedules_product_types` (`product_type_id`),
KEY `fk_loan_schedules_loans` (`loan_id`),
KEY `loan_schedules_tmp_user_loan_group_schedule_num` (`tmp_user_id`,`tmp_loan_id`,`emi_group`,`schedule_num`),
KEY `loan_schedules_emi_group_index` (`emi_group`),
KEY `loan_schedules_tmp_loan_id_index` (`tmp_loan_id`),
KEY `loan_schedules_tmp_user_id_index` (`tmp_user_id`),
KEY `loan_schedules_user_id_index` (`user_id`),
KEY `idx_schedule_num_expected_total_am_status_code` (`schedule_num`,`expected_total_amt`,`status_code`),
CONSTRAINT `_fk_loan_schedules_product_types` FOREIGN KEY (`product_type_id`) REFERENCES `product_types` (`product_type_id`) ON DELETE NO ACTION ON UPDATE NO ACTION
) ENGINE=InnoDB AUTO_INCREMENT=8622683 DEFAULT CHARSET=latin1 |
table: razorpay_enach_consent
payment_created_at` timestamp NULL DEFAULT NULL,
`token_id` varchar(200) DEFAULT NULL,
`token_expiry_date` timestamp NULL DEFAULT NULL,
`signature` varchar(500) DEFAULT NULL,
`is_active` tinyint(2) NOT NULL DEFAULT '1',
PRIMARY KEY (`consent_id`),
UNIQUE KEY `token_id_UNIQUE` (`token_id`),
KEY `idx_is_active_loan_id` (`is_active`,`loan_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4989 DEFAULT CHARSET=utf8mb4 |
The consequences of changing a collation or character set on a column in a table, or on the table as a whole, are:
converting the former character set / collation values to the new one. That's not always possible: For example the unicode ⌘ cannot be represented in latin1. See this.
rebuilding any indexes containing textual columns.
If you use text (VARCHAR, CHAR, TEXT, MEDIUMTEXT and so forth) data types in ON conditions for JOINs, the character sets and collations of the columns should match. If you use numerical data types (INT, BIGINT) those data types should match. If they don't MySQL or MariaDB must do a lot of on-the-fly conversion to evaluate the ON condition.
Collations and character sets are baked into indexes. So if you do this on your utfmb4 column razorpay_enach_consent.token_id
WHERE CONVERT(token_id USING latin1) = _latin1'Constant
you'll defeat an index on token_id. But if you do this, you won't.
WHERE token_id = _utf8mb4'Constant
you'll use the index.

MySQL update with subquery and between stupidly slow

I have a simple query. All I am trying to do is combine geolocation data into a primary table so I dont need to use a join when pulling data. Yeah a join is fine but I want all the data in a single table...
Here is my query:
> explain UPDATE test AS l
SET l.country_name1 = (
select country_name
from ip2location_db9
where
l.ip >= ip_from
ORDER BY ip_from DESC
LIMIT 1)
******************** 1. row *********************
id: 1
select_type: UPDATE
table: l
partitions:
type: ALL
possible_keys:
key:
key_len:
ref:
rows: 10
filtered: 100.00
Extra:
******************** 2. row *********************
id: 2
select_type: DEPENDENT SUBQUERY
table: ip2location_db9
partitions:
type: index
possible_keys: idx_ip_from,idx_ip_from_to,ip_from_to,ip_from
key: idx_ip_from
key_len: 5
ref:
rows: 1
filtered: 33.33
Extra: Using where; Backward index scan
2 rows in set
Simple right? Except that this query for 10 lines takes 130+ seconds?..
The select statement when run by itself is super fast (0.00 sec) and works perfectly.
explain select country_name
from ip2location_db9
where
ip_from <= INET_ATON('114.160.63.108')
ORDER BY ip_from DESC
LIMIT 1
******************** 1. row *********************
id: 1
select_type: SIMPLE
table: ip2location_db9
partitions:
type: range
possible_keys: idx_ip_from,idx_ip_from_to
key: idx_ip_from
key_len: 5
ref:
rows: 1949595
filtered: 100.00
Extra: Using index condition
1 rows in set
However when I use it with the update statement with only 14 rows in the test table, the query takes a painful 130 seconds.
Tables are set up as follows:
CREATE TABLE `test` (
`ip` int(10) unsigned DEFAULT NULL,
`country_name` varchar(64) COLLATE utf8_bin DEFAULT NULL,
`region_name` varchar(128) COLLATE utf8_bin DEFAULT NULL,
`city_name` varchar(128) COLLATE utf8_bin DEFAULT NULL,
`latitude` double DEFAULT NULL,
`longitude` double DEFAULT NULL,
`zip_code` varchar(30) COLLATE utf8_bin DEFAULT NULL,
`data` varchar(512) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE `ip2location_db9`(
`ip_from` INT(10) UNSIGNED,
`ip_to` INT(10) UNSIGNED,
`country_code` CHAR(2),
`country_name` VARCHAR(64),
`region_name` VARCHAR(128),
`city_name` VARCHAR(128),
`latitude` DOUBLE,
`longitude` DOUBLE,
`zip_code` VARCHAR(30),
INDEX `idx_ip_from` (`ip_from`),
INDEX `idx_ip_to` (`ip_to`),
INDEX `idx_ip_from_to` (`ip_from`, `ip_to`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Note that when running the update query, the hard drive activity goes crazy high utilization.
See? Nothing fancy. Table data matches. Rows are properly indexed.
Explain shows that is uses the index just as it does when I perform the select by itself:
******************** 1. row *********************
id: 1
select_type: UPDATE
table: l
partitions:
type: ALL
possible_keys:
key:
key_len:
ref:
rows: 14
filtered: 100.00
Extra:
******************** 2. row *********************
id: 2
select_type: DEPENDENT SUBQUERY
table: ip2location_db93
partitions:
type: index
possible_keys: idx_ip_from,idx_ip_from_to
key: idx_ip_from
key_len: 5
ref:
rows: 1
filtered: 33.33
Extra: Using where
2 rows in set
So what could I possibly been doing wrong that would cause the query to take 2 min to run for 14 lines?
Its not a hardware thing. i7 6990, 32gb ram, running off an ssd. Not the fastest in the world but I can manually update 14 rows faster than this query can...
I have spent an excess of time searching trying to find out why this takes so long. I assume that I am just not searching correctly. Perhaps I dont know a specific term or something that would point me in the right direction. I am not a DB guy. Just doing this for a work thing.
Hoping you guys can save my sanity..
Adding more info...
I have tried to make this query work many many ways. Other questions on stack said to avoid the subquery and use a join. Ok that was the first thing I tried but I cant get the query to use the indexes I built.
> explain UPDATE test AS l
JOIN
ip2location_db9 AS h
ON
l.ip <= h.ip_from and l.ip >= h.ip_to
SET
l.country_name1 = h.country_name
******************** 1. row *********************
id: 1
select_type: UPDATE
table: l
partitions:
type: ALL
possible_keys: ip
key:
key_len:
ref:
rows: 10
filtered: 100.00
Extra:
******************** 2. row *********************
id: 1
select_type: SIMPLE
table: h
partitions:
type: ALL
possible_keys: idx_ip_from,idx_ip_to,idx_ip_from_to,ip_from_to,ip_from,ip_to
key:
key_len:
ref:
rows: 3495740
filtered: 11.11
Extra: Range checked for each record (index map: 0x3F)
2 rows in set
Does an update with a join get any easier than that?
Even using force index doesn't get the query to use an index.
Ive tried this query so many ways
UPDATE test
JOIN (
SELECT * FROM ip2location_db9
) AS t1
ON (test.ip <= t1.ip_from and test.ip >= t1.ip_to)
SET test.country_name1 = t1.country_name
UPDATE test l,
(SELECT
*
FROM
ip2location_db9,test
WHERE
test.ip >= ip_from
ORDER BY ip_from DESC
LIMIT 1) AS PreQualified
SET
l.country_name1 = PreQualified.country_name
Nothing works! What am I doing wrong?

Mysql query not using all indexed column for searching

For faster search i have indexed two columns(composite index) client_id and batch_id.
Below is my output of indexes of my table
show indexes from authentication_codes
*************************** 3. row ***************************
Table: authentication_codes
Non_unique: 1
Key_name: client_id
Seq_in_index: 1
Column_name: client_id
Collation: A
Cardinality: 18
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
*************************** 4. row ***************************
Table: authentication_codes
Non_unique: 1
Key_name: client_id
Seq_in_index: 2
Column_name: batch_id
Collation: A
Cardinality: 18
Sub_part: NULL
Packed: NULL
Null: YES
Index_type: BTREE
Comment:
Index_comment:
4 rows in set (0.02 sec)
ERROR:
No query specified
when i use explain to check if indexing is used in query or not it gives me below output.
mysql> explain select * from authentication_codes where client_id=6 and batch_id="101" \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: authentication_codes
type: ref
possible_keys: client_id
key: client_id
key_len: 773
ref: const,const
rows: 1044778
Extra: Using where
1 row in set (0.00 sec)
ERROR:
No query specified
********************EDIT***************************
output of show create table authentication_codes is as below
mysql> show create table authentication_codes \G;
*************************** 1. row ***************************
Table: authentication_codes
Create Table: CREATE TABLE `authentication_codes` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`code` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`batch_id` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`serial_num` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`client_id` int(11) DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `index_authentication_codes_on_code` (`code`),
KEY `client_id_batch_id` (`client_id`,`batch_id`)
) ENGINE=InnoDB AUTO_INCREMENT=48406205 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
1 row in set (0.00 sec)
my question is why batch_id column is not used for searching. why only client_id column is used for searching??
To use index on two columns you need to create two column index. MySQL cannot use two separate indexes on one table.
This query will add multi column index on client_id and batch_id
alter table authentication_codes add index client_id_batch_id (client_id,batch_id)
http://dev.mysql.com/doc/refman/5.7/en/multiple-column-indexes.html
The EXPLAIN does not match the CREATE TABLE, at least in the name of the relevant index.
Explaining the EXPLAIN (as displayed at the moment):
select_type: SIMPLE
table: authentication_codes
type: ref
possible_keys: client_id
key: client_id -- The index named "client_id" was used
key_len: 773 -- (explained below)
ref: const,const -- 2 constants were used for the first two columns in that index
rows: 1044778 -- About this many rows (2% of table) matches those two constants
Extra: Using where
773 = 2 + 3 * 255 + 1 + 4 + 1
2 = length for VARCHAR
3 = max width of a utf8 character -- do you really need utf8?
255 = max length provided in VARCHAR(255) -- do you really need that much?
1 = extra length for NULL -- perhaps your columns could/should be NOT NULL?
4 = length of INT for client_id -- if you don't need 4 billion ids, maybe a smaller INT would work? and maybe UNSIGNED, too?
So, yes, it is using both parts of client_id=6 and batch_id="101". But there are a million rows in that batch for that client, so the query takes time.
If you want to discuss how to further speed up the use of this table, please provide the other common queries. (I don't want to tweak the schema to make this query faster, only to find that other queries are made slower.)

MySQL, the index of text can`t work

I create a table like this
CREATE TABLE `text_tests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`text_st_date` text NOT NULL,
`varchar_st_date` varchar(255) NOT NULL DEFAULT '2015-08-25',
`text_id` text NOT NULL,
`varchar_id` varchar(255) NOT NULL DEFAULT '0',
`int_id` int(11) NOT NULL DEFAULT '0',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `idx_of_text_st_date` (`text_st_date`(50),`id`),
KEY `idx_of_varchar_st_date` (`varchar_st_date`,`id`),
KEY `idx_of_text_id` (`text_id`(20),`id`),
KEY `idx_of_varchar_id` (`varchar_id`,`id`),
KEY `idx_of_int_id` (`int_id`,`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
then i make some datas use Ruby
(1..10000).each do |_i|
item = TextTest.new
item.text_st_date = (Time.now + _i.days).to_s
item.varchar_st_date = (Time.now + _i.days).to_s
item.text_id = _i
item.varchar_id = _i
item.int_id = _i
item.save
end
at last, I try to use the index of text, but it can`t work, it always full table scan.
EXPLAIN SELECT id
FROM text_tests
ORDER BY text_st_date DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9797
Extra: Using filesort
1 row in set (0.02 sec)
EXPLAIN SELECT id
FROM text_tests
ORDER BY text_id DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 9797
Extra: Using filesort
1 row in set (0.00 sec)
varchar works good
EXPLAIN SELECT id
FROM text_tests
ORDER BY varchar_st_date DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: index
possible_keys: NULL
key: idx_of_varchar_st_date `enter code here`
key_len: 771
ref: NULL
rows: 20
Extra: Using index
1 row in set (0.00 sec)
EXPLAIN SELECT id
FROM text_tests
ORDER BY varchar_id DESC
LIMIT 20\G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: text_tests
type: index
possible_keys: NULL
key: idx_of_varchar_id
key_len: 771
ref: NULL
rows: 20
Extra: Using index
1 row in set (0.00 sec)
Why the index of text can`t work, and how to use the index of text?
Indexes don't serve a very strong purpose to satisfy queries that return all the rows of the table in the result set. One of their primary purposes is to accelerate WHERE and JOIN ... ON clauses. If your query has no WHERE clause, don't be surprised if the query planner decides to scan the whole table.
Also, your first query does ORDER BY text_column. But your index only encompasses the first fifty characters of that column. So, to satisfy the query, MySql has sort the whole thing. What's more, it has to sort it on the hard drive, because the in-memory table support can't handle BLOB or Text Large Objects.
MySQL is very good at handling dates, but you need to tell it that you have dates, not VARCHAR(255).
Use a DATE datatype for date columns ! If Ruby won't help you do that, then get rid of Ruby.

Add index to generated column

First, sorry if the used terms are not right. I'm not a mySQL-professional.
I have a table like this :
CREATE TABLE `accesses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`time` int(11) DEFAULT NULL,
`accessed_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `index_accesses_on_accessed_at` (`accessed_at`)
) ENGINE=InnoDB AUTO_INCREMENT=9278483 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
This table has 10.000.000 rows in it. I use it to generate charts, with queries like this :
SELECT SUM(time) AS value, DATE(created_at) AS date
FROM `accesses`
GROUP BY date;
This query is very long (more than 1 minute). I'm doing lots of others queries (with AVG, MIN or MAX instead of SUM, or with a WHERE on a specific day or month, or GROUP BY HOUR(created_at), etc...)
I want to optimize it.
The best idea I have is to add several columns, with redundancy, like DATE(created_at), HOUR(created_at), MONTH(created_at), then add an index on it.
... Is this solution good or is there any other one ?
Regards
Yes, it can be an optimization to store data redundantly in permanent columns with an index to optimize certain queries. This is one example of denormalization.
Depending on the amount of data and the frequency of queries, this can be an important speedup (#Marshall Tigerus downplays it too much, IMHO).
I tested this out by running EXPLAIN:
mysql> explain SELECT SUM(time) AS value, DATE(created_at) AS date FROM `accesses` GROUP BY date\G *************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: accesses
partitions: NULL
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1
filtered: 100.00
Extra: Using temporary; Using filesort
Ignore the fact that the table is empty in my test. The important part is Using temporary; Using filesort which are expensive operations, especially if your temp table gets so large that MySQL can't fit it in memory.
I added some columns and indexes on them:
mysql> alter table accesses add column cdate date, add key (cdate),
add column chour tinyint, add key (chour),
add column cmonth tinyint, add key (cmonth);
mysql> explain SELECT SUM(time) AS value, cdate FROM `accesses` GROUP BY cdate\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: accesses
partitions: NULL
type: index
possible_keys: cdate
key: cdate
key_len: 4
ref: NULL
rows: 1
filtered: 100.00
Extra: NULL
The temporary table and filesort went away, because MySQL knows it can do an index scan to process the rows in the correct order.