slow query even with the index on large table - mysql

I am performing a simple select query to extract username from table logs(containing 54864 rows).
It took about 7.836s to retrieve data.
How can I speed up the performace???
SELECT username FROM `logs`
WHERE
logs.branch=1
and
logs.added_on > '2016-11-27 00:00:00'
On describing table,
+-------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| username | char(255) | YES | MUL | NULL | |
| fullname | char(255) | YES | | NULL | |
| package | char(255) | YES | | NULL | |
| prev_expiry | date | YES | | NULL | |
| recharged_upto | date | YES | | NULL | |
| payment_option | int(11) | YES | MUL | NULL | |
| amount | float(14,2) | YES | | NULL | |
| branch | int(11) | YES | MUL | NULL | |
| added_by | int(11) | YES | | NULL | |
| added_on | datetime | YES | MUL | NULL | |
| remark | text | YES | | NULL | |
| payment_mode | char(255) | YES | | NULL | |
| recharge_duration | char(255) | YES | | NULL | |
| invoice_number | char(255) | YES | | NULL | |
| cheque_no | char(255) | YES | | NULL | |
| bank_name | char(255) | YES | | NULL | |
| verify_by_ac | int(11) | YES | | 0 | |
| adjusted_days | int(11) | YES | | NULL | |
| adjustment_note | text | YES | | NULL | |
+-------------------+-------------+------+-----+---------+----------------+
20 rows in set
On explaining query,
+----+-------------+--------------------------+------+-----------------------------+--------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+------+-----------------------------+--------------+---------+-------+------+-------------+
| 1 | SIMPLE | logs | ref | branch_index,added_on_index | branch_index | 5 | const | 37 | Using where |
+----+-------------+--------------------------+------+-----------------------------+--------------+---------+-------+------+-------------+
1 row in set
updated:: explaing query after adding composite index(branch_added_index )
+----+-------------+--------------------------+------+------------------------------------------------+--------------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------------------------+------+------------------------------------------------+--------------+---------+-------+------+-------------+
| 1 | SIMPLE | logs | ref | branch_index,added_on_index,branch_added_index | branch_index | 5 | const | 37 | Using where |
+----+-------------+--------------------------+------+------------------------------------------------+--------------+---------+-------+------+-------------+
1 row in set

Add a composite key on branch,added_on so you cover all the WHERE conditions since you use AND.
ALTER TABLE logs ADD KEY(branch,added_on)
This should be much faster,also you can drop the branch_index key since the above index can replace it.You only return 37 rows from 54000 so the cardinality is OK.
ALTER TABLE logs DROP INDEX `branch_index`;
Or you can use index hints
SELECT username FROM `logs` USE INDEX (branch_added_index) WHERE
logs.branch=1
and
logs.added_on > '2016-11-27 00:00:00'

if your table's existing index is already used for other queries try adding new composite index like below
create index <indexname> on logs(branch,added_on)

create a composite index on 2 fields (branch, added_on) like:
ALTER TABLE `logs` ADD KEY idx_branch_added (branch, added_on);

This will be even faster, because it is "covering":
INDEX(branch, added_on, username) -- in exactly that order.
(And drop any indexes that are prefixes of this.)
Index Cookbook
"Cardinality" is rarely of importance. And EXPLAIN often gets the value wrong.
The EXPLAIN shows 5 for the size of branch -- does it really need to be NULLable? Will you have 2 billion branches? Consider using something smaller, such as the 1-byte TINYINT UNSIGNED NOT NULL (values 0..255).
Also, shrink the 255 to something reasonable.
When describing a table, please use SHOW CREATE TABLE; it is more descriptive. It might help to know the Engine, Charset, etc.

Related

mysql: why below query unused union index?

table a
+----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | NULL | |
| uid | int(11) | YES | MUL | NULL | |
| channel | varchar(20) | YES | | NULL | |
| createAt | datetime | YES | | NULL | |
+----------+-------------+------+-----+---------+-------+
table a index: a_index_uid_createAt` (`uid`,`createAt`)
table b:
+-----------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+-------+
| uid | int(11) | NO | PRI | NULL | |
| date | date | YES | MUL | NULL | |
| channel | varchar(20) | YES | MUL | NULL | |
| gender | smallint(6) | YES | MUL | NULL | |
| chargeAmt | int(11) | YES | | 0 | |
| revised | smallint(6) | YES | | 0 | |
| createAt | datetime | YES | | NULL | |
+-----------+-------------+------+-----+---------+-------+
query st:
select DATE(a.createAt) date,a.channel,b.chargeAmt
FROM a, b
where a.uid = b.uid
and a.createAt >= '2021-05-10 00:00:00'
and a.createAt <= '2021-05-10 23:59:59';
explain:
+----+-------------+-------+--------+-------------------------------------------------+---------+---------+--------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-------------------------------------------------+---------+---------+--------------+--------+-------------+
| 1 | SIMPLE | a | ALL | a_index_uid_createAt | NULL | NULL | NULL | 172725 | Using where |
| 1 | SIMPLE | b | eq_ref | PRIMARY | PRIMARY | 4 | xiehou.r.uid | 1 | |
+----+-------------+-------+--------+-------------------------------------------------+---------+---------+--------------+--------+-------------+
why? a_index_uid_createAt index invalid!
Please use the JOIN .. ON syntax:
select DATE(a.createAt) date, a.channel, b.chargeAmt
FROM a
JOIN b ON a.uid = b.uid -- How the tables are related
WHERE a.createAt >= '2021-05-10 -- filtering
and a.createAt < '2021-05-10 + INTERVAL 1 DAY; -- filtering
The Optimizer, when it sees a JOIN, starts by deciding which table to start with. The preferred table is the one with filtering, namely a.
To do the filtering, it needs an index that starts with the columns mentioned in the WHERE clause.
The other table will be reached by looking at the ON, which seems to have PRIMARY KEY(uid)
So, the only useful index is
a: INDEX(createAt)
Any INDEX(uid, ...) is likely to be unused, since it starts with an existing index, namely PRIMARY KEY(uid).
(In the future, please use SHOW CREATE TABLE; it is more descriptive than DESCRIBE.)

In MYSQL, what does it mean when there are duplicate indices where everything but key_name is the same?

describe etc_category_metadata;
+---------------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_id | bigint(20) | NO | | NULL | |
| time_updated | int(11) | YES | | NULL | |
| category_type | int(11) | YES | MUL | NULL | |
| status_keywords | mediumblob | YES | | NULL | |
| page_keywords | mediumblob | YES | | NULL | |
| profession_keywords | mediumblob | YES | | NULL | |
| adgroup_ids | mediumblob | YES | | NULL | |
| prod | tinyint(1) | YES | | 0 | |
| version | int(11) | YES | | 1 | |
| status | int(11) | YES | | 0 | |
| dep_category_ids | mediumblob | YES | | NULL | |
| custom_param | mediumblob | YES | | NULL | |
| queue_priority | int(11) | YES | | 1 | |
| auto_requeue_num | int(11) | YES | | 0 | |
| cloned_version | int(11) | YES | | 0 | |
| custom_query | varchar(1000) | YES | | NULL | |
| description | varchar(1000) | YES | | NULL | |
| error_message | mediumblob | YES | | NULL | |
| time_last_completed | int(11) | YES | | NULL | |
+---------------------+---------------+------+-----+---------+----------------+
21 rows in set (0.40 sec)
show index from etc_category_metadata;
+-----------------------+------------+-----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------------------+------------+-----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| etc_category_metadata | 0 | PRIMARY | 1 | id | A | 12613 | NULL | NULL | | BTREE | | |
| etc_category_metadata | 0 | category_type | 1 | category_type | A | 12613 | NULL | NULL | YES | BTREE | | |
| etc_category_metadata | 0 | category_type | 2 | version | A | 12613 | NULL | NULL | YES | BTREE | | |
| etc_category_metadata | 0 | category_type_2 | 1 | category_type | A | 12613 | NULL | NULL | YES | BTREE | | |
| etc_category_metadata | 0 | category_type_2 | 2 | version | A | 12613 | NULL | NULL | YES | BTREE | | |
| etc_category_metadata | 0 | category_type_3 | 1 | category_type | A | 12613 | NULL | NULL | YES | BTREE | | |
| etc_category_metadata | 0 | category_type_3 | 2 | version | A | 12613 | NULL | NULL | YES | BTREE | | |
+-----------------------+------------+-----------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
7 rows in set (0.07 sec)
Trying to figure out whether it's safe to delete the keys category_type_2 and category_type_3. It seems like they are the exact same indices as category_type. This is a legacy table that I have no idea who created a long time ago. Are there any valid reasons someone would end up creating these three keys that seem duplicates?
Having two identical indexes is a waste of disk space and slows down INSERTs (a little). There is no benefit.
You can't really see if they are duplicates from DESCRIBE TABLE. Instead, do SHOW CREATE TABLE. Watch for UNIQUE/not, prefixed/not, regular/FULLTEXT, created manually / created by FOREIGN KEY, etc.
Once you decide that they are identical (except for name), do drop one. There are other cases where indexes may as well be dropped. Suppose you had these 3 indexes:
INDEX(a,b) -- keep this
INDEX(a) -- unnecessary
INDEX(b) -- keep
Or this pair:
UNIQUE(a) -- keep; same as INDEX(a), plus a uniqueness check
INDEX(a) -- drop
More subtle, consider this pair:
INDEX(a,b) -- keep; provides composite index
UNIQUE(a) -- keep; provides uniqueness check
(There are more combinations.)
Yes,you have duplicate indexes on the same column so mysql just add a number if you havent given the indexes a name.IMO,mysql shoudnt even allow duplicate indexes.It is safe to delete them
DROP INDEX category_type_2 ON etc_category_metadata
and do the same for the others
There's no good reason to have multiple indexes on the same columns with the same order. Issues such a create index statement on a current MySQL database will still succeed (for backwards compatibility reasons), but will issue a warning:
Duplicate index 'index_name' defined on the table 'db.table_name'. This is deprecated and will be disallowed in a future release.
If you have such pre-existing duplicate indexes there's no reason not to remove them.

mysql slow query on indexed column

Running a simple query on an indexed column, yet it's taking over 500ms. These queries run often so it's affecting performance in a big way.
Just under 1M rows in the table, a very simple table. Using MyISAM. I don't understand why it's examining all rows, seems like it's ignoring the index! I tried adding a second index on the field, a normal one instead of a unique index, didn't make any difference. Thanks for looking.
# Time: 130730 22:00:07
# User#Host: engine[engine] # engine [10.0.0.6]
# Query_time: 0.511209 Lock_time: 0.000050 Rows_sent: 1 Rows_examined: 932048
SET timestamp=1375236007;
SELECT * FROM `marketplacesales`.`sales_order` WHERE `marketplace_orderid` = 823123693003;
+---------------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| marketplace_id | smallint(5) unsigned | NO | MUL | NULL | |
| marketplace_orderid | varchar(255) | NO | UNI | NULL | |
| datetime | datetime | NO | | NULL | |
| added | datetime | NO | MUL | NULL | |
| phone | varchar(255) | YES | | NULL | |
| email | varchar(255) | YES | | NULL | |
| company | varchar(255) | NO | | NULL | |
| name | varchar(255) | NO | | NULL | |
| address1 | varchar(255) | NO | | NULL | |
| address2 | varchar(255) | NO | | NULL | |
| city | varchar(255) | NO | | NULL | |
| state | varchar(255) | NO | | NULL | |
| zip | varchar(255) | NO | | NULL | |
| country | varchar(255) | NO | | NULL | |
+---------------------+-----------------------+------+-----+---------+----------------+
EXPLAIN SELECT * FROM `marketplacesales`.`sales_order` WHERE `marketplace_orderid` = 823123693003;
+----+-------------+-------------+------+-------------------------------------------+------+---------+------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+-------------------------------------------+------+---------+------+--------+-------------+
| 1 | SIMPLE | sales_order | ALL | marketplace_orderid,marketplace_orderid_2 | NULL | NULL | NULL | 932053 | Using where |
+----+-------------+-------------+------+-------------------------------------------+------+---------+------+--------+-------------+
1 row in set (0.00 sec)
You need to use explain to see what is happening. My guess is that the index is not being used.
The where clause is:
WHERE `marketplace_orderid` = 823123693003;
As explained here, the conversion will take place as floating point numbers. This requires a conversion on marketplace_orderid.
Either fix the field in the table so it is numeric. Or, put the value in quotes in the where clause:
WHERE `marketplace_orderid` = '823123693003';
The problem with quotes is that the actual value might have leading zeros, which would cause the match to fail.
Try using:
SELECT * FROM `marketplacesales`.`sales_order` WHERE `marketplace_orderid LIKE '823123693003%'
That should force the use of the index.
Also, if possible, I would consider changing the field to bigint intead of varchar

MySQL query with JOIN and GROUP BY optimization. Is it possible?

I have two tables: gpnxuser and key_value
mysql> describe gpnxuser;
+--------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| version | bigint(20) | NO | | NULL | |
| email | varchar(255) | YES | | NULL | |
| uuid | varchar(255) | NO | MUL | NULL | |
| partner_id | bigint(20) | NO | MUL | NULL | |
| password | varchar(255) | YES | | NULL | |
| date_created | datetime | YES | | NULL | |
| last_updated | datetime | YES | | NULL | |
+--------------+--------------+------+-----+---------+----------------+
and
mysql> describe key_value;
+----------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+--------------+------+-----+---------+----------------+
| id | bigint(20) | NO | PRI | NULL | auto_increment |
| version | bigint(20) | NO | | NULL | |
| date_created | datetime | YES | | NULL | |
| last_updated | datetime | YES | | NULL | |
| upkey | varchar(255) | NO | MUL | NULL | |
| user_id | bigint(20) | YES | MUL | NULL | |
| security_level | int(11) | NO | | NULL | |
+----------------+--------------+------+-----+---------+----------------+
key_value.user_id is FK that references gpnxuser.id. I also have an index in gpnxuser.partner_id which is a FK that references a table called "partner" (which, I think, does not matter much to this question).
For partner_id = 64, I have 500K rows in gpnxuser which have relationship with approximatelly 6M rows in key_value.
I wanted to have a query that returned all distinct 'key_value.upkey' for userĀ“s belonging to a given partner. I did something like this:
select upkey from gpnxuser join key_value on gpnxuser.id=key_value.user_id where partner_id=64 group by upkey;
which takes forever to run. The explain for the query looks like:
mysql> explain select upkey from gpnxuser join key_value on gpnxuser.id=key_value.user_id where partner_id=64 group by upkey;
+----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+
| 1 | SIMPLE | gpnxuser | ref | PRIMARY,FKB2D9FEBE725C505E | FKB2D9FEBE725C505E | 8 | const | 259640 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | key_value | ref | FK9E0C0F912D11F5A9 | FK9E0C0F912D11F5A9 | 9 | gpnx_finance_db.gpnxuser.id | 14 | Using where |
+----+-------------+-----------+------+----------------------------+--------------------+---------+-----------------------------+--------+----------------------------------------------+
My question is: is there a query that can run fast and obtain the result that I want?
what you need to do is utilize EXISTS statement: This will cause only partial table scan until a match found and not more.
select upkey from (select distinct upkey from key_value) upk
where EXISTS
(select 1 from gpnxuser u, key_value kv
where u.id=kv.user_id and partner_id=1 and kv.upkey = upk.upkey)
NB. In the original query, group by is misused: distinct looks better there.
select DISTINCT upkey from gpnxuser join key_value on
gpnxuser.id=key_value.user_id where partner_id=1
I would look into partitioning your key_value table on user_id, if you typically run queries based on this column.
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html

Simple MySQL query with performance issues

I have the following simple MySQL query:
SELECT SQL_NO_CACHE mainID
FROM tableName
WHERE otherID3=19
AND dateStartCol >= '2012-08-01'
AND dateStartCol <= '2012-08-31';
When I run this it takes 0.29 seconds to bring back 36074 results. When I increase my date period to bring back more results (65703) it runs in 0.56. When I run other similar SQL queries on the same server but on different tables (some tables are larger) the results come back in approximately 0.01 seconds.
Although 0.29 isn't slow - this is a basic part for a complex query and this timing means that it is not scalable.
See below for the table definition and indexes.
I know it's not server load as I have the same issue on a development server which has very little usage.
+---------------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------------------+--------------+------+-----+---------+----------------+
| mainID | int(11) | NO | PRI | NULL | auto_increment |
| otherID1 | int(11) | NO | MUL | NULL | |
| otherID2 | int(11) | NO | MUL | NULL | |
| otherID3 | int(11) | NO | MUL | NULL | |
| keyword | varchar(200) | NO | MUL | NULL | |
| dateStartCol | date | NO | MUL | NULL | |
| timeStartCol | time | NO | MUL | NULL | |
| dateEndCol | date | NO | MUL | NULL | |
| timeEndCol | time | NO | MUL | NULL | |
| statusCode | int(1) | NO | MUL | NULL | |
| uRL | text | NO | | NULL | |
| hostname | varchar(200) | YES | MUL | NULL | |
| IPAddress | varchar(25) | YES | | NULL | |
| cookieVal | varchar(100) | NO | | NULL | |
| keywordVal | varchar(60) | NO | | NULL | |
| dateTimeCol | datetime | NO | MUL | NULL | |
+---------------------------+--------------+------+-----+---------+----------------+
+--------------------+------------+-------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------------+------------+-------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+
| tableName | 0 | PRIMARY | 1 | mainID | A | 661990 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_otherID1 | 1 | otherID1 | A | 330995 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_otherID2 | 1 | otherID2 | A | 25 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_otherID3 | 1 | otherID3 | A | 48 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_dateStartCol | 1 | dateStartCol | A | 187 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_timeStartCol | 1 | timeStartCol | A | 73554 | NULL | NULL | | BTREE | |
|tableName | 1 | idx_dateEndCol | 1 | dateEndCol | A | 188 | NULL | NULL | | BTREE | |
|tableName | 1 | idx_timeEndCol | 1 | timeEndCol | A | 73554 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_keyword | 1 | keyword | A | 82748 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_hostname | 1 | hostname | A | 2955 | NULL | NULL | YES | BTREE | |
| tableName | 1 | idx_dateTimeCol | 1 | dateTimeCol | A | 220663 | NULL | NULL | | BTREE | |
| tableName | 1 | idx_statusCode | 1 | statusCode | A | 2 | NULL | NULL | | BTREE | |
+--------------------+------------+-------------------------------+--------------+---------------------------+-----------+-------------+----------+--------+------+------------+---------+
Explain Output:
+----+-------------+-----------+-------+----------------------------------+-------------------+---------+------+-------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+-------+----------------------------------+-------------------+---------+------+-------+----------+-------------+
| 1 | SIMPLE | tableName | range | idx_otherID3,idx_dateStartCol | idx_dateStartCol | 3 | NULL | 66875 | 75.00 | Using where |
+----+-------------+-----------+-------+----------------------------------+-------------------+---------+------+-------+----------+-------------+
If that is really your query (and not a simplified version of same), then this ought to achieve best results:
CREATE INDEX table_ndx on tableName( otherID3, dateStartCol, mainID);
The first index entry means that the first match in the WHERE is very fast; the same also applies with dateStartCol. The third field is very small and does not slow the index appreciably, but allows for the datum you require to be found immediately in the index with no table access at all.
It is important that the keys are in the same index. In the EXPLAIN you posted, each key is in an index of its own, so even if MySQL chooses the best index, the performances will not be optimal. I'd try and use less indexes, for they also have a cost (shameless plug: Can Indices actually decrease SELECT performance? ).
First try to add the right key. It seems like dateStartCol is more selective than otherID3
ALTER TABLE tableName ADD KEY idx_dates(dateStartCol, dateStartCol)
Second - please make sure you select only rows you need by adding LIMIT clause to the SELECT. This will should up the query. Try like this:
SELECT SQL_NO_CACHE mainID
FROM tableName
WHERE otherID3=19
AND dateStartCol >= '2012-08-01'
AND dateStartCol <= '2012-08-31'
LIMIT 10;
Please also make sure that your MySQL tuned up properly. You may want to check key_buffer_size and innodb_buffer_pool_size as described in http://astellar.com/2011/12/why-is-stock-mysql-slow/
If this is a recurrent or important query then create a multiple column index:
CREATE INDEX index_name ON tableName (otherID3, dateStartCol)
Delete the non used indexes as they make table changes more expensive.
BTW you don't need two separate columns for date and time. You can combine then in a datetime or timestamp type. One less column and one less index.
The explain output shows it chose the dateStartCol index so you could try the opposite I suggested above:
CREATE INDEX index_name ON tableName (dateStartCol, otherID3)
Notice that the query's dateStartCol condition will still get 75% of the rows so not much improvement, if any, in using that single index.
How unique is otherID3? If there are not many repeated otherID3 you can hint the engine to use it.