any one can help me here.how to improve query performance - mysql

my query is running longer than 30 minutes .it is a simple query even it contains indexes also.we are unable to find why it was taking too much execution time and it effects on our entire db performance.
yesterday it ran around: 122.6mins
any one can help me here.how to improve query performance
This is my query:
SELECT tab1.customer_id,tab1.row_mod,tab1.row_create,tab1.event_id,tab1.event_type,
tab1.new_value,tab1.old_value FROM tab1 force index (tab1_n2)where customer_id >= 1 and customer_id
< 5000000 and (tab1.row_mod >= '2012-10-01') or (tab1.row_create >= '2012-10-01' and tab1.row_create < '2012-10-13');
Explain plan
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | tab1 | ALL | tab1_n2 | NULL | NULL | NULL | 18490530 | Using where |
+----+-------------+------------------+------+---------------------+------+---------+------+----------+-------------+
1 row in set (0.00 sec)
Table structure:
mysql> show create table tab1\G
*************************** 1. row ***************************
Table: tab1
Create Table: CREATE TABLE `tab1` (
`customer_id` int(11) NOT NULL,
`row_mod` datetime DEFAULT NULL,
`row_create` datetime DEFAULT NULL,
`event_id` int(11) DEFAULT NULL,
`event_type` varchar(45) DEFAULT NULL,
`new_value` varchar(255) DEFAULT NULL,
`old_value` varchar(255) DEFAULT NULL,
KEY `customer_id1` (`customer_id`),
KEY `new_value_n1` (`new_value`),
KEY `tab1_n1` (`row_create`),
KEY `tab1_n2` (`row_mod`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
Please help me how to tune it .even it having indexes also

Probably because you are using an index that does not make sense.
The row_mod condition is only one branch of the OR condition, so that index is not much help here. If you are forcing every lookup through the index without eliminating any rows, that could be a lot slower than a full table scan. Good rule of thumb is that an index should eliminate more than 90% of rows.
Try to do without the "force index" part.

Try using a UNION of the two conditions. That way each condition can use an index.
ALTER TABLE tab1 ADD INDEX idx_row_mod_customer_id (row_mod, customer_id);
ALTER TABLE tab1 ADD INDEX idx_row_create (row_create);
SELECT tab1.customer_id, tab1.row_mod, tab1.row_create, tab1.event_id, tab1.event_type,
tab1.new_value, tab1.old_value
FROM tab1
WHERE customer_id >= 1 and customer_id
< 5000000 AND tab1.row_mod >= '2012-10-01'
UNION
SELECT tab1.customer_id, tab1.row_mod, tab1.row_create, tab1.event_id, tab1.event_type,
tab1.new_value, tab1.old_value
FROM tab1
WHERE tab1.row_create >= '2012-10-01' AND tab1.row_create < '2012-10-13';
To optimise further, you could add all selected columns to both indices, saving MySQL from having to load the rows into memory. This will greatly increase the size of the indices, and therefore their memory requirement.

Related

Mariadb - select count() using index but select * not using proper index

Version - 10.4.25-MariaDB
I have a table where column(name) is a second part of primary key(idarchive,name).
When i run count(*) on table where name like 'done%', its using the index on field name properly but when i run select * its not using the separate index instead using primary key and slowing down the query.
Any idea what we can do here ?
any changes in optimizer switch or any other alternative which can help ?
Note - we can't use force index as queries are not controlable.
Table structure:
CREATE TABLE `table1 ` (
`idarchive` int(10) unsigned NOT NULL,
`name` varchar(255) NOT NULL,
`idsite` int(10) unsigned DEFAULT NULL,
`date1` date DEFAULT NULL,
`date2` date DEFAULT NULL,
`period` tinyint(3) unsigned DEFAULT NULL,
`ts_archived` datetime DEFAULT NULL,
`value` double DEFAULT NULL,
PRIMARY KEY (`idarchive`,`name`),
KEY `index_idsite_dates_period` (`idsite`,`date1`,`date2`,`period`,`ts_archived`),
KEY `index_period_archived` (`period`,`ts_archived`),
KEY `name` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Queries:
explain select count(*) from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
| 1 | SIMPLE | table1 | range | name | name | 767 | NULL | 9131455 | Using where; Using index |
+------+-------------+-------------------------------+-------+---------------+------+---------+------+---------+--------------------------+
1 row in set (0.000 sec)
explain select * from table1 WHERE name like 'done%' ;
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
| 1 | SIMPLE | table1 | ALL | name | NULL | NULL | NULL | 18262910 | Using where |
+------+-------------+-------------------------------+------+---------------+------+---------+------+----------+-------------+
1 row in set (0.000 sec) ```
Your SELECT COUNT(*) ... LIKE 'constant%' query is covered by your index on your name column. That is, the entire query can be satisfied by reading the index. So the query planner decides to range-scan your index to generate the result.
On the other hand, your SELECT * query needs all columns from all rows of the table. That can't be satisfied from any of your indexes. And, it's possible your WHERE name like 'done%' filter reads a significant fraction of the table, enough so the query planner decides the fastest way to satisfy it is to scan the entire table. The query planner figures this out by using statistics on the contents of the table, plus some knowledge of the relative costs of IO and CPU.
If you have just inserted many rows into the table you could try doing ANALYZE TABLE table1 and then rerun the query. Maybe after the table's statistics are updated you'll get a different query plan.
And, if you don't need all the columns, you could stop using SELECT * and instead name the columns you need. SELECT * is a notorious query-performance antipattern, because it often returns column data that never gets used. Once you know exactly what columns you want, you could create a covering index to provide them.
These days the query planner does a pretty good job of optimizing simple queries such as yours.
In MariaDB you can say ANALYZE FORMAT=JSON SELECT.... It will run the query and show you details of the actual execution plan it used.

the best index when having two timestamp columns in mysql

I have two columns in my MySQL Database start_time and end_time. They both are TIMESTAMP. I have created an index on the two columns.
I have 3.5 million rows. This query takes 13s to be executed :
select * from test WHERE start_time > TIMESTAMP('2020-04-02 09:00:00') and end_time < TIMESTAMP('2020-04-02 10:00:00')
Is there any way to optimise it ?
EDIT:
CREATE TABLE `test` (
`YYY` varchar(255) NOT NULL,
`start_time` timestamp NULL DEFAULT NULL,
`end_time` timestamp NULL DEFAULT NULL,
UNIQUE KEY `index1` (`YYY`,`start_time`) USING BTREE,
UNIQUE KEY `index2` (`YYY`,`start_time`,`end_time`) USING BTREE
) ENGINE=InnoDB DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
Don't bother having one UNIQUE (index2) that starts with the same column as another UNIQUE (index1).
There is no index (in MySQL, at least) that optimizes for such an "overlap" test.
If you the time ranges are non-overlapping (that is, no two pairs of start..end overlap other than one end matches another `start), then my technique for IP-ranges scales quite well. But it requires restructuring the table. http://mysql.rjweb.org/doc.php/ipranges
I cannot replicate this finding:
EXPLAIN
-> SELECT *
-> FROM test
-> WHERE start_time >= '2020-04-02 09:00:00'
-> AND end_time <= '2020-04-02 10:00:00';
+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+
| 1 | SIMPLE | test | index | NULL | index2 | 267 | NULL | 262035 | Using where; Using index |
+----+-------------+-------+-------+---------------+--------+---------+------+--------+--------------------------+
1 row in set (0.00 sec)
Actually, I'm surprised that the index is used, given that YYY is not part of the filtering condition. Anyway, try without YYY, or try with YYY as the third argument in the index. And remove the other indices.

Why would an indexed column return results slowly when querying for `IS NULL`?

I have a table with 25 million rows, indexed appropriately.
But adding the clause AND status IS NULL turns a super fast query into a crazy slow query.
Please help me speed it up.
Query:
SELECT
student_id,
grade,
status
FROM
grades
WHERE
class_id = 1
AND status IS NULL -- This line delays results from <200ms to 40-70s!
AND grade BETWEEN 0 AND 0.7
LIMIT 25;
Table:
CREATE TABLE IF NOT EXISTS `grades` (
`student_id` BIGINT(20) NOT NULL,
`class_id` INT(11) NOT NULL,
`grade` FLOAT(10,6) DEFAULT NULL,
`status` INT(11) DEFAULT NULL,
UNIQUE KEY `unique_key` (`student_id`,`class_id`),
KEY `class_id` (`class_id`),
KEY `status` (`status`),
KEY `grade` (`grade`)
) ENGINE=INNODB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Local development shows results instantly (<200ms). Production server is huge slowdown (40-70 seconds!).
Can you point me in the right direction to debug?
Explain:
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
| 1 | SIMPLE | grades | index_merge | class_id,status,grade | status,class_id | 5,4 | NULL | 26811 | Using intersect(status,class_id); Using where |
+----+-------------+--------+-------------+-----------------------+-----------------+---------+------+-------+--------------------------------------------------------+
A SELECT statement can only use one index per table.
Presumably the query before just did a scan using the sole index class_id for your condition class_id=1. Which will probably filter your result set nicely before checking the other conditions.
The optimiser is 'incorrectly' choosing an index merge on class_id and status for the second query and checking 26811 rows which is probably not optimal. You could hint at the class_id index by adding USING INDEX (class_id) to the end of the FROM clause.
You may get some joy with a composite index on (class_id,status,grade) which may run the query faster as it can match the first two and then range scan the grade. I'm not sure how this works with null though.
I'm guessing the ORDER BY pushed the optimiser to choose the class_id index again and returned your query to it's original speed.

query taking more time to execute

Query is taking around 5mins to 20 mins to execute.
Due to this we are getting load spikes.
Please help me to rewrite the query.
Also help me to improve the performance of query.
Query:
SELECT DATE(create_time) as createDate, count(url_id)
FROM t_notification
WHERE domain_id = 185
AND type = 12
AND create_time >= '2012-12-15'
GROUP BY createDate
Explain
explain select DATE(create_time) as createDate, count(url_id) from t_notification where domain_id = 185 and type = 12 and create_time >= '2012-12-15' group by createDate;
+----+-------------+----------------+------+---------------------------------+----------+---------+-------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+------+---------------------------------+----------+---------+-------+---------+----------------------------------------------+
| 1 | SIMPLE | t_notification | ref | FK_notification_domain,idx_test | idx_test | 5 | const | 9189516 | Using where; Using temporary; Using filesort |
+----+-------------+----------------+------+---------------------------------+----------+---------+-------+---------+----------------------------------------------+
1 row in set (0.29 sec)
mysql> show create table t_notification\G
*************************** 1. row ***************************
Table: t_notification
Create Table: CREATE TABLE `t_notification` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`type` int(11) DEFAULT NULL,
`content` varchar(512) DEFAULT NULL,
`create_time` date DEFAULT NULL,
`domain_id` int(11) DEFAULT NULL,
`url_id` int(11) DEFAULT NULL,
`status` int(11) DEFAULT NULL,
`targetrul_partnerurl_id` int(11) DEFAULT NULL,
`week_entrances` int(11) DEFAULT NULL COMMENT 'for keyword and target_url',
PRIMARY KEY (`id`),
KEY `url_id` (`url_id`),
KEY `targetrul_partnerurl_id` (`targetrul_partnerurl_id`),
KEY `FK_notification_domain` (`domain_id`,`id`),
KEY `idx_test` (`domain_id`,`status`,`type`,`create_time`)
) ENGINE=InnoDB AUTO_INCREMENT=50747991 DEFAULT CHARSET=utf8
1 row in set (0.00 sec)
From MySQl docs
Suppose that you issue the following SELECT statement: mysql> SELECT
* FROM tbl_name WHERE col1=val1 AND col2=val2;
If a multiple-column index exists on col1 and col2, the appropriate
rows can be fetched directly. If separate single-column indexes exist
on col1 and col2, the optimizer will attempt to use the Index Merge
optimization (see Section 8.3.1.4, “Index Merge Optimization”), or
attempt to find the most restrictive index by deciding which index
finds fewer rows and using that index to fetch the rows.
If the table has a multiple-column index, any leftmost prefix of the
index can be used by the optimizer to find rows. For example, if you
have a three-column index on (col1, col2, col3), you have indexed
search capabilities on (col1), (col1, col2), and (col1, col2, col3).
You don't have an useable index on type or create_time. Either drop status from key idx_test or create a new index on (type, create_time) or on type and create_time separately.
Consider creating composite index on domain_id and type columns as they are directly used in where clause. It will definitely improve the performance of your query.

How to optimize a query that's using group by on a large number of rows

The table looks like this:
CREATE TABLE `tweet_tweet` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`text` varchar(256) NOT NULL,
`created_at` datetime NOT NULL,
`created_date` date NOT NULL,
...
`positive_sentiment` decimal(5,2) DEFAULT NULL,
`negative_sentiment` decimal(5,2) DEFAULT NULL,
`entity_id` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `tweet_tweet_entity_created` (`entity_id`,`created_at`)
) ENGINE=MyISAM AUTO_INCREMENT=1097134 DEFAULT CHARSET=utf8
The explain on the query looks like this:
mysql> explain SELECT `tweet_tweet`.`entity_id`,
STDDEV_POP(`tweet_tweet`.`positive_sentiment`) AS `sentiment_stddev`,
AVG(`tweet_tweet`.`positive_sentiment`) AS `sentiment_avg`,
COUNT(`tweet_tweet`.`id`) AS `tweet_count`
FROM `tweet_tweet`
WHERE `tweet_tweet`.`created_at` > '2010-10-06 16:24:43'
GROUP BY `tweet_tweet`.`entity_id` ORDER BY `tweet_tweet`.`entity_id` ASC;
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+
| 1 | SIMPLE | tweet_tweet | ALL | NULL | NULL | NULL | NULL | 1097452 | Using where; Using temporary; Using filesort |
+----+-------------+-------------+------+---------------+------+---------+------+---------+----------------------------------------------+
1 row in set (0.00 sec)
About 300k rows are added to the table every day. The query runs about 4 seconds right now but I want to get it down to around 1 second and I'm afraid the query will take exponentially longer as the days go on. Total number of rows in tweet_tweet is currently only a little over 1M, but it will be growing fast.
Any thoughts on optimizing this? Do I need any more indexes? Should I be using something like Cassandra instead of MySQL? =)
You may try to reorder fields in the index (i.e. KEY tweet_tweet_entity_created (created_at, entity_id). That will allow mysql to use the index to reduce the quantity of actual rows that need to be grouped and ordered).
You're not using the index tweet_tweet_entity_created. Change your query to:
explain SELECT `tweet_tweet`.`entity_id`,
STDDEV_POP(`tweet_tweet`.`positive_sentiment`) AS `sentiment_stddev`,
AVG(`tweet_tweet`.`positive_sentiment`) AS `sentiment_avg`,
COUNT(`tweet_tweet`.`id`) AS `tweet_count`
FROM `tweet_tweet` FORCE INDEX (tweet_tweet_entity_created)
WHERE `tweet_tweet`.`created_at` > '2010-10-06 16:24:43'
GROUP BY `tweet_tweet`.`entity_id` ORDER BY `tweet_tweet`.`entity_id` ASC;
You can read more about index hints in the MySQL manual http://dev.mysql.com/doc/refman/5.1/en/index-hints.html
Sometimes MySQL's query optimizer needs a little help.
MySQL has a dirty little secret. When you create an index over multiple columns, only the first one is really "used". I've made tables that used Unique Keys and Foreign Keys, and I often had to set a separate index for one or more of the columns.
I suggest adding an extra index to just created_at at a minimum. I do not know if adding indexes to the aggregate columns will also speed things up.
if your mysql version 5.1 or higher ,you can consider partitioning option for large tables.
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html