Mysql - InnoDB - Possible Key NULL when using function in an Update Query - mysql

BACKGROUND
I am working with a high traffic application which seems is extremely slow when executing the following.
Below is a description of my problem:
I have the following function defined:
CREATE FUNCTION getTableXMax() RETURNS INT
BEGIN
DECLARE NUM INT DEFAULT 0;
SELECT COALESCE((SELECT MAX(ID) FROM TABLE_X),0) INTO NUM;
RETURN NUM;
END //
TABLE_X has more than 30 million entries.
PROBLEMATIC QUERY
mysql> UPDATE TABLE_X SET COST = 0 WHERE ID=49996728;
-> //
Query OK, 1 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
mysql> UPDATE TABLE_X SET COLUMN_X=0 WHERE ID=getTableXMax();
-> //
Query OK, 1 rows affected (1 min 23.13 sec)
Rows matched: 1 Changed: 0 Warnings: 0
------- QUESTION -----------
As you can see above, the problem is that the query below takes more than a minute to execute when using the mysql function. I want to understand why this happens (although overall implementation might be bad).
------- DEBUG --------------
I run some EXPLAIN queries to check the possible_keys that mysql uses in order to perform the search. As you can see below the query that uses the function has a NULL value for possible_keys - thus I assume the why the problem exists is probably answered. The questions remaining is how to fix it, and what is the reason.
mysql> EXPLAIN UPDATE TRANSCRIPTIONS SET COST = 0 WHERE ID=12434;//
+----+-------------+----------------+-------+---------------+---------+---------+-------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+---------------+---------+---------+-------+------+-------------+
| 1 | SIMPLE | TRANSCRIPTIONS | range | PRIMARY | PRIMARY | 4 | const | 1 | Using where |
+----+-------------+----------------+-------+---------------+---------+---------+-------+------+-------------+
1 row in set (0.00 sec)
mysql> EXPLAIN UPDATE TRANSCRIPTIONS SET COST = 0 WHERE ID=getTableXMax();//
+----+-------------+----------------+-------+---------------+---------+---------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+-------+---------------+---------+---------+------+----------+-------------+
| 1 | SIMPLE | TRANSCRIPTIONS | index | NULL | PRIMARY | 4 | NULL | 38608423 | Using where |
+----+-------------+----------------+-------+---------------+---------+---------+------+----------+-------------+
MYSQL VERSION
+-------------------------+------------------------------+
| Variable_name | Value |
+-------------------------+------------------------------+
| innodb_version | 5.6.34 |
| protocol_version | 10 |
| slave_type_conversions | |
| version | 5.6.34 |
| version_comment | MySQL Community Server (GPL) |
| version_compile_machine | x86_64 |
| version_compile_os | Linux |
+-------------------------+------------------------------+
I hope my question was thorough enough.

I think that
UPDATE TABLE_X
SET COLUMN_X=0
ORDER BY ID DESC
LIMIT 1
is enough. And the function is not needed at all.
If you want to save the function and the logic then use
UPDATE TABLE_X,
( SELECT getTableXMax() criteria ) fn
SET COLUMN_X=0
WHERE ID=criteria;
But as the first step - try to define the function as DETERMINISTIC.

I think the problem is that the MySQL engine doesn't realize that getTableXMax() always returns the same value. So rather than calling the function once, and then finding that row in the index to update it, it scans the entire table, calling getTableXMax() for each row, and compares the result with ID to determine if it should update that row.
Declaring the function DETERMINISTIC should probably help this. This tells the optimizer that the function always returns the same value, so it only needs to call it once rather than for every row in the table.
The rewrites in Akina's answers will also work, and you could also use a variable:
SET #maxID = getTableXMAx();
UPDATE TABLE_X
SET COLUMN_X = 0
WHERE ID = #maxID;

Related

MySQL Similar Queries taking longer

So I have a table about 2GB in size, I run 2 queries and one takes about 200ms and the other takes over 8s, both return '0' which is correct. I added device_id and time_server as indexes and assumed it would make them quicker, it did.. well for one of the queries. So why is there huge difference in time taken to query the same table?
Have I just been unlucky in that one query is running in memory and the other is running from disk as I've hit the limit of innodb_buffer_pool_size?
Why the difference in the row counts from EXPLAIN, if both return a count of 0, I'd have thought it would do a full table scan and the row count would be identical?
Its worth noting that CPU, RAM, Disk I/O etc.. are all fine with nothing obvious that could slow it down. Repeatedly running the queries gives the same results, so its consistent.
Query 1
mysql> SELECT count(*) AS count
FROM mydb.gps
WHERE device_id = 780 AND time_server > '2021-08-03 16:32:48';
+-------+
| count |
+-------+
| 0 |
+-------+
1 row in set (8.20 sec)
Query 2:
mysql> SELECT count(*) AS count
FROM mydb.gps
WHERE device_id = 430 AND time_server > '2021-08-03 16:32:48';
+-------+
| count |
+-------+
| 0 |
+-------+
1 row in set (0.02 sec)
If I run an explain on them:
Query 1
mysql> EXPLAIN
-> SELECT count(*) AS count FROM mydb.gps WHERE device_id = 780 AND time_server > '2021-08-03 16:32:48';
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+--------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+--------+----------+-------------+
| 1 | SIMPLE | gps | NULL | ref | device_id,time_server | device_id | 5 | const | 282416 | 2.12 | Using where |
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
Query 2
mysql> EXPLAIN
-> SELECT count(*) AS count FROM mydb.gps WHERE device_id = 430 AND time_server > '2021-08-03 16:32:48';
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | gps | NULL | ref | device_id,time_server | device_id | 5 | const | 2001 | 2.12 | Using where |
+----+-------------+-------+------------+------+-----------------------+-----------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
You have more than 100x more rows with the first device_id, so the query has more rows to scan to check the time_server value. You may be able to improve it by creating a multi-column index:
ALTER TABLE gps ADD INDEX (device_id, time_server);

How does MySQL index not speed up update query?

I have a table located in RAM and doing some performance tests.
Let's consider a sample query, adding explain sentences along with results
mysql> explain update users_ram set balance = balance + speed where sub = 1;
+----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| 1 | UPDATE | users_ram | NULL | ALL | NULL | NULL | NULL | NULL | 2333333 | 100.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update users_ram set balance = balance + speed where sub = 1;
Query OK, 1166970 rows affected (0.37 sec)
Rows matched: 1166970 Changed: 1166970 Warnings: 0
As you can see, it takes 0.37 sec without index. Then I'm creating an index on the sub column, which is an int column with just two possible values of 0 and 1, and surprisingly nothing changes
mysql> create index sub on users_ram (sub);
Query OK, 2333333 rows affected (2.04 sec)
Records: 2333333 Duplicates: 0 Warnings: 0
mysql> show index from lords.users_ram;
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| users_ram | 0 | user | 1 | user | NULL | 2333333 | NULL | NULL | YES | HASH | | |
| users_ram | 1 | sub | 1 | sub | NULL | 2 | NULL | NULL | | HASH | | |
+-----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
2 rows in set (0.00 sec)
mysql> explain update users_ram set balance = balance + speed where sub = 1;
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
| 1 | UPDATE | users_ram | NULL | range | sub | sub | 5 | const | 1166666 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update users_ram set balance = balance + speed where sub = 1;
Query OK, 1166970 rows affected (0.37 sec)
Rows matched: 1166970 Changed: 1166970 Warnings: 0
If I remove the index and add it again, but now using btree, it gets even more weird
mysql> explain update users_ram set balance = balance + speed where sub = 1;
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
| 1 | UPDATE | users_ram | NULL | range | sub | sub | 5 | const | 1057987 | 100.00 | Using where |
+----+-------------+-----------+------------+-------+---------------+------+---------+-------+---------+----------+-------------+
1 row in set (0.00 sec)
mysql> update users_ram set balance = balance + speed where sub = 1;
Query OK, 1166970 rows affected (0.62 sec)
Rows matched: 1166970 Changed: 1166970 Warnings: 0
How could adding an index could have no effect or even slow down the query?
Let's take into account that I'm not modifying the column which is indexed, so mysql doesn't have to do an extra write operation, so really I can't get what's really happening here.
"table located in RAM" -- I suspect that is technically incorrect. The possibilities (in MySQL):
The table lives on disk, but it is usually fully cached in the in-RAM "buffer_pool".
The table is ENGINE=MEMORY. But that is used only for temp stuff; it is completely lost if the server goes down.
update users_ram set balance = balance + speed where sub = 1;
The table users_ram needs some index starting with sub. With such, it can go directly to the row(s). But...
It seems that there are 1166970 such rows. That seems like half the table?? At which point, the index is pretty useless. But...
Updating 1M rows is terribly slow, regardless of indexing.
Plan A: Avoid the UPDATE. Perhaps this can be done by storing speed in some other table and doing the + whenever you read the data. (It is generally bad schema design to need huge updates like that.)
Plan B: Update in chunks: http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks
How the heck did you get index-type to be HASH? Perhaps `ENGINE=MEMORY? What version of MySQL?
What is speed? Another column? A constant?
Please provide SHOW CREATE TABLE users_ram -- There are some other things we need to see, such as the PRIMARY KEY and ENGINE.
(I need some of the above info before tackling "How could adding an index could have no effect or even slow down the query?")

How MySQL implements the loose index scan

Recently,I face a question how mysql implements the loose index scan?
For example:
the test table structure is:
CREATE TABLE test (
id int(11) NOT NULL default '0',
v1 int(10) unsigned NOT NULL default '0',
v2 int(10) unsigned NOT NULL default '0',
v3 int(10) unsigned NOT NULL default '0',
PRIMARY KEY (id),
KEY v1_v2_v3 (v1,v2,v3)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
select * from test;
+----+----+-----+----+
| id | v1 | v2 | v3 |
+----+----+-----+----+
| 1 | 1 | 0 | 1 |
| 2 | 3 | 1 | 2 |
| 10 | 4 | 10 | 10 |
| 0 | 4 | 100 | 0 |
| 3 | 4 | 100 | 3 |
| 5 | 5 | 9 | 5 |
| 8 | 7 | 3 | 8 |
| 7 | 7 | 4 | 7 |
| 30 | 8 | 15 | 30 |
+----+----+-----+----+
Now let's see two sql:
first one:
mysql> explain select v1,v2 from test group by v1,v2;
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | NULL | v1_v2_v3 | 8 | NULL | 3 | Using index for group-by |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
I know that Using index for group-by means MySQL use the loose index scan to query the sql.But why the explain output column rows is 3?I wonder how MySQL only scan three rows and get the query result.
second one:
mysql> explain select max(v3) from test where v1>3 group by v1,v2;
+----+-------------+-------+-------+---------------+----------+---------+------+------+---------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+---------------------------------------+
| 1 | SIMPLE | test | range | v1_v2_v3 | v1_v2_v3 | 8 | NULL | 1 | Using where; Using index for group-by |
+----+-------------+-------+-------+---------------+----------+---------+------+------+---------------------------------------+
1 row in set (0.00 sec)
mysql> explain select max(v2) from test where v1>3 group by v1,v2;
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
| 1 | SIMPLE | test | range | v1_v2_v3 | v1_v2_v3 | 4 | NULL | 4 | Using where; Using index |
+----+-------------+-------+-------+---------------+----------+---------+------+------+--------------------------+
1 row in set (0.00 sec)
the only difference between the above two sql is in the select list,one is max(v3),another one is max(v2).But why the max(v3) uses the loose index scan,the max(v2) don't use the loose index scan? I don't unnderstand the GROUP BY Optimization says:
The only aggregate functions used in the select list (if any) are MIN() and MAX(), and all of them refer to the same column. The column must be in the index and must immediately follow the columns in the GROUP BY.
why the column must immediately follow the columns in the GROUP BY?
I am searching for a long time on net. But no use. Please help or try to give some ideas how to achieve this.
Thanks!
This is too long for a comment.
Essentially, when asking "why does the optimizer behave a certain way", the answer is because the designers implemented it that way. If you want to know "why", you would have to ask them . . . that is not an appropriate question for a general-purpose forum.
I want to point out a few things, though. If you think that that the max(v2) is a bug, then you can report it at bugs.mysql.com. I don't think it is a bug for two reasons:
The documentation explicitly states how the optimization works, and this query is not documented to use the index ("v2" does not follow the keys in the group by).
Even if it were documented differently, the use of an aggregation function on a group by key is, shall I say, non-sensical. It is valid SQL, but it is simply verbose and unnecessary. Such constructs are way down on the list of priorities for database implementors.
Finally, MySQL does not really use statistics (very well?) when creating the query plan. However, in most databases, validating a query plan on 9 rows (which fit on a single data page) often results in a query plan that does a full table scan and "inefficient" algorithms. As an example, an algorithm such as bubble sort is quite inefficient on large numbers of rows, but it can be the most efficient sorting algorithm on a (very) small number of rows.
Is there any reason to use max (v2) in the query? The result is the same even if you do not use the max () function. If you change the query to "select v2 from test where v1> 3 group by v1, v2 ", it will be done by loose index scan method.
And here are the reasons why the column must immediately follow the columns in the GROUP BY.
v1 v2 v3
1 1 1
1 1 2
1 1 10
1 2 1
1 2 2
1 2 8
In this case, select max (v3) from t1 group by v1, v2 to perform loose index scan. This is done as shown in the following figure.
v1 v2 v3
1 1 1
1 1 2
1 1 10 ------------------> 10 return
1 2 1
1 2 2
1 2 8 ------------------> 8 return
However, if you perform select max (v3) from t1 group by v1, loose index scan is not possible. Because you have to access all the keys to find the maximum value(=10).
v1 v2 v3
1 1 1 ------------------> (x)
1 1 2 ------------------> (x)
1 1 10 ------------------> 10 return
1 2 1 ------------------> (x)
1 2 2 ------------------> (x)
1 2 8 ------------------> (x)
Note that you can use the following command to see how many records are accessed using loose index scan (or tight index scan).
flush status;
select max(v3) from t1 group by v1,v2; -- perform loose index scan
show session status like 'Handler_read_key%';
flush status;
select max(v3) from t1 group by v1; -- perform tight index scan
show session status like 'Handler_read_key%';

Why is this MySQL JOIN statement returning more results?

I have two (characteristic_list and measure_list) tables that are related to each other by a column called 'm_id'. I want to retrieve records using filters (columns from characteristic_list) within a date range (columns from measure_list). When I gave the following SQL using INNER JOIN, it takes a while to retrieve the record. What am I doing wrong?
mysql> explain select c.power_set_point, m.value, m.uut_id, m.m_id, m.measurement_status, m.step_name from measure_list as m INNER JOIN characteristic_lis
t as c ON (m.m_id=c.m_id) WHERE (m.sequence_end_time BETWEEN '2010-06-18' AND '2010-06-20');
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
| 1 | SIMPLE | c | ALL | NULL | NULL | NULL | NULL | 82952 | |
| 1 | SIMPLE | m | ALL | NULL | NULL | NULL | NULL | 85321 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+-------+-------------+
2 rows in set (0.00 sec)
mysql> select count(*) from measure_list;
+----------+
| count(*) |
+----------+
| 83635 |
+----------+
1 row in set (0.18 sec)
mysql> select count(*) from characteristic_list;
+----------+
| count(*) |
+----------+
| 83635 |
+----------+
1 row in set (0.10 sec)
The reason this query takes a while to execute is because it has to scan the entire table. You never want to see "ALL" as the type of the query. To speed things up, you need to make smart decisions about what to index.
See the following documents at the MySQL site:
http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html
http://dev.mysql.com/doc/refman/5.1/en/using-explain.html
As an add-on to the previous answer by Dan, you should consider indexing the join columns and the where columns. In this case, that means the m_id cols in both tables and the sequence_end_time in the measure_list table. They are small enough that you could add an index, run explain plan and time it, then change the index and compare. Should be relatively quick to solve.

MySql function not using indexes

I have simple function consist of one sql query
CREATE FUNCTION `GetProductIDFunc`( in_title char (14) )
RETURNS bigint(20)
BEGIN
declare out_id bigint;
select id into out_id from products where title = in_title limit 1;
RETURN out_id;
END
Execution time of this function takes 5 seconds
select Benchmark(500 ,GetProductIdFunc('sample_product'));
Execution time of plain query takes 0.001 seconds
select Benchmark(500,(select id from products where title = 'sample_product' limit 1));
"Title" field is indexed. Why function execution takes so much time and how can I optimize it?
edit:
Execution plan
mysql> EXPLAIN EXTENDED select id from products where title = 'sample_product' limit 1;
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
| 1 | SIMPLE | products | const | Index_title | Index_title | 14 | const | 1 | 100.00 | Using index |
+----+-------------+----------+-------+---------------+------------+---------+-------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
mysql> EXPLAIN select GetProductIdFunc('sample_product');
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
| 1 | SIMPLE | NULL | NULL | NULL | NULL | NULL | NULL | NULL | No tables used |
+----+-------------+-------+------+---------------+------+---------+------+------+----------------+
1 row in set (0.00 sec)
This could be a character set issue. If the function is using a different character set than the table column, it would lead to very slow performance despite the index.
Run show create table products\G to determine the character set for the column.
Run show variables like 'character_set%'; to see what the relevant default character sets are for your DB.
Try this:
CREATE FUNCTION `GetProductIDFunc`( in_title char (14) )
RETURNS bigint(20)
BEGIN
declare out_id bigint;
set out_id = (select id from products where title = in_title limit 1);
RETURN out_id;
END