Stored procedure hanging - mysql

A stored procedure hangs from time to time. Any advices?
BEGIN
DECLARE bookId int;
SELECT IFNULL(id,0) INTO bookId FROM products
WHERE
isbn=p_isbn
and stoc>0
and status='vizibil'
and pret_ron=(SELECT MAX(pret_ron) FROM products
WHERE isbn=p_isbn
and stoc>0
and status='vizibil')
ORDER BY stoc DESC
LIMIT 0,1;
IF bookId>0 THEN
UPDATE products SET afisat='nu' WHERE isbn=p_isbn;
UPDATE products SET afisat='da' WHERE id=bookId;
SELECT bookId INTO obookId;
ELSE
SELECT id INTO bookId FROM products
WHERE
isbn=p_isbn
and stoc=0
and status='vizibil'
and pret_ron=(SELECT MAX(pret_ron) FROM products
WHERE isbn=p_isbn
and stoc=0
and status='vizibil')
LIMIT 0,1;
UPDATE products SET afisat='nu' WHERE isbn=p_isbn;
UPDATE products SET afisat='da' WHERE id=bookId;
SELECT bookId INTO obookId;
END IF;
END
When it hangs it does it on:
| 23970842 | username | sqlhost:54264 | database | Query | 65 | Sending data | SELECT IFNULL(id,0) INTO bookId FROM products
WHERE
isbn= NAME_CONST('p_isbn',_utf8'973-679-50 | 0.000 |
| 1133136 | username | sqlhost:52466 | database _emindb | Query | 18694 | Sending data | SELECT IFNULL(id,0) INTO bookId FROM products
WHERE
isbn= NAME_CONST('p_isbn',_utf8'606-92266- | 0.000 |

First, I'd like to mention the Percona toolkit, it's great for debugging deadlocks and hung transactions. Second, I would guess that at the time of the hang, there are multiple threads executing this same procedure. What we need to know is, which locks are being acquired at the time of the hang. MySQL command SHOW INNODB STATUS gives you this information in detail. At the next 'hang', run this command.
I almost forgot to mention the tool innotop, which is similar, but better: https://github.com/innotop/innotop
Next, I am assuming you are the InnoDB engine. The default transaction isolation level of REPEATABLE READ may be too high in this situation because of range locking, you may consider trying READ COMMITTED for the body of the procedure (SET to READ COMMITTED at the beginning and back to REPEATABLE READ at the end).
Finally, perhaps most importantly, notice that your procedure performs SELECTs and UPDATEs (in mixed order) on the same table using perhaps the same p_isbn value. Imagine if this procedure runs concurrently -- it is a perfect deadlock set up.

Related

Why deleted rows still show up in a subsequent select query in concurrently executed MySQL transaction?

I have the following memberships table with some initial data.
CREATE TABLE memberships (
id SERIAL PRIMARY KEY,
user_id INT,
group_id INT
);
INSERT INTO memberships(user_id, group_id)
VALUES (1, 1), (2, 1), (1, 2), (2, 2);
I have two transactions (repeatable read isolation level) deleting all the rows whose group_id is 2 from the memberships table and retrieving the result using a select query, but the result I get is surprising.
time
transaction 1
transaction 2
T1
start transaction
T2
delete from memberships where group_id = 2
start transaction
T3
select * from memberships this is to make MySQL believe that transaction 2 starts before transaction 1 finishes
T4
select * from memberships this prints only rows whose group_id is 1
T5
commit
T6
delete from memberships where group_id = 2
T7
select * from memberships surprisingly, this prints all rows including rows whose group_id is 2
Below is the result I get from T7.
select * from memberships;
+----+---------+----------+
| id | user_id | group_id |
+----+---------+----------+
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 2 |
| 4 | 2 | 2 |
+----+---------+----------+
4 rows in set (0.00 sec)
This is really surprising since this select query is immediately preceded by a delete query which should remove all the rows whose group_id is 2.
I tried this on MySQL 5.7 and 8.0, and both of them have this issue.
I also tried this on Postgres 14 (also repeatable read isolation level), fortunately, Postgres doesn't have this issue. At timestamp T6, I get an error could not serialize access due to concurrent delete.
Can someone explain to me:
Why MySQL has the issue I described above? How does MySQL implement deletion and how does it work with the MySQL MVCC scheme?
Why Postgres doesn't have the issue? How does Postgres implement deletion and how does it work with the Postgres MVCC implementation?
Thanks a lot!
The repeatable read behavior you are seeing is mentioned in the MySQL documentation:
This is the default isolation level for InnoDB. Consistent reads within the same transaction read the snapshot established by the first read.
This means that the repeatable snapshot which the second transaction sees throughout its transaction is established at T3. Keep in mind that repeatable read is the default isolation level for MySQL.
On Postgres, the default isolation level is not repeatable read but rather read committed. Under this isolation level, attempting the delete from the second transaction which interleaves with the first transaction generates the serialize access error. If you explicitly set the isolation level in Postgres you should get similar behavior:
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;

MySQL isolation level repeatable reads and atomic increments in updates

The last hours I've studied documentation about the different SQL transaction isolation levels and found out that MySQL uses the Repeatable Read Isolation by default and did some experiments.
As far as I understand this, selects in an ongoing transaction should see the same data unless the same transaction does updates to it.
I found a non repeatable read while using atomic increments (e.g. update table set age=age+1 where id=1).
My test table consists of two columns id and age with one entry 1, 20.
Running the following commands in 2 session I get a non repeatable read:
Transaction 1 Transaction 2
--------------- -------------------
begin; begin;
select * from test; select * from test;
+----+-----+ +----+-----+
| id | age | | id | age |
+----+-----+ +----+-----+
| 1 | 20 | | 1 | 20 |
+----+-----+ +----+-----+
update test set \
age=age+1 where id=1;
select * from test; select * from test;
+----+-----+ +----+-----+
| id | age | | id | age |
+----+-----+ +----+-----+
| 1 | 21 | | 1 | 20 |
+----+-----+ +----+-----+
commit;
select * from test;
-- age = 20
update test set age=age+1 where id=1;
select * from test;
-- Expected age=21
-- got age=22 => Non-Repeatable Read
Why does the update use a different value than a select would return? Imagine I would do a select and increment the returned value by one following an update of the row. I would get different results.
The UPDATE operation from the connection on the right column blocks until the transaction on the left completes. If you want repeatable reads on both connections, you'll need to use BEGIN / COMMIT on both connections.
The proper way to run such code is to use FOR UPDATE on the end of the first SELECT. Without that, you are asking for troubles like you found.
What I think happened is (in the righthand connection):
the second SELECT on the right did a "repeatable read" and got only 20.
the UPDATE saw that 21 had been committed, so it bumped it to 22.
the third SELECT new that you had changed the row, so it reread it, getting 22.

MySQL ALTER TABLE taking long in small table

I have two tables in my scenario
table1, which has about 20 tuples
table2, which has about 3 million tuples
table2 has a foreign key referencing table1 "ID" column.
When I try to execute the following query:
ALTER TABLE table1 MODIFY vccolumn VARCHAR(1000);
It takes forever. Why is it taking that long? I have read that it should not, because it only has 20 tuples.
Is there any way to speed it up without having server downtime? Because the query is locking the table, also.
I would guess the ALTER TABLE is waiting on a metadata lock, and it has not actually starting altering anything.
What is a metadata lock?
When you run any query like SELECT/INSERT/UPDATE/DELETE against a table, it must acquire a metadata lock. Those queries do not block each other. Any number of queries of that type can have a metadata lock.
But a DDL statement like CREATE/ALTER/DROP/TRUNCATE/RENAME or event CREATE TRIGGER or LOCK TABLES, must acquire an exclusive metadata lock. If any transaction still holds a metadata lock, the DDL statement waits.
You can demonstrate this. Open two terminal windows and open the mysql client in each window.
Window 1: CREATE TABLE foo ( id int primary key );
Window 1: START TRANSACTION;
Window 1: SELECT * FROM foo; -- it doesn't matter that the table has no data
Window 2: DROP TABLE foo; -- notice it waits
Window 1: SHOW PROCESSLIST;
+-----+------+-----------+------+---------+------+---------------------------------+------------------+-----------+---------------+
| Id | User | Host | db | Command | Time | State | Info | Rows_sent | Rows_examined |
+-----+------+-----------+------+---------+------+---------------------------------+------------------+-----------+---------------+
| 679 | root | localhost | test | Query | 0 | starting | show processlist | 0 | 0 |
| 680 | root | localhost | test | Query | 4 | Waiting for table metadata lock | drop table foo | 0 | 0 |
+-----+------+-----------+------+---------+------+---------------------------------+------------------+-----------+---------------+
You can see the drop table waiting for the table metadata lock. Just waiting. How long will it wait? Until the transaction in window 1 completes. Eventually it will time out after lock_wait_timeout seconds (by default, this is set to 1 year).
Window 1: COMMIT;
Window 2: Notice it stops waiting, and it immediately drops the
table.
So what can you do? Make sure there are no long-running transactions blocking your ALTER TABLE. Even a transaction that ran a quick SELECT against your table earlier will hold its metadata lock until the transaction commits.

optimized way to calculate compliance in mysql

I have a table which contains task list of persons. followings are columns
+---------+-----------+-------------------+------------+---------------------+
| task_id | person_id | task_name | status | due_date_time |
+---------+-----------+-------------------+------------+---------------------+
| 1 | 111 | walk 20 min daily | INCOMPLETE | 2017-04-13 17:20:23 |
| 2 | 111 | brisk walk 30 min | COMPLETE | 2017-03-14 20:20:54 |
| 3 | 111 | take medication | COMPLETE | 2017-04-20 15:15:23 |
| 4 | 222 | sport | COMPLETE | 2017-03-18 14:45:10 |
+---------+-----------+-------------------+------------+---------------------+
I want to find out monthly compliance in percentage(completed task/total task * 100) of each person like
+---------------+-----------+------------+------------+
| compliance_id | person_id | compliance | month |
+---------------+-----------+------------+------------+
| 1 | 111 | 100 | 2017-03-01 |
| 2 | 111 | 50 | 2017-04-01 |
| 3 | 222 | 100 | 2017-03-01 |
+---------------+-----------+------------+------------+
Here person_id 111 has 1 task in month 2017-03-14 and which status is completed, as 1 out of 1 task is completed in march then compliance is 100%
Currently, I am using separate table which stores this compliance but I have to calculate compliance update that table every time the task status is changed
I have tried creating a view also but it's taking too much time to execute view almost 0.5 seconds for 1 million records.
CREATE VIEW `person_compliance_view` AS
SELECT
`t`.`person_id`,
CAST((`t`.`due_date_time` - INTERVAL (DAYOFMONTH(`t`.`due_date_time`) - 1) DAY)
AS DATE) AS `month`,
COUNT(`t`.`status`) AS `total_count`,
COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) AS `completed_count`,
CAST(((COUNT((CASE
WHEN (`t`.`status` = 'COMPLETE') THEN 1
END)) / COUNT(`t`.`status`)) * 100)
AS DECIMAL (10 , 2 )) AS `compliance`
FROM
`task` `t`
WHERE
((`t`.`isDeleted` = 0)
AND (`t`.`due_date_time` < NOW())
GROUP BY `t`.`person_id` , EXTRACT(YEAR_MONTH FROM `t`.`due_date_time`)
Is there any optimized way to do it?
The first question to consider is whether the view can be optimized to give the required performance. This may mean making some changes to the underlying tables and data structure. For example, you might want indexes and you should check query plans to see where they would be most effective.
Other possible changes which would improve efficiency include adding an extra column "year_month" to the base table, which you could populate via a trigger. Another possibility would be to move all the deleted tasks to an 'archive' table to give the view less data to search through.
Whatever you do, a view will always perform worse than a table (assuming the table has relevant indexes). So depending on your needs you may find you need to use a table. That doesn't mean you should junk your view entirely. For example, if a daily refresh of your table is sufficient, you could use your view to help:
truncate table compliance;
insert into compliance select * from compliance_view;
Truncate is more efficient than delete, but you can't use a rollback, so you might prefer to use delete and top-and-tail with START TRANSACTION; ... COMMIT;. I've never created scheduled jobs in MySQL, but if you need help, this looks like a good starting point: here
If daily isn't often enough, you could schedule this to run more often than daily, but better options will be triggers and/or "partial refreshes" (my term, I've no idea if there is a technical term for the idea.
A perfectly written trigger would spot any relevant insert/update/delete and then insert/update/delete the related records in the compliance table. The logic is a little daunting, and I won't attempt it here. An easier option would be a "partial refresh" on called within a trigger. The trigger would spot user targetted by the change, delete only the records from compliance which are related to that user and then insert from your compliance_view the records relating to that user. You should be able to put that into a stored procedure which is called by the trigger.
Update expanding on the options (if a view just won't do):
Option 1: Daily full (or more frequent) refresh via a schedule
You'd want code like this executed (at least) daily.
truncate table compliance;
insert into compliance select * from compliance_view;
Option 2: Partial refresh via trigger
I don't work with triggers often, so can't recall syntax, but the logic should be as follows (not actual code, just pseudo-code)
AFTER INSERT -- you may need one for each of INSERT / UPDATE / DELETE
FOR EACH ROW -- or if there are multiple rows and you can trigger only on the last one to be changed, that would be better
DELETE FROM compliance
WHERE person_id = INSERTED.person_id
INSERT INTO compliance select * from compliance_view where person_id = INSERTED.person_id
END
Option 3: Smart update via trigger
This would be similar to option 2, but instead of deleting all the rows from compliance that relate to the relevant person_id and creating them from scratch, you'd work out which ones to update, and update them and whether any should be added / deleted. The logic is a little involved, and I'm not not going to attempt it here.
Personally, I'd be most tempted by Option 2, but you'd need to combine it with option 1, since the data goes stale due to the use of now().
Here's a similar way of writing the same thing...
Views are of very limited benefit in MySQL, and I think should generally be avoided.
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(task_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,person_id INT NOT NULL
,task_name VARCHAR(30) NOT NULL
,status ENUM('INCOMPLETE','COMPLETE') NOT NULL
,due_date_time DATETIME NOT NULL
);
INSERT INTO my_table VALUES
(1,111,'walk 20 min daily','INCOMPLETE','2017-04-13 17:20:23'),
(2,111,'brisk walk 30 min','COMPLETE','2017-03-14 20:20:54'),
(3,111,'take medication','COMPLETE','2017-04-20 15:15:23'),
(4,222,'sport','COMPLETE','2017-03-18 14:45:10');
SELECT person_id
, DATE_FORMAT(due_date_time,'%Y-%m') yearmonth
, SUM(status = 'complete')/COUNT(*) x
FROM my_table
GROUP
BY person_id
, yearmonth;
person_id yearmonth x
111 2017-03 1.0
111 2017-04 0.5
222 2017-03 1.0

Profile Stored procedures in MySQL

I am working with MySQL and using stored procedures. I have a profiling tool that I am using to profile the code that communicates with MySQL through the stored procedures and I was wondering if there was a tool or capability within MySQL client to profile stored procedure executions. What I have in mind is something that's similar to running queries with profiling turned on. I am using MySQL 5.0.41 on Windows XP.
Thanks in advance.
There is a wonderfully detailed article about such profiling: http://mablomy.blogspot.com/2015/03/profiling-stored-procedures-in-mysql-57.html
As of MySQL 5.7, you can use performance_schema to get informations about the duration of every statement in a stored procedure. Simply:
1) Activate the profiling (use "NO" afterward if you want to disable it)
UPDATE performance_schema.setup_consumers SET ENABLED="YES"
WHERE NAME = "events_statements_history_long";
2) Run the procedure
CALL test('with parameters', '{"if": "needed"}');
3) Query the performance schema to get the overall event informations
SELECT event_id,sql_text,
CONCAT(TIMER_WAIT/1000000000,"ms") AS time
FROM performance_schema.events_statements_history_long
WHERE event_name="statement/sql/call_procedure";
| event_id | sql_text | time |
|2432 | CALL test(...) | 1726.4098ms |
4) Get the detailed informations of the event you want to profile
SELECT EVENT_NAME, SQL_TEXT,
CONCAT(TIMER_WAIT/1000000000,"ms") AS time
FROM performance_schema.events_statements_history_long
WHERE nesting_event_id=2432 ORDER BY event_id;
| EVENT_NAME | SQL_TEXT | time |
| statement/sp/stmt | ... 1 query of the procedure ... | 4.6718ms |
| statement/sp/stmt | ... another query of the procedure ... | 4.6718ms |
| statement/sp/stmt | ... another etc ... | 4.6718ms |
This way, you can tell which query takes the longest time in your procedure call.
I don't know any tool that would turn this resultset into a KCachegrind friendly file or so.
Note that this should not be activated on production server (might be a performance issue, a data size bump, and since performance_schema.events_statements_history_long holds the procedure's parameters values, then it might be a security issue [if procedure's parameter is a final user email or password for instance])
You can turn on the slow query logging within MySQL.
Take a look at this other SO question:
MYSQL Slow Query
Depending on which version, you may actually be able to set the value to zero, so every single query in the DB is shown in the slow query log.
See here for additional details:
http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#sysvar_long_query_time