mysql query miss some rows occasionally - mysql

I'm having this problem, here is my sql statements:
select * from tb1 where id > the_max_read order by id
The table tb1 is to monitor some other tables' changes, so it keeps growing.
Variable the_max_read is the max id that program already read.
I'm running this sql via C++ and using mysql's mysql_query function, and save result with mysql_store_result.
DB engine is innodb.
The problem is that it miss some rows sometime, not always but keep happening.
For example, say I have this table:
| -- id | -- name|
| 834370 | name1 |
| 834371 | name2 |
| 834372 | name3 |
| 834373 | name4 |
| 834374 | name5 |
| 834375 | name6 |
and the_max_read=834371, when run the above sql, the result only contains 834374 and 834375.
Though this table may be inserted some new rows by other programs, but I still cannot understand why it just miss some rows, it's almost the simplest sql.

This sounds like it might be a transaction issue, where you read before some of the transactions are committed.
Try read uncommitted data: http://dev.mysql.com/doc/refman/5.1/en/set-transaction.html
e.g.
SET SESSION TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
select * from tb1 where id > the_max_read order by id;
SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ;
Hope that helps.

Related

Concurrent scripts pulling jobs to do from MySQL table [duplicate]

Say I have multiple workers that can concurrently read and write against a MySQL table (e.g. jobs). The task for each worker is:
Find the oldest QUEUED job
Set it's status to RUNNING
Return the corresponding ID.
Note that there may not be any qualifying (i.e. QUEUED) jobs when a worker runs step #1.
I have the following pseudo-code so far. I believe I need to cancel (ROLLBACK) the transaction if step #1 returns no jobs. How would I do that in the code below?
BEGIN TRANSACTION;
# Update the status of jobs fetched by this query:
SELECT id from jobs WHERE status = "QUEUED"
ORDER BY created_at ASC LIMIT 1;
# Do the actual update, otherwise abort (i.e. ROLLBACK?)
UPDATE jobs
SET status="RUNNING"
# HERE: Not sure how to make this conditional on the previous ID
# WHERE id = <ID from the previous SELECT>
COMMIT;
I am implementing something very similar to your case this week. A number of workers, each grabbing the "next" row in a set of rows to work on.
The pseudocode is something like this:
BEGIN;
SELECT ID INTO #id FROM mytable WHERE status = 'QUEUED' LIMIT 1 FOR UPDATE;
UPDATE mytable SET status = 'RUNNING' WHERE id = #id;
COMMIT;
Using FOR UPDATE is important to avoid race conditions, i.e. more than one worker trying to grab the same row.
See https://dev.mysql.com/doc/refman/8.0/en/select-into.html for information about SELECT ... INTO.
It's still not quite clear what you are after. But assuming your task is: Find the next QUEUED job. Set it's status to RUNNING and select the corresponding ID.
In a single threaded environment, you can just use your code. Fetch the selected ID into a variable in your application code and pass it to the UPDATE query in the WHERE clause. You don't even need a transaction, since there is only one writing statement. You can mimic in an SQLscript.
Assuming this is your current state:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | QUEUED |
| 3 | 2020-06-15 12:00:20 | QUEUED |
| 4 | 2020-06-15 12:00:30 | QUEUED |
You want to start the next queued job (which has id=2).
SET #id_for_update = (
SELECT id
FROM jobs
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1
);
UPDATE jobs
SET status="RUNNING"
WHERE id = #id_for_update;
SELECT #id_for_update;
You will get
#id_for_update
2
from the last select. And the table will have this state:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | RUNNING |
| 3 | 2020-06-15 12:00:20 | QUEUED |
| 4 | 2020-06-15 12:00:30 | QUEUED |
View on DB Fiddle
If you have multiple processes, which start jobs, you would need to lock the row with FOR UPDATE. But that can be avoided using LAST_INSERT_ID():
Starting from the state above, with job 2 already running:
UPDATE jobs
SET status = 'RUNNING',
id = LAST_INSERT_ID(id)
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1;
SELECT LAST_INSERT_ID();
You will get:
| LAST_INSERT_ID() | ROW_COUNT() |
| ---------------- | ----------- |
| 3 | 1 |
And the new state is:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | RUNNING |
| 3 | 2020-06-15 12:00:20 | RUNNING |
| 4 | 2020-06-15 12:00:30 | QUEUED |
View on DB Fiddle
If the UPDATE statement affected no row (there were no queued rows) ROW_COUNT() will be 0.
There might be some risks, which I am not aware of - But this is also not really how I would approach this. I would rather store more information in the jobs table. Simple example:
CREATE TABLE jobs (
id INT auto_increment primary key,
created_at timestamp not null default now(),
updated_at timestamp not null default now() on update now(),
status varchar(50) not null default 'QUEUED',
process_id varchar(50) null default null
);
and
UPDATE jobs
SET status = 'RUNNING',
process_id = 'some_unique_pid'
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1;
Now a running job belongs to a specific process and you can just select it with
SELECT * FROM jobs WHERE process_id = 'some_unique_pid';
You might even like to have more information - eg. queued_at, started_at, finished_at.
Adding SKIP LOCKED to the SELECT query, and putting in a SQL transaction, committed when the job is done, avoid jobs stuck in status RUNNING if a worker crashes (because the uncommitted transaction will rollback). It's now supported in newest versions of most common DBMS.
See:
Select only unlocked rows mysql
https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked
(This is not an answer to the question, but a list of caveats that you need to be aware of when using any of the real Answers. Some of these have already been mentioned.)
Replication -- You must do all the locking on the Primary. If you are using a cluster with multiple writable nodes, be aware of the inter-node delays.
Backlog -- When something breaks, you could get a huge list of tasks in the queue. This may lead to some ugly messes.
Number of 'workers' -- Don't have more than a "few" workers. If you try to have, say, 100 concurrent workers, they will stumble over each other an cause nasty problems.
Reaper -- Since a worker may crash, the task assigned to it may never get cleared. Have a TIMESTAMP on the rows so a separate (cron/EVENT/whatever) job can discover what tasks are long overdue and clear them.
If the tasks are fast enough, then the overhead of the queue could be a burden. That is, "Don't queue it, just do it."
You are right to grab the task in one transaction, then later release the task in a separate transaction. Using InnoDB's locking is folly for any be trivially fast actions.

Atomic read and update in MySQL with concurrent workers

Say I have multiple workers that can concurrently read and write against a MySQL table (e.g. jobs). The task for each worker is:
Find the oldest QUEUED job
Set it's status to RUNNING
Return the corresponding ID.
Note that there may not be any qualifying (i.e. QUEUED) jobs when a worker runs step #1.
I have the following pseudo-code so far. I believe I need to cancel (ROLLBACK) the transaction if step #1 returns no jobs. How would I do that in the code below?
BEGIN TRANSACTION;
# Update the status of jobs fetched by this query:
SELECT id from jobs WHERE status = "QUEUED"
ORDER BY created_at ASC LIMIT 1;
# Do the actual update, otherwise abort (i.e. ROLLBACK?)
UPDATE jobs
SET status="RUNNING"
# HERE: Not sure how to make this conditional on the previous ID
# WHERE id = <ID from the previous SELECT>
COMMIT;
I am implementing something very similar to your case this week. A number of workers, each grabbing the "next" row in a set of rows to work on.
The pseudocode is something like this:
BEGIN;
SELECT ID INTO #id FROM mytable WHERE status = 'QUEUED' LIMIT 1 FOR UPDATE;
UPDATE mytable SET status = 'RUNNING' WHERE id = #id;
COMMIT;
Using FOR UPDATE is important to avoid race conditions, i.e. more than one worker trying to grab the same row.
See https://dev.mysql.com/doc/refman/8.0/en/select-into.html for information about SELECT ... INTO.
It's still not quite clear what you are after. But assuming your task is: Find the next QUEUED job. Set it's status to RUNNING and select the corresponding ID.
In a single threaded environment, you can just use your code. Fetch the selected ID into a variable in your application code and pass it to the UPDATE query in the WHERE clause. You don't even need a transaction, since there is only one writing statement. You can mimic in an SQLscript.
Assuming this is your current state:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | QUEUED |
| 3 | 2020-06-15 12:00:20 | QUEUED |
| 4 | 2020-06-15 12:00:30 | QUEUED |
You want to start the next queued job (which has id=2).
SET #id_for_update = (
SELECT id
FROM jobs
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1
);
UPDATE jobs
SET status="RUNNING"
WHERE id = #id_for_update;
SELECT #id_for_update;
You will get
#id_for_update
2
from the last select. And the table will have this state:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | RUNNING |
| 3 | 2020-06-15 12:00:20 | QUEUED |
| 4 | 2020-06-15 12:00:30 | QUEUED |
View on DB Fiddle
If you have multiple processes, which start jobs, you would need to lock the row with FOR UPDATE. But that can be avoided using LAST_INSERT_ID():
Starting from the state above, with job 2 already running:
UPDATE jobs
SET status = 'RUNNING',
id = LAST_INSERT_ID(id)
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1;
SELECT LAST_INSERT_ID();
You will get:
| LAST_INSERT_ID() | ROW_COUNT() |
| ---------------- | ----------- |
| 3 | 1 |
And the new state is:
| id | created_at | status |
| --- | ------------------- | -------- |
| 1 | 2020-06-15 12:00:00 | COMLETED |
| 2 | 2020-06-15 12:00:10 | RUNNING |
| 3 | 2020-06-15 12:00:20 | RUNNING |
| 4 | 2020-06-15 12:00:30 | QUEUED |
View on DB Fiddle
If the UPDATE statement affected no row (there were no queued rows) ROW_COUNT() will be 0.
There might be some risks, which I am not aware of - But this is also not really how I would approach this. I would rather store more information in the jobs table. Simple example:
CREATE TABLE jobs (
id INT auto_increment primary key,
created_at timestamp not null default now(),
updated_at timestamp not null default now() on update now(),
status varchar(50) not null default 'QUEUED',
process_id varchar(50) null default null
);
and
UPDATE jobs
SET status = 'RUNNING',
process_id = 'some_unique_pid'
WHERE status = 'QUEUED'
ORDER BY id
LIMIT 1;
Now a running job belongs to a specific process and you can just select it with
SELECT * FROM jobs WHERE process_id = 'some_unique_pid';
You might even like to have more information - eg. queued_at, started_at, finished_at.
Adding SKIP LOCKED to the SELECT query, and putting in a SQL transaction, committed when the job is done, avoid jobs stuck in status RUNNING if a worker crashes (because the uncommitted transaction will rollback). It's now supported in newest versions of most common DBMS.
See:
Select only unlocked rows mysql
https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html#innodb-locking-reads-nowait-skip-locked
(This is not an answer to the question, but a list of caveats that you need to be aware of when using any of the real Answers. Some of these have already been mentioned.)
Replication -- You must do all the locking on the Primary. If you are using a cluster with multiple writable nodes, be aware of the inter-node delays.
Backlog -- When something breaks, you could get a huge list of tasks in the queue. This may lead to some ugly messes.
Number of 'workers' -- Don't have more than a "few" workers. If you try to have, say, 100 concurrent workers, they will stumble over each other an cause nasty problems.
Reaper -- Since a worker may crash, the task assigned to it may never get cleared. Have a TIMESTAMP on the rows so a separate (cron/EVENT/whatever) job can discover what tasks are long overdue and clear them.
If the tasks are fast enough, then the overhead of the queue could be a burden. That is, "Don't queue it, just do it."
You are right to grab the task in one transaction, then later release the task in a separate transaction. Using InnoDB's locking is folly for any be trivially fast actions.

MySQL query results returned are semi-random / inconsistently ordered

I'm working with an ndb cluster setup that uses proxysql. There are 4 mysql servers, 4 data nodes, and 2 management nodes. The following happens when I access one of the mysql servers directly, so I think that I can safely rule out proxysql as the root cause, but beyond that I'm just lost.
Here's a table I set up to help illustrate my problem:
mysql> describe delain;
+----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+-------------+------+-----+---------+----------------+
| album_id | tinyint(2) | NO | PRI | NULL | auto_increment |
| album | varchar(30) | YES | | NULL | |
+----------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
It contains the following data; note that I specified an order by clause:
mysql> select * from delain order by album_id;
+----------+-------------------------+
| album_id | album |
+----------+-------------------------+
| 1 | Lucidity |
| 2 | April Rain |
| 3 | We Are the Others |
| 4 | The Human Contradiction |
| 5 | Moonbathers |
+----------+-------------------------+
5 rows in set (0.00 sec)
If I don't specify an order clause, the results returned are seemingly random, such as this:
mysql> select * from delain;
+----------+-------------------------+
| album_id | album |
+----------+-------------------------+
| 3 | We Are the Others |
| 5 | Moonbathers |
| 1 | Lucidity |
| 2 | April Rain |
| 4 | The Human Contradiction |
+----------+-------------------------+
5 rows in set (0.00 sec)
When I repeat the query (sans order clause) I get a different ordering pretty much every time. It doesn't seem to be truly random, but there sure as heck isn't any sort of discernible pattern to me.
Why is this happening? My experience with mysql has always been that the default ordering is essentially according to the primary key, but this is also the first time I've used an ndb cluster in particular; I don't know if there's a difference there, or if there's a setting inside a config file that got missed or what. Any help is greatly appreciated!
This is standard SQL behavior.
https://mariadb.com/kb/en/library/sql-99/order-by-clause/ says in part:
An ORDER BY clause may optionally appear after a query expression: it specifies the order rows should have when returned from that query (if you omit the clause, your DBMS will return the rows in some random order).
(emphasis mine)
It'd be more accurate to say it will return the rows in some arbitrary order, instead of random order. Random implies that the order will change from one execution to the next.
In the case of InnoDB, the order tends to be the index order in which the rows were accessed. The index it reads is not necessarily the primary key. So the order is unchanging and somewhat predictable if you know something about the internals. But it's not random.
In the case of MyISAM, the order tends to be the order the rows are stored in the table, which can vary depending on the order the rows were inserted, and also depending on where there was space in the file at the time of insertion, after row deletions.
In the case of NDB, I don't know as much about its internals, so I can't describe its rule for "default" order, but it's still true that without an explicit ORDER BY, the storage engine is allowed to return rows in whatever order it wants to.
For NDB the order depends on timing in the case of a
SELECT * from table;
SELECT * from table is implemented as a parallelised
full table scan within the data nodes and their database
threads with one
MySQL thread receiving results.
So with a filtered query like
SELECT * from table where filter_column = 2;
the filter gets evaluated in many threads in parallel.
Each of those threads return rows to the MySQL thread in any
order that depends on OS scheduler, networking and many
other things. So there is no default ordering unless you
use ORDER BY.
So for NDB order is truly random and not just arbitrary.
You'll see this in all NDB test suites using MTR that
queries mostly use SELECT * from table ORDER BY some_field;

mysql index not optimizing query

I have mysql MyISAM table on which I am doing a simple select id from mytable limit 1;. This just freezes the system.
I tried explain select id from mytable limit 1;. Again it freezes my system. Table demographics: 50k records, 10 mbs size, 2 indexes (primary key autoincrement), 8 columns.
I am clueless why the explain statement failed, as it is supposed to display the query plan, nothing else. Neither the table size is enormous nor the number of records, then why is mysql working so slow? Rather, what am I missing here?
It was due a waiting state on mytable. eggyal gave me the clue to use show processlist. It showed:
+-----+---------+-----------------+----------------+---------+------+---------------------------------+----------------------------------------------------+
| Id | User | Host | db | Command | Time | State | Info |
+-----+---------+-----------------+----------------+---------+------+---------------------------------+----------------------------------------------------+
| 349 | root | localhost:56612 | mydb | Query | 3582 | Waiting for table metadata lock | ALTER TABLE `mytable` ADD INDEX(`fk_to_02`) |
I planted a kill 349 to terminate that wait chain, and now the explain statement works as expected.

MySQL query not going away after being killed

I have a MySQL query that is copying data from one table to another for processing. For some reason, this query that normally takes a few seconds locked up overnight and ran for several hours. When I logged in this morning, I tried to kill the query, but it is still listed in the process list.
| Id | User | Host | db | Command | Time | State | Info |
+---------+----------+-----------+------+---------+-------+--------------+--------------------------------------------------------------------------------------+
| 1061763 | tb_admin | localhost | dw | Killed | 45299 | Sending data | INSERT INTO email_data_inno_stage SELECT * FROM email_data_test LIMIT 4480000, 10000 |
| 1062614 | tb_admin | localhost | dw | Killed | 863 | Sending data | INSERT INTO email_data_inno_stage SELECT * FROM email_data_test LIMIT 4480000, 10000 |
What could have caused this, and how can I kill this process so I can get on with my work?
If the table email_data_test is MyISAM and it was locked, that would have held up the the INSERT.
If the table email_data_test is InnoDB, then a lot of MVCC data was being written in ib_logfiles, which may not have occurred yet.
In both cases, you had the LIMIT clause scroll through 4,480,000 rows just to get to 10,000 rows you actually needed to INSERT.
Killing the query only causes the InnoDB table email_data_inno_stage to execute a rollback.