Atomically copying one MySQL table over another? - mysql

I am trying to copy one table over another one "atomically". Basically I want to update a table periodically, such that a process that reads from the table will not get an incomplete result if another process is updating the table.
To give some background info, I want a table that acts as a leaderboard for a game. This leaderboard will update every few minutes via a separate process. My thinking is as follows:
Table SCORES contains the publicly-viewable leaderboard that will be read from when a user views the leaderboard. This table is updated every few minutes. The process that updates the leaderboard will create a SCORES_TEMP table that contains the new leaderboard. Once that table is created, I want to copy all of its contents over to SCORES "atomically". I think what I want to do is something like:
TRUNCATE TABLE SCORES;
INSERT INTO SCORES SELECT * FROM SCORES_TEMP;
I want to replace everything in SCORES. I don't need to maintain my primary keys or auto increment values. I just want to bring in all the data from SCORES_TEMP. But I know that if someone views the scores before these 2 statements are done, the leaderboard will be blank. How can I do this atomically, such that it will never show blank or incomplete data? Thanks!

Use rename table
RENAME TABLE old_table TO backup_table, new_table TO old_table;
It's atomic, works on all storage engines, and doesn't have to rebuild the indexes.

In MySQL, because of the behavior of TRUNCATE I think you'll need to:
BEGIN TRANSACTION;
DELETE FROM SCORES;
INSERT INTO SCORES SELECT * FROM SCORES_TEMP;
COMMIT TRANSACTION;
I'm not sure there's a way to make what is always effectively a DDL operation transaction safe.

You may use transactions (for InnoDB),
BEGIN TRANSACTION;
DELETE FROM SCORES;
INSERT INTO SCORES SELECT * FROM SCORES_TEMP;
COMMIT;
or LOCK TABLES (for MyISAM):
LOCK TABLES;
DELETE FROM SCORES;
INSERT INTO SCORES SELECT * FROM SCORES_TEMP;
UNLOCK TABLES;

I don't know hot MySQL deals with transaction, but in T-SQL you could write
BEGIN TRAN
DELETE FROM SCORES
INSERT INTO SCORES SELECT * FROM SCORES_TEMP
COMMIT TRAN
This way your operation would be "atomic", but not instantaneous.

Related

MySQL concurrency and auto_incrementing key

I have a MySQL table of Users, and a table of Actions performed by the Users (linked to that User by a the primary key, userid ). The Actions table has an incrementing key indx. Whenever I add a new row to that table, I then update the latest column of the relevant Users row with the indx of the row I just added to the Actions table. So something like:
INSERT INTO actions(indx,actionname,userid) VALUES(default, "myaction", 1);
UPDATE users SET latest=LAST_INSERT_ID() WHERE userid=1;
The idea being that I can check for updates for a User by seeing if the latest is higher then the last time I checked.
My issue is that if more than one connection is opened on the database and they try and add an Action for the same User at the same time, connection2 could conceivably run their INSERT and UPDATE between the INSERT and update of connection1, and the latest entry of the user they're both trying to update will no longer have the indx of the most recent action entry.
I've been reading up on transaction, isolation levels, etc. But haven't really found a way around this (though my understanding of how these work exactly is pretty shaky, so maybe I just misunderstood). I think I need a way to lock the Actions table until the User table is updated. This application only gets used by a few hundred users tops, so I don't think the performance hit due to momentarily locking the table will be too bad.
So is that something that can be done in MySQL? Is there a better solution? I imagine this general pattern must be pretty common: having one table with a bunch of varieties of rows, and a second table with a row that tracks meta data for each variety in table A and needs to be updated atomically each time that first table is changed. So I'm hoping there's a solution that isn't too complex
Use SELECT ... FOR UPDATE to lock the row in order to serialize the access to the table and prevent from race conditions:
START TRANSACTION;
SELECT any_column FROM users WHERE userid=1 FOR UPDATE;
INSERT INTO actions(indx,actionname,userid) VALUES(default, "myaction", 1);
UPDATE users SET latest=LATEST_INSERT_ID() WHERE userid=1;
COMMIT;
However this will slown down your INSERTing rate, because all these transactions from all sessions will be serialized.
The better option is to not store the last ID in users table at all. Just use SELECT max( id ) FROM actions WHERE userid = xxxx in all places where this number is required. With an index on actions( userid ) this query will be very fast (assuming that id column is the primary key in this table), and the inserts will not be slowed down

Handling multiple MySql queries (Deleting and Copy)

Good morning.
I have a table on MySQL DataBase.
In this table there are 5 robots that can write like 10 record each per hour.
Every 3 month a script that I have created, make a copy of the table and then delete all the table entries (In this way I can keep the IDs in a certain order).
My question is.
That are two different statement:
CREATE TABLE omologationResult_{date} AS SELECT * FROM omologationResult
DELETE FROM omologationResult
if the script is going to copy the table at point 0, and a record will be added from the robots, there's no problem, because the SQL statement starts from the lowest ID 'till the end. But if the script is going to delete the table and the robot is writing in it. What will happen? I lose the last robot record?
And if it's true. What can I do to make a copy of the table and then remove only the data that I've copied?
Thank you
Yes, this is not a safe operation because it's not atomic. It's quite possible for another thread to insert values into that table in between your CREATE .. SELECT and the DELETE. One option you have is to use a multi table DELETE
CREATE TABLE omologationResult_{date} AS SELECT * FROM omologationResult;
DELETE omologationResult FROM omologationResult
INNER JOIN omologationResult_{date} ON omologationResult_{date}.id = omologationResult.id
Will ensure that only items that exist in both tables have been deleted from omologationResult

Performance of mysql counting rows in a big table

This fairly obvious question has very few (couldnt find any) solid answers.
I do simple select from table of 2 million rows.
select count(id) as total from big_table
Any machine I try this query on, usually takes at least 5 seconds to complete. This is unacceptable for realtime queries.
The reason I need an exact value of rows fetched is for precise statistical calculations later on.
Using the last auto increment value is unfortunately not an options because rows also get deleted periodically.
It can indeed be slow when running on an InnoDB engine. As stated in section 14.24 of the MySQL 5.7 Reference Manual, “InnoDB Restrictions and Limitations”, 3rd bullet point:
InnoDB InnoDB does not keep an internal count of rows in a table because concurrent transactions might “see” different numbers of rows at the same time. Consequently, SELECT COUNT(*) statements only count rows visible to the current transaction.
For information about how InnoDB processes SELECT COUNT(*) statements, refer to the COUNT() description in Section 12.20.1, “Aggregate Function Descriptions”.
The suggested solution is a counter table. This is a separate table with one row and column, having the current record count. It could be kept updated via triggers. Something like this:
create table big_table_count (rec_count int default 0);
-- one-shot initialisation:
insert into big_table_count select count(*) from big_table;
create trigger big_insert after insert on big_table
for each row
update big_table_count set rec_count = rec_count + 1;
create trigger big_delete after delete on big_table
for each row
update big_table_count set rec_count = rec_count - 1;
You can see here a fiddle, where you should alter the insert/delete statements in the build section to see the effect on:
select rec_count from big_table_count;
You could extend this for several tables, either by creating such a table for each, or to reserve a row per table in the above counter table. It would then be keyed by a column "table_name".
Improving concurrency
The above method does have an impact if you have many concurrent sessions inserting or deleting records, because they need to wait for each other to complete the update of the counter.
A solution is to not let the triggers update the same, single record, but to let them insert a new record, like this:
create trigger big_insert after insert on big_table
for each row
insert into big_table_count (rec_count) values (1);
create trigger big_delete after delete on big_table
for each row
insert into big_table_count (rec_count) values (-1);
The way to get the count then becomes:
select sum(rec_count) from big_table_count;
Then, once in a while (e.g. daily) you should re-initialise the counter table to keep it small:
truncate table big_table_count;
insert into big_table_count select count(*) from big_table;

Storing Only 10 rows in database [duplicate]

I need to set a maximum limit of rows in my MySQL table. Documentation tell us that one can use following SQL code to create table:
CREATE TABLE `table_with_limit`
`id` int(11) DEFAULT NULL
) ENGINE=InnoDB MAX_ROWS=100000
But MAX_ROWS property is not a hard limit ("store not more then 100 000 rows and delete other") but a hint for database engine that this table will have AT LEAST 100 000 rows.
The only possible way I see to solve the problem is to use BEFORE INSERT trigger which will check the count of rows in table and delete the older rows. But I'm pretty sure that this is a huge overheat :/
Another solution is to clear the table with cron script every N minutes. This is a simplest way, but still it needs another system to watch for.
Anyone knows a better solution? :)
Try to make a restriction on adding a new record to a table. Raise an error when a new record is going to be added.
DELIMITER $$
CREATE TRIGGER trigger1
BEFORE INSERT
ON table1
FOR EACH ROW
BEGIN
SELECT COUNT(*) INTO #cnt FROM table1;
IF #cnt >= 25 THEN
CALL sth(); -- raise an error
END IF;
END
$$
DELIMITER ;
Note, that COUNT operation may be slow on big InnoDb tables.
On MySQL 5.5 you can use SIGNAL // RESIGNAL statement to raise an error.
Create a table with 100,000 rows.
Pre-fill one of the fields with a
"time-stamp" in the past.
Select oldest record, update "time-stamp"
when "creating" (updating) record.
Only use select and update - never use insert or delete.
Reverse index on "time-stamp" field makes
the select/update fast.
There is no way to limit the maximum number of a table rows in MySQL, unless you write a Trigger to do that.
I'm just making up an answer off the top of my head. My assumption is that you want something like a 'bucket' where you put in records, and that you want to empty it before it hits a certain record number count.
After an insert statement, run SELECT LAST_INSERT_ID(); which will get you the auto increment of a record id. Yes you still have to run an extra query, but it will be low resource intensive. Once you reach a certain count, truncate the table and reset the auto increment id.
Otherwise you can't have a 'capped' table in mysql, as you would have to have pre-defined actions like (do we not allowe the record, do we truncate the table? etc).

How to atomically move rows from one table to another?

I am collecting readings from several thousand sensors and storing them in a MySQL database. There are several hundred inserts per second. To improve the insert performance I am storing the values initially into a MEMORY buffer table. Once a minute I run a stored procedure which moves the inserted rows from the memory buffer to a permanent table.
Basically I would like to do the following in my stored procedure to move the rows from the temporary buffer:
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
Unfortunately the previous is not usable because the data collection processes insert additional rows in "data_buffer" between INSERT and DELETE above. Thus those rows will get deleted without getting inserted to the "data" table.
How can I make the operation atomic or make the DELETE statement to delete only the rows which were SELECTed and INSERTed in the preceding statement?
I would prefer doing this in a standard way which works on different database engines if possible.
I would prefer not adding any additional "id" columns because of performance overhead and storage requirements.
I wish there was SELECT_AND_DELETE or MOVE statement in standard SQL or something similar...
I beleive this will work but will block until insert is done
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
BEGIN TRANSACTION;
INSERT INTO data (SELECT * FROM data_buffer FOR UPDATE);
DELETE FROM data_buffer;
COMMIT TRANSACTION;
A possible way to avoid all those problems, and to also stay fast, would be to use two data_buffer tables (let's call them data_buffer1 and data_buffer2); while the collection processes insert into data_buffer2, you can do the insert and delete on data_buffer2; than you switch, so collected data goes into data_buffer2, while data is inserted+deleted from data_buffer1 into data.
How about having a row id, get the max value before insert, make the insert and then delete records <= max(id)
This is a similar solution to #ammoQ's answer. The difference is that instead of having the INSERTing process figure out which table to write to, you can transparently swap the tables in the scheduled procedure.
Use RENAME in the scheduled procedure to swap tables:
CREATE TABLE IF NOT EXISTS data_buffer_new LIKE data_buffer;
RENAME TABLE data_buffer TO data_buffer_old, data_buffer_new TO data_buffer;
INSERT INTO data SELECT * FROM data_buffer_old;
DROP TABLE data_buffer_old;
This works because RENAME statement swaps the tables atomically, thus the INSERTing processes will not fail with "table not found". This is MySQL specific though.
I assume the tables are identical, with the same columns and primary key(s)? If that is the case, you could nestled select inside a where clause...something like this:
DELETE FROM data_buffer
WHERE primarykey IN (SELECT primarykey FROM data)
This is a MySQL specific solution. You can use locking to prevent the INSERTing processes from adding new rows while you are moving rows.
The procedure which moves the rows should be as follows:
LOCK TABLE data_buffer READ;
INSERT INTO data SELECT * FROM data_buffer;
DELETE FROM data_buffer;
UNLOCK TABLE;
The code which INSERTs new rows in the buffer should be changed as follows:
LOCK TABLE data_buffer WRITE;
INSERT INTO data_buffer VALUES (1, 2, 3);
UNLOCK TABLE;
The INSERT process will obviously block while the lock is in place.