Quickest way to delete enormous MySQL table

Quickest way to delete enormous MySQL table - mysql

I have an enormous MySQL (InnoDB) database with millions of rows in the sessions table that were created by an unrelated, malfunctioning crawler running on the same server as ours. Unfortunately, I have to fix the mess now.
If I try to truncate table sessions; it seems to take an inordinately long time (upwards of 30 minutes). I don't care about the data; I just want to have the table wiped out as quickly as possible. Is there a quicker way, or will I have to just stick it out overnight?

(As this turned up high in Google's results, I thought a little more instruction might be handy.)
MySQL has a convenient way to create empty tables like existing tables, and an atomic table rename command. Together, this is a fast way to clear out data:
CREATE TABLE new_foo LIKE foo;
RENAME TABLE foo TO old_foo, new_foo TO foo;
DROP TABLE old_foo;
Done

The quickest way is to use DROP TABLE to drop the table completely and recreate it using the same definition. If you have no foreign key constraints on the table then you should do that.
If you're using MySQL version greater than 5.0.3, this will happen automatically with a TRUNCATE. You might get some useful information out of the manual as well, it describes how a TRUNCATE works with FK constraints. http://dev.mysql.com/doc/refman/5.0/en/truncate-table.html
EDIT: TRUNCATE is not the same as a drop or a DELETE FROM. For those that are confused about the differences, please check the manual link above. TRUNCATE will act the same as a drop if it can (if there are no FK's), otherwise it acts like a DELETE FROM with no where clause.
EDIT: If you have a large table, your MariaDB/MySQL is running with a binlog_format as ROW and you execute a DELETE without a predicate/WHERE clause, you are going to have issues to keep up the replication or even, to keep your Galera nodes running without hitting a flow control state. Also, binary logs can get your disk full. Be careful.

The best way I have found of doing this with MySQL is:
DELETE from table_name LIMIT 1000;
Or 10,000 (depending on how fast it happens).
Put that in a loop until all the rows are deleted.
Please do try this as it will actually work. It will take some time, but it will work.

Couldn't you grab the schema drop the table and recreate it?

drop table should be the fastest way to get rid of it.

Have you tried to use "drop"? I've used it on tables over 20GB and it always completes in seconds.

If you just want to get rid of the table altogether, why not simply drop it?

Truncate is fast, usually on the order of seconds or less. If it took 30 minutes, you probably had a case of some foreign keys referencing the table you were truncating. There may also be locking issues involved.
Truncate is effectively as efficient as one can empty a table, but you may have to remove the foreign key references unless you want those tables scrubbed as well.

We had these issues. We no longer use the database as a session store with Rails 2.x and the cookie store. However, dropping the table is a decent solution. You may want to consider stopping the mysql service, temporarily disable logging, start things up in safe mode and then do your drop/create. When done, turn on your logging again.

I'm not sure why it's taking so long. But perhaps try a rename, and recreate a blank table. Then you can drop the "extra" table without worrying how long it takes.

searlea's answer is nice, but as stated in the comments, you lose the foreign keys during the fight.
this solution is similar: the truncate is executed within a second, but you keep the foreign keys.
The trick is that we disable/enable the FK checks.
SET FOREIGN_KEY_CHECKS=0;
CREATE TABLE NewFoo LIKE Foo;
insert into NewFoo SELECT * from Foo where What_You_Want_To_Keep
truncate table Foo;
insert into Foo SELECT * from NewFoo;
SET FOREIGN_KEY_CHECKS=1;
Extended answer - Delete all but some rows
My problem was: Because of a crazy script, my table was for with 7.000.000 junk rows. I needed to delete 99% of data in this table, this is why i needed to copy What I Want To Keep in a tmp table before deleteting.
These Foo Rows i needed to keep were depending on other tables, that have foreign keys, and indexes.
something like that:
insert into NewFoo SELECT * from Foo where ID in (
SELECT distinct FooID from TableA
union SELECT distinct FooID from TableB
union SELECT distinct FooID from TableC
)
but this query was always timing out after 1 hour.
So i had to do it like this:
CREATE TEMPORARY TABLE tmpFooIDS ENGINE=MEMORY AS (SELECT distinct FooID from TableA);
insert into tmpFooIDS SELECT distinct FooID from TableB
insert into tmpFooIDS SELECT distinct FooID from TableC
insert into NewFoo SELECT * from Foo where ID in (select ID from tmpFooIDS);
I theory, because indexes are setup correctly, i think both ways of populating NewFoo should have been the same, but practicaly it didn't.
This is why in some cases, you could do like this:
SET FOREIGN_KEY_CHECKS=0;
CREATE TABLE NewFoo LIKE Foo;
-- Alternative way of keeping some data.
CREATE TEMPORARY TABLE tmpFooIDS ENGINE=MEMORY AS (SELECT * from Foo where What_You_Want_To_Keep);
insert into tmpFooIDS SELECT ID from Foo left join Bar where OtherStuff_You_Want_To_Keep_Using_Bar
insert into NewFoo SELECT * from Foo where ID in (select ID from tmpFooIDS);
truncate table Foo;
insert into Foo SELECT * from NewFoo;
SET FOREIGN_KEY_CHECKS=1;

Related

Better way of copying data?

I have two tables where I want to copy the post_id from one table to another when the testpostmeta.meta_value = testTable.stockcode
There's about 2000 rows in testTable and 65k rows in testpostmeta.
The code works, it just takes about 1-2 minutes to complete. Is there anything that can be done to speed the hamster wheel up?
UPDATE testTable
INNER JOIN testpostmeta
ON testTable.stockcode = testpostmeta.meta_value
SET testTable.post_id = testpostmeta.post_id
I tried adding WHERE testpostmeta.meta_value = testTable.stockcode but that didn't work.

be sure you have proper indexes on testTable and testpostmeta
CREATE INDEX my_idx1 ON testTable (stokcode);
CREATE INDEX my_idx2 ON testpostmeta (meta_value , post_id);

Try adding an index to each table that matches the field used for your JOIN criteria:
ALTER TABLE testTable ADD INDEX stockcode_idx(stockcode);
ALTER TABLE testpostmeta ADD INDEX meta_idx(meta_value);

You can stop the autocommit
SET autocommit = 0 ;
--Insert/Update/Delete stuff here
COMMIT ;

If post_id is indexed in target table that also can slow down the update.
Try disabling index before the operation and enable it after. So you data will be indexed once rather on each subsequent data change.
ALTER TABLE targetTable DISABLE KEYS;
-- Your UPDATE query
ALTER TABLE targetTable ENABLE KEYS;
And as said in the reference:
Performing multiple updates together is much quicker than doing one at a time if you lock the table.
Here some reference page that can give more idea on what can be done:
8.2.4.2 Optimizing UPDATE Statements
8.5.4 Bulk Data Loading for InnoDB Tables

SQL Entry's Duped

I messed up when trying to create a test Database and accidently duplicated everything inside of a certain table. Basically there is now 2 of every entry there was once before. Is there a simple way to fix this? (Using InnoDB tables)

Yet another good reason to use auto incrementing primary keys. That way, the rows wouldn't be total duplicates.
Probably the fastest way is to copy the data into another table, truncate the first table, and re-insert it:
create temporary table tmp as
select distinct *
from test;
truncate table test;
insert into test
select *
from tmp;
As a little note: in almost all cases, I recommend using the complete column list on an insert statement. This is the one case where it is optional. After all, you are putting all the columns in another table and just putting them back a statement later.

mySQL: duplicating multiple records via temporary table, how to preserve autoincrement index?

I wish to duplicate a selection of records in a mySQL table.
The pk of the table is an autoincremented int.
I want to do this with one set of mysql queries (for performance reasons).
It seems like the fastest way to do this is to put the results of the selection into a temporary table,
make any changes needed, and reinsert the records back to the original table, like this:
CREATE TEMPORARY TABLE temp1234 ENGINE=MEMORY SELECT * FROM a_table WHERE column='my selection';
# do updates in temp1234; (altering FK's mainly)
INSERT INTO a_table SELECT * FROM temp1234;
But when I try to do this i get an error for duplicate PKs.
Now, I realise that I could alter the INSERT with SELECT query to exclude the pk/ID column, but as I am proceduraly generating these queries across multiple tables for a large data copying function, i want to avoid having to supply column names.
What is the best way around this problem?

how to delete duplicate records in mysql table

I'm having an issue with finding and deleting duplicate records, I have a table with IDs called CallDetailRecordID which I need to scan and delete records, the reason there are duplicates is that I'm exporting data to special arching engine works with MySQL and it doesn't support indexing.
I tried using "Select DISTINCT" but it dosn't work, is there is another way? I'm hoping I can create a store procedure and have it run weekly to perform clean up.
your help is highly appreciated.
Thank you

CREATE TABLE tmp_table LIKE table
INSERT INTO tmp_table (SELECT * FROM table GROUP BY CallDetailRecordID)
RENAME table TO old_table
RENAME tmp_table to table
Drop the old table if you want, add a LOCK TABLES statement at the beginning to avoid lost inserts.

Delete, Truncate or Drop to clean out a table in MySQL

I am attempting to clean out a table but not get rid of the actual structure of the table. I have an id column that is auto-incrementing; I don't need to keep the ID number, but I do need it to keep its auto-incrementing characteristic. I've found delete and truncate but I'm worried one of these will completely drop the entire table rendering future insert commands useless.
How do I remove all of the records from the table so that I can insert new data?

drop table will remove the entire table with data
delete * from table will remove the data, leaving the autoincrement values alone. it also takes a while if there's a lot of data in the table.
truncate table will remove the data, reset the autoincrement values (but leave them as autoincrement columns, so it'll just start at 1 and go up from there again), and is very quick.

TRUNCATE will reset your auto-increment seed (on InnoDB tables, at least), although you could note its value before truncating and re-set accordingly afterwards using alter table:
ALTER TABLE t2 AUTO_INCREMENT = value

Drop will do just that....drop the table in question, unless the table is a parent to another table.
Delete will remove all the data that meets the condition; if no condition is specified, it'll remove all the data in the table.
Truncate is similar to delete; however, it resets the auto_increment counter back to 1 (or the initial starting value). However, it's better to use truncate over delete because delete removes the data by each row, thus having a performance hit than truncate. However, truncate will not work on InnoDB tables where referential integrity is enforced unless it is turned off before the truncate command is issued.
So, relax; unless you issue a drop command on the table, it won't be dropped.

Truncate table is what you are looking for
http://www.1keydata.com/sql/sqltruncate.html

Another possibility involves creating an empty copy of the table, setting the AUTO_INCREMENT (with some eventual leeway for insertions during the non-atomic operation) and then rotating both :
CREATE TABLE t2_new LIKE t2;
SELECT #newautoinc:=auto_increment /*+[leeway]*/
FROM information_schema.tables
WHERE table_name='t2';
SET #query = CONCAT("ALTER TABLE t2_new AUTO_INCREMENT = ", #newautoinc);
PREPARE stmt FROM #query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
RENAME TABLE t2 TO t2_old, t2_new TO t2;
And then, you have the extra advantage of being still able to change your mind before removing the old table.
If you reconsider your decision, you can still bring back old records from the table before the operation:
INSERT /*IGNORE*/ INTO t2 SELECT * FROM t2_old /*WHERE [condition]*/;
When you're good you can drop the old table:
DROP TABLE t2_old;

I've just come across a situation where DELETE is drastically affecting SELECT performance compared to TRUNCATE on a full-text InnoDB query.
If I DELETE all rows and then repopulate the table (1million rows), a typical query takes 1s to come back.
If instead I TRUNCATE the table, and repopulate it in exactly the same way, a typical query takes 0.05s to come back.
YMMV, but for whatever reason for me on MariaDB 10.3.15-MariaDB-log DELETE seems to be ruining my index.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Quickest way to delete enormous MySQL table - mysql

The best way I have found of doing this with MySQL is: DELETE from table_name LIMIT 1000; Or 10,000 (depending on how fast it happens). Put that in a loop until all the rows are deleted. Please do try this as it will actually work. It will take some time, but it will work.

Couldn't you grab the schema drop the table and recreate it?

drop table should be the fastest way to get rid of it.

Have you tried to use "drop"? I've used it on tables over 20GB and it always completes in seconds.

If you just want to get rid of the table altogether, why not simply drop it?

I'm not sure why it's taking so long. But perhaps try a rename, and recreate a blank table. Then you can drop the "extra" table without worrying how long it takes.

Related

Better way of copying data?

SQL Entry's Duped

mySQL: duplicating multiple records via temporary table, how to preserve autoincrement index?

how to delete duplicate records in mysql table

Delete, Truncate or Drop to clean out a table in MySQL

Categories

Resources