Delete, Truncate or Drop to clean out a table in MySQL - mysql

I am attempting to clean out a table but not get rid of the actual structure of the table. I have an id column that is auto-incrementing; I don't need to keep the ID number, but I do need it to keep its auto-incrementing characteristic. I've found delete and truncate but I'm worried one of these will completely drop the entire table rendering future insert commands useless.
How do I remove all of the records from the table so that I can insert new data?

drop table will remove the entire table with data
delete * from table will remove the data, leaving the autoincrement values alone. it also takes a while if there's a lot of data in the table.
truncate table will remove the data, reset the autoincrement values (but leave them as autoincrement columns, so it'll just start at 1 and go up from there again), and is very quick.

TRUNCATE will reset your auto-increment seed (on InnoDB tables, at least), although you could note its value before truncating and re-set accordingly afterwards using alter table:
ALTER TABLE t2 AUTO_INCREMENT = value

Drop will do just that....drop the table in question, unless the table is a parent to another table.
Delete will remove all the data that meets the condition; if no condition is specified, it'll remove all the data in the table.
Truncate is similar to delete; however, it resets the auto_increment counter back to 1 (or the initial starting value). However, it's better to use truncate over delete because delete removes the data by each row, thus having a performance hit than truncate. However, truncate will not work on InnoDB tables where referential integrity is enforced unless it is turned off before the truncate command is issued.
So, relax; unless you issue a drop command on the table, it won't be dropped.

Truncate table is what you are looking for
http://www.1keydata.com/sql/sqltruncate.html

Another possibility involves creating an empty copy of the table, setting the AUTO_INCREMENT (with some eventual leeway for insertions during the non-atomic operation) and then rotating both :
CREATE TABLE t2_new LIKE t2;
SELECT #newautoinc:=auto_increment /*+[leeway]*/
FROM information_schema.tables
WHERE table_name='t2';
SET #query = CONCAT("ALTER TABLE t2_new AUTO_INCREMENT = ", #newautoinc);
PREPARE stmt FROM #query;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
RENAME TABLE t2 TO t2_old, t2_new TO t2;
And then, you have the extra advantage of being still able to change your mind before removing the old table.
If you reconsider your decision, you can still bring back old records from the table before the operation:
INSERT /*IGNORE*/ INTO t2 SELECT * FROM t2_old /*WHERE [condition]*/;
When you're good you can drop the old table:
DROP TABLE t2_old;

I've just come across a situation where DELETE is drastically affecting SELECT performance compared to TRUNCATE on a full-text InnoDB query.
If I DELETE all rows and then repopulate the table (1million rows), a typical query takes 1s to come back.
If instead I TRUNCATE the table, and repopulate it in exactly the same way, a typical query takes 0.05s to come back.
YMMV, but for whatever reason for me on MariaDB 10.3.15-MariaDB-log DELETE seems to be ruining my index.

Related

Better way of copying data?

I have two tables where I want to copy the post_id from one table to another when the testpostmeta.meta_value = testTable.stockcode
There's about 2000 rows in testTable and 65k rows in testpostmeta.
The code works, it just takes about 1-2 minutes to complete. Is there anything that can be done to speed the hamster wheel up?
UPDATE testTable
INNER JOIN testpostmeta
ON testTable.stockcode = testpostmeta.meta_value
SET testTable.post_id = testpostmeta.post_id
I tried adding WHERE testpostmeta.meta_value = testTable.stockcode but that didn't work.
be sure you have proper indexes on testTable and testpostmeta
CREATE INDEX my_idx1 ON testTable (stokcode);
CREATE INDEX my_idx2 ON testpostmeta (meta_value , post_id);
Try adding an index to each table that matches the field used for your JOIN criteria:
ALTER TABLE testTable ADD INDEX stockcode_idx(stockcode);
ALTER TABLE testpostmeta ADD INDEX meta_idx(meta_value);
You can stop the autocommit
SET autocommit = 0 ;
--Insert/Update/Delete stuff here
COMMIT ;
If post_id is indexed in target table that also can slow down the update.
Try disabling index before the operation and enable it after. So you data will be indexed once rather on each subsequent data change.
ALTER TABLE targetTable DISABLE KEYS;
-- Your UPDATE query
ALTER TABLE targetTable ENABLE KEYS;
And as said in the reference:
Performing multiple updates together is much quicker than doing one at a time if you lock the table.
Here some reference page that can give more idea on what can be done:
8.2.4.2 Optimizing UPDATE Statements
8.5.4 Bulk Data Loading for InnoDB Tables

SQL Entry's Duped

I messed up when trying to create a test Database and accidently duplicated everything inside of a certain table. Basically there is now 2 of every entry there was once before. Is there a simple way to fix this? (Using InnoDB tables)
Yet another good reason to use auto incrementing primary keys. That way, the rows wouldn't be total duplicates.
Probably the fastest way is to copy the data into another table, truncate the first table, and re-insert it:
create temporary table tmp as
select distinct *
from test;
truncate table test;
insert into test
select *
from tmp;
As a little note: in almost all cases, I recommend using the complete column list on an insert statement. This is the one case where it is optional. After all, you are putting all the columns in another table and just putting them back a statement later.

ALTER TABLE ADD COLUMN takes a long time

I was just trying to add a column called "location" to a table (main_table) in a database. The command I run was
ALTER TABLE main_table ADD COLUMN location varchar (256);
The main_table contains > 2,000,000 rows. It keeps running for more than 2 hours and still not completed.
I tried to use mytop
to monitor the activity of this database to make sure that the query is not locked by other querying process, but it seems not. Is it supposed to take that long time? Actually, I just rebooted the machine before running this command. Now this command is still running. I am not sure what to do.
Your ALTER TABLE statement implies mysql will have to re-write every single row of the table including the new column. Since you have more than 2 million rows, I would definitely expect it takes a significant amount of time, during which your server will likely be mostly IO-bound. You'd usually find it's more performant to do the following:
CREATE TABLE main_table_new LIKE main_table;
ALTER TABLE main_table_new ADD COLUMN location VARCHAR(256);
INSERT INTO main_table_new SELECT *, NULL FROM main_table;
RENAME TABLE main_table TO main_table_old, main_table_new TO main_table;
DROP TABLE main_table_old;
This way you add the column on the empty table, and basically write the data in that new table that you are sure no-one else will be looking at without locking as much resources.
I think the appropriate answer for this is using a feature like pt-online-schema-change or gh-ost.
We have done migration of over 4 billion rows with this, though it can take upto 10 days, with less than a minute of downtime.
Percona works in a very similar fashion as above
Create a temp table
Creates triggers on the first table (for inserts, updates, deletes) so that they are replicated to the temp table
In small batches, migrate data
When done, rename table to new table, and drop the other table
You can speed up the process by temporarily turning off unique checks and foreign key checks. You can also change the algorithm that gets used.
If you want the new column to be at the end of the table, use algorithm=instant:
SET unique_checks = 0;
SET foreign_key_checks = 0;
ALTER TABLE main_table ADD location varchar(256), algorithm=instant;
SET unique_checks = 1;
SET foreign_key_checks = 1;
Otherwise, if you need the column to be in a specific location, use algorithm=inplace:
SET unique_checks = 0;
SET foreign_key_checks = 0;
ALTER TABLE main_table ADD location varchar(256) AFTER othercolumn, algorithm=inplace;
SET unique_checks = 1;
SET foreign_key_checks = 1;
For reference, it took my PC about 2 minutes to alter a table with 20 million rows using the inplace algorithm. If you're using a program like Workbench, then you may want to increase the default timeout period in your settings before starting the operation.
If you find that the operation is hanging indefinitely, then you may need to look through the list of processes and kill whatever process has a lock on the table. You can do that using these commands:
SHOW FULL PROCESSLIST;
KILL PROCESS_NUMBER_GOES_HERE;
Alter table takes a long time with a big data like in your case, so avoid to use it in such situations, and use some code like this one:
select main_table.*,
cast(null as varchar(256)) as null_location, -- any column you want accepts null
cast('' as varchar(256)) as not_null_location, --any column doesn't accept null
cast(0 as int) as not_null_int, -- int column doesn't accept null
into new_table
from main_table;
drop table main_table;
rename table new_table TO main_table;
DB2 z/OS does a virtual add of the column instantly. And puts the table into Advisory-Reorg status. Anything that runs before the reorg gets the default value or null if no default. When updates are done, they expand the rows updated. Inserts are done expanded. The next reorg expands every unexpanded row and assigns the default value to anything it expands.
Only a real database handles this well. DB2 z/OS.

How to quickly prune large tables?

I currently have a MySQL table of about 20 million rows, and I need to prune it. I'd like to remove every row whose updateTime (timestamp of insertion) was more than one month ago. I have not personally performed any alterations of the table's order, so the data should be in the order in which it was inserted, and there is a UNIQUE key on two fields, id and updateTime. How would I go about doing this in a short amount of time?
How much down time can you incur? How big are the rows? How many are you deleting?
Simply put, deleting rows is one of the most expensive things you can do to a table. It's just a horrible thing overall.
If you don't have to do it, and you have the disk space for it, and your queries aren't affected by the table size (well indexed queries typically ignore table size), then you may just leave well enough alone.
If you have the opportunity and can take the table offline (and you're removing a good percentage of the table), then your best bet would be to copy the rows you want to keep to a new table, drop the old one, rename the new one to the old name, and THEN recreate your indexes.
Otherwise, you're pretty much stuck with good 'ol delete.
There are two ways to remove a large number of rows. First there is the obvious way:
DELETE FROM table1 WHERE updateTime < NOW() - interval 1 month;
The second (slightly more complicated) way is to create a new table and copy the data that you want to keep, truncate your old table, then copy the rows back.
CREATE TABLE table2 AS
SELECT * FROM table1 WHERE updateTime >= NOW() - interval 1 month;
TRUNCATE table1;
INSERT INTO table1
SELECT * FROM table2;
Using TRUNCATE is much faster than a DELETE with a WHERE clause when you have a large number of rows to delete and a relatively small number that you wish to keep.
Spliting the deletes with limit might speed up the process;
I had to delete 10M rows and i issued the command. It never responded for hours.
I killed the query ( which took couple of hours)
then Split the deletes.
DELETE from table where id > XXXX limit 10000;
DELETE from table where id > XXXX limit 10000;
DELETE from table where id > XXXX limit 10000;
DELETE from table where id > XXXX limit 10000;
Then i duplicated this statement in a file and used the command.
mysql> source /tmp/delete.sql
This was much faster.
You can also try to use tools like pt-tools. and pt-archiver.
Actually even if you can't take the table offline for long, you can still use the 'rename table' technique to get rid of old data.
Stop processes writting to table.
rename table tableName to tmpTableName;
create table tableName like tmpTableName;
set #currentId=(select max(id) from tmpTableName);
set #currentId=#currentId+1;
set #indexQuery = CONCAT("alter table test auto_increment = ", #currentId);
prepare stmt from #indexQuery;
execute stmt;
deallocate prepare stmt;
Start processes writting to table.
insert into tableName
select * from tmpTableName;
drop table;
New inserts to tableName will begin at the correct index; The old data will be inserted in correct indexes.

Quickest way to delete enormous MySQL table

I have an enormous MySQL (InnoDB) database with millions of rows in the sessions table that were created by an unrelated, malfunctioning crawler running on the same server as ours. Unfortunately, I have to fix the mess now.
If I try to truncate table sessions; it seems to take an inordinately long time (upwards of 30 minutes). I don't care about the data; I just want to have the table wiped out as quickly as possible. Is there a quicker way, or will I have to just stick it out overnight?
(As this turned up high in Google's results, I thought a little more instruction might be handy.)
MySQL has a convenient way to create empty tables like existing tables, and an atomic table rename command. Together, this is a fast way to clear out data:
CREATE TABLE new_foo LIKE foo;
RENAME TABLE foo TO old_foo, new_foo TO foo;
DROP TABLE old_foo;
Done
The quickest way is to use DROP TABLE to drop the table completely and recreate it using the same definition. If you have no foreign key constraints on the table then you should do that.
If you're using MySQL version greater than 5.0.3, this will happen automatically with a TRUNCATE. You might get some useful information out of the manual as well, it describes how a TRUNCATE works with FK constraints. http://dev.mysql.com/doc/refman/5.0/en/truncate-table.html
EDIT: TRUNCATE is not the same as a drop or a DELETE FROM. For those that are confused about the differences, please check the manual link above. TRUNCATE will act the same as a drop if it can (if there are no FK's), otherwise it acts like a DELETE FROM with no where clause.
EDIT: If you have a large table, your MariaDB/MySQL is running with a binlog_format as ROW and you execute a DELETE without a predicate/WHERE clause, you are going to have issues to keep up the replication or even, to keep your Galera nodes running without hitting a flow control state. Also, binary logs can get your disk full. Be careful.
The best way I have found of doing this with MySQL is:
DELETE from table_name LIMIT 1000;
Or 10,000 (depending on how fast it happens).
Put that in a loop until all the rows are deleted.
Please do try this as it will actually work. It will take some time, but it will work.
Couldn't you grab the schema drop the table and recreate it?
drop table should be the fastest way to get rid of it.
Have you tried to use "drop"? I've used it on tables over 20GB and it always completes in seconds.
If you just want to get rid of the table altogether, why not simply drop it?
Truncate is fast, usually on the order of seconds or less. If it took 30 minutes, you probably had a case of some foreign keys referencing the table you were truncating. There may also be locking issues involved.
Truncate is effectively as efficient as one can empty a table, but you may have to remove the foreign key references unless you want those tables scrubbed as well.
We had these issues. We no longer use the database as a session store with Rails 2.x and the cookie store. However, dropping the table is a decent solution. You may want to consider stopping the mysql service, temporarily disable logging, start things up in safe mode and then do your drop/create. When done, turn on your logging again.
I'm not sure why it's taking so long. But perhaps try a rename, and recreate a blank table. Then you can drop the "extra" table without worrying how long it takes.
searlea's answer is nice, but as stated in the comments, you lose the foreign keys during the fight.
this solution is similar: the truncate is executed within a second, but you keep the foreign keys.
The trick is that we disable/enable the FK checks.
SET FOREIGN_KEY_CHECKS=0;
CREATE TABLE NewFoo LIKE Foo;
insert into NewFoo SELECT * from Foo where What_You_Want_To_Keep
truncate table Foo;
insert into Foo SELECT * from NewFoo;
SET FOREIGN_KEY_CHECKS=1;
Extended answer - Delete all but some rows
My problem was: Because of a crazy script, my table was for with 7.000.000 junk rows. I needed to delete 99% of data in this table, this is why i needed to copy What I Want To Keep in a tmp table before deleteting.
These Foo Rows i needed to keep were depending on other tables, that have foreign keys, and indexes.
something like that:
insert into NewFoo SELECT * from Foo where ID in (
SELECT distinct FooID from TableA
union SELECT distinct FooID from TableB
union SELECT distinct FooID from TableC
)
but this query was always timing out after 1 hour.
So i had to do it like this:
CREATE TEMPORARY TABLE tmpFooIDS ENGINE=MEMORY AS (SELECT distinct FooID from TableA);
insert into tmpFooIDS SELECT distinct FooID from TableB
insert into tmpFooIDS SELECT distinct FooID from TableC
insert into NewFoo SELECT * from Foo where ID in (select ID from tmpFooIDS);
I theory, because indexes are setup correctly, i think both ways of populating NewFoo should have been the same, but practicaly it didn't.
This is why in some cases, you could do like this:
SET FOREIGN_KEY_CHECKS=0;
CREATE TABLE NewFoo LIKE Foo;
-- Alternative way of keeping some data.
CREATE TEMPORARY TABLE tmpFooIDS ENGINE=MEMORY AS (SELECT * from Foo where What_You_Want_To_Keep);
insert into tmpFooIDS SELECT ID from Foo left join Bar where OtherStuff_You_Want_To_Keep_Using_Bar
insert into NewFoo SELECT * from Foo where ID in (select ID from tmpFooIDS);
truncate table Foo;
insert into Foo SELECT * from NewFoo;
SET FOREIGN_KEY_CHECKS=1;