What's Mysql fastest way to delete a database content? - mysql

mysql fastest way to delete a database content on freebsd? please help
i've tried to delete from navicat (400,000+ lines),
but in a hour... just 100,000 was deleted
I haven't phpmyadmin

To delete everything in a table:
TRUNCATE TABLE table_you_want_to_nuke
To delete certain rows, you have two options:
Follow these steps:
Create a temporary table using CREATE TABLE the_temp_table LIKE current_table
Drop all of the indexes on the temp table.
Copy the records you want to keep with INSERT INTO the_temp_table SELECT * FROM current_table WHERE ...
TRUNCATE TABLE current_table
INSERT INTO current_table SELECT * FROM the_temp_table
To speed this option up, you may want to drop all indexes from current_table before the final INSERT INTO, then recreate them after the INSERT. MySQL is much faster at indexing existing data than it is at indexing on the fly.
The option you're currently trying: DELETE FROM your_table WHERE whatever_condition. You probably need to break this into chunks using the WHERE condition or LIMIT, so you can batch it and not bog down the server forever.
Which is better/faster depends on lots of things, mostly the ratio of deleted records to retained records and the number of indexes involved. As always, test this carefully before doing it on a live database, as both DELETE and TRUNCATE will permanently destroy data.

Related

Rake task that purge just a single table

Is there a Rake command that i can run that would just delete all the fields of a table instead of dropping the whole database and recreating it. I have a table that grows large very quick, most of the data in it does not need to persist more that a week.
Try truncating the table:
ActiveRecord::Base.connection.execute("TRUNCATE TABLE table_name")
From MySQL's docs:
Logically, TRUNCATE TABLE is similar to a DELETE statement that
deletes all rows, or a sequence of DROP TABLE and CREATE TABLE
statements.
Which seems like what you want to achieve.
If you need to remove old records from the table, without deleting current data, you need to be careful not to issue SQL statements that could lock up a large % of the records in your table. (It sounds like this table is written to frequently, so locking it for a long time is not acceptable.)
At least with mysql+innodb, you can easily end up locking more rows than just the ones you actually delete. See http://mitchdickinson.com/mysql-innodb-row-locking-in-delete/ for more info on that.
Here's a process that should keep the table fairly available, and let you remove old rows:
Select just the ids of a set of records which you want to remove, based on their created_at times.
Issue a DELETE for those records.
Repeat this process as long as the SELECT returns records.
This gives you an idea of the process...
max_age = 7.days.ago
batch_size = 1000
loop do
ids = Model.select(:id).
where('created_at < ?', max_age).
limit(batch_size).
map(&:id)
Model.where(id: ids).delete_all
break if ids.size < batch_size
end
Since the SELECT and the DELETE are separate statements, you won't lock any records which aren't actually being removed. The overall time taken for this process will definitely be longer than a simple TRUNCATE TABLE, but the benefit is you'll be able to keep recent records. TRUNCATE will remove everything.

mysql: removing duplicates while avoiding client timeout

Issue: hundreds of identical (schema) tables. Some of these have some duplicated data that needs to be removed. My usual strategy for this is:
walk list of tables - for each do
create temp table with unique key on all fields
insert ignore select * from old table
truncate original table
insert select * back into original table
drop or clean temp table
For smaller tables this works fine. Unfortunately the tables I'm cleaning often have 100s of millions of records so my jobs and client connections are timing out while I'm running this. (Since there are hundreds of these tables I'm using Perl to walk the list and clean each one. This is where the timeout happens).
Some options I'm looking into:
mysqldump - fast but I don't see how to do the subsequent 'insert ignore' step
into outfile / load infile - also fast but I'm running from a remote host and 'into outfile' creates all the files on the mysql server. Hard to clean up.
do the insert/select in blocks of 100K records - this avoid the db timeout but its pretty slow.
I'm sure there is a better way. Suggestions?
If an SQL query to find the duplicates can complete without timing out, I think you should be able to do a SELECT with a Count() operator with a WHERE clause that restricts the output to only the rows with duplicate data (Count(DUPEDATA) > 1). The results of this SELECT can be placed INTO a temporary table, which can then be joined with the primary table for the DELETE query.
This approach uses the set-operations strengths of SQL/MySQL -- no need for Perl coding.

Running optimize on table copy?

I have an InnoDB table in MySQL which used to contain about 600k rows. After deleting 400k+ rows, my guess is that I need to run an OPTIMIZE.
However, since the table will be locked during this operation, the site will not be usable at that time. So, my question is: should I run the optimize on the live database table (with a little under 200k rows)? Or is it possible to create a copy of that table, run the OPTIMIZE on that copy and after that rename both tables so the copy the data back to the live table?
If you create a copy, then it should be optimised already if you do CREATE TABLE..AS SELECT... No need to run it separately
However, I'd consider copy the 200k rows to keep into a new table, then renaming the tables.
This way is less steps and less work all round.
CREATE TABLE MyTableCopy AS
SELECT *
FROM myTable
WHERE (insert Keep condition here);
RENAME TABLE
myTable TO myTable_DeleteMelater,
MyTableCopy TO myTable;

Is there any way to do a bulk/faster delete in mysql?

I have a table with 10 million records, what is the fastest way to delete & retain last 30 days.
I know this can be done in event scheduler, but my worry is if takes too much time, it might lock the table for much time.
It will be great if you can suggest some optimum way.
Thanks.
Offhand, I would:
Rename the table
Create an empty table with the same name as your
original table
Grab the last 30 days from your "temp" table and insert
them back into the new table
Drop the temp table
This will enable you to keep the table live through (almost) the entire process and get the past 30 days worth of data at your leisure.
You could try partition tables.
PARTITION BY LIST (TO_DAYS( date_field ))
This would give you 1 partition per day, and when you need to prune data you just:
ALTER TABLE tbl_name DROP PARTITION p#
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
Not that it helps you with your current problem, but if this is a regular occurance, you might want to look into a merge table: just add tables for different periods in time, and remove them from the merge table definition when no longer needed. Another option is partitioning, in which it is equally trivial to drop a (oldest) partition.
To expand on Michael Todd's answer.
If you have the space,
Create a blank staging table similar to the table you want to reduce in size
Fill the staging table with only the records you want to have in your destination table
Do a double rename like the following
Assuming:
table is the table name of the table you want to purge a large amount of data from
newtable is the staging table name
no other tables are called temptable
rename table table to temptable, newtable to table;
drop temptable;
This will be done in a single transaction, which will require an instantaneous schema lock. Most high concurrency applications won't notice the change.
Alternatively, if you don't have the space, and you have a long window to purge this data, you can use dynamic sql to insert the primary keys into a temp table, and join the temp table in a delete statement. When you insert into the temp table, be aware of what max_packet_size is. Most installations of MySQL use 16MB (16777216 bytes). Your insert command for the temp table should be under max_packet_size. This will not lock the table. You'll want to run optimize table to reclaim space for the rest of the engine to use. You probably won't be able to reclaim disk space, unless you were to shutdown the engine and move the data files.
Shutdown your resource,
SELECT .. INTO OUTFILE, parse output, delete table, LOAD DATA LOCAL INFILE optimized_db.txt - more cheaper to re-create, than to UPDATE.

Duplicating table in MYSQL without copying one row at a time

I want to duplicate a very large table, but I do not want to copy it row by row. Is there a way to duplicate it?
For example, you can TRUNCATE w/o deleting row/row, so i was wondering if there is something similar for copying entire tables
UPDATE: row by row insert is very painful (because of 120M rows). Anyway to avoid that?
MySQL no longer has a reliable "copy table" functionality - many reasons for this related to how data is stored. However, the below does row-by-row insertion but is pretty simple:
CREATE TABLE `new_table` LIKE `old_table`;
INSERT INTO `new_table` (SELECT * FROM `old_table`);
You could use INSERT INTO ... SELECT.
If you're using MyISAM you can copy the physical files on disk. Restart the service and you'll have the new table with indexes and everything the same.
INSERT INTO TABLE2 SELECT * FROM TABLE1
It's nontrivial to copy a large table; ultimately the database is likely to need to rebuild it.
In InnoDB the only way is really to rebuild it, which means insert ... select or such like, however, with 120M rows as it's going to happen in a single transaction, you will probably exceed the size of the rollback area, which will cause the insert to fail.
mysqldump followed by renaming the original table then restoring the dump should work, as mysqldump may cause a commit every lots of rows. However it will be slow.
oracle:
Create table t as select * from original_table