Rake task that purge just a single table - mysql

Is there a Rake command that i can run that would just delete all the fields of a table instead of dropping the whole database and recreating it. I have a table that grows large very quick, most of the data in it does not need to persist more that a week.

Try truncating the table:
ActiveRecord::Base.connection.execute("TRUNCATE TABLE table_name")
From MySQL's docs:
Logically, TRUNCATE TABLE is similar to a DELETE statement that
deletes all rows, or a sequence of DROP TABLE and CREATE TABLE
statements.
Which seems like what you want to achieve.

If you need to remove old records from the table, without deleting current data, you need to be careful not to issue SQL statements that could lock up a large % of the records in your table. (It sounds like this table is written to frequently, so locking it for a long time is not acceptable.)
At least with mysql+innodb, you can easily end up locking more rows than just the ones you actually delete. See http://mitchdickinson.com/mysql-innodb-row-locking-in-delete/ for more info on that.
Here's a process that should keep the table fairly available, and let you remove old rows:
Select just the ids of a set of records which you want to remove, based on their created_at times.
Issue a DELETE for those records.
Repeat this process as long as the SELECT returns records.
This gives you an idea of the process...
max_age = 7.days.ago
batch_size = 1000
loop do
ids = Model.select(:id).
where('created_at < ?', max_age).
limit(batch_size).
map(&:id)
Model.where(id: ids).delete_all
break if ids.size < batch_size
end
Since the SELECT and the DELETE are separate statements, you won't lock any records which aren't actually being removed. The overall time taken for this process will definitely be longer than a simple TRUNCATE TABLE, but the benefit is you'll be able to keep recent records. TRUNCATE will remove everything.

Related

MYSQL delete from large table increase the delete time for each transaction

I am trying to delete data from table which contains almost 6,000,000,000 records , with where clause.
here is the stored procedure I am using and running from command prompt MySQL in windows.
DELIMITER $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `clean_table`( )
BEGIN
REPEAT
DO SLEEP(1);
DELETE FROM tablename
WHERE exportInfoId<=8479 limit 100000;
SELECT ROW_COUNT();
UNTIL ROW_COUNT() = 0 END REPEAT;
END$$
DELIMITER ;
Its deleting the data but it continue incasing the time for each delete transaction. Why its keep increasing , even when with delete the earlier transaction reduce the data in table ? Is there any way to make it same time for each transaction ? I am already using sleep as suggested in some other answers .
You need to add an index to the column with which you are using to find the record(s) to be deleted.
With an index, MySQL knows exactly where the records are to be found so it can go straight to the record(s).
Without an index, then the table must be searched row by row.
The difference is, without an index the deletes are performed in the order of the primary key however, with an index the records will be searched in the order of that particular column.
To add an index, do what #o-jones points out below in the comments, which is:
// Normal Index
ALTER TABLE ADD INDEX exportInfoId (exportInfoId);
// Reverse Index
ALTER TABLE ADD INDEX exportInfoId (exportInfoId DESC);
Adding an index to the column is the correct answer here however, there may be other answers that work for you, depending on your use-case.
Sure, adding an index would help a lot. ("DESC" is optional.)
Sure it will take time to add an index to 6B rows. Newer versions of MySQL can do it "instantly", but that just means it is happening in the background and is not available for a long time.
The reason for the slowdown is that the DELETEs are skipping over more and more rows that are not to be deleted. The INDEX would avoid that. (As Akina described.)
A note about DELETE (and UPDATE): It is slow because it keeps the deleted rows in case of a ROLLBACK (which would happen after a power failure).
There are other techniques. See http://mysql.rjweb.org/doc.php/deletebig . The main one that comes to mind for your task is to walk through the table 1K rows at a time based on the PRIMARY KEY. (Note that this avoids repeatedly skipping over rejected rows.)
If "most" of the table is to be deleted, copying the table over is the best way. (See that link.) Note also that the table will not shrink after DELETEs; the freed up space will be reused, to some extent. (As Bill Commented on.)
If you are deleting "old" data such that PARTITIONing would work, then DROP PARTITION is virtually instantaneous on all versions (since 5.1?) of MySQL. (That is also mentioned in my link.) (Partitioning is usually used for "time-series"; does exportInfoId work like a "time"?)
To Alex, I say "there may be other answers".
As for 100K vs 5K rows; I like 1K because above that, you get into "diminishing returns". And at 100K you might hit various buffer size settings that actually cause 100K to run slower.

How to speed up a simple UPDATE of all values in one column, in MySQL

This is taking many hours on a table with over 4.6millon records.
Is there a way to speed this up?
UPDATE tableA
SET SKU = CONCAT("X-", tableA.supplier_SKU);
There is no index on any column yet.
EXPLAIN indicates rows=4.6 million, filtered = 100% !
If there is an index(indexes) on SKU, dropping it, updating and recreating might help.
Can you lock the table first (ensure no other user is blocking your operation)?
lock tables tableA write;
?
Can you create another table, update there and then rename?
https://dev.mysql.com/doc/refman/5.7/en/rename-table.html
*note - link above describes how to swap two tables in one statement.
4.6M records doesn't sound like sth that should take hours, unless you can't lock the table because other users keep updating it.
Please provide SHOW CREATE TABLE tableA.
The slow part is needing to save 4.6 million "old" rows before getting to the "commit".
Do not ever use LOCK TABLES with InnoDB.
You could break the task into chunks so that it blocks other actions less. (But the total time will probably be longer.) See this for 'chunking': http://mysql.rjweb.org/doc.php/deletebig#deleting_in_chunks

MYSQL Table Times Out During Drop

I have a table with several hundred million rows of data. I want to delete the table, but every operation I perform on the table loses connection after running for 50,000+ seconds (about 16 hours), which is under the 60,000 second time out condition I have set in the database. I've tried creating a stored procedure with the Drop Table code thinking that if I send the info to the DB to perform the operation it will not need a connection to process it, but it does the same thing. Is it just timing out? Or do I need to do something else?
Instead do TRUNCATE TABLE. Internally it creates an equivalent, but empty, table, then swaps. This technique might take a second, even for a very big table.
If you are deleting most of a table, then it is usually faster (sometimes a lot faster), to do
CREATE TABLE new LIKE real;
INSERT INTO new
SELECT ... FROM real
WHERE ... -- the rows you want to keep
Why do you need to delete everything?
For other techniques in massive deletes, including big chunks out of a huge table, see https://mariadb.com/kb/en/mariadb/big-deletes/

What's Mysql fastest way to delete a database content?

mysql fastest way to delete a database content on freebsd? please help
i've tried to delete from navicat (400,000+ lines),
but in a hour... just 100,000 was deleted
I haven't phpmyadmin
To delete everything in a table:
TRUNCATE TABLE table_you_want_to_nuke
To delete certain rows, you have two options:
Follow these steps:
Create a temporary table using CREATE TABLE the_temp_table LIKE current_table
Drop all of the indexes on the temp table.
Copy the records you want to keep with INSERT INTO the_temp_table SELECT * FROM current_table WHERE ...
TRUNCATE TABLE current_table
INSERT INTO current_table SELECT * FROM the_temp_table
To speed this option up, you may want to drop all indexes from current_table before the final INSERT INTO, then recreate them after the INSERT. MySQL is much faster at indexing existing data than it is at indexing on the fly.
The option you're currently trying: DELETE FROM your_table WHERE whatever_condition. You probably need to break this into chunks using the WHERE condition or LIMIT, so you can batch it and not bog down the server forever.
Which is better/faster depends on lots of things, mostly the ratio of deleted records to retained records and the number of indexes involved. As always, test this carefully before doing it on a live database, as both DELETE and TRUNCATE will permanently destroy data.

mysql - Deleting Rows from InnoDB is very slow

I got a mysql database with approx. 1 TB of data. Table fuelinjection_stroke has apprx. 1.000.000.000 rows. DBID is the primary key that is automatically incremented by one with each insert.
I am trying to delete the first 1.000.000 rows using a very simple statement:
Delete from fuelinjection_stroke where DBID < 1000000;
This query is takeing very long (>24h) on my dedicated 8core Xeon Server (32 GB Memory, SAS Storage).
Any idea whether the process can be sped up?
I believe that you table becomes locked. I've faced same problem and find out that can delete 10k records pretty fast. So you might want to write simple script/program which will delete records by chunks.
DELETE FROM fuelinjection_stroke WHERE DBID < 1000000 LIMIT 10000;
And keep executing it until it deletes everything
Are you space deprived? Is down time impossible?
If not, you could fit in a new INT column length 1 and default it to 1 for "active" (or whatever your terminology is) and 0 for "inactive". Actually, you could use 0 through 9 as 10 different states if necessary.
Adding this new column will take a looooooooong time, but once it's over, your UPDATEs should be lightning fast as long as you do it off the PRIMARY (as you do with your DELETE) and you don't index this new column.
The reason why InnoDB takes so long to DELETE on such a massive table as yours is because of the cluster index. It physically orders your table based upon your PRIMARY (or first UNIQUE it finds...or whatever it feels like if it can't find PRIMARY or UNIQUE), so when you pull out one row, it now reorders your ENTIRE table physically on the disk for speed and defragmentation. So it's not the DELETE that's taking so long. It's the physical reordering after that row is removed.
When you create a new INT column with a default value, the space will be filled, so when you UPDATE it, there's no need for physical reordering across your huge table.
I'm not sure exactly what your schema is exactly, but using a column for a row's state is much faster than DELETEing; however, it will take more space.
Try setting values:
innodb_flush_log_at_trx_commit=2
innodb_flush_method=O_DIRECT (for non-windows machine)
innodb_buffer_pool_size=25GB (currently it is close to 21GB)
innodb_doublewrite=0
innodb_support_xa=0
innodb_thread_concurrency=0...1000 (try different values, beginning with 200)
References:
MySQL docs for description of different variables.
MySQL Server Setting Tuning
MySQL Performance Optimization basics
http://bugs.mysql.com/bug.php?id=28382
What indexes do you have?
I think your issue is that the delete is rebuilding the index on every iteration.
I'd delete the indexes if any, do the delete, then re-add the indexes. It'll be far faster, (I think).
I was having the same problem, and my table has several indices that I didn't want to have to drop and recreate. So I did the following:
create table keepers
select * from origTable where {clause to retrieve rows to preserve};
truncate table origTable;
insert into origTable null,keepers.col2,...keepers.col(last) from keepers;
drop table keepers;
About 2.2 million rows were processed in about 3 minutes.
Your database may be checking for records that need to be modified in a foreign key (cascades, delete).
But I-Conica answer is a good point(+1). The process of deleting a single record and updating a lot of indexes during done 100000 times is inefficient. Just drop the index, delete all records and create it again.
And of course, check if there is any kind of lock in the database. One user or application can lock a record or table and your query will be waiting until the user release the resource or it reachs a timeout. One way to check if your database is doing real work or just waiting is lauch the query from a connection that sets the --innodb_lock_wait_timeout parameter to a few seconds. If it fails at least you know that the query is OK and that you need to find and realease that lock. Examples of locks are Select * from XXX For update and uncommited transactions.
For such long tables, I'd rather use MYISAM, specially if there is not a lot of transactions needed.
I don't know exact ans for ur que. But writing another way to delete those rows, pls try this.
delete from fuelinjection_stroke where DBID in
(
select top 1000000 DBID from fuelinjection_stroke
order by DBID asc
)