Delete rows older than 14 days in MySQL - mysql

Summary
I need to purge the history of a table from rows that are older than 14 days.
Being no MySQL Expert, my searches lead me to this:
delete
from SYS_VERROUS_POLICE
where idModificationPolice not in (
select distinct idModificationPolice
from SYS_VERROUS_POLICE
where date(dateHeureModification)
between curdate() and curdate() - interval 14 day
);
Thrown Exception
But then I'm stuck with this error message:
Error Code: 1093. You can't specify target table 'SYS_VERROUS_POLICE' for update in FROM clause.
What the...
Context
MySQL seems to be operating in safe mode, so I just won't be able to perform a DELETE where matching dates.
In safe-mode, if I try to delete using only the date field, it doesn't comply.
delete
from SYS_VERROUS_POLICE
where date(dateHeureModification) < curdate() - interval 14 day
Error Code: 1175. You are using safe update mode and you tried to update a table without a
WHERE that uses a KEY column
To disable safe mode, toggle the option in Preferences -> SQL Editor and reconnect.
0,00071 sec
Am I missing something?

I understand your point about using safe mode. If you try to use UPDATE or DELETE against a non-indexed expression, it complains, because it can't make an estimate of whether you will accidentally delete your whole table.
Using an expression on DATE(dateHeureModification) > ...
is naturally unindexed. MySQL can't do an index lookup against the result of a function.
You can use LIMIT in your delete query to make it satisfy the safe-updates mode. MySQL treats it as sufficient protection against accidentally deleting all the rows in the table, if you use LIMIT.
DELETE
FROM SYS_VERROUS_POLICE
WHERE DATE(dateHeureModification) < (curdate() - INTERVAL 14 DAY)
LIMIT 1000;
It's a good idea to run the delete in limited-size batches anyway, so it doesn't create too many locks or add too much to the undo segment.
Just keep doing DELETE in a loop, deleting batches of 1000 rows at a time, and check rows-affected after each batch. Stop the loop when rows-affected reaches 0.
Another idea: I don't think you really need the DATE() function in your WHERE clause. So you might be able to do the DELETE like below, and it will be able to use an index. Besides, it should be faster to the query to check for any rows if you have an index on dateHeureModification.
DELETE
FROM SYS_VERROUS_POLICE
WHERE dateHeureModification < (curdate() - INTERVAL 14 DAY)
LIMIT 1000;

I don't understand why do you complicate it :
delete
from SYS_VERROUS_POLICE
where date(dateHeureModification)
<(curdate() - interval 14 day);

#Jean Dous is the correct answer. But just to explain what is the problem in your query. You try to check a condition for the same table you are updating and is like creating a circular reference.
Instead you materialize the table as a subquery so you can use it.
delete
from SYS_VERROUS_POLICE
where idModificationPolice not in (
select distinct T.idModificationPolice
from (SELECT *
FROM SYS_VERROUS_POLICE) as T
where date(T.dateHeureModification)
between curdate() and curdate() - interval 14 day
);

Related

MySQL give rows a lifetime

How can I create a lifetime of a row so after a specific time say 2 weeks the row will automatically erase? Any info would be great.
RDBMS don't generally allow rows to automatically self destruct. It's bad for business.
More seriously, some ideas, depending on your exact needs
run a scheduled job to run a DELETE to remove rows based on some date/time column
(more complex idea) use a partitioned table with a sliding window to move older rows to another partition
use a view to only show rows less than 2 weeks old
Add a timestamp column to the table that defaults to CURRENT_TIMESTAMP, and install a cron job on the server that frequently runs and prunes old records.
DELETE FROM MyTable WHERE datediff(now(), myTimestamp) >= 14;
Or you can add timestamp column and always select like this:
SELECT * FROM myTable WHERE timetampColumn>=date_sub(now(), interval 2 week);
It is better if you don't need to erase the data and you want to show only data from last 2 weeks.

MySQL tables - Deleting Old Rows

I've got a Wordpress site. And some of the database tables are getting large--from data collected by the plug-ins. I know of two ways to delete this data:
I can delete rows manually--1000 at a time--in phpmyadmin.
I can empty the whole table in phpmyadmin.
What I'm looking for is a third way, so that I can delete just the data collected before a certain time period.
As noted above, I know that I can sort the rows by date and delete 1000 at a time--to get rid of old ones. But, there are over a million rows. Is there a single procedure I can use to delete--for example--all of the rows that are more than 60 days old?
To delete records that are more than 60 days old you can do something like this
DELETE
FROM table1
WHERE datefield < DATE_SUB(CURDATE(), INTERVAL 60 DAY)
SQLFiddle

mysql event to change a fields value when a date is reached

What I am trying to accomplish is this:
I have table1 which contains user_id,group_id(int with a default value set) and expire_date.Also table2 which between others has a field user_group_id which serves as foreign key to group_id of table1.
When the date is reached I'd like to change values of group_id and user_group_id to default.
Unfortunately it seems I can't figure my way around this since I'm really new to mysql.
Table1 will contain like 500 rows max.Probably the event won't be used to update more than 4-5 rows per run.
Automated alternative solutions are welcome.
mysql 5.2.7
php 5.3.8
CentOs 6
Thanks in advance for any responces!
You need a statement like
update table1 set group_id=<default> where expire_date() > now();
You can run this update query from a cronjob or from a trigger or from a mysql event (http://dev.mysql.com/doc/refman/5.1/en/events.html)
Did it with event.
CREATE EVENT event_name2
ON SCHEDULE
EVERY 24 HOUR
DO
UPDATE test.employees
SET `group`=DEFAULT
WHERE expire_date <= now( )
Firstly i thought to use triggers but they occur only when something is changed on database witch wasnt the case here.

What is the best way to delete old rows from MySQL on a rolling basis?

I find myself wanting to delete rows older than (x)-days on a rolling basis in a lot of applications. What is the best way to do this most efficiently on a high-traffic table?
For instance, if I have a table that stores notifications and I only want to keep these for 7 days. Or high scores that I only want to keep for 31 days.
Right now I keep a row storing the epoch time posted and run a cron job that runs once per hour and deletes them in increments like this:
DELETE FROM my_table WHERE time_stored < 1234567890 LIMIT 100
I do that until mysql_affected_rows returns 0.
I used to do it all at once but that caused everything in the application to hang for 30 seconds or so while INSERTS piled up. Adding the LIMIT worked to alleviate this but I'm wondering if there is a better way to do this.
Try creating Event that will run on database automatically after the time interval you want.
Here is an Example:
If you want to delete entries that are more than 30 days old from some table 'tableName', having column entry 'datetime'. Then following query runs every day which will do required clean-up action.
CREATE EVENT AutoDeleteOldNotifications
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 DAY
ON COMPLETION PRESERVE
DO
DELETE LOW_PRIORITY FROM databaseName.tableName WHERE datetime < DATE_SUB(NOW(), INTERVAL 30 DAY)
We need to add ON COMPLETION PRESERVE to keep the event after each run. You can find more info here: http://www.mysqltutorial.org/mysql-triggers/working-mysql-scheduled-event/
Check out MySQL Partitioning:
Data that loses its usefulness can often be easily removed from a partitioned table by dropping the partition (or partitions) containing only that data. Conversely, the process of adding new data can in some cases be greatly facilitated by adding one or more new partitions for storing specifically that data.
See e.g. this section to get some ideas on how to apply it:
MySQL Partition Pruning
And this one:
Partitioning by dates: the quick how-to
Instead of executing the delete against the table alone, try gathering the matching keys first and then do a DELETE JOIN
Given you sample query above
DELETE FROM my_table WHERE time_stored < 1234567890 LIMIT 100 ;
You can leave the LIMIT out of it.
Let say you want to delete data that over 31 days old.
Let's compute 31 days in seconds (86400 X 31 = 2678400)
Start with key gathering
Next, index the keys
Then, perform DELETE JOIN
Finally, drop the gathered keys
Here is the algorithm
CREATE TABLE delete_keys SELECT id FROM my_table WHERE 1=2;
INSERT INTO delete_keys
SELECT id FROM
(
SELECT id FROM my_table
WHERE time_stored < (UNIX_TIMESTAMP() - 2678400)
ORDER BY time_stored
) A LIMIT 100;
ALTER TABLE delete_keys ADD PRIMARY KEY (id);
DELETE B.* FROM delete_keys
INNER JOIN my_table B USING (id);
DROP TABLE delete_keys;
If the key gathering is less than 5 minutes, then run this query every 5 minutes.
Give it a Try !!!
UPDATE 2012-02-27 16:55 EDT
Here is something that should speed up key gathering a little more. Add the following index:
ALTER TABLE my_table ADD INDEX time_stored_id_ndx (time_stored,id);
This will better support the subquery that populates the delete_keys table because this provides a covering index so that the fields are retrieved frok the index only.
UPDATE 2012-02-27 16:59 EDT
Since you have to delete often, you may want to try this every two months
OPTIMIZE TABLE my_table;
This will defrag the table after all those annoying little deletes every 5 minutes for two months
At my company, we have a similar situation. We have a table that contains keys that have an expiration. We have a cron that runs to clean that out:
DELETE FROM t1 WHERE expiration < UNIXTIME(NOW());
This ran once an hour, but we were having similar issues to what you are experiencing. We increased it to once per minute. Then 6 times per minute. Setup a cron with a bash script that basically does the query, then sleeps for a few seconds and repeats until the minute is up.
The increased frequency significantly decreased the number of rows that we were deleting. Which relieved the contention. This is the route that I would go.
However, if you find that you still have too many rows to delete, use the limit and do a sleep between them. For example, if you have 50k rows to delete, do a 10k chunk with a 2 second sleep between them. This will help the queries from stacking up, and it will allow the server to perform some normal operations between these bulk deletes.
You may want to consider introducing a master/slave (replication) solution into your design. If you shift all the read traffic to the slave, you open up the master to handle 'on-the-fly' CRUD activities, which then replicate down to the slave (your read server).
And because you are deleting so many records you may want to consider running an optimize on the table(s) from where the rows are being deleted.
Ended up using this to leave only 100 last rows in place, so significant lag when executed frequently (every minute)
delete a from tbl a left join (
select ID
from tbl
order by id desc limit 100
) b on a.ID = b.ID
where b.ID is null;

MYSQL data archiving

I have a table with a timestamp column that records when the record is modified.
I would like to on a nightly basis move all records that are older than 6 days.
should I use
insert into archive_table select * from regualr_table where datediff( now(), audit_updated_date)>=6;
delete from regular_table where datediff( now(), audit_updated_date)>=6;
since there are 1 million rows in regular_table, is there anyway to optimize the query so they run faster? Also will the delete be locking the regular_table?
My main concern is the read query to the db won't be slowed down by this archiving process.
Two suggestions:
Compute the value of the cutoff date in a variable, and query compared to that, e.g.:
SET #archivalCutoff = DATE_SUB(NOW(), INTERVAL 6 DAY);
insert into archive_table select * from regular_table where audit_updated_date < #archivalCutoff;
delete from regular_table where audit_updated_date)< #archivalCutoff;
In fact what you have in your question runs into problems, especially with lots of records, because the cutoff moves, you may get records in your regular and archive tables, and you may get records that are deleted but not archived.
The second suggestion is to index the audit_updated field.