I need to delete a lot of rows in multiple tables that are older than certain date.
Actually I'm doing this:
DELETE FROM trash1 WHERE deletion_date < DATE_SUB(NOW(), INTERVAL 48 HOUR);
DELETE FROM trash2 WHERE deletion_date < DATE_SUB(NOW(), INTERVAL 48 HOUR);
DELETE FROM trash3 WHERE deletion_date < DATE_SUB(NOW(), INTERVAL 48 HOUR);
Is there any approach to make it better or faster?
Is it possible to do it in only one query?
Thanks.
First thing I would check is the efficiency of the query WHERE deletion_date < DATE_SUB(NOW(), INTERVAL 48 HOUR) on those tables using EXPLAIN.
EXPLAIN DELETE FROM trash1 WHERE deletion_date < DATE_SUB(NOW(), INTERVAL 48 HOUR);
And so on. You're looking to see if the query is using an indexed column or not and how many rows it has to search through. You want to avoid searching every row in the table.
Next thing to check is whether deletion_date is indexed and the type of index. Because this is a range query (ie. "less than") rather than a simple equality check the type of index can matter. There's a discussion about optimizing range queries in the MySQL manual.
Start with that, see how it goes. If you have trouble, post the output of your EXPLAINs.
Related
I want to compare my algorithm, here is some simplification from my case,
Assume that the table is indexed using the log_timestamp column.
First query:
SELECT
name
FROM
user_table
WHERE
DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) >= DATE('2018-01-01')
AND
DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) < DATE('2019-01-01');
Second query:
SELECT
name
FROM
user_table
WHERE
log_timestamp >= DATETIME_SUB('2018-01-01', INTERVAL 7 HOUR)
AND
log_timestamp < DATETIME_SUB('2019-01-01', INTERVAL 7 HOUR);
Which of the two queries above would be faster and why?
To find out how a query works fast or slow, we must know how many records are in the table.
if the number of records is still between 100 - 1000 (depending on the number of fields in the table), when both queries are executed, both will display results in almost the same time.
if the number of records has exceeded 100,000, it will start to see the time difference in displaying the results.
Remember, don't forget to use the EXPLAIN function to see how the query goes.
let's analyze the two queries
First Query
SELECT
name
FROM
user_table
WHERE
DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) >= DATE('2018-01-01')
AND
DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) < DATE('2019-01-01');
MySQL will :
do this DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) through all records in table without using index,
after that MySQL will compare with >= DATE('2018-01-01')
do this DATE(DATETIME_ADD(log_timestamp , INTERVAL 7 HOUR)) through all records in table without using index,
after that MySQL will compare with < DATE('2019-01-01');
and display the results
Notes :
imagine you have 100,000 records in the table, it will takes time to display the results
Second Query
SELECT
name
FROM
user_table
WHERE
log_timestamp >= DATETIME_SUB('2018-01-01', INTERVAL 7 HOUR)
AND
log_timestamp < DATETIME_SUB('2019-01-01', INTERVAL 7 HOUR);
MySQL will :
compare log_timestamp >= DATETIME_SUB('2018-01-01', INTERVAL 7 HOUR) through indexes, not full scan table
and compare log_timestamp < DATETIME_SUB('2019-01-01', INTERVAL 7 HOUR); through indexes, not full scan table
and display the results
Notes :
Remember!!!... Index in tables, it's just like index in a book. when you wanna read a book that have more than 1000 page, you will see the index first to find the page you looking for. You will not read all the pages, to find the topic you wanna read.
The question you asked should really be two separate questions. The first, along the lines of what you asked above, is which of the two queries is faster right now. The second question, which is really the one to consider, is how can you tune both queries to make them faster, and which one would be the fastest.
As it turns out, only the second query can use an index:
SELECT name
FROM user_table
WHERE log_timestamp >= DATETIME_SUB('2018-01-01', INTERVAL 7 HOUR) AND
log_timestamp < DATETIME_SUB('2019-01-01', INTERVAL 7 HOUR);
This query should benefit from an index on (log_timestamp, name). Note that your first query cannot really benefit from any index, so I expect your second query to be much faster, after the right index has been created.
I want to delete the records that is older than 1 day. What is the best way to achieve it? I have never used event before so i am having little problem.
For eg: I want to delete records where start_time is older than 1 day.
by doing some research i got to this point.
CREATE EVENT deleteRecords
ON SCHEDULE EVERY 1 DAY
ON COMPLETION PRESERVE
DO
DELETE FROM databaseName.tableName WHERE start_time < DATE_SUB(NOW(),
INTERVAL 1 DAY)
There are many ways on how to achieve your desired result.
//sample 1
DELETE FROM databaseName.tableName WHERE DATE_ADD(start_time,INTERVAL 1 DAY) < NOW();
//sample 2
DELETE FROM databaseName.tableName WHERE ADDTIME(start_time,"1 00:00:01") =< NOW();
The code in your post is okay to use too. But don't forget to backup data first if you're unsure of what you're doing.
Reference Date and Time Functions
I have a system where you can get something for "free" but only once every 7 days, I'm currently having an issue in that once every 7 days part.
What I want to do is delete entries in a certain table once that one or more entry/entries went over 7 days. The concerned table has an ID, USERNAME and DATE column.
Any thoughts?
It should be as easy as:
delete from theTable where date < now() - interval 7 days;
Make sure to run it often enough so that you don't have to delete to many rows.
If you're in an environment without replication you can go ahead and add a limit
delete from theTable where date < now() - interval 7 days limit 1000;
And if this is a large table put an index on date (or where date is first) so it doesn't do a table scan.
I am looking for a query that is able to delete all rows from a table in a database where timestamp is older than the current date/time or current timestamp.
Would really appreciate some help out here urgently!
Here's the query I am using but as I thought it ain't working:
delete from events where timestamp<CURRENT_TIMESTAMP{);
Um... This may seem silly, but every record in the table will be older than Now(), since Now() is calculated at the time that query is processed. If you you want to delete a record that's older than another record, then you don't want to use Now(), but the timestamp from the record you're comparing the rest to. Or, if you want to delete records that are older than a specific point in time, then you need to calculate the timestamp that you want to use to compare against. For example, to delete records older than 10 minutes, you could use this:
DELETE FROM events WHERE timestamp < (NOW() - INTERVAL 10 MINUTE)
Or, for deleting records that are over a day old:
DELETE FROM events WHERE timestamp < (NOW() - INTERVAL 1 DAY)
For specific points in time (e.g. Oct. 12th, 2012 at 4:15:00 PM GMT), there's a method to do that, but the syntax escapes me, right now. Where's my MySQL manual? :)
delete from events where timestamp < NOW()
should be enough.
DELETE FROM events WHERE timestamp < UNIX_TIMESTAMP(NOW())
or if it's a standard datetime
DELETE FROM events WHERE timestamp < NOW()
Hibernate (hql) Delete records older than 7 days
I am not sure, but you can Try this:
String hqlQuery = "from PasswordHistory pwh "
+ "where pwh.created_date < datediff(curdate(), INTERVAL 7 DAY)";
List<Long> userList = (List<Long>)find(hqlQuery);
deleteAll(userList );// from baseDao
public void deleteAll(Collection list) {
getHibernateTemplate().deleteAll(list);
}
DELETE FROM table WHERE date < '2011-09-21 08:21:22';
In MySQL I am trying to select rows from a table that have a lock_dt older than 10 hours. How can I write this sql statement properly?
select phone from table where ((now() - lock_dt) < 10 hours)
SELECT phone
FROM table
WHERE lock_dt > NOW() - INTERVAL 10 HOUR
Using intervals is pretty handy. Also, I try to isolate the column so an eventual index can be used (sometimes, the RDBMS don't know how to use an index with NOW() - lock_dt, even if there's an index on lock_dt).
Also, the description of your problem contradicts your query. NOW() - lock_dt < 10 hours means the interval is less than 10 hours. That's what my query do. You have to change > to < if you want more than 10 hours.
SELECT phone FROM table WHERE lock_dt < DATE_SUB(now(), interval 10 hour);