Way to automate deletion of expired records in MySQL? - mysql

I've googled around and searched the MYSQL docs ad nauseam and couldn't find a succinct way of automating deletion of records that exceeded a given timeframe. I've been able to get a query in 5.1 to cast a value of TIMESTAMP to DATETIME within a DIFF function with the current time to see if it meets the criteria of expiration. I've read that 5.1 now has the capability of running scheduled tasks but not much in the way of configuring it. I'm not using triggers for this.
In the MySQL docs for 5.1, it refers to creating an event:
'CREATE
[DEFINER = { user | CURRENT_USER }]
EVENT
[IF NOT EXISTS]
event_name
ON SCHEDULE schedule
[ON COMPLETION [NOT] PRESERVE]
[ENABLE | DISABLE | DISABLE ON SLAVE]
[COMMENT 'comment']
DO sql_statement;
schedule:
AT timestamp [+ INTERVAL interval] ...
| EVERY interval
[STARTS timestamp [+ INTERVAL interval] ...]
[ENDS timestamp [+ INTERVAL interval] ...]
interval:
I currently use Toad (which has been a Godsend). My query affectively removes any records that are more than 30 minutes old. I just need to find how this event gets invoked...
Thanks!

You are talking about using the MySQL Scheduler. Once you create that event, MySQL will call it automatically at whatever interval you configure it with. If you are having trouble getting it set up, post the query and error your are getting.

Write a query and have it ran on a job every so often. Say, check for the expired rows every 30 minutes or so.

If it doesn't have to be exact, and you're just housekeeping, you can tie the process to another one. If you can afford the time.
If you have an old invoice file, purge it when month-end is run (possibly a lot of records, but it's a batch process anyway). Purge old inventory items when you add new ones (less frequent, but fewer records possibly). Keeping an access log table? Purge it when the most recent record in it falls on a different day than today. (for low traffic logfiles) And so on.

Related

How to make mysql run Delet query automatically

I Have an OTP table and I want to delete data that is older than 5 minutes automatically
Then how could I make trigger or procedure for that?
You must use according Event Scheduler procedure.
CREATE EVENT remove_old_rows
ON SCHEDULE
EVERY 10 SECOND
COMMENT 'Delete the rows that are older than 5 minutes from OTP table.'
DO
DELETE
FROM OTP_database.OTP_table
WHERE created_at < CURRENT_TIMESTAMP - INTERVAL 5 MINUTE;
Do not forget to enable Event Scheduler.
Don't do it! Just create view to get the most recent data:
create view v_otp as
select otp.*
from otp
where otp.created_at >= now() - interval 5 minute;
Anyone who uses the view only sees the most recent data.
Then you can leisurely delete old data during a period when the database is not busy.
An added benefit is that this is always accurate. If an event or job gets delayed, then your users might see old data. Further, this does not involve complicated locking and transaction semantics when the server is busy.

Automatically delete outdated rows from database every n seconds

I have a database which has a timestamp column and I want outdated data to be deleted.
So my idea is to write a MySQL query to a .php file which deletes every row where timestamp < current_timestamp - const. As there will be a lot of rows where this has to be checked, I am going to set an index to the timestamp column.
So how can I run this script automatically every n seconds? I heard about Linux crontab - can I use this on my webserver (not the db server) to execute the .php file periodically and is this overall a good technique to delete outdated rows from a database?
The database is set on a RDS instance on Amazon Web Services. My webserver is a EC2 instance (also Amazon Web Services).
Doing such a thing requires setting up an event or job. Such efforts keep the database very busy.
I would strongly recommend a different approach. Use a view to access the data you want:
create view v_t as
select t.*
from t
where timestamp > CURRENT_TIMESTAMP - ??;
Then use this view to access the data.
Then, periodically, go in an clean the table to get rid of the rows that you don't don't want. You can do this once a day, once a week, once an hour -- the deletions can occur at times when the database load is lighter, so it doesn't affect users.
I think you should check out lambda service on AWS.
It allows you to run commands against AWS services without another instance running.
Here's an example on how to set it up.
http://docs.aws.amazon.com/lambda/latest/dg/vpc-rds-deployment-pkg.html
Good luck
Eugene
Gordon Linoff's approach is ideal, but if you want to go the route of scheduled jobs, MySQL Event Scheduler is something you can try. The following example, runs daily and delete records older than a week.
CREATE EVENT
clean_my_table
ON SCHEDULE EVERY 1 DAY
DO
DELETE FROM my_table
WHERE time_stamp < date_sub(now(), INTERVAL 1 WEEK);
MySQL Event Reference page
https://dev.mysql.com/doc/refman/5.7/en/create-event.html

How to sync records with ETL to a datawarehouse in NRT

I am a newbie with all these ETL stuff,
I wonder what are the best solutions with tools like PDI (pentaho data integration) to sync some records from operational databases to datawarehouse
I am in a near real time context (so I don't want to sync data 1 a day but every 5 minutes for example.)
3 ways immediately come to me:
using an indexed time columns on operation database
Ex: SELECT * FROM records WHERE date > NOW() - INTERVAL 5 MINUTES
but I can still miss some records or having some duplicates etc...
using a table or a sync column
Ex: SELECT * FROM records WHERE synced = no
using a queue service
Ex: at record creation, creating an event in a rabbitMq (or any other tool) telling that something is ready to get sync

deleting item from database after 30 minutes [duplicate]

This question already has an answer here:
Delete MySQL Row after 30 minutes using Cron Jobs/Event Scheduler
(1 answer)
Closed 9 years ago.
I want to have a script, that counts how many users are online on my site, but this script should count guests, so I have created a database for session, this script gives user an ID and set 30 minutes session, and now I have a problem, because if he is not active more than 30 minutes, he should be deleted from the database, because I want to count by ID how many users are online, and I have headache how can I do this.
Is there a simple way to do this?
As stated by Barmar in their answer here:
DELETE FROM my_table
WHERE timestamp < NOW() - INTERVAL 30 MINUTE
Write a PHP script that executes this SQL, and add a crontab entry
that runs it every 30 minutes. Or use the MySQL Event Scheduler to run
it periodically; it is described here.
Since you have not mentioned any database in specific, I give you a general idea of how to implement it:
You can have a table like this:
USER_SESSIONS
(
USER_ID //UNIQUE_ID
LAST_ACTIVE //TIMESTAMP
)
Here is the functionality you can associate to implement the session:
When a user logs in, you create an entry in this table.
When a user logs out, you delete the corresponding entry from this table.
When user does some activity (depends on how you want to track activity in the front end), update the corresponding TIMESTAMP of the user.
Create a DB Schedule (a continuously running process) that monitors the TIMESTAMP column. All good databases have a built in scheduler. Here is some pseudo code for the scheduler:
FOR each entry in USER_SESSIONS
If (CURRENT_TIMESTAMP - LAST_ACTIVE) > [30 mins] then
delete the entry of the corresponding user from this table.
//This will essentially cause a session timeout.
End If
LOOP;
I hope you got a fair idea of how to implement it.

MySql - Missed event schedule

I am trying to use mysql event schedule in my application, I have not use it before so i have some confusions.
I want to know if my computer is off on the schedule date, then schedule will continue on next day, after starting my computer?
Like:
my schduled is for beginning at every month (no predefined time set)
if in the above date my computer/Server is off,
will mysql continue scheduled event in next day after turning on my computer/server?
If no, then please suggest a solution.
Hmmmm, have you looked at something like this?
MySQL: Using the Event Scheduler
... or:
How to create MySQL Events
... or even: [MySQL :: MySQL 5.1 Reference Manual: 19.4.1. Event Scheduler Overview](19.4.1. Event Scheduler Overview)?
Also please keep in mind that SQL DBMS servers are written with the rather strong presumption that they will be kept up and operating 24 hours per day with only brief periods of downtime for maintenance or repairs. There is generally very little consideration for operation on machines which are shutdown at night and while not in use.
If you simply store a table of dates and events then your can simply query that table for events which have passed or are upcoming within any range you like ... and you can run the program(s) containing those queries (and performing any appropriate activities based on the results) whenever you start you computer and periodically while it's up and running.
These links refer to a feature of MySQL which is designed to have the server internally execute certain commands (MySQL internal commands, such as re-indexing, creating/updating views, cleaning tables of data which "expires" and so on. I don't know if a MySQL server would attempt to execute all events which have passed during downtime, though it should only be a little bit of work to follow the tutorial, schedule some event for some time (say 15 minutes after the time you expect to hit [Enter]) ... then shutdown your computer (or even just the MySQL server) and go off to lunch. Then come back, start it up and see what happens.
The scheduled event could be something absurdly simple, like inserting the "current" time into some table you set up.