I am writing a few tables for a MySQL database, and within those tables I have entries that I only need for a few weeks.
For example:
create table LoginAttempts (
ReferenceId int unsigned auto_increment not null,
UserId smallint unsigned not null,
EventTime timestamp not null...)
Now I have no need to keep records of user login attempts forever, so I plan on deleting them after, say, 30 days. Obviously I need this operation to happen automatically. My original thought was to add a trigger to this table that would run each time an entry is made to the table. Basically, it would simply compare each row's timestamp to the current timestamp - if it is over 30 days old, that row would be deleted.
However, my concern is possible performance impact. This method, at least to me, seems inefficient. There ought to be a better way. So, how can I implement what I want to do in a better manner - if there is one?
Use a cron job or MySql events for that.
If you go with mysql events you create an event like this
CREATE EVENT event_name
ON SCHEDULE EVERY 1 DAY
DO
DELETE
FROM LoginAttempts
WHERE DATEDIFF(CURDATE(), DATE(EventTime)) > 30;
Use SHOW PROCESSLIST to check if event scheduler is enabled. If it's ON you should see a process "Daemon" by user "event_scheduler".
Use SET GLOBAL event_scheduler = ON;to enable the scheduler if it's currently not enabled.
More on configuring event scheduler here
Related
I have ran into an issue when using a mysql database where, after creating a new table and adding CRUD database query logic to my web application (with backend written in c), update querys will sometimes take 10-20 minute to execute.
The web application has apache modules that talk to server daemons that have a connection to a mysql (MariaDB 10.4) database. The server daemons each have about 20 work threads, waiting to handle any requests from the apache modules. The work threads maintain a consent connection to the mysql database. I added a new table of the following schema:
CREATE TABLE MyTable
(
table_id INT NOT NULL AUTO_INCREMENT,
index_id INT NOT NULL,
int_column_1 INT DEFAULT 0,
decimal_column_1 DECIMAL(9,3) DEFAULT 0,
decimal_column_2 DECIMAL(9,3) DEFAULT 0,
varchar_column_1 varchar(3000) DEFAULT NULL,
varchar_column_2 varchar(3000) DEFAULT NULL,
deleted tinyint DEFAULT 0,
PRIMARY KEY (table_id) ,
KEY index_on_index_id (index_id)
)
Then I added the following crud operations:
1. RETRIEVE:
SELECT * FROM MyTable table_id, varchar_column_1,... WHERE index_id = ${given index_id}
2. CREATE:
INSERT INTO MyTable (index_id, varchar_column_2, ,,,) VALUES ( ${given}, ${given})Note: This is done using a prepare statement because ${given varchar_column_2} is a user entered value.
3. UPDATE:
UPDATE MyTable SET varchar_column_1 = ISNULL(${given varchar_column_2}, `varchar_column_2 `) WHERE table_id = ${given table_id} Note: This is also done using a prepare statement because ${given varchar_column_2} is a user entered value. Also, the isnull is a kludge solution to the possibility that the given varchar_column_2 might be null, so that the column will just be set to the value in the table.
4. DELETE:
UPDATE MyTable SET deleted = 1 WHERE table_id = ${given table_id}
Finally, there is a delete index_id operation:
UPDATE MyTable SET deleted = 1 WHERE index_id = ${given index_id }
This was deployed to a production server without proper testing. On that production server, a script I wrote was ran that filled MyTable with about 30,000 entries. Then, using the crud operations, about 600 updates, 50 creates, 20 deletes, and thousands of retrieves were performed on the table. The problem that is occurring is that after some time (an hour or two) of these operations being performed, the update operation would take 10+ minutes to execute. Eventually, all of the work threads in the server daemon would be stuck waiting on the update operations, and any other requests to the daemon would time out. This behavior happened twice in one day and one more time two days later.
There were three parts of this behavior that really confused me. One is that all update operations on the database were being blocked. So even if the daemon, or any daemon, was updating a different table in database, that update would take 10+ minutes. The next is that the select operations would execute instantly as all the update queries were taking 10+ minutes. Finally, after 10-20 minutes, all of the 20-ish update queries would successfully execute, the database would be correctly updated, and the threads would go back to working properly.
I received a dump of the database and ran EXPLAIN ${mysql query} for each of the new CRUD queries, and none produced strange results. In the "Extras" column, the only entry was "using where clause" for the queries that have where clauses. Another potential problem is the use of varchars. Since the UPDATE operations are used the most and are the ones that seem to be causing the problem, I thought maybe the fact that the varchars are changing sizes a lot (they range from 8 chars to 500 chars), it might run into some mysql memory issues that cause the long execution time. I also thought maybe there was an issue with table level locks, but running
Show status like ' table%
returned table_locks_waited = 0.
Unfortunately, no database monitoring was being done on the production server that was having issues, I only have the order of the transactions as they happened. To this, each time this issue occurred, the first update query that was blocked was an update to a different table in the database. It was the same query twice (but it is also the most common update query in the application), but it has been in the application for months without any issues.
I tried to reproduce this issue on a server with the same table and CRUD operations, but with only 600 entries in MyTable. Making about 100 update requests, 20 create requests, 5 delete requests, and hundreds of get requests. I could not reproduce the issue of the update queries taking 10+ minutes. This makes me think that maybe the size of the table has something to do with it.
I am looking for any suggestions on what might be causing this issue, or any ideas on how to better diagnose the problem.
Sorry for the extremely long question. I am a junior software engineer that is in a little over his head. Any help would be really appreciated. I can also provide any additional information about the database or application if needed.
This question already has an answer here:
Delete MySQL Row after 30 minutes using Cron Jobs/Event Scheduler
(1 answer)
Closed 9 years ago.
I want to have a script, that counts how many users are online on my site, but this script should count guests, so I have created a database for session, this script gives user an ID and set 30 minutes session, and now I have a problem, because if he is not active more than 30 minutes, he should be deleted from the database, because I want to count by ID how many users are online, and I have headache how can I do this.
Is there a simple way to do this?
As stated by Barmar in their answer here:
DELETE FROM my_table
WHERE timestamp < NOW() - INTERVAL 30 MINUTE
Write a PHP script that executes this SQL, and add a crontab entry
that runs it every 30 minutes. Or use the MySQL Event Scheduler to run
it periodically; it is described here.
Since you have not mentioned any database in specific, I give you a general idea of how to implement it:
You can have a table like this:
USER_SESSIONS
(
USER_ID //UNIQUE_ID
LAST_ACTIVE //TIMESTAMP
)
Here is the functionality you can associate to implement the session:
When a user logs in, you create an entry in this table.
When a user logs out, you delete the corresponding entry from this table.
When user does some activity (depends on how you want to track activity in the front end), update the corresponding TIMESTAMP of the user.
Create a DB Schedule (a continuously running process) that monitors the TIMESTAMP column. All good databases have a built in scheduler. Here is some pseudo code for the scheduler:
FOR each entry in USER_SESSIONS
If (CURRENT_TIMESTAMP - LAST_ACTIVE) > [30 mins] then
delete the entry of the corresponding user from this table.
//This will essentially cause a session timeout.
End If
LOOP;
I hope you got a fair idea of how to implement it.
I've done some googling but can't really get much relevant information. I'm trying to set a date/time for certain rows to be deleted depending on activity. If active, the time would be bumped to a later time unless activated once again.. Otherwise it will be deleted. I've managed to sort the rows when activated (inserted/updated) in activity.
Thanks in advance.
Firstly do not put this update/delete in a trigger if you have millions of rows that needs to be deleted you are going to see a huge performance hit on inserts/updates. It is not the best place for it. You can create either a cron job as Filype suggested. Or if you want to keep it all in MySQL use the MySQL Event scheduler.
Go to this page to read more about scheduling events in MySQL:
http://dev.mysql.com/doc/refman/5.1/en/events.html
MySQL Event allows you to schedule things on MySQL on a regular basis.
The code would look something like
CREATE EVENT myevent
ON SCHEDULE AT CURRENT_TIMESTAMP + INTERVAL 1 HOUR
DO
DELETE FROM MyTable Where Expired< NOW();
Here is a suggestion, I haven't tried yet, you might think to update the row with deleted=1 instead of actually deleting the record.
CREATE TRIGGER deleteInactiveRecords AFTER UPDATE,INSERT ON myTable
FOR EACH ROW BEGIN
DELETE FROM myTable WHERE updated < (updated-((60)*60*24))
END;
MySQL 5.1, Ubuntu 10.10 64bit, Linode virtual machine.
All tables are InnoDB.
One of our production machines uses a MySQL database containing 31 related tables. In one table, there is a field containing display values that may change several times per day, depending on conditions.
These changes to the display values are applied lazily throughout the day during usage hours. A script periodically runs and checks a few inexpensive conditions that may cause a change, and updates the display value if a condition is met. However, this lazy method doesn't catch all posible scenarios in which the display value should be updated, in order to keep background process load to a minimum during working hours.
Once per night, a script purges all display values stored in the table and recalculates them all, thereby catching all possible changes. This is a much more expensive operation.
This has all been running consistently for about 6 months. Suddenly, 3 days ago, the run time of the nightly script went from an average of 40 seconds to 11 minutes.
The overall proportions on the stored data have not changed in a significant way.
I have investigated as best I can, and the part of the script that is suddenly running slower is the last update statement that writes the new display values. It is executed once per row, given the (INT(11)) id of the row and the new display value (also an INT).
update `table` set `display_value` = ? where `id` = ?
The funny thing is, that the purge of all the previous values is executed as:
update `table` set `display_value` = null
And this statement still runs at the same speed as always.
The display_value field is not indexed. id is the primary key. There are 4 other foreign keys in table that are not modified at any point during execution.
And the final curve ball: If I dump this schema to a test VM, and execute the same script it runs in 40 seconds not 11 minutes. I have not attempted to rebuild the schema on the production machine, as that's simply not a long term solution and I want to understand what's happening here.
Is something off with my indexes? Do they get cruft in them after thousands of updates on the same rows?
Update
I was able to completely resolve this problem by running optimize on the schema. Since InnoDB doesn't support optimize, this forced a rebuild, and resolved the issue. Perhaps I had a corrupted index?
mysqlcheck -A -o -u <user> -p
There is a chance the the UPDATE statement won't use an index on id, however, it's very improbable (if possible at all) for a query like yours.
Is there a chance your table are locked by a long-running concurrent query / DML? Which engine does the table use?
Also, updating the table record-by-record is not efficient. You can load your values into a temporary table in a bulk manner and update the main table with a single command:
CREATE TEMPORARY TABLE tmp_display_values (id INT NOT NULL PRIMARY KEY, new_display_value INT);
INSERT
INTO tmp_display_values
VALUES
(?, ?),
(?, ?),
…;
UPDATE `table` dv
JOIN tmp_display_values t
ON dv.id = t.id
SET dv.new_display_value = t.new_display_value;
I've googled around and searched the MYSQL docs ad nauseam and couldn't find a succinct way of automating deletion of records that exceeded a given timeframe. I've been able to get a query in 5.1 to cast a value of TIMESTAMP to DATETIME within a DIFF function with the current time to see if it meets the criteria of expiration. I've read that 5.1 now has the capability of running scheduled tasks but not much in the way of configuring it. I'm not using triggers for this.
In the MySQL docs for 5.1, it refers to creating an event:
'CREATE
[DEFINER = { user | CURRENT_USER }]
EVENT
[IF NOT EXISTS]
event_name
ON SCHEDULE schedule
[ON COMPLETION [NOT] PRESERVE]
[ENABLE | DISABLE | DISABLE ON SLAVE]
[COMMENT 'comment']
DO sql_statement;
schedule:
AT timestamp [+ INTERVAL interval] ...
| EVERY interval
[STARTS timestamp [+ INTERVAL interval] ...]
[ENDS timestamp [+ INTERVAL interval] ...]
interval:
I currently use Toad (which has been a Godsend). My query affectively removes any records that are more than 30 minutes old. I just need to find how this event gets invoked...
Thanks!
You are talking about using the MySQL Scheduler. Once you create that event, MySQL will call it automatically at whatever interval you configure it with. If you are having trouble getting it set up, post the query and error your are getting.
Write a query and have it ran on a job every so often. Say, check for the expired rows every 30 minutes or so.
If it doesn't have to be exact, and you're just housekeeping, you can tie the process to another one. If you can afford the time.
If you have an old invoice file, purge it when month-end is run (possibly a lot of records, but it's a batch process anyway). Purge old inventory items when you add new ones (less frequent, but fewer records possibly). Keeping an access log table? Purge it when the most recent record in it falls on a different day than today. (for low traffic logfiles) And so on.