Trigger Performance Degredation (To use or not to use) - mysql

I've created this trigger:
DELIMITER $$
CREATE TRIGGER `increment_daily_called_count` BEFORE UPDATE ON `list`
FOR EACH ROW begin
if (NEW.called_count != OLD.called_count) then
set NEW.daily_called_count = OLD.daily_called_count(NEW.called_count-OLD.called_count);
set NEW.modify_date = OLD.modify_date;
end if;
end
$$
DELIMITER ;
The database table this runs on is accessed and used by 100's of different scripts in the larger system and the reason for the trigger is so I don't have to hunt down every little place in these scripts where the called_count might get updated...
My concern is that, because this particular table gets modified constantly (I'm talking dozens of times per second), is this going to put undue strain on the database? Am I better off in the long run hunting down all the called_count update queries in the myriad scripts and adding daily_called_count = daily_called_count+1?
Some specifics I'd like to know the answer to here:
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is MySQL smart enough to bundle these queries?
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
Could this trigger cause any unforeseen weirdness that I'm not anticipating?

Two disclaimers:
I have not worked with MySQL in a very long time, and never used triggers with it. I can only speak from general experience with RDBMS's.
The only way to really know anything for sure is to run a performance test.
That said, my attempts to answer with semi-educated guesses (from experience):
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is mysql smart enough to bundle these queries?
I don't think it's a separate update in the sense of statement execution. But you are adding a computation overhead cost to each row.
However, what I am more worried about is the row-by-row nature of this trigger. It literally says FOR EACH ROW. Generally speaking, row-by-row operations scale poorly in a RDBMS compared to SET-based operations. MS SQL Server runs statement-level triggers where the entire set of affected rows is passed in, so a row-by-row operation is not necessary. This may not be an option in MySQL triggers - I really don't know.
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
It would certainly make the system do less work. How much the performance impact is, numerically, I can't say. You'd have to test. If it's only a 1% difference, the trigger is probably fine. If it's 50%, well, it'd be worth hunting down all the code. Since hunting down the code is a burden, I suspect it's either embedded in an application or comes dynamically from an ORM. If that is the case, as long as the performance cost of the trigger is acceptable, I'd rather stick to the trigger as it keeps a DB-specific detail in the DB.
Measure, measure, measure.
Could this trigger cause any unforeseen weirdness that I'm not anticipating?
Caching comes to mind. If these columns are part of something an application reads and caches, its cache invalidation is probably tied to when it thinks it changed the data. If the database changes data underneath it, like with a trigger, caching may result in stale data being processed.

First, thanks to #Brandon for his response. I built my own script and test database to benchmark and solve my question... While I don't have a good answer to points 1 and 3, I do have an answer on the performance question...
To note I am using 10.0.24-MariaDB on our development server which didn't have anything else running on it at the time.
Here are my results...
Updating 100000 rows:
TRIGGER QUERY TIME: 6.85960197 SECONDS
STANDARD QUERY TIME: 5.90444183 SECONDS
Updating 200000 rows:
TRIGGER QUERY TIME: 13.19935203 SECONDS
STANDARD QUERY TIME: 11.88235188 SECONDS
You folks can decide for yourselves which way to go.

Related

Ruby on rails: ActiveRecords' first_or_create is very slow

I've got a ruby scripts which imports XML files to a MySQL database. It does it by looping through the elements in the XML file and finally
table.where(
value: e['value'],
...
).first_or_create
The script has to process a lot of data, most of it is already in the database. Because of this, it runs really slow because first_or_create obviously triggers a lot ot SELECT queries.
Is there any way to handle this more rapidly? Is it related to connection management?
Thanks
first_or_create is of course a convenience method, which doesn't much care about performance on a bigger data set.
Ensure all your indices are in place.
First obvious way to increase performance would be: since every create statement is wrapped in a begin, commit transaction block. Thats 3 queries for one insert.
You can place your whole loop inside transaction block - that will gain you some time as it will only execute begin and commit once.
Remember that the roundtrip to/from database takes considerable amount of time, so an obvious performance boost is to combine multiple statements into one. Try to create one SELECT query finding a batch of, let's say 1000 records. DB will return that 200 don't exist and you can go ahead and build one INSERT statement for those 200 queries.
Always perform measurements and always try to formulate what level of performance you are trying to achieve, to not make the code too verbose.
It's better to eliminate records that need to create.check records if not in the db then create

How can I find the bottleneck in my slow MySQL routine (stored procedure)?

I have a routine in MySQL that is very long and has multiple SELECT, INSERT, and UPDATE statements in it with some IFs and REPEATs. It's been running fine until lately, where it's hanging an taking over 20 seconds to complete (which is unacceptable considering it used to take 1 second or so).
What is the quickest and easiest way for me to find out where in the routine the bottleneck is coming from? Basically the routine is getting stopped up and some point... how can I find out where that is without breaking apart the routine and testing one-by-one each section?
If you use Percona Server (a free distribution of MySQL with many enhancements), you can make the slow-query log record times for individual queries, using the log_slow_sp_statements configuration variable. See http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
If you're using stock MySQL, you can add statements in the stored procedure to set a series of session variables to the value returned by the SYSDATE() function. Use a different session variable at different points in the SP. Then after you run the SP in a test execution, you can inspect the values of these session variables to see what section of the SP took the longest.
To analyze the query can see the execution plan of the same. It is not always an easy task but with a bit of reading will find the solution. I leave some useful links
http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
http://dev.mysql.com/doc/refman/5.0/en/explain.html
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
http://www.lornajane.net/posts/2011/explaining-mysqls-explain

SQL Server 2008 - How to implement a "Watch Dog Service" which woofs when too many insert statements on a table

Like my title describes: how can I implement something like a watchdog service in SQL Server 2008 with following tasks: Alerting or making an action when too many inserts are committed on that table.
For instance: Error table gets in normal situation 10 error messages in one second. If more than 100 error messages (100 inserts) in one second then: ALERT!
Would appreciate it if you could help me.
P.S.: No. SQL Jobs are not an option because the watchdog should be live and woof on the fly :-)
Integration Services? Are there easier ways to implement such a service?
Kind regards,
Sani
I don't understand your problem exactly, so I'm not entirely sure whether my answer actually solves anything or just makes an underlying problem worse. Especially if you are facing performance or concurrency problems, this may not work.
If you can update the original table, just add a datetime2 field like
InsertDate datetime2 NOT NULL DEFAULT GETDATE()
Preferrably, make an index on the table and then with whatever interval that fits, poll the table by seeing how many rows have an InsertDate > GetDate - X.
For this particular case, you might benefit from making the polling process read uncommitted (or use WITH NOLOCK), although one has to be careful when doing so.
If you can't modify the table itself and you can't or won't make another process or job monitor the relevant variables, I'd suggest the following:
Make a 'counter' table that just has one Datetime2 column.
On the original table, create an AFTER INSERT trigger that:
Deletes all rows where the datetime-field is older than X seconds.
Inserts one row with current time.
Counts to see if too many rows are now present in the counter-table.
Acts if necessary - ie. by executing a procedure that will signal sender/throw exception/send mail/whatever.
If you can modify the original table, add the datetime column to that table instead and make the trigger count all rows that aren't yet X seconds old, and act if necessary.
I would also look into getting another process (ie. an SQL Jobs or a homemade service or similar) to do all the housekeeping, ie. deleting old rows, counting rows and acting on it. Keeping this as the work of the trigger is not a good design and will probably cause problems in the long run.
If possible, you should consider having some other process doing the housekeeping.
Update: A better solution will probably be to make the trigger insert notifications (ie. datetimes) into a queue - if you then have something listening against that queue, you can write logic to determine whether your threshold has been exceeded. However, that will require you to move some of your logic to another process, which I initially understood was not an option.

Extreme low-priority SELECT query in MySQL

Is it possible to issue an (expensive, but low-priority) SELECT query to mySQL in such a way that if an UPDATE query appears in the queue, mySQL will immediately terminate the query, and re-append it to the end of the queue?
If re-appending to the queue is not possible, I'm happy with simply killing the SELECT query.
No, not really.
I am not sure exactly what you need, but my guess is that you need to either optimize the SELECT to not lock an entire table, or get the replication going and do the SELECT on the slave rather than the master.
You could theoretically find out what the MySQL process ID is of the SELECT query, and in your application send a KILL before you do any update.
Well, sort of maybe.
A client runs an application which occasionally throws out queries that completely kill performance for everything else on the server. We have monitoring and if we've got a suitable person ready to react, we can deal to that query manually, and we learn about the problems in the app by doing things that way.
But to prevent major outages if noone is on the ball, we have an automated script which terminates long running queries, so the server does recover in the event that noone is available to intervene within 15 minutes.
Far from ideal, but that's where things are currently at with this project, and it does prevent the occasional extended outages that used to occur. We can only move just so fast with fixing up the problem queries.
Anyway, you could run something similar, that looks at the running queries and recognises when you have an update waiting on one of your large selects, and in that event it kills the select. Doing this sort of check a few times a minute is not overly expensive. I'd want to do a bit of testing before running.
So, whether you can solve your problem this way depends on what your tolerance is for how long an update can be delayed. Running this every minute (as we do) is no problem at all. Running it every second would noticeably add to the overall load. You'd need to test how far you can reasonably go in between those points.
This approach means some delay before the select gets pushed out of the way, but it saves you having to build this logic into potentially many different places in your application.
--
Regarding breaking up your query, you're most likely better off restricting the chunks by id range from one or more tables in your query rather than by offset and limit.
--
There may also be good solutions available based on partitioning your tables so that the queries don't collide as badly. Make sure you have a very good grasp on what you are doing for this though.

How can I get MySQL trigger execution time?

I have a rather complicated trigger and I'm afraid it's execution time is too long. How can I measure it?
A trigger is like every other sql query, with the difference that it can not be called explicitly. About measuring performance of sql query it really depends on your implementation, so a little more information will be useful.
With php, with some tool... how?
The simplest way(in the db) is to INSERT NOW in the beginning of the trigger and INSERT NOW at the end.
But time measurement(if this is what you asked) is not always the best choice to measure performance.
This is a good way to start - Using the New MySQL Query Profiler