Should a statement be reused as many time as possible or there's a limitation?
If there is a limitation, when is the right time to close it?
Is creating and closing statement a costly operation?
Creating and closing a statement doesn't really make sense. I believe what you mean is creating and closing a cursor. A cursor is a query that you iterate over the results of. Typically you see them in Stored Procedures and Functions in MySQL. Yes, they have a cost to open and close and you should iterate over the entire set.
Alternately you're talking about prepared statements such as you might create using the PDO library in PHP. In which case, you can use them as many times as possible and indeed you should, as this is more efficient.
Every time MySQL receives a statement, it translates that into its own internal logic and creates a query plan. Using prepared statements means it only has to do this once rather than every time you call it.
Finally, you might be trying to ask about a connection, rather than a statement. In which case, again, the answer is yes - you can (and should) use it as many time as you need as there's a significant performance impact of opening it. Though you don't want to keep it open longer than you need it because MySQL has a maximum number of connections it can open.
Hopefully one of those will answer your question.
Related
I am in the process of writing my first C client for MySQL 5.5 and have stumbled across the following page in the documentation. Nearly at the end, it states (bold emphasis mine, italic emphasis not mine):
An advantage of mysql_use_result() is [...]. Disadvantages are that
[...]. Furthermore, you must retrieve all the rows even if you
determine in mid-retrieval that you've found the information you were
looking for.
The last sentence is not clear to me.
1) What happens if I don't follow that line?
2) I think that there actually must be a way to prematurely end fetching rows if I decide that I have enough information (otherwise, this whole thing wouldn't make much sense in my eyes).
I understand that something bad could happen if I just stop fetching rows and then try to execute the next statement, but isn't there a function like mysql_finish_fetch() or something like that?
And what happens if I call mysql_free_result()? This should free the result even if I haven't fetched all rows yet, so it should be safe to call it in mid-retrieval and continue with whatever I'd like to do. Am I wrong here?
This sounds like an internal threading issue that MySQL exposes to the client. Chalk it up to the various MySQL gotchas. The short of it is that MySQL apparently has a finite number of "searchers" internally, and using mysql_use_result() apparently dedicates one of them to your API request. Further, MySQL apparently has no exposed API call to cancel such a request. The only option is to see the fetch through until the end.
The slightly longer version: internally, MySQL's cursors apparently have a single code path -- I imagine for performance in the common cases. That code path exits only when the cursor finds no more results. When you use the more common mysql_store_result(), MySQL has done this already before returning the result to the application. When you use mysql_use_result(), however, MySQL requires that you do "the dirty work" of iterating the rest of the result set so as to clear the cursor. Fun.
From the documentation:
mysql_use_result() initiates a result set retrieval but does not actually read the result set into the client like mysql_store_result() does. Instead, each row must be retrieved individually by making calls to mysql_fetch_row(). This reads the result of a query directly from the server without storing it in a temporary table or local buffer, which is somewhat faster and uses much less memory than mysql_store_result(). The client allocates memory only for the current row and a communication buffer that may grow up to max_allowed_packet bytes.
On the other hand, you should not use mysql_use_result() for locking reads if you are doing a lot of processing for each row on the client side, or if the output is sent to a screen on which the user may type a ^S (stop scroll). This ties up the server and prevent other threads from updating any tables from which the data is being fetched.
When using mysql_use_result(), you must execute mysql_fetch_row() until a NULL value is returned, otherwise, the unfetched rows are returned as part of the result set for your next query. The C API gives the error Commands out of sync; you can't run this command now if you forget to do this!
So, to actually answer your questions:
1) What happens if I don't follow that line?
The C API will return the error message: Commands out of sync; you can't run this command now
2) I think that there actually must be a way to prematurely end fetching rows if I decide that I have enough information (otherwise, this whole thing wouldn't make much sense in my eyes).
One would think, but no. You must iterate the result set completely.
I've created this trigger:
DELIMITER $$
CREATE TRIGGER `increment_daily_called_count` BEFORE UPDATE ON `list`
FOR EACH ROW begin
if (NEW.called_count != OLD.called_count) then
set NEW.daily_called_count = OLD.daily_called_count(NEW.called_count-OLD.called_count);
set NEW.modify_date = OLD.modify_date;
end if;
end
$$
DELIMITER ;
The database table this runs on is accessed and used by 100's of different scripts in the larger system and the reason for the trigger is so I don't have to hunt down every little place in these scripts where the called_count might get updated...
My concern is that, because this particular table gets modified constantly (I'm talking dozens of times per second), is this going to put undue strain on the database? Am I better off in the long run hunting down all the called_count update queries in the myriad scripts and adding daily_called_count = daily_called_count+1?
Some specifics I'd like to know the answer to here:
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is MySQL smart enough to bundle these queries?
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
Could this trigger cause any unforeseen weirdness that I'm not anticipating?
Two disclaimers:
I have not worked with MySQL in a very long time, and never used triggers with it. I can only speak from general experience with RDBMS's.
The only way to really know anything for sure is to run a performance test.
That said, my attempts to answer with semi-educated guesses (from experience):
Does use of this trigger essentially make this 3 separate update queries where it was once a single query, or is mysql smart enough to bundle these queries?
I don't think it's a separate update in the sense of statement execution. But you are adding a computation overhead cost to each row.
However, what I am more worried about is the row-by-row nature of this trigger. It literally says FOR EACH ROW. Generally speaking, row-by-row operations scale poorly in a RDBMS compared to SET-based operations. MS SQL Server runs statement-level triggers where the entire set of affected rows is passed in, so a row-by-row operation is not necessary. This may not be an option in MySQL triggers - I really don't know.
Is there a performance argument for hunting down and modifying the originating queries over using the trigger?
It would certainly make the system do less work. How much the performance impact is, numerically, I can't say. You'd have to test. If it's only a 1% difference, the trigger is probably fine. If it's 50%, well, it'd be worth hunting down all the code. Since hunting down the code is a burden, I suspect it's either embedded in an application or comes dynamically from an ORM. If that is the case, as long as the performance cost of the trigger is acceptable, I'd rather stick to the trigger as it keeps a DB-specific detail in the DB.
Measure, measure, measure.
Could this trigger cause any unforeseen weirdness that I'm not anticipating?
Caching comes to mind. If these columns are part of something an application reads and caches, its cache invalidation is probably tied to when it thinks it changed the data. If the database changes data underneath it, like with a trigger, caching may result in stale data being processed.
First, thanks to #Brandon for his response. I built my own script and test database to benchmark and solve my question... While I don't have a good answer to points 1 and 3, I do have an answer on the performance question...
To note I am using 10.0.24-MariaDB on our development server which didn't have anything else running on it at the time.
Here are my results...
Updating 100000 rows:
TRIGGER QUERY TIME: 6.85960197 SECONDS
STANDARD QUERY TIME: 5.90444183 SECONDS
Updating 200000 rows:
TRIGGER QUERY TIME: 13.19935203 SECONDS
STANDARD QUERY TIME: 11.88235188 SECONDS
You folks can decide for yourselves which way to go.
My situation:
MySQL 5.5, but possible to migrate to 5.7
Legacy app is executing single MySQL query to get some data (1-10 rows, 20 columns)
Query can be modified via application configuration
Query is very complex SELECT with multiple JOINS and conditions, it's about 20KB of code
Query is well profiled, index usage fine-tuned, I spent much time on this and se no room for improvement without splitting to smaller queries
With traditional app I would split this large query to several smaller and use caching to avoid many JOINS, but my legacy app does not allow to do that. I can use only one query to return results
My plan to improve performance is:
Reduce parsing time. Parsing 20KB of SQL on every request, while only parameters values are changed seems ineffective
I'd like to turn this query into prepared statement and only fill placeholders with data
Query will be parsed once and executed multiple times, should be much faster
Problems/questions:
First of all: does above solution make sense?
MySQL prepared statements seem to be session related. I can't use that since I cannot execute any additional code ("init code") to create statements for each session
Other solution I see is to use prepared statement generated inside procedure or function. But examples I saw rely on dynamically generating queries using CONCAT() and making prepared statement executed locally inside of procedure. It seems that this kind of statements will be prepared every procedure call, so it will not save any processing time
Is there any way to declare server-wide and not session related prepared statement in MySQL? So they will survive application restart and server restart?
If not, is it possible to cache prepared statements declared in functions/procedures?
I think the following will achieve your goal...
Put the monster in a Stored Routine.
Arrange to always execute that Stored Routine from the same connection. (This may involve restructuring your client and/or inserting a "web service" in the middle.)
The logic here is that Stored Routines are compiled once per connection. I don't know whether that includes caching the "prepare". Nor do I know whether you should leave the query naked, or artificially prepare & execute.
Suggest you try some timings, plus try some profiling. The latter may give you clues into what I am uncertain about.
I have a routine in MySQL that is very long and has multiple SELECT, INSERT, and UPDATE statements in it with some IFs and REPEATs. It's been running fine until lately, where it's hanging an taking over 20 seconds to complete (which is unacceptable considering it used to take 1 second or so).
What is the quickest and easiest way for me to find out where in the routine the bottleneck is coming from? Basically the routine is getting stopped up and some point... how can I find out where that is without breaking apart the routine and testing one-by-one each section?
If you use Percona Server (a free distribution of MySQL with many enhancements), you can make the slow-query log record times for individual queries, using the log_slow_sp_statements configuration variable. See http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
If you're using stock MySQL, you can add statements in the stored procedure to set a series of session variables to the value returned by the SYSDATE() function. Use a different session variable at different points in the SP. Then after you run the SP in a test execution, you can inspect the values of these session variables to see what section of the SP took the longest.
To analyze the query can see the execution plan of the same. It is not always an easy task but with a bit of reading will find the solution. I leave some useful links
http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
http://dev.mysql.com/doc/refman/5.0/en/explain.html
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
http://www.lornajane.net/posts/2011/explaining-mysqls-explain
I don't remember ever seeing a way to use prepared statements from the console and somehow don't think running an explain query thought as a prepared statement from the API will get what I want.
This is related to this old question of mine.
I'm primarily interested in MySQL but would be interested in other DBs as well.
According to the brief research that I conducted, I don't see a way to get it. Ideally, the real execution plan would be generated once the variables are provided. Lookup tables can quickly eliminate actually running the query if a constant is not present. The ideal execution plan would take into account the frequency of occurrence. My understanding is that MySQL at least used to prepare an execution plan when the statement is prepared in order to validate the expression. Then, when you execute it, it generates another explain plan.
I believe the explain plan is temporarily housed in a table in MySQL but is quickly removed after it is used.
I would suggest asking on the MySQL internals list.
Good Luck,
Jacob
"You can't"
https://dev.mysql.com/doc/internals/en/prepared-stored-statement-execution.html
That basically says that the execution plan created for the prepared statement at compile time is not used. At execution time, once the variables are bound, it uses the values to create a new execution plan and uses that one.
This means that if you want to know what it will do, you can take the query you intended on preparing, give it the values you will bind to it, and EXPLAIN PLAN that complete query.