My situation:
MySQL 5.5, but possible to migrate to 5.7
Legacy app is executing single MySQL query to get some data (1-10 rows, 20 columns)
Query can be modified via application configuration
Query is very complex SELECT with multiple JOINS and conditions, it's about 20KB of code
Query is well profiled, index usage fine-tuned, I spent much time on this and se no room for improvement without splitting to smaller queries
With traditional app I would split this large query to several smaller and use caching to avoid many JOINS, but my legacy app does not allow to do that. I can use only one query to return results
My plan to improve performance is:
Reduce parsing time. Parsing 20KB of SQL on every request, while only parameters values are changed seems ineffective
I'd like to turn this query into prepared statement and only fill placeholders with data
Query will be parsed once and executed multiple times, should be much faster
Problems/questions:
First of all: does above solution make sense?
MySQL prepared statements seem to be session related. I can't use that since I cannot execute any additional code ("init code") to create statements for each session
Other solution I see is to use prepared statement generated inside procedure or function. But examples I saw rely on dynamically generating queries using CONCAT() and making prepared statement executed locally inside of procedure. It seems that this kind of statements will be prepared every procedure call, so it will not save any processing time
Is there any way to declare server-wide and not session related prepared statement in MySQL? So they will survive application restart and server restart?
If not, is it possible to cache prepared statements declared in functions/procedures?
I think the following will achieve your goal...
Put the monster in a Stored Routine.
Arrange to always execute that Stored Routine from the same connection. (This may involve restructuring your client and/or inserting a "web service" in the middle.)
The logic here is that Stored Routines are compiled once per connection. I don't know whether that includes caching the "prepare". Nor do I know whether you should leave the query naked, or artificially prepare & execute.
Suggest you try some timings, plus try some profiling. The latter may give you clues into what I am uncertain about.
Related
I'm running into a issue with a spring boot application, in which I am getting the below error,
(conn=1126) Can't create more than max_prepared_stmt_count statements (current value: 16382)
Which seems to be hitting the ceiling of max_prepared_stmt_count in mysql. Increasing it as much as we want could be problematic as well as it could result in OOM killer issues like this.
I'm exploring if there are any ways to limit the creation of PreparedStatements in Spring boot.
One possible option that I can think of,
Avoiding lazy loading whenever possible would force Hibernate to fetch the data with lesser number of prepared statements thus avoiding the problem.
Cache the PreparedStatements created by hibernate.
If anyone solved this problem or with a deeper insight, please share your wisdom on solving this.
MySQL has no problem with many prepared statements if they are over time. Obviously some MySQL databases stay up and running for months at a time, serving prepared statements. The total count of prepared statements over those months can be limitless.
The problem is how many prepared statements are cached in the MySQL Server at any given moment.
It has long been a source of trouble for MySQL Server that it allocates a small amount of RAM in the server for each prepared statement, to store the compiled version of the prepared statement and allow it to be executed subsequently. But the server doesn't know if the client will execute it again, so it has to keep that memory allocation indefinitely. The server depends on the client to explicitly deallocate the prepared statement. This can become a problem if the client neglects to "close" its prepared statements. They become long-lived, and eventually all those accumulated data structures take too much RAM in the MySQL Server.
So the variable really should be named max_prepared_stmts_allocated, not max_prepared_stmt_count. Likewise the status variable Prepared_stmt_count should be Prepared_stmts_open or something like that.
To fix this in your case, I would make sure in the client code that you deallocate prepared statements promptly when you have no more need of them.
If you open a prepared statement using a try-with-resources block, it should automatically be closed at the end of the block.
Otherwise you should call stmt.close() explicitly when you're done with it.
So, since cursory googling doesn't reveal anything enlightening:
How does MySQL generate a query plan for a Prepared Statement such as the server-side ones implemented in Connector/J for JDBC? Specifically, does it generate it at the time that the SQL statement is compiled and then reuse it with every execution regardless of the parameters or will it actually adjust the plan in the same manner that would be achieved with issuing each SQL query separately?
If it does happen to be "smart" about it, an explanation of how it does this would be great (e.g. variable peeking)
In almost all cases, the query plan is built when you execute the statement. In MySQL (unlike competing products), building the plan is very fast, so you don't really need to worry about whether it is cached in any way.
Also, but building as needed, different values in the query can lead to different query plans, hence faster execution.
(In the extreme, I have seen one statement, with different constants, have 6 different query plans.)
I will get some text from another question here:
The PreparedStatement is a slightly more powerful version of a Statement, and should always be at least as quick and easy to handle as a Statement.
The Prepared Statement may be parametrized
Most relational databases handles a JDBC / SQL query in four steps:
Parse the incoming SQL query
Compile the SQL query
Plan/optimize the data acquisition path
Execute the optimized query / acquire and return data
A Statement will always proceed through the four steps above for each SQL query sent to the database. A Prepared Statement pre-executes steps (1) - (3) in the execution process above. Thus, when creating a Prepared Statement some pre-optimization is performed immediately. The effect is to lessen the load on the database engine at execution time.
Now here is my question:
If I use hundreds or thousands of Statement, will it be cause performance problems in database? (I don't mean that they will perform slower because of more jobs to do every time). Will all those statements be cached in database or they will be lost in space as soon as they are executed?
Since there is no restictions on using prepared statements, you should work carefully with them.
As you said you need hundreds of prepaired, think twice may be you are using it wrong.
The pattern it should be used is having an application that doing a haevy inserts/updates/select hundred or thousand times a second which only differs in variables. So in real world it would be like, connecting, creating session, sending statement, and sending bunch of variables to that statement.
But if your plan is to create prepared on each single operations, it's just better to use common queries.
On your questions:
Hundreds of statements will not kill mysql or drive you to performance degradation
The prepared are stored in memory while client session is up and running. As soon as you close session the prepared die.
To be sure you need it:
Your app able to execute statements fast so you get speed value of using them
Your query will not have a variable number of arguments, otherwise you can kill you app by creating objects and storing in memory on every statement
I have a routine in MySQL that is very long and has multiple SELECT, INSERT, and UPDATE statements in it with some IFs and REPEATs. It's been running fine until lately, where it's hanging an taking over 20 seconds to complete (which is unacceptable considering it used to take 1 second or so).
What is the quickest and easiest way for me to find out where in the routine the bottleneck is coming from? Basically the routine is getting stopped up and some point... how can I find out where that is without breaking apart the routine and testing one-by-one each section?
If you use Percona Server (a free distribution of MySQL with many enhancements), you can make the slow-query log record times for individual queries, using the log_slow_sp_statements configuration variable. See http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
If you're using stock MySQL, you can add statements in the stored procedure to set a series of session variables to the value returned by the SYSDATE() function. Use a different session variable at different points in the SP. Then after you run the SP in a test execution, you can inspect the values of these session variables to see what section of the SP took the longest.
To analyze the query can see the execution plan of the same. It is not always an easy task but with a bit of reading will find the solution. I leave some useful links
http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
http://dev.mysql.com/doc/refman/5.0/en/explain.html
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
http://www.lornajane.net/posts/2011/explaining-mysqls-explain
The only reliable MySQL support I know of is through node-mysql and it specifically says in the documentation:
Warning: sql statements with multiple queries separated by semicolons are not supported yet.
Which kind of sucks... Because I need to insert or update a large number of rows at a time with some logic involved, and it's easily done with one large MySQL query containing IFs and ELSEs. I can't imagine how I'd simply move all the logic over to the node.js side without an enormous loss of performance and complexity in code. So I'm not ready to give up on the pure SQL solution.
Is there any way for me to execute a large SQL query from node.js relatively easily?
Can you define your logic within a stored procedure and then call that as your single statement instead (passing whatever parameters you need)?