How to prematurely finish mysql_use_result() / mysql_fetch_row()? - mysql

I am in the process of writing my first C client for MySQL 5.5 and have stumbled across the following page in the documentation. Nearly at the end, it states (bold emphasis mine, italic emphasis not mine):
An advantage of mysql_use_result() is [...]. Disadvantages are that
[...]. Furthermore, you must retrieve all the rows even if you
determine in mid-retrieval that you've found the information you were
looking for.
The last sentence is not clear to me.
1) What happens if I don't follow that line?
2) I think that there actually must be a way to prematurely end fetching rows if I decide that I have enough information (otherwise, this whole thing wouldn't make much sense in my eyes).
I understand that something bad could happen if I just stop fetching rows and then try to execute the next statement, but isn't there a function like mysql_finish_fetch() or something like that?
And what happens if I call mysql_free_result()? This should free the result even if I haven't fetched all rows yet, so it should be safe to call it in mid-retrieval and continue with whatever I'd like to do. Am I wrong here?

This sounds like an internal threading issue that MySQL exposes to the client. Chalk it up to the various MySQL gotchas. The short of it is that MySQL apparently has a finite number of "searchers" internally, and using mysql_use_result() apparently dedicates one of them to your API request. Further, MySQL apparently has no exposed API call to cancel such a request. The only option is to see the fetch through until the end.
The slightly longer version: internally, MySQL's cursors apparently have a single code path -- I imagine for performance in the common cases. That code path exits only when the cursor finds no more results. When you use the more common mysql_store_result(), MySQL has done this already before returning the result to the application. When you use mysql_use_result(), however, MySQL requires that you do "the dirty work" of iterating the rest of the result set so as to clear the cursor. Fun.
From the documentation:
mysql_use_result() initiates a result set retrieval but does not actually read the result set into the client like mysql_store_result() does. Instead, each row must be retrieved individually by making calls to mysql_fetch_row(). This reads the result of a query directly from the server without storing it in a temporary table or local buffer, which is somewhat faster and uses much less memory than mysql_store_result(). The client allocates memory only for the current row and a communication buffer that may grow up to max_allowed_packet bytes.
On the other hand, you should not use mysql_use_result() for locking reads if you are doing a lot of processing for each row on the client side, or if the output is sent to a screen on which the user may type a ^S (stop scroll). This ties up the server and prevent other threads from updating any tables from which the data is being fetched.
When using mysql_use_result(), you must execute mysql_fetch_row() until a NULL value is returned, otherwise, the unfetched rows are returned as part of the result set for your next query. The C API gives the error Commands out of sync; you can't run this command now if you forget to do this!
So, to actually answer your questions:
1) What happens if I don't follow that line?
The C API will return the error message: Commands out of sync; you can't run this command now
2) I think that there actually must be a way to prematurely end fetching rows if I decide that I have enough information (otherwise, this whole thing wouldn't make much sense in my eyes).
One would think, but no. You must iterate the result set completely.

Related

Qt SQL `nextResult` function for MySQL Server 8.0: delayed execution per result set?

We are currently doing a lot of small queries. We execute a query, read the results, and then execute the next one. Since network requests cost a lot of time, this ping-ponging gets slow very fast.
This is why we want to do multiple queries at once, sending all data that the SQL server must know to it, and only retrieving one result (consisting of multiple result sets).
We found that Qt 5.14.1's QSqlQuery has the nextResult() function, but in the documentation (link) it says:
Some databases may execute all statements at once while others may delay the execution until the result set is actually accessed, [...].
MY QUESTION:
So, does MySql Server 8.0 delay the execution until the result set is actually accessed? If this is the case, then we still have a ping-pong for every query right? Which would be very slow still.
P.S. Our current solution to just have 1 ping-pong is to union different result sets (resulting in kind of a block diagonal matrix) with lots and lots of null values), and this question is meant to find a better way to do this.

When should I close a statement in MySQL

Should a statement be reused as many time as possible or there's a limitation?
If there is a limitation, when is the right time to close it?
Is creating and closing statement a costly operation?
Creating and closing a statement doesn't really make sense. I believe what you mean is creating and closing a cursor. A cursor is a query that you iterate over the results of. Typically you see them in Stored Procedures and Functions in MySQL. Yes, they have a cost to open and close and you should iterate over the entire set.
Alternately you're talking about prepared statements such as you might create using the PDO library in PHP. In which case, you can use them as many times as possible and indeed you should, as this is more efficient.
Every time MySQL receives a statement, it translates that into its own internal logic and creates a query plan. Using prepared statements means it only has to do this once rather than every time you call it.
Finally, you might be trying to ask about a connection, rather than a statement. In which case, again, the answer is yes - you can (and should) use it as many time as you need as there's a significant performance impact of opening it. Though you don't want to keep it open longer than you need it because MySQL has a maximum number of connections it can open.
Hopefully one of those will answer your question.

Does mysql return data on demand, or all? Why looping over the data is slower than accessing them?

I was wondering when we call the perl dbi apis to query a database, are all the results return? Or do we get partially the result set and as we iterating we retrieve more and more rows from the database.
The reason I am asking is that I notice the following in a perl script.
I did a query to a database which returns a really large number of records. After getting this records I did a for loop over the results and created a hash of this data.
What I noticed is that the actual query from the database return in a reasonable amount of time (the results were a lot) but the big delay was looping over the data to create the hash.
I don't understand this. I would expect that the query would be the slow part since the for loop and the construction of the hash would be in-memory and would be cheap.
Any explanation/idea why this happens? Am I misunderstanding something basic here?
Update
I understand that MySQL caches data so when I run the same query multiple times it would be faster the second time and on. But still I would not expect the for loop over the data set in memory to be of the same (and more) time duration as the query to the MySQL DB.
Assuming you are using DBD::mysql, the default is to pull all the results from the server at once and store them in memory. This avoids tying up the server's resources and works fine for the majority of result sets as RAM is usually plentiful.
That answers your original question, but if you would like more assistance, I suggest pasting code - it's possible your hash building code is doing something wrong, or unnecessary queries are being made. See also Speeding up the DBI for tips on efficient use of the DBI API, and how to profile what DBI is doing.

How can I find the bottleneck in my slow MySQL routine (stored procedure)?

I have a routine in MySQL that is very long and has multiple SELECT, INSERT, and UPDATE statements in it with some IFs and REPEATs. It's been running fine until lately, where it's hanging an taking over 20 seconds to complete (which is unacceptable considering it used to take 1 second or so).
What is the quickest and easiest way for me to find out where in the routine the bottleneck is coming from? Basically the routine is getting stopped up and some point... how can I find out where that is without breaking apart the routine and testing one-by-one each section?
If you use Percona Server (a free distribution of MySQL with many enhancements), you can make the slow-query log record times for individual queries, using the log_slow_sp_statements configuration variable. See http://www.percona.com/doc/percona-server/5.5/diagnostics/slow_extended_55.html
If you're using stock MySQL, you can add statements in the stored procedure to set a series of session variables to the value returned by the SYSDATE() function. Use a different session variable at different points in the SP. Then after you run the SP in a test execution, you can inspect the values of these session variables to see what section of the SP took the longest.
To analyze the query can see the execution plan of the same. It is not always an easy task but with a bit of reading will find the solution. I leave some useful links
http://dev.mysql.com/doc/refman/5.5/en/execution-plan-information.html
http://dev.mysql.com/doc/refman/5.0/en/explain.html
http://dev.mysql.com/doc/refman/5.0/en/using-explain.html
http://www.lornajane.net/posts/2011/explaining-mysqls-explain

MySQL - Restrict (Forcibly) the number of rows returned for ANY QUERY

I have a special security need with mysql. I need to forcibly restrict the number of rows a query returns, issuing an error if the returned rows will be over, say a million rows. Here is the setup -
Need - The data has 100s of millions of rows, and we don't want the client to run down the server or do a complete extraction (They would never need all the lines, just aggregations) The idea is, if they need it, they run into an error or the barrier, and come to us with the reason explaining why they need to pull so many rows with a query.
System - Clients can use any query tool, so we have no control over what query is generated. Thus, we cannot use Limit x which seems to be the solution suggested everywhere.
I have tried searching for a solution, and for now it seems that the only way to do it is at the application level (which we do not own).
Is there any way to achieve this?
Setting
1- We need to have SSL enabled.
2- MySQL 5.5
Thanks!
J
It seems like you might be able to get close with MySQL Proxy.
https://launchpad.net/mysql-proxy
See this page for manipulating results. Not sure if it does a buffered or unbuffered read, or if you can cancel the reading of results or not...
http://dev.mysql.com/doc/refman/5.1/en/mysql-proxy-scripting-read-query-result.html
It's open source, so you might be able to hire someone to tweak it if needed as well.
There may be other ways to restrict the overloading of your database server. Take a look at this link for more info:
MySQL - can I limit the maximum time allowed for a query to run?