SQLAlchemy connection pooling offers a recycle option which (if I'm reading the docs right) will invalidate connections after they have reached a certain total age.
MySQL has a config option wait_timeout which closes connections after an idle period.
Is there a way to make the SQLAlchemy recycle work on the time since a connection was last actively used rather than its total age so that it lines up better with the server's behaviour?
Related
I tested a mysql cluster using sysbench to figure out a sweet spot to set maximum threads to. In my endevours I came across the threads option in sysbench.
--threads=N
I also came across the thread_pool_size in Mysql Thread pool operations.
thread_pool_size: The number of thread groups in the thread pool. This is the most important parameter controlling thread pool performance.
So the question that plagues me is are the threads for sysbench similar to the thread_pool_size for mysql?
Here is an example of a command that I used.
sysbench oltp_read_write.lua --threads=26 --time=30 --mysql-user='root' --mysql-password='password' --table-size=10000 --mysql-host=10.100.100.64 --mysql-port=6033 run
This is an image to show my current configuration:
CNFfiles
OUCH!
thread_cache_size is the number of "threads" to hang onto. It is a simpleminded pooling. It is a number not bytes!! 10 is a reasonable number. Anything more than max_connections is unnecessary.
max_connections refers to "concurrent" connections, not total over time. The default of 151 is fine for most systems. 1000 is "high" but is warranted for some systems; 10K is too high.
Check these:
SHOW GLOBAL STATUS LIKE 'Max_used_connections';
SHOW GLOBAL STATUS LIKE 'Threads_running';
The former is a high-water mark (since startup). If it is close to max_connections, then maybe max_connections should be increased.
The latter says how many of the current connections are actually doing anything. If it is over 100, the connections are stumbling over each other. We will need more details to discuss what to do next. (1 is common; a 'busy' system might say no more than 10, and change rapidly.)
Sysbench is a client of MySQL. It can start a number of threads, one per connection.
When not using a thread pool in MySQL Server, every client connection starts its own thread. So there's a one-to-one correspondence between sysbench threads and MySQL Server threads.
It's typical that a client connection is not running a query every second. Normally a client application runs other code in between waiting for queries. So on the MySQL Server side, some threads exist, but they aren't doing anything. This appears as "Sleep" in the processlist.
It's pretty common to have hundreds of client connections open, but only one or two dozen of these connections doing any query at any given moment. The others are all sleeping.
As a metaphor, I would compare this to customers in a bank, where they approach a teller's window and do transactions. The customer blocks others from using the same teller, even if the customer is signing a form or something else that is not talking directly to the teller.
When using a thread pool, threads are handled differently in MySQL Server. The thread pool feature exists so that a smaller number of threads in the MySQL Server can be shared by a greater number of client connections. The threads in MySQL Server are no longer corresponding one-to-one with client connections. They switch when a client connection requests to execute an SQL query. This is done to reduce resource usage when your clients open a large number of connections.
A metaphor for this is a restaurant where a single server can handle a whole section with customers. The customers only need attention from time to time, and the server can therefore keep track of multiple tables of customers.
In the case of sysbench, this is probably not a typical workload. The client threads are running SQL queries more rapidly than a typical application. If you try to use a thread pool in this case, you might have more client requests than the number of threads in the thread pool, and in this case the client requests might queue up.
In the restaurant metaphor, this would be the infrequent times when more than one table wants something at the same time. Then all but one of the tables must wait, but hopefully not for long since most customer requests are brief.
Using the thread pool in MySQL Server while testing with sysbench might not be the best way to measure the maximum throughput of queries.
Assuming I'll execute every 2 seconds a query, should I open the connection on every request or should I keep the connection alive until the application(server) stops?
In my experience, establishing connections is unlikely to be a bottleneck for a mysql server (connection overhead is fairly low in mysql). That having been said, reusing existing connections is often an appropriate approach, but it requires some careful considerations: if the database server is temporarily unavailable, the code must reconnect; if the server is replaced, it must reconnect (mysql implementations tend towards failover solutions rather than true high availability); if the application uses multiple connections to mysql, you must be sure not to cross your connections between users or sessions (active database, timezone, charset, and so on are sessions variables, essentially tied to a connection). If you're not up to the task of making your reusable connection reliable in these and other edge cases, creating a new connection every 2 seconds may provide this durability for free.
In short, there can be less-than-obvious benefits to short lived connections. I would not bother to add intelligence around maintaining a persistent connection unless you have reason to believe it will actually make a meaningful difference in your case (eg benchmarks).
I have written a web server using Delphi and the Indy TIdHttpServer component. I am managing a pool of TAdoConnection connections to a MySql database. When a request comes in I query my pool for available database connections. If one is not available then a new TAdoConnection is created and added to the pool.
Problems occur when a connection becomes "stale" (i.e. it has not been used in quite some time). I think in this instance the query results in the "MySql has gone away" error.
Does anyone have a method for getting around this? Or would I have manage it myself by one of the following:
Writing a thread that will periodically "refresh" all connections.
Keeping track of the last active query, and if too old pass up using the connection and instead free it.
Two suggestions:
store a 'last used' time stamp with every pooled connection, and if a connection is requested check if the connection is too old - in this case, create a new one
add a validateObject() method which issues a no-op SQL query to detect if the connection is still healthy
a background thread which cleans up the pool in regular intervals: removing idle connections allows to reduce the pool size back to a minimum after peak usage
For some suggestions, see this article about the Apache Commons Pool Framework: http://www.javaworld.com/article/2071834/build-ci-sdlc/pool-resources-using-apache-s-commons-pool-framework.html
Most widely used options in database.yml are of following :
adapter
encoding
database
pool
username
password
socket
host
port
timeout
I know the use of the most of the above but pool.
So i want to know what is the use of the pool option in database.yml or there is any other parameter which we need to set for the application having very heavy traffic.
It sets the amount of possible connections per ruby process. So in case you are threading your rails app, or you use transactions excessively. The limits here depend on your setup. Consider this:
50 ruby processes
each with 100 threads
a mysql with a setting of 1000 simultaneous connections
so it makes sense that every process can open at most 20 connections (50 * 20 == 1000) at a given time. So you would set the pool value to 20 or less.
For anyone else who is looking for an answer to this question, the basic idea seems to be that a database can only support so many simultaneous connections, so there needs to a way to limit the open connections. The pool attribute specifies the maximum number of connections that can be opened at a given time.
See http://guides.rubyonrails.org/configuring.html#database-pooling for more information about this. The guide doesn't explicitly say that pool is the total connections for the app, but that's the sense I get after reading it.
pool is the config of size of connection pool, which is 5 by default.
http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html
MySQL default max_connections = 151. How many connections should I use for 1000 users per application?
I would think that if anything, you should probably decrease this number. Do you think that 15% of your users are logged in at the same time and all using the database? That seems like a very high percentage to assume. If your application does not hold on to database connections for more time than needed, then you likely need much less than 150 connections available. As soon as the database communication is done, your app should release the connection. If you are using a connection pool, then openning and closing a connection is very fast. Using this approach, you can have two users logged in at the same time and it is likely that they will not need more than one connection between them because it is rare that they both are simultaneously executing some DB operation.