Producer Consumer setup: How to handle Database Connections?

Producer Consumer setup: How to handle Database Connections? - mysql

I'm building my first single-producer/single-consumer app in which the consumer takes items off the queue and stores them in a MySQL database.
Previously, when it was a single thread app, I would open a connection to the DB, send the query, close the connection, and repeat every time new info came in.
With a producer-consumer setup, what is the better way to handle the DB connection? Should I open it once before starting the consumer loop (I can't see a problem with this, but I'm sure that one of you fine folks will point it out if there is one)? Or should I open and close the DB connection on each iteration of the loop (seems like a waste of time and resources)?
This software runs on approximately 30 small linux computers and all of them talk to the same database. I don't see 30 simultaneous connections being an issue, but I'd love to hear your thoughts.
Apologies if this has been covered, I couldn't find it anywhere. If it has, a link would be fantastic. Thanks!
EDIT FOR CLARITY
My main focus here is the speed of the consumer thread. The whole reason for switching from single- to multi-threaded was because the single-threaded version was missing incoming information because it was busy trying to connect to the database. Given that the producer thread is expected to start dumping info into the buffer at quite a high rate, and given that the buffer will be limited in size, it is very important that the consumer work through the buffer as quickly as possible while remaining stable.

Your MySQL shouldn't have any problems handling connections in the hundreds, if not thousands.
On each of your consumers you should set up a connection pool use that from your consumer. If you consume the messages in a single thread (per application) the pool only needs to use one connection but it's also fine to consume and start parallel threads that all use one connection.
The reason for using a connection pool is that it will handle re connection and keep alive for you. Just ask it for one connection and have it promise that it will work (it does this by running a small query against the database). If you don't use a connection for a while and it get's terminated the pool will just create a new one.

Related

Creating a pool of connections vs 1 permanent in MySQL

So this is more of a generic question but an important one for me and perhaps future googlers.
Since one can create one connection and keep it alive as long as whatever process is connected to it is alive too, and librarys can keep it healthy (upon failing etc) why would one use a pool.
I can not understand where the performance enhancement comes into to play.
The querys are just getting queued the same way they would with one connection.
There is no 'parallel' processing.
Also assuming the process and the DB are in the same server there is no time lost sending the request over the network. In addition no time is lost connecting, and ending the connection with either option.
I can only see the demerits such as making sure the data selected are not currently getting updated by another connection, thus receiving old data etc.
I wanted to boost the performance of my MySQL DB and make it more scalable as I stumbled upon the pool vs 1 permanent connection argument without being sure if I should change or not.

How to Prevent "MySql has gone away" when using TIdHTTPServer

I have written a web server using Delphi and the Indy TIdHttpServer component. I am managing a pool of TAdoConnection connections to a MySql database. When a request comes in I query my pool for available database connections. If one is not available then a new TAdoConnection is created and added to the pool.
Problems occur when a connection becomes "stale" (i.e. it has not been used in quite some time). I think in this instance the query results in the "MySql has gone away" error.
Does anyone have a method for getting around this? Or would I have manage it myself by one of the following:
Writing a thread that will periodically "refresh" all connections.
Keeping track of the last active query, and if too old pass up using the connection and instead free it.

Two suggestions:
store a 'last used' time stamp with every pooled connection, and if a connection is requested check if the connection is too old - in this case, create a new one
add a validateObject() method which issues a no-op SQL query to detect if the connection is still healthy
a background thread which cleans up the pool in regular intervals: removing idle connections allows to reduce the pool size back to a minimum after peak usage
For some suggestions, see this article about the Apache Commons Pool Framework: http://www.javaworld.com/article/2071834/build-ci-sdlc/pool-resources-using-apache-s-commons-pool-framework.html

How does mysql handle massive connections in real world?

I have been researching this for a while but got no convinced answer.
From mysql tutorial, the default connections number is less than two hundred, and it says max_connection_num can be set to 2000 in Linux box as long as you have enough resource. I think this number is far from enough in real world deployment as there might be millions people visit your website at the same time.
There are couple of articles talking about how to optimize to reduce time cost by each query. But none of them tells me how this issue is root caused. I think there must be some mechanism like queue to prevent massive connections from happening simultaneously. otherwise you will finally get "too connection" exception.
anyone has some expertise in this area? thank you.

There are several options.
Connection pooling
As you mentionned: queuing. If too many clients connect at the same time, then the application layer should handle this exception, put the request to sleep for a short period of time and try again. Requests lasting more than a couple of seconds should usually be banned in such a high traffic environment.
Load balancing through replication and/or clustering

Normally, your application is supposed to reuse connections already established. However, the language you chose to implement your application introduces limitations. If you use Java or .Net you can have pool of connections. For PHP it is not the case, you can check this discussion

If you exceed the max_connection_num, you do get a too many connections error. But if you really have 1 million users at your web server at the exact same time, you can't handle that with one server anyway, 1 million concurrent connections really requires a very big farm to handle.
However, the clients to your database is a webapp, that webapp usually connects to the database through abstractions called a connection pool, which does limit the number of connections to the database on the client side as long as all the database connections goes through that same pool.

mysql connections. Should I keep it alive or start a new connection before each transaction?

I'm doing my first foray with mysql and I have a doubt about how to handle the connection(s) my applications has.
What I am doing now is opening a connection and keeping it alive until I terminate my program. I do a mysql_ping() every now and then and the connection is started with MYSQL_OPT_RECONNECT.
The other option (I can think of), would be to start a new connection before doing anything that requires my connection to the database and closing it after I'm done with it.
What are the pros and cons of these two approaches?
what are the "side effects" of a long connection?
What is the most used method of handling this?
Cheers ;)
Some extra details
At this point I am keeping the connection alive and I ping it every now and again to now it's status and reconnect if needed.
In spite of this, when there is some consistent concurrency with queries happening in quick succession, I get a "Server has gone away" message and after a while the connection is re-established.
I'm left wondering if this is a side effect of a prolonged connection or if this is just a case of bad mysql server configuration.
Any ideas?

In general there is quite some amount of overhead incurred when opening a connection. Depending on how often you expect this to happen it might be ok, but if you are writing any kind of application that executes more than just a very few commands per program run, I would recommend a connection pool (for server type apps) or at least a single or very few connections from your standalone app to be kept open for some time and reused for multiple transactions.
That way you have better control over how many connections get opened at the application level, even before the database server gets involved. This is a service an application server offers you, but it can also be rolled up rather easily if you want to keep it smaller.
Apart from performance reasons a pool is also a good idea to be prepared for peaks in demand. When a lot of requests come in and each of them tries to open a separate connection to the database - or as you suggested even more (per transaction) - you are quickly going to run out of resources. Keep in mind that every connection consumes memory inside MySQL!
Also you want to make sure to use a non-root user to connect, because if you don't (I think it is tied to the MySQL SUPER privilege), you might find yourself locked out. MySQL reserves at least one connection for an administrator for problem fixing, but if your app connects with that privilege, all connections would already be used up when you try to put out the fire manually.

Unless you are worried about having too many connections open (i.e. over 1,000), you she leave the connection open. There is overhead in connecting/reconnecting that will only slow things down. If you know you are going to need the connection to stay open for a while, run this query instead of pinging periodically:
SET SESSION wait_timeout=#
Where # is the number of seconds to leave an idle connection open.

What kind of application are you writing? If it's a webscript: keep it open. If it's an executable, pool your connections (if necessary, most of the times a singleton will do).

Grails MySql processList

i have a grails application with a webflow. I store my inner flow objects of interest in the converstaion scope. After entering and leaving the flow a few times, i see that the single user connected to the DB (MySql) generates a lot of threads on the MySql Server which are not released. The processlist in mysql show me the threads in sleeping mode and a netstat on the client shows me established connections to the mysql server.
I assume the connections are held active and not released. But why is that? What do grails exactly do when entering and leaving a flow? Why are so many connections opened and not closed?
Any help would be appreciated.
regards,
masiar

Grails uses hibernate, which in turn uses connection pooling; these are idle connections, waiting for traffic.
You can learn more about Hibernate's connection pooling at: https://www.hibernate.org/214.html
This is actually desirable behavior; it can take a non-negligible amount of time to open a new connection, much more time than it takes to send a query down an open one.
"Premature optimization is the root of all evil" - unless you are seeing a performance problem related to the database, I'd leave this alone.

Think of the hibernate pooling like a steady and ready pool of cars having their engines turned on at all time for you or your buddies to jump in and go to anywhere you want..., well no, to the database. No need to wait for a taxi or to jump start your own car before you are up and running... all good here
Conversations are meant to stick around as long as they are needed. Often you dive down into workflows and upon finishing them, you return to your old, and thus still alive conversation. It is meant like that... all good here too

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008