I'm using a MySQL instance on Azure under the free trial period. One thing I've noticed is that max_user_connections is set to 4 under this option and 40 under the highest priced tier.
Both of these seem really low, unless I'm misunderstanding something. Let's say I have 41 users making database requests simultaneously, wouldn't this cause a failure due to going over the max allowable connections? That doesn't seem like much room.
How can I use Azure to allow a realistic number of simultaneous connections? Or am I thinking about this incorrectly? Should I just dump MySQL for SQL Azure?
Thanks.
If you are using the .NET framework, connection pooling is managed by the data provider. Instead of opening a connection and leaving it open for an entire session, with .NET each database operation/transaction typically opens the connection, performs a single task and then closes the connection after the operation completes. The .NET MySQL data provider also supports advanced connection pooling, see http://www.devart.com/dotconnect/mysql/docs/ComparingProviders.html
I would assume the Azure limitation is referring to applications that employ the first (session duration) alternative.
Related
I'm currently developing plugins for bukkit and a lot of them need a database connection. Now I'm thinking about if could be better to have just one plugin that handles the connection for all plugins.
The question behind that is if it is good or not to keep a connection up even if there are no queries for some minutes (that may happen). Otherwise I would need to establish a new connection for each query?
It is a good idea to have one class/plugin for handling database, but the connection state should not be open all the time,make sure the connection is opened only for the time taken by the query.
Many applications use connection pools to have a number of connections readily available to run queries over. It reduces the number of protocol re-negotiations that the database driver has to do. This is especially useful for applications that need fast access times to the underlying data, yet have larger downtimes between requests. E-Commerce applications like webshops are a good example.
I have been researching this for a while but got no convinced answer.
From mysql tutorial, the default connections number is less than two hundred, and it says max_connection_num can be set to 2000 in Linux box as long as you have enough resource. I think this number is far from enough in real world deployment as there might be millions people visit your website at the same time.
There are couple of articles talking about how to optimize to reduce time cost by each query. But none of them tells me how this issue is root caused. I think there must be some mechanism like queue to prevent massive connections from happening simultaneously. otherwise you will finally get "too connection" exception.
anyone has some expertise in this area? thank you.
There are several options.
Connection pooling
As you mentionned: queuing. If too many clients connect at the same time, then the application layer should handle this exception, put the request to sleep for a short period of time and try again. Requests lasting more than a couple of seconds should usually be banned in such a high traffic environment.
Load balancing through replication and/or clustering
Normally, your application is supposed to reuse connections already established. However, the language you chose to implement your application introduces limitations. If you use Java or .Net you can have pool of connections. For PHP it is not the case, you can check this discussion
If you exceed the max_connection_num, you do get a too many connections error. But if you really have 1 million users at your web server at the exact same time, you can't handle that with one server anyway, 1 million concurrent connections really requires a very big farm to handle.
However, the clients to your database is a webapp, that webapp usually connects to the database through abstractions called a connection pool, which does limit the number of connections to the database on the client side as long as all the database connections goes through that same pool.
Question 1:
I am using MySQL Connector /J to connect to MySQL. I am creating connection for every request. I need to use connection pool. Whether i need to choose c3p0 or i could use MysqlConnectionPool class provided by the connector library.
Question 2:
I may need to load balace / failover between two MySQL database servers. I could use jdbc:mysql://host,host2/dbname to do the failover automatically. I want to use connection pool and failover in combination. How should i acheive it.
I'd recommend using C3PO or something else. It'll integrate into a Java EE app server better, and it's database agnostic.
Your second question is a good deal more complicated. Load balancing is usually done with an appliance of some kind, like an F5 or ACE, that stands between the client and the load balanced instances. Is that how you're doing it? How do you plan to keep the data in synch if you load balance between the two? If the connections aren't "sticky", you'll expect to find INSERTed data in both instances.
Maybe this reference can help you get started:
http://www.howtoforge.com/loadbalanced_mysql_cluster_debian
My site has always used persistent connections, based on my understanding of them there's no reason not to. Why close the connection when it can be reused? I have a site that in total accesses about 7 databases. It's not a huge traffic site, but it's big enough. What's your take on persistent, should I use them?
With persistent connections:
You cannot build transaction processing effectively
impossible user sessions on the same connection
app are not scalable. With time you may need to extend it and it will require management/tracking of persistent connections
if the script, for whatever reason, could not release the lock on the table, then any following scripts will block indefinitely and one should restart the db server. Using transactions, transaction block will also pass to the next script (using the same connection) if script execution ends before the transaction block completes, etc.
Persistent connections do not bring anything you can do with non-persistent connections.
Then, why to use them, at all?
The only possible reason is performance, to use them when overhead of creating a link to your SQL Server is high. And this depends on many factors like:
database type
whether MySQl server is on the same machine and, if not, how far? might be out of your local network /domain?
how much overloaded by other processes the machine on which MySQL sits
One always can replace persistent connections with non-persistent connections. It might change the performance of the script, but not its behavior!
Commercial RDMS might be licensed by the number of concurrent opened connections and here the persistent connections can misserve
My knowledge on the area is kinda limited so I can't give you many details on the subject but, as far as I know, the process of creating connections and handing them to a thread really costs resources, so I would avoid it if I were you. Anyhow I think that most of this decisions can't be generalized and depend on the business.
If, for instance, your application communicates continuously with the Database and will only stop when the application is closed, then perhaps persistent connections are the way to go, for you avoid the process mentioned before.
However, if your application only communicates with the Database sporadically to get minor information then closing the connection might be more sane, for you won't waste resources on opened connections that are not being used.
Also there is a technique called "Connection Pooling", in which you create a series of connections a priori and keep them there for other applications to consume. In this case connections are persistent to the database but non-persistent to the applications.
Note: Connections in MSSQL are always persistent to the database because connection pooling is the default behavior.
I've been thinking, why does Apache start a new connection to the MySQL server for each page request? Why doesn't it just keep ONE connection open at all times and send all sql queries through that one connection (obviously with client id attached to each req)?
It cuts down on the handshake time overhead, and a couple of other advantages that I see.
It's like plugging in a computer every time you want to use it. Why go to the outlet each time when you can just leave it plugged in?
MySQL does not support multiple sessions over a single connection.
Oracle, for instance, allows this, and you can setup Apache to mutliplex several logical sessions over a single TCP connection.
This is limitation of MySQL, not Apache or script languages.
There are modules that can do session pooling:
Precreate a number of connections
Pick a free connection on demand
Create additional connections if not free connection is available.
the reason is: it's simpler.
to re-use connections, you have to invent and implement connection pooling. this adds another almost-layer of code that has to be developed, maintained, etc.
plus pooled connections invite a whole other class of bugs that you have to watch out for while developing your application. for example, if you define a user variable but the next user of that connection goes down a code path that branches based on the existence of that variable or not then that user runs the wrong code. other problems include: temporary tables, transaction deadlocks, session variables, etc. all of these become very hard to duplicate because it depends on the subsequent actions of two different users that appear to have no ties to each other.
besides, the connection overhead on a mysql connection is tiny. in my experience, connection pooling does increase the number of users a server can support by very much.
Because that's the purpose of the mod_dbd module.