Our mysql hoster has a limit of concurrent db connections. As it is rather pricey to expand that limit the following question came up:
Info:
I have a web app (been developed by an external coder- in case you might wonder about this question).
The web app is distributed and installed on many servers (every user installs it on their pc). These satellites are sending data into a mysql db. Right now the satellites are posting into the db directly.To improve security and error handling i would like to have the satellites posting to a XML-rpc (wordpress api) which then further posts into the db.
Question:
would such api reduce number of concurrent connections or not?
(right now as every satellite connects directly. It is like 1 user = 1 connection)
If 10 satellites are posting to one file, this file then processes the data and posts them into the db -> has this been one connection? (or as many connections as different data sets have been processed.)
What if the api throttles a little bit, so as there is only posting at a time. Would this lead to just one connection or not?
Any pointers are well appreciated!
Thank you in advance!
If you want to improve concurrent connections to the database (because the fact is, creating a connection to the database is "expensive"). You should look into using a ConnectionPool (example with Java).
How are concurrent database connections counted?
(source: iforce.co.nz)
A connectionless server
Uses a connectionless IPC API (e.g., connectionless datagram socket)
Sessions with concurrent clients can be interleaved.
A connection-oriented server
Uses a connection-oriented IPC API (e.g. stream-mode socket )
Sessions with concurrent clients can only be sequential unless the server is threaded.
(Client-server distributed computing paradigm, N.A.)
Design and Performance Issues
Database connections can become a bottleneck. This can be addressed
by using connection pools.
Compiled SQL statements can be re-used by using PreparedStatements
instead of statements. These statements can be parameterized.
Connections are usually not created directly by the servlet but either
created using a factory (DataSource) or obtained from a naming service
(JNDI).
It is important to release connections (close them or return them to the
connection pool). This should be done in a finally clause (so that it is done
in any case). Note that close() also throws an exception!
try
{
Console.WriteLine("Executing the try statement.");
throw new NullReferenceException();
}
catch (NullReferenceException e)
{
Console.WriteLine("{0} Caught exception #1.", e);
}
catch
{
Console.WriteLine("Caught exception #2.");
}
finally
{
Console.WriteLine("Executing finally block.");
}
There are various problems when developing interfaces between OO and
RDBMS. This is called the “paradigm mismatch”. The main problem is that
databases use reference by value while OO languages use reference by
address. So-called middleware/ object persistency framework software
tries to ease this.
Dietrich, R. (2012). Web Application Architecture, Server Side Scripting Servlets. Palmerston North: Massey University.
It depends on the way you implement the centralized service.
If the service after receiving a request immediatly posts the data to mysql, you may have many connections if there are simultaneous requests. But using connection pooling you can control precisely how many open connections you will have. In the limit, you can have just one connection open. This might cause contention if there are many concurrent requests as each request has to wait for the connection to be released.
If the service receives requests, store them in some place (other then the database), and processes them in chunks, you can also have just one connection. But this case is more complex to implement because you have to control the access (reading and writing) to the temporary data buffer.
Related
I am using this
var mysql = require('mysql');
in my node.js app. I am interested to make my app perform the fastest. I have many functions that connect to SQL. There is 2 approaches I am familiar with
For every request, I make a new connection and then execute the query and the close the connection.
Open the connection and make it a global variable, and then never close it. Then for every request that comes in, it just uses the opened connection saved globally.
Which is generally better to use? Also for number 2, if the server closes unexpectedly, then the sql connection doesn't close. Is that bad?
Thanks
Approach 2 is faster, but to avoid the potential problem of connections dropping without unexpectedly, you'll have to implement testing mechanism for every segment that queries the database (ex: count the number of returned rows).
To take this approach further, you can define connections bank or pool. Where you can deal with connection testing and distributions. The basic idea is to have many connections to the database and just inject healthy connections to consumers (functions, or objects that query the database). As Andrew mentions in the comments You can check this question: node.js + mysql connection pooling
Since the database is an essential asset to a project, if this is not a homework or learning project, it might not be a bad idea to explore 3rd party libraries, where a lot of the connections and security details is covered and automated.
I'm using a MySQL instance on Azure under the free trial period. One thing I've noticed is that max_user_connections is set to 4 under this option and 40 under the highest priced tier.
Both of these seem really low, unless I'm misunderstanding something. Let's say I have 41 users making database requests simultaneously, wouldn't this cause a failure due to going over the max allowable connections? That doesn't seem like much room.
How can I use Azure to allow a realistic number of simultaneous connections? Or am I thinking about this incorrectly? Should I just dump MySQL for SQL Azure?
Thanks.
If you are using the .NET framework, connection pooling is managed by the data provider. Instead of opening a connection and leaving it open for an entire session, with .NET each database operation/transaction typically opens the connection, performs a single task and then closes the connection after the operation completes. The .NET MySQL data provider also supports advanced connection pooling, see http://www.devart.com/dotconnect/mysql/docs/ComparingProviders.html
I would assume the Azure limitation is referring to applications that employ the first (session duration) alternative.
I am writing my first .NET MVC application and I am using the Code-First approach. I have recently learned how to configure two SQL Servers installations for High Availability using a Mirror Database and a Witness (not to be confused with Failover Clusters) to do the failover process. I think this would be a great time to practice both things by mounting my web app into a highly-available DB.
Now, for what I learned (correct me if I'm wrong) in the mirror configuration you have the witness failover to the secondary DB if the first one goes down... but your application will also need to change the connection string to reference the secondary server.
What is the best approach to have both addresses in the Web.config (or somewhere else) and choosing the right connection string?
I have zero experience with connecting to Mirrored databases, so this is all heresy! :)
The short of it may be you may not have to do anything special, as long as you pass along the FailoverPartner attribute in your connection string. The long of it is you may need additional error handling to attempt a new connection so the data provide will actually use the FailoverPartner name in the new connection.
There seems to be some good information with Connecting Clients to a Database Mirroring Session to get started. Have you had a chance to check that out?
If not, its there with Making the Initial Connection where they introduce the FailoverPartner attribute of the ConnectionString property attributes.
Reconnecting to a Database Mirroring Session suggests that on any client disconnect due to failover, the client will need to trap this exception and be prepared to reconnect:
The application must become aware of
the error. Then, the application needs
to close the failed connection and
open a new connection using the same
connection string attributes.
If the FailoverPartner attribute is available, this process should be relatively transparent to the client.
If the above doesn't work, then you might need to actually introduce some logic at the application tier to track who is the primary node, the failover node, and connection strings for each, and be prepared to persist that information somewhere - much like the data access provider should be doing for us (eyes wide open).
There is also this ServerFault post on database mirroring with Sql Server that might be of interest from an operational viewpoint that has additional reference information.
Hopefully someone with actual experience will back up any of this!
This may be totally off base, but what if you had a load balancer between your web server and the database servers?
The Load Balancer would have both databases in it's pool, using basic health check techniques (e.g ping, etc).
Your configuration would then only need to point to the IP of the Load Balancer, and wouldn't need to change.
This is what these network devices are good for. It's not the job of the programming framework (ASP.NET) to make decisions on the health of servers.
I've been thinking, why does Apache start a new connection to the MySQL server for each page request? Why doesn't it just keep ONE connection open at all times and send all sql queries through that one connection (obviously with client id attached to each req)?
It cuts down on the handshake time overhead, and a couple of other advantages that I see.
It's like plugging in a computer every time you want to use it. Why go to the outlet each time when you can just leave it plugged in?
MySQL does not support multiple sessions over a single connection.
Oracle, for instance, allows this, and you can setup Apache to mutliplex several logical sessions over a single TCP connection.
This is limitation of MySQL, not Apache or script languages.
There are modules that can do session pooling:
Precreate a number of connections
Pick a free connection on demand
Create additional connections if not free connection is available.
the reason is: it's simpler.
to re-use connections, you have to invent and implement connection pooling. this adds another almost-layer of code that has to be developed, maintained, etc.
plus pooled connections invite a whole other class of bugs that you have to watch out for while developing your application. for example, if you define a user variable but the next user of that connection goes down a code path that branches based on the existence of that variable or not then that user runs the wrong code. other problems include: temporary tables, transaction deadlocks, session variables, etc. all of these become very hard to duplicate because it depends on the subsequent actions of two different users that appear to have no ties to each other.
besides, the connection overhead on a mysql connection is tiny. in my experience, connection pooling does increase the number of users a server can support by very much.
Because that's the purpose of the mod_dbd module.
I have a service that accepts callbacks from a provider.
Motivation: I do not want to EVER lose any callbacks (unless of course my network becomes unreachable).
Let's suppose the impossible happens and my mysql server becomes unreachable for some time,
I want to fallback to a secondary persistence store once I've retried several times and fail.
What are my options? Queues, in-memory cache ?
You say you're receiving "Callbacks" - you've not made clear what they are. What is the protocol? Is it over a network.
If it were HTTP, then I would say the best way is that if your application is unable to write the data into permanent storage, it should return an error ("Try again later" if that exists in the protocol) to the caller, who should try again later.
An asynchronous process like a callback should always be able to cope with failures downstream and queue its requests.
I've worked with a payment provider where this has been the case (Paypal). If you're unable to completely process the request, just send an error back to the caller.
I recommend some sort of job queue server. I personally use Starling and have had great results with it. It speaks the memcache protocol so it is easy to use as a persistent queue.
Starling on Github
I've put a queue in SQLite for this before. Though, in my case, it was to protect against loss of the network link to the MySQL server — the data was locally-generated.
You can have a backup MySQL server, and switch your connection to that one in case primary one breaks down. If it's going to be only fail-over store you could probably run it locally on the application server.