I wonder what is the optimal way to establish/maintain connection with MySQL/Redis (from nodejs): store in one single object (conn) or create a new connection on every request? Namely:
1, Should we use a single connection for every nodejs http request? Use connection pool? Or a single connection on every new request (so reconnection should be important because the connection could be randomly lost)? How is the performance?
2, What is the difference between MySQL and Redis in term of maintaining such connection?
I will tell you how I used to manage to do this.
1, Should we use a single connection for every nodejs http request? Use connection pool? Or a single connection on every new request (so reconnection should be important because the connection could be randomly lost)? How is the performance?
You don't want to create connections manually for every nodejs http request. Always use connection pooling if you are using nodejs mysqlijs/mysql module. I use it.
Pools take care of server reconnections automatically.
I don't have a benchmarks, but performance should be better because within pools connections can be reused once released. In other factors, believe me, creating and managing connections manually is cumbersome and error-prone.
Eg:
Declare your mysql connection pool in a Db.js file like below and export it.
var mysql = require("mysql");
var pool = mysql.createPool({
connectionLimit : 5,
host : process.env.DB_HOST || 'localhost',
user : process.env.DB_USER,
password : process.env.DB_PASSWORD,
database : 'mydb'
});
module.exports = db;
And use it in your inside an end-point in another file.
var pool = require('./Db.js');
app.get('/endpoint', function (req, res) {
// ...
pool.query('<your-query-here>', function (err, res, fields) {
if (err) throw err;
// do something with result (res)
});
});
I prefer using both pool.query and pool.getConnection based on the scenario. Using query is safer because you don't need to consider releasing connection. It will be automatically handled by the library. getConnection is used only where several queries has to be run inside an end-point, so I can reuse the same connection to do that.
2, What is the difference between MySQL and Redis in term of maintaining such connection?
In short, You don't need pooling for redis. We don't think about pooling redis connections according to this.
Related
I built a multi-tenant microservice architecture in Nest.js, multi-tenant connection are made using TypeOrm latest Datasource API. After upgrading to latest TypeORM version we encountering the MySQL "Too many connections" error.
During searching about this I found that in their latest version they added option "PoolSize" to control number of active connections. I've added that too but the issue is still there.
Ideally, TypeOrm should close the connection once the DB operation finished or use opened connection (if any) on new request, but the connection to MySQL is keeping active but in sleep state, and on a new request it create new connection. see below:
By running show processlist; cmd on MySQL
I've created the multi-tenant connection using nest.js provider for incoming request in microservice:
databaseSource is used for initial database connection to default database and then on each tenant request we create the new DB connection.
export const databaseSource = new DataSource({
type: process.env.DB_CONNECTION,
host: process.env.DB_HOST,
port: parseInt(process.env.DB_PORT, 10),
username: process.env.DB_USERNAME,
password: process.env.DB_PASSWORD,
database: process.env.DB_DATABASE,
entities: ["src/models/entities/**/*.ts"],
synchronize: true,
poolSize: 10,
});
const connectionFactory = {
provide: CONNECTION,
scope: Scope.REQUEST,
useFactory: async (request: RequestContext) => {
const tenantHost = request.data["tenantId"] || request.data;
if (tenantHost) {
const tenantConnection: DataSource = await getTenantConnection(
tenantHost
);
return tenantConnection;
}
return null;
},
inject: [REQUEST],
};
#Global()
#Module({
providers: [connectionFactory],
exports: [CONNECTION],
})
export class TenancyModule {}
export async function getTenantConnection(
tenantHost: string
): Promise<DataSource> {
const tenantId = getTenantId(tenantHost);
const connectionName = `${tenantId}`;
const DBConfig: DataSourceOptions = {
type: process.env.DB_CONNECTION,
host: process.env.DB_HOST,
port: parseInt(process.env.DB_PORT, 10),
username: process.env.DB_USERNAME,
password: process.env.DB_PASSWORD,
database: connectionName,
entities: ["src/models/entities/**/*.ts"],
synchronize: true,
poolSize: 10,
};
const dataSource = new DataSource(DBConfig);
if (!dataSource.isInitialized) {
await dataSource.initialize();
}
return dataSource;
}
Then once datasource initalized,I inject it into Service and used it to getRepository and performed DB operation.
I researched a lot about this some saying increase the MySQL "max_connections" limit, some saying passed the "connectionLimit" options in TypeOrm config (poolSize in latest version) but nothing works for me.
Am I doing anything wrong to create the tenant connection?
Is there any way to closed the connection manually after DB operation?
Your error doesn't have to do anything with typeorm version. It's most likely that the number of tenants have increased and the way you're creating the connection, you're going to run out of connections if not now then later.
There are a number of things that you can do to make it work. First of them all would be to limit the number of connections per tenant. The correct parameter to limit the number of connections in a pool is to use connectionLimit parameter inside extra object for typeorm versions < 0.3.10 and poolSize for typeorm versions >= 0.3.10.
TypeORM is just an ORM, it delegates underlying communication to the database to the corresponding driver. In case of mysql, it uses this mysql npm module. Whatever option that you specify in dataSourceOptions is passed onto the driver. You can see the set of available pool options. You might wanna set this number to a very small value as you're going to have multiple tenants. Perhaps keep this value configurable for every tenant. A smaller number for a not so big tenant and a larger value for a very busy one. This way you'll be able to reduce overall connections pressure on your database server.
Talking about the screenshot you've pasted with high number of connections in sleep command, this is mostly due to the pool of connections made by your application. This doesn't pose any harm unless it surpasses the max_connections variable on your mysql database server. In your case, it has happened indeed that's why the error: Too many connections.
Other options you might explore is to increase the value of the variable max_connections so that you're able to accommodate all your tenants. You might also wanna increase the server size as increasing this variable will increase RAM usage, unless of course mysql is already running on a very big machine.
Edit 1: After reading your comment, I see a probable issue at this line:
if (databaseSource.options.database === connectionName) {
//
}
databaseSource.options.database will always be equal to process.env.DB_DATABASE when the databaseSource is first initialised. Upon any subsequent request for connection for any tenantId, this check will fail and every time a new connection pool will be created.
Edit 2: Again the issue lies within your code. You're always creating a new DataSource object without checking if there is already a connection pool for that tenant. isInitialized flag will always be false for a new object and your code will do dataSource.initialize() which will create new pool. Hint: Try to keep connection pools created in a map:
const tenantPools = {
tenantId: dataSource
}
and before creating a new DataSource object, check if that already exists in this map.
I am trying to implement nodejs mysql database following this tutorial. I know that
pool.query() is a shortcut for pool.getConnection() +
connection.query() + connection.release().
In the article the database is configed as:
var mysql = require('mysql')
var pool = mysql.createPool({
connectionLimit: 10,
host: 'localhost',
user: 'matt',
password: 'password',
database: 'my_database'
})
pool.getConnection((err, connection) => {
if (err) {
if (err.code === 'PROTOCOL_CONNECTION_LOST') {
console.error('Database connection was closed.')
}
if (err.code === 'ER_CON_COUNT_ERROR') {
console.error('Database has too many connections.')
}
if (err.code === 'ECONNREFUSED') {
console.error('Database connection was refused.')
}
}
if (connection) connection.release()
return
})
module.exports = pool
This is can be used as:
pool.query('SELECT * FROM users', function (err, result, fields) {
if (err) throw new Error(err)
// Do something with result.
})
However, I really do not understand the point of
if (connection) connection.release()
Why do we need this if using pool releases the connection automatically?
Once you do pool.getConnection(), you are removing a connection from the pool which you can then use and nobody else can get access to that connection from the pool. When you are done with it, you put it back in the pool so others can use it.
So, when not using pool.query() (which as you know puts it back in the pool automatically), you have to get a connection, do whatever you want with it and then put it back into the pool yourself.
If all you need to do is a single query, then use pool.query() and let it automatically get a connection from the pool, run the query, then release it back to the pool. But, if you have multiple things you want to do with the connection, such as multiple queries or multiple inserts to the database, then get the connection, do your multiple operations with it and then release it back to the pool. Getting a connection manually from the pool also allows you to build up state on that connection and share that state among several operations. Two successive calls to pool.query() may actually use different connections from the pool. They might even run in parallel.
However, I really do not understand the point of
if (connection) connection.release()
Why do we need this if using pool releases the connection automatically?
If you manually get a connection from the pool, then you have to manually put it back in the pool when you're done with it with connection.release(). Otherwise, the pool will soon be empty of connections and you'll have a bunch of idle connections that can't be used by anyone.
If you use the automatic methods like pool.query(), then it will handle putting it back into the pool after the single query operation.
Think of it like an automatic mode vs. a manual mode. The manual mode gives you finer grain control over how you do things, but when the automatic mode lines up with your needs, it's easier to use. When the automatic mode (pool.query()) doesn't do exactly what you want, then manually get a connection from the pool, use it and the put it back.
I built a program with NodeJS where multiple users access it in the same time and do a lot of operations that queries the MySQL database.
My approach is very simple. I only open one connection when the app is started and leave it that way.
const dbConfig = require('./db-config');
const mysql = require('mysql');
// Create mySQL Connection
const db = mysql.createConnection({
host: dbConfig.host,
user: dbConfig.user,
password: dbConfig.password,
database: dbConfig.database,
multipleStatements: true
});
// Connect MySQL
db.connect((err) => {
if (err) {
throw err;
} else {
console.log('MySQL connected!');
}
});
module.exports = db;
And then, whenever the program needs to query the database, i do like this
db.query('query_in_here', (error, result) => {
*error_handling_and_doing_stuff*
}
I'm having trouble when noone access the app for a long period of time (some hours).
Because when this happens i think the connection is being closed automatically. And then, when a user try to access the app, i see in the console that the connection timed out.
My first thought was too handle the disconnection and connect again. But, it get me thinking if this is the correct approach.
Should i use pool connections instead? Because if i keep only one connection it means that two users can't query the database in the same time?
I tried to understand tutorials with pool connections but couldn't figure out when to create new connections and when should i end them.
UPDATE 1
Instead of create one connection when the app is started i changed to create a pool connection.
const dbConfig = require('./db-config');
const mysql = require('mysql');
// Create mySQL Connection
const db = mysql.createPool({
host: dbConfig.host,
user: dbConfig.user,
password: dbConfig.password,
database: dbConfig.database,
multipleStatements: true
});
module.exports = db;
It seems that when i use now "db.query(....)" the mysql connection and release of that connection is done automatically.
So, it should resolve my issue but i don't know if this is the correct approach.
Should i use pool connections instead?
Yes you should. Pooling is supported out-of-the-box with the mysql module.
var mysql = require('mysql');
var pool = mysql.createPool({
connectionLimit : 10,
host : 'example.org',
user : 'bob',
password : 'secret',
database : 'my_db'
});
pool.query('SELECT 1 + 1 AS solution', function (error, results, fields) {
// should actually use an error-first callback to propagate the error, but anyway...
if (error) return console.error(error);
console.log('The solution is: ', results[0].solution);
});
You're not supposed to know how pooling works. It's abstracted from you. All you need to do is use pool to dispatch queries. How it works internally is not something you're required to understand.
What you should pay attention to is the connectionLimit configuration option. This should match your MySQL server connection limit (minus one, in case you want to connect to it yourself while your application is running), otherwise you'll get "too many connections" errors. The default connection limit for MySQL is 100, so I'd suggest you set connectionLimit to 99.
Because if i keep only one connection it means that two users can't query the database in the same time?
Without pooling, you can't serve multiple user requests in-parallel. It's a must have for any non-hobby, data-driven application.
Now, if you really want to know how connection pooling works, this article sums it up pretty nicely.
In software engineering, a connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database. Opening and maintaining a database connection for each user, especially requests made to a dynamic database-driven website application, is costly and wastes resources. In connection pooling, after a connection is created, it is placed in the pool and it is used again so that a new connection does not have to be established. If all the connections are being used, a new connection is made and is added to the pool. Connection pooling also cuts down on the amount of time a user must wait to establish a connection to the database.
I want to use mysql from AWS Lambda (hosted nodejs).
The nodejs instance will be automatically terminated by Lambda when no new request show up for a few minutes.
Due to this Lambda behavior, I don't want to call end() because otherwise it would turn every request doing connect-use-end cycle. I want the connection (or pool) to live over multiple requests.
Would that be a problem if connection.end() is not called and instance get terminated? (could there be leak or something)
var mysql = require('mysql');
var connection = mysql.createConnection({
host : 'localhost',
user : 'me',
password : 'secret',
database : 'my_db'
});
connection.connect();
index.handler = function(){
connection.query('SELECT x', function(err, rows, fields) {
// do something here
});
};
// * cannot call because potential incoming request still need to use.
// connection.end();
You may endup getting "Too many connection" errors at somepoint.
Your best option till then is to open and close connection within the same invocation.
Connection pool is a use case which needs effective reuse of lamdba containers and is a pending feature request on aws.
https://forums.aws.amazon.com/thread.jspa?threadID=216000
I found a very good module (node-mysql) to connect to Mysql database.
The module is very good, I Only have a question about "WHEN" to open the connection to Mysql.
I always have used php-mysql before starting with node, for each request i opened a connection...then query....then close.
Is the same with node? for each request do I have to open a connection and then close it? or can i use persistent connection?
Thank you
The open-query-close pattern generally relies on connection pooling to perform well. Node-mysql doesn't have any built in connection pooling, so if you use this pattern you'll be paying the cost of establishing a new connection each time you run a query (which may or may not be fine in your case).
Because node is single threaded, you can get away with a single persistent connection (especially since node-mysql will attempt to reconnect if the connection dies), but there are possible problems with that approach if you intend to use transactions (since all users of the node client are sharing the same connection and so same transaction state). Also, a single connection can be a limit in throughput since only one sql command can be executed at a time.
So, for transactional safety and for performance, the best case is really to use some sort of pooling. You could build a simple pool yourself in your app or investigate what other packages are out there to provide that capability. But either open-query-close, or persistent connection approaches may work in your case also.
felixge/node-mysql now has connection pooling (at the time of this writing.)
https://github.com/felixge/node-mysql#pooling-connections
Here's a sample code from the above link:
var mysql = require('mysql');
var pool = mysql.createPool(...);
pool.getConnection(function(err, connection) {
// Use the connection
connection.query( 'SELECT something FROM sometable', function(err, rows) {
// And done with the connection.
connection.end();
// Don't use the connection here, it has been returned to the pool.
});
});
So to answer your question (and same as #Geoff Chappell's answer): best case would be to utilize pooling to manage connections.