I am using express 4.x, and the latest MySQL package for node.
The pattern for a PHP application (which I am most familiar with) is to have some sort of database connection common file that gets included and the connection is automatically closed upon the completion of the script. When implementing it in an express app, it might look something like this:
// includes and such
// ...
var db = require('./lib/db');
app.use(db({
host: 'localhost',
user: 'root',
pass: '',
dbname: 'testdb'
}));
app.get('/', function (req, res) {
req.db.query('SELECT * FROM users', function (err, users) {
res.render('home', {
users: users
});
});
});
Excuse the lack of error handling, this is a primitive example. In any case, my db() function returns middleware that will connect to the database and store the connection object req.db, effectively giving a new object to each request. There are a few problems with this method:
This does not scale at all; database connections (which are expensive) are going to scale linearly with fairly inexpensive requests.
Database connections are not closed automatically and will kill the application if an uncaught error trickles up. You have to either catch it and reconnection (feels like an antipattern) or write more middleware that EVERYTHING must call pior to output to ensure the connection is closed (anti-DRY, arguably)
The next pattern I've seen is to simply open one connection as the app starts.
var mysql = require('mysql');
var connection = mysql.createConnection(config);
connection.on('connect', function () {
// start app.js here
});
Problems with this:
Still does not scale. One connection will easily get clogged with more than just 10-20 requests on my production boxes (1gb-2gb RAM, 3.0ghz quad CPU).
Connections will still timeout after a while, I have to provide an error handler to catch it and reconnection - very kludgy.
My question is, what kind of approach should be taken with handing database connections in an express app? It needs to scale (not infinitely, just within reason), I should not have to manually close in the route/include extra middleware for every path, and I (preferably) to not want to catch timeout errors and reopen them.
Since, you're talk about MySQL in NodeJS, I have to point you to KnexJS! You'll find writing queries is much more fun. The other thing they use is connection pooling, which should solve your problem. It's using a little package called generic-pool-redux which manages things like DB connections.
The idea is you have one place your express app access the DB through code. That code, as it turns out, is using a connection pool to share the load among connections. I initialize mine something like this:
var Knex = require('knex');
Knex.knex = Knex({...}); //set options for DB
In other files
var knex = require('knex').knex;
Now all files that could access the DB are using the same connection pool (set up once at start).
I'm sure there are other connection pool packages out there for Node and MySQL, but I personally recommend KnexJS if you're doing any dynamic or complex SQL queries. Good luck!
Related
TL;DR: Vertical or Horizontal scaling for this system design?
I have NGINX running as a load balancer for my application. It distributes across 4 EC2 (t2.micro's cuz I'm cheap) to route traffic and those are all currently hitting one server for my MySQL database (also a t2.micro, totalling 6 separate EC2 instances for the whole system).
I thinking about horizontally scale my database via Source/Replica distribution, and my thought is that I should route all read queries/GET requests (the highest traffic volume I'll get) to the Replicas and all write queries/POST requests to the Source db.
I know that I'll have to programmatically choose which DB my servers point to based on request method, but I'm unsure of how best to approach that or if I'm better off vertically scaling my DB at that point and investing in a larger EC2 instance.
Currently I'm connecting to the Source DB using an express server and it's handling everything. I haven't implemented the Source/Replica configuration just yet because I want to get my server-side planned out first.
Here's the current static connection setup:
const mysql = require('mysql2');
const Promise = require('bluebird');
const connection = mysql.createConnection({
host: '****',
port: 3306,
user: '****',
password: '*****',
database: 'qandapi',
});
const db = Promise.promisifyAll(connection, { multiArgs: true });
db.connectAsync().then(() =>
console.log(`Connected to QandApi as ID ${db.threadId}`)
);
module.exports = db;
What I want to happen is I want to either:
set up an express middleware function that looks at the request method and connects to the appropriate database by creating 2 configuration templates to put into the createConnection function (I'm unsure of how I would make sure it doesn't try to reconnect if a connection already exists, though)
if possible just open two connections simultaneously and route which database takes which method (I'm hopeful this option will work so that I can make things simpler)
Is this feasible? Am I going to see worse performance doing this than if I just vertically scaled my EC2 to something with more vCPUs?
Please let me know if any additional info is needed.
Simultaneous MySQL Database Connection
I would be hesitant to use any client input to connect to a server, but I understand how this could be something you would need to do in some scenarios. The simplest and quickest way around this issue would be to create a second database connection file. In order to make this dynamic, you can simply require the module based on conditions in your code, so sometimes it will be called and promised at only certain points, after certain conditions. This process could be risky and requires requiring modules in the middle of your code so it isn't ideal but can get the job done. Ex :
const dbConnection = require("../utils/dbConnection");
//conditional {
const controlledDBConnection = require("../utils/controlledDBConnection");
var [row] = await controlledDBConnection.execute("SELECT * FROM `foo`;")
}
Although using more files could potentially have an effect on space constraints and could potentially slow down code while waiting for a new promise, but the overall effect will be minimal. controlledDBConnection.js would just be something close to a duplicate to dbConnection.js with slightly different parameters depending on your needs.
Another path you can take if you want to avoid using multiple files is to export a module with a dynamically set variable from your controller file, and then import it into a standard connection file. This would allow you to change up your connection without rewriting a duplicate, but you will need diligent error checks and a default.
Info on modules in JS : https://javascript.info/import-export
Some other points
Use Environment Variables for your database information like host, etc. since this will allow for you to easily change information for your database all in one place, while also allowing you to include your .env file in .gitignore if you are using github
Here is another great stack overflow question/answer that might help with setting up a dynamic connection file : How to create dynamically database connection in Node.js?
How to set up .env files : https://nodejs.dev/learn/how-to-read-environment-variables-from-nodejs
How to set up .gitignore : https://stackabuse.com/git-ignore-files-with-gitignore/
What is considered best practice for handling and managing connections when building an API or web application with Node.js that depends on MySQL (or in my case, MariaDB)?
Per the documentation for node-mysql, there seem to be two methods to use:
var connection = mysql.createConnection({...});
app.get("/", function(req, res) {
connection.query("SELECT * FROM ....", function(error, result) {
res.json(result);
});
});
-- or --
var pool = mysql.createPool({...});
app.get("/", function(req, res) {
pool.getConnection(error, connection) {
if (error) {
console.log("Error getting new connection from pool");
} else {
connection.query("SELECT * FROM ....", function(error, result) {
connection.release();
res.json(result);
});
}
});
});
To me, it makes the most sense to use the second option, as it should use as many connections as are needed, as opposed to relying on a single connection. However, I have experienced problems using a pool with multiple routes, i.e each route gets a new connection from the pool, executes a query, and releases it back into the pool. Each time I get a connection from a pool, use it, and release it, it seems there is still a process in MySQL waiting for another request. Eventually, these processes build up in MySQL (visible by running SHOW PROCESSLIST) and the application is no longer able to retrieve a connection from the pool.
I have resorted to using the first method because it works and my application doesn't crash, but it doesn't seem like a robust solution. However, node-mariasql looks promising, but I can't tell if that will be any better than what I am currently using.
My question is: what is the best way to handle/structure MySQL connections when building an API or web application that relies heavily on SQL queries on almost every request?
Changing connection.release() to connection.destory() solved my issue. I'm not sure what the former is supposed to do, but the latter behaves as expected and actually removes the connection. This means once a connection is done being used, it kills the MySQL process and creates another when needed. This also means that many queries can hit the API simultaneously, and slow queries will not block new ones.
Better late then never.
connection.destroy() would mean that on each impression you are making a new connection to mySQL, instead of just grabbing an idle connection and querying on that which would have less overhead. Basically you are not using the pool anymore.
Its possible your mySQL user had a limited number of connections to mysql, or that the number of queries you were making to sql were slower then the number of impressions coming into your server.
You can try tweaking the connectionLimit parameter to something higher, so your server can handle more connections simultaneously.
var pool = mysql.createPool({
connectionLimit : 10,
host : 'example.org',
user : 'bob',
password : 'secret'
});
I have a login system with my NodeJS using mysql-node.
The problem i have how ever is how to keep the user logged in, if they refresh the page they have to login again, because i do not know how to store the session.
My login system is like this:
socket.on('login', function(data,callBack){
var username = sanitize(data['login']).escape(),
pass = sanitize(data['password']).escape();
var query = connection.query('SELECT uid FROM users WHERE name = ? AND pass = ?', [username,pass],
function(err,results){
if(err){
console.log('Oh No! '+err);
} else if(results.length == 1){
//some how set a session here
} else if(!results.length) {
console.log('No rows found!');
}
});
});
I'm having difficulty understanding how i set up a session for each client that connects. Is this possible with NodeJS ?
Reading that they assign express to var app but if i already have this : var app = http.createServer( ... how can i also assign express to it :S bit confusing
You need to understand the difference between a express' server and a native NodeJS' server, here my link comparaison nodejs server vs express server
So you can do:
var app = express();
var server = http.createServer(app);
This enable you to have still the low level functionnaly with NodeJS.
So, if you don't want to use existing modules or framework, you can build your own session manager:
using cookie
using IP/UA
using socket
The best way would be first to implement it with socket, for example:
server.on('connection', function (socket) {
socket.id = id;
});
or
server.on('request', function (req, res) {
req.connection.id = id; // The socket can also be accessed at request.connection.
});
So, you just need to implement a middleware who check the id.
If you want to prevent from session prediction, session sidejacking, etc. you need to combine cookies, ip, socket, and your ideas to make it more secure for your app.
Once you've done your session manager, you can choose where to store the sessions, in a simple object, in redis, in mongodb, in mysql ... (express use MemoryStore by default, but maybe not now)
I don't have an idea if nodejs has core feature of saving sessions. you need to use a database along with it. using Express will help you to utilized a database to persist user sessions. You better study and use it
http://expressjs.com/
http://blog.modulus.io/nodejs-and-express-sessions
I don't think there is any session mechanism within Nodejs' core. However, they are plenty of libraries that would allow you to do it. The first that comes to mind is Connect's session, which is a middleware for Nodejs. Have a look at this question for more details on how to use it.
Have a look at this tutorial from dailyjs which tries to include Express's session into a notepad webapp. The source code is available here. (Note that Express' session is based on Connect's, and is practically the same).
EDIT: Here is a more complete example for Node authentication, using mongoose. They do however show their schemas, so I assume you can easily do the transition to MySQL.
I use https://github.com/felixge/node-mysql for my application
When and Why use
db_pool = mysql.createConnection(db);
or
db_pool = mysql.createPool(db);
what are the differences? and when to use them?
A single connection is blocking. While executing one query, it cannot execute others. Hence, your DB throughput may be reduced.
A pool manages many lazily-created (in felixge's module) connections. While one connection is busy running a query, others can be used to execute subsequent queries. This can result in an increase in application performance as it allows multiple queries to be run in parallel.
Connection pooling allows you to reuse existing database connections instead of opening a new connection for every request to your Node application.
Many PHP and .Net folks are accustomed to connection pooling, since the standard data access layers in these platforms pool connections automatically (depending on how you access the database.)
Opening a new database connection takes time and server resources. Using a connection that is already there is much faster, and overall, your application should need to maintain less total open connections at any one time if you use connection pooling.
The connection pooling functionality of node-mysql works very well and is easy to use. I keep the pool in a global variable and just pass that to any modules that need to access the database.
For example, here the env_settings variable in the app server holds global settings, including the active connection pool:
var http = require("http");
var mysql = require('mysql');
var env_settings = {
dbConnSettings: {
host: "localhost",
database: "yourDBname",
user: "yourDBuser",
password: "yourDBuserPassword"
},
port: 80
};
// Create connection pool
env_settings.connection_pool = mysql.createPool(env_settings.dbConnSettings);
var app = connect()
.use(site.ajaxHandlers(env_settings));
http.createServer(app).listen(env_settings.port);
And here is the ajaxHandlers module that uses the connection pool:
ajaxHandlers = function (env_settings) {
return function ajaxHandlers(req, res, next) {
var sql, connection;
env_settings.connection_pool.getConnection(function(err, connection) {
sql = "SELECT some_data FROM some_table";
connection.query(sql, function(err, rows, fields) {
if (err) {
connection.release();
// Handle data access error here
return;
}
if (rows) {
for (var i = 0; i < rows.length; i++) {
// Process rows[i].some_data
}
}
connection.release();
res.end('Process Complete');
return;
});
});
}
}
/* Expose public functions ------ */
exports.ajaxHandlers = ajaxHandlers;
The connection_pool.getConnection method is asynchronous, so when the existing open connection is returned from the pool, or a new connection is opened if need be, then the callback function is called and you can use the connection. Also note the use of connection.release() instead of ending the connection as normal. The release just allows the pool to take back the connection so it can be reused.
Here is a good way to think about the difference. Take the example of a very simple app that takes requests and returns a data set containing the results. Without connection pooling, every time a request is made, a new connection is opened to the database, the results are returned, and then the connection is closed. If the app gets more requests per second that it can fulfill, then the amount of concurrent open transactions increases, since there are more than one connection active at any time. Also, each transaction will take longer because it has to open a new connection to the data server, which is a relatively big step.
With connection pooling, the app will only open new connections when none are in the pool. So the pool will open a bunch of new connections upon the first few requests, and leave them open. Now when a new request is made, the connection pooling process will grab a connection that is already open and was used before instead of opening a new connection. This will be faster, and there will be less active connections to the database under heavy load. Of course, there will be more "waiting" connections open when no one is hitting the server, since they are held in the pool. But that is not usually an issue because the server has plenty of resources available in that case anyway.
So database connection pooling can be used to make your app faster, and more scalable. If you have very little traffic, it is not as important - unless you want to return results as quick as possible. Connection pooling if often part of an overall strategy to decrease latency and improve overall performance.
I'm aware of the popularity of a module like node-mysql for connecting to a database from an application, but I can't find any info on the connecting process without using a module like this.
Obviously I could go fishing around the modules themselves for the answer, but is there really no user-case for simple connections with simple queries without module dependency and bloated functionality?
I find it strange given the very simple I/O of a process like MySQL.
This has less to do with node.js and more to do with knowing how to implement the MySql client/server protocol. You simply need to create a tcp connection to the server and send the correct format and sequence of data per the protocol. node-mysql has done the difficult part: abstracting the protocol into something much easier to use.
This is subjective, but looking at the example in https://github.com/felixge/node-mysql
for me looks like simple connection and simple Query
var mysql = require('mysql');
var connection = mysql.createConnection({
host : 'localhost',
user : 'me',
password : 'secret',
});
connection.connect();
connection.query('SELECT 1 + 1 AS solution', function(err, rows, fields) {
if (err) throw err;
console.log('The solution is: ', rows[0].solution);
});
connection.end();
If you have a look to the source code you'll see what it takes to implement the mysql client protocol, I would say that is not that simple
https://github.com/felixge/node-mysql/blob/master/lib/Connection.js
https://github.com/felixge/node-mysql/tree/master/lib/protocol
But again this is something subjective,IMHO I don't think that there is a simpler way to query MySql.