Database threads per request using Ecto - json

A typical Elixir web application will usually have a postgresql backend, with Ecto queries coupled with the API logic.
However since cowboy creates a child GenServer process (containing the app logic) per request, will this have the effect of producing n psql threads for n concurrent requests, even with the pooling cowboy/poolboy provides?
Then, moving to a scenario where multiple instances of the application exists (for example a docker container cluster) will this not add an extra factor to the total number of existing database threads?

Cowboy does create a new Erlang process for each request but executing an Ecto query from that process will not result in a new Database connection. Ecto keeps a pool of connections to the database (using db_connection/poolboy). The size of this pool is set using the pool_size option in the configuration of the Repo. When you initiate a query, a connection from this pool is borrowed and used to execute the query. The connection is returned to the pool after the execution is complete. Ecto will never create a new connection for each query. If a connection is not available in the pool, it'll wait for one to be available or eventually time out if no connection is checked in in the configured timeout (defaults to 30 seconds).

Related

MySQL with Nodd js

I am creating a rest api that uses mysql as data base. My confusion is that should i connect to database in every request and release the connection at the end of the operation. Or should i connect the database at the start of the server and make it globally available and forget about releasing the connection
I would caution that neither option is quite wise.
The advantage of creating one connection for each request is that those connections can interact with your database in parallel, this is great when you have a lot of requests coming through.
The disadvantage (and the reason you might just create one connection on startup and share it) is obviously the setup cost of establishing a new connection each time.
One option to look into is connection pooling https://en.wikipedia.org/wiki/Connection_pool.
At a high level you can establish a pool of open connections on startup. When you need to make a request remove one of those connections from the pool, use it, and return it when done.
There are a number of useful Node packages that implement this abstraction, you should be able to find one if you look.

Reuse connection in database connection pool

As per my understanding, database connection pool usually works this way:
create n connections during app initialization and put them into a cache(e.g. a list)
a thread will require a connection to do some operation to the db and return the connection back when it has finished
when there is no available connection in the cache, the thread in step2 will be waiting util a connection is pushed back to the cache
My question is:
Can we execute multiple db operations through one connection after we acquire it from the pool instead of do one db operation then put it back? it's seems more efficient, because it saves the time acquiring and putting back the connection. (under multiple threads condition, there must be some cost of locking when add and get from the connection pool)
can anyone help? Thks!
Yes, the database connection can be used for multiple operations each time it is acquired from the pool, and this behavior is typical for database applications that use pooling. For example, a connection might be acquired once and reused for several operations during the handling of a request to a REST service. This lifecycle also often involves managing those operations as a single transaction in the database.

Database Connection Pool with a Single Application Server Instance

I am connecting to a remote MySQL database in my node.js server code for a web app. Is there any advantage to using a connection pool when I only have a single instance of a node.js application server running?
Connection pools are per instance of application. When you connect to the db, you are doing it from that particular instance and hence the pool is in the scope of that instance. The advantage of creating a pool is that you don't create / close connections very often, as this is, in general, a very expensive process. Rather, you maintain a set of connections open, in idle state, ready to be used if there is a need.
Update
In node there is async.parallel() construct which allows you to launch a set of tasks in async manner. Immagine that those tasks represent each one a single query. If you have a single connection to use, each process should use that same one, and it will quickly become a bottelneck. Instead, if you have a pool of available connections, each task can use a separate connection until the pool is completely used. Check this for more detailed reference.

Rails holding on to DB connections too long, can't run Stored Procedures

I'm experiencing a problem with Rails and my MySQL RDS Instance. I have my rails app connected to it through our database.yml file with a pool of 10 (now 5) connections. The other day another user of the database tried running a stored procedure but it would not execute. It was stuck just hanging around waiting to execute. The user looked at the processes and noticed that our rails user had around 30 idle processes so they killed some of those. The stored procedure kicked off then and ran without issue.
We are on an r3.xlarge instance and had ~100 total processes at the time of problem. This doesn't seem alarmingly high to me and I'm not sure why the procedure wouldn't execute without freeing up some of the processes. I guess my question is, is there a way to tell my rails app to release some of these idle connections after x seconds, or a way to control these connections better? I can write a cron which frees them up, but I'd love to do it the rails/best way.
Thanks for any help!
It seems to me that you may have hit the maximum connections limit on the MySQL instance. You can run select ##max_connections on your MySQL to find out the limit.
I don't know of a way to force Rails to close its allocated db connections. Each server process may use up to the pool size connections (i.e. 10 or 5 in your case) to the db for its threads. The distinction between threads and processes is important: if you for example have multiple workers serving your rails app running as separate processes (e.g. puma can be configured like that), then each of the process may allocate up to 5 or 10 connections. If you use background processes (sidekiq etc.), they also may use up to this amount of connections.
The ConnectionPool also provides a reaper that can be used to free allocated db connections from dead threads but unless your app is having some larger troubles, this usually will not help (your threads are more probably idle than dead).
So, I'd give a general advice to try to estimate the maximum number of connections that all your rails processes might need and if it is near or above the MySQL connection limit, either lower the connection pool size or decrease the number of possibly run Rails processes (workers).
If you need more help, please specify what application server do you use to run your Rails app and how it is configured and the same also for any background job workers.
Try setting reaping_frequency in your database.yml file:
reaping_frequency: frequency in seconds to periodically run the Reaper,
which attempts to find and recover connections from dead threads,
which can occur if a programmer forgets to close a connection at the
end of a thread or a thread dies unexpectedly. Regardless of this
setting, the Reaper will be invoked before every blocking wait.
(Default nil, which means don't schedule the Reaper)
Above documentation from: http://api.rubyonrails.org/classes/ActiveRecord/ConnectionAdapters/ConnectionPool.html

should I use memcached mysql or NO-SQL Database?

I'm working on a tcp server application that need dispatch around 1000 frames per second on java. Each ingoing frame generates an update (save state) and insert(for logging) to mysql. It says, 1000 updates and 1000 inserts per second!. I have a bottleneck in this updates and inserts. At this point, I don't know if mysql is a bad solution for my scenary because I'm doing the benchmarks on a PC, not on a server. I try execute Updates and Inserts in batchs periodically, but each batch takes a lot of time.
I'm using MyIsam engine and Mina to manipulates tcp connection and a pool for the mysql connections.
Should I use memcached over Mysql or NO-SQL, or it can be a hardware resource problem?
thanks!
regards!!