Difference between connections in use and threads connected in mysql? - mysql

Can anyone please tell me what the difference is between the number of connections in use and threads connected to a cluster in MySQL?
I am currently using a single cluster and a pooling library, but intuitively I feel the number of connections and the threads have to be the same as from what I understand one thread has to be spawned for every connection.
However in the metrics, I am observing for my application, I see the number of threads connected is twice the number of connections in use at some points of time, how is this possible?

Related

Are sysbench threads similar to thread_pool_size?

I tested a mysql cluster using sysbench to figure out a sweet spot to set maximum threads to. In my endevours I came across the threads option in sysbench.
--threads=N
I also came across the thread_pool_size in Mysql Thread pool operations.
thread_pool_size: The number of thread groups in the thread pool. This is the most important parameter controlling thread pool performance.
So the question that plagues me is are the threads for sysbench similar to the thread_pool_size for mysql?
Here is an example of a command that I used.
sysbench oltp_read_write.lua --threads=26 --time=30 --mysql-user='root' --mysql-password='password' --table-size=10000 --mysql-host=10.100.100.64 --mysql-port=6033 run
This is an image to show my current configuration:
CNFfiles
OUCH!
thread_cache_size is the number of "threads" to hang onto. It is a simpleminded pooling. It is a number not bytes!! 10 is a reasonable number. Anything more than max_connections is unnecessary.
max_connections refers to "concurrent" connections, not total over time. The default of 151 is fine for most systems. 1000 is "high" but is warranted for some systems; 10K is too high.
Check these:
SHOW GLOBAL STATUS LIKE 'Max_used_connections';
SHOW GLOBAL STATUS LIKE 'Threads_running';
The former is a high-water mark (since startup). If it is close to max_connections, then maybe max_connections should be increased.
The latter says how many of the current connections are actually doing anything. If it is over 100, the connections are stumbling over each other. We will need more details to discuss what to do next. (1 is common; a 'busy' system might say no more than 10, and change rapidly.)
Sysbench is a client of MySQL. It can start a number of threads, one per connection.
When not using a thread pool in MySQL Server, every client connection starts its own thread. So there's a one-to-one correspondence between sysbench threads and MySQL Server threads.
It's typical that a client connection is not running a query every second. Normally a client application runs other code in between waiting for queries. So on the MySQL Server side, some threads exist, but they aren't doing anything. This appears as "Sleep" in the processlist.
It's pretty common to have hundreds of client connections open, but only one or two dozen of these connections doing any query at any given moment. The others are all sleeping.
As a metaphor, I would compare this to customers in a bank, where they approach a teller's window and do transactions. The customer blocks others from using the same teller, even if the customer is signing a form or something else that is not talking directly to the teller.
When using a thread pool, threads are handled differently in MySQL Server. The thread pool feature exists so that a smaller number of threads in the MySQL Server can be shared by a greater number of client connections. The threads in MySQL Server are no longer corresponding one-to-one with client connections. They switch when a client connection requests to execute an SQL query. This is done to reduce resource usage when your clients open a large number of connections.
A metaphor for this is a restaurant where a single server can handle a whole section with customers. The customers only need attention from time to time, and the server can therefore keep track of multiple tables of customers.
In the case of sysbench, this is probably not a typical workload. The client threads are running SQL queries more rapidly than a typical application. If you try to use a thread pool in this case, you might have more client requests than the number of threads in the thread pool, and in this case the client requests might queue up.
In the restaurant metaphor, this would be the infrequent times when more than one table wants something at the same time. Then all but one of the tables must wait, but hopefully not for long since most customer requests are brief.
Using the thread pool in MySQL Server while testing with sysbench might not be the best way to measure the maximum throughput of queries.

NodeJS + MySql and large concurrent access

So I have been writting this NodeJS Quiz App. At first we are not expecting that much users, but as time goes by we are looking at about ~50000 potential registered users.
Considering the DB of choice iss MySql, and also considering not all registered users are going to be logged in at the same time, what is the maximum number of concurrent MySQL connections possible?
Also, how much RAM and CPU would I need to host this app?
When you are using node.js, you can simply use connection pooling, there is no obvious downside to pooling but you are establishing much less connection (and less overhead).
In addition, it all goes down to how much query is actually executed and how much execution time they need, here are some make up numbers:
Let's say if 50k users all go online at the same time, and each user makes 1 interaction per 5s and takes 10ms DB time to execute, then 50k * 10ms / 5s = 100 concurrent connection needed.
Of course you need to factor in some real numbers, and read-to-write ratio matters a lot (read can potentially run in parallel), and cache can help a lot if reading makes up large number of queries.
All in all, you are unlikely to need much if all you do is some simple select and insert per interaction.
PS This post gave some interesting number: with connection pooling, 5k connection = 100k concurrent user. Of course he/she didn't mention his/her load type, but you can see that (concurrent / connection threads) can be up to tens and hundred when connection pool is in place.
you can set mySQL the max global connection up to 10,00,000.
and for hardware requirement below link can help you.
https://www.percona.com/blog/2019/02/25/mysql-challenge-100k-connections/
Hope you get your answers if not let me know.

simultaneous connections to a mysql database

I made a program that receives user input and stores it on a MySQL database. I want to implement this program on several computers so users can upload information to the same database simoultaneously. The database is very simple, it has just seven columns and the user will only enter four of them.
There would be around two-three hundred computers uploading information (not always at the same time but it can happen). How reliable is this? Is that even possible?
It's my first script ever so I appreciate if you could point me in the right direction. Thanks in advance.
Having simultaneous connections from the same script depends on how you're processing the requests. The typical choices are by forking a new Python process (usually handled by a webserver), or by handling all the requests with a single process.
If you're forking processes (new process each request):
A single MySQL connection should be perfectly fine (since the total number of active connections will be equal to the number of requests you're handling).
You typically shouldn't worry about multiple connections since a single MySQL connection (and the server), can handle loads much higher than that (completely dependent upon the hardware of course). In which case, as #GeorgeDaniel said, it's more important that you focus on controlling how many active processes you have and making sure they don't strain your computer.
If you're running a single process:
Yet again, a single MySQL connection should be fast enough for all of those requests. If you want, you can look into grouping the inserts together, as well as multiple connections.
MySQL is fast and should be able to easily handle 200+ simultaneous connections that are writing/reading, regardless of how many active connections you have open. And yet again, the performance you get from MySQL is completely dependent upon your hardware.
Yes, it is possible to have up to that many number of mySQL connectins. It depends on a few variables. The maximum number of connections MySQL can support depends on the quality of the thread library on a given platform, the amount of RAM available, how much RAM is used for each connection, the workload from each connection, and the desired response time.
The number of connections permitted is controlled by the max_connections system variable. The default value is 151 to improve performance when MySQL is used with the Apache Web server.
The important part is to properly handle the connections and closing them appropriately. You do not want redundant connections occurring, as it can cause slow-down issues in the long run. Make sure when coding that you properly close connections.

Producer Consumer setup: How to handle Database Connections?

I'm building my first single-producer/single-consumer app in which the consumer takes items off the queue and stores them in a MySQL database.
Previously, when it was a single thread app, I would open a connection to the DB, send the query, close the connection, and repeat every time new info came in.
With a producer-consumer setup, what is the better way to handle the DB connection? Should I open it once before starting the consumer loop (I can't see a problem with this, but I'm sure that one of you fine folks will point it out if there is one)? Or should I open and close the DB connection on each iteration of the loop (seems like a waste of time and resources)?
This software runs on approximately 30 small linux computers and all of them talk to the same database. I don't see 30 simultaneous connections being an issue, but I'd love to hear your thoughts.
Apologies if this has been covered, I couldn't find it anywhere. If it has, a link would be fantastic. Thanks!
EDIT FOR CLARITY
My main focus here is the speed of the consumer thread. The whole reason for switching from single- to multi-threaded was because the single-threaded version was missing incoming information because it was busy trying to connect to the database. Given that the producer thread is expected to start dumping info into the buffer at quite a high rate, and given that the buffer will be limited in size, it is very important that the consumer work through the buffer as quickly as possible while remaining stable.
Your MySQL shouldn't have any problems handling connections in the hundreds, if not thousands.
On each of your consumers you should set up a connection pool use that from your consumer. If you consume the messages in a single thread (per application) the pool only needs to use one connection but it's also fine to consume and start parallel threads that all use one connection.
The reason for using a connection pool is that it will handle re connection and keep alive for you. Just ask it for one connection and have it promise that it will work (it does this by running a small query against the database). If you don't use a connection for a while and it get's terminated the pool will just create a new one.

MySQL active connections at once, Windows Server

I have read every possible answer to this question and searched via Google in order to find the correct answer to the following question, but I am rather a novice and don't seem to get a clear understanding.
A lot I've read has to do with web servers, but I don't have a web server, but an intranet database.
I have a MySQL dsatabase in a Windows server at work.
I will have many users accessing this database constantly to perform simple queries and writting back to it new records.
The read/write will not be that heavy (chances are 50-100 users will do so exactly at the same time, even if 1000's could be connected).
The GUI will be either via Excel forms and/or Access.
What I need to know is the maximum number of active connections I can have at any given time to the database.
I know I can change the number on Mysql Admin however I really need to know what will really work...
I don't want to put 1000 users if the system will really handle 100 correctly (after that, although connected, the performance will be too slow, for example)
Any ideas or own experiences will be appreciated
This depends mainly on your server hardware (RAM, cpu, networking) and server load for other processes if not dedicated to the database. I think you won't have an absolute answer and the best way is testing.
I think something like 1000 should work ok, as long as you use 64 bit MySQL server. With 32 bit, too many connections may create virtual memory pressure - a connection has an own thread, and every thread needs a stack, so the stack memory will reduce possible size of the buffer pool and other buffers.
MySQL generally does not slow down if you have many idle connections, however special commands e.g "show processlist" or "kill", that enumerate every connection will be somewhat slower.
If idle connection stays idle for too long (idle time exceeds wait_timeout parameter), it is dropped by the server. If this is the case in your possible scenario, you might want to increase wait_timeout (its default value is 8 hours)