I've purchased a single VPC on AWS and initiated there 6 MySql databases, and foreach one I've created a reading replica, so that I can always run queries on the reading replicas quickly.
Most of the day, my writing instances (original instances) are fully loaded and their CPUs percentage is mostly 99%. However, the reading replicas shows something ~7-10% CPU usage, but sometimes I get an error when I run a service connecting to the reading replica "TOO MANY CONNECTIONS".
I'm not that expert with AWS, but is this happening because the writing replicas are fully loaded and they're on the same VPC?
this happening because the writing replicas are fully loaded and they're on the same VPC?
No, it isn't. This is unrelated to replication. In replication, the replica counts as exactly 1 connection on the master, but replication does not consume any connections on the replica itself. There is no impact on connections related to the intensity of the total workload from replication.
This issue simply means you have more clients connecting to the replica than are allowed by the parameter group based on your RDS instance type. Use the query SELECT ##MAX_CONNECTIONS; to see what this limit is. Use SHOW STATUS LIKE 'THREADS_CONNECTED'; to see how many connections exist currently, and use SHOW PROCESSLIST; (as the administrative user, or any user holding the PROCESS privilege) in order to see what all of these connections are doing.
If many of them show Sleep and have long values in Time (seconds spent in the current state) then the problem is that your application is somehow abandoning connections, rather than properly closing them after use or when they are otherwise no longer needed.
Related
I am trying to setup a cluster of 3 servers in 3 different locations; Dallas-US, London-UK, Mumbai-India. On each location I have setup a webserver and db server. On db server I have configured Galera Mariadb Multi-Master cluster to replicate db among all three servers. My each webservers are connected with local IP to their regional db server. I am expecting that my Dallas webserver will fetch db records from Dallas db server; London webserver from London db server and Mumbai webserver from Mumbai db server.
Everything is working well but I have found that mysql query takes much time above 100s while fetching record. I have tried Mariadb with single instance and its fetching data within 5s.
What am I doing wrong?
It is possible to configure Galera to be single-Master. It does not sound like you did that, but suggest double checking.
Given that all nodes are writable, here's a simplified view of what happens on each transaction.
Do all the work to store/update the data on the Master you are connected to. (Presumably, it is the local machine.)
At COMMIT time, make a single round-trip to each other nodes (probably ~200ms) to give them a chance to say "wait! that would cause a conflict".
Usually step 2 will come back with "it's OK". At this point, the COMMIT returns success to the client.
(Note: If you are not using BEGIN...COMMIT, but instead auto_commit=ON, then there is an implicit COMMIT at the end of each DML statement.)
For a local read, the default action should return "immediately".
But, maybe you are concerned about the "critical read" problem. (cf wsrep_sync_wait) In this case, you want to make sure that a write has propagated to your server. This is likely to lead to a 200ms delay on the read because it waits for the "gcache" is caught up.
If you can assume that only read from the same server that they write to, consider setting wsrep_sync_wait=0. If anyone does a cross-datacenter write, then read, he could hit the "critical read" problem. (This is where he writes something, but may not see it on the next read.)
I have 3 gameservers connected at the same database. I started using galera cluster to sync, because mysql remote connection gets a delay because of hosts distance, BR, US and FR, and my gameserver use only one main thread for important queries.
This delay (lag) happen because principal thread need receive callback (confirmation) before continue running the aplication.
I thought thatwithn galera cluster, using local database with ping 0 the problem doesnt will happen anymore, but I don't know why, everytime that I get INSERTS and DELET on database the same lag happens.
On my application debug, I see that queries are sent local with 0 MS but it still lagging.
My question is, does galera mysql-wsrep needs confirmation of other clusters?
Galera checks with all the other nodes during the COMMIT command. This is when the lag occurs. Of course, COMMIT is explicitly or implicitly (autocommit) part of any transaction, so every transaction has that lag.
This implies that the optimal use of a geographically-dispersed Galera Cluster is to put many actions in a single transaction. (On the other hand, too many things in a single transaction can lead to undoing too much if there is any failure/deadlock/etc.)
The lag between US and Europe is on the order of 100ms; is that what you are seeing?
I have a setup were multiple databases (one per tenant) live within the same mariadb server (or cluster). The goal is to protect mariadb from to many connections but also from to many connections to each database. Basically throttle each tenant at the database level without affecting others.
Example: The tenant1 database is being hit hard and limited at a total of 10 connections. Other connections are queued. At the same time tenant2 can continue working as normal because it has not hit any limit and is therefore not affected by the queue.
I know HAProxy is great if you have one database being hit from multiple applications as you can have connections queued in HAProxy instead of hitting a hard limit in the database and having to deal with that in the application.
So the question, can HAProxy be used as a front for multiple databases within the same cluster (potentially with their own database credentials) and allow throttling connections per database. Or would you need multiple HAProxy servers for that (one per tenant)?
Another approach is to set up separate VMs, each with a MySQL instance. Then throttle access via CGroups. With this approach, HAProxy (etc) is not relevant unless you have some replication also.
CGroups has the feature that each VM can get "at least a certain percentage" of various resources (CPU, net, I/O). When the system is too busy, that percentage becomes a max. When otherwise idle, users can use more than their share.
You have set the VARIABLE max_user_connections?
I made a program that receives user input and stores it on a MySQL database. I want to implement this program on several computers so users can upload information to the same database simoultaneously. The database is very simple, it has just seven columns and the user will only enter four of them.
There would be around two-three hundred computers uploading information (not always at the same time but it can happen). How reliable is this? Is that even possible?
It's my first script ever so I appreciate if you could point me in the right direction. Thanks in advance.
Having simultaneous connections from the same script depends on how you're processing the requests. The typical choices are by forking a new Python process (usually handled by a webserver), or by handling all the requests with a single process.
If you're forking processes (new process each request):
A single MySQL connection should be perfectly fine (since the total number of active connections will be equal to the number of requests you're handling).
You typically shouldn't worry about multiple connections since a single MySQL connection (and the server), can handle loads much higher than that (completely dependent upon the hardware of course). In which case, as #GeorgeDaniel said, it's more important that you focus on controlling how many active processes you have and making sure they don't strain your computer.
If you're running a single process:
Yet again, a single MySQL connection should be fast enough for all of those requests. If you want, you can look into grouping the inserts together, as well as multiple connections.
MySQL is fast and should be able to easily handle 200+ simultaneous connections that are writing/reading, regardless of how many active connections you have open. And yet again, the performance you get from MySQL is completely dependent upon your hardware.
Yes, it is possible to have up to that many number of mySQL connectins. It depends on a few variables. The maximum number of connections MySQL can support depends on the quality of the thread library on a given platform, the amount of RAM available, how much RAM is used for each connection, the workload from each connection, and the desired response time.
The number of connections permitted is controlled by the max_connections system variable. The default value is 151 to improve performance when MySQL is used with the Apache Web server.
The important part is to properly handle the connections and closing them appropriately. You do not want redundant connections occurring, as it can cause slow-down issues in the long run. Make sure when coding that you properly close connections.
I am using Amazon RDS for my database services and want to use the read replica feature to distributed the traffic amongst the my read replica volumes. I currently store the connection information for my database in a single config file. So my idea is that I could create a function that randomly picked from a list of my read-replica endpoints/addresses in my config file any time my application performed a read.
Is there a problem with this idea as long as I don't perform it on a write?
My guess is that if you have a service that has enough traffic to where you have multiple rds read replicas that you want to balance load across, then you also have multiple application servers in front of it operating behind a load balancer.
As such, you are probably better off having certain clusters of app server instances each pointing at a specific read replica. Perhaps you do this by availability zone.
The thought here is that your load balancer will then serve as the mechanism for properly distributing the incoming requests that ultimately lead to database reads. If you had the DB reads randomized across different replicas you could have unexpected spikes where too much traffic happens to be directed to one DB replica causing resulting latency spikes on your service.
The biggest challenge is that there is no guarantee that the read replicas will be up-to-date with the master or with each other when updates are made. If you pick a different read-replica each time you do a read you could see some strangeness if one of the read replicas is behind: one out of N reads would get stale data, giving an inconsistent view of the system.
Choosing a random read replica per transaction or session might be easier to deal with from the consistency perspective.