After Aurora Cluster DB failover, unable to write to DB - mysql

Right now I am connecting to a cluster endpoint that I have set up for an Aurora DB-MySQL compatible cluster, and after I do a "failover" from the AWS console, my web application is unable to properly connect to the DB that should be writable.
My setup is like this:
Java Web App (tomcat8) with HikariCP as the connection pool, with ConnecterJ as the driver for MySQL. I am evaluating Aurora-MySQL to see if it will satisfy some of the needs the application has. The web app sits in an EC2 instance that is in the same VPC and SG as the Aurora-MySQL cluster. I am connecting through the cluster endpoint to get to the database.
After a failover, I would expect HikariCP to break connections (it does), and then attempt to reconnect (it does), however, the application must be connecting to the wrong server, because anytime a write is hit to the database, a SQL Exception is thrown that says:
The MySQL server is running with the --read-only option so it cannot execute this statement
What is the solution here? Should I rework my code to flush DNS after all connections go down, or after I start receiving this error, and then try to re-initiate connections after that? That doesn't seem right...

I don't know why I keep asking questions if I just answer them (I should really be more patient), but here's an answer in case anyone stumbles upon this in a Google search:
RDS uses DNS changes when working with the cluster endpoint to make it looks "seamless". Since the IP behind the hostname can change, if there is any sort of caching going on, then you can see pretty quickly how a change won't be reflected. Here's a page from AWS' docs that go into it a bit more: https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html
To resolve my issue, I went into the jvm's security file and then changed it to be 0 just to verify if what was happening was correct. Seems correct. Now I just need to figure out how to do it properly...

Related

VerneMQ plugin_chain_exhausted Authentication MySQL

I have a running instance of VerneMQ (cluster of 2 nodes) on Google kubernets and using MySQL (CloudSQL) for Auth. Server accepts connections over TLS
It works fine, but after a few days i start seeing this message on the log:
can't authenticate client {[],<<"Client-id">>} from X.X.X.X:16609 due to plugin_chain_exhausted
The client app (paho) complains that the server refused the connection for being "not authorized (code=5 in paho error)"
after a few retry it finally connects. but every time it get's harder and harder until it just won't connect anymore
If i restart VerneMQ everything get's back to normal
I have only 3 clients currently connected at most, at the same time.
clients already connected have no issues in pub/sub.
In my configuration i have (among other things):
log.console.level=debug
plugins.vmq_diversity=on
vmq_diversity.mysql.* = all of them set
allow_anonymous=off
vmq_diversity.auth_mysql.enabled=on
it's like the server degrades over time. the status webpage reports no problem
My verne server was build from the git repository about a month ago and runs on a docker container
what could be the cause?
what else could i check to find posibles causes? maybe a diversity missconfiguration?
Tks
To quickly explain the plugin_chain_exhausted log: with Verne you can run multiple authentication/authorization plugins, and they will be checked in a chain. If one plugin allows the client, it will be in. If no plugin allows the client, you'll see the log above.
This does not explain the behaviour you describe, though. I don't think I have seen that.
In any case, the first thing to check is whether you actually run multiple plugins. For instance: have you disabled the vmq.passwd and the vmq.acl plugins?

Pomelo MySQL (.NET Core) Can't Recover After Database Failure

Last night AWS RDS had an "Internet Connectivity Issue" that was resolved a short time later. However, my app (which runs in .NET Core and connects to an RDS MySQL instance via Pomelo.EntityFrameworkCore.MySql) could never re-establish a connection to the database even though the MySQL server was back online. I tested connecting from my own local machine and it worked just fine. I then re-deployed the .NET Core app it everything started working again.
Is there something that I need to re-create (the db context perhaps), or is there something that is being cached that I need to flush to try to connect again? I connect via hostname, and my connection string looks something like this:
server=something.somewhere.us-east-2.rds.amazonaws.com;userid=XXXX;password=YYYYY;database=ZZZZ
Here is the Exception being thrown:
MySqlException: Unable to connect to any of the specified MySQL hosts.
at MySqlConnector.Core.ServerSession+<ConnectAsync>d__56.MoveNext (C:\projects\mysqlconnector\src\MySqlConnector\Core\ServerSession.cs:239)
and here is how I create my db context in Startup.cs:
services.AddDbContext<BlayFapContext>(opt => opt.UseMySql(Settings.Instance.SQLConnectionString));
Any help would be greatly appreciated.
Giawa
Okay, we worked out what happened. Pomelo's MySQL wrapper had an issue as outlined in their git repo here: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/issues/434
Basically, if a MySQL database is not available when the connection string is first used then it will be cached as invalid and will never work again. You can easily confirm this by launching a service with no MySQL connectivity, verify it doesn't work, then launch MySQL and confirm that the service still doesn't work. It can never establish a MySQL connection after the first connection string is found to be invalid.
They patched it shortly after the 2.0.1 release, but they haven't updated Nuget with a new version since then, despite the issue being found 6 months ago. So, the fix is to checkout their repository source code, and patch it ourselves. We found the fix here works just fine: https://github.com/PomeloFoundation/Pomelo.EntityFrameworkCore.MySql/pull/456
So, why was the connection string retried? We already had a successful connection! It turns out that the internet connectivity issue with the Ohio data center was not limited to RDS, but also affected EC2. Our EC2 instance was rebooted as part of the fix, and the MySQL connection wasn't valid when it reboot due to the continued connectivity issues. The state of that connection was cached, and even though the MySQL server came back online our service was toast.
Giawa

AWS Aurora: The MySQL server is running with the --read-only option so it cannot execute this statement

I am getting this error when executing a GRANT statement on my Aurora DB instance in AWS:
The MySQL server is running with the --read-only option so it cannot execute this statement
My user is not read-only though, so why is this happening?
It turned out to be a silly mistake, but posting it anyway in case anyone else has the problem:
I was accessing the replica instance by mistake - I had copied the endpoint for the replica, and it is read-only apparently. So if you have this problem, verify that you are connecting to the Primary Instance or best of all the DB Cluster endpoint.
Edit: According to #Justin's answer we definitely should use DB Cluster:
You need to connect to the cluster, rather than an instance. This is because instances seem to take a turn to be the readers and writers.
You need to connect to the cluster, rather than an instance. This is because instances seem to take a turn to be the readers and writers.
In my case, I was receiving this error after performing a Blue/Green failover in a Test environment. I was trying to access the Blue database, in order to confirm the process for reverting back to Blue database should that be required later.
Accessing Blue via the cluster address yielded this error, as did attempting to use the direct links to the Blue "reader" and "writer" instances. In the end, I performed a failover of the Blue "reader" and "writer" instances, after which the cluster address was in a working state again.
tl;dr
Try a failover of the "writer".

Ghost Blogging Platform Connection Reset Error

I am running Ghost as a web service on Microsoft azure. I am using MySql Database for storage instead of the default Sqlite. Every time i open the blog i get a Econnreset error with status as 500, and Sql query is being shown.
I have MySql Running in a virtual machine. But everything works out fine on refresh. I am also using connection pooling.
How to rectify this, or what can be the probable reason for Ghost to drop connection with database.
Solved the problem. Issue is with the underlying Knex MySql Driver. When the connection remains Idle Azure closes the connection, when the request is made again knex does not check if the connection is still there or not leading to Econnreset Error.
You can fix this by setting min number of connections to be zero in knex.
For more details follow this issue:
https://github.com/tgriesser/knex/issues/975
Is the mysql database hosted on another azure instance ?
If so you will need to make it available to the outside (Open the required ports).

MySQL connections limit in Micro CloudFoundry

I'm running my application with the Micro CloudFoundry, but I'm having trouble connecting to MySQL 'User 'usGh0jJk8EoZn' has exceeded the 'max_user_connections' resource'. How can I change this value?
I'm not quite sure you can change that value.
Before going down that road though, you may want to make sure that you are not leaking connections. Is your application running correctly when deployed locally (i.e. not using regular CloudFoundry nor Micro CF)? How are you connecting to the database? It may seem strange that you hit a connection limit if you're actually the sole user of your app, which I assume you are if using micro.
as ebottard said, it's well worth making sure your code isn't leaking connections. But, if you want to change the mysql setup for the instance running on Micro CloudFoundry, you can SSH in to the VM using the 'vcap' user.
Once connected, you will find the mysql configuration file at /var/vcap/jobs/mysql_node/config/my.cnf
For maximum connections you will also have to change the max_user_conns value in /var/vcap/jobs/mysql_node/config/mysql_node.yml
Please also take a look at;
http://docs.cloudfoundry.com/infrastructure/micro/using-mcf.html#logging-in-to-micro-cloud-foundry