MySql errors while executing queries on a Django project on AWS Lambda function - mysql

I have a Django project hosted on AWS Lambda function.
This microservice uses pymysql to connect to AWS Aurora RDS database.
This service executes one and only one query over and over again.
1 in 200 times the query fails with EOF packet error.
In order to investigate this issue i have implemented a "repeater" which would repeat the same query if one fails (maximum 2 repeats with 0.25 seconds delay).
Once again, in a rare ocasion, a query has failed and I expected to see a successful query after the first reattempt. However it failed in all 3 consecutive calls with all DIFFERENT error messages.
Error messages (in order):
AssertionError: Protocol error, expecting EOF
django.db.utils.InternalError: Packet sequence number wrong - got 24 expected 1
django.db.utils.InterfaceError: (0, '')
These are errors from 3 separate queries executed against MySql Aurora RDS database. (I just wanted to ephesize that indeed it is not a stack trace but rather different query errors).
More useful info:
The microservice uses Django ORM to create queries.
The database is in Master-Slave configuration and those queries go against a Slave database.
The parameters observed in Master and Slave databases (such as CPU usage, free RAM space, various latencies, various throughputs, etc.) are completely normal and do not indicate any potential errors.
It is not a multithreaded environment.
Error stack traces:
Complete stack for *EOF error*:
https://pastebin.com/BracLTZX
Complete stack for *Packet sequence error*:
https://pastebin.com/fYmRGh69
Complete stack for *Interface error*:
https://pastebin.com/bstG1r2q

Related

AWS Time Out Problems with Elastic Beanstalk App with DB Access

Hi When my Elastic Beanstalk (m5a.large Windows Server with deployed .net Core WebApi) comes under heavy load, the Status in the Health Page for my EC2 instances turns red, my Requests and the Healthcheck are timing out. That happens around 1-3 minutes after having a minimum of 10-20 Req/sec for a server.
I have to launch a lot of servers, so that each server gets a Request/Second count of 1-5 so they do not turn red.
In my logs I saw the following Errors:
Exception=MySql.Data.MySqlClient.MySqlException (0x80004005): Unable to connect to any of the specified MySQL hosts.
---> MySql.Data.MySqlClient.MySqlException (0x80004005): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
These Errors brought me to the topic Connection Pooling so i switched
using MySql.Data.MySqlClient;
to
using MySqlConnector;
Now these Errors do not come up anymore but the Problem remains.
The Monitoring Feature of EB and RDS do not state any obvious Problems. Running Queries in Mysql Workbench against the Database is fast as usual.
At the moment, my Database calls from the server are synchronous and not using the async feature of MysqlConnector.
Does the m5a.large cannot process more than 5 Request/Second?
Kind Regards

Will AWS Lambda automatically close MySQL connections?

If we don't close the MySQL connection at the end of the handler function in lambda-- will the MySQL connection close automatically when lambda dies and re-connect at the cold-start?
The connections won't be closed immediately but eventually they will. By default, the connection timeouts are 8 hour on MySQL and maximum connections are also capped at 66.
show variables like "wait_timeout"; -- 28800
show variables like "max_connections"; -- 66
When you create a connection to MySQL server, it would create a Thread on the MySQL server to serve this connection.
show status where variable_name = 'threads_connected';
select * from information_schema.processlist;
After a Lambda executes a request and sends a response, the Lambda execution environment is not removed immediately and the same one may be used to serve other requests. This is your Warm/Hot Lambda and in this case an active MySQL connection would be really good for your function execution and this is possible only when you did not close the connection in the previous invocation. Eventually, when there are no more requests, this Lambda execution environment can be shutdown and the resources are returned to the pool of AWS compute resources. When the Lambda execution environment shuts down, the TCP connection to the MySQL server from the Lambda will also terminate. Now the MySQL server can remove the thread associated with the Lambda and in essence would reduce the pool of active connections on the server. This also takes a bit of time. So if you are getting a lot of requests concurrently and if the maximum connections are already active, then the request would start failing.
I did some test to see how long does it really take to reclaim the connections and here is the snapshot. The X axis is in minutes and the Y axis is on the scale of 0-70 where each line parallel to X-Axis is 10 units away from each other.
It roughly takes 10-15 minutes to reclaim the connections. But again it depends on the Lambda usage pattern as well.
So should you close the connection on every invocation? Well, it depends!
Take a look at Lambda Runtime extensions and see if you can use the shutdown hook to close connection. If you can, then it would mean while the Lambda execution environment was serving multiple requests, you used a cached connection and just before your Lambda execution environment is taken away from you, you closed the connection.
Lambda RDS Proxy is also an alternative as mentioned above, but it is not free. Before you take the RDS Proxy route, do consider using another Serverless solution like AWS Fargate. In this case probably you would use a connection pool just like any long running server side application.
No, they will not be closed automatically unless you are doing something with your mysql client that implicit closes the connection when it goes out of scope.
The connection will stay open until it times out. There has been many people who reported problems in the past with poorly written Lambdas creating tons of open sessions/connections to relational databases because the connections were not properly closed and they had to wait to be timed out.
One feature that came out a year or so ago was RDS Proxies which are sort of an intermediary between clients and the MySQL server that implements connection pooling. This solves the problem with Lambdas not being able to effectively use connection pooling since RDS Proxies service can do that for serverless clients.

MySQL Client Connection

I have a very basic question regarding MySQL Workbench Client Connections window.In that window a Command column and a Time column is shown.If the Command column value is Sleep and the Time column value is very large(say 1500), does that mean that the client connection object has not been used for quite sometime? Also what are the meanings of "Threads Connected", "Threads Running","Total Connections" etc?
A sample MySQL Workbench screenshot of Client Connections real time is shown below:
It basically utilizes output of SHOW PROCESSLIST command.
Command Column: It basically implies the type of action happening in a particular connected thread. In the example screenshot: Sleep means that a thread is connected, but not firing any query as of now. Query means that a query is being executed. That is why we have more Threads Connected, but less number of Threads Running (Query command being run). Some threads are in the process of Connecting. Check more details here.
Time Column: The time in seconds that the thread has been in its current state.
Threads Connected: Number of MySQL client connections open to the server at the moment. So, for example, in our application code, when we do a mysqli_connect, it opens a connection to the Server. In this particular case, it also basically implies that 15 client sessions (most of them originating from application code) are executing simultaneously right now.
Threads Running: Out of these 15 connections, 4 are actually in the process of executing a query.
Total Connection: Total connections made to the server till date (since last server restart I believe).
Connection Limit: Maximum number of connections that can be made simultaneously. Default value of this is 151. In our case, we have increased it to 512, due to server capacity available.

aws DMS replicate-changes-only error

I have prod aws Aurora DB and I want
to replicate changes to test mysql DB (schema is same - Aurora is based on mysql)
I am using aws DMS for this.
When performing full replication for certain tables the replication works fine,
When I want to perform replicate-changes-only, the replication fails.
I've set binlog_checksum=NONE and binlog_format=ROW in the parameter group.
The error I am receiving while running is:
Last Error The task stopped abnormally Stop Reason RECOVERABLE_ERROR Error Level RECOVERABLE
Last Error Task 'task-id' was suspended due to 6 successive unexpected failures Stop Reason FATAL_ERROR Error Level FATAL
Loading a snapshot to the test db isn't an option.
I just want to replicate the changes between specific tables.
Thanks in advance.
I am having the same error, it was always stopping 10min after starting. Adding more verbose logs didn't show more information but by changing the task configuration, especially the parameter MaxFullLoadSubTasks.
By default the value is "MaxFullLoadSubTasks": 8,, I changed it to "MaxFullLoadSubTasks": 1,.
It is slower but it's working for now. You may be able to increase it a bit to be quicker without having the same error.
You can modify the task configuration by first copying the task json settings you will find under DMS > TASK > overview, then changing the value and saving it to a file and then:
aws dms modify-replication-task --replication-task-arn <TASK_ARN_ID> --replication-task-settings file:///path/to/your/task_config.json

Making MySQL client program thread safe?

I'm running into an interesting threadding problem while running a D programming that uses the MySQL C API. I am getting error 2013 "Lost connection to MySQL server during query." The problem appears to occurs when enough threads flood the network interface buffer, but the server still has more to transfer. This is my best guess based on some research and running the program on two different computers. One computer has a 100Mb connection to the server and the other has a 1Gb connection. The computer with the 100Mb connection throws the error, while the 1Gb computer does not. I am wondering if I am running into what is described in the first paragraph of How to Write a Threaded Client in the MySQL documentation. If I am, what do I need to do with SIGPIPE and how do I do it?
For those who are interested, I am calling mysql_library_init before any library call and I am creating a new MYSQL* for each thread with mysql_init and mysql_real_connect. Also of note, the queries that I am executing are small SELECTs, only a few thousand records returned from each query and all queries are executed from the same table.
Please try this before mysql_real_connect:
my_bool myb = 1;
mysql_options(conn, mysql_option.MYSQL_OPT_RECONNECT, &myb);
Also please check this mysql troubleshooting page:
http://dev.mysql.com/doc/refman/5.5/en/gone-away.html