Insert data from AWS Lambda to AWS Aurora - mysql

In my application, I am using AWS S3 to upload and store files. Whenever a file is uploaded to S3, an event is created which triggers a specific lambda function λ. Then, my lambda function λ should perform an SQL INSERT (with the event data of the S3 event) to my running AWS Aurora instance. I expect that λ will be invoked about 10 - 50 times per second.
Summarising: S3 EVENT → TRIGGERS λ → AURORA INSERT
I found various posts claiming that accessing Aurora (or RDS in general) from a lambda function can result in problems due to missing connection pooling and stateless container architecture of AWS Lambda (e.g. AWS Lambda RDS Connection Pooling).
My λ can be written in any language so the question is, what language/framework to use to not get in trouble with the AWS Lambda connection pooling problem or in other words, is it possible to perform 10 - 50 inserts per second to Aurora with an Aurora MySQL compatible db.t2.small instance? Or are there any alternatives to perform INSERTS to Aurora with another service than Lambda (e.g. SNS) without writing and running my own EC2 instance?
Update 2017-12-10: AWS recently announced Serverless AWS Aurora as preview which looks promising regarding serverless architectures.

The connection pooling problem is not language-specific. It is caused by the approach that you used in your code to connect and disconnect from your database.
Basically, the best way to avoid it is to connect and disconnect from the database during your lambda invocation. This is not optimal from the performance perspective but this is the least error-prone.
It is possible, to reuse a database connection (for performance reasons), but this may or may not have connection problems depending on how your database is configured to handle idle connections. This requires some trial-and-error and some database configuration tweaking. On top of that, tweaks that work on development may not work on production (since production traffic is different).
Now, to your questions:
Is it possible to perform 10 - 20 inserts per second to Aurora with an Aurora MySQL compatible db.t2.small instance?
I don't see why not. 50 inserts per second isn't really high.
Are there any alternatives to perform INSERTS to Aurora with another service than Lambda (e.g. SNS) without writing and running my own EC2 instance?
I don't think there's any. SQL INSERTs use a schema so you have to be aware of that schema when INSERT-ing data, so that means you have to code it yourself using Lambda.

Related

First query or connection to AWS RDS is very very slow

I have a product built with laravel, with multi-tenancy.
Deployed on EC2 instance and using AWS RDS as the database server.
I am currently having around 100 databases on the production.
Laravel's hyn tenancy module is handling the connections.
Now, the problem is for each tenant after some idle time, the first request takes too long. around 15-20 seconds. and after that, it works smoothly.
In the test environment, we are not using RDS but a local MySQL instance. and the problem does not occur in the test environment. the only difference between test and production is the AWS RDS.
I have looked into max connections, query cache, and so on... but no luck so far.
Any suggestions?
The solution will depend on what kind of RDS you have.
I assume it's serverless (more common). In that case, there's a setting for min and max for ACU. It will (I believe) go down to zero by default if the DB is not accessed in a while. Check that and see if it is properly set.
If you have a Provisioned DB, then it's more complex. It will start caching things once queries are executed but until a particular query is run, you will be waiting for the DB to "wake up" and run a full query.
Check this page for relevant info.

AWS RDS Concurrency Question (multi-user)

I am trying to build a desktop application and ideally connect to AWS RDS (MySQL) database. My use case is that at least 5 users will be using this app at the same time and be writing into the database at the same time likely. My question is does RDS handle concurrency issue? Or do I need to write some script to handle this in the desktop app?
Thank you!
RDS is a managed service, you don't need to worry about concurrency or other configurations, unless and until you need any custom behavior.
The maximum number of simultaneous database connections varies by the DB engine type and the memory allocation for the DB instance class. The maximum number of connections is set in the parameter group associated with the DB instance, except for Microsoft SQL Server, where it is set in the server properties for the DB instance in SQL Server Management Studio (SSMS).
You can read here more.

AWS Aurora mysql database is very slow then AWS RDS Mysql

We have existing data in on-premise mysql version 5.7 and planning to move application and database on AWS. we provisioned one RDS Mysql database and one Aurora Mysql db and connect application to both server. we saved execution timing in database so we found that RDS MySql is running 2times fast rather then Aurora database.
AWS claim that performance of Aurora database is 5 times faster then RDS Mysql, but it seems this is not correct.
Please suggest is there any tuning required with Aurora db.
System configuration for both DB: db.r6g.large (2CPU and 16 GB RAM)
Note : refer db column prodQueryTime as 'Mysql performance time' and experimentQueryTime as 'Aurora DB performance time'
There are many similar reports like yours, e.g. here. And the answer is that it depends what you do. AWS rep writes:
The most important aspect to keep in mind is that Aurora is optimized for concurrent workloads and its benefits are best evaluated by running parallel benchmarks.
Totally agree.
Aurora MYSQL performance is hugely worse than common MYSQL for our Task (iOS & Android App, accesded by 500K users)
We migrated from RDS MYSQL (t3.medium) to Aurora MYSQL.
We tried Aurora MYSQL and Aurora MYSQL Severless V2.
Originally, our MYSQL t3.medium was just running fine, CPU (20-50%) all time.
The same size of Aurora could not handle the load. We had to duplicate the instance size and keep having issues during peak times.
Also, testing Aurora Serverless v2, we had to scale to 8ACU (16GB) to be able to handle the load.
As I said before, with MYSQL (No aurora), we could handle the load just fine with t3.medium (2CPU, 4GB).
With Aurora we had to duplicate (or more) the instance size, so the cost doubled (or more).
Going back to normal RDS MYSQL.

AWS RDS connection from external client extremely slow

I am currently connecting to an RDS instance (MariaDB) without an issue from within the configured VPC.
I am also connecting to the RDS instance from local clients (external to the VPC) with no connectivity issues but have serious issues with SQL execution speeds. It can take up to 20 times longer to execute a query remotely vs locally (an EC2 on same VPC).
I have the Security Group for the RDS instance setup to allow the external IPs as incoming rules and the RDS instance is listening on a non-default port (not 3306).
I cannot think of anything I should be doing differently on the network side of things and I have set skip-name-resolve=1, yet the speed is ridiculous.
It has no preference in terms of what the SQL query may be (SELECT, UPDATE, DELETE), they all execute slow.
Server RDS is MariaDb 10.1.19 on a db.t2.medium instance.
Client connection is via MySQL .NET Connector and connection string:
Server=<ip>;Port=<port>;Database=<dbname>;User ID=<dbuser>;Pooling=true;CharSet=utf8;Password=<dbpass>
Client has no connectivity or speed issues when DB in not an RDS (local MySQL).
I have seen various network related issues popping up now and then (connection stream dropped) but nothing serious apart from that, just very slow.
Any pointers on how to at least determine where the problem is?
The scenario I am trying to achieve (with acceptable speeds) is described here (albeit vague in their instructions):
http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_VPC.Scenarios.html#USER_VPC.Scenario4

persistently replicating RDS MySQL database to external slave

AWS now allows you to replicate data from an RDS instance to an external MySQL database.
However, according to the docs:
Replication to an instance of MySQL running external to Amazon RDS is only supported during the time it takes to export a database from a MySQL DB instance. The replication should be terminated when the data has been exported and applications can start accessing the external instance.
Is there a reason for this? Can I choose to ignore this if I want the replication to be persistent and permanent? Or does AWS enforce this somehow? If so, are there any work-arounds?
It doesn't look like Amazon explicitly states why they don't support ongoing replication other than the statement you quoted. In my experience, if AWS doesn't explicitly document a reason for why they do something then you're not likely to find out unless they decide to document it at a later time.
My guess would be that it has to do with the dynamic nature of Amazon instances and how they operate within RDS. RDS instances can have their IP address change suddenly without warning. We've encountered that on more than one occasion with the RDS instances that we run. According to the RDS Best Practices guide :
If your client application is caching the DNS data of your DB instances, set a TTL of less than 30 seconds. Because the underlying IP address of a DB instance can change after a failover, caching the DNS data for an extended time can lead to connection failures if your application tries to connect to an IP address that no longer is in service.
Given that RDS instances can and do change their IP address from time to time my guess is that they simply want to avoid the possibility of having to support people who set up external replication only to have it suddenly break if/when an RDS instance gets assigned a new IP address. Unless you set the replication user and any firewalls protecting your external mysql server to be pretty wide open then replication could suddenly stop if the RDS master reboots for any reason (maintenance, hardware failure, etc). From a security point of view, opening up your replication user and firewall port like that are not a good idea.