Load testing a ec2 Node.js machine - Now... how do I remotely load test 6500 QPS? - stress-testing

Ok, I have my server built on ec2. My stack is Nginx as a load balancer, supervisord for managing processes for node.js i.e. one process for each cpu, and redis, master and slave on separate boxes. I have stress tested by testing failover and taking services offline. Using apache AB, on the server I can get up to 6500 QPS.
Now, I need to load test remotely. What are the best open source tools to accomplish this or even the most cost effective SaaS method to do this. I do expect 6500 QPS per server in production and need to extend the isolation of apache AB to remote testing. E.g. I will have servers in singapore and I need to test 6500 QPS from Japan and the effect of latency. I am aware of apache Jmeter but looking for a best practice solution.
Thanks

I have successfully used jMeter for load testing at significant scale.
If a single load generation client cannot output enough load, you can configure jMeter with multiple load generation clients, with the load coordinated by a master instance.
Using "open source tools" implies that you have the ability to spin up servers in the zones you're interested in (e.g. Japan). If you locate a cloud provider in that region, you can spin up as many load generation instances as needed. You may, however, need quite a few instances depending on the network connectivity offered to individual instances. The nice thing about jMeter is that it can coordinate many load generation instances.

You can use blazemeter as SaaS solution. It's 100% Jmeter compatible. There is Japan(Tokyo) load origin location which you need.

Related

Data transfer between local server and AWS server

We have an ASP.NET MVC 5 web application that reads data locally from within the same server. This server is in Europe. However when trying to read the same data from an AWS server based in Sidney the lag is many times greater. A ping from our local server to the AWS server in Australia takes 5 seconds. The data needs to be located in Australia because of data protection laws issued by the Australian Government. The database is MySQL. We have created a VPN between both servers and made no difference.
What are our options in order to improve the speed between these two servers?
If it is a web application serving content to users over internet you can use CloudFront distribution to reduce your latency issues.
https://aws.amazon.com/cloudfront/
If you are trying to connect your servers from your data center to AWS
Use AWS Direct Connect, this will provide a dedicated link between your on-premise datacenter and to the AWS Servers; Decreasing your latency by a lot.
https://aws.amazon.com/directconnect/
AWS runs your application regardless of which platform(ASP.NET, JAVA, C...) it is, AWS only provisions infrastructure. You don't need to worry about the platform on which your application is running and what database it connects to. You just need to ensure that all the network connections are properly open so that your servers can communicate with AWS servers.

aws - needed requirements - windows - sqlserver - webapp

I'm a totally newbie with cloud computing, but I want to try it.
I want to relocate a public app developed by myself and currently served by a traditional hosting.
The requirements are just the basics.
- Windows Server 2008 R2 or later
minimal ram
.net 2.0 and .net 4.0 support
IIS
SQL Server 2008
Some 20gb, for the app, db, and files
Some 10gb more (DB backup)
With win2 + sql server I can't use the free testing. I know that and I'm ready to pay the platform needed. In the current hosting I'm paying for that too.
I just want see if someone can validate my configuration and say if I'm forgetting something:
1 instance EC2 Windows and Std. SQL Server m4.large
1 Amazon EBS Volumes 100gb, 3IOPS, 30% snapshots
1 Elastic IP for the unique instance
5GB Transfer in, out, interdata
http://calculator.s3.amazonaws.com/index.html#r=IAD&s=EC2&key=calc-D29B73A3-A7C5-4CEE-9C4F-3ED5A74D2420
Basically, the server will run the web app who has connection with the database installed in the same machine (maybe in the future with more visits the db will be moved to another vm). The web app will be public access.
Someone see something missing?
Should I buy amazon s3 storage for the backups?
Where will be saved the ebs volume backups? or with my configuration I can't have backups?
At the moment I'll keep using my own dns server and I hope be capable to configure for redirect to aws without need of amazon route 53.
at are you seeing, I need some orientation, because I'm a little lost in this new world. What services I need? what about the backups?
I'm not thinking at this moment in optimization (cache, load balance, ...). If the things go in the right way, in the future. Right now I just want the most simply installation. My problem, windows and sql, those can't be used in free version and I can't change the app.
I feel shame for be so annoying, but I still have those doubts.
There are a lot of pieces to this. I will try and break some of it down.
Should I buy amazon s3 storage for the backups? Where will be saved
the ebs volume backups? or with my configuration I can't have backups?
You can script a backup of the drive to S3, or back up the volume entirely as a 'volume'. Totally up to you. Backing up contents of the drive to S3 with a proper script (like this) would be a great idea.
At the moment I'll keep using my own dns server and I hope be capable to configure for redirect to aws without need of amazon route 53.
Route 53 is amazing and an extremely small charge for the flexibility it gives. I would recommend using it because it ties in neatly to other AWS services you might need (like load balancers) in the future.
I just want see if someone can validate my configuration and say if I'm forgetting something:
1 instance EC2 Windows and Std. SQL Server m4.large
1 Amazon EBS Volumes 100gb, 3IOPS, 30% snapshots
1 Elastic IP for the unique instance
5GB Transfer in, out, interdata
With an m4.large, you get 2 VCPUs and 8gb of memory, and 450mbps throughput. Do you need all that horsepower? You can try a t2.medium if that sounds like a lot for some burstable cpu performance. It all depends on what utilization you're at now, and if you're CPU/memory/bandwidth bound.
Feel free to ask more questions.
For backups of EBS volumes to S3, the best way to backup would be us EBS snapshots.

Manual deployment vs. Amazon Elastic Beanstalk

What are the advantages we get by using Elastic Beanstalk over maually creating EC2 instance and setting up tomcat server and deploy etc for a typical java web applicaion. Are load balancing, Monitoring and autoscaling the only advantages?
Suppose for my web application which uses database I installed the database in the EC2 instance itself. When Autoscalling takes place will the database gets created in the newly created instance or it will be accessing the database I created in the master instance... If it creates just a replica when autoscaling happens how will be data sync happens between the instances?
All the things you mentioned like load balancing, monitoring and auto-scaling are definitely advantages.
However, you have to kind of think about it this way: In a true Platform as a Service (PAAS), the goal is to separate the application from the platform. As a developer, you only worry about your application. The platform is "rented" to you. The platform "instances" are automatically updated, administered, scaled, balanced, etc. for you. You just upload your WAR file and it just works (at least theoretically).
EC2 by itself is not PAAS. It is more like IAAS (Infrastructure as a Service). You still have to take care of the server instances, install software on them, keep them updated, etc.
Elastic Beanstalk is a PAAS system. So are App Engine and Azure among many others.
In a true PAAS system, the DBMS is a separate component from the web application server(s). The reason is obvious: The DBMS cannot be possibly installed on the instances that are being used for the application server because, as instances are created and destroyed based on your traffic, the DBMS would be lost! Having the DBMS and application server on the same machine/instance is not generally a good idea anyway.
In a PAAS system, the DBMS is a separate service. For Amazon, it would be Amazon RDS. Just like with Elastic Beanstalk, where you don't have to worry about the application server and you just upload your WAR file, with RDS, you don't have to worry about the DBMS and you just deploy your database(s).
Elastic Beanstalk and RDS work very well together, especially when deployed in the same availability zone, where the latency would be very low.
Finally, using Elastic Beanstalk doesn't cost anything more than the deployed resources (EC2 instances and the load balancer). However, RDS is not cheap and would definitely be more expensive than using a single EC2 instance for both the application server and the DBMS.
Elastic Beanstalk does more than just load balancing, monitoring, and autoscaling.
1) Manages application versions by storing and managing different versions of your application, allowing you to easily switch back and forth between different versions of your applications.
2) Has the concept of "environments" for each application, allowing you to deploy different versions of your application in each environment. This is handy for example if you want to set up separate QA and DEV environments, and you want to easily deploy a build first in DEV then deploy the same version of the application in QA when your QA team is ready for the next build.
3) Externalizes the important container configuration properties (Tomcat memory settings, for example) to the Elastic Beanstalk console and API. Because of this you can easily save the settings and copy them between environments.
4) View application log files through the console and automatically roll and archive log files to S3. (Admittedly this feature is currently a little weak.)
I had an app deployed both in EC2 dedicated(Nginx & Gunicorn) and Beanstalk Environment(CentOS & Apache2).
My observations:
BeanStalk is Paas. Manually creating an EC2 instance(IAAS), is like doing everything from scratch, but you have solid control.
BeanStalk comes with by default CentOS and Apache(Httpd). You could choose OS in dedicated instance.
These things that mattered to me,
There were lots of 504 errors showing up in Beanstalk environment.
It was difficult to debug when BeanStalk server crashed, as logs would also not show up and could not ssh into machine. This is very important.
Installing/configuring tools like Celery, Redis (need to run another port) etc.,. in dedicated instance is lot more easier.
In my case, I had to scale up (Beanstalk)server in order to run installation of some packages(like pandoc). These things are more simpler in Ubuntu.
Scaling is a lot more easier in BeanStalk. Cloning servers is straightforward in BeanStalk.
I had taken micro in both the cases (dedicated & Beanstalk). I felt dedicated micro instance was better.
Automated deployment in Beanstalk. I had to write scripts to automate the same, which is fine, since it is only once.

Java EE application deployment on Amazon EC2

We have a Java EE application (EAR file deployed on JBoss, MySQL, MongoDB) which we would like to deploy on an Amazon EC2 instance. I have several questions regarding deployment best practices.
What is the most commonly used Linux AMI which we can rely on for a robust deployment (There are so many Linux variants, and I am not sure which AMI is commonly used, is it Fedora, CentOS, Red Hat, SUSE ...)
How do we handle production upgrades (EAR file modifications or schema upgrades). Are there any tools which are available to handle this installation or rollback of these changes.
What kind of data backup capability is available for the database?
Should I rely on Amazon RDS for MySQL support?
How should I handle support for MongoDB?
This is the first time, I am hosting an web-app and would appreciate some inputs on how to manage the production instance.
I agree with Mark Robinson's answer: Use whichever Unix variant you're most comfortable with. It may pay to pick one with decent cloud support. For my site I use Ubuntu.
I have a common image which is the base of every version deploy I do. I have www.mysite.com pointing to an Elastic IP so I can decide which instance it goes to. The common image has all the software I need installed (Postgres/Postgis/Tomcat/etc) but the database and web server data folders and symlinked to Elastic Block Store (EBS) instances.
When it comes time to do a deploy I start a new instance up, freeze and snapshot the EBS volumes on production and make new volumes. I point my new instance at the new volumes and then install whatever I need to onto that. Once I've smoke tested everything successfully I can switch the Elastic IP to point to the new instance and everything keeps on going.
I'll note that I currently have the advantage where only I can modify the database; no users can. This will become a problem shortly.
If you use the XFS filesystem on top of the EBS volume then you can tell XFS to freeze the file system (so no updates happen) then call the EC2 api to snapshot the volume then unfreeze the file system. The result is that the snapshot is taken quickly and sent to S3. I have a nightly script which does this.
If RDS looks like it will suit your needs then use it. Amazon is building lots of solid tools quickly and this will ease your scalability issues if you have any.
I'm sorry, I have no idea.
Good question!
1) I would recommend going with whatever Linux variant you are most comfortable with. If you have someone who is really keen on CentOS, go with that. Once you have selected your AMI, take it and customize it by configuring how you want it. Then save that AMI as you base-layout. It will make rolling out new machines much easier and save your bacon if EC2 goes down.
2) Upgrades with EC2 can be tres cool. Instead of upgrading a live system, take your pre-configured AMI, update that and save that AMI as myAMI-1.1 (or whatever). That way, you can flip over to the new system almost instantly AND roll back to a previous version in case something breaks. You can also back-up DB instances to S3. It's cheap at about $0.10/GB/Month.
3) It depends where you are storing your DB. If you are storing it on your EC2 instance you are in trouble. The EC2 instances have no persistence storage. So if your machine crashes, you lose everything. I'm not familiar with Amazon DB system but you should also look into Elastic Block Store. It's basically an actual hard-drive you can write to. When you want to upgrade your schema, do a full DB dump to S3 and then do an upgrade of your actual schema. If something goes wrong, you can pull the previous version out of S3.
4) & 5) I have never used those so I can't help you.
What is the most commonly used Linux AMI which we can rely on for a robust deployment (There are so many Linux variants, and I am not sure which AMI is commonly used, is it Fedora, CentOS, Red Hat, SUSE ...)
How do we handle production upgrades (EAR file modifications or schema upgrades). Are there any tools which are available to handle this installation or rollback of these changes.
What kind of data backup capability is available for the database?
Should I rely on Amazon RDS for MySQL support?
How should I handle support for MongoDB?
Any Linux AMI will do the job, what you need is a JRE only. (assuming development work not required). If you need to monitor the JVM behavior then get JConsole installed.
Easiest and painless way is to SSH into the local home directory, transfer the updated class file/EAR file (depends the number of changes applied) and copy and replace into the Tomcat deployment directory, restart apache. (make sure you tested locally before upload to production).
Depends on which database you are using, if you are using MySQL then just do scheduled backup that writes to your home directory so that from time to time you could SSH in and download a copy for backup purpose.
I would not consider reply on Amazon RDS for MySQL support due to 2 reasons: MySQL is small enough and manageable, and also I would want to have total complete control of the database and why pay for more when you can do it yourself FOC?
The usage of MongoDB should be align with the purpose of your application and benefits you gain from that. I would recommend you use MongoDB for static data retrieval like state, country, area etc... where MySQL to be use for transaction data only.
If you can live with deploying your Java EE application on TomEE instead of JBoss, Boxfuse does what you want.
For you Java EE application you literally only have to execute (TomEE uses war files instead of ear files):
boxfuse run my-tomee-app-1.0.war -env=prod
This will
Create AMI containing TomEE and your application ready to boot
Create an Elastic IP or ELB
Create a security group with the correct ports defined
Create an auto-scaling group
Launch your instance(s)
Any subsequent update will be done as a zero downtime blue/green deployment.
More info: https://boxfuse.com/blog/javaee-aws

Development Environment for Testing MySQL Replication

Is there an easy way to setup an environment on one machine (or a VM) with MySQL replication? I would like to put together a proof of concept of MySQL replication with one Master write instance and two slave instances for reads.
I can see doing it across 2 or 3 VMs running on my computer, but that would really bog down my system. I'd rather have everything running on the same VM. What's the best way to proof out scalability solutions like this in a local dev environment?
Thanks for your help,
Dave
I think to truly test MySQL Replication it is important to do so in realistic constraints.
If you put all the replicate nodes under one operating system then you no longer have the bandwidth constraint, the data transfer speed would be much higher that what you would get if those replicate DBs are on different sites.
Everything under one VM is a shortcut to configurations, for instance it does not make you go through the configuration of the networking.
I suggest you use multiple VMs, even if you have to put them under one physical machine, you can always configure the hypervisor to make the packets go through a router, in which case the I/O will be bound by whatever the network interface has as throughput.
I can see doing it across 2 or 3 VMs
running on my computer, but that would
really bog down my system.
You can try and make a few VMs with JeOS (Just Enough OS) versions of the operating system you want. I know Ubuntu has one and it can boot on 128 RAM, which makes it convenient to deploy lots of cloned VMs under one physical machine without monster RAM.
Next step would be doing the same thing on a cloud (Infrastructure as a Service, IaaS) provider, and try your setup on different geographical sites.
If what you're testing is machine-to-machine replication, then setting up multiple VMs on a virtual private network would be the correct environment to test it. If you use Ubuntu Server, you don't have to install more than you actually need -- just give the VMs enough space for a base install + MySQL + your data. Memory usage can be as little as 256MB per VM. All you have to do is suspend or shutdown the VMs when you're not running a full-up test.
I've had situations where I was running 4 or more VMs simultaneously on my workstation, either for development or testing purposes -- it's not that taxing unless you're trying to do video rendering in each VM.