I am very new to stress testing and am just trying to learn the ropes. So my questions are:
If I have a development server which in terms of software is identical but in terms of hardware has a much lower spec that the production server, is it worth stress testing the development server to identify obvious software defects?
How is it best to stress test a live production server without potentially jeopardising the experience of you users? Or should stress testing a live production server be avoided.
Here are various tips/suggestions:
If your application is new, so you don't know if it can handle the load it will have in production, then you need to do "capacity" testing. You should do your capacity testing on your production hardware, which since it hasn't gone "live" yet, won't affect users.
If your application is an existing application that is already deployed in production then what you should be doing is "performance regression" testing.
A performance regression test consists of doing a stress test of all the individual "features" (whatever that means for your application) on your development server to measure it's performance. You keep a record of the results as your "baseline".
As you make changes to your application, re-run your performance regression tests to see if any results have changed significantly from the baseline (and record the new numbers as your new baseline).
If the performance regression results on your development server didn't change much from the baseline then you should be safe to deploy to production without your server utilization changing (i.e. getting overloaded).
I think you should avoid any work including stress testing on production machines unless you know you have a problem that you can't reproduce in your test environment - that said maybe you know your users don't use the system during the night? If the tests are non intrusibe/read only then I'd say it's an additional option.
As to the analyzing performance on a weeker machine it's not so bad - most bottlenecks are caused by bad architecture of your system and should be visible on different hardware configurations, just at different load scenarios - it may be even easier to notice the problems on a weeker machine so I'd say stress test and optimize on your development system and you'll know that at least theoreticaly your production system should be even better.
Related
in our team recently the question was raised if using h2db for integration tests is a bad practice/should be avoided if the production environment relies on a different database engine, in our case MySQL8.
I'm not sure if I agree with that, considering we are using spring boot/hibernate for our backends.
I did some reading and came across this article https://phauer.com/2017/dont-use-in-memory-databases-tests-h2/ stating basically the following (and more):
TL;DR
Using in-memory databases for tests reduce the reliability and
scope of your tests. Your application’s SQL may fail in production
against the real database, although the h2-based tests are green.
They provide not the same features as the real database. Possible
consequences are:
You change the application’s SQL code just to make
it run in both the real and the in-memory database. This may result in
less effective, elegant, accurate or maintainable implementations. Or
you can’t do certain things at all.
You skip the tests for some
features completely.
As far as I can tell for a simple CRUD application with some business logic all those points does'n concern me (there some more in the article), because hibernate wraps away all the SQL and there is no native SQL in the code.
Are there any points that i am overlooking or have not considered that speak against an h2db? Is there a "best practice" regarding the usage of in-memory db for integration tests with spring boot/hibernate?
I'd avoid using H2 DB if possible. Using H2DB is good, when you can't run your own instance, for example if your company uses stuff like Oracle and won't let you run your own DB wherever you want (local machine, own dev server...).
Problems with H2DB are following:
Migration scripts may different for H2DB and your DB. You'll probably have to have some tweaks for H2DB scripts and MySQL scripts.
H2DB usually doesn't provide same features, like real RDBMS, you degrade the DB for using only SQL, you won't be able to test store procedures, triggers and all the fancy stuff that may come handy.
The H2DB and other RDBMS are different. Tests won't be testing the same thing, you may get some errors in production that won't appear in your tests.
Speaking of your simple CRUD application - it may not stay like that forever.
But go ahead with any approach you like, it is best to get your personal experience yourself, I got burned on H2DB too often to like it.
I would say it depends on the scope of your tests and what you can afford for your integration tests. I would prefer testing against an as close as possible environment to my production environment. But that's the ideal case, in reality that might not be possible for a varied reasons. Also, expecting hibernate to abstract away low level details perfectly is also an ideal case, in reality the abstraction may be giving you a false sense of security.
If the scope of your tests is to just test CRUD operations, an in-memory tests should be fine. It will perform in that scope quite adequately. It might even be beneficial reducing time of your tests, as well as some degree of complexity. It wont detect any platform/version/vendor specific issues, but that wasnt the scope of the test anyways. You can rather test those things in a staging environment before going to production.
In my opinion, it's now easier than ever to create a test environment as close as possible to your production environment using things like docker, CI/CD tools/platform also support spinning up services for that purpose. If this isn't available or too complicated for your use case, then the fallback is acceptable.
From experience, I had faced failures related to platform/version/vendor specific issues when deploying to production though all my tests against in-memory database went green. It's always better to detect these issues early and save a lot of recurrent development time and most importantly your good night sleep.
Our main website remotely accessed the database of our other website which is on a different domain hosting. My problem is our main website is very slow in loading a page while the second website is not experiencing the problem of our main website(database is hosted on our second website).
Why we're experiencing this problem on our main website?
What would be the possible reasons?
What would be the possible solutions for this?
Edit:
We just transfer the other domain to the same hosting of our main website.
Maybe the problem is the database authentication process between two hosting.
This is a very, very wide question - I can only give general advice.
I'd start by making sure the slow website is properly written. Run the website on a controlled development environment, with a copy of your production database, and use a tool like Apache JMeter to subject it to load; make sure it is "fast" in that environment. "Fast" is a movable concept, but I'd be expecting to see sub-second response times up to hundreds of concurrent users.
If the site is slow in this context, it will be slow on production; find out where the bottleneck is, tune, optimize etc.
If that isn't the problem, I'd replicate that setup with the other website connecting to the same database, and throw load at both sites simultaneously. You might just have reached the scalability limits of the system, and you may be seeing performance issues related to that - unlikely if the first website responds quickly and the second doesn't, but it's possible you're seeing deadlocks or other concurrency issues.
If the website behaves well on "perfect" infrastructure, but not in production, you need to work out what the issue is on production. The best way is to use a profiler on the production environment; this might mean creating a copy of the website which isn't publicly accessible, and installing the profiler there. XDebug works nicely for PHP.
The profiler will show you where your application slows down; it could be in the PHP code, it could be in the authentication section, it could be executing the SQL queries.
Once the profiler tells you where the problem is, you can work out how to fix it.
However, as a rule of thumb, running database queries outside a single network cage is a terrible idea; it's not secure, it exposes your database queries to arbitrary internet performance problems, and it eats into your bandwidth allocation. It's not really to do with the domain in the sense of "www.company.com" - one hosting environment can run multiple domains - but if you're routing your database traffic over the public internet, you give up any control over performance.
Apologies for the fairly generic nature of the question - I'm simply hoping someone can contribute some suggestions and/or ideas as I'm out of both!
The background:
We run a fairly large (35M hits/month, peak around 170 connections/sec) site which offers free software downloads (stricly legal) and which is written in ASP .NET 2 (VB .Net :( ). We have 2 web servers, sat behind a dedicated hardware load balancer and both servers are fairly chunky machines, Windows Server 2012 Pro 64 bit and IIS 8. We serve extensionless URLs by using a custom 404 page which parses out the requested URL and Server.Transfers appropriately. Because of this particular component, we have to run in classic pipeline mode.
DB wise we use MySQL, and have two replicated DBs, reads are mainly done from the slave. DB access is via a DevArt library and is extensively cached.
The Problem:
We recently (past few months) moved from older servers, running Windows 2003 Server and IIS6. In the process, we also upgraded the Devart Component and MySql (5.1). Since then, we have suffered intermitted scalability issues, which have become significantly worse as we have added more content. We recently increased the number of programs from 2000 to 4000, and this caused response times to increase from <300ms to over 3000ms (measured with NewRelic). This to my mind points to either a bottleneck in the DB (relatively unlikely, given the extensive caching and from DB monitoring) or a badly written query or code problem.
We also regularly see spikes which seem to coincide with cache refreshes which could support the badly written query argument - unfortunately all caching is done for x minutes from retrieval so it can't always be pinpointed accurately.
All our caching uses locks (like this What is the best way to lock cache in asp.net?), so it could be that one specific operation is taking a while and backing up requests behind it.
The problem is... I can't find it!! Can anyone suggest from experience some tools or methods? I've tried to load test, I've profiled the code, I've read through it line by line... NewRelic Pro was doing a good job for us, but the trial expired and for political reasons we haven't purchased a full licence yet. Maybe WinDbg is the way forward?
Looking forward to any insight anyone can add :)
It is not a good idea to guess on a solution. Things could get painful or expensive quickly. You really should start with some standard/common triage techniques and make an educated decision.
Standard process for troubleshooting performance problems on a data driven app go like this:
Review DB indexes (unlikely) and tune as needed.
Check resource utilization: CPU, RAM. If your CPU is maxed-out, then consider adding/upgrading CPU or optimize code or split your tiers. If your RAM is maxed-out, then consider adding RAM or split your tiers. I realize that you just bought new hardware, but you also changed OS and IIS. So, all bets are off. Take the 10 minutes to confirm that you have enough CPU and RAM, so you can confidently eliminate those from the list.
Check HDD usage: if your queue length goes above 1 very often (more than once per 10 seconds), upgrade disk bandwidth or scale-out your disk (RAID, multiple MDF/LDFs, DB partitioning). Check this on each MySql box.
Check network bandwidth (very unlikely, but check it anyway)
Code: a) Consider upgrading to .net 3.5 (or above). It was designed for better scalability and has much better options for caching. b) Use newer/improved caching. c) pick through the code for harsh queries and DB usage. I have had really good experiences with RedGate Ants, but equiv. products work good too.
And then things get more specific to your architecture, code and platform.
There are also some locking mechanisms for the Application variable, but they are rarely the cause of lockups.
You might want to keep an eye on your pool recycle statistics. If you have a memory leak (or connection leak, etc) IIS might seem to freeze when the pool tops-out and restarts.
I was originally planning on using a local machine on our network as the development server.
Then I had the idea of using a subdomain.
So if the site was at www.example.com then the development could be done at dev.example.com.
If I did this, I would know that the entire software stack was configured exactly the same for development and production. Also development could use the same database as production removing the hassle of syncing the data. I could even use the same media (images, videos, etc.)
I have never heard of anyone else doing this, and with all these pros I am wondering why not?
What are the cons to this approach?
Update
OK, so its seems the major no no of this approach is using the same DB for dev and production. If you take that out of the equation, is it still a terrible idea?
The obvious pro is what you mentioned: no need to duplicate files, databases, or even software stacks. The obvious con is slightly bigger: you're using the exact same files, databases, or even software stacks. Needless to say: if your development isn't working correctly (infinite loops, and whatnot), production will be pulled down right alongside with it. Obviously, there are possibilities to jail both environments within the OS, but in that case you're back to square one.
My suggestion: use a dedicated development machine, not the production server, for development. You want to split it for stability.
PS: Obviously, if the development environment missed a "WHERE id = ?", all information in the production database is removed. That sounds like a huge problem, doesn't it? :)
People do do this.
However, it is a bad idea to run development against a production database.
What happens if your dev code accidentally overwrites a field?
We use subdomains of the production domain for development as you suggest, but the thought of the dev code touching the prod database is a bit hair-raising.
In my experience, using the same database for production and development is nonsence. How would you change your data model without changing your code?
And also 2 more things:
Its wise to prepare all changes in SQL script, that is run after testing from different environment not your console. Some accidental updates to live system made me headake for weeks.
Once happend to me, that restored backup didn't reproduced live system problem, because of unordered query result. This strange baviour of backup later helped us find the real problem simplier, than retrying on live system.
Using the production machine for development takes away your capacity to experiment. Trying out new modules/configurations can be very risky in a live environment. If I mess up our dev machine with an error in the apache conf, I will just slightly inconvenience my fellow devs. You will be shutting down the live server while people are trying to give you their money.
Not only that but you will be sharing resources with the live enviroment. You can forget about stress testing when the dev server also has to deal with actual customers. Any mistakes that can cause problems on the development server (infinite loop taking up the entire CPU, running out of HDD space, etc) suddenly become a real issue.
I have a client asking this for a requirement and haven't done this before, what does he mean by web service infrastructure?
That phrase encompasses a wide variety of technical aspects. Your infrastructure is all of the components that make up the systems that run a web business or application, including hardware. So it refers to your server and network setup, your bandwidth and connections in and out, your database setup, backup solutions, web server software, code deployment methods, and anything else used to successfully run a web business with high reliability and uptime and low error and bug incidents.
In order to make such a thing scalable, you have to architect all these components together into something that will work smoothly with growth over time. A scalable architecture should be flexible enough to handle sudden traffic spikes.
Methods used to facilitate scalability include replicated databases, clustered web servers, load balancers, RAID disk striping, and network switching. Your code has to take much of this into account.
It's a tough service to provide.
First thing that comes to mind was the Enterprise service bus.
He probably means some sort of "infrastructure" to run a lot of complex interacting web services.
Either an enterprise application that you call via a web service that can run on many instances of a web application server, or run a single instance that are very nicely multithreaded and scale to many CPUs, or deploying loads of different webservices that all talk to each other, often via message queues, until you have something that breaks all the time and requires a huge team of people to maintain. Might as well throw in a load of virtual machines to have a virtualised, scalable, re-deployable web service infrastructure (i.e., loads of tomcats or jbosses in linux VMs ready to deply as needed, one app per VM).
Then there is physical scalability. Is there enough CPU power for your needs? Is there enough bandwidth between physical nodes to send all these messages and SOAP transactions between machines? Is there enough storage? Is the storage available on a fast, low latency interconnect? Is the database nicely fed with CPU power, bandwidth, a disc system that doesn't lag. Is there a database backup. How about when a single machine can't handle the load of a particular function - then you need load balancers, although these are good for redundancy and software updates whilst remaining live as well.
Is there a site backup? Or are you scaling globally - will there be multiple data centres around the globe? Do you have redundant links to the internet from each data centre? What happens when a site goes down? How is data replicated between sites, to reduce inter-site communications, and how do these data caches and pushes work?
And so on and so forth. But your client probably just wants a web service that can be load balanced without thrashing (i.e., two or more instances can share data/sessions/etc, depends on the application really), with easy database configuration and backup. Ease of deployment is desirable, so make the install simple. Or even provide a Linux VM for them to add to their VM infrastructure. Talk to their sysadmin to see what they currently do.
This phrase is often used as a marketing term from companies who sell some part of what they'll call a "scalable web services infrastructure".
Try to find out from the client exactly what they need. Do they have existing web services? Do they have existing business logic they've decided to expose as web services? Do they have customers who are asking to be able to access your client's systems through web services?
Does your client even know what a web service is?