I have an application hosted in openshift. Now I want figure out how many request can handle in order to check the speed and availability.
So my first attempt will be generate a multiple HTTP GET requests to my Rest Service(made in python and hosted in openshift).
My fear is can get my IP workplace banned regarding this looks like an attack.
In the other hand I see there are tools like New Relic or DataDog to check metrics, but I don't know if I can simulate http requests and then check the response times.
Openshift Response
I finally wrote to Openshift support and they told me I can simulate http requests without worries.
I recall the default behavior being that each gear can handle 16 concurrent connections, then auto-scaling would kick in and you would get a new gear. Therefore I would think it makes sense to start by testing that a gear works well with 16 users at once. If not, then you can change the scaling policy to what works best for you application.
BlazeMeter is a tool that could probably help with creating the connections. They mention 100,000 concurrent users on that main page so I don't think you have to worry about getting banned for this sort of test.
Related
I am trying to Ajax load from LAN's mysql using chrome app.
I am proposing Ajax because I need chrome app to load up any update in the SQL instantaneously.
Since this app is only used in LAN network, I presume there is no need to maintain a web server (aka running Apache). Can anyone provide some hints as this answer I found on the forum does not help me (an absolute newbie) too much.
https://developer.chrome.com/extensions/xhr
Thank you.
YY
Since this app is only used in LAN network, I presume there is no need to maintain a web server (aka running Apache).
AJAX refers to making a HTTP request to.. something.
Something that can answer HTTP requests is called a web server.
So, you do need some sort of web server. It may be a component of MySQL server, but it's still a web server.
That said, it doesn't look like MySQL has a supported HTTP interface. There is an experimental HTTP Plugin that provides REST API, but it's experimental. Therefore, you would need a separate server application that does what you need.
That said,
I am proposing Ajax because I need chrome app to load up any update in the SQL instantaneously.
AJAX is not a magic bullet. It works well for requesting data, but it is not adapted to receiving updates initiated by the server you're talking to. It's a request-response cycle, and while there are some techniques to use it to push data they are hacks.
WebSockets evolved to cover the bidirectional, persistent communication needs. However, this again would require a web server to sit as a proxy between your DB and your app - this time, WebSockets-capable.
That said, building a Chrome App allows you to connect to a database directly - since Chrome Apps are capable of using chrome.sockets API. You would need a JavaScript library specifically adapted to the task, but those probably exist.
That said, and noting that I'm not an expert on databases, but..
Databases are not designed to notify you about updates. You need to poll them to see if the data has changed. You will not get it instantaneously no matter what interface you use. You'll need to periodically monitor it for changes.
Considering this, depending on what you're trying to ultimately do you may be choosing a wrong instrument.
There's a lot of "buts" here, and it seems like a complex task. You should re-evaluate your readiness as an "absolute newbie" to undertake it.
I'm developing a HTML5 Websocket-Based application which should notify the users in real-time about different events. The client connect to the server, send a handshake with some securitytoken, the server check if the securitytoken is valid and add the client to the list of active clients. Now he get notifications on special events.
Because there are different notifications from multiplice applications, there is a notification-core where handle the basics of the connection and also the authentification because this is always the same. The core can be accessed from applications, with them they can communicate to the server.
Does it make sense or is it necessary to insert some limitations in the core? For example tracking the user-ip and refuse the connection if the user has more than lets say 3 connections to the server in the last 10 seconds to prevent flood-attacks.
In my oppinion I think it can reduce serverload if someone try to crash my service by holding the F5 key or using some botnet as long as he isn't sending so much traffic to my server that my connection can't handle that much.
I'm using socket.io if this is important.
If you're trying to protect your application from malicious attacks, there are many, many things you would need to consider and it is important to prioritize those things and spend your development time on the things that could most impact your service. I would think that creating multiple webSocket connections would be very low on the priority list way behind operations in your service that actually change state such as cause writes to a database, etc... Modern servers can easily hold tens of thousands of sockets and it costs little server load to just be sending the same notification to lots of sockets.
In addition, using the IP address as something to limit by can cause problems because larger organizations may use NAT to share a single IP address among many users for outbound connections. If you are going to limit by user, it's much better to limit by a userID (something each user uniquely logs in with).
How to implement dynamically updating vote count similar to quora:- Whenever a user upvotes an answer its reflected automatically for every one who is viewing that page.
I am looking for an answer that address following:
Do we have to keep polling for upvote counts for every answer, If yes
then how to manage the server load arising because of so many users
polling for upvotes.
Or to use websockits/push notifications, how scalable are these?
How to store the upvote/downvote count in databases/inmemory to support this. How do they control the number of read/writes. My backend database is mysql
The answer I am looking for may not be exactly how quora is doing it, but may be how this can be done using available opensource technologies.
It's not the back-end system details that you need to worry about but the front end. Having connection being open all the time is impractical at any real scale. Instead you want the opposite - to be able to serve and close connection from back-end as fast as you can.
Websockets is a sexy technology, but again, in real world there are issues with proxies, if you are developing something that should work on a variety of screens (desktop, tablet, mobile) it might became a concern to you. Even good-old long polls might not work through firewalls and proxies.
Here is a good news: I think
"keep polling for upvote counts for every answer"
is a totally good solution in this case. Consider the following:
your use-case does not need any real real-time updates. There is little harm to see the counter updated a bit later
for very popular topics you would like to squash multiple up-votes/down-votes into one anyway
most of the topics will see no up-vote/down-vote traffic at all for days/weeks, so keeping a connection open, waiting for an event that never comes is a waste
most of the user will never up-vote/down-vote that just came to read a topic, so your read/write ration of topics stats will be greatly skewed toward reads
network latencies varies hugely across clients, you will see horrible transfer rates for a 100B http responses, while this sluggish client is fetching his response byte-by-byte your precious server connection and what is more importantly - thread on a back end server is busy
Here is what I'd start with:
have browsers periodically poll for a new topic stat, after the main page loads
keep your MySQL, keep counters there. Every time there is an up/down vote update the DB
put Memcached in front of the DB as a write-through cache i.e. every time there is an up/down vote update cache, then update DB. Set explicit expire time for a counter there to be 10-15 minutes . Every time counter is updated expire time is prolongated automatically.
design these polling http calls to be cacheable by http proxies, set expire and ttl http headers to be 60 sec
put a reverse proxy(Varnish, nginx) in front of your front end servers, have this proxy do the caching of the said polling calls. These takes care of the second level cache and help free up backend servers threads quicker, see network latencies concern above
set-up your reverse proxy component to talk to memcached servers directly without making a call to the backend server, yes if your can do it with both Varnish and nginx.
there is no fancy schema for storing such data, it's a simple inc()/dec() operation in memcached, note that it's safe from the race condition point of view. It's also a safe atomic operation in MySQL UPDATE table SET field = field + 1 WHERE [...]
Aggressive multi level caching covers your read path: in Memcached and in all http caches along the way, note that these http poll requests will be cached on the edges as well.
To take care of the long tail of unpopular topic - make http ttl for such responses reverse proportional to popularity.
A read request will only infrequently gets to the front end server, when http cache expired and memcached does not have it either. If that is still a problem, add memecached servers and increase expire time in memcached across the board.
After you done with that you have all the reads taken care of. The only problem you might still have, depending on the scale, is high rate of writes i.e. flow of up/down votes. This is where your single MySQL instance might start showing some lags. Fear not - proceed along the old beaten path of sharding your instances, or adding a NoSQL storage just for counters.
Do not use any messaging system unless absolutely necessary or you want an excuse to play with it.
Websockets, Server Sent Events (I think that's what you meant by 'push notifications') and AJAX long polling have the same drawback - they keep underlying TCP connection open for a long time.
So the question is how many open TCP connections can a server handle.
Basically, it depends on its OS, number of file descriptors (a config parameter) and available memory (each open connection reserves a read/write buffers).
Here's more on that.
We once tested a possibility to keep 1 million websocket connections open on a single server (Windows 7 x64 with 16Gb of RAM, JVM 1.7 with 8Gb of heap, using Undertow beta to serve Web requests).
Surprisingly, the hardest part was to generate the load on the server )
It managed to hold 1M. But again the server didn't do something useful, just received requests, went through protocol upgrade and kept those connections open.
There was also some number of lost connections, for whatever reason. We didn't investigate. But in production you would also have to ping the server and handle reconnection.
Apart from that, Websockets seem like an overkill here, SSE still aren't widely adopted.
So I would go with good old AJAX polling, but optimize it as much as possible.
Works everywhere, simple to implement and tweak, no reliance on an external system (I had bad experience with that several times), possibilities for optimization.
For instance, you could group updates for all open articles in a single browser, or adjust update interval according to how popular the article is.
After all it doesn't seem like you need real-time notifications here.
sounds like you might be able to use a messaging system like Kafka, or RabbitMQ, or ActiveMQ. Your front end would sent votes to a message channel and receive them with a listener, and you could have a server side piece persist the votes to the db periodically.
You could also accomplish your task by polling your database, and by incre/decre menting a number related to a post via a stored proc... there are a bunch of options here and it depends on how much concurrency you may be facing.
So Have a web application that has 10-12 pages with many POST/ GET DB Calls. We usually have a apache crash/other problem when site traffic results to 1000 or so (concurrent users) which is very small number, we have updated server with good RAM and resources. When our system admin guy do load testing on blitz and other custom script and is suggesting to move away from Apache. Some things does not make sense to me. Like Apache is not too bad to handle few thousand of concurrent users considering we have cloudflare for caching. Here is what he suggested:
replacement of Apache+mod_fcgi with Nginx+php-fpm which can make the server handle much more users, and then test it.
or
2. For testing: Need 10-20 servers to run a scenario from. Basically, what is needed is a more complex blitz.io analogue. create one server, which takes all those hours, then just clone it in the cloud and pay for about 1 hour of testing multiplied by the number of servers needed.
Once again there are many DB calls anf HT access. ALso what makes Nginx better than apache in this case?
I would check this comparison first. Basically, nginx is event based, so it's able to handle more requests concurrently. However, as the MySQL DB seems to be the choke point here, it's very possible that nginx wouldn't solve all your problems. Perhaps moving to a NoSQL kind of database, that's better at scaling horizontally, would help (if that's feasible).
I am using the Web Stress Tester tool from Faststream to stress-test a web request, and it has an option for keeping the connection alive.
I'll reckon that for the usual web request (say a request to a PHP page), the connection is closed after the web server processes the request and send back. Is it accurate to say that I should not keep the connection alive for such cases?
IP session setup and tear down are expensive, hence the addition of items in HTTP 1.1 which leverage already open connections instead of establishing a new one to download additional items. You may need to take a look at the documentation from Web Stress Tester to see if this is what is being referenced. You might also way to take a look at the w3c documentation on HTTP 1.1 to familiarize yourself with the additions related to session management which come with the update.
In general, however, you would want to model your behavior as closely as possible to your actual end user behavior to make your test the best predictor possible of end user behavior in production. If you have an application which keeps sessions alive that you are modeling, then by all means enable the setting. If your application does not keep alive the current session then don't enable the option.