I'm looking at implementing a live voting system on my website. The website provides a live stream, and I'd like to be able to prompt viewers to select an answer during a vote initiated by the caster. I can understand how to store the data in a mySQL database, and how to process the answers. However:
How would I initially start the vote on the client-side and display it? Should a script be running every few seconds on the page, checking another page to see if a question is available for the user?
Are there any existing examples of a real-time polling system such as what I'm looking at implementing?
You would have to query the server for a new question every few seconds.
The alternative is to hold the connection open until the server sends more data or it times out, which just reduces (but does not eliminate) the server hits. I think it is called "long polling". http://en.wikipedia.org/wiki/Push_technology
You will have to originate the connection from the client-side. The simplest solution is to have the page make an AJAX request every second or so. Web pages don't have to return immediately (they can take 30 seconds or more before responding without the connection timing out). This, opening one connection which doesn't respond until it has something to say, is "long-polling".
You could use setTimeout in JavaScript to make AJAX requests each few seconds to check whether there are new questions.
Yes, long polling might be better, but I'm sure it's a bit more complex. So if you are up to the job, go ahead and use it!
Here's a bit more info on the topic:
http://www.webdevelopmentbits.com/avoiding-long-polling
Related
I have an Android frontend.
The Android client makes a request to my NodeJS backend server and waits for a reply.
The NodeJS reads a value in a MySQL database record (without send it back to the client) and waits that its value changes (an other Android client changes it with a different request in less than 20 seconds), then when it happens the NodeJS server replies to client with that new value.
Now, my approach was to create a MySQL trigger and when there is an update in that table it notifies the NodeJS server, but I don't know how to do it.
I thought two easiers ways with busy waiting for give you an idea:
the client sends a request every 100ms and the server replies with the SELECT of that value, then when the client gets a different reply it means that the value changed;
the client sends a request and the server every 100ms makes a SELECT query until it gets a different value, then it replies with value to the client.
Both are bruteforce approach, I would like to don't use them for obvious reasons. Any idea?
Thank you.
Welcome to StackOverflow. Your question is very broad and I don't think I can give you a very detailed answer here. However, I think I can give you some hints and ideas that may help you along the road.
Mysql has no internal way to running external commands as a trigger action. To my knowledge there exists a workaround in form of external plugin (UDF) that allowes mysql to do what you want. See Invoking a PHP script from a MySQL trigger and https://patternbuffer.wordpress.com/2012/09/14/triggering-shell-script-from-mysql/
However, I think going this route is a sign of using the wrong architecture or wrong design patterns for what you want to achieve.
First idea that pops into my mind is this: Would it not be possible to introduce some sort of messaging from the second nodjs request (the one that changes the DB) to the first one (the one that needs an update when the DB value changes)? That way the the first nodejs "process" only need to query the DB upon real changes when it receives a message.
Another question would be, if you actually need to use mysql, or if some other datastore might be better suited. Redis comes to my mind, since with redis you could implement the messaging to the nodejs at the same time...
In general polling is not always the wrong choice. Especially for high load environments where you expect in each poll to collect some data. Polling makes impossible to overload the processing capacity for the data retrieving side, since this process controls the maximum throughput. With pushing you give that control to the pushing side and if there is many such pushing sides, control is hard to achieve.
If I was you I would look into redis and learn how elegantly its publish/subscribe mechanism can be used as messaging system in your context. See https://redis.io/topics/pubsub
Environment:
Windows Server 2003 - IIS 6.x
ASP.NET 3.5 (C#)
IE 7,8,9
FF (whatever the latest 10 versions are)
User Scenario:
User enters search criteria against large data-set. After initiating the request, they are navigated to a results page, where they wait until the data is loaded and can then refine the data.
Technical Scenario:
After user sends search criteria (via ajax call), UI calls back-end service. Back-end service queries transactional system(s) and puts the resulting data into a db "cache" - a denormalized table, set-up for further refining the of the data (i.e. sorting, filtering). UI waits until the data is cached and then upon getting notified that the process is done, navigates to a resulting page. The resulting page then makes a call to get the data from the denormalized table.
Problem:
The search is relatively slow (15-25 seconds) for large queries that end up having to query many systems based on the criteria entered. It is relatively fast for other queries ( <4 seconds).
Technical Constraints:
We can not entirely re-architect this search / results system. There are way to many complexities here between how the UI and the back-end is tied together. The page is required (because of constraints that can not be solved on StackOverflow) to turn after performing the search criteria.
We also can not ask the organization to denormalize the data prior to searching because the data has to be real-time, i.e. if a user makes a change in other systems, the data has to show up correctly if they do a search afterwards.
Process that I want to follow:
I want to cheat a little. I want to issue the "Cache" request via an async HttpHandler in a fire-forget model.
After issuing the query, I want to transition the page to the resulting page.
On the transition page, I want to poll the "Cache" table to see if the data has been inserted into it yet.
The reason I want to do this transition right away, is that the resulting page is expensive on itself (even without getting the data) - still 2 seconds of load time before even getting to calling the service that gets the data from the cache.
Question:
Will the ASP.NET thread that is called via the async handler reliably continue processing even if I navigate away from the page using a javascript redirect?
Technical Boundaries 2:
Yes, I know... This search process does not sound efficient. There is nothing I can do about that right now. I am trying to do whatever I can to get it to perform a little better while we continue researching how we are going to re-architect it.
If your answer is to: "Throw it away and start over", please do not answer. That is not acceptable.
Yes.
There is the property Response.IsClientConnected which is used to know if a long running process is still connected. The reason for this property is a processes will continue running even if the client becomes disconnected and must be manually detected via the property and manually shut down if a premature disconnect occurs. It is not by default to discontinue a running process on client disconnect.
Reference to this property: http://msdn.microsoft.com/en-us/library/system.web.httpresponse.isclientconnected.aspx
update
FYI this is a very bad property to rely on these days with sockets. I strongly encourage you to do an approach which allows you to quickly complete a request that notes in some database or queue of some long running task to complete, probably use RabbitMQ or something like that, that in turns uses socket.io or similar to update the web page or app once completed.
How about don't do the async operation on an ASP.NET thread at all? Let the ASP.NET code call a service to queue the data search, then return to the browser with a token from the service, where it will then redirect to the result page that awaits the completed result? The result page will poll using the token from the service.
That way, you won't have to worry about whether or not ASP.NET will somehow learn that the browser has moved to a different page.
Another option is to use Threading (System.Threading).
When the user sends the search criteria, the server begins processing the page request, creates a new Thread responsible for executing the search, and finishes the response getting back to the browser and redirecting to the results page while the thread continues to execute on the server background.
The results page would keep verifying on the server if the query execution had finished as the started Thread would share the progress information. When it does finish, the results are returned when the next ajax call is done by the results page.
It could also be considered using WebSockets. In a sense that the Webserver itself could tell the browser when it is done processing the query execution as it offers full-duplex communications channels.
I want to have ONE single mysql-connection used by EVERY user that selects the data all the time and updates it if specific conditions are met (like a placed bid). Most preferably even then if no user is visiting the website, if that's even possible?
So, in the last days I'm google'ing all the time, trying so hard to figure out to solve my issue, but it seems there are no people with enough knowledge to help me with my problem. So I try to ask my question as simple as possible without confusing you with my code. (But if you're interested seeing the code: http://pastebin.com/dRFzWtEH)
However, this is all about an auction website with live-countdown-timer and I just want to run a node.js server that SELECTs data every second and sends it to a WebSocket to show all users visiting that website the countdown and price-updates (on bids) in realtime.
I accomplished this whole task by using single-mysql-queries but then I ran into errors. Then the author of the GitHub node-mysql-module suggested me to use a MySQL Pool. But there is like no content at all to find about my specific aim stated in my first sentence of this question.
Now I want to ask in general, how could I accomplish this and is this even possible or does at least one user has to be on my website?
What would the code/code-structure/logical process look like?
And I guess I don't need to close the connection at all, so I won't need functions like connection.end()?
No, don't worry about connection pooling. It is not a big deal in MySQL.
Furthermore a "pool" has a problem -- it must clear out all settings, #variables, transaction state, etc, etc, before allowing the next 'client' to use the pooled connection. This can take time, especially if the client is far from the server.
MySQL's connection/disconnection time is very low, unlike competing products.
If you are developing a Web product, then keep in mind that HTTP is "stateless". That is, you cannot hang onto a connection from one 'page' to the next 'page. Hence, no 'state' can be saved.
Edit
If you have "Across the pond" latency problems (100-200ms between US and Europe), client-side connection pool could be very useful. However, if the pool software is injecting commands to reset things, that could totally defeat the pooling.
If you can turn on the 'general log' (in a hosted service, you may have to use log_output=TABLE), do so to see what extra commands are injected.
Also, consider combining multiple client SQL statements into Stored Procedures to cut down on back-and-forth.
Also consider either moving the MySQL server closer to the client, or moving the client closer to the MySQL server, depending on how the end-user to client back-and-forth compares to the client to MySQL traffic.
How to implement dynamically updating vote count similar to quora:- Whenever a user upvotes an answer its reflected automatically for every one who is viewing that page.
I am looking for an answer that address following:
Do we have to keep polling for upvote counts for every answer, If yes
then how to manage the server load arising because of so many users
polling for upvotes.
Or to use websockits/push notifications, how scalable are these?
How to store the upvote/downvote count in databases/inmemory to support this. How do they control the number of read/writes. My backend database is mysql
The answer I am looking for may not be exactly how quora is doing it, but may be how this can be done using available opensource technologies.
It's not the back-end system details that you need to worry about but the front end. Having connection being open all the time is impractical at any real scale. Instead you want the opposite - to be able to serve and close connection from back-end as fast as you can.
Websockets is a sexy technology, but again, in real world there are issues with proxies, if you are developing something that should work on a variety of screens (desktop, tablet, mobile) it might became a concern to you. Even good-old long polls might not work through firewalls and proxies.
Here is a good news: I think
"keep polling for upvote counts for every answer"
is a totally good solution in this case. Consider the following:
your use-case does not need any real real-time updates. There is little harm to see the counter updated a bit later
for very popular topics you would like to squash multiple up-votes/down-votes into one anyway
most of the topics will see no up-vote/down-vote traffic at all for days/weeks, so keeping a connection open, waiting for an event that never comes is a waste
most of the user will never up-vote/down-vote that just came to read a topic, so your read/write ration of topics stats will be greatly skewed toward reads
network latencies varies hugely across clients, you will see horrible transfer rates for a 100B http responses, while this sluggish client is fetching his response byte-by-byte your precious server connection and what is more importantly - thread on a back end server is busy
Here is what I'd start with:
have browsers periodically poll for a new topic stat, after the main page loads
keep your MySQL, keep counters there. Every time there is an up/down vote update the DB
put Memcached in front of the DB as a write-through cache i.e. every time there is an up/down vote update cache, then update DB. Set explicit expire time for a counter there to be 10-15 minutes . Every time counter is updated expire time is prolongated automatically.
design these polling http calls to be cacheable by http proxies, set expire and ttl http headers to be 60 sec
put a reverse proxy(Varnish, nginx) in front of your front end servers, have this proxy do the caching of the said polling calls. These takes care of the second level cache and help free up backend servers threads quicker, see network latencies concern above
set-up your reverse proxy component to talk to memcached servers directly without making a call to the backend server, yes if your can do it with both Varnish and nginx.
there is no fancy schema for storing such data, it's a simple inc()/dec() operation in memcached, note that it's safe from the race condition point of view. It's also a safe atomic operation in MySQL UPDATE table SET field = field + 1 WHERE [...]
Aggressive multi level caching covers your read path: in Memcached and in all http caches along the way, note that these http poll requests will be cached on the edges as well.
To take care of the long tail of unpopular topic - make http ttl for such responses reverse proportional to popularity.
A read request will only infrequently gets to the front end server, when http cache expired and memcached does not have it either. If that is still a problem, add memecached servers and increase expire time in memcached across the board.
After you done with that you have all the reads taken care of. The only problem you might still have, depending on the scale, is high rate of writes i.e. flow of up/down votes. This is where your single MySQL instance might start showing some lags. Fear not - proceed along the old beaten path of sharding your instances, or adding a NoSQL storage just for counters.
Do not use any messaging system unless absolutely necessary or you want an excuse to play with it.
Websockets, Server Sent Events (I think that's what you meant by 'push notifications') and AJAX long polling have the same drawback - they keep underlying TCP connection open for a long time.
So the question is how many open TCP connections can a server handle.
Basically, it depends on its OS, number of file descriptors (a config parameter) and available memory (each open connection reserves a read/write buffers).
Here's more on that.
We once tested a possibility to keep 1 million websocket connections open on a single server (Windows 7 x64 with 16Gb of RAM, JVM 1.7 with 8Gb of heap, using Undertow beta to serve Web requests).
Surprisingly, the hardest part was to generate the load on the server )
It managed to hold 1M. But again the server didn't do something useful, just received requests, went through protocol upgrade and kept those connections open.
There was also some number of lost connections, for whatever reason. We didn't investigate. But in production you would also have to ping the server and handle reconnection.
Apart from that, Websockets seem like an overkill here, SSE still aren't widely adopted.
So I would go with good old AJAX polling, but optimize it as much as possible.
Works everywhere, simple to implement and tweak, no reliance on an external system (I had bad experience with that several times), possibilities for optimization.
For instance, you could group updates for all open articles in a single browser, or adjust update interval according to how popular the article is.
After all it doesn't seem like you need real-time notifications here.
sounds like you might be able to use a messaging system like Kafka, or RabbitMQ, or ActiveMQ. Your front end would sent votes to a message channel and receive them with a listener, and you could have a server side piece persist the votes to the db periodically.
You could also accomplish your task by polling your database, and by incre/decre menting a number related to a post via a stored proc... there are a bunch of options here and it depends on how much concurrency you may be facing.
I have a mysql database with several tables. I have an input that makes ajax calls for every character.
Is there a way to load balance by distributing to other domains etc?
Estimated statistics:
~1000-2000 hits a day. Average site time per user ~30-60 secs.
I think you'd be better off making the AJAX form set a timeout whenever a character is input so that let's say 300ms after the last character the AJAX request is made. I've done something similar to your solution in a Java SWING application and the load on the server to make a simple query was stupendous. As far as load balancing MySQL all I know is that you'll either have to give up on consistency or you'll have to deal with degraded write performance.
I've heard good things about Perlbal for load balancing, and it's free making it a good candidate for the poor.
It's source is hosted on Google Code.