WebSocket progress - html

I have a case where I may need to send 500KB - 1MB of data to the client via WebSockets. Therefore, I was wondering if it was possible to track the progress of how much data has been received by the client. That way the application does not appear to be non-responsive when connecting via a slower connection.

There is no built-in way to do this. (That is, while the websocket protocol allows for fragmentation of messages, any client using Javascript's websocket API has no access to this and is only informed when a browser receives all message fragments and combines their contents into a single buffer.)
You could indicate progress in application code however by breaking your single large message into several smaller ones.
If you do this, you'll also need to define your own simple protocol. At a minimum, this could be an initial message that informs the client that following messages are to be combined and add up to x bytes. Or, if you don't know the size of data in advance, a second message that follows the final data transfer and indicates the end of your fragmented message.

Related

Transmitting Data Between Websites Via The Internet

I've edited this question. I hope this version is a bit more clear.
I am seeking to have a programmer build a process for me. I need to ensure what is recommended is a best practice for the below process.
Here are the steps I need to have built:
Have a https: webform on my server that submits client inputted data into a database on my server. The data is personal identifiable information and needs to be securely transmitted in the next step.
Once the data is loaded in my database, I need to transfer the data in an encrypted/Json format to a third-party server. The third-party will unencrypt the data, score it and send it back to my server encrypted.
While the data is being sent and scored by the third-party, the client will see a browser screen indicating processing...
Once the data is scored and sent back to my server, it will be unencrypted and it will update the client's browser with options based on the score given by the third-party.
Based on what I understand, I think an API on both my server and the third-party server might be best.
What is the best practice approach for the above process?
Below are some questions I have which would be very helpful for me to understand in your response.
Is the API approach the best?
What process is used by the third-party to unencrypt data I send and vice versa? How do I prevent others from unencrypting the data if it is intercepted?
3)While the data is being scored by the third-party, the client browser will show processing. From a web development standpoint how does this work? Also, from a web development standpoint, how exactly is the processing screen triggered to update with results on the client's browser screen when the data is sent back from the third-party?
The file that you will be transmitting, as you mentioned is encrypted so it will totally depend on the encryption algorithm you are using, generally encrypted data are stored as BASE64 or HEX so after encryption the data will be passed in the above-mentioned format.
To answer you second question on "how will the receiving website receive the file?", there are several ways you can do this:
You can share the backend database your website is using then it will just be a simple query away (by shared I mean both the websites use the same database).
Another way of achieving this is to use an API which can store your data and can be globally used in any application it is called at
Or you can set up a simple php server locally at your machine and send data between websites using the HTTP: GET or HTTP: POST requests.
also avoid using un-necessary tags like web-development-server or data-transfer or transmission etc. these tags are useless and unrelated to your question. You should only tag those which are related to your question, a simple tag for web-development would be enough.
also edit out your question to make us properly understand, what problems you are facing? what have you tried? what do you expect from us in the answer?
please clarify your question more.
Your concept of files being sent around is kind of wrong, because in most cases none of this is ever been written to disk, and so there is no JSON file with a file-name - and these are not directly being encrypted, but only pushed through an encrypted channel. Most commonly both sides either use HTTPS or WSS as the protocol, which encrypts / decrypts the data being exchanged transparently (all by itself). And depending on the protocol which is being used, this requires either a combination of client & server, server & server - or a P2P network - to be installed.
Further reading: Internetworking Basics - Computer and Information Science.

Possible to reduce AWS Data Transfer cost by switching from HTTP API to gRPC service?

I have a web service which handles a significant volume of traffic. This traffic can be in range of millions per minute. The service is hosted on AWS EC2 behind an ELB and uses HTTP APIs. This leads to a good chunk of AWS bill comprising of Data Transfer fees. The Data Transfer Out component is mostly higher since the 50% responses from web service are somewhat large and encoded as JSON in addition to SSL negotiations.
Now gRPC payloads are smaller in size compared to similar data represented as JSON due to binary serialization. So it is possible to save upon the data transfer costs by switching from HTTP APIs to gRPC?
I couldn't find any benchmark/article anywhere correlating AWS Data Transfer costs with HTTP APIs/gRPC services. Even 5-10% savings would be beneficial.
PS: Here the clients accessing the web service are also mine. So it is possible for to make changes on both server side and client side.
Maybe, but probably not. It depends on your actual data.
If you're using HTTP for communications, then there are two components of overall message size: HTTP headers and response body. If the headers represent a significant portion of your overall message size, then it makes more sense to get rid of them by using an alternative layer-7 protocol, such as WebSockets.
If the headers aren't significant, then it depends on what your actual message content is. That's because Protocol Buffers, which is used by gRPC, performs essentially two optimizations:
Replacement of field names with a one- or two-byte value. This can be a big savings, as long as your JSON response doesn't frequently use the same field names (ie, repeated objects). If it does, then using GZip encoding will reduce the average cost of a field name down to somewhere around 5 bytes (my observation with large files, YMMV).
Storage of numeric values in fewer than their normal number of bits. If your message content consists of arrays of numbers, this will be a huge win. If it's mostly text, you won't see much benefit, because the same byte sequence will have to be sent in either case.
Personally, I think switching to WebSockets would be the best first step. That assumes, of course, that these messages are coming from a relatively small number of clients. If every message is from a different client, you won't save anything.

Should I process Slang/Bad words masking on Server side (or Client side) to achieve better performance?

I have been developing a real time chatting service and need to mask bad words which client sends. So, now I am wondering about system performance as thousands of msgs transfer at a real time.
Which part (server or client) process is the best solution masking bad words to get better performance?
Client side : Android
Server side : Nodejs (MySQL, Redis)
Methods I am thinking :
Download slang list from server and when client sends a msg (if bad word) mask it. Process can take long time but there may be good search algorithm.
Put slang list on redis and on every msg server processing check for bad word (through redis query) and send masked msg to client and endpoint. Through redis sounds great but I have to send back masked msg to client, this seems me making the system slow.
If you want to display the masked message realtime then implement it on both end. Client side would take care of displaying the message and simultaneously send it to server (async process) for masking and saving to database.
Make sure that the algorithms for masking are same at both end so that the data displayed realtime and one which will be pulled from database will be same.

Will IIS ever terminate the thread if a POST gets canceled by the browser [duplicate]

Environment:
Windows Server 2003 - IIS 6.x
ASP.NET 3.5 (C#)
IE 7,8,9
FF (whatever the latest 10 versions are)
User Scenario:
User enters search criteria against large data-set. After initiating the request, they are navigated to a results page, where they wait until the data is loaded and can then refine the data.
Technical Scenario:
After user sends search criteria (via ajax call), UI calls back-end service. Back-end service queries transactional system(s) and puts the resulting data into a db "cache" - a denormalized table, set-up for further refining the of the data (i.e. sorting, filtering). UI waits until the data is cached and then upon getting notified that the process is done, navigates to a resulting page. The resulting page then makes a call to get the data from the denormalized table.
Problem:
The search is relatively slow (15-25 seconds) for large queries that end up having to query many systems based on the criteria entered. It is relatively fast for other queries ( <4 seconds).
Technical Constraints:
We can not entirely re-architect this search / results system. There are way to many complexities here between how the UI and the back-end is tied together. The page is required (because of constraints that can not be solved on StackOverflow) to turn after performing the search criteria.
We also can not ask the organization to denormalize the data prior to searching because the data has to be real-time, i.e. if a user makes a change in other systems, the data has to show up correctly if they do a search afterwards.
Process that I want to follow:
I want to cheat a little. I want to issue the "Cache" request via an async HttpHandler in a fire-forget model.
After issuing the query, I want to transition the page to the resulting page.
On the transition page, I want to poll the "Cache" table to see if the data has been inserted into it yet.
The reason I want to do this transition right away, is that the resulting page is expensive on itself (even without getting the data) - still 2 seconds of load time before even getting to calling the service that gets the data from the cache.
Question:
Will the ASP.NET thread that is called via the async handler reliably continue processing even if I navigate away from the page using a javascript redirect?
Technical Boundaries 2:
Yes, I know... This search process does not sound efficient. There is nothing I can do about that right now. I am trying to do whatever I can to get it to perform a little better while we continue researching how we are going to re-architect it.
If your answer is to: "Throw it away and start over", please do not answer. That is not acceptable.
Yes.
There is the property Response.IsClientConnected which is used to know if a long running process is still connected. The reason for this property is a processes will continue running even if the client becomes disconnected and must be manually detected via the property and manually shut down if a premature disconnect occurs. It is not by default to discontinue a running process on client disconnect.
Reference to this property: http://msdn.microsoft.com/en-us/library/system.web.httpresponse.isclientconnected.aspx
update
FYI this is a very bad property to rely on these days with sockets. I strongly encourage you to do an approach which allows you to quickly complete a request that notes in some database or queue of some long running task to complete, probably use RabbitMQ or something like that, that in turns uses socket.io or similar to update the web page or app once completed.
How about don't do the async operation on an ASP.NET thread at all? Let the ASP.NET code call a service to queue the data search, then return to the browser with a token from the service, where it will then redirect to the result page that awaits the completed result? The result page will poll using the token from the service.
That way, you won't have to worry about whether or not ASP.NET will somehow learn that the browser has moved to a different page.
Another option is to use Threading (System.Threading).
When the user sends the search criteria, the server begins processing the page request, creates a new Thread responsible for executing the search, and finishes the response getting back to the browser and redirecting to the results page while the thread continues to execute on the server background.
The results page would keep verifying on the server if the query execution had finished as the started Thread would share the progress information. When it does finish, the results are returned when the next ajax call is done by the results page.
It could also be considered using WebSockets. In a sense that the Webserver itself could tell the browser when it is done processing the query execution as it offers full-duplex communications channels.

JSON Asynchronous Application server?

First let me explain the data flow I need
Client connects and registers with server
Server sends initialization JSON to client
Client listens for JSON messages sent from the server
Now all of this is easy and straightforward to do manually, but I would like to leverage a server of some sort to handle all of the connection stuff, keep-alive, dead clients, etc. etc.
Is there some precedent set on doing this kind of thing? Where a client connects and receives JSON messages asynchronously from a server? Without using doing manual socket programming?
A possible solution is known as Comet, which involves the client opening a connection to the server that stays open for a long time. Then the server can push data to the client as soon as it's available, and the client gets it almost instantly. Eventually the Comet connection times out, and another is created.
Not sure what language you're using but I've seen several of these for Java and Scala. Search for comet framework and your language name in Google, and you should find something.
In 'good old times' that would be easy, since at the first connection the server gets the IP number of the client, so it could call back. So easy, in fact, that it was how FTP does it for no good reason.... But now we can be almost certain that the client is behind some NAT, so you can't 'call back'.
Then you can just keep the TCP connection open, since it's bidirectional, just make the client wait for data to appear. The server would send whatever it wants whenever it can.... But now everybody wants every application to run on top of a web browser, and that means HTTP, which is a strictly 'request/response' initiated by the client.
So, the current answer is Comet. Simply put, a JavaScript client sends a request, but the server doesn't answer for a looooong time. if the connection times out, the client immediately reopens it, so there's always one open pipe waiting for the server's response. That response will contain whatever message the server want's to send to the client, and only when it's pertinent. The client receives it, and immediately sends a new query to keep the channel open.
The problem is that HTTP is a request response protocol. The server cannot send any data unless a requests is submitted by the client.
Trying to circumvent this by macking a request and then continously send back responses on the same, original, requests is flawed as the behavior does not conform with HTTP and it does not play well with all sort of intermediaries (proxies, routers etc) and with the browser behavior (Ajax completion). It also doesn't scale well, keeping a socket open on the server is very resource intensive and the sockets are very precious resources (ordinarly only few thousand available).
Trying to circumvent this by reversing the flow (ie. server connects to the client when it has somehting to push) is even more flawed because of the security/authentication problems that come with this (the response can easily be hijacked, repudiated or spoofed) and also because often times the client is unreachable (lies behind proxies or NAT devices).
AFAIK most RIA clients just poll on timer. Not ideal, but this how HTTP works.
GWT provides a framework for this kind of stuff & has integration with Comet (at least for Jetty). If you don't mind writing at least part of your JavaScript in Java, it might be the easier approach.