WebSockets server: reading data after handshake - google-chrome

I've managed to get the socket opened, hands shaken, and even though that's all fun and so, I would like to handle the data itself now. The small thingy is that unlike the HTTP headers which are pure ascii, the content seems to be encoded:
ÅÅúÅ à›ÅÅ»öë∑âÅÅ«∆{UÅÅeæƒ$ÅÅvü
‡7ÅÅŸJêÏòÅÅ~}Z¥?ÅÅ9TÉHxÅÅ[ 1†ÅÅs óE2ÅÅ9\ÅyxÅÅ#´°ºbÅÅïôx ‘ÅÅ)Ÿ1–hÅÅ⁄}
That's what server received from Google Chrome client's
socket.send("A");
socket.send("A");
Just skimming the protocol definition, I didn't find anything about encoding besides base64, which this clearly isn't.
How should I handle the content serverside?
Edit: already looked quite a few articles, but nearly all are about the client side.

Data that is sent from the client to the server is masked (to protect misbehaving intermediaries from getting confused). It's a 4 byte running XOR with the mask sent as the first 4 bytes of the payload. It is described in the spec in section 5.3

Related

What are the pros and cons of Base64 file upload through JSON, as opposed to AJAX or jQuery upload?

I was tasked to write image upload to remote server and save those images locally. It was quite easy to do it with Base64 transfer through JSON and storing with Node.js. However, is there a reason to not use this type of file upload, to use AJAX or other ways? (Other than the 30% bandwidth increase, which I know of. You can still include that in your answer in order for it to be full).
The idea of base64 encoding is to avoid binary data for protocols based on text. Outside this situation, it's I think always a bad idea.
Pros
Avoidance of binary data for protocols based on text, and independance from external files.
Avoidance of delimiter collision.
Cons
Time and space increased complexity; for space it's 33–36% (33% by the encoding itself, up to 3% more by the inserted line breaks).
API response payloads are larger/too large.
User Experience is negatively impacted, unless one invoke some lazy loading.
By including all image data together in one API response, the app
must receive all data before drawing anything on screen. This means
users will see on-screen loading states for longer and the app will
appear sluggish as users wait.
This is however mitigated with Axios and some lazy loader such as react-lazyload or lazyload or so.
CDN caching is harder. Contrary to image files, the Base64 strings inside an API response cannot be delivered via a CDN cache. The whole API response must be delivered by CDN. (cf., Don’t use Base64 encoded images on mobile and Why "optimizing" your images with Base64 is almost always a bad idea)
Image caching on the device is no longer possible.
Content management becomes harder on server side. Most content management tools handle images as binary files. But then when managing in binary, there is the time overhead of encoding/decoding.
No security gain and overhead in engineering to mitigate (Sanitizing, Input Validation, Escaping). Example of XSS attack: Preventing XSS with Base64 encoding: The False sense of web application security
The developers of that site might have opted to make the website appear more secure by having cryptic URLs and whatnot. However, that
doesn't mean this is security by obscurity.
If their website is vulnerable to SQL injection and they try to hide that by encoding the URLs, then it's security by obscurity. If their website is well secured against SQL injection; XSS; CSRF; etc., and they deiced to encode the URLs like that, then it's just plain stupidity.
It does not help with text encoded images such as svg (Probably Don’t Base64 SVG)
Data URIs aren't supported on IE6 or IE7, nor on Opera before 7.2 (Which browsers support data URIs and since which version?)
References
https://en.wikipedia.org/wiki/Base64
https://en.wikipedia.org/wiki/Delimiter#Delimiter_collision
SO: What is base 64 encoding used for?
https://medium.com/snapp-mobile/dont-use-base64-encoded-images-on-mobile-13ddeac89d7c
https://css-tricks.com/probably-dont-base64-svg/
https://security.stackexchange.com/questions/46362/purpose-of-using-base64-encoded-urls
https://bunnycdn.com/blog/why-optimizing-your-images-with-base64-is-almost-always-a-bad-idea/
https://www.davidbcalhoun.com/2011/when-to-base64-encode-images-and-when-not-to/
Data Encoding
Every data Encoding and Decoding can be used duo various reasons, which came up with benefits and downsides.
like:
Error-detection encodings : which can detect errors but increase data usage.
Encryption encodings: turns data to cipher which intruder wont decipher.
There are a lot of Encoding Algorithms which Alter Data in
Which has some usefullness to do that.
but with
Base64 Encoding, its encode every 6-bit data into one character (8-bit) .
3 Byte to 4 Byte but it only includes alphanumeric(62 distinc) and 2 signs.
its benefits is it Dose not have special chars and signs
Base64 Purpose
it make possible to transfer Any Data with Channels Which Prohibits us to have:
special chars like ' " / \ ...
non-printable Ascii like \0 \n \r \t \a
8-bit Ascii codes (ascii with 1 MSB )
binary files usually includes any data which if turns in ascii can be any 8-bit character.
in some protocols and application there are I/O Interfaces Which Does only accepts a handful of chars (alphanumeric with a few of signs).
duo to:
prevent to code injection (ex: SQL injection or any prgramming-language-syntax-like characters ; )
or just some character has already has a meaning in their protocol (ex: in URI QueryString character & has a meaning and cannot be in any QueryString Value)
or maybe the input is not intended to accept non-alphanumerical values. (ex: it should accept only Human Names)
but with base64 encoding you can encode anything and transfer it with
any channel you want.
Example:
you can encode an image or application and save it in DBMS with SQL
you can include some binary data in URI
you can send binary files in a protocol which has been designed to accepts only human chats as alphanumerical like IRC Channel
Base64 is a just a converting format that HTTP server cannot accept binary data in the contents except the HTTP Header type is binary or acceptable format defined by web-server.
As you might know, JSON can contain various formats and information; thus, you can contain such as
{
IMG_FILENAME="HELLO",
IMG_TYPE="IMG/JPEG",
DATA="~~~BASE64 ENCODED IMAGE~~~~"
}
You can send JSON file through AJAX or other method. But, as I told you, HTTP server have various limitation because it should keep RFC2616 (https://www.rfc-editor.org/rfc/rfc2616).
In short, Sending Through JSON can contain various data.
AJAX is just a type of sending as other ways does.
I used same solution in one of my project.
The only concern is the request body size. If all your images are small, like a few M, then you should be fine.
My server is asp.net core, its maxAllowedContentLength value is 30000000, which is approximately 28.6MB. When the image size is over this, the request failed with error "request body too large".
I think node.js should have similar setting, make sure to adjust it to meet your need.
Please note that when the request size is too big, the possibility of request timeout increases accordingly due to the network traffic. This will be an issue especially for the requests from phones.
I think the use of base64 is valid.
The only doubt is the size of the request, but this can be circumvented if you divide this base64 in the frontend, if a 30mb file you could divide each request into 5mb and in the backend put the parts together, this is useful even to do the "keep downloading" "when you have a problem with the network and corrupt some part.
Hugs
Base64 converts your data to an ASCII representation of the binary data. It allows you to embed your data in text streams such as JSON for example. Base64 increases the size of the data transferred by 33%.
multipart/form-data is the standard way of transferring binary data in HTTP requests. It allows you to use specific encodings / content types for each part you'd like to transfer. In my opinion, you should stick to multipart uploads unless you have specific requirements or device/SDK capabilities.
Checkout these links
What is difference Between Base64 and Multipart?
Base64 image upload VS Binary image upload?

Are WebRTC data channel packets atomic?

I want to use a WebRTC data channel to exchange json messages between peers.
Can I safely assume that each json message arrives atomically remotely (not like in TCP where packets may be split or chunked together) or do I need implement something like a length prefix to know where one message ends and another begin?
Using a reliable channel and possibly a tcp turn server, if that's relevant.
Yes, according to the webRTC draft spec, whatever message you send() down a data channel should arrive in a single onmessage callback at the far end.
In real life however, Chrome sometimes calls onmessage with a partial message when it runs out of buffers. If you keep your messages <64k this seems not to happen.

Sending continuous data over HTTP with Go

I am currently working on a web service in Go that essentially takes a request and sends back JSON, rather typical. However, this particular JSON takes 10+ seconds to actually complete and return. Because I am also making a website that depends on the JSON, and the JSON contents are subject to change, I implemented a route that quickly generates and returns (potentially updated or new) names as placeholders that would get replaced later by real values that correspond to the names. The whole idea behind that is the website would connect to the service, get back JSON almost immediately to populate a table, then wait until the actual data to fill in came back from the service.
This is where I encounter an issue, potentially because I am newish to Go and don't understand its vast libraries completely. The previous method that I used to send JSON back through the HTTP requests was ResponseWriter.Write(theJSON). However, Write() terminates the response, so the website would have to continually ping the service which could now and will be disastrous in the future
So, I am seeking some industry knowledge into my issue. Can HTTP connections be continuous like that, where data is sent piecewise through the same http request? Is that even a computationally or security smart feature, or are there better ways to do what I am proposing? Finally, does Go even support a feature like that, and how would I asynchronously handle it for performance optimization?
For the record, my website is using React.js.
i would use https websockets to achieve this effect rather than a long persisting tcp.con or even in addition to this. see the golang.org/x/net/websocket package from the go developers or the excellent http://www.gorillatoolkit.org/pkg/websocket from gorilla web toolkit for use details. You might use padding and smaller subunits to allow interruption and restart of submission // or a kind of diff protocol to rewrite previously submitted JSON. i found websocket pretty stable even with small connection breakdowns.
Go does have a keep alive ability net.TCPConn's SetKeepAlive
kaConn, _ := tcpkeepalive.EnableKeepAlive(conn)
kaConn.SetKeepAliveIdle(30*time.Second)
kaConn.SetKeepAliveCount(4)
kaConn.SetKeepAliveInterval(5*time.Second)
Code from felixqe
You can use restapi as webservice and can sent data as a json.SO you can continously sent data over a communication channel.

Why use XML(SOAP) when JSON so simple and easy to handle?

Receiving and sending data with JSON is done with simple HTTP requests. Whereas in SOAP, we need to take care of a lot of things. Parsing XML is also, sometimes, hard. Even Facebook uses JSON in Graph API. I still wonder why one should still use SOAP? Is there any reason or area where SOAP is still a better option? (Despite the data format)
Also, in simple client-server apps (like Mobile apps connected with a server), can SOAP give any advantage over JSON?
I will be very thankful if someone can enlist the major/prominent differences between JSON and SOAP considering the information I have provided(If there are any).
I found the following on advantages of SOAP:
There is one big reason everyone sticks with SOAP instead of using JSON. With every JSON setup, you're always coming up with your own data structure for each project. I don't mean how the data is encoded and passed, but how the data formatted format is defined, the data model.
SOAP has an industry-mature way of specifying that data will be in a certain format: e.g. "Cart is a collection of Products and each Product can have these attributes, etc." A well put together WSDL document really has this nailed. See W3C specification: Web Services Description Language
JSON has similar ways of specifying this data structure — a JavaScript class comes to mind as the most common way of doing this — but a JavaScript class isn't really a data structure used for this purpose in any kind of agnostic, well established, widely used way.
In short, SOAP has a way of specifying the data structure in a maturely formatted document (WSDL). JSON doesn't have a standard way of doing this.
If you are creating a client application and your server implementation is done with SOAP then you have to use SOAP in client side.
Also, see: Why use SOAP over JSON and custom data format in an “ENTERPRISE” application? [closed]
Nowadays SOAP is a complete overkill, IMHO. It was nice to use it, nice to learn it, and it is beautiful we can use JSON now.
The only difference between SOAP and REST services (no matter whether using JSON) is that SOAP WS always has it's own WSDL document that could be easily transformed into a self-descriptive documentation while within REST you have to write the documentation for yourself (at least to document the data structures). Here are my cons'&'pros for both:
REST
Pros
lightweight (in all means: no server- nor client-side extensions needed, no big chunks of XML are needed to be transfered here and there)
free choice of the data format - it's up on you to decide whether you can use plain TXT, JSON, XML, or even create you own format of data
most of the current data formats (and even if used XML) ensures that only the really required amount of data is transfered over HTTP while with SOAP for 5 bytes of data you need 1 kB of XML junk (exaggerated, ofc, but you got the point)
Cons
even there are tools that could generate the documentation from docblock comments there is need to write such comments in very descriptive way if one wants to achieve a good documentation as well
SOAP
Pros
has a WSDL that could be generated from even basic docblock comments (in many languages even without them) that works well as a documentation
even there are tools that could work with WSDL to give an enhanced try this request interface (while I do not know about any such tool for REST)
strict data structure
Cons
strict data structure
uses an XML (only!) for data transfers while each request contains a lot of junk and the response contains five times more junk of information
the need for external libraries (for client and/or server, though nowadays there are such libraries already a native part of many languages yet people always tend to use some third-party ones)
To conclude, I do not see a big reason to prefer SOAP over REST (and JSON). Both can do the same, there is a native support for JSON encoding and decoding in almost every popular web programming language and with JSON you have more freedom and the HTTP transfers are cleansed from lot of useless information junk. If I were to build any API now I would use REST with JSON.
I disagree a bit on the trend of JSON I see here. Although JSON is an order maginitude easier, I'd venture to say it's quite limited. For example, SOAP WS is not the last thing. Indeed, between soap client/server you now have enterprise services bus, authentification scheme based on crypto, user management, timestamping requests/replies, etc. For all of this, there're some huge software platforms that provide services around SOAP (well, "web services") and will inject stuff in your XML. So although JSON is probably enough for small projects and an order of magnitude easier there, I think it becomes quite limited if you have decoupled transmission control and content (ie. you develop the content stuff, the actual server, but all the transmission is managed by another team, the authentification by one more team, deployment by yet another team). I don't know if my experience at a big corp is relevant, but I'd say that JSON won't survive there. There are too many constraints on top of the basic need of data representation. So the problem is not JSON RPC itself, the problem is it misses the additional tools to manage the complexity that arises in complex applications (not to say that what you do is not complex, it's just that the software reflects the complexity of the company that produces it)
I think there is a lot of basic misinformation on this thread. SOAP, REST, XML, and JSON concepts seem to be mixed up in the responses.
Here is some clarification -
XML and JSON (an others) are encodings of information.
SOAP is a communications protocol
REST is an (Architecture) style
each is used for something different although you might use more than one of these things together.
Lets start with encoding data structures as XML vs JSON:
Everything JSON currently supports can be done in XML, but not the other way around. JSON will eventually adopt all the features that XML has, but its proponents haven't encountered all of the problems yet, once they get more experience things will be added on to close the gap. for example JSON didn't start out with Schemas and binary formats.
SOAP is a communication protocol for calling an operation. It runs on top of things like, HTTP, SMTP, etc. Aside from many other features, SOAP messages can span multiple "application" layer protocols. i.e. i can sent a SOAP message by HTTP to a service endpoint which then puts it on a message queue for another system. SOAP solves the problem of maintaining authentication, message authenticity, etc. as the requested moved between different parts of a distributed system.
JSON and other data formats canbe sent via SOAP. I work with some systems that sent binary fixed-width encoded objects via SOAP, its not a problem.
The analogy is that - if only the postman is allowed to send you a letter, then it is just HTTP, but if anyone can send you a letter, then you want SOAP. (i.e. message transport security vs message content security)
the 6 REST constraints are architectural style. Interestingly the first several years of REST the examples were in SOAP. (there is no such thing as REST or SOAP they are not opposites)
A "heavyweight bloated, etc.etc." SOA SOAP system might have monoliths with operations like GET, PUT, POST instances of a single entity. SOAP doesn't have those operations predefined, but that is typically how it is used.
Consider that if you built a "REST" service on HTTP alone with an SSL/TLS terminating proxy, then you may have violated the 4th constraint of REST.
So for your software development today, you wouldn't normally interact with any of these directly. Just as if you were written a graphics program you wouldn't directly work with HDMI vs. DisplayPort typically.
The question is do you understand architecturally what your system needs to do and configure it to use the mechanism that does that job. (for example, all the challenges of applying today's microservices to general systems are old problems previously solved by SOAP, CORBA and the old protocols)
I have spent several years writing SOAP web services (with JAX WS). They are not hard to write. And I love the idea of a single endpoint and single HTTP method (POST). For me, REST is too verbose.
But as a data container, JSON is simpler, smaller, more readable, more flexible, looks closer to programming languages.
So, I reinvented the wheel and created my own approach to writing backends for AJAX requests. In comparison:
REST:
get user: method GET https://example.com/users/{id}
update user: method POST https://example.com/users/ (JSON with User object in request body)
RPC:
get user: method GET https://example.com/getUser?id=1
update user: method POST https://example.com/updateUser (JSON with User object in the request body)
My way (the proposed name is JOH - JSON over HTTP):
get user: method POST https://example.com/ (JSON specifies both user ID and class/method responsible for handling request)
update user: method POST https://example.com/ (JSON specifies both user object and class/method responsible for handling request)

Check a webpage remotely to see if if its ASCII or Binary

Is it possible to check remotely (no local/FTP access) a URL to see if the webpage (file) was uploaded/created as binary or ascii?
Thanks,
Roy.
Not sure of the value (or indeed, intent) of the question but it's quite possible to send an HTTP GET request to the web page and examine what comes back. If all the bytes are within the range 0x20 through 0x7e, you can safely assume it's ASCII. Anything outside that range is not ASCII.
Perhaps if your question indicated more on why you were trying to do this, we could help you out further.
If you want to figure out whether the page was FTP'ed to the server in ASCII or binary mode, it won't make any difference (and you won't be able to tell) if the server and the machine that uploaded it are both ASCII.
If you uploaded, for example, an EBCDIC file in binary mode to an ASCII server, that will be immediately obvious :-).