How do I signal to a web server that I'm posting gzipped data? - json

I have a client that will be posting large JSON files to an API server. Since the files are so compressable, I would like to gzip them and send the compressed data.
What I would like to know is: What is the best way to signal my intent to the server?
Basically, I want the reverse of Accept-encoding, such that the server would treat the data as only compressed for transport purposes, and automatically decompress the data before interpreting it according to the Content-Type.
This means that I can't set the Content-Type field to application/gzip because it needs to be application/json for the server to understand what the true uncompressed data encoding is.
Content-Transfer-Encoding looks like it would serve my purposes, but it was built with email in mind, and only supports 7bit, quoted-printable, base64, 8bit, and binary.
Does transparent compression/decompression for HTTP transport from client to server exist? And if not, what are the best practices for what I'm trying to achieve?

The "reverse" of the Accept-encoding header is Content-encoding. This signals to the server that the content is gzipped:
Content-encoding: gzip
You're correct that you shouldn't use the Content-type header for this, since the gzip compression is purely a matter of how the request is encoded, not what it represents.

Related

binarized jason which looks good on browsers

I'm looking into an HTTP interface that returns (essentially) a JSON object.
When I access the URL by chrome or firefox, the JSON data is shown with appropriate indents. However, when I download it with curl etc, the data is binary.
I think the browsers know this binary encoding method and show it in a pretty format. (If I save it as a file from the browsers, it is a text file with the indents.)
What do you think this binary encoding is?
(Unfortunately, I can not upload the binary data here...)
[SOLVED]
Browsers send requests with headers but curl doesn't send header by default. That is the reason why I get the different response by these methods. My API returns binarized (compressed) json when called without a header.
You should have a look in the header of the HTTP response message which contains the binary data. There should be values about encoding, content-type and compression.
With this values you can decode the binary data.

technical inquiry - HTML transfer of image from server to browser

When an image is uploaded from the client's machine to the client (browser), it requires FileReader() API in html, thereafter a base64 encoded url (say) of the image is sent in chunks to the server, where it needs to be re-assembled. All of this is taken care by the developer.
However, when an image is sent from the server to the client, only the directory path of the image inside the server machine suffices, no chunking and encoding is required.
My questions are:
1. Does the server send the image in chunks to the html file. If it does not, how does sending full images not bottle server's network? What would happen in case of large video files?
2. In what form of binary data is the image sent to the client - base64url / ArrayBuffer / binarystring / text / etc.
3. If the server does send the image in chunks, who is doing the chunking and the re-assembly on the client thereafter?
Thanks.
HTML isn't really important here. What you care about are the transport protocols used - HTTP and TCP, most likely.
HTTP isn't chunked by default, though there are advanced headers that do allow that - those are mostly used to support seeking in large files (e.g. PDF, video). Technically, this isn't really chunking - it's just the infrastructure for allowing for partial downloads (i.e. "Give me data from byte 1024 to byte 2048.").
TCP is a stream-based protocol. From programmer point of view, that's all there is to it - no chunking. Technically, though, it will process your input data and send it as distinct packets that are reassembled in-order on the other side. This is a matter of practicality - it allows you to manage data latency, streaming, packet retransmission etc. TCP handles the specifics during connection negotiation - flow control, window sizing, congestion control...
That's not the end of it, though. All the other layers add their own bits - their own ways to package the payload and split it as necessary, their own ways to handle routing and filtering, their own ways to handle congestion...
Finally, just like HTTP natively supports downloads, it supports uploading data as well. Just send an HTTP request (usually POST or PUT) and write data in a format the server understands - anything from text through base-64 to raw binary data. The limiting factor in your case isn't the server, the browser or HTTP - it's JavaScript. The basic mechanism is still the same - a request followed by a response.
Now, to address your questions:
Server doesn't send images to the HTML file. HTML only contains an URL of the image[1], and when the browser sees an URL in the img tag, it will initiate a new, separate request just for the image data. It isn't fundamentally different from downloading a file from a link. As for the transfer itself, it follows pretty much exactly the same way as the original HTML document - HTTP headers, then the payload.
Usually, raw binary data. HTTP is a text-based protocol, but it's payload can be arbitrary. There's little reason to use Base-64 to transfer image data (though in the past, there have been HTTP and FTP servers that didn't support binary at all, so you had to use something like Base-64).
The HTTP server doesn't care (with the exception of "partial downloads" mentioned above). The underlying network protocols handle this.
[1] Nowadays, there's methods to embed images directly in the HTML text, but it's of varying practicality depending on the image size, caching requirements etc.

How to set response Content-Length to infinite

I try to create an application in web2py framework. By default web2py server has Transfer-Encoding: Chunked header for response, but in that case when target remote web application sends GET request to my app it could get only first string of text from requested page (from file's content that displayed on page). If to use Content-Length instead, for example, with value of 1000 it will get 1000 bytes of data from page... But if I expect to response with huge range of data, how to set Content-Length parameter to infinity or by file (like here but with web2py syntax instead of php )?
If you are serving web2py via the built-in development server and serving files via the Expose functionality, then the files will be served via chunked transfer encoding.
However, if you instead use response.stream to serve files, the Content-Length header will be set automatically.

Decoding charset of JSON response

I'm using Dev HTTP Client Chrome extension to verify restful URL so i can build C# application that can consume it. I have a trouble with encoding meaning when response is shown inside plugin/browser encoding is not proper, but when i download it with that same plugin and open file with Notepad++ encoding is fine. I'm having same problem with my C# application when reading JSON response from that web service. I also used restclient-ui-3.1 to check data but it behaves same as Chrome plugin, meaning it displays wrong characters in its response body tab.
Obviously web service is sending properly encoded data but i can't manage to read it accordingly on client side. Any hints?
Dev HTTP Client uses Content-Encoding and Content-Type HTTP headers to determine how is response body encoded (e.g. Content-Encoding: gzip) and what representation contains (e.g. Content-Type: application/json; charset=utf-8).
It seems that your webservice is not sending proper HTTP headers. Check them.

JSON REST Service: Content-Encoding: gzip

I am writing some code to interface with a service that I do not have yet, so I am writing a simulator to attempt to de-risk some of the problems I might run into when I get a chance to integrate against the real system. The interface is basically a REST style interface that returns JSON formatted strings.
The interface specification says the JSON formatted response is returned in lieu of the standard HTTP body. It also says that responses from the server will be zlib compressed and have the "Content-Encoding: gzip" set in the header.
So I created a WCF service that provides a REST interface that returns a JSON formatted string. I now need to deal with the compression portion of the equation. To satisfy the Content-Encoding: gzip requirements, do I simply gzip the JSON string I created and return that rather than the string? Or is it more involved than that? Let me know if there is any other information that is needed here, as I am still a newbie when dealing with REST/HTTP.
Thanks so Much for your time.
You're correct. Just Gzip the JSON string and return it.
Best reference for any REST implementation is the HTTP/1.1 RFC: https://www.rfc-editor.org/rfc/rfc2616
In short: yes, it's as simple as that. The response body just needs to be the gzip-compressed version of the normal response body.
This question may have some useful information for setting up your service.