HTML file containing precompressed Brotli not encoding by Google server - html

I'm hosting a single page static website on Google cloud and I'm trying to implement text-compression on my index.html file.
What I've done so far was to copy all my minified html code from index.html and convert it to Brotli code using an online converter and then saving the Brotli code as index2.html in my bucket. Finally I set the Content-encoding meta value of index2.html to br.
How ever, despite my expectation, I only see a blank page in Chrome and "Content Encoding Error" in Firefox when i go to the www.mysite.com/index2.html address.
I also did the same procedure with gzip compression and setting the content-encoding to gzip but the results were the same. I used the following instructions by Google but it doesn't seem very comprehensive.
What am I doing wrong?
P.S. I am using HTTPS with a valid SSL. I also ckecked in my browser, and the server sent a header that includes the gzip and br in the content-encoding field.

It almost (!) works for me but not how I expect it to and I'm unclear why.
I have a (JPEG) file ${IMAGE} in a bucket that's hosting a website.
Copy the image locally
brotli it
Copy the brotli'd image back to the bucket
Set its metadata
Browse it.
gsutil cp gs://${BUCKET}/images/${IMAGE} ${PWD}
brotli ${IMAGE} --output=brotli.jpg
gsutil cp brotli.jpg gs://${BUCKET}/images
gsutil setmeta \
-h "Content-Type:image/jpeg" \
-h "Content-Encoding:br" \
gs://${BUCKET}/images/brotli.jpg
gsutil stat gs://${BUCKET}/images/brotli.jpg
Content-Encoding: br
Content-Type: image/jpeg
If I browse the site directly in Chrome, it fails (canceled) no response code:
ERR_CONTENT_DECODING_FAILED
If I browse the GCS public URL, it works (200):
https://storage.googleapis.com/${BUCKET}/images/brotli.jpg
And:
https://storage.cloud.google.com/${BUCKET}/images/brotli.jpg
If I use gzip rather than brotli, both work as expected.
For some reason, I'm unable to browse a brotli compressed file as part of the static site even though it's definitively present and I can browse the URL via other means.

Related

How can I download file instead of view in Google cloud storage using html

I want to download files instead of view in Google cloud storage using HTML.
<a
href="https://storage.googleapis.com/test/test.pdf"
download
>download</a
I'm not able to download the pdf. It's open in the browser.
You need to update the Metadata content-type of your file to application/octet-stream.
gsutil -h "Content-Type: application/octet-stream" \
gs://test/test.pdf
Like this, the browser won't be able to detect the file type and to open the PDF Viewer. So, the browser will propose you to save it on your computer.
Note: if there is a way to do a similar thing in HTML, it could be good. But I don't know, I'm not good in frontend dev

How to remove .html extension in URL on google cloud platform (storage bucket) hosted site? [duplicate]

I'd like to let website users load userProfile.html when they request www.website.com/userProfile (without .html).
There is nothing about that in docs.
You can serve HTML content without a .html suffix. It's important to set the content-type though. For example, with gsutil:
gsutil -h "Content-Type:text/html" cp /path/users/userProfile \
gs://www.website.com/userProfile

How to get http status from a website run on the play framework

Please forgive my ignorance. I don't know anything about the play framework, or the MVC design pattern so please bear with me.
I need to write a shell script to check if our service is still up. I had planned to use curl to get the http response code and check if it's a 200 or 404. We are using the play framework and designing the site using the MVC design pattern. I know nothing about these 2 things so it might be adding to my confusion.
Anyway, the website is up because I checked in Chrome and FF and it's up and working. I ran this curl command, which should give me a 200:
curl -I http://oursite.com:8080
HTTP/1.1 404 Not Found
Content-Type: text/html; charset=utf-8
Content-Length: 1900
As a return code in the header, I get a 404. I don't know much about web technology but I read up on 404 status codes from wikipedia and it seems like it's looking for a index.html page (or whatever the requested page is). I found a /dir/of/play/deployment/public directory on our server. I blindly created an index.html in it but it didn't help. I still got a 404 when I was expecting a 200.
I also tried using wget. wget gives me a 200, but it downloads the contents of the main page to my local acct to a file called index.html. I ran
wget http://oursite.com:8080
--2014-01-29 19:42:42-- http://oursite.com:8080/
Resolving oursite.com... 10.64.8.76
Connecting to oursite.com|10.64.8.76|:8080... connected.
HTTP request sent, awaiting response... 200 OK
Length: 27181 (27K) [text/html]
Saving to: “index.html”
I also ran another wget command but this one gave me a 404. I used the spider option to not download the file locally.
wget --spider http://oursite.com:8080
Spider mode enabled. Check if remote file exists.
--2014-01-29 19:42:42-- http://oursite.com:8080/
Resolving oursite.com... 10.64.8.76
Connecting to oursite.com|10.64.8.76|:8080... connected.
HTTP request sent, awaiting response... 404 Not Found
Remote file does not exist -- broken link!!!
The output of the last command said there's a remote file that it's looking for but can't find. However, clearly the site was sending some content when it wasn't in spider mode thanks to the creation/existence of the index.html pg in my local dir.
So the question is, what is this page that curl and "wget --spider" looking for and why did plain wget "find" this file and download its contents? Am I approaching this problem correctly (using curl or wget to see if the webservice is up and running) or am I attacking this problem wrong and should use something else to check?
Thanks in advance for your help.
I suspect this is due to:
https://github.com/playframework/playframework/issues/2280
You'll need to make a regular GET call instead with an -i:
curl -is http://myapp.com | awk '/HTTP\/1.1 (.*) (.*?)/ {print $2}'
This should return 200 all is working.
One tip: in Chrome developer tools, in the network tab you can right click a request and 'Copy as cUrl'. You could compare this request with your handwritten cUrl request.

--no-clobber still overwrites file if --html-extension used in wget?

I have a script for downloading all of my Chrome Bookmarks. I use wget with the --html-extension because some of the bookmarks end in .php and can't be opened by a web browser unless --html-extension option is used. The problem I am having is that when I use --html-extension with --no-clobber, It doesn't recognize that most of the files are already there for some reason, so it goes through the whole process of redownloading stuff it already has.
An example:
wget -nc http://www.test.com/
run once will save the file like it is supposed to. if you run it again then it will say the file already there so not retrieving. that is the operation i would expect.
however, delete the file that was just saved and run:
wget -nc http://www.test.com/ --html-extension
and then run that same command again. it overwrites the file instead of saying file already there. What is going on?
When the html suffix is added, wget can't tell what remote file you want to compare it to.
man wget: http://unixhelp.ed.ac.uk/CGI/man-cgi?wget
======================
--html-extension
If a file of type application/xhtml+xml or text/html is downloaded
and the URL does not end with the regexp .[Hh][Tt][Mm][Ll]?, this
option will cause the suffix .html to be appended to the local
filename. This is useful, for instance, when you're mirroring a
remote site that uses .asp pages, but you want the mirrored pages
to be viewable on your stock Apache server. Another good use for
this is when you're downloading CGI-generated materials. A URL
like http://site.com/article.cgi?25 will be saved as arti-
cle.cgi?25.html.
Note that filenames changed in this way will be re-downloaded every
time you re-mirror a site, because Wget can't tell that the local
X.html file corresponds to remote URL X (since it doesn't yet know
that the URL produces output of type text/html or application/xhtml+xml. To prevent this re-downloading, you must use -k
and -K so that the original version of the file will be saved as
X.orig.

Google Translate TTS problem

I'm testing with a simple HTML file, which contains:
<audio src="http://translate.google.com/translate_tts?tl=en&q=A+simple_text+to+voice+demonstration." controls autoplay>
with Chrome v11.0.696.68 and FF v4.0.1. I'm going through a proxy server and it doesn't work. Nothing gets played and clicking on the play button doesn't work in Chrome. In FF it flashes and then shows an 'X' over the control. The error logs don't show anything.
So I've broken down the steps:
Typing the URL into either browser works
wget -q -U Mozilla -O /tmp/tts.mp3 "http://translate.google.com/translate_tts?tl=en&q=Welcome+to+our+fantastic+text+to+voice+demonstration." gets me a file that plays fine on both browsers.
If I serve this file from my local web server it works fine (i.e. one that doesn't go through the proxy). i.e. src="http://localhost/tts.mp3"
I'm stumped. If the proxy were the problem then wget and address bar access shouldn't work. If the src being a URL were the problem then it shouldn't work from my local server.
Any clues? suggestions?
The reason this isn't working is most likely because translate.google.com restricts certain types of requests to prevent the service from being overloaded. For instance, if you use wget without the "-U Mozilla" user agent option you will get an HTTP 404 because the service restricts responses from wget's default user agent string.
In your case, it looks like what is going on is that translate.google.com is returning a HTTP 404 if a HTTP Referrer is included in the request. When you run wget from command line there is no referrer. When you use the audio tag from within a webpage, an HTTP Referrer is provided when requesting the translation. I just tried the following and got a 404.
wget --referer="http://foo.com" -U Mozilla -O /tmp/tts.mp3 "http://translate.google.com/translate_tts?tl=en&q=Welcome+to+our+fantastic+text+to+voice+demonstration
However if you take the --referer option out, it works.
The service is working here (11-NOV-2011) but is limited to 100 characters. You can split your text into 100 char chunks, download the mp3 result for each chunk and then join the chunks for the final Mp3 file.