Why use protocol-relative URLs at all? - html

It's been an oft-discussed question on StackOverflow what this means:
<script src="//cdn.example.com/somewhere/something.js"></script>
This gives the advantage that if you're accessing it over HTTPS, you get HTTPS automatically, instead of that scary "Insecure elements on this page" warning.
But why use protocol-relative URLs at all? Why not simply use HTTPS always in CDN URLs? After all, an HTTP page has no reason to complain if you decide to load some parts of it over HTTPS.
(This is more specifically for CDNs; almost all CDNs have HTTPS capability. Whereas, your own server may not necessarily have HTTPS.)

As of December 2014, Paul Irish's blog on protocol-relative URLs says:
2014.12.17: Now that SSL is encouraged for everyone and doesn’t have performance concerns, this technique is now an anti-pattern. If the asset you need is available on SSL, then always use the https:// asset.
Unless you have specific performance concerns (such as the slow mobile network mentioned in Zakjan's answer) you should use https:// to protect your users.

Because of performance. Establishing of HTTPS connection takes much longer time than HTTP, TLS handshake adds latency delay up to 2 RTTs. You can notice it on mobile networks. So it is better not to use HTTPS asset URLs, if you don't need it.

There are a number of potential reasons, though they're all not particularly crucial:
How about the next time every business with an agenda pushes a new protocol? Are we going to have to swap out thousands of strings again then? No thanks.
HTTPS is slower than HTTP of same version
If any of the notes listed at caniuse.com for HTTP/2 are a problem
Conceptually, if the server enforces the protocol, there is no reason to be specific about it in the first place. Agnosticism is what it is. It's covering all your bases.

One thing to note, if you are using CSP's upgrade-insecure-requests, you can safely use protocol-agnostic URLs (//example.com).

Protocol-relative URLs sometimes break JS codes that try to detect location.protocol. They are also not understood by extremely old browsers. If you are developing web services that requires maximum backward-compatibility (i.e. serving crucial emergency information that can be received/sent on slow connections and/or old devices) do not use PRURLs.

Related

User submitted images with http:// urls are causing browsers to warn that the page is not secure

I'm working for a forum owner who allows users to submit hotlinked images from other domains in their posts. If they choose to use an http version of the URL, the otherwise clean page becomes insecure in the eyes of a browser, which some percentage of the time triggers a worried email from certain users.
I can't rewrite the urls, since I can't code against the assumption that future off site images will have https available. For the same reason, I can't use protocol relative src attributes. I'm unwilling to fetch and cache the images on our server just so that they can be served over https, because of the computational expense involved.
What can I do? Is there some piece of HTML syntax or some similar which I can use to tell the browser "This image doesn't matter, and doesn't constitute a security threat"?
This isn't possible. The image may not constitute a security threat but MITM attacks could still lead to images other than the intended one being loaded over the network, and who knows what an attacker may want to supplant that image with. My suggestion would be to pass the annoyance on to your users and tell them they can only use https URLs.

What are alternatives to HTTP authentication?

My site uses HTTP authentication and I've learned it isn't very secure and it causes a lot of problems for many browsers, and not all browsers may support it, so I want to use an alternative that is secure and more widely supported; what are some alternatives?
Is it possible to lock all directories using an HTML login page?
My site uses HTTP authentication and I've learned it isn't very secure
That's false... unless you're referring to something like basic auth over an insecure channel. In that case, anything over the insecure channel has potential issues. (Even if you did some client-side encryption hackery, you still have the problem that the remote host is not verified without the TLS or SSL layer.)
Basic auth is fine in some cases, and not for others. It depends on what you're trying to do.
it causes a lot of problems for many browsers, and not all browsers may support it
Completely false. I've never seen a browser that didn't support basic auth and digest auth.
what are some alternatives?
This isn't possible to answer without a better understanding of your requirements. Two-factor auth with a DNA sample and a brainwave scan might be more secure but chances are that's not what you're looking for. Besides, you can't forget about the rest of your system and you've told us nothing about that.
Is it possible to lock all directories using an HTML login page?
Yes. How you do this depends on what you're running server-side, but yes it's completely possible and often done.

Is loading scripts or other resources via HTTPS on a HTTP page problematic?

I'm aware of protocol-relative URLs, which are usually the right solution for serving scripts or other resources on pages that may be loaded using HTTP or HTTPS.
However, I have a script that I would like to always serve via HTTPS, even when the page it's being loaded onto is served via HTTP. Leaving the obvious potential security issues around mixing HTTP and HTTPS content aside (namely, that a MITM attack on some script served via HTTP could theoretically be used to inject exploit code used to read stuff from the script served via HTTPS), is this a bad idea for any other reason? For example, will this cause mixed content warnings in any old versions of IE?
Nope! At least, not on any browsers that remain in popular use.
Paul Irish (one of the developers of Google Chrome and modestly notable programming blogger and open-source contributor) has this advice to give in a 2014 update to his 2010 blog post, The Protocol-relative URL (emphasis from the original):
Now that SSL is encouraged for everyone and doesn’t have performance concerns, this technique is now an anti-pattern. If the asset you need is available on SSL, then always use the https:// asset.
Allowing the snippet to request over HTTP opens the door for attacks like the recent Github Man-on-the-side attack. It’s always safe to request HTTPS assets even if your site is on HTTP, however the reverse is not true.
More guidance and details in Eric Mills’ guide to CDNs & HTTPS.
If Paul Irish says that requesting HTTPS assets on a HTTP page is fine, then that's good enough for me.

Make full site HTTPS / SSL? What performance / SEO issues & best practices still apply in 2012? [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Note: There are existing question that look like duplicates (linked below) but most of them are from a few years ago. I'd like to get a clear and definitive answer that proves things either way.
Is making an entire website run in HTTPS not an issue today from a best practice and performance / SEO perspective?
UPDATE: Am looking for more information with sources, esp. around impact to SEO. Bounty added
Context:
The conversation came up when we wanted to introduce some buttons that spawn lightboxes with forms in them that collect personal information (some of them even allow users to login). This is on pages that make up a big portion of the site. Since the forms would need to collect and submit information securely and the forms are not on pages of their own, the easiest way we could see to make this possible was to make the pages themselves be HTTPS.
What I would like is for an answer that covers issues with switching a long running popular site to HTTPS such as the ones listed below:
Would a handshake be negotiated on every request?
Will all assets need to be encrypted?
Would browsers not cache HTTPS content, including assets?
Is downstream transparent proxies not caching HTTPS content, including assets (css, js etc.) still an issue?
Would all external assets (tracking pixels, videos, etc) need to have HTTPS version?
HTTPS and gzip might not be happy together?
Backlinks and organic links will always be HTTP so you will be 301'ing all the time, does this impact SEO / performance? Any other SEO impact of changing this sitewide?
There's a move with some of the big players to always run HTTPS, see Always on SSL, is this setting a precedent / best practice?
Duplicate / related questions:
Good practice or bad practice to force entire site to HTTPS?
Using SSL Across Entire Site
SSL on entire site or just part of it?
Not sure I can answer all points in one go with references, but here goes. Please edit as appropriate:
Would a handshake must be negotiated on every request?
No, SSL connections are typically reused for a number of consecutive requests. The overhead once associated with SSL is mostly gone these days. Computers have also gotten a lot faster.
Will all assets need to be encrypted?
Yes, otherwise the browser will not consider the entire site secure.
Would browsers not cache HTTPS content, including assets?
I do not think so, caching should work just fine.
Is downstream transparent proxies not caching HTTPS content, including assets (css, js etc.) still an issue?
For the proxy to cache SSL encrypted connections/assets, the proxy would need to decrypt the connection. That largely negates the advantage of SSL. So yes, proxies would not cache content.
It is possible for a proxy to be an SSL endpoint to both client and server, so it has separate SSL sessions with each and can see the plaintext being transmitted. One SSL connection would be between the proxy and the server, the proxy and the client would have a separate SSL connection signed with the certificate of the proxy. That requires that the client trusts the certificate of the proxy and that the proxy trusts the server certificate. This may be set up this way in corporate environments.
Would all external assets (tracking pixels, videos, etc) need to have HTTPS version?
Yes.
HTTPS and gzip might not be happy together?
Being on different levels of protocols, it should be fine. gzip is negotiated after the SSL layer is put over the TCP stream. For reasonably well behaved servers and clients there should be no problems.
Backlinks and organic links will always be HTTP so you will be 301'ing all the time, does this impact SEO?
Why will backlinks always be HTTP? That's not necessarily a given. How it impacts SEO very much depends on the SE in question. An intelligent SE can recognize that you're simply switching protocols and not punish you for it.
1- Would a handshake be negotiated on every request?
There are two issues here:
Most browsers don't need to re-establish a new connection between requests to the same site, even with plain HTTP. HTTP connections can be kept alive, so, no, you don't need to close the connection after each HTTP request/response: you can re-use a single connection for multiple requests.
You can also avoid to perform multiple handshake when parallel or subsequent SSL/TLS connections are required. There are multiple techniques explained in ImperialViolet - Overclocking SSL (definitely relevant for this question), written by Google engineers, in particular session resumption and false start. As far as I know, most modern browsers support at least session resumption.
These techniques don't get rid of new handshakes completely, but reduce their cost. Apart from session-reuse, OCSP-stapling (to check the certificate revocation status) and elliptic curves cipher suites can be used to reduce the key exchange overhead during the handshake, when perfect forward-secrecy is required. These techniques also depend on browser support.
There will still be an overhead, and if you need massive web-farms, this could still be a problem, but such a deployment is possible nowadays (and some large companies do it), whereas it would have been considered inconceivable a few years ago.
2- Will all assets need to be encrypted?
Yes, as always. If you serve a page over HTTPS, all the resources it uses (iframe, scripts, stylesheets, images, any AJAX request) need to be using HTTPS. This is mainly because there is no way to show the user which part of the page can be trusted and which can't.
3- Would browsers not cache HTTPS content, including assets?
Yes, they will, you can either use Cache-Control: public explicitly to serve your assets, or assume that the browser will do so. (In fact, you should prevent caching for sensitive resources.)
4- Is downstream transparent proxies not caching HTTPS content, including assets (css, js etc.) still an issue?
HTTP proxy servers merely relay the SSL/TLS connection without looking into them. However, some CDNs also provide HTTPS access (all the links on Google Libraries API are available via https://), which, combined with in-browser caching, allows for better performance.
5- Would all external assets (tracking pixels, videos, etc) need to have HTTPS version?
Yes, this goes with point #3. The fact that YouTube supports HTTPS access helps.
6- HTTPS and gzip might not be happy together?
They're independent. HTTPS is HTTP over TLS, the gzip compression happens at the HTTP level. Note that you can compress the SSL/TLS connection directly, but this is rarely used: you might as well use gzip compression at the HTTP level if you need (there's little point compressing twice).
7- Backlinks and organic links will always be HTTP so you will be 301'ing all the time, does this impact SEO?
I'm not sure why these links should use http://. URL shortening services are a problem generally speaking for SEO if that's what you're referring to.
I think we'll see more and more usage of HTTP Strict Transport Security, so more https:// URLs by default.

Dangers of using HTML5 prefetch?

Ok, so it isn't a huge worry yet as it is only supported by a few browsers:
Mozilla Firefox: Supported
Google Chrome: Supported since version 13 (Use an alternate syntax)
Safari: Currently not supported Internet
Explorer: Currently not supported
However, prefetch makes me twitch. If the user lands on your page and bounces off to another site have you paid for the bandwidth of them visiting your prefetch links?
Isn't there a risk of developers prefetching every link on the page which in turn would make the website a slower experience for user?
It looks like it can alter analytics. Will people be forcing page views onto users via prefetch?
Security, you wont know what pages are being prefetched. Can it prefetch malicious files?
Will all this prefetching be painful for mobile users with limited usage?
I can't call myself an expert on the subject, but I can make these observations:
Prefetch should be considered only where it is known to be beneficial. Enabling prefetch on everything would just be silly. It's essentially a balance of server load vs user experience.
I haven't looked into the HTML5 prefetching spec, but I would imagine they've specified a header that states "this request is being performed as part of prefetching", which could be used to fix the analytics problem - i.e. "if this is a prefetch, don't include it in analytics stats".
From a security standpoint, one would expect prefetch to follow the same cross-domain rules as Ajax does. This would mitigate any cases where XSS is an issue.
Mobile browsers that support HTML5 prefetch should be smart enough to turn it on when using WiFi, and off when using potentially expensive or slow forms of network connection, e.g. 2G/3G.
As I've stated, I can't guarantee any of the above things, but (like with any technology) it's a case of best practices. You wouldn't use Cache-Control to force every page on your site to be cached for a year. Nor would you expect a browser to satisfy a cross-domain Ajax request. Hopefully the same considerations were/will be taken for prefetching.
To answer the question of analytics and statistics, the spec has the following to say:
To ensure compatibility and improve the success rate of prerendering requests the target page can use the [PAGE-VISIBILITY] to determine the visibility state of the page as it is being rendered and implement appropriate logic to avoid actions that may cause the prerender to be abandoned (e.g. non-idempotent requests), or unwanted side-effects from being triggered (e.g. analytics beacons firing prior to the page being displayed).