Two forward slashes in a url/src/href attribute [duplicate] - html

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
URI starting with two slashes … how do they behave?
Absolute URLs omitting the protocol (scheme) in order to preserve the one of the current page
shorthand as // for script and link tags? anyone see / use this before?
I was looking through the source of HTML5 Reset when I noticed the following line:
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.5.1/jquery.min.js"></script>
Why does the URL start with two forward slashes? Is this a shorthand for http://?

The "two forward slashes" are a common shorthand for "request the referenced resource using whatever protocol is being used to load the current page".
Best known as "protocol relative URLs", they are particularly useful when elements — such as the JS file in your example — could be served and/or requested from either a http or a https context. By using protocol relative URLs, you can avoid implementing
if (window.location.protocol === 'http:') {
myResourceUrl = 'http://example.com/my-resource.js';
} else {
myResourceUrl = 'https://example.com/my-resource.js';
}
type of logic all over your codebase (assuming, of course, that the server at example.com is able to serve content through both http and https).
A prominent real-world example is the Magento 1.X E-Commerce engine: for performance reasons, the category and product pages use plain http by default, whereas the checkout is https enabled.
If some resources (e.g. promotional banners in the site's header) are referenced via non protocol relative URLs (i.e. http://example.com/banner.jpg), customers reaching the https enabled checkout are greeted with a rather unfriendly
"there are insecure elements on this page"
prompt - which, one can safely assume, isn't exactly great for business.
If the aforementioned resource is referenced via //example.com/banner.jpg though, the browser takes care of loading it via the proper protocol both on the plain http product/category pages and in the https-enabled checkout flow.
tl;dr: With even the slightest possibility of a mixed http/https environment, just use the double slash/protocol relative URLs to reference resources — assuming that the host serving them supports both http and https.

It will automatically add https or http, depending on how the request was made.

Related

HTTP Basic Authentication in URL supported or deprecated

On a project we spent considerable effort to work around basic authentication (because webdriver tests were depending on it, and webdriver has no api for basic authentication), and I remember basic authentication in the URL clearly not working. I.e. could not load http://username:password#url
Just google "basic authentication in url" and you will find tons of people complaining: https://medium.com/#lmakarov/say-goodbye-to-urls-with-embedded-credentials-b051f6c7b6a3
https://www.ietf.org/rfc/rfc3986.txt
Use of the format "user:password" in the userinfo field is deprecated.
Now today I told this quagmire to a friend and he said they are using http://username:password#url style basic authentication in webdriver tests without any problem.
I went in my current Chrome v71 to a demo page and to my surprise I found it indeed very well working: https://guest:guest#jigsaw.w3.org/HTTP/Basic/
How is this possible?? Are we living in parallel dimensions at the same time? Which one is true: is basic authentication using credentials in the URL supported or deprecated? (Or was this maybe added back to Chrome due to complaints of which I can't find any reference?)
Essentially, deprecated does not mean unsupported.
Which one is true: is basic authentication using credentials in the URL supported or deprecated?
The answer is yes, both are true. It is deprecated, but for the most part (anecdotally) still supported.
From the medium article:
While you would not usually have those hardcoded in a page, when you open a URL likehttps://user:pass#host and that page makes subsequent requests to resources linked via relative paths, that’s when those resources will also get the user:pass# part applied to them and banned by Chrome right there.
This means urls like <img src=./images/foo.png> but not urls like <a href=/foobar>zz</a>.
The rfc spec states:
Use of the format "user:password" in the userinfo field is
deprecated. Applications should not render as clear text any data
after the first colon (":") character found within a userinfo
subcomponent unless the data after the colon is the empty string
(indicating no password). Applications may choose to ignore or
reject such data when it is received as part of a reference and
should reject the storage of such data in unencrypted form. The
passing of authentication information in clear text has proven to be
a security risk in almost every case where it has been used.
Applications that render a URI for the sake of user feedback, such as
in graphical hypertext browsing, should render userinfo in a way that
is distinguished from the rest of a URI, when feasible. Such
rendering will assist the user in cases where the userinfo has been
misleadingly crafted to look like a trusted domain name
(Section 7.6).
So the use of user:pass#url is discouraged and backed up by specific recommendations and reasons for disabling the use. It also states that apps may opt to reject the userinfo field, but it does not say that apps must reject this.

ETags for server-side rendered pages that contain CSP nonce

I have a server-side-rendered React app and Node/Express so far were able to generate the correct, stable ETags, allowing for taking advantage of client-side caching.
Additionally, generated HTML contains fragments of render-blocking (above-the-fold) CSS and JS inlined as <script> and <style> tags for faster client-side first renders (as promoted by Google and its PageSpeed and Lighthouse tools).
Now I want to enable Content Security Policy (CSP) and I provide a nonce as an attribute to those <script> and <style> tags on every page request, to avoid unsafe-inline violations. However, ever-changing nonce makes ETags to change on every request as well. HTML is never cached and every request hits my Express server.
Is there a way to combine simultaneously:
inlined CSS and JS
CSP features (that is nonce, or similar)
ETags or alternatives
?
So far I see a contradiction between current performance vs security guidelines.
Are there equivalents to CSP nonce or can CSP nonce be provided while keeping HTML intact? Is there a way to otherwise cache pages that contain CSP nonce?
Ideally, I would like a solution to be contained within the Express server, without resorting to tinkering with my reverse proxy config, but any options are welcome.
One solution is to leave the whole content generation and caching to web application (Node in your case) and CSP nonce generation to front-end webserver (e.g. Nginx). I have implemented it with Django which does page caching with ETag, does all the Vary header logic etc and the HTML it produces contains such a static CSP nonce placeholder:
< script nonce="+++CSP_NONDE+++"> ... </script>
This placeholder is then filled in by Nginx using ngx_http_subs_filter_module:
sub_filter_once off;
sub_filter +++CSP_NONCE+++ $ssl_session_id;
add_header Content-Security-Policy "script-src 'nonce-$ssl_session_id'";
I have seen solutions using an additional Nginx module to generate a truly unique random nonce for each request but I believe it's an overkill and I'm just using TLS session identifier, which is unique per each connecting client and may be cached for some time (e.g. 10 minutes) depending on your Nginx configuration.
Just make sure the web application returns uncompressed HTML as Nginx won't be able to do string substitution.

Why is the HTTP location header only set for POST requests/201 (Created) responses?

Ignoring 3xx responses for a moment, I wonder why the HTTP location header is only used in conjunction with POST requests/201 (Created) responses.
From the RFC 2616 spec:
For 201 (Created) responses, the Location is that of the new resource which was created by the request.
This is a widely supported behavior, but why shouldn't it be used with other HTTP methods? Take the JSON API spec as an example:
It defines a self referencing link for the current resource inside the JSON payload (not uncommon for RESTful APIs). This link is included in every payload. The spec says that you MUST include an HTTP location header, if you create a new document via POST and that the value is the same as the self referencing link in the payload, but this is ONLY needed for POST. Why bother with a custom format for a self referencing link, if you could just use the HTTP location header?
Note: This isn't specific to JSON API. It's the same for HAL, JSON Hyper-Schema or other standards.
Note 2: It isn't even specific to the HTTP location header as it is the same with the HTTP link header. As you can see the JSON API, HAL and JSON Hyper-Schema not only define conventions for self referencing links, but also to express information about related resources or possible actions for a resource. But it seems that they all could just use the HTTP link header. (They could even put the self referencing link into the HTTP link header, if they don't want to use the HTTP location header.)
I don't want to rant, it just seems to be some sort of "reinventing the wheel". It also seems to be very limiting: if you would just use HTTP location/link header, it doesn't matter if you ask for JSON, XML or whatever in your HTTP accept header and you would get useful meta-information about your resource on a HEAD request, which wouldn't contain the links if you would use JSON API, HAL or JSON Hyper-Schema.
The semantics of the Location header isn't that of a self-referencing link, but of a link the user-agent should follow in order to complete the request. That makes sense in redirects, and when you create a new resource that will be in a new location you should go to. If your request is already completed, meaning you already have a full representation of the resource you wanted, it doesn't make sense to return a Location.
The Link header may be considered semantically equivalent to an hypertext Link, but it should be used to reference metadata related to the given resource when the media-type is not hypermedia-aware, so it doesn't replace the functionality of a link to related resources in a RESTful API.
The need for a custom link format in the resource representation is inherent to the need to decouple the resource from the underlying implementation and protocol. REST is not coupled to HTTP, and any protocol for which there's a valid URI scheme can be used. If you decided to use the Link header for all links, you're coupling to HTTP.
Let's say you present an FTP link for clients to follow. Where would be the Link in that case?
The semantic of the Location header depends on the status code. For 201, it links to the newly created resource, but in 3xx requests it can have multiple (although similiar) meanings. I think that is why it is generally avoided for other usages.
The alternative is the Content-Location header, which always has a consistent meaning. It tells the client the canonical URL the resource it requested. It is purely informative (in contrast to the Location, which is expected to be processed by the client).
So, the Content-Location header seems to closer resemble a self-referencing link. However, the Content-Location also has no defined behavior for PUT and POST. It also seems to be quite rarely used.
This blogs post Location vs Content-Location is a nice comparison. Here is a quote:
Finally, neither header is meant for general-purpose linking.
In sum, requiring a standardized, self link in the body seems to be good idea. It avoids a lot of confusion on the client side.

HTTP protocol: HTML only?

My informatics teacher explained that HTTP can really be used exclusively to transfer unencoded ASCII, and as soon as an image or the like have to be downloaded from the server, FTP is used instead.
Is this true?
No, it isn't. HTTP contains a header and data part, the latter gets interpreted by the receipent according to the CONTENT-TYPE header. HTTP can transfer arbitrary data.
If you want any more specific information about the HTTP protocol, the full RFC (request for comments) document is available here: http://www.w3.org/Protocols/rfc2616/rfc2616.html
An RFC is where the various different parties who have a stake in these protocols are presented the current version and invited to give their opinion. It's the method by which most of the internet has been built :)

How to respond to an alternate URI in a RESTful web service

I'm building a RESTful web service which has multiple URIs for one of its resources, because there is more than one unique identifier. Should the server respond to a GET request for an alternate URI by returning the resource, or should I send an HTTP 3xx redirect to the canonical URI? Is HTTP 303 (see also) the most appropriate redirect?
Clarification: the HTTP specification makes it clear that the choice of redirect depends on which URI future requests should use. In my application, the 'canonical' URI is the most stable of the alternatives; an alternative URI will always direct to same canonical URI, or become invalid.
I'd personally plump for returning the resource rather than faffing with a redirect, although I suspect that's only because my subcoscious is telling me redirects are slower.
However, if you were to decide to use a redirect I'd think a 302 or 307 might be more appropiate than a 303, although the w3.org has details of the different redirect codes you could use.
Under W3C's Architexture of the World Wide Web, Volume One, there is a section on URI aliases (Section 2.3.1) which states the following:
"When a URI alias does become common currency, the URI owner should use protocol techniques such as server-side redirects to relate the two resources. The community benefits when the URI owner supports redirection of an aliased URI to the corresponding "official" URI. For more information on redirection, see section 10.3, Redirection, in RFC2616. See also CHIPS for a discussion of some best practices for server administrators."
For what it's worth, I would recommend a 302 redirect.
The answer from Ubiguchi had what I needed, except that I now think a redirect is the way to go, via the link to the HTTP 1.1 specifiction section on response codes. It turns out that I actually need a 301 redirect because the URI I'm redirecting to is more 'correct' and stable, and should therefore be used for future requests.