How does refer(r)er work technically? - html

I don't understand: how are webserver and trackers like Google Analytics able to track referrals?
Is it part of HTTP?
Is it some (un)specified behavior of the browsers?
Apparently every time you click on a link on a web page, the original web page is passed along the request.
What is the exact mechanism behind that? Is it specified by some spec?
I've read a few docs and I've played with my own Tomcat server and my own Google Analytics account, but I don't understand how the "magic" happens.
Bonus (totally related) question: if, on my own website (served by Tomcat), I put a link to another site, does the other site see my website as the "referrer" without me doing anything special in Tomcat?

Referer (misspelled in the spec) is an HTTP header. It's a standard header that all major HTTP clients support (though some proxy servers and firewalls can be configured to strip it or mangle it). When you click on a link, your browser sends an HTTP request that contains the page being requested and the page on which the link was found, among other things.
Since this is a client/request header, the server is irrelevant, and yes, clicking a link on a page hosted on your own server would result in that page's URL being sent to the other site's server, though your server may not necessarily be accessible from that other site, depending on your network configuration.

One detail to add to what's already been said about how browsers send it: HTTPS changes the behavior a bit. I am not aware if it's in any spec, but if you jump from HTTPS to HTTP, and if you stay on the same domain or go to different domains, then sometimes the referrer is not sent. I don't know the exact rules, but I've observed this in the wild. If there's some spec or description about this, it would be great.
EDIT: ok, the RFC says plainly:
Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP request if the referring page was transferred with a secure protocol.
So, if you go from HTTPS page to a HTTP link, referrer info is not sent.

From: http://en.wikipedia.org/wiki/HTTP_referrer
The referrer field is an optional part
of the HTTP request sent by the
browser program to the web server.
From RFC 2616:
The Referer[sic] request-header field
allows the client to specify, for
the server's benefit, the address
(URI) of the resource from which
the Request-URI was obtained (the
"referrer", although the header
field is misspelled.)

If you request a web page using a browser, your browser will sent the HTTP Referer header along with the request.

Your browser passes referrer with each page request.
It seems unusual that JavaScript has access to this as well, but it does.

Yes, the browser sends the previous page in the HTTP headers. This is defined in the HTTP/1.1 spec:
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.36
The answer to your question is yes, as the browser sends the referer.

"The referrer field is an optional part of the HTTP request sent by the browser program to the web server."
http://en.wikipedia.org/wiki/HTTP_referrer

When you click on a link the browser adds a Referer header to the request. It is part of HTTP. You can read more about it here.

Related

Same origin policy by default [duplicate]

tl;dr; About the Same Origin Policy
I have a Grunt process which initiates an instance of express.js server. This was working absolutely fine up until just now when it started serving a blank page with the following appearing in the error log in the developer's console in Chrome (latest version):
XMLHttpRequest cannot load https://www.example.com/
No 'Access-Control-Allow-Origin' header is present on the requested
resource. Origin 'http://localhost:4300' is therefore not allowed access.
What is stopping me from accessing the page?
tl;dr — When you want to read data, (mostly) using client-side JS, from a different server you need the server with the data to grant explicit permission to the code that wants the data.
There's a summary at the end and headings in the answer to make it easier to find the relevant parts. Reading everything is recommended though as it provides useful background for understanding the why that makes seeing how the how applies in different circumstances easier.
About the Same Origin Policy
This is the Same Origin Policy. It is a security feature implemented by browsers.
Your particular case is showing how it is implemented for XMLHttpRequest (and you'll get identical results if you were to use fetch), but it also applies to other things (such as images loaded onto a <canvas> or documents loaded into an <iframe>), just with slightly different implementations.
The standard scenario that demonstrates the need for the SOP can be demonstrated with three characters:
Alice is a person with a web browser
Bob runs a website (https://www.example.com/ in your example)
Mallory runs a website (http://localhost:4300 in your example)
Alice is logged into Bob's site and has some confidential data there. Perhaps it is a company intranet (accessible only to browsers on the LAN), or her online banking (accessible only with a cookie you get after entering a username and password).
Alice visits Mallory's website which has some JavaScript that causes Alice's browser to make an HTTP request to Bob's website (from her IP address with her cookies, etc). This could be as simple as using XMLHttpRequest and reading the responseText.
The browser's Same Origin Policy prevents that JavaScript from reading the data returned by Bob's website (which Bob and Alice don't want Mallory to access). (Note that you can, for example, display an image using an <img> element across origins because the content of the image is not exposed to JavaScript (or Mallory) … unless you throw canvas into the mix in which case you will generate a same-origin violation error).
Why the Same Origin Policy applies when you don't think it should
For any given URL it is possible that the SOP is not needed. A couple of common scenarios where this is the case are:
Alice, Bob, and Mallory are the same person.
Bob is providing entirely public information
… but the browser has no way of knowing if either of the above is true, so trust is not automatic and the SOP is applied. Permission has to be granted explicitly before the browser will give the data it has received from Bob to some other website.
Why the Same Origin Policy applies to JavaScript in a web page but little else
Outside the web page
Browser extensions*, the Network tab in browser developer tools, and applications like Postman are installed software. They aren't passing data from one website to the JavaScript belonging to a different website just because you visited that different website. Installing software usually takes a more conscious choice.
There isn't a third party (Mallory) who is considered a risk.
* Browser extensions do need to be written carefully to avoid cross-origin issues. See the Chrome documentation for example.
Inside the webpage
Most of the time, there isn't a great deal of information leakage when just showing something on a webpage.
If you use an <img> element to load an image, then it gets shown on the page, but very little information is exposed to Mallory. JavaScript can't read the image (unless you use a crossOrigin attribute to explicitly enable request permission with CORS) and then copy it to her server.
That said, some information does leak so, to quote Domenic Denicola (of Google):
The web's fundamental security model is the same origin policy. We
have several legacy exceptions to that rule from before that security
model was in place, with script tags being one of the most egregious
and most dangerous. (See the various "JSONP" attacks.)
Many years ago, perhaps with the introduction of XHR or web fonts (I
can't recall precisely), we drew a line in the sand, and said no new
web platform features would break the same origin policy. The existing
features need to be grandfathered in and subject to carefully-honed
and oft-exploited exceptions, for the sake of not breaking the web,
but we certainly can't add any more holes to our security policy.
This is why you need CORS permission to load fonts across origins.
Why you can display data on the page without reading it with JS
There are a number of circumstances where Mallory's site can cause a browser to fetch data from a third party and display it (e.g. by adding an <img> element to display an image). It isn't possible for Mallory's JavaScript to read the data in that resource though, only Alice's browser and Bob's server can do that, so it is still secure.
CORS
The Access-Control-Allow-Origin HTTP response header referred to in the error message is part of the CORS standard which allows Bob to explicitly grant permission to Mallory's site to access the data via Alice's browser.
A basic implementation would just include:
Access-Control-Allow-Origin: *
… in the response headers to permit any website to read the data.
Access-Control-Allow-Origin: http://example.com
… would allow only a specific site to access it, and Bob can dynamically generate that based on the Origin request header to permit multiple, but not all, sites to access it.
The specifics of how Bob sets that response header depend on Bob's HTTP server and/or server-side programming language. Users of Node.js/Express.js should use the well-documented CORS middleware. Users of other platforms should take a look at this collection of guides for various common configurations that might help.
NB: Some requests are complex and send a preflight OPTIONS request that the server will have to respond to before the browser will send the GET/POST/PUT/Whatever request that the JS wants to make. Implementations of CORS that only add Access-Control-Allow-Origin to specific URLs often get tripped up by this.
Obviously granting permission via CORS is something Bob would only do only if either:
The data was not private or
Mallory was trusted
How do I add these headers?
It depends on your server-side environment.
If you can, use a library designed to handle CORS as they will present you with simple options instead of having to deal with everything manually.
Enable-Cors.org has a list of documentation for specific platforms and frameworks that you might find useful.
But I'm not Bob!
There is no standard mechanism for Mallory to add this header because it has to come from Bob's website, which she does not control.
If Bob is running a public API then there might be a mechanism to turn on CORS (perhaps by formatting the request in a certain way, or a config option after logging into a Developer Portal site for Bob's site). This will have to be a mechanism implemented by Bob though. Mallory could read the documentation on Bob's site to see if something is available, or she could talk to Bob and ask him to implement CORS.
Error messages which mention "Response for preflight"
Some cross-origin requests are preflighted.
This happens when (roughly speaking) you try to make a cross-origin request that:
Includes credentials like cookies
Couldn't be generated with a regular HTML form (e.g. has custom headers or a Content-Type that you couldn't use in a form's enctype).
If you are correctly doing something that needs a preflight
In these cases then the rest of this answer still applies but you also need to make sure that the server can listen for the preflight request (which will be OPTIONS (and not GET, POST, or whatever you were trying to send) and respond to it with the right Access-Control-Allow-Origin header but also Access-Control-Allow-Methods and Access-Control-Allow-Headers to allow your specific HTTP methods or headers.
If you are triggering a preflight by mistake
Sometimes people make mistakes when trying to construct Ajax requests, and sometimes these trigger the need for a preflight. If the API is designed to allow cross-origin requests but doesn't require anything that would need a preflight, then this can break access.
Common mistakes that trigger this include:
trying to put Access-Control-Allow-Origin and other CORS response headers on the request. These don't belong on the request, don't do anything helpful (what would be the point of a permissions system where you could grant yourself permission?), and must appear only on the response.
trying to put a Content-Type: application/json header on a GET request that has no request body the content of which to describe (typically when the author confuses Content-Type and Accept).
In either of these cases, removing the extra request header will often be enough to avoid the need for a preflight (which will solve the problem when communicating with APIs that support simple requests but not preflighted requests).
Opaque responses (no-cors mode)
Sometimes you need to make an HTTP request, but you don't need to read the response. e.g. if you are posting a log message to the server for recording.
If you are using the fetch API (rather than XMLHttpRequest), then you can configure it to not try to use CORS.
Note that this won't let you do anything that you require CORS to do. You will not be able to read the response. You will not be able to make a request that requires a preflight.
It will let you make a simple request, not see the response, and not fill the Developer Console with error messages.
How to do it is explained by the Chrome error message given when you make a request using fetch and don't get permission to view the response with CORS:
Access to fetch at 'https://example.com/' from origin 'https://example.net' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.
Thus:
fetch("http://example.com", { mode: "no-cors" });
Alternatives to CORS
JSONP
Bob could also provide the data using a hack like JSONP which is how people did cross-origin Ajax before CORS came along.
It works by presenting the data in the form of a JavaScript program that injects the data into Mallory's page.
It requires that Mallory trust Bob not to provide malicious code.
Note the common theme: The site providing the data has to tell the browser that it is OK for a third-party site to access the data it is sending to the browser.
Since JSONP works by appending a <script> element to load the data in the form of a JavaScript program that calls a function already in the page, attempting to use the JSONP technique on a URL that returns JSON will fail — typically with a CORB error — because JSON is not JavaScript.
Move the two resources to a single Origin
If the HTML document the JS runs in and the URL being requested are on the same origin (sharing the same scheme, hostname, and port) then the Same Origin Policy grants permission by default. CORS is not needed.
A Proxy
Mallory could use server-side code to fetch the data (which she could then pass from her server to Alice's browser through HTTP as usual).
It will either:
add CORS headers
convert the response to JSONP
exist on the same origin as the HTML document
That server-side code could be written & hosted by a third party (such as CORS Anywhere). Note the privacy implications of this: The third party can monitor who proxies what across their servers.
Bob wouldn't need to grant any permissions for that to happen.
There are no security implications here since that is just between Mallory and Bob. There is no way for Bob to think that Mallory is Alice and to provide Mallory with data that should be kept confidential between Alice and Bob.
Consequently, Mallory can only use this technique to read public data.
Do note, however, that taking content from someone else's website and displaying it on your own might be a violation of copyright and open you up to legal action.
Writing something other than a web app
As noted in the section "Why the Same Origin Policy only applies to JavaScript in a web page", you can avoid the SOP by not writing JavaScript in a webpage.
That doesn't mean you can't continue to use JavaScript and HTML, but you could distribute it using some other mechanism, such as Node-WebKit or PhoneGap.
Browser extensions
It is possible for a browser extension to inject the CORS headers in the response before the Same Origin Policy is applied.
These can be useful for development but are not practical for a production site (asking every user of your site to install a browser extension that disables a security feature of their browser is unreasonable).
They also tend to work only with simple requests (failing when handling preflight OPTIONS requests).
Having a proper development environment with a local development server
is usually a better approach.
Other security risks
Note that SOP / CORS do not mitigate XSS, CSRF, or SQL Injection attacks which need to be handled independently.
Summary
There is nothing you can do in your client-side code that will enable CORS access to someone else's server.
If you control the server the request is being made to: Add CORS permissions to it.
If you are friendly with the person who controls it: Get them to add CORS permissions to it.
If it is a public service:
Read their API documentation to see what they say about accessing it with client-side JavaScript:
They might tell you to use specific URLs
They might support JSONP
They might not support cross-origin access from client-side code at all (this might be a deliberate decision on security grounds, especially if you have to pass a personalized API Key in each request).
Make sure you aren't triggering a preflight request you don't need. The API might grant permission for simple requests but not preflighted requests.
If none of the above apply: Get the browser to talk to your server instead, and then have your server fetch the data from the other server and pass it on. (There are also third-party hosted services that attach CORS headers to publically accessible resources that you could use).
Target server must allowed cross-origin request. In order to allow it through express, simply handle http options request :
app.options('/url...', function(req, res, next){
res.header('Access-Control-Allow-Origin', "*");
res.header('Access-Control-Allow-Methods', 'POST');
res.header("Access-Control-Allow-Headers", "accept, content-type");
res.header("Access-Control-Max-Age", "1728000");
return res.sendStatus(200);
});
As this isn't mentioned in the accepted answer.
This is not the case for this exact question, but might help others that search for that problem
This is something you can do in your client-code to prevent CORS errors in some cases.
You can make use of Simple Requests.
In order to perform a 'Simple Requests' the request needs to meet several conditions. E.g. only allowing POST, GET and HEAD method, as well as only allowing some given Headers (you can find all conditions here).
If your client code does not explicit set affected Headers (e.g. "Accept") with a fix value in the request it might occur that some clients do set these Headers automatically with some "non-standard" values causing the server to not accept it as Simple Request - which will give you a CORS error.
This is happening because of the CORS error. CORS stands for Cross Origin Resource Sharing. In simple words, this error occurs when we try to access a domain/resource from another domain.
Read More about it here: CORS error with jquery
To fix this, if you have access to the other domain, you will have to allow Access-Control-Allow-Origin in the server. This can be added in the headers. You can enable this for all the requests/domains or a specific domain.
How to get a cross-origin resource sharing (CORS) post request working
These links may help
This CORS issue wasn't further elaborated (for other causes).
I'm having this issue currently under different reason.
My front end is returning 'Access-Control-Allow-Origin' header error as well.
Just that I've pointed the wrong URL so this header wasn't reflected properly (in which i kept presume it did). localhost (front end) -> call to non secured http (supposed to be https), make sure the API end point from front end is pointing to the correct protocol.
I got the same error in Chrome console.
My problem was, I was trying to go to the site using http:// instead of https://. So there was nothing to fix, just had to go to the same site using https.
This bug cost me 2 days. I checked my Server log, the Preflight Option request/response between browser Chrome/Edge and Server was ok. The main reason is that GET/POST/PUT/DELETE server response for XHTMLRequest must also have the following header:
access-control-allow-origin: origin
"origin" is in the request header (Browser will add it to request for you). for example:
Origin: http://localhost:4221
you can add response header like the following to accept for all:
access-control-allow-origin: *
or response header for a specific request like:
access-control-allow-origin: http://localhost:4221
The message in browsers is not clear to understand: "...The requested resource"
note that:
CORS works well for localhost. different port means different Domain.
if you get error message, check the CORS config on the server side.
In most housing services just add in the .htaccess on the target server folder this:
Header set Access-Control-Allow-Origin 'https://your.site.folder'
I had the same issue. In my case i fixed it by adding addition parameter of timestamp to my URL. Even this was not required by the server I was accessing.
Example yoururl.com/yourdocument?timestamp=1234567
Note: I used epos timestamp
"Get" request with appending headers transform to "Options" request. So Cors policy problems occur. You have to implement "Options" request to your server. Cors Policy about server side and you need to allow Cors Policy on your server side. For Nodejs server:details
app.use(cors)
For Java to integrate with Angular:details
#CrossOrigin(origins = "http://localhost:4200")
You should enable CORS to get it working.

Converting http to https; is it a must in order to be secured?

I already have a SSL certificate for my site and know how to redirect or convert my site into https.
But my question here is that. Is it a 'must', to convert the site to https in order to be secured? Or is it okay to retain using http but I should put the SSL certificate in the site so that even I type https, my site can still be visited and somehow it's also secured?
Sub notes:
http://www.sampleSite.com/ is the default link that I use in my site but if I type https://www.sampleSite.com/ it will also work though there will be no 'Secure' indicated besides my site link
Yes you should.
If you don't redirect http to https users will not redirect themselves.
Even if important pages (with personal information) that must be secured with https are redirected to https and the link pointing to them are https, because your pages with the link are using http, are vulnerable to sslstip.
If you want to properly secure your website, use https on all your webpage, redirect all http connection to https and use HSTS.
Anything less will be insecure and vulnerable to sslstrip attacks.
It depends of the use. If you web site use log in and sensitive informations, it's better to secure it.
The biggest difference between http and https is that
https encripte the online data. If anyone in between the sender and the recipient could open the message, they still could not understand it. Only the sender and the recipient, who know the "code," can decipher the message.

If an HTTP request is sent from an iframe, where does the iframed site see the request from?

Suppose I make a webpage that includes
<iframe src="http://google.com"/>
and a user browses through that iframe. Does Google see the request made from the server I'm hosting my site on, or from the user's router?
You do NOT load content of iframe source from your server. You just pass that code to the user browser then everything happens on client side. Therefore google will see client ip address and etc.
When one website is called through another domain whether iframe or not, browsers send current domain name to the next target (google.com in your case) with HTTP Referrer data. This is the only way of google.com to understand where the client request google from.
Details : What is the HTTP Referer if the link is clicked in an <iframe>?

Cross Domain Form POSTing

I've seen articles and posts all over (including SO) on this topic, and the prevailing commentary is that same-origin policy prevents a form POST across domains. The only place I've seen someone suggest that same-origin policy does not apply to form posts, is here.
I'd like to have an answer from a more "official" or formal source. For example, does anyone know the RFC that addresses how same-origin does or does not affect a form POST?
clarification: I am not asking if a GET or POST can be constructed and sent to any domain. I am asking:
if Chrome, IE, or Firefox will allow content from domain 'Y' to send a POST to domain 'X'
if the server receiving the POST will actually see any form values at all. I say this because the majority of online discussion records testers saying the server received the post, but the form values were all empty / stripped out.
What official document (i.e. RFC) explains what the expected behavior is (regardless of what the browsers have currently implemented).
Incidentally, if same-origin does not affect form POSTs - then it makes it somewhat more obvious of why anti-forgery tokens are necessary. I say "somewhat" because it seems too easy to believe that an attacker could simply issue an HTTP GET to retrieve a form containing the anti-forgery token, and then make an illicit POST which contains that same token. Comments?
The same origin policy is applicable only for browser side programming languages. So if you try to post to a different server than the origin server using JavaScript, then the same origin policy comes into play but if you post directly from the form i.e. the action points to a different server like:
<form action="http://someotherserver.com">
and there is no javascript involved in posting the form, then the same origin policy is not applicable.
See wikipedia for more information
It is possible to build an arbitrary GET or POST request and send it to any server accessible to a victims browser. This includes devices on your local network, such as Printers and Routers.
There are many ways of building a CSRF exploit. A simple POST based CSRF attack can be sent using .submit() method. More complex attacks, such as cross-site file upload CSRF attacks will exploit CORS use of the xhr.withCredentals behavior.
CSRF does not violate the Same-Origin Policy For JavaScript because the SOP is concerned with JavaScript reading the server's response to a clients request. CSRF attacks don't care about the response, they care about a side-effect, or state change produced by the request, such as adding an administrative user or executing arbitrary code on the server.
Make sure your requests are protected using one of the methods described in the OWASP CSRF Prevention Cheat Sheet. For more information about CSRF consult the OWASP page on CSRF.
Same origin policy has nothing to do with sending request to another url (different protocol or domain or port).
It is all about restricting access to (reading) response data from another url.
So JavaScript code within a page can post to arbitrary domain or submit forms within that page to anywhere (unless the form is in an iframe with different url).
But what makes these POST requests inefficient is that these requests lack antiforgery tokens, so are ignored by the other url. Moreover, if the JavaScript tries to get that security tokens, by sending AJAX request to the victim url, it is prevented to access that data by Same Origin Policy.
A good example: here
And a good documentation from Mozilla: here

Handling various types of URL redirects

Generally, when writing HTTP client software, the HTTP protocol provides sufficient information for how to handle redirected URLs. Specifically, if an HTTP request returns a redirect code of 302 or 307, the redirect should be considered temporary and the client should continue to use the original URL. However, a redirect code of 301 indicates that the client should discard the old URL, and permanently use the redirected URL.
But is there any standard practice for redirects which are NOT issued by the HTTP server itself? In other words, HTML or Javascript redirects? Intuitively, I'd think that an HTML/Javascript-redirect should be handled like a 301, but I'm not sure if this is a good idea.
There are no "redirects" in HTML or JavaScript, all you do is instruct the browser to navigate to another page. Think of it as equivalent to entering a URL in your address bar. There is no question of temporary or permanent here.