Is there a reason we include the http / https protocol on the href attribute of links?
Would it be fine to just leave it off:
my site
The inclusion of the “http:” or “https:” part is partly just a matter of tradition, partly a matter of actually specifying the protocol. If it is defaulted, the protocol of the current page is used; e.g., //www.example.com becomes http://www.example.com or https://www.example.com depending on the URL of the referring page. If a web page is saved on a local disk and then opened from there, it has no protocol (just the file: pseudo-protocol), so URLs like //www.example.com won’t work; so here’s one reason for including the “http:” or “https:” part.
Omitting also the “//” part is a completely different issue altogether, turning the URL to a relative URL that will be interpreted as relative to the current base URL.
The reason why www.example.com works when typed or pasted on a browser’s address line is that relative URLs would not make sense there (there is no base URL to relate to), so browser vendors decided to imply the “http://” prefix there.
URLs in href are not restricted to only HTTP documents. They support all the protocols supported by browsers- ftp, mailto, file etc.
Also, you can preceed URL name with '#', to link to a html id internally in the page. You can give just the name or directory path, without a protocol, which will be taken as a relative URL.
My solution was to trick the browser with a redirect service, such as bit.ly and goo.gl (which will be discontinued soon), in addition to others.
When the browser realizes that the url of the shortcuts is https, it automatically releases the link image, the link is released and instead displays the http image, without showing the original link.
The annoying part is that, according to the access, it will display in the panel control of your redirector, thousands of "clicks", which is actually "display".
With this experience I'm going to look for a Wordpress plugin for redirection and create my own "redirects links". So I will have https // mysite.com /id → redirect to http link.
Related
I am learning HTML, and whenever I execute the href function in HTML and click the blue text, the browser tries to redirect me to a folder inside my computer, when in reality I want to enter a website. For example, if I try to execute the following code, instead of the browser redirecting me to duckduckgo.com, it tries to redirect me to a folder inside my computer:
Browse anonymously and without being traced
How can I solve this issue?
Because href="duckduckgo.com" is using a relative URL, so the browser is looking for duckduckgo.com relative to the current URL that is displaying the page. To the browser it's no different than if you used href="index.html", both are structurally identical.
Instead, use a fully-qualified URL:
Browse anonymously and without being traced
You can also default to the current protocol with this:
Browse anonymously and without being traced
So if the current page is open via http:// or https:// then the link would use the same in the resulting request. Note however that your description of "a folder inside my computer" may somewhat imply that your current protocol could be file://, in which case an inferred protocol clearly wouldn't work. The point is, the structure of a complete URL is pretty versatile so you have options.
I've been looking around, and I find something that for some reason works for some links, but when I try to download my image it just opens the page with the image
Download
I expect this to download the imgur image I linked, but it just opens the link instead, any ideas how to fix this?
I've just made a quick test and it seems like that origin domain matters in this case.
The download attribute works fine if the image is on the same domain. Otherwise it opens the link.
This attribute only works for same-origin URLs.
Although HTTP(s) URLs need to be in the same-origin, blob: URLs and data: URLs are allowed so that content generated by JavaScript, such as pictures created in an image-editor Web app, can be downloaded.
If the HTTP header Content-Disposition: gives a different filename than this attribute, the HTTP header takes priority over this attribute.
If Content-Disposition: is set to inline, Firefox prioritizes Content-Disposition, like the filename case, while Chrome prioritizes the download attribute.
According to the MDN description of the download attribute:
This attribute only works for same-origin URLs.
So it won't work with a URL that points to a different domain than your own, such as i.imgur.com.
You can use a proxy script on your own server, something like:
Download
Then write the image_download.php script that does:
readfile($_GET['url']);
You should of course have validation checks in the script so that it doesn't get abused as a general-purpose proxy by third parties. Google "php proxy" and you'll find some pre-written scripts.
The download attribute only works for same origin URLS.
You could either write a server side script.
OR
Take a look at this discussion for a quick workaround.
Simply insert your URL into the downloadResource() like so: downloadResource('https://i.imgur.com/KBUpwNd.jpg'); Wrap in script tags and run on your browser.
I am working on ROR web apps. My webpage url looks like below-
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102#/dashboard
Here I understood that advertiser_id is 2102 but I couldn't understand what #/dashboard is pointing to?
The portion of the URL which follows the # symbol is not normally sent to the server in the request for the page. If you open your web inspector and watch the request for the page, you will see that the #/dashboard portion is not included in the request at all.
On a normal (basic HTML) web page, the # symbol can be used to link to a section within the page, so that the browser jumps down to that section after the page loads.
In fancy javascript-heavy web applications, the # symbol is commonly used followed by more URL paths, for example www.example.com/some-path#/other-path/etc the other-path/etc portion of the URL is not seen by the server, but is available for Javascript to read in the browser and presumably display something different based on that URL path.
So in your case, the first part of the URL is a request to the server:
http://dev.ibiza.jp:3000/facebook/report?advertiser_id=2102
and the second part of the URL could be for Javascript to display a specific view of the page once it has loaded:
#/dashboard
The # symbol is also used to create a Fragment Identifier and is also typically used to link to a specific piece of content within a web page (such as to cause the browser to jump down to a particular section on the page).
As others have mentioned, this has SEO implications. In order to index pages such as this, you may have to employ different techniques to allow the content that is "behind the # symbol" to be accessible to search engines.
# symbol is called anchor, it redirects to a specific position on the html page.
It's a crawling technique , you could read more Here
Providing another example
Here's a request to github for the sourcecode of a java class
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java
By appending this with "#L90" the web browser will make the same request, and then scroll to line 90 and highlight the code.
https://github.com/spring-cloud/spring-cloud-consul/blob/master/spring-cloud-consul-discovery/src/main/java/org/springframework/cloud/consul/serviceregistry/ConsulServiceRegistry.java#L90
Your web browser made the same request to the github server, but in the anchored case, performed the additional action of highlighting the selected line after the response was received.
after # is the hash of the location; the ! the follows is used by search engines to help index AJAX content. After that can be anything, but is usually rendered to look as a path (hence the /)
SEO wise, is it okay to use an protocol free URL like this?
<link rel="canonical" href="//example.com" />
I redirect all users to HTTPS anyway.
With protocol free I mean not using either http:// or https:// but // instead.
If the spec co-written by Google employees (https://www.rfc-editor.org/rfc/rfc6596) is correct, then yes, any relative reference is ok.
Yes you can - both the spec and Google will allow a uri as a canonical href.
This does present a risk that your page is reached via the wrong protocol - eg: http page is visited when you want https canonical url's. In this case the relative canonical value is interpreted as an http url.
However, if you have your 301 redirects correctly set up to go from http to https, you will not have an issue, and it may actually be preferable in some cases to use a relative canonical url.
Case in point, switching from an http site to https will lose you all your Facebook likes accumulated on your http url's. In this case you may want Facebook to still crawl your http site, whilst redirecting all other user agents to https.
Facebook will then reinstate your old http page likes on both your http and https pages, but not if your page's canonical url points to an absolute https url. In this instance a protocol relative canonical url - //www.mysite.com - is very useful.
Yes
href attribute on a canonical link is like all href attribute on <link>: it supports URIs. And URIs can be full URIs or relative URIs.
Moreover The Canonical Link Relation spec confirms that.
Then: of course you can use a relative URL like a protocol free one.
But don't
I will recommend anyway to always use full URLs : scheme, host, path...
Why ? Because canonical URL is made to prevent from wrong URL to be used by robots.
Then using a relative URL might let some wrong URLs used by bots contrary to a full URL which you can be certain it is the right one.
I believe you can't, according to this website.
Make them 100% specific. For various reasons, a ton of sites use protocol relative links, meaning they leave the http / https bit from their URLs. Don’t do this for your canonicals. You have a preference. Show it.
Also, I'd recommend the https, because Google is using it as a ranking signal.
I'm writing a web crawler and I'm testing it out by starting at Wikipedia. However, I noticed that many of wikipedia's links are prefaced with //, so the link from wikipedia.org to en.wikipedia.org is a link to //en.wikipedia.org. What exactly does this // mean in practice? Does it say "use whatever scheme you were using before and then redirect to this url?" or does it mean something entirely different?
The link will use protocol (http or https) same as page which contain that link. For example if https://stackoverflow.com/ contain it will directed to https://en.wikipedia.org
It maintains the protocol that is being used for the webpage. HTTP/HTTPS.
It's particulaly useful for external scripts and css tags, in which you don't know in which protocol your site will be working on.
That's why on Google libraries (https://developers.google.com/speed/libraries/devguide#jquery) you have like this:
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.10.2/jquery.min.js"></script>
Just while writing this I found a duplicate: Two forward slashes in a url/src/href attribute
Take a look at it.
Yes, it will redirect to that url using the scheme of the current location.
In order for this to work, the resource this url points to must be available in every scheme it's expected to be redirected from (usually, both http and https).
It is protocoll relative url. It keeps http or https.