Change BASE HREF for absolute references? - html

I copy a large html source of an external page (say, http://www.example.com/bar/something.html) into a directory in my PC (say, /xxx). The file 'something.html' contains many absolute references in the form href="/bar/another.html" or src="/bar2/yetanother.jpg" etc.
If I simply click 'something.html' (accessing it from my browser as 'file://') -- or even if I upload it to my own server and access it via 'http://' -- all those references will be looked in the same host where the file is. I still want them to be looked in the original host (i.e., http://www.example.com).
Had they been relative references (without the 1st slash), I would simple put <base href=" http://www.example.com/"> in the HEAD section. How can I achieve a similar effect with those absolute references??
Consider also the case where something.html includes many other files (css, js, ...) which may also have such absolute references...

You have the terminology backwards: the reference /bar/another.html is a relative reference, not an absolute reference. The / indicates to restart at the root of the resource. For file:/// URLs, this will start at the root of the filesystem, but it's still relative.
If you add the <base href="..."> it will prepend the ... to the URL, unless the URL is indeed absolute (begins with http://, ftp://, file:// etc)
If you use the base href as file:///where_i_downloaded/, you'll get resources linked from there (not from the root of the file system), or as http://www.foo.com/ which would force the browser to attempt to load from the original server (attempt, because URLs for AJAX services may not work with this).

Related

How do I specify an absolute path?

I wrote this code, but I couldn't get the html. I don't know why. I want to know why this is when the route is not wrong.
<img id = "navLogo" src = "C:/Bitnami/wampstack-7.1.27-0/apache2/htdocs/TermProject/imgs/navLogo.jpg"></img>
For security reasons, websites may not request arbitrary files from your machine's filesystem.
Keep in mind that the way this works is the HTML is sent to the browser and then the browser sends a second request for the image.
In this case the browser would be trying to get the file off of your machine, (which may coincidentally be where the web server happens to be running), but if this site were live on the web and someone else accessed it, their browser would be trying to get this image from that user's machine, not from the website.
If the browser was allowed to serve files from your local filesystem, one could very easily create a site to grab files off of your machine and transmit them elsewhere, creating a MASSIVE security problem.
To fix this you should specify a path relative to the web server's root, which would probably mean:
<img src="/imgs/navLogo.jpg" />
or maybe:
<img src="/TermProject/imgs/navLogo.jpg" />
Note that behavior will be different if you're loading the HTML file from the filesystem (the location is file:…) vs. serving it from a web server (location is http://…). I'm assuming you're doing the former here based on the fact that your image is under an apache directory.
Ray Hatfield's answer is good but I don't have enough rep to comment.
I think you are misunderstanding absolute vs relative paths here.
You are thinking of absolute as starting with your c: drive, but on a web server, the absolute path starts at the root of your site, which is always just a forward slash "/".
Relative paths begin from the current directory of the source file(s) that specify them.
Neither of them can go higher in the file system than your site's root folder.
To put it simply (specifically answering "How do I specify an absolute path?"):
All absolute paths on the web start with a /
All relative paths on the web do not.
Given the path in your example:
C:/Bitnami/wampstack-7.1.27-0/apache2/htdocs/TermProject/imgs/navLogo.jpg
A default apache installation considers the site root to be the "htdocs" in this path.
This means that the absolute path / on your website is found at C:/Bitnami/wampstack-7.1.27-0/apache2/htdocs/ on your hard drive.
If you have a file C:/Bitnami/wampstack-7.1.27-0/apache2/htdocs/TermProject/index.html then you can access the navLogo image from that page
with an absolute path at:
/TermProject/imgs/navLogo.jpg
or a relative path at (note the missing forward slash):
imgs/navLogo.jpg
I believe the source you specify should have path starting from folder where your html is located. So, try something like this "/imgs/navLogo.jpg".

Relative paths from directory

We have a qa/dev server and a prod server. The two differ by a directory like this
https://domain/service/envQA/sitename
https://domain/service/env/sitename
In some static html I'm trying to put in src and href that are relative to avoid having the markup reference QA if a developer migrates the content and doesn't update an absolute path that includes the envQA. We aren't very fancy and just move most documents over by hand and a busy developer might miss a reference in the middle of several pages of markup -- it happens.
So I'm trying to use relative paths like this.
<img src="assets/backgroundimg.png" />
This works when the user is at our homepage url of https://domain/service/env/sitename but unfortunately our site also has navigational elements that return the user to https://domain/service/env/sitename/ (note the closing slash).
Is there any way (without javascript) to handle a relative path that would work from either of those "locations"?
Have you considered using the <base> tag?
https://developer.mozilla.org/en-US/docs/Web/HTML/Element/base
This would allow you to set a base per environment allowing configuring all urls at once.

Using Root-Relative Links With Subdomain

I have inherited a website where the previous developer has coded all links relative to the site root, with a leading backslash in each link:
<link href="/css/file.css" />
<script src="/js/file.js"></script>
This works great for when the site is hosted on a server, as the links will equate to:
http://www.example.com/css/file.css
http://www.example.com/js/file.js
However, I'm trying to get these links to work correctly when called from within a subfolder for local testing. Specifically, I'm using WAMP, and have moved the entire code to a local folder called site at http://localhost:8080/site/.
I can't use the root of localhost, as WAMP stores various files there (including an index that would get overwritten).
The obvious solution, as many posts here on StackOverflow suggest, is to simply use folder-relative links, such as:
<link href="css/file.css" />
<script src="js/file.js"></script>
However, there are literally hundreds of hard-coded root-relative links in various different files, so it would be great having to avoid altering every single one of them if possible.
To avoid having to edit every link, I've tried setting an HTML <base> tag and specifying the folder directly:
<base href="http://localhost:8080/site/">
However, this doesn't seem to work.
Is <base> incompatible with root-relative links?
Is there any way I can easily have all files reference http://localhost:8080/site/ without having to manually edit each one of their preexisting root-relative links? Or will I have to manually update each one to be folder-relative?
Is <base> incompatible with root-relative links?
No, but an absolute path is still an absolute path. It will resolve relative to http://localhost:8080/site/ by dropping /site/.
If you want to use absolute paths and not keep your development sites in subdirectories, then configure your HTTP server to use Virtual Name Hosting.
Add custom hostnames (either in the DNS server for your LAN or in the hosts file on your development system), such as site.localhost, and set the DocumentRoot in a virtual host.
Have you tried using the replace function in your IDE? You can simply replace all the ="/ with =". It'll save you a lot of work and stress.

Absolute or Relative URL if my website may not be at the root folder?

I am developing a website on a web server which can be accessed by 2 URL: mywebsite.example.com or example.com/mywebsite. For example, when I access mywebsite.example.com/images/abc.jpg and example.com/mywebsite/images/abc.jpg, I get the same picture.
The problem is, I have many links inside my website, and I am not sure should I use an absolute or relative path.
From another question
Absolute vs relative URLs
I found someone suggesting using URL relative to root (like /images/abc.jpg), however when I access the website using example.com/mywebsite, every link just break.
For relative paths, I found it hard to manage since webpages are in different folders, but using the same template which contains some links. It means I have to manually set some links as ../ and some as ./.
I have also tried using <base> tag however it messes up with anchor. Even if I try to include the full path before the # symbol, some jQuery libraries does not function properly since they get the value inside the attribute href directly, but not extracting the part after #.
Would there be any better practice or suggestion?
I think you should use relative urls, and concentrate your searchs on how to use relative urls in templates, that would be resolved relatively to the final page.
I don't know the technology you are using for templating, but I see two common solutions :
declare a "relative path" variable in the template, and then override it in the different pages, with the new relative path. Use this relative path as a prefix for all urls
delegate urls construction to a service that would know the final page. Somethinkg like resolveUrl(..)

Why would a developer place a forward slash at the start of each relative path?

I am examining some code for a friend, and have found that the developer who built his site began each and every relative src, href, and include with a forward slash /.
For example:
src="/assets/js/jquery.js"
I have never seen this before. So my question is, why would a developer place a forward slash / at the start of a relative path?
It's done in order to root the path (making it an absolute path).
It ensures that the path is not relative but read from the root of the site.
This allows one to move a file around and not have to change the links to the different resources.
Using your example:
src="/assets/js/jquery.js"
If the referencing file is in /pages/admin/main.html (for example) using relative paths you would use:
src="../../assets/js/jquery.js"
Suppose you move the file to a child directory. No changes would be needed for with the original rooted path, but the relative one would need to change to:
src="../../../assets/js/jquery.js"
Adding on #Oded's answer, the slash makes the URL absolute.
For example:
/foo/bar/baz.css
This translates to:
http://www.example.com/foo/bar/baz.css
But without the slash, things become a bit different:
foo/bar/baz.css
This tells the browser to look in the current folder (not the root folder) for the directory foo and then the subsequent directories and the file.
Also, take for instance this HTML:
<script type="text/javascript" src="foo.js"></script>
If you move the HTML file into another folder, then the script will not load, as foo.js isn't being moved with the HTML file.
But if you use an absolute URL:
<script type="text/javascript" src="/foo.js"></script>
Then the JS file is loaded EXACTLY from http://www.example.com/foo.js no matter where the HTML file is.
This is to ensure the asset comes from the "root" of the web server.
e.g.
Host is www.example.com
URL becomes www.example.com/assets/js/jquery.js
I do this with project I want to ensure live on their own virtual host.
The issue really comes down to where those assets are being included. For example if the asset is being included from /help/pages/faq then the developer can be sure the path will work correctly when the site is hosted on a non changing host, e.g. example.com.
The issue of using relative paths, 'assets/js/jquery.js' is that if the assets are included from the /help/pages/faqs then the path becomes relative to that starting point, e.g. /help/pages/faqs/assets/js/jquery.js
Hope that helps
This is a bit off topic, but if there is any chance that your application will ever be served behind a reverse proxy (eg. using apache2 or nginx) under a sub-path, you should try to avoid absolute paths.
For example, if you reference "/style.css" on https://example.com/, and you tried to hide it behind a reverse proxy at https://proxy.example.com/example/, your absolute reference would break. The browser would make the request to "https://proxy.example.com/style.css" when it should have requested "https://proxy.example.com/example/style.css".
Unintentional absolute paths from a leading forward slash are a nightmare for reverse proxies to deal with.