Absolutizing an image url with a ".." - html

I have an HTML document I'm transforming with an image whose source url looks like this:
"../foo/bar/baz.png"
I'm using a tritium function to absolutize image source urls, but the ".." seems to be stumping it. It's prepending the hostname, etc, but when it does, it adds one too many layers.
So for example, the correct URL of the image is:
"www.host.com/foo/bar.png"
But the page on which it appears is at "www.host.com/site/baz/page.html"
The source of the image in the original html is therefore "../foo/bar.png"
But the absolutized result I'm getting is: "www.host.com/site/foo/bar.png"
In other words it's going up the file tree to "/site/", but it needs to be going up one more. I don't really see how it even works on the original page without another ".." How should I be handling the ".." in the url?

.. means to traverse one level up; you are using a relative path, not an absolute one like you should be. Drop the dots:
<img src="/foo/bar.png"> will load the image from the root of the domain.

There is a huge difference between src="/foo/bar.png" and src="foo/bar.png" (Notice the slash after the first double quote)
First one points to http://example.com/foo/bar.png NO MATTER what.
Second one, however, (without the beginning slash) is relative URL so the output path depends on the file on which the image appears.
That is why you were getting "www.host.com/site/foo/bar.png" (one level up relative to the file path).
Two solutions:
1) src="/foo/bar.png" OR
2) src="../../foo/bar.png"
I always recommend the first approach because even after you move the files around, you won't have to change the absolute URL. (I learned it the hard way)
P.S. this rule applies to CSS files as well. (for example when specifying the background image URL) If you use absolute paths, you won't have to bang your head on the wall when you change the directory of the CSS file.

As you're in a Moovweb project, I would suggest manipulating the problematic src before you use the absolutize() function.
Is there an easy way you can select the image using Tritium? I'd suggest doing that, then manipulating the src attribute:
$("./img[#id='']") {
attribute("src", "/foo/bar.png")
}
After this, you should be able to use the absolutize() function and the src will be rendered correctly.

Related

Why doesn't appending a path segment work using href="./other"?

I'm confused when it comes to how relative paths are calculated in urls.
When having a base url without a trailing slash ("example.com/a/b") I can't append a new segment with a relative path using only the new segment?
Why doesn't appending a path segment work using href="./c"?
When using href="../c" I get the expected result, a relative path one level up in the hierarchy. But what is the syntax to append a relative path even when the base url doesn't end with a trailing slash?
Just using href="c" replaces the last segment and using href="/c" removes all segments. The only relative option I have seem to be href="b/c" but then I have to repeat the last segment which doesn't always make it so easy. I wish href="./c" or something similar would work...
But perhaps "./c" is not correct because the dot refers to the "folder" which in this case could mean the last segment ending with a slash? But even then it should be possible to use some other syntax to accomplish the same.
Relative URLs (which don't start with a /) are always computed from the last "directory" segment of the path. Any "file name" part is dropped. There is no way to change that with plain URL syntax.
You could do it by writing your own URL resolution code in a programming language of your choice.

Why does href="/../subpage/doc.html" work?

The website I found looks like the following:
current URL: http://www.example.com/stocking/
a link: <a href="/../shop/alphabetic/page">
this takes you to http://www.example.com/shop/alphabetic/page.
From what I understand about relative paths, you use a leading slash to refer to the current base URL and leading points to go from the current directory. Therefore, it should make no sense to do the above.
Actually, I'm surprised this is even working and somehow equivalent to either
href="../shop/alphabetic/page"
href="/shop/alphabetic/page"
which should work as well for this purpose.
So how does this even work?
/ starts an absolute path.
../ then goes up a path segment, but as you are at the top already, it has no effect and is ignored.

How to go to sub-directory with relative URI

Sorry if it so basic but I could not find the answer by searching.
If we are in the page http://www.example.com/a-dir-without-trailing-slash how we can reach the sub-directory http://www.example.com/a-dir-without-trailing-slash/pic using relative URI? (we do not know the current directory name(i.e. a-dir-without-trailing-slash)
Some more explanation:
a-dir-without-trailing-slash is the name of an article in the website. It is not an actual directory nor an actual file name. Now, I want to get the pictures that are used in this article by addresses like:
http://www.example.com/a-dir-without-trailing-slash/pic/1
http://www.example.com/a-dir-without-trailing-slash/pic/2
,...
and in the webpage html, I would refer to them with something similar to:
<img src="pic/1" />
If the original article address was in the form of http://www.example.com/a-dir-with-trailing-slash/, the above example would work finely. I want to know if is it possible to get a relative URI with current article addresses (without trailing slash)?
Thank you very much
I suppose you want to avoid hard coding "slugs" in the content so that they can be stored and manipulated independent of each other.
One solution is to use the base tag which allows you to specify the prefix that is added to relative URLs instead of typing them all over the place.
Make sure that your website uses absolute URLs where necessary.
Modify your CMS to "generate" and place the following tag inside the head section that contains trailing slash:
<base href="/a-dir-without-trailing-slash/">
Then you can use relative URLs inside the content, for example:
<img src="pic/1">
<!-- http://www.example.com/a-dir-without-trailing-slash.html/pic/1 -->
You need server side scripting to add the filename to urls (or may be just one '> tag in the head). – Salman
Bounty get.

Extract (random) image with no useful src= from web page

First I'd like to know how this can be achieved in general, and then maybe someone knows how to accomplish this using Capybara.
Example: <img src="http://example.com/getrandomimage">
Thing is, src points to a script which returns random image, not to the image itself.
Page is loaded, script is run, image is displayed. I can easily get the src value, but if I access the link to download the image, the script runs again and returns a totally different picture. And I need the one that's already on the page.
I think the process would be very similar using JS or Capybara. I'd break it down into two steps:
Write a selector that will find the <img> tag. In JS that might look like:
myImg = document.getElementByTagName("img")
Call .src on the returned node:
result = myImg.src
I believe Capybara is limited to XPath and CSS selectors. Therefore, depending on the page you are trying to scrape, you'll have to identify some sort of pattern in the HTML tags or the CSS attributes to find the <img> tag.

Unable to apply CSS for body of a HTML document

I am unable to apply a background image in my HTML document using the following code in CSS:
body
{
text-align:center;
background-image:url('C:\wamp\www\marks display\WI71.jpg');
}
I also searched for it, but I found, the above declaration is true but unable to execute it. Why is this happening?
That's not a URL, that's a file path.
If the root of your site is marks display, probably you want this:
background-image:url('/WI71.jpg');
Path should not be map to a drive(file path) when publishing on web, it should be a URL.
It should be like background-image:url('http://domainname/71.jpg'); -- Complete Url of Image
or background-image:url('WI71.jpg'); -- Relative url
buddy html css in reality is actually a on server thing so below is the right code:
background-image: url('c:/xyz/xyz/sample.jpg');
however if you are uploading your site on a real web server do not gives paths like that, just make it like below
background-image: url('foldername_if required/imagename');
The string C:\wamp\www\marks display\WI71.jpg does not comply with URL syntax. To begin with, the character \ as such is not allowed in URLs; it should be replaced by the slash /. The space character should be %-encoded as %20. Finally, to refer to a file in the local system with a pathname, use a file: URL:
background-image:url('file:///C:/wamp/www/marks%20display/WI71.jpg');
However, IE has very permissive error recovery here, so your malformed code actually works on IE, if the file exists in the place indicated with the name given. Other browsers require correct code (mostly).
Such URLs are of very limited usefulness. They mostly work in local testing only, and even in local testing, it is better to use URLs that are relative to the location of the HTML document. This way, you can use the same code in local testing and on a web server, provided that you replicate the relevant parts of the folder structure.