In HTML "href" automate - html

if you remain to site is "http: //www.demo.com/demo/index.php".
in this page, has two tags is
first
and second,
when you click "first", internet address is "http: //www.demo.com/demo/link.php?id=1", but
you click "second" why not is "http: //www.demo.com/demo/http://www.google.com " ?
I have no idea.but i alreadly try to it

URLs have several components (e.g. the scheme, the hostname, the query string).
You can omit any number of them from the left and the URL will be resolved relative to another URL.
link.php?id=1 omits the scheme, the hostname and the / that indicates the top of the path, so it is resolved relative to the current URL.
The second starts with the scheme, so none of the current URL is kept.
If you wanted a relative URI to there, then you would use a dot prefix as per the spec:
A path segment that contains a colon character (e.g., "this:that")
cannot be used as the first segment of a relative-path reference, as
it would be mistaken for a scheme name. Such a segment must be
preceded by a dot-segment (e.g., "./this:that") to make a relative-
path reference.
See the URI spec for further reading.

Related

What's the difference between the two css link below, and what are the purposes of the two links? [duplicate]

I just learned from a colleague that omitting the "http | https" part of a URL in a link will make that URL use whatever scheme the page it's on uses.
So for example, if my page is accessed at http://www.example.com and I have a link (notice the '//' at the front):
Google
That link will go to http://www.google.com.
But if I access the page at https://www.example.com with the same link, it will go to https://www.google.com
I wanted to look online for more information about this, but I'm having trouble thinking of a good search phrase. If I search for "URLs without HTTP" the pages returned are about urls with this form: "www.example.com", which is not what I'm looking for.
Would you call that a schemeless URL? A protocol-less URL?
Does this work in all browsers? I tested it in FF and IE 8 and it worked in both. Is this part of a standard, or should I test more browsers?
Protocol relative URL
You may receive unusual security warnings in some browsers.
See also, Wikipedia Protocol-relative URLs for a brief definition.
At one time, it was recommended; but going forward, it should be avoided.
See also the Stack Overflow question Why use protocol-relative URLs at all?.
It is called network-path reference (the part that is missing is called scheme or protocol) defined in RFC3986 Section 4.2
4.2 Relative Reference
A relative reference takes advantage of the hierarchical syntax
(Section 1.2.3) to express a URI reference relative to the name space
of another hierarchical URI.
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
relative-part = "//" authority path-abempty
/ path-absolute
/ path-noscheme
/ path-empty
The URI referred to by a relative reference, also known as the target URI, is obtained by applying the reference resolution
algorithm of Section 5.
A relative reference that begins with two slash characters is
termed a network-path reference (emphasis mine); such references are rarely used.
A relative reference that begins with a single slash character is termed an absolute-path reference. A relative reference that does not begin with a slash character is termed a relative-path reference.
A path segment that contains a colon character (e.g., "this:that") cannot be used as the first segment of a relative-path reference, as it would be mistaken for a scheme name. Such a segment must be preceded by a dot-segment (e.g., "./this:that") to make a relative- path reference.

Why doesn't appending a path segment work using href="./other"?

I'm confused when it comes to how relative paths are calculated in urls.
When having a base url without a trailing slash ("example.com/a/b") I can't append a new segment with a relative path using only the new segment?
Why doesn't appending a path segment work using href="./c"?
When using href="../c" I get the expected result, a relative path one level up in the hierarchy. But what is the syntax to append a relative path even when the base url doesn't end with a trailing slash?
Just using href="c" replaces the last segment and using href="/c" removes all segments. The only relative option I have seem to be href="b/c" but then I have to repeat the last segment which doesn't always make it so easy. I wish href="./c" or something similar would work...
But perhaps "./c" is not correct because the dot refers to the "folder" which in this case could mean the last segment ending with a slash? But even then it should be possible to use some other syntax to accomplish the same.
Relative URLs (which don't start with a /) are always computed from the last "directory" segment of the path. Any "file name" part is dropped. There is no way to change that with plain URL syntax.
You could do it by writing your own URL resolution code in a programming language of your choice.

How to point to an anchor when the fragment identifier is already used in the URL?

I have a page (over which I have no control) with an URL similar to
https://example.com/#group:1106/about:Bxk9H9jJQOm-pYkmpZVjhA
Within this page, there is an element
<h1 id="content-H1-59520">Introduction</h1>
Fragment identifiers (#) can be used to point to a specific id on a page:
In URIs for MIME text/html pages such as
http://www.example.org/foo.html#bar the fragment refers to the element
with id="bar".
My question: taken into account that the fragment identifier is already used in the bare URL, how should I modify it to have it pointing to the H1 element above?
On a hunch I tried https://example.com/#group:1106/about:Bxk9H9jJQOm-pYkmpZVjhA#content-H1-59520 but it does not work.
The fragment identifier component is indicated by the first # and terminated by the end of the URL. The URL you tested is invalid, because the fragment identifier component may not contain a #.
The URL has to be:
https://example.com/#content-H1-59520
If a JavaScript-based site requires the fragment identifier to represent application states, it conflicts with the browser feature to jump to an ID.
You could either switch to a different URI design (that doesn’t require the fragment identifier component), or maybe you could rebuild the jump feature in JavaScript (e.g., appending the anchor ID to the fragment, delimited by #).

Is it standard web-browser behaviour to prepend the current URL to incomplete links?

If a website includes an incomplete link such as the following:
Link
Link
Is it standard, universal behaviour that the link would be interpreted as the current URL with the href value appended to it?
The HTML5 spec defines how [href] attributes behave
The href attribute on a and area elements must have a value that is a valid URL potentially surrounded by spaces.
which links to:
A string is a valid URL potentially surrounded by spaces if, after stripping leading and trailing whitespace from it, it is a valid URL.
which links to:
A URL is a valid URL if it conforms to the authoring conformance requirements in the URL standard. [URL]
which links to a sizable block of text, but I think the following is important:
Most of the URL-related terms used in the HTML specification (URL, absolute URL, relative URL, relative schemes, scheme component, scheme data, username, password, host, port, path, query, fragment, percent encode, get the base, and UTF-8 percent encode) can be straightforwardly mapped to the terminology of [RFC3986] [RFC3987].
As for the "incomplete link" examples you included in your question. They are examples of a "fragment" and "query" respectively, which have an implicit relative URL of . which represents the current URL (note that it will not merge query strings or document fragment identifiers).

How do user agents distinguish domains from file extensions in relative urls?

Let's say a browser encounters a link like this:
<a href='stackoverflowhome.html'>home</a>
This is clearly a relative url to an html file in the current directory, but how does the browser know that the .html is a file extension, and not a TLD (top level domain)? Does it have a list of common file extensions, or a list of TLDs? And if so, is it manually updated whenever a new file format becomes commonly used, or when the list of accepted TLDs change, for example with brand tlds?
It's because that is how RFC 3986 specified that URIs should be parsed. If the URI does not have a scheme (a set of characters followed by a colon - e. g. http: or gopher:) then it must be treated as a relative URI. Quoting from the RFC:
A URI-reference is either a URI or a relative reference. If the
URI-reference's prefix does not match the syntax of a scheme followed
by its colon separator, then the URI-reference is a relative
reference.
User-agents are allowed to make their best guess about what the user meant (see section 4.5) especially in cases where the context is ambiguous (such as URL bars in browsers) but the RFC recommends against it where the URI will be around for a long time as the best guess of user-agents will change over time, thus leading to URIs that don't resolve to the same resource depending on the time they are accessed or the user-agent they are accessed with.
Relative URLs are never domain names.
A URL is only parsed as containing a domain name if it has a protocol. (or is protocol-relative).
The URL does not start with a protocol specifier - no http:// or https://, so is interpreted as a relative URL.