Using slashes in tel: URIs - cross-browser

It's pretty clear that the correct way to link to a phone number from HTML is with a URI like tel:2065551212. However many browsers also accept telephone URIs with slashes, a la tel://2065551212 even though this is AFAICT not allowed by the relevant RFC (3966).
So, my question is this: why do some people recommend using slashes? Is it just a bad habit from http URIs?
Or... Were there actually some early mobile browsers that did not properly recognize tel: links without the slashes? I have read at least one blogger indicating that this is the case, which is a poor but understand excuse for doing so. Can anybody provide an example where tel://### would work but tel:### would not?

Related

Switch browser to a strict mode in order to write proper html code

Is it possible to switch a browser to a "strict mode" in order to write proper code at least during the development phase?
I see always invalid, dirty html code (besides bad javascript and css) and I feel that one reason is also the high tolerance level of all browsers. So at least I would be ready to have a stricter mode while I use the browser for the development for the pages in order to force myself to proper code.
Is there anything like that with any of the known browser?
I know about w3c-validator but honestly who is really using this frequently?
Is there maybe some sort of regular interface between browser and validator? Are there any development environments where the validation is tested automatically?
Is there anything like that with any of the known browser? Is there maybe some sort of regular interface between browser and validator? Are there any development environments where the validation is tested automatically?
The answer to all those questions is “No“. No browsers have any built-in integration like what you describe. There are (or were) some browser extensions that would take every single document you load and send it to the W3C validator for checking, but using one of those extensions (or anything else that automatically sends things to the W3C validator in the background) is a great way to get the W3C to block your IP address (or the IP-address range for your entire company network) for abuse of W3C services.
I know about w3c-validator but honestly who is really using this frequently?
The W3C validator currently processes around 17 requests every second—around 1.5 million documents every day—so I guess there are quite a lot of people using it frequently.
I see always invalid, dirty html code… I would be ready to have a stricter mode while I use the browser for the development for the pages in order to force myself to proper code.
I'm not sure what specifically you mean by “dirty html code” or “proper code“ but I can say that there are a lot of markup cases that are not bad or invalid but which some people mistakenly consider bad.
For example, some people think every <p> start tag should always have a matching </p> end tag but the fact is that from the time when HTML was created, it has never required documents to always have matching </p> end tags in all cases (in fact, when HTML was created, the <p> element was basically an empty element—not a container—and so the <p> tag simply was a marker.
Another example of a case that some people mistakenly think of as bad is the case of unquoted attribute values; e.g., <link rel=stylesheet …>. But that fact is that unless an attribute value contains spaces, it generally doesn't need to be quoted. So in fact there's actually nothing wrong at all with a case like <link rel=stylesheet …>.
So there's basically no point in trying to find a tool or mechanism to check for cases like that, because those cases are not actually real problems.
All that said, the HTML spec does define some markup cases as being errors, and those cases are what the W3C validator checks.
So if you want to catch real problems and be able to fix them, the answer is pretty simple: Use the W3C validator.
Disclosure: I'm the maintainer of the W3C validator. 😀
As #sideshowbarker notes, there isn't anything built in to all browsers at the moment.
However I do like the idea and wish there was such a tool also (that's how I got to this question)
There is a "partial" solution, in that if you use Firefox, and view the source (not the developer tools, but the CTRL+U or right click "View Page Source") Firefox will highlight invalid tag nesting, and attribute issues in red in the raw HTML source. I find this invaluable as a first pass looking at a page that doesn't seem to be working.
It is quite nice because it isn't super picky about the asdf id not being quoted, or if an attribute is deprecated, but it highlights glitchy stuff like the spacing on the td attributes is messed up (this would cause issues if the attributes were not quoted), and it caught that the span tag was not properly closed, and that the script tag is outside of the html tag, and if I had missed the doctype or had content before it, it flags that too.
Unfortunately "seeing" these issues is a manual process... I'd love to see these in the dev console, and in all browsers.
Most plugins/extensions only get access to the DOM after it has been parsed and these errors are gone or negated... however if there is a way to get the raw HTML source in one of these extension models that we can code an extension for to test for these types of errors, I'd be more than willing to help write one (DM #scunliffe on Twitter). Alternatively this may require writing something at a lower level, like a script to run in Fiddler.

practices in handling bad robots requests url's containing ampersand like "&" instead of "&"

& is a reserved character in html therefore everywhere I have url's pointing to some path with querystring I put & instead of & so that I get valid HTML.
There are a many different crawlers that goes over the website and access this url's but they don't use html decode methods to get the correct url values so they make requests to my website with:
mywebsite.com/?p1=v1&p2=v2
instead of
mywebsite.com/?p1=v1&p2=v2
Right now I am responding with the error page as the robots that makes this requests are of no interest to me.
But my question is, what are the best practice to handle this kind of requests?
Do you know if there is of any use to support handling this kind of requests? ( for example are there any popular crawlers or browsers that doesn't properly converts this url's ?)
I think you can expect that any major crawler is able to handle valid escaped URLs. So I won't worry about the rest.
If you really like to then you may want to add rewrite rules to your Apache or whatever you use. But this may lead to other problems when an URL really contains the charsequence & and got replaced with & by your rewrite rule for error.
In my opinion it is better to leave this untouched. It is not your fault and when you do not really care about these crawler - so what? :)
Yes & is a reserved character but your not gonna put it in website links.
Correct
mywebsite.com/?p1=v1&p2=v2
Incorrect
mywebsite.com/?p1=v1&p2=v2

HTML5: which is better - using a character entity vs using a character directly?

I've recently noticed a lot of high profile sites using characters directly in their source, eg:
<q>“Hi there”</q>
Rather than:
<q>“Hi there”</q>
Which of these is preferred? I've always used entities in the past, but using the character directly seems more readable, and would seem to be OK in a Unicode document.
If the encoding is UTF-8, the normal characters will work fine, and there is no reason not to use them. Browsers that don't support UTF-8 will have lots of other issues while displaying a modern webpage, so don't worry about that.
So it is easier and more readable to use the characters and I would prefer to do so.
It also saves a couple of bytes which is good, although there is much more to gain by using compression and minification.
The main advantage I can see with encoding characters is that they'll look right, even if the page is interpreted as ASCII.
For example, if your page is just a raw HTML file, the default settings on some servers would be to serve it as text/html; charset=ISO-8859-1 (the default in HTTP 1.1). Even if you set the meta tag for content-type, the HTTP header has higher priority.
Whether this matters depends on how likely the page is to be served by a misconfigured server.
It is better to use characters directly. They make for: easier to read code.
Google's HTML style guide advocates for the same. The guide itself can be found here:
Google HTML/CSS Style guide.
Using characters directly. They are easier to read in the source (which is important as people do have to edit them!) and require less bandwidth.
The example given is definitely wrong, in theory as well as in practice, in HTML5 and in HTML 4. For example, the HTML5 discussions of q markup says:
“Quotation punctuation (such as quotation marks) that is quoting the contents of the element must not appear immediately before, after, or inside q elements; they will be inserted into the rendering by the user agent.”
That is, use either ´q’ markup or punctuation marks, not both. The latter is better on all practical accounts.
Regarding the issue of characters vs. entity references, the former are preferable for readability, but then you need to know how to save the data as UTF-8 and declare the encoding properly. It’s not rocket science, and usually better. But if your authoring environment is UTF-8 hostile, you need not be ashamed of using entity references.

URL hash format, what's allowed and what's not?

I'm using hash-based navigation in my rich web app. I also found I needed to create permalinks that would point to single instances of resources, but since I cannot cause the page o refresh, and the main page is loaded from a single path '/', I cannot use real URLs. Instead I thought about using hashes. Let me give you an example because I know the explanation above sucks.
So, instead of having http://example.com/path/to/resource/1, I would have http://example.com/#path/to/resource/1
This seems to work ok, and browser believes '#path/to/resource/1' is a hash (slashes permitted, I think) but I was wondering about what characters are allowed in URL hash. Is there a specification or a RFC that I could read to find out what the standard behavior of browsers is when it comes to hashes?
EDIT: Ok, so silly me. Didn't actually check if slashes worked in all browsers. Chrome obviously doesn't like them. Only works in FF.
Look at: http://www.w3.org/Addressing/rfc1630.txt or http://www.w3.org/Addressing/URL/4_2_Fragments.html
Basically you can use anything that can be encoded in an URL.
Note: There might be browser inconsistencies. If you fear them, you might use a serialization mechanism, like converting the string to hex or something (will be twice longer though), or use an id of some sort.
This document should help. Slashes are allowed, but the lexical analysis might differ between browsers.
I think you might find that useful: RFC3986
If you use PHP to generate your page paths you could also urlencode() which generates you a valide URL.

is this url unvalid and not good practice?

i have a url in this format:
http://www.example.com/manchester united
note the space between manchester and united, is this bad practice, or is it perfectly fine, i just wanted to before i proceed, thanks
The space is not a valid character in URIs; you have to replace it with %20. It may also be considered bad practice. Replacing the space with -, + or _ is preferable; it is both “prettier” and doesn't require escaping of the URI.
Most browsers will still try to parse URIs with a space; but that's highly ambiguous.
It's bad practice not only because browsers are required to turn the space into a %20 and thus obfuscate your users' address bars, but because it would be difficult to communicate the url to anyone.
Furthermore, what about all of those "find links in text" regexes that are around stack overflow? You effectively break them all!
It will be replaced in the address bar as http://www.example.com/manchester%20united, which I personally think if far uglier than the alternative http://www.example.com/manchester_united.
I believe spaces in URLS are replaced with a %20 sign by many browsers.
you will need to add %20 instead of the space, however the browser will do it for you, I would rather not have any spaces in the URI
Technically this will work. The browser will replace the space with a %20, and the server will translate it back.
But ... it's not generally a good idea because it can lead to ambiguity, or difficulty in communicating the URL to others, particularly in an advertising setting where you're expecting someone to type in a URL they've seen in print.
Maybe a question for: https://webmasters.stackexchange.com/
But...
If you enter than into a browser, it will add %20 between manchester and united. Technically you should do this in your HTML page but most modern browsers can handle this. Common practice is to split them out with a hyphen i.e. http://www.example.com/manchester-united.
Look at the URL of this question for an example of this in action.
Can do that, but apparently it's bad style.
See the following: http://www.blooberry.com/indexdot/html/topics/urlencoding.htm