w3c markup validator ampersand (&) error - html

Is there any workaround for the w3c validation error for an & present in urls or some other place in HTML markup?
It says:
& did not start a character reference. (& probably should have been escaped as &.)
The ampersand in my case is a part of a url for gravatar thumbnail. This is the problematic part of a url:
c91588793296e2?s=50&d=http%3A%2F%.

for each & sign you got write &
in your example it would be:
c91588793296e2?s=50&d=http%3A%2F%

Use & for literal ampersands, even in URLs.
http://htmlhelp.com/tools/validator/problems.html#amp

Replace with &

should be:
c91588793296e2?s=50&d=http%3A%2F%.
notice the &
I know it feels wonky, but ampersands have to be encoded as html entities, which are confusingly denoted with ampersands.

Related

How do you escape escaped text in HTML? [duplicate]

I have some XML text that I wish to render in an HTML page. This text contains an ampersand, which I want to render in its entity representation: &.
How do I escape this ampersand in the source XML? I tried &, but this is decoded as the actual ampersand character (&), which is invalid in HTML.
So I want to escape it in such a way that it will be rendered as & in the web page that uses the XML output.
When your XML contains &, this will result in the text &.
When you use that in HTML, that will be rendered as &.
As per §2.4 of the XML 1.0 spec, you should be able to use &.
I tried & but this isn't allowed.
Are you sure it isn't a different issue? XML explicitly defines this as the way to escape ampersands.
The & character is itself an escape character in XML so the solution is to concatenate it and a Unicode decimal equivalent for & thus ensuring that there are no XML parsing errors. That is, replace the character & with &.
Use CDATA tags:
<![CDATA[
This is some text with ampersands & other funny characters. >>
]]>
& should work just fine. Wikipedia has a list of predefined entities in XML.
In my case I had to change it to %26.
I needed to escape & in a URL. So & did not work out for me.
The urlencode function changes & to %26. This way neither XML nor the browser URL mechanism complained about the URL.
I have tried &amp, but it didn't work. Based on Wim ten Brink's answer I tried &amp and it worked.
One of my fellow developers suggested me to use & and that worked regardless of how many times it may be rendered.
& is the way to represent an ampersand in most sections of an XML document.
If you want to have XML displayed within HTML, you need to first create properly encoded XML (which involves changing & to &) and then use that to create properly encoded HTML (which involves again changing & to &). That results in:
&amp;
For a more thorough explanation of XML encoding, see:
What characters do I need to escape in XML documents?
<xsl:text disable-output-escaping="yes">& </xsl:text> will do the trick.
Consider if your XML looks like below.
<Employees Id="1" Name="ABC">
<Query>
SELECT * FROM EMP WHERE ID=1 AND RES<>'GCF'
<Query>
</Employees>
You cannot use the <> directly as it throws an error. In that case, you can use <> in replacement of that.
<Employees Id="1" Name="ABC">
<Query>
SELECT * FROM EMP WHERE ID=1 AND RES <> 'GCF'
<Query>
</Employees>
14.1 How to use special characters in XML has all the codes.

Using a "&" in <a></a>

Currently, I have:
Start Process
However, I ran this code through the W3 HTML validator (https://validator.w3.org), and it comes up with this:
& did not start a character reference. (& probably should have been escaped as &.)
Is there another proper way to put a "&" into an <a></a> tag, or should I just leave it like how it is?
Handling ampersands (&) in URLs is explained in the Web Design Group's Common Validator Problems page:
Ampersands (&'s) in URLs
Another common error occurs when including a URL which contains an ampersand ("&"):
<!-- This is invalid! --> ...
This example generates an error for "unknown entity section" because the "&" is assumed to begin an entity reference. Browsers often recover safely from this kind of error, but real problems do occur in some cases. In this example, many browsers correctly convert &copy=3 to ©=3, which may cause the link to fail. Since 〈 is the HTML entity for the left-pointing angle bracket, some browsers also convert &lang=en to 〈=en. And one old browser even finds the entity §, converting &section=2 to §ion=2.
To avoid problems with both validators and browsers, always use & in place of & when writing URLs in HTML:
...
Note that replacing & with & is only done when writing the URL in HTML, where "&" is a special character (along with "<" and ">"). When writing the same URL in a plain text email message or in the location bar of your browser, you would use "&" and not "&". With HTML, the browser translates "&" to "&" so the Web server would only see "&" and not "&" in the query string of the request.

Set a mailto link with a subject containing an ampersand (&)

Im using the following mailto link to send an email:
<a class="share3" title="" href="mailto:?subject=#check&body=#domain">
It works well, but sometimes my subject will contain an ampersand (&) character, and when it does my email is created without a body.
Any way to resolve this problem?
In order to get special/reserved characters into a URL, you must encode them - to get an & to work, it must be encoded to %26.
More details here: http://www.w3schools.com/tags/ref_urlencode.asp
use %26 for the & in the subject.

Do I encode ampersands in <a href...>?

I'm writing code that automatically generates HTML, and I want it to encode things properly.
Say I'm generating a link to the following URL:
http://www.google.com/search?rls=en&q=stack+overflow
I'm assuming that all attribute values should be HTML-encoded. (Please correct me if I'm wrong.) So that means if I'm putting the above URL into an anchor tag, I should encode the ampersand as &, like this:
<a href="http://www.google.com/search?rls=en&q=stack+overflow">
Is that correct?
Yes, it is. HTML entities are parsed inside HTML attributes, and a stray & would create an ambiguity. That's why you should always write & instead of just & inside all HTML attributes.
That said, only & and quotes need to be encoded. If you have special characters like é in your attribute, you don't need to encode those to satisfy the HTML parser.
It used to be the case that URLs needed special treatment with non-ASCII characters, like é. You had to encode those using percent-escapes, and in this case it would give %C3%A9, because they were defined by RFC 1738. However, RFC 1738 has been superseded by RFC 3986 (URIs, Uniform Resource Identifiers) and RFC 3987 (IRIs, Internationalized Resource Identifiers), on which the WhatWG based its work to define how browsers should behave when they see an URL with non-ASCII characters in it since HTML5. It's therefore now safe to include non-ASCII characters in URLs, percent-encoded or not.
By current official HTML recommendations, the ampersand must be escaped e.g. as & in contexts like this. However, browsers do not require it, and the HTML5 CR proposes to make this a rule, so that special rules apply in attribute values. Current HTML5 validators are outdated in this respect (see bug report with comments).
It will remain possible to escape ampersands in attribute values, but apart from validation with current tools, there is no practical need to escape them in href values (and there is a small risk of making mistakes if you start escaping them).
You have two standards concerning URLs in links (<a href).
The first standard is RFC 1866 (HTML 2.0) where in "3.2.1. Data Characters" you can read the characters which need to be escaped when used as the value for an HTML attribute. (Attributes themselves do not allow special characters at all, e.g. <a hr&ef="http://... is not allowed, nor is <a hr&ef="http://....)
Later this has gone into the HTML 4 standard, the characters you need to escape are:
< to <
> to >
& to &
" to &quote;
' to &apos;
The other standard is RFC 3986 "Generic URI standard", where URLs are handled (this happens when the browser is about to follow a link because the user clicked on the HTML element).
reserved = gen-delims / sub-delims
gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "#"
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
It is important to escape those characters so the client knows whether they represent data or a delimiter.
Example unescaped:
https://example.com/?user=test&password&te&st&goto=https://google.com
Example, a fully legitimate URL
https://example.com/?user=test&password&te%26st&goto=https%3A%2F%2Fgoogle.com
Example fully legitimate URL in the value of an HTML attribute:
https://example.com/?user=test&password&te%26st&goto=https%3A%2F%2Fgoogle.com
Also important scenarios:
JavaScript code as a value:
<img src="..." onclick="window.location.href = "https://example.com/?user=test&password&te%26st&goto=https%3A%2F%2Fgoogle.com";">...</a> (Yes, ;; is correct.)
JSON as a value:
...
Escaped things inside escaped things, double encoding, URL inside URL inside parameter, etc,...
http://x.com/?passwordUrl=http%3A%2F%2Fy.com%2F%3Fuser%3Dtest&password=""123
I am posting a new answer because I find zneak's answer does not have enough examples, does not show HTML and URI handling as different aspects and standards and has some minor things missing.
Yes, you should convert & to &.
This HTML validator tool by W3C is helpful for questions like this. It will tell you the errors and warnings for a particular page.

How do I escape ampersands in XML so they are rendered as entities in HTML?

I have some XML text that I wish to render in an HTML page. This text contains an ampersand, which I want to render in its entity representation: &.
How do I escape this ampersand in the source XML? I tried &, but this is decoded as the actual ampersand character (&), which is invalid in HTML.
So I want to escape it in such a way that it will be rendered as & in the web page that uses the XML output.
When your XML contains &amp;, this will result in the text &.
When you use that in HTML, that will be rendered as &.
As per §2.4 of the XML 1.0 spec, you should be able to use &.
I tried & but this isn't allowed.
Are you sure it isn't a different issue? XML explicitly defines this as the way to escape ampersands.
The & character is itself an escape character in XML so the solution is to concatenate it and a Unicode decimal equivalent for & thus ensuring that there are no XML parsing errors. That is, replace the character & with &.
Use CDATA tags:
<![CDATA[
This is some text with ampersands & other funny characters. >>
]]>
& should work just fine. Wikipedia has a list of predefined entities in XML.
In my case I had to change it to %26.
I needed to escape & in a URL. So & did not work out for me.
The urlencode function changes & to %26. This way neither XML nor the browser URL mechanism complained about the URL.
I have tried &amp, but it didn't work. Based on Wim ten Brink's answer I tried &amp and it worked.
One of my fellow developers suggested me to use & and that worked regardless of how many times it may be rendered.
& is the way to represent an ampersand in most sections of an XML document.
If you want to have XML displayed within HTML, you need to first create properly encoded XML (which involves changing & to &) and then use that to create properly encoded HTML (which involves again changing & to &). That results in:
&amp;
For a more thorough explanation of XML encoding, see:
What characters do I need to escape in XML documents?
<xsl:text disable-output-escaping="yes">& </xsl:text> will do the trick.
Consider if your XML looks like below.
<Employees Id="1" Name="ABC">
<Query>
SELECT * FROM EMP WHERE ID=1 AND RES<>'GCF'
<Query>
</Employees>
You cannot use the <> directly as it throws an error. In that case, you can use <> in replacement of that.
<Employees Id="1" Name="ABC">
<Query>
SELECT * FROM EMP WHERE ID=1 AND RES <> 'GCF'
<Query>
</Employees>
14.1 How to use special characters in XML has all the codes.