target = "_blank" vs. target = _blank - html

Is there any difference between target="_blank" and target=_blank ?
Seems to have the same behavior, but just want to make sure if one is better to practice than the other (and why). I have always used quotes, but am reading the rails tutorial and realized Michael does not use them.

They are equivalent.
The HTML attribute syntax allows for quoted and unquoted attributes.
In addition to the general requirements for attribute values, an unquoted attribute value has the following restrictions:
must not contain any literal space characters
must not contain any """, "'", "=", ">", "<", or "`", characters
must not be the empty string

Always use first approach. When you define an attribute, surround it's value with double quotes. While both can work, second seems to be messy and inconsistent as well as can lead to some issues with older browsers.

Related

Is there any difference for data-attribute=false with data-attribute="false" in html element?

I have data attribute in html element as <button data-verified=false>Update</button>. It have boolean value for data attribute.
Is there any difference with following element <button data-verified="false">Update</button> as the data-attribute is wrapped with double quotes.
Is boolean values are supported in html?
Boolean attributes are supported in HTML, but data-verified isn't one of them, no matter how it appears in the markup. data-verified=false and data-verified="false" both create an attribute of the type string and value "false", which if tested in JS as a boolean will be treated as true
This is only the case because false doesn't contain spaces. As a contrary example, data-verified=not true is invalid and not at all the same as data-verified="not true"
There are no differences in the values - however, always prefer to quote around attribute values, because:
Looks cleaner
Easier to maintain
Every editor can deal with it easily
It's a standard, nearly all HTML code examples you'll see use the value quoted
My answer corroborates from Do you quote HTML5 attributes?
I think it is just a convention that attributes always have double quotes.
However. In jQuery, you can use the .data() method. It is smart enough to recognize booleans and numeric values.
The only difference is that only the latter is allowed in XHTML. In HTML syntax, they both are allowed, and they are equivalent: the difference is lost when the HTML markup is parsed, and the DOM contains in both cases only the string false.
This follows from general principles in HTML and does not depend on the name of the attribute in any way.
“Boolean value” is a vague term. In HTML5, some attributes are called “boolean attributes”, but this is strongly misleading – especially since values true and false, far from being the only values allowed, aren’t allowed at all for such values. You need to read the specification of “boolean attributes” to see what they really are.
When you use data-* attributes, it is completely up to you what you use as values and how you process them.

escaping inside html tag attribute value

I am having trouble understanding how escaping works inside html tag attribute values that are javascript.
I was lead to believe that you should always escape & ' " < > . So for javascript as an attribute value I tried:
It doesn't work. However:
and
does work in all browsers!
Now I am totally confused. If all my attribute values are enclosed in double quotes, does this mean I do not have to escape single quotes? Or is apos and ascii 39 technically different characters? Such that javascript requires ascii 39, but not apos?
There are two types of “escapes” involved here, HTML and JavaScript. When interpreting an HTML document, the HTML escapes are parsed first.
As far as HTML is considered, the rules within an attribute value are the same as elsewhere plus one additional rule:
The less-than character < should be escaped. Usually < is used for this. Technically, depending on HTML version, escaping is not always required, but it has always been good practice.
The ampersand & should be escaped. Usually & is used for this. This, too, is not always obligatory, but it is simpler to do it always than to learn and remember when it is required.
The character that is used as delimiters around the attribute value must be escaped inside it. If you use the Ascii quotation mark " as delimiter, it is customary to escape its occurrences using " whereas for the Ascii apostrophe, the entity reference &apos; is defined in some HTML versions only, so it it safest to use the numeric reference ' (or ').
You can escape > (or any other data character) if you like, but it is never needed.
On the JavaScript side, there are some escape mechanisms (with \) in string literals. But these are a different issue, and not relevant in your case.
In your example, on a browser that conforms to current specifications, the JavaScript interpreter sees exactly the same code alert('Hello');. The browser has “unescaped” &apos; or ' to '. I was somewhat surprised to hear that &apos; is not universally supported these days, but it’s not an issue: there is seldom any need to escape the Ascii apostrophe in HTML (escaping is only needed within attribute values and only if you use the Ascii apostrophe as its delimiter), and when there is, you can use the ' reference.
&apos; is not a valid HTML reference entity. You should escape using '

Encoding rules for URL with the `javascript:` pseudo-protocol?

Is there any authoritative reference about the syntax and encoding of an URL for the pseudo-protocol javascript:? (I know it's not very well considered, but anyway it's useful for bookmarklets).
First, we know that standard URLs follow the syntax:
scheme://username:password#domain:port/path?query_string#anchor
but this format doesn't seem to apply here. Indeed, it seems, it would be more correct to speak of URI instead of URL : here is listed the "unofficial" format javascript:{body}.
Now, then, which are the valid characters for such a URI, (what are the escape/unescape rules) when embedding in a HTML?
Specifically, if I have the code of a javascript function and I want to embed it in a javascript: URI, which are the escape rules to apply?
Of course one could escape every non alfanumeric character, but that would be overkill and make the code unreadable. I want to escape only the necessary characters.
Further, it's clear that it would be bad to use some urlencode/urldecode routine pair (those are for query string values), we don't want to decode '+' to spaces, for example.
My findings, so far:
First, there are the rules for writing a valid HTML attribute value: but here the standard only requires (if the attribute value if enclosed in quotes) an arbitrary CDATA (actually a %URI, but HTML itself does not impose additional validation at its level: any CDATA will validate).
Some examples:
<a href="javascript:alert('Hi!')"> (1)
<a href="javascript:if(a > b && 1 < 0) alert( b ? 'hi' : 'bye')"> (2)
<a href="javascript:if(a>b &&& 1 < 0) alert( b ? 'hi' : 'bye')"> (3)
Example (1) is valid. But also example (2) is valid HTML 4.01 Strict. To make it valid XHTML we only need to escape the XML special characters < > & (example 3 is valid XHTML 1.0 Strict).
Now, is example (2) a valid javascript: URI ? I'm not sure, but I'd say it's not.
From RFC 2396: an URI is subject to some addition restrictions and, in particular, the escape/unescape via %xx sequences. And some characters are always prohibited:
among them spaces and {}# .
The RFC also defines a subset of opaque URIs: those that do not have hierarchical components, and for which the separating charactes have no special meaning (for example, they dont have a 'query string', so the ? can be used as any non special character). I assume javascript: URIs should be considered among them.
This would imply that the valid characters inside the 'body' of a javascript: URI are
a-zA-Z0-9
_|. !~*'();?:#&=+$,/-
%hh : (escape sequence, with two hexadecimal digits)
with the additional restriction that it can't begin with /.
This stills leaves out some "important" ASCII characters, for example
{}#[]<>^\
Also % (because it's used for escape sequences), double quotes " and (most important) all blanks.
In some respects, this seems quite permissive: it's important to note that + is valid (and hence it should not be 'unescaped' when decoding, as a space).
But in other respects, it seems too restrictive. Braces and brackets, specially: I understand that they are normally used unescaped and browsers have no problems.
And what about spaces? As braces, they are disallowed by the RFC, but I see no problem in this kind of URI. However, I see that in most bookmarklets they are escaped as "%20". Is there any (empirical or theorical) explanation for this?
I still don't know if there are some standard functions to make this escape/unescape (in mainstream languages) or some sample code.
javascript: URLs are currently part of the HTML spec and are specified at https://html.spec.whatwg.org/multipage/browsing-the-web.html#the-javascript:-url-special-case

Single vs Double quotes (' vs ")

I've always used single quotes when writing my HTML by hand. I work with a lot of rendered HTML which always uses double quotes. This allows me to determine if the HTML was written by hand or generated. Is this a good idea?
What is the difference between the two? I know they both work and are supported by all modern browsers but is there a real difference where one is actually better than the other in different situations?
The w3 org said:
By default, SGML requires that all attribute values be delimited using either double quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39). Single quote marks can be included within the attribute value when the value is delimited by double quote marks, and vice versa. Authors may also use numeric character references to represent double quotes (") and single quotes ('). For double quotes authors can also use the character entity reference ".
So... seems to be no difference. Only depends on your style.
I use " as a top-tier and ' as a second tier, as I imagine most people do. For example
Click Me!
In that example, you must use both, it is unavoidable.
Quoting Conventions for Web Developers
The Short Answer
In HTML the use of single quotes (') and double quotes (") are interchangeable, there is no difference.
But consistency is recommended, therefore we must pick a syntax convention and use it regularly.
The Long Answer
Web Development often consists of many programming languages. HTML, JS, CSS, PHP, ASP, RoR, Python, etc. Because of this we have many syntax conventions for different programming languages. Often habits from one language will follow us to other languages, even if it is not considered "proper" i.e. commenting conventions. Quoting conventions also falls into this category for me.
But I tend to use HTML tightly in conjunction with PHP. And in PHP there is a major difference between single quotes and double quotes. In PHP with double quotes "you can insert variables directly within the text of the string". (scriptingok.com) And when using single quotes "the text appears as it is". (scriptingok.com)
PHP takes longer to process double quoted strings. Since the PHP parser has to read the whole string in advance to detect any variable inside—and concatenate it—it takes longer to process than a single quoted string. (scriptingok.com)
 
Single quotes are easier on the server. Since PHP does not need to read the whole string in advance, the server can work faster and happier. (scriptingok.com)
Other things to consider
Frequency of double quotes within string. I find that I need to use double quotes (") within my strings more often than I need to use single quotes (') within strings. To reduce the number of character escapes needed I favor single quote delimiters.
It's easier to make a single quote. This is fairly self explanatory but to clarify, why press the SHIFT key more times than you have to.
My Convention
With this understanding of PHP I have set the convention (for myself and the rest of my company) that strings are to be represented as single quotes by default for server optimization. Double quotes are used within the string if a quotes are required such as JavaScript within an attribute, for example:
<button onClick='func("param");'>Press Me</button>
Of course if we are in PHP and want the parser to handle PHP variables within the string we should intentionally use double quotes. $a='Awesome'; $b = "Not $a";
Sources
Single quotes vs Double quotes in PHP. (n.d.). Retrieved November 26, 2014, from http://www.scriptingok.com/tutorial/Single-quotes-vs-double-quotes-in-PHP
If it's all the same, perhaps using single-quotes is better since it doesn't require holding down the shift key. Fewer keystrokes == less chance of repetitive strain injury.
Actually, the best way is the way Google recommends. Double quotes:
https://google.github.io/styleguide/htmlcssguide.xml?showone=HTML_Quotation_Marks#HTML_Quotation_Marks
See https://google.github.io/styleguide/htmlcssguide.xml?showone=HTML_Validity#HTML_Validity
Quoted Advice from Google: "Using valid HTML is a measurable baseline quality attribute that contributes to learning about technical requirements and constraints, and that ensures proper HTML usage."
In HTML I don't believe it matters whether you use " or ', but it should be used consistently throughout the document.
My own usage prefers that attributes/html use ", whereas all javascript uses ' instead.
This makes it slightly easier, for me, to read and check. If your use makes more sense for you than mine would, there's no need for change. But, to me, your code would feel messy. It's personal is all.
Using double quotes for HTML
i.e.
<div class="colorFont"></div>
Using single quotes for JavaScript
i.e.
$('#container').addClass('colorFont');
$('<div class="colorFont2></div>');
I know LOTS of people wouldn't agree, but this is what I do and I really enjoy such a coding style: I actually don't use any quote in HTML unless it is absolutely necessary.
Example:
<form method=post action=#>
<fieldset>
<legend>Register here: </legend>
<label for=account>Account: </label>
<input id=account type=text name=account required><br>
<label for=password>Password: </label>
<input id=password type=password name=password required><br>
...
Double quotes are used only when there are spaces in the attribute values or whatever:
<form class="val1 val2 val3" method=post action=#>
...
</form>
I had an issue with Bootstrap where I had to use double quotes as single quotes didn't work.
class='row-fluid' made the last <span> fall below the other <span>s, rather than sitting nicely beside them on the far right. class="row-fluid" worked.
It makes no difference to the html but if you are generating html dynamically with another programming language then one way may be easier than another.
For example in Java the double quote is used to indicate the start and end of a String, so if you want to include a doublequote within the String you have to escape it with a backslash.
String s = "a Link"
You don't have such a problem with the single quote, therefore use of the single quote makes for more readable code in Java.
String s = "<a href='link'>a Link</a>"
Especially if you have to write html elements with many attributes.(Note I usually use a library such as jhtml to write html in Java, but not always practical to do so)
if you are writing asp.net then occasionally you have to use double quotes in Eval statements and single quotes for delimiting the values - this is mainly so that the C# inline code knows its using a string in the eval container rather than a character. Personally I'd only use one or the other as a standard and not mix them, it looks messy thats all.
Using " instead of ' when:
<input value="user"/> //Standard html
<input value="user's choice"/> //Need to use single quote
<input onclick="alert('hi')"/> //When giving string as parameter for javascript function
Using ' instead of " when:
<input value='"User"'/> //Need to use double quote
var html = "<input name='username'/>" //When assigning html content to a javascript variable
I'm newbie here but I use single quote mark only when I use double quote mark inside the first one. If I'm not clear I show You example:
<p align="center" title='One quote mark at the beginning so now I can
"cite".'> ... </p>
I hope I helped.
Lots of great insightful replies here! More than enough for anyone to make a clear and personal decision.
I would simply like to point out one thing that's always mattered to me.
And take this with a grain of salt!
Double quotes apply to strings that have more than a single phase such as "one two" rather than single quotes for 'one' or 'two'. This can be traced as far back as C and C++.
(reference here or do your own online search).
And that's truly the difference.
With this principle (this different), parsing became possible such as "{{'a','b'},{'x','y'}} or "/[^\r\n]*[\r\n]" (which needed to be space independent because it's expressional) or more famously for HTML specific title = "Hello HTML!" or style = "font-family:arial; color:#FF0000;"
The funny thing here is that HTML (coming from XML itself) commonly adopted double quotes due to expressional features even if it is a single character (e.g. number) or single phase string.
As NibblyPig pointed out quite well and straightforward:
" as a top-tier and ' as a second tier since "'a string here'" is valid and expected by W3 standards (which is for the web) and will most likely never change.
And for consistency, double quotes is wisely used, but only fully correct by preference.
In PHP using double quotes causes a slight decrease in performance because variable names are evaluated, so in practice, I always use single quotes when writing code:
echo "This will print you the value of $this_variable!";
echo 'This will literally say $this_variable with no evaluation.';
So you can write this instead;
echo 'This will show ' . $this_variable . '!';
I believe Javascript functions similarly, so a very tiny improvement in performance, if that matters to you.
Additionally, if you look all the way down to HTML spec 2.0, all the tags listed here;
W3 HTML DTD Reference
(Use doublequotes.) Consistency is important no matter which you tend to use more often.
Double quotes are used for strings (i.e., "this is a string") and single quotes are used for a character (i.e., 'a', 'b' or 'c'). Depending on the programming language and context, you can get away with using double quotes for a character but not single quotes for a string.
HTML doesn't care about which one you use. However, if you're writing HTML inside a PHP script, you should stick with double quotes as you will need to escape them (i.e., \"whatever\") to avoid confusing yourself and PHP.

Is it safe to display user input as input values without sanitization?

Say we have a form where the user types in various info. We validate the info, and find that something is wrong. A field is missing, invalid email, et cetera.
When displaying the form to the user again I of course don't want him to have to type in everything again so I want to populate the input fields. Is it safe to do this without sanitization? If not, what is the minimum sanitization that should be done first?
And to clearify: It would of course be sanitized before being for example added to a database or displayed elsewhere on the site.
No it isn't. The user might be directed to the form from a third party site, or simply enter data (innocently) that would break the HTML.
Convert any character with special meaning to its HTML entity.
i.e. & to &, < to <, > to > and " to " (assuming you delimit your attribute values using " and not '.
In Perl use HTML::Entities, in TT use the html filter, in PHP use htmlspecialchars. Otherwise look for something similar in the language you are using.
It is not safe, because, if someone can force the user to submit specific data to your form, you will output it and it will be "executed" by the browser. For instance, if the user is forced to submit '/><meta http-equiv="refresh" content="0;http://verybadsite.org" />, as a result an unwanted redirection will occur.
You cannot insert user-provided data into an HTML document without encoding it first. Your goal is to ensure that the structure of the document cannot be changed and that the data is always treated as data-values and never as HTML markup or Javascript code. Attacks against this mechanism are commonly known as "cross-site scripting", or simply "XSS".
If inserting into an HTML attribute value, then you must ensure that the string cannot cause the attribute value to end prematurely. You must also,of course, ensure that the tag itself cannot be ended. You can acheive this by HTML-encoding any chars that are not guaranteed to be safe.
If you write HTML so that the value of the tag's attribute appears inside a pair of double-quote or single-quote characters then you only need to ensure that you html-encode the quote character you chose to use. If you are not correctly quoting your attributes as described above, then you need to worry about many more characters including whitespace, symbols, punctuation and other ascii control chars. Although, to be honest, its arguably safest to encode these non-alphanumeric chars anyway.
Remember that an HTML attribute value may appear in 3 different syntactical contexts:
Double-quoted attribute value
<input type="text" value="**insert-here**" />
You only need to encode the double quote character to a suitable HTML-safe value such as "
Single-quoted attribute value
<input type='text' value='**insert-here**' />
You only need to encode the single quote character to a suitable HTML-safe value such as ‘
Unquoted attribute value
<input type='text' value=**insert-here** />
You shouldn't ever have an html tag attribute value without quotes, but sometimes this is out of your control. In this case, we really need to worry about whitespace, punctuation and other control characters, as these will break us out of the attribute value.
Except for alphanumeric characters, escape all characters with ASCII values less than 256 with the &#xHH; format (or a named entity if available) to prevent switching out of the attribute. Unquoted attributes can be broken out of with many characters, including [space] % * + , - / ; < = > ^ and | (and more). [para lifted from OWASP]
Please remember that the above rules only apply to control injection when inserting into an HTML attribute value. Within other areas of the page, other rules apply.
Please see the XSS prevention cheat sheet at OWASP for more information
Yes, it's safe, provided of course that you encode the value properly.
A value that is placed inside an attribute in an HTML needs to be HTML encoded. The server side platform that you are using should have methods for this. In ASP.NET for example there is a Server.HtmlEncode method, and the TextBox control will automatically HTML encode the value that you put in the Text property.