HTML semantics and styling - html

My question is this – is it still best practice to use certain HTML tags even if you would then need to style them differently to how a browser interprets and displays those tags?
For example – the HTML5 <blockquote> tag will start on its own line, with default margin and padding, and with an indent.
However – if you do not wish there to be an indent, should you still use the <blockquote> tags in order to convey meaning to the browser and search engine, and then apply CSS to reduce/rid the indent, or should you just use a <p> tag for example?
It’s not much effort to restyle the blockquote element, and I assume that it is important to use the tags that most accurately convey the meaning of their contents, but at the same time I do not want to get into a habit of writing extra code if it is considered best practice not to do so.

I assume that it is important to use the tags that most accurately
convey the meaning of their contents
You assume correctly.
I would say yes, absolutely always try to use the appropriate element for the type of content/purpose you intend. Elements have names/designations for a reason, so your code can be structured in a way that makes semantic sense. Why is this important? Well ignoring SEO for which it plays an important part, or ease of access regarding your code, this is the intended design of HTML.
Not using a specific element because it has default styling applied is not a sensible or really logical course of action in this context. This is especially in light of the fact that when you compare browser-specific styling, most elements may have default styles applied.
I do not want to get into a habit of writing extra code if it is
considered best practice not to do so
Bad practice to some degree is subjective, otherwise it is simply right or wrong- in this case it wont be game breaking to not use the correct tags, however you will be going against their intended use according to specification, so it would most certainly be bad practice.
See:
Semantic HTML (Wikipedia)
Semantic Web (Wikipedia)

Yes, use the tags which are most aligned with the semantics of your content. Don't concern yourself with styling when constructing your HTML doc, as you could have different styles for different resolutions (smartphone, tablet, desktop) or even different mediums (web browser, screen reader, braille display, whatever device come out in the future...)
The following talks about class names specifically, but it does drive home the importance of semantic correctness:
ref: http://www.w3.org/QA/Tips/goodclassnames

HTML5 gives us many new elements to describe parts of a Web page, such as header, footer, nav, section, article, aside and so on. These exist because we Web developers actually wanted such semantics. How did the authors of the HTML5 specification know this? Because in 2005 Google analyzed 1 billion pages to see what authors were using as class names on divs and other elements. More recently, in 2008, Opera MAMA analyzed 3 million URLs to see the top class names and top IDs used in the wild. These analyses revealed that authors wanted to mark up these areas of the page but had no elements to do so, other than the humble and generic div, to which they then added descriptive classes and IDs.
(HTML5 Doctor has many articles about HTML5 semantics.)
Read

You use the blockquote tag to denote that the content is a quote.
The default style provided by the mark-up communicates information to the one interpreting or decoding the sign.
http://html5doctor.com/blockquote-q-cite/
It is not only a matter of the default style of the mark-up element.
By using CSS you can create a variation of that basic style. And this is certainly valid for the blockquote. The default style does not provide quotes just padding and margin. It is provided by the CSS quotes property. And this is just one way of styling a quote. You can have a left border on the quote, italic, ...
blockquote {
quotes: "\201C""\201D""\2018""\2019";
}
http://css-tricks.com/snippets/css/simple-and-nice-blockquote-styling/

This is largely opinion-based, and opinions on matters like this are often expressed almost as religious convictions, with little or no factual evidence or arguments presented. However, I will try to deal with the issue on a technical basis, focusing on the example given. (It’s a good example and does not take us too easily too deep into semantic jungles.)
The blockquote element is defined as structural rather than semantic. It does not say anything about the meaning of its content (it might be about apples, God, or screwdrivers), only about its structural relation with the enclosing document: the content is a copy of some content taken from an external source, i.e. from outside the document – that is, a quotation – and it is a “block quotation” as opposite to “inline quotation”, which means little (if anything) else than its default rendering being a block.
In practice, blockquote has widely been just simply to indent text, especially in the old days, before CSS was generally available. This is one reason why there is no sign of search engines (or browsers) making any use of the structural relationship (or “semantics”, if you want to put it that way) nominally expressed by the element. There are many things that search engines could do with such information. They just don’t. It is questionable if they ever will: too much content would be incorrectly treated as quoted, and there is too much content that is actually quoted but not marked with blockquote.
Thus, the effect of blockquote is in practice just the default rendering, with certain margins on all sides. When CSS is disabled or overridden for some reason, this is the rendering, and you should try to use HTML markup that gives you tolerable rendering even without CSS. This can be seen as the good reason for using blockquote for block quotations. On the other hand, if your block quotations have been clearly indicated as quotations in other methods, such as introductory phrases or headings, this argument does not really apply.
On the other hand, there is no good reason not to use blockquote just because you don’t want the indent. In that case, you must have some other method of visually distinguishing the block quotation from other content, such as using a background color, a different font, or maybe large decorative quotation marks. You would naturally be using CSS for that, and compared to what you do with it, setting margin-left: 0 is a trivial thing to do.
Moreover, even though browsers and general search engines ignore the structural meaning of blockquote, you (or your organization) may decide to do otherwise. You can choose to mark block quotations uniformly in order to be able to find them easily (for statistical, checking, or other purposes). You could achieve the same by using, say, class=quote consistently, but why not use an element when you can?

Related

Why HTML5 support `dir` attribute?

I noticed that if I want to change the direction orders of my elements I should use dir attribute (e.g. <div dir="ltr"></div>).
I got surprised that HTML5 confirms this approach and does not replace it with CSS3 because as much as I know, the approach of HTML5/CSS3 is to put elements on the HTML file and style them with the CSS. That is why attributes such as align, width and height (except img tag for optimization and only in px) are deprecated. In addition, Removing styling tags such as <font> confirms this approach.
I am wondering why dir attribute is still in the HTML and not in the CSS? Is there any reason for it?
The direction of elements is a side effect. The attribute describes the direction the language is written in. For example, English is a left to right language while Hebrew is a right to left language. It is thus a semantic attribute.
If an attribute in HTML5 is not deprecated there's a reason for it, and this case is no exception. Firstly it's worth pointing out that there is a CSS direction property, so it does exist. But it's actually better to use the attribute, here's why:
(The following quotes are all from w3)
The dir attribute is used to set the base direction of text for display. It is essential for enabling HTML in right-to-left scripts.
Use the dir attribute on a block element only when you need to change the base direction of content in that block. Do not use CSS.
You should always use dedicated bidi markup to describe your content, where markup is available. Then CSS may or may not also be needed to describe the meaning of that markup.
More specifically:
...directionality is an integral part of the document structure [so] markup should be used ....
styling applied by CSS is not permanent. It may be turned off, be overridden, go unrecognised, or be changed/replaced in different contexts. Although bidi markup is only needed for the visual rendering of a text it is not purely decorative in function...
Essentially saying that it's more well recognised and you will achieve a more persistent result across browsers.
Personally, if you're using it purely for styling content (which is what CSS is designed for) then I wouldn't use the attribute. Mainly for the reason you're likely to use another property that limits your website's compatibility anyway, and most people use it for styling purposes instead of language support and whatnot. I consider it to be an accessibility feature because not everyone needs it, but it's good for it to be there for those who do.
It's worth pointing out there is also a third option of using paired Unicode bidi formatting code characters. However it is best to avoid this as it can become impractical, w3 recognise this too:
When control characters are used in free-flowing content there is always a likelihood of overlapping or unterminated ranges
It is also much easier to manage inheritance and the effects of paragraph separators with markup.
The HTML 4 specification specifically warns against mixing the two approaches because of the increased likelihood of improper nesting.
I lied, it's not an official accessibility feature, but I consider it as one. While the following is true:
neither markup nor CSS should be used unless they are needed.
HTML5 adds a new feature:
HTML5 provides a new value for the dir attribute: auto. The auto value tells the browser to look at the first strongly typed character in the element. If it's a right-to-left typed character such as a Hebrew or Arabic letter, the element will get a direction of rtl. If it's, say, a Latin character, the direction will be ltr.
Essentially meaning that HTML5 will automatically select the best option. So it selects the corresponding value which suits the language the user is reading it in. Bear in mind that most browsers offer a translation service. By my definition this makes in an accessibility feature.
I found the complete answer here
Because directionality is an integral part of the document structure,
markup should be used to set the directionality for a document or
chunk of information, or to identify places in the text where the
Unicode bidirectional algorithm alone is insufficient to achieve
desired directionality.
To produce the desired right-to-left or bidirectional effect, some
people simply apply CSS to whatever general paragraph or inline
elements surround the relevant text. However, styling applied by CSS
is not permanent. It may be turned off, be overridden, go
unrecognised, or be changed/replaced in different contexts. The markup
may also find its way into places where the CSS is not available, eg.
via shared databases, or quoted fragments.
Although bidi markup is only needed for the visual rendering of a text
it is not purely decorative in function.
Markup remains integrated with the document content in a persistent
fashion. It also lends significant clarity to the content if you use
dedicated bidi markup. You should therefore use dedicated bidi markup
whenever it is available. Do not simply attach CSS styling to a
general element to achieve the effect.
That is why both dir attribute and css exists in HTML5.

Is There Any Reason to Use HX Tags When I Override All Styles Anyway?

I use <h1>, <h2>, <h3> tags for headers. But then I almost always "overwrite" pretty much all styles (font size/family/weight, margin and padding).
Given this, is there any real reason to use them, other than their somewhat informative nature (that something is meant as a heading)?
Is there any real reason to use them, other than their somewhat informative nature
That is the ONLY reason to use them, to describe your document. That's the whole point of HTML, CSS is meant purely for styling.
There are exceptions for the non-semantic tags like div and span, but generally what you plan to do with your CSS should never affect your decisions on which tag to use to mark up your content, always use the appropriate one.
Yes, it's a good to define heading tags in your document. There are some good reason like
It's good to describe your document
It's good for SEO also
It's good to use heading tag when a screen reader set to a scanning mode.
Read this for more http://webdesign.about.com/od/beginningtutorials/a/headings_struct.htm
Heading markup is known to be significant to search engines (they give greater relative weight to text in them than copy test), though the details (including the importance of this issue) are not public information. Some screen readers and assistive software make use of heading markup, e.g. allowing “heading reading mode” and by making noticeable pauses before and after a heading. When an HTML document is opened in Microsoft Word, heading elements are recognized as headings that are used e.g. in generating a table of content. Headings are a widely known concept, so user style sheets may conceivably have settings for them, suitable for an individual user.
This description is not exhaustive, but it illustrates that for heading elements, the idea of “semantic markup” has some practical relevance.

whats the difference between the <u> tag and the <ins> tag?

what's the difference between these two tags? seems to do the same thing.
<p>My favorite color is <ins>red</ins>!</p>
and
<p>My favorite color is <u>red</u>!</p>
Semantics. The <ins> tags means content inserted after it was first published. The <u> tag is simply for underlining and has no meaning.
Reference: <ins>, <u>
The official difference depends on the HTML specification or draft that you choose to regard as official.
By the HTML 4.01 specification, u means underlined text style, whereas ins means that its content has been “inserted [...] with respect to a different version of a document (e.g., in draft legislation where lawmakers need to view the changes). The rendering of ins is not specified; instead, some possible renderings are described. In practice, browsers mostly use underlining. There is also the formal syntactic difference that u allows text-level content only, whereas ins may contain blocks, too.
In the HTML5 drafts, ins is essentially similar but with different wordings and with an explicit suggestion, or maybe (semi)recommendation, that the default rendering use underlining (see 10.3.4 Phrasing content there). The u was previously excluded from the draft, now added but with invented meaning: “The u element represents a span of text with an unarticulated, though explicitly rendered, non-textual annotation, such as labeling the text as being a proper name in Chinese text (a Chinese proper name mark), or labeling the text as being misspelt.” If this does not make sense to you, you’re not the only one. And u has underlining as the suggested, or recommended, default rendering.
In practice, the effect is mostly the same, except that ins tags are ignored by some old browsers. Some exotic browser could use different rendering. I have not seen any evidence of any browser, search engine, or any other relevant software make any distinction between the two; their “semantic difference” has no practical impact.
However, I would not use ins for anything except inserted text in some sense, just because some future browsers might treat it in some way that makes sense for inserted text but not otherwise. And there would be no tangible benefit from using ins just for underlining: u is shorter markup and more widely supported. Then again, situations where you should underline text, except links, on web pages are rare.
ins is a semantic tag : it denotes an element which has been inserted (see also del for deleted elements) while u gives the engine the instruction to render the element as underline.
Some engines may decide to render ins as u but you can use the "semantic meaning" to decide to render it otherwise (with CSS). Speaking of CSS, it renders <u> useless and just makes style maintenance more difficult, just like <b>, <center>, and so on.
As for the other edition tag, del, it's not widely used and it's not really clear why it should be.
Reference :
http://www.w3.org/TR/html-markup/ins.html
http://www.w3.org/TR/html-markup/del.html
http://www.w3.org/TR/html-markup/u.html
ins has a meaning: This text was inserted at a later date.
u has no meaning, it just tells the browser to put some underlining on it. (Though CSS can override that just as for any other element, one of the reasons this element has been pointless since 4.0)
ins was introduced with 4.0, and is still current.
u was introduced with at least 2.0 (HTML1.0 never really stopped being work-in-progress so 2.0 was the first real standard), and deprecated with 4.0
u is inline-only, so <p><u>this is valid</u></p> but <u><p>this makes no sense</p></u>. ins is one of the very few elements that can be both block and inline (a few more inline elements were given this status with 5.0), so <p><ins>this is valid</ins></p><ins><p>this is also valid</p></ins>.
ins has an optional cite attribute giving the URI of a document (or fragment within a document, perhaps the same document it is in) explaining the change, and datetime giving the date of the change in the a W3 date-time format.
Hence:
If the underline is to indicate something added, use ins.
If you want to indicate something added, but not by underlining, use ins and use CSS to change how it appears.
Don't use u ever. HTML's history is a nightmare from which we are trying to awake.
If you want underline for another reason, then either pick the element that best matches your meaning, using div or span if you really can't do any better, and use CSS to underline it.

What is the actual meaning of separation of content and presentation?

What is the actual meaning of separation of content and presentation?
Is it just mean to avoid inline css?
Does it mean that the design should be able to manipulated without changing the HTML?
Can we really make any change in design from CSS only?
If we want to change the size of
images then we will have to go to in
HTML code
If we wan to add one more line break in paragraph then again we will
have to go to in HTML code
If we want to add one more separator
at some place then again we will have
to go to in HTML code
Which X/HTML tag we should avoid to use to keep separation of content and presentation?
Is separation of content and presentation also helpful for accessibility/screen reader users? ... and for programmer/developer/designer?
When defining what is content and presentation, see your HTML document as a data container. Then ask yourself the following on each element and attribute:
Does the attribute/element represent a meaningful entity in my data?
For example, are the words between <b> tag are in bold simply for display purposes or did I want to add emphasis on that data?
Am I using the proper attribute/element to property represent the type of data I want to represent?
Since I want to add emphasis on that particular section, I should use <em> (it doesn't mean italic, it means emphasis and can be made bold) or <strong> depending of the level of emphasis wanted.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Sometimes an presentational tag can simply be replaced by CSS rules on the parent element. In which case, the presentational tag needs to be removed.
After asking yourself these three simple questions, you are usually able to make a pretty informed decision. An example:
Original Code:
<label for="name"><b>Name:</b></label>
Checking the <b> tag...
Does the attribute/element represent a meaningful entity in my data?
No, the tag doesn't represent a data node. It is there purely for presentation.
Am I using the proper attribute/element to property represent the type of data I want to represent?
<b> is used for presentation of bold elements.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Since <b> is presentational and I am using it for presentation, yes. And since the <b> element affects the whole of <label>, it can be removed and style be applied to the <label>.
Semantic HTML's goal is not to simplify design and redesign or to avoid inline styling, but to help a parser understand what that particular tag represent in your document. That way, applications can be created (ie.: search engine) to intelligently decide what your content signify and to classify it accordingly.
Therefore, it makes sense to use the CSS property content: to add quotes around text located in a <q> tag (it has no value to the data contained in your document other that presentation), but no sense to the use the same CSS property to add a © symbol in your footer as it does have a value in your data.
Same applies to attributes. Using the width and height attribute on an <img> tag representing an icon at size 16x16 makes semantic sense as it is important to understand the meaning of the <img> tag (an icon can have different representations depending on the size it is displayed at). Using the same attributes on an <img> tag representing a thumbnail of an larger image does not.
Sometimes you will need to add non-semantic elements to be able to achieve your wanted presentation, but usually those are avoidable.
There are no wrong elements. There are wrong uses of particular elements. <b> should not be used when adding emphasis. <small> should be used for legal sub-text, not to make text smaller (see HTML5 - Section 4.6.4 for why), etc... All elements have a particular usage scenario and they all represent data (minus presentational elements, but they do have a use in some cases). No elements should be set aside.
Attributes are a different thing. Most the attributes are presentational in nature. Attributes such as <img border> and <body fgcolor> rarely have signification in the data you are representing therefore you should not use them (except in those rare cases).
Search Engines are a good examples as to why semantic documents are so important. Microformats are a predefined set of elements and classes which you can use to represent data which search engines will understand in a certain way. The product price information in Google Searches is an example of semantics at work.
By using the predefined rules in set standards to store information in your document allows third-party programs to understand what seems to be a wall of text without using heuristics algorithms which may be prone to failures. It also helps screen readers and other accessibility applications to more easily understand the context in which the information is presented. It also greatly helps the maintainability of your markup as everything is tied to a set definition.
The best example is probably the CSS Zen Garden.
The goal of this site is to showcase what is possible with CSS-based design only, with a strict separation of content from the design. Style sheets contributed by various graphic designers are used to change the visual presentation of a single HTML file, producing hundreds of different designs. The HTML markup itself never changes between the different designs.
On each design page, you'd have a link to view the CSS file of that design.
What is the actual meaning of separation of content and presentation?
It is rather a design philosophy than somewhat concrete. In general, it means that you should preserve the semantics of the content, think of your content as of a piece of structured information. And that also means that you should keep all aesthetic details away from this structured information.
is it just mean to avoid inline css?
As noticed above, inline styles have nothing to do with semantics of your content and should be avoided at all costs. But it isn't just that.
is it just mean if after writing html according to design then if then if we want to do any change in design then it should be only with css, no need to html
Unfortunately, it is not always possible to achieve some concrete aesthetic goals without modifying the underlying markup; CSS3 tries it's best to address these issues.
Which X/HTML tag we should avoid to use to keep separation of content and presentation?
Look for deprecated tags in W3C HTML 4.01 / XHTML 1.0 Reference
Is separation of content and presentation also helpful for accessibility/screen reader users?
Surely. Better structured information generally remains readable even if certain browsers render styles incorrectly (or do not render them at all). Such content may also look more adequate on printed media (though print styles may be applied to achieve even better aestherics -- they, again, have nothing to do with content semantics).
Is separation of content and presentation also helpful for programmer/developer/designer ?
Of course. The separation of content and presentation takes its roots from more general philosophy, the separation of concerns. Everybody benefit from the separation: the content supplier does not have to be a good designer and vice versa.
Putting in line breaks at certain points is inevitable, there will usually be some overlap of presentation and content. You should always aim for perfect separation though.
Take the other extreme: A page containing loads and loads of tables that are used for layout purposes only. This is the definite anti-pattern that should be avoided at all cost. The content plays a second fiddle after the layout here; it's often not in the right order and thereby hardly machine readable. Not machine readable content is bad for accessibility and bad for the page's search engine ranking.
By marking up content without concern for presentation, you are first and foremost making it machine readable. You are then also in a position to serve the same content to different clients in different formats, say in a mobile-optimized version. You can also change the presentation easily without having to mess with the HTML files, say for a big redesign.
Another benefit that comes naturally by separating content and presentation (HTML - CSS files) is that you have less to type and less to maintain, plus your pages can have a consistent styling applied very easily. Contrast thousands of inline styles vs. one style definition in one CSS file, which is "naturally" applied to all elements with the same "meaning" (markup).
Ideally your (X)HTML consists only of meaningful, semantic markup and your CSS of styles using this markup for its selectors. In the real world you'll often mix classes and IDs into your markup that add no extra meaning, because you need these extra "hooks" to style everything the way you want to. But even here there's a difference between class="blue right-aligned" and class="contact-info secondary". Always try to add meaning to the content, not style. Balancing this is quite an art in itself. :)

What does "semantically correct" mean?

I have seen it a lot in css talk. What does semantically correct mean?
Labeling correctly
It means that you're calling something what it actually is. The classic example is that if something is a table, it should contain rows and columns of data. To use that for layout is semantically incorrect - you're saying "this is a table" when it's not.
Another example: a list (<ul> or <ol>) should generally be used to group similar items (<li>). You could use a div for the group and a <span> for each item, and style each span to be on a separate line with a bullet point, and it might look the way you want. But "this is a list" conveys more information.
Fits the ideal behind HTML
HTML stands for "HyperText Markup Language"; its purpose is to mark up, or label, your content. The more accurately you mark it up, the better. New elements are being introduced in HTML5 to more accurately label common web page parts, such as headers and footers.
Makes it more useful
All of this semantic labeling helps machines parse your content, which helps users. For instance:
Knowing what your elements are lets browsers use sensible defaults for how they should look and behave. This means you have less customization work to do and are more likely to get consistent results in different browsers.
Browsers can correctly apply your CSS (Cascading Style Sheets), describing how each type of content should look. You can offer alternative styles, or users can use their own; as long as you've labeled your elements semantically, rules like "I want headlines to be huge" will be usable.
Screen readers for the blind can help them fill out a form more easily if the logical sections are broken into fieldsets with one legend for each one. A blind user can hear the legend text and decide, "oh, I can skip this section," just as a sighted user might do by reading it.
Mobile phones can switch to a numeric keyboard when they see a form input of type="tel" (for telephone numbers).
Semantics basically means "The study of meaning".
Usually when people are talking about code being semantically correct, they're referring to the code that accurately describes something.
In (x)HTML, there are certain tags that give meaning to the content they contain. For example:
An H1 tag describes the data it contains as a level-1 heading. An H2 tag describes the data it contains as a level-2 heading. The implied meaning behind this is that each H2 under an H1 is in some way related (i.e. heading and subheading).
When you code in a semantic way, you basically give meaning to the data you're describing.
Consider the following 2 samples of semantic VS non-semantic:
<h1>Heading</h1>
<h2>Subheading</h2>
VS a non-semantic equivalent:
<p><strong>Heading</strong></p>
<p><em>Subheading</em></p>
Sometimes you might hear people in a debate saying "You're just talking semantics now" and this usually refers to the act of saying the same meaning as the other person but using different words.
"Semantically correct usage of elements means that you use them for what they are meant to be used for. It means that you use tables for tabular data but not for layout, it means that you use lists for listing things, strong and em for giving text an emphasis, and the like."
From: http://www.codingforums.com/archive/index.php/t-53165.html
HTML elements have meaning. "Semantically correct" means that your elements mean what they are supposed to.
For instance, you definition lists are represented by <dl> lists in code, your abbreviations are <abbr>s etc.
It means that HTML elements are used in the right context (not like tables are used for design purposes), CSS classes are named in a human-understandable way and the document itself has a structure that can be processed by non-browser clients like screen-readers, automatic parsers trying to extract the information and its structure from the document etc.
For example, you use lists to build up menus. This way a screen reader for disabled people will know these list items are parts of the same menu level, so it will read them in sequence for a person to make choice.
I've never heard it in a purely CSS context, but when talking about CSS and HTML, it means using the proper tags (for example, avoiding the use of the table tag for non-tabular data), providing proper values for the class and id that identify what the contained data is (and using microformats as appropriate), and so on.
It's all about making sure that your data can be understood by humans (everything is displayed properly) and computers (everything is properly identified and marked up).