Preserving good semantics with repetitive content - html

Say I'm building a typical document editor:
Where the preview (in red) is an up-to-date, formatted vue of the form's data.
The preview element contains semantic elements (e.g. h1, h2, main, header, etc.). It's kind of a document in itself, which does make sense, conceptually. But this makes the structure of the real document quite confusing for crawlers and screen readers. There might be, for instance, two h1 or main elements. I'm looking for a way to avoid that.
Plus, there's the problem of repetitive content (see image).
For the accessibility part of the problem, I could just add an aria-hidden="true" attribute to the preview element. In fact, visually-impaired people don't need the preview, it's just redundancy to them, they just need the form.
But for crawlers, here are my options:
Don't use semantic elements inside the preview element, use divs instead (😥).
Host the preview at an other URL and insert it via an iframe (that's what I'm doing right now, but it seems hacky to me).
Leave it like that, crawlers don't care.
Any idea/resource/suggestion?

As long as your preview area is clearly indicated for assistive technology, it's perfectly fine to have redundant information. If you have an <iframe>, make sure there's a title attribute on it.
<iframe title="preview area"...>
However, you might have validator issues with multiple structure elements.
For example, HTML only allows one <main> element:
A document must not have more than one main element that does not have the hidden attribute specified.
You can have multiple <header> elements but a <header> has a default role of banner and the banner role says:
Within any document or application, the author SHOULD mark no more than one element with the banner role.
The key here is "should", meaning it's a strong recommendation but not required. You can also get away with multiple banner roles if your preview section has role="document".
I would recommend not using non-semantic elements (div) because an assistive technology user might want to check the actual semantic structure of what's generated, although I suppose you could also have a "show in new tab" option for the preview that uses all full semantics, kind of like your second bullet but not using an iframe.

Related

Web accessibility - heading-order rule with missing <h2>

We have inherited a website which we are currently trying to make WCAG2.0 AA compliant
One of the pages is failing the heading-order rule as it has an <h3> and <h4> tags but no prior <h1> or <h2>
We are in the process of adding an <h1> tag (as all pages should have one) but there is no need for an <h2> tag and to amend the <h3> and <h4> would involve a large refactoring of various jQuery code and CSS
Are there any tricks to make the page accessible? I'm loathed to put a hidden <h2> tag in as the screen readers would presumably pick this up. Or do they ignore the hidden tags and the page then becomes compliant?
Are you trying to satisfy WCAG 2.0 AA or are you trying to make the page accessible?
These are usually the same, but sometimes not.
A hidden <h2> would pass WCAG depending on how you hide it. That, however, would not make the page more accessible.
While this may suck, the best, most correct approach is to fix the <h3>s and <h4>s to become correct levels (that is my answer, the rest of this is fluff). Your question might be more appropriate if you instead provide some code samples and ask for tips on how to write a regex or otherwise script these replacements throughout the inherited system.
If you are being told you have no time to do it right, then code examples (or a sample site) might still be helpful to get some guidance.
I really have seen this situation before, and actually have fallen into this trap myself. By applying styles to the h3 and h4, it is possible to make a page look, well, a certain way.
Looking at the point of the header tags however, it is their purpose to add semantics to the document, as we all well know. Is it, therefore, meaningful to have a document outline where there is an h3 but no h2? Screenreaders and other accessibility tools use this header information and some could get confused.
My most influential decision-making point is, "how will the user consume this information?" Will they be able to consume it? Is it meaningful to skip a header level? I initially think not, but please let me know of your differing opinion!
If you really do care about accessibility, giving an empty h2 (even implicitly which is the case when you omit it) might give no clue, in some screen readers, of the announced section to users when they will navigate the outline of the document (1).
That being said, I can't find anywhere (neither WCAG or HTML5 documentation) where it's said that you can't omit one level of headings.
The only official (for HTML5.1) requirement is to use "headings of lower rank" to start new subsections, which should mean that you could use a h3 directly below a h1 but can't use another h1
Even the WCAG is giving an example using omitted ranks saying this example does not intend to prescribe the level for a particular section.
(1) HTML5.1 provides an outline algorithm where we can read about "implied headings" or about the use of the rank when there is an heading content element
I would say the best way to ensure compliance is to refactor the code that is in the javascript/CSS. To hide elements you could use the hidden attribute or aria-hidden.
https://www.paciellogroup.com/blog/2012/05/html5-accessibility-chops-hidden-and-aria-hidden/

Is it necessary to change <h2>'s to <h1>'s when the main <h1> is removed?

I'm following the CodeSchool foundation course now with HTML. Its saying its 'not OK' to not have a <h2> heading as the top one:
we removed the h1 tag and replaced it with an img tag. That's fine, but it also means that our highest level heading right now is an h2 tag, which isn't OK.
Is this going to affect the final product or is it just a design thing?
Nothing will break if there is a h2 and no h1. However, the way computers read websites, is by parsing the HTML. The only way the computer knows that something is a top-level header (h1) is by finding the h1 tag.
The most common reason I can think of at the moment, as to why you want h1 insted of h2, is how search engines index your website. Using a h1, means that the text within the tag is highly relevant to your website, while h2 is not as important (but still more important than h3 for example).
So it's not about what works and don't works, but it's about what things should be. Writing a website, following all the "rules" of HTML, will make it easily parsable by crawlers (programs that parse websites and gather information).
Another, important, reason why your html should be well-structured is for blind people. They use programs that read out text from websites to "read" the website. Having bad semantic/bad structured html might make these programs useless.
Edit:
Sorry, as mentioned, document outline would break, but the website would still be rendered, which is what I was referring to.
The heading elements are used to provide semantic meaning to your content (as most HTML elements do). In addition, people who use assistive technologies, such as screen readers, rely on certain HTML elements to provide information as to how to navigate a page. The heading elements (h1...h6) help to create the page's Document Outline, for which there is a defined specification.
If you were to start with an h2, the outline would be broken.
Just as it makes no sense to stat a report on section 2, you should not start a web page there for the same reason.
SEO
Search Engine Optimalization! If a person searches for, say, "Rubik's Cubes", a search engine will look through their data to see which sites have the string "Rubik's Cubes" (or some substring or related words) on their site.
Now, if your site has the string "Rubik's Cubes" in an <h1> tag, the search engine will understand that this is the entire point of your page, and rank it higher, because it's the title of the entire page!
Meanwhile, if you have it in an <h2> it's probably just some part of the page, and that's fine too, but not AS good.
So while structurally, it makes no difference, think of SEO when you choose which header you use. If you're picking them just based on size, that's a bad idea. Just style them bigger/smaller instead!

Adobe Search&Promote non-standard <noindex> tag

A website I am working on uses Adobe Search&Promote (SP) as it's internal website indexing and searching tool.
I need to exclude common parts of each web page from being indexed by SP (such as the header, nav, footer) because they are the same on every single page.
SP's documentation states the following:
"To prevent parts of individual web pages from being searched, you can exclude portions of a page from indexing. Surround the text with <noindex> and </noindex> tags. This method is useful if you want to exclude navigation text from searches."
Of course, <noindex>, is not a standard HTML tag/element.
Is there javascript or something I should be doing to register/create this fake tag in browsers so I don't have to worry about any strange behavior as a result of having a non-standard HTML tag just hanging out in my code? Or should I just not care because browsers will ignore this non-existent element?
Note: There is absolutely no styling that needs to be done on this <noindex> element. It simply needs to wrap around content in the HTML.
There is nothing you need to do. Browsers are expected to ignore unknown tags, and they do, so they see <noindex>foo</noindex> just as foo. Well, not quite. Technically, modern browsers construct an element node (of type HTMLUnknownElement) in the DOM. But the element has no associated default styling and no associated action, so it’s really a dummy element and represents its content only.
It would be possible to remove such elements nodes using client-side JavaScript, but that would be quite unnecessary.
The only real risk is that some day some specification or some browser or some web-wide indexing robot might start treating noindex as a real element with some defined meaning, possibly with default rendering and default functionality. Then you would be in trouble if these differ from what you expected. But it’s a rather small risk, and it seems that you don’t have a choice.
Although it's not in the documentation, our team consulted an Adobe consultant regarding this. He told us that we can use a 'noindex' class instead of the <noindex> element. He was even recommending us to use the class instead of the tag.
A warning though, the 'noindex' class is only working with <div> elements but not on other elements such as <ul>, <header>, or <footer>.
So a usage will be something like this:
<div class="noindex">
<p>This should not be indexed.</p>
</div>

How should I use html5 elements with modal sections of my web page?

I'm pretty inexperienced as far as html goes and even less so with html5.
I have a question regarding modal popups - page sections that are interacted with using javascript/ajax, but not necessarily displayed on the page all the time. These are not generally in the main html flow - I might for instance place all my modal code at the end of the page for maintainability. The question is - should I be declaring these chunks of the page using html section tags, or something else?
To shed more light on the situation I'm describing, I have an application page. This contains a number of sections (I'm not referring to html5 here). The first section is modal on entering the page - it's a "click to continue if you agree" section. The next 5 chunks belong to a stepped application form - each step is displayed on at a time using a multiview control. Then another modal - a UI block, followed by a final decision section.
Since they are modal, and appear out of the flow, it is probably most suitable to use a div for them. If you do want to use a semantic block, then which you use will depend on what the content is, and how it relates to the rest of the page. The following articles should help you make that decision:
http://html5doctor.com/the-section-element/
http://html5doctor.com/the-article-element/
http://html5doctor.com/avoiding-common-html5-mistakes/ (particularly the first section of that article - "Don’t use section as a wrapper for styling")
Edit: Have added that 3rd link, since I now have enough rep to do so :-) yay!
The question is - should I be declaring these chunks of the page as sections, or something else
One of the big advantages of HTML5 is it's sematically readable. If you feel that your modal pop ups are better described by something like an article tag, then use an article. Use the tag you feel most accurately describes your functionality.
For example, let's say I have a sample page like so:
<html>
<head></head>
<body>
<article>
<!-- Some stuff here -->
</article>
</body>
</html>
I would expect the content of that article tag to fit this definition:
The article element represents a component of a page that consists of a self-contained composition in a document, page, application, or site and that is intended to be independently distributable or reusable, e.g. in syndication. This could be a forum post, a magazine or newspaper article, a blog entry, a user-submitted comment, an interactive widget or gadget, or any other independent item of content.
W3C Specification. The Article Element.
Note: In this context, an article is designed to represent flow content. Given that your aim is not to write flow content (as you correctly put) this is not a good example. This is very clear from the definition I've provided.
Similarly, if I replaced article with section, I would expect it to fit this definition:
Examples of sections would be chapters, the various tabbed pages in a tabbed dialog box, or the numbered sections of a thesis. A Web site's home page could be split into sections for an introduction, news items, and contact information.
W3C Specification. The Section Element
If I were you I would have a look through the spec and think the following questions:
What does my content actually mean to the user?
How will my content appear to other programmers?
Does the use of this content give me a hint at the correct semantics?
It depends what you have in your modal.
You could have a login form, subscribe stuff, advertisements, articles, a frame of another page, so it would only make sense to use <section> if they are actually an interesting section of the page, for example, you have an article and then you want to display the autor info in a modal box, then I would say that it would acceptable to use <section>.
So overall if it is part of the content then sounds ok to use that, if is is not you should use a <div>.
I would also say that no one has the answer for this as it is purely opinionated, and quite frankly doesn't matter.
There is also another way to incorporate modals. As they are dependent of JavaScript you could also load the popup contents via AJAX without having them in the document flow. A recent project I worked on, first renders links to a normal and complete HTML page for popup contents (e.g. contact forms). If JS is enabled, a parameter is added to the links to load only the main content without header, menu and sidebars via AJAX.
As the modal content does not really belong to your site content (if it does it shouldn't be a popup but within the documents main content) it shouldn't get marked up with some section, main or article tag. Instead use a div to render the popups or use an iframe if that is admissible for your project.
It doesn't really matter what tag is used for a modal, as long as it's appropriate the purpose (don't use a <fieldset> for example). Usually we see a <div> representing a modal.
You can use the role attribute for semantic information about the purpose of an element. In this case role="dialog" would be appropriate. You can find more info on the role attribute in HTML5 here.
Also note ARIA attributes: They enhance accessibility. For example aria-hidden="true" specifies that the element isn't visible. Screen-readers use this to skip the content.

What is the actual meaning of separation of content and presentation?

What is the actual meaning of separation of content and presentation?
Is it just mean to avoid inline css?
Does it mean that the design should be able to manipulated without changing the HTML?
Can we really make any change in design from CSS only?
If we want to change the size of
images then we will have to go to in
HTML code
If we wan to add one more line break in paragraph then again we will
have to go to in HTML code
If we want to add one more separator
at some place then again we will have
to go to in HTML code
Which X/HTML tag we should avoid to use to keep separation of content and presentation?
Is separation of content and presentation also helpful for accessibility/screen reader users? ... and for programmer/developer/designer?
When defining what is content and presentation, see your HTML document as a data container. Then ask yourself the following on each element and attribute:
Does the attribute/element represent a meaningful entity in my data?
For example, are the words between <b> tag are in bold simply for display purposes or did I want to add emphasis on that data?
Am I using the proper attribute/element to property represent the type of data I want to represent?
Since I want to add emphasis on that particular section, I should use <em> (it doesn't mean italic, it means emphasis and can be made bold) or <strong> depending of the level of emphasis wanted.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Sometimes an presentational tag can simply be replaced by CSS rules on the parent element. In which case, the presentational tag needs to be removed.
After asking yourself these three simple questions, you are usually able to make a pretty informed decision. An example:
Original Code:
<label for="name"><b>Name:</b></label>
Checking the <b> tag...
Does the attribute/element represent a meaningful entity in my data?
No, the tag doesn't represent a data node. It is there purely for presentation.
Am I using the proper attribute/element to property represent the type of data I want to represent?
<b> is used for presentation of bold elements.
Am I using the attribute/element only for display purposes? If yes, can the element be removed and the parent element styled using CSS?
Since <b> is presentational and I am using it for presentation, yes. And since the <b> element affects the whole of <label>, it can be removed and style be applied to the <label>.
Semantic HTML's goal is not to simplify design and redesign or to avoid inline styling, but to help a parser understand what that particular tag represent in your document. That way, applications can be created (ie.: search engine) to intelligently decide what your content signify and to classify it accordingly.
Therefore, it makes sense to use the CSS property content: to add quotes around text located in a <q> tag (it has no value to the data contained in your document other that presentation), but no sense to the use the same CSS property to add a © symbol in your footer as it does have a value in your data.
Same applies to attributes. Using the width and height attribute on an <img> tag representing an icon at size 16x16 makes semantic sense as it is important to understand the meaning of the <img> tag (an icon can have different representations depending on the size it is displayed at). Using the same attributes on an <img> tag representing a thumbnail of an larger image does not.
Sometimes you will need to add non-semantic elements to be able to achieve your wanted presentation, but usually those are avoidable.
There are no wrong elements. There are wrong uses of particular elements. <b> should not be used when adding emphasis. <small> should be used for legal sub-text, not to make text smaller (see HTML5 - Section 4.6.4 for why), etc... All elements have a particular usage scenario and they all represent data (minus presentational elements, but they do have a use in some cases). No elements should be set aside.
Attributes are a different thing. Most the attributes are presentational in nature. Attributes such as <img border> and <body fgcolor> rarely have signification in the data you are representing therefore you should not use them (except in those rare cases).
Search Engines are a good examples as to why semantic documents are so important. Microformats are a predefined set of elements and classes which you can use to represent data which search engines will understand in a certain way. The product price information in Google Searches is an example of semantics at work.
By using the predefined rules in set standards to store information in your document allows third-party programs to understand what seems to be a wall of text without using heuristics algorithms which may be prone to failures. It also helps screen readers and other accessibility applications to more easily understand the context in which the information is presented. It also greatly helps the maintainability of your markup as everything is tied to a set definition.
The best example is probably the CSS Zen Garden.
The goal of this site is to showcase what is possible with CSS-based design only, with a strict separation of content from the design. Style sheets contributed by various graphic designers are used to change the visual presentation of a single HTML file, producing hundreds of different designs. The HTML markup itself never changes between the different designs.
On each design page, you'd have a link to view the CSS file of that design.
What is the actual meaning of separation of content and presentation?
It is rather a design philosophy than somewhat concrete. In general, it means that you should preserve the semantics of the content, think of your content as of a piece of structured information. And that also means that you should keep all aesthetic details away from this structured information.
is it just mean to avoid inline css?
As noticed above, inline styles have nothing to do with semantics of your content and should be avoided at all costs. But it isn't just that.
is it just mean if after writing html according to design then if then if we want to do any change in design then it should be only with css, no need to html
Unfortunately, it is not always possible to achieve some concrete aesthetic goals without modifying the underlying markup; CSS3 tries it's best to address these issues.
Which X/HTML tag we should avoid to use to keep separation of content and presentation?
Look for deprecated tags in W3C HTML 4.01 / XHTML 1.0 Reference
Is separation of content and presentation also helpful for accessibility/screen reader users?
Surely. Better structured information generally remains readable even if certain browsers render styles incorrectly (or do not render them at all). Such content may also look more adequate on printed media (though print styles may be applied to achieve even better aestherics -- they, again, have nothing to do with content semantics).
Is separation of content and presentation also helpful for programmer/developer/designer ?
Of course. The separation of content and presentation takes its roots from more general philosophy, the separation of concerns. Everybody benefit from the separation: the content supplier does not have to be a good designer and vice versa.
Putting in line breaks at certain points is inevitable, there will usually be some overlap of presentation and content. You should always aim for perfect separation though.
Take the other extreme: A page containing loads and loads of tables that are used for layout purposes only. This is the definite anti-pattern that should be avoided at all cost. The content plays a second fiddle after the layout here; it's often not in the right order and thereby hardly machine readable. Not machine readable content is bad for accessibility and bad for the page's search engine ranking.
By marking up content without concern for presentation, you are first and foremost making it machine readable. You are then also in a position to serve the same content to different clients in different formats, say in a mobile-optimized version. You can also change the presentation easily without having to mess with the HTML files, say for a big redesign.
Another benefit that comes naturally by separating content and presentation (HTML - CSS files) is that you have less to type and less to maintain, plus your pages can have a consistent styling applied very easily. Contrast thousands of inline styles vs. one style definition in one CSS file, which is "naturally" applied to all elements with the same "meaning" (markup).
Ideally your (X)HTML consists only of meaningful, semantic markup and your CSS of styles using this markup for its selectors. In the real world you'll often mix classes and IDs into your markup that add no extra meaning, because you need these extra "hooks" to style everything the way you want to. But even here there's a difference between class="blue right-aligned" and class="contact-info secondary". Always try to add meaning to the content, not style. Balancing this is quite an art in itself. :)