How to specify a language across an entire website?

How to specify a language across an entire website? - html

For context: I am not a web developer or programmer of any sort and I use a screen-reader.
I am using an accessibility evaluation tool across some sites I'd like to visit, which, however, have some accessibility issues.
From some googling, I see that it is fairly easy to specify a single webpages language by using <HTML lang='en'></html> for example for English.
How easy would it be to implement this across an entire site with hundreds of separate pages? This is a very vague question without linking to the specific website. For example, a car website with a separate page for individual cars with each page missing a specified language. Would it be fairly simple to specify the language across all automatically or would one have to manually copy/paste <HTML lang='en'></html> for each page? Or does it heavily depend on how the website was structured HTML-wise? etc.
Thank you

You can define the content language sitewide without modifying the HTML using the HTTP header:
Content-Language: de-DE
This is described in the WCAG : Specifying the default language in the HTTP header
This header should be used as fallback by browsers when no lang attribute is declared.
The HTML5 specification says that if there is no lang attribute on the html tag, and if there is no meta element with the http-equiv attribute set to Content-Language, and if there is only a single language tag in the HTTP header declaration, then a browser may use that information to guess at the default language of the text in the page.
But as noted in the WCAG, this might not be always sufficient:
Note that the Content-Language HTTP header does not serve to identify the language used for processing the content. The content processing language can be identified by means of other techniques, such as the attributes lang and xml:lang in markup languages.
The technique Using the language attribute on the HTML element must be applied and specifiying the lang attribute is something that you won't do an a per-webpage basis, but modifying the site template.
And yes, this is a programming question, a beginner one, but still a programming one

To answer the underlying programming question: Yes, usually changing one file is enough.
You can safely assume that the HTML files that make a website are generated by either a Static Site Generator in advance, or by Content Management System (CMS) on the server when a page is requested.
To generate these files, templates are used. Templates give a common structure and contain variables, which will be replaced by content from the respective page or component.
Usually, there is one main, high-level HTML template, which provides the basic HTML structure with the opening HTML tag <html>.
So usually it suffices to change only one file to include the language attribute, by writing <html lang="en"> in that template, or using a variable for a multi-lingual site <html lang="{{language}}"> (syntax depends on your generator).

Related

How to include meta data in HTML tags

I plan to use a headless-CMS to manage content on my website, so a non-technical content editor will be able to independently maintain content.
To assist the content editor map content between the website and the CMS, I need to inject a CMS ID into HTML tags. (thinking that the content editor will view the page source to find the value)
What is the standard way to inject meta data into a HTML tag?
E.g. <p cms-id=7adQpNPZxP4jK28RLp3wES></p>

You can use data attributes on elements - https://developer.mozilla.org/en-US/docs/Learn/HTML/Howto/Use_data_attributes.
It doesn't interfere with HTML semantics and is easily accessible with JavaScript.

The <meta> tag defines metadata about an HTML document. Metadata is information about data.
<meta> tags always go inside the <head> element, and are typically used to specify character set, page description, keywords, author of the document, and viewport settings.

Contentful DevRel here. 👋
The rendering has to be done by your server. Depending on the technology you use you have to place the id in the website. It's hard to give advice without knowing what you use to render the HTML.

Why use Schema.org microdata to mark up web page elements?

I understand why and how to use Schema.org to add microdata to your site, this is not a question about that. The question is why Schema.org has support for certain things that can be marked up with simple HTML5. Among these are
Types
WebPage and WebSite
I can see why WebPage and WebSite would be needed, for example, to reference the page/site of a certain organization in a link, but there's no need to mark up your own page with this—the <html> tag does this.
SiteNavigationElement
Why not just use <nav>?
Table
Just use <table>.
properties
WebPage/mainContentOfPage
<main> element
WebPage/relatedLink
<link> element inside <head>

This answer is primarily about the WebPageElement types (like SiteNavigationElement).
For WebPage, see my answer to the question Implicity of web page structure in Schema.org (tl;dr: it can be useful to provide WebPage, even for the current page).
For WebSite, similar reasons from the answer above apply. HTML doesn’t allow you to state something about the whole site (and, by the way, a Google rich result makes use of this type).
Schema.org is not restricted to HTML5.
Schema.org is a vocabulary which can be used with various syntaxes (like JSON-LD, Microdata, RDFa, Turtle, …), stand-alone or in various host languages (like HTML 4.01, XHTML 1.0/1.1, (X)HTML5, XML, SVG, …). So having other ways to specify that something is (or: is about; or: represents) a site-wide navigation, a table etc. is the exception rather than the rule.
But there can be reasons to use these types even in HTML5 documents, for example:
The HTML5 markup and the annotations from Microdata/RDFa are two "different worlds": a Microdata/RDFa parser is only interested in the annotations, and after successfully parsing a document, the underlying markup is of no relevance anymore (e.g., the information that something was specified in a table element is lost in the Microdata/RDFa layer).
By using types like WebPageElement, you can specify metadata that is not possible to specify in plain HTML5. For example, the author/license/etc. of a table.
You can use these types to specify data about something which does not exist on the current document, e.g., you could say on your personal website that you are the author of a table in Wikipedia.
That said, these are not typical use cases relevant for a broad range of authors. Unless you have a specific reason for using them, you might want to omit them. They are not useful for typical websites. Using them can even be problematic in some cases.
See also my Schema.org issue The purpose of WebPageElement and mainContentOfPage, where I suggested to deprecate WebPageElement and the mainContentOfPage property.

Just use <table>.
You seem to be reading the title of the pages and no further. The <table> tag doesn't have the dozens of special properties listed on that page like isFamilyFriendly or license or timeRequired.
Schema.org microdata is intended to build a standard set of additional, semantic metadata that can be used by automated systems - search engine spiders, parser robots, etc. - to better understand the nature and features of the content.

html validator versus SU:BADGE

How to fix that
element "SU:BADGE" undefined <su:badge layout="5"></su:badge>
I use HTML 4.01 Transitional on my website ...
The BADGE is from stumbleupon.com website
Thank you . Regards.

Since such markup is not valid in HTML 4.01, the validator is just doing what you asked it to do: reporting any reportable markup errors.
The only practical reason why such error messages might be disturbing is that if there are many of them, you might accidentally miss to notice some other error messages, relating to constructs where your markup unintentionally deviates from HTML 4.01. If this is an issue, consider writing a custom DTD. It requires some understanding of SGML, but so does the use of validators in general, does it not?
On the other hand, you might decide to refrain from using tags suggested by people who cannot do such a simple thing using valid HTML. There are many ways to put information on a web page in a manner that does not affect the visible appearance but can be retrieved by programs that search for specific constructs (e.g., meta tags). They decided on a way that causes problems to authors who wish to use validtors.

You don’t.
Validation is nice (especially as a sanity check while developing) but if you don’t use valid markup then there’s nothing you can do, and if you need invalid markup (because you use a third-party API for instance) then validation simply ceases to be relevant.
Alternatively, you could serve your page as XML instead of HTML 4 and define an appropriate su XML namespace. But since this would mean that some browsers no longer display your page correctly this is more a theoretical possibility.

The difference between two different HTML hyperlinks? (link & html tags)

I've been googling the internet and still can't seem to find an answer. I was wondering what the difference is between using something like:
<link rel="profile" href="http://gmpg.org/xfn/11" />
and
<html xmlns:og="http://opengraphprotocol.org/schema/" xmlns:fb="http://www.facebook.com/2008/fbml">
I'm using a HTML5 doctype and would like to keep everything clean. Am I wrong in thinking that these are somehow similar? Thanks!

These two types of links have about nothing in common, other than using HTTP URIs.
The profile link element links to another resource (often a web page), which should be relevant to the current page. Some browsers might show this link somehow in the user interface, or interpret it otherwise. Or search machines might use this.
For some rel values (like rel="stylesheet"), there are definitions on how to interpret them in the relevant standards, others are only used by human readers.
The xmlns:... links define an XML namespace prefix (og or fb) for the current document, with an URI used simply as identifier for the namespace. This means that you can now use elements in these namespaces, in addition to the normal HTML elements (by prefixing their names with og: or fb:).
The document at that URI will not be retrieved. The elements will either be already known by the XML processor reading the file, or simply ignored (if this is a simple browser interpreting this as HTML).
This is structural metadata about the current document (or element, in fact, as they are allowed on non-root elements, too, and only apply to the element they are on and its enclosed elements).
For your next question in the comment:
The Dublin Core metadata is information about the content of the current document. I can see no reason to use links (or URIs) here, so in fact neither of them fits. If you would put the metadata in a separate document, you could link to them (using a link element), but normally you would use a meta element with a name from the Dublin Core standard. (Inside the head element, of course.)

xmlns: is an XML attribute. HTML5 is not XML, so this is a worthless attribute in your document.

What is DOM? (summary and importance)

What is the Document Object Model (DOM)?
I am asking this question because I primarily worked in .NET and I only have limited experience, but I often hear more experienced developers talk about/mention it. I read tutorials online but I am unable to make sense of the whole picture. I know that it is an API!
More specific questions are:
Where is it currently used?
What field(s) of developers use it (ex-.NET developers)?
How relevant is it for all developers in general to understand?

In general terms a DOM is a model for a structured document.
It is a central concept in today's IT and no developer can opt out of DOM. Be it in .net, in HTML, in XML or other domains where it is used.
It applies to all documents (word documents, HTML pages, XML files, etc). In the developer sphere it applies mainly in the HTML and the XML domains with slightly different meanings.
HTML
In the HTML arena, the DOM was introduced to support the revolution called in the late 90ies "dynamic HTML". Before IE4 and Netscape 4.0, HTML documents where not changeable inside the browser (all you had in these remote times to sprite up a web page was "animated GIF" !!!! and HTML was version 3.2).
Therefore dynamically manipulating inside the browser the document sent by the server was a huge revolution and initiated the march towards the attractive web sites we see today.
Javascript had been introduced by Netscape (baptised javascript to surf on the new Java trend, but unrelated) and was supported by both Netscape HTTP servers and Netscape browsers, with Internet Explorer eagerly following the move inside the browser. However When javascript is used to manipulate the content of a document, you need an easy way to designate the part of the document you want to interact with. That's where the DOM comes in. Although HTML 4 is not "well formed", browsers build an internal representation of the page with the "body" element at its top and plenty of html tags below, in a hierarchical organisation (child nodes, parent nodes attributes etc). The DOM is the model underpinning the API that allows to navigate this hierarchy.
Since both Netscape and IE browsers were competing solutions, there was little chance the NS and the IE DOM would converge. The W3C stepped in to allow smaller browser vendors to enter the competition and endeavoured to standardised the DOM. Hence the W3C DOM. All it did was just to introduce another dialect and as everybody knows it took years and two serious competitors to force MS to comply with the standards.
Even though more moderns navigating techniques like JQuery have shorthand notations for the DOM, they internally rely on the DOM.
XML
HTML made obvious the disadvantages of showing leniency towards the "well-formedness" of documents and this ushered a new craze : XML. In the web arena, XML and XSLT were first supported by IE5 and adopted in many more domains than just displaying pages.
To parse XML, in the Java Word mainly, you would develop a SAX parser which is basically a plugin to a SAX engine in which you describe what the engine should do of all the XML events (tags...) it will encounter in the parsed document. Developing a SAX parser is not straightforward but is a low footprint solution.
However you have to develop a specific one for each new document type...
So it was not long before libraries started to appear to parse any document and build an in-memory map of its hierarchy. Because it also had the same concepts of root, parents and children (inherited from SGML through HTML), it was also termed a DOM and the name applies regardless of the library.
Other domains
The concept of DOM is not restricted to or even invented for HTML or XML. A DOM is a general concept applicable to any document, especially those (the vast majority of them do) showing a hierarchical structure in which you need to navigate. You can speak about the DOM of a MS-Word document and there are APIs to navigate these as well.

The DOM is the application programming interface for well-defined HTML and XML structures (per W3C's document). It is used in any place where you interact with the elements of a web page (any element - style, text, attributes, etc). You will hear a lot about the DOM with JavaScript and/or JavaScript libraries, such as jQuery (which, of course, is JavaScript). It is also referenced with Java, ECMAScript, JScript, and VBScript.
If you are programming .NET it is important if you are doing web-based work. If you are doing application programming, it's not as important. The DOM is definitely not a thing of the past - it is used and worked within every day by many developers. With that said, there has been work towards standardization of the DOM across web browsers. (Again, libraries can help hide these differences. This is one reason jQuery is so popular. You don't have to worry about the browser specifics - you just do what you need to do.)
The document I linked to above does a great job of answering all your questions and more. I would highly recommend reading it. If you have more questions, you can also check out the links below:
What is the Document Object Model? (W3C)
Document Object Model (Wikipedia)

I'm really not going to be able to explain it any better than the Wikipedia Article on DOM
But to answer a few of your questions:
Where do we still use it?
Every web browser since the mid-nineties.
Who uses it,
Every web developer since the mid-nineties.
in what technology?
Mostly the web via JavaScript, but pretty much anytime you access XML/HTML programatically you are using some kind of DOM implementation.
How important is it for anyone in .net
carrier? [sic]
Extremely, although you probably use it without even knowing it.
Is this just a thing of the past which
was heavily used but had problems?
If it is then somebody needs to tell John Resig that he has wasted the past 3 years of his life.

When a browser loads an HTML page, its convert it to the Document Object Model (DOM).
The browser's produced HTML DOM, constructs as a tree that consists of all your HTML page element as objects. For example, assuming that you load below HTML page on a browser:
<!DOCTYPE html>
<html>
<head>
<title>website title</title>
</head>
<body>
<p id="js_paragraphId">I'm a paragraph</p>
some website
</body>
</html>
After loading, the browser converts it to:
Some of the ability of scripting languages on HTML DOM consists of:
1- Change all the HTML elements in the page.
2- change all the HTML attributes in the page.
3- Change all the CSS styles on the page.
4- Remove existing HTML elements and attributes.
5- Add new HTML elements and attributes.
6- React to all existing HTML events in the page.
7- create new HTML events on the page.
Let's back to your questions:
1- It currently used in all modern browsers.
2- Front-end developers.
3- All Front-end developers that using scripting languages especially JavaScript.

What is DOM?
The Document Object Model (DOM) is a programming API for HTML and XML documents. It defines the logical structure of documents and the way a document is accessed and manipulated. A standard defined by w3 consortium.
Source: http://www.w3.org/TR/WD-DOM/introduction.html

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008