Displaying invalid HTML in layout - html

Not sure how to tag this question. I have a database of XHTML documents that are converted by LaTeXMLpost; however, saying that they have validation issues is an understatement. I need to show them inside a browser. However, tag autoclosing due to invalid markup messes up my structure.
A minimal example:
<!doctype html>
<html>
<head>
<title>test</title>
</head>
<body>
<div id="content" style="background-color:pink">
<!-- yield -->
<section >
<ul>
<li>
<div>
<p>
First
<li>
<div>
<p>
Second
</p>
</div>
</li>
</p>
</div>
</li>
</ul>
</section>
<section>
Next
</section>
<!-- end yield -->
</div><!-- end content -->
</body>
</html>
jsfiddle
Everything outside comments is layout; inside it is the loaded document. If things were taken at face value, everything should be pink, right?
The problem is, "Next" gets booted outside the #content. Even though it is valid XML, it does not conform to HTML/XHTML DTD (or whatever passes for DTD in HTML5), so it gets mangled.
The question is: How can I protect my layout against invalid markup inside it? Can I do something to the content to normalise it? I'm loading it into Nokogiri before displaying, but I still end up in this mess anyway (since the XML isn't malformed, I suppose, Nokogiri doesn't do anything about it).
I don't care if it's displayed nicely or not, all I care now is that it remains safely contained (otherwise I have trouble with manipulating it, attaching events, styling, and pretty much everything else).

You can try Nokogiri it has some built-in functionality for fixing invalid mark-up.
Related question : Repairing invalid HTML with Nokogiri (removing invalid tags)

Related

Is there a smart way to hide alot of text in HTML?

so I have this huge amount of text from several documents that i'd like to insert on my webpages. When i copy paste the text into my <p>element, it works fine and all, but it looks messy in my html-file.
Is there any other way to transfer my written document to my html-file, for instance link the document to the html-file, or maybe there's a way to hide or separate the <p> so the html-file looks neat even though there's a huge amount of text in my html-file. Any advice?
I do not know about any way to include html in another html (something like php's include), but it could be done with JQuery:
index.html:
<html>
<head>
<!-- link jquery -->
<script>
$(function(){
$("#fileContent").load("doc.html");
});
</script>
</head>
<body>
<div id="fileContent"></div>
</body>
</html>
doc.html (file that contains your text)
There's a lot you could do to separate these blocks of text.
Firstly, I'd recommend using <div>..</div> tags to divide the content into separate semantic sections. There are a bunch of different tags that aim to divide the content of the page semantically: <aside>, <main>, <header>, <nav>, and so on. I'd recommend reading up on these tags and using them appropriately.
However, to answer your question more directly, you should separate each block of text into separate <p> tags. After all, the <p> tag is meant for defining separate paragraphs. While the HTML document may not look pretty when indented and filled with multiple different tags like <div> a <p>, it is the best way to do it.
Unless the HTML page is going to be presented in its core (code) format, then how the <p> tags look in the .html file is unnecessary because after all these are what define how the page is presented and rendered in the browser.

Total Validator doesn't find skip link

Total Validator doesn't find this link and write this warning to me:
Add a skip navigation link as the first link on the page.
How can I write this link in a better way?
<html>
<body>
Skip to Content
navbar with menu
<div id="skip">
</div>
</body>
</html>
From what I have tested, they do require the text link to contain the word "skip", and the href attribute to start with a #, no matter if this element exists.
With the code you have submitted, it works with my own installation of TotalValidator (I am not saying that I would use this tool).
For information, TotalValidator web site uses the following code
<div id="skip">Skip navigation</div>
[...]
<a id="content"></a>
In spite of what the first comment says, the ID value "skip" is technically fine; it does not need to be changed to "skiptocontent". The reason why TotalValidator does not detect the skip link is probably something else. The link goes to somewhere in the page, and that "somewhere" is not explicitly marked as the main content. You can do this using WAI-ARIA landmarks.
With markup such as the following, it should be obvious for a validator that your first link is a skip link to the main content:
<body>
Skip to Content
<!-- navigation menu goes here -->
<div role="main" id="skip">
<p>...</p>
</div>
</body>
You can also use "semantic" elements, e.g.
<body>
Skip to Content
<header><h1>...</h1></header>
<nav><!-- navigation menu goes here --></nav>
<main id="skip"><!--role="main" is redundant on the main element-->
<p>...</p>
</main>
<footer>
</footer>
</body>
See the WAI-ARIA specification for documentation on main (role) and the HTML5.2 spec for the main element.

HTML: Does text need a container element conform to standards?

Is the following W3C Compliant:
<div>
<h3>Heading</h3>
This is the text for this section.
</div>
Or does the text require a container element?
<div>
<h3>Heading</h3>
<p>This is the text for this section.</p>
</div>
The first example doesn't sit right with me, but a content editor asked me and I realized I don't really know if it's okay.
Both examples are valid.
Technically in the first one, the text is inside a container, the outer <div>.
Anyway it is perfectly valid to put text directly inside the <body>, which means the following HTML document will pass validation with no errors or warnings:
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
</head>
<body>
<h3>Heading</h3>
This is the text for this section.
</body>
</html>
The more relevant question is whether it is semantically correct. To put it simply, paragraph text should be surronded by a <p> tag. More generally each type of content SHOULD be written inside the semantically relevant tag.
I will advise you to use the second approach.
When you use the correct Heading Tag it helps boost your page SEO wise.
Moreover, paragraph tag, P, helps some browser to render your page in “reading mode”.
Finally, a div is a block-displayed element. This CSS code looks a bit weird: div {color: blue}. But, p { color: red; } make more sense for a lot people.
Technically, both are conforming HTML (unless you validate it against the strict HTML4.x/XHTML1.x scheme which has no connection to reality anymore). Hovewer, the second approach would be probably more convenient from the styling/scripting perspective, where it's always better to have a possibility to address any piece of content directly. The first example has an implicit paragraph, and explicit is usually better than implicit.

How to generate unrendered HTML elements on web page with Angular 2.1.1 like stackoverflow?

What I am trying to do:
I am attempting to create a web page with Angular2 which shows HTML on the screen in much the same way many websites do such as stackoverflow, css-tricks, and and w3schools (to name a few). I would like to be able to copy the code and paste it somewhere else after its shown on screen.
What I know:
I have come to realize that it will probably be necessary to convert all of my opening tags ( i.e., < ) to &lt and to convert all of my closing tags ( i.e., > ) to &gt, however I am still not sure what the best way to interpolate some variables into the template.
For example, I have this in my template file:
<div>{{myTitle}}</div>
<div><p>{{mySubTitle}}</p></div>
<div>
<ul>
<li>{{item1}}</li>
<li>{{item2}}</li>
<li>{{item3}}</li>
</ul>
</div>
What I want to see (and be able to copy) in the browser:
<div>This is my title</div>
<div><p>This is my subtitle</p></div>
<div>
<ul>
<li>Apple</li>
<li>Orange</li>
<li>Durian</li>
</ul>
</div>
Stack overflow makes this really easy and nice to accomplish by letting you highlight the code you want to display on screen and clicking the {} button in the editor. However, when I try using the <pre> and <code> tags in my Angular2 app, I do not get the same result, I cannot see the actual HTML elements like <div> and <li>.
Instead what I see is:
{{myTitle}}
{{mySubTitle}}
{{item1}}
{{item2}}
{{item3}}
I have used handlebarsjs in the past and am familiar with that library but I was under the impression that using Angular2 would eliminate the need for handlebarsjs. Does anyone know how to accomplish what I am trying to do in Angular2 without handlebarsjs?
For < and > you'll probably need to use &lt and &gt.
For the braces in template expressions you may want to use ngNonBindable directive.
<div ngNonBindable> {{myTitle}} </div>
Use <pre> or <code> for HTML to become rendered verbatim.
<pre ngNonBindable>
<div>{{'{{'}}myTitle{{'}}'}}</div>
<div><p>{{'{{'}}mySubTitle{{'{{'}}</p></div>
<div>
<ul>
<li>{{'{{'}}item1{{'{{'}}</li>
<li>{{'{{'}}item2{{'{{'}}</li>
<li>{{'{{'}}item3{{'{{'}}</li>
</ul>
</div>
</pre>
You need to escape { and } (for example like shown above)

HTML eMail error with HTML 4.01

I'm working on some oracle code to generate an HTML eMail. It's mostly working, but I took the resulting HTML and placed it in Dreamweaver CS6 to use the validation. I get a few errors:
1) No Character encoding declared at document level [HTML 4.01]
2) element "U" undefined [HTML 4.01]
The html code is generated automatically by a rich text editor widget. Should I use something other than HTML 4.01? I'm not too savvy with HTML Header code.
Here's the HTML code that is generated from my test.
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN""http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<title>Saint Susanna Parish Mailing</title>
</head>
<body>
<p>This is normal text</p>
<p>
<strong>This is bold</strong>
</p>
<p>
<u>This is Underscored</u>
</p>
<ol>
<li>
<span style="color:#ff0000;">This is numbered</span>
</li>
</ol>
<ul>
<li>This is bulleted</li>
</ul>
<p style="text-align: center;">This is centered</p>
<p>
<span style="font-size:18px;"><span style="font-family: times new roman,times,serif;">This is a new font</span></span>
</p>
<p style="text-align: right;">This is right justified</p>
<p> </p>
</body>
</html>
Thanks for looking at this.
I think the encoding can -and must- be specified in the mail headers, so I would ignore that warning.
The article The Importance of Content-Type Character Encoding in HTML Emails says:
[The client] will display the email based on what Content-Type has been set.
However, email clients read the Content-Type value that is set in the
email header and they completely ignore the META tag that is within
the HTML.
So that suggests that you should add the proper header, and can safely ignore the validator's warning, although it can't hurt at all to add the meta tag as well.
If you want a second opinion, you can try the W3C Markup Validation Service, although that one might also complain about missing content types. After all, these validators don't know what headers you are going to supply.
Different rules apply to HTML mail anyway. Clients ignore basically everything that is outside of the body. They also filter out all kinds of attributes, won't allow JavaScript and fully ignore external stylesheets and inline style tags.
The <u> tag was deprecated in HTML 4.01 but not obsolete. In that case the validator seems to be wrong, so I would ignore that warning as well. I wouldn't underline text at all though, because obviously that text could easily be mistaken for a link. If you need to, and you don't want to use <u>, you can use an inline text-decoration style.
Some suggestions:
U can do a lot of control by using classes etc - declared in a style.css file that u call first as well.
<!DOCTYPE HTML> - HTML 5
<b> and </b> can replace strong to save characters
<link rel="stylesheet" type="text/css" href="../style.css" title="Standard Style">