Using direct HTML tags instead of div.class - html

I have some special tags on my blogsite which need to be as simple as possible so that my colleges who don't know anything about HTML can use it. For example
<question>...</question>
and
<answer>...</answer>
and then these are styled in CSS. It's far easier for HTML-idiots than to use the <div class="answer">...</div> format.
I've just found out IE8 is displaying it all wrong while Firefox and Chrome do it right. Is that expected or am I doing something wrong? Do you know of any hack to fix this since there are tons of blogsposts I'll have to manually change otherwise!!

You want to create <question>...</question> etc.
These are not HTML (not even HTML5), and you will struggle to get browsers to understand them reliably.
A quick tip that might help you:
You say you've got it working in all browsers except IE. If so, you might be able to hack IE to get it working as well, using a technique similar to the hacks like HTML5Shiv that are being used to get IE to work with the new HTML5 tags. These use Javascript to create a DOM element with the new tag name, after which IE suddenly starts to recognise that tag as being valid HTML.
It might just work. But be aware that it is a hack, and it only targets IE. And since you're using non-standard tags, you also have no way of knowing what will happen in the future in terms of it breaking browsers, even if they work now. (in fact, I would say the worse case scenario would be if one of the tags you've invented is added to the HTML standard at a later date, because then you'll start getting weird layout glitches as it gets added to the default stylesheet)
If you can get it working that way, then well done. But consider yourself warned that it's not good practice.
What you have actually asked for is not HTML, but XML markup. This is perfectly fine, but shouldn't be put directly into a web page in the way you're hoping.
There are a number of well-documented ways to get raw XML code into a browser.
One option is to use XSL to transform it into valid HTML. Another way would be to load it into a DOM object in Javascript and process it using a script. (this is where the 'X' comes in 'Ajax').
My guess is that a simple XSL transformation would do the trick for you. (In fact, it sounds like your use case might be simple enough that even just basic string replacement might suffice for the same end result). You can get your colleauges to create the code using <whatever> <tags> <they> <want>, and you write a script that parses it and converts it to regular HTML prior to merging it with the rest of the page.
In the long term, this would probably be a far better solution than the hack I've described above.
Hope that helps.

I don't know if this answer fits your needs but imho using custom html tags is basically NOT using HTML. Therefore the absence of compatibilty.
If you need to render data in HTML wouldn't be better using XML + XSLT?
You can find guides on w3schools

You can't add new elements like that. HTML has some fixed elements that browsers understand, but if you add your own, browser don't know what to do with them.
HTML5 has some new elements you can maybe find useful : http://www.w3schools.com/html5/html5_new_elements.asp but this won't work with older browser without some kind of javascript to fix things. For example http://remysharp.com/2009/01/07/html5-enabling-script/
However, if you really want to add new tags, it is possible to do so and then "modify" them via javascript to known tags (actually it's what the html5 enabling script of IE do), but it won't be possible to apply CSS easily to the new tags.
In short, I strongly advise against adding new tags. It's not that hard to understand something like <div class="answer">.

sounds like you want to write XML and convert to HTML using XSLT. This is an old tutorial (includes defining an DTD), but a further web search will garner more results that might suit

here you go fella:
http://net.tutsplus.com/tutorials/html-css-techniques/how-to-make-all-browsers-render-html5-mark-up-correctly-even-ie6/
You need to use createElement :)

Related

What is really needed for HTML5 support?

I am using HTML5 tags like header and was using html5shiv: http://code.google.com/p/html5shiv/.
Looking through the files, it's seemed like everything is overdone with a bunch of unnecessary files, so I researched of an easier way through html5shiv http://css-tricks.com/snippets/javascript/make-html5-elements-work-in-old-ie/ and simply adding the "hotlink": http://html5shiv.googlecode.com/svn/trunk/html5.js and letting them host the rest.
Then I was thinking, even this code seems overdone. Why can't I just use createelement http://reference.sitepoint.com/javascript/Document/createElement, why do I need all the html5shiv code in general?
Here is some of the code from: http://html5shiv.googlecode.com/svn/trunk/html5.js
.cloneNode():f=c(a),f.canHaveChildren&&!d.test(a)?
g.appendChild(f):f},a.createDocumentFragment=Function("h,f","return function(){var
n=f.cloneNode(),c=n.createElement;h.shivMethods&&("+i().join()
I am not a professional at JavaScript, but I don't understand why this is necessary?
HTML5Shiv fixes several issues with using HTML5 element in IE, not just the obvious one of being able to create the elements in the first place. The first version was just that, but later versions have added further fixes for other issues.
The two other issues that I know of are:
Bugs with printing pages containing HTML5 elements.
A bug with .innerHTML when used with an HTML5 element.
The basic issue of allowing these elements to be added to the page is a pretty short and easy bit of code, but these other two issues are where the bulk of the HTML5Shiv code comes from.
A full write-up of the history of HTML5Shiv and when these things were added can be found here: http://paulirish.com/2011/the-history-of-the-html5-shiv/

How to use new HTML5 tags when they aren't fully implemented?

I've done a little bit of playing around with the new HTML5 tags and read up on all the new ones at W3cschools, etc. and I'm a little confused.
If I create an HTML page that uses "areas", "sections", "asides", etc. Nothing happens? I have to manually style them - which is FINE, but am I missing something? What's the point of declaring a tag an "aside" if I have to make it ASIDE (common css: float:right;width:30%, sorta thing)?
Why not just stay with DIV tags and style them?
I also noticed new attributes, such as "draggable", but, surprise-surprise, it doesn't drag! I have to code it to drag too (javascript/jquery??) ? What's the point of declaring it draggable? I can create a div tag and drag it using JQuery, so someone please enlighten me as to what's so "whoopty doo" about html5?
This is how the web changes in respect to HTML.
First new tags are created by browser makers for advanced users to try out.
Then they are added to the standard
Then they actually get implemented in the various browsers and devices.
Then they become widespread and useful.
Then they become universal and are implemented by 99.9% of devices.
For the tags mentioned they are probably between steps 2 and 3
It may not be "whoopty doo" but this is how changes occurs in this system
Developers who accept, embrace and use this pattern of evolution help move this process along.
It's a bit zen-like.
Additionally, as Aaron points out in his comment above (+1), these particular tags are semantic tags for organizational/outlining and search engines, screen readers and the like. So you yourself may not see much up front for them.
Might not be the case but at least I know for me I make this mistake every now and then. Don't forget to include at the top <!DOCTYPE html>. Furthermore, not all browsers support HTML5 yet I believe so make sure the browser version you are using supports it (most of them should support it nowadays though).

What issues need to be considered when using HTML5shiv?

I am thinking about starting to use some HTML5 elements in my sites. With the varying lack of support for HTML5 in Internet Explorer I was considering using HTML5shiv. I have read that I would need to set the CSS for various unrecognised elements to be block level and also the possibility of issues with loading HTML5 elements via ajax.
I would like to know what issues others have encountered when using this script. Thanks.
If you're going to dynamically load HTML5 elements you'll need the innershiv. You'll also need to bear in mind that if the IE user has JavaScript disabled, it won't work at all.
I've found the existing solution to be highly unreliable when used in real world scenarios - it's fine for noddy little "hello world" examples but as soon as the pages start getting more complex then you will find that styles will stop applying on some requests etc.
It's not a very nice answer, but the truth is that if you need to support older IE versions then you basically can't rely on being able to style HTML5 elements reliably. If you can get away with using the elements but use superflous markup (divs etc.) to do things like layout then you might get away with it, but then it depends what you consider to be the lesser of the two evils : Loads of noddy markup or no IE support.

Having the HTML of a webpage, how to obtain the visible words of that webpage?

Having the HTML of a webpage, what would be the easiest strategy to get the text that's visible on the correspondent page? I have thought of getting everything that's between the <a>..</a> and <p>...</p> but that is not working that well.
Keep in mind as that this is for a school project, I am not allowed to use any kind of external library (the idea is to have to do the parsing myself). Also, this will be implemented as the HTML of the page is downloaded, that is, I can't assume I already have the whole HTML page downloaded. It has to be showing up the extracted visible words as the HTML is being downloaded.
Also, it doesn't have to work for ALL the cases, just to be satisfatory most of the times.
I am not allowed to use any kind of external library
This is a poor requirement for a ‘software architecture’ course. Parsing HTML is extremely difficult to do correctly—certainly way outside the bounds of a course exercise. Any naïve approach you come up involving regex hacks is going to fall over badly on common web pages.
The software-architecturally correct thing to do here is use an external library that has already solved the problem of parsing HTML (such as, for .NET, the HTML Agility Pack), and then iterate over the document objects it generates looking for text nodes that aren't in ‘invisible’ elements like <script>.
If the task of grabbing data from web pages is of your own choosing, to demonstrate some other principle, then I would advise picking a different challenge, one you can usefully solve. For example, just changing the input from HTML to XML might allow you to use the built-in XML parser.
Literally all the text that is visible sounds like a big ask for a school project, as it would depend not only on the HTML itself, but also any in-page or external styling. One solution would be to simply strip the HTML tags from the input, though that wouldn't strictly meet your requirements as you have stated them.
Assuming that near enough is good enough, you could make a first pass to strip out the content of entire elements which you know won't be visible (such as script, style), and a second pass to remove the remaining tags themselves.
i'd consider writing regex to remove all html tags and you should be left with your desired text. This can be done in Javascript and doesn't require anything special.
I know this is not exactly what you asked for, but it can be done using Regular Expressions:
//javascript code
//should (could) work in C# (needs escaping for quotes) :
h = h.replace(/<(?:"[^"]*"|'[^']*'|[^'">])*>/g,'');
This RegExp will remove HTML tags, notice however that you first need to remove script,link,style,... tags.
If you decide to go this way, I can help you with the regular expressions needed.
HTML 5 includes a detailed description of how to build a parser. It is probably more complicated then you are looking for, but it is the recommended way.
You'll need to parse every DOM element for text, and then detect whether that DOM element is visible (el.style.display == 'block' or 'inline'), and then you'll need to detect whether that element is positioned in such a manner that it isn't outside of the viewable area of the page. Then you'll need to detect the z-index of each element and the background of each element in order to detect if any overlapping is hiding some text.
Basically, this is impossible to do within a month's time.

HTML differences between browsers

Do you know of any differences in handling HTML tags/properties in different browsers? For example, I once saw a page with a input tag with a maxlength field set to "2o". Firefox and Opera ignore the "o", and set the max length to 2, while Internet Explorer ignores the field altogether. Do you know of any more?
(Note: seeing as this will probably be a list, it would be great if the general name of the difference was in bold text, like: Different erratic value handling in tag properties)
Bug Lists
Web developers have already compiled some pretty comprehensive lists; I think it's better to compile a list of resources than to duplicate those lists.
http://www.positioniseverything.net/
http://www.richinstyle.com/bugs/table.html
http://www.quirksmode.org/ (as mentioned by Kristopher Johnson)
Javascript
I agree with Craig - it's best to program Javascript using a library that handles differences between browsers (as well as simplify things like namespacing, AJAX event handling, and context). Here's the jump to Craig's answer (on this page).
CSS Resets
CSS Resets can really simplify web development. They override settings which vary slightly between browsers to give you a more common starting point. I like Yahoo's YUI Reset CSS.
Check out http://www.quirksmode.org/
If you are programming in javascript the best advice I can give is to use a javascript library instead of trying to roll your own. The libraries are well tested, and the corner cases are more likely to have been encountered.
Scriptalicious - http://script.aculo.us/
jQuery - http://jquery.com/
Microsoft AJAX - http://www.asp.net/ajax/
Dojo - http://dojotoolkit.org/
Prototype - http://www.prototypejs.org/
YUI - http://developer.yahoo.com/yui/
Do you know of any differences in handling HTML tags/properties in different browsers
Is this question asking for information on all differences, including DOM and CSS? Bit of a big topic. I thought the OP was asking about HTML behaviour specifically, not all this other stuff...
The one that really annoys me is IE's broken document.getElementById javascript function - in most browsers this will give you something that has the id you specify, IE is happy to give you something that has the value in the name attribute, even if there is something later in the document with the id you asked for.
I once saw a page with a input tag
with a maxlength field set to "2o".
In this specific case, you're talking about invalid code. The maxlength attribute can't contain letters, only numbers.
What browsers do with invalid code varies a great deal, as you can see for yourself.
If you're really asking "what do all the different browsers do when faced with HTML code that, for any one of an infinite number of reasons, is broken?", that way lies madness.
We can reduce the problem space a great deal by using valid code.
So, use valid HTML. Then you are left with two main problem areas:
browser bugs -- how the browser follows the HTML standard and what it does wrong
differences in browser defaults, like the amount of padding/margin it gives to the body
Inconsistent parsing of XHTML in HTML mode
HTML parsers are not designed to handle XML.
If an XHTML document is served as "text/html“ and the compatibilities guidelines are not followed you can get unexpected results.
Empty tags is one possible source of problems. <tag/> and <tag></tag> are equivalent in XML. However the HTML parser can interpret them in two ways.
For instance Opera and IE treat <br></br> as two <br> but Firefox and WebKit treat <br></br> as one <br>.