As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
What is the preferred method when dealing with choosing a class vs. an ID?
For instance, you can have a bunch of elements that might be styled identically and could all use the same class. However, for readability purposes, it's sometimes nice to have a unique ID for each element instead.
Obviously you don't want to go ridiculously overboard where every element has an ID. However, where do you guys draw the line and does using all IDs where you could be using classes slow things down noticeably? If so... when?
How to stop obliterating semantic HTML.
Most people learn HTML from looking at source code and of HTML and tinkering with it, learning how <tag>foo</tag> looks and running along with it. They don't really gain a deep understand of it, but they go on to do things that require a deep understanding, the side effect is the problem you and thousands of others have every day -- they're doing things and they don't know fully how these tools work, because it looks so simple on the surface and the powerful uses are are "hidden" in the funny manual that nobody feels the need to read. Everything is plainly explained and been written down for a long time.
What IDs are for (directly from the HTML4 spec, with my notes)
The id attribute assigns a unique identifier to an element (it only happens ONCE, never TWICE or more, I'm tired of seeing people come on this site and dropping in their code with the same ID in twenty elements)
The id attribute has several roles in HTML:
As a style sheet selector. (This means, you can use it to describe CSS styles)
As a target anchor for hypertext links.(When you can jump to a section of a page)
As a means to reference a particular element from a script.(document.getElementById("whatever"))
As the name of a declared OBJECT element.
For general purpose processing by user agents (e.g. for identifying fields when extracting data from HTML pages into a database, translating HTML documents into other formats, etc.).
What Classes are for (directly from the HTML4 spec, with my notes)
The class attribute [...] assigns one or more class names to an element (this one gets to be re-used to your heart's content) ; the element may be said to belong to these classes. A class name may be shared by several element instances. The class attribute has several roles in HTML:
As a style sheet selector (when an author wishes to assign style information to a set of elements).
For general purpose processing by user agents. (Basically, it's just another part of an element)
What? I don't get it.
IDs: It's the fingerprint of something, there's only one, you only use each fingerprint once in the entire document. You only use it when you need to give something an ID. You probably don't want to have hundreds of these, or even tens of these. You rarely if ever need to start making these. The specific uses are for target anchors, improving selector speed in rare edge-cases. Generally you never describe your CSS based on IDs, you might have some edge-cases such as #HEADER .body h1, which may be different from your #BODY, I'd still advise against making them IDs for no real reason.
Classes: Nothing to do with unique fingerprints or linking to sections of a page, classes don't uniquely identify something. Classes describe a group of things that belong together or should behave the same way. If you're part of the class called coffee you should exhibit classes as one might expect from coffee, if you're a class of cellphone, then look like a cellphone (don't provide coffee).
But how the heck am I supposed to access the 4th cell in the 6th column of some table, or group of divs or that 20th list item?
This is where people who don't know what HTML is throw their hands up in the air and decide to assign IDs to all the elements. This is a total side-effect of nobody properly explaining to you how HTML works. That's a nice way of saying you didn't RFTM or ask questions early on (user1066982 in this case, did, which is amazing and makes me happy, I'm writing this to point other people to in the future who fail at HTML).
You need to start learning right now. Stop pretending you understand this stuff.
HTML is not a string of text such as <foo><bar>baz</bar>blah<ding/></foo>, sure that's how you write HTML but if that's what you believe it is you do not understand HTML in the browser.
HTML is a document that is structured like XML. HTML documents have a model, that means they aren't flat text. The text-representation of that document is a way your browser can take flat text and turn it into a tree structure. Trees are like arrays, except they aren't just flat elements in an array one-after-another, but rather they nest so one element may point to several other elements.
This below isn't a diagram (stolen from the w3c's spec on the Document Object Model) of how to write HTML text, this is a diagram of how your browser stores it in memory:
Since it's in memory like that, it doesn't mean "Oh crap! I have no way to access the first TD in the second TR of the table body in the table!", it means you simply and plainly explain to your code that there is a child element inside of the table.
JavaScript provides a full DOM API that allows you to access every single node in that DOM tree.
PHP provides a full DOM API that allows you to access every single node in that DOM tree.
C++ has a full DOM API that allows you to access every single node in that DOM tree.
ASP provides a full DOM API that allows you to access every single node in that DOM tree.
EVERYTHING that touches the DOM provides a full DOM API that allows you to access every single node in that DOM tree, with the exception of sub-standard software that throws regular expressions around in a futile attempt at parsing HTML.
Use the API for the DOM to access those nodes based on semantic HTML. Semantic HTML means you have a structure to your HTML that makes sense. Paragraphs go in <p> tags, headings go into heading tags, and so on.
You never, under any circumstances, what-so-ever need to reproduce the DOM API through hacking in values with ID tags because you didn't know you could just say getAllEmentsByTagName("td")[4] to get the fourth element.
If you can grab getAllEmentsByName("td")[4] you don't need to do <td id="id4"> and then later getElementById("id4") because you didn't want learn just one other API call. I dread the day I ever have to maintain a pile of code left behind by someone who felt the need to stick an ID into every element "just to be sure", especially when I need to go back and insert a new element between the fifth and sixth element in a table of thousands (can you imagine replacing EVERY id? Especially when this feature was accounted for over 10 years ago?! Insanity!)
Tl;dr
HTML isn't actually just a pile of text with one way to access it
rtfm, stop pretending you understand it because you can do a handful of things, you're holding yourself back.
Don't shove IDs everywhere, only use them where absolutely required.
Use classes to describe things, not identify things.
?????
Profit.
However, for readability purposes, it's sometimes nice to have a unique ID for each element instead.
This makes absolutely no sense to me. What makes an ID more readable than a class? There's no point assigning unique identifiers to each of a group of related elements if there's no benefit in having identities.
For what it's worth, realize that a single element can have both classes and an ID. If your elements need to be uniquely identified somehow, give them IDs. If multiple elements should be styled identically and are all similar in purpose anyway, use classes. If your elements fit both criteria, give them both attributes, and use each attribute accordingly.
IDs should not be used for styling. Use classes instead. IDs have a very high specificity, and are difficult to override (leading to more IDs, and longer selector chains). Also, IDs are used for JavaScript DOM selection, so if you're using the same IDs in your CSS that you're using in your JavaScript, you've tied the styles to the scripts, and that's bad separation of concerns.
IDs are for JavaScript. Classes are for CSS.
Note: JavaScript and specificity are not the only reasons. Others include fragment identifiers and code reuse. As I say in the comments, there are several smart people who advise against IDs (start there and follow the links)
I use IDs for elements that have clear responsibility, Classes for element that have same presentations, for example:
HTML:
<div id='sport-news'>
<article class='news'>...</article>
<article class='news'>...</article>
</div>
CSS:
.news { /* global styles */ }
[id=sport-news] .news { /* specific styles */ }
JavaScript:
var sportNews = document.getElementById('sport-news') // faster
, news = sportNews.childNodes;
For me, [id] .class is more readable than .parent-class .child-class.
When designing, I will use both id's and classes. For specific items I will use id only. But if you need to apply same styles for different items, use classes. You cannot use same id for different items because id is specific to one item only.
Related
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
From a language-design standpoint, what's the point of creating the id attribute for HTML if you can have a class with only one element? Why not just use classes for everything and not complicate the markup?
I can think of three possible explanations, but they don't fully satisfy me, so I wondered if you know why id was included in HTML. My thoughts are:
The existence of an id helps in creating CSS styles because its greater specificity makes it possible to give an id to one member of a class overriding styles given to other members of that class. This explanation doesn't fully satisfy me because you could just give it an extra class instead and put the styles for that class at the bottom of the stylesheet in a section for styles given to single elements.
When selecting elements with jQuery, the DOM traversal could stop as soon as the element with that id is found. Thus, the existence of an id would make the selection run faster. This explanation doesn't satisfy me because I'm fairly certain that jQuery was created long after ids and classes already existed.
Having an id as a language feature could help to ensure that styles (and selectors) which are supposed to be unique truly are applied to only one element because things go haywire when this isn't the case. This explanation doesn't satisfy me because having your site break when you accidentally create two elements with the same id doesn't seem to be a particularly effective way of informing you that something's gone wrong.
The first publicly available description of HTML was a document called "HTML Tags", first mentioned on the Internet by Berners-Lee in late 1991.
There is a description of anchor tag:
<A NAME=xxx HREF=XXX> ... </A>
HREF
...This allows for the form HREF=#identifier to
refer to another anchor in the same document.
NAME
The attribute NAME allows the anchor to be the destination of a link.
I think NAME attribute here is the predecessor of element's ID: it allowed you to link directly to a desired part of a hypertext page (even if it is the same page).
IDs are unique values so, when you parse the html with something such as javascript, you can be sure of what element your script will hit.
For Javascript anyway getElementById is a few times faster than getElementsByClassName
Test Ops/sec
getElementById 269,235
getElementsByClassName 86,369
ref
More info from the spec
What makes attributes of type ID special is that no two such attributes can
have the same value in a conformant document, regardless of the type of the
elements that carry them; whatever the document language, an ID typed
attribute can be used to uniquely identify its element.
So it is a way to uniquely identify an element, where the class selector could only do so by coincidence.
ref
There are a great many reasons, most of which don't even involve CSS. For example, ajax and JS libraries often require unique IDs, and IDs can act as anchors with URL hashes.
XML is derived from HTML, but used for more generic data. In XML it is very often desirable to have a unique id (the same way it is often necessary in a database). Because XML puts a lot of effort into automated verifiability, the best approach was to simply add the id attribute as a language element. This way, a XML verifier can output an error if the same value is assigned to two id attributes.
Later, many XML features found their way back to HTML, and I guess id is just one of them. It is not strictly needed, but a nice thing to have in combination with Java Script.
I'm wondering why someone would want to use CSS selectors rather than XPath selectors, or vice-versa, if he could use either one. I think that understanding the algorithms that process the languages will resolve my wonder.
There's a lot of documentation on XPath and CSS selectors individually, but I've found very few comparisons. Also, I don't use CSS selectors that much.
Here's what I've read about the differences. (These three references discuss the use of XPath and CSS selectors in Selenium to query HTML, but my wonder is general.)
XPath allows traversal from child to parent
CSS selectors have features specific to HTML
CSS selectors are faster when you're using Internet Explorer in Selenium
It looks like CSS selection algorithms are somehow optimized for HTML, but I don't know how.
Is there a paper on how CSS and XPath query algorithms work and how they differ?
Are there other abstract differences between the languages that I'm missing?
The main difference is in how stable is the document structure you target:
XPath is a good query language when the structure matters and/or is stable. You usually specify path, conditions, exact offset... it is also a good query language to retrieve a set of similar objects and because of that, it has an intimate relationship with XQuery. Here the document has a stable structure and you must retrieve repeated/similar sections
CSS selectors suits better CSS stylesheets. These do not care about the document structure because this changes a lot. Think of one CSS stylesheet applied to all the HTML pages of a website. The content and structure of every page is different. Here CSS selectors are better because of that changing structure. You will notice that access is more tag based. Most CSS syntax specify a set of elements, attributes, id, classes... and not so much their structure. Here you must locate sections that do not have a clear location within a document structure but are marked with certain attributes.
Update: After a closer look to your question I realized that you are more interested in the current implementation, not the nature of the the query languages. In that case I cannot give you the answer you are looking for. I can only suppose that the reason is still that one is more dependent on the structure than the other.
For example, in XPath you must keep track of the structure of the document you are working on. On the other hand CSS selectors are triggered when a specific tag shows up, and it usually does not matter what came before it. I can imagine that it will be much easier to implement a CSS selector algorithm that work as you read a document, while XPath has more cases where you really need the full document and/or strict track of what it is reading (because the history and background of what you are reading is more important)
Now, do not take me too serious on my update. I am only guessing here because I had some background on language parsing, but I actually do not have experience with the ones designed for data querying.
Why is it bad practice to have more than one HTML element with the same id attribute on the same page? I am looking for a way to explain this to someone who is not very familiar with HTML.
I know that the HTML spec requires ids to be unique but that doesn't sound like a convincing reason. Why should I care what someone wrote in some document?
The main reason I can think of is that multiple elements with the same id can cause strange and undefined behavior with Javascript functions such as document.getElementById. I also know that it can cause unexpected behavior with fragment identifiers in URLs. Can anyone think of any other reasons that would make sense to HTML newbies?
Based on your question you already know what w3c has to say about this:
The id attribute specifies a unique id for an HTML element (the id
attribute value must be unique within the HTML document).
The id attribute can be used to point to a style in a style sheet.
The id attribute can also be used by a JavaScript (via the HTML DOM)
to make changes to the HTML element with the specific id.
The point with an id is that it must be unique. It is used to identify an element (or an anything: if two students had the same student id schools would come apart at the seems). It's not like a human name, which needn't be unique. If two elements in an array had the same index, or if two different real numbers were equal... the universe would just fall apart. It's part of the definition of identity.
You should probably use class for what you are trying to do, I think (ps: what are you trying to do?).
Hope this helps!
Why should I care what someone wrote in some document?
You should care because if you are writing HTML, it will be rendered in a browser which was written by someone who did care. W3C created the spec and Google, Mozilla, Microsoft etc... are following it so it is in your interest to follow it as well.
Besides the obvious reason (they are supposed to be unique), you should care because having multiple elements with the same id can break your application.
Let's say you have this markup:
<p id="my_id">One</p>
<p id="my_id">Two</p>
CSS is forgiving, this will color both elements red:
#my_id { color:red; }
..but with JavaScript, this will only style the first one:
document.getElementById('my_id').style.color = 'red';
This is just a simple example. When you're doing anything with JavaScript that relies on ids being unique, your whole application can fall apart. There are questions posted here every day where this is actually happening - something crucial is broken because the developer used duplicate id attributes.
Because if you have multiple HTML elements with the same ID, it is no longer an IDentifier, is it?
Why can't two people have the same social security number?
You basicaly responded to the question. I think that as long as an elemenet can no longer be uniquely identified by the id, than any function that resides on this functionality will break. You can still choose to search elements in an xpath style using the id like you would use a class, but it's cumbersome, error prone and will give you headaches later.
The main reason I can think of is that multiple elements with the same id can cause strange and undefined behavior with Javascript functions such as document.getElementById.
... and XPath expressions, crawlers, scrapers, etc. that rely on ids, but yes, that's exactly it. If they're not convinced, then too bad for them; it will bite them in the end, whether they know it or not (when their website gets visited poorly).
Why should a social security number be unique, or a license plate number? For the same reason any other identifier should be unique. So that it identifies exactly one thing, and you can find that one thing if you have the id.
The main reason I can think of is that multiple elements with the same
id can cause strange and undefined behavior with Javascript functions
such as document.getElementById.
This is exactly the problem. "Undefined behavior" means that one user's browser will behave one way (perhaps get only the first element), another will behave another way (perhaps get only the last element), and another will behave yet another way (perhaps get an array of all elements). The whole idea of programming is to give the computer (that is, the user's browser) exact instructions concerning what you want it to do. When you use ambiguous instructions like non-unique ID attributes, then you get unpredictable results, which is not what a programmer wants.
Why should I care what someone wrote in some document?
W3C specs are not merely "some document"; they are the rules that, if you follow in your coding, you can reasonably expect any browser to obey. Of course, W3C standards are rarely followed exactly by all browsers, but they are the best set of commonly accepted ground rules that exist.
The short answer is that in HTML/JavaScript DOM API you have the getElementById function which returns one element, not a collection. So if you have more than one element with the same id, it would not know which one to pick.
But the question isn't that dumb actually, because there are reasons to want one id that might refer to more than one element in the HTML. For example, a user might make a selection of text and wants to annotate it. You want to show this with a
<span class="Annotation" id="A01">Bla bla bla</span>
If the user selected text that spans multiple paragraphs, then the needs to be broken up into fragments, but all fragments of that selection should be addressable by the same "id".
Note that in the past you could put
<a name="..."/>
elements in your HTML and you could find them with getElementsByName. So this is similar. But unfortunately the HTML specifications have started to deprecate this, which is a bad idea because it leaves an important use case without a simple solution.
Of course with XPath you can do anything use any attribute or even text node as an id. Apparently the XPointer spec allows you to make reference to elements by any XPath expression and use that in URL fragment references as in
http://my.host.com/document.html#xpointer(id('A01'))
or its short version
http://my.host.com/document.html#A01
or, other equivalent XPath expressions:
http://my.host.com/document.html#xpointer(/*/descendant-or-self::*[#id = 'A01'])
and so, one could refer to name attributes
http://my.host.com/document.html#xpointer(/*/descendant-or-self::*[#name = 'A01'])
or whatever you name your attributes
http://my.host.com/document.html#xpointer(/*/descendant-or-self::*[#annotation-id = 'A01'])
Hope this helps.
I have seen it a lot in css talk. What does semantically correct mean?
Labeling correctly
It means that you're calling something what it actually is. The classic example is that if something is a table, it should contain rows and columns of data. To use that for layout is semantically incorrect - you're saying "this is a table" when it's not.
Another example: a list (<ul> or <ol>) should generally be used to group similar items (<li>). You could use a div for the group and a <span> for each item, and style each span to be on a separate line with a bullet point, and it might look the way you want. But "this is a list" conveys more information.
Fits the ideal behind HTML
HTML stands for "HyperText Markup Language"; its purpose is to mark up, or label, your content. The more accurately you mark it up, the better. New elements are being introduced in HTML5 to more accurately label common web page parts, such as headers and footers.
Makes it more useful
All of this semantic labeling helps machines parse your content, which helps users. For instance:
Knowing what your elements are lets browsers use sensible defaults for how they should look and behave. This means you have less customization work to do and are more likely to get consistent results in different browsers.
Browsers can correctly apply your CSS (Cascading Style Sheets), describing how each type of content should look. You can offer alternative styles, or users can use their own; as long as you've labeled your elements semantically, rules like "I want headlines to be huge" will be usable.
Screen readers for the blind can help them fill out a form more easily if the logical sections are broken into fieldsets with one legend for each one. A blind user can hear the legend text and decide, "oh, I can skip this section," just as a sighted user might do by reading it.
Mobile phones can switch to a numeric keyboard when they see a form input of type="tel" (for telephone numbers).
Semantics basically means "The study of meaning".
Usually when people are talking about code being semantically correct, they're referring to the code that accurately describes something.
In (x)HTML, there are certain tags that give meaning to the content they contain. For example:
An H1 tag describes the data it contains as a level-1 heading. An H2 tag describes the data it contains as a level-2 heading. The implied meaning behind this is that each H2 under an H1 is in some way related (i.e. heading and subheading).
When you code in a semantic way, you basically give meaning to the data you're describing.
Consider the following 2 samples of semantic VS non-semantic:
<h1>Heading</h1>
<h2>Subheading</h2>
VS a non-semantic equivalent:
<p><strong>Heading</strong></p>
<p><em>Subheading</em></p>
Sometimes you might hear people in a debate saying "You're just talking semantics now" and this usually refers to the act of saying the same meaning as the other person but using different words.
"Semantically correct usage of elements means that you use them for what they are meant to be used for. It means that you use tables for tabular data but not for layout, it means that you use lists for listing things, strong and em for giving text an emphasis, and the like."
From: http://www.codingforums.com/archive/index.php/t-53165.html
HTML elements have meaning. "Semantically correct" means that your elements mean what they are supposed to.
For instance, you definition lists are represented by <dl> lists in code, your abbreviations are <abbr>s etc.
It means that HTML elements are used in the right context (not like tables are used for design purposes), CSS classes are named in a human-understandable way and the document itself has a structure that can be processed by non-browser clients like screen-readers, automatic parsers trying to extract the information and its structure from the document etc.
For example, you use lists to build up menus. This way a screen reader for disabled people will know these list items are parts of the same menu level, so it will read them in sequence for a person to make choice.
I've never heard it in a purely CSS context, but when talking about CSS and HTML, it means using the proper tags (for example, avoiding the use of the table tag for non-tabular data), providing proper values for the class and id that identify what the contained data is (and using microformats as appropriate), and so on.
It's all about making sure that your data can be understood by humans (everything is displayed properly) and computers (everything is properly identified and marked up).
I have a query In our application we have lots of HTML tags. During development many tags were not given any id because of no requirement.Now the QA team wants to automate the test cases using QTP. In most of the cases this tool doesn't recognizes because it does not find ids for most of the HTML tags.Now we are asked to add ids to all the HTML tags.
I want to know if there will be any effect adding id attribute to these tags. Even positive impact are welcome
I do not think there will be any either positive or negative effect : maybe the size of the HTML page will increase a bit, but probably not that much.
Still, are you sure you need to put "id" attributes on every HTML tag of your pages ? Wouldn't only a few of those be enough ? Like on form fields, on links, on error-messages ; and that's probably about it ?
One thing you must take care, though, is that "id", as in "identifers", must be unique ; which implies it might be good, before starting adding them, to define some kind of "id-policy", to say, for instance, that "ids for elements of that kind should be named that way".
And, for your next projects : have developpers add those when theyr're developping ;-)
(And following the policy, of course)
Now that I'm thinking about it : a positive effect might be that it'll be easier to write Javascript code interacting with your HTML document -- but that'll be true for next projects or evolutions for this one, when those id are already present in the HTML at the time developpers put the JS code in place...
Since there are no QTP related answers yet.
GUI recognition in QTP is object-oriented. In order to identify an object QTP needs a unique combination of object's properties, and checking them better to be as fast as possible - that is why HTML ID would be ideal.
Now, where it is especially critical - for objects that do not have other unique identifiers. The most typical example - html tables. Their contents is dynamic, their number on the page may vary. By adding HTML ID you allow recognition mechanism get straight to the right table.
Objects with other unique properties can be recognized well without HTML ID. For example, if you have a single "submit" link on the page QTP will successfully recognize it by inner text.
So the context-specific answer: don't start adding ids to every single tag. Ask automation guys to prepare a list of objects they have problem with. And add ids to those objects.
PS. It also depends on automation programming skills. There are descriptive programming and dynamic recognition methods. They allow retrieving the right objects even without ids provided.
As Albert said, QTP doesn't rely solely on elements' id, in fact due to the fact that many web applications generate different ids for each session, (as far as I remember) the id property isn't part of the default description for most web test objects.
QTP is pretty good at recognizing most simple web controls and if you're facing problems it may be the case that a Web Extensibility project will help you bridge the gap between the semantics of your web application and the raw HTML it is created in. If a complex control is recognized by QTP as a WebElement (which is actually the div that contains the span that drives the code) you will understandably have object recognition problems since there are many divs on the page but probably many less complex controls.
If you are talking about side-effects - NO. Adding ids won't cause any problems (apart from taking up some extra bytes of course)
If you really have the need to add ids, go ahead and add them.
http://www.w3.org/TR/html4/struct/links.html#anchors-with-id says: The id and name attributes share the same name space. This means that they cannot both define an anchor with the same name in the same document. It is permissible to use both attributes to specify an element's unique identifier for the following elements: A, APPLET, FORM, FRAME, IFRAME, IMG, and MAP. When both attributes are used on a single element, their values must be identical.