How to comment out classes inside class=""? Is this even possible? - html

Is it possible to comment out specific classes inside the HTML (<div class="" ...>)?
For example:
<div data-group="group1" data-container="true" class="container-lg d-flex /* slider */ p-0">
Where in this example the class "slider" should be (temporarily) excluded from the class list.
[UPDATE]
Based on the comments I understand the way of thinking, so I go for the solution Lee Taylor mentioned. When I want to disable a specific class assignment, I just add a prefix to that class. For example:
<div class="slider container"...
becomes:
<div class="disable-slider container"...
Life could be so easy if the mind is thinking too complex :-D
Thank you all for thinking with me!
[/UPDATE]
This would make life a lot easier, in my opinion, in these ways:
You don't have to switch to your style sheet and go searching for the matching class, commenting out and switch back again to the code.
It's clear for everyone that you just exclude the complete function, which - if named clearly - gives other developers a better overview.
For testing purposes you could use this as (style) modules, which are enabled/disabled in a snap! Again, no more hopping between screens/tabs/windows.
Easier debugging. Just comment out some classes and you've got the source of the problem in no time.
It stimulates developers to use recognizable and clearly named classes
Currently I copy the whole element/row, comment this out and add a comment and then paste the copied row below. Then I remove classes from this line of code.
Most of the time this doesn't get updated, so you can't see it as a reliable backup if you're debugging.
I know for sure that such would be possible with JS, but why? (Also changing the HTML structure with JS gives a lot of headaches when it comes to layout shifts and not everyone has the possibility to make use of server side scripting.) Such should exist HTML in my opinion.
Am I the only one who has this in mind?

Per the HTML Specification on the class attribute:
When specified on HTML elements, the class attribute must have a value that is a set of space-separated tokens representing the various classes that the element belongs to.
Here is the definition for space-separated tokens:
A set of space-separated tokens is a string containing zero or more words (known as tokens) separated by one or more ASCII whitespace, where words consist of any string of one or more characters, none of which are ASCII whitespace.
A string containing a set of space-separated tokens may have leading or trailing ASCII whitespace.
Therefore, no, you should technically not be allowed to comment out class list members in any way. If any implementation of the specification does allow this, then it is undefined behavior and should not be depended upon.

Related

How to edit this html lexer rule?

I want to edit this HTML lexer rule and I need help with the Regular Expression
the TAG_NAME refers to any HTML attribute for ex: (required, class, id, etc...).
I want to edit it to make it does not accept this exact syntax: 'az-'.
I think this needs regular expression modification, I looked it up but I couldn't integrate what I found online with the way these rules are written.
I tried to remove the '-' in the Tag_NameChar as a first try but that made the HTML doesnt recognize attributes like 'data-target'.
This snippet is for the rule:
and this one shows how the attributes are recognized.
ANTLR does not support lookahead syntax like some regex engines do, so there's no easy way to exclude certain matches from within the regex. It's always possible to rewrite a regular expression to exclude a given string (regular expressions are closed under negation and intersection), but it usually ends up quite painful. In your case, you'd end up with something following the logic of "a tag name can either have less than 3 characters, more than 3 characters, or it could have three characters where the first isn't an 'a', the second isn't a 'z' or the last isn't a '-'".
The less painful, but also less cross-language solution is to use a predicate that returns false if the text of the tag name equals az-. So something like {getText().equals("az-")}? depending on the language.
If you're okay with introducing an additional lexer rule, you may also introduce a rule INVALID_TAG_NAME (or whatever you want to call it) that matches exactly az- and that's defined before TAG_NAME. That way any tag that's named exactly az- will produce an INVALID_TAG_NAME token instead of a TAG_NAME token.
Depending on your requirements, you could also leave the grammar unchanged altogether and simply produce an error when you see a tag named az- when you traverse the tree in a listener or visitor.

Style guide for documentation in HTML urges to use spaces in <code>...</code>

In the style guide for the maintenance of a bulky documentation of an existing system using HTML which I has to maintain for a client, I found, that text given in a code-tag should be enclosed with spaces like:
..., the element<code> STATE </code>matches datatype ...
In most cases the whole text is enclosed in <p> tags:
<p>..., the element<code> STATE </code>matches datatype ...</p>
Does anyone has an idea why I should write <code> STATE </code> with no place before and afterwards?
One explanation could be that rendering the HTML leads to "better" (i. e. same / bigger width, ...) constant spaces between normal text and the code (the space in code-tag seems to be "bigger"). Is that approach meaningful? Or are there arguments against this rule so I could convince the program director to kick-out this rule?
This sounds like a way of enforcing a style without, for whatever reason, using CSS.
There's no reason to do this other than to conform to somebody's preference (your boss or a client, presumably, in this case).
To back this up, the HTML specification itself uses examples of <code> elements wrapped within <p> elements which do not follow this format:
Example 104
The following example shows how the element can be used in a paragraph to mark up element names and computer code, including punctuation.
<p>The <code>code</code> element represents a fragment of computer code.</p>
— Example 104 within the HTML5.1 specification

RegEx to substitute tag names, leaving the content and attributes intact

I would like to replace opening and closing tag, leaving the content of tags and its attribute intact.
Here is what I have:
<div class="QText">Text to be kept</div>
to be replaced with
<span class="QText">Text to be kept</span>
I tried this expression which finds all expressions I want but there seems to be no way to replace found expressions.
<div class="QText">(.*?)</div>
Thanks in advance.
I think #AmitJoki's answer will work well enough in certain circumstances, but if you only want to replace div elements when they have an attribute or a specific set of attributes, then you would want to use a regex replacement with backreferences - how you specify and refer to a backreference, unfortunately, depends upon your chosen editor. Visual Studio has the most unique and annoying "flavor" of regex I know of, while Dreamweaver has a fairly typical implementation (both as well as I imagine whatever editor you're using do regex replacement - you just have to know the menu item or keystroke to bring up the dialog).
If memory serves, Dreamweaver has replacement options when you hit Ctrl+F, while you have to hit Ctrl+H, so try those.
Once you get a "Find" and "Replace" box, you would put something like what you have in your last example above: <div class="QText">(.*?)</div> or perhaps <div class="(QText|RText|SText)">(.*?)</div> into your "Find" box, then put something like <span class="QText">\1</span> or <span class="\1">\2</span> in the "Replacement" box. A few utilities might use $1 to refer to a backreference rather than \1, but you'll have to lookup help or experiment to be sure.
If you are using a language to run this expression, you need to tell us which language.
If you are using a specific editor to run this expression, you need to tell us which editor.
...and never forget the prevailing wisdom on regex and HTML
Just replace div.
var s="<div class='QText'>Text to be kept</div>";
alert(s.replace(/div/g,"span"));
Demo: http://jsfiddle.net/9sgvP/
Mark it as answer if it helps ;)
Posted as requested
If its going to be literal like that, capture what's to be kept, then replace the rest,
Find: <div( class="QText">.*?</)div>
Replace: <span$1span>

Can an HTML element have multiple ids?

I understand that an id must be unique within an HTML/XHTML page.
For a given element, can I assign multiple ids to it?
<div id="nested_element_123 task_123"></div>
I realize I have an easy solution with simply using a class. I'm just curious about using ids in this manner.
No. From the XHTML 1.0 Spec
In XML, fragment identifiers are of
type ID, and there can only be a
single attribute of type ID per
element. Therefore, in XHTML 1.0 the
id attribute is defined to be of type
ID. In order to ensure that XHTML 1.0
documents are well-structured XML
documents, XHTML 1.0 documents MUST
use the id attribute when defining
fragment identifiers on the elements
listed above. See the HTML
Compatibility Guidelines for
information on ensuring such anchors
are backward compatible when serving
XHTML documents as media type
text/html.
Contrary to what everyone else said, the correct answer is YES
The Selectors spec is very clear about this:
If an element has multiple ID attributes, all of them must be treated as IDs for that element for the purposes of the ID selector.Such a situation could be reached using mixtures of xml:id, DOM3 Core, XML DTDs, and namespace-specific knowledge.
Edit
Just to clarify: Yes, an XHTML element can have multiple ids, e.g.
<p id="foo" xml:id="bar">
but assigning multiple ids to the same id attribute using a space-separated list is not possible.
No. While the definition from W3C for HTML 4 doesn't seem to explicitly cover your question, the definition of the name and id attribute says no spaces in the identifier:
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be followed by any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons (":"), and periods (".").
My understanding has always been:
IDs are single use and are only applied to one element...
Each is attributed as a unique identifier to (only) one single element.
Classes can be used more than once...
They can therefore be applied to more than one element, and similarly yet different, there can be more than one class (i.e., multiple classes) per element.
No. Every DOM element, if it has an id, has a single, unique id. You could approximate it using something like:
<div id='enclosing_id_123'><span id='enclosed_id_123'></span></div>
and then use navigation to get what you really want.
If you are just looking to apply styles, class names are better.
You can only have one ID per element, but you can indeed have more than one class. But don't have multiple class attributes; put multiple class values into one attribute.
<div id="foo" class="bar baz bax">
is perfectly legal.
No, you cannot have multiple ids for a single tag, but I have seen a tag with a name attribute and an id attribute which are treated the same by some applications.
No, you should use nested DIVs if you want to head down that path. Besides, even if you could, imagine the confusion it would cause when you run document.getElementByID(). What ID is it going to grab if there are multiple ones?
On a slightly related topic, you can add multiple classes to a DIV. See Eric Myers discussion at,
http://meyerweb.com/eric/articles/webrev/199802a.html
I'd like to say technically yes, since really what gets rendered is technically always browser-dependent. Most browsers try to keep to the specifications as best they can and as far as I know there is nothing in the CSS specifications against it. I'm only going to vouch for the actual HTML, CSS, and JavaScript code that gets sent to the browser before any other interpreter steps in.
However, I also say no since every browser I typically test on doesn't actually let you.
If you need to see for yourself, save the following as a .html file and open it up in the major browsers. In all browsers I tested on, the JavaScript function will not match to an element. However, remove either "hunkojunk" from the id tag and all works fine.
Sample Code
<html>
<head>
</head>
<body>
<p id="hunkojunk1 hunkojunk2"></p>
<script type="text/javascript">
document.getElementById('hunkojunk2').innerHTML = "JUNK JUNK JUNK JUNK JUNK JUNK";
</script>
</body>
</html>
Nay.
From 3.2.3.1 The id attribute:
The value must not contain any space characters.
id="a b" <-- find the space character in that VaLuE.
That said, you can style multiple IDs. But if you're following the specification, the answer is no.
From 7.5.2 Element identifiers: the id and class attributes:
The id attribute assigns a unique identifier to an element (which may
be verified by an SGML parser).
and
ID and NAME tokens must begin with a letter ([A-Za-z]) and may be
followed by any number of letters, digits ([0-9]), hyphens ("-"),
underscores ("_"), colons (":"), and periods (".").
So "id" must be unique and can't contain a space.
No.
Having said that, there's nothing to stop you doing it. But you'll get inconsistent behaviour with the various browsers. Don't do it. One ID per element.
If you want multiple assignations to an element use classes (separated by a space).
Any ID assigned to a div element is unique.
However, you can assign multiple IDs "under", and not "to" a div element.
In that case, you have to represent those IDs as <span></span> IDs.
Say, you want two links in the same HTML page to point to the same div element in the page.
The two different links
<p>Exponential Equations</p>
<p><Logarithmic Expressions</p>
Point to the same section of the page
<!-- Exponential / Logarithmic Equations Calculator -->
<div class="w3-container w3-card white w3-margin-bottom">
<span id="exponentialEquationsCalculator"></span>
<span id="logarithmicEquationsCalculator"></span>
</div>
The simple answer is no, as others have said before me. An element can't have more than one ID and an ID can't be used more than once in a page. Try it out and you'll see how well it doesn't work.
In response to tvanfosson's answer regarding the use of the same ID in two different elements. As far as I'm aware, an ID can only be used once in a page regardless of whether it's attached to a different tag.
By definition, an element needing an ID should be unique, but if you need two ID's then it's not really unique and needs a class instead.
That's interesting, but as far as I know the answer is a firm no. I don't see why you need a nested ID, since you'll usually cross it with another element that has the same nested ID. If you don't there's no point, if you do there's still very little point.
Classes are specially made for this, and
here is the code from which you can understand it:
<html>
<head>
<style type="text/css">
.personal{
height:100px;
width: 100px;
}
.fam{
border: 2px solid #ccc;
}
.x{
background-color:#ccc;
}
</style>
</head>
<body>
<div class="personal fam x"></div>
</body>
</html>
ID's should be unique, so you should only use a particular ID once on a page. Classes may be used repeatedly.
Check HTML id Attribute (W3Schools) for more details.
I don´t think you can have two Id´s but it should be possible. Using the same id twice is a different case... like two people using the same passport. However one person could have multiple passports... Came looking for this since I have a situation where a single employee can have several functions. Say "sysadm" and "team coordinator" having the id="sysadm teamcoordinator" would let me reference them from other pages so that employees.html#sysadm and employees.html#teamcoordinator would lead to the same place... One day somebody else might take over the team coordinator function while the sysadm remains the sysadm... then I only have to change the ids on the employees.html page ... but like I said - it doesn´t work :(

Variable order regex syntax

Is there a way to indicate that two or more regex phrases can occur in any order? For instance, XML attributes can be written in any order. Say that I have the following XML:
Home
Home
How would I write a match that checks the class and title and works for both cases? I'm mainly looking for the syntax that allows me to check in any order, not just matching the class and title as I can do that. Is there any way besides just including both combinations and connecting them with a '|'?
Edit: My preference would be to do it in a single regex as I'm building it programatically and also unit testing it.
No, I believe the best way to do it with a single RE is exactly as you describe. Unfortunately, it'll get very messy when your XML can have 5 different attributes, giving you a large number of different REs to check.
On the other hand, I wouldn't be doing this with an RE at all since they're not meant to be programming languages. What's wrong with the old fashioned approach of using an XML processing library?
If you're required to use an RE, this answer probably won't help much, but I believe in using the right tools for the job.
Have you considered xpath? (where attribute order doesn't matter)
//a[#class and #title]
Will select both <a> nodes as valid matches. The only caveat being that the input must be xhtml (well formed xml).
You can create a lookahead for each of the attributes and plug them into a regex for the whole tag. For example, the regex for the tag could be
<a\b[^<>]*>
If you're using this on XML you'll probably need something more elaborate. By itself, this base regex will match a tag with zero or more attributes. Then you add a lookhead for each of the attributes you want to match:
(?=[^<>]*\s+class="link")
(?=[^<>]*\s+title="Home")
The [^<>]* lets it scan ahead for the attribute, but won't let it look beyond the closing angle bracket. Matching the leading whitespace here in the lookahead serves two purposes: it's more flexible than matching it in the base regex, and it ensure that we're matching a whole attribute name. Combining them we get:
<a\b(?=[^<>]*\s+class="link")(?=[^<>]*\s+title="Home")[^<>]+>[^<>]+</a>
Of course, I've made some simplifying assumptions for the sake of clarity. I didn't allow for whitespace around the equals signs, for single-quotes or no quotes around the attribute values, or for angle brackets in the attribute values (which I hear is legal, but I've never seen it done). Plugging those leaks (if you need to) will make the regex uglier, but won't require changes to the basic structure.
You could use named groups to pull the attributes out of the tag. Run the regex and then loop over the groups doing whatever tests that you need.
Something like this (untested, using .net regex syntax with the \w for word characters and \s for whitespace):
<a ((?<key>\w+)\s?=\s?['"](?<value>\w+)['"])+ />
The easiest way would be to write a regex that picks up the <a .... > part, and then write two more regexes to pull out the class and the title. Although you could probably do it with a single regex, it would be very complicated, and probably a lot more error prone.
With a single regex you would need something like
<a[^>]*((class="([^"]*)")|(title="([^"]*)"))?((title="([^"]*)")|(class="([^"]*)"))?[^>]*>
Which is just a first hand guess without checking to see if it's even valid. Much easier to just divide and conquer the problem.
An first ad hoc solution might be to do the following.
((class|title)="[^"]*?" *)+
This is far from perfect because it allows every attribute to occur more than once. I could imagine that this might be solveable with assertions. But if you just want to extract the attributes this might already be sufficent.
If you want to match a permutation of a set of elements, you could use a combination of back references and zero-width
negative forward matching.
Say you want to match any one of these six lines:
123-abc-456-def-789-ghi-0AB
123-abc-456-ghi-789-def-0AB
123-def-456-abc-789-ghi-0AB
123-def-456-ghi-789-abc-0AB
123-ghi-456-abc-789-def-0AB
123-ghi-456-def-789-abc-0AB
You can do this with the following regex:
/123-(abc|def|ghi)-456-(?!\1)(abc|def|ghi)-789-(?!\1|\2)(abc|def|ghi)-0AB/
The back references (\1, \2), let you refer to your previous matches, and the zero
width forward matching ((?!...) ) lets you negate a positional match, saying don't match if the
contained matches at this position. Combining the two makes sure that your match is a legit permutation
of the given elements, with each possibility only occuring once.
So, for example, in ruby:
input = <<LINES
123-abc-456-abc-789-abc-0AB
123-abc-456-abc-789-def-0AB
123-abc-456-abc-789-ghi-0AB
123-abc-456-def-789-abc-0AB
123-abc-456-def-789-def-0AB
123-abc-456-def-789-ghi-0AB
123-abc-456-ghi-789-abc-0AB
123-abc-456-ghi-789-def-0AB
123-abc-456-ghi-789-ghi-0AB
123-def-456-abc-789-abc-0AB
123-def-456-abc-789-def-0AB
123-def-456-abc-789-ghi-0AB
123-def-456-def-789-abc-0AB
123-def-456-def-789-def-0AB
123-def-456-def-789-ghi-0AB
123-def-456-ghi-789-abc-0AB
123-def-456-ghi-789-def-0AB
123-def-456-ghi-789-ghi-0AB
123-ghi-456-abc-789-abc-0AB
123-ghi-456-abc-789-def-0AB
123-ghi-456-abc-789-ghi-0AB
123-ghi-456-def-789-abc-0AB
123-ghi-456-def-789-def-0AB
123-ghi-456-def-789-ghi-0AB
123-ghi-456-ghi-789-abc-0AB
123-ghi-456-ghi-789-def-0AB
123-ghi-456-ghi-789-ghi-0AB
LINES
# outputs only the permutations
puts input.grep(/123-(abc|def|ghi)-456-(?!\1)(abc|def|ghi)-789-(?!\1|\2)(abc|def|ghi)-0AB/)
For a permutation of five elements, it would be:
/1-(abc|def|ghi|jkl|mno)-
2-(?!\1)(abc|def|ghi|jkl|mno)-
3-(?!\1|\2)(abc|def|ghi|jkl|mno)-
4-(?!\1|\2|\3)(abc|def|ghi|jkl|mno)-
5-(?!\1|\2|\3|\4)(abc|def|ghi|jkl|mno)-6/x
For your example, the regex would be
/<a href="home.php" (class="link"|title="Home") (?!\1)(class="link"|title="Home")>Home<\/a>/