What does a reference inside a div tag without an attribute do? - html

I would like to start using Attribute Selectors in my css. I am seeing div tags that contain a reference WITHOUT any attribute statement like:
<div class="container" data-footer>
All my searches (for the last hour) to find out how "data-footer" can be listed without the use of an attribute= (e.g., id= or class= or etc.) have resulted in no information. Dozens of SO and Google links without a single example of a reference inside a div tag without the use of an attribute. Because I do not know what this is (or what to call it) I'm probably not searching with the right keywords. Is this a short-form way to pass an id or ???

Data- attributes without a value can be used as Boolean. For example:
if(div.hasAttribute('data-footer')) {
// then do something
}
In css you can access it like:
div[data-footer] {
}

Related

How to select <div class="ok">.....<a href="soft://an.id/">...</div> nodes?

A document has several <div class="ok"> tags. I am able to select all of them with
"//*[#class="ok"]" (i don't have to specify div, because only div tags have this class). I get a list of 6 nodes matching this.
Now, i need
either to test each node in order to see if it includes the tag <a href="soft://an.id/">. This inclusion is not direct. I mean, the <div> includes a <table> with many <tr> and <td> and <span>, and the <a..> (only one, or none) somewhere before </div>.
or to directly select only (div) nodes of class="ok" that include this <a> tag.
I have tried many things, that all fail. Including protecting the "/" in the href detection (is it required?).
I am quite familiar with regular expressions, but i must confess that i find XPath syntax even harder to understand.. And the W3C reference documents are so hard, without examples..
Any hints are welcome.
In order to select only <div class="ok"> element containing <a href="soft://an.id/"> child element you can use the following XPath locator:
"//div[#class='ok' and .//a[#href='soft://an.id/']]"
If I understand you correctly, you have a nested somewhere under the div with class "ok", right?
So in xpath, the a / is meant for a direct locator under/above the current tag. If you are looking for the somewhere under the found div, you need to use:
//div[#class="ok"]//a[#href="soft://an.id/"]
Then you need to check if it exists or not by using some kind of an assertion.

Partial HTML Selection Using Jsoup

So I was wondering if there is a way to find the element that belongs to a specific String that you know exists on a HTML page as part of an attribute. The example is I know that "Apr-16-2015" is somewhere in an attribute on the HTML page. If I go look for it, it's part of the attribute title:
<a title="Apr-16-2015 5:04 AM"
However, I do not have the information about the exact time, i.e. the "5:04 AM". I was wondering if there is a way to partially search an attribute in order for it to return the full element.
This is my code:
org.jsoup.nodes.Element links = lastPage.select("[title=\"Apr-16-2015\"]").first();
Again, it doesn't work because I did not enter the full attribute title, as given above. My question: "Is there any way to make this selector work by not entering the full information, as I will be unable to have the latter part of the attribute to my disposition?"
You can use it in the following way:
lastPage.select("[title^=\"Apr-16-2015\"]").first();
As described on JSoup Documentation:
[attr^=value], [attr$=value], [attr*=value]: elements with attributes
that start with, end with, or contain the value, e.g. [href*=/path/]
References:
http://jsoup.org/cookbook/extracting-data/selector-syntax

Parsing awful HTML: How do I recognize boundaries with xpath?

This is almost going to sound like a joke, but I promise you this is real life. There is a site on the internet, one which you have all used, that does not believe in css classes. Everything is defined directly in the style tag on an element. It's horrifying.
My problem though is that it also makes the html extraordinarily difficult to parse. The structure that I've got to go on looks something like this:
<td>
<a name="<random_string>"></a>
<div style="generic-style, used by other elements">
<div style="similarly generic style">{some_stuff}</div>
</div>
<a name="<random_string>"></a>
...
</td>
Basically, I've got these a tags that are forming the boundaries of the reviews, whos only defining information is the random string that is their name. I don't actually care about the anchor tags, but I would like to grab the reviews between them using xpath.
I've looked into sibling queries, but they don't seem to be well suited for alternating boundaries. I also looked into the Kayessian method of xpath queries, which (aside from having an awesome name) only seems well suited to grab a particular div, rather than all divs between the anchor tags.
Any thoughts on how I could grab the divs here?
If //td/div[../a[#name]] works for you, then the following should also work :
//td[a/#name]/div
This way you don't need to go back and forth -or rather down and up-. For a more specific selector, you may want to try the following :
//td/div[preceding-sibling::*[1][self::a/#name]][following-sibling::*[1][self::a/#name]]
The XPath selects div element having all the following properties :
td/div : is child of <td> element
[preceding-sibling::*[1][self::a/#name]] : preceded directly by <a> element having attribute name
[following-sibling::*[1][self::a/#name]] : followed directly by <a> element having attribute name
I figured it out! It turns out that xpath will allow for relative attribute assertions. I am not sure if this behavior is desired, but it happens to work in this case! Here's the xpath:
//td/div[../a[#name]]
Nice and clean, the ../a[#name] basically just says:
Go up a level, and make sure on that level of the hierarchy there's an a element with a name attribute

How to identify a DIV without using an ID or a class name

So I just need a reminder,
Every element can only have 1 id, but multiple classes are possible,
But what if I would like to have 2 ways to uniquely identify an object without classes?
I cant remember the name of it, something like Tagname, that can be used in addition to ID.
And how would you identify this object in jQuery?
For classes it is: $('.class'), for ID's it is $('#id') but what about this thing that I am vaguely describing?
Taylor
I cant remember the name of it,
something like Tagname, that can be
used in addition to ID.
You might be thinking of GetElementsByTagName() but that will return a collection:
https://developer.mozilla.org/en/DOM/element.getElementsByTagName.
You can use arbitrary attributes in any modern browser (jQuery does this behind the scenes). So you can put whatever attribute you want on an element and locate it using a jQuery attribute selector (as #Dave pointed out in his answer).
<div myAttribute="foo"></div>
<script>
var element = $("div[myAttribute='foo']"); // matches all divs with "myAttribute" set
</script>
There are many options: http://api.jquery.com/category/selectors/
Just use the "name" attribute:
<div id="something" name="something"></div>
Then reference it by the id like usual, or by the name like this:
$('[name="something"]')
Info on on this selector here.

What do square brackets in class names mean?

I saw here square brackets that are used in class names:
<input class="validate[required,custom[onlyLetter],length[0,100]]" name="firstname" type="text" />
What do these square brackets mean?
The square brackets are used as an attribute selector, to select all elements that have a certain attribute value. In other words, they detect attribute presence.
Example 1:
img[alt="picName"] {width:100px;}
would affect only
<img src="picName.png" alt="picName" />
in your code, and won't affect
<img src="picName.png" alt="picName2" />
Example 2:
The following affects all elements with title attribute specified:
[title] {border:1px dotted #333;}
Example 3:
This CSS
p[class~="fancy"]
will affect the following html
<p class="fancy">Hello</p>
<p class="very fancy">Hola</p>
<p class="fancy maybe">Aloha</p>
but won't affect this:
<p class="fancy-fancy">Privet</p>
Example 4:
[lang|="en"]
will affect elements with lang attribute, which is hyphen-separated list of words beginning with “en”, like
<div lang="en">Tere</div>
<div lang="en-gb">GutenTag</div>
Examples 5, 6, 7:(CSS3)
The following attribute selector affects link elements whose href attribute value starts with the string “http:”.
a[href^="http:"]
The following attribute selector affects image elements whose src attribute values ends with the string “.png”.
img[src$=".png"]
The following attribute selector affects any input element whose name attribute value contains the string “choice”.
input[name*="choice"]
That is most likely used by some sort of validator or validation library. The class here means that validate this field denoted by validate keyword and then:
required it is required field
custom validation type; allow only letters
length should be between 0 to 100 chars
Well, this information is used by the jQuery validation library you posted the link to :)
Apart from the use-case / example given by the OP for brackets in class names, there is also another use case which Harry Roberts proposed (and later stopped proposing) in his blog a while back: grouping related classes in your markup where the square brackets could be used to group
two or more related class attributes to make them easier to notice
when scanning an HTML file
...
and that looks something like this:
<div class="[ foo foo--bar ] baz">
where:
There must be more than one ‘set’ of classes.
One ‘set’ must contain more than one class.
He also noted that adding the brackets is completely valid according to the html5 spec
There are no […] restrictions on the tokens authors can use in the
class attribute…
Just to reiterate:
The brackets in the class attributes - while being valid CSS class names are not actually meant to be used in the CSS1 - but rather to help readability in the HTML.
Notes:
1
Although technically, they can be used when escaped,
.\[ {
color: red;
}
<div class="[">Hi there</div>
Nothing. Brackets are a legal character for class names with no special meaning whatsoever.
In standard HTML, they have no particular meaning. It's just more text.
To the jQuery Validation plugin, they do.
Example:
[what-ever] {
color: red;
}
<p what-ever>Hello</p>
This will color Hello red. You can use square-bracket as class names
There is no particular rule within a class name. In your example they are almost certainly being used by a JavaScript validation framework. This is because in HTML you can't simply add your own attributes to tags, therefore the validation framework is exploiting the fact that CSS class names can contain such characters in order to 'store' the validation rules within the class name. There won't actually be any corresponding class in the style-sheet - this is just a trick to work around the limitations of HTML.