Extracting class element - extract

Tried "title = soup.find(class_=' _6YOLH _1JtW7 _2VF_A _2OMMP').get_text().strip()" to extract the title but keep getting attribute error that nonetype object has no attribute. Code is below. Any help is appreciated.
Classic Fit Solid Wool Suit

Maybe check if the attribute exists first then try to retrieve it.

I've checked. It does exist. Here's the websites element inside of an h1 tag:
itemprop="name" overrideelementwith="div" class=" _6YOLH _1JtW7 _2VF_A _2OMMP">Classic Fit Solid Wool Suit

Related

How can I get the element of a-tag in the div class with selenium?

I recently work on the project that I have to get the element from a specific website.
I want to get the text elements that are something below.
<div class="block-content">
<div class="block-heading">
<a href="https://www~~~~~~">
<i class="fa fa-map">
::before
</i>
"Text I want to get"
</a>
</div>
</div>
I have been trying to solve this for a while, but I could not find anything working fine.
I would love you if you could help me.
Thank you.
According to the information you provided the text you are looking for is inside a element so the xpath for this element is something like:
//a[contains(#href,'https://www')]
But since there is also i element inside it, getting the text from a element will give you both text contained in a itself and the text inside the i.
So you should get the text from i that is looking like just a (space) here and reduce it from the text you are receiving from the a.
In case you want to perform this action on all the a elements containing href and i element inside it you can use the following xpath:
//a[#href and ./i]
If there are more specific definitions about the elements you are looking for - the xpath I mentioned should be updated accordingly
From your comment, I understood that you would like to extract that text. So here is the code for you which would extract the text you want.
Selenium::WebDriver::Wait
.new(timeout: 60)
.until { !driver.find_element(xpath: "//i[#class='fa fa-map-marker']/..").text.empty? }
p driver.find_element(xpath: "//i[#class='fa fa-map-marker']/..").text[/(?<=before \")\w+ \w+ \w+ \w+ \w+/]
output
"Text I want to get"
I couldn't get the elements that I wanted directly, so here's what I did.
It is just that I did modify the elements with some methods though.
def seller_name
shop_info_elements = #driver.find_elements(:class_name, "block-content")
shop_info_text= shop_info_elements.first.text
shop_info_text_array = shop_info_text.lines
seller_name = shop_info_text_array.first.chomp
seller_name
end
It is not beautiful, but it can work for any other pages on the same site.

Parsing awful HTML: How do I recognize boundaries with xpath?

This is almost going to sound like a joke, but I promise you this is real life. There is a site on the internet, one which you have all used, that does not believe in css classes. Everything is defined directly in the style tag on an element. It's horrifying.
My problem though is that it also makes the html extraordinarily difficult to parse. The structure that I've got to go on looks something like this:
<td>
<a name="<random_string>"></a>
<div style="generic-style, used by other elements">
<div style="similarly generic style">{some_stuff}</div>
</div>
<a name="<random_string>"></a>
...
</td>
Basically, I've got these a tags that are forming the boundaries of the reviews, whos only defining information is the random string that is their name. I don't actually care about the anchor tags, but I would like to grab the reviews between them using xpath.
I've looked into sibling queries, but they don't seem to be well suited for alternating boundaries. I also looked into the Kayessian method of xpath queries, which (aside from having an awesome name) only seems well suited to grab a particular div, rather than all divs between the anchor tags.
Any thoughts on how I could grab the divs here?
If //td/div[../a[#name]] works for you, then the following should also work :
//td[a/#name]/div
This way you don't need to go back and forth -or rather down and up-. For a more specific selector, you may want to try the following :
//td/div[preceding-sibling::*[1][self::a/#name]][following-sibling::*[1][self::a/#name]]
The XPath selects div element having all the following properties :
td/div : is child of <td> element
[preceding-sibling::*[1][self::a/#name]] : preceded directly by <a> element having attribute name
[following-sibling::*[1][self::a/#name]] : followed directly by <a> element having attribute name
I figured it out! It turns out that xpath will allow for relative attribute assertions. I am not sure if this behavior is desired, but it happens to work in this case! Here's the xpath:
//td/div[../a[#name]]
Nice and clean, the ../a[#name] basically just says:
Go up a level, and make sure on that level of the hierarchy there's an a element with a name attribute

Protractor: Finding Element by Div Text

Hey I have this code in one of my div elements:
<div class="col-sm-8">Account Information: </div>
Can someone tell me how I would go about finding this element in my protractor code? Is it possible to do something like this:
expect(element(by.divText('Account Information: ')).isDisplayed()).toBe(true);
I have multiple elements with the class "col-sm-8" so I am not able to find the element by class. I was just wondering if there is any way to possibly find the element using the text in the div element? Thanks for the help!
I would recommend you to use by.cssContainingText
element(by.cssContainingText('.col-sm-8', 'Account Information'))
There is no webdriver method which would allow locating an element by its text. You could try using xpath in the following way (not tested):
element(by.xpath('//div[contains(text(), "Account Information: ")]')
keep in mind by.cssContainingText matches the element by PARTIAL text
so element(by.cssContainingText('div', 'male')) will actually match both male and female text
To solve this, use xpath with exact text match
element(by.xpath('//div[text()="male"]'))

how to get text in a div without an id or class jsoup

i have the follow problem. I want to parse a html page with jsoup. The parsing is not the problem but i need to get the text in a div without using the id or class. there is a custom attribute in the div element.
<div id="test" custom="tester">Get this text </div>
i tried the follow thing:
Element bedrijfwrapper = document.select("custom[name="tester"].first();
but i dont't get it to work. Can someone help me out?
You should try this :
document.select('[custom="tester"]').first();
Good luck.
This might work:
doc.select("div[custom=tester]")

Comments within style= attributes - safe?

I am working on a CMS that generates CSS "style='xyz'" statements from user input.
The user input will be validated but as an additional safeguard, I want to check the validity of the values on generation of the CSS code.
If an invalid value is encountered - e.g. a relative width ("50%") where only absolute values are allowed due to layout restrictions - I would like to return a comment INSIDE the style attribute to help debugging:
<div class="content" style="background-color: lightblue; /* WIDTH was invalid: Only absolute values allowed here */; border: 1px orange dotted;">
Is this "safe", i.e. will all major browsers still parse the settings before and after the comment properly? It is difficult to Google information about this.
It is probably safe but I wouldn't put the wrong values commented into the markup.
Let the user know they did something wrong in the very beginning before you generate markup.
A good idea would be to create such a test case and feed it to the W3C validator to see what it would say says about it.
http://validator.w3.org/
From the top of my head, IE supports it, Fx doesn't.