extracting "author" from a book in amazon with Jsoup - html

I am trying this for days now and it won't wort.
i want the name of the author of this book
"http://www.amazon.de/Weit-weg-ganz-Jojo-Moyes-ebook/dp/B00H07CB9O/ref=sr_1_1?s=books&ie=UTF8&qid=undefined&sr=1-1".
As firebug shows it is located in the following code.
<html>
...
<div class="buying">
<h1 class="parseasinTitle">
<span>
<span class="contributorNameTrigger" asin="B001HMNFPMB00H07CB9O">
<a id="contributorNameTriggerB001HMNFPMB00H07CB9O" href="http://www.amazon.de/Jojo-Moyes/e /B001HMNFPM/ref=ntt_athr_dp_pel_1" asin="B001HMNFPMB00H07CB9O">Jojo Moyes</a>
<a href="#" asin="B001HMNFPMB00H07CB9O">
</span>
...
</html>
i tryed to select the name with
doc.getElementsByClass("contributorNameTrigger")
but it doesn't work.then i used the class "buying" and tried to select the span and the span class but it doesn't work neither
doc.getElementsByClass("buying").select("span").select("span[class=contributorNameTrigger");
Any help is appreciated!

it finally works with the following code:
Element author = doc.getElementsByClass("buying").select("span").select("a").first();
Thanx for the answers!

Related

How to get a div or span class from a related span class?

I've found the lowest class: <span class="pill css-1a10nyx e1pqc3131"> of multiple elements of a website but now I want to find the related/linked upper-class so for example the highest <div class="css-1v73czv eh8fd9011" xpath="1">. I've got the soup but can't figure out a way to get from the 'lowest' class to the 'highest' class, any idea?
<div class="css-1v73czv eh8fd9011" xpath="1">
<div class="css-19qortz eh8fd9010">
<header class="css-1idy7oy eh8fd909">
<div class="css-1rkuvma eh8fd908">
<footer class="css-f9q2sp eh8fd907">
<span class="pill css-1a10nyx e1pqc3131">
End result would be:
INPUT- Search on on all elements of a page with class <span class="pill css-1a10nyx e1pqc3131">(lowest)
OUTPUT - Get all related titles or headers of said class.
I've tried it with if-statements but that doesn't work consistently. Something with an if class = (searchable class) then get (desired higher class) should work.
I can add any more details if needed please let me know, thanks in advance!
EDIT: Picture per clarification where the title(highest class) = "Wooferland Festival 2022" and the number(lowest class) = 253
As mentioned, question needs some more information, to give a concret answer.
Assuming you like to scrape the information in the picture based on your example HTML you select your pill and use .find_previous() to locate your elements:
for e in soup.select('span.pill'):
print(e.find_previous('header').text)
print(e.find_previous('div').text)
print(e.text)
Assuming there is a cotainer tag in HTML structure like <a> or other you would select this based on the condition, that it contains a <span> wit class pill:
for e in soup.select('a:has(span.pill)'):
print(e.header.text)
print(e.header.next.text)
print(e.footer.span.text)
Note: Instead of using css classes, that can be highly dynamic, try use more static attributes or the HTML structure.
Example
See both options, for first one the <a> do not matter.
from bs4 import BeautifulSoup
html='''
<a>
<div class="css-1v73czv eh8fd9011" xpath="1">
<div class="css-19qortz eh8fd9010">
<header class="css-1idy7oy eh8fd909">some date information</header>
<div class="css-1rkuvma eh8fd908">some title</div>
<footer class="css-f9q2sp eh8fd907">
<span class="pill css-1a10nyx e1pqc3131">some number</span>
<footer>
</div>
</div>
</a>
'''
soup = BeautifulSoup(html)
for e in soup.select('span.pill'):
print(e.find_previous('header').text)
print(e.find_previous('div').text)
print(e.text)
print('---------')
for e in soup.select('a:has(span.pill)'):
print(e.header.text)
print(e.header.next.text)
print(e.footer.span.text)
Output
some date information
some title
some number
---------
some date information
some date information
some number

How to add a link in the end of each element (string) from the list? Using Thymeleaf

I am fetching lines of text from the list one by one and I need to add a hyper link in the end of each line. Trying the code below, but link is not displayed.
<p th:each="releases : ${release}"
class="releases" th:text="${releases}" th:href="www.abc.com"> New Releases </p>
<p th:each="releases : ${release}"> <span class="releases" th:text="${releases.split('Spotify')[0]}">
New Releases </span> <a class="spoturl" th:href="${releases.split('URL:\s')[1]}"> Spotify URL </a> </p>
My solution
If you want to add a link to the end of each "release" string, you can use this:
<p th:each="releases : ${release}"
class="releases">
<span th:text="${releases}"></span>
<a th:href="#{www.abc.com/${rel}(rel=${releases})}"
th:text=" '[link]'"></a>
</p>
So, for example, if the items in the release list are Some_Release and Another_Release, you will get this:
Some_Release [link]
Another_Release [link]
Each link text will have a customized href.
Try this
<p th:each="releases : ${release}" th:href="www.abc.com"> <span class="releases" th:text="${releases}"> New Releases </span> </p>

Contents under a specific 'div' is not showing in BeautifulSoup

I have been trying to fetch user comment on a blog post. The tag between which the comments lie is <div id='loadComment> ..... </div>.
So when I tried to view it through soap.prettify(), I only found <div id='loadComment> </div>. The codes in between were not displaying.
Below is the real code which I should be able to see as output
<div id = "loadComment">
<div>
<div class = "comm-cont">
<a href = "...">
................
<a class = ....> ... </a>
</a>
</div>
</div>
But the output I get using prettify() is
<div id = 'loadComment'>
</div>
I would want to fetch comments but I am not even able to view them through BeutifulSoup. Thereby, I need help from the community. Thank you in advance.
Edit:
url:
https://www.somewhereinblog.net/blog/nibhrita/30292259

c# razor template with #Link.To renders the incorrect link

I have a template for the details view of a single product. In this template it lists the "tags" and "categories" with links to view products of the same tag or category.
I define the links for the tags and categories in the same way but they are rendered differently.
here is my template:
<link rel="stylesheet" href="#App.Path/assets/portfolio.css" data-enableoptimizations="bottom"/>
<div class="sc-element">
<div class="ks-portfolio-detail">
<div class="row">
<div class="col-sm-12 col-md-6">#Edit.Toolbar(Content)
<img src="#Content.Image" alt="#Content.UrlKey" class="img-responsive" />
</div>
<div class="col-sm-12 col-md-6">
<div class="ks-title"><h1>#Content.Title</h1></div>
...
<div class="ks-lable">Categories:</div>
#{ int count=0; }
#foreach(var item in AsDynamic(Content.Categories)){
count++;
#item.Title
#(count < Content.Categories.Count?" | ":"")
}
<br/><br/>
<div class="ks-lable">Tags:</div>
#{ int counter=0; }
#foreach(var item in AsDynamic(Content.Tags)){
counter++;
#item.Name
#(counter < Content.Tags.Count?" | ":"")
}
</div>
</div>
</div>
</div>
Please note the lines where the category and tag links are created:
#item.Title
...
#item.Name
PROBLEM
The tags Link.To renders the link with "slashes" like:
http://dnn804/portfolio/tag/Demo2
but the category renders the link like:
http://dnn804/portfolio?category=Flowers
QUESTION
Can someone help me figure out why these links are rendered differently when using the same function? I want them both to appear like the "tags" link.
Thanks in advance.
The Link.To uses the DNN-internal link resolution. I'm just going to guess that there are some terms that DNN maybe treats differently, causing special links. DNN does sometimes do strange things with links, just like on the home page, which is why we are using it.
That's just a guess though.
You could run some experiments like "cat=" instead of category, to validate this.

How to get the whole sentence for title of <a> tag using <%= t('vote.up') %> in rails?

In en.yml, vote.up = "This question does not show any research effort; it is unclear or not useful"
When I write:
</i>
I get the HTML:
<i class="icon-arrow-down"></i>
But I hope to get:
<i class="icon-arrow-up"></i>
How should I do? Thanks in advance!
Wrap title's html attribute value in double quotes:
title="<%= t('vote.up') %>"