How to get elements that are out of Parent Class - html
I am trying to extract some data from the web. However NOT all of the information that I need is in the Parent Class. I can get the information in the Parent class.
QUESTION - Is there a way to get data if it is outside of the parent class? or is there a way to set the below code to extract without using a parent class.
Link
I am using IE as it allos me to search the site. I have tried several code variations however, the extra information is not is the parent class that I am trying to extract from.
I am after the name, location and social media links. Location is at the tops of the webpage out of the class
I tried to use the following for parent class shop-home as all other class fall into it, but it did not work. I have never tried to get data that is not in the parent class so, not 100% sure how to do it. SIM helped with this element.ParentNode.ParentNode.getElementsByClassName as the product url was before the parent. I have been trying to use this for all the other data that is outside the parent, however I can not get it to work. I do not full understand it if someone could explain what the .ParentNode.ParentNode. is doing that will help with my understand and I might be able to work the rest out myself.
The code below is for the first two items that pulls off fine, the code layout is the same for all items except it is as If element.getElementsByClassName("CLASS HERE")(0) . I have tried using ID Tag Span AND SO ON If element.getElementsByClassName("CLASS HERE")(0).getelementsByTagName ("Span") (0)
Application.ScreenUpdating = False
Set HTML = objIE.document
''''########## Setting the Parent Class HERE ##########
Set elements = HTML.getElementsByClassName("v2-listing-card__info")
''''Scrolls Down the Browser
objIE.document.parentWindow.Scroll 0&, 9999 ' Scrolls Down the Browser
''''FOR LOOP
For Each element In elements
''' Element 1
If element.ParentNode.ParentNode.getElementsByClassName("listing-link")(0) Is Nothing Then
wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = "-"
Else
HtmlText = element.ParentNode.ParentNode.getElementsByClassName("listing-link")(0).href
wsSheet.Cells(sht.Cells(sht.Rows.Count, "A").End(xlUp).Row + 1, "A").Value = HtmlText
End If
''' Element 2
If element.getElementsByTagName("h3")(0) Is Nothing Then
wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = "-"
Else
HtmlText = element.getElementsByTagName("h3")(0).innerText ' Get CLASS and Child Nod 'src
wsSheet.Cells(sht.Cells(sht.Rows.Count, "B").End(xlUp).Row + 1, "B").Value = HtmlText 'return value in column
End If
''' Element 3
RESULTS - Date in red is wrong or missing as it is not in the above parent class
The shipping in column H pulls off fine as it is in the Parent, If there is no shipping info then a hyphen goes into the cell. Items for C,D,E, are out of the parent class that I am using.
<div class="flex-grow-1">
<div class="max-width-760px ">
</div>
<div class="max-width-676px">
<div class="">
<p class="wt-text-heading-02 wt-display-inline" data-inplace-editable-text="story_headline" data-endpoint="AboutPost" data-key="story_headline" data-placeholder="Sum up what you do in one sentence. Or just write something catchy." data-use-inplace-input="1"
data-add-class="normal story-headline-edit-link"></p>
</div>
<div class="">
<div id="about-story" class="" aria-hidden="false">
<p class="about-story text-body-larger text-gray-lighter ">
<span class="mt-xs-1" data-inplace-editable-text="story" data-endpoint="AboutPost" data-key="story" data-placeholder="How did you get started? What inspires you? We know each seller’s story is unique — tell yours here."></span>
</p>
</div>
<div class="wt-text-center-xs">
</div>
</div>
</div>
<div class="wt-mb-xs-6 wt-mb-md-8">
<div class="clearfix"></div>
<div>
<h3 class="wt-text-title-01"></h3>
<div class="pt-xs-2 pt-lg-4">
<div class="display-flex-md flex-wrap max-width-760px">
<div class="mb-xs-2 text-body mr-md-6">
<a href="https://www.facebook.com/Lucky-Plum-706715642737271/" class="text-decoration-none clearfix" title="Facebook" target="_blank" rel="nofollow noopener">
<span class="etsy-icon"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" aria-hidden="true" focusable="false"><path d="M20,5V19a1.007,1.007,0,0,1-1,1H15V13.776h2l0.336-2.3H15V9.659a0.912,0.912,0,0,1,1-1.031h1.5V6.55a11.284,11.284,0,0,0-1.641-.109c-2.2,0-3.3,1.219-3.3,3.039v1.992h-2v2.3h2V20H5a1.007,1.007,0,0,1-1-1V5A1.007,1.007,0,0,1,5,4H19A1.007,1.007,0,0,1,20,5Z"></path></svg></span>
<span>Facebook</span>
</a>
</div>
<div class="mb-xs-2 text-body mr-md-6">
<a href="https://www.instagram.com/luckyplumstudio/" class="text-decoration-none clearfix" title="Instagram" target="_blank" rel="nofollow noopener">
<span class="etsy-icon"><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" aria-hidden="true" focusable="false"><path d="M12,5.447c2.136,0,2.389,0.008,3.233,0.047c0.78,0.036,1.204,0.166,1.485,0.275c0.373,0.145,0.64,0.318,0.92,0.598 c0.28,0.28,0.453,0.546,0.598,0.92c0.11,0.282,0.24,0.706,0.275,1.485c0.038,0.844,0.047,1.097,0.047,3.233 s-0.008,2.389-0.047,3.233c-0.036,0.78-0.166,1.204-0.275,1.485c-0.145,0.373-0.318,0.64-0.598,0.92 c-0.28,0.28-0.546,0.453-0.92,0.598c-0.282,0.11-0.706,0.24-1.485,0.275c-0.843,0.038-1.096,0.047-3.233,0.047 s-2.389-0.008-3.233-0.047c-0.78-0.036-1.204-0.166-1.485-0.275c-0.373-0.145-0.64-0.318-0.92-0.598 c-0.28-0.28-0.453-0.546-0.598-0.92c-0.11-0.282-0.24-0.706-0.275-1.485c-0.038-0.844-0.047-1.097-0.047-3.233 S5.45,9.616,5.488,8.773c0.036-0.78,0.166-1.204,0.275-1.485c0.145-0.373,0.318-0.64,0.598-0.92c0.28-0.28,0.546-0.453,0.92-0.598 c0.282-0.11,0.706-0.24,1.485-0.275C9.611,5.455,9.864,5.447,12,5.447 M12,4.005c-2.173,0-2.445,0.009-3.298,0.048 C7.85,4.092,7.269,4.227,6.76,4.425C6.234,4.63,5.787,4.903,5.343,5.348C4.898,5.793,4.624,6.239,4.42,6.765 c-0.198,0.509-0.333,1.09-0.372,1.942C4.009,9.56,4,9.833,4,12.005c0,2.173,0.009,2.445,0.048,3.298 c0.039,0.852,0.174,1.433,0.372,1.942c0.204,0.526,0.478,0.972,0.923,1.417c0.445,0.445,0.891,0.718,1.417,0.923 c0.509,0.198,1.09,0.333,1.942,0.372c0.853,0.039,1.126,0.048,3.298,0.048s2.445-0.009,3.298-0.048 c0.852-0.039,1.433-0.174,1.942-0.372c0.526-0.204,0.972-0.478,1.417-0.923c0.445-0.445,0.718-0.891,0.923-1.417 c0.198-0.509,0.333-1.09,0.372-1.942C19.991,14.45,20,14.178,20,12.005s-0.009-2.445-0.048-3.298 c-0.039-0.852-0.174-1.433-0.372-1.942c-0.204-0.526-0.478-0.972-0.923-1.417c-0.445-0.445-0.891-0.718-1.417-0.923 c-0.509-0.198-1.09-0.333-1.942-0.372C14.445,4.014,14.173,4.005,12,4.005L12,4.005z"></path><path d="M12,7.897c-2.269,0-4.108,1.839-4.108,4.108S9.731,16.113,12,16.113s4.108-1.839,4.108-4.108S14.269,7.897,12,7.897z M12,14.672c-1.473,0-2.667-1.194-2.667-2.667S10.527,9.339,12,9.339s2.667,1.194,2.667,2.667S13.473,14.672,12,14.672z"></path><circle cx="16.27" cy="7.735" r="0.96"></circle></svg></span>
<span>Instagram</span>
</a>
</div>
</div>
</div>
</div>
</div>
<div class="wt-mb-xs-8 wt-mb-md-10">
<div class="clearfix"></div>
<div class="about-section display-flex-md flex-direction-column-md mb-md-5 pl-xs-0 pr-xs-0" data-region="shop-members" id="shop-members">
<div class="p-xs-0">
<h3 class="wt-text-title-01">Shop members</h3>
</div>
<div class="pl-xs-0 pr-xs-0 pt-xs-2 pt-lg-4">
<div class="max-width-760px">
<ul class="list-unstyled block-grid-md-2" data-region="shop-member-list">
<li class="pt-xs-2 pb-xs-2 block-grid-item" data-region="shop-member" data-member-id="22676501471" data-member-avatar-url="https://i.etsystatic.com/isc/87253d/22676501471/isc_90x90.22676501471_6w54.jpg?version=0" data-member-bio="" data-member-role="Owner"
data-member-name="Lucky Plum Studio">
<div class="flag">
<div class="flag-img vertical-align-top pr-lg-3">
<img src="https://i.etsystatic.com/isc/87253d/22676501471/isc_90x90.22676501471_6w54.jpg?version=0" alt="" class="circle" data-region="member-avatar" width="48" height="48">
</div>
<div class="flag-body">
<h6 class="mb-xs-0 b text-transform-none text-body" data-region="member-name">Lucky Plum Studio</h6>
<p class="prose" data-region="member-role">Owner</p>
<p class="text-gray-lighter mb-xs-0" data-region="member-bio">
</p>
</div>
</div>
</li>
</ul>
</div>
</div>
</div>
</div>
<div class="">
</div>
</div>
As Always thanks in advance
''######### updated today 22/3/2021 at 6pm uk time #########
In reply to Qharr answer. I had this for location and nothing was collected, could you please explain where i went wrong and I should be able to fix the rest
''' Element 4
DoEvents
If element.getElementsByClassName("shop-location")(0).getElementsByTagName("Span")(0) Is Nothing Then ' Get CLASS and Child Nod
wsSheet.Cells(sht.Cells(sht.Rows.Count, "D").End(xlUp).Row + 1, "D").Value = "-"
Else
HtmlText = element.getElementsByClassName("shop-location")(0).getElementsByTagName("Span")(0).innerText
wsSheet.Cells(sht.Cells(sht.Rows.Count, "D").End(xlUp).Row + 1, "D").Value = HtmlText
End If
Not sure what to say except read up on html and html document methods/ css selectors so you understand the patterns you need to apply. The rest is just practice and learning which are the fastest and more robust methods.
CSS:
Location: .shop-location span is a span child element with parent having class shop-location
Social media links: #about .text-decoration-none child nodes with one class name that is text-decoration-none, having parent with id about.
Name: [data-region='member-name'] element with data-region attribute having value member-name
Read about css selectors and descendant combinator here
Practice css selectors here
Learn about html here
VBA:
Option Explicit
Public Sub GetInfo()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.etsy.com/uk/shop/LuckyPlumStudio"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
With .document
Debug.Print .querySelector(".shop-location span").innerText 'location
Dim i As Long, socialMedias As Object
Set socialMedias = .querySelectorAll("#about .text-decoration-none")
For i = 0 To socialMedias.Length - 1 'media links
Debug.Print socialMedias.Item(i).href
Next
Debug.Print .querySelector("[data-region='member-name']").innerText 'company name
End With
.Quit
End With
End Sub
Less optimal methods for selecting:
Option Explicit
Public Sub GetInfo()
Dim ie As SHDocVw.InternetExplorer
Set ie = New SHDocVw.InternetExplorer
With ie
.Visible = True
.Navigate2 "https://www.etsy.com/uk/shop/LuckyPlumStudio"
While .Busy Or .readyState <> READYSTATE_COMPLETE: DoEvents: Wend
With .document
Debug.Print .getElementsByClassName("shop-location wt-display-flex-xs")(0).getElementsByTagName("span")(0).innerText 'location
Dim i As Object, socialMedias As Object
Set socialMedias = .getElementById("about").getElementsByClassName("text-decoration-none clearfix")
For Each i In socialMedias 'media links
Debug.Print i.href
Next
Debug.Print .getElementById("about").getElementsByClassName("flag")(0).getElementsByTagName("h6")(0).innerText 'company name
End With
.Quit
End With
End Sub
Related
VBA Webscrape HTML Coinmarketcap
Trying to scrape the number of cryptos in the top left corner of https://coinmarketcap.com/. I tried to find the "tr" but could not. Not sure how to grab that value up the top left of the page. Here is what I have so far and I am being thrown a runtime error 438 Object doesn't support this property or method. Sub cRYP() Dim appIE As Object Set appIE = CreateObject("internetexplorer.application") With appIE .Navigate "https://coinmarketcap.com/" .Visible = True End With Do While appIE.Busy DoEvents Loop Set allRowOfData = appIE.Document.getElementById("__next") Dim myValue As String: myValue = allRowOfData.Cells(16).innerHTML appIE.Quit Range("A1").Value = myValue End Sub
There is no tr tag, because there is no table. At first you must get the html structure which contains your wanted value, because there is no possibility to get it directly. That is the structure with the classname container. Because the method getElementsByClassName() builds a node collection you must get the right structure with it's index in the collection. That's easy because its the first one. The first index of a collection is 0 like in an array. Than you have this html structure: <div class="container"> <div><span class="sc-2bz68i-0 cVPJov">Cryptos <!-- -->: 17.826 </span><span class="sc-2bz68i-0 cVPJov">Exchanges <!-- -->: 459 </span><span class="sc-2bz68i-0 cVPJov">Market Cap <!-- -->: €1,536,467,483,857 </span><span class="sc-2bz68i-0 cVPJov">24h Vol <!-- -->: €105,960,257,048 </span><span class="sc-2bz68i-0 cVPJov">Dominance <!-- -->: <a href="/charts/#dominance-percentage" class="cmc-link">BTC <!-- -->: <!-- -->42.7% <!-- --> <!-- -->ETH <!-- -->: <!-- -->18.2% </a> </span><span class="sc-2bz68i-0 cVPJov"><span class="icon-Gas-Filled" style="margin-right:4px;vertical-align:middle"></span>ETH Gas <!-- -->: <a>35 <!-- --> <!-- -->Gwei<span class="sc-2bz68i-1 cEFmtT icon-Chevron-down"></span> </a> </span></div> <div class="rz95fb-0 jKIeAa"> <div class="sc-16r8icm-0 cPgeGh nav-item"></div> <div class="rz95fb-1 rz95fb-2 eanzZL"> <div class="cmc-popover"> <div class="cmc-popover__trigger"><button title="Change your language" class="sc-1kx6hcr-0 eFEgkr"><span class="sc-1b4wplq-1 kJnRBT">English</span><span class="sc-1b4wplq-0 ifkbzu"><span class="icon-Caret-down"></span></span></button></div> </div> </div> <div class="rz95fb-1 cfBxiI"> <div><button title="Select Currency" data-qa-id="button-global-currency-picker" class="sc-1kx6hcr-0 eFEgkr"><span class="sc-1q0bpva-0 hEPBWj"></span><span class="sc-1bafwtq-1 dUQeWc">EUR</span><span class="sc-1bafwtq-0 cIzAJN"><span class="icon-Caret-down"></span></span></button></div> </div><button type="button" class="sc-1kx6hcr-0 rz95fb-6 ccLqrB cmc-theme-picker cmc-theme-picker--day"><span class="icon-Moon"></span></button> </div> </div> As you can see the wanted value is part of the first a tag in the scraped structure. We can simply get that tag with the method getElementsByTagName(). This will also build a node collection. We need also the first element of the collection with the index 0. Than we have this: 17.826 Now we only need the innertext of this element and that's it. Here is the VBA code. I don't use the IE, because it is finaly EOL and shouldn't be used anymore. You can load coinmarketcap simply without any parameters via xhr (xml http request): Sub CryptosCount() Const url As String = "https://coinmarketcap.com/" Dim doc As Object Dim nodeCryptosCount As Object Set doc = CreateObject("htmlFile") With CreateObject("MSXML2.XMLHTTP.6.0") .Open "GET", url, False .Send If .Status = 200 Then doc.body.innerHTML = .responseText Set nodeCryptosCount = doc.getElementsByClassName("container")(0).getElementsByTagName("a")(0) MsgBox "Number of cryptocurrencies on Coinmarketcap: " & nodeCryptosCount.innertext Else MsgBox "Page not loaded. HTTP status " & .Status End If End With End Sub Edit As I see now, there is a possibility to get the value directly by using getElementsByClassName("cmc-link")(0) You can play with the code to learn more.
Extract email HTML Element
I am trying to scrape a page and there is a point I am stuck at. Here's first the HTML part of the whole HTML page <article class="mod mod-Treffer" data-teilnehmerid="122085958708"> <div data-wipe="{"listener": "click", "name": "Trefferliste Eintrag zur Detailseite", "id": "122085958708", "synchron": true}" data-realid="2aeca1d2-2bc5-4070-ac4d-e16b10badca5" data-tnid="122085958708" target="_self"> <div class="mod-hervorhebung"> <p class="mod-hervorhebung--partnerHervorhebung" data-hervorhebungsstufe="3">Silber Partner</p> </div> <picture class="trefferlisten_logo"> <source media="(min-width: 768px)" srcset="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png"> <img alt="" data-lazy-src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png" src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png"> </picture> <h2 data-wipe-name="Titel">A & S Billing Pflege-Service GmbH</h2> <p class="d-inline-block mod-Treffer--besteBranche">Ambulante Pflegedienste</p> <div class="mod mod-Stars mod-Stars--" title="2.9/5" data-float="2,9"> <span class="mod-Stars__text" style="width: 58.000001907348632812500%;">2.9</span> </div> <span>2.9</span> <span>(8)</span> <address class="mod mod-AdresseKompakt"> <p data-wipe-name="Adresse"> Kirchenberg 2‑4, <span class="nobr"> 90482 Nürnberg </span> (Mögeldorf) </p> <p class="mod-AdresseKompakt__phoneNumber" data-hochgestellt-position="end" data-wipe-name="Kontaktdaten">(0911) 60 00 99 77</p> </address> </div> <div class="aktionsleiste_kompakt"> <div class="mod-gsSlider mod-gsSlider--noneOnWhite"> <span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="left" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-links"}"></span> <span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="right" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-rechts"}"></span> <div class="mod-gsSlider__slider" data-initialized="true"> <a class="contains-icon-homepage gs-btn" target="_blank" rel=" noopener" href="http://www.as-billing.de" data-wipe="{"listener":"click", "name":"Trefferliste Webseite-Button", "id":"122085958708"}" data-isneededpromise="false">Webseite</a> <a class="contains-icon-email gs-btn" href="mailto:info#as-billing.de?subject=Anfrage%20%C3%BCber%20Gelbe%20Seiten" data-wipe="{"listener":"click", "name":"Trefferliste Email-Button", "id":"122085958708"}" data-isneededpromise="false">E-Mail</a> <span class="contains-icon-route_finden gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Navigation-Button", "id":"122085958708"}" data-parameters="{"partner": "googlemaps", "searchquery": "A%20%26%20S%20Billing%20Pflege-Service%20GmbH%20Kirchenberg%202-4%2090482%20N%C3%BCrnberg"}" data-target="_blank">Route</span> <span class="contains-icon-details gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Actionbutton Mehr Details", "id":"122085958708"}" data-parameters="{"partner": "gs", "realId": "2aeca1d2-2bc5-4070-ac4d-e16b10badca5", "tnId": "122085958708"}">Mehr Details</span> </div> </div> </div> </article> I first used these lines Dim post As Object Set post = html.querySelectorAll(".mod-Treffer") For i = 0 To post.Length - 1 Debug.Print post.Item(i).getElementsByTagName("h2")(0).innerText Debug.Print post.Item(i).getElementsByTagName("Address")(0).getElementsByTagName("p")(1).innerText 'I am stuck with extracting the email 'HERE Next i Moreover, sometimes the post object doesn't have the email inforrmation so I need to extract only if found. That's the code till now Const sURL As String = "https://www.gelbeseiten.de/Suche/Ambulante%20Pflegedienste/Bundesweit" Dim http As MSXML2.XMLHTTP60, html As HTMLDocument Set http = New MSXML2.XMLHTTP60 Set html = New MSHTML.HTMLDocument With http .Open "Get", sURL, False .send html.body.innerHTML = .responseText End With Dim post As Object Set post = html.querySelectorAll(".mod-Treffer") Dim i As Long, r As Long Range("A1").Resize(1, 3).Value = Array("Title", "Phone", "Email") r = 2 For i = 0 To post.Length - 1 Cells(r, 1).Value = post.Item(i).getElementsByTagName("h2")(0).innerText Cells(r, 2).Value = post.Item(i).getElementsByTagName("Address")(0).getElementsByTagName("p")(1).innerText Next i Here's a snapshot of the email part
Original question: In this case I would use an attribute = value selector with contains operator to target the href attribute by the string mailto. Add css selector: [href*=mailto] If you use querySelectorAll("[href*=mailto]") you can test if the .Length property is greater than 0 or use querySelector and test if Not querySelector("[href*=mailto]") Is Nothing. If you set to a variable Dim ele As Object Set ele = html.document.querySelector("[href*=mailto]") If Not ele Is Nothing Then Debug.Print ele.href 'do something with the href to parse out email End If Updated question: For the updated question I would transfer current node's, in nodeList, outerHTML into a surrogate HTMLDocument variable so I can leverage querySelector method again. I would target email by class. Option Explicit Public Sub GetListingInfo() Const URL As String = "https://www.gelbeseiten.de/Suche/Ambulante%20Pflegedienste/Bundesweit" Dim http As MSXML2.XMLHTTP60, html As MSHTML.HTMLDocument Set http = New MSXML2.XMLHTTP60 Set html = New MSHTML.HTMLDocument With http .Open "Get", URL, False .send html.body.innerHTML = .responseText End With Dim post As Object, html2 As MSHTML.HTMLDocument Set post = html.querySelectorAll(".mod-Treffer") Set html2 = New MSHTML.HTMLDocument Dim i As Long, emailNode As Object With ActiveSheet .Range("A1").Resize(1, 3).Value = Array("Title", "Phone", "Email") For i = 0 To post.Length - 1 html2.body.innerHTML = post.Item(i).outerHTML .Cells(i + 2, 1).Value = html2.querySelector("h2").innerText .Cells(i + 2, 2).Value = html2.querySelector(".mod-AdresseKompakt__phoneNumber").innerText Set emailNode = html2.querySelector(".contains-icon-email") If Not emailNode Is Nothing Then .Cells(i + 2, 3).Value = Replace$(emailNode.href, "mailto:", vbNullString) Next i End With End Sub
Thanks a lot. I could figure it out using these lines If InStr(post.Item(i).getElementsByTagName("a")(1).href, "mailto:") Then Debug.Print Split(Split(post.Item(i).getElementsByTagName("a")(1).href, "mailto:")(1), "?")(0) End If But I welcome any other suggestions to improve and learn more. * After testing, I encountered an error if the email not found within the element. How to avoid the error? I can use On Error Resume Next. But I have a desire to handle the error instead of skipping it. ** Edit: I could solve the second point by using this structure Dim emailObj As Object Set emailObj = post.Item(i).getElementsByTagName("a")(1) If Not emailObj Is Nothing Then If InStr(post.Item(i).getElementsByTagName("a")(1).href, "mailto:") Then Debug.Print Split(Split(post.Item(i).getElementsByTagName("a")(1).href, "mailto:")(1), "?")(0) End If The code works but sometimes the email is not grabbed correctly .. that is because of this line Set emailObj = post.Item(i).getElementsByTagName("a")(1) Sometimes the object is not assigned to 1. So my last question: how can I get the email data regardless of the assigned number? Inside the loop, I tried this line and played around with no use Set aNodeList = post.Item(i).querySelectorAll(".contains-icon-email")(0)
<article class="mod mod-Treffer" data-teilnehmerid="122085958708"> <div data-wipe="{"listener": "click", "name": "Trefferliste Eintrag zur Detailseite", "id": "122085958708", "synchron": true}" data-realid="2aeca1d2-2bc5-4070-ac4d-e16b10badca5" data-tnid="122085958708" target="_self"> <div class="mod-hervorhebung"> <p class="mod-hervorhebung--partnerHervorhebung" data-hervorhebungsstufe="3">Silber Partner</p> </div> <picture class="trefferlisten_logo"> <source media="(min-width: 768px)" srcset="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png"> <img alt="" data-lazy-src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png" src="https://ies.v4all.de/0122/GS/0122/5/8335/49428335_310x190.png"> </picture> <h2 data-wipe-name="Titel">A & S Billing Pflege-Service GmbH</h2> <p class="d-inline-block mod-Treffer--besteBranche">Ambulante Pflegedienste</p> <div class="mod mod-Stars mod-Stars--" title="2.9/5" data-float="2,9"> <span class="mod-Stars__text" style="width: 58.000001907348632812500%;">2.9</span> </div> <span>2.9</span> <span>(8)</span> <address class="mod mod-AdresseKompakt"> <p data-wipe-name="Adresse"> Kirchenberg 2‑4, <span class="nobr"> 90482 Nürnberg </span> (Mögeldorf) </p> <p class="mod-AdresseKompakt__phoneNumber" data-hochgestellt-position="end" data-wipe-name="Kontaktdaten">(0911) 60 00 99 77</p> </address> </div> <div class="aktionsleiste_kompakt"> <div class="mod-gsSlider mod-gsSlider--noneOnWhite"> <span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="left" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-links"}"></span> <span class="mod-gsSlider__arrow mod-gsSlider__arrow--arrow" data-direction="right" data-show="false" data-wipe="{"listener":"click","name":"Trefferliste: Aktionleiste-button-rechts"}"></span> <div class="mod-gsSlider__slider" data-initialized="true"> <a class="contains-icon-homepage gs-btn" target="_blank" rel=" noopener" href="http://www.as-billing.de" data-wipe="{"listener":"click", "name":"Trefferliste Webseite-Button", "id":"122085958708"}" data-isneededpromise="false">Webseite</a> <a class="contains-icon-email gs-btn" href="mailto:info#as-billing.de?subject=Anfrage%20%C3%BCber%20Gelbe%20Seiten" data-wipe="{"listener":"click", "name":"Trefferliste Email-Button", "id":"122085958708"}" data-isneededpromise="false">E-Mail</a> <span class="contains-icon-route_finden gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Navigation-Button", "id":"122085958708"}" data-parameters="{"partner": "googlemaps", "searchquery": "A%20%26%20S%20Billing%20Pflege-Service%20GmbH%20Kirchenberg%202-4%2090482%20N%C3%BCrnberg"}" data-target="_blank">Route</span> <span class="contains-icon-details gs-btn" data-wipe="{"listener":"click", "name":"Trefferliste Actionbutton Mehr Details", "id":"122085958708"}" data-parameters="{"partner": "gs", "realId": "2aeca1d2-2bc5-4070-ac4d-e16b10badca5", "tnId": "122085958708"}">Mehr Details</span> </div> </div> </div> </article>
With VBA get Internet explorer to select item from list
Good Day, I've searched for answers and solutions proveded on this site did not seem to help including selectedIndex and looping through arrays I've got the following HTML code making up a table from which I want to select the second option "Vorige week" <table cellspacing="0" cellpadding="0" title="" class="mstrListBlock" id="id_mstr51" style="display: table; width: auto;"> <tbody> <tr> <td class="mstrListBlockCell"> <span class=""> <div class="mstrListBlockCaption" style="display: none;"/> <div class="mstrListBlockHeader" style="display: none;"> <div style="" class="mstrListBlockContents" id="ListBlockContents_id_mstr51"> <div oncontextmenu="return mstr.behaviors.Generic.oncontextmenu(arguments[0], self, 'id_mstr51');" onmouseup="try{mstr.$obj('id_mstr51').focus();}catch(localerr){}; return mstr.behaviors.Generic.clearBrowserHighlights(self)" onmousedown="var retVal = mstr.behaviors.ListView.onmousedown(arguments[0], self, 'id_mstr51'); try{mstr.$obj('id_mstr51').focus();}catch(localerr){}; return retVal" ondblclick="return mstr.behaviors.ListView.ondblclick(arguments[0], self, 'id_mstr51')" class="mstrListBlockListContainer" id="id_mstr51ListContainer" style="display: block;"> <div class="mstrListBlockItem" title="Huidige Week"> <div class="mstrListBlockItemSelected" title="Vorige Week"> <div class="mstrBGIcon_fi mstrListBlockItemName" style="background-position: 2px 50%; padding-left: 23px;">Vorige Week</div> </div> <div class="mstrListBlockItem" title="Afgesloten 4 Weken"> <div class="mstrListBlockItem" title="Afgesloten 8 Weken"> <div class="mstrListBlockItem" title="Huidige Periode"> <div class="mstrListBlockItem" title="Vorige Periode"> <div class="mstrListBlockItem" title="Afgesloten 2 Perioden"> <div class="mstrListBlockItem" title="Selectie Datum Hiërarchie. Aangepast wegens IServer crash icm. Metric prompts."> <div class="mstrListBlockItem" title="Gisteren"> I think my problem is in deciding which element I need to use to get the desired outcome Sub JDWReport() Dim objIE As InternetExplorer Set objIE = New InternetExplorerMedium objIE.Visible = True objIE.navigate "URL" Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop objIE.document.getElementById("Uid").Value = "username" objIE.document.getElementById("Pwd").Value = "password" objIE.document.getElementById("3054").Click Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop objIE.navigate "URL2" Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop objIE.document.getElementsClassName("mstrBGIcon_fi mstrListBlockItemName")(0).Click objIE.Quit End Sub See code above which I'm currently using. It gets stuck with the line objIE.document.getElementsClassName("mstrBGIcon_fi mstrListBlockItemName")(0).Click I tried changing this line to different elements based on the HTML code and use .click .selectedindex=2 but those won't work. <div class="mstrListBlockItemSelected" title="Vorige Week"> Currently it says mstrListBlockItemSelected, however, when first navigating to the site, the class is defined as the rest, mstrListBlockItem. It will only change to selected if you click on the item in question (from a list of items). My ultimate goal would be to get the class with title "Vorige Week" to change from mstrListBlockItem to mstrListBlockItemSelected.
I can see that you are using HTML Table and create DIV's in that. I try to search and find that there is no any method or property is available to select the text in DIV. I suggest you to use any HTML control to select its value. For example "Select option". You can try to create drop down using select and then use the code below to select any value in it. Sub Select_Item() Dim post As Object, elem As Object With CreateObject("InternetExplorer.Application") .Visible = True .navigate "C:\Users\WCS\Desktop\element.html" While .Busy = True Or .ReadyState < 4: DoEvents: Wend Set post = .Document.getElementById("ctl00_ContentPlaceHolder1_ddlCycleID") For Each elem In post.getElementsByTagName("option") If InStr(elem.Value, "10") > 0 Then elem.Selected = True: Exit For Next elem End With End Sub
You can try an attribute = value CSS selector: ie.document.querySelector("[title='Vorige Week']").Selected = True Or ie.document.querySelector("[title='Vorige Week']").Click
Excel VBA - Get Link from HTML Anchor Where There is No ID Getting [object HTMLDivElement]
I'm trying to get the href value from a link on a webpage, where the anchor tag does not have an ID. The method that I'm using below returns [object HTMLDivElement] as the value for aEle. How do I get the actual link from the anchor tag within the h2 class="title" tag? The HTML looks like this: <div id="searchResults" class="searchResults_clear"> <div class="prodFeatures> <div class="inner"> <a title="Product Name 1" class="img" href="/product1.html"><img alt='' src="/products/product1.jpg" /></a> <div class="details"> <div class="info"> <p class="mfg">Mfg 1</p> <h2 class="title">Product Name 1</h2> <div class="SKU"> SKU #1 </div> <p class="model">Model #1</p> </div> </div> </div> <div class="prodFeatures> <div class="inner"> <a title="Product Name 2" class="img" href="/product2.html"><img alt='' src="/products/product2.jpg" /></a> <div class="details"> <div class="info"> <p class="mfg">Mfg 2</p> <h2 class="title">Product Name 2</h2> <div class="SKU"> SKU #2 </div> <p class="model">Model #2</p> </div> </div> </div> </div> I've tried a few different methods that I found through StackOverflow. This method appears to come the closest. (I didn't save the links, but that would have been helpful for everyone, wouldn't it?) It seems like this should work: (I've only shown that part of the sub that is responsible for grabbing the data. The rest of it works fine.) Dim objIE As InternetExplorer Dim aEle As HTMLLinkElement Dim y As Integer Dim result As String 'set iteration counter i = 1 'for each <a> element in the collection of objects with class of 'info'... For Each aEle In objIE.document.getElementsByClassName("info") MsgBox (aEle) ' limit to 3 items returned per search term If i = 4 Then Exit For End If '...count result and print it to Sheet2 in col A Worksheets("Results").Range("A1048576").End(xlUp).Offset(1, 0).Value = i Worksheets("Results").Range("A1048576").End(xlUp).Offset(1, 0).EntireColumn.AutoFit Worksheets("Results").Range("A1048576").End(xlUp).Offset(1, 0).EntireColumn.HorizontalAlignment = xlCenter Debug.Print i '...print search terms used in Sheet2 in col B Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 1).Value = searchCell Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 1).WrapText = False Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 1).EntireColumn.AutoFit Debug.Print searchCell If InStr(aEle, " ") = 1 Then Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 2).Value = "Nothing Found" Debug.Print "Nothing Found" GoTo nextSearch Else '...get the description within the element and print it to Sheet2 in col C Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 2).Value = aEle.innerText Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 2).WrapText = False Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 2).EntireColumn.AutoFit Debug.Print paraText End If '...get the href link and print it to Sheet2 in col D, next blank row result = aEle Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 3).Value = result Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 3).WrapText = False Worksheets("Results").Range("A1048576").End(xlUp).Offset(0, 3).EntireColumn.AutoFit Debug.Print result 'increment our iteration counter i = i + 1 'repeat times the # of ele's we have in the collection Next aEle
I want to loop through information on a webpage html. Im not an expert in html but I know what I want from the code
</div> <p id="content-profile-view"> <h3 class="content-profile-title" id="content-profile-title-profile"> Member Profile </h3> <div class="content-profile-display" id="content-profile-display-profile"> <fieldset class="fieldgroup group-membership"><legend>Membership</legend><div class="field field-type-text field-field-membertype"> <div class="field-items"> <div class="field-item odd"> <div class="field-label-inline-first"> Member Type: </div> Fellow </div> </div> </div> In the HTML above I want to return Member Type: Fellow. My code below will get me Member Type: BUT I can't seem to get the Fellow Part. See my code below in vba. Dim collection As MSHTML.IHTMLElementCollection Dim element As MSHTML.HTMLInputElement, subElement As MSHTML.HTMLInputElement Dim a As String Dim b As String Set collection = Doc.getElementsByTagName("div") For Each element In collection If element.className = "field-label-inline-first" Then a = element.innerText Debug.Print a End If Next element
to get the second piece of data, you'll need to look for divs with the class 'field-item', because fellow is included in that overall div, not the filed-label-inline-first div I've reformatted the content below to make it more obvious what is going on here. <div class="field-item odd"> <div class="field-label-inline-first">Member Type: </div> Fellow </div>