How to get HTML element with VBA in Excel? - html

I have an HTML in my Excel:
<form name="scform" action="online_range.aspx" autocomplete="off">
<input name="AcctNo" type="hidden" value="3949067512">
<table width="100%" border="0" cellpadding="3">
<tbody><tr>
<td width="6%"></td>
<td width="18%" align="center" valign="middle"><font color="#ffffff" face="verdana" size="1"><b>Numbers</b>
<td width="18%" align="right" valign="middle"><font color="#000000" face="verdana" size="1">**000,000,000,000.00**</font></td>
<td width="18%" align="right" valign="middle"><font color="#000000" face="verdana" size="1">**100,100,100,100.00**</font>
<td width="5%" align="center" valign="middle"><font color="#000000" face="verdana" size="1">
<!--<a href="javascript:document.scform.submit();" onmouseover="sctest('0479281963'); window.status='Account Details'; return true;">-->
<!-- INSERT BUILDMENU - APSMITH -->
<script>BuildMenu_SCPHP(0,'')</script>
<a onmouseover="showmenu(event,linksetSCPHP[0]); sctest(479281963, 'IM'); window.status='Account Details';" onmouseout="delayhidemenu()" href="javascript:document.scform.submit();">
<!-- END BUILDMENU - APSMITH -->
<img width="21" height="17" src="/images/detail2.gif" border="0"></a>
</font>
</td>
</tr>
</tbody></table></td>
<td width="3%"></td>
</tr></tbody></table></form>
I want to get the value from the td which is 000,000,000,000.00 and 100,000,000,000.00 but have no luck.
Here's what i tried:
Dim IE As New InternetExplorer
Dim Doc As HTMLDocument
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
'Navigate to Website
IE.navigate "https://secure1.bpiexpressonline.com/AuthFiles/login.aspx?URL=/direct_signin.htm"
'Loop until page load complete
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
Doc.getElementById("UserID").Value = Range("E23").Value
Doc.getElementById("Password").Value = Range("E24").Value
Doc.getElementById("login").submit
'Loop until page load complete
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
'Dim tb As Object, tr As Object, th As Object
Dim tb As Object
Set tb = Doc.getElementsByTagName("AcctNo")
what to do here? i tried getElementsById(td)(1) and so on, but no luck.
by using getElementsById(td)(n) there's no error but what the output is wrong, can someone help me or teach me how to parse form type.
thanks in advance

As I understand you have difficulties with constructing path to desired element?
1) Add id attribute to your table row element, it will be:
<table width="100%" border="0" cellpadding="3">
<tbody><tr id="row_1">
2) Now you can use:
Dim row As Object
Set row = Doc.getElementsByTagName("row_1");
3) Now you can retrieve get your values like this:
row.getElementsByTagName("td")(1).getElementsByTagsName("font")(1).innerText

Related

QuerySelectorAll - returns null

I'm having a weird problem with Excel build version and a macro.
Basically, the macro reads a html page and does a queryselectorAll for a specific table ID and tag.
The code runs smooth as butter on my machine and for other users.
We all have version 16.0 (Office 365) and build 13801 (from application.version and application.build).
The user for which the macro doesn't work has the build 14326.
Does anything change between the 2 different build version ? I'm confused ...
The error is generated by the object inputBoxes that remains Null.
All the users are printing the line Debug.Print html.body.innerHTML just fine.
I've tried with a different selector html.querySelectorAll("td") but the error remains the same.
The macro goes :
Public Sub getGDIN(cp12 As String, adl As Integer, frm As UserForm)
Dim http As New XMLHTTP60
Dim html As New HTMLDocument
Dim response As String
With http
.Open "GET", "http://me.intra-dmu-13/gdi.asp?Type=P&cp12=" & cp12, False
.send
If .Status <> 200 Then
MsgBox "Unable to load the requested page (GDI)", vbCritical
Exit Sub
End If
response = StrConv(.responseBody, vbUnicode)
End With
html.body.innerHTML = response
Dim inputBoxes As Object
Dim i As Long
Dim ctrl As Control
Set inputBoxes = html.querySelectorAll("#Table3 td")
'Set inputBoxes2 = html.querySelectorAll("td") <--- same error
Debug.Print html.body.innerHTML ' <--- print the whole html (both build version)
If inputBoxes.Length < 32 Then ' <--- ERROR NULL
MsgBox "Nothing has been found with this file number.", vbInformation
frm.btnGenerer.Enabled = False
For Each ctrl In frm.Controls
If ctrl.Name Like "tb*adl" & adl And Not ctrl.Name = "tbCp12_adl" & adl Then
ctrl = vbNullString
End If
Next ctrl
Exit Sub
End If
' ...
End Sub
In addition, here is the generated html (since it's from an Intranet) :
<TABLE id=Table3 cellSpacing=0 cellPadding=0 width="100%" border=0>
<TBODY>
<TR height=10>
<TD class=cell-login height=10><IMG border=0 alt="" src="about:../../Images/Spacer.gif" width=1 height=10></TD>
</TR>
<TR>
<TD class=cell-login>
<TABLE style="FONT: 11px Arial, Helvetica, sans-serif" cellSpacing=2 cellPadding=2 border=0>
<TBODY>
<TR>
<TD><B>CP-12</B></TD>
<TD>:</TD>
<TD>BBBB01018097<INPUT type=hidden value=BBBB01018097 name=CP12></TD>
</TR>
<TR>
<TD><B>Nom</B></TD>
<TD>:</TD>
<TD>Byer<INPUT type=hidden value=Byer name=Nom></TD>
</TR>
<TR>
<TD><B>Prenom</B></TD>
<TD>:</TD>
<TD>Joe<INPUT type=hidden value=Joe name=Prenom></TD>
</TR>
<TR>
<TD><B>Titre</B></TD>
<TD>:</TD>
<TD>monsieur<INPUT type=hidden value=monsieur name=Sexe></TD>
</TR>
<TR>
<TD><B>Date of birth</B></TD>
<TD>:</TD>
<TD>1980-01-01<INPUT type=hidden value=1980-01-01 name=DB></TD>
</TR>
<TR>
<TD><B>SIN</B></TD>
<TD>:</TD>
<TD>121-111-111<INPUT type=hidden value=121-111-111 name=SIN></TD>
</TR>
<TR>
<TD><B>Langue</B></TD>
<TD>:</TD>
<TD>Anglais<INPUT type=hidden value=Anglais name=Langue></TD>
</TR>
<TR>
<TD><B>Adresse</B></TD>
<TD>: </TD>
<TD>9999, CH DE LA COTE-SAINT-LUC APP.9
<BR>MONTREAL (QUEBEC) H1X 7B8
<INPUT type=hidden value="9999, CH DE LA COTE-SAINT-LUC APP.9" name=Adresse>
<INPUT type=hidden value=MONTREAL name=Ville>
<INPUT type=hidden value="H1X 7B8" name=CP>
</TD>
</TR>
<TR>
<TD><B>Telephone</B></TD>
<TD>:</TD>
<TD>(000) 000-0000<INPUT type=hidden value="(000) 000-0000" name=Telephone></TD>
</TR>
<!--
<TR>
<TD colspan=3 align=center><img src="../../Images/fleche-connect.gif" alt="" align="absmiddle" border="0"> <a class="lien" href="javascript:AncAdr('BYEJ09027497');">Plus...</a></TD>
</TR>
-->
</TBODY>
</TABLE>
</TD>
</TR>
</FORM>
</TBODY>
</TABLE>
The variable cp12 passed into the sub in simply a string that contains a file number to query our ministry database.
If anyone can shed some light ... or maybe suggest a different approach.
Thank you.

Excel VBA Web Scraping Table Elements from a <frameset> and a <frame>

I am trying to scrape some table-looking items from a website into Excel.
I'm no stranger to coding in general, though I'm pretty new to VBA in an Excel sense :)
I have tried using Excel's Data>From Web interface, it's not recognizing the table. I'm guessing it's because it's built using (or at least that's what my Google-Fu has lead me to understand).
Snipping of what the second table looks like
<html>
<frame title="links" ...>...</frame>
<frame title="queue">
#document
<head>...</head>
<body>
<div id="container>
<script>...</script>
<div>
<table id="oTable">
<colgroup>...</colgroup>
<thead>...</thead>
<tbody>
<tr onclick="changeHighlight( 'eid0' )" id="eid0" class="queryshaded">
<td nowrap=""><a onclick="javascript:window.open('IWViewer.jsp?id=3.5599976.5599976');" title="Open Image" href="javascript:doNothing();"><img title="Open Image" border="0" alt="Open Image" src="URL.gif"></a> <a onclick="javascript:window.open('URL','_newtab');" title="Open Workitem" href="javascript:doNothing();"><img title="Open Workitem" border="0" alt="Open Workitem" src="URL.gif"></a>
</td><td scope="row" nowrap="">12345</td>
<td nowrap="">28/08/2018 17:00:49</td>
<td nowrap="">11/09/2018 16:28:39</td>
<td nowrap="">5,599,976</td>
<td nowrap="">dijm</td></tr>
<tr onclick="changeHighlight( 'eid1' )" id="eid1" class="queryunshaded">
<td nowrap=""><a onclick="javascript:window.open('IWViewer.jsp?id=3.6443276.6443276');" title="Open Image" href="javascript:doNothing();"><img title="Open Image" border="0" alt="Open Image" src="URL.gif"></a> <a onclick="javascript:window.open('URL;id=3.6443276.6443276','_newtab');" title="Open Workitem" href="javascript:doNothing();"><img title="Open Workitem" border="0" alt="Open Workitem" src="URL.gif"></a>
</td><td scope="row" nowrap="">67890</td>
<td nowrap="">25/06/2019 11:01:01</td>
<td nowrap="">09/07/2019 10:32:32</td>
<td nowrap="">6,443,276</td>
<td nowrap=""></td></tr>
<tr onclick="changeHighlight( 'eid2' )" id="eid2" class="queryshaded">
<td nowrap=""><a onclick="javascript:window.open('IWViewer.jsp?id=3.6443287.6443287');" title="Open Image" href="javascript:doNothing();"><img title="Open Image" border="0" alt="Open Image" src="URL.gif"></a> <a onclick="javascript:window.open('URL;id=3.6443287.6443287','_newtab');" title="Open Workitem" href="javascript:doNothing();"><img title="Open Workitem" border="0" alt="Open Workitem" src="URL.gif"></a>
</td><td scope="row" nowrap="">23456</td>
<td nowrap="">25/06/2019 11:01:24</td>
<td nowrap="">09/07/2019 10:35:30</td>
<td nowrap="">6,443,287</td>
<td nowrap=""></td></tr>
<tr onclick="changeHighlight( 'eid3' )" id="eid3" class="queryunshaded">
<td nowrap=""><a onclick="javascript:window.open('IWViewer.jsp?id=3.6443339.6443339');" title="Open Image" href="javascript:doNothing();"><img title="Open Image" border="0" alt="Open Image" src="URL.gif"></a> <a onclick="javascript:window.open('URL;id=3.6443339.6443339','_newtab');" title="Open Workitem" href="javascript:doNothing();"><img title="Open Workitem" border="0" alt="Open Workitem" src="URL.gif"></a>
</td><td scope="row" nowrap="">78901</td>
<td nowrap="">25/06/2019 11:06:02</td>
<td nowrap="">09/07/2019 10:40:39</td>
<td nowrap="">6,443,339</td>
<td nowrap=""></td></tr>
<tr onclick="changeHighlight( 'eid4' )" id="eid4" class="queryshaded">
<td nowrap=""><a onclick="javascript:window.open('IWViewer.jsp?id=3.6443344.6443344');" title="Open Image" href="javascript:doNothing();"><img title="Open Image" border="0" alt="Open Image" src="URL.gif"></a> <a onclick="javascript:window.open('URL;id=3.6443344.6443344','_newtab');" title="Open Workitem" href="javascript:doNothing();"><img title="Open Workitem" border="0" alt="Open Workitem" src="URL.gif"></a>
</td><td scope="row" nowrap="">34567</td>
<td nowrap="">25/06/2019 11:06:17</td>
<td nowrap="">09/07/2019 10:40:43</td>
<td nowrap="">6,443,344</td>
<td nowrap=""></td></tr>
I have tried various solutions that look somewhat like this:
https://www.ozgrid.com/forum/forum/other-software-applications/excel-and-web-browsers-help/131683-extracting-data-from-a-grid-on-webpage
and
Scraping data from website using vba
and trying to define the frames themselves to try and get the info from there?
(again: new to Excel VBA)
'set myHTMLDoc to the main pages IE document
Dim myHTMLDoc As HTMLDocument
Set myHTMLDoc = ie.Document
'set myHTMLFrame2 as the 2nd frame of the main page (index starts at 0)
Dim myHTMLFrame2 As HTMLDocument
Set myHTMLFrame2 = myHTMLDoc.Frames(1).Document
With the above block of code I'm getting a "Run-time error '438'
Without the above block I'm getting a "Run-time error '1004'
The info I eventually want is in each row:
</td><td scope="row" nowrap="">67890</td>
<td nowrap="">25/06/2019 11:01:01</td>
<td nowrap="">09/07/2019 10:32:32</td>
<td nowrap="">6,443,276</td>
Ideally I'd like to dump each element into a cell
67890 | 25/06/2019 11:01:01 | 09/07/2019 10:32:32 | 6,443,276
There's 20 of these rows on each page (there's a button to press to get to the next page which I'll figure out later...hopefully haha)
Massive premptive Thank You to anyone who can help :)
-EDIT-
This is the code that I'm currently working with (not precious about it :P )
Private Sub CommandButton1_Click()
Dim ie As Object
Dim html As Object
Dim objElementTR As Object
Dim objTR As Object
Dim objElementsTD As Object
Dim objTD As Object
Dim result As String
Dim intRow As Long
Dim intCol As Long
Set ie = CreateObject("InternetExplorer.Application")
ie.Navigate "URL"
ie.Visible = True ' loop until page is loaded
Do Until (ie.ReadyState = 4 And Not ie.Busy)
DoEvents
Loop
'set myHTMLDoc to the main pages IE document
Dim myHTMLDoc As HTMLDocument
Set myHTMLDoc = ie.Document
'set myHTMLFrame2 as the 2nd frame of the main page (index starts at 0)
Dim myHTMLFrame2 As HTMLDocument
Set myHTMLFrame2 = ie.Document.querySelector("[title=queue]").contentDocument.getElementById("oTable")
result = myHTMLFrame2
Set html = CreateObject("htmlfile")
myHTMLFrame2 = result
Set objElementTR = html.getElementsByTagName("tr")
ReDim myarray(0 To objElementTR.Length, 0 To 10)
For Each objTR In objElementTR
intRow = intRow + 1
Set objElementsTD = objTR.getElementsByTagName("td")
For Each objTD In objElementsTD
myarray(intRow, intCol) = objTD.innerText
intCol = intCol + 1
Next objTD
intCol = 0
Next objTR
With Sheets(1).Cells(1, 1).Cells(Rows.Count, "A").End(xlUp).Offset(1, 0)
.Resize(UBound(myarray), UBound(myarray, 2)).Value = myarray
End With
End Sub
You could try isolating the frame by its title attribute, then go via contentDocument and get the table by id
ie.document.querySelector("[title=queue]").contentDocument.querySelector("#oTable")
Then end .querySelector("#oTable") can be interchanged with .getElementById("oTable")
I would then dump the .outerHTML of the table via clipboard so as to paste table direct into sheet.

how to find location of specific <tr> each time code is run

My code below will extract a value for each hour of the day.
However, the webpage I'm scraping can change and so I want to find a way to assign the location of the to a variable so that it will know what number it is everytime. I found the current number "116" by trial and error.
I included the html structure below as well. Any suggestions?
Sub scrape()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.application")
With IE
.Visible = False
.navigate "web address"
Do Until .readyState = 4
DoEvents
Loop
.document.all.item("Login1_UserName").Value = "user"
.document.all.item("Login1_Password").Value = "pw"
.document.all.item("Login1_LoginButton").Click
Do Until .readyState = 4
DoEvents
Loop
End With
Dim htmldoc As Object
Dim r
Dim c
Dim aTable As Object
Dim TDelement As Object
Set htmldoc = IE.document
Dim td As Object
For Each td In htmldoc.getElementsByTagName("td")
On Error Resume Next
If span.Children(0).id = "ctl00_PageContent_grdReport_ctl08_Label50" Then
ThisWorkbook.Sheets("sheet1").Range("j8").Offset(r, c).Value = td.Children(1).innerText
End If
On Error GoTo 0
Next td
End Sub
HTML:
<form name="aspnetForm" id="aspnetForm" action="./MinMaxReport.aspx"
method="post">
<div>
</div>
<script type="text/javascript">...</script>
<div>
</div>
<table class="header-table">...</table>
<table class="page-area">
<tbody>
<tr>
<table id="ctl00_PageContent_Table1" border="0">...</table>
<table id="ctl00_PageContent_Table2" border="0">
<tbody>
<tr>
<td>
<div id="ctl00_PageContent_grdReport_div">
<tbody>
<tr style="background-color: beige;">
<td>...</td>
<td>
<span id="ctl00_PageContent_grdReport_ctl08_Label50">Most Restrictive
Capacity Maximum</span>
</td>
<td>
<span id="ctl00_PageContent_grdReport_ctl08_Label51">159</span>
</td>
</tr>
</tbody>
</div>
</td>
</tr>
</tbody>
</table>
</table>
</tr>
</tbody>
</table>
You could loop through all TDs and check if id= "ctl00_PageContent_grdReport_ctl08_Label50" for example:
For Each td In htmldoc.getElementsByTagName("td")
On Error Resume Next
If td.Children(0).ID = "ctl00_PageContent_grdReport_ctl08_Label50" Then
ThisWorkbook.Sheets("sheet1").Range("j8").Offset(r, c).Value = td.Children(1).innerText
End If
On Error GoTo 0
Next td
Children(0) will pick the first iHTML element contained in your table cell. On Error Resume Next is for the situation when td element has no child.
It is possible that you have more then one element with this id in your webpage. Then, you must identify table or table row first. I cannot do it because I can't see your whole HTML code.

Web Scraping with Internet Explorer VBA - Get data from an unknown variable?

I am working on an Excel VBA project to scrape some specific information from a website. The view of this data on the website is as such:
Website View:
What I am looking to do is extract text based on two criteria: Name and post date. For example, I have the name Kaelan and the post date of 11/16/2016. I want to extract the amount of $365.
This is the HTML code:
<div class="familyLedgerAmountCategory" id="id_4541278">
<table>
<tr>
<td class="tdCategoryRow">
<div class="cmFloatLeft divExpandToggle expanded" id="divCategoryToggle_id_4541278"></div>
<div class="cmFloatLeft" id="divCategoryLabel_id_4541278" style="width: 430px;">
Kaelan
</div><span style="margin-left: 5px;">$ 465.00</span>
</td>
</tr>
<tbody>
<tr class="trListTableBody LedgerExisting" id="CamperFamilyLedgerRowControl_14816465">
<td class="tdCamperFamilyLedgerTableColumnDescription tdBorderTop" id="tdCamperFamilyLedgerTableColumnDescription_CamperFamilyLedgerRowControl_14816465">
<div class="divListTableBodyCell" id="tdColumnDescriptionCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">
<a class="aColumnDescriptionCell" id="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" name="aColumnDescriptionCell_CamperFamilyLedgerRowControl_14816465" target="_self" title="Click to view details">2017 Super Early Bird Teen Camp - Tuition</a>
</div>
</td>
</tr>
</table>
</div>
</td>
<td class="tdCamperFamilyLedgerTableColumnPostDate tdBorderTop" id="tdCamperFamilyLedgerTableColumnPostDate_CamperFamilyLedgerRowControl_14816465">
<div class="divListTableBodyCell" id="tdColumnPostDateCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">
11/16/2016
</div>
</td>
</tr>
</table>
</div>
</td>
<td class="tdCamperFamilyLedgerTableColumnEffective tdBorderTop" id="tdCamperFamilyLedgerTableColumnEffective_CamperFamilyLedgerRowControl_14816465">
<div class="divListTableBodyCell" id="tdColumnEffectiveCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">
11/15/2016
</div>
</td>
</tr>
</table>
</div>
</td>
<td class="tdCamperFamilyLedgerTableColumnQty tdBorderTop" id="tdCamperFamilyLedgerTableColumnQty_CamperFamilyLedgerRowControl_14816465">
<div class="divListTableBodyCell" id="tdColumnQtyCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">
1
</div>
</td>
</tr>
</table>
</div>
</td>
<td class="tdCamperFamilyLedgerTableColumnAmount tdBorderTop" id="tdCamperFamilyLedgerTableColumnAmount_CamperFamilyLedgerRowControl_14816465">
<div class="divListTableBodyCell" id="tdColumnAmountCell">
<table class="tblListTableBodyCell">
<tr>
<td>
<div class="divListTableBodyLabel">
$ 365.00
</div>
</td>
</tr>
</table>
</div>
</td>
<td class="tdCamperFamilyLedgerTableColumnAction tdBorderTop" id="tdCamperFamilyLedgerTableColumnAction_CamperFamilyLedgerRowControl_14816465"></td>
</tr>
</tbody>
</table>
</div>
My attempt to pull the amount is as follows:
Sub Test()
Dim ie As Object
Dim oElement As Object
Dim wsTarget As Worksheet
Dim i As Integer
Dim NewWB As Workbook
Set NewWB = ActiveWorkbook
Set wsTarget = NewWB.Sheets(1)
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
ie.navigate website...
Wait 6
ie.document.All.Item("txtUserName").Value = "User"
ie.document.All.Item("pswdPassword").Value = "Pass
Wait 1
ie.document.getElementById("btnLogin").Click
Wait 5
ie.navigate website...
i = 1
For Each oElement In ie.document.getElementsByClassName("cmFloatLeft")
If oElement.innerText = "Kaelan" Then
extract1 = oElement.getElementsByClassName("divListTableBodyLabel").inn‌​erText
MsgBox extract1
Else
End If
Next
However, I get an error when running the code above. Can I get the class for cmFloatLeft that I am looking for and then try to call the divLisTableBodyLabel class immediately even though that class does not fall directly below the cmFloatLeft class?
Sorry, I'm still pretty new to scraping web data.
Thanks
That structure is a bit difficult to scrape - you could try going "up" from the "Kaelan" node to the patent table, and then looping over that to extract the various pieces of information. If the post structures are consistent then that could provide one approach.
Set doc = IE.document
Set els = doc.getElementsByClassName("cmFloatLeft")
i = 1
For Each oElement In els
Debug.Print oElement.innerText
If Trim(oElement.innerText) = "Kaelan" Then
Set tbl = GetParent(oElement, "table") '<< find the parent table
If Not tbl Is Nothing Then
'loop over the parent table
For Each rw In tbl.Rows
For Each cl In rw.Cells
Debug.Print cl.innerText
Next cl
Next rw
End If
End If
Next
Function to find a named parent (by tag name):
Function GetParent(el, tagParent)
Dim rv As Object
Set rv = el
Do While Not rv.parentElement Is Nothing
Set rv = rv.parentElement
If UCase(rv.tagName) = UCase(tagParent) Then
Set GetParent = rv
Exit Function
End If
Loop
Set GetParent = Nothing
End Function

How to use getElementsByTagName with <td> with overflow: hidden on VBA?

I am using the VBA automation to get some informations of a ticket system in my job. I am trying to get the value into the generated table but only information that doest'go to the column "A" on sheet "Plan1" is <td> which contains the overflow: hidden CSS atribute. I don't know if are them related but coincidently are the only data that don't appears. Someone can help me?
HTML code:
<div id="posicionamentoContent">
<table class="grid">
<thead>...</thead>
<tbody>
<tr id="937712" class="gridrow">
<td width="200px"> Leonardo Peixoto </td>
<td width="200px"> 23/12/2015 09:45 </td>
<td width="200px"> SIM </td>
<td width="200px"> Telhado da loja com pontos de vazamento.</td>
<td width="200px" align="center"></td>
<td width="200px" align="center"></td>
</tr>
...
...
...
The complete code: http://i.stack.imgur.com/4BsFo.png
I need to get the first 4 <td> text ( Leonardo Peixoto, 23/12/2015 09:45, SIM and Telhado da loja com pontos de vazamento.) but they are only texts which I can't get.
Obs: When I use developers tools (f12) to inspect each element, it shows me perfectly the information I need inside <td>. But when I open "source code" page to checkthe html, the code is like this:
<div id="tabPosicionamento" style="padding: 5px 0 5px 0;" class="ui-tabs-hide">
div id="posicionamentoContent"></div>
</div>
Example VBA:
Sub extractTablesData1()
'we define the essential variables
Dim IE As Object, obj As Object
Dim ticket As String
Set IE = CreateObject("InternetExplorer.Application")
ticket= InputBox("Enter the ticket code")
With IE
.Visible = False
.navigate ("https://www.example.com/details/") & ticket
While IE.ReadyState <> 4
DoEvents
Wend
ThisWorkbook.Sheets("Plan1").Range("A1:K500").ClearContents
Set data = IE.document.getElementsByClassName("thead")(0).getElementsByTagName("td")
i = 0
For Each elemCollection In data
ThisWorkbook.Sheets("Plan1").Range("A" & i + 1) = data(i).innerText
i = i + 1
Next elemCollection
End With
IE.Quit
Set IE = Nothing
....
....
End Sub
This function returns in column "A" of sheet Plan1 only <td class=info3"></td> and <td class=info4"></td> but I need <td class=info1"></td> and <td class=info2 also."></td>
I wasn't able to read the page code due the proxy blocking me, but I faced a similar issue a while ago and the solution I found out was put all data on clipboard and paste. After that I clean the data on the sheet.
Here the code I used to do that:
Set ieTable = ie.document.getElementById("ID")
If Not ieTable Is Nothing Then
Set clip = New DataObject
clip.SetText "<html>" & ieTable.outerHTML & "</html>"
clip.PutInClipboard
Sheet1.Range("A1").Select
ActiveSheet.PasteSpecial Format:="Unicode Text", link:=False, DisplayAsIcon:=False, NoHTMLFormatting:=True
End If
Considering that you need to isolate the 4 td lines, you can do that with a loop for every search.
In your sample it numerates the Data, but not using it. Also, the cell assignment should be cells(x,y).value. Here is the working code.
Sub extractTablesData1()
'we define the essential variables
Dim IE As Object, Data As Object
Dim ticket As String
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = False
.navigate ("put your data url here")
While IE.ReadyState <> 4
DoEvents
Wend
Set Data = IE.document.getElementsByTagName("tr")(0).getElementsByTagName("td")
i = 1
For Each elemCollection In Data
ActiveWorkbook.Sheets(1).Cells(1, i).Value = elemCollection.innerHTML
i = i + 1
Next elemCollection
End With
IE.Quit
Set IE = Nothing
End Sub
It doesn't bring the information what I need (lasts <td>)
<div id="posicionamentoContent">
<table class="grid">
<thead>...</thead>
<tbody>
<tr id="937712" class="gridrow">
<td width="200px"> Leonardo Peixoto </td>
<td width="200px"> 23/12/2015 09:45 </td>
<td width="200px"> SIM </td>
<td width="200px"> Telhado da loja com pontos de vazamento.</td>
<td width="200px" align="center"></td>
<td width="200px" align="center"></td>
</tr>