Working code gives error when run on any other PC - html

I have working code that requests information from a website.
When I send the file to another PC and run the code, I get:
"Run-time error'91': Object variable or With block variable not set"
I ensured:
Macro security levels are the same (Enable all macros & trust access to VBA project object model)
All the checked boxes in VBA editor > Tools > References are the same (Specifically Microsoft HTML Object Library & Microsoft XML, V6.0 is checked)
Sub Macro1()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim Current As Variant
website = "https://www.thalia.de/shop/home/artikeldetails/A1062020980"
Set request = CreateObject("MSXML2.XMLHTTP")
request.Open "GET", website, False
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
request.send
response = StrConv(request.responseBody, vbUnicode)
html.body.innerHTML = response
Current = html.getElementsByClassName("element-text-standard value").Item(0).innerText
MsgBox (Current)
End Sub
The line on which I get the error:
Current = html.getElementsByClassName("element-text-standard value").Item(0).innerText

WinHttp
I've tried a ton of various solutions, in the end, it came just to replacing MSXML2.XMLHTTP with WinHttp.WinHttpRequest.5.1 to make it work on my computer. While I was researching, I rewrote the whole thing a little bit. I'm a noob at this so I can't explain why one works and the other does not.
Option Explicit
Sub Macro1()
Const URL As String _
= "https://www.thalia.de/shop/home/artikeldetails/A1062020980"
'Const URL As String _
= "https://www.thalia.de/shop/home/artikeldetails/A1060523771"
Const ClassName As String _
= "element-text-standard value"
Dim WhrResponseText As String
WhrResponseText = GetWhrResponseText(URL)
If Len(WhrResponseText) = 0 Then
MsgBox "Could not get a response.", vbExclamation
Exit Sub
End If
' ' Write the response string to a worksheet.
' Dim ws As Worksheet: Set ws = ActiveSheet ' improve!
' Dim arr() As String: arr = Split(WhrResponseText, vbLf)
' ws.Range("A1").Resize(UBound(arr) + 1).Value = Application.Transpose(arr)
Dim Elements As Object
With CreateObject("htmlfile")
.body.innerHTML = WhrResponseText
Set Elements = .getElementsByClassName(ClassName)
End With
' Using 'Length' to determine if a result was found and returning
' the first element.
Dim Result As Variant
With Elements
If .Length > 0 Then
Result = .Item(0).innerText
MsgBox Result
Else
MsgBox "Nothing found."
End If
End With
Dim i As Long
' Loop through the elements using 'For Each... Next'.
Dim Element As Object
For Each Element In Elements
Debug.Print i, Element.innerText
i = i + 1
Next Element
' ' Loop through the elements using 'For... Next'.
' With Elements
' For i = 0 To .Length - 1
' Debug.Print i, .Item(i).innerText
' Next i
' End With
End Sub
Function GetWhrResponseText( _
ByVal URL As String) _
As String
Const ProcName As String = "GetWhrResponseText"
On Error GoTo ClearError
With CreateObject("WinHttp.WinHttpRequest.5.1")
.Open "GET", URL, False
.send
GetWhrResponseText = StrConv(.responseBody, vbUnicode)
End With
ProcExit:
Exit Function
ClearError:
Debug.Print "'" & ProcName & "' Run-time error '" _
& Err.Number & "':" & vbLf & " " & Err.Description
Resume ProcExit
End Function

Related

Excel VBA - Error 91 problem when HTML value is nothing

Hi I recently discovered excel VBA and am using it to aid my study of German.
I have a list of German words but no meaning/part of speech, example sentences, etc.
I wrote a macro to go to website (https://dictionary.cambridge.org/dictionary/german-english/) and fetch html data.
However, for some words, the example sentences are not provided (Hence the html returning no value and the error 91).
I have referred to other posts concerning this and added If Not HTMLDoc.getElementsByClassName() Is Nothing Then statements, but no luck so far.
Could you please tell me how to write a code such that if there is no html value, the macro moves on and go to the next word? (word is set by integer corresponding to the cell number in the excel sheet)
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim i As Integer
Dim strURL As String
For i = 2 To 3493
strURL = "https://dictionary.cambridge.org/dictionary/german-english/" & Range("A" & i)
XMLReq.Open "Get", strURL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Error."
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Set XMLReq = Nothing
'Part
If IsObject(HTMLDoc.getElementsByClassName("pos dpos")) Then
Range("B" & i) = HTMLDoc.getElementsByClassName("pos dpos")(0).innerText
End If
'Meaning
If IsObject(HTMLDoc.getElementsByClassName("ddef_h")) Then
Range("C" & i) = HTMLDoc.getElementsByClassName("ddef_h")(0).innerText
End If
'ExampleGer
If Not HTMLDoc.getElementsByClassName("eg deg") Is Nothing Then
i = i + 1
Else
Range("D" & i) = HTMLDoc.getElementsByClassName("eg deg")(0).innerText
End If
'ExampleEng
If Not HTMLDoc.getElementsByClassName("trans dtrans hdb") Is Nothing Then
i = i + 1
Else
Range("E" & i) = HTMLDoc.getElementsByClassName("trans dtrans hdb")(0).innerText
End If
Next i
End Sub
Ok, I'm a German and therefore did not need any example words.
A word that delivers all 4 values: Haus (house)
A word that delivers only 2 values: Gummibaum (rubber plant)
Try the following code and please ...
NEVER! NEVER! NEVER! manipulate the counting variable of a for loop in the code block of the loop. Never use this i = i + 1 if i is the counting variable of the for loop. If you do that you run into problems in 99.9%
Sub Dictionary()
Dim XMLReq As New MSXML2.XMLHTTP60
Dim HTMLDoc As New MSHTML.HTMLDocument
Dim i As Integer
Dim strURL As String
'Use object variables for each node you want to read from the DOM tree
'In the code below, these variables are then used to check whether an object exists or not
Dim nodePart As Object
Dim nodeMeaning As Object
Dim nodeExampleGer As Object
Dim nodeExampleEng As Object
For i = 2 To 3493
strURL = "https://dictionary.cambridge.org/dictionary/german-english/" & Range("A" & i)
'strURL = "https://dictionary.cambridge.org/dictionary/german-english/haus"
XMLReq.Open "Get", strURL, False
XMLReq.send
If XMLReq.Status <> 200 Then
MsgBox "Error."
Exit Sub
End If
HTMLDoc.body.innerHTML = XMLReq.responseText
Set XMLReq = Nothing
'Part
Set nodePart = HTMLDoc.getElementsByClassName("pos dpos")(0)
If Not nodePart Is Nothing Then
Range("B" & i) = nodePart.innerText
End If
'Meaning
Set nodeMeaning = HTMLDoc.getElementsByClassName("ddef_h")(0)
If Not nodeMeaning Is Nothing Then
Range("C" & i) = nodeMeaning.innerText
End If
'ExampleGer
Set nodeExampleGer = HTMLDoc.getElementsByClassName("eg deg")(0)
If Not nodeExampleGer Is Nothing Then
Range("D" & i) = nodeExampleGer.innerText
End If
'ExampleEng
Set nodeExampleEng = HTMLDoc.getElementsByClassName("trans dtrans hdb")(0)
If Not nodeExampleEng Is Nothing Then
Range("E" & i) = nodeExampleEng.innerText
End If
Next i
End Sub

I'm getting stuck at vba runtime error 424

I'm getting
run-time error 424
in 68th row (line)
request.Open "GET", Url, False
and I don't know how to fix it.
My previous question I posted ;
How to scrape specific part of online english dictionary?
My final goal is to get result like this;
A B
beginning bɪˈɡɪnɪŋ
behalf bɪˈhæf
behave bɪˈheɪv
behaviour bɪˈheɪvjər
belong bɪˈlɔːŋ
below bɪˈloʊ
bird bɜːrd
biscuit ˈbɪskɪt
Here's code I wrote, and it's mostly based on someone else's code I found on internet.
' Microsoft ActiveX Data Objects x.x Library
' Microsoft XML, v3.0
' Microsoft VBScript Regular Expressions
Sub ParseHelp()
' Word reference from
Dim Url As String
Url = "https://www.oxfordlearnersdictionaries.com/definition/english/" & Cells(ActiveCell.Row, "B").Value
' Get dictionary's html
Dim Html As String
Html = GetHtml(Url)
' Check error
If InStr(Html, "<TITLE>Not Found</Title>") > 0 Then
MsgBox "404"
Exit Sub
End If
' Extract phonetic alphabet from HTML
Dim wrapPattern As String
wrapPattern = "<span class='name' (.*?)</span>"
Set wrapCollection = FindRegexpMatch(Html, wrapPattern)
' MsgBox StripHtml(CStr(wrapCollection(1)))
' Fill phonetic alphabet into cell
If Not wrapCollection Is Nothing Then
Dim wrap As String
On Error Resume Next
wrap = StripHtml(CStr(wrapCollection(1)))
If Err.Number <> 0 Then
wrap = ""
End If
Cells(ActiveCell.Row, "C").Value = wrap
Else
MsgBox "not found"
End If
End Sub
Public Function StripHtml(Html As String) As String
Dim RegEx As New RegExp
Dim sOut As String
Html = Replace(Html, "</li>", vbNewLine)
Html = Replace(Html, " ", " ")
With RegEx
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "<[^>]+>"
End With
sOut = RegEx.Replace(Html, "")
StripHtml = sOut
Set RegEx = Nothing
End Function
Public Function GetHtml(Url As String) As String
Dim xmlhttp As Object
Set xmlhttp = CreateObject("MSXML2.serverXMLHTTP")
Dim converter As New ADODB.stream
' Get
request.Open "GET", Url, False
request.send
' raw bytes
converter.Open
converter.Type = adTypeBinary
converter.Write request.responseBody
' read
converter.Position = 0
converter.Type = adTypeText
converter.Charset = "utf-8"
' close
GetHtml = converter.ReadText
converter.Close
End Function
Public Function FindRegexpMatch(txt As String, pat As String) As Collection
Set FindRegexpMatch = New Collection
Dim rx As New RegExp
Dim matcol As MatchCollection
Dim mat As Match
Dim ret As String
Dim delimiter As String
txt = Replace(txt, Chr(10), "")
txt = Replace(txt, Chr(13), "")
rx.Global = True
rx.IgnoreCase = True
rx.MultiLine = True
rx.Pattern = pat
Set matcol = rx.Execute(txt)
'MsgBox "Match:" & matcol.Count
On Error GoTo ErrorHandler
For Each mat In matcol
'FindRegexpMatch.Add mat.SubMatches(0)
FindRegexpMatch.Add mat.Value
Next mat
Set rx = Nothing
' Insert code that might generate an error here
Exit Function
ErrorHandler:
' Insert code to handle the error here
MsgBox "FindRegexpMatch. " & Err.GetException()
Resume Next
End Function
Any kind of help would be greatly appreciated.
The following is an example of how to read in values from column A and write out pronounciations to column B. It uses css selectors to match a child node then steps up to parentNode in order to ensure entire pronounciation is grabbed. There are a number of ways you could have matched on the parent node to get the second pronounciation. Note that I use a parent node and Replace as the pronounciation may span multiple childNodes.
If doing this for lots of lookups please be a good netizen and put some waits in the code so as to not bombard the site with requests.
Option Explicit
Public Sub WriteOutPronounciations()
Dim html As MSHTML.HTMLDocument, i As Long, ws As Worksheet
Dim data As String, lastRow As Long, urls()
Set ws = ThisWorkbook.Worksheets("Sheet1")
lastRow = ws.Cells(ws.rows.Count, "A").End(xlUp).row 'you need at least two words in column A or change the redim.
urls = Application.Transpose(ws.Range("A1:A" & lastRow).Value)
ReDim results(1 To UBound(urls))
Set html = New MSHTML.HTMLDocument
With CreateObject("MSXML2.ServerXMLHTTP")
For i = LBound(urls) To UBound(urls)
.Open "GET", "https://www.oxfordlearnersdictionaries.com/definition/english/" & urls(i), False
.send
html.body.innerHTML = .responseText
data = Replace$(Replace$(html.querySelector(".name ~ .wrap").ParentNode.innerText, "/", vbNullString), Chr$(10), Chr$(32))
results(i) = Right$(data, Len(data) - 4)
Next
End With
With ThisWorkbook.Worksheets(1)
.Cells(1, 2).Resize(UBound(results, 1), 1) = Application.Transpose(results)
End With
End Sub
Required references (VBE>Tools>References):
Microsoft HTML Object Library
Should you go down the API route then here is a small example. You can make 1000 free calls in a month with Prototype account. The next best, depending on how many calls you wish to make looks like the 10,001 calls (that one extra PAYG call halves the price). # calls will be affected by whether word is head word or needs lemmas lookup call first. The endpoint construction you need is GET /entries/{source_lang}/{word_id}?fields=pronunciations though that doesn't seem to filter massively. You will need a json parser to handle the json returned e.g. github.com/VBA-tools/VBA-JSON/blob/master/JsonConverter.bas. Download raw code from there and add to standard module called JsonConverter. You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
Option Explicit
Public Sub WriteOutPronounciations()
Dim html As MSHTML.HTMLDocument, i As Long, ws As Worksheet
Dim data As String, lastRow As Long, words()
'If not performing lemmas lookup then must be head word e.g. behave, behalf
Const appId As String = "yourAppId"
Const appKey As String = "yourAppKey"
Set ws = ThisWorkbook.Worksheets("Sheet1")
lastRow = ws.Cells(ws.rows.Count, "A").End(xlUp).row
words = Application.Transpose(ws.Range("A1:A" & lastRow).Value)
ReDim results(1 To UBound(words))
Set html = New MSHTML.HTMLDocument
Dim json As Object
With CreateObject("MSXML2.ServerXMLHTTP")
For i = LBound(words) To UBound(words)
.Open "GET", "https://od-api.oxforddictionaries.com/api/v2/entries/en-us/" & LCase$(words(i)) & "?fields=pronunciations", False
.setRequestHeader "app_id", appId
.setRequestHeader "app_key", appKey
.setRequestHeader "ContentType", "application/json"
.send
Set json = JsonConverter.ParseJson(.responseText)
results(i) = IIf(json("results")(1)("type") = "headword", json("results")(1)("lexicalEntries")(1)("pronunciations")(2)("phoneticSpelling"), "lemmas lookup required")
Set json = Nothing
Next
End With
With ThisWorkbook.Worksheets(1)
.Cells(1, 2).Resize(UBound(results, 1), 1) = Application.Transpose(results)
End With
End Sub

Store JSON results into an array excel vba

I have tried using API for get some information for Yahoo Finance
And this is the UDF that I created
Sub Test()
'1 >> High & 2 >> Close
MsgBox YahooHigh("GOOG", "2019-07-18", 1)
MsgBox YahooHigh("GOOG", "2019-07-18", 2)
End Sub
Function YahooHigh(sTicker As String, sDate As String, idx As Integer)
Dim json As Object
With CreateObject("WinHttp.WinHttpRequest.5.1")
.Open "GET", "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=" & sTicker & "&outputsize=full&apikey=myapikey"
.Send
Set json = JsonConverter.ParseJson(.responseText)
End With
If idx = 1 Then
YahooHigh = json("Time Series (Daily)")(sDate)("2. high")
ElseIf idx = 2 Then
YahooHigh = json("Time Series (Daily)")(sDate)("4. close")
Else
YahooHigh = Empty
End If
End Function
The UDF works fine but of course I will have to load the JSON result each time. As in my example, the UDF will run for twice the first for High value and the second for the Close value
Is there a way to store the json results into an array then instead of loading the json, the array is called. I thought of static but I am stuck at this
What I would like to do is to store all the dates in the json results for specific ticker (High value and Close value only) then to recall the desired value from the static array .. Any ideas?
Another variation:
I have tried using the HTML content and it works fine for me when using the link directly
Sub MyTest()
Dim html As Object, ele As Object
With CreateObject("MSXML2.ServerXMLHTTP")
'https://finance.yahoo.com/quote/GOOG/history?period1=1325566800&period2=1325566800&interval=1d&filter=history&frequency=1d
'.Open "GET", "https://finance.yahoo.com/quote/GOOG/history?period1=1325566800&period2=1325566800&interval=1d&filter=history&frequency=1d", False
Dim sTicker As String
sTicker = Sheets(1).Range("B1").Value 'GOOG
Dim period1 As Long, period2 As Long
period1 = ToUnix(Sheets(1).Range("B2").Value) '3 Jan 2012
period2 = ToUnix(Sheets(1).Range("B3").Value) '3 Jan 2012
.Open "GET", "https://finance.yahoo.com/quote/" & sTicker & "/history?period1=" & period1 & "&period2=" & period2 & "&interval=1d&filter=history&frequency=1d", False
.Send
If .Status <> 200 Then MsgBox "Problem" & vbNewLine & .Status & " - " & .StatusText: Exit Sub
Set html = CreateObject("htmlfile")
html.body.innerHTML = .responseText
'WriteTxtFile html.body.innerHTML
'Stop
Set ele = html.getElementsByTagName("table")(0).getElementsByTagName("tr")(1)
Dim tCell As Object
Dim cnt As Long
For Each tCell In ele.Children
cnt = cnt + 1
If cnt = 3 Then Debug.Print "High: " & tCell.innerText
If cnt = 5 Then Debug.Print "Close: " & tCell.innerText
Next tCell
End With
End Sub
Public Function ToUnix(dt) As Long
ToUnix = DateDiff("s", "1/1/1970", dt)
End Function
When using this line .Open "GET", "https://finance.yahoo.com/quote/GOOG/history?period1=1325566800&period2=1325566800&interval=1d&filter=history&frequency=1d", False it works fine and returns values from High and Close
But when trying to convert the dates from the worksheet to UNIX so as to use them in the link, it doesn't work
This is the problem for me now
Just have your function return the json object, then parse it in your sub.
The json object will contain all your data, and you can parse out what you want.
For example
In your function:
Function YahooHigh(sTicker As String) as object
Dim json As Object
With CreateObject("WinHttp.WinHttpRequest.5.1")
.Open "GET", "https://www.alphavantage.co/query?function=TIME_SERIES_DAILY&symbol=" & sTicker & "&outputsize=full&apikey=myapikey"
.Send
Set YahooHigh = JsonConverter.ParseJson(.responseText)
End With
and in your Sub:
Sub Test()
Dim obj As Object
Set obj = YahooHigh("GOOG")
MsgBox obj("Time Series (Daily)")("2019-07-18")("2. high")
MsgBox obj("Time Series (Daily)")("2019-07-18")("4. close")
End Sub

Exploring the Instr VBA Function In Webscraping

I want to scrape this URL https://www.realtor.com/realestateandhomes-search/06510 using the VBA InStr function and extract all URLs with this substring "06510"
Here's is a sample code I've been trying to make work.
Option Explicit
Sub GetLinks()
'
'To use HTMLDocument you need to set a reference to Tools -> References -> Microsoft HTML Object Library
Dim HTML As New HTMLDocument
Dim http As Object
Dim links As Object
Dim link As HTMLHtmlElement
Dim counter As Long
Dim website As Range
Dim LastRange As Range
Dim row As Long
Dim continue As Boolean
Dim respHead As String
Dim lRow As Long
Application.ScreenUpdating = False
' The row where website addresses start
row = 30
continue = True
lRow = Cells(Rows.count, 1).End(xlUp).row + 1
' XMLHTTP gives errors where ServerXMLHTTP does not
' even when using the same URL's
'Set http = CreateObject("MSXML2.XMLHTTP")
Set http = CreateObject("MSXML2.ServerXMLHTTP.6.0")
Do While continue
' Could set this to first cell with URL then OFFSET columns to get next web site
Set website = Range("A" & row)
Set LastRange = Range("B" & lRow)
If Len(website.Value) < 1 Then
continue = False
Exit Sub
End If
If website Is Nothing Then
continue = False
End If
'Debug.Print website
With http
On Error Resume Next
.Open "GET", website.Value, False
.send
' If Err.Num is not 0 then an error occurred accessing the website
' This checks for badly formatted URL's. The website can still return an error
' which should be checked in .Status
'Debug.Print Err.Number
' Clear the row of any previous results
Range("B" & row & ":e" & row).Clear
' If the website sent a valid response to our request
If Err.Number = 0 Then
If .Status = 200 Then
HTML.body.innerHTML = http.responseText
Set links = HTML.getElementsByTagName("a")
For Each link In links
If InStr(link.outerHTML, "06510") Then
LastRange.Value = link.href
End If
Next
End If
Set website = Nothing
Else
'Debug.Print "Error loading page"
LastRange.Value = "Error with website address"
End If
On Error GoTo 0
End With
row = row + 1
Loop
Application.ScreenUpdating = True
End Sub
After inspecting the page, here's a sample of the kind of URL to extract - https://www.realtor.com/realestateandhomes-detail/239-Bradley-St_New-Haven_CT_06510_M36855-92189. Any help will be appreciated
Using QHarr's code in a simplified way...
Sub GetLinks()
Dim url As String, links_count As Integer
Dim j As Integer, row As Integer
Dim XMLHTTP As Object, html As Object
'Dim tr_coll As Object, tr As Object
'Dim elements As Object
Dim i As Long, allLinksOfInterest As Object
'Dim td_coll As Object, td As Object, td_col, objT
url = "https://www.realtor.com/realestateandhomes-search/06510"
Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.responseText
Set allLinksOfInterest = html.querySelectorAll("[href*='06510']")
For i = 0 To allLinksOfInterest.Length - 1
Debug.Print allLinksOfInterest.Item(i).href
Next
End Sub
Please check if I'm missing anything. I'm still getting the error "Object doesn't support this property or method"
Don't use Instr on entire node outerHTML during a loop of all a tags. There are times when this is required but this shouldn't be one of them (hopefully).
You want to use attribute = value css selector with contains, *, operator. It is specifically for the purpose of matching on substrings in attribute values. This is more efficient.
Set allLinksOfInterest = html.querySelectorAll("[href*='06510']")
So,
Dim i As Long, allLinksOfInterest As Object
Set allLinksOfInterest = html.querySelectorAll("[href*='06510']")
For i = 0 To allLinksOfInterest.Length - 1
Debug.Print allLinksOfInterest.Item(i).href
Next
Attribute = value with contains operator:
[attr*=value]
Represents elements with an attribute name of attr whose
value contains at least one occurrence of value within the string.
VBA:
Produces 26 links currently.All are relative links so need domain added as shown in loop. Some are duplicates so consider adding to a dictionary as keys so as remove duplicates.
Option Explicit
Public Sub GetLinks()
Dim html As HTMLDocument
Set html = New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.realtor.com/realestateandhomes-search/06510", False
.send
html.body.innerHTML = .responseText
End With
Dim i As Long, allLinksOfInterest As Object
Set allLinksOfInterest = html.querySelectorAll("[href*='06510']")
For i = 0 To allLinksOfInterest.Length - 1
Debug.Print Replace$(allLinksOfInterest.item(i).href,"about:","https://www.realtor.com")
Next
End Sub
If InStr(link.outerHTML, "06510") Then
In the code above, InStr function was used like boolean function. But it is not boolean, instead it returns integer. So, you should add comparison operator after function. May be like:
If InStr(link.outerHTML, "06510")>0 Then

Web scraping html page with no tags as delimiter

I'm trying to import into a string array all lines of text in a web page. The URL is here: Vaticano-La Sacra Bibbia-Genesi-Cap.1.
Unfortunately (maybe a choice of the web designer), in the tag there aren't ID's or CLASS. All the rows are separated by 1 or more < BR > element. Start and end text is separated from a simple menu by 2 tag < HR >.
A clean extract of page code is here: jsfiddle.
I find a way to bring the text. And now what I do in VBA till now:
Note: objDoc is a Public variable coming from another module, fill with a .responseText without problems.
Public Sub ScriviXHTML(strBook As String, intNumCap As Integer)
Dim strDati2 As String
Dim TagBr As IHTMLElementCollection
Dim BrElement As IHTMLElement
Dim intElement As Integer
Dim objChild as Object
Dim strData, strTextCont, strNodeVal, strWholeText As String
Set objDoc2 = New HTMLDocument
Set objDoc2 = objDoc
Set objDoc = Nothing
'Put in variable string HTML code of the web page.
strDati2 = objDoc2.body.innerHTML
'Set in the variable object TAG type BR.
Set TagBr = objDoc2.body.getElementsByTagName("BR")
'Loop for all BRs in the page.
For Each BrElement In TagBr
'Here I try to get the NextSibling element of the <br>
' because seems contain the text I'm looking for.
Set objChild = BrElement.NextSibling
With objChild
' Here I try to put in the variables
strData = Trim("" & .Data & "")
strTextCont = Trim("" & .textContent & "")
strNodeVal = Trim("" & .NodeValue & "")
strWholeText = Trim("" & .wholeText & "")
End With
intElement = intElement + 1
Next BrElement
Two questions:
1) Is it, about you, the best way to achieve what I'm trying to do?
2) Sometimes the Element.NextSibling.Data doesn't exist, with an Error of runtime '438', so I manually move the point of sospension of the routine to by-pass the error. How can I intercept this error? [Please not with a simple On Error Resume Next!]... better: how can I use an If...Then... End If statement to check if in NextSibling exist the Data member?
Thanks at all.
Well you can get all the text as follows:
Public Sub GetInfo()
Dim sResponse As String, xhr As Object, html As New HTMLDocument
Set xhr = CreateObject("MSXML2.XMLHTTP")
With xhr
.Open "GET", "http://www.vatican.va/archive/ITA0001/__P1.HTM", False
.send
sResponse = StrConv(.responseBody, vbUnicode)
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
html.body.innerHTML = sResponse
[A1] = Replace$(Replace$(regexRemove(html.body.innerHTML, "<([^>]+)>"), " ", Chr$(32)), Chr$(10), Chr$(32))
End With
End Sub
Public Function regexRemove(ByVal s As String, ByVal pattern As String) As String
Dim regex As Object
Set regex = CreateObject("VBScript.RegExp")
With regex
.Global = True
.MultiLine = True
.IgnoreCase = False
.pattern = pattern
End With
If regex.test(s) Then
regexRemove = regex.Replace(s, vbNullString)
Else
regexRemove = s
End If
End Function