Greetings to everyone.
GOAL
I want to open flashscore.com and get data related to soccer matches.
I want all the matches of this season, so the link "Show more matches" will have to be clicked one or more times.
RESTRICTIONS
I need to do this using VBA
This site does not support Internet Explorer
I cannot install anything on the pc that will be used, so Selenium is turned down as an option
Bearing in mind all of the above, the only option left, seems to be the Microsoft XML, v6.0 library.
WHAT I 'VE TRIED
I 've read several possible duplicates to this post and tried their solutions, but nothing seemed to help so far.
Here is the code with comments explaining every situation:
Option Explicit
Sub Get_Matches()
'REFERENCE TO Microsoft XML, v6.0
Dim httpReq As New MSXML2.XMLHTTP60
Dim doc As MSHTML.HTMLDocument
Dim eleCol As Object
Dim ele As MSHTML.HTMLHtmlElement
'Open site and get html. In comments:things proposed by others but seemed to make no change.
httpReq.Open "GET", "https://www.flashscore.com/football/england/premier-league-2019-2020/results/"
'httpReq.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
httpReq.send
Set doc = New MSHTML.HTMLDocument
doc.body.innerHTML = httpReq.responseText
'doc.body.innerHTML = httpReq.responseBody
'Tried to get the div containing the first match:
Set ele = doc.getElementById("g_1_2JDks1o7") '--> returns nothing
'Tried to get the "Show more matches" link:
Set ele = doc.getElementsByClassName("event__more")(0) '--> returns nothing
'Tried to get the first ancestor that has an id:
Set ele = doc.getElementById("live-table") '--> returns nothing
'Tried to get all <a> elements and then narrow them down till I find the "Show more matches" link:
Set eleCol = doc.getElementsByTagName("a")
For Each ele In eleCol
If ele.href Like "*[#]*" And ele.innerText = "Show more matches" Then
Exit For
End If
Next ele
ele.Click '--> I get the "Show more matches" link, but nothing seems to change
'Someone suggested firing onclick event.
ele.FireEvent "onclick" '--> did nothing
'Some people suggested waiting.
'So I tried this:
Do
DoEvents
Loop While doc.readyState = "loading"
If doc.readyState <> "complete" Then GoTo ERROR_END
Set doc = New MSHTML.HTMLDocument
doc.body.innerHTML = httpReq.responseText
'and the Delay_Code_By function below,
Set ele = doc.getElementById("live-table") '--> but still returns nothing
'Some people suggested looping, so the document gets loaded first.
Do: Set ele = doc.getElementById("live-table"): DoEvents: Loop While ele Is Nothing '--> resulted to infinite loop.
Do: Set eleCol = doc.getElementsByClassName("event__more"): DoEvents: Loop Until eleCol.Length > 0 '--> resulted to infinite loop.
'This block extracts html in a txt file in desktop, to help me see it, as it is at runtime.
'------------------------------------------------------------------------------------------
' Dim fso As Scripting.FileSystemObject
' Dim txtFile As Scripting.TextStream
' Set fso = New Scripting.FileSystemObject
' Set txtFile = fso.OpenTextFile(CreateObject("WScript.Shell").SpecialFolders("Desktop") & "\output.txt", 2, True, TristateTrue)
' txtFile.Write httpReq.responseText
' txtFile.Close
' Set txtFile = Nothing
' Set fso = Nothing
'------------------------------------------------------------------------------------------
'Inside the txt file I found html code that onclick calls some kind of function:
'Tried to call this function:
doc.parentWindow.execScript "document.body.classList.toggle('loading', true);", "JScript" '--> throws automation error (probably there's some error with my syntax).
'I also tried to call this function:
doc.parentWindow.execScript "function(){return cjs.Api.loader.get('cjs').call(function(_cjs){loadMoreGames(_cjs);});};" '--> which did not throw error but did nothing.
Exit Sub
ERROR_END:
MsgBox "error"
End Sub
Public Sub Delay_Code_By(seconds As Integer)
Dim endTime As Date
endTime = DateAdd("s", seconds, Now)
Do While Now < endTime
DoEvents
Loop
End Sub
QUESTIONS
If there is any other option that I'm missing, other than using Microsoft XML, v6.0 library, please tell me to try it.
As far as I understand the problem is that the elements I'm trying to get are not present the moment I try to get them. If I am correct, can anyone please explain me why is this happening and a possible workaround this one? If I am wrong please point me to the right direction.Thank you.
UPDATE
I'm posting this screenshot based on #QHarr 's comment about watching the network/xhr tab.
Related
I have a problem with the VBA code.
I would like to open this website: https://www.tnt.com/express/en_us/site/tracking.html and in Shipment numbers search box I would like to put active cells from Excel file. At the beginning I tried to put only a specific text for example: "777777".
I wrote the below code but unfortunately, the search button is empty and there is no error. I tried everything and I have no idea what should I change in my code.
Any clues? Thank you in advance.
HTML:
<input class="__c-form-field__text ng-touched ng-dirty ng-invalid" formcontrolname="query" pbconvertnewlinestocommasonpaste="" pbsearchhistorynavigation="" shamselectalltextonfocus="" type="search">
VBA:
Sub TNT2_tracker()
Dim objIE As InternetExplorer
Dim aEle As HTMLLinkElement
Dim y As Integer
Dim result As String
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "https://www.tnt.com/express/en_us/site/tracking.html"
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim webpageelement As Object
For Each webpageelement In objIE.document.getElementsByClassName("input")
If webpageelement.Class = "__c-form-field__text ng-pristine ng-invalid ng-touched" Then
webpageelement.Value = "777"
End If
Next webpageelement
End Sub
You could use the querySelector + class name to find an element.
something like
'Find the input box
objIE.document.querySelector("input.__c-form-field__text").value = "test"
'Find the search button and do a click
objIE.document.querySelector("button.__c-btn").Click
No need to loop through elements. Unless the site allows you to search multiple tracking numbers at the same time.
It seems automating this page is a litte tricky. If you change the value of the input field it doesn' t work. Nothing happen by clicking the submit button.
A look in the dom inspector shows several events for the input field. I checked them out and it seems we need to paste the value over the clipboard by trigger the paste event of the shipping field.
In order for this to work without Internet Explorer prompting, its security settings for the Internet zone must be set to allow pasting from the clipboard. I'm using a German version of IE, so I have problems explaining how to find the setting.
This macro works for me:
Sub TNT2_tracker()
Dim browser As Object
Dim url As String
Dim nodeDivWithInputField As Object
Dim nodeInputShipmentNumber As Object
Dim textToClipboard As Object
'Dataobject by late binding to use the clipboard
Set textToClipboard = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
url = "https://www.tnt.com/express/en_us/site/tracking.html"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Manual break for loading the page complitly
'Application.Wait (Now + TimeSerial(pause_hours, pause_minutes, pause_seconds))
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get div element with input field for shipment number
Set nodeDivWithInputField = browser.Document.getElementsByClassName("pb-search-form-input-group")(0)
If Not nodeDivWithInputField Is Nothing Then
'If we got the div element ...
'First child element is the input field
Set nodeInputShipmentNumber = nodeDivWithInputField.FirstChild
'Put shipment number to clipboard
textToClipboard.setText "7777777"
textToClipboard.PutInClipboard
'Insert value by trigger paste event of the input field
Call TriggerEvent(browser.Document, nodeInputShipmentNumber, "paste")
'Click button
browser.Document.getElementsByClassName("__c-btn")(0).Click
Else
MsgBox "No input field for shipment number found."
End If
End Sub
And this function to trigger a html event:
Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)
Dim theEvent As Object
htmlElementWithEvent.Focus
Set theEvent = htmlDocument.createEvent("HTMLEvents")
theEvent.initEvent eventType, True, False
htmlElementWithEvent.dispatchEvent theEvent
End Sub
As #Stavros Jon alludes to..... there is a browserless way using xhr GET request via API. It returns json and thus you ideally need to use a json parser to handle the response.
I use jsonconverter.bas as the json parser to handle the response. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
Example request with dummy tracking number (deliberately passed as string):
Option Explicit
Public Sub TntTracking()
Dim json As Object, ws As Worksheet, trackingNumber As String
trackingNumber = "1234567" 'test input value. Currently this is not a valid input but is for demo.
Set ws = ThisWorkbook.Worksheets("Sheet1") 'for later use if writing something specific out
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.tnt.com/api/v3/shipment?con=" & trackingNumber & "&searchType=CON&locale=en_US&channel=OPENTRACK", False
.send
Set json = JsonConverter.ParseJson(.responseText)
End With
'do something with results
Debug.Print json("tracker.output")("notFound").Count > 0
Debug.Print JsonConverter.ConvertToJson(json("tracker.output")("notFound"))
End Sub
Below is some code I've found and altered to attempt to capture the keyword/phrase suggestions from Amazon's search bar. I'm very new to the concept of web scraping, so I know the code presented here may be very ineffective and inefficient. I've manually captured some data from the F12 DOM Explorer and Network windows. If the best answer is web scraping, I need that in the form of excel vba. I see in some of the below images that it appears as though some of the content type from the Network window is "application/json" and the Initiator/Type is "XMLHttpRequest", but this is only after it shows a connection and authentication to "https://completion.amazon.com". If that's the route, I have no idea how to complete those requests. Any help would be much appreciated.
So far I've tried invoking the search bar programmatically, via the scripts in the code, but that does nothing that I can see. Simply 'pasting' the keyword into the search bar with a 'space' appended to it does not produce the suggested keywords. However, typing into the search bar does. If I type the keyword in, then choose 'inspect element' of the dropdown suggestions, dynamic HTML is produced to show the HTML content of the suggestions (at which time I can get what I need). I've been unsuccessful in getting to that point.
Private Sub CommandButton1_Click()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim AASearchRank As Workbook
Dim AAws As Worksheet
Dim InputSearch As HTMLInputTextElement
Dim elems As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim elems2 As IHTMLElementCollection
Dim TDelement2 As HTMLDivElement
'Dim TDelement2 As HTMLInputTextElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim x As Integer
Dim i As Long
MyURL = "https://www.amazon.com/"
Set IE = New InternetExplorer
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
Set HTMLDoc = IE.Document
Set AASearchRank = Application.ThisWorkbook
Set AAws = AASearchRank.Worksheets("Sheet2")
Set InputSearchButton = HTMLDoc.getElementById("nav-search-submit-text")
Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchtextbox")
If Not InputSearchOrder Is Nothing Then
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
x = 2
If AAws.Range("D" & x).Value = "" Then
Do Until AAws.Range("B" & x) = ""
Set InputSearch = HTMLDoc.getElementById("twotabsearchtextbox")
InputSearch.Focus
'When a keyword is typed in the search bar with a 'space' after, it invokes the suggestions I'm looking for.
InputSearch.Value = "Travel "
'InputSearch.Value = AAws.Range("C" & x) & " "
Set InputSearchButton = HTMLDoc.getElementsByClassName("nav-input")(0)
InputSearch.Focus
'Here I was trying to invoke some script to see if it had any effect on the search bar drop down
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'UpNav',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'Search',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'NavBar',end:+new Date(),begin:window.navmet.main});"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Application.Wait (Now + TimeValue("0:00:05"))
Set elems2 = HTMLDoc.getElementsByClassName("nav-issFlyout nav-flyout")
i = 0
For Each TDelement2 In elems2
'Debug statements strictly for learning what each option/query returns
Debug.Print TDelement2.innerText
Debug.Print TDelement2.className
Debug.Print TDelement2.dataFld
Debug.Print TDelement2.innerHTML
Debug.Print TDelement2.outerText
Debug.Print TDelement2.outerHTML
Debug.Print TDelement2.parentElement.className
Debug.Print TDelement2.tagName
Debug.Print TDelement2.ID
Next
'Once the searchbar is populated, and the drop down list provides suggestions,
'the below code will give me what I want. If there's an easier solution,
'I'm all for it
Set elems = HTMLDoc.getElementsByClassName("s-suggestion")
i = 0
For Each TDelement In elems
If Left(TDelement.ID, 6) = "issDiv" Then
Debug.Print TDelement.innerText
Debug.Print TDelement.ID
End If
Next
x = x + 1
Loop
End If
End Sub
An ideal solution would be to obtain these suggested keywords through either invoking the search bar dynamic HTML or via Amazon's completion site, but it appears as though that might not be open to the general public. Thank you for any help, and apologies up front for any posting deficiencies.
There is an API call you can find in the network tab. It returns a json string you can parse with as jsonparser to get the suggestions. I use jsonconverter.bas which, once downloaded I add to the project and then go VBE > Tools > References > Add a reference to Microsoft Scripting Runtime.
The url itself is a queryString i.e. it is constructed of different parameters. For example, there is a limit parameter, whose value is 11, which specifies the number of suggestions to return. You may be able to alter and/or remove some of these. Below, I concatenate the SEARCH_TERM constant into the query string to represent your search value (that which would be typed into the search box).
I don't know whether any of the params are time-based (i.e. expire over time - I have made a number of requests without problem since you posted your question). It may be that necessary time based values can be pulled via a prior GET request to Amazon search page.
params = (
('session-id', '141-0042012-2829544'),
('customer-id', ''),
('request-id', '7E7YCB7AZZM1HQEZF2G1'),
('page-type', 'Search'),
('lop', 'en_US'),
('site-variant', 'desktop'),
('client-info', 'amazon-search-ui'),
('mid', 'ATVPDKIKX0DER'),
('alias', 'aps'),
('b2b', '0'),
('fresh', '0'),
('ks', '76'),
('prefix', 'TRAVEL'),
('event', 'onKeyPress'),
('limit', '11'),
('fb', '1'),
('suggestion-type', ['KEYWORD', 'WIDGET']),
('_', '1556820864750')
)
VBA:
Option Explicit
Public Sub GetTable()
Dim json As Object, suggestion As Object '< VBE > Tools > References > Microsoft Scripting Runtime
Const SEARCH_TERM As String = "TRAVEL"
Const SEARCH_TERM2 As String = "BOOKS"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://completion.amazon.com/api/2017/suggestions?session-id=141-0042012-2829544" & _
"&customer-id=&request-id=7E7YCB7AZZM1HQEZF2G1&page-type=Search&lop=en_US&site-variant=" & _
"desktop&client-info=amazon-search-ui&mid=ATVPDKIKX0DER&alias=aps&b2b=0&fresh=0&ks=76&" & _
"prefix=" & SEARCH_TERM & "&event=onKeyPress&limit=11&fb=1&suggestion-type=KEYWORD&suggestion-type=" & _
"WIDGET&_=1556820864750", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Set json = JsonConverter.ParseJson(.responseText)("suggestions")
End With
For Each suggestion In json
Debug.Print suggestion("value")
Next
End Sub
In order to fix the following code, I tried to split it up into a smaller part. So, I have the following code that drives me crazy for hours in Sheet1:
Sub Scrapping_Data()
Dim IE As Object, EURUSD1 As String, EURUSD2 As String
Application.ScreenUpdating = False
Range("A:B").Clear
Set IE = CreateObject("internetexplorer.application")
With IE
.Navigate "http://uk.investing.com/currencies/streaming-forex-rates-majors"
.Visible = False
End With
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set FOREX = IE.document.getElementById("pair_1")
EURUSD1 = FOREX.Cells(1).innerHTML
EURUSD2 = FOREX.Cells(2).innerHTML
IE.Quit
Set IE = Nothing
Range("A1").Value = EURUSD1
Range("B1").Value = EURUSD2
End Sub
I run it for the first time and it worked fine. But when I run it for the second time, the error the run-time error '91' occurred. So I clicked F8, but nothing happened the code worked fine and I checked Sheet1 there were values in Cells(1,1) and Cells(1,2). I then run it again and the error the run-time error '13' occurred this time. Again I clicked F8, but nothing happened the code worked fine. When I kept running the code, the errors still occurred and clicking F8 didn't help to find the problem. What is wrong with my code? How to fix it?
What I don't get it here too is my laptop is getting slow every time I run the code and I have to manually restart it many times.
The following requires that you go into the VBE's Tools ► References and place checkmarks beside Microsoft HTML Object library and Microsoft XML v6.0.
This is an xmlhttprewuest equivalent of an Internet Explorer object web scrape to the same URL.
Option Explicit
Sub tournamentFixtures()
'declare the objects with early binding
Dim htmlBDY As New HTMLDocument, xmlHTTP As New MSXML2.XMLHTTP60
'declare the regular variables
Dim sURL As String, ws As Worksheet
'set a var object to the destination worksheet
Set ws = Worksheets("Sheet1")
'assign the URL to a string var
sURL = "http://uk.investing.com/currencies/streaming-forex-rates-majors"
'isolate all commands to the MSXML2.XMLHTTP60 object
With xmlHTTP
'initiate the URL
.Open "GET", sURL, False
'set hidden header information
.setRequestHeader "User-Agent", "XMLHTTP/1.0"
'get the page data
.send
'safety check to make sure we got the web page's data
If .Status <> 200 Then GoTo bm_safe_Exit
'if here you got the page data - copy it to the local var
htmlBDY.body.innerHTML = .responseText
End With
'localize all commands to the page data
With htmlBDY
'check if the element ID exists
If Not .getElementById("pair_1") Is Nothing Then
'it exists - get the data directly to the worksheet
With .getElementById("pair_1")
ws.Range("A1") = .Cells(1).innerText
ws.Range("B1") = .Cells(2).innerText
End With
Else
'it doesn't exist - bad page data
MsgBox "there is no 'pair_1' on this page"
End If
End With
bm_safe_Exit:
'clean up all of the objects that were instantiated
Set htmlBDY = Nothing: Set xmlHTTP = Nothing: Set ws = Nothing
End Sub
I have commented virtually every line so you can follow what is happening. This may need some tweaking. I ran it ~40 times and it failed once but that could have been my own Internet connection. Consider this a starting point where you can do your own research to accomplish your goals. If you continue to have problems with this new code, please do not paste this into another question and ask why it doesn't work without doing some research and attempting a solution yourself. StackOverflow is a site for professional and enthusiast programmers.
I gave up trying to offer solutions to web scraping problems because page technology changes too fast to keep up on a peripheral level. You have to be involved in the immediate changes to be able to respond to them quickly and my own interests lie elsewhere. I responded to this request because you actually supplied the URL to test against (something few people asking questions actually think is important - go figure) and I thought the static dimming of the var would help.
The construction and destruction of an InternetExplorer object takes time; up to a few seconds even on the fastest sytems. You can wait an appropriate amount of time for it to relinquish all of the .DLLs et al it has loaded or you can declare your IE as a static object that will be reused on subsequent reruns of the sub procedure.
Option Explicit
Sub Scrapping_Data()
Static IE As Object
Dim EURUSD1 As String, EURUSD2 As String
Application.ScreenUpdating = False
With Worksheets("Sheet1") 'KNOW what worksheet you are on!!!!!
.Range("A:B").Clear
End With
If IE Is Nothing Then
Set IE = CreateObject("internetexplorer.application")
With IE
.Visible = True
'.Visible = False
.Silent = True
End With
End If
With IE
.Navigate "http://uk.investing.com/currencies/streaming-forex-rates-majors"
Do While .ReadyState <> 4: DoEvents: Loop
With .document.getElementById("pair_1")
EURUSD1 = .Cells(1).innerHTML
EURUSD2 = .Cells(2).innerHTML
End With
End With
With Worksheets("Sheet1") 'KNOW what worksheet you are on!!!!!
.Range("A1") = EURUSD1
.Range("B1") = EURUSD2
End With
IE.Navigate "about:blank"
End Sub
The caveat here is that you will have to destruct the InternetExplorer object yourself at some point in the future. Closing the workbook will close the VBA project but leave the IE object 'orphaned'.
Given all of the HTML5 debris that comes along with that web page, have you considered moving to xmlhttprequest? And if you are wondering then yes, that would be a new question under a different set of [tags].
I have a project that requires me to open a document in Word and wait while the document is open. The user then needs to edit the document, print it or save it and then they will close it manually. When the document (or Word itself) is closed I would like to continue my VBA script in Access.
I have the following code
If Len(Dir(sDocName)) > 0 Then
Dim wordApp As Word.Application
Dim wordDoc As Word.Document
' Launch Word
Set wordApp = New Word.Application
' Open the document
With wordApp
Set wordDoc = .Documents.Open(sDocName, , False)
' Pass data
With wordDoc
.Variables("bmArchitectCompanyName").Value = "Hello"
.Variables("bmArchitectCompanyAddress").Value = "Hello"
.Variables("bmArchitectCompanyPostCode").Value = "Hello"
.Variables("bmArchitectContactFirstName").Value = "Hello"
.Variables("bmProjectTitle").Value = ProjectTitle
.Fields.Update
End With
.Visible = True
.Activate
Do Until wordApp Is Nothing
' Wait for Word to be closed
Loop
' Display a success message
MsgBox "Success!"
End With
End If
However, the do loop never exits.
How do I check if the document I launched through VBA has been closed by the user from my VBA code?
If your are handling the Word Document in a class module or form module, try declaring a Word.Document object this way:
Dim WithEvents w As Word.Document
then you can use the Word.Document object event "Close":
Private Sub w_Close()
'Things you want to do when the Word.Document is closed
'reset object
set w=nothing
End Sub
You must not let then "container" form close or the object terminate unless w is nothing. The form or the object are quiet (or whatever you want them to do) until your Word document is closed.
You must yield to the OS using DoEvents, then check if the document was closed, otherwise coninue the DoEvents loop. The following is outline code that I unfortunately could not test.
Do
DoEvents
For i = 1 to wordApp.Documents.Count
If (wordApp.Documents(i).Name = sDocName) Then Exit For
Next i
If (i > wordApp.Documents.Count) Then Exit Loop
Loop While (True)
Note: you probably don't want to close the application because maybe the user had other documents open. Maybe you should test that.
I have found the solution...
Do While .Visible
' wait until the window is no longer visible
Loop
Does exactly what I was looking for, waits until word is closed and then continues with the MS Access VBA script.
Building on what Paul has entered this uses a different method to test if the document is open.
Sub Test()
Dim wordApp As Word.Application
Dim wordDoc As Word.Document
Dim sDocName As String
sDocName = "<Full path to document>"
Set wordApp = New Word.Application
With wordApp
Set wordDoc = .Documents.Open(sDocName, , False)
.Visible = True
.Activate
Do
DoEvents
Loop While FileIsOpen(sDocName)
End With
MsgBox "Finished"
End Sub
Public Function FileIsOpen(FullFilePath As String) As Boolean
Dim ff As Long
On Error Resume Next
ff = FreeFile()
Open FullFilePath For Input Lock Read As #ff
Close ff
FileIsOpen = (Err.Number <> 0)
On Error GoTo 0
End Function
The issue with checking to see if wordApp Is Nothing is that VBA doesn't know to dereference the object when it is closed externally. In other words, when the document to which wordApp refers is closed, the pointer to the object becomes bogus but not Nothing. To trap for the event, you can do a meaningless call to the document object in the loop, and exit if an error occurs:
On Error Resume Next
With wordDoc
Do
DoEvents
'This will only occur if there is an error:
If .Visible<>.Visible Then Exit Do
Loop
End With
Err.Clear
On Error Goto 0
I just tested with no loop or anything else. The macro code in Access waited until the instance of the Word application was closed before it finished executing. Here's what I tested, and the MsgBox only executed after I quit Word:
' Open the document
With wordApp
Set wordDoc = .Documents.Add
' Pass data
With wordDoc
.Variables("bmArchitectCompanyName").Value = "Hello"
.Variables("bmArchitectCompanyAddress").Value = "Hello"
.Variables("bmArchitectCompanyPostCode").Value = "Hello"
.Variables("bmArchitectContactFirstName").Value = "Hello"
.Content.Text = "test"
'.Variables("bmProjectTitle").Value = ProjectTitle
'.Fields.Update
End With
.Visible = True
.Activate
' Display a success message
MsgBox "Success!"
Set wordDoc = Nothing
Set wordApp = Nothing
End With
Note that you want to be sure to set the objects to Nothing or
you're likely to get error messages the next time you run the code
because it will leave the instance of WinWord open in memory.
I am trying to scrap data out of a section of a webpage. To get into the section I need to fill in a captcha security code and hit a button, but that is alright because the security code is actually written in the html of the page. So, I am creating an IE object, driving it to the webpage, getting the captcha security code, writing it in the proper box, hitting the submit button and then getting the html document so I can scrap data out of it.
Nonetheless I am executing the steps exatcly in the order I mentioned, it seems that the html document that is being gotten is not the one from the page after I pass through the captcha validation, but from the page before the captcha validation.
Would anyone know what must I do to get the correct html document and conseuently be able to scrap the data I really want? Thank you.
The subprocedure's code follows next:
'Getting National fuel prices from ANP
Sub subANPNationalFuelPrices()
'Creating variables for the URL and the HTML files
Dim urlANP As String: urlANP = "http://www.anp.gov.br/preco/prc/Resumo_Semanal_Index.asp"
Dim htmlANP1 As HTMLDocument
'Creating the IE object
Dim IE As InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
'Making sure that the webpage is fully load
IE.navigate (urlANP)
Do While IE.readyState <> READYSTATE_COMPLETE
Application.StatusBar = "Getting your data"
DoEvents
Loop
Set htmlANP1 = IE.document
'Getting the Captcha Password
Dim strCaptchaPassword As String
Dim colMyCollection As IHTMLElementCollection
Set colMyCollection = htmlANP1.getElementById("divQuadro").all
Dim objLabel As IHTMLElement
For Each objLabel In colMyCollection
strCaptchaPassword = strCaptchaPassword & objLabel.innerText
Next objLabel
'Getting the input box object and getting it the correct password
Dim objInputBox As IHTMLElement
Set objInputBox = htmlANP1.getElementById("txtValor")
objInputBox.Value = strCaptchaPassword
'Getting the submit button object and clicking it
Dim objInputButton As IHTMLElement
Set objInputButton = htmlANP1.getElementById("image1")
objInputButton.Click
'Getting the true rich data HTML
Set htmlANP1 = IE.document
'Extracting the data from the html document
Dim rngValues As range: Set rngValues = Sheet1.range("B17")
Dim strValues(35) As String
Dim dblValues(35) As Double
Dim objElement1 As IHTMLElement
Set objElement1 = htmlANP1.getElementsByTagName("TABLE")(1)
Dim colCollection1 As IHTMLElementCollection
Set colCollection1 = objElement1.all
Dim intTempCount As Integer
Dim objTempElement As IHTMLElement
intTempCount = 32
For Each objTempElement In colCollection1
Sheet1.Cells(intTempCount, 3) = objTempElement.tagName
Sheet1.Cells(intTempCount, 4) = objTempElement.innerText
intTempCount = intTempCount + 1
Next objTempElement
End sub
You are not waiting for the new webpage to load after clicking the button on the captcha. Either check the ready state of IE again or end you code here be starting a timer which starts your code off again in X seconds AND then checks the ready state of IE and Document.
I do scraping on a system using iFrame so using IE.Readystate isn't very reliable. Usually I have to wait for another element to 'exist', but using IsObject(element) hasn't been very reliable either. What I've had to do is use a loop in my main code that calls a function so if I'm waiting for something to load and I know that after the page loads, there's an element with the ID "UserName", then I do this..
...
Do Until IsErr(doc, "UserName") = False: Loop
...
Function IsErr(doc As HTMLDocument, ID As String) As Boolean
IsErr = True
On Error GoTo ExitFunction:
Debug.Print left(doc.getElementById(ID).innerHTML, 1)
IsErr = False
Exit Function
ExitFunction:
End Function
I could just do a loop statement that keeps trying to debug it, but that would be a nightmare with the error handling so if you use a separate function for the printing, it can exit the function after the error, then the loop re-initiates the function and it will do this forever until the next element exists.