How to check the status of URL? - html

I created a macro, in which I can fetch each URL from any webpage.
Now, I have each URL in column.
How can I check if a URL is working.
If any one of these URL is not working then it should show me error not working next to URL in next column.
Below is the code I wrote:
Sub CommandButton1_Click()
Dim ie As Object
Dim html As Object
Dim j As Integer
j = 1
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = True
url = "www.mini.co.uk"
ie.navigate url
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "Trying to go to website ..."
Loop
Application.StatusBar = " "
Set html = ie.document
'Dim htmltext As Collection
Dim htmlElements As Object
Dim htmlElement As Object
Set htmlElements = html.getElementsByTagName("*")
For Each htmlElement In htmlElements
'If htmlElement.getAttribute("href") <> "" Then Debug.Print htmlElement.getAttribute("href")
If htmlElement.getAttribute("href") <> "" Then Cells(j, 1).Value = htmlElement.getAttribute("href")
j = j + 1
Next
ActiveSheet.Range("$A$1:$A$2752").removeDuplicates Columns:=1, Header:=xlNo
End Sub
This code is to fetch the URL from web-page.
Below is the code to check the status of URL, if it is working or not.
Sub CommandButton2_Click()
Dim k As Integer
Dim j As Integer
k = 1
j = 1
'Dim Value As Object
'Dim urls As Object
'urls.Value = Cells(j, 1)
For Each url In Cells(j, 1)
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
ie.Visible = False
url = Cells(j, 1)
ie.navigate url
Do While ie.READYSTATE <> READYSTATE_COMPLETE
Application.StatusBar = "checking the Data. Please wait..."
Loop
Cells(k, 2).Value = "OK"
'Set html = ie.document
ie.Quit
j = j + 1
k = k + 1
Next
End Sub

Public Function IsURLGood(url As String) As Boolean
Dim request As New WinHttpRequest
On Error GoTo IsURLGoodError
request.Open "HEAD", url
request.Send
If request.Status = 200 Then
IsURLGood = True
Else
IsURLGood = False
End If
Exit Function
IsURLGoodError:
IsURLGood = False
End Function
Sub testLink()
Dim source As Range, req As Object, url$
Set source = Range("A2:B2")
source.Columns(2).Clear
For i = 1 To source.Rows.Count
url = source.Cells(i, 1)
If IsURLGood(url) Then
source.Cells(i, 2) = "OK"
Else
source.Cells(i, 2) = "Down"
End If
Next
MsgBox "Done"
End Sub

Since you are interested to know whether the link is working, xmlhttp may be one solution.
Set sh = ThisWorkBook.Sheets("Sheet1")
Dim column_number: column_number = 2
'Row starts from 2
For i=2 To 100
strURL = sh.cells(i,column_number)
sh.cells(i, column_number+1) = CallHTTPRequest(strURL)
Next
Function CallHTTPRequest(strURL)
Set objXMLHTTP = CreateObject("MSXML2.XMLHTTP")
objXMLHTTP.Open "GET", strURL, False
objXMLHTTP.send
status = objXMLHTTP.Status
'strContent = ""
'If objXMLHTTP.Status = 200 Then
' strContent = objXMLHTTP.responseText
'Else
' MsgBox "HTTP Request unsuccessfull!", vbCritical, "HTTP REQUEST"
' Exit Function
'End If
Set objXMLHTTP = Nothing
CallHTTPRequest = status
End Function

You can actually get the status codes using IE automation, but it requires working with events and a reference to the Microsoft Internet Controls library.
Private Declare PtrSafe Sub SleepEx Lib "Kernel32.dll" (ByVal dwMilliseconds As Long, Optional ByVal bAlertable As Boolean = True)
Private WithEvents ie As SHDocVw.InternetExplorer
Private LastStatusCode As Long
Private Sub ie_NavigateError(ByVal pDisp As Object, URL As Variant, TargetFrameName As Variant, StatusCode As Variant, Cancel As Boolean)
LastStatusCode = StatusCode
End Sub
Public Sub NavigateReturnStatus(url As String) As Long
Set ie = CreateObject("InternetExplorer.Application")
Status = 0
ie.Navigate url
Do While IEObject.ReadyState <> READYSTATE_COMPLETE Or IEObject.Busy
SleepEx 50 'No busy waiting, short wait time
DoEvents 'Need to receive events from IE application
Loop
NavigateReturnStatus = LastStatusCode
End Sub
This doesn't return a conventional HTTP status code, but instead returns a NavigateError status code. That means you can get more detailed information about errors, but no information about successful navigation. Of course, if it's 0, no error has occurred so the status is likely 200.
Speed is very much slower than a WinHTTP/MSXML approach, but I'm sharing this mainly for cases where someone's already navigating using Internet Explorer anyway.
Of course, the code can (and likely should) be modified to reuse the internet explorer application.

Sub URLWorkingorNot()
'Make Sure to Select Cells containing URL
Dim i As Long
AddReference
i= 1
Selection.Replace "#N/A", "NA": Selection.Offset(0, 1).EntireColumn.Insert
Dim IE As InternetExplorer
If ActiveWorkbook Is Nothing Then Exit Sub
For Each cell In Selection
If cell.Value <> "" Then
Set IE = New InternetExplorer
IE.Navigate2 cell.Value
IE.Left = 900
IE.Width = 900
IE.Visible = True
While IE.Busy: DoEvents: Wend
On Error Resume Next
If InStr(1, IE.document.body.innerText, "The webpage cannot be found", vbBinaryCompare) <> 0 Then cell.Offset(0, 1).Value = "Not Available"
'MsgBox IE.document.body.innerText
If err.Number <> 0 Then err.Clear: On Error GoTo 0
IE.Quit: Set IE = Nothing
End If
i = i + 1:
ProgressBar Selection.Count, i, "Working on " & i & " Cell": DoEvents
If ActiveWorkbook.Path <> "" And Left(i, 3) = "00" Then ActiveWorkbook.Save
Next cell
Unload UProgressBar
Application.StatusBar = ""
End Sub
Sub AddReference()
'cOPIED FROM iNTERNET
'Macro purpose: To add a reference to the project using the GUID for the
'reference library
Dim strGUID As String, theRef As Variant, i As Long
'Update the GUID you need below.
'strGUID = "{00020905-0000-0000-C000-000000000046}"
strGUID = "{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}"
'iNTERNET cONTROLS - "{EAB22AC0-30C1-11CF-A7EB-0000C05BAE0B}" MAJOR 1 MINOR 1
'HTMLOBJECT "{3050F1C5-98B5-11CF-BB82-00AA00BDCE0B}" MAJOR 4 MINOR 0
'Set to continue in case of error
On Error Resume Next
'Remove any missing references
For i = ThisWorkbook.Vbproject.References.Count To 1 Step -1
Set theRef = ThisWorkbook.Vbproject.References.Item(i)
If theRef.isbroken = True Then
ThisWorkbook.Vbproject.References.Remove theRef
End If
Next i
'Clear any errors so that error trapping for GUID additions can be evaluated
err.Clear
'Add the reference
ThisWorkbook.Vbproject.References.AddFromGuid _
GUID:=strGUID, Major:=1, Minor:=0
ThisWorkbook.Vbproject.References.AddFromFile "C:\Windows\System32\UIAutomationCore.dll"
'If an error was encountered, inform the user
Select Case err.Number
Case Is = 32813
'Reference already in use. No action necessary
Case Is = vbNullString
'Reference added without issue
Case Else
'An unknown error was encountered, so alert the user
MsgBox "A problem was encountered trying to" & vbNewLine _
& "add or remove a reference in this file" & vbNewLine & "Please check the " _
& "references in your VBA project!", vbCritical + vbOKOnly, "Error!"
End Select
On Error GoTo 0
End Sub

Related

HTML Element Collection filled from Previous Webpage Rather than Redirected Webpage VBA

The code below navigates to a webpage, fills search boxes with queries, and submits to the results page. However, the final element collection in the script, tdtags, which is defined after the redirect, is pulling data from the original search page, rather than the results page. I currently have the while ie.busy loop and a timed delay in the script, neither of which works. I have also tried waiting until an element only present in the results page becomes available in the html, but this also does not work.
Dim twb As Workbook
Dim ie As Object
Set twb = ThisWorkbook
twb.Activate
Set ie = CreateObject("internetexplorer.application")
'church = Sheets("Control").Range("A2").Value
'minister = Sheets("Control").Range("A4").Value
location = "London" 'Sheets("Control").Range("A6").Value
'denomination = Sheets("Control").Range("A8").Value
With ie
.navigate "http://www.ukchurch.org/index.php"
.Visible = True
Do While .Busy Or .ReadyState <> 4
DoEvents
Loop
End With
Application.Wait (Now + TimeValue("00:00:02"))
Set intags = ie.document.getelementsbytagname("input")
For Each intag In intags
If intag.getattribute("name") = "name" Then
If church <> "" Then
intag.Value = church
End If
ElseIf intag.getattribute("name") = "minister" Then
If minister <> "" Then
intag.Value = minister
End If
ElseIf intag.getattribute("name") = "location" Then
If location <> "" Then
intag.Value = location
End If
Else
End If
Next intag
Set dropopt = ie.document.getelementsbytagname("select")
For Each dropo In dropopt
If dropo.classname = "DenominationDropDown" Then
Set opttags = dropo.getelementsbytagname("option")
For Each opt In opttags
If opt.innertext = denomination Then
opt.Selected = True
End If
Next opt
End If
Next dropo
On Error Resume Next
For Each intag In intags
If intag.getattribute("src") = "images/ukchurch/button-go.jpg" Then
intag.Click
Do While ie.Busy Or ie.ReadyState <> 4
DoEvents
Loop
Application.Wait (Now + TimeValue("00:00:03"))
Exit For
End If
Next intag
Application.Wait (Now + TimeValue("00:00:03"))
Set tdtags = ie.document.getelementsbytagname("td")
For Each td In tdtags
If td.classname = "pText" Then
Debug.Print td.innertext
Debug.Print ie.locationURL
pagecount = Right(td.innertext, InStr(td.innertext, ":"))
End If
Next td
Debug.Print pagecount
End Sub
Any diagnosis would be appreciated.
Automating IE is a pain, so avoid it.
The following function requests the results page directly.
Public Function GetSearchResult(Optional ByVal ResultPage As Integer = 0, Optional ByVal ChurchName As String = "", Optional ByVal Minister As String = "", Optional ByVal ChurchLocation As String = "", Optional ByVal Denomination As String = "") As Object
Dim Request As Object: Set Request = CreateObject("MSXML2.serverXMLHTTP")
Dim Result As Object: Set Result = CreateObject("htmlfile")
Request.Open "POST", "http://www.ukchurch.org/searchresults1.php", False
Request.setRequestHeader "content-type", "application/x-www-form-urlencoded"
Request.send IIf(ResultPage = 0, "", "page=" & ResultPage & "&") & "name=" & ChurchName & "&minister=" & Minister & "&location=" & ChurchLocation & "&denomination=" & Denomination
Result.body.innerHTML = Request.responseText
Set GetSearchResult = Result
End Function
An example which prints the contents of the td with classname pText inside the table containing the search results
Sub Main()
Dim Document As Object
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows as Object
Dim ResultRow As Object
Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
For Each ResultRow in ResultRows
If ResultRow.Classname = "pText" Then
Debug.print ResultRow.innerText
End If
Next
End Sub
Update
You need to add a couple of References to your VBA project to make the following code work.
In the VBA Editor, Goto the Tools Menu, Click References and in the dialog that opens add a check next to the following two items: Microsoft XML, v6.0 and Microsoft HTML Object Library (
Public Function GetChurchDetails(ByVal ChurchID As String) As MSHTML.HTMLDocument
Dim Request As New MSXML2.ServerXMLHTTP60
Dim Result As New MSHTML.HTMLDocument
Request.Open "GET", "http://www.ukchurch.org/churchdetails.php?churchid=" & ChurchID, False
Request.send
Result.body.innerHTML = Request.responseText
Set GetChurchDetails = Result
End Function
Sub Main2()
Dim Document As MSHTML.HTMLDocument
Dim Church As MSHTML.HTMLDocument
Set Document = GetSearchResult(ChurchLocation:="London")
Dim ResultRows As MSHTML.IHTMLElementCollection
Dim ResultRow As MSHTML.IHTMLElement
Dim ChurchID As String
'Set ResultRows = Document.getElementsByTagName("table")(8).getElementsByTagName("td")
' all result links on searchresults1.php have a classname of resultslink which makes getting them much easier
Set ResultRows = Document.getElementsByClassName("resultslink")
For Each ResultRow In ResultRows
ChurchID = ResultRow.getAttribute("href")
ChurchID = Mid(ChurchID, InStr(1, ChurchID, "=") + 1)
Set Church = GetChurchDetails(ChurchID)
' code to read data from the page using Church as the Document
' eg: Church.getElemenetsByTagName("td").....
Next
End Sub
You only need to use the "post" mode when your submitting data, for everything else you can use "get"

Retrieving all Excel file links from a webpage

I'm trying to get all the downloadable Excel file links from the website, but having difficulty. Please help to guide me. Thanks.
Sub TYEX()
Dim internet As Object
Dim internetdata As Object
Dim div_result As Object
Dim header_links As Object
Dim link As Object
Dim URL As String
Set internet = CreateObject("InternetExplorer.Application")
internet.Visible = True
URL = "https://www.jpx.co.jp/markets/public/short-selling/index.html"
internet.Navigate URL
Do Until internet.ReadyState >= 4
DoEvents
Loop
Application.Wait Now + TimeSerial(0, 0, 5)
Set internetdata = internet.Document
Set div_result = internetdata.getElementById("readArea")
Set header_links = div_result.getElementsByTagName("td")
For Each h In header_links
Set link = h.ChildNodes.item(0)
Cells(Range("A" & Rows.Count).End(xlUp).Row + 1, 1) = link.href
Next
MsgBox "done"
End Sub
You had the idea down correctly, but here's a different approach:
Sub TYEX()
Dim ie As Object
Set ie = CreateObject("InternetExplorer.Application")
With ie
.navigate "https://www.jpx.co.jp/markets/public/short-selling/index.html"
.Visible = True
Do While .Busy Or .readyState < 4
DoEvents
Loop
Dim doc As Object, tbl As Object
Set doc = .document
Set tbl = doc.getElementsByClassName("component-normal-table")(0).Children(0)
Dim r As Long, xlsArr(), a As Object
With tbl.Rows
ReDim xlsArr(1 To .Length - 1)
For r = 1 To .Length - 1 ' 0 is the table header
xlsArr(r) = .Item(r).Children(1).innerHTML
Next r
End With
With CreateObject("VBScript.RegExp")
.Pattern = "<a href=""(\/markets.*?\.xls)"
For r = 1 To UBound(xlsArr)
xlsArr(r) = "https://www.jpx.co.jp" & .Execute(xlsArr(r))(0).SubMatches(0)
Debug.Print xlsArr(r)
Next
End With
End With
'Add to sheet
Dim ws As Worksheet, rng As Range
Set ws = ThisWorkbook.Worksheets(1)
With ws
Set rng = .Range(.Cells(NextRow(ws), 1), .Cells(NextRow(ws) + UBound( _
xlsArr) - 1, 1))
rng.Value = Application.Transpose(xlsArr)
End With
End Sub
Public Function NextRow(ByVal ws As Worksheet, Optional ByVal col As Variant = 1) As Long
With ws
NextRow = .Cells(.Rows.Count, col).End(xlUp).Row + 1
End With
End Function
Breaking Down the Code
This will loop your html table rows. We start at 1, because 0 is actually just the table header.
With tbl.Rows
ReDim xlsArr(1 To .Length - 1)
For r = 1 To .Length - 1 ' 0 is the table header
xlsArr(r) = .Item(r).Children(1).innerHTML
Next r
End With
This uses regular expressions to extract the url from the innerHTML property. You can see how this particular regex works here: Regex101
With CreateObject("VBScript.RegExp")
.Pattern = "<a href=""(\/markets.*?\.xls)"
For r = 1 To UBound(xlsArr)
xlsArr(r) = "https://www.jpx.co.jp" & .Execute(xlsArr(r))(0).SubMatches(0)
Debug.Print xlsArr(r)
Next
End With
You will size your range to be the same size as your array that contains the links, then write the array to the worksheet. This is usually much faster than writing cells one-by-one.
'Add to sheet
Dim ws As Worksheet, rng As Range
Set ws = ThisWorkbook.Worksheets(1)
With ws
Set rng = .Range(.Cells(NextRow(ws), 1), .Cells(NextRow(ws) + UBound( _
xlsArr) - 1, 1))
rng.Value = Application.Transpose(xlsArr)
End With
You can use an attribute = value CSS selector with $ operator to say the href value must end with .xls. Then use querySelectorAll to retrieve all matched results. Using CSS selectors is a very fast and generally robust method.
Dim list As Object
Set list = ie.document.querySelectorAll("[href$='.xls']")
It is much faster to use XMLHTTP as well, rather than opening IE. Note that you can then pass these links to a function to perform a binary download or to URLMon for downloading.
Option Explicit
Public Sub Links()
Dim sResponse As String, html As HTMLDocument, list As Object, i As Long
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.jpx.co.jp/markets/public/short-selling/index.html", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument
With html
.body.innerHTML = sResponse
Set list = html.querySelectorAll("[href$='.xls']")
End With
For i = 0 To list.Length - 1
Debug.Print Replace$(list.item(i), "about:", "https://www.jpx.co.jp")
Next
End Sub
Example download function (though you could re-use your existing XMLHTTP object - this is just to illustrate):
Public Function DownloadFile(ByVal downloadFolder As String, ByVal downloadURL As String) As String
Dim http As Object , tempArr As Variant
Set http = CreateObject("WinHttp.WinHttpRequest.5.1")
http.Open "GET", downloadURL, False
http.send
On Error GoTo errhand
With CreateObject("ADODB.Stream")
.Open
.Type = 1
.write http.responseBody
tempArr = Split(downloadURL, "/")
tempArr = tempArr(UBound(tempArr))
.SaveToFile downloadFolder & tempArr, 2 '< "/" on enter of downloadFolder. 2 for overwrite which is Ok if no file modifications.
.Close
End With
DownloadFile = downloadFolder & tempArr
Exit Function
errhand:
If Err.Number <> 0 Then
Debug.Print Err.Number, Err.Description
MsgBox "Download failed"
End If
DownloadFile = vbNullString
End Function
References (VBE > Tools > References):
Microsoft HTML Object Library

VBA reads HTML from the old page after clicking submit button

I am not a programmer but I have managed to learn just a few things in VBA but now on a certain website I face a problem that does not exist on some other.
What should happen is that a page form should be completed with data, submit button clicked and then I want to get some data from the result page.
The first phase works fine but it seems that no matter what I do the VBA still reads data from the page before submit was clicked.
The code is:
Sub VIES2()
'Uruchomienie Internet Explorera i wstrzymanie dalszej akcji aż uzyska stan gotowości
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"
Do While IE.ReadyState <> 4: DoEvents: Loop
'Wypełnienie formularza odpowiednimi wartościami i kliknięcie przycisku sprawdzenia
IE.document.getElementbyId("countryCombobox").Value = "IT"
IE.document.getElementbyId("number").Value = "01802840023"
IE.document.getElementbyId("requesterCountryCombobox").Value = "IT"
IE.document.getElementbyId("requesterNumber").Value = "01802840023"
IE.document.getElementbyId("submit").Click
'Test uzyskiwania opisu i identyfikatora zapytania
For t = 1 To 999999
Next t
Application.Wait Now + TimeValue("00:00:10")
Do While IE.ReadyState <> 4: DoEvents: Loop
For t = 1 To 999999
Next t
Application.Wait Now + TimeValue("00:00:10")
MsgBox IE.LocationURL
Set Text = IE.document.getElementsbyClassName("layout-content")
For Each Element In Text
MsgBox Element.innerText
Next
Set Test = IE.document.getElementsbyTagName("TABLE")
For Each Element In Test
MsgBox Element.innerText
Next
End Sub
I have tried putting break, various wait loops and Application.Wait as suggested in similar questions where it seems to have worked. Here, even after the page is long after fully loaded the code still reads the old page - at least pulling the URL and some data seems to point that it is the case.
UPDATE: I should also add that I have tried to make the macro refresh the page but it clears the input content. What is interesting that target URL is:
http://ec.europa.eu/taxation_customs/vies/vatResponse.html
If I change the initial page to this the browser instantly redirects to the original page with notification that initial data is needed. The macro then completes the data and clicks submit button. In this case IE.LocationURL indicates this URL:
http://ec.europa.eu/taxation_customs/vies/vatResponse.html
but according to the content I get with getElementsbyClassName still reads elements from the initial page:
http://ec.europa.eu/taxation_customs/vies/?locale=pl
This worked to print out the VAT response table
Note:
If on 32-bit remove the PtrSafe.
Code:
Option Explicit
Declare PtrSafe Sub sleep Lib "kernel32" Alias "Sleep" (ByVal dwmilliseconds As Long)
Public Sub VIES2()
Application.ScreenUpdating = False
Dim IE As Object
'Uruchomienie Internet Explorera i wstrzymanie dalszej akcji az uzyska stan gotowosci
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"
Do While IE.ReadyState <> 4: DoEvents: Loop
'Wypelnienie formularza odpowiednimi wartosciami i klikniecie przycisku sprawdzenia
IE.document.getElementById("countryCombobox").Value = "IT"
IE.document.getElementById("number").Value = "01802840023"
IE.document.getElementById("requesterCountryCombobox").Value = "IT"
IE.document.getElementById("requesterNumber").Value = "01802840023"
IE.document.getElementById("submit").Click
sleep (5000) 'or increase to 10000
Dim tbl As Object
Set tbl = IE.document.getElementById("vatResponseFormTable")
Dim ws As Worksheet
Set ws = ActiveWorkbook.Worksheets.Add
ws.Name = "Results"
Dim rng As Range, currentRow As Object, currentColumn As Object, i As Long, outputRow As Long
outputRow = outputRow + 1
Set rng = ws.Range("B" & outputRow)
For Each currentRow In tbl.Rows
For Each currentColumn In currentRow.Cells
rng.Value = currentColumn.outerText
Set rng = rng.Offset(, 1)
i = i + 1
Next currentColumn
outputRow = outputRow + 1
Set rng = rng.Offset(1, -i)
i = 0
Next currentRow
Application.ScreenUpdating = True
End Sub
Output:
Although QHarr's solution is working in my end, I'm providing with another with no hardcoded delay within the script.
Using IE as your question was:
Sub Get_Data()
Dim HTML As HTMLDocument, post As Object, elems As Object
Dim elem As Object, r&, c&
With New InternetExplorer
.Visible = False
.navigate "http://ec.europa.eu/taxation_customs/vies/?locale=pl"
While .Busy = True Or .readyState < 4: DoEvents: Wend
Set HTML = .document
With HTML
.getElementById("countryCombobox").Value = "IT"
.getElementById("number").Value = "01802840023"
.getElementById("requesterCountryCombobox").Value = "IT"
.getElementById("requesterNumber").Value = "01802840023"
.getElementById("submit").Click
Do: Set post = .getElementById("vatResponseFormTable"): DoEvents: Loop While post Is Nothing
For Each elems In post.Rows
For Each elem In elems.Cells
c = c + 1: Cells(r + 1, c) = elem.innerText
Next elem
c = 0: r = r + 1
Next elems
End With
.Quit
End With
End Sub
Reference to add to the library:
1. Microsoft Internet Controls
2. Microsoft HTML Object Library
Using xmlhttp request (It is way faster than IE):
Sub Get_Data()
Dim elems, elem As Object
Dim QueryString$, S$, r&, c&
QueryString = "memberStateCode=IT&number=01802840023&traderName=&traderStreet=&traderPostalCode=&traderCity=&requesterMemberStateCode=IT&requesterNumber=01802840023&action=check&check=Weryfikuj"
With New XMLHTTP
.Open "POST", "http://ec.europa.eu/taxation_customs/vies/vatResponse.html", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded; charset=UTF-8"
.send QueryString
S = .responseText
End With
With New HTMLDocument
.body.innerHTML = S
For Each elems In .getElementById("vatResponseFormTable").Rows
For Each elem In elems.Cells
c = c + 1: Cells(r + 1, c) = elem.innerText
Next elem
c = 0: r = r + 1
Next elems
End With
End Sub
Reference to add to the library:
1. Microsoft XML, V6
2. Microsoft HTML Object Library
Most of the time you should search if there isn't a REST/SOAP available to achieve that kind of task.
Using an Internet Explorer instance for this is a total overkill.
Try this simple function, that uses the SOAP service to validate VAT numbers:
Function IsVatValid(country_code, vat_number)
Dim objHTTP As Object
Dim xmlDoc As Object
Set objHTTP = CreateObject("WinHttp.WinHttpRequest.5.1")
sURL = "http://ec.europa.eu/taxation_customs/vies/services/checkVatService"
sEnv = "<s11:Envelope xmlns:s11='http://schemas.xmlsoap.org/soap/envelope/'>" & _
"<s11:Body>" & _
"<tns1:checkVat xmlns:tns1='urn:ec.europa.eu:taxud:vies:services:checkVat:types'>" & _
"<tns1:countryCode>" & country_code & "</tns1:countryCode>" & _
"<tns1:vatNumber>" & vat_number & "</tns1:vatNumber>" & _
"</tns1:checkVat>" & _
"</s11:Body>" & _
"</s11:Envelope>"
objHTTP.Open "Post", sURL, False
objHTTP.setRequestHeader "Content-Type", "text/xml"
objHTTP.setRequestHeader "SOAPAction", "checkVatService"
objHTTP.send (sEnv)
objHTTP.waitForResponse
Set xmlDoc = CreateObject("HTMLFile")
xmlDoc.body.innerHTML = objHTTP.responsetext
IsVatValid = CBool(xmlDoc.getElementsByTagName("valid")(0).innerHTML)
Set xmlDoc = Nothing
Set objHTTP = Nothing
End Function
And then you can simply validate all your vat numbers:
Debug.Print IsVatValid("IT", "01802840023")
>>> True

VBA - Internet Explorer 11 -Get Text from webpage

I have a webpage:
https://www.brcdirectory.com/InternalSite/Site.aspx?BrcSiteCode=1832583
I want to retrieve some text from this page, from within a HTML <Span ID>.
<span id="ctl00_ContentPlaceHolder1_FormView1_GridView1_ctl02_lb_ExpiryDate">Expiry Date : 07/12/2017</span>
I have IE 11.0.9600.18639
Via Excel, I am using the below code to open IE 11, navigate to the page and want to try and display a message box of the text inside the <SPAN>.
Code:
Option Explicit
Sub GoToWebsiteTest()
Application.ScreenUpdating = False
Application.DisplayAlerts = False
Dim appIE As Object
Dim objElement As Object
Dim objCollection As Object
Dim i As Long, LastRow As Long, sFolder As String
Dim sURL As String, FILE As String
LastRow = Range("I" & Rows.Count).End(xlUp).Row
For i = 6 To LastRow
Set appIE = New InternetExplorerMedium
sURL = "https://www.brcdirectory.com/InternalSite/Site.aspx?BrcSiteCode=" & Range("I392").Value
With appIE
.navigate sURL
.Visible = True
End With
Do While appIE.Busy Or appIE.READYSTATE <> READYSTATE_COMPLETE
DoEvents
Loop
Set objCollection = appIE.document.getElementById("ctl00_ContentPlaceHolder1_FormView1_GridView1_ctl02_lb_ExpiryDate")
MsgBox Replace(objCollection.innerText, "Expiry Date : ", "")
appIE.Quit
Set appIE = Nothing
Next i
Application.ScreenUpdating = True
Application.DisplayAlerts = True
MsgBox "All BRCs Succesfully Updated."
End Sub
I have tried everything! I have tried so many variations of this line where I get the error:
Do While appIE.Busy Or appIE.READYSTATE <> READYSTATE_COMPLETE
But alas I get this annoying error:
Runtime Error: -2147467259 (80004005)
Method 'Busy' of object 'IWebBrowser2' failed.
Please, please can someone show me what i am doing wrong. This is driving me crazy. Thanks in advance.
If you don't want to use the "get from web" you can use this code.
Sub expiry()
Dim RE As Object
Dim HTML As String
Set RE = CreateObject("vbscript.regexp")
HTML = GetHTML("https://www.brcdirectory.com/InternalSite/Site.aspx?BrcSiteCode=1832583")
'Expiry Date : 07/12/2017
RE.Pattern = "(Expiry Date : \d{2}\/\d{2}\/\d{4})"
RE.Global = True
RE.IgnoreCase = True
Set Matches = RE.Execute(HTML)
ExpiryDate = Matches.Item(0).submatches.Item(0)
End Sub
Function GetHTML(URL As String) As String
Dim HTML As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.Send
GetHTML = .ResponseText
End With
End Function
ExpiryDate will contain the text you wanted (I think).
If you only wanted the actual date you can use RE.Pattern = "Expiry Date : (\d{2}\/\d{2}\/\d{4})"
EDIT;
In response to comments below:
This is the references I have enabled
EDIT based on download to textfile.
Private Declare Function URLDownloadToFileA Lib "urlmon" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, ByVal dwReserved As Long, _
ByVal lpfnCB As Long) As Long
Sub expiry()
Dim RE As Object
Dim HTML As String
Dim MyData As String
Set RE = CreateObject("vbscript.regexp")
DownloadFile "https://www.brcdirectory.com/InternalSite/Site.aspx?BrcSiteCode=1832583", "C:\TEST\goog.txt"
Open "C:\TEST\goog.txt" For Binary As #1
HTML = Space$(LOF(1))
Get #1, , HTML
Close #1
'Expiry Date : 07/12/2017
RE.Pattern = "(Expiry Date : \d{2}\/\d{2}\/\d{4})"
RE.Global = True
RE.IgnoreCase = True
Set Matches = RE.Execute(HTML)
ExpiryDate = Matches.Item(0).submatches.Item(0)
End Sub
Private Function DownloadFile(URL As String, LocalFilename As String) As Boolean
'Thanks Mentalis:)
Dim lngRetVal As Long
lngRetVal = URLDownloadToFileA(0, URL, LocalFilename, 0, 0)
If lngRetVal = 0 Then DownloadFile = True
End Function
EDIT again.
I managed to resolve this by using the following code:
Option Explicit
Private ieBrowser As InternetExplorer
Sub GetBRCText()
Dim i As Long, LastRow As Long
Dim a As Range, b As Range
Dim strDocHTML As String, strDocHTML2 As String
Dim dteStartTime As Date
Application.ScreenUpdating = False
Application.DisplayAlerts = False
On Error Resume Next
LastRow = ThisWorkbook.ActiveSheet.Range("I" & Rows.Count).End(xlUp).Row
Set a = Range("I6:I" & LastRow)
'Create a browser object
Set ieBrowser = CreateObject("internetexplorer.application")
For Each b In a.Rows
If Not IsEmpty(b) Then
'Start Browsing loop
ieBrowser.navigate "https://www.brcdirectory.com/InternalSite/Site.aspx?BrcSiteCode=" & b.Value
dteStartTime = Now
Do While ieBrowser.READYSTATE <> READYSTATE_COMPLETE
If DateDiff("s", dteStartTime, Now) > 240 Then Exit Sub
Loop
On Error Resume Next
strDocHTML = ieBrowser.document.getElementById("ctl00_ContentPlaceHolder1_FormView1_GridView1_ctl02_lb_ExpiryDate").innerHTML
strDocHTML2 = ieBrowser.document.getElementById("ctl00_ContentPlaceHolder1_FormView1_GridView1_ctl02_lb_Grade").innerHTML
b.Offset(0, 2).Value = Replace(strDocHTML, "Expiry Date : ", "")
b.Offset(0, 1).Value = Replace(strDocHTML2, "Grade : ", "")
End If
Next b
ieBrowser.Quit
Set ieBrowser = Nothing
Application.ScreenUpdating = True
Application.DisplayAlerts = True
End Sub

Clicking checklist (IE - HTML). Why not work?

Good morning! I am developing a macro to click a few buttons on a page, but some are not working. Does anyone know why?
I posted my worksheet on: http://www.sendspace.com/file/3if8c6
Thanks in advance for any help.
Silmar
In the spreadsheet I posted a picture with the 02 buttons that can not click. Anyway, I'll post the code here.
Thanks!
Sub test()
Dim ie
Dim obj
Dim obj2
Dim linkCollection2
Dim elemCollection
Dim t As Integer
Dim r As Integer, c As Integer
Set ie = CreateObject("internetexplorer.application")
ie.Navigate2 "http://www.infomoney.com.br/mercados/agendas"
ie.Visible = True
Do While ie.Busy
Loop
Do Until ie.Document.ReadyState = "complete"
Loop
Dim LinkFound As Boolean
Dim linkCollection
Set linkCollection = ie.Document.getElementsByTagName("A")
For Each link In linkCollection
If link.InnerText = "Resultados" Then
LinkFound = True
link.Click
Exit For
End If
Next
If Not LinkFound Then
MsgBox "Link Not Found!"
Exit Sub
End If
Do While ie.Busy
Loop
Do Until ie.Document.ReadyState = "complete"
Loop
For Each obj In ie.Document.All.Item("ctl00$cphContent$ctl02$ddlReferencePage").Options
If obj.InnerText = "3T12" Then
obj.Selected = True
' ie.Document.forms(0).submit
Exit For
End If
Next obj
Set linkCollection2 = ie.Document.getElementsByTagName("A")
For Each link In linkCollection2
If link.InnerText = "Resultados" Then
LinkFound = True
link.Click
Exit For
End If
Next
Set elemCollection = ie.Document.getElementsByTagName("TABLE")
For t = 0 To elemCollection.Length - 1
For r = 0 To elemCollection(t).Rows.Length - 1
For c = 0 To elemCollection(t).Rows(r).Cells.Length - 1
ThisWorkbook.Worksheets(1).Cells(r + 1, c + 1) = elemCollection(t).Rows(r).Cells(c).InnerText
Next c
Next r
Next t
' The error happens here
' TO FROM HERE DOES NOT WORK. WHY?
For Each obj2 In ie.Document.All.Item("tblCInvestorData_length").Options
If obj.InnerText = "100" Then
obj2.Selected = True
' ie.Document.forms(0).submit
Exit For
End If
Next obj2
End Sub