I have the below VBA codes to automate IE, and then extract the figures of the HTML table and populate the data to Excel table. Is it possible to do the same thing by automate Edge Browser? Since my company don't allow us to install any 3rd party application, Selenium is not an option. As I am not too familarize with coding, highly apprecipate if someone can offer some sample codes
Sub sfc_esg_list()
Dim IE As New InternetExplorer
Dim doc As New MSHTML.HTMLDocument
IE.Visible =*emphasized text* True
'use IE browser to navigate SFC website
IE.navigate "https://www.sfc.hk/en/Regulatory-functions/Products/List-of-ESG-funds"
Do
DoEvents
'Application.Wait (Now() + TimeValue("00:00:04"))
Loop Until IE.readyState = 4
Set doc = IE.Document
Set TRs = doc.getElementsByTagName("tr")
Sheets("ESG list_SFC").Activate
'copy and paste the ESG fund list from SFC website to sheets<ESG list_SFC>
With Sheets("ESG list_SFC")
.Cells.Clear
For Each TR In TRs
r = r + 1
For Each Cell In TR.Children
C = C + 1
.Cells(r, C).NumberFormat = "#"
.Cells(r, C) = Cell.innerText
Next Cell
C = 0
Next TR
End With
IE.Quit
Set doc = Nothing
Set IE = Nothing
'Save the file
Application.ScreenUpdating = False
Application.DisplayAlerts = False
'ActiveWorkbook.Save
End Sub
IE is pretty much dead at this point. I think it should be something like this.
Sub TryMe()
Dim xmlhttp As Object
Set xmlhttp = CreateObject("MSXML2.serverXMLHTTP")
Set request = CreateObject("MSXML2.XMLHTTP")
Dim oHtml As HTMLDocument
Dim oElement As Object
Dim htmlText As String
Set oHtml = New HTMLDocument
request.Open "GET", "https://www.sfc.hk/en/Regulatory-functions/Products/List-of-ESG-funds/", False
request.send
oHtml.body.innerHTML = request.responseText
htmlText = oHtml.getElementsByClassName("tablesorter tablesorter-default tablesorterfcd4c178102ad8")(0).outerhtml
With CreateObject("new:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}") 'Clipboard
.SetText htmlText
.PutInClipboard
Sheets(1).Range("A1").Select
Sheets(1).PasteSpecial Format:="Unicode Text"
End With
End Sub
I thought the class name was 'tablesorter tablesorter-default tablesorterfcd4c178102ad8' but it doesn't seem to work, and I'm not sure why. Can you play around with some other class names? When you hit F12, you will see the HTML code behind the page.
I have 15 different URLs, and I need to fetch price from the particular website in Excel a particular column, can you please help me out. It's my first VBA program and I try but it show my syntax error.
Sub myfile()
Dim IE As New InternetExplorer Dim url As String Dim item As
HTMLHtmlElement Dim Doc As HTMLDocument Dim tagElements As Object
Dim element As Object Dim lastRow Application.ScreenUpdating =
False Application.DisplayAlerts = False Application.EnableEvents =
False Application.Calculation = xlCalculationManual url =
"https://wtb.app.channeliq.com/buyonline/D_nhoFMJcUal_LOXlInI_g/TOA-60?html=true"
IE.navigate url IE.Visible = True Do DoEvents Loop Until
IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
lastRow = Sheet1.UsedRange.Rows.Count + 1 Set tagElements =
Doc.all.tags("tr") For Each element In tagElements
If InStr(element.innerText, "ciq-price")> 0 And
InStr(element.className, "ciq-product-name") > 0 Then
Sheet1.Cells(lastRow, 1).Value = element.innerText
' Exit the for loop once you get the temperature to avoid unnecessary processing
Exit For End If Next
IE.Quit Set IE = Nothing Application.ScreenUpdating = True
Application.DisplayAlerts = True Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
End Sub
You can't copy any web scraping macro for your purposes. Every page has it's own HTML code structure. So you must write for every page an own web scraping macro.
I can't explain all about web scraping with VBA here. Please start your recherche for information with "excel vba web scraping" and "document object model". Further you need knowlege about HTML and CSS. In best case also about JavaScript:
The error message user-defined type not defined ocours because you use early binding without a reference to the libraries Microsoft HTML Object Library and Microsoft Internet Controls. You can read here how to set a reference via Tools -> References... and about the differences between early and late binding Early Binding v/s Late Binding and here deeper information from Microsoft Using early binding and late binding in Automation
To get the prices from the shown url you can use the following macro. I use late binding:
Option Explicit
Sub myfile()
Dim IE As Object
Dim url As String
Dim tagElements As Object
Dim element As Object
Dim item As Object
Dim lastRow As Long
lastRow = ActiveSheet.UsedRange.Rows.Count + 1
url = "https://wtb.app.channeliq.com/buyonline/D_nhoFMJcUal_LOXlInI_g/TOA-60?html=true"
Set IE = CreateObject("internetexplorer.application")
IE.navigate url
IE.Visible = True
Do: DoEvents: Loop Until IE.readyState = 4
Set tagElements = IE.document.getElementsByClassName("ciq-online-offer-item ")
For Each element In tagElements
Set item = element.getElementsByTagName("td")(1)
ActiveSheet.Cells(lastRow, 1).Value = Trim(item.innerText)
lastRow = lastRow + 1
Next
IE.Quit
Set IE = Nothing
End Sub
Edit for a second Example:
The new link leads to an offer. I assume the price of the product is to be fetched. No loop is needed for this. You just have to find out in which HTML segment the price is and then you can decide how to get it. In the end there are only two lines of VBA that write the price into the Excel spreadsheet.
I'm in Germany and Excel has automatically set the currency sign from Dollar to Euro. This is of course wrong. Depending on where you are, this may have to be intercepted.
Sub myfile2()
Dim IE As Object
Dim url As String
Dim tagElements As Object
Dim lastRow As Long
lastRow = ActiveSheet.UsedRange.Rows.Count + 1
url = "https://www.wayfair.com/kitchen-tabletop/pdx/cuisinart-air-fryer-toaster-oven-cui3490.html"
Set IE = CreateObject("internetexplorer.application")
IE.navigate url
IE.Visible = True
Do: DoEvents: Loop Until IE.readyState = 4
'Break for 3 seconds
Application.Wait (Now + TimeSerial(0, 0, 3))
Set tagElements = IE.document.getElementsByClassName("BasePriceBlock BasePriceBlock--highlight")(0)
ActiveSheet.Cells(lastRow, 1).Value = Trim(tagElements.innerText)
IE.Quit
Set IE = Nothing
End Sub
I have a HTML file that is saved locally on the desktop which contains a table of statistics from which I need to pull specific data, paste it into a excel workbook table and then email it.
I've got the rest of the process working, I'm just struggling to figure out how to parse the html file and all other examples I've seen are parsing a website rather than a locally saved html file.
Apologies if this is a bit of beginner question but I'm finding it hard to make sense of the other examples I've seen.
thank you for any assistance.
Thank you to everyone for your examples and pointing me in the right direction ! The example posted below copies the data from a HTML file stored on the users desktop and pastes it into a new worksheet in Excel.
Option Explicit
Sub ParseHTML()
Dim URL As String
Dim IE As InternetExplorer
Dim htmldoc As MSHTML.IHTMLDocument 'Document object
Dim eleColtr As MSHTML.IHTMLElementCollection 'Element collection for tr tags
Dim eleColtd As MSHTML.IHTMLElementCollection 'Element collection for td tags
Dim htmlTables As MSHTML.IHTMLElementCollection 'Element collection for table tags
Dim eleRow As MSHTML.IHTMLElement 'Row elements
Dim eleCol As MSHTML.IHTMLElement 'Column elements
Dim wksOut As Worksheet
Dim rngOut As Range
Dim intTableIndex As Integer
Dim intRowIndex As Integer
Dim intColIndex As Integer
URL = Environ("userProfile") & "\desktop\FileName.HTML"
'Open InternetExplorer.
Set IE = New InternetExplorer
'Navigate to URL.
With IE
.navigate URL
.Visible = False
'Extract html information to objects.
Set htmldoc = IE.document
Set htmlTables = htmldoc.getElementsByTagName("table")
Set eleColtr = htmlTables(intTableIndex).getElementsByTagName("tr")
'Extract table to a new blank worksheet.
On Error Resume Next
Set wksOut = ThisWorkbook.Worksheets("WorksheetName")
If Err.Number <> 0 Then
Set wksOut = ThisWorkbook.Worksheets.Add(After:=Worksheets(Worksheets.Count))
wksOut.Name = "WorksheetName"
End If
With wksOut
.Cells.Clear
.Cells.NumberFormat = "General"
.Cells.ColumnWidth = 2
End With
On Error GoTo 0
'This section populates Excel
intRowIndex = 0
For Each eleRow In eleColtr
Set eleColtd = htmlTables(intTableIndex).getElementsByTagName("tr")(intRowIndex).getElementsByTagName("td") 'get all the td elements in that specific tr
Set rngOut = wksOut.Range("A1000000").End(xlUp).Offset(1, 0)
intColIndex = 0
For Each eleCol In eleColtd
rngOut.Offset(0, intColIndex) = eleCol.innerText
intColIndex = intColIndex + 1
Next eleCol
intRowIndex = intRowIndex + 1
Next eleRow
wksOut.Cells.EntireColumn.AutoFit
'Cleanup
IE.Quit
Set IE = Nothing
Set htmldoc = Nothing
Set htmlTables = Nothing
Set eleColtr = Nothing
Set eleColtd = Nothing
Set wksOut = Nothing
Set rngOut = Nothing
End With
End Sub
Please note that excel may throw a Runtime Error Automation Error on line:
Set IE = New InternetExplorer
If this happens try setting InternetExplorer integrity to Medium:
Set IE = New InternetExplorerMedium
If you need more information regarding InternetExplorer Integrity please see
https://blogs.msdn.microsoft.com/ieinternals/2011/08/03/default-integrity-level-and-automation/
As Tim mentioned I could open the file in excel and copy and paste the values which runs a lot faster:
Sub CopyHTML()
dim Wb as Workbook
dim Ws as Worksheet
Set Wb = ActiveWorkbook
Set Ws = Wb.Sheets("Sheet1")
'Opens html file and copies range
Workbooks.Open (Environ("userProfile") & "\desktop\FileName.html")
Range("A1:AJ21").Select
Selection.Copy
'pastes range in cell B5 on active workbook
Wb.Activate
Range("B5").Select
Ws.Paste
Application.CutCopyMode = False
Workbooks("FileName.html").Close
Thanks for the advice Tim !
I am trying to learn how to parse data from HTML using Excel VBA. So I found one example online which works fine but when I change URL address from www.yahoo.com to local HTML file on C it gives me error i.e. Method 'busy' of object 'IwebBrowser2' failed. Code is:
Sub GetBodyText()
Dim URL As String
Dim Data As String
URL = "file:///C:/test.html"
Dim ie As Object
Dim ieDoc As Object
Set ie = CreateObject("InternetExplorer.Application")
ie.navigate URL
Do Until (ie.readyState = 4 And Not ie.Busy)
DoEvents
Loop
Set ieDoc = ie.Document
Data = ieDoc.body.innerText
'Split Data into separate lines
'or just use Range("A1")=data
Dim myarray As Variant
myarray = Split(Data, vbCrLf)
For i = 0 To UBound(myarray)
'Start writing in cell A1
Cells(i + 1, 1) = myarray(i)
Next
ie.Quit
Set ie = Nothing
Set ieDoc = Nothing
End Sub
For IE, just use:
URL = "c:\test.html"
I have been trying desperately for months to automate a process whereby a csv file is downloaded, maned and saved in a given location.
so far I only managed with excel vba to open the web page and click the bottom to download the csv file, but the code stop and required a manual intervention to to be completed, i would like it to be fully automated if possible.
see the code used (I am not the author):
Sub WebDataExtraction()
Dim URL As String
Dim IeApp As Object
Dim IeDoc As Object
Dim ieForm As Object
Dim ieObj As Object
Dim objColl As Collection
URL = "http://www.bmreports.com/bsp/BMRSSystemData.php?pT=DDAD&zT=N&dT=NRT"
Set IeApp = CreateObject("InternetExplorer.Application")
IeApp.Visible = True
IeApp.Navigate URL
Do Until IeApp.ReadyState = READYSTATE_COMPLETE
Loop
Set IeDoc = IeApp.Document
For Each ele In IeApp.Document.getElementsByTagName("span")
If ele.innerHTML = "CSV" Then
Application.Wait (Now + TimeValue("0:00:15"))
DoEvents
ele.Click
'At this point you need to Save the document manually
' or figure out for yourself how to automate this interaction.
Test_Save_As_Set_Filename
File_Download_Click_Save
End If
Next
IeApp.Quit
End Sub"
thanks in advance
Nunzio
I am posting a second answer, since, as I believe my first answer is adequate for many similar applications, it does not work in this instance.
Why the other methods fail:
The .Click method: This raises a new window which expects user input at run-time, it doesn't seem to be possible to use the WinAPI to control this window. Or, at least not any way that I can determine. The code execution stops on the .Click line until the user manually intervenes, there is no way to use a GoTo or a Wait or any other method to circumvent this behavior.
Using a WinAPI function to download the source file directly does not work, since the button's URL does not contain a file, but rather a js function that serves the file dynamically.
Here is my proposed workaround solution:
You can read the webpage's .body.InnerText, write that out to a plain text/csv file using FileSystemObject and then with a combination of Regular Expressions and string manipulation, parse the data into a properly delimited CSV file.
Sub WebDataExtraction()
Dim url As String
Dim fName As String
Dim lnText As String
Dim varLine() As Variant
Dim vLn As Variant
Dim newText As String
Dim leftText As String
Dim breakTime As Date
'## Requires reference to Microsoft VBScript Regular Expressions 5.5
Dim REMatches As MatchCollection
Dim m As Match
'## Requires reference to Microsoft Internet Controls
Dim IeApp As InternetExplorer
'## Requires reference to Microsoft HTML object library
Dim IeDoc As HTMLDocument
Dim ele As HTMLFormElement
'## Requires reference to Microsoft Scripting Runtime
Dim fso As FileSystemObject
Dim f As TextStream
Dim ln As Long: ln = 1
breakTime = DateAdd("s", 60, Now)
url = "http://www.bmreports.com/bsp/BMRSSystemData.php?pT=DDAD&zT=N&dT=NRT"
Set IeApp = CreateObject("InternetExplorer.Application")
With IeApp
.Visible = True
.Navigate url
Do Until .ReadyState = 4
Loop
Set IeDoc = .Document
End With
'Wait for the data to display on the page
Do
If Now >= breakTime Then
If MsgBox("The website is taking longer than usual, would you like to continue waiting?", vbYesNo) = vbNo Then
GoTo EarlyExit
Else:
breakTime = DateAdd("s", 60, Now)
End If
End If
Loop While Trim(IeDoc.body.innerText) = "XML CSV Please Wait Data Loading Sorting"
'## Create the text file
fName = ActiveWorkbook.Path & "\exported-csv.csv"
Set fso = CreateObject("Scripting.FileSystemObject")
Set f = fso.OpenTextFile(fName, 2, True, -1)
f.Write IeDoc.body.innerText
f.Close
Set f = Nothing
'## Read the text file
Set f = fso.OpenTextFile(fName, 1, False, -1)
Do
lnText = f.ReadLine
'## The data starts on the 4th line in the InnerText.
If ln >= 4 Then
'## Return a collection of matching date/timestamps to which we can parse
Set REMatches = SplitLine(lnText)
newText = lnText
For Each m In REMatches
newText = Replace(newText, m.Value, ("," & m.Value & ","), , -1, vbTextCompare)
Next
'## Get rid of consecutive delimiters:
Do
newText = Replace(newText, ",,", ",")
Loop While InStr(1, newText, ",,", vbBinaryCompare) <> 0
'## Then use some string manipulation to parse out the first 2 columns which are
' not a match to the RegExp we used above.
leftText = Left(newText, InStr(1, newText, ",", vbTextCompare) - 1)
leftText = Left(leftText, 10) & "," & Right(leftText, Len(leftText) - 10)
newText = Right(newText, Len(newText) - InStr(1, newText, ",", vbTextCompare))
newText = leftText & "," & newText
'## Store these lines in an array
ReDim Preserve varLine(ln - 4)
varLine(ln - 4) = newText
End If
ln = ln + 1
Loop While Not f.AtEndOfStream
f.Close
'## Re-open the file for writing the delimited lines:
Set f = fso.OpenTextFile(fName, 2, True, -1)
'## Iterate over the array and write the data in CSV:
For Each vLn In varLine
'Omit blank lines, if any.
If Len(vLn) <> 0 Then f.WriteLine vLn
Next
f.Close
EarlyExit:
Set fso = Nothing
Set f = Nothing
IeApp.Quit
Set IeApp = Nothing
End Sub
Function SplitLine(strLine As String) As MatchCollection
'returns a RegExp MatchCollection of Date/Timestamps found in each line
'## Requires reference to Microsoft VBScript Regular Expressions 5.5
Dim RE As RegExp
Dim matches As MatchCollection
Set RE = CreateObject("vbscript.regexp")
With RE
.MultiLine = False
.Global = True
.IgnoreCase = True
'## Use this RegEx pattern to parse the date & timestamps:
.Pattern = "(19|20)\d\d[-](0[1-9]|1[012])[-](0[1-9]|[12][0-9]|3[01])[ ]\d\d?:\d\d:\d\d"
End With
Set matches = RE.Execute(strLine)
Set SplitLine = matches
End Function
EDIT
I tested my original answer code using the URL:
http://www.bmreports.com/bsp/BMRSSystemData.php?pT=DDAD&zT=N&dT=NRT#saveasCSV
But this method does not seem to work, for this site. The ele.Click doesn't seem to initiate the download, it just opens the data tabular on the webpage. To download, you need to do the right-click/save-as. If you have gotten that far (as I suspect, based on the subroutines you are calling, but for which you did not provide the code), then you can probably use the Win API to get the HWND of the Save dialog and possibly automate that event. Santosh provides some information on that:
VBA - Go to website and download file from save prompt
Here is also a good resource that should help solve your problem:
http://social.msdn.microsoft.com/Forums/en-US/beb6fa0e-fbc8-49df-9f2e-30f85d941fad/download-file-from-ie-with-vba
Original Answer
If you are able to determine the URL of the CSV then you can use this subroutine to open a connection to the CSV data and import it directly to the workbook. You may need to automate a text-to-columns operation on the imported data, but that can easily be replicated with the macro recorder. I put an example of this in the Test() subroutine below.
You could easily modify this to add the QueryTables in a new workbook, and then automate the SaveAs method on that workbook to save the file as a CSV.
This example uses a known URL for Yahoo Finance, Ford Motor Company, and will add a QueryTables with the CSV data in cell A1 of the active worksheet. This can be modified pretty easily to put it in another sheet, another workbook, etc.
Sub Test()
Dim MyURL as String
MyURL = "http://ichart.finance.yahoo.com/table.csv?s=GM&a0&b=1&c2010&d=05&e=20&f=2013&g=d&ignore=.csv"
OpenURL MyURL
'Explode the CSV data:
Range("A:A").TextToColumns Destination:=Range("A1"), DataType:=xlDelimited, _
TextQualifier:=xlDoubleQuote, ConsecutiveDelimiter:=False, Tab:=False, _
Semicolon:=False, Comma:=True, Space:=False, Other:=False, FieldInfo _
:=Array(Array(1, 3), Array(2, 1), Array(3, 1), Array(4, 1), Array(5, 1), Array(6, 1), _
Array(7, 1)), TrailingMinusNumbers:=True
End Sub
Private Sub OpenURL(fullURL As String)
'This opens the CSV in querytables connection.
On Error GoTo ErrOpenURL
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & fullURL, Destination:=Range("A1"))
.Name = fullURL
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = True
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingAll
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
ExitOpenURL:
Exit Sub 'if all goes well, you can exit
'Error handling...
ErrOpenURL:
Err.Clear
bCancel = True
Resume ExitOpenURL
End Sub