send Enter(~) key using vba to an outside IE object - html

I'm stuck with the following task: from an excel file I send data (for a certain column into the sheet I send the cell values) to a specific IE object predefined.
The data is sent onto the website in a textarea field (identified onto the website through an ID:id1) using a macro. For each cell value/data added into the textarea field of the website Enter key/command need to be done/appended automatically as this will generate some of the empty input fields to be completed automatically onto the website (the code is listed below:)
I'm struggling with a method of automatically send the ENTER key onto the website into the textarea after the data is inserted
Code updated..
Sub adddata()
Dim objIE As Object
Dim objTR As Object
Dim i, j, counter As Integer
Dim lastRow As Long
counter = 1
Set objIE = GetIeByTitle("https://exampletest.com", True, True)
Dim lastRow2 As Integer
lastRow2 = Workbooks(path1).Worksheets("Test").Range("A" & Rows.Count).End(xlUp).Row
Workbooks(path1).Worksheets("Test").Activate
contor = 1
'First of all the unhidden files have to be take from the target excel file
'Selecting the unhidden lines from the excel file
For j = 1 To lastRow2
If Rows(j).EntireRow.Hidden = False Then
Workbooks(path1).Worksheets("Test").Range("A" & j & ":Z" & j).Select
Selection.Copy
ThisWorkbook.Worksheets("Test2").Range("A" & counter).PasteSpecial
counter = counter + 1
End If
Next j
'look into the new excel file containing just the unhidden lines
'Afterwards a look up through new excel file cells
For i = 2 To counter
objIE.document.getelementbyid("id1").Value = Worksheets("Test2").Range("C" & i).Value ' taking the value from the cell and adding it on the text area field
objIE.document.getelementbyid("id1").SetFocus
Application.SendKeys "~" ' sending the enter key
Application.Wait (7) ' add the delay of 7 seconds
'2nd field
objIE.document.getelementbyid("id2").Value = Worksheets("Test2").Range("D" & i).Value
'3rd field
objIE.document.getelementbyid("id3").Value = Worksheets("Test2").Range("E" & i).Value
Next i
End Sub

It looks like youre missing a closing bracket here:
Set objIE = GetIeByTitle("https://exampletest.com", True, True ***)***
Try putting braces around the tilde:
Application.SendKeys "{~}"
EDIT:
It sounds like perhaps there is code associated with the change of the text box; perhaps try:
objIE.document.getelementbyid("id1").fireevent ("onchange")

Related

Parsing Html in VBA using a query get request

I am using someone else's code because this is an old file other people are using, I want to update it to make it more efficient but I need a little help. Below is the vba operation. What I need is it to get the information but delete everything but a certain word which changes every time the operation is run. I could use regex and objRE.Pattern = "|" but the word changes depending on the status.
HTML:
<span onmouseover="ShowText('Message','blahblah'); return true;"
onmouseout="HideText('Message'); return true;"
href="javascript:ShowText('Message')">---(PSA)---</span>
</font><a href='?srn=numbers12131131'target='_self'><font color='#6666FF'
size='3'>numbers123232343</font></a><font size='3'>----Installed----MUM
Indication:In Scope-<font color='#00CC00'>PASS WITH WARNING</font>--- (20181018)
</td><tr></table> </b><br>
<table class="OrderForm" width="1000"> '
I just want the Installed status in my excel sheet.
VBA code that needs work:
Sub GetComment()
Dim book As Workbook
Dim sheet As Worksheet
Dim row As Integer
Dim SRN As String
Dim whttp As Object
Set book = ThisWorkbook
Set sheet = book.Worksheets("CMT Data")
Set whttp = CreateObject("WinHTTP.WinHTTPrequest.5.1")
row = 2
SRN = sheet.Cells(row, 1)
Do While SRN <> ""
Debug.Print SRN
whttp.Open "GET", "www.websitedatgoeshere.com" & SRN, False
whttp.SetRequestHeader "Cookie", "mycookiefromwebsite;"
whttp.send
Debug.Print whttp.responseText
sheet.Cells(row, 2) = whttp.responseText
row = row + 1
SRN = sheet.Cells(row, 1)
Loop
Set whttp = Nothing
End Sub
This is based on if, and only if, the word is always between "----" and "----", and that it is the first occurrence in the response. If not the first you can adjust the index 1 as required.
Debug.Print Split(Split(whttp.responseText, "----")(1), "----")(0)
sheet.Cells(row, 2) = Split(Split(whttp.responseText, "----")(1), "----")(0)

Access 2007: Hide Data Labels on Chart Object via vba with 0 values?

I have a form in Access 2007 with a Stacked Bar Chart Object that is dynamically generated depending on the current date and outputs a PDF of the chart.
Everything generates and works fine, but what is happening is data labels are being applied even for series with a Null or 0 value. This leads to a mess of text in various places.
I'm looking for a way via VBA to remove any labels that belong to a series with no values.
I've tried ruling out null values from the SQL query and also setting the format options so the 0 values won't show. I have tried looping through the series and applying a label if the value is > 0, but if I set it to apply the series name it still puts it for blank values.
EDIT Current Code:
Option Compare Database
Private Sub Form_Load()
Dim tstChart As Graph.Chart
On Error GoTo Form_Load_Error
Set tstChart = [Forms]!testing!barEquip.Object
With tstChart
.HasTitle = True
.ChartTitle.Font.Size = 14
.ChartTitle.Text = VBA.Strings.MonthName(VBA.DatePart("m", VBA.Date()) - 1) & " " & VBA.DatePart("yyyy", VBA.Date()) & _
" Test Title"
For Each srs In .SeriesCollection
For Each pt In srs.Points
pt.DataLabel.Text = "Y"
Next
Next
End With
On Error GoTo 0
Exit Sub
Form_Load_Error:
MsgBox "Error " & Err.Number & " (" & Err.Description & ") in procedure Form_Load of VBA Document Form_testing"
End Sub
I'm able to change each label, but I can't seem to figure out a way to check each point in the series points.
EDIT:
SOLVED
(Simple, but works fine)
Sub AdjustDataLabels(cht As Chart)
Dim srs As Series
Dim pt As Point
Dim vals As Variant
For Each srs In cht.SeriesCollection
'Apply Value labels
srs.ApplyDataLabels (xlDataLabelsShowValue)
For Each pt In srs.Points
'Check for empty labels
If pt.DataLabel.Text = "" Then
'Do nothing
Else
'Add Series Name then remove Value
pt.DataLabel.ShowSeriesName = True
pt.DataLabel.ShowValue = False
End If
Next
Next
End Sub
You are using a Graph.Chart instead of a Chart. They are more limited in what you can do with them, which is what I was afraid of. But perhaps this can help anyways.
The idea is to first ensure that the series data labels are being displayed.
Once we know they are displayed, iterate the points and selectively manipulate the point's DataLabel.Text property, based on it's DataLabel.Text property. I'm assuming the value here being displayed is 0, and that you simply want to hide labels if it's 0, and do nothing to the other labels.
Within your procedure we will call another sub to do this:
Set tstChart = [Forms]!testing!barEquip.Object
With tstChart
.HasTitle = True
.ChartTitle.Font.Size = 14
.ChartTitle.Text = VBA.Strings.MonthName(VBA.DatePart("m", VBA.Date()) - 1) & " " & VBA.DatePart("yyyy", VBA.Date()) & _
" Test Title"
Call AdjustDataLabels(tstChart) 'Call a procedure to modify the labels as needed
End With
So that code will now call on another sub-procedure:
Sub AdjustDataLabels(cht As Graph.Chart)
Dim srs As Graph.Series
Dim pt As Graph.Point
Dim vals As Variant
For Each srs In cht.SeriesCollection
'First, ensure the dataLabels are ON
srs.ApplyDataLabels
For Each pt In srs.Points
'Now, check the datalabels one by one, testing for your criteria
If pt.DataLabel.Text = " some condition " Then
'Criteria met, so blank out this datalabel
'pt.HasDataLabel = False
'OR:
pt.DataLabel.Text = vbNullString
Else
'If you need to make any adjustments to other labels,
' you can do that here.
' For example you could simply append the series name.
' Modify as needed.
pt.DataLabel.Text = pt.DataLabel.Text & " -- " & srs.Name
End If
Next
Next
End Sub
SOLVED: (Simple, but works fine) Thanks for all the help!
Sub AdjustDataLabels(cht As Chart)
Dim srs As Series
Dim pt As Point
Dim vals As Variant
For Each srs In cht.SeriesCollection
'Apply Value labels
srs.ApplyDataLabels (xlDataLabelsShowValue)
For Each pt In srs.Points
'Check for empty labels
If pt.DataLabel.Text = "" Then
'Do nothing
Else
'Add Series Name then remove Value
pt.DataLabel.ShowSeriesName = True
pt.DataLabel.ShowValue = False
End If
Next
Next
End Sub

Search on website conditional of changing different cells

I am currently trying to write a code in VBA, the purpose of this code is in general:
Type input into a cell "A" in a defined range.
Left from this particular cell (within the range) i want different information from a website-search to be printed next to the input-cell "A".
More specifically I want to write a code, where I can type in specific stock tickers in a column (ie. "IBM"), and when I do this, a procedure begins, which goes to finance.yahoo.com, collects different information about this specifik stock (ie. "International Business Machines"), and prints this information "to the right" of the cell where i typed the stock ticker in the first place.
The goal is to be able to type in 20-30 stock tickers, and make it retrieve the information for every ticker and print it to the right of those stock tickers.
I have already figured out how to type in one ticker in a specific cell, and make it retrieve the desired data from the website, and then print it into specific cells "to the right" of the input-cell.
My challenge now, is to be able to do this for a large range of cells below the first input-cell.
My code as of now:
Private Sub Worksheet_Change(ByVal Target As Range)
If Target.Row = Range("Sheet1!$A$2").Row And _
Target.Column = Range("Sheet1!$A$2").Column Then
Dim IE As New InternetExplorer
IE.Visible = False
IE.navigate "http://finance.yahoo.com/q;_ylt=AiMiBWm16z_q5Ai0SlNb3jaiuYdG;_ylu=X3oDMTBxdGVyNzJxBHNlYwNVSCAzIERlc2t0b3AgU2VhcmNoIDEx;_ylg=X3oDMTBsdWsyY2FpBGxhbmcDZW4tVVMEcHQDMgR0ZXN0Aw--;_ylv=3?s=" & Range("Sheet1!$A$2").Value 'This types in the value from my input-cell into the website, so i get directed to the webpage for this particular company.
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Dim Doc As HTMLDocument
Set Doc = IE.document
Dim Name_001 As String 'Info-cell with name of the company
Dim Ticker_001 As String 'Info-cell with current price of the company
Name_001 = Trim(Doc.getElementsByClassName("title")(0).innerText)
Ticker_001 = Trim(Doc.getElementsByClassName("yfi_rt_quote_summary_rt_top sigfig_promo_1")(0).innerText)
IE.Quit
Dim Nam_001 As Variant
Dim Tic_001 As Variant
Nam_001 = Split(Name_001, "(")
Tic_001 = Split(Ticker_001, " ")
Range("Sheet1!$B$2").Value = Nam_001(0)
Range("Sheet1!$C$2").Value = Tic_001
End If
End Sub
"Sheet1!$A$2": The input-cell
"Sheet1!$B$2": First output cell for the name of the stock ticker.
"Sheet1!$C$2": Second output cell for the current price of the stock ticker.
Right now my code only works for one row, i want it to work for all the rows beneath the current input-cell:
I hope my question is clear, and that you can help me with my problem (I am not allowed to post any pictures, so I cant show you my work book, however I have tickers in column "A", name in column "B" and price in column "C").
Thanks in advance - Juhlers.
I modified your code: (1) changed sheet name; (2) Removed anchored row ($2) and replaced with 'Target.Row'; (3) Added Error Trap - sometimes get error; (4) Changed Cursor to 'Busy' since it takes a few seconds. Try The following:
Option Explicit
Private Sub Worksheet_Change(ByVal Target As Range)
Dim IE As New InternetExplorer
Dim Doc As HTMLDocument
Dim Name_001 As String 'Info-cell with name of the company
Dim Ticker_001 As String 'Info-cell with current price of the company
Dim Nam_001 As Variant
Dim Tic_001 As Variant
On Error GoTo Error_Trap
If Target.column <> 1 Then
Exit Sub
End If
Application.Cursor = xlWait
IE.Visible = False
IE.navigate "http://finance.yahoo.com/q;_ylt=AiMiBWm16z_q5Ai0SlNb3jaiuYdG;_ylu=X3oDMTBxdGVyNzJxBHNlYwNVSCAzIERlc2t0b3AgU2VhcmNoIDEx;_ylg=X3oDMTBsdWsyY2FpBGxhbmcDZW4tVVMEcHQDMgR0ZXN0Aw--;_ylv=3?s=" & Range("Stocks!$A" & Target.row).value 'This types in the value from my input-cell into the website, so i get directed to the webpage for this particular company.
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
Set Doc = IE.document
Name_001 = Trim(Doc.getElementsByClassName("title")(0).innerText)
Ticker_001 = Trim(Doc.getElementsByClassName("yfi_rt_quote_summary_rt_top sigfig_promo_1")(0).innerText)
IE.Quit
Nam_001 = Split(Name_001, "(")
Tic_001 = Split(Ticker_001, " ")
Range("Stocks!$B" & Target.row).value = Nam_001(0)
Range("Stocks!$C" & Target.row).value = Tic_001
Application.Cursor = xlNormal
Exit Sub
Error_Trap:
Application.Cursor = xlNormal
MsgBox "Error: " & Err.Number & vbTab & Err.Description
Exit Sub
End Sub

How to create individual HTML publish pages using loop through excel spreadsheet

I am trying to create individual HTML pages using a loop function on an excel spreadsheet. I had been manually publishing each page but I have thousands of entries, so I need an automated method using macro. I recorded a macro with the steps I use through the manual approach shown below:
Sub HTMLexport()
Columns("A:W").Select
With ActiveWorkbook.PublishObjects.Add(xlSourceRange, _
"C:\Users\<user_name>\Desktop\Excel2HTML\Articles\1045_VSE.htm", _
"Sheet1", "$A:$W", xlHtmlStatic, _
"FileName_10067", "")
.Publish (True)
End With
Columns("W:W").Select
Selection.EntireColumn.Hidden = True
End Sub
Ultimately what I want is to be able to have Column A and the next column selected (ex. B, C, H, Etc), then have those two published into an HTML page. The name of the file I would like to be based on a cell reference. Ex. Cell W3 would have a value of 1045, and the file name be saved as 1045_VSE.htm, where _VSE is constant through the loop process. That way, each new HTML page name would increment based on the cell reference. Once the HTML page is saved, hide the column and move to the next one, rinse and repeat. Any help with this would be amazing. Thanks in advance.
This should be fairly straightforward to put inside a loop.
Here is an example. I assume that the filename will come from the first row/second column in the sub-range, you can easily modify this, or ask me how and I can revise. I also assume that the Div ID ("FileName_100067") is constant. Again, this can easily be modified if needed.
Sub HTMLinLoop()
Dim wb As Workbook: Set wb = ActiveWorkbook
Dim ws As Worksheet: Set ws = ActiveSheet
Dim rng As Range '## The full range including all columns'
Dim subRng As Range '## a variable to contain each publishObjects range'
Dim pObj As PublishObject '## A variable to contain each publishObject as we create it.'
Dim p As Long '## use this integer to iterate over the columns in rng'
Dim fileName As String '## represents just the file name to export'
Dim fullFileName As String '## the full file path for each export'
Dim divName As String '## variable for the DivID argument, assume static for now'
Set rng = ws.Range("A3:W30") '## modify as needed'
For p = 1 To rng.Columns.Count
'Identify the sub-range to use for this HTML export:'
' this will create ranges like "A:B", then "A:C", then "A:D", etc.'
Set subRng = Range(rng.Columns(1).Address, rng.Columns(p + 1).Address)
'Create the filename:'
'## modify as needed, probably using a range offset.'
fileName = subRng.Cells(1, 2).Value & "_VSE.htm"
'Concatenate the filename & path:'
'## modify as needed.'
exportFileName = "C:\Users\" & Environ("Username") & "\Desktop\" & fileName
'Create hte DIV ID:'
divName = "FileName_10067" '## modify as needed, probably using a range offset.'
'## Now, create the publish object with the above arguments:'
Set pObj = wb.PublishObjects.Add( _
SourceType:=xlSourceRange, _
fileName:=exportFileName, _
Sheet:=ws.Name, _
Source:=subRng.Address, _
HtmlType:=xlHtmlStatic, _
DivID:=divName, _
Title:="")
'## Finally, publish it!'
pObj.Publish
'## Hide the last column:'
rng.Columns(p+1).EntireColumn.Hidden = True
Next
End Sub

html parsing of cricinfo scorecards

Aim
I am looking to scrape 20/20 cricket scorecard data from the Cricinfo website, ideally into CSV form for data analysis in Excel
As an example the current Australian Big Bash 2011/12 scorecards are available from
Game 1: http://www.espncricinfo.com/big-bash-league-2011/engine/match/524915.html
Last Game: http://www.espncricinfo.com/big-bash-league-2011/engine/match/524935.html
Background
I am proficient in using VBA (either automating IE or using XMLHTTP and then using regular expressions) to scrape data from websites, ie
Extract values from HTML TD and Tr
In that same question a comment was posted suggesting html parsing - which I hadn't come accross before - so I have taken a look at questions such as RegEx match open tags except XHTML self-contained tags
Query
While I could write a regex to parse the cricket data below I would like advice as to how I could efficiently retrieve these results with html parsing.
Please bear in mind that my preference is a repeatable CSV format containing:
the date/name of the match
Team 1 name
the output should dump up to 11 records for Team 1 (blank records where players haven't batted, ie "Did Not Bat")
Team 2 name
the output should dump up to 11 records for Team 2 (blank records where players haven't batted)
Nirvana for me would be a solution that I could deploy using VBA or VBscript so I could fully automate my analysis, but I presume I will have to use a separate tool for the html parse.
Sample Site links and Data to be Extracted
There are 2 techniques that I use for "VBA". I will describe them 1 by one.
1) Using FireFox / Firebug Addon / Fiddler
2) Using Excel's inbuilt facility to get data from the web
Since this post will be read by many so I will even cover the obvious. Please feel free to skip whatever part you know
1) Using FireFox / Firebug Addon / Fiddler
FireFox : http://en.wikipedia.org/wiki/Firefox
Free download (http://www.mozilla.org/en-US/firefox/new/)
Firebug Addon: http://en.wikipedia.org/wiki/Firebug_%28software%29
Free download (https://addons.mozilla.org/en-US/firefox/addon/firebug/)
Fiddler : http://en.wikipedia.org/wiki/Fiddler_%28software%29
Free download (http://www.fiddler2.com/fiddler2/)
Once you have installed Firefox, install the Firebug Addon. The Firebug Addon lets you inspect the different elements in a webpage. For example if you want to know the name of a button, simply right click on it and click on "Inspect Element with Firebug" and it will give you all the details that you will need for that button.
Another example would be finding the name of a table on a website which has the data that you need scrapped.
I use Fiddler only when I am using XMLHTTP. It helps me to see the exact info being passed when you click on a button. Because of the increase in the number of BOTS which scrape the sites, most sites now, to prevent automatic scrapping, capture your mouse coordinates and pass that information and fiddler actually helps you in debugging that info that is being passed. I will not get into much details here about it as this info can be used maliciously.
Now let's take a simple example on how to scrape the URL posted in your question
http://www.espncricinfo.com/big-bash-league-2011/engine/match/524915.html
First let's find the name of the table which has that info. Simply right click on the table and click on "Inspect Element with Firebug" and it will give you the below snapshot.
So now we know that our data is stored in a table called "inningsBat1" If we can extract the contents of that table to an Excel file then we can definitely work with the data to do our analysis. Here is sample code which will dump that table in Sheet1
Before we proceed, I would recommend, closing all Excel and starting a fresh instance.
Launch VBA and insert a Userform. Place a command button and a webcrowser control. Your Userform might look like this
Paste this code in the Userform code area
Option Explicit
'~~> Set Reference to Microsoft HTML Object Library
Private Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Private Sub CommandButton1_Click()
Dim URL As String
Dim oSheet As Worksheet
Set oSheet = Sheets("Sheet1")
URL = "http://www.espncricinfo.com/big-bash-league-2011/engine/match/524915.html"
PopulateDataSheets oSheet, URL
MsgBox "Data Scrapped. Please check " & oSheet.Name
End Sub
Public Sub PopulateDataSheets(wsk As Worksheet, URL As String)
Dim tbl As HTMLTable
Dim tr As HTMLTableRow
Dim insertRow As Long, Row As Long, col As Long
On Error GoTo whoa
WebBrowser1.navigate URL
WaitForWBReady
Set tbl = WebBrowser1.Document.getElementById("inningsBat1")
With wsk
.Cells.Clear
insertRow = 0
For Row = 0 To tbl.Rows.Length - 1
Set tr = tbl.Rows(Row)
If Trim(tr.innerText) <> "" Then
If tr.Cells.Length > 2 Then
If tr.Cells(1).innerText <> "Total" Then
insertRow = insertRow + 1
For col = 0 To tr.Cells.Length - 1
.Cells(insertRow, col + 1) = tr.Cells(col).innerText
Next
End If
End If
End If
Next
End With
whoa:
Unload Me
End Sub
Private Sub Wait(ByVal nSec As Long)
nSec = nSec + Timer
While Timer < nSec
DoEvents
Sleep 100
Wend
End Sub
Private Sub WaitForWBReady()
Wait 1
While WebBrowser1.ReadyState <> 4
Wait 3
Wend
End Sub
Now run your Userform and click on the Command button. You will notice that the data is dumped in Sheet1. See snapshot
Similarly you can scrape other info as well.
2) Using Excel's inbuilt facility to get data from the web
I believe you are using Excel 2007 so I will take that as an example to scrape the above mentioned link.
Navigate to Sheet2. Now navigate to Data Tab and click on the button "From Web" on the extreme right. See snapshot.
Enter the url in the "New Web Query Window" and click on "Go"
Once the page is uploaded, select the relevant table that you want to import by clicking on the small arrow as shown in the snapshot. Once done, click on "Import"
Excel will then ask you where you want the data to be imported. Select the relevant cell and click on OK. And you are done! The data will be imported to the cell which you specified.
If you wish you can record a macro and automate this as well :)
Here is the macro that I recorded.
Sub Macro1()
With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.espncricinfo.com/big-bash-league-2011/engine/match/524915.html" _
, Destination:=Range("$A$1"))
.Name = "524915"
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = """inningsBat1"""
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
End Sub
Hope this helps. Let me know if you still have some queries.
Sid
For anyone else interested in this I ended up using the code below based on Siddhart Rout's earlier answer
XMLHttp was significantly quicker than automating IE
the code generates a CSV file for each series to be dowloaded (held in the X variable)
the code dumps each match to a regular 29 row range (regardless of how many players batted) to facillitate easier analysis later on
Public Sub PopulateDataSheets_XML()
Dim URL As String
Dim ws As Worksheet
Dim lngRow As Long
Dim lngRecords As Long
Dim lngWrite As Long
Dim lngSpare As Long
Dim lngInnings As Long
Dim lngRow1 As Long
Dim X(1 To 15, 1 To 4) As String
Dim objFSO As Object
Dim objTF As Object
Dim xmlHttp As Object
Dim htmldoc As HTMLDocument
Dim htmlbody As htmlbody
Dim tbl As HTMLTable
Dim tr As HTMLTableRow
Dim strInnings As String
s = Timer()
Set xmlHttp = CreateObject("MSXML2.ServerXMLHTTP")
Set objFSO = CreateObject("scripting.filesystemobject")
X(1, 1) = "http://www.espncricinfo.com/indian-premier-league-2011/engine/match/"
X(1, 2) = 501198
X(1, 3) = 501271
X(1, 4) = "indian-premier-league-2011"
X(2, 1) = "http://www.espncricinfo.com/big-bash-league-2011/engine/match/"
X(2, 2) = 524915
X(2, 3) = 524945
X(2, 4) = "big-bash-league-2011"
X(3, 1) = "http://www.espncricinfo.com/ausdomestic-2010/engine/match/"
X(3, 2) = 461028
X(3, 3) = 461047
X(3, 4) = "big-bash-league-2010"
Set htmldoc = New HTMLDocument
Set htmlbody = htmldoc.body
For lngRow = 1 To UBound(X, 1)
If Len(X(lngRow, 1)) = 0 Then Exit For
Set objTF = objFSO.createtextfile("c:\temp\" & X(lngRow, 4) & ".csv")
For lngRecords = X(lngRow, 2) To X(lngRow, 3)
URL = X(lngRow, 1) & lngRecords & ".html"
xmlHttp.Open "GET", URL
xmlHttp.send
Do While xmlHttp.Status <> 200
DoEvents
Loop
htmlbody.innerHTML = xmlHttp.responseText
objTF.writeline X(lngRow, 1) & lngRecords & ".html"
For lngInnings = 1 To 2
strInnings = "Innings " & lngInnings
objTF.writeline strInnings
Set tbl = Nothing
On Error Resume Next
Set tbl = htmlbody.Document.getElementById("inningsBat" & lngInnings)
On Error GoTo 0
If Not tbl Is Nothing Then
lngWrite = 0
For lngRow1 = 0 To tbl.Rows.Length - 1
Set tr = tbl.Rows(lngRow1)
If Trim(tr.innerText) <> vbNewLine Then
If tr.Cells.Length > 2 Then
If tr.Cells(1).innerText <> "Extras" Then
If Len(tr.Cells(1).innerText) > 0 Then
objTF.writeline strInnings & "-" & lngWrite & "," & Trim(tr.Cells(1).innerText) & "," & Trim(tr.Cells(3).innerText)
lngWrite = lngWrite + 1
End If
Else
objTF.writeline strInnings & "-" & lngWrite & "," & Trim(tr.Cells(1).innerText) & "," & Trim(tr.Cells(3).innerText)
lngWrite = lngWrite + 1
Exit For
End If
End If
End If
Next
For lngSpare = 12 To lngWrite Step -1
objTF.writeline strInnings & "-" & lngWrite + (12 - lngSpare)
Next
Else
For lngSpare = 1 To 13
objTF.writeline strInnings & "-" & lngWrite + (12 - lngSpare)
Next
End If
Next
Next
Next
'Call ConsolidateSheets
End Sub
RegEx is not a complete solution for parsing HTML because it is not guaranteed to be regular.
You should use the HtmlAgilityPack to query the HTML. This will allow you to use the CSS selectors to query the HTML similar to how you do it with jQuery.
As quite a few people may see this I thought I would use it as a chance to demonstrate a few features I rarely see people using in VBA web-scraping: deleteRow, querySelector and use of clipboard to write out a table (complete with formatting and hyperlinks) to a sheet based on the table.outerHTML.
deleteRow is used to remove the unwanted rows. querySelector is used to apply faster css selectors to match on nodes. Modern browsers/html parsers are optimized for css and class selectors (which I use) are the second fastest selector type (after id).
Use of css selectors and understanding htmlTable methods/properties will allow for much greater flexibility in your web-scraping endeavours. Understanding the use of the clipboard means a simple copy paste method for transferring a table to Excel.
Execution could easily be tied to a button push and the url read in from a cell.
VBA:
Option Explicit
Public Sub test()
WriteOutTable "https://www.espncricinfo.com/series/8044/scorecard/524935/hobart-hurricanes-vs-melbourne-stars-big-bash-league-2011-12"
End Sub
Public Sub WriteOutTable(ByVal url As String)
'required VBE (Alt+F11) > Tools > References > Microsoft HTML Object Library ; Microsoft XML, v6 (your version may vary)
Dim hTable As MSHTML.HTMLTable, clipboard As Object
Dim xhr As MSXML2.xmlhttp60, html As MSHTML.htmlDocument
Set xhr = New MSXML2.xmlhttp60
Set html = New MSHTML.htmlDocument
With xhr
.Open "GET", url, False
.Send
html.body.innerHTML = .responseText
End With
Set hTable = html.querySelector(".batsman")
rowCount = hTable.Rows.Length - 1
For i = rowCount To 0 Step -1
Select Case True
Case i = rowCount Or i = rowCount - 1 Or InStr(hTable.Rows(i).outerHTML, "wicket-details") > 0
hTable.deleteRow i
End Select
Next
Set clipboard = GetObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
clipboard.SetText hTable.outerHTML
clipboard.PutInClipboard
ActiveSheet.Cells(1, 1).PasteSpecial
End Sub