Newbie here. I have an HTML source code and would like to look for string in between header tags <h1></h1>, <h2></h2>till <h5></h5> and then convert the text to lower case except acronyms or abbreviations (these are all capitals in 2 or more characters). And make sure that all country names in between use proper case.
As an example: It will find <h1>HR Policies and Procedures for Hiring - argentina LTD</h1>
It will convert it to:<H1>HR policies and procedures for hiring - Argentina LTD</H1>
I've tried a user defined function for Excel VBA found online: CapIt(A2). It uses Search, Split and Join. I'm not able to put them together to come up with the result. Would appreciate very much your help. Thank you.
Code I saw online as initial reference:
Function Capit(s As String)
Dim v As Variant, j As Long
v = Split(s, " ") ' separates the words
For j = LBound(v) To UBound(v)
If StrComp(v(j), UCase(v(j)), vbBinaryCompare) <> 0 Then v(j) = StrConv(v(j), vbProperCase)
Next j
Capit = Join(v, " ") ' joins the words
End Function
'Added this code below, can we use the results to lowercase the string and exclude the output in this function
Function FindAcronyms(yourWord As String)
Dim I As Integer
Dim ctr As Integer
FindAcronyms = Null
For I = 1 To Len(yourWord)
If Asc(Mid(yourWord, I, 1)) <= 90 And _
Asc(Mid(yourWord, I, 1)) >= 65 Then
If ctr > 0 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1)
End If
ctr = ctr + 1
Else
If ctr > 1 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1) & ", "
End If
ctr = 0
End If
Next
If ctr > 1 Then
FindAcronyms = FindAcronyms & Mid(yourWord, I - 1, 1)
End If
If Right(FindAcronyms, 2) = ", " Then
FindAcronyms = Left(FindAcronyms, Len(FindAcronyms) - 2)
End If
End Function
'the final look would be something like this
Sub TitleChange()
'define array
myarray = Range("A1:A100")
' Define the pattern
Dim pattern As String: pattern = "<h*>*</h*>" 'looks for the header tags
Dim f As Variant
For Each f In myarray
If f Like pattern = True Then Capital (f) 'changes all string to lower case except countries (to retain proper case) and acronyms (to retain uppercase)
Next f
End Sub
You can include the countries in an array
Sub Test()
Debug.Print Capital("HR Policies and Procedures for Hiring - argentina LTD")
End Sub
Function Capital(ByVal s As String)
Dim a, v As Variant, j As Long
a = Array("Argentina", "Egypt", "Enland")
v = Split(s, " ")
For j = LBound(v) To UBound(v)
If StrComp(v(j), UCase(v(j)), vbBinaryCompare) <> 0 Then v(j) = StrConv(v(j), vbLowerCase)
If Not IsError(Application.Match(v(j), a, 0)) Then v(j) = StrConv(v(j), vbProperCase)
Next j
Capital = Join(v, " ")
End Function
Added UDF that parses HTML code, used the Sub Test above as UDF Capital and UDF to bring together. Welcome suggestions to make it cleaner or more efficient
Dim rng As Range, cell As Range
Set rng = Range("A1:A5")
' Define the pattern
Dim pattern As String: pattern = "*<h?>*</h?>*"
' Check each item against the pattern
For Each cell In rng
If (cell Like pattern = True) Then
cell.Offset(0, 16).Value = cell.Value
cell.Offset(0, 16).Value = joinCell(Capital(StripHTML(cell)), cell.Offset(0, 0).Value) 'used UDF for striping innertext, applying rules and joining back string
End If
Next cell
End Sub
I am using VBA in Access to modify a excel sheet using Macro 1 and input it in a table using Macro 2. When I run both of these consecutively, my system gets stuck in a loop, but works fine if I restart the Microsoft access application after running one Macro and run Macro 2. Also at times, the excel file on which I am running my code gets a pop-up box to enable read/write access.. Can someone help me with it?
Macro1
Function Clean()
Dim CurrFilePath, PathName, Week As String
Dim Filename
Dim OpenExcel As Object
Set OpenExcel = CreateObject("Excel.Application")
OpenExcel.Visible = False
Dim OpenWorkbook, WS As Object
Dim i, j As Integer
Dim Count_WS As Integer
OpenExcel.Quit
CurrFilePath = Application.CurrentProject.path
StartTime = Timer
Week = InputBox("Enter the week for the data import e.g. 34")
PathName = CurrFilePath & "\Direct Deliveries\Week " & Week & "\"
Example = CurrFilePath & "\Direct Deliveries\Week " & Week
Confirm:
Confirm_Folder = MsgBox("Does the Direct Deliveries info exist in " & PathName & " path", vbYesNo)
If Confirm_Folder = vbNo Then
path = InputBox("Locate Direct Deliveries .xlsx on your System and Copy the Dir path here e.g." & Example)
PathName = path & "\"
GoTo Confirm
End If
Filename = Dir(PathName & "*.xlsx")
Do While Len(Filename) > 0
Set OpenExcel = CreateObject("Excel.Application")
OpenExcel.Visible = False
OpenExcel.EnableEvents = False
OpenExcel.ScreenUpdating = False
'Variables to track first cell
i = 0
j = 0
PathFile = PathName & Filename
Set OpenWorkbook = OpenExcel.Workbooks.Open(PathFile)
For Each WS In OpenWorkbook.Worksheets
'If condition to check correct worksheets
On Error Resume Next
If Range("A1").Value = "Carrier SCAC" And Range("D1").Value = "Trip ID" Then
'Loop to fill blank TripIDs
For Each Cell In WS.UsedRange.Columns(4).Cells
' For blank cells, set them to equal the cell above
If WS.Cells(Cell.Row, 1) <> "ABCD" And Not IsEmpty(WS.Cells(Cell.Row, 9)) Then
If i <> 0 Then
If (Len(Cell.Text) = 0) And PreviousCell <> "Trip ID" And Cell.Row Then
Cell.Value = PreviousCell
End If
End If
PreviousCell = Cell
i = i + 1
End If
Next Cell
'Loop to fill blank SCAC Codes
For Each CarrierCell In WS.UsedRange.Columns(1).Cells
' For blank cells, set them to equal the cell above
If j <> 0 Then
If (Len(CarrierCell.Text) = 0) And PreviousCell <> "Carrier SCAC" And PreviousCell <> "ABCD" And Not IsEmpty(WS.Cells(CarrierCell.Row, 4)) Then
CarrierCell.Value = PreviousCell
End If
End If
PreviousCell = CarrierCell
j = j + 1
Next CarrierCell
End If
Count_WS = Count_WS + 1
Next WS
Filename = Dir()
OpenWorkbook.Close SaveChanges:=True
Set OpenWorkbook = Nothing
OpenExcel.Quit
Set OpenExcel = Nothing
Loop
'Display the end status
TotalTime = Format((Timer - StartTime) / 86400, "hh:mm:ss")
Application.Echo True
DeleteImportErrTables
End Function
Macro 2
'--------------------------------------------------------
' Author: Akanksha Goel
' The code imports Direct Deliveries erroneous excel templates to Access Database
'------------------------------------------------------------
'
'------------------------------------------------------------
Function ListErrBeforeImports()
Dim OpenExcel As Object
Set OpenExcel = CreateObject("Excel.Application")
OpenExcel.Visible = False
Dim PathFile As String, Filename As String, PathName As String
Dim TableName As String
Dim HasFieldNames As Boolean
Dim OpenWorkbookED As Object
Dim SQL, CurrFilePath As String
Dim SQLcreate, SQLAlter, SQLSet As String
Dim SQL2, SQL3 As String
Dim Count_Templates As Integer
StartTime = Timer
OpenExcel.Quit
'Turn Off the warnings and screen updating
DoCmd.SetWarnings False
Application.Echo False
OpenExcel.EnableEvents = False
OpenExcel.ScreenUpdating = False
CurrFilePath = Application.CurrentProject.path
Week = InputBox("Enter the week for the data import e.g. 34")
PathName = CurrFilePath & "\Direct Deliveries\Week " & Week & "\"
Example = CurrFilePath & "\Direct Deliveries\Week " & Week
Confirm:
Confirm_Folder = MsgBox("Does the Direct Deliveries info exist in " & PathName & " path", vbYesNo)
If Confirm_Folder = vbNo Then
path = InputBox("Locate Direct Deliveries .xlsx on your System and Copy the Dir path here e.g." & Example)
PathName = path & "\"
GoTo Confirm
End If
HasFieldNames = True
TableName = "TempTable"
Filename = Dir(PathName & "*.xlsx")
PathFile = PathName & Filename
'Arguments for function AssignTablesToGroup()
Dim Arg1 As String
Dim Arg2 As Integer
Arg1 = "EmptyDeliveryDates_TripsWeek" & Week
Call DeleteTable(Arg1)
Arg2 = 383
SQLcreate = "Create Table EmptyDeliveryDates_TripsWeek" & Week & " ( TripID Text, ShipToZip Text, ArriveDelivery Text, Carrier Text, SourceWorkbook Text);"
DoCmd.RunSQL SQLcreate
'Assign Error Table to 'Errors in DirectDeliveries Excels' Group
Call AssignToGroup(Arg1, Arg2)
'Arguments for function AssignTablesToGroup()
Dim Arg3 As String
Arg3 = "InvalidZip_TripsWeek" & Week
DeleteTable Arg3
Arg2 = 383
SQLcreate = "Create Table InvalidZip_TripsWeek" & Week & " ( TripID Text, ShipToZip Text, ArriveDelivery Text, Carrier Text, SourceWorkbook Text);"
DoCmd.RunSQL SQLcreate
'Assign Error Table to 'Errors in DirectDeliveries Excels' Group
Call AssignToGroup(Arg3, Arg2)
'Arguments for function AssignTablesToGroup()
Dim Arg4 As String
Arg4 = "InvalidTrip_TripsWeek" & Week
DeleteTable Arg4
Arg2 = 383
SQLcreate = "Create Table InvalidTrip_TripsWeek" & Week & " ( TripID Text, ShipToZip Text, ArriveDelivery Text, Carrier Text, SourceWorkbook Text);"
DoCmd.RunSQL SQLcreate
'Assign Error Table to 'Errors in DirectDeliveries Excels' Group
Call AssignToGroup(Arg4, Arg2)
Do While Len(Filename) > 0
Set OpenExcel = CreateObject("Excel.Application")
OpenExcel.Visible = False
OpenExcel.EnableEvents = False
OpenExcel.ScreenUpdating = False
PathFile = PathName & Filename
Set OpenWorkbookED = OpenExcel.Workbooks.Open(PathFile, ReadOnly)
Set WS_Book = OpenWorkbookED.Worksheets
DeleteTable "TempTable"
'Loop through Worksheets in each template workbook
For Each WS In WS_Book
WorksheetName = WS.Name
x = WS.Range("A1")
If WS.Range("A1") = "Carrier SCAC" Then
'Get the used records in worksheet
GetUsedRange = WS.UsedRange.Address(0, 0)
'Import records from worksheet into Access Database table
DoCmd.TransferSpreadsheet acImport, 10, "TempTable", PathFile, HasFieldNames, WorksheetName & "!" & GetUsedRange
SQLAlter = "ALTER TABLE TempTable ADD COLUMN SourceBook TEXT(100)"
DoCmd.RunSQL SQLAlter
SQLSet = "UPDATE TempTable SET TempTable.SourceBook = '" & Filename & "' where ([Arrive Delivery]) is NULL or len([Arrive Delivery])<2 or len([Trip ID])<8 or len([Ship to Zip])<5;"
DoCmd.RunSQL SQLSet
SQL = "INSERT INTO " & Arg4 & "(TripID, ShipToZip, ArriveDelivery, Carrier, SourceWorkbook) Select Distinct [Trip ID], [Ship to Zip], [Arrive Delivery], [Carrier SCAC], SourceBook FROM TempTable WHERE len([Trip ID])<8 and len([Ship To Zip])>0 and len([Arrive Delivery])>0;"
DoCmd.RunSQL SQL
SQL2 = "INSERT INTO " & Arg3 & "(TripID, ShipToZip, ArriveDelivery, Carrier, SourceWorkbook) Select Distinct [Trip ID], [Ship to Zip], [Arrive Delivery], [Carrier SCAC], SourceBook FROM TempTable WHERE len([Ship To Zip])<5 and len([Arrive Delivery])>0 and len([Trip ID])>0;"
DoCmd.RunSQL SQL2
SQL3 = "INSERT INTO " & Arg1 & "(TripID, ShipToZip, ArriveDelivery, Carrier, SourceWorkbook) Select Distinct [Trip ID], [Ship to Zip], [Arrive Delivery], [Carrier SCAC], SourceBook FROM TempTable WHERE ([Arrive Delivery] is NULL or len([Arrive Delivery])<2) and len([Ship To Zip])>0 and len([Trip ID])>0 ;"
DoCmd.RunSQL SQL3
DoCmd.DeleteObject acTable, "TempTable"
Count_Templates = Count_Templates + 1
End If
Next WS
OpenWorkbookED.Saved = True
OpenWorkbookED.Close
Filename = Dir()
Set OpenWorkbookED = Nothing
OpenExcel.Quit
Set OpenExcel = Nothing
Loop
'Display the end status
TotalTime = Format((Timer - StartTime) / 86400, "hh:mm:ss")
MsgBox "Done! Error tables updated in 'Errors in DirectDeliveries Excels' group in with " & Count_Templates & " Templates " & TotalTime & " minutes", vbInformation
Application.Echo True
'CallFunction Delete Import Tables
DeleteImportErrTables
End Function
Merge the two functions so you only open one instance (your OpenExcel object) of Excel.
I'm trying to webscrape multiple tables from a website. So far I have built an excel VBA macro to do this. I also figured out how to get all the data when it is on multiple pages in the website. For instance, if I have 1000 results but 50 are displayed on each page. The problem is that I have the same 5 tables on multiple pages because each table has 1000 results.
My code can only loop through each page for 1 table. I also have written code to grab each table, but I cannot figure out how to do that for each of the 50 search results (each page).
How can I loop through multiple tables and click the next page in the process to capture all the data?
Sub ETFDat()
Dim IE As Object
Dim i As Long
Dim strText As String
Dim jj As Long
Dim hBody As Object
Dim hTR As Object
Dim hTD As Object
Dim tb As Object
Dim bb As Object
Dim Tr As Object
Dim Td As Object
Dim ii As Long
Dim doc As Object
Dim hTable As Object
Dim y As Long
Dim z As Long
Dim wb As Excel.Workbook
Dim ws As Excel.Worksheet
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
y = 1 'Column A in Excel
z = 1 'Row 1 in Excel
Sheets("Fund Basics").Activate
Cells.Select
Selection.Clear
IE.navigate "http://www.etf.com/channels/smart-beta-etfs/channels/smart- beta-etfs?qt-tabs=0#qt-tabs" ', , , , "Content-Type: application/x-www-form-urlencoded" & vbCrLf
Do While IE.busy: DoEvents: Loop
Do While IE.ReadyState <> 4: DoEvents: Loop
Set doc = IE.document
Set hTable = doc.getElementsByTagName("table") '.GetElementByID("tablePerformance")
ii = 1
Do While ii <= 17
For Each tb In hTable
Set hBody = tb.getElementsByTagName("tbody")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each Tr In hTR
Set hTD = Tr.getElementsByTagName("td")
y = 1 ' Resets back to column A
For Each Td In hTD
ws.Cells(z, y).Value = Td.innerText
y = y + 1
Next Td
DoEvents
z = z + 1
Next Tr
Exit For
Next bb
Exit For
Next tb
With doc
Set elems = .getElementsByTagName("a")
For Each e In elems
If (e.getAttribute("id") = "nextPage") Then
e.Click
Exit For
End If
Next e
End With
ii = ii + 1
Application.Wait (Now + TimeValue("00:00:05"))
Loop
MsgBox "Done"
End Sub
There is the example showing how the data could be retrieved from the website using XHRs and JSON parsing, it consists of several steps.
Retrieve the data.
I looked into a little with XHRs using Chrome Developer Tools Network tab.
Most relevant data I found is JSON string returned by GET XHR from http://www.etf.com/etf-finder-channel-tag/Smart-Beta%20ETFs/-aum/50/50/1 after I clicked the next page button:
The response has the following structure for single row item:
[
{
"productId": 576,
"fund": "iShares Russell 1000 Value ETF",
"ticker": "IWD",
"inceptionDate": "2000-05-22",
"launchDate": "2000-05-22",
"hasSegmentReport": "true",
"genericReport": "false",
"hasReport": "true",
"fundsInSegment": 20,
"economicDevelopment": "Developed Markets",
"totalRows": 803,
"fundBasics": {
"issuer": "<a href='/channels/blackrock-etfs' alt='BlackRock'>BlackRock</a>",
"expenseRatio": {
"value": 20
},
"aum": {
"value": 36957230250
},
"spreadPct": {
"value": 0.000094
},
"segment": "Equity: U.S. - Large Cap Value"
},
"performance": {
"priceTrAsOf": "2017-02-27",
"priceTr1Mo": {
"value": 0.031843
},
"priceTr3Mo": {
"value": 0.070156
},
"priceTr1Yr": {
"value": 0.281541
},
"priceTr3YrAnnualized": {
"value": 0.099171
},
"priceTr5YrAnnualized": {
"value": 0.13778
},
"priceTr10YrAnnualized": {
"value": 0.061687
}
},
"analysis": {
"analystPick": null,
"opportunitiesList": null,
"letterGrade": "A",
"efficiencyScore": 97.977103,
"tradabilityScore": 99.260541,
"fitScore": 84.915658,
"leveragedFactor": null,
"exposureReset": null,
"avgDailyDollarVolume": 243848188.037378,
"avgDailyShareVolume": 2148400.688889,
"spread": {
"value": 0.010636
},
"fundClosureRisk": "Low"
},
"fundamentals": {
"dividendYield": {
"value": 0.021543
},
"equity": {
"pe": 27.529645,
"pb": 1.964124
},
"fixedIncome": {
"duration": null,
"creditQuality": null,
"ytm": {
"value": null
}
}
},
"classification": {
"assetClass": "Equity",
"strategy": "Value",
"region": "North America",
"geography": "U.S.",
"category": "Size and Style",
"focus": "Large Cap",
"niche": "Value",
"inverse": "false",
"leveraged": "false",
"etn": "false",
"selectionCriteria": "Multi-Factor",
"weightingScheme": "Multi-Factor",
"activePerSec": "false",
"underlyingIndex": "Russell 1000 Value Index",
"indexProvider": "Russell",
"brand": "iShares"
},
"tax": {
"legalStructure": "Open-Ended Fund",
"maxLtCapitalGainsRate": 20,
"maxStCapitalGainsRate": 39.6,
"taxReporting": "1099"
}
}
]
The property "totalRows": 803 specifies the total rows count. So to make data retrieving as fast as it possible, better to make the request to get the first row. As you can see from the URL, there is ../-aum/50/50/.. tail, which points sorting order, item to start from, and total items to return. Thus to get the only row it should be http://www.etf.com/etf-finder-channel-tag/Smart-Beta%20ETFs/-aum/0/1/1
Parse retrieved JSON, get the total number of rows from totalRows property.
Make another one request to get the entire table.
Parse the entire table JSON, convert it to 2d array and output. You can perform further processing with direct access to the array.
For the table shown below:
The resulting table contains 803 rows and header with columns as follows:
productId
fund
ticker
inceptionDate
launchDate
hasSegmentReport
genericReport
hasReport
fundsInSegment
economicDevelopment
totalRows
fundBasics_issuer
fundBasics_expenseRatio_value
fundBasics_aum_value
fundBasics_spreadPct_value
fundBasics_segment
performance_priceTrAsOf
performance_priceTr1Mo_value
performance_priceTr3Mo_value
performance_priceTr1Yr_value
performance_priceTr3YrAnnualized_value
performance_priceTr5YrAnnualized_value
performance_priceTr10YrAnnualized_value
analysis_analystPick
analysis_opportunitiesList
analysis_letterGrade
analysis_efficiencyScore
analysis_tradabilityScore
analysis_fitScore
analysis_leveragedFactor
analysis_exposureReset
analysis_avgDailyDollarVolume
analysis_avgDailyShareVolume
analysis_spread_value
analysis_fundClosureRisk
fundamentals_dividendYield_value
fundamentals_equity_pe
fundamentals_equity_pb
fundamentals_fixedIncome_duration
fundamentals_fixedIncome_creditQuality
fundamentals_fixedIncome_ytm_value
classification_assetClass
classification_strategy
classification_region
classification_geography
classification_category
classification_focus
classification_niche
classification_inverse
classification_leveraged
classification_etn
classification_selectionCriteria
classification_weightingScheme
classification_activePerSec
classification_underlyingIndex
classification_indexProvider
classification_brand
tax_legalStructure
tax_maxLtCapitalGainsRate
tax_maxStCapitalGainsRate
tax_taxReporting
Put the below code into VBA Project standard module:
Option Explicit
Sub GetData()
Dim sJSONString As String
Dim vJSON As Variant
Dim sState As String
Dim lRowsQty As Long
Dim aData()
Dim aHeader()
' Download and parse the only first row to get total rows qty
sJSONString = GetXHR("http://www.etf.com/etf-finder-channel-tag/Smart-Beta%20ETFs/-aum/0/1/1")
JSON.Parse sJSONString, vJSON, sState
lRowsQty = vJSON(0)("totalRows")
' Download and parse the entire data
sJSONString = GetXHR("http://www.etf.com/etf-finder-channel-tag/Smart-Beta%20ETFs/-aum/0/" & lRowsQty & "/1")
JSON.Parse sJSONString, vJSON, sState
' Convert JSON to 2d array
JSON.ToArray vJSON, aData, aHeader
' Output
With Sheets(1)
.Cells.Delete
OutputArray .Cells(1, 1), aHeader
Output2DArray .Cells(2, 1), aData
.Cells.Columns.AutoFit
End With
End Sub
Function GetXHR(sURL As String) As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", sURL, False
.Send
GetXHR = .responseText
End With
End Function
Sub OutputArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
1, _
UBound(aCells) - LBound(aCells) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Sub Output2DArray(oDstRng As Range, aCells As Variant)
With oDstRng
.Parent.Select
With .Resize( _
UBound(aCells, 1) - LBound(aCells, 1) + 1, _
UBound(aCells, 2) - LBound(aCells, 2) + 1)
.NumberFormat = "#"
.Value = aCells
End With
End With
End Sub
Create one more standard module, name it JSON and put the below code into it, this code provides JSON processing functionality:
Option Explicit
Private sBuffer As String
Private oTokens As Object
Private oRegEx As Object
Private bMatch As Boolean
Private oChunks As Object
Private oHeader As Object
Private aData() As Variant
Private i As Long
Sub Parse(ByVal sSample As String, vJSON As Variant, sState As String)
' Backus–Naur form JSON parser implementation based on RegEx
' Input:
' sSample - source JSON string
' Output:
' vJson - created object or array to be returned as result
' sState - string Object|Array|Error depending on processing
sBuffer = sSample
Set oTokens = CreateObject("Scripting.Dictionary")
Set oRegEx = CreateObject("VBScript.RegExp")
With oRegEx ' Patterns based on specification http://www.json.org/
.Global = True
.MultiLine = True
.IgnoreCase = True ' Unspecified True, False, Null accepted
.Pattern = "(?:'[^']*'|""(?:\\""|[^""])*"")(?=\s*[,\:\]\}])" ' Double-quoted string, unspecified quoted string
Tokenize "s"
.Pattern = "[+-]?(?:\d+\.\d*|\.\d+|\d+)(?:e[+-]?\d+)?(?=\s*[,\]\}])" ' Number, E notation number
Tokenize "d"
.Pattern = "\b(?:true|false|null)(?=\s*[,\]\}])" ' Constants true, false, null
Tokenize "c"
.Pattern = "\b[A-Za-z_]\w*(?=\s*\:)" ' Unspecified non-double-quoted property name accepted
Tokenize "n"
.Pattern = "\s+"
sBuffer = .Replace(sBuffer, "") ' Remove unnecessary spaces
.MultiLine = False
Do
bMatch = False
.Pattern = "<\d+(?:[sn])>\:<\d+[codas]>" ' Object property structure
Tokenize "p"
.Pattern = "\{(?:<\d+p>(?:,<\d+p>)*)?\}" ' Object structure
Tokenize "o"
.Pattern = "\[(?:<\d+[codas]>(?:,<\d+[codas]>)*)?\]" ' Array structure
Tokenize "a"
Loop While bMatch
.Pattern = "^<\d+[oa]>$" ' Top level object structure, unspecified array accepted
If .Test(sBuffer) And oTokens.Exists(sBuffer) Then
Retrieve sBuffer, vJSON
sState = IIf(IsObject(vJSON), "Object", "Array")
Else
vJSON = Null
sState = "Error"
End If
End With
Set oTokens = Nothing
Set oRegEx = Nothing
End Sub
Private Sub Tokenize(sType)
Dim aContent() As String
Dim lCopyIndex As Long
Dim i As Long
Dim sKey As String
With oRegEx.Execute(sBuffer)
If .Count = 0 Then Exit Sub
ReDim aContent(0 To .Count - 1)
lCopyIndex = 1
For i = 0 To .Count - 1
With .Item(i)
sKey = "<" & oTokens.Count & sType & ">"
oTokens(sKey) = .Value
aContent(i) = Mid(sBuffer, lCopyIndex, .FirstIndex - lCopyIndex + 1) & sKey
lCopyIndex = .FirstIndex + .Length + 1
End With
Next
End With
sBuffer = Join(aContent, "") & Mid(sBuffer, lCopyIndex, Len(sBuffer) - lCopyIndex + 1)
bMatch = True
End Sub
Private Sub Retrieve(sTokenKey, vTransfer)
Dim sTokenValue As String
Dim sName As String
Dim vValue As Variant
Dim aTokens() As String
Dim i As Long
sTokenValue = oTokens(sTokenKey)
With oRegEx
.Global = True
Select Case Left(Right(sTokenKey, 2), 1)
Case "o"
Set vTransfer = CreateObject("Scripting.Dictionary")
aTokens = Split(sTokenValue, "<")
For i = 1 To UBound(aTokens)
Retrieve "<" & Split(aTokens(i), ">", 2)(0) & ">", vTransfer
Next
Case "p"
aTokens = Split(sTokenValue, "<", 4)
Retrieve "<" & Split(aTokens(1), ">", 2)(0) & ">", sName
Retrieve "<" & Split(aTokens(2), ">", 2)(0) & ">", vValue
If IsObject(vValue) Then
Set vTransfer(sName) = vValue
Else
vTransfer(sName) = vValue
End If
Case "a"
aTokens = Split(sTokenValue, "<")
If UBound(aTokens) = 0 Then
vTransfer = Array()
Else
ReDim vTransfer(0 To UBound(aTokens) - 1)
For i = 1 To UBound(aTokens)
Retrieve "<" & Split(aTokens(i), ">", 2)(0) & ">", vValue
If IsObject(vValue) Then
Set vTransfer(i - 1) = vValue
Else
vTransfer(i - 1) = vValue
End If
Next
End If
Case "n"
vTransfer = sTokenValue
Case "s"
vTransfer = Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace( _
Mid(sTokenValue, 2, Len(sTokenValue) - 2), _
"\""", """"), _
"\\", "\"), _
"\/", "/"), _
"\b", Chr(8)), _
"\f", Chr(12)), _
"\n", vbLf), _
"\r", vbCr), _
"\t", vbTab)
.Global = False
.Pattern = "\\u[0-9a-fA-F]{4}"
Do While .Test(vTransfer)
vTransfer = .Replace(vTransfer, ChrW(("&H" & Right(.Execute(vTransfer)(0).Value, 4)) * 1))
Loop
Case "d"
vTransfer = Evaluate(sTokenValue)
Case "c"
Select Case LCase(sTokenValue)
Case "true"
vTransfer = True
Case "false"
vTransfer = False
Case "null"
vTransfer = Null
End Select
End Select
End With
End Sub
Function Serialize(vJSON As Variant) As String
Set oChunks = CreateObject("Scripting.Dictionary")
SerializeElement vJSON, ""
Serialize = Join(oChunks.Items(), "")
Set oChunks = Nothing
End Function
Private Sub SerializeElement(vElement As Variant, ByVal sIndent As String)
Dim aKeys() As Variant
Dim i As Long
With oChunks
Select Case VarType(vElement)
Case vbObject
If vElement.Count = 0 Then
.Item(.Count) = "{}"
Else
.Item(.Count) = "{" & vbCrLf
aKeys = vElement.Keys
For i = 0 To UBound(aKeys)
.Item(.Count) = sIndent & vbTab & """" & aKeys(i) & """" & ": "
SerializeElement vElement(aKeys(i)), sIndent & vbTab
If Not (i = UBound(aKeys)) Then .Item(.Count) = ","
.Item(.Count) = vbCrLf
Next
.Item(.Count) = sIndent & "}"
End If
Case Is >= vbArray
If UBound(vElement) = -1 Then
.Item(.Count) = "[]"
Else
.Item(.Count) = "[" & vbCrLf
For i = 0 To UBound(vElement)
.Item(.Count) = sIndent & vbTab
SerializeElement vElement(i), sIndent & vbTab
If Not (i = UBound(vElement)) Then .Item(.Count) = "," 'sResult = sResult & ","
.Item(.Count) = vbCrLf
Next
.Item(.Count) = sIndent & "]"
End If
Case vbInteger, vbLong
.Item(.Count) = vElement
Case vbSingle, vbDouble
.Item(.Count) = Replace(vElement, ",", ".")
Case vbNull
.Item(.Count) = "null"
Case vbBoolean
.Item(.Count) = IIf(vElement, "true", "false")
Case Else
.Item(.Count) = """" & _
Replace(Replace(Replace(Replace(Replace(Replace(Replace(Replace(vElement, _
"\", "\\"), _
"""", "\"""), _
"/", "\/"), _
Chr(8), "\b"), _
Chr(12), "\f"), _
vbLf, "\n"), _
vbCr, "\r"), _
vbTab, "\t") & _
""""
End Select
End With
End Sub
Function ToString(vJSON As Variant) As String
Select Case VarType(vJSON)
Case vbObject, Is >= vbArray
Set oChunks = CreateObject("Scripting.Dictionary")
ToStringElement vJSON, ""
oChunks.Remove 0
ToString = Join(oChunks.Items(), "")
Set oChunks = Nothing
Case vbNull
ToString = "Null"
Case vbBoolean
ToString = IIf(vJSON, "True", "False")
Case Else
ToString = CStr(vJSON)
End Select
End Function
Private Sub ToStringElement(vElement As Variant, ByVal sIndent As String)
Dim aKeys() As Variant
Dim i As Long
With oChunks
Select Case VarType(vElement)
Case vbObject
If vElement.Count = 0 Then
.Item(.Count) = "''"
Else
.Item(.Count) = vbCrLf
aKeys = vElement.Keys
For i = 0 To UBound(aKeys)
.Item(.Count) = sIndent & aKeys(i) & ": "
ToStringElement vElement(aKeys(i)), sIndent & vbTab
If Not (i = UBound(aKeys)) Then .Item(.Count) = vbCrLf
Next
End If
Case Is >= vbArray
If UBound(vElement) = -1 Then
.Item(.Count) = "''"
Else
.Item(.Count) = vbCrLf
For i = 0 To UBound(vElement)
.Item(.Count) = sIndent & i & ": "
ToStringElement vElement(i), sIndent & vbTab
If Not (i = UBound(vElement)) Then .Item(.Count) = vbCrLf
Next
End If
Case vbNull
.Item(.Count) = "Null"
Case vbBoolean
.Item(.Count) = IIf(vElement, "True", "False")
Case Else
.Item(.Count) = CStr(vElement)
End Select
End With
End Sub
Sub ToArray(vJSON As Variant, aRows() As Variant, aHeader() As Variant)
' Input:
' vJSON - Array or Object which contains rows data
' Output:
' aData - 2d array representing JSON data
' aHeader - 1d array of property names
Dim sName As Variant
Set oHeader = CreateObject("Scripting.Dictionary")
Select Case VarType(vJSON)
Case vbObject
If vJSON.Count > 0 Then
ReDim aData(0 To vJSON.Count - 1, 0 To 0)
oHeader("#") = 0
i = 0
For Each sName In vJSON
aData(i, 0) = "#" & sName
ToArrayElement vJSON(sName), ""
i = i + 1
Next
Else
ReDim aData(0 To 0, 0 To 0)
End If
Case Is >= vbArray
If UBound(vJSON) >= 0 Then
ReDim aData(0 To UBound(vJSON), 0 To 0)
For i = 0 To UBound(vJSON)
ToArrayElement vJSON(i), ""
Next
Else
ReDim aData(0 To 0, 0 To 0)
End If
Case Else
ReDim aData(0 To 0, 0 To 0)
aData(0, 0) = ToString(vJSON)
End Select
aHeader = oHeader.Keys()
Set oHeader = Nothing
aRows = aData
Erase aData
End Sub
Private Sub ToArrayElement(vElement As Variant, sFieldName As String)
Dim sName As Variant
Dim j As Long
Select Case VarType(vElement)
Case vbObject ' collection of objects
For Each sName In vElement
ToArrayElement vElement(sName), sFieldName & IIf(sFieldName = "", "", "_") & sName
Next
Case Is >= vbArray ' collection of arrays
For j = 0 To UBound(vElement)
ToArrayElement vElement(j), sFieldName & IIf(sFieldName = "", "", "_") & "#" & j
Next
Case Else
If Not oHeader.Exists(sFieldName) Then
oHeader(sFieldName) = oHeader.Count
If UBound(aData, 2) < oHeader.Count - 1 Then ReDim Preserve aData(0 To UBound(aData, 1), 0 To oHeader.Count - 1)
End If
j = oHeader(sFieldName)
aData(i, j) = ToString(vElement)
End Select
End Sub
I tried useful the original question and found little bug that I fixed
The answer to you question is to
1 loop on Doc elements until next page link is found, and you to
2 set/reset the doc var to avoid the Doc lost
Do
Do While IE.Busy: DoEvents: Loop
Do While IE.readyState <> 4: DoEvents: Loop
Set doc = IE.document
....
Loop While nextPageFound 'exit if "next page" not found
when following the next page hyper link, during loop on Doc elements.
And then I added the
3 split of Table Header and Body (if exist) for 1st row field name set
4 RESCUE of cell value from specific class by name cell against simple td.innerText
5 loop on doc for link search until NextPageFound using
e.getAttribute("title") instead id=nextPage
6 use of a SetUp sheet with parameters to customize the script with url and destination data sheet
strUrl = ThisWorkbook.Sheets("Setup").Range("b1").Value
strDestSheet = ThisWorkbook.Sheets("Setup").Range("b2").Value
And here is the VBA function revisited:
Sub ETFDatNew()
Dim IE As Object
Dim i As Long
Dim strText As String
Dim jj As Long
Dim hBody As Object
Dim hTR As Object
Dim hTD As Object
Dim tb As Object
Dim bb As Object
Dim Tr As Object
Dim Td As Object
Dim ii As Long
Dim doc As Object
Dim hTable As Object
Dim y As Long
Dim z As Long
Dim wb As Excel.Workbook
Dim ws As Excel.Worksheet
Set wb = Excel.ActiveWorkbook
Set ws = wb.ActiveSheet
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
y = 1 'Column A in Excel
z = 1 'Row 1 in Excel
'6 I used a SetUp sheet with parameters to customize the script with url and destination data sheet
strUrl = ThisWorkbook.Sheets("Setup").Range("b1").Value
strDestSheet = ThisWorkbook.Sheets("Setup").Range("b2").Value
Sheets(strDestSheet).Activate ' Destination sheet
Cells.Select
Selection.Clear
'IE.navigate "http://halleyweb.com/c058057/mc/mc_p_ricerca.php" ', , , , "Content-Type: application/x-www-form-urlencoded" & vbCrLf
IE.navigate strUrl
ii = 1
nextPageFound = True 'set to init scope var / settato per iniziare il loop
Do '1) loop until nextPageFound instead of '' ii <= 17 Or
'2 set/reset of doc to avoid doc lost after next page link click (see below)
Do While IE.Busy: DoEvents: Loop
Do While IE.readyState <> 4: DoEvents: Loop
Set doc = IE.document
' all tables or single one by Id
Set hTable = doc.getElementsByTagName("table") 'or by .getElementById("table-albo-pretorio")
For Each tb In hTable
'3) splitting header vs body for 1st row field name search
'tHeader
If z = 1 Then
Set hBody = tb.getElementsByTagName("thead")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each Tr In hTR
Set hTD = Tr.getElementsByTagName("th") 'header th
y = 1 ' Resets back to column A
For Each Th In hTD
ws.Cells(z, y).Value = Th.innerText
y = y + 1
Next Th
DoEvents
z = z + 1
Next Tr
Exit For
Next bb
End If
'tBody
Set hBody = tb.getElementsByTagName("tbody")
For Each bb In hBody
Set hTR = bb.getElementsByTagName("tr")
For Each Tr In hTR
Set hTD = Tr.getElementsByTagName("td")
y = 1 ' Resets back to column A
For Each Td In hTD
'4) RESCUE cell value from td.innerText against specific class name cell
'ws.Cells(z, y).Value = Td.innerText
If CBool(Td.getElementsByClassName("tablesaw-cell-content").Length) Then 'there is at least 1
'use the first
ws.Cells(z, y).Value = Td.getElementsByClassName("tablesaw-cell-content")(0).innerText
End If
y = y + 1 'colonna successiva /next col
Next Td
DoEvents
z = z + 1 'riga successiva /next row
Next Tr
Exit For
Next bb
Exit For
Next tb
'5 looping on doc for link search until NextPageFound using e.getAttribute("title") instead id=nextPage
With doc 'ricerca dei link
Set elems = .getElementsByTagName("a")
nextPageFound = False ' si predispone per concludere nel caso non sia presente una pagina successiva
For Each e In elems
If (e.getAttribute("title") = "Pagina successiva") Then ' alla ricerca di link con title="Pagina successiva" / instead of id=nextPage
e.Click
nextPageFound = True ' trovata pagina successiva /found next page
Exit For
End If
Next e
End With
ii = ii + 1
Application.Wait (Now + TimeValue("00:00:01"))
Loop While nextPageFound ' conclude nel caso non sia stato trovato il link Pagina successiva / exit if not found
IE.Quit
Set IE = Nothing
Application.StatusBar = ""
MsgBox "Estrazione completata" ' completed
End Sub
In our code, we have a few Excel objects and a few subs and functions.
We edited a few things and now, for some reason, our objects aren't working inside the sub, the give a "Object Required" error.
We don't know what to do anymore, so any help would be greatly appreciated!
Note: We added the entire code in case there would be questions about declarations and that...
Sub:
Sub birthday (formatDate, i, intRow)
'Take date from database, separate it to days & months
Dim month, day, name
eventDate = Split(formatDate,"/")
month = eventDate(0)
day = eventDate(1)
'Get name of event out of database (one column to the right, from date of event)
name = "netch"
'Get value of row which is used to write events in the specific month
Dim k, row, c
k = 1
wscript.echo objXLCal.Cells(k, 2).Value
Do Until objXLCal.Cells(k, 2).Value = monthRet(month)
k = k + 1
Loop
'k will be used to find the day column, while row is where the events of that months are written
row = k + 3
c = 1
'Get value of column
Do Until objXLCal.Cells(k,c).Value = eval(day)
c = c + 1
Loop
'Insert name of event into place
If Asc(name) = 63 Then
objXLCal.Cells(row,c).Value = StrReverse(name)
Else
objXLCal.Cells(row,c).Value = name
End If
End Sub
Rest of code:
main("C:\Users\liatte\Desktop\hotFolder\Input")
Function main(argFilePath)
Dim templatePath
'-----------------------------------------------------------------------------
'Path to calendar template
templatePath = "C:\Users\liatte\Desktop\Aviv Omer Neta\Birthdays\Calendar1.xlsx"
'-----------------------------------------------------------------------------
'creates the msxml object
'Set xmlDoc = CreateObject("Msxml2.DOMDocument.6.0")
'Dim retVal
'load the xml data of the script
'retVal=xmlDoc.load(argFilePath)
Dim fso, folder, sFolder, inputFolder, xmlDataPath, curNode
'get input folder
'Set curNode=xmlDoc.selectSingleNode("//ScriptXmlData/inputFilePath")
'inputFolder=CSTR(curNode.text)
'location of input folder
'sFolder=inputFolder
sFolder=argFilePath
'creating file getting object
Set fso = CreateObject("Scripting.FileSystemObject")
Set folder = fso.GetFolder(sFolder)
'loop that runs on files in input - RUNS JUST ONCE
'For each folderIdx In folder.files
'Creating object for user excel
Set objXLBirth = CreateObject("Excel.Application")
Set objWorkbookBirth = objXLBirth.Workbooks.Open("C:\Users\liatte\Desktop\hotFolder\Input\Birthdays.xlsx")
'Creating object for calendar template excel
Set objXLCal = CreateObject("Excel.Application")
objXLCal.DisplayAlerts = false
Dim picStr, srcMonth, k, i, intRow, formatDate, txtStr
'Beginning reading from line 2, skipping header
intRow = 2
'loop for each person in user excel
Do Until objXLBirth.Cells(intRow,1).Value = ""
i=2
'Opening the template as new in each round of loop
Set objWorkbookCal = objXLCal.Workbooks.Open(templatePath)
'Cover pic
If Not objXLBirth.Cells(intRow, i).Value = "" Then
objXLCal.Cells(2, 49).Value = objXLBirth.Cells(intRow, i).Value
End If
'Month pic inserter
For i=3 To 14
If Not objXLBirth.Cells(intRow,i).Value = "" Then
picStr = objXLBirth.Cells(1,i).Value
srcMonth = monthRet(Mid(picStr,4))
k=1
Do Until objXLCal.Cells(k, 2).Value = srcMonth
k=k+1
Loop
objXLCal.Cells(k, 47).Value = objXLBirth.Cells(intRow,i).Value
End If
Next
i=15
'Cover text inserter
If Not objXLBirth.Cells(intRow, i).Value = "" Then
objXLCal.Cells(2, 50).Value = objXLBirth.Cells(intRow, i).Value
End If
'Month text inserter
For i = 16 To 27
If Not objXLBirth.Cells(intRow,i).Value = "" Then
txtStr = objXLBirth.Cells(1,i).Value
srcMonth = monthRet(Mid(txtStr,5))
k=1
Do Until objXLCal.Cells(k, 2).Value = srcMonth
k=k+1
Loop
If Asc(objXLBirth.Cells(intRow, i).Value)=63 Then
objXLCal.Cells(k, 48).Value = StrReverse(objXLBirth.Cells(intRow, i).Value)
Else
objXLCal.Cells(k, 48).Value = objXLBirth.Cells(intRow, i).Value
End If
End If
Next
i=28
'Birthday inserter
Do Until objXLBirth.Cells(intRow,i).Value = ""
formatdate=FormatDateTime(objXLBirth.Cells(intRow,i),2)
Call birthday (formatdate,i,intRow)
i=i+2
Loop
'saving changed calendar
objXLCal.ActiveWorkBook.SaveAs "C:\Users\liatte\Desktop\Aviv Omer Neta\Birthdays\Calendar_" & objXLBirth.Cells(intRow, 1).Value & ".txt", 42
intRow = intRow+1
Loop
'moving file to Success
'fso.MoveFile inputFolder, "C:\Users\liatte\Desktop\Success\"
'Next
objXLBirth.Quit
objXLCal.Quit
End Function
Another function:
Function monthRet(month)
Select Case month
Case "1"
monthRet="January"
Case "2"
monthRet="February"
Case "3"
monthRet="March"
Case "4"
monthRet="April"
Case "5"
monthRet="May"
Case "6"
monthRet="June"
Case "7"
monthRet="July"
Case "8"
monthRet="August"
Case "9"
monthRet="September"
Case "10"
monthRet="October"
Case "11"
monthRet="November"
Case "12"
monthRet="December"
End Select
End Function
Thank you very much!
Given a code layout like:
Sub birthday (formatDate, i, intRow)
...
wscript.echo objXLCal.Cells(k, 2).Value
...
End Sub
Function main(argFilePath)
...
Set objXLCal = CreateObject("Excel.Application")
...
End Function
main "C:\Users\liatte\Desktop\hotFolder\Input"
an "Object required" error for the WScript.Echo line is to be expected (the local variable objXLCal initialized in main isn't the same as the (therefore) uninitialized local variable objXLCal in birthday).
The correct solution would be to start with "Option Explicit" and follow the principles of decent procedural programming in VBScript, but the disgusting hack of Diming variables like objXLCal at the top/global level won't lower the quality of the published code.