In this thread [Link}(Scraping table from local HTML with unicode characters), QHarr has helped me to scrape a table from local html file.
I have a html file at this Link
And I used the same code and edited a little for the variables 'startTableNumber' and 'endTableNumber' and 'numColumns'
Public Sub Test()
Dim fStream As ADODB.Stream, html As HTMLDocument
Set html = New HTMLDocument
Set fStream = New ADODB.Stream
With fStream
.Charset = "UTF-8"
.Open
.LoadFromFile "C:\Users\Future\Desktop\Sample 2.html"
html.body.innerHTML = .ReadText
.Close
End With
Dim hTables As Object, startTableNumber As Long, i As Long, r As Long, c As Long
Dim counter As Long, endTableNumber, numColumns As Long
startTableNumber = 91
endTableNumber = 509
numColumns = 14
Set hTables = html.getElementsByTagName("table")
r = 2: c = 1
For i = startTableNumber To endTableNumber Step 2
counter = counter + 1
If counter = 10 Then
c = 1: r = r + 1: counter = 1
End If
Cells(r, c) = hTables(i).innerText
c = c + 1
Next
End Sub
But I got scattered data of the table further more I would like to find a flexible way so as to make the code recognize those variables without assigning them manually
I hope to find solution using selenium. Hope also not to receive negative rep. I have done my best to clarify the issue
Regards
So, as I said in my comments you need to study how the data appears in the later table tags and perform a mapping to get the correct ordering. The following writes out the table. As I also mentioned, this is not robust and only the methodology may possibly be transferable to other documents.
In your case you wouldn't be reading from file but would use
Set tables = driver.FindElementsByCss("table[width='100%'] table:first-child")
You would then For Each over the web elements in the collection adjusting the syntax as required e.g. .Text instead of .innerText. There may be a few other adaptations for selenium due to its indexing of webElements but everything you need to should be evident below.
VBA:
Option Explicit
Public Sub ParseInfo()
Dim html As HTMLDocument, tables As Object, ws As Worksheet, i As Long
Set ws = ThisWorkbook.Worksheets("Sheet2")
Dim fStream As ADODB.Stream
Set html = New HTMLDocument
Set fStream = New ADODB.Stream
With fStream
.Charset = "UTF-8"
.Open
.LoadFromFile "C:\Users\User\Desktop\test.html"
html.body.innerHTML = .ReadText
.Close
End With
Set tables = html.querySelectorAll("table[width='100%'] table:first-child")
Dim rowCounter: rowCounter = 2
Dim mappings(), j As Long, headers(), arr(13)
headers = Array("Notes", "Type", "Enrollment status", "Governorate of birth", "Year", "Month", "Day", "Date of Birth", "Religion", _
"Nationality", "Student Name", "National Number", "Student Code", "M")
mappings = Array(3, 8, 9, 12, 11, 10, 2, 7, 1, 6, 5, 4, 13)
ws.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
For i = 89 To 504 Step 26
arr(0) = vbNullString
For j = 0 To 12
arr(mappings(j)) = tables.item(2 * j + i).innerText
Next
ws.Cells(rowCounter + 1, 1).Resize(1, UBound(arr) + 1) = arr
rowCounter = rowCounter + 1
Next
End Sub
Related
Edit1: Solutions to my questions can be found in the comments below, regards to Zwenn
I need help with finding the right website elements to use in the instruction when printing in excel. The following is what I have which works when printing the first half of the page, but not the second. I'm not able to print the "all seasons course statistics" table featured on the webpage using the navigation below and I would like to be able to do so using different element navigation.
When looking for the "all seasons course statistics" table on the webpage, the table elements are past the "tabs-wrapper rns-scroll" class-element when inspecting the webpage.
Website in question: https://www.racingandsports.com/thoroughbred/jockey/jake-bayliss/27461.
The first part of the VBA code:
Sub Horse2()
Dim IE As InternetExplorer
Application.ScreenUpdating = False
Set IE = New InternetExplorer
IE.Visible = True
Set ws = ThisWorkbook.Worksheets("Sheet1")
Dim http As New XMLHTTP60, html As New HTMLDocument
Dim node As HTMLHtmlElement
Dim nodeTr As HTMLHtmlElement
Dim nodeDiv As HTMLHtmlElement
Dim Element1 As HTMLHtmlElement
Dim node1 As HTMLHtmlElement
Dim currentUrl As String
With http
.Open "GET", "https://www.racingandsports.com/thoroughbred/jockey/jake-bayliss/27461", False
.send
html.body.innerHTML = .responseText
End With
The element navigation starts here in the VBA. The elements here just navigate through other tables rather than the one I need and I have no idea how to get it to go to the "all seasons course statistics" table, even when it contains the same element names.
With html.getElementsByClassName("col-md-12 table-responsive")
For Each node In html.getElementsByClassName("table rns-table")
r = r + 1: c = 4
For Each nodeTr In node.getElementsByTagName("tr")
With nodeTr.getElementsByTagName("td")
If .Length Then
ws.Cells(r + 1, c + 3) = .Item(0).innerText
On Error Resume Next
ws.Cells(r + 1, c + 4) = .Item(1).innerText
On Error Resume Next
ws.Cells(r + 1, c + 5) = .Item(2).innerText
On Error Resume Next
ws.Cells(r + 1, c + 6) = .Item(3).innerText
On Error Resume Next
ws.Cells(r + 1, c + 7) = .Item(4).innerText
On Error Resume Next
ws.Cells(r + 1, c + 8) = .Item(5).innerText
On Error Resume Next
ws.Cells(r + 1, c + 9) = .Item(6).innerText
On Error Resume Next
ws.Cells(r + 1, c + 10) = .Item(7).innerText
On Error Resume Next
ws.Cells(r + 1, c + 11) = .Item(8).innerText
On Error Resume Next
ws.Cells(r + 1, c + 12) = .Item(9).innerText
On Error Resume Next
ws.Cells(r + 1, c + 13) = .Item(10).innerText
On Error Resume Next
ws.Cells(r + 1, c + 14) = .Item(11).innerText
On Error Resume Next
ws.Cells(r + 1, c + 15) = .Item(12).innerText
On Error Resume Next
r = r + 1
End If
End With
Next
Next
End With
IE.Quit
Set IE = Nothing
Application.StatusBar = ""
Application.ScreenUpdating = True
MsgBox "data input complete"
End Sub
In addition to my comment above, here is the suggestion for a code optimisation. As I said, On Error Resume Next is not a good choice in most cases. You can avoid using it here by only taking over as many elements in a loop as are present. In addition, you should make your row and column handling less complicated. These are simply numbers that need to be managed. This can usually be done with +1 and resetting the column. Other corrections are not necessary.
The following code outputs exactly the same as your initial macro:
Sub Horse2()
Dim ws As Worksheet
Dim r As Long
Dim c As Long
Dim http As New XMLHTTP60
Dim html As New HTMLDocument
Dim node As HTMLHtmlElement
Dim nodeTr As HTMLHtmlElement
Dim oneElement As Long
Set ws = ThisWorkbook.Worksheets("Sheet1")
r = 2
c = 7
With http
.Open "GET", "https://www.racingandsports.com/thoroughbred/jockey/jake-bayliss/27461", False
.send
html.body.innerHTML = .responseText
End With
With html.getElementsByClassName("col-md-12 table-responsive")
For Each node In html.getElementsByClassName("table rns-table")
For Each nodeTr In node.getElementsByTagName("tr")
With nodeTr.getElementsByTagName("td")
If .Length Then
For oneElement = 0 To .Length - 1
ws.Cells(r, c) = .Item(oneElement).innerText
c = c + 1
Next oneElement
c = 7
r = r + 1
End If
End With
Next
r = r + 1
Next
End With
MsgBox "Data input complete"
End Sub
I found the code below which works nicely and I think I can repurpose it for my needs, but does not include the precipitation. I'm relatively new to HTML so having trouble understanding what each line of code's purpose is. I've gone to the website and looked at the elements and console but can't find "p[data-testid='wxPhrase']" or the word 'children' or 'child'.
I presumed precipitation was just another child so tried adding lines like these after editing the column headers in the first sub:
Results(r + 1, 3) = Children(r).FirstChild.innerText
Results(r + 1, 4) = Children(r).PreviousSibling.PreviousSibling.PreviousSibling.FirstChild.innerText
but it gives Run-time error '438': Object doesn't support this property or method. I appreciate very much some help and education. Thanks, in advance!
Sub MiamiWeather()
Dim Data As Variant
Data = MiamiWeatherData
Range("A1:B1").Value = Array("Date", "Temperature")
Range("A2").Resize(UBound(Data), 2).Value = Data
End Sub
Function MiamiWeatherData()
Const URL = "https://weather.com/weather/tenday/l/3881cd527264bc7c99b6b541473c0085e75aa026b6bd99658c56ad9bb55bd96e"
Dim responseText As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
responseText = .responseText
End With
Dim Document As HTMLDocument
Set Document = CreateObject("HTMLFILE")
Document.body.innerHTML = responseText
Dim Children As IHTMLDOMChildrenCollection
Set Children = Document.querySelectorAll("p[data-testid='wxPhrase']")
Dim Results As Variant
ReDim Results(1 To Children.Length, 1 To 2)
Dim r As Long
For r = 0 To Children.Length - 1
Results(r + 1, 1) = Children(r).PreviousSibling.PreviousSibling.FirstChild.innerText
Results(r + 1, 2) = Children(r).PreviousSibling.FirstChild.innerText
Next
MiamiWeatherData = Results
End Function
Assuming you want the percentage, you need to resize the array to hold an extra dimension then add an extra selector within the loop. That selector can select by attribute = value and will need to work off .Children(r).PreviousSibling. Assuming, you have a still maintained Microsoft set-up you can chain querySelector at this point as shown below.
For older versions e.g., <= Windows 7 then use Results(r + 1, 3) = Children(r).PreviousSibling.Children(2).Children(0).Children(1).innerText
Option Explicit
Public Sub MiamiWeather()
Dim Data As Variant
Data = MiamiWeatherData
Range("A1:C1").value = Array("Date", "Temperature", "Precipitation")
Range("A2").Resize(UBound(Data), 3).value = Data
End Sub
Function MiamiWeatherData()
Const URL = "https://weather.com/weather/tenday/l/3881cd527264bc7c99b6b541473c0085e75aa026b6bd99658c56ad9bb55bd96e"
Dim responseText As String
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
responseText = .responseText
End With
Dim Document As HTMLDocument
Set Document = CreateObject("HTMLFILE")
Document.body.innerHTML = responseText
Dim Children As IHTMLDOMChildrenCollection
Set Children = Document.querySelectorAll("p[data-testid='wxPhrase']")
Dim Results As Variant
ReDim Results(1 To Children.length, 1 To 3)
Dim r As Long
For r = 0 To Children.length - 1
Results(r + 1, 1) = Children(r).PreviousSibling.PreviousSibling.FirstChild.innerText
Results(r + 1, 2) = Children(r).PreviousSibling.FirstChild.innerText
Results(r + 1, 3) = Children(r).PreviousSibling.querySelector("[data-testid=PercentageValue]").innerText
Next
MiamiWeatherData = Results
End Function
Hi I am trying to scrape the Product name (Cohiba Robusto), Product Size (Single Cigar, Pack of 3, Box of 25) and prices (£33.65, £90, £730) from this website: https://www.jjfox.co.uk/cohiba-robusto-621.html
I am trying to get something like this:
I am using the code below, which gives an error ("Object variable or with variable not set").
Will appreciate any help with this.
Sub getproducts()
Sheets("JJFox").Select
Dim oHtml As HTMLDocument
Dim oElement As Object
Dim Elements As IHTMLElementCollection
Dim Document As HTMLDocument
Set oHtml = New HTMLDocument
'Cells(1, 6) = Time()
lastrow = ActiveSheet.Cells(Rows.Count, "A").End(xlUp).Row
cnt = lastrow + 1
counter1 = cnt
Dim gg As String
gg = "https://www.jjfox.co.uk/cohiba-robusto-621.html"
Dim objHTTP As New WinHttp.WinHttpRequest
url = gg
objHTTP.Open "POST", url, False
objHTTP.setRequestHeader "Content-Type", "application/json"
objHTTP.send ("{""key"":null,""from"":""me#me.com"",""to"":null,""cc"":null,""bcc"":null,""date"":null,""subject"":""My Subject"",""body"":null,""attachments"":null}")
oHtml.body.innerHTML = objHTTP.responseText
'Cells(rw, 2) = oHtml.getElementsByTagName("description").innerText
' If Not .Document.querySelector("button[aria-label='Close']") Is Nothing Then
' .Document.querySelector("button[aria-label='Close']").Click
' End If
txttitle = oHtml.getElementsByClassName("productcart")(0).innerText
txttitlehtml = oHtml.getElementsByClassName("packsize")(0).innerHTML
txttitle = Mid(txttitle, 1, InStr(1, txttitle, Chr(10)))
'Debug.Print txttitlehtml
'txttitle2 = oHtml.getElementsByClassName("price")(0).innerText
Dim Text As String
Text = GetHTML(gg)
starts = InStr(1, Text, "spConfig =")
endS = InStr(starts + 1, Text, "spConfig")
If starts = 0 Then
Cells(counter1, 1) = txttitle
Cells(counter1, 2) = "Single"
starts = InStr(starts + 1, Text, "productPrice")
endl = InStr(starts + 1, Text, ",")
Cells(counter1, 3) = Val(Mid(Text, starts + 14, endl - (starts + 14)))
Cells(counter1, 4) = "JJFox"
Cells(counter1, 5) = Now()
Cells(counter1, 7) = gg ' link to the page
counter1 = counter1 + 1
Else
Text = Mid(Text, starts, endS - starts)
'Debug.Print Text
'find how many pack options are avaialble
myTxt = Text
countTxt = "label"
bb = (Len(myTxt) - Len(replace(myTxt, countTxt, ""))) / Len(countTxt) - 1
'End find////////////////////////////////////
varlabel = "class=" & Chr(34) & "label" & Chr(34)
starts = InStr(1, Text, "label") + 1
Text = Mid(Text, starts, Len(Text))
For i = 1 To bb
starts = InStr(1, Text, "label")
If InStr(starts, Text, "label") Then
'Show the element's properties
Cells(counter1, 1) = txttitle
Cells(counter1, 2) = Mid(Text, starts + 8, InStr(starts, Text, " \") - (starts + 8))
starts = InStr(starts + 1, Text, "oldPrice")
endl = InStr(starts + 1, Text, ",")
Cells(counter1, 3).FormulaR1C1 = Val(Mid(Text, starts + 11, endl - (starts + 11)))
'Debug.Print Val(Mid(Text, startS + chrs, 6))
Cells(counter1, 4) = "JJFox"
Cells(counter1, 5) = Now()
starts = starts + 1
Text = Mid(Text, starts, Len(Text))
Cells(counter1, 7) = gg ' link to the page
counter1 = counter1 + 1
End If
Next i
End If
'Cells(2, 6) = Time()
End Sub
Function GetHTML(url As String) As String
With CreateObject("MSXML2.ServerXMLHTTP.6.0")
.Open "GET", url, False
.send
GetHTML = .responseText
End With
End Function
The prices and labels are pulled dynamically from a script tag who content you can parse as json with a json parser. You need to grab the name from the html however.
With a little knowledge of html and css, it is easy enough to define a css pattern to target the script node of interest with:
.fieldset [type='text/x-magento-init']
That looks for a child script with type attribute having attribute value text/x-magento-init, and a parent with class fieldset.
I have used a tiny bit less efficient (you won't notice):
For i = 1 To optionsCollection.Count
Simply because I know the collection is small and to allow me to index into two variables with a single loop.
Json library:
I use jsonconverter.bas. Download raw code from here and add to standard module called JsonConverter . Remove the top Attribute line from the copied code.
You then need to go:
VBE > Tools > References > Add references to:
Microsoft Scripting Runtime
Microsoft HTML Object Library
Microsoft XML Library.
In VBA for json the [] denotes a collection and the {} represents a dictionary.
Option Explicit
Public Sub GetCigarData()
'< VBE > Tools > References:
'Microsoft Scripting Runtime
'Microsoft HTML Object Library
'Microsoft XML Library
Dim json As Object, html As MSHTML.HTMLDocument, xhr As MSXML2.XMLHTTP60, ws As Worksheet
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set xhr = New MSXML2.XMLHTTP60
Set html = New MSHTML.HTMLDocument
With xhr
.Open "GET", "https://www.jjfox.co.uk/cohiba-robusto-621.html", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
html.body.innerHTML = .responseText
End With
Set json = jsonConverter.ParseJson(html.querySelector(".fieldset [type='text/x-magento-init']").innerHTML)("#product_addtocart_form")("configurable")("spConfig")
Dim prices As Scripting.Dictionary, options As Scripting.Dictionary, optionsCollection As Collection
Set prices = json("optionPrices")
Set options = json("attributes")
Set optionsCollection = options(options.Keys(0))("options")
Dim results() As Variant, headers() As Variant, i As Long, name As String
ReDim results(1 To optionsCollection.Count, 1 To 3)
name = html.querySelector(".base").innerText
For i = 1 To optionsCollection.Count
results(i, 1) = name
results(i, 2) = optionsCollection.item(i)("label")
results(i, 3) = prices(prices.Keys(i - 1))("finalPrice")("amount")
Next
headers = Array("Name", "Size", "Price")
With ws
.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
End Sub
Read about css selectors:
https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_Selectors
I am trying to extract the details in this webpage and they seem to be under certain "divs" with "selection-left" and "selection-right" right. I haven't found a way to successfully pull it yet.
This is the URL - https://sports.ladbrokes.com/en-af/betting/golf/golf-all-golf/us-masters/2020-us-masters/228648232/
And here is an image of what I want to extract. I want to copy the competition name and each participant and score.
I have tried using QHar's approach in this link - How to extract values from nested divs using VBA. But I'm getting errors along this line -
ReDim results(1 To countries.Length / 2, 1 To 4)
Here is the code I've been trying to make work
Option Explicit
Public Sub GetData()
Dim html As HTMLDocument, ws As Worksheet, countries As Object, scores As Object, results(), i As
Long, r As Long
Set ws = ThisWorkbook.Worksheets("Sheet1"): Set html = New HTMLDocument: r = 1
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://sports.ladbrokes.com/en-af/betting/golf/golf-all-golf/us-masters/2020-us-masters/228648232/", False
.send
html.body.innerHTML = .responseText
End With
Set participant = html.querySelectorAll(".market-content .selection-left"): Set scores = html.querySelectorAll("..market-content .selection-right")
ReDim results(1 To countries.Length / 2, 1 To 4)
For i = 0 To participant.Length - 1 Step 2
results(r, 1) = participant.item(i).innerText: results(r, 2) = "'" & scores.item(i).innerText
r = r + 1
Next
ws.Cells(1, 1).Resize(1, 4) = Array("Competition", "Participant", "Score")
ws.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End Sub
I will need help to make this code work
Content is dynamically added so will not be present in your current request format; hence your error as you have a nodeList of Length 0. You could try making POST requests as the page does but it doesn't look like a quick and easy bit of coding. I would go with browser automation, if this is a small project, so that js can run on the page and you can click the show more button. You will need a wait condition for the page to have properly loaded. I use the presence of the show more button.
Option Explicit
Public Sub GetOddsIE()
Dim d As InternetExplorer, odds As Object, names As Object, i As Long
Dim ws As Worksheet, results(), competition As String
Set d = New InternetExplorer
Set ws = ThisWorkbook.Worksheets("Sheet1")
Const URL = "https://sports.ladbrokes.com/en-af/betting/golf/golf-all-golf/us-masters/2020-us-masters/228648232/"
With d
.Visible = False
.Navigate2 URL
While .Busy Or .ReadyState <> 4: DoEvents: Wend
With .Document.getElementsByClassName("expandable-below-container-button")
Do
DoEvents
Loop While .Length = 0 'wait for element to be present
.Item(0).Click 'click on show more
End With
Set names = .Document.getElementsByClassName("selection-left-selection-name")
Set odds = .Document.getElementsByClassName("odds-convert")
competition = .Document.getElementsByClassName("league")(0).innerText
ReDim results(1 To names.Length, 1 To 3)
For i = 0 To names.Length - 1
results(i + 1, 1) = competition
results(i + 1, 2) = names.Item(i).innerText
results(i + 1, 3) = "'" & odds.Item(i).innerText
Next
.Quit
End With
ws.Cells(1, 1).Resize(1, 3) = Array("Competition", "Participant", "Score")
ws.Cells(2, 1).Resize(UBound(results, 1), UBound(results, 2)) = results
End Sub
Public Sub D_Galoplar()
Application.ScreenUpdating = False
Dim Asay(1 To 250)
Dim Jsay(1 To 100)
For q = 2 To Sheets("Y").Columns("A:A").Find(What:="boş").Row - 1
Asay(q - 1) = Sheets("Y").Range("A" & q)
Next q
For q = 2 To Sheets("Y").Columns("C:C").Find(What:="boş").Row - 1
Jsay(q - 1) = Sheets("Y").Range("C" & q)
Next q
For w = 1 To 250
Cells.Delete Shift:=xlUp
Range("A1").Select
If Asay(w) < 1 Then Exit For
Dim elem As Object, trow As Object
Dim R&, C&, s$
With New XMLHTTP60
.Open "POST", "https://yenibeygir.com/at/getatdetaytab", False
.setRequestHeader "content-type", "application/x-www-form-urlencoded; charset=UTF-8"
.send "tab=galopTab&id=" & Asay(w)
s = .responseText
End With
With New HTMLDocument
.body.innerHTML = s
For Each elem In .getElementsByClassName("at_Galoplar")(0).Rows
For Each trow In elem.Cells
C = C + 1: Cells(R + 1, C) = trow.innerText
Next trow
C = 0: R = R + 1
Next elem
End With
Cells.UnMerge
Range("A1").Select
If Range("A1048576").End(xlUp).Row < 2 Then GoTo ATLA2
Columns("A:A").Insert
For i = 2 To Range("B1048576").End(xlUp).Row - 1
Range("A" & i) = Asay(w)
Next i
Range("O2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/4,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/400))"
Range("P2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/6,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/600))"
Range("Q2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/8,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/800))"
Range("R2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/10,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/1000))"
Range("S2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/12,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/1200))"
Range("T2").FormulaR1C1 = "=IF(ISBLANK(RC[-9]),""-"",IF(ISNUMBER(RC[-9]),RC[-9]/14,((LEFT(RC[-9],1)*6000)+(MID(RC[-9],3,2)*100)+(RIGHT(RC[-9],1)*10))/1400))"
Range("O2:T2").Copy
Range("O2:O" & Range("A1048576").End(xlUp).Row).PasteSpecial Paste:=xlPasteFormulas
Cells.Copy
Cells.PasteSpecial Paste:=xlPasteValues
Columns("O:T").Cut Columns("F:K")
Range("A2:N" & Range("A1048576").End(xlUp).Row).Copy
Sheets("Galop").Range("A" & Sheets("Galop").Range("A1048576").End(xlUp).Row + 1).PasteSpecial Paste:=xlPasteValues
ATLA2:
Cells.Delete Shift:=xlUp
Next w
End Sub
I want to get a lot of data with the For Next cycle, but after a while the page hangs. How can I reset objects at the end of each cycle?
Asay numbers
10182
10221
10279
10303
10316
10325
10360
10370
10680
11598
11629
11715
11745
12335
12385
12533
12559
13154
13393
13635
13641
13669
13673
14027
14057
14062
14228
14619
14674
14687
14743
14770
14778
15197
15217
15323
15382
15507
15775
15828
16077
16335
16510
17149
17513
17867
18532
37964
60176
66067
66255
66581
66582
66896
66998
67056
67309
67356
67379
67473
68008
68012
68162
68298
68312
68320
68332
68333
68353
68383
68545
68702
68775
68922
69445
69606
69817
69963
69968
69985
69986
70048
70202
71372
(boş)
Slowing down maybe due to throttling of network if you are trying to hit the site too many times in quick succession. This is particularly likely given your access method. Better would be to see if an API is available to bulk access info. You are likely going through many networks to get to this page as well. It may be possible to get some basic info about delays from TRACERT command from a command prompt.
You are doing a POST so remember there is a fair amount of server side stuff going on as well.
You don't need to set elem to Nothing as it only exists during your For Loop. Same for tRow.
Putting .getElementsByClassName("at_Galoplar")(0).Rows into a variable will provided faster referencing.
Write the results to an array first and then dump the array out to the sheet in one go will provide significant improvement in speed.
Using New keyword can lead to unexpected behaviour. You can create one instance of HTMLDocument and work with that provided you have good error handling in. I have had occassional cases in a loop where I have had to set HTMLDocument to Nothing before looping back round.
Personally, I would cheat and re-write this to leverage that you can issue GET requests to get the same info. I use a class to hold the XMLHTTP object, and an array to hold the results. I write the results out in one go. This takes a few seconds to run for me. The asay numbers are in Sheet1 range A1:A84.
Class module clsHTTP
Option Explicit
Private http As Object
Private Sub Class_Initialize()
Set http = CreateObject("MSXML2.XMLHTTP")
End Sub
Public Function GetString(ByVal url As String) As String
Dim sResponse As String
With http
.Open "GET", url, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
GetString = sResponse
End With
End Function
Standard module 1
Option Explicit
Public Sub DGaloplar()
Dim asays(), ws As Worksheet, asay As Long, html As HTMLDocument
Dim http As clsHTTP, url As String, headers(), numberOfRequests As Long
headers = Array("Asay", "Tarih", "Sehir", "Kg", "Jokey", "400", "600", "800", "1000", "1200", "1400", "Ç", "Pist", "Durum")
Set http = New clsHTTP
Set ws = ThisWorkbook.Worksheets("Sheet1")
Set html = New HTMLDocument
asays = Application.Transpose(ws.Range("A1:A84").Value) 'Load asay values from sheet 1
Const numTableRows As Long = 11
Const numTableColumns As Long = 15
Const BASE_URL As String = "https://yenibeygir.com/at/getatdetaytab/?tab=galopTab&id="
numberOfRequests = UBound(asays)
Dim results(), headerRow As Boolean, tRows As Object, tRow As Object, iRow As Long
Dim tCells As Object, tCell As Object, r As Long, c As Long, hTable As HTMLTable
ReDim results(1 To numTableRows * numberOfRequests, 1 To numTableColumns)
Application.ScreenUpdating = False
For asay = 1 To numberOfRequests
headerRow = True
url = BASE_URL & asays(asay)
html.body.innerHTML = http.GetString(url)
Set hTable = html.querySelector(".at_Galoplar")
Set tRows = hTable.getElementsByTagName("tr")
For Each tRow In tRows
If Not headerRow Then
c = 2: r = r + 1
results(r, 1) = asays(asay)
Set tCells = tRow.getElementsByTagName("td")
For Each tCell In tCells
results(r, c) = tCell.innerText
c = c + 1
Next
End If
headerRow = False
Next
Next
With ws
.Cells(1, 3).Resize(1, UBound(headers) + 1) = headers
.Cells(2, 3).Resize(UBound(results, 1), UBound(results, 2)) = results
End With
Application.ScreenUpdating = True
End Sub
Refereces:
Microsoft HTML Object Library
In general Set elem = Nothing is what you need.
In your code you are assigning the variables within a for-each loop, thus even if you set them to Nothing later, there would not be a performance bonus.
Try setting those object to Nothing, like below:
Set elem = Nothing
Set trow = Nothing
I am not sure whether you need variable declarations in your loop, you can take them out of the loop, this might save some time.
But I think your HTTP requests are taking so long, not any VBA code.
UPDATE
Try setting Application.EnableEvents and Application.ScreenUpdating to False at the beggining of macro and setting them back to True at the end.