I built the below code to pull the price of a mutual fund into excel via VBA. It worked up until last night. It just started pulling a different number (the % return on the dow. Top of page.). I looked to see if the website layout changed, but can't figure it out.
I believe the code is getting confused between the "value" I am trying to pull and the "value positive" class for the dow.
Is there a way for the code to focus on "value" not "value positive"?
Sub ExtractLastValue()
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 800
objIE.Visible = True
objIE.Navigate ("https://www.marketwatch.com/investing/fund/lfmix")
Do
DoEvents
Loop Until objIE.readystate = 4
Dim myValue As String: myValue = objIE.document.getElementsByClassName("value")(0).innerText
Range("C3").Value = myValue
End Sub
Always declare all variables. The best way to do this is to write Option Explicit at the top of each module. The declarations also belong at the top of the macro.
First fence the desired value before you get it via the CSS class "value".
Option Explicit
Sub ExtractLastValue()
Dim objIE As Object
Dim url As String
Dim myValue As String
url = "https://www.marketwatch.com/investing/fund/lfmix"
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Top = 0
objIE.Left = 0
objIE.Width = 800
objIE.Height = 800
objIE.Visible = True
objIE.navigate url
Do: DoEvents: Loop Until objIE.readyState = 4
myValue = objIE.document.getElementsByClassName("intraday__price")(0).getElementsByClassName("value")(0).innerText
Range("C3").Value = myValue
End Sub
You can scrape that value in several ways. Here is one of the faster methods. When the execution is done, you should get the value in Range("C3").
Sub FetchValue()
Const Url$ = "https://www.marketwatch.com/investing/fund/lfmix"
Dim Html As New HTMLDocument, S$, elem As Object
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36"
.send
Html.body.innerHTML = .responseText
[C3] = Html.querySelector("h3.intraday__price > .value").innerText
End With
End Sub
Make sure you don't execute the script incessantly as the site is very good at banning ips temporarily.
Related
I am new to html scraping but familiar with VBA.
I am trying to get the price information from
https://shop.zuhalmuzik.com/index.php?p=search&search=FA-08
The html code that contains the information is span id "pric";
<div class="listingPriceWrap">
<div class="listingPriceNormal">
<span id="divmoneyorderprice7484">
<span id="pric">16,341</span> TL</span><script>var originalmoneyorderprice7484=16341;</script></div></div>
However, there are several span id="pric" that are 0. Therefore, the below code gives me 0. How can I get the value "16,341"?
Sub getprice2()
Dim ws As Worksheet: Set ws = ThisWorkbook.Worksheets("Sheet1")
Dim text As String
Dim lastrow As Long
Dim sht As Worksheet
Set sht = ActiveSheet
lastrow = sht.Cells(sht.Rows.Count, "A").End(xlUp).Row
For i = 2 To lastrow
With CreateObject("internetexplorer.application")
.Navigate "https://shop.zuhalmuzik.com/index.php?p=search&search=" & Cells(i, 1).Value
Do While .Busy And .ReadyState <> 4: DoEvents: Loop
Sleep 500
text = .Document.getElementById("pric").innerText
.Quit
End With
ws.Cells(i, 2).Value = text
Next i
End Sub
Okay, give the following a go. Make sure to add Microsoft HTML Object Library to the reference library before execution.
Sub fetchPrice()
Const Url$ = "https://shop.zuhalmuzik.com/index.php?p=search&search=FA-08"
Dim Html As New HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", Url, False
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36"
.send
Html.body.innerHTML = .responseText
[A1] = Html.querySelector("span[id*='divmoneyorderprice'] > span[id='pric']").innerText
End With
End Sub
I modified code I found online to be able to find a ClassName on an HTML page and returns its text when doing a Google search. I need to do this for about 10,000 companies but when testing it with only 100 rows it works and then stops around ~60th row. After that I am unable to get any results and found the only way to resolve it is by waiting about an hour and then executing it again. I tested this on another computer and had the same results and issue. It doesn't have to do with what is in the ~60th row because I use a different set of 100 companies each test. Even changing the loop to i = 2 to 101 still causes it to have the same problem.
Col A would have a company name such as: "Buchart Horn"
Col B returns "Architect in Baltimore, Maryland"
Col C would be blank (that's fine)
Col D returns "Baltimore, Maryland - Buchart Horn: Engineers, Architects and Planners"
I'm very new to VBA so any help would be appreciated. Thank you.
'References enabled:
'Microsoft Internet Controls, Microsoft HTML Object Library
Sub GoogleSearch()
Dim URL As String
Dim objHTTP As Object
Dim htmlDoc As HTMLDocument
Set htmlDoc = CreateObject("htmlfile")
Dim objResults1 As Object
Dim objResults2 As Object
Dim objResults3 As Object
On Error Resume Next
lastRow = Range("A" & Rows.count).End(xlUp).Row
For I = 2 To lastRow
URL = "https://www.google.com/search?q=" & Cells(I, 1)
Set objHTTP = CreateObject("MSXML2.XMLHTTP")
With objHTTP
.Open "GET", URL, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send
htmlDoc.body.innerHTML = .responseText
End With
Set objResults1 = htmlDoc.getElementsByClassName("YhemCb")
Set objResults2 = htmlDoc.getElementsByClassName("wwUB2c kno-fb-ctx")
Set objResults3 = htmlDoc.getElementsByClassName("LC20lb")
Cells(I, 2) = objResults1(0).innerText
Cells(I, 3) = objResults2(0).innerText
Cells(I, 4) = objResults3(0).innerText
Next
Set htmlDoc = Nothing
Set objResults1 = Nothing
Set objResults2 = Nothing
Set objResults3 = Nothing
Set objHTTP = Nothing
End Sub
The problem here turned out to be too many requests too quickly to google using the below GET request in a For loop:
With objHTTP
.Open "GET", URL, False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send
htmlDoc.body.innerHTML = .responseText
End With
In order to make this code request at an acceptable pace to the server we are requesting from we add a pause in the loop.
The easiest built in way to pause in VBA is with:
For I = 2 To lastRow
... 'Lines omitted for clarity of purpose
...
Application.Wait (Now + TimeValue("0:00:6"))
Next
The application will then wait until that time to continue executing (an unfortunate limitation with this method is 1 second is the minimum wait time, so smaller values would need a different solution, Application.Wait and examples that could create a smaller delay outlined Here)
I have to pull information from a MURAL board (design thinking tool, which is pretty much an online whiteboard). I need to pull the following information for the stickies:
https://app.mural.co/t/hanno1/m/hanno1/1488557783266/465baa38d35e95edc969a5ca9e2a8bb8b6f10310
Sticky Note Text
Sticky Note Attributes (Color, Size, Shape)
Sticky Note Location
Image links (and locations if possible)
I have created code that is not working. Nothing is being pulled. It pretty much skips straight from opening to quitting the browser.
Also how do I pull the actual HTML code to find the attributes/location?
Option Explicit
Public Sub GetInfo()
Dim ie As InternetExplorer, arr(), col
Set ie = New InternetExplorer
Set col = New Collection
With ie
.Visible = True
.navigate "https://app.mural.co/t/nextgencomms9753/m/nextgencomms9753/1536712668215/cd70107230d7f406058157a3bb8e951cedc9afc0"
While .Busy Or .readyState < 4: DoEvents: Wend
Dim listedItems As Object, item As Object, prices As Object, price As Object, j As Long
Set listedItems = .document.getElementsByClassName("widget-layer-inner")
For Each item In listedItems
Set prices = item.getElementsByClassName("Linkify")
ReDim arr(0 To prices.Length - 1) 'you could limit this after by redim to 0 to 0
j = 0
For Each price In prices
arr(j) = price.innerText
j = j + 1
Next
col.Add Array(item.getElementsByClassName("widgets-container") (0).innerText, arr)
Next
.Quit
Dim item2 As Variant, rowNum As Long
For Each item2 In col
rowNum = rowNum + 1
With ThisWorkbook.Worksheets("Sheet1")
.Cells(rowNum, 1) = Replace$(Trim$(item2(0)), Chr$(10), Chr$(32))
.Cells(rowNum, 2).Resize(1, UBound(item2(1)) + 1) = item2(1)
End With
Next
End With
End Sub
In general, I think using IE automation should be avoided where possible, especially if you can figure out a method to emulate this request via a web request.
A little background on this method
I'm submitting two web requests. One to get an authorization token, and another to get the the JSON from the page which populate the widgets on screen. I figured this out by studying the web requests sent back and forth between the client (me) and the server, and emulated those requests.The approach outlined below is pretty fast, about 2 seconds without URL decoding, and 10 seconds with decoding.
Things you'll need for this to work
Explicit Reference set to Microsoft XML v6.0
Explicit Reference set to Microsoft Scripting Runtime
The VBA-JSON project included into your project, get that here
Code
I split out token and json retrieval into two functions. What you get back from getJSON is a dictionary. This dictionary is somewhat nested, so you refer to items by key to traverse the dictionary down. E.g. MyDict(property1)(childPropertyOfproperty1)(childPropertyOf...) etc.
Here's the code.
Option Explicit
Public Sub SubmitRequest()
Const URL As String = "https://app.mural.co/t/hanno1/m/hanno1/1488557783266/465baa38d35e95edc969a5ca9e2a8bb8b6f10310"
Dim returnobject As Object
Dim widgets As Object
Dim widget As Variant
Dim WidgetArray As Variant
Dim id As String
Dim i As Long
Set returnobject = getJSON(URL, getToken(URL))
Set widgets = returnobject("widgets")
ReDim WidgetArray(0 To 7, 0 To 10000)
For Each widget In widgets
'Only add if a text item, change if you like
If returnobject("widgets")(widget)("type") = "murally.widget.TextWidget" Then
WidgetArray(0, i) = URLDecode(returnobject("widgets")(widget)("properties")("text"))
WidgetArray(1, i) = returnobject("widgets")(widget)("properties")("fontSize")
WidgetArray(2, i) = returnobject("widgets")(widget)("properties")("backgroundColor")
WidgetArray(3, i) = returnobject("widgets")(widget)("x")
WidgetArray(4, i) = returnobject("widgets")(widget)("y")
WidgetArray(5, i) = returnobject("widgets")(widget)("width")
WidgetArray(6, i) = returnobject("widgets")(widget)("height")
WidgetArray(7, i) = returnobject("widgets")(widget)("id")
i = i + 1
End If
Next
ReDim Preserve WidgetArray(0 To 7, i - 1)
With ThisWorkbook.Worksheets("Sheet1")
.Range("A1:H1") = Array("Text", "FontSize", "BackgroundColor", "X Position", "Y Position", "Width", "Height", "ID")
.Range(.Cells(2, 1), .Cells(i+ 1, 8)).Value = WorksheetFunction.Transpose(WidgetArray)
End With
End Sub
Public Function getJSON(URL As String, Token As String) As Object
Dim baseURL As String
Dim getRequest As MSXML2.XMLHTTP60
Dim URLParts As Variant
Dim jsonconvert As Object
Dim id As String
dim user as String
URLParts = Split(URL, "/", , vbBinaryCompare)
id = URLParts(UBound(URLParts) - 1)
user = URLParts(UBound(URLParts) - 2)
baseURL = Replace(Replace("https://app.mural.co/api/murals/{user}/{ID}", "{ID}", id), "{user}", user)
Set getRequest = New MSXML2.XMLHTTP60
With getRequest
.Open "GET", baseURL
.setRequestHeader "Authorization", "Bearer " & Token
.setRequestHeader "Referer", URL
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
.send
Set getJSON = JsonConverter.ParseJson(.responseText)
End With
End Function
Public Function getToken(URL As String) As String
Dim getRequest As MSXML2.XMLHTTP60
Dim URLParts As Variant
Dim position As Long
Dim jsonconvert As Object
Dim Token As Object
Dim State As String
Dim User As String
Dim json As String
Dim referer As String
Dim id As String
Dim posturl As String
json = "{""state"": ""{STATE}""}"
posturl = "https://app.mural.co/api/v0/visitor/{user}.{ID}"
referer = "https://app.mural.co/t/{user}/m/{user}/{ID}"
URLParts = Split(URL, "/", , vbBinaryCompare)
position = InStrRev(URL, "/")
URL = Left$(URL, position - 1)
State = URLParts(UBound(URLParts))
id = URLParts(UBound(URLParts) - 1)
User = URLParts(UBound(URLParts) - 2)
json = Replace(json, "{STATE}", State)
posturl = Replace(Replace(posturl, "{user}", User), "{ID}", id)
referer = Replace(Replace(referer, "{user}", User), "{ID}", id)
Set getRequest = New MSXML2.XMLHTTP60
With getRequest
.Open "POST", posturl
.setRequestHeader "origin", "https://app.mural.co"
.setRequestHeader "Referer", referer
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
.setRequestHeader "Content-Type", "application/json; charset=utf-8"
.send json
Set jsonconvert = JsonConverter.ParseJson(.responseText)
End With
getToken = jsonconvert("token")
End Function
' from https://stackoverflow.com/a/12804172/4839827
Public Function URLDecode(ByVal StringToDecode As String) As String
With CreateObject("htmlfile")
.Open
.Write StringToDecode
.Close
URLDecode = .body.outerText
End With
End Function
Here's the output returned. There are other properties available, however this code is meant to just give you an idea how to pull this back.
I am creating a macro to grab Fax Numbers from a public banking website.
I have written enough code to get to the site, select from a dropdown list, and change the selection in the dropdown list. However when I use FireEvent ("onChange"), it does not trigger the webpage to update.
I have searched for an answer, but have not found any.
Website: https://www.atb.com/contact-us/Pages/branch-locator.aspx
Public Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As LongPtr)
Sub test()
Dim ieExplorer As New InternetExplorerMedium
Dim ieField As Object
Dim ieSubmit As Object
Dim ieSelect As Object
Dim iebutton As Object
Dim buttCounter As Integer
Dim objOption As Object
Dim objCount As Integer
Dim ieForm As Object
Dim intRow As Long, faxNum As String
intRow = 2
With ieExplorer
.Visible = True
.Navigate "https://www.atb.com/contact-us/Pages/branch-locator.aspx"
Sleep 1000
Sleep 1000
Sleep 1000
Sleep 1000
Sleep 1000
Sleep 1000
Sleep 1000
Set ieSelect = .Document.getElementsByTagName("select")
Do While o < ieSelect.Length
If ieSelect(o).ID = "ba" Then
For Each i In ieSelect(o).Options
If i.Value <> "null" Then
ieSelect(o).Focus
i.Selected = True
ieSelect(o).FireEvent "onchange"
Set ieField = .Document.getElementsByTagName("p")
Do While x < ieField.Length
If InStr(ieField(x).innertext, "FAX") Then
Cells(intRow, "A").Value = i.Value
Cells(intRow, "B").Value = ieField(x).innertext
intRow = intRow + 1
End If
Loop
End If
Next
End If
o = o + 1
Loop
End With
End Sub
I would use XMLHTTP/WinHttp POST request and grab the xml and then parse that. You could adapt as a function. I would prefer to grab all the fax numbers in one go and write out to sheet. I use xpath to retrieve the title (name of branch) and the fax numbers.
You could adapt the xpath syntax to retrieve any of the listed values. E.g row returned from which you could select values:
<z:row ows_ID='1' ows_Title='Acadia Valley' ows_Transit='1.00000000000000' ows_Classification='Agency' ows_Address='Acadia Valley' ows_City='Acadia Valley' ows_Postal='T0J 0A0' ows_Phone='(403) 972-3805' ows_Fax='(403) 972-2263' ows_Hours='Mon-Fri 9:00-12:30, 13:30-16:00' ows_LAT='51.159888' ows_LONG='-110.209308' ows__ModerationStatus='0' ows__Level='1' ows_UniqueId='1;#{2973F9AC-2019-4BD1-A740-41A270BAC267}' ows_owshiddenversion='3' ows_FSObjType='1;#0' ows_Created='2015-11-18 13:58:48' ows_PermMask='0x1000030041' ows_Modified='2016-02-08 11:16:05' ows_FileRef='1;#Lists/Branches/1_.000' ows_MetaInfo='1;#' />
VBA:
Option Explicit
Public Sub GetFaxNumbers()
Dim body As String, xmlDoc As Object, request As Object
Application.ScreenUpdating = False
Set xmlDoc = CreateObject("MSXML2.DOMDocument") 'New MSXML2.DOMDocument60
body = "<soapenv:Envelope xmlns:soapenv='http://schemas.xmlsoap.org/soap/envelope/' xmlns:soap='http://schemas.microsoft.com/sharepoint/soap/'>"
body = body & "<soapenv:Body><GetListItems xmlns='http://schemas.microsoft.com/sharepoint/soap/'><listName>Branches</listName>"
body = body & "<viewFields><ViewFields><FieldRef Name='ID' /><FieldRef Name='Title' /><FieldRef Name='Transit' />"
body = body & "<FieldRef Name='Classification' /><FieldRef Name='Address' /><FieldRef Name='City' /><FieldRef Name='Postal' />"
body = body & "<FieldRef Name='Phone' /><FieldRef Name='Fax' /><FieldRef Name='Hours' /><FieldRef Name='LAT' /><FieldRef Name='LONG' />"
body = body & "</ViewFields></viewFields><rowLimit>0</rowLimit><query><Query><OrderBy><FieldRef Name='Title' Ascending='True' />"
body = body & "</OrderBy></Query></query></GetListItems></soapenv:Body></soapenv:Envelope>"
Set request = CreateObject("WinHttp.WinHttpRequest.5.1")
With request
.Open "POST", "https://www.atb.com/_vti_bin/lists.asmx", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.181 Safari/537.36"
.setRequestHeader "Content-Type", "text/xml"
.send body
With xmlDoc
.validateOnParse = True
.setProperty "SelectionLanguage", "XPath"
.async = False
If Not .LoadXML(request.responseText) Then
Err.Raise .parseError.ErrorCode, , .parseError.reason
End If
End With
End With
Dim elements As Object, counter As Long, rowNum As Long
Set elements = xmlDoc.SelectNodes("//#ows_Title | //#ows_Fax")
rowNum = 1
For counter = 0 To elements.Length - 1 Step 2
With ThisWorkbook.Worksheets("Sheet1")
.Cells(rowNum, 1) = elements(counter).Text
.Cells(rowNum, 2) = elements(counter + 1).Text
End With
rowNum = rowNum + 1
Next
Application.ScreenUpdating = True
End Sub
Sample of results:
Looks like the select change is set up by this code:
$('body').find('#ba').change(function(){
var a = $(this).val();
lookyloo(a);
});
You should be able to call lookyloo using ExecScript and pass in the value
Eg:
How to find and call javascript method from vba
Tested:
Dim ie As InternetExplorer, el
Set ie = New InternetExplorerMedium
ie.Visible = True
ie.navigate "https://www.atb.com/contact-us/Pages/branch-locator.aspx"
Set el = ie.document.getElementById("ba") 'I put a break here while the page loaded...
el.selectedIndex = 5 'for example
ie.document.parentWindow.Window.execScript "lookyloo('" & el.Value & "');"
I had a similar issue and got it to work by changing "onchange" to ("onchange").
I've been banging my head against he wall trying to figure out why this VBA code will not work :(
I am simply trying to insert the value entered into the excel input box into a website's input box. I am a novice when it comes to HTML so I'm sure that has something to do with it.
Here is the HTML element from the website Zomato.com:
<input class="dark" id="location_input" role="combobox" aria-expanded="true" aria-labelledby="label_search_location" aria-owns="explore-location-suggest" aria-autocomplete="list" placeholder="Please type a location...">
Here is my VBA code:
Sub Merchant_Extraction()
Dim IE As Object
Dim form As Variant
Dim button As Variant
Set IE = CreateObject("internetexplorer.application")
merchantzip = InputBox("Enter Zip Code")
With IE
.Visible = True
.navigate ("http://www.zomato.com")
While IE.readystate <> 4
DoEvents
Wend
IE.Document.GetElementByID(“location_input_sp”).Item.innertext = merchantzip
Set form = IE.Document.getelementsbytagname("form")
Set button = form(0).onsubmit
form(0).get
End With
Set IE = Nothing
End Sub
I am unclear why it's not working - any help would be incredible!
API XMLHTTP GET request
The API was mentioned. The documention is here.
The basic free account allows access to restaurant information and search APIs (up to 1000 calls/day).
An example first 0-20 results call, with a city id specified (68 for Manchester,UK), is as follows; a JSON response is received. The response it parsed into a JSON object with JSONConverter.bas
Option Explicit
Public Sub GetInfo()
Dim URL As String, strJSON As String, json As Object
URL = "https://developers.zomato.com/api/v2.1/search?entity_id=68&entity_type=city&start=0&count=20"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.setRequestHeader "Content-Type", "application/json"
.setRequestHeader "user-key", "yourAPIkey"
.send
strJSON = .responseText
End With
Set json = JsonConverter.ParseJson(strJSON)
'other stuff with JSON object
End Sub
Example JSON response:
Zomato - Common APIs:
Finding your city ID:
The quickest way for me was to go to concatenate the city onto a base URL string e.g. https://www.zomato.com/manchester, then click search and right-click inspect HTML on first result. Then Ctrl+F to bring up search box, search for CITY_ID, and scan through find results for the HTML until city id found e.g.
As far as entering values into webpages the correct syntax would be:
IE.Document.all.Item("location_input").Value = ""
I've combined your routine with some code that I use so you can see an example. I have not been able to test however. In my environment, the IE object disconnects after the .navigate portion so I added in a loop to find and re-assign the object...
Option Explicit
Public Declare Sub Sleep Lib "kernel32" (ByVal dwMilliseconds As Long)
Sub Merchant_Extraction()
Dim IE As Object, objShellWindows As Object
Dim MerchantZip As String, strWebPath As String
Dim Form As Variant, Button As Variant
Dim X As Long
strWebPath = "http://www.zomato.com"
MerchantZip = InputBox("Enter Zip Code")
If MerchantZip = vbNullString Then Exit Sub
Set IE = CreateObject("InternetExplorer.Application")
With IE
.Visible = True
.Navigate strWebPath
End With
Do
Sleep 250
DoEvents
Loop While IE.Busy Or IE.ReadyState <> 4
If TypeName(IE) <> "IWebBrowser2" Or IE.Name <> "Internet Explorer" Then
Set objShellWindows = CreateObject("Shell.Application").Windows
For X = 0 To objShellWindows.Count - 1
Set IE = objShellWindows.Item(X)
If Not IE Is Nothing Then
If IE.Name = "Internet Explorer" Then
If InStr(1, IE.LocationURL, strWebPath, 1) > 0 Then
Do While IE.Busy Or IE.ReadyState <> 4
Sleep 250
DoEvents
Loop
Exit For
End If
End If
End If
Set IE = Nothing
Next
Set objShellWindows = Nothing
End If
If Not IE Is Nothing Then
IE.Document.all.Item("location_input").Value = MerchantZip
Sleep 250
For Each Button In IE.Document.getelementsbytagname("form")
If StrComp(Button.Type, "Button", 1) = 0 Then
Button.Click
End If
Next
Set IE = Nothing
End If
End Sub