I have over 20,000 searches I need to do in google. I want to use VBA to do an automate search in google or internet explorer and return link to excel. I have tried multiple VBA formulas and none of them seem to work. Is there a formula that will do an automate search and return link to the first site on google search to excel? Below is the formula I am currently using, but it isn't working. I am searching addresses in column A and need link to be return to column B.
Sub XMLHTTP_Count()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("A" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
If html.getElementById("resultStats") Is Nothing Then
str_text = "0 Results"
Else
str_text = html.getElementById("resultStats").innerText
End If
Cells(i, 2) = str_text
DoEvents
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Well, you don't need the randomizer and it looks like the 'resultStats' changed to 'result-stats'. Try the code below and see if it does what you want.
Sub GetSearchStats()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object
Dim start_time As Date
Dim end_time As Date
Dim var As String
Dim var1 As Object
lastRow = Range("A" & Rows.Count).End(xlUp).row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.com/search?q=" & Cells(i, 1)
Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getElementById("rso")
Set var1 = html.getElementById("result-stats")
Cells(i, 2).Value = var1.innerText
DoEvents
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Result:
I think I answered your initial question. This sounds like a new question , and it probably warrants a new post, but I'll go ahead and offer a second answer here, to address this question.
Sub WebPage()
Dim internet As Object
Dim internetdata As Object
Dim div_result As Object
Dim header_links As Object
Dim link As Object
Dim URL As String
Set internet = CreateObject("InternetExplorer.Application")
internet.Visible = True
URL = "https://www.google.co.in/search?q=how+to+program+in+vba"
internet.Navigate URL
Do Until internet.ReadyState >= 4
DoEvents
Loop
Application.Wait Now + TimeSerial(0, 0, 5)
Set internetdata = internet.Document
Set div_result = internetdata.getelementbyid("res")
Set header_links = div_result.getelementsbytagname("h3")
For Each h In header_links
Set link = h.ChildNodes.Item(0)
Cells(Range("A" & Rows.Count).End(xlUp).Row + 1, 1) = link.href
Next
MsgBox "done"
End Sub
Result:
You can easily convert each text field to a hyperlink if you want to make thse all clickable links. Feel free to modify the code to suit your needs.
I am trying to get data from "https://in.tradingview.com/symbols/NSE-ABB/technicals/" using excel vba website scraping, eventhough I am geting response, but the body.innerHTML is not showing required table, but in chrome if I inspect the page, I am able to see the table with the name.
What is wrong with the code?
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", URL, False
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
sResponse = Mid$(sResponse, InStr(1, sResponse, "<!DOCTYPE "))
WriteTxtFile sResponse
With html
.body.innerHTML = sResponse
Set tElementC = .getElementsByClassName("table-1i1M26QY- maTable-27Z4Dq6Y- tableWithAction-2OCRQQ8y-")(0).getElementsByTagName("td")
End With
URL --> https://in.tradingview.com/symbols/NSE-ABB/technicals/
classname to access = "table-1i1M26QY- maTable-27Z4Dq6Y- tableWithAction-2OCRQQ8y-"
The webpage source HTML by the link provided https://in.tradingview.com/symbols/NSE-ABB/technicals/ doesn't contain the necessary data, it uses AJAX. The website has a sorta API available. The response is returned in JSON format. So you need to make some reverse engineering work first to find out how does the website works. In a browser, e. g. Chrome, press F12 to open DevTools, navigate to the webpage, go to Network tab, set the filter to XHR, it will look like as shown below:
Examine logged responses. One of them having the largest size actually contains all the necessary data:
To make such XHR you need to keep the entire payload structure also, and add the relevant headers:
In Form Data section there are a lot of quote field titles that located within the array, so you may choose which actually you need. You may find more available titles, click on Initiator link (first screenshot above), you will see JS code which initiated that XHR. Click Pretty print {} at the bottom to make the code readable. Type any title you already pulled out from Form Data in the search box, e. g. Recommend.Other, and find others next to it in the code:
Here is VBA example showing how such scraping could be done. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub Test()
Dim aQuoteFieldTitles()
Dim aQuoteFieldData()
Dim sPayload As String
Dim sJSONString As String
Dim vJSON
Dim sState As String
Dim i As Long
' Put the necessary field titles into array
aQuoteFieldTitles = Array( _
"name", "description", "country", "type", "after_tax_margin", "average_volume", "average_volume_30d_calc", "average_volume_60d_calc", "average_volume_90d_calc", "basic_eps_net_income", "beta_1_year", "beta_3_year", "beta_5_year", "current_ratio", "debt_to_assets", "debt_to_equity", "dividends_paid", "dividends_per_share_fq", _
"dividends_yield", "dps_common_stock_prim_issue_fy", "earnings_per_share_basic_ttm", "earnings_per_share_diluted_ttm", "earnings_per_share_forecast_next_fq", "earnings_per_share_fq", "earnings_release_date", "earnings_release_next_date", "ebitda", "enterprise_value_ebitda_ttm", "enterprise_value_fq", "exchange", "expected_annual_dividends", _
"gross_margin", "gross_profit", "gross_profit_fq", "industry", "last_annual_eps", "last_annual_revenue", "long_term_capital", "market_cap_basic", "market_cap_calc", "net_debt", "net_income", "number_of_employees", "number_of_shareholders", "operating_margin", _
"pre_tax_margin", "preferred_dividends", "price_52_week_high", "price_52_week_low", "price_book_ratio", "price_earnings_ttm", "price_revenue_ttm", "price_sales_ratio", "quick_ratio", "return_of_invested_capital_percent_ttm", "return_on_assets", "return_on_equity", "return_on_invested_capital", "revenue_per_employee", "sector", _
"eps_surprise_fq", "eps_surprise_percent_fq", "total_assets", "total_capital", "total_current_assets", "total_debt", "total_revenue", "total_shares_outstanding_fundamental", "volume", "relative_volume", "pre_change", "post_change", "close", "open", "high", "low", "gap", "price_earnings_to_growth_ttm", "price_sales", "price_book_fq", _
"price_free_cash_flow_ttm", "float_shares_outstanding", "total_shares_outstanding", "change_from_open", "change_from_open_abs", "Perf.W", "Perf.1M", "Perf.3M", "Perf.6M", "Perf.Y", "Perf.YTD", "Volatility.W", "Volatility.M", "Volatility.D", "RSI", "RSI7", "ADX", "ADX+DI", "ADX-DI", "ATR", "Mom", "High.All", "Low.All", "High.6M", "Low.6M", _
"High.3M", "Low.3M", "High.1M", "Low.1M", "EMA5", "EMA10", "EMA20", "EMA30", "EMA50", "EMA100", "EMA200", "SMA5", "SMA10", "SMA20", "SMA30", "SMA50", "SMA100", "SMA200", "Stoch.K", "Stoch.D", "MACD.macd", "MACD.signal", "Aroon.Up", "Aroon.Down", "BB.upper", "BB.lower", "goodwill", "debt_to_equity_fq", "CCI20", "DonchCh20.Upper", _
"DonchCh20.Lower", "HullMA9", "AO", "Pivot.M.Classic.S3", "Pivot.M.Classic.S2", "Pivot.M.Classic.S1", "Pivot.M.Classic.Middle", "Pivot.M.Classic.R1", "Pivot.M.Classic.R2", "Pivot.M.Classic.R3", "Pivot.M.Fibonacci.S3", "Pivot.M.Fibonacci.S2", "Pivot.M.Fibonacci.S1", "Pivot.M.Fibonacci.Middle", "Pivot.M.Fibonacci.R1", _
"Pivot.M.Fibonacci.R2", "Pivot.M.Fibonacci.R3", "Pivot.M.Camarilla.S3", "Pivot.M.Camarilla.S2", "Pivot.M.Camarilla.S1", "Pivot.M.Camarilla.Middle", "Pivot.M.Camarilla.R1", "Pivot.M.Camarilla.R2", "Pivot.M.Camarilla.R3", "Pivot.M.Woodie.S3", "Pivot.M.Woodie.S2", "Pivot.M.Woodie.S1", "Pivot.M.Woodie.Middle", "Pivot.M.Woodie.R1", _
"Pivot.M.Woodie.R2", "Pivot.M.Woodie.R3", "Pivot.M.Demark.S1", "Pivot.M.Demark.Middle", "Pivot.M.Demark.R1", "KltChnl.upper", "KltChnl.lower", "P.SAR", "Value.Traded", "MoneyFlow", "ChaikinMoneyFlow", "Recommend.All", "Recommend.MA", "Recommend.Other", "Stoch.RSI.K", "Stoch.RSI.D", "W.R", "ROC", "BBPower", "UO", "Ichimoku.CLine", _
"Ichimoku.BLine", "Ichimoku.Lead1", "Ichimoku.Lead2", "VWMA", "ADR", "RSI[1]", "Stoch.K[1]", "Stoch.D[1]", "CCI20[1]", "ADX-DI[1]", "AO[1]", "Mom[1]", "Rec.Stoch.RSI", "Rec.WR", "Rec.BBPower", "Rec.UO", "Rec.Ichimoku", "Rec.VWMA", "Rec.HullMA9" _
)
' Field titles exactly as in the table MOVING AVERAGES
' aQuoteFieldTitles = Array("EMA5", "SMA5", "EMA10", "SMA10", "EMA20", "SMA20", "EMA30", "SMA30", "EMA50", "SMA50", "EMA100", "SMA100", "EMA200", "SMA200", "Ichimoku.BLine", "VWMA", "HullMA9")
' Compose payload
sPayload = "{""symbols"":{""tickers"":[""NSE:ABB""],""query"":{""types"":[]}},""columns"":" & JSON.Serialize(aQuoteFieldTitles) & "}"
' Retrieve JSON response
With CreateObject("MSXML2.XMLHTTP")
.Open "POST", "https://scanner.tradingview.com/india/scan", True
.setRequestHeader "content-type", "application/x-www-form-urlencoded"
.setRequestHeader "user-agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.110 Safari/537.36"
.setRequestHeader "content-length", Len(sPayload)
.send (sPayload)
Do Until .readyState = 4: DoEvents: Loop
sJSONString = .responseText
End With
' Parse JSON response
JSON.Parse sJSONString, vJSON, sState
' Check response validity
Select Case True
Case sState <> "Object"
MsgBox "Invalid JSON response"
Case IsNull(vJSON("data"))
MsgBox vJSON("error")
Case Else
' Output data to worksheet #1
aQuoteFieldData = vJSON("data")(0)("d")
With ThisWorkbook.Sheets(1)
.Cells.Delete
.Cells.WrapText = False
For i = 0 To UBound(aQuoteFieldTitles)
.Cells(i + 1, 1).Value = aQuoteFieldTitles(i)
.Cells(i + 1, 2).Value = aQuoteFieldData(i)
Next
.Columns.AutoFit
End With
MsgBox "Completed"
End Select
End Sub
The output for me is as follows:
BTW, the similar approach applied in other answers.
As mentioned in comments, javascript has to run on the page to update the required content. There doesn't appear to be an API freely available. You can use a browser. You need to go VBE > Tools > References > add a reference to Microsoft Internet Controls.
Option Explicit
Public Sub GetInfo()
Dim IE As InternetExplorer, ws As Worksheet, hTable As Object, tRow As Object, td As Object, r As Long, c As Long, headers()
headers = Array("name", "value", "action")
Set ws = ThisWorkbook.Worksheets("Sheet1"): Set IE = New InternetExplorer
With IE
.Visible = True
.Navigate2 "https://in.tradingview.com/symbols/NSE-ABB/technicals/"
While .Busy Or .readyState < 4: DoEvents: Wend
Set hTable = IE.document.querySelector("table + .tableWithAction-2OCRQQ8y-")
ws.Cells(1, 1).Resize(1, UBound(headers) + 1) = headers
For Each tRow In hTable.getElementsByTagName("tr")
r = r + 1: c = 1
For Each td In tRow.getElementsByTagName("td")
ws.Cells(r, c).Value = td.innerText
c = c + 1
Next td
Next tRow
.Quit
End With
End Sub
I have a cell that contains something I would like searched in google news. I want the code to return the number of results for that search. Currently I have this code which I found elsewhere on the site and does not use google news but even then I sometimes get a
runtime error -2147024891 (80070005)
after 70 or so searched and I can't run again.
Sub HawkishSearch()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object, html As Object
Dim start_time As Date
Dim end_time As Date
lastRow = Range("B" & Rows.Count).End(xlUp).Row
Dim cookie As String
Dim result_cookie As String
start_time = Time
Debug.Print "start_time:" & start_time
For i = 2 To lastRow
url = "https://www.google.co.in/search?q=" & Cells(i, 2) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
XMLHTTP.Open "GET", url, False
XMLHTTP.setRequestHeader "Content-Type", "text/xml"
XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0"
XMLHTTP.send
Set html = CreateObject("htmlfile")
html.body.innerHTML = XMLHTTP.ResponseText
If html.getElementById("resultStats") Is Nothing Then
str_text = "0 Results"
Else
str_text = html.getElementById("resultStats").innerText
End If
Cells(i, 3) = str_text
DoEvents
Next
end_time = Time
Debug.Print "end_time:" & end_time
Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time)
MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time)
End Sub
Best option (IMO) is to use the Google News API and register for an API key. You can then use a queryString including your search term and parse the JSON response to get the result count. I do that below and also populate a collection with the article titles and links. I use a JSON parser called JSONConverter.bas which you download and add to your project. You can then go to VBE > Tools > References > add a reference to Microsoft Scripting Runtime.
Sample JSON response from API:
The {} denotes a dictionary which you access by key, the [] denotes a collection which you access by index or by For Each loop over.
I use the key totalResults to retrieve the total results count from the initial dictionary returned by the API.
I then loop the collection of dictionaries (articles) and pull the story titles and URLs.
You can then inspect the results in the locals window or print out
Sample of results in locals window:
Option Explicit
Public Sub GetStories()
Dim articles As Collection, article As Object
Dim searchTerm As String, finalResults As Collection, json As Object, arr(0 To 1)
Set finalResults = New Collection
searchTerm = "Obama"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://newsapi.org/v2/everything?q=" & searchTerm & "&apiKey=yourAPIkey", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
Set json = JsonConverter.ParseJson(.responseText)
End With
Debug.Print "total results = " & json("totalResults")
Set articles = json("articles")
For Each article In articles
arr(0) = article("title")
arr(1) = article("url")
finalResults.Add arr
Next
Stop '<== Delete me later
End Sub
Loop:
If deploying in a loop you can use a class clsHTTP to hold the XMLHTTP object. This is more efficient than creating and destroying. I supply this class with a method GetString to retrieve the JSON response from the API, and a GetInfo method to parse the JSON and retrieve the results count and the API results URLs and Titles.
Example of results structure in locals window:
Class clsHTTP:
Option Explicit
Private http As Object
Private Sub Class_Initialize()
Set http = CreateObject("MSXML2.XMLHTTP")
End Sub
Public Function GetString(ByVal url As String) As String
With http
.Open "GET", url, False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
GetString = .responseText
End With
End Function
Public Function GetInfo(ByVal json As Object) As Variant
Dim results(), counter As Long, finalResults(0 To 1), articles As Object, article As Object
finalResults(0) = json("totalResults")
Set articles = json("articles")
ReDim results(1 To articles.Count, 1 To 2)
For Each article In articles
counter = counter + 1
results(counter, 1) = article("title")
results(counter, 2) = article("url")
Next
finalResults(1) = results
GetInfo = finalResults
End Function
Standard module:
Option Explicit
Public Sub GetStories()
Dim http As clsHTTP, json As Object
Dim finalResults(), searchTerms(), searchTerm As Long, url As String
Set http = New clsHTTP
With ThisWorkbook.Worksheets("Sheet1")
searchTerms = Application.Transpose(.Range("A1:A2")) '<== Change to appropriate range containing search terms
End With
ReDim finalResults(1 To UBound(searchTerms))
For searchTerm = LBound(searchTerms, 1) To UBound(searchTerms, 1)
url = "https://newsapi.org/v2/everything?q=" & searchTerms(searchTerm) & "&apiKey=yourAPIkey"
Set json = JsonConverter.ParseJson(http.GetString(url))
finalResults(searchTerm) = http.GetInfo(json)
Set json = Nothing
Next
Stop '<==Delete me later
End Sub
'
Otherwise:
I would use the following where I grab story links by their class name. I get the count and write the links to a collection
Option Explicit
Public Sub GetStories()
Dim sResponse As String, html As HTMLDocument, articles As Collection
Const BASE_URL As String = "https://news.google.com/"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://news.google.com/topics/CAAqIggKIhxDQkFTRHdvSkwyMHZNRGxqTjNjd0VnSmxiaWdBUAE?hl=en-US&gl=US&ceid=US:en", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument: Set articles = New Collection
Dim numberOfStories As Long, nodeList As Object, i As Long
With html
.body.innerHTML = sResponse
Set nodeList = .querySelectorAll(".VDXfz")
numberOfStories = nodeList.Length
Debug.Print "number of stories = " & numberOfStories
For i = 0 To nodeList.Length - 1
articles.Add Replace$(Replace$(nodeList.item(i).href, "./", BASE_URL), "about:", vbNullString)
Next
End With
Debug.Print articles.Count
End Sub
Standard Google search:
The following works an example standard google search but you will not always get the same HTML structure depending on your search term. You will need to provide some failing cases to help me determine if there is a consistent selector method that can be applied.
Option Explicit
Public Sub GetResultsCount()
Dim sResponse As String, html As HTMLDocument
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.google.com/search?q=mitsubishi", False
.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
.send
sResponse = StrConv(.responseBody, vbUnicode)
End With
Set html = New HTMLDocument
With html
.body.innerHTML = sResponse
Debug.Print .querySelector("#resultStats").innerText
End With
End Sub