Excel VBA automation of html form generated by php - html

I am a non-coder/programmer trying to create an excel spreadsheet that utilizes some VBA automation to import some web-data based on certain cell values. I have managed to scrape together part of the process using Youtube and some other sites like this. However, I have hit a road block that I am hoping someone could help me with.
Here is the setup:
I am trying to import some Co2 data based on 2 parameters (pressure and temperature) that will reside in 2 separate cells in my excel sheet.
In my VBA code, I have managed to navigate to the first site (http://www.peacesoftware.de/einigewerte/co2_e.html), and then find the table elements, fill them in, and submit the form.
My problem is when IE navigates to the next page where the results are. I do not know how to import table elements from this new page. I assume I need to tell VBA to look at the new page, but it has a generic URL (http://www.peacesoftware.de/einigewerte/calc_co2.php5), so I am confused on what to reference.
I hope this is all the info needed to get a clear picture of my problem. Here is my current VBA code. The end part after "submit" is a wild guess on my part. Once VBA is looking at the right table/page, I will then import the enthalpy and entropy values to my excel spreadsheet.
Thanks in advance for your help!
' updates enthalpy, entropy data from peacesoftware site
Private Sub Worksheet_Change(ByVal Target As Range)
If Target.Row = Range("tempA").Row And Target.Column = Range("tempA").Column Then
Dim IE As Object
Dim pressA As String
Dim tempA As String
Dim denseA As String
Dim enthA As String
Dim entroA As Style
tempA = Range("tempA")
pressA = Range("pressA")
Set IE = CreateObject("InternetExplorer.Application")
' setup internet explorer
IE.Visible = True
IE.navigate "http://www.peacesoftware.de/einigewerte/co2_e.html"
Do
DoEvents
Loop Until IE.readyState = READYSTATE_COMPLETE
' find pressure and temp entry form
IE.document.getElementsByName("druck")(0).Value = pressA
IE.document.getElementsByName("temperatur")(0).Value = tempA
IE.document.forms(0).submit
'Do
'DoEvents
'Loop Until IE.readyState = READYSTATE_COMPLETE
'update new energy data
'IE.navigate table_url ???
End If
End Sub
Private Sub Worksheet_SelectionChange(ByVal Target As Range)
End Sub
***Update
I think I found a solution, so I can at least get it functional. (thanks to this post: excel vba form submit and parse data from result )
I needed to wait after submit and then search for the "td" element I needed. Since the actual value I needed was the next element after the element label, I used TDelement.nextElementSibling.innerText to grab it.
IE.document.forms(0).submit
' wait for new page to load
Do While IE.Busy: DoEvents: Loop
Set doc = IE.document
Dim sdd As String
Set TDelements = doc.getElementsByTagName("td")
r = 0
For Each TDelement In TDelements
If TDelement.innerText = "Density : " Then
Range("denseA") = TDelement.nextElementSibling.innerText
r = r + 1
End If
Next
Thanks again for the help.

The problem is that the form is sent by POST and not by GET.
That means that you cannot see/manipulate/send the parameters in the URL.
BUT you can use VBA to send a POST-Request directly to the result page.
The following code gets you the values you mentioned in the commenct-section of this response.
You have to change the postData-variable if you want another temperature or 'druck':
This Example-Code just gives you the result for the values 20 and 20.
Dim Density As String
Dim Entropy As String
Dim Enthalpy As String
Dim SoA As String
Dim Result As String
Dim myURL As String, postData As String
Dim winHttpReq As Object
Set winHttpReq = CreateObject("WinHttp.WinHttpRequest.5.1")
myURL = "http://www.peacesoftware.de/einigewerte/calc_co2.php5"
postData = "lang=english&calc=standard&druck=20&druckunit=1&temperatur=20&tempunit=1"
winHttpReq.Open "POST", myURL, False
winHttpReq.SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
winHttpReq.Send (postData)
Result = winHttpReq.responseText
Density = getValue("Density", Result, False)
Entropy = getValue("Entropy", Result, False)
Enthalpy = getValue("Enthalpy", Result, False)
SoA = getValue("state of aggregation", Result, True)
Additionally I created a helper-function which must also be implemented:
Public Function getValue(Property As String, Result As String, isBold As Boolean) As String
Dim posProp As Long
Dim posTD As Long
Dim posEndTD As Long
Dim startPosVal As Long
Dim endPosVal As Long
Dim valLength As Long
Dim Value As String
'find the position of the value on the page
posProp = InStr(1, Result, Property)
If posProp > 0 Then
posTD = InStr(posProp, Result, "<td>")
If isBold = True Then
posEndTD = InStr(posTD, Result, "</b></td>")
startPosVal = posTD + 7
Else
posEndTD = InStr(posTD, Result, "</td>")
startPosVal = posTD + 4
End If
endPosVal = posEndTD
valLength = endPosVal - startPosVal
Value = Mid(Result, startPosVal, valLength)
getValue = Value
End If
End Function
Hope this helps!
If you need any help just leave me a comment.
EDIT:
Ah i just read that you found a solution.
But I think this code is a bit cleaner and faster because it just send ONE http-request directly to the result-page.

Related

Importing data from a hyperlink on a webpage in excel using macros

I want to import some data from a website https://www.amfiindia.com/nav-history-download. On this page, there is a link "Download Complete NAV Report in Text Format" which will give me the required data. But this link is not static so I cannot use this directly in VBA to download my data. So how to download data from a hyperlink on a webpage using excel?
My approach is first getting the hyperlink in a variable then use that variable to get the data?
First, get the hyperlink using getElementsByTagName function as shown below.
Then use that as URL to get the data.
But I am getting type mismatch error while equating website which is a string with my hyperlink.
I don't know the type of href. Tried seeing in watch window showing Variant, tried that still error.
Kindly help me with this.
Sub webscraping()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim cellAddress As String
Dim rowNumber As Long
Dim ie As InternetExplorer
Dim ht As HTMLDocument
Dim hr As MSHTML.IHTMLElement
'Dim Hra As MSHTML.IHTMLElement
Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate ("https://www.amfiindia.com/nav-history-download")
Do Until ie.ReadyState >= 4
DoEvents
Loop
Set ht = ie.Document
'MsgBox ht.getElementById("navhistorydownload")
Set hr = ht.getElementsByTagName("a")(18).href
' Website to go to.
website = StrConv(hr, vbUnicode)
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
'price = html.getElementstagName("a").Item(0).innerText
cellAddress = Range("A" & Rows.Count).End(xlUp).Address
rowNumber = Range(cellAddress).Row
ThisWorkbook.Sheets(1).Cells(rowNumber + 1, 1) = response
' MsgBox rowNumber
' MsgBox cellAddress
' Output the price into a message box.
'MsgBox price
End Sub
If you don't know the type then you can use
?typename(ht.getElementsByTagName("a")(18).href)
in the immediate window.
It should be a string and declared as such.
Rather than indexing into an anchor collection I would grab by css selector
ht.querySelector(".nav-hist-dwnld a").href
This specifies the parent node with class name nav-hist-dwnld and then asks for the first child a tag.
This, website = StrConv(hr, vbUnicode) is not required. Use the extracted href direct.

Return URL From First Search Result

I have an Excel workbook of around 25,000 company keywords from which I'd like to get the company website URL.
I am looking to run a VBA script which can run these keywords as a Google search, and pull the URL of the first result into a spreadsheet.
I found a similar thread.
The results of this to be hit-and-miss; some keywords return the URL in the next column, others remain blank.
It also seemed to pull the URL of Google's optimised sub-links in the first search result rather than the main website URL: Google Search Result example
I then found the below code here which I ran on a sample list of 1,000 keywords. The author of this blog stipulates that this code works for Mozilla Firefox.
I tested IE code that he has also written but this did not achieve the same results (it was adding hyperlinks consisting of descriptive text from the search results rather than the raw URL).
The Firefox code worked until the 714th row, then returned a error message
"Run time error 91: object variable or with block variable not set"
Spreadsheet layout showing successful results and row at which macro stopped
Sub GoogleURL ()
Dim url As String, lastRow As Long
Dim XMLHTTP As Object
Dim html As Object
Dim objResultDiv As Object
Dim objH As Object
lastRow = Range(“A” & Rows.Count).End(xlUp).Row
For i = 2 To lastRow
url = “https://www.google.co.uk/search?q=” & Cells(i, 1) & “&rnd=” & WorksheetFunction.RandBetween(1, 10000)
Set XMLHTTP = CreateObject(“MSXML2.serverXMLHTTP”)
XMLHTTP.Open “GET”, url, False
XMLHTTP.setRequestHeader “Content-Type”, “text/xml”
XMLHTTP.setRequestHeader “User-Agent”, “Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0”
XMLHTTP.send
Set html = CreateObject(“htmlfile”)
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid(“rso”)
Set objH = objResultDiv.getelementsbytagname(“h3”)(0)
Cells(i, 2).Value = objH.innerText
Set html = CreateObject(“htmlfile”)
html.body.innerHTML = XMLHTTP.ResponseText
Set objResultDiv = html.getelementbyid(“rso”)
Set objH = objResultDiv.getelementsbytagname(“cite”)(0)
Cells(i, 3).Value = objH.innerText
DoEvents
Next
End Sub
As Firefox is a third party browser for the support scope of Microsoft, I can help you to check the VBA code for the IE browser.
You said that the VBA code given in this link for the IE browser generates the description with the link and your requirement is to store description and link in a separate column.
I tried to modify that sample code as per your requirement.
Here is the modified code from that sample.
Option Explicit
Const TargetItemsQty = 1 ' results for each keyword
Sub GWebSearchIECtl()
Dim objSheet As Worksheet
Dim objIE As Object
Dim x As Long
Dim y As Long
Dim strSearch As String
Dim lngFound As Long
Dim st As String
Dim colGItems As Object
Dim varGItem As Variant
Dim strHLink As String
Dim strDescr As String
Dim strNextURL As String
Set objSheet = Sheets("Sheet1")
Set objIE = CreateObject("InternetExplorer.Application")
objIE.Visible = True ' for debug or captcha request cases
y = 1 ' start searching for the keyword in the first row
With objSheet
.Select
.Range(.Columns("B:B"), .Columns("B:B").End(xlToRight)).Delete ' clear previous results
.Range(.Columns("C:C"), .Columns("C:C").End(xlToRight)).Delete ' clear previous results
.Range("A1").Select
Do Until .Cells(y, 1) = ""
x = 2 ' start writing results from column B
.Cells(y, 1).Select
strSearch = .Cells(y, 1) ' current keyword
With objIE
lngFound = 0
.navigate "https://www.google.com/search?q=" & EncodeUriComponent(strSearch) ' go to first search results page
Do
Do While .Busy Or Not .READYSTATE = 4: DoEvents: Loop ' wait IE
Do Until .document.READYSTATE = "complete": DoEvents: Loop ' wait document
Do While TypeName(.document.getelementbyid("res")) = "Null": DoEvents: Loop ' wait [#res] element
Set colGItems = .document.getelementbyid("res").getElementsByClassName("g") ' collection of search result [.g] items
For Each varGItem In colGItems ' process each item in collection
If varGItem.getelementsbytagname("a").Length > 0 And varGItem.getElementsByClassName("st").Length > 0 Then ' must have hyperlink and description
strHLink = varGItem.getelementsbytagname("a")(0).href ' get first hyperlink [a] found in current item
strDescr = GetInnerText(varGItem.getElementsByClassName("st")(0).innerHTML) ' get first description [span.st] found in current item
lngFound = lngFound + 1
'Debug.Print (strHLink)
'Debug.Print (strDescr)
With objSheet ' put result into cell
.Cells(y, x).Value = strDescr
.Hyperlinks.Add .Cells(y, x + 1), strHLink
.Cells(y, x).WrapText = True
x = x + 1 ' next column
End With
If lngFound = TargetItemsQty Then Exit Do ' continue with next keyword - necessary quantity of the results for current keyword found
End If
DoEvents
Next
If TypeName(.document.getelementbyid("pnnext")) = "Null" Then Exit Do ' continue with next keyword - no [a#pnnext.pn] next page button exists
strNextURL = .document.getelementbyid("pnnext").href ' get next page url
.navigate strNextURL ' go to next search results page
Loop
End With
y = y + 1 ' next row
Loop
End With
objIE.Quit
' google web search page contains the elements:
' [div#res] - main search results block
' [div.g] - each result item block within [div#res]
' [a] - hyperlink ancor(s) within each [div.g]
' [span.st] - description(s) within each [div.g]
' [a#pnnext.pn] - hyperlink ancor to the next search results page
End Sub
Function EncodeUriComponent(strText As String) As String
Static objHtmlfile As Object
If objHtmlfile Is Nothing Then
Set objHtmlfile = CreateObject("htmlfile")
objHtmlfile.parentWindow.execScript "function encode(s) {return encodeURIComponent(s)}", "jscript"
End If
EncodeUriComponent = objHtmlfile.parentWindow.encode(strText)
End Function
Function GetInnerText(strText As String) As String
Static objHtmlfile As Object
If objHtmlfile Is Nothing Then
Set objHtmlfile = CreateObject("htmlfile")
objHtmlfile.Open
objHtmlfile.Write "<body></body>"
End If
objHtmlfile.body.innerHTML = strText
GetInnerText = objHtmlfile.body.innerText
End Function
Output in IE 11 browser:
You can try to run it on your side to see the results with large amount of data.
If you meet with any performance issue then I suggest you try it with a smaller amount of data.

filling a html auto search box and obtaining the results

I am trying to fill in a search box on a web page that as it is filled in it auto searches for the results. The website is https://pcpartpicker.com/products/motherboard/. If you go there and type in a motherboard manufacturer of motherboard name you can see how it begins to narrow down the possible selections. I have code that will fill in the search box but nothing happens.
Sub GetMotherboards()
Dim ie As InternetExplorer
Set ie = New InternetExplorer
Dim doc As HTMLDocument
Dim objText As DataObject
Dim objArticleContents As Object
Dim objLinksCollection As Object
Dim objToClipBoard As DataObject
Dim r As Object
Dim prodRating As String
Dim prodName As String
Dim lngNumberOfVideos As Long
Dim strURL As String
Dim strNewString As String, strStr As String, strTestChar As String
Dim bFlag As Boolean
strURL = "https://pcpartpicker.com/products/motherboard/" ' Range("J5").Value
With ie
.navigate strURL
.Visible = True
Do While .readyState <> 4: DoEvents: Loop
Application.Wait Now + #12:00:02 AM#
Set doc = ie.document
End With
bFlag = False
With doc
Set objArticleContents = .getElementsByClassName("subTitle__form")
Stop
Set ele = .getElementsByClassName("subTitle__form")(0)
Set form = .getElementsByClassName("subTitle__form")(0).getElementsByClassName("form-label xs-inline")(1)
Set inzputz = ele.getElementsByClassName("text-input")(0)
Call .getElementsByClassName("text-input")(0).setAttribute("placeholder", "MSI B450 TOMAHAWK") '.setAttribute("part_category_search", "MSI B450 TOMAHAWK")
End With
End Sub
After reading some posts here (which I now can't find) my thinking is that there is/ are event listeners and functions that need to be included in this code but that is over my head. Could someone please help me figure this out.
Tim Williams has a post here (an answer to a post) which discussed this but now I can't find it.
You can avoid the expense of a browser and perform the same xhr GET request the page does that returns json. You will need a json parser to handle the response.
Json library:
I use jsonconverter.bas. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
I show a partial implementation which makes requests for different categories and products and uses both full and partial string searches. It is a partial implementation in that I read responses into json objects and also print the json strings but do not attempt to access all items within json object. That can be refined upon more detail from you. For demo puposes I access ("result")("data") which gives you the price and name info. Part of the original response json has html as value for accessor ("result")("html"). This has description info e.g.Socket/CPU with motherboard items.
Option Explicit
Public Sub ProductSearches()
Dim xhr As Object, category As String, items()
Set xhr = CreateObject("MSXML2.XMLHTTP")
category = "motherboard"
items = Array("Gigabyte B450M DS3H", "MSI B450 TOMAHAWK", "random string")
PrintListings items, xhr, category
category = "memory"
items = Array("Corsair Vengeance") 'partial search
PrintListings items, xhr, category
End Sub
Public Function GetListings(ByVal xhr As Object, ByVal category As String, ByVal item As String) As Object
Dim json As Object
With xhr
.Open "GET", "https://pcpartpicker.com/products/" & category & "/fetch/?xslug=&location=&search=" & item, False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Set json = JsonConverter.ParseJson(.responseText)("result")("data")
Set GetListings = json
End With
End Function
Public Sub PrintListings(ByRef items(), ByVal xhr As Object, ByVal category As String)
'Partially implemented. You need to decide what to do with contents of json object
Dim json As Object, i As Long
For i = LBound(items) To UBound(items)
Set json = GetListings(xhr, category, items(i))
'Debug.Print Len(JsonConverter.ConvertToJson(json)) ' Len(JsonConverter.ConvertToJson(json)) =2 i.e {} then no results
Debug.Print JsonConverter.ConvertToJson(json) 'demo purposes only
'do something with json
Next
End Sub
Json parsing:
Read about using JsonConverter and parsing json in vba here, here and here.
You need to execute the keyup event after you place your value into your textbox.
You can accomplish this by using the execScript method.
So, after you load the webpage, create a variable for your input/textbox. In the below example, it's tb. Set the .Value property to your search text (which I used "MSI") then fire the keyup event via script.
Dim tb As HTMLInputElement
Set tb = IE.document.getElementById("part_category_search")
tb.Value = "MSI"
IE.document.parentWindow.execScript "$('#part_category_search').keyup()"
I am not overly familiar with jQuery, so this script targets all inputs on the webpage. But I've tested it and it works for your search.
Here was the full code I used in testing if you want to shorten yours:
Dim IE As InternetExplorer
Set IE = New InternetExplorer
IE.Visible = True
IE.navigate "https://pcpartpicker.com/products/motherboard/"
Do While IE.Busy Or IE.readyState < 4
DoEvents
Loop
Dim tb As HTMLInputElement
Set tb = IE.document.getElementById("part_category_search")
tb.Value = "MSI"
IE.document.parentWindow.execScript "$('#part_category_search').keyup()"

Extract data from HTML Element - VBA

I'm new to web scraping and the HTML language.
I'm trying to write a code in VBA to extract data from the following website:
https://companies.govmu.org:4343/MNSOnlineSearch/
I have an Excel sheet with over 5000 company names and their respective "File No" in columns A and B respectively, and I need to input their "Status" (either "Live" or "Defunct") in column C. This will be done after searching for each company by "File No" and then extracting their status to the Excel sheet.
The issue is that I can't seem to get the element containing the data that I need.
I've already written the bit of code which will extract the "File No" from my Excel sheet, paste it on the webpage in the "File No" search box, and run the search. (You can try searching C5113, as an example).
However, on the resulting webpage, I've tried getting the element containing the data that I need, but it does not work.
For example, I tried to MsgBox (MsgBox is my personal way to check whether my variable contains the data I need) the inner HTML of the tag fieldset (fs) with ID "CompanyList" as shown in the code below, but it returns an error.
I've also tried with another variable named div, of data type HTMLDivElement, and then getting the element by ID "companies".
And finally, I've also tried looping through a variable of type IHTMLElementCollection to look for the element that I need, but it still does not show the element that I need (it shows other elements that I don't need).
Option Explicit
Sub ExtractStatusDetails()
Dim ie As InternetExplorer
Dim html As HTMLDocument
Dim resultHtml As HTMLDocument
Dim fs As IHTMLElement
Dim searchBoxes As IHTMLElementCollection
Dim searchButton As Object
Dim homePage As String
homePage = "https://companies.govmu.org:4343/MNSOnlineSearch/"
Set ie = New InternetExplorer
ie.Visible = False
ie.navigate homePage
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set html = ie.document
Set searchBoxes = html.getElementsByClassName("col-md-6 col-lg-4")
searchBoxes(0).innerHTML = Replace(searchBoxes(0).innerHTML, "placeholder", "value")
searchBoxes(0).innerHTML = Replace(searchBoxes(0).innerHTML, "Search company by File No...", "C63")
Set searchButton = searchBoxes(0).getElementsByClassName("btn btn-large btn-primary btn-raised")
searchButton(0).Click
Do While ie.readyState <> READYSTATE_COMPLETE
DoEvents
Loop
Set resultHtml = ie.document
Set fs = resultHtml.getElementById("CompanyList")
MsgBox fs.innerHTML
ie.Quit
End Sub
The page does an xmlhttp POST request which retrieves data from a backend data store (likely Oracle GlassFish > JDBC API > data repository e.g. MySQL) . It returns all similar matches, possibly including exact.
You can find the POST request in the network traffic of browser dev tools after you enter the fileNo and press the search button.
Below is a function you can call in a loop over your fileNos to retrieve the company status
Option Explicit
Public Sub test()
Dim fileNo As String, xmlhttp As Object
fileNo = "C5113"
Set xmlhttp = CreateObject("MSXML2.XMLHTTP")
MsgBox GetCompanyStatus(fileNo, xmlhttp)
End Sub
Public Function GetCompanyStatus(ByVal fileNo As String, ByVal xmlhttp As Object) As String
Dim html As HTMLDocument, body As String, fileNos As Object, i As Long
Set html = New HTMLDocument
body = "tabs=tab-1&searchByName=&searchByFileNo=PLACEHOLDER&submitCompanies=&searchByBusName=&searchByBRN=&searchByIncDateFrom=&searchByIncDateTo=&doAction=search"
With xmlhttp
.Open "POST", "https://companies.govmu.org:4343/MNSOnlineSearch/GetCompanies", False
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.send Replace$(body, "PLACEHOLDER", fileNo)
html.body.innerHTML = .responseText
Set fileNos = html.querySelectorAll("td.tdFileNo")
If fileNos.Length > 0 Then
For i = 0 To fileNos.Length - 1
If fileNos.item(i).innerText = fileNo Then
GetCompanyStatus = html.querySelectorAll("td.tdStatus").item(i).innerText
Exit Function
End If
Next i
End If
GetCompanyStatus = "Not found"
End With
End Function
I would instead consider how you can group your requests. As you can post partial file numbers you could cut down on the number of requests considerably by doing in batches with partial file numbers e.g. search for C5 or C51; the backend then does something like "C5%" to return all matches starting with the specified string, and then loop those results searching for your file numbers of interest that fall within that range.
You could have a dictionary with fileNo as key and status as value and update this as you loop the results returned by a request. I think the number of keys is constrained by Long, so no problem I think for storing all your fileNos at the start, in a dictionary, and updating later during requests. You could even have multiple dictionaries that host ranges of fileNos, like the volumes of the good old Encyclopædia Britannica. This would limit loops to dicts you hope to populate from the same request, for example. Is something to play around with an explore.

How can I retrieve Amazon's keyword/phrase suggestions from the search bar

Below is some code I've found and altered to attempt to capture the keyword/phrase suggestions from Amazon's search bar. I'm very new to the concept of web scraping, so I know the code presented here may be very ineffective and inefficient. I've manually captured some data from the F12 DOM Explorer and Network windows. If the best answer is web scraping, I need that in the form of excel vba. I see in some of the below images that it appears as though some of the content type from the Network window is "application/json" and the Initiator/Type is "XMLHttpRequest", but this is only after it shows a connection and authentication to "https://completion.amazon.com". If that's the route, I have no idea how to complete those requests. Any help would be much appreciated.
So far I've tried invoking the search bar programmatically, via the scripts in the code, but that does nothing that I can see. Simply 'pasting' the keyword into the search bar with a 'space' appended to it does not produce the suggested keywords. However, typing into the search bar does. If I type the keyword in, then choose 'inspect element' of the dropdown suggestions, dynamic HTML is produced to show the HTML content of the suggestions (at which time I can get what I need). I've been unsuccessful in getting to that point.
Private Sub CommandButton1_Click()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim AASearchRank As Workbook
Dim AAws As Worksheet
Dim InputSearch As HTMLInputTextElement
Dim elems As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim elems2 As IHTMLElementCollection
Dim TDelement2 As HTMLDivElement
'Dim TDelement2 As HTMLInputTextElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim x As Integer
Dim i As Long
MyURL = "https://www.amazon.com/"
Set IE = New InternetExplorer
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
Set HTMLDoc = IE.Document
Set AASearchRank = Application.ThisWorkbook
Set AAws = AASearchRank.Worksheets("Sheet2")
Set InputSearchButton = HTMLDoc.getElementById("nav-search-submit-text")
Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchtextbox")
If Not InputSearchOrder Is Nothing Then
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
x = 2
If AAws.Range("D" & x).Value = "" Then
Do Until AAws.Range("B" & x) = ""
Set InputSearch = HTMLDoc.getElementById("twotabsearchtextbox")
InputSearch.Focus
'When a keyword is typed in the search bar with a 'space' after, it invokes the suggestions I'm looking for.
InputSearch.Value = "Travel "
'InputSearch.Value = AAws.Range("C" & x) & " "
Set InputSearchButton = HTMLDoc.getElementsByClassName("nav-input")(0)
InputSearch.Focus
'Here I was trying to invoke some script to see if it had any effect on the search bar drop down
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'UpNav',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'Search',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'NavBar',end:+new Date(),begin:window.navmet.main});"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Application.Wait (Now + TimeValue("0:00:05"))
Set elems2 = HTMLDoc.getElementsByClassName("nav-issFlyout nav-flyout")
i = 0
For Each TDelement2 In elems2
'Debug statements strictly for learning what each option/query returns
Debug.Print TDelement2.innerText
Debug.Print TDelement2.className
Debug.Print TDelement2.dataFld
Debug.Print TDelement2.innerHTML
Debug.Print TDelement2.outerText
Debug.Print TDelement2.outerHTML
Debug.Print TDelement2.parentElement.className
Debug.Print TDelement2.tagName
Debug.Print TDelement2.ID
Next
'Once the searchbar is populated, and the drop down list provides suggestions,
'the below code will give me what I want. If there's an easier solution,
'I'm all for it
Set elems = HTMLDoc.getElementsByClassName("s-suggestion")
i = 0
For Each TDelement In elems
If Left(TDelement.ID, 6) = "issDiv" Then
Debug.Print TDelement.innerText
Debug.Print TDelement.ID
End If
Next
x = x + 1
Loop
End If
End Sub
An ideal solution would be to obtain these suggested keywords through either invoking the search bar dynamic HTML or via Amazon's completion site, but it appears as though that might not be open to the general public. Thank you for any help, and apologies up front for any posting deficiencies.
There is an API call you can find in the network tab. It returns a json string you can parse with as jsonparser to get the suggestions. I use jsonconverter.bas which, once downloaded I add to the project and then go VBE > Tools > References > Add a reference to Microsoft Scripting Runtime.
The url itself is a queryString i.e. it is constructed of different parameters. For example, there is a limit parameter, whose value is 11, which specifies the number of suggestions to return. You may be able to alter and/or remove some of these. Below, I concatenate the SEARCH_TERM constant into the query string to represent your search value (that which would be typed into the search box).
I don't know whether any of the params are time-based (i.e. expire over time - I have made a number of requests without problem since you posted your question). It may be that necessary time based values can be pulled via a prior GET request to Amazon search page.
params = (
('session-id', '141-0042012-2829544'),
('customer-id', ''),
('request-id', '7E7YCB7AZZM1HQEZF2G1'),
('page-type', 'Search'),
('lop', 'en_US'),
('site-variant', 'desktop'),
('client-info', 'amazon-search-ui'),
('mid', 'ATVPDKIKX0DER'),
('alias', 'aps'),
('b2b', '0'),
('fresh', '0'),
('ks', '76'),
('prefix', 'TRAVEL'),
('event', 'onKeyPress'),
('limit', '11'),
('fb', '1'),
('suggestion-type', ['KEYWORD', 'WIDGET']),
('_', '1556820864750')
)
VBA:
Option Explicit
Public Sub GetTable()
Dim json As Object, suggestion As Object '< VBE > Tools > References > Microsoft Scripting Runtime
Const SEARCH_TERM As String = "TRAVEL"
Const SEARCH_TERM2 As String = "BOOKS"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://completion.amazon.com/api/2017/suggestions?session-id=141-0042012-2829544" & _
"&customer-id=&request-id=7E7YCB7AZZM1HQEZF2G1&page-type=Search&lop=en_US&site-variant=" & _
"desktop&client-info=amazon-search-ui&mid=ATVPDKIKX0DER&alias=aps&b2b=0&fresh=0&ks=76&" & _
"prefix=" & SEARCH_TERM & "&event=onKeyPress&limit=11&fb=1&suggestion-type=KEYWORD&suggestion-type=" & _
"WIDGET&_=1556820864750", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Set json = JsonConverter.ParseJson(.responseText)("suggestions")
End With
For Each suggestion In json
Debug.Print suggestion("value")
Next
End Sub