I am making a request to a webpage using Visual Basic and i want to parse the response and add it to a database. My problem is that when i make the request, Windows prompts a question that asks if i want to save the file. Is there a way to receive the response as string directly in VB?
Here is a sample of my code:
Dim ret As Long
Dim ie As Object
'checking connection state
ret = InternetGetConnectedStateEx(ret, sConnType, 254, 0)
'instantiating IE
Set ie = CreateObject("internetexplorer.application")
'hiding browser
ie.Visible = False
'opening link
Header = "x-api-key:..." & Chr(10) & Chr(13)
ie.Navigate "https://api.openapi.ro/api/companies/13548146", , , , Header
If this is VBA, I suggest using XMLHTTP. In the VBA IDE, use tools--->references to add a reference to "Microsoft XML, v6.0" in the project (you may have a different version, adjust your code and references accordingly).
Here is an example of how you might use it:
Public Sub TestIt()
Dim xh As XMLHTTP60
Set xh = New XMLHTTP60
With xh
.Open "GET", "https://jsonplaceholder.typicode.com/posts/1"
.send
End With
'In this case, the response text was in ASCII, so I had to use StrConv to convert it to Unicode for VBA.
Debug.Print StrConv(xh.responseBody, vbUnicode)
End Sub
Related
I want to import some data from a website https://www.amfiindia.com/nav-history-download. On this page, there is a link "Download Complete NAV Report in Text Format" which will give me the required data. But this link is not static so I cannot use this directly in VBA to download my data. So how to download data from a hyperlink on a webpage using excel?
My approach is first getting the hyperlink in a variable then use that variable to get the data?
First, get the hyperlink using getElementsByTagName function as shown below.
Then use that as URL to get the data.
But I am getting type mismatch error while equating website which is a string with my hyperlink.
I don't know the type of href. Tried seeing in watch window showing Variant, tried that still error.
Kindly help me with this.
Sub webscraping()
Dim request As Object
Dim response As String
Dim html As New HTMLDocument
Dim website As String
Dim price As Variant
Dim cellAddress As String
Dim rowNumber As Long
Dim ie As InternetExplorer
Dim ht As HTMLDocument
Dim hr As MSHTML.IHTMLElement
'Dim Hra As MSHTML.IHTMLElement
Set ie = New InternetExplorer
ie.Visible = True
ie.Navigate ("https://www.amfiindia.com/nav-history-download")
Do Until ie.ReadyState >= 4
DoEvents
Loop
Set ht = ie.Document
'MsgBox ht.getElementById("navhistorydownload")
Set hr = ht.getElementsByTagName("a")(18).href
' Website to go to.
website = StrConv(hr, vbUnicode)
' Create the object that will make the webpage request.
Set request = CreateObject("MSXML2.XMLHTTP")
' Where to go and how to go there - probably don't need to change this.
request.Open "GET", website, False
' Get fresh data.
request.setRequestHeader "If-Modified-Since", "Sat, 1 Jan 2000 00:00:00 GMT"
' Send the request for the webpage.
request.send
' Get the webpage response data into a variable.
response = StrConv(request.responseBody, vbUnicode)
' Put the webpage into an html object to make data references easier.
html.body.innerHTML = response
' Get the price from the specified element on the page.
'price = html.getElementstagName("a").Item(0).innerText
cellAddress = Range("A" & Rows.Count).End(xlUp).Address
rowNumber = Range(cellAddress).Row
ThisWorkbook.Sheets(1).Cells(rowNumber + 1, 1) = response
' MsgBox rowNumber
' MsgBox cellAddress
' Output the price into a message box.
'MsgBox price
End Sub
If you don't know the type then you can use
?typename(ht.getElementsByTagName("a")(18).href)
in the immediate window.
It should be a string and declared as such.
Rather than indexing into an anchor collection I would grab by css selector
ht.querySelector(".nav-hist-dwnld a").href
This specifies the parent node with class name nav-hist-dwnld and then asks for the first child a tag.
This, website = StrConv(hr, vbUnicode) is not required. Use the extracted href direct.
Below is some code I've found and altered to attempt to capture the keyword/phrase suggestions from Amazon's search bar. I'm very new to the concept of web scraping, so I know the code presented here may be very ineffective and inefficient. I've manually captured some data from the F12 DOM Explorer and Network windows. If the best answer is web scraping, I need that in the form of excel vba. I see in some of the below images that it appears as though some of the content type from the Network window is "application/json" and the Initiator/Type is "XMLHttpRequest", but this is only after it shows a connection and authentication to "https://completion.amazon.com". If that's the route, I have no idea how to complete those requests. Any help would be much appreciated.
So far I've tried invoking the search bar programmatically, via the scripts in the code, but that does nothing that I can see. Simply 'pasting' the keyword into the search bar with a 'space' appended to it does not produce the suggested keywords. However, typing into the search bar does. If I type the keyword in, then choose 'inspect element' of the dropdown suggestions, dynamic HTML is produced to show the HTML content of the suggestions (at which time I can get what I need). I've been unsuccessful in getting to that point.
Private Sub CommandButton1_Click()
Dim MyHTML_Element As IHTMLElement
Dim MyURL As String
Dim AASearchRank As Workbook
Dim AAws As Worksheet
Dim InputSearch As HTMLInputTextElement
Dim elems As IHTMLElementCollection
Dim TDelement As HTMLTableCell
Dim elems2 As IHTMLElementCollection
Dim TDelement2 As HTMLDivElement
'Dim TDelement2 As HTMLInputTextElement
Dim InputSearchButton As HTMLInputButtonElement
Dim IE As InternetExplorer
Dim x As Integer
Dim i As Long
MyURL = "https://www.amazon.com/"
Set IE = New InternetExplorer
With IE
.Silent = True
.Navigate MyURL
.Visible = True
Do
DoEvents
Loop Until .ReadyState = READYSTATE_COMPLETE
End With
Set HTMLDoc = IE.Document
Set AASearchRank = Application.ThisWorkbook
Set AAws = AASearchRank.Worksheets("Sheet2")
Set InputSearchButton = HTMLDoc.getElementById("nav-search-submit-text")
Set InputSearchOrder = HTMLDoc.getElementById("twotabsearchtextbox")
If Not InputSearchOrder Is Nothing Then
InputSearchButton.Click
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
End If
x = 2
If AAws.Range("D" & x).Value = "" Then
Do Until AAws.Range("B" & x) = ""
Set InputSearch = HTMLDoc.getElementById("twotabsearchtextbox")
InputSearch.Focus
'When a keyword is typed in the search bar with a 'space' after, it invokes the suggestions I'm looking for.
InputSearch.Value = "Travel "
'InputSearch.Value = AAws.Range("C" & x) & " "
Set InputSearchButton = HTMLDoc.getElementsByClassName("nav-input")(0)
InputSearch.Focus
'Here I was trying to invoke some script to see if it had any effect on the search bar drop down
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'UpNav',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'Search',end:+new Date(),begin:window.navmet.tmp});"
HTMLDoc.parentWindow.execScript "window.navmet.push({key:'NavBar',end:+new Date(),begin:window.navmet.main});"
Do
DoEvents
Loop Until IE.ReadyState = READYSTATE_COMPLETE
'Application.Wait (Now + TimeValue("0:00:05"))
Set elems2 = HTMLDoc.getElementsByClassName("nav-issFlyout nav-flyout")
i = 0
For Each TDelement2 In elems2
'Debug statements strictly for learning what each option/query returns
Debug.Print TDelement2.innerText
Debug.Print TDelement2.className
Debug.Print TDelement2.dataFld
Debug.Print TDelement2.innerHTML
Debug.Print TDelement2.outerText
Debug.Print TDelement2.outerHTML
Debug.Print TDelement2.parentElement.className
Debug.Print TDelement2.tagName
Debug.Print TDelement2.ID
Next
'Once the searchbar is populated, and the drop down list provides suggestions,
'the below code will give me what I want. If there's an easier solution,
'I'm all for it
Set elems = HTMLDoc.getElementsByClassName("s-suggestion")
i = 0
For Each TDelement In elems
If Left(TDelement.ID, 6) = "issDiv" Then
Debug.Print TDelement.innerText
Debug.Print TDelement.ID
End If
Next
x = x + 1
Loop
End If
End Sub
An ideal solution would be to obtain these suggested keywords through either invoking the search bar dynamic HTML or via Amazon's completion site, but it appears as though that might not be open to the general public. Thank you for any help, and apologies up front for any posting deficiencies.
There is an API call you can find in the network tab. It returns a json string you can parse with as jsonparser to get the suggestions. I use jsonconverter.bas which, once downloaded I add to the project and then go VBE > Tools > References > Add a reference to Microsoft Scripting Runtime.
The url itself is a queryString i.e. it is constructed of different parameters. For example, there is a limit parameter, whose value is 11, which specifies the number of suggestions to return. You may be able to alter and/or remove some of these. Below, I concatenate the SEARCH_TERM constant into the query string to represent your search value (that which would be typed into the search box).
I don't know whether any of the params are time-based (i.e. expire over time - I have made a number of requests without problem since you posted your question). It may be that necessary time based values can be pulled via a prior GET request to Amazon search page.
params = (
('session-id', '141-0042012-2829544'),
('customer-id', ''),
('request-id', '7E7YCB7AZZM1HQEZF2G1'),
('page-type', 'Search'),
('lop', 'en_US'),
('site-variant', 'desktop'),
('client-info', 'amazon-search-ui'),
('mid', 'ATVPDKIKX0DER'),
('alias', 'aps'),
('b2b', '0'),
('fresh', '0'),
('ks', '76'),
('prefix', 'TRAVEL'),
('event', 'onKeyPress'),
('limit', '11'),
('fb', '1'),
('suggestion-type', ['KEYWORD', 'WIDGET']),
('_', '1556820864750')
)
VBA:
Option Explicit
Public Sub GetTable()
Dim json As Object, suggestion As Object '< VBE > Tools > References > Microsoft Scripting Runtime
Const SEARCH_TERM As String = "TRAVEL"
Const SEARCH_TERM2 As String = "BOOKS"
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://completion.amazon.com/api/2017/suggestions?session-id=141-0042012-2829544" & _
"&customer-id=&request-id=7E7YCB7AZZM1HQEZF2G1&page-type=Search&lop=en_US&site-variant=" & _
"desktop&client-info=amazon-search-ui&mid=ATVPDKIKX0DER&alias=aps&b2b=0&fresh=0&ks=76&" & _
"prefix=" & SEARCH_TERM & "&event=onKeyPress&limit=11&fb=1&suggestion-type=KEYWORD&suggestion-type=" & _
"WIDGET&_=1556820864750", False
.setRequestHeader "User-Agent", "Mozilla/5.0"
.send
Set json = JsonConverter.ParseJson(.responseText)("suggestions")
End With
For Each suggestion In json
Debug.Print suggestion("value")
Next
End Sub
With dynamic web pages that display a table of retrieved data, I’ve found that both MSXML2.XMLHTTP and the Internet Explorer object usually can’t access this data. A good example is https://www.tiff.net/tiff/films.html. Both techniques won’t retrieve any of the movie data – just the surrounding web page. The code I’ve tried is as follows:
Function getHTTP(ByVal sReq As String) As Variant
On Error GoTo onErr
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", sReq, False
.send
getHTTP = StrConv(.responseBody, 64)
End With
Exit Function
onErr: MsgBox "Error " & Err & ": " & Err.Description, 49, "Error opening site"
End Function
Function GetHTML(ByVal strURL As String) As Variant
Dim oIE As InternetExplorer
Dim hElm As IHTMLElement
Set oIE = New InternetExplorer
oIE.Navigate strURL
Do While (oIE.Busy Or oIE.ReadyState <> READYSTATE_COMPLETE)
DoEvents
Loop
Set hElm = oIE.Document.all.tags("html").Item(0)
GetHTML = hElm.outerHTML
Set oIE = Nothing
Set hElm = Nothing
End Function
But there is a way to manually retrieve the movie data – just follow these steps with Microsoft Edge or Internet Explorer:
Right-click on one of the movies
Choose “inspect element." The DevTools console opens.
At the bottom-left of the screen, click on the “html” tab.
Right-click the tab. Choose “copy.”
Open notepad and paste what you’ve copied.
You now have the movie data and can save it to a file for parsing. My question: Is there any way to get this data programmatically?
Why Json? Because the page is loaded using json data
To View: Use Google Chrome --> Press F12 --> Load URL --> Goto Network tab
Code:
Sub getHTTP()
Dim Url As String, data As String
Dim xml As Object, JSON As Object, colObj, item
Url = "https://www.tiff.net/data/films-events-2018.json?q=1513263947586"
Set xml = CreateObject("MSXML2.ServerXMLHTTP")
With xml
.Open "GET", Url, False
.send
data = .responseText
End With
Set JSON = JsonConverter.ParseJson(data)
Set colObj = JSON("items")
For Each item In colObj
Debug.Print item("title")
Debug.Print item("description")
For Each c1 In item("cast")
Debug.Print c1
Next
For Each c2 In item("countries")
Debug.Print c2
Next
Next
End Sub
Output
Installation of JsonConverter
Download the latest release
Import JsonConverter.bas into your project (Open VBA Editor, Alt + F11; File > Import File)
Add Dictionary reference/class
For Windows-only, include a reference to "Microsoft Scripting Runtime"
For Windows and Mac, include VBA-Dictionary
Tree View of Data
Here are the film titles using IE (you can use same process to get directors)
Option Explicit
Public Sub GetFilms()
Dim IE As New InternetExplorer, html As HTMLDocument, films As Object, i As Long
With IE
.Visible = True
.navigate "https://www.tiff.net/tiff/films.html"
While .Busy Or .readyState < 4: DoEvents: Wend
Set films = .document.querySelectorAll("[target=_self]")
For i = 0 To films.Length - 1
Debug.Print films.item(i).innerText
Next
.Quit '<== Remember to quit application
End With
End Sub
XHR is too fast for this, with the URL provided, but IE is just fine.
If you inspect the HTML you can see each film has the following commonality:
There is an attribute within the a tag called target whose value is _self.
You can use an attribute CSS selector to gather all of these matching elements using the querySelectorAll method of document.
CSS selector (sample):
I would be interested in if this can be solved for getting the film descriptions by parsing the HTML. I had thought the presence of the comments was obscuring the film descriptions. A regex which selects the text within these in theory "<!-- react-text: \d+ -->([^...].+?(?=<))" seems to fail when applied to the .innerHTML as did attempts to swop out the comment start and finish with regex.
I want to automate my delivery status for my regular courier from various service provider like Blue Dart.
I have Docket Numbers; I tried the same using VBA but it is unable to fetch data from webpage.
My code enter the Docket number from cell in home page, then it redirects to other page where delivery status is mentioned in table.
Sub GetCourseList()
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
Dim IEWindows As SHDocVw.ShellWindows
Dim IEwindow As SHDocVw.InternetExplorer
Dim IEDocument As MSHTML.HTMLDocument
Dim BreadcrumbDiv As MSHTML.HTMLElementCollection
Set IEWindows = New SHDocVw.ShellWindows
'create new instance of IE. use reference to return current open IE if
'you want to use open IE window. Easiest way I know of is via title bar.
IE.Navigate "http://www.bluedart.com/maintracking.html"
'go to web page listed inside quotes
IE.Visible = True
While IE.busy
DoEvents 'wait until IE is done loading page.
Wend
IE.Document.All("numbers").Value = ThisWorkbook.Sheets("sheet1").Range("A1")
Application.SendKeys "~"
Dim URL As String
Dim qt As QueryTable
Dim ws As Worksheet
Set ws = Worksheets.Add
For Each IEwindow In IEWindows
If InStr(IEwindow.LocationURL, "your URL or some unique string") <> 0 Then ' Found it
Set IEDocument = IEwindow.Document
URL = IEwindow.LocationURL
Set qt = ws.QueryTables.Add( _
Connection:="URL;" & URL, _
Destination:=Range("F1"))
With qt
.RefreshOnFileOpen = True
.Name = "bluedart"
.FieldNames = True
.WebSelectionType = xlAllTables
.Refresh BackgroundQuery:=False
End With
End If
Next
End Sub
Your code does not attempt to interact with a page generated after entering Docket Number and confirming in any way. It could be done by:
Emulating browser interaction, can be Internet Explorer: click "Go" element on the page after Docket Number has been entered and use:
While IE.Busy Or IE.Readystate <> 4
DoEvents
Wend
It can also be achieved by creating POST request with proper parameters, including Docket Number.
Even after this is achieved, it still won't be possible to get data by query from this page, as its URL is this:
http://www.bluedart.com/servlet/RoutingServlet
Try to open this link. Nothing will display, because content of this URL is generated via POST method and parameters needed to generate content properly are not included in URL.
Instead of query, data can be accessed via finding HTML elements, such as tables, in HTML document for both methods I've mentioned.
We are in the process of implementing code to read/create 2D bar codes that are starting to show up on our supplier's parts.
We have a need to create the 2D bar codes in MS Access reports and forms. Has anyone had success with the font (IDAutomation) or Active X (dlSoft) solutions out there.
For C#, the open source library "http://barcoderender.codeplex.com/" was suggested. Any thoughts on how successful this was or if anyone has other open-source and/or pay for options.
Thanks,
Anton
Depending on the volume of codes you need to generate, you could use the Google Charts API to generate QR Codes.
Simply add a "Microsoft Web Browser" ActiveX component and the following code to your Form:
Dim Size As Integer
Dim Text As String
Dim URL As String
Size = 200
Text = "This is my test"
' Better to actually use a URL encoding function like those described here:
' http://stackoverflow.com/questions/218181/how-can-i-url-encode-a-string-in-excel-vba
Text = Replace(Text, " ", "%20")
URL = "http://chart.apis.google.com/chart?chs=" & Size & "x" & Size & "&cht=qr&chld=H|0&chl=" & Text
WebBrowser.Navigate (URL)
You can of course change the Size and the Text depending on your need. The Text can also be a value directly from your Form, therefore your data.
I would advise you to check Googles Terms and Services before using it.
I completely escaped the web browser control and the Google API since that functionality is now deprecated from what I can tell. I went with a different free API but the Google API or any other API could be used instead.
In my example I am creating an .png image in the same directory as the application. I have a text box on my form named txtToCode in which I type in any text I want to code. I also have an image control so that the image can be viewed from the form, but you can modify it how you wish:
Private Sub btnCode2_Click()
Call GetQRCode(Me.txtToCode, 150, 150)
End Sub
Sub GetQRCode(Content As String, Width As Integer, Height As Integer)
Dim ByteData() As Byte
Dim XmlHttp As Object
Dim HttpReq As String
Dim ReturnContent As String
Dim EncContent As String
Dim QRImage As String
EncContent = EncodeURL(Content)
HttpReq = "https://api.qrserver.com/v1/create-qr-code/?data=" & EncContent & "&size=" & Width & "x" & Height & ""
Set XmlHttp = CreateObject("MSXML2.XmlHttp")
XmlHttp.Open "GET", HttpReq, False
XmlHttp.Send
ByteData = XmlHttp.responseBody
Set XmlHttp = Nothing
ReturnContent = StrConv(ByteData, vbUnicode)
Call ExportImage(ReturnContent)
End Sub
Private Sub ExportImage(image As String)
Dim FilePath As String
On Error GoTo NoSave
' Build Export Path
FilePath = Application.CurrentProject.Path & "\qr.png"
Open FilePath For Binary As #1
Put #1, 1, image
Close #1
Me.Image3.Picture = FilePath
' Save File Path
Exit Sub
NoSave:
MsgBox "Could not save the QR Code Image! Reason: " & Err.Description, vbCritical, "File Save Error"
End Sub
Private Function EncodeURL(str As String)
Dim ScriptEngine As Object
Dim encoded As String
Dim Temp As String
Temp = Replace(str, " ", "%20")
Temp = Replace(Temp, "#", "%23")
EncodeURL = Temp
End Function