Downloading a file with WINHTTP using POST/GET and headers (with VBA) - html

As a follow up to my previous question: Automating File Download of a link that looks like this: https://www.domain.com/TableData/TableA.csv
I am learning more about headers that are sent with POST and GET - but I am far from knowing what I need to know.
The code I have so far works in downloading the file, but the problem from the last question is still persisting. When I open the downloaded file after execution it is filled with html of the login page as though the credentials did not work from the POST login. I suspect my problem lies in the strAuthenticate string, but I don't know for sure.
Sub SaveFileFromURL()
Dim FileNum As Long
Dim FileData() As Byte
Dim WHTTP As Object
fileUrl = "https://www.ncci.com/Manuals/RateTableData/State/XX/XX.csv"
filePath = "C:\Apps\information.csv"
myuser = "xxxxxx"
mypass = "xxxxxx"
strAuthenticate = "sm_userid=xxxxx&sm_password=xxxxxx"
Set WHTTP = CreateObject("WinHTTP.WinHTTPrequest.5.1")
WHTTP.Open "POST", "https://www.ncci.com", False
WHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
WHTTP.send strAuthenticate
x = WHTTP.getResponseHeader("Set-Cookie")
MsgBox x
WHTTP.Open "GET", fileUrl, False
WHTTP.setRequestHeader "Cookie", x
WHTTP.send
FileData = WHTTP.responseBody
Set WHTTP = Nothing
FileNum = FreeFile
Open filePath For Binary Access Write As #FileNum
Put #FileNum, 1, FileData
Close #FileNum
MsgBox "File has been saved!", vbInformation, "Success"
End Sub

BIG THANK YOU TO RYAN WILDRY! Problem Solved.
I needed to set two specific cookies and change the actual login link - using the main domain was not working.
For anyone else who finds this post one day - Look through the cookies using the developer tools and find the ones that you need to pass on login. I personally tested them by having a message box with the response headers pop up which I compared to the response headers that I usually get when I login manually.

Related

is there a way to tell if there is json on a website

i was wondering if you could tell me if there is a way to tell if there is Json that get from a site, i was working with the sydney KWS site and someone was able to tell me what their JSON page was, the page i am looking at is https://www.bne.com.au/passenger/flights/arrivals-departures any help would be great i need to get the flight information for departure from it,
I have taken a look at the backend and found that there is java been used to get the infoamtion, and found that there is a redirect when you first load the page but i could make head ot tails of that
Open the webpage https://www.bne.com.au/passenger/flights/arrivals-departures in a browser (e. g. Chrome) and press F12 to open Developer tools. Go to Network tab, reload the page F5, enter json as filter string, then you can see the requests are logged:
Inspect logged requests, the one having the largest size in that case contains the flights data. Open the request, here you can see URL on Headers tab (unix timestamp is sent as nocache parameter to disable caching):
There is response content on Preview and Response tabs:
Here is VBA example showing how that data could be retrieved. Import JSON.bas module into the VBA project for JSON processing.
Option Explicit
Sub test()
Dim url As String
Dim resp As String
Dim data
Dim state As String
Dim body()
Dim head()
url = "https://www.bne.com.au/sites/default/files/00API-Today.json?nocache=" & CStr(epochTimeStamp(Now))
' Retrieve JSON content
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", url, True
.send
Do Until .readyState = 4: DoEvents: Loop
resp = .responseText
End With
' Parse JSON sample
JSON.Parse resp, data, state
If state = "Error" Then MsgBox "Invalid JSON": End
' Convert JSON to 2D Array
JSON.ToArray data, body, head
' Output to worksheet #1
output head, body, ThisWorkbook.Sheets(1)
MsgBox "Completed"
End Sub
Sub output(head, body, ws As Worksheet)
With ws
.Activate
.Cells.Delete
With .Cells(1, 1)
.Resize(1, UBound(head) - LBound(head) + 1).Value = head
.Offset(1, 0).Resize( _
UBound(body, 1) - LBound(body, 1) + 1, _
UBound(body, 2) - LBound(body, 2) + 1 _
).Value = body
End With
.Columns.AutoFit
End With
End Sub
Function epochTimeStamp(dateTime)
epochTimeStamp = (dateTime - 25569) * 86400
End Function
The output for me is as follows (fragment):
BTW, the similar approach applied in other answers.

How to import JSON to Excel correctly

My objective is to run a search for some data and return the results into an excel table. I'm using the service newsapi.org and using VBA to do this.
I'm sending a XMLHttpRequest to newsapi.org and successfully receiving a (JSON) response, which I am able to save into a file on my desktop. I however cannot import that response into excel as I receive run-time error 13: type mismatch.
Bizarrely when I change my source to a different JSON file, it works. e.g. http://jsonplaceholder.typicode.com/users
So I'm assuming the issue is somewhere around the type of the JSON response I am receiving.
Public Sub xmlhttptutorial()
Dim xmlhttp As Object
Dim myurl As String
Dim JSON As Object
Dim myFile As String
Dim i As Integer
Dim ws As Worksheet
Set xmlhttp = CreateObject("MSXML2.XMLHTTP")
Set ws = Sheet2
myFile = "C:\Users\A0781525\Desktop\myFile.txt"
myurl = "https://newsapi.org/v2/everything?q=Ashley%20Madison%20Data%20Breach&"
xmlhttp.Open "GET", myurl, False
xmlhttp.Send
Set JSON = JsonConverter.ParseJson(xmlhttp.ResponseText)
Open myFile For Output As #1: Print #1, xmlhttp.ResponseText: Close #1
i = 2
For Each Item In JSON
Range("A2").Value = Item("articles")("0:")("source")("id:")
Range("A2").Value = Item("articles")("0:")("source")("name")
Range("A2").Value = Item("articles")("0:")("title")
i = i + 1
Next
End Sub
The break occurs at line:
Range("A2").Value = Item("articles")("0:")("source")("id:")
A sample of the JSON file output I receive:
{"status":"ok","totalResults":16,"articles":[{"source":{"id":"mashable","name":"Mashable"},"author":"Jack Morse","title":"Porn site leaks over a million users' private info","description":"The great thing about the internet is that no one has to know you have a serious thing for hentai pornography. Unless, that is, the porn site you have an account on leaks your personal information. Over a million Luscious.net account holders faced that unexpe…","url":"https://mashable.com/article/porn-site-leaks-users-data/","urlToImage":"https://mondrian.mashable.com/2019%252F08%252F20%252F24%252F62fc9aa277d54b2092a39393d2202a62.856fe.jpg%252F1200x630.jpg?signature=MBXieHs3n4uvowiVyV4K8cCO4j4=","publishedAt":"2019-08-20T22:36:24Z","content":"The great thing about the internet is that no one has to know you have a serious thing for hentai pornography. Unless, that is, the porn site you have an account on leaks your personal information. \r\nOver a million Luscious.net account holders faced that unex… [+2840 chars]"}
You are parsing the JSON incorrectly. Probably due to a misunderstanding of how it is constructed.
Try something like:
i = 2
'Cells.Clear
For Each item In JSON("articles")
Cells(i, 1).Value = item("source")("id")
Cells(i, 2).Value = item("source")("name")
Cells(i, 3).Value = item("title")
i = i + 1
Next
The problem is with the way you are trying to access the parsed json elements.
Not having the exact structure of the JSON the best I can do is assume what you need to do is this:
Debug.Print JSON("articles")(1)("source")("id")
To access the first article's id.
or this
For Each item In JSON("articles")
Debug.Print item("source")("id")
Next item
to loop through them

How to set dropdown box on website so that select option and scrape data

A website has changed so I can't scrape data from it anymore. Just need to change the set line below I believe but have tried a number of things and none have worked. I'm not very knowledgeable about this area I'm afraid but rest of code is working. Eg url is https://www.gurufocus.com/stock/CFWFF/insider and it is the table of insider transactions I am trying to press the dropdown for and change it to 100 instead of 10:
t = Timer
noTable = 0
Do
Set DropDown = doc.querySelectorAll(".el-dropdown-menu__item")
lastDropDrownItemIndex = DropDown.Length - 1
If Timer - t > MAX_WAIT_SEC Then
noTable = 1
Exit Do
End If
Loop While lastDropDrownItemIndex < 1
If noTable = 1 Then GoTo noTableEscape
DropDown.Item(lastDropDrownItemIndex).Click
Thanks
Ok so, not sure what you are after exactly, but the website you're scraping offers an API which in my opinion could probably make your life a lot easier. To put it simply, this means that it provides an easy way to request for data with the use of some parameters embedded in a URL. It returns the requested data in JSON format.
In the following code I will be using the XMLHTTP request method and a JSON Parser. For this you will need the following:
JSON parser , it helps you parse the downloaded data. Follow the installation instructions to import it in your project
A reference to the Microsoft Scripting Runtime library. The JSON parser needs it.
A reference to the Microsoft WinHTTP Services, Version 5.1 library. It lets you use an HTTP request object.
For demonstration purposes, the following code will only print in the immediate window the first entry's name and position. You can modify the code to fit your needs.
Sub test()
Dim req As New WinHttpRequest
Dim jsonResponse As String
Dim jsonParsed As Object
Dim url As String
Dim pageNum As Integer
Dim numPerPage As Integer
pageNum = 1 'You can change this parameter to navigate in different pages
numPerPage = 100 'You can change this parameter to control the number of entries
url = "https://www.gurufocus.com/reader/_api/stocks/OTCPK:CFWFF/insider?page=" & pageNum & "&per_page=" & numPerPage & "&sort=date%7Cdesc"
With req
.Open "GET", url, False
.setRequestHeader "Accept", "application/json, text/plain, */*"
.setRequestHeader "Authorization", ThisWorkbook.Worksheets("The name of your Worksheet").Range("A1").Value 'I have stored a string that is essential to the request in cell A1
.send
jsonResponse = .responseText
End With
Set jsonParsed = JsonConverter.ParseJson(jsonResponse)
Debug.Print jsonParsed("data")(1)("name") 'get the name parameter of the first entry
Debug.Print jsonParsed("data")(1)("position") 'get the position parameter of the first entry
End Sub
Please note that there's a very long string which is essential to the request, which I have stored in cell A1. This string looks like so:
Bearer
eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImp0aSI6ImUxYjAwMmYxMjczMGRiMTBmMmZkYjJkNDk0YTU4NjRmZDZjOWY3ZGI4ZmI1NDY1NTQ2MzZlMGJhNzkxODUxNmY4NTM2ZWIzZDNhODhmN2VmIn0.eyJhdWQiOiIyIiwianRpIjoiZTFiMDAyZjEyNzMwZGIxMGYyZmRiMmQ0OTRhNTg2NGZkNmM5ZjdkYjhmYjU0NjU1NDYzNmUwYmE3OTE4NTE2Zjg1MzZlYjNkM2E4OGY3ZWYiLCJpYXQiOjE1NTkwNzA3OTcsIm5iZiI6MTU1OTA3MDc5NywiZXhwIjoxNTY5NDM4NzkzLCJzdWIiOiIiLCJzY29wZXMiOltdfQ.mZ4DqhUk9YAU6JYDBScF8MJ_zHPyL94bAec7LxZTaWipcWf9uesdGDMDC9v_7W-6zrtXAUWhk4YAL70E5rpPjM7gusYH0RfO48O2PnaV8gsqXoNCFwFBOHuxh109q7X0YsNkfX2wX8m3XigtK9A_YAGID7wxgX96lwzBevsDJ3borHMcJlQtxidF_Bq2D5WPASsuy3jdY80HkOCR1y4eaSIswBEtK5rPj_xy7VXRbYGhLklqw4wgHgq4blfaHnVVmPXf6k8mx45ye8vPecS-w9kjuDOHVn2mvU6mpBzqEpbH4lqpiqmYG7M-CvB1joEAcMQtcilCvsdfKOusoC2MU4_vPtF3Q4ZFVaEcXIQgomdKtFa_XGpCudit45b2rEFacKMUENqLj_sPwYkgM1IPl1lQfR-VpigqnCHPAxVQAPzqwJvS6CxuYOPmvnrx23fBAillP7LtDHwHtlMpgZUjdB5y6IWsia76crM4kbkrKn3zc8xoAGb1fIrgJlY-9hOzrwsmrchantEdYOFZjcMJvhCnlfvnEm6kT2Sdcu4o6TndTZJjrVmD4mb-jNGy4kw_mAx1DfyqR7GLtCVSzcSLKgrrwCJEL22K2bfXH2HExXvgLFbPXivVZJc70TnF9lJmx_dx79cxAm7szFGIdrs56bAC4mdKpvKL3BNmVY-J-G0
The same string should work for you as well.
The result looks like so:
Brown, James Michael
Senior Officer
Each one of the 100 data entries has the following structure:
It's fairly easy to loop through all the entries as well. For example, to print the name of all the entries you would have to do this:
Dim item As Object
For Each item In jsonParsed("data")
Debug.Print item("name")
Next item
Finally, you can also loop through all the parameters of each entry. For example, the following code prints all the parameters and their corresponding values for the first entry:
Dim key As Variant
For Each key In jsonParsed("data")(1).Keys
Debug.Print key & ": " & jsonParsed("data")(1)(key)
Next key
So this way you can basically access any parameter you want for each entry.

Read html line by line in VBScript

I have a script that will pull the html down and put it into a text or HTML file. I can then parse the text file line by line, but I'd rather either parse the website itself or parse the textstream as I get it. Is this possible using VBS (my scripting languages are limited)?
Code:
dim URL
url = "www.something.com"
set wshshell = wscript.createobject("wscript.shell")
set http = createObject("microsoft.xmlhttp")
on error resume next
http.open "GET", URL, FALSE
http.send
if err.number = 0 then
outputFile.writeline http.responsetext
else
wscript.echo "error " & err.number & ": " & err.description
end if
set wshshell = nothing
set http = nothing
patchStatusFile.close
It works fine if I write to an external HTML file. I was wondering if I HAD to write to a file or can I parse the stream first? ie:
strToLookAt = http.responsetext
do until strToLookAt.atEndOfStream
strLine = strToLookAt.readLine
if strLine = "the thing I'm looking for"
...do stuff...
end if
loop
Why did no one answer such a simple question?
here is an example of what I do.
dim up_http : up_http = "http://www.metrolyrics.com/Cornography-lyrics-Brad-Paisley.html"
dim xmlhttp : set xmlhttp = createobject("MSXML2.XMLHTTP.6.0")
xmlhttp.open "get", up_http, True
xmlhttp.send
LyricsURL = xmlhttp.responseText
'At this point we have the html from the web page in memory variable LyricsURL
No need to write to any file.
You can just process the memory variable line for line.
In this case (script not shown) I have it in a function (get_html). I then process each line of the result of the function looking for a particular strings that mark the beginning and end of the Lyrics. Then I save that result into a variable then I replace and delete characters in this variable.
Never got an answer. What I decided to do was:
1) Create a temp file where I store the text info.
2) Parse temp file.
3) Delete Temp file.
Not best idea, but in all honesty, this isn't the best script anyway. Just realized there was an "export" button on the SCCM report. I am going to see about utilizing that in a script.

Microsoft Access 2007 and 2010: "Run-Time Error '429': ActiveX component can't create object"

I am trying to fix a Microsoft Access database that was imported from Access 97 format to Access 2007 format (.mdb to .accdb). The import was successful and I was able to make the database fully functional from my machine. It is also fully functional from my coworker's machine. However, when taken to another building that is part of our organization, we cannot get the database to run. We know that part of the problem is within opening a connection to the web server that holds the central database (there are several copies of this Access database that consist of the same code, but different data is input into them and uploaded to this central database). Here is the code.
Public Function updateSqlServer(TransType As String, SqlCommand As Variant) As Boolean
Dim xmldom As New MSXML2.DOMDocument40
Dim xmlhttp As New MSXML2.ServerXMLHTTP40
Const SoapServer = "http://www.example.com/webservice.asp"
'setup the XMLHTTP object and POST envelope to SoapServer
toResolve = 5 * 1000
toConnect = 5 * 1000
toSend = 15 * 1000
toReceive = 15 * 1000
xmlhttp.setTimeouts toResolve, toConnect, toSend, toReceive
xmlhttp.Open "POST", SoapServer, False 'YIELDS Run-Time Error 429 on this line: xmlhttp.Open
xmlhttp.setRequestHeader "Man", POST & " " & SoapServer & " HTTP/1.1"
xmlhttp.setRequestHeader "MessageType", "CALL"
xmlhttp.setRequestHeader "Content-Type", "text/xml"
xmlhttp.send (SoapEnvelope)
'synchronous wait for response; HTTP status other than 200 (OK) is an error
If xmlhttp.Status = 200 Then
Set xmldom = xmlhttp.responseXML 'get response into XML DOM document
Debug.Print (xmldom.xml) 'write soap response to screen
updateSqlServer = True
Else
'handle error
Debug.Print ("Didn't Work")
Debug.Print ("status=" & xmlhttp.Status) 'write soap return code
Debug.Print ("" & xmlhttp.statusText) 'write status text
updateSqlServer = False
End If
Set xmlhttp = Nothing
Set xmldom = Nothing
End Function
Things we have tried to fix this problem:
1. Added all necessary references (Ex. Microsoft XML)
2. Editing the permissions of the file and the folders
3. Registering ActiveX Controls
4. Made sure all options that can be changed match on both working and non-working machines
This old and is not my code or design. I just have to fix it.
Any help will be appreciated.
A coworker of mine figured out the answer. Those variables are objects of MSXML4, and Windows 7 comes stock with MSXML 6.0, which is evidently not compatible with MSXML 4.
Make sure to check install packages before anything else.
You can either install MSXML 4 or change the variables to MSXML 6 by:
Dim xmldom As New MSXML2.DOMDocument6.0
instead of
Dim xmldom As New MSXML2.DOMDocument40