I'm working with a block of code in VBA (Access 2016) to get the status of a MailChimp contact via their API:
With CreateObject("MSXML2.ServerXMLHTTP")
.Open Request, APIUrl & APIString & Criteria, False
.setrequestheader "Authorization", "Basic " & APIAuth
.send
APIResponse = .responsetext
If InStr(APIResponse, "{""exact_matches"":{""members"":[]") > 0 Then
MailExist = False
Else
MailExist = True
Set JSONControl = CreateObject("MSScriptControl.ScriptControl")
JSONControl.language = "Jscript"
APIString = "XXXXXXX"
.Open Request, APIUrl & APIString & Criteria, False
.setrequestheader "Authorization", "Basic " & APIAuth
.send
Set Subber = JSONControl.Eval("(" + .responsetext + ")")
If Subber.status = "subscribed" Then
MailSub = True
Else
MailSub = False
End If
End If
End With
Towards the end I use the 'Subber' object to see if the contact is subscribed or not
If Subber.status = "subscribed" Then
This has been working fine until recently, when the 'status' Property is suddenly being capitalised by the IDE, causing it to error out with an 'Object does not support this property or method' message.
I've done some digging and found a similar problem with a solution here, but after following the solutions the 'status' is still being capitalised. I've tried checking through the whole project for the word 'Status' with no results, and none of the controls on my forms use it either. I did have a table field with the name 'Status' but I've since renamed that so it shouldn't affect it, and I've also tried setting a public variable called 'status'. This worked to set the property back to all lowercase, but once the code is triggered the property goes back to 'Status', along with the variable name. I'm now stuck and can't figure out why the capitalisation occurs!
You can try CallByName
The CallByName function is used to get or set a property, or invoke a method at run time using a string name. This means you can pass in your required property as an argument without this capitalisation. Use vbGet to retrieve value.
Related
I'm trying to retrieve a JSON response object through the below query API. When I try to read the responseText in VBA I receive an empty result. However, the exact same request returns correct data from PostMan. Also, the correct data returns from sending the different request bodies. Whenever I try to execute Set Json = JsonConverter.ParseJson(strResponse) and I'm getting the error message Error Parsing JSON: ^ Expecting '{' or '['. Can you please help?
This is VBA code
Dim strUrl As String
Dim reqBody As String
'For search GOSS service API-Step1
strUrl = "https://gossrepo.ins.dell.com/gossv3/api/reporting/service/getrefid"
'create a method for calling HTTP services
Set hReq = CreateObject("MSXML2.XMLHTTP")
With hReq
.Open "POST", strUrl, blnAsync, False
reqBody = "{""methodType"":extract,""sourceApplication"":DSA,""searchParameter"":[{""conditionType"":term,""key"":global_bu_id,""value"":11},{""conditionType"":wildcard,""key"":customer_num,""value"":[530007546697]},{""conditionType"":range,""key"":order_date,""value"":[{""from"":2021-08-31,""to"":2021-09-09}]},{""conditionType"":sort,""key"":order_date_time,""value"":desc}],""pageSize"":40,""pageNum"":0}"
.SetRequestHeader "Content-type", "application/json"
.Send reqBody
While hReq.ReadyState <> 4
DoEvents
Wend
'wrap the response in a JSON root tag "data" to count returned objects
strResponse = hReq.ResponseText
Debug.Print strResponse
End With
Set Json = JsonConverter.ParseJson(strResponse)
Updated the fixed with the different post body:
Dim strUrl As String
Dim reqBody As String
'For search GOSS service API-Step1
strUrl = "https://gossrepo.us.dell.com/gossv3/api/reporting/service/getdata"
'create a method for calling HTTP services
Set hReq = CreateObject("MSXML2.XMLHTTP")
With hReq
.Open "POST", strUrl, blnAsync, False
reqBody = "{""methodType"":""details"",""sourceApplication"":""DSA"",""pageNum"":0,""pageSize"":300,""searchParameter"":[{""conditionType"":""term"",""key"":""global_bu_id"",""value"":""11""},{""conditionType"":""wildcard"",""key"":""customer_num"",""value"":[""" & ws & """]},{""conditionType"":""range"",""key"":""order_date"",""value"":[{""from"":""" & ws11 & """,""to"":""" & ws12 & """}]},{""conditionType"":""sort"",""key"":""order_date_time"",""value"":""desc""}]}"
.SetRequestHeader "Content-type", "application/json"
.Send reqBody
While hReq.ReadyState <> 4
DoEvents
Wend
'wrap the response in a JSON root tag "data" to count returned objects
strResponse = hReq.ResponseText
End With
Set Json = JsonConverter.ParseJson(strResponse)
Probably your request is wrong and you don't get the expected response because of it... Look at the status that's returned (hReq.status and hReq.statusText), I bet it's 400 Bad Request or 500 Internal Error and not 200 Ok. (You could also use an inspecting proxy like Fiddler to look at what exactly you send and receive here.)
I can already see your request body is invalid JSON as it has unquoted strings in it... It's not the exact same as you showed in Postman! That's like the issue (or one of the issues). You have e.g. "methodType": extract, but it has to be "methodType": "extract" (in VBA ""methodType"": ""extract"") - you did it correctly in Postman but wrong in your code.
As mentioned by CherryDT - Your original reqBody had alot of missing quotes and in your updated reqBody, you are missing quotes for order_date and also you quoted pageSize and pageNum value which is supposed to be a number and thus quotes is not required:
Below should give you the same JSON string as what you had in Postman:
reqBody = "{""methodType"":""extract"",""sourceApplication"":""DSA"",""searchParameter"":[{""conditionType"":""term"",""key"":""global_bu_id"",""value"":""11""},{""conditionType"":""wildcard"",""key"":""customer_num"",""value"":[""530007546697""]},{""conditionType"":""range"",""key"":""order_date"",""value"":[{""from"":""2021-08-31"",""to"":""2021-09-09""}]},{""conditionType"":""sort"",""key"":""order_date_time"",""value"":""desc""}],""pageSize"":40,""pageNum"":0}"
One way which has been working well for me so far is:
Copy the JSON string from Postman to Notepad
Open Replace dialog (Ctrl-H)
Enter " in Find What
Enter "" in Replace with
Click Replace All
Now you can copied the new string back to your VBA editor and it should produce the same output as Postman's.
I had a macro that used to go to a website pull a value from the A column, for example 517167000, from a particular part of the code and returning that value to a cell.
The html source has changed now and i cant seem to get it to work.
My original code was
Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
With request
.Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
.send
UnitPerBox = Trim(Split(Split(.responseText, "Units per box</td>")(1), "<tr")(0))
End With
End Function
So a working example of the website is
https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-517167000
So that you can go to the website and view the source.
The new html code looks like the below, but its been so long since i did the original macro, that i assumed that i could change
"Units per box</td>")(1), "<tr"
to
"Units per pack</td> <td class="value">")(1), "<tr"
as the below new html code is what is now on the site, and i need the value 2.74 for example, but its not working.
<tr>
<td class="name">Units per pack</td>
<td class="value">2.74</td>
</tr>
Any help would be much appreciated.
An example of
Cheers
If you go and work with .responseText using Split() doing text manipulation you might as well use a regular expression without setting it's Global parameter:
Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Pattern = "\d+(?:\.\d+)?"
With request
.Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
.send
UnitPerBox = RegEx.Execute(Split(.responsetext, "Units per pack</td>")(1))(0)
End With
End Function
Neater (IMO) however is to avoid text manipulation on the .responseText alltogether and work through the HTML document, retrieve the appropriate data straigt from the HTML-table by element-ID and table indexes:
Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
Dim htmlResponse As Object: Set htmlResponse = CreateObject("htmlfile")
With request
.Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
.send
htmlResponse.body.innerHTML = .responseText
UnitPerBox = htmlResponse.body.document.getElementById("specifications").getElementsByTagName("tr")(10).getElementsByTagName("td")(1).innerText
End With
End Function
Note that the table is 0-indexed meaning we are actually retrieving our value from the 11th row, second column. In case you are not sure that the tablecontent is always found on the same indexes, you could also just loop the child nodes:
Public Function UnitPerBox(searchTerm As String) As String
Static request As Object
If request Is Nothing Then Set request = CreateObject("msxml2.xmlhttp")
Dim htmlResponse As Object: Set htmlResponse = CreateObject("htmlfile")
Dim Rws As Object
With request
.Open "GET", "https://larsonjuhl.co.uk/mouldings/larson-juhl-essentials/arq-essentials-moulding-" & searchTerm, False
.send
htmlResponse.body.innerHTML = .responseText
Set Rws = htmlResponse.body.document.getElementById("specifications").getElementsByTagName("tr")
For Each Rw In Rws
If Rw.getElementsByTagName("td")(0).InnerText = "Units per pack" Then
UnitPerBox = Rw.getElementsByTagName("td")(1).InnerText
Exit For
End If
Next
End With
End Function
Where I personally would prefer to use HTML document over text manipulation, all above options work to retrieve your value =)
I am doing VBA web scraping for tracking the status of fedex shipment.
The URL is
https://www.fedex.com/apps/fedextrack/index.html?tracknumbers=475762806100&cntry_code=in
The current status of this shipment is delivered. I want to extract this.
I know how to select elements that have a class name.
The above website's HTML code does not have class names.
How do I select an element that does not have a specific class name using queryselector?
I am using the fastest method which is MSXML2.XMLHTTP in my VBA code. One drawback of this method is getelementbyclassname will not work in this method or any other way to make the program to sense the getelementbyclassname.
This is the reason I chose to use queryselector.
I am not able to pick the correct element using query selector.
From the HTML codes,
How do I get the inner text of the class named -----
"redesignStatusChevronTVC tank-results-item__data-label-large tank-text-center statusChevron_key_status"?
<h3 class="redesignStatusChevronTVC tank-results-item__data-label-large tank-text-center statusChevron_key_status">Delivered</h3>
Sub GetInfo()
Dim sResponse As String, i As Long, html As New HTMLDocument
Dim prices As Object, info As Object
Application.ScreenUpdating = False
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.fedex.com/apps/fedextrack/index.html?tracknumbers=475762806100&cntry_code=in", False
.send
sResponse = .responseText
End With
With html
.body.innerHTML = sResponse
Set info = .querySelectorAll("redesignStatusChevronTVC tank-results-item__data-label-large tank-text-center statusChevron_key_status")
End With
With Worksheets(3)
Worksheets(3).Activate '
For i = 0 To info.Length - 1
Debug.Print info(i).innerText
Next i
End With
Application.ScreenUpdating = True
End Sub
Assuming I'm reading things correctly on my end, I think the HTML returned by the server does not contain the information you're looking for.
To confirm this, try printing:
InStr(1, sResponse, "redesignStatusChevronTVC tank-results-item__data-label-large tank-text-center statusChevron_key_status", vbTextCompare)
to the immediate window and you should see it return 0 (meaning that text is not present within the response text).
The information you're trying to scrape (and even the element which needs selecting with .redesignStatusChevronTVC.tank-results-item__data-label-large.tank-text-center.statusChevron_key_status) is populated dynamically via JavaScript and does not exist at the time you're trying to access it.
From what I can see, the web page makes an HTTP POST request, and the server returns some JSON, which represents information relating to that tracking number. Try the code below (run the procedure JustATest), which tries to make the same HTTP POST request:
Option Explicit
Private Sub JustATest()
MsgBox "Delivery status is: " & GetDeliveryStatusForPackage("475762806100", "en_IN")
End Sub
Private Function GetDeliveryStatusForPackage(ByVal trackingNumber As String, ByVal localeValue As String)
' Given a "trackingNumber" and "localeValue", should return the delivery status of that package.
Dim jsonResponse As String
jsonResponse = GetFedExJson(trackingNumber, localeValue)
GetDeliveryStatusForPackage = ExtractDeliveryStatusFromJson(jsonResponse)
End Function
Private Function ExtractDeliveryStatusFromJson(ByVal someJson As String) As String
' Should extract the delivery status. This function treats the JSON
' encoded string as a string and hence relies on basic string matching.
Const START_DELIMITER As String = """keyStatus"":"""
Dim startDelimiterIndex As Long
startDelimiterIndex = InStr(1, someJson, START_DELIMITER)
Debug.Assert startDelimiterIndex > 0
startDelimiterIndex = startDelimiterIndex + Len(START_DELIMITER)
Dim endDelimiterIndex As Long
endDelimiterIndex = InStr(startDelimiterIndex + 1, someJson, """", vbBinaryCompare)
Debug.Assert endDelimiterIndex > 0
ExtractDeliveryStatusFromJson = Mid$(someJson, startDelimiterIndex, endDelimiterIndex - startDelimiterIndex)
End Function
Private Function GetFedExJson(ByVal trackingNumber As String, ByVal localeValue As String) As String
' Should return a JSON-encoded response. The return value can be
' passed to a function that parses JSON (if such a function is available for use).
Dim formToPost As String
formToPost = CreateFedExForm(trackingNumber, localeValue)
Const TARGET_URL As String = "https://www.fedex.com/trackingCal/track"
With CreateObject("MSXML2.XMLHTTP")
.Open "POST", TARGET_URL, False
.SetRequestHeader "Connection", "keep-alive"
.SetRequestHeader "Content-Type", "application/x-www-form-urlencoded"
.SetRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"
.Send formToPost
Debug.Assert InStr(1, .ResponseText, "{""TrackPackagesResponse"":{""successful"":true,", vbBinaryCompare)
GetFedExJson = .ResponseText
End With
End Function
Private Function CreateFedExForm(ByVal trackingNumber As String, ByVal localeValue As String) As String
' Should return a string representing a form of URL encoded name-value pairs.
Dim data As String
data = "{""TrackPackagesRequest"":{""appType"":""WTRK"",""appDeviceType"":""DESKTOP"",""supportHTML"":true,""supportCurrentLocation"":true,""uniqueKey"":"""",""processingParameters"":{},""trackingInfoList"":[{""trackNumberInfo"":{""trackingNumber"":""" & trackingNumber & """,""trackingQualifier"":"""",""trackingCarrier"":""""}}]}}"
CreateFedExForm = "data=" & Application.EncodeURL(data) & "&action=trackpackages&locale=" & Application.EncodeURL(localeValue) & "&version=1&format=json"
End Function
If it works, then the function GetDeliveryStatusForPackage seems capable of returning the delivery status of a given trackingNumber and localeValue.
It's worth noting that the JSON returned by the server contains other information (which you didn't ask for in your question, but you might find relevant/useful). It's too long to post here, but you can explore it for yourself.
I think it might be possible to get information for multiple tracking numbers in a single request. (I say this because in the request TrackPackagesRequest.trackingInfoList is an array -- and in the response, TrackPackagesResponse.packageList is also an array). It's just a supposition/rational guess at this stage, but might be something that can potentially reduce how long your code takes to finish.
It might be worth getting a VBA module (https://github.com/VBA-tools/VBA-JSON) which supports JSON parsing. I didn't bother, since you only wanted the delivery status. But deserialising the response would be the proper way to do it (especially in terms of accessing the correct property path).
You might also want to check if the terms, which govern your usage of their website, expressly forbid web scraping or any other similar activities.
Regarding nested keyStatus property's value being "In transit" for invalid tracking numbers, check property path TrackPackagesResponse.packageList[0].errorList[0], where there is an object. For invalid tracking numbers it seems to be {"code":"1041","message":"This tracking number cannot be found. Please check the number or contact the sender."... -- and for valid tracking numbers, both the code and message properties appear to be zero-length strings.
It might be good to now get the VBA JSON module that I mention above, since there are two errorList objects (at different levels of nesting) and you want to make sure you're accessing the correct one.
The change required in the code would probably be to first check if the code and message properties of TrackPackagesResponse.packageList[0].errorList[0] indicate the tracking number is invalid (and return message if invalid). Otherwise, return TrackPackagesResponse.packageList[0].keyStatus. I don't have time to implement these changes right now. But I think it's something you can do (unless you're really unsure, in which case let me know which bit you need help with).
I am coding vba to open the central bank's website and input the values and extract the data, I usually do this on the mail site, bank of Brazil etc ...
() of the central bank
I can not give the input value via vba in the textbox I've already tried:
Ie.Document.all.Item("valueConverter").Innertext="1"
Ie.Document.getElementById("valueConverter").Value="1"
Ie.Document.getElementById("valueConverter")(0).Value="1"
Ie.Document.getElementByName("valueConverter").Value = "1"
The Elements of this site is this:
<Input type = "text" name = "valueConverter" maxlength = "17" size "20" value onkeypress = "return (MascaraMoeda (this, '.', ',', Event)
Does anyone know how?
tl;dr;
I cannot mark this as a duplicate as there is no accepted answer to where I posted an answer to a similar question.
Not sure of the protocol as simply posting a link in the comments doesn't mean it will be found again.
My full answer is here: Excel Web Query Submit Issues
To summarize:
You can use the bcb.gov.br Open Data Portal.
Send a request for a JSON response with the conversion rates from their Exchange rates – daily bulletins.
With the received response, amongst other methods, you can then:
Use the JSON Converter .basa and set the convert the response into a JSON object and work with that
Parse the response as a string with a regex to get the values
For brevity, I will give you just the second method here and you can view my other answer for both methods:
Public Sub GetInfo2()
Dim strURL As String, strJSON As String, item As Variant, http As Object, json As Object
Const TARGET_CURRENCY As String = "USD"
Const START_DATE As String = "06-13-2018"
Const END_DATE As String = "06-13-2018"
strURL = "https://olinda.bcb.gov.br/olinda/service/PTAX/version/v1/odata/ExchangeRatePeriod(moeda=#moeda,dataInicial=#dataInicial,dataFinalCotacao=#dataFinalCotacao)?%40moeda=%27" & TARGET_CURRENCY & "%27&%40dataInicial=%27" & START_DATE & "%27&%40dataFinalCotacao=%27" & END_DATE & "%27&%24format=json"
Set http = CreateObject("MSXML2.XMLHTTP")
http.Open "GET", strURL, False
http.send
strJSON = http.responseText
Dim Matches As Object
With CreateObject("VBScript.RegExp")
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = """cotacaoCompra"":\d{1,}.\d{1,}" 'The pattern I really wanted, "(?<=""cotacaoCompra"":)\d{1,}.\d{1,}", doesn't appear to be supported
If Not .test(strJSON) Then Exit Sub
Set Matches = .Execute(strJSON)
Dim match As Object
For Each match In Matches
Debug.Print Replace(match, """cotacaoCompra"":", vbNullString)
Next
End With
End Sub
I want to extract a U.S. Patent title from a url like
http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6293874.PN.&OS=PN/6293874&RS=PN/6293874
(Update: as pointed out the comments, the patent title is not labeled "Title;" however, it consistently appears by itself above "Abstract" on the web page.) In most cases it is in the 7th child element of "body" or the 3rd "font" element in the document, but occasionally a notice at the top of the page to "** Please see images for: ( Certificate of Correction ) **" or "( Reexamination Certificate )" messes up both methods of extraction by inserting one additional child of "body" and three additional "font" elements before you get to the title.
However, the title seems to be consistently the first "font" element with the attribute "size" having a value of "+1". Unfortunately other elements have size="-1", including the aforementioned elements that are not always present, so it has to be specifically with that attribute and value. I have searched but can't figure out how to get elements by attribute and value. Here is my code:
Function Test_UpdateTitle(url As String)
Dim title As String
Dim pageSource As String
Dim xml_obj As XMLHTTP60
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
xml_obj.Open "GET", url, False
xml_obj.send
pageSource = xml_obj.responseText
Set xml_obj = Nothing
Dim html_doc As HTMLDocument
Set html_doc = CreateObject("HTMLFile")
html_doc.body.innerHTML = pageSource
Dim fontElement As IHTMLElement
'Methods 1 and 2 fail in cases of a certificate of correction or reexamination certificate
'Method 1
' Dim body As IHTMLElement
' Set body = html_doc.getElementsByTagName("body").Item(0)
' Set fontElement = body.Children(6)
'Method 2
' Set fontElement = html_doc.getElementsByTagName("font").Item(3)
'Method 3
Dim n As Integer
For n = 3 To html_doc.getElementsByTagName("font").Length - 1
Set fontElement = html_doc.getElementsByTagName("font").Item(n)
If InStr(fontElement.innerText, "Please see") = 0 And _
InStr(fontElement.innerText, "( Certificate of Correction )") = 0 And _
InStr(fontElement.innerText, "( Reexamination Certificate )") = 0 And _
InStr(fontElement.innerText, " **") = 0 Then
Test_UpdateTitle = fontElement.innerText
Exit Function
End If
Next n
End Function
I should add that the " **" is not working to skip the the last element <b> **</b> and I am getting " **" as the title where there is a notice to please see images. Is asterisk a wildcard character in this context?
You can try this. As long as its the first font tag with the size attribute and a value of "+1" this should work. I only tested with 3 different pages but they all returned the correct results.
Function Test_UpdateTitle(url)
title = "Title Not Found!"
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
xml_obj.Open "GET", url, False
xml_obj.send
pageSource = xml_obj.responseText
Set xml_obj = Nothing
Set document = CreateObject("HTMLFile")
document.write pageSource
For i = 0 To document.getElementsByTagName("font").length - 1
If document.getElementsByTagName("font")(i).size = "+1" Then
title = document.getElementsByTagName("font")(i).innerText
Exit For
End If
Next
Test_UpdateTitle = title
End Function
MsgBox Test_UpdateTitle("http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6293874.PN.&OS=PN/6293874&RS=PN/6293874")
MsgBox Test_UpdateTitle("http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&p=1&u=%2Fnetahtml%2FPTO%2Fsearch-bool.html&r=1&f=G&l=50&co1=AND&d=PTXT&s1=fight.TI.&OS=TTL/fight&RS=TTL/fight")
MsgBox Test_UpdateTitle("http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.htm&r=14&f=G&l=50&d=PTXT&p=1&S1=search&OS=search&RS=search")
This answer is somewhat incomplete because my Excel won't do these lines:
Dim xml_obj As XMLHTTP60
Set xml_obj = CreateObject("MSXML2.XMLHTTP")
But I think this might be a preferred approach.
Instead of using USPTO's site, what about using Google's?
Hit this URL: http://www.google.com/patents/US6293874
Note that the patent number is apparent in that URL.
Then, in your function, just pull that tag named invention-title.
Set titleElement = html_doc.getElementsByTagName("invention-title").Item(0)
title = titleElement.innerText
MsgBox(title)
If you check the source on that page, there's only one of those.
If you're open to this alternative approach, it would be relatively easy to parse patent numbers from the URLs you have, and I think the extraction of invention-title would be much more reliable.
See if this answer is working as intended. Make sure you have references to the following libraries in your workbook:
Microsoft XML, v6.0
Microsoft HTML Object Library
If you are not sure how to add these to Excel just give a read to this link Link to reference adding
Option Explicit
Sub Test()
Debug.Print Test_UpdateTitle("http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=%2Fnetahtml%2FPTO%2Fsrchnum.htm&r=1&f=G&l=50&s1=6293874.PN.&OS=PN/6293874&RS=PN/6293874")
End Sub
Function Test_UpdateTitle(ByVal strURL As String) As String
Dim oHTTP As MSXML2.XMLHTTP60
Dim oDoc As MSHTML.HTMLDocument
Dim oFontTags As Variant
Dim oFontTag As HTMLFontElement
Dim strInnerText As String
Dim strSize As String
' Create the http object and send it.
Set oHTTP = New MSXML2.XMLHTTP60
oHTTP.Open "GET", strURL, False
oHTTP.send
' Make sure that get the a reponse back
If oHTTP.Status = 200 Then
Set oDoc = New HTMLDocument
oDoc.body.innerHTML = oHTTP.responseText
Set oFontTags = oDoc.getElementsByTagName("font")
' Go through all the tags.
For Each oFontTag In oFontTags
'Get the inner text and size of each tag.
strInnerText = oFontTag.innerText
strSize = oFontTag.getAttributeNode("size").Value
'Compare to make sure you have what's needed
If InStr(strInnertText, "Please see") = 0 And _
InStr(strInnertText, "( Certificate of Correction )") = 0 And _
InStr(strInnertText, "( Reexamination Certificate )") = 0 And _
InStr(strInnertText, " **") = 0 Then
If strSize = "+1" Then
Test_UpdateTitle = strInnerText
Exit Function
End If
End If
Next oFontTag
End If
End Function
I hope this helps. :)
In short:
Debug.Print html_doc.querySelector("font[size=+1]").innerText
tl;dr;
① CSS selector:
No need for any of the long winded methods. You have stated the styling pattern. Use a CSS selector to grab it.
font[size=+1]
The reads as font tag with attribute size who value is +1. You may need font[size='+1'] when using VBA to try both.
② CSS query:
③ VBA:
As it is the first match you want you can use the querySelector method of document to apply the selector and retrieve a single element.
html_doc.querySelector("font[size=+1]")
You may need to add a reference to HTML Object Library and use an early bound call of Dim html_doc As HTMLDocument to access the method. The late bound method may expose the querySelector method but if the interface doesn't then use early binding.