How to extract something between <!-- --> using VBA? - html

I'm trying to scrape a page using VBA. I know how to get elements by id class and tag names. But now I have come across this Tag
<!-- <b>IE CODE : 3407004044</b> -->
Now after searching on the internet I know that this is a comment in the HTML, but what I'm unable to find is what is the tag name of this element ,if it qualifies as a tag at all. Should I use
documnet.getelementsbytagname("!") ?
If not, how else can I extract these comments ?
EDIT:
I have a bunch of these td elements within tr elements and I want to extract IE Code : 3407004044
Below is a larger set of HTML code:
<tr align="left">
<td width="50%" class="subhead1">
' this is the part that I want to extract
<!-- <b>IE CODE : 3108011111</b> -->
</td>
<td rowspan="9" valign="top">
<span id="datalist1_ctl00_lbl_p"></span>
</td>
</tr>
Thanks!

Give it a try like this, it works if you fix it a bit further:
Option Explicit
Public Sub TestMe()
Dim myString As String
Dim cnt As Long
Dim myArr As Variant
myString = "<!-- <b>IE CODE : Koj sega e</b> -->blas<hr>My Website " & _
"is here<B><B><B><!-- <b>IE CODE : nomer </b> -->" & _
"is here<B><B><B><!-- <b>IE CODE : 1? </b> -->"
myString = Replace(myString, "-->", "<!--")
myArr = Split(myString, "<!--")
For cnt = LBound(myArr) To UBound(myArr)
If cnt Mod 2 = 1 Then Debug.Print myArr(cnt)
Next cnt
End Sub
This is what you get:
<b>IE CODE : Koj sega e</b>
<b>IE CODE : nomer </b>
<b>IE CODE : 1? </b>
The idea is the following:
Replace the --> with <!--
Split the input by <!--
Take every second value from the array
There are some possible scenarios, where it will not work, e.g. if you have --> or <!-- written somewhere within the text, but in the general case it should be ok.

You can use XPath:
substring-before(substring-after(//tr//comment(), "<b>"), "</b>")
to get required data

Related

VBA Webscrape HTML Coinmarketcap

Trying to scrape the number of cryptos in the top left corner of https://coinmarketcap.com/. I tried to find the "tr" but could not. Not sure how to grab that value up the top left of the page.
Here is what I have so far and I am being thrown a runtime error 438 Object doesn't support this property or method.
Sub cRYP()
Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")
With appIE
.Navigate "https://coinmarketcap.com/"
.Visible = True
End With
Do While appIE.Busy
DoEvents
Loop
Set allRowOfData = appIE.Document.getElementById("__next")
Dim myValue As String: myValue =
allRowOfData.Cells(16).innerHTML
appIE.Quit
Range("A1").Value = myValue
End Sub
There is no tr tag, because there is no table. At first you must get the html structure which contains your wanted value, because there is no possibility to get it directly. That is the structure with the classname container. Because the method getElementsByClassName() builds a node collection you must get the right structure with it's index in the collection. That's easy because its the first one. The first index of a collection is 0 like in an array.
Than you have this html structure:
<div class="container">
<div><span class="sc-2bz68i-0 cVPJov">Cryptos
<!-- -->: 17.826
</span><span class="sc-2bz68i-0 cVPJov">Exchanges
<!-- -->: 459
</span><span class="sc-2bz68i-0 cVPJov">Market Cap
<!-- -->: €1,536,467,483,857
</span><span class="sc-2bz68i-0 cVPJov">24h Vol
<!-- -->: €105,960,257,048
</span><span class="sc-2bz68i-0 cVPJov">Dominance
<!-- -->: <a href="/charts/#dominance-percentage" class="cmc-link">BTC
<!-- -->:
<!-- -->42.7%
<!-- -->
<!-- -->ETH
<!-- -->:
<!-- -->18.2%
</a>
</span><span class="sc-2bz68i-0 cVPJov"><span class="icon-Gas-Filled" style="margin-right:4px;vertical-align:middle"></span>ETH Gas
<!-- -->: <a>35
<!-- -->
<!-- -->Gwei<span class="sc-2bz68i-1 cEFmtT icon-Chevron-down"></span>
</a>
</span></div>
<div class="rz95fb-0 jKIeAa">
<div class="sc-16r8icm-0 cPgeGh nav-item"></div>
<div class="rz95fb-1 rz95fb-2 eanzZL">
<div class="cmc-popover">
<div class="cmc-popover__trigger"><button title="Change your language" class="sc-1kx6hcr-0 eFEgkr"><span class="sc-1b4wplq-1 kJnRBT">English</span><span class="sc-1b4wplq-0 ifkbzu"><span class="icon-Caret-down"></span></span></button></div>
</div>
</div>
<div class="rz95fb-1 cfBxiI">
<div><button title="Select Currency" data-qa-id="button-global-currency-picker" class="sc-1kx6hcr-0 eFEgkr"><span class="sc-1q0bpva-0 hEPBWj"></span><span class="sc-1bafwtq-1 dUQeWc">EUR</span><span class="sc-1bafwtq-0 cIzAJN"><span class="icon-Caret-down"></span></span></button></div>
</div><button type="button" class="sc-1kx6hcr-0 rz95fb-6 ccLqrB cmc-theme-picker cmc-theme-picker--day"><span class="icon-Moon"></span></button>
</div>
</div>
As you can see the wanted value is part of the first a tag in the scraped structure. We can simply get that tag with the method getElementsByTagName(). This will also build a node collection. We need also the first element of the collection with the index 0.
Than we have this:
17.826
Now we only need the innertext of this element and that's it.
Here is the VBA code. I don't use the IE, because it is finaly EOL and shouldn't be used anymore. You can load coinmarketcap simply without any parameters via xhr (xml http request):
Sub CryptosCount()
Const url As String = "https://coinmarketcap.com/"
Dim doc As Object
Dim nodeCryptosCount As Object
Set doc = CreateObject("htmlFile")
With CreateObject("MSXML2.XMLHTTP.6.0")
.Open "GET", url, False
.Send
If .Status = 200 Then
doc.body.innerHTML = .responseText
Set nodeCryptosCount = doc.getElementsByClassName("container")(0).getElementsByTagName("a")(0)
MsgBox "Number of cryptocurrencies on Coinmarketcap: " & nodeCryptosCount.innertext
Else
MsgBox "Page not loaded. HTTP status " & .Status
End If
End With
End Sub
Edit
As I see now, there is a possibility to get the value directly by using
getElementsByClassName("cmc-link")(0)
You can play with the code to learn more.

Web scraping using excel VBA

I am looking at an HTML code link below:
<h1 class="wer wer">
<a href="http://somelink.com" rel="bookmark" title="Permanent Link to Title of this page that covers some random topic">
Short title of this page...</a>
</h1>
I am currently using the below code to pull out innertext ("Short title of this page...")
For Each ele In .document.all
Select Case ele.classname
Case "wer wer"
RowCount = RowCount + 1
sht.Range("A" & RowCount) = ele.innertext
End Select
Next ele
How can I modify this code to pull out title ("Permanent Link to Title of this page that covers some random topic") and href ("http://somelink.com")?
Any help would be much appreciated. Thanks.
Select the element by its styling.
.document.querySelector("a[href=http://somelink.com]").innerText
a[href=http://somelink.com] is a CSS selector of first element with an a tag having href = 'http://somelink.com'.

Using Classic Asp variable to update CSS

I have a classic ASP page. On this page I need to hide a table based on whether the database that fills that table returns any results. If the table is empty then the header is hidden. <table> doesn't have a visible or display element thus I am wrapping it in a <div>. However when the page executes the css isn't applied.
.hideDiv {
display: <%=vis%>;
}
<div class="hideDiv">
<table>
<!-- Table elements -->
<%
' Other code
If count > 0 Then
vis = "block"
Else
vis = "none"
End If
' The vis variable is not updated past this point
%>
</table>
</div>
I think you have a few options.
Here's an old fashioned method.
Option 1:
Instead of having your CSS determine the Show or Hide of your table, have the If Count > 0 do the work server side .
If count > 0 Then
Response.Write("<table>" & vbCrLf)
'# Do you Table Tags and your code here.
Response.Write("</table>" & vbCrLf)
End If
If you must write CSS for your script you typically need to write the script twice, so you can have your CSS embedded in your header correctly.
Option 2:
Placed in header .
<%
Dim vis
If count > 0 Then
vis = "block"
Else
vis = "none"
End If
Response.Write("<style type=""text/css"">" & vbCrLf)
Response.Write(" .hideDiv {" & vbCrLf)
Response.Write(" display: "&vis&";" & vbCrLf)
Response.Write("}" & vbCrLf)
Response.Write("</style>" & vbCrLf)
%>
Then you can place your table in the body.
<div class="hideDiv">
<table>
<!-- Table elements -->
</table>
</div>
Option 3:
You can Inline your CSS and make it work. Or at least it should as long as your code sets the vis.
<%
Dim vis
If count > 0 Then
vis = "block"
Else
vis = "none"
End If
%>
<div style="display:<%=vis%>;">
<table>
<!-- Table elements -->
</table>
</div>
Often times in ASP Classic we need to write a small script to check if our table data is there. Remember to follow the left to right, top to bottom if you're not placing things in function or sub calls.
The count > 0 needs to trigger the building of your CSS so it can include the vis to your <Div> element.
If you're getting your Count value after running your SQL then you might need to setup that second script to test if you have data for your table then build your CSS.
Example:
Function MyCount()
Dim Count
Count = 0
SQL = SELECT Top 1 ID FROM Table WHERE FIELD1 Is Not NULL
'# blah
If rs.EOF=False Then
count = 1
End If
MyCount = count
End Function
We then we can mix the examples above to only trigger when we have a table to show.
<header>
<%
If MyCount() = 1 Then
Dim vis
vis = "block"
Else
vis = "none"
End If
%>
</header>
In the body you could then use something like the following.
<div style="display:<%=vis%>;">
<table>
<!-- Table elements -->
</table>
</div>
In your post you are actually calling the <%=vis%> before you set it.
Top to Bottom, Left to Right, reorder your code.
You should put below code at the top, then you test again:
If count > 0 Then
vis = "block"
Else
vis = "none"
End If
Below codes work well in my computer
<%
' Other code
If count > 0 Then
vis = "block"
Else
vis = "none"
End If
' The vis variable is not updated past this point
%>
.hideDiv {
display: <%=vis%>;
}
<div class="hideDiv">
<table>
<!-- Table elements -->
</table>
</div>

How to output result inside HTA window instead of a pop up message box?

I have a button in HTA file, it will search certain strings then output the result.(Thanks for #Ansgar Wiechers) I want to output the result in side of this HTA window instead of popping up a message box. The result will fill in the blank "YOU ARE IN _____ MODE"
How can I do it?
<html>
<head>
<title></title>
<HTA:APPLICATION
APPLICATIONNAME=""
ID=""
VERSION="1.0"/>
</head>
<script language="VBScript">
Sub RUNCURRENTMODE
Set xml = CreateObject("Msxml2.DOMDocument.6.0")
xml.async = False
xml.load "C:\aaa\settings.xml"
If xml.ParseError Then
MsgBox xml.ParseError.Reason
self.Close() 'or perhaps "Exit Sub"
End If
For Each n In xml.SelectNodes("//CSVName")
Select Case n.Attributes.GetNamedItem("Value").Text
Case "standard.csv" : MsgBox "This is standard."
Case "non-standard.csv" : MsgBox "This is non-standard."
Case Else : MsgBox "Unexpected value."
End Select
Next
End Sub
</script>
<body bgcolor="buttonface">
<center>
<p><font face="verdana" color="red">YOU ARE CURRENTLY IN STANDARD CSV MODE</font></p>
<input id=runbutton class="button" type="button" value="CURRENT MODE" name="db_button" onClick="RUCURRENTMODE" style="width: 170px"><p>
</center>
</body>
</html>
Normally you'd put an element with an ID into the <body> section of the HTA, for instance a paragraph. Or a span, since you want to update only a portion of the text:
<body>
<p>You are in <span id="FOO"></span> mode</p>
</body>
and change the value of the element with that ID in your function:
Select Case n.Attributes.GetNamedItem("Value").Text
Case "standard.csv" : FOO.innerText = "standard."
Case "non-standard.csv" : FOO.innerText = "non-standard."
Case Else : MsgBox "Unexpected value."
End Select

mixing asp and html codes

i have this code which contains html and asp code
<%for each x in rs.Fields%>
<%IF (x.name="ID") THEN%>
<%dim i
i=x.value%>
<td><a href="form7.asp?id="+<%i%>>
<%Response.Write(x.value)%><a/>
i want to use the i variable inside the html code
or another example
<%id=request("id")%>
<%=id%>
<tr>
<th>Name:</th>
<td><input name="n"></input></td>
i want to use id in the input tag in the value as value=id
how to do that ? can someone help me please ?
First, a basic ASP design principle: try to minimize the switching between HTML context and ASP (or really, VBScript) context on a page, for reasons of performance as well as readability.
Following that principle in your latter snippet, I would use Response.Write to emit the necessary HTML as follows:
<%
id=request("id")
Response.Write "<tr><th>Name:</th><td><input name=""n"" value=" & id & "></input></td></tr>"
%>
All you're doing is supplying the VALUE attribute of the INPUT tag.
Fixed:
<%
dim i
for each x in rs.Fields
IF (x.name="ID") THEN
i=x.value
response.write("<td><a href='form7.asp?id=" & i & "'>"
response.write(x.value) & "<a/>"
'not sure if you want a closing TD here
response.write("</td>") & vbCrLf
END IF
next
%>