ListBox with html element - html

Can anyone offer me some advice? I currently have a listbox I am using, in the listbox there is a list of images from any website. they are grabbed from the website via this method
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
Dim PageElements As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("img")
For Each CurElement As HtmlElement In PageElements
imagestxt.Items.Add(imagestxt.Text & CurElement.GetAttribute("src") & Environment.NewLine)
Next
Timer1.Enabled = True
End Sub
I then use the picture control method to get the image and display it.
pic1.Image = New Bitmap(New MemoryStream(New WebClient().DownloadData(imagestxtimagestxt.SelectedItem.ToString))).SelectedItem.ToString)))
This method pulls the images and title from the HTML.
Private Function StrHTML12() As Boolean
Dim htmlDocument As HtmlDocument = WebBrowser1.Document
ListBox1.Items.Clear()
For Each element As HtmlElement In htmlDocument.All
ListBox1.Items.Add(element.TagName)
If element.TagName.ToUpper = "IMG" Then
imgtags.Items.Add(element.OuterHtml.ToString)
End If
If element.TagName.ToUpper = "TITLE" Then
titletags.Items.Add(element.OuterHtml.ToString)
Timer1.Enabled = False
End If
Next
End Function
This is a counting method to count how many empty alt="" or empty img alt='' there are on the page.
Basically what i am looking to do is;
Have a program that can check the image, look at the alt='' or img alt='' if on the website the dev hasn't put anything in the alt tag i want the image to show in a picture box and i want the alt tag either next to it or underneith it or something. but i have no idea how.
counter = InStr(counter + 1, strHTML, "<img alt=''")
counter = InStr(counter + 1, strHTML, "alt=''")
counter = InStr(counter + 1, strHTML, "alt=""")
The above seems really slow and messy. is there a better way of doing it?

I do not have VB installed so I have not been able to test the code. I'm also not familiar with the datagridview component so have not attempted to integrate my code with it.
The code below should get you the title of the page, and loop through all the img tags that do not have (or have empty) alt-text
HtmlElement.GetAttribute(sAttr) returns the value of the attribute or an empty string.
Private Sub WebBrowser1_DocumentCompleted(ByVal sender As Object, ByVal e As WebBrowserDocumentCompletedEventArgs) Handles WebBrowser1.DocumentCompleted
Dim Title As String
Dim ImSrc As String
Dim PageElements As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("img")
// This line might need to be adjusted, see below
Title = PageElements.GetElementsByTagName("title")(0).InnerText
For Each CurElement As HtmlElement In PageElements
If CurElement.GetAttribute("alt") = "" Then
// CurElement does not have alt-text
ImSrc = CurElement.GetAttribute("src") // This Image has no Alt Text
Else
// CurElement has alt-text
End If
Next
Timer1.Enabled = True
End Sub
The line that gets the title might need to be changed as I'm unsure how collections can be accessed. You want the first (hopefully only) element returned from the GetElementsByTagName function.

Related

VBA Web search button - GetElementsbyClassName

I have a problem with the VBA code.
I would like to open this website: https://www.tnt.com/express/en_us/site/tracking.html and in Shipment numbers search box I would like to put active cells from Excel file. At the beginning I tried to put only a specific text for example: "777777".
I wrote the below code but unfortunately, the search button is empty and there is no error. I tried everything and I have no idea what should I change in my code.
Any clues? Thank you in advance.
HTML:
<input class="__c-form-field__text ng-touched ng-dirty ng-invalid" formcontrolname="query" pbconvertnewlinestocommasonpaste="" pbsearchhistorynavigation="" shamselectalltextonfocus="" type="search">
VBA:
Sub TNT2_tracker()
Dim objIE As InternetExplorer
Dim aEle As HTMLLinkElement
Dim y As Integer
Dim result As String
Set objIE = New InternetExplorer
objIE.Visible = True
objIE.navigate "https://www.tnt.com/express/en_us/site/tracking.html"
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop
Dim webpageelement As Object
For Each webpageelement In objIE.document.getElementsByClassName("input")
If webpageelement.Class = "__c-form-field__text ng-pristine ng-invalid ng-touched" Then
webpageelement.Value = "777"
End If
Next webpageelement
End Sub
You could use the querySelector + class name to find an element.
something like
'Find the input box
objIE.document.querySelector("input.__c-form-field__text").value = "test"
'Find the search button and do a click
objIE.document.querySelector("button.__c-btn").Click
No need to loop through elements. Unless the site allows you to search multiple tracking numbers at the same time.
It seems automating this page is a litte tricky. If you change the value of the input field it doesn' t work. Nothing happen by clicking the submit button.
A look in the dom inspector shows several events for the input field. I checked them out and it seems we need to paste the value over the clipboard by trigger the paste event of the shipping field.
In order for this to work without Internet Explorer prompting, its security settings for the Internet zone must be set to allow pasting from the clipboard. I'm using a German version of IE, so I have problems explaining how to find the setting.
This macro works for me:
Sub TNT2_tracker()
Dim browser As Object
Dim url As String
Dim nodeDivWithInputField As Object
Dim nodeInputShipmentNumber As Object
Dim textToClipboard As Object
'Dataobject by late binding to use the clipboard
Set textToClipboard = CreateObject("New:{1C3B4210-F441-11CE-B9EA-00AA006B1A69}")
url = "https://www.tnt.com/express/en_us/site/tracking.html"
'Initialize Internet Explorer, set visibility,
'call URL and wait until page is fully loaded
Set browser = CreateObject("internetexplorer.application")
browser.Visible = True
browser.navigate url
Do Until browser.ReadyState = 4: DoEvents: Loop
'Manual break for loading the page complitly
'Application.Wait (Now + TimeSerial(pause_hours, pause_minutes, pause_seconds))
Application.Wait (Now + TimeSerial(0, 0, 3))
'Get div element with input field for shipment number
Set nodeDivWithInputField = browser.Document.getElementsByClassName("pb-search-form-input-group")(0)
If Not nodeDivWithInputField Is Nothing Then
'If we got the div element ...
'First child element is the input field
Set nodeInputShipmentNumber = nodeDivWithInputField.FirstChild
'Put shipment number to clipboard
textToClipboard.setText "7777777"
textToClipboard.PutInClipboard
'Insert value by trigger paste event of the input field
Call TriggerEvent(browser.Document, nodeInputShipmentNumber, "paste")
'Click button
browser.Document.getElementsByClassName("__c-btn")(0).Click
Else
MsgBox "No input field for shipment number found."
End If
End Sub
And this function to trigger a html event:
Private Sub TriggerEvent(htmlDocument As Object, htmlElementWithEvent As Object, eventType As String)
Dim theEvent As Object
htmlElementWithEvent.Focus
Set theEvent = htmlDocument.createEvent("HTMLEvents")
theEvent.initEvent eventType, True, False
htmlElementWithEvent.dispatchEvent theEvent
End Sub
As #Stavros Jon alludes to..... there is a browserless way using xhr GET request via API. It returns json and thus you ideally need to use a json parser to handle the response.
I use jsonconverter.bas as the json parser to handle the response. Download raw code from here and add to standard module called JsonConverter . You then need to go VBE > Tools > References > Add reference to Microsoft Scripting Runtime. Remove the top Attribute line from the copied code.
Example request with dummy tracking number (deliberately passed as string):
Option Explicit
Public Sub TntTracking()
Dim json As Object, ws As Worksheet, trackingNumber As String
trackingNumber = "1234567" 'test input value. Currently this is not a valid input but is for demo.
Set ws = ThisWorkbook.Worksheets("Sheet1") 'for later use if writing something specific out
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "https://www.tnt.com/api/v3/shipment?con=" & trackingNumber & "&searchType=CON&locale=en_US&channel=OPENTRACK", False
.send
Set json = JsonConverter.ParseJson(.responseText)
End With
'do something with results
Debug.Print json("tracker.output")("notFound").Count > 0
Debug.Print JsonConverter.ConvertToJson(json("tracker.output")("notFound"))
End Sub

Get website's inner text without webbrowser

I want to get website's inner text through code.
I can already get it's inner html with code below, but i can't find any code that's getting URL's inner text without webbrowser.
This code is getting text from website in webbrowser, but i need same thing, just without webbrowser.
Dim sourceString As String = WebBrowser1.Document.Body.InnerText
With HtmlAgilityPack...
Private Sub ToolStripButton1_Click(sender As Object, e As EventArgs) Handles ToolStripButton1.Click
Dim doc As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument
With New Net.WebClient
doc.LoadHtml(.DownloadString("https://example.com"))
.Dispose()
End With
Debug.Print(doc.DocumentNode.Name)
PrintChildNodes(doc.DocumentNode)
Debug.Print(doc.DocumentNode.Element("html").Element("body").InnerText)
End Sub
Sub PrintChildNodes(Node As HtmlAgilityPack.HtmlNode, Optional Indent As Integer = 1)
For Each Child As HtmlAgilityPack.HtmlNode In Node.ChildNodes
Debug.Print("{0}{1}", String.Empty.PadLeft(Indent, vbTab), Child.Name)
PrintChildNodes(Child, Indent + 1)
Next
End Sub
**Taken from **
Wolfwyrd
In this question HTTP GET in VB.NET
Try
Dim fr As System.Net.HttpWebRequest
Dim targetURI As New Uri("http://whatever.you.want.to.get/file.html")
fr = DirectCast(HttpWebRequest.Create(targetURI), System.Net.HttpWebRequest)
If (fr.GetResponse().ContentLength > 0) Then
Dim str As New System.IO.StreamReader(fr.GetResponse().GetResponseStream())
Response.Write(str.ReadToEnd())
str.Close();
End If
Catch ex As System.Net.WebException
'Error in accessing the resource, handle it
End Try
You will get Html as well as http headers. Don't think this will work by itself with https.

get url from HTML string

I have the following code that grabs a div element:
For Each ele As HtmlElement In WebBrowser1.Document.GetElementsByTagName("div")
If ele.GetAttribute("className").Contains("description") Then
Dim content As String = ele.InnerHtml
If content.Contains("http://myserver.com/image/check.png") Then
'Do stuff if image exists
Else
'Do stuff if image doesn't exist
End If
End If
The div element looks like this:
<DIV class=headline><SPAN class=blue-title-lg>TITLE_HERE
</SPAN> LOCATION1_HERE, LOCATION2_HERE</DIV>DESCRIPTION_HERE<BR>
<DIV class=about><A class=link href="viewprofile.aspx?
profile_id=00000000">USERNAME</A> 20 FSM -
Friends <FONT color=green>Online Today</FONT></DIV>
When the tick image doesn't exist, I want to grab the url that's in:
<a class=link href="viewprofile.aspx?profile_id=00000000"></a>
and put it into a string. This is where I've hit a brick wall and I need some help. I'd imagine a regex solution would resolve my issue, but regex is one of my weak spots. Can someone put me out of my misery?
Solved it!
I slept on it and came up with a really simple way of solving it. The UI of my app now looks like a mess, but I'll sort that later. I have the information I need.
Here's how I did it:
Dim PageElement As HtmlElementCollection = WebBrowser1.Document.GetElementsByTagName("a")
For Each CurElement As HtmlElement In PageElement
Dim linkunverified As String
linkunverified = CurElement.GetAttribute("href")
If linkunverified.Contains("viewprofile.aspx") Then
If ListBox1.Items.Contains(linkunverified) Then
Else
ListBox1.Items.Add(linkunverified)
End If
End If
Next
For Each ele As HtmlElement In WebBrowser1.Document.GetElementsByTagName("div")
If ele.GetAttribute("className").Contains("description") Then
Dim content As String = ele.InnerHtml
If content.Contains("http://pics.myserver.com/image/check.png") Then
Else
Dim i As Integer
For i = 0 To ListBox1.Items.Count - 1
If content.Contains(ListBox1.Items(i).Remove(0, 24)) Then
ListBox2.Items.Add("http://www.myserver.com/" & ListBox1.Items(i).Remove(0, 24))
End If
Next
End If
End If
Next

Visual Basic - Extracting text from a webpage

My program is designed to search Amazon and extract bits of data from the results - such as product name and price.
So far, my program pastes what the user has typed into txt_Search and presses 'Go' on the page. Here is the code so far:
Private Sub btn_Search_Click(sender As Object, e As EventArgs) Handles btn_Search.Click
Dim SearchTerm As String
'Set SearchTerm as text input
SearchTerm = txt_Search.Text
'Identify the search text box
Dim AmazonElementCollection As HtmlElementCollection = web_Browser.Document.GetElementsByTagName("Input")
For Each curElement As HtmlElement In AmazonElementCollection
Dim searchtextbox As String = curElement.GetAttribute("id").ToString
If searchtextbox = "twotabsearchtextbox" Then
'Populate the search text box
curElement.SetAttribute("Value", searchterm)
End If
Next
'BUTTON CLICK: Search button
'<input type="submit" value="Go" class="nav-submit-input" title="Go">
For Each curElement As HtmlElement In AmazonElementCollection
'Find the Go button
If curElement.GetAttribute("value").Equals("Go") Then
'Function to click the button
curElement.InvokeMember("click")
End If
Next
So this all works fine as it does all this on a web browser object called web_Browser. But how do I take the text off the web page for the first result for example and put it into a textbox. I would like the title of the first result to be put into a textbox called txt_Title. Thank you.
You need to add an event handler for the DocumentCompleted event of your web_browser control.
In that event handler, you can use the same method to find your results element as you have done to find the search text box and the search button, that is: loop through the elements of the document and use the properties of the elements to identify the one you want.
This might look something like this:
Add event handler:
If curElement.GetAttribute("value").Equals("Go") Then
'Add the handler
AddHandler web_Browser.DocumentCompleted, AddressOf myDocumentCompleted
'Function to click the button
curElement.InvokeMember("click")
End If
Event handler implementation:
Private Sub myDocumentCompleted(ByVal sender As Object, _
ByVal e As WebBrowserDocumentCompletedEventArgs)
Dim AmazonElementCollection As HtmlElementCollection = web_Browser.Document.GetElementsByTagName("something you need to figure out")
For Each curElement As HtmlElement In AmazonElementCollection
Dim resultText As String = curElement.GetAttribute("id").ToString
If resultText = "something you need to figure out" Then
'Populate the text box
txt_Title.Text = curElement.innerHTML;
End If
Next
End Sub
Note: I haven't run this through a compiler or tested it, so be prepared for some problems, but I hope you get the idea.

How to export data from VB6 to an HTML TextBox

I would like to know how can I export data from VB6 textbox to a HTML textbox? It could be a simple html page or an asp page.
for example, on my VB6 form, i have an name field. Upon clicking of a button on the VB6 form, the data in the name field will be exported to a textbox on the html page.
Thank you all for help and time for reading this.
To see this demo in action, and be able to follow it through and learn how to grab from it what you need:
Create a form with a lable over a textbox, and stick 1 command buttons on the form. Don't rename any of them - the program expects the text1, command1
The following CODE is the complete FORM CODE to copy/paste into it.
Add refernce to your project from (Project=>References)Microsoft Internet Controls,Microsoft HTML Object Library,
Option Explicit
Public TargetIE As SHDocVw.InternetExplorer
Private Sub Command1_Click() ' Send text to first IE-document found
GetTheIEObjectFromSystem
SendTextToActiveElementWithSubmitOptionSet (False)
End Sub
Private Sub Form_Load()
Me.Text1 = "This is a sample text message set and submitted programmatically" 'make text1 multiline in design
Me.Command1.Caption = "Text to the first IE browser document found"
End Sub
Public Sub GetTheIEObjectFromSystem(Optional ByVal inurl As String = ".") ' "." will be found in ALL browser URLs
Dim SWs As New SHDocVw.ShellWindows
Dim IE As SHDocVw.InternetExplorer
Dim Doc As Object
For Each IE In SWs
If TypeOf IE.Document Is HTMLDocument Then ' necessary to avoid Windows Explorer
If InStr(IE.LocationURL, inurl) > 0 Then
Set TargetIE = IE
Exit For
End If
End If
Next
Set SWs = Nothing
Set IE = Nothing
End Sub
Private Sub SendTextToActiveElementWithSubmitOptionSet(ByVal bSubmitIt As Boolean)
Dim TheActiveElement As IHTMLElement
Set TheActiveElement = TargetIE.Document.activeElement
If Not TheActiveElement.isTextEdit Then
MsgBox "Active element is not a text-input system"
Else
TheActiveElement.Value = Me.Text1.Text
Dim directParent As IHTMLElement
If bSubmitIt Then
Dim pageForm As IHTMLFormElement
Set directParent = TheActiveElement.parentElement
' find its parent FORM element by checking parent nodes up and up and up until found or BODY
Do While (UCase(directParent.tagName) <> "FORM" And UCase(directParent.tagName <> "BODY"))
Set directParent = directParent.parentElement
Loop
If UCase(directParent.tagName) = "FORM" Then
Set pageForm = directParent
pageForm.submit 'intrinsic Form-element Method
Else
MsgBox ("Error: No form unit for submitting the text on this page!")
End If
End If
Set pageForm = Nothing
Set directParent = Nothing
End If
Set TheActiveElement = Nothing
Set TargetIE = Nothing
End Sub