Get value from HTML element for display in Textbox - html

I have tried adapting a handful of solutions Ive found on here and cannot get any to work. Most recently,
Private Sub Button4_Click(sender As Object, e As EventArgs) Handles Button4.Click
Dim Web As New HtmlAgilityPack.HtmlWeb
Dim Doc As New HtmlAgilityPack.HtmlDocument
Doc = Web.Load("http://MyWebSearch.com/s/" + TextBox1.Text)
For Each table As HtmlAgilityPack.HtmlNode In Doc.DocumentNode.SelectNodes(<div class="inline-block"></div>)
Textbox5.text(table.InnerText)
Next
End Sub
I am trying to conduct a search with a fixed address where + textbox1 contains the search item. I need to return the value from one element on the page into Textbox5 after search is conducted. I cant for the life of me get this to work. Ive tried obtaining the xpath but that failed also. What am I doing wrong?
The web page is rbx.trade/s/"username"
I am trying to return the users "Rap" and display in textbox5

You're searching for the class of an element. You probably want to search for the ID instead.
For Each table As HtmlAgilityPack.HtmlNode In Doc.DocumentNode.SelectNodes(<div id="elementID"></div>)
Textbox5.text(table.InnerText)
Next
Or if you indeed mean to be using the class, store each value in an array or append to the text box instead of storing it directly. Say there are 8 element with that class. Doing it the way you are will always store the value of the 8th skimmed element with that class name.

Related

Obtain Innertext from Web Element with Variable Path - Selenium

I have a VBA macro that I'm running in Excel 2016. The macro brings back information from the internet using Chrome and Selenium WebDriver. The macro iterates through several similar webpages, but some pages have a few more or less lines than others. Hence, the XPath to the innertext I'm interested in varies slightly from page to page. Here is a snippet of the source code for the element, it is the "242" that I'm trying to locate and extract.
<div ng-repeat="squarefootage in improvement.SquareFootage" class="ng-scope">
<div>
<span class="labelSquareFootage ng-binding">ATTACHED GARAGE AREA </span><span class="result ng-binding">242</span>
</div>
</div>
As a workaround I'm just grabbing the entire source code for the page and then parsing it with INSTR to find what I'm looking for. I was wondering if there was a more elegant method to find an element with a variable path? Is there something in WebDriver that would work like
WDriver.FindElementbyInnerHTML
?
Here is a link to the website, you can look at a few different addresses and see how the path changes from page (address) to page (next address).
You could gather all nodes with matching class and loop until desired garage text found then take the nextSibling
Public Sub Demo()
'Your code to get to page and enter address and search, open heading, then....
Dim html As MSHTML.HTMLDocument
Set html = New MSHTML.HTMLDocument
html.body.innerHTML = WDriver.PageSource
Dim nodes As Object, node As Object, i As Long
Set nodes = html.querySelectorAll(".labelSquareFootage")
For i = 0 To nodes.Length - 1
Set node = nodes.Item(i)
If InStr(node.innerText, "ATTACHED GARAGE AREA") > 0 Then
Debug.Print node.NextSibling.innerText
Exit For
End If
Next i
End Sub
For xpath, you could try
//*[text()[contains(.,'ATTACHED GARAGE AREA')]]/following-sibling::span
if the desired value is the next span node. This searches for the desired text in the .innerText then takes the nextSibling span.
CSS selectors

Concatenate Rich Text Fields (HTML) and display result on Access form

I have an access database which deals with "articles" and "items" which are all textual stuff. An article is composed of several items. Each item has a rich text field and I wish to display the textual content of an article by concatenating all rich text fields of its items.
I have written a VBA program which concatenates the items rich text fields and feeds this into an independent TextBox control on my form (Textbox.Text = resulting string) but it does not work, I get an error message saying "this property parameter is too long".
If I try to feed a single textual field into the Textbox control, I get another error stating "Impossible to update the recordset" which I do not understand, what recordset is this about ?
Each item field is typically something like this (I use square brackets instead of "<" and ">" because otherwise the display of the post is not right) [div][font ...]Content[/font] [/div]", with "[em]" tags also included.
In front of my problem, I have a number of questions :
1) How do you feed an HTML string into an independent Textbox control ?
2) Is it OK to concatenate these HTML strings or should I modify tags, for example have only one "[div]" block instead of several in a row (suppress intermediate div tags) ?
3) What control should I use to display the result ?
You might well answer that I might as well use a subform displaying the different items of which an article is made up. Yes, but it is impossible to have a variable height for each item, and the reading of the whole article is very cumbersome
Thank you for any advice you may provide
It works for me with a simple function:
Public Function ConcatHtml()
Dim RS As Recordset
Dim S As String
Set RS = CurrentDb.OpenRecordset("tRichtext")
Do While Not RS.EOF
' Visually separate the records, it works with and without this line
If S <> "" Then S = S & "<br>"
S = S & RS!rText & vbCrLf
RS.MoveNext
Loop
RS.Close
ConcatHtml = S
End Function
and an unbound textbox with control source =ConcatHtml().
In your case you'd have to add the article foreign key as parameter to limit the item records you concatenate.
The "rich text" feature of a textbox is only intended for simple text.
We use the web browser control to display a larger amount of HTML text, and load it like this:
Private Sub Form_Current()
LoadWebPreview
End Sub
Private Sub HtmlKode_AfterUpdate()
LoadWebPreview
End Sub
Private Sub LoadWebPreview()
' Let the browser control finish the rendering of its standard content.
While Me!WebPreview.ReadyState <> acComplete
DoEvents
Wend
' Avoid the pop-up warning about running scripts.
Me!WebPreview.Silent = True
' Show body as it would be displayed in Outlook.
Me!WebPreview.Document.body.innerHTML = Me!HtmlBody.Value
End Sub

Retrieving the text between the <div> with VBA

I am trying to get a text string from inside a div on a webpage, but I can't seem to figure out how it is stored in the element.
Set eleval = objIE.Document.getElementsByClassName("outputValue")(0)
Debug.Print (eleval.innerText)
I have tried this and variations thereof, but my string just reads as "".
I mainly need help on how is this type of data is referenced in VBA.
<div class="outputValue">"text data that I want"</div>
Here is a screenshot of the page in question, I cannot give a link since it requires a company login to reach.
With .querySelector method, make sure page if fully loaded before attempting.
Example delays can be added with Application.Wait Now + TimeSerial(h,m,s)
Set eleval = objIE.Document.querySelector("div[class="outputValue"]")
Debug.Print eleval.innerText
If it is the first of its className on the page you could also use:
Set eleval = objIE.Document.querySelector(".outputValue")
If there is more than one and it is at a later index you can use
Set eleval = objIE.Document.querySelectorAll(".outputValue")
And then access items from the nodeList returned with
Debug.Print eleval.Item(0).innerText 'or replace 0 with the appropriate index.
Dim elaval as Variant
elaval = Trim(Doc.getElementsByTagName("div")(X).innerText)
msgbox elaval
Where X is the instance of your class div

Dynamically adding divs and images during runtime - vb

I have a table full of users which includes a username and an image. What I'm trying to do is for every user in the table, I want to create a new div with a class of "user" which contains a label for their username and their image. Here's what I have so far:
Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
Dim label1 As New Label
label1.ID = "label1"
label1.Text = "Username goes here"
Controls.Add(label1)
End Sub
I found out how to add a label but I'm not sure how I'd position it, or add HTML elements with a label and image object inside them. Here's what I'm hoping the result will look like:
<div class="user">
<h2> Username </h2>
<img src="user.png">
</div>
Hopefully this could then be repeated as many times as necessary for each user.
Or you can do it like this, here an example in c#
var div = new Panel();
div.Controls.Add(new LiteralControl("<h1>hallo</h1>"));
div.Controls.Add(new Image());
Page.Form.Controls.Add(div);
You will probably want to use the asp:repeater control, you bind it the same way you bind a grid I.e
myRepeater.DataSource = results
myRepeater.DataBind
I will come back to to this an provide a full detailed explanation for this answer as I'm own my phone at the moment.

VBA Excel Scraping

I am getting started with trying to learn about scraping. I got this page that is behind a login and I remember reading that you should not try to do the (1), (2) or (3) thing after get element by tagname. But that you should rather get something more unique like a Classname or ID. But can someone please tell me why
This the GetTag works and
Dim Companyname As String
Companyname = ie.document.getElementsByTagName("span")(1).innertext
This GetClass do not work
Dim Companyname As String
Companyname = ie.document.getElementsByClassName("account-website-name").innertext
This is the text that I am scraping
<span class="account-website-name" data-journey-name="true">Dwellington Journey</span>
getELEMENTbyProperty vs getELEMENTSbyProperty
There are primarily two distinct types of commands to retrieve one or more elements from a web page's .Document; those that return a single object and those that return a collection of objects.
Getting an ELEMENT
When getElementById is used, you are asking for a single object (e.g. MSHTML.IHTMLElement). In this case the properties (e.g. .Value, .innerText, .outerHtml, etc) can be retrieved directly. There isn't supposed to be more than a single unique id property within an HTML body so this function should safely return the only element within the i.e.document that matches.
'typical VBA use of getElementById
Dim CompanyName As String
CompanyName = ie.document.getElementById("CompanyID").innerText
Caveat: I've noticed a growing number of web designers who seem to think that using the same id for multiple elements is oh-key-doh-key as long as the id's are within different parent elements like different <div> elements. AFAIK, this is patently wrong but seems to be a growing practise. Be careful on what is returned when using .getElementById.
Getting ELEMENTS
When using getElementsByTagName, getElementsByClassName, etc. where the word Elements is plural, you are returning a collection (e.g. MSHTML.IHTMLElementCollection) of objects, even if that collection contains only one or even none. If you want to use these to directly access an property of one of the elements within the collection, an ordinal index number must be supplied so that a single element within the collection is referenced. The index number within these collections is zero based (i.e. the first starts at (0)).
'retrieve the text from the third <span> element on a webpage
Dim CompanyName As String
CompanyName = ie.document.getElementsByTagName("span")(2).innerText
'output all <span> classnames to the Immediate window until the right one comes along
'retrieve the text from the first <span> element with a classname of 'account-website-name'
Dim e as long, es as long
es = ie.document.getElementsByTagName("span").Length - 1
For e = 0 To es
Debug.Print ie.document.getElementsByTagName("span")(e).className
If ie.document.getElementsByTagName("span")(e).className = "account-website-name" Then
CompanyName = ie.document.getElementsByTagName("span")(e).innerText
Exit For
End If
Next e
'same thing, different method
Dim eSPN as MSHTML.IHTMLElement, ecSPNs as MSHTML.IHTMLElementCollection
ecSPNs = ie.document.getElementsByTagName("span")
For Each eSPN in ecSPNs
Debug.Print eSPN.className
If eSPN.className = "account-website-name" Then
CompanyName = eSPN.innerText
Exit For
End If
Next eSPN
Set eSPN = Nothing: Set ecSPNs = Nothing
To summarize, if your Internet.Explorer method uses Elements (plural) rather than Element (singular), you are returning a collection which must have an index number appended if you wish to treat one of the elements within the collection as a single element.
CSS selector:
You can achieve the same thing with a CSS selector of .account-website-name
The "." means className. This will return a collection of matching elements if there are more than one.
CSS query:
VBA:
You apply the selector with the .querySelectorAll method of .document. This returns a nodeList which you traverse the .Length of, accessing items by index, starting from 0.
Dim aNodeList As Object, i As Long
Set aNodeList = ie.document.querySelectorAll(".account-website-name")
For i = 0 To aNodeList.Length -1
Debug.Print aNodeList.Item(i).innerText
' Debug.Print aNodeList(i).innerText ''<== sometimes this syntax instead
Next