I am trying to get a text string from inside a div on a webpage, but I can't seem to figure out how it is stored in the element.
Set eleval = objIE.Document.getElementsByClassName("outputValue")(0)
Debug.Print (eleval.innerText)
I have tried this and variations thereof, but my string just reads as "".
I mainly need help on how is this type of data is referenced in VBA.
<div class="outputValue">"text data that I want"</div>
Here is a screenshot of the page in question, I cannot give a link since it requires a company login to reach.
With .querySelector method, make sure page if fully loaded before attempting.
Example delays can be added with Application.Wait Now + TimeSerial(h,m,s)
Set eleval = objIE.Document.querySelector("div[class="outputValue"]")
Debug.Print eleval.innerText
If it is the first of its className on the page you could also use:
Set eleval = objIE.Document.querySelector(".outputValue")
If there is more than one and it is at a later index you can use
Set eleval = objIE.Document.querySelectorAll(".outputValue")
And then access items from the nodeList returned with
Debug.Print eleval.Item(0).innerText 'or replace 0 with the appropriate index.
Dim elaval as Variant
elaval = Trim(Doc.getElementsByTagName("div")(X).innerText)
msgbox elaval
Where X is the instance of your class div
Related
I've asked a similar question two days ago but I know stumble again on a similar problem but somehow different. previous question asked on a related problem
I have a report of many lines with the same structure. I need to click an icon that is on the nth line. That report is structured in cells so I know that my icon is in the first position (column) of that report. After I have click that icon I'll also have to click on a button in the 10th column.
I already know how to access the page in question with that code
Sub click_button_no_hlink()
Dim i As Long
Dim IE As Object
Dim Doc As Object
Dim objElement As Object
Dim objCollection As Object
Set IE = CreateObject("InternetExplorer.Application") 'create IE instance
IE.Visible = True
IE.Navigate "https://apex.xyz.qc.ca/apex/prd1/f?p=135:LOGIN_DESKTOP::::::" ' Adress of web page
While IE.Busy: DoEvents: Wend 'loading page
This first part is easy isn't? And I know how to handle it. Afterward I tried different variation around this but it either do nothing, or I get an error message. Obviously I don't fully understand what I'm doing with the "querySelector" thing…
dim step_target as string
step_target = 2
'identify all the lines of my table containing lines, containing icons
'and button to click on
Set objCollection = IE.document.getElementsByClassName("highlight-row")
i = 0
Do While i < objCollection.Length
'cell 2 is the one containing the step I'm targetting
If objCollection.Item(i).Cells(2).innerText = step_target Then
'that's not doing anything
objCollection.Item(i).Cells(9).Click
'tried many syntax around this with no luck
IE.document.querySelector([objCollection.Item(i).Cells(9)]).FireEvent ("onclick")
End If
i = i + 1
Loop
Here's images of the code of the page
Showing all the lines of the report
Showing all code lines of a particular line
and now the code of that first icon I need to click on (this is where I need help ;-) how can I call that action)
and finally the code of that button I also need to click on
Again, I thank you all in advance, for the time you'll take to help me along this.
you could try attribute selector for first in combination with descendant combinator and a type selector
ie.document.querySelector("[headers='ID_DET_DEM_TRAV_STD'] a").click
you could try attribute selector for second in combination with descendant combinator and input type selector
ie.document.querySelector("[headers='BOUTON1'] input").click
alternative for second is
ie.document.querySelector("[value=Fait]").click
Typically, if you want to select by position e.g. 1 and 10th columns you would use
td:nth-of-type(1)
td:nth-of-type(10)
Though you would also use a tr:nth-of-type(n) to get the right row as well e.g. first row, first col. Then add in any child type selector, for example, that you might need.
ie.document.querySelector("tr:nth-of-type(1) td:nth-of-type(1)")
Child a tag:
ie.document.querySelector("tr:nth-of-type(1) td:nth-of-type(1) a")
Child input tag: would then be:
IE.document.querySelector("tr:nth-of-type(4) td:nth-of-type(10) input").Click
I'm trying to set up an Excel form that auto-fills an HTML web form. I've figured out how to use VBA to get the elements and cycle through them to add values. My issue is with a text field that is revealed with the click of a check box. I can't have Excel check the box until after I've obtained the elements for the form.
I've looked at the html, and it looks like the field is hidden to some degree when the form is first loaded, as the id and everything can only be found in the tree once the box is checked. The problem is, this field won't show up in the VBA elements list no matter what I try. I've tried re-doing the Set command, and gotten an error when I do that. I'm not sure how to refresh the elements list in VBA to include the new input box.
I used the Set command to get all the elements
Set frm = ie.document.getElementByID("form1")
This is fine, but I can't use that same command to try to re-Set the element list. I get the run-time error 438 (Object doesn't support this property or method)
I tried making a Variant titled frm2, but I get the same error
Sub formFill()
Dim ie As Object
Dim frm As Variant
Dim element As Variant
Set ie = CreateObject("InternetExplorer.Application")
ie.navigate "THIS IS THE URL"
While ie.readyState <> 4: DoEvents: Wend
'Get form by ID
Set frm = ie.document.getElementByID("form1")
ie.Visible = True
For Each element In frm.elements
Select Case element.Name
Case "fv_RRFC$chkOtherModel"
element.Checked = True
element.FireEvent ("OnClick")
frm.getElementByID("fv_RRFC_txtOtherModel")(0).Value = "test model" 'I tried using the command here, but it didn't work
Case "fv_RRFC$txtRRFC_PROGRAM"
element.Value = "test"
Case "fv_RRFC$txtOtherModel"
element.Value = "test model"
'My attempt to add it to the Select Case. Not surprised this didn't work, as the for Each loop uses the list it had before
End Select
Next
End Sub
I expected to be able to re-load the elements list to interact and fill the newly revealed box, but I've had no luck finding a way to do that.
I have an access database which deals with "articles" and "items" which are all textual stuff. An article is composed of several items. Each item has a rich text field and I wish to display the textual content of an article by concatenating all rich text fields of its items.
I have written a VBA program which concatenates the items rich text fields and feeds this into an independent TextBox control on my form (Textbox.Text = resulting string) but it does not work, I get an error message saying "this property parameter is too long".
If I try to feed a single textual field into the Textbox control, I get another error stating "Impossible to update the recordset" which I do not understand, what recordset is this about ?
Each item field is typically something like this (I use square brackets instead of "<" and ">" because otherwise the display of the post is not right) [div][font ...]Content[/font] [/div]", with "[em]" tags also included.
In front of my problem, I have a number of questions :
1) How do you feed an HTML string into an independent Textbox control ?
2) Is it OK to concatenate these HTML strings or should I modify tags, for example have only one "[div]" block instead of several in a row (suppress intermediate div tags) ?
3) What control should I use to display the result ?
You might well answer that I might as well use a subform displaying the different items of which an article is made up. Yes, but it is impossible to have a variable height for each item, and the reading of the whole article is very cumbersome
Thank you for any advice you may provide
It works for me with a simple function:
Public Function ConcatHtml()
Dim RS As Recordset
Dim S As String
Set RS = CurrentDb.OpenRecordset("tRichtext")
Do While Not RS.EOF
' Visually separate the records, it works with and without this line
If S <> "" Then S = S & "<br>"
S = S & RS!rText & vbCrLf
RS.MoveNext
Loop
RS.Close
ConcatHtml = S
End Function
and an unbound textbox with control source =ConcatHtml().
In your case you'd have to add the article foreign key as parameter to limit the item records you concatenate.
The "rich text" feature of a textbox is only intended for simple text.
We use the web browser control to display a larger amount of HTML text, and load it like this:
Private Sub Form_Current()
LoadWebPreview
End Sub
Private Sub HtmlKode_AfterUpdate()
LoadWebPreview
End Sub
Private Sub LoadWebPreview()
' Let the browser control finish the rendering of its standard content.
While Me!WebPreview.ReadyState <> acComplete
DoEvents
Wend
' Avoid the pop-up warning about running scripts.
Me!WebPreview.Silent = True
' Show body as it would be displayed in Outlook.
Me!WebPreview.Document.body.innerHTML = Me!HtmlBody.Value
End Sub
Just an FYI... I am using UserForm1.WebBrowser1.document.CurrentWindow.execScript("return doSubmit( this.form )").Click code to extract
I need to click on next button but element ID is not available; below is the HTML code
<input class="saveButton" onclick="return doSubmit( this.form )" type="BUTTON" value="Next">
i tried using below vba code but this does'nt work for me
Dim CurrentWindow As HTMLWindowProxy: Set CurrentWindow = UserForm1.WebBrowser1.document.parentWindow
Call CurrentWindow.execScript("return doSubmit( this.form )")
Please let me know if any further thing required
So. Clicking something without an ID can be done, you just have to know how to FIND what you're looking for. Normally, I'd recommend clicking on the element by innertext as a good backup.
I know you can't probably share the webpage, so I won't ask. But, what is the tag assigned to this element? "td","a","div","input" etc etc.
You can finding the element using this loop.
Dim Cnt As Variant
Dim oCell As Object
Cnt = 0 '
With oie.document.all
For Each oCell In .tags("a") 'This will change according to html tagging of element
If oCell.value = "Next" Then 'If this "Next" value is unique.
oie.document.all.Item(cnt).click 'If doesn't trigger try nextline
oie.document.all.item(cnt).fireevent ("onclick")
Exit For
End If
Cnt = Cnt + 1
Next oCell
End With
Set oCell = Nothing
Now, in order to get this to work. Enable the "Microsoft Internet Controls" reference library.
A couple things that might have to change depending on the HTML structure.
oie.document.all.item 'This might change to something like oie.document.body. or oie.document.forms(0).document.all.
Like I said, the "a" needs to be changed to look at every tag assigned to the element in the html.
I hope this points you in the right direction, if you don't want to run the loop everytime the macro runs, you can put this into a function that runs and assigns all of the elements on load. But one thing at a time :)
Hello and thanks for your quick turn around!
Code is not running...
I tried your code as per my requirement and make the necessary changes:
Dim Cnt As Variant
Dim oCell As Object
Cnt = 0
With UserForm1.WebBrowser1.document.all
For Each oCell In .tags("input")
If oCell.Class= "saveButton" Then
UserForm1.WebBrowser1.document.all.Item(Cnt).Click
UserForm1.WebBrowser1.document.all.Item(Cnt).FireEvent ("onclick")
Exit For
End If
Cnt = Cnt + 1
Next oCell
End With
Set oCell = Nothing
End Function
Just an FYI... i am using userform to perform this task.
As you said, we need some unique value so i have 'savebutton' as class and change in code as 'oCell.class'...
Is there any chance to find the 'savebutton' as this is unique?
Thanks a ton!
I am getting started with trying to learn about scraping. I got this page that is behind a login and I remember reading that you should not try to do the (1), (2) or (3) thing after get element by tagname. But that you should rather get something more unique like a Classname or ID. But can someone please tell me why
This the GetTag works and
Dim Companyname As String
Companyname = ie.document.getElementsByTagName("span")(1).innertext
This GetClass do not work
Dim Companyname As String
Companyname = ie.document.getElementsByClassName("account-website-name").innertext
This is the text that I am scraping
<span class="account-website-name" data-journey-name="true">Dwellington Journey</span>
getELEMENTbyProperty vs getELEMENTSbyProperty
There are primarily two distinct types of commands to retrieve one or more elements from a web page's .Document; those that return a single object and those that return a collection of objects.
Getting an ELEMENT
When getElementById is used, you are asking for a single object (e.g. MSHTML.IHTMLElement). In this case the properties (e.g. .Value, .innerText, .outerHtml, etc) can be retrieved directly. There isn't supposed to be more than a single unique id property within an HTML body so this function should safely return the only element within the i.e.document that matches.
'typical VBA use of getElementById
Dim CompanyName As String
CompanyName = ie.document.getElementById("CompanyID").innerText
Caveat: I've noticed a growing number of web designers who seem to think that using the same id for multiple elements is oh-key-doh-key as long as the id's are within different parent elements like different <div> elements. AFAIK, this is patently wrong but seems to be a growing practise. Be careful on what is returned when using .getElementById.
Getting ELEMENTS
When using getElementsByTagName, getElementsByClassName, etc. where the word Elements is plural, you are returning a collection (e.g. MSHTML.IHTMLElementCollection) of objects, even if that collection contains only one or even none. If you want to use these to directly access an property of one of the elements within the collection, an ordinal index number must be supplied so that a single element within the collection is referenced. The index number within these collections is zero based (i.e. the first starts at (0)).
'retrieve the text from the third <span> element on a webpage
Dim CompanyName As String
CompanyName = ie.document.getElementsByTagName("span")(2).innerText
'output all <span> classnames to the Immediate window until the right one comes along
'retrieve the text from the first <span> element with a classname of 'account-website-name'
Dim e as long, es as long
es = ie.document.getElementsByTagName("span").Length - 1
For e = 0 To es
Debug.Print ie.document.getElementsByTagName("span")(e).className
If ie.document.getElementsByTagName("span")(e).className = "account-website-name" Then
CompanyName = ie.document.getElementsByTagName("span")(e).innerText
Exit For
End If
Next e
'same thing, different method
Dim eSPN as MSHTML.IHTMLElement, ecSPNs as MSHTML.IHTMLElementCollection
ecSPNs = ie.document.getElementsByTagName("span")
For Each eSPN in ecSPNs
Debug.Print eSPN.className
If eSPN.className = "account-website-name" Then
CompanyName = eSPN.innerText
Exit For
End If
Next eSPN
Set eSPN = Nothing: Set ecSPNs = Nothing
To summarize, if your Internet.Explorer method uses Elements (plural) rather than Element (singular), you are returning a collection which must have an index number appended if you wish to treat one of the elements within the collection as a single element.
CSS selector:
You can achieve the same thing with a CSS selector of .account-website-name
The "." means className. This will return a collection of matching elements if there are more than one.
CSS query:
VBA:
You apply the selector with the .querySelectorAll method of .document. This returns a nodeList which you traverse the .Length of, accessing items by index, starting from 0.
Dim aNodeList As Object, i As Long
Set aNodeList = ie.document.querySelectorAll(".account-website-name")
For i = 0 To aNodeList.Length -1
Debug.Print aNodeList.Item(i).innerText
' Debug.Print aNodeList(i).innerText ''<== sometimes this syntax instead
Next