Hi I am trying to use Rselenium to select a dropdown menu.
The field I want to click for the dropdown menu is Date Range so I look up in the html code (see picture below) and found
class="select2-choice"
to be the pointer so I invoke command to click on the dropdown menu
webElem <- rd$client$findElement(using = 'xpath',
value = '//*[#class="select2-choice"]')
webElem$clickElement()
Then I want to select "Custom" in the dropdown field so I look up in the html code (see picture below) and found it is under
select id="namedRange-3640"
and the option is
value="custom"
so I try to invoke Rselenium command again to click on this custom field
webElem <- rd$client$findElement(using = 'xpath', "//select[#id='namedRange-3640']/option[#value='custom']")
webElem$clickElement()
However there is no action in the webpage, there is no warning from the code either. I tried in other webpage with much simpler structure like W3C tutorial on dropdown menu and it works. However in this case it seems to be slightly more complicated, with something called ng-repeat which I have not come across before. Anyone know how to select the custom field?
Many thanks
This could be the solution.
library(RSelenium)
remDr <- remoteDriver(browser=c("firefox"), port = 4445)
remDr$open()
remDr$navigate("your_web_site.com")
frame_ws<- remDr$findElement(using='id', value="iframeResult")
remDr$switchToFrame(frame_ws)
#You can replace "today" with all elements the list
option <- remDr$findElement(using = 'xpath', "//*/option[#value = 'today']")
option$clickElement()
If you want to deep the argument you should visit here
Related
I'm looking to use RSelenium to input some gene names into an online repository that creates a functional annotation heatmap for said genes.
However, I'm struggling to work out how to input the gene list into the text box to generate the heatmap.
Here is an image of the text box and the html associated with it.
The code I have so far (see below) can navigate successfully to the appropriate page, and select the appropriate text box but I can't work out how to input the text such that the list of genes at the bottom of the html code is added to as if I was typing the genes in manually. Note, the genes you can see in text box in the image were input manually.
##### Load driver and navigate to site #####
driver <- rsDriver(browser=c("chrome"), chromever="80.0.3987.106")
remDr <- driver[["client"]]
remDr$navigate("http://solo.bmap.ucla.edu/shiny/webapp/")
## Select heatmap option
gene_toggle <- remDr$findElement(using = 'css', '[class="dropdown-toggle"]')
gene_toggle$clickElement()
input <- remDr$findElement(using = 'css', '[data-value="panel-Heatmap"]')
input$clickElement()
## Input gene list to text box - not working yet - can't get text to enter properly
gene_select <- remDr$findElement(using = 'css', '[class="selectize-input items not-full has-options has-items"]')
gene_select$clickElement()
##### NOTE I HAVE TRIED THESE OPTION BELOW ....
gene_select$sendKeysToElement("NEUROD1")
gene_select$sendKeysToElement(list("NEUROD1"))
gene_select$sendKeysToElement(list("NEUROD1", key = "enter"))
gene_select$sendKeysToElement(list("NEUROD6, NEUROD2"))
I feel like I'm almost there, but unsure if I'm selecting the wrong element or formatting the sendKeysToElement command wrongly. I'm fairly new to RSelenium.
Any advice would be greatly appreciated.
You're almost there indeed, just need to select the <input> element inside your gene_select:
input <- gene_select$findChildElement(using = 'xpath', value = 'input')
input$sendKeysToElement(list("NEUROD2", key = "enter"))
I'm trying to scrape a ncbi website (https://www.ncbi.nlm.nih.gov/protein/29436380) to obtain information of a protein. I need to access the gene_synonyms and GeneID fields. I have tried to find the relevant nodes with the selectorGadget addon in chrome and with the code inspector in ff. I have tried this code:
require("dplyr")
require("rvest")
require("stringr")
GIwebPage <- read_html("https://www.ncbi.nlm.nih.gov/protein/29436380")
TestHTML <- GIwebPage %>% html_node("div.grid , div#maincontent.col.nine_col , div.sequence , pre.genebank , .feature") %>% html_text(trim = TRUE)
Then I try to find the relevant text but it is simply not there.
str_extract_all(TestHTML, pattern = "(synonym).{30}")
[[1]]
character(0)
str_extract_all(TestHTML, pattern = "(GeneID:).{30}")
[[1]]
character(0)
All I seem to be accessing is some of the text content of the column on the right.
str_extract_all(TestHTML, pattern = "(protein).{30}")
[[1]]
[1] "protein codes including ambiguities a"
[2] "protein sequence for myosin-9 (NP_00"
[3] "protein should not be confused with t"
[4] "protein, partial [Homo sapiens]gi|294"
[5] "protein codes including ambiguities a"
I have tried so many combinations of nodes selections with html_node() that I don't know anymore what to try. Is this content buried in some structure I can't see? or I'm just not skilled enough to realize the node to select?
Thanks a lot,
José.
The page is dynamically loading the information. The underlying information is store at another location.
Using the developer tools from your bowser, look for the link:
The information you are looking for is store at the "viewer.fcgi", right click to copy the link.
See similar question/answers: R not accepting xpath query
I have been searching for this answer, but I can't get it right to execute.
I can login to web page, then it pops up a new window. I need to select a value from drop-down list.
please help me on this.
HtmlElement
<select name="fldcustval" onchange="fnsetBankName();fnSetCurrencyList();fnGetCustomer();displayDetails(this);">..</select>
My Attempts
I have tried this, but can't get it to work.
ie.Document.getElementsByname("flducstval").Focus
ie.Document.getElementsByname("fldcustval").selectedindex=2
-->(the second option in dropdown list)
ie.Document.getElementsByName("fldcustval").FireEvent ("onchange")
I want to extract only "Beech Valley Solutions - "
When I run
html_nodes('li') %>%
html_nodes(".flexbox.empLoc") %>%
html_text()
All the information comes out. "Beech Valley Solutions - Atlanta, GA Today 24hr"
There is one more way of doing scraping using rvest.
Instead of passing css selector item in html_nodes(), you can pass xpath within html_nodes().Just an example below -
page %>% html_nodes(xpath = "//*[#id='series-matches']/div[20]/div[3]/div[1]/a[1]/span")
Reference:
https://blog.rstudio.com/2014/11/24/rvest-easy-web-scraping-with-r/
x path is easier to fetch -
Right click the section for which you want to fetch xpath.
Select inspect code from the drop down. 3. html page will appear to the right side, from which click the right click and press Copy option.
Drop will appear from which select "Copy xpath".
Ctrl V (Paste) the xpath within html_nodes(xpath = "xpath here"). I hope this will help you.
Anyone can help me why the below code doe not have any data for the selected table?
library('httr')
library('rvest')
url= read_html("http://projects.worldbank.org/search?lang=en&searchTerm=§orcode_exact=AB")
table = html_node(url,"table#f05v5-sorting-table.border-top2.border-allside.clearboth")
Thanks!
You are missing some steps. Your workflow should look like this:
dat_html <- read_html(
"http://projects.worldbank.org/search?lang=en&searchTerm=§orcode_exact=AB"
)
dat_nodes <- html_nodes(dat_html, xpath = "xxxx")
dat <- html_table(dat_nodes)
dat will be a list, so if you want a data frame, you could do something like:
dat_df <- as.data.frame(dat)
Or, if you like tibbles:
dat_tbl <- as_tibble(dat)
I cannot find the table you are interested in on that webpage, so you have to replace "xxxx" by the xpath of the table you are interested in.
To find the xpath, if you are inspecting the page from chrome or chromium, you can right click on the node in the inspector window, and look for Copy, then Copy XPath.