I am trying to enter into the following web page in Python (https://power.larc.nasa.gov/data-access-viewer/). A confirmation page (let's say "Welcome to the power data access viewer!") appears simultaneously with the main page (let's say "Power Data Access Viewer"). I used the following code to close the confirmation page (by clicking on "Access Data" button), but I could not succeed. Any help is highly appreciated.
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
driver = webdriver.Chrome()
driver.get('https://power.larc.nasa.gov/data-access-viewer/')
time.sleep(5)
driver.find_element_by_xpath("//*[#id='mysplash']/div[2]/div[2]").click()
Unfortunately, I received the error below. I have to add that two pages (main and confirmation pages) belong to two separate classes in one web-page, while I think there is a same class in case of "Alert&Popup". Thus, the "Switch-To" function did not work too. Furthermore, I applied "driver.window_handles" to get two pages address, but I received identical address.
Message: no such element: Unable to locate element: {"method":"xpath","selector":"//* [#id='mysplash']/div[2]/div[2]"}
Again, thanks your help in advance.
Illustration figure of the problem
You need wait, the confirmation page appears after a few moments and you can locate the element by css selector:
driver.get("https://power.larc.nasa.gov/data-access-viewer/")
element = WebDriverWait(driver, 30).until(EC.element_to_be_clickable((By.CSS_SELECTOR, 'div.enable-btn')))
element.click()
Following import:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
Related
I am having issues clicking a button with Selenium. I have never worked with Selenium before, so I have tried searching the web for a solution but have had no luck. I tried some other things such as WebDriverWait but nothing has worked.
# My Code
from selenium import webdriver
from selenium.webdriver.common.action_chains import ActionChains
import time
PATH = "F:\SeleniumProjects\chromedriver.exe"
options = webdriver.ChromeOptions()
options.add_argument("start-maximized");
options.add_argument("disable-infobars")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options, executable_path=PATH)
driver.get("https://www.discord.com")
time.sleep(3)
buddy = driver.find_element_by_xpath("/html/body/div/div/div/div[1]/div[1]/header[1]/nav/div/a")
ActionChains(driver).move_to_element(buddy).click().perform()
This exception is confusing me because I know I can interact with it but I am unsure why it says it isn't. I am sure there is some simple fix but I am stumped.
selenium.common.exceptions.ElementNotInteractableException: Message: element not interactable: https://discord.com/login has no size and location
(Session info: chrome=90.0.4430.93)
Here is the button I am trying to press
<a class="button-195cDm buttonWhite-18r1SC buttonSmall-2bnF7I gtm-click-class-login-button button-1x6X9g mobileAppButton-2dMGaq" href="//discord.com/login">Login</a>
Wrap a wait around it and the following should work:
buddy = WebDriverWait(driver,10).until(EC.visibility_of_element_located((By.XPATH, "//div[#class='appButton-2wSXh-']//a[contains(text(),'Login')]")))
buddy.click()
There are 2 elements of this on the page:
//a[contains(text(),'Login')]
That is why we need to go up one more level to the div element in the XPath
Import the following at the top of your script if they aren't there already:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
The ActionChains is not needed, that is generally used for dropdown menus, things like that. Also your XPath was the full path in the HTML, that is generally not advisable and can be brittle.
This is an open site with tax information. I access the site, click on 'Abzüge', then the next page is in English. I try to change to German, however, I cannot locate the button. My current code is
from selenium import webdriver
driver = webdriver.Chrome('/path-to-my-chromedriver/chromedriver')
driver.get('https://www.estv.admin.ch/estv/de/home/allgemein/steuerstatistiken/fachinformationen/steuerbelastungen/steuerfuesse.html')
elem1 = driver.find_element_by_link_text('Abzüge')
elem1.click()
de = driver.find_element_by_xpath('/html/body/div/div[4]/header/div/div[2]/div/button[1]')
de.click()
driver.close()
The Error is NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"/html/body/div/div[4]/header/div/div[2]/div/button[1]"} (Session info: chrome=85.0.4183.102)
I have also tried to find the button by class and by shorter xPaths, but I can never locate the button.
I also need to select some buttons in the main field. So I gave that a try as well, but it does not work as well. My code to select Whole of Switzerland, without changing to German, is
element = driver.find_element_by_xpath('//*[#id="LOCATION"]/div[2]/div[1]/div[1]/div/div/div[2]')
element.click()
I get the same error, NoSuchElementException.
How would I change the language on the webpage, and how would I select a button in the body?
You can use below code with XPath locator discussed in comments.
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver = webdriver.Chrome()
driver.get('https://www.estv.admin.ch/estv/de/home/allgemein/steuerstatistiken/fachinformationen/steuerbelastungen/steuerfuesse.html')
WebDriverWait(driver,20).until(EC.element_to_be_clickable((By.XPATH, "//nav[#class='nav-lang']//li[contains(., 'IT')]")))
print(driver.find_element(By.XPATH, "//nav[#class='nav-lang']//li[contains(., 'IT')]").text)
driver.find_element(By.XPATH, "//nav[#class='nav-lang']//li[contains(., 'IT')]").click()
I am trying to make a bot that can play Cookie Clicker. I have successfully opened the website using the webbrowser module. When I use the developer tool to see the html I can see the information I want to obtain, such as how much money I have, how expensive items are ect. But when I try to get that information using the requests and beautifulsoup it instead gets the html of a new window. How can I make it so that I get the html of the already opened tab?
import webbrowser
webbrowser.open('https://orteil.dashnet.org/cookieclicker/')
from bs4 import BeautifulSoup
import requests
def scrape():
html = requests.get('https://orteil.dashnet.org/cookieclicker/')
print(html)
scrape()
You can try to do this:
body_element = html.find_element_by_xpath("//body")
body_content = body_element.get_attribute("innerHTML")
print(body_content)
I am trying to find the donation button on the website of
The University of British Columbia.
The donation button is located at the page footer, within the div classed as "span7"
However, when scraped, the html yeilded the div with nothing inside it.
My program works perfectly with direct div as source:
from bs4 import BeautifulSoup as bs
import re
site = '''<div class="span7" id="ubc7-footer-menu"><div class="row-fluid"><div class="span6"><h3>About UBC</h3><div>Contact UBC</div><div>About the University</div><div>News</div><div>Events</div><div>Careers</div><div>Make a Gift</div><div>Search UBC.ca</div></div><div class="span6"><h3>UBC Campuses</h3><div>Vancouver Campus</div><div>Okanagan Campus</div><h4>UBC Sites</h4><div>Robson Square</div><div>Centre for Digital Media</div><div>Faculty of Medicine Across BC</div><div>Asia Pacific Regional Office</div></div></div></'''
html = bs(site, 'html.parser')
link = html.find('a', string=re.compile('(?)(donate|donation|gift)'))
#returns proper donation URL
However, using the site does not work
from bs4 import BeautifulSoup as bs
import requests
import re
site = requests.get('https://www.ubc.ca/')
html = bs(site.content, 'html.parser')
link = html.find('a', string=re.compile('(?i)(donate|donation|gift)'))
#returns none
Is there something wrong with my parser? Is it some-sort of anti-scrape maneuver? Am I doomed?
I cannot seem to find the 'Donate' button on the URL that you provided, but there is nothing inherently wrong with your parser, its just that the GET request that you send only gives you the HTML initially returned from the response, rather than waiting for the page to fully render.
It appears that parts of the page are filled in by Javascript. You can use Splash, which is used to render Javascript-based pages. You can run Splash in Docker quite easily, and just make HTTP requests to the Splash container which will return HTML that looks just like the webpage as rendered in a web browser.
Although this sounds overly complicated, it is actually quite simple to set up since you don't need to modify the Docker image at all, and you need no previous knowledge of Docker to get it to work. It requires just a single line from the command line to start a local Splash server:
docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
You then just modify any existing requests you have in your Python code to route to splash instead:
i.e. http://example.com/ becomes
http://localhost:8050/render.html?url=http://example.com/
Hi I'm practicing extracting information from a website.
(I'm using python, selenium, and beautifulsoup, which doesn't matter too much. The question is about finding an element in HTML.)
So (1) I want info in the table in graph. I located the table using Firefox Inspector: <table id='......'>
(2) but in my code I can't find it:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
from bs4 import BeautifulSoup
url = 'http://corp.sec.state.ma.us/corpweb/UCCSearch/UCCSearch.aspx'
driver = webdriver.Firefox()
driver.get(url)
# navigate to the page I want using selenium
driver.find_element_by_id("MainContent_rdoSearchO").click()
driver.find_element_by_id("MainContent_txtName").send_keys("mcdonald")
Select(driver.find_element_by_id("MainContent_cboOState")).select_by_visible_text("Massachusetts")
Select(driver.find_element_by_id("MainContent_UCCSearchMethodO")).select_by_visible_text("Begins With")
driver.find_element_by_id("MainContent_btnSearch").click()
# now on next page, click link (selenium)
link_text = '95352026'
driver.find_element_by_link_text(link_text).click()
### real question starts here:
# now on the page I want
# in firefox inspector find: <table id="MainContent_tblFilingHistory">
table_id = 'MainContent_tblFilingHistory'
# try find it
table = driver.find_elements_by_id(table_id)
len(table) # length = 0, can't find it
html.find(table_id) # -1, HTML really doesn't have this string
The element you have trouble to locate is in another window. You need to tell the driver to switch the context to that window:
from selenium import webdriver
from selenium.webdriver.support.ui import Select
driver = webdriver.Firefox()
driver.get('http://corp.sec.state.ma.us/corpweb/UCCSearch/UCCSearch.aspx')
driver.find_element_by_id("MainContent_rdoSearchO").click()
driver.find_element_by_id("MainContent_txtName").send_keys("mcdonald")
Select(driver.find_element_by_id("MainContent_cboOState")).select_by_visible_text("Massachusetts")
Select(driver.find_element_by_id("MainContent_UCCSearchMethodO")).select_by_visible_text("Begins With")
driver.find_element_by_id("MainContent_btnSearch").click()
driver.find_element_by_link_text('95352026').click()
#switch to the next window
driver.switch_to_window(driver.window_handles[1])
table = driver.find_elements_by_id('MainContent_tblFilingHistory')