Selenium login to page with python 3.6 can't find element by name - selenium-chromedriver

Today I tried to write a code to make a bot for ytpals.com webpage.
I am using python selenium library.
What I am trying to do first is to login to page with my youtube channel ID.
But I was unsucessfull to find element 'channelid' whatever I do.
Adding to this this, page sometimes doesn't load fully...
Btw it worked for me with other pages to find an input form, but this page... I can't understand.
Maybe someone has better understanding than me and know how to log in in this page?
My simple code:
import time
from selenium import webdriver
browser = webdriver.Firefox()
browser.get('https://www.ytpals.com/')
search = browser.find_element_by_name('channelid')
search.send_keys("testchannel")
time.sleep(5) # sleep for 5 seconds so you can see the results
browser.quit()

So I found a solution to my problem.
I downloaded SELENIUM IDE, and I can use it as a debugger, such a great tool!
if someone will need it, grab a link:
https://www.seleniumhq.org/docs/02_selenium_ide.jsp

Related

Changing innerhtml(onclick) in selenium python without auto reload

I'm trying to replace an innerhtml "onclick" element of educational website.
Which originally is
onclick="if(movieEnd){fnViewChildMove('4', 'ft05', '403');}else{ alert('Finish the video first'); }
to
onclick="fnViewChildMove('4', 'ft05', '403');"
Which works fine when I manually edit it myself in devtool.
So I made a code with Selenium Python to automate it, but the page reload itself when I execute script.
the code is below
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('Website Address')
time.sleep(1)
js = """
document.body.innerHTML = document.body.innerHTML.replace("if(movieEnd){", "")
document.body.innerHTML = document.body.innerHTML.replace(";}else{ alert('Finish the video first'); }", "")
"""
driver.execute_script(js)
This code is working fine with other websites, but is not working where I need to apply.
Because if I execute the script, the page reload itself and refresh the page without applying replacement.
It seems the website is preventing this kind of situation for some reason, but I have no idea to get through it.
Is there any way to prevent auto-reload when I execute the code to website?
Thanks in advance!

splinter nested <html> documents

I am working on some website automation. Currently, I am unable to access a nested html documents with Splinter. Here's a sample website that will help demonstrate what I am dealing with: https://www.w3schools.com/html/tryit.asp?filename=tryhtml_elem_select
I am trying to get into the select element and choose the "saab" option. I am stuck on how to enter the second html document. I've read the documentation and saw nothing. I'm hoping there is a way with Python.
Any thoughts?
Before Solution:
from splinter import Browser
exe = {"executable_path": "chromedriver.exe"}
browser = Browser("chrome",**exe, headless=False)
url = "https://www.w3schools.com/html/tryit.asp?filename=tryhtml_elem_select"
browser.visit(url)
# This is where I'm stuck. I cannot find a way to access the second (nested) html doc
innerframe = browser.find_by_name("iframeResult").first
innerframe.find_by_name("cars")[0]
Solution:
from splinter import Browser
exe = {"executable_path": "chromedriver.exe"}
browser = Browser("chrome",**exe, headless=False)
url = "https://www.w3schools.com/html/tryit.asp?filename=tryhtml_elem_select"
browser.visit(url)
with browser.get_iframe("iframeResult") as iframe:
cars = iframe.find_by_name("cars")
cars.select("saab")
I figured out that these are called iframes. Once I learned the terminology, it wasn't too hard to figure out how it interact with it. "Nested html documents" was not returning the results I needed to find the solution.
I hope this helps someone out in the future!

Website hiding page footer from parser

I am trying to find the donation button on the website of
The University of British Columbia.
The donation button is located at the page footer, within the div classed as "span7"
However, when scraped, the html yeilded the div with nothing inside it.
My program works perfectly with direct div as source:
from bs4 import BeautifulSoup as bs
import re
site = '''<div class="span7" id="ubc7-footer-menu"><div class="row-fluid"><div class="span6"><h3>About UBC</h3><div>Contact UBC</div><div>About the University</div><div>News</div><div>Events</div><div>Careers</div><div>Make a Gift</div><div>Search UBC.ca</div></div><div class="span6"><h3>UBC Campuses</h3><div>Vancouver Campus</div><div>Okanagan Campus</div><h4>UBC Sites</h4><div>Robson Square</div><div>Centre for Digital Media</div><div>Faculty of Medicine Across BC</div><div>Asia Pacific Regional Office</div></div></div></'''
html = bs(site, 'html.parser')
link = html.find('a', string=re.compile('(?)(donate|donation|gift)'))
#returns proper donation URL
However, using the site does not work
from bs4 import BeautifulSoup as bs
import requests
import re
site = requests.get('https://www.ubc.ca/')
html = bs(site.content, 'html.parser')
link = html.find('a', string=re.compile('(?i)(donate|donation|gift)'))
#returns none
Is there something wrong with my parser? Is it some-sort of anti-scrape maneuver? Am I doomed?
I cannot seem to find the 'Donate' button on the URL that you provided, but there is nothing inherently wrong with your parser, its just that the GET request that you send only gives you the HTML initially returned from the response, rather than waiting for the page to fully render.
It appears that parts of the page are filled in by Javascript. You can use Splash, which is used to render Javascript-based pages. You can run Splash in Docker quite easily, and just make HTTP requests to the Splash container which will return HTML that looks just like the webpage as rendered in a web browser.
Although this sounds overly complicated, it is actually quite simple to set up since you don't need to modify the Docker image at all, and you need no previous knowledge of Docker to get it to work. It requires just a single line from the command line to start a local Splash server:
docker run -p 8050:8050 -p 5023:5023 scrapinghub/splash
You then just modify any existing requests you have in your Python code to route to splash instead:
i.e. http://example.com/ becomes
http://localhost:8050/render.html?url=http://example.com/

How to fill out a web form and return the data with knowing the web form id/name in python

I am currently trying to automatically submit information into the web forms on this website : https://coinomi.com/recovery-phrase-tool.html Unfortunately I do not know the name of the forms, and cant seem to find out from its source code. Now I have tried to fill out the forms using the requests python module, and just by passing the parameters through the URL before scraping it. Unfortunately I have trouble finding the name of the form so I cant do this.
If possible I wanted to do this with the offline version of the website at https://github.com/Coinomi/bip39/blob/master/bip39-standalone.html so that it is more secure but I barely know how to use regular web forms with the tools I have, let alone locally from my computer.
I am not sure what exactly are you looking for. However, here is a part of code, which use selenium to fill some parts of the form that you mention.
import selenium
from selenium import webdriver
from selenium.webdriver.support.select import Select
browser = browser = webdriver.Chrome('C:\\Users...\\chromedriver.exe')
browser.get('https://coinomi.com/recovery-phrase-tool.html')
# Example to fill a text box
recoveryPhrase = browser.find_element_by_id('phrase')
recoveryPhrase.send_keys('your answer')
# Example to select a element
numberOfWords = Select(browser.find_element_by_id('strength'))
numberOfWords.select_by_visible_text('24')
# Example to click a button
generateRandomMnemonic = browser.find_element_by_xpath('/html/body/div[1]/div[1]/div/form/div[4]/div/div/span/button')
generateRandomMnemonic.click()

Error when getting website table data using python selenium - Multiple tables and Unable to locate element

I am trying to get info from brazilian stock market (BMF BOVESPA). The website has several tables, but my code is not being able to get them.
The code below aims to get all data from table "Ações em Circulação no Mercado" -> one of the last tables from webpage.
I have tried the ones below, but none worked for me:
content = browser.find_element_by_css_selector('//div[#id="div1"]')
and
table = browser.find_element_by_xpath(('//*[#id="div1"]/div/div/div1/table/tbody'))
Thanks in advance for taking my question.
from selenium import webdriver
from time import sleep
url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-
Listadas/ResumoEmpresaPrincipal.aspx?codigoCvm=19348&idioma=pt-br"
browser = webdriver.Chrome()
browser.get(url)
sleep(5) #wait website to reload
content = browser.find_element_by_css_selector('//div[#id="div1"]')
HTML can be found at attached picture
As alternative, the code below reaches the same website
url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"
Ticker='ITUB4'
browser = webdriver.Chrome()
browser.get(url)
sleep(2)
browser.find_element_by_xpath(('//*[#id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)
browser.find_element_by_xpath(('//*[#id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click();
content = browser.find_element_by_id('div1')
Selenium with Python documentation UnOfficial
Hii there
Selenium provides the following methods to locate elements in a page:
find_element_by_id
find_element_by_name
find_element_by_xpath
find_element_by_link_text
find_element_by_partial_link_text
find_element_by_tag_name
find_element_by_class_name
find_element_by_css_selector
Why your code doesnt work ? because you're not using correct correct code to locate element
you're using xpath inside css selector
content = browser.find_element_by_css_selector('//div[#id="div1"]') #this part is wrong
instead you can do this if you want to select div1
content = browser.find_element_by_id('div1')
here's the correct code
url = "http://bvmf.bmfbovespa.com.br/cias-Listadas/Empresas-
Listadas/BuscaEmpresaListada.aspx?idioma=pt-br"
Ticker='ITUB4'
browser = webdriver.Chrome()
browser.get(url)
sleep(2)
browser.find_element_by_xpath(('//*[#id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_txtNomeEmpresa_txtNomeEmpresa_text"]')).send_keys(Ticker)
browser.find_element_by_xpath(('//*[#id="ctl00_contentPlaceHolderConteudo_BuscaNomeEmpresa1_btnBuscar"]')).click()
I tested it and it worked :)
Mark it as best answer if i helped you :)