In this question How can I get a url from Chrome by Python?, it was brought up that you could grab the url from python in pywinauto 0.6. How is it done?
Using inspect.exe (which is mentioned in Getting Started) you can find Chrome's address bar element, and that its parameter "value" contains the current url.
I found two ways to get this url:
from __future__ import print_function
from pywinauto import Desktop
chrome_window = Desktop(backend="uia").window(class_name_re='Chrome')
address_bar_wrapper = chrome_window['Google Chrome'].main.Edit.wrapper_object()
Here's the first way:
url_1 = address_bar_wrapper.legacy_properties()['Value']
Here's the second:
url_2 = address_bar_wrapper.iface_value.CurrentValue
print(url_1)
print(url_2)
Also if protocol is "http" Chrome removes "http://" prefix. U can add sth like:
def format_url(url):
if url and not url.startswith("https://"):
return "http://" + url
return url
Related
I want to upload my javascript and html code to qwebengine so that it will read the code and load it in the browser. Is this possible? I think there is some way to do this as I have been reading about it on the internet but I'm unsure of how to do so. My code for the browser is:
from PyQt5.QtWidgets import *
from PyQt5.QtCore import *
from PyQt5.QtWebEngineWidgets import *
class MyWebBrowser(QMainWindow):
def __init__(self,):
super(MyWebBrowser, self).__init__()
self.window=QWidget()
self.window.setWindowTitle("Brave")
self.layout=QVBoxLayout()
self.horizontal = QHBoxLayout()
self.url_bar = QTextEdit()
self.url_bar.setMaximumHeight(30)
self.go_btn=QPushButton("Go")
self.go_btn.setMinimumHeight(30)
self.back_btn = QPushButton("<")
self.back_btn.setMinimumHeight(30)
self.forward_btn = QPushButton(">")
self.forward_btn.setMinimumHeight(30)
self.horizontal.addWidget(self.url_bar)
self.horizontal.addWidget(self.go_btn)
self.horizontal.addWidget(self.back_btn)
self.horizontal.addWidget(self.forward_btn)
self.browser=QWebEngineView()
self.go_btn.clicked.connect(lambda: self.navigate(self.url_bar.toPlainText()))
self.back_btn.clicked.connect(self.browser.back)
self.forward_btn.clicked.connect(self.browser.forward)
self.layout.addLayout(self.horizontal)
self.layout.addWidget(self.browser)
self.browser.setUrl(QUrl("http://www.google.com"))
self.window.setLayout(self.layout)
self.window.show()
def navigate(self,url):
if not url.startswith("http"):
url = "http://" + url
self.url_bar.setText(url)
# redirect to your website
if "google.com" in url:
url = "http://stackoverflow.com"
self.browser.setUrl(QUrl(url))
app=QApplication([])
window=MyWebBrowser()
app.exec_()
If I create an html/javascript file called somefile, how will I upload that code to the browser?
I would love to get some help in logging into the Fidelity website and navigate within it. My attempts so far have not led me to anywhere significant. So here is the code that I have written, after much consultation with answers around the web. The steps are:
Login to Fidelity
Check if response code not 200, but is 302 or 303 and my code passes this test (with a code of 302).
Then I check the number of cookies returned (there were 5) and for each cookie I try to navigate to a different web page within Fidelity (I do this five times, once for each cookie, simply because I do not know which subscript "j" of the variable "cookie" will work).
function loginToFidelity(){
var url = "https://www.fidelity.com";
var payload = {
"username":"*********",
"password":"*********"
};
var opt = {
"payload":payload,"method":"post","followRedirects" : false
};
var response = UrlFetchApp.fetch(encodeURI(url),opt);
if ( response.getResponseCode() == 200 ) {
Logger.log("Couldn't login.");
return
}
else if (response.getResponseCode() == 303 || response.getResponseCode() == 302) {
Logger.log("Logged in successfully. " + response.getResponseCode());
var cookie = response.getAllHeaders()['Set-Cookie']
for (j = 0; j < cookie.length; j++) {
var downloadPage = UrlFetchApp.fetch("https://oltx.fidelity.com/ftgw/fbc/oftop/portfolio#activity",
{"Cookie" : cookie[j],"method" : "post","followRedirects" : false,"payload":payload});
Logger.log(downloadPage.getResponseCode())
Logger.log(downloadPage.getContentText())
}
}
}
For each choice of the subscript "j", I get the same answer for the ResponseCode (always 302) as well as the same answer for ContentText. The answer for ContentText is obviously incorrect as it is not what it is supposed to be. The ContentText is shown below:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>302 Found</title>
</head><body>
<h1>Found</h1>
<p>The document has moved here.</p>
</body></html>
Based on this, I have two questions:
Have I logged into the Fidelity site correctly? If not, why do I get a response code of 302 in the login process? What do I need to do differently to login correctly?
Why am I getting such a strange and obviously incorrect answer for my ContentText while getting a perfectly reasonable ResponseCode of 302? What do I need to do differently, so that I can get the password-controlled page within Fidelity, whose url is "https://oltx.fidelity.com/ftgw/fbc/oftop/portfolio#activity"?
NOTE: Some other tests have been done in addition to the one stated above. Results from these tests are provided in the discussion below.
Here is something which worked for me. You may have found the solution already, not sure. Remember to fill in your loginid where the XXXX is and the pin number for YYYY.
I understand this is python code, not the google script, but you get the idea about the code flow.
import requests, sys, lxml.html
s = requests.Session()
r = s.get('https://login.fidelity.com')
payload = {
'DEVICE_PRINT' : 'version%3D3.5.2_2%26pm_fpua%3Dmozilla%2F5.0+(x11%3B+linux+x86_64%3B+rv%3A41.0)+gecko%2F20100101+firefox%2F41.0%7C5.0+(X11)%7CLinux+x86_64',
'SavedIdInd' : 'N',
'SSN' : 'XXXXX',
'PIN' : 'YYYYY'
}
r = s.post(login_url, data=payload, headers=dict(referer='https://login.fidelity.com'))
response = s.get('https://oltx.fidelity.com/ftgw/fbc/oftop/portfolio')
print response.content
mwahal, you left out the critical form action url (your login_url is undefined)
this works (if added to your python code)
login_url = 'https://login.fidelity.com/ftgw/Fas/Fidelity/RtlCust/Login/Response/dj.chf.ra'
btw here's the result of the print after the post showing successful login
{"status":
{
"result": "success",
"nextStep": "Finish",
"context": "RtlCust"
}
}
or adding some code:
if r.status_code == requests.codes.ok:
status = r.json().get('status')
print(status["result"])
gets you "success"
Unfortunately the answer from #mwahal doesn't work anymore - I've been trying to figure out why, will update if I do. One issue is that the login page now requires a cookie from the cfa.fidelity.com domain, which only gets set when one of the linked JavaScript files is loaded.
One alternative is to use selenium, if you just want to navigate the site, or seleniumrequests if you want to tap into Fidelity's internal APIs.
There is a hitch with seleniumreqeusts for the transactions API... the API requires Content-Type: application/json and seleniumrequests doesn't seem to support custom headers in requests. So I use selenium to log in, call one of the APIs that doesn't need that header, copy then edit the response's request header, and use regular requests to get the transactions:
from seleniumrequests import Chrome
import requests
# Log into Fidelity
driver = Chrome()
driver.get("https://www.fidelity.com")
driver.find_element_by_id("userId-input").send_keys(username)
driver.find_element_by_name("PIN").send_keys(password)
driver.find_element_by_id("fs-login-button").click()
r = driver.request('GET', 'https://digital.fidelity.com/ftgw/digital/rsc/api/profile-data')
headers = r.request.headers
headers['accept'] = "application/json, text/plain, */*"
headers['content-type'] = "application/json"
payload = '{"acctDetails":[{"acctNum":"<AcctId>"}],"searchCriteriaDetail":{"txnFromDate":1583639342,"txnToDate":1591411742}}'
api = "https://digital.fidelity.com/ftgw/digital/dc-history/api"
r = requests.post(api, headers=headers, data=payload)
transactions = r.json()
The idea is to collect all soundcloud users' id's (not names) who posted tracks that first letter is e.g. "f" in the period in our case of "past year".
I used filters on soundcloud and got results in the next URL: https://soundcloud.com/search/sounds?q=f&filter.created_at=last_year&filter.genre_or_tag=hip-hop%20%26%20rap
I found the first user's id ("wavey-hefner") in the follow line of html code:
<a class="sound__coverArt" href="/wavey-hefner/foreign" draggable="true">
I want to get every user's id from the whole html.
My code is:
import requests
import re
from bs4 import BeautifulSoup
html = requests.get("https://soundcloud.com/search/sounds?q=f& filter.created_at=last_year&filter.genre_or_tag=hip-hop%20%26%20rap")
soup = BeautifulSoup(html.text, 'html.parser')
for id in soup.findAll("a", {"class" : "sound_coverArt"}):
print (id.get('href'))
It returns nothing :(
The page is rendered in JavaScript. You can use Selenium to render it, first install Selenium:
pip3 install selenium
Then get a driver e.g. https://sites.google.com/a/chromium.org/chromedriver/downloads (if you are on Windows or Mac you can get a headless version of Chrome - Canary if you like) put the driver in your path.
from bs4 import BeautifulSoup
from selenium import webdriver
import time
browser = webdriver.Chrome()
url = ('https://soundcloud.com/search/sounds?q=f& filter.created_at=last_year&filter.genre_or_tag=hip-hop%20%26%20rap')
browser.get(url)
time.sleep(5)
# To make it load more scroll to the bottom of the page (repeat if you want to)
browser.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(5)
html_source = browser.page_source
browser.quit()
soup = BeautifulSoup(html_source, 'html.parser')
for id in soup.findAll("a", {"class" : "sound__coverArt"}):
print (id.get('href'))
Outputs:
/tee-grizzley/from-the-d-to-the-a-feat-lil-yachty
/empire/fat-joe-remy-ma-all-the-way-up-ft-french-montana
/tee-grizzley/first-day-out
/21savage/feel-it
/pluggedsoundz/famous-dex-geek-1
/rodshootinbirds/fairytale-x-rod-da-god
/chancetherapper/finish-line-drown-feat-t-pain-kirk-franklin-eryn-allen-kane-noname
/alkermith/future-low-life-ft-the-weeknd-evol
/javon-woodbridge/fabolous-slim-thick
/hamburgerhelper/feed-the-streets-prod-dequexatron-1000
/rob-neal-139819089/french-montana-lockjaw-remix-ft-gucci-mane-kodak-black
/pluggedsoundz/famous-dex-energy
/ovosoundradiohits/future-ft-drake-used-to-this
/pluggedsoundz/famous
/a-boogie-wit-da-hoodie/fucking-kissing-feat-chris-brown
/wavey-hefner/foreign
/jalensantoy/foreplay
/yvng_swag/fall-in-luv
/rich-the-kid/intro-prod-by-lab-cook
/empire/fat-joe-remy-ma-money-showers-feat-ty-dolla-ign
I am trying to addapt the code found in
Python + Selenium + PhantomJS render to PDF
so I instead of saving one web page as a pdf file, I can iterate over a list of urls and save each one with a specific name (from another list).
count = 0
while count < length:
def execute(script, args):
driver.execute('executePhantomScript', {'script': script, 'args' : args })
driver = webdriver.PhantomJS('phantomjs')
# hack while the python interface lags
driver.command_executor._commands['executePhantomScript'] = ('POST', '/session/$sessionId/phantom/execute')
driver.get(urls[count])
# set page format
# inside the execution script, webpage is "this"
pageFormat = '''this.paperSize = {format: "A4", orientation: "portrait" };'''
execute(pageFormat, [])
# render current page
render = '''this.render("test.pdf")'''
execute(render, [])
count+=1
I tested modifying
render = '''this.render("test.pdf")'''
to
render = '''this.render(names[count]+".pdf")'''
so as to include the each name in the list using count but have not been successful.
Also tried:
dest = file_user[count]+".pdf"
render = '''this.render(dest)'''
execute(render, [])
But did not work either.
I greatly appreciate a suggestion for the appropriate syntax.
It must be very simple but I am a noobie.
Use string formatting:
render = 'this.render("{file_name}.pdf")'.format(file_name=names[count])
I have a flash app running that loads remote data and we're transitioning to use (SSL) https://
I am wondering is it possible to just use "//" as you would in JavaScript to automatically assume the parent page's protocol (http or https).
Thanks
update: it seems to me that you can use a url format like "//www.something.com" but instead of assuming the page protocol it seems like it's just defaulting to "http://www.something.com".
Now I'm working around this by checking if the SWF is an SSL url. Something like this:
if( loaderInfo.url.indexOf("https:") == 0 ) {
//replace http: with https:
}
Which is unfortunately inconvenient to be doing that everywhere you handle a remote asset URL. Just loading everything with matching proto would be a lot nicer... like "//www.someurl.com/wouldbenicer.xml", especially since js and html both work that way.
Blah.
Any ideas?
"//" relative proto doesn't work in flash the way the browser works with urls in HTML, instead it defaults to http://
Workaround:
Check the URL of the SWF to see if the URL about to be loaded should be modified to have https:// protocol:
if( loaderInfo.url.indexOf("https:") == 0 ) {
//replace http: with https:
} else {
//replace https: with http:
}
Building upon OG Sean's answer, here's a wrapper function that'll manage protocol-relative URLs and default to HTTP.
function relativeURL(url:String) {
var scheme = (loaderInfo.url.indexOf("https:") == 0) ? "https:": "http:";
var url = scheme + url.replace(/^https?:/,"");
return url;
}
using a string contains splash, and add it twice
var singlesplash:String = "/";
var doublesplash:String = singlesplash + singlesplash;
myurl = "http:" + doublesplash + "www.google.com";
or
myurl = "http:/" + "/www.google.com";