I'm looking for a way to get the page load of a website.
Namely the "finish" value from chrome network tab:
I want to compare websites, so the value must not be exactly like in chrome, just comparable between websites.
I basically want to answer the question "How does my Page Load time compare to other websites".
I tried things like yslow.js (buggy) and tried with selenium and other headless browser but was not able to figure that out.
You can get these metrics from window.performance.timing.
Here is an example with Python :
from selenium import webdriver
driver = webdriver.Chrome()
driver.get(r"http://stackoverflow.com/")
times = driver.execute_script("""
var t = window.performance.timing; return [
t.domContentLoadedEventEnd - t.navigationStart,
t.loadEventEnd - t.navigationStart
]; """)
print "DOMContentLoaded: %s Load: %s" % tuple(times)
Related
If I use the PageSpeed Insights tool Google offers on its developer website, or the Lighthouse CLI with HTML as output, I get a very nicely formatted report with 0-100 scores like so:
However, if I run the Lighthouse CLI tool with the --output json option, I get a Lighthouse Result Object (LHR) and Lighthouse's Understanding the Results page helpfully points out that the HTML version is just is "a rendering of information contained in the result object."
My question: How do I translate the JSON into the scores from the HTML version? I want to be able to programmatically react to changes for some custom monitoring I'm setting up for my site.
I actually ended up finding the answer myself by browsing the Lighthouse source code. So I'll just leave the answer here in case others are trying to do this.
With performance as an example, simply multiply lhr.categories.performance.score with 100 in the lighthouse result object, like so:
import chromeLauncher from 'chrome-launcher';
import lighthouse from 'lighthouse';
const chrome = await chromeLauncher.launch({ chromeFlags: ['--headless'] });
const result = await lighthouse(`https://example.com`, {
onlyCategories: ['performance'],
port: chrome.port
});
const score = result.lhr.categories.performance.score * 100;
With Selenium 4 and chromedriver, I succeeded printing websites to PDF with custom page sizes (see Python code below). I would like to know the equivalent to do this with geckodriver/firefox.
def send_devtools(driver, cmd, params={}):
resource = "/session/%s/chromium/send_command_and_get_result" % driver.session_id
url = driver.command_executor._url + resource
body = json.dumps({'cmd': cmd, 'params': params})
response = driver.command_executor._request('POST', url, body)
if (response.get('value') is not None):
return response.get('value')
else:
return None
def save_as_pdf(driver, path, options={}):
result = send_devtools(driver, "Page.printToPDF", options)
if (result is not None):
with open(path, 'wb') as file:
file.write(base64.b64decode(result['data']))
return True
else:
return False
options = webdriver.ChromeOptions()
# headless setting is mandatory, otherwise saving tp pdf won't work
options.add_argument("--headless")
driver = webdriver.Chrome(executable_path='/usr/local/bin/chromedriver', options=options)
# chrome has to operate in headless mode to procuce PDF
driver.get(r'https://example.my')
send_devtools(driver, "Emulation.setEmulatedMedia", {'media': 'screen'})
pdf_options = { 'paperHeight': 92, 'paperWidth': 8, 'printBackground': True }
save_as_pdf(driver, 'myfilename.pdf', pdf_options)
Did you try wkhtmltopdf?
wkhtmltopdf and wkhtmltoimage are open source (LGPLv3) command line tools to render HTML into PDF and various image formats using the Qt WebKit rendering engine. These run entirely "headless" and do not require a display or display service.
Example usage:
wkhtmltopdf http://google.com google.pdf
If you want to do it with python, after installation you can invoke with:
import os
number = iter(range(100))
def html_to_pdf(link, name="test"):
if os.path.isfile(name): # same file name
name = name[:-1] + str(next(number))
os.system(f"wkhtmltopdf {link} {name}.pdf")
Additionally you can use subprocess.run if you want to use wkhtmltopdf with more parameters. Your html_to_pdf method will gain more effective with more parameters. You can checkout documentation with:
wkhtmltopdf -H
To print a page as PDF there is a specific WebDriver command that can be used for cross-browser automation. That means that there is no need to write custom code, which utilizes the Chrome DevTools protocol, as done above for Chrome.
For both Chrome and Firefox this command is already available in Selenium 3.141, and should also work without modifications for Selenium 4.
The command will return the base64 encoded PDF data in the response's payload, and would require you to save it to a file yourself.
Issue:
To proceed with the same task using Firefox or Geckodriver, it apparently has some issues with the mentioned code for writing to the file, resulting in not saving the target document.
Solution:
So I tweaked around the code, which now opens the website using Geckdriver on Firefox and takes a screenshot for the body elements using the function find_element_by_tag_name(), which is later on converted to RGB mode, with the dimensions of the screenshot and later saved as a PDF document using Pillow
Code:
from PIL import Image
from io import BytesIO
from selenium import webdriver
driverOptions = webdriver.FirefoxOptions()
# Uncomment the below line and change the path according to your configurations if you encounter an error like "Expected browser binary location ..."
# driverOptions.binary_location = '/Applications/Firefox.app/Contents/MacOS/firefox'
driverOptions.add_argument("--headless")
webDriver = webdriver.Firefox(executable_path = '/usr/local/bin/geckodriver', options = driverOptions)
webDriver.get(f'https://stackoverflow.com')
websiteScreenshot = Image.open(BytesIO(webDriver.find_element_by_tag_name('body').screenshot_as_png))
rgbImage = Image.new('RGB', websiteScreenshot.size, (255, 255, 255))
rgbImage.paste(websiteScreenshot, mask=websiteScreenshot.split()[3])
rgbImage.save('fileName.pdf', "PDF", resolution=100)
webDriver.quit()
References:
Browser Binary Location Issue
Converting a screenshot to PDF
Additional:
You can download the Geckodriver for Firefox based on your configurations from here, happy coding! 😊
I'm attempting to scrape from this web page here: https://explorer.helium.com/accounts/14Jydka1ufeZBXAHNmjK9SWedvWtufdaJRbEMgtF8Bifc6Dv7Gm
I'm trying to find out how to pull the number below "Rewards(24H)" and import it into a cell.
I've tried using ImportXML function but it gave me the "Imported content is empty" error.
After doing some research, I think the element is not server-side because I can't find it in the source code. So I opened up the Developer Tools for the page, clicked the Network Tab and refreshed the page.
I filtered the results to see only the XHR parts. Clicking the Headers tab will display a number of APIs in the Request URL section.
This is as far as I have gotten. I cannot find any reference to the Rewards(24H) number in any of the JSON code.
It'd be much appreciated if anyone can explain how I can find that number and import it into a Google Sheets cell, preferably self updating every hour.
Thanks!
After seeing to the network requests of the above URL data is coming from the api.helium.io so you can follow below code to get your desired data and FYI that website provides API already and has docs here : API Documentation so if you other data you can follow this also and according to my guess you were not able to find the data cause it was rounded off to the 2 decimals where as API gives result up to 5 to 6 so that could be the reason you were not able to find the data which was displayed in XHR requests.
Code:
import datetime
import requests
dt = datetime.datetime.now(datetime.timezone.utc)
querystring = {"min_time":"-60 day","max_time":dt.strftime("%Y-%m-%dT%H:%M:%SZ"),"bucket":"day"}
response=requests.get('https://api.helium.io/v1/accounts/14Jydka1ufeZBXAHNmjK9SWedvWtufdaJRbEMgtF8Bifc6Dv7Gm/hotspots').json()
for data in response["data"]:
print(f"name:{data['name']}")
internal_data =requests.get(f"https://api.helium.io/v1/hotspots/{data['address']}/rewards/sum",params=querystring).json()
last24hour=internal_data["data"][0]["total"]
last7days=0
for i in range(0,7):
last7days+=internal_data["data"][i]["total"]
last30days = 0
for i in range(0, 30):
last30days += internal_data["data"][i]["total"]
print(f"24H : {round(last24hour,2)}")
print(f"7D : {round(last7days, 2)}")
print(f"30D : {round(last30days, 2)}")
Output:
Let me know if you have any questions :)
How Can I filter only the requests with errors in google chrome network devtools?
Option 1: Filtering HTTP Status Codes
You can filter responses by their status code — Here's a useful list with all HTTP's Status Codes.
AFAIK this filtering feature has been working for years. It's through the status-code property (you can see all properties you can use here, in Google Developers).
As explained:
status-code. Only show resources whose HTTP status code matches the
specified code. DevTools populates the autocomplete dropdown menu with
all of the status codes it has encountered.
While it's not as useful as a regex expression or a wildcard, it can narrow down a lot. For instance, if you want to see all requests with error 403, the filter is status-code:403.
There's a useful plot twist: you can use negative filters, i.e.: -status-code:200 (notice the prepended - sign). That will filter out all requests with a 200 code, showing only, for the most part, troubled requests.
With all the 200's out of the way, you can sort the status column for a better experience.
Option 2: Work with the HAR format
For a more in-depth analysis, almost as quick, you can easily export the whole networking log, and its details, to a HAR (HTTP ARchive) file. Right-click:
Paste it into your favorite editor. You'll see it's just a JSON file (plain text). You can always search for "error" or RegExp expressions. If you know a bit of JS, Python, etc., you can quickly parse it as you wish.
Or you can save it as *.har file, for instance, and use a HAR analyzer, like Google's free analyzer:
There are a lot of tools that will help you analyze HAR files. Apps like Paw, Charles, and others can import HAR and show it to you as a history of requests. AFAIK Postman doesn't understand HAR yet, but you can go to your network tab and copy in cURL format instead of HAR (or use a HAR->cURL converter like this one) and import it right into Postman.
There's no such functionality.
The Filter input doesn't apply to the Status column.
You can augment devtools itself by adding a checkbox in the filter bar:
open the network panel
undock devtools into a separate window
press the hotkey to invoke devtools - CtrlShifti or ⌘⌥i
paste the following code in this new devtools window console and run it
{
// see the link in the notes below for a full list of request properties
const CONDITION = r =>
r.failed ||
r.statusCode >= 400;
const label = document.createElement('label');
const input = label.appendChild(document.createElement('input'));
input.type = 'checkbox';
input.onchange = () => {
const view = UI.panels.network._networkLogView;
view.removeAllNodeHighlights()
view._filters = input.checked ? [CONDITION] : [];
view._filterRequests();
};
label.append('failed');
UI.panels.network._filterBar._filters[1]._filterElement.appendChild(label);
}
You can save this code as a snippet in devtools to run it later.
To quickly switch docking mode in the main devtools press CtrlShiftD or ⌘⇧D
Theoretically, it's not that hard to put this code into resources.pak file in Chrome application directory. There are several tools to decompile/build that file.
The full list of internal request properties is in the constructor of NetworkRequest.
I have a HTML/Javascript file with google's web speech api and I'm doing testing using selenium, however everytime I enter the site the browser requests permission to use my microphone and I have to click on 'ALLOW'.
How do I make selenium click on ALLOW automatically ?
Wrestled with this quite a bit myself.
The easiest way to do this is to avoid getting the permission prompt is to add --use-fake-ui-for-media-stream to your browser switches.
Here's some shamelessly modified code from #ExperimentsWithCode's answer:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--use-fake-ui-for-media-stream")
driver = webdriver.Chrome(executable_path="path/to/chromedriver", chrome_options=chrome_options)
#ExperimentsWithCode
Thank you for your answer again, I have spent almost the whole day today trying to figure out how to do this and I've also tried your suggestion where you add that flag --disable-user-media-security to chrome, unfortunately it didn't work for me.
However I thought of a really simple solution:
To automatically click on Allow all I have to do is press TAB key three times and then press enter. And so I have written the program to do that automatically and it WORKS !!!
The first TAB pressed when my html page opens directs me to my input box, the second to the address bar and the third on the ALLOW button, then the Enter button is pressed.
The python program uses selenium as well as PyWin32 bindings.
Thank you for taking your time and trying to help me it is much appreciated.
So I just ran into another question asking about disabling a different prompt box. It seems there may be a way for you to accomplish your goal.
This page lists options for starting chrome. One of the options is
--disable-user-media-security
"Disables some security measures when accessing user media devices like webcams and microphones, especially on non-HTTPS pages"
So maybe this will work for you:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_argument("--disable-user-media-security=true")
driver = webdriver.Chrome(executable_path="path/to/chromedriver", chrome_options=chrome_options)
Beware of Mohana Latha's answer for JAVA!
The code is only pressing the buttons and NEVER releasing them. This will bring bunch of issues later on.
Use this instead:
// opening new window
Robot robot;
try {
robot = new Robot();
robot.keyPress(KeyEvent.VK_CONTROL);
robot.delay(100);
robot.keyPress(KeyEvent.VK_N);
robot.delay(100);
robot.keyRelease(KeyEvent.VK_N);
robot.delay(100);
robot.keyRelease(KeyEvent.VK_CONTROL);
robot.delay(100);
} catch (AWTException e) {
log.error("Failed to press buttons: " + e.getMessage());
}
Building on the answer from #Shady Programmer.
I tried to send Tab keys with selenium in order to focus on the popup, but as reported by others it didn't work in Linux. Therefore, instead of using selenium keys, I use xdotool command from python :
def send_key(winid, key):
xdotool_args = ['xdotool', 'windowactivate', '--sync', winid, 'key', key]
subprocess.check_output(xdotool_args)
which for Firefox, gives the following approximate sequence :
# Focusing on permissions popup icon
for i in range(6):
time.sleep(0.1)
send_key(win_info['winid'], 'Tab')
# Enter permissions popup
time.sleep(0.1)
send_key(win_info['winid'], 'space')
# Focus on "accept" button
for i in range(3):
time.sleep(0.1)
send_key(win_info['winid'], 'Tab')
# Press "accept"
send_key(win_info['winid'], 'a')
[Java]: Yes there is a simple technique to click on Allow button using Robot-java.awt
public void allowGEOLocationCapture(){
Robot robot = null;
try {
robot = new Robot();
robot.keyPress(KeyEvent.VK_TAB);
robot.keyPress(KeyEvent.VK_ENTER);
robot.delay(600);
} catch (AWTException e) {
getLogger().info(e);
}
}
You can allow using add_experimental_option as shown below.
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
chrome_options.add_experimental_option('prefs',{'profile.default_content_setting_values.notifications':1})
driver = webdriver.Chrome(chrome_options=chrome_options)
Do it
For android chrome it really work!
adb -s emulator-5554 push <YOU_COMPUTER_PATH_FOLDER>/com.android.chrome_preferences.xml /data/data/com.android.chrome/shared_prefs/com.android.chrome_preferences.xml
File config here https://yadi.sk/d/ubAxmWsN5RQ3HA
Chrome 80 x86
or you can save the settings file after ticking the box with your hands, in adb its "pull" - command