Chrome headless with Puppeteer, how to catch js crash? - puppeteer

I'm running Chrome in headless mode with Puppeteer, and I discovered that if an URL I load contain a javascript code like:
while (true) {console.log('crash')}
The page will load forever even though I have timeout set in place and waitUntil defined:
await page.goto('http://...', {waitUntil: ['load', 'documentloaded', 'networkidle0'], 'timeout': timeout})
How can I ensure that no JS (or any other kind of) abuse don't stuck my code?

Related

Why does nothing run after page.goto twitter?

So I am running puppeteer and going to twitter, I am running the browser with headless false and saving cookies.
So my code is
const browser = await puppeteer.launch({
product: 'firefox',
headless: false,
userDataDir: './dataDir'
});
const page = await browser.newPage();
await page.setDefaultNavigationTimeout(0);
await page.goto('https://twitter.com/home');
console.log("hi");
It goes to twitter, it saves cookies because I went ahead and logged in and the next time I ran it I was logged in but it never gets to printing out hi on the console.
When I run it with some other url, like google.com or news.ycombinator.com it works fine, but not with twitter. which makes me think they have some secret sauce running there (although I would expect google to have that same secret sauce so hmmm)
I have tried with setting wait for events on the page.goto - like for example {waitUntil: "domcontentloaded"}but none of them improve the situation.
So anyway how can I go to Twitter with puppeteer and have that console.log show up after my page.goto.
ON EDIT: Have found that this affects FF with Puppeteer, but if I run Puppeteer with chromium do not have the problem. Would still like a solution of course, as I prefer to work in Firefox.

Programmatically start the performance profiling in Chrome

Is there a way to start the performance profiling programmatically in Chrome?
I want to run a performance test of my web app several times to get a better estimate of the FPS but manually starting the performance profiling in Chrome is tricky because I'd have to manually align the frame models. (I am using this technique to extract the frames)
CMD + Shift + E reloads the page and immediately starts the profiling, which alleviates the alignment problem but it only runs for 3 seconds as explained here. So this doesn't work.
Ideally, I'd like to click on a button to start my test script and also starts the profiling. Is there a way to achieve that?
in case you're still interested, or someone else may find it helpful, there's an easy way to achieve this using Puppeteer's tracing class.
Puppeteer uses Chrome DevTools Protocol's Tracing Domain under the hood, and writes a JSON file to your system that can be loaded in the dev tools performance panel.
To get a profile trace of your page's loading time you can implement the following:
const puppeteer = require('puppeteer');
(async () => {
// launch puppeteer browser in headful mode
browser = await puppeteer.launch({
headless: false,
devtools: true
});
// start a page instance in the browser
page = await browser.newPage();
// start the profiling, with a path to the out file and screenshots collected
await page.tracing.start({
path: `tests/logs/trace-${new Date().getTime()}.json`,
screenshots: true
});
// go to the page
await page.goto('http://localhost:8080');
// wait for as long as you want
await page.waitFor(4000);
// or you can wait for an element to appear with:
// await page.waitForSelector('some-css-selector');
// stop the tracing
await page.tracing.stop();
// close the browser
await browser.close();
})();
Of course, you'll have to install Puppeteer first (npm i puppeteer). If you don't want to use Puppeteer you can interact with Chrome DevTools Protocol's API directly (see link above). I didn't investigate that option very much since Puppeteer delivers a high level and easy to use API over CDP's API. You can also interact directly with CDP via Puppeteer's CDPSession API.
Hope this helps. Good luck!
You can use the chrome devtools protocol and use any driver library from here https://github.com/ChromeDevTools/awesome-chrome-devtools#protocol-driver-libraries to programmatically create a profile.
Use this method - https://chromedevtools.github.io/devtools-protocol/tot/Profiler#method-start to start a profile.

Why is incognito mode not working while using puppeteer-extra plugin with puppeteer

I am using puppeteer-extra package with stealth plugin of puppeteer. While using the default puppeteer package, incognito shows up , but while using puppeteer-extra plugin, even while initializing the incognito context, the incognito window doesn't open up. Any idea if its some compatibility issue or someone already came across this problem.
I have tried with args passing "--incognito" mode and also using the context method.
While using --incognito parameter it opens parent window with incognito but while using newPage(), it open a second window which is without incognito flow.
Two approaches I had used
Importing puppeteer extra package:
import puppeteer from 'puppeteer-extra';
import pluginStealth from 'puppeteer-extra-plugin-stealth';
Method 1:
const context = await browser.createIncognitoBrowserContext();
const page = await context.newPage();
Method 2 :
const browser = await puppeteer.launch({args:[--incognito]});
I expect that while using puppeteer-extra package, the behavior should be same as using puppeteer.
The problem
This appears to be caused by a bug in the puppeteer-extra library. When you open a puppeteer instance with puppeteer-extra, the browser instance is hotpatched to better integrate newly opened pages with plugins.
Unfortunately the current implementation of browser._createPageInContext (as of version 2.1.3) doesn't correctly handle which browser context the new page should belong to once it's opened.
The fix
The fix is this pull request.
Specifically, you need to change this line
return async (contextId) => {
to this
return async function (contextId) {
so that arguments on the next line is evaluated correctly
const page = await originalMethod.apply(context, arguments)

How can I make HTTP request wait before continuing

I'm developing an extension for Google Chrome and I'm monitoring HTTP requests. In the event handler for chrome.webRequest.onHeadersReceived I'm trying to make a delay. It cannot wait asynchronously (unlike WebExtensions in Firefox) and it doesn't support something like Thread.Sleep or CriticalSection or ResetEvent or anything. The only solution that I see is spin waiting which is a very bad choice. Even synchronous XMLHTTPRequest is deprecated and doesn't work.
var headersReceived = function (e) {
/// ?????? some method to delay synchronously
return {cancel: false};
};
chrome.webRequest.onHeadersReceived.addListener(headersReceived,
{urls: ["*://*/*"]},
["blocking", "responseHeaders"]);
You can try and reference to this plugin: network-spinner-devtool it is a browser devtools extension, with capacity of URL level configuration and control, to allow introducing delay before sending http request or after receiving response(support in firefox only).
it supports both Chrome and Firefox browser
Can install from Chrome web store as well Chrome DevTools

How to get content on timeout in puppeteer (headless chrome)?

We are using puppeteer to run automated tests on hundreds of websites and URLs. Some of those websites are very slow and run into a timeout. That is often the case because there is an ad that does not finish loading. So increasing the timeout is not an option.
Is there a way to get the currently rendered HTML (DOM) at the moment the timeout is happening? page.content() is only returning a promise that is still pending.
You might be able to use something like evaluate, which injects a custom JavaScript function to execute. However, if the thread is truly "locked" then it'll likely run into the same issue.
const body = await page.evaluate(() => document.documentElement.outerHTML);
You might also need to be a little more flexible on how you orchestrate the script by catching goto timeouts and then trying the above.