Puppeteer. What is waitForNavigation waiting for? - puppeteer

What "navigation" is waitForNavigation waiting for?
The websites navigation? The browsers refresh icon "navigation" to finish spinning?
Or is this just a awkwardly worded method that should be named waitForBrowserToLoad?
But, when I use it, like so:
await this.page.waitForNavigation();
The page DOES finish loading.
And yet it never resolves. I'm not sure why.
What is waitForNavigation waiting for?
async function beforeScrape(page) {
//code gets this far
await this.page.waitForNavigation();
//never resolves
await page.click(".table-header");
}

Documentation says:
[page.waitForNavigation] resolves when the page navigates to a new URL or reloads. It is useful for when you run code which will indirectly cause the page to navigate.
For example you fill a form and the click "Submit" button, after which a new page is shown with results. This way you can wait until the new page loaded:
await Promise.all([
page.waitForNavigation(),
page.click('input[type=submit]'),
]);
If you instruct the script to wait for navigation, but do not cause one, it will just sit there waiting until timing out.

Related

slow click (and events) performance on puppteer

I have a small automation code that works on a website, using puppeteer, chrome headless.
The code is very simple.
locate an element.
click on it.
the thing is that the click events are SUPER slow, I assume it should take less than a few ms.
but it takes 90-400ms to perform a single click event.
there is no need to scroll because all the elements appear on the screen.
here is a piece of code:
logger.verbose(`before clicking on row`);
// get by link text
// search for any element with this class and text
let workElement = await page.$x(`//span[contains(#class,"tr-class") and contains(text(),"${jobSearchId}")]`);
//we don't want to click on the element itself because it makes things more difficult
let row = await (await workElement.getProperty("parentNode")).getProperty("parentNode");
await row.click();
logger.verbose(`row clicked`);
//click on the button
logger.verbose(`button click before`);
await page.click(".searchJobDescription");
logger.verbose(`button clicked`);
0|main | 2022-08-17 13:09:05.283 - verbose: before clicking on row
0|main | 2022-08-17 13:09:05.685 - verbose: row clicked
0|main | 2022-08-17 13:09:05.685 - verbose: button click before
0|main | 2022-08-17 13:09:05.776 - verbose: button clicked
clicking on the row takes 402ms
and clicking the button takes 91ms
even a human can perform these actions much faster.
can anyone help me understand how to speed up these actions?
puppeteer is very slow and I don't know why.
So...
The solution is super weird and seems not related to the problem, but I have tested it again and again, and I'm 100% sure that this is the cause (or part of a bigger bug I can't see).
I have built a cookies file that stores the page cookies, when I am injecting the cookies back to the page, everything slows down, I am using the next method for it.
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
await page.setCookie(...cookiesList);
}
when I am removing the call for injecting the cookies, puppeteer runs fast again. (~30ms per click)
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
// removing this line makes Puppeteer great again
// await page.setCookie(...cookiesList);
}
I am not even sure that this is a puppeteer issue.
it seems like Chrome/Chromium problem because when I debugged the CDP, chrome itself returned very slow answers for the command "mouseMove" (which is part of Puppeteer click method).
not sure what's going on, so if someone smart along the way will be able to solve it, it'll be great.
For me the problem been solved and I wasted more then a week in debugging it, so I'm leaving it this way.
if someone can reproduce it, and open a bug for puppeteer/chromium guys, I am sending him very good Karma.

WaitForNavigation alternative to ensure that no navigation is currently happening

A similar question may have been asked, but despite spending a couple of hours I could not find a satisfying answer.
I would like to call Frame.click(). At the time of calling it - I want to make sure that no navigation is pending. I also don't know if the element I am about to click on will result in navigation (it's passed in dynamically).
I tried https://www.npmjs.com/package/pending-xhr-puppeteer, but that seems to no longer be supported. (still uses Request instead of HTTPRequest)
Right now I am resorting to page.WaitForNetworkIdle(). Is this the best I can do?
P.S. It seems page.WaitForNavigation used to return right away if no navigation was taking place (on a year old version of the code). Since I updated to 12.1 it started waiting for the timeout and then throwing an exception if no navigation started within that timeframe.
I think you're confusing navigation and requests/responses. Navigation can be finished but requests/responses can still happened (eg: .aspx requests).
To make sure the navigation has happened, you could wrap it in a promise via Promise.all() method.
await Promise.all([
page.click(`#submit`),
page.waitForNavigation(),
]);
Additionally you could await page.content(); to make sure the full HTML contents of the page, including the doctype.
In the case of an ajax request you can use page.waitForResponse():
await Promise.all([
page.click(`#submit`),
page.waitForResponse(response => response.status() === 200),
]);

Puppeteer - clicks do not work outside of slowMo

I'm navigating with Puppeteer around a React website.
Two sample lines of code:
await page.waitForSelector('a.btn-lg[data-target="#loginModal"]');
await page.click('a.btn-lg[data-target="#loginModal"]');
With a sufficient slowMo value, the effects are consistent - the button gets clicked every time.
However, without slowMo, sometimes the button does get clicked, and sometimes it doesn't (a window wired to it doesn't open).
It happens for a lot of elements, not just this one button in particular.
I just started using Puppeteer, and it looks like I'm either misusing the library, or the website somehow screws up my efforts.
Please tell me why sometimes the effects of clicking are visible and sometimes not, and how to remedy it.
UPDATE:
Code such as this does not work either.
await page.evaluate(() => (document.querySelector('span.pum-close') as any).click());
await page.$$eval('span.pum-close', elements =>
elements[0].click()
);

Are waitUntil of page.goto and page.waitForNavigation the same?

As far as I know both page.goto and page.waitForNavigation accept waitUntil as a parameter, is it just two ways to achieve the same results?
For example:
page.goto(url, {waitUntil: 'domcontentloaded'})
vs:
page.waitForNavigation(url, {waitUntil: 'domcontentloaded'})
On puppeteer version 1.19.0, waitForNavigation does not accept an url. Usually waitForNavigation is used with a click, where clicking might cause a navigation in the browser.
Example:
const [response] = await Promise.all([
page.waitForNavigation(), // The promise resolves after navigation has finished
page.click('a.my-link'), // Clicking the link will indirectly cause a navigation
]);
It accepts options similar to .goto(), but that's all.
So,
page.goto() will go to an url and wait for navigation.
page.waitForNavigation() will only wait for navigation.

Puppetter Problems, waitForNavigation() returns immediately

I downloaded the latest version of Puppeteer a couple weeks ago, so I'm new with it. The first thing I noticed is that
await this.page.waitForNavigation();
does not seem to work. If I run in not headless mode and debug, I can see the waitForNavigation() returns as soon as navigation starts, not finishes. Who cares when navigation starts? You can't do anything until navigation is complete.
How can I be sure a page is ready? Right now I have had to fill my code with lots of
await this.page.waitFor(SomeDelayMs);
Generally speaking, you're better off using:
await page.waitForSelector('your_selector')
That will cause puppeteer to wait until a specific selector is available before continuing execution.
You can also use something like this if you're dealing with something that only shows up once clicked:
await page.waitForSelector('your_selector', {visible: True})