I downloaded the latest version of Puppeteer a couple weeks ago, so I'm new with it. The first thing I noticed is that
await this.page.waitForNavigation();
does not seem to work. If I run in not headless mode and debug, I can see the waitForNavigation() returns as soon as navigation starts, not finishes. Who cares when navigation starts? You can't do anything until navigation is complete.
How can I be sure a page is ready? Right now I have had to fill my code with lots of
await this.page.waitFor(SomeDelayMs);
Generally speaking, you're better off using:
await page.waitForSelector('your_selector')
That will cause puppeteer to wait until a specific selector is available before continuing execution.
You can also use something like this if you're dealing with something that only shows up once clicked:
await page.waitForSelector('your_selector', {visible: True})
Related
I have a small automation code that works on a website, using puppeteer, chrome headless.
The code is very simple.
locate an element.
click on it.
the thing is that the click events are SUPER slow, I assume it should take less than a few ms.
but it takes 90-400ms to perform a single click event.
there is no need to scroll because all the elements appear on the screen.
here is a piece of code:
logger.verbose(`before clicking on row`);
// get by link text
// search for any element with this class and text
let workElement = await page.$x(`//span[contains(#class,"tr-class") and contains(text(),"${jobSearchId}")]`);
//we don't want to click on the element itself because it makes things more difficult
let row = await (await workElement.getProperty("parentNode")).getProperty("parentNode");
await row.click();
logger.verbose(`row clicked`);
//click on the button
logger.verbose(`button click before`);
await page.click(".searchJobDescription");
logger.verbose(`button clicked`);
0|main | 2022-08-17 13:09:05.283 - verbose: before clicking on row
0|main | 2022-08-17 13:09:05.685 - verbose: row clicked
0|main | 2022-08-17 13:09:05.685 - verbose: button click before
0|main | 2022-08-17 13:09:05.776 - verbose: button clicked
clicking on the row takes 402ms
and clicking the button takes 91ms
even a human can perform these actions much faster.
can anyone help me understand how to speed up these actions?
puppeteer is very slow and I don't know why.
So...
The solution is super weird and seems not related to the problem, but I have tested it again and again, and I'm 100% sure that this is the cause (or part of a bigger bug I can't see).
I have built a cookies file that stores the page cookies, when I am injecting the cookies back to the page, everything slows down, I am using the next method for it.
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
await page.setCookie(...cookiesList);
}
when I am removing the call for injecting the cookies, puppeteer runs fast again. (~30ms per click)
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
// removing this line makes Puppeteer great again
// await page.setCookie(...cookiesList);
}
I am not even sure that this is a puppeteer issue.
it seems like Chrome/Chromium problem because when I debugged the CDP, chrome itself returned very slow answers for the command "mouseMove" (which is part of Puppeteer click method).
not sure what's going on, so if someone smart along the way will be able to solve it, it'll be great.
For me the problem been solved and I wasted more then a week in debugging it, so I'm leaving it this way.
if someone can reproduce it, and open a bug for puppeteer/chromium guys, I am sending him very good Karma.
A similar question may have been asked, but despite spending a couple of hours I could not find a satisfying answer.
I would like to call Frame.click(). At the time of calling it - I want to make sure that no navigation is pending. I also don't know if the element I am about to click on will result in navigation (it's passed in dynamically).
I tried https://www.npmjs.com/package/pending-xhr-puppeteer, but that seems to no longer be supported. (still uses Request instead of HTTPRequest)
Right now I am resorting to page.WaitForNetworkIdle(). Is this the best I can do?
P.S. It seems page.WaitForNavigation used to return right away if no navigation was taking place (on a year old version of the code). Since I updated to 12.1 it started waiting for the timeout and then throwing an exception if no navigation started within that timeframe.
I think you're confusing navigation and requests/responses. Navigation can be finished but requests/responses can still happened (eg: .aspx requests).
To make sure the navigation has happened, you could wrap it in a promise via Promise.all() method.
await Promise.all([
page.click(`#submit`),
page.waitForNavigation(),
]);
Additionally you could await page.content(); to make sure the full HTML contents of the page, including the doctype.
In the case of an ajax request you can use page.waitForResponse():
await Promise.all([
page.click(`#submit`),
page.waitForResponse(response => response.status() === 200),
]);
I'm navigating with Puppeteer around a React website.
Two sample lines of code:
await page.waitForSelector('a.btn-lg[data-target="#loginModal"]');
await page.click('a.btn-lg[data-target="#loginModal"]');
With a sufficient slowMo value, the effects are consistent - the button gets clicked every time.
However, without slowMo, sometimes the button does get clicked, and sometimes it doesn't (a window wired to it doesn't open).
It happens for a lot of elements, not just this one button in particular.
I just started using Puppeteer, and it looks like I'm either misusing the library, or the website somehow screws up my efforts.
Please tell me why sometimes the effects of clicking are visible and sometimes not, and how to remedy it.
UPDATE:
Code such as this does not work either.
await page.evaluate(() => (document.querySelector('span.pum-close') as any).click());
await page.$$eval('span.pum-close', elements =>
elements[0].click()
);
I have spent the better part of a day researching this, but still have no freaking clue where to even begin!
Basically all I want to do is add a button to the DevTools toolbar that injects the entire DevTools window into an arbitrary page as a div or iframe, so that I can manipuate that with hover/click event handlers to show/hide it more easily.
Working on a 13" laptop with no option to buy a second monitor (I am dirt poor, hence the need to do this and get my web dev business up and running), makes this an absolute necessity. It is so frustrating to have to constantly expand/collapse the window or undock it and move it around to be able to inspect and change arbitrary parts of a page simultaneously.
As mentioned before, I have spent ~2 hours scouring the DevTools Protocol docs, ~3 hours trying to build my own Chromium from source before my hard drive ran out of space, and countless hours researching this topic on other sites. But Chrome is such a massive and complicated beast that I am utterly baffled by the overly-simplistic DevTools extension API docs and other incomplete resources that I've found.
Could anyone shed some light on this for me? I've gotten so far as creating a DevTools extension with the following devtools.js:
chrome.devtools.panels.create('\u25b2\u25bc', '', 'panel.html', function(panel) {
panel.onShown.addListener(function() {
chrome.extension.sendMessage({
tabId: chrome.devtools.inspectedWindow.tabId,
action: "code",
content: "var newItem = document.createElement('div'); // Create a new div node\n" +
"newItem.innerHTML = '<p>Boo!</p>';" +
"var body = document.getElementsByTagName('body')[0]; // Get the body element to insert the header into\n" +
"body.insertBefore(newItem, body.childNodes[0]); // Insert header before the first child of the body"
});
});
});
That adds a panel, which when switched to inserts the div correctly, but even that tiny amount of progress took ~4 hours to make!! Am I just the world's worst web developer, or is this API the most complicated and poorly-documented one ever created?! Some help to make the button actually do what I want would be very much appreciated!
I'd also like to know whether the extension route is the best one for me to follow, or should I be using the DevTools protocol? That seems to be much more powerful, but all the talk of "remote debugging" and "external clients" on the aforementioned docs page just confuses me even more (can't i just mimic the DevTools locally in my own browser since I'm already running Apache locally?).
MTIA and apologies for the long-winded question!
Im using console.log lots in my javascript for debugging mouse move events. The problem im having is that when in the chrome console the new entries aren't followed.
Its best illustrated in these screenshots:
First lot of logs is fine because its big enough to see all of it on the screen:
A few seconds later:
The log has gone past the size of the window requiring me to scroll.
This makes it incredibly difficult to debug mouse move events because I have to move over to the console and scroll down, thus adding more entries to the log.
So my question is: How can I get chrome to essentially tail the log instead of stopping and require me to scroll.
With the console open, drag the scroll bar down to the bottom of the window and release it. It should tail the output for you.
It took me quite a few tries to get it to work in Version 27.0.1438.7 dev-m. But in Version 27.0.1440.0 canary, not only did it happen automatically, I could reattach the auto-scroll each time I tried.
You can download Canary from here.
The default behavior is for Console to follow (tail) logs as they head in there.
However, we had a bug in the DevTools where if you changed the zoom factor (cmd++) it didn't work always.
We just fixed that: https://codereview.chromium.org/180733003/ You'll need canary for a little while (from the date of this post) but it'll work its way down to Stable in about 10 weeks.
There's a rather pernicious bug here (present in Chrome for as long as I remember), where if you log any sort of expando-item like a DOM element or some such thing, it messes with the display of the log, and causes the scroll to stop following.
I solved this by applying a little bit of ingenuity, and finding the offending log, and you don't even need to delete the log statement, you just have to make it "friendlier". What works very often is I take any such log statement such as
console.log((mouse ? "mouse" : "touch") + " start on", jqtarg[0]);
and wrap it in an array:
console.log([(mouse_not_touch ? "mouse" : "touch") + " start on", jqtarg[0]]);
You may try do other things as well, in an attempt to make the log more readable, such as an object (haven't tested any of this rigorously, it may still cause the annoying failure-of-scroll-follow):
console.log({"mouse/touch start on": jqtarg[0]});
Based on a very small amount of testing, it would appear that if a log appears in the log buffer as an item that can be directly hovered (as opposed to requiring you to manually expand it first) to cause the inspector to highlight the item in the DOM for you, then it may trigger "scroll lock syndrome".
BTW, a helpful thing to be aware of is that if you log the exact same stuff repeatedly, Chrome helpfully "stacks" them like so: (See? I fixed the autoscroll by shoving my log in an object! yay!)
If you don't really need to see values based on precise coordinates, printing coarser values more ... coarsely will lead to a more compact log (which will still give you sensible feedback with counts).
Update: Sometimes none of this works. Sometimes you're just out of luck with this and you just have to clean up all the logs that you don't need and log the minimal amount of information to prevent overloading it and causing it to fail to scroll down.