A similar question may have been asked, but despite spending a couple of hours I could not find a satisfying answer.
I would like to call Frame.click(). At the time of calling it - I want to make sure that no navigation is pending. I also don't know if the element I am about to click on will result in navigation (it's passed in dynamically).
I tried https://www.npmjs.com/package/pending-xhr-puppeteer, but that seems to no longer be supported. (still uses Request instead of HTTPRequest)
Right now I am resorting to page.WaitForNetworkIdle(). Is this the best I can do?
P.S. It seems page.WaitForNavigation used to return right away if no navigation was taking place (on a year old version of the code). Since I updated to 12.1 it started waiting for the timeout and then throwing an exception if no navigation started within that timeframe.
I think you're confusing navigation and requests/responses. Navigation can be finished but requests/responses can still happened (eg: .aspx requests).
To make sure the navigation has happened, you could wrap it in a promise via Promise.all() method.
await Promise.all([
page.click(`#submit`),
page.waitForNavigation(),
]);
Additionally you could await page.content(); to make sure the full HTML contents of the page, including the doctype.
In the case of an ajax request you can use page.waitForResponse():
await Promise.all([
page.click(`#submit`),
page.waitForResponse(response => response.status() === 200),
]);
Related
I have a small automation code that works on a website, using puppeteer, chrome headless.
The code is very simple.
locate an element.
click on it.
the thing is that the click events are SUPER slow, I assume it should take less than a few ms.
but it takes 90-400ms to perform a single click event.
there is no need to scroll because all the elements appear on the screen.
here is a piece of code:
logger.verbose(`before clicking on row`);
// get by link text
// search for any element with this class and text
let workElement = await page.$x(`//span[contains(#class,"tr-class") and contains(text(),"${jobSearchId}")]`);
//we don't want to click on the element itself because it makes things more difficult
let row = await (await workElement.getProperty("parentNode")).getProperty("parentNode");
await row.click();
logger.verbose(`row clicked`);
//click on the button
logger.verbose(`button click before`);
await page.click(".searchJobDescription");
logger.verbose(`button clicked`);
0|main | 2022-08-17 13:09:05.283 - verbose: before clicking on row
0|main | 2022-08-17 13:09:05.685 - verbose: row clicked
0|main | 2022-08-17 13:09:05.685 - verbose: button click before
0|main | 2022-08-17 13:09:05.776 - verbose: button clicked
clicking on the row takes 402ms
and clicking the button takes 91ms
even a human can perform these actions much faster.
can anyone help me understand how to speed up these actions?
puppeteer is very slow and I don't know why.
So...
The solution is super weird and seems not related to the problem, but I have tested it again and again, and I'm 100% sure that this is the cause (or part of a bigger bug I can't see).
I have built a cookies file that stores the page cookies, when I am injecting the cookies back to the page, everything slows down, I am using the next method for it.
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
await page.setCookie(...cookiesList);
}
when I am removing the call for injecting the cookies, puppeteer runs fast again. (~30ms per click)
async function injectCookies() {
const cookiesList = getCookiesFromJarAsList();
// removing this line makes Puppeteer great again
// await page.setCookie(...cookiesList);
}
I am not even sure that this is a puppeteer issue.
it seems like Chrome/Chromium problem because when I debugged the CDP, chrome itself returned very slow answers for the command "mouseMove" (which is part of Puppeteer click method).
not sure what's going on, so if someone smart along the way will be able to solve it, it'll be great.
For me the problem been solved and I wasted more then a week in debugging it, so I'm leaving it this way.
if someone can reproduce it, and open a bug for puppeteer/chromium guys, I am sending him very good Karma.
I'm navigating with Puppeteer around a React website.
Two sample lines of code:
await page.waitForSelector('a.btn-lg[data-target="#loginModal"]');
await page.click('a.btn-lg[data-target="#loginModal"]');
With a sufficient slowMo value, the effects are consistent - the button gets clicked every time.
However, without slowMo, sometimes the button does get clicked, and sometimes it doesn't (a window wired to it doesn't open).
It happens for a lot of elements, not just this one button in particular.
I just started using Puppeteer, and it looks like I'm either misusing the library, or the website somehow screws up my efforts.
Please tell me why sometimes the effects of clicking are visible and sometimes not, and how to remedy it.
UPDATE:
Code such as this does not work either.
await page.evaluate(() => (document.querySelector('span.pum-close') as any).click());
await page.$$eval('span.pum-close', elements =>
elements[0].click()
);
I downloaded the latest version of Puppeteer a couple weeks ago, so I'm new with it. The first thing I noticed is that
await this.page.waitForNavigation();
does not seem to work. If I run in not headless mode and debug, I can see the waitForNavigation() returns as soon as navigation starts, not finishes. Who cares when navigation starts? You can't do anything until navigation is complete.
How can I be sure a page is ready? Right now I have had to fill my code with lots of
await this.page.waitFor(SomeDelayMs);
Generally speaking, you're better off using:
await page.waitForSelector('your_selector')
That will cause puppeteer to wait until a specific selector is available before continuing execution.
You can also use something like this if you're dealing with something that only shows up once clicked:
await page.waitForSelector('your_selector', {visible: True})
A recurring problem with modern web design can be summed up as "too much sh** all over the place". There're two problems with this: one, it takes up memory and takes longer to load, and two, it visually clutters the webpage.
If I just wanted to solve the second problem, I wouldn't need help. JavaScript can delete DOM nodes and CSS can hide them, so there're already a few visible ways to simply hide parts of a webpage. What I want to do is solve the first problem - make a webpage load faster by not loading certain elements.
I'm pretty sure it's impossible to selectively download certain parts of an HTML file. But once the source is downloaded, the browser doesn't have to actually parse and display all of it, does it?
Of course, if this is done after it's already been parsed and displayed, it would be pointless. So I need a way to tell Chrome what to do before it begins parsing the HTML. Is this possible, and do you think it would significantly reduce load time/memory usage?
Yeah, unfortunately Ive never seen a way of changing the html before Chrome renders it.
But as far as blocking things that that page gets to display then Id recommend just using AdBlock https://chrome.google.com/webstore/detail/gighmmpiobklfepjocnamgkkbiglidom
AdBlock can be used to stop resources (js,images,css,xmlhttprequest) from ever being downloaded (it blocks them in the background using the webRequest api) and can also hide elements using css...its rather effective (just remember to select advanced options in its option page and then when you click the AdBlock button you get "Show the resource list"). Also installing Flashblock can help...or disable plugins in Chromes settings, doing this will make them not load but will still show on the page and then you can make them load.
Totally possible! Meet the newest Chrome API: webRequest, finalized in the current version of Chrome - 17.
Docs for webRequest: http://code.google.com/chrome/extensions/webRequest.html#event-onBeforeRequest
I'm trying to think of a solid way to do this... one suggestion I have is using the 'sub_frame' filter, and watching if it's a like/tweet/social button url
You could also block known analytics stuff... and the list goes on! Have fun! Do you have an email list I can sub to for when you launch? If not, get one and drop me a comment!
(From the comments, here is how a innerHTML hack could work)
//This modLoop constantly peers into and modifies the innerHTML in attempt to modify the html before it's fully processed.
var modLoop = function modLoop(){
var html = document.documentElement.innerHTML
//modify the page html before it's processed!
//like: html = html.replace('//google'sCDN.com/jquery/1.7.1/', chrome.extension.getURL('localjQuery.1.7.1.js'));
//I just pulled that ^ out of nowhere, you'll want to put careful thought into it.
//Then, mod the innerHTML:
document.documentElement.innerHTML = html;
setTimeout(modLoop, 1);
};
var starter = function starter(){
if (document.documentElement.innerHTML && document.documentElement.innerHTML.lengh > 0) {
modLoop();
} else {
setTimeout(starter, 1);
}
};
starter();
I am starting my internship on a Home Server able to control mutliple domotics equipments from a web page. The global idea is that based on a click on a button, a certain script is spawned on the server and controls a microcontroller.
My tutor built a simple website he gave me, using AJAX to always stay on 1 page, and brings the menus according to user actions (they are hidden if not used, brought back to front if used).
I have set up an apache server which I configured to execute CGI scripts, and it works.
To always stay on one page, I used the '204 No Content' return trick, so that the server's answer to the page is 'I don't have anything to say, just stay on this page'.
But the one problem I have is that the CGI is launched only once. If I click the button for the first time it works, afterwards nothing happens.
I tried using the SSI (shtml) to use the in a button code instead of using a FORM with GET method but the script still won't execute twice.
I might be using the wrong tools. Should I keep going with CGIs ? Or is there something else (like AJAX, jquery) that actually is designed to do what I want ?
Thanks for having read.
EDIT : I have found a way around it (it's always when I'm desperate after looking for days for an answer that I go to forums and then find myself a nice solution in the next hour .... )
I used a basic link, and for some reason it has a different behaviour than using a button. Whatever. My interrogations on the technologies used still stand though :)
EDIT2 : My solution is crappy, for some reason the script is also called at page refresh (or when the page loads for the first time). It's strange, because since it's in the it should only be spawned when I click on it ...
Familiarize yourself with jQuery and its AJAX API. You can't make it not load a new page unless you use AJAX. Here is an example of what an AJAX call looks like:
$.ajax({
url: 'http://server.domain.com/cgi-bin/myfile.cgi',
data: {
x: 1,
today: '20110504',
user: 'Joe'
}
}).success(function(data, status, xhr) {
if (data)
alert(data);
});
That is for jQuery 1.5 or higher. You can run that whenever a certain button is clicked like this:
HTML for the button:
<input type="button" id="doThis"/>
JavaScript:
$(function() {
$('#doThis').click(function() {
//put AJAX sample shown above, here
});
});