I'm experimenting with JavaScript and puppeteer and when I go onto a site I have set it up so it takes a screenshot, but the problem is some elements of the website haven't loaded in when it takes the screenshot, is there anything I can do to add a delay?
Check my answer at: https://stackoverflow.com/a/57677369/10251383
Try this options with your page.goto():
await page.goto(url, { waitUntil: 'load' });
await page.goto(url, { waitUntil: 'domcontentloaded' });
await page.goto(url, { waitUntil: 'networkidle0' });
await page.goto(url, { waitUntil: 'networkidle2' });
Related
I am trying to make it log the data found from inspect>network>preview but right now it logs inspect>network>headers.
Here is what I have:
const puppeteer = require("puppeteer");
const url =
"https://www.google.com/";
async function StartScraping() {
await puppeteer
.launch({
headless: false,
})
.then(async (browser) => {
const page = await browser.newPage();
await page.setViewport({
width: 1500,
height: 800,
});
page.on("response", async (response) => {
if (response.url().includes("Text")) {
console.log(await response);
}
});
await page.goto(url, {
waitUntil: "load",
timeout: 0,
});
});
}
StartScraping();
It depends how you want it formatted. More information can be found here: https://github.com/puppeteer/puppeteer/blob/9ef4153f6e3548ac3fd2ac75b4570343e53e3a0a/docs/api.md#class-response
I've modified your code a bit to where I think you would want the response:
page.on("response", async (response) => {
if (response.url().includes("Text")) {
console.log(await response.text());
}
});
I use my own browser to get the result page I want. Everything is correct. Page link is below.
https://parcelsapp.com/en/tracking/016-35294405
img for working
I want to use puppeteer to help me to load the result page. The page shows differently.
I use options headless=false to debug. I found the browser pop up from puppeteer can not load the url correctly. I guess it is because the different environments. How can I solve the problem? Thank you.
img for not working
My code is below:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: false,
slowMo: 250, // slow down by 250ms
executablePath: '/usr/bin/google-chrome-stable',
});
const page = await browser.newPage();
await page.on("request", (request) => {
request.abort();
});
await page.goto('https://parcelsapp.com/en/tracking/016-35294405');
await page.waitForNavigation()
await page.screenshot({ path: 'result.png' });
await browser.close();
})();
Is there any way to avoid being detected by a website that I am using puppeteer? I just can't navigate around the https://www.footlocker.ca/ website using puppeteer. I have tried using stealth plugin and random user-agents to no avail.
Any advice on what else I can try?
This website use navigator.webdriver to check if you are real user or bot. so you can use the code below to delete navigator.webdriver value. docs.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.evaluateOnNewDocument(() => {
delete navigator.__proto__.webdriver;
});
await page.goto("https://www.footlocker.ca", {
waitUntil: "domcontentloaded",
});
})();
I have such a button and I do not know how to press it. This is the button from the start of the video. I tried probably everything and it still does not work. Can I press it?
<button class="ytp-large-play-button ytp-button" aria-label="test"></button>
my code
(async () => {
const browser = await puppeteer.launch({headless:false});
const page = await browser.newPage();
await page.setViewport({ width: 1749, height: 1080, deviceScaleFactor: 1, });
await page.goto('https://www.bananki.pl/');
await page.waitFor(2000)
page.click("a[id='login-btn']")
await page.waitFor(2000)
await page.type('input[name=user_mail]', 'email', {delay: 20})
await page.type('input[name=user_pass]', 'password', {delay: 20})
await page.keyboard.press("Enter")
await page.waitFor(3000)
await page.goto('https://www.bananki.pl/zdobywaj-bananki/banana-tv/');
await page.waitFor(5000)
await page.waitFor(10000);
})();
You should be able to select it by doing
await page.click(<button css selector here>)
You can do something like:
await page.click('.ytp-large-play-button[aria-label="test"]')
to be 100% sure that this will be this button.
I want to close pages when puppeteer faces on any error , sometimes page the page that i try to load crashes and it doesnt call .close();
(async () => {
const page = await browser.newPage();
await page.setViewport({width: resWidth, height: resHeight});
await page.goto(d["entities"]["urls"][0]["expanded_url"], {timeout :90000});
await page.screenshot({path: './resimdata/'+d['id']+'.png' ,fullPage: true});
await page.close();
})();
There is an issue/PR on puppeteer repo regarding this which will be helpful in similar situation.
Related Issue link: https://github.com/GoogleChrome/puppeteer/issues/952
Meanwhile, you can try this little hack, if the PR is there on version 0.12+, we don't have to worry about the following code.
(async() => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
function handleClose(msg){
console.log(msg);
page.close();
browser.close();
process.exit(1);
}
process.on("uncaughtException", () => {
handleClose(`I crashed`);
});
process.on("unhandledRejection", () => {
handleClose(`I was rejected`);
});
await page.goto("chrome://crash");
})();
Which will output something like the following,
▶ node app/app.js
I was rejected