What is the current folder for img references of static page - puppeteer

When a page is rendered using the page.setContent method of some static Html content, what is the current folder for attributes such as the src of img tags?
For example, for:
await page.setContent("<img src="./pic.jpg" />");
where is the folder ./?

Maybe it's undefined, here is my test result:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
page.on('request', request => console.log('send request: ' + request.url()));
page.on('console', message => console.log('console: ' + message.text()));
await page.setContent('<img src="./test.jpg" /><script>console.log("href="+window.location.href);</script>');
await browser.close();
})();
output:
console: href=about:blank
The page URL is about:blank and there's no requests sent.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
page.on('request', request => console.log('send request: ' + request.url()));
page.on('console', message => console.log('console: ' + message.text()));
await page.setContent('<base href="https://www.google.com"><img src="./test.jpg" /><script>console.log("href="+window.location.href);</script>');
await browser.close();
})();
output:
console: href=about:blank
send request: https://www.google.com/test.jpg
console: Failed to load resource: the server responded with a status of 404 ()
browser request test.jpg after appending a base element while the URL is still about:blank
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
page.on('request', request => console.log('send request: ' + request.url()));
page.on('console', message => console.log('console: ' + message.text()));
// set base href to local URL
await page.setContent('<base href="file:///abc/index.html"><img src="./test.jpg" /><script>console.log("href="+window.location.href);</script>');
await browser.close();
})();
output:
console: href=about:blank
console: Not allowed to load local resource: file:///abc/test.jpg
send request: file:///abc/test.jpg

The folder is located from the page you are visiting.
For example if the URL is
mydomain.com/directory1/page.html
The image can be found at mydomain.com/directory1/pic.jpg

Related

puppeteer-proxy im geting a Error: net::ERR_FAILED

Im Trying for the first time the puppeteer-proxy lib,and Im getting this error.
I don know is an error of puppeteer-proxy or of the function await page.setRequestInterception(true) because this guy has the same error as me
Code
const puppeteer = require('puppeteer-extra')
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
const {proxyRequest} = require('puppeteer-proxy')
puppeteer.use(StealthPlugin())
var browser = null
const func = (async () => {
browser = await puppeteer.launch({
headless: false,
executablePath: "chrome-win/chrome.exe"
})
const page = await browser.newPage()
await page.setRequestInterception(true);
page.on('request', async (request) => {
await proxyRequest({
page,
proxyUrl: "https://username:password#174.25.210.207:6286",
request,
});
});
await page.goto("https://www.google.com/")
})();
Error
C:\Users\edina\Documents\Last Developer Projects\Scraping Browser\node_modules\puppeteer-core\lib\cjs\puppeteer\common\Frame.js:238
? new Error(`${response.errorText} at ${url}`)
^
Error: net::ERR_FAILED at https://www.google.com/
at navigate (C:\Users\edina\Documents\Last Developer Projects\Scraping Browser\node_modules\puppeteer-core\lib\cjs\puppeteer\common\Frame.js:238:23)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Frame.goto (C:\Users\edina\Documents\Last Developer Projects\Scraping Browser\node_modules\puppeteer-core\lib\cjs\puppeteer\common\Frame.js:207:21)
at async CDPPage.goto (C:\Users\edina\Documents\Last Developer Projects\Scraping Browser\node_modules\puppeteer-core\lib\cjs\puppeteer\common\Page.js:439:16)
at async C:\Users\edina\Documents\Last Developer Projects\Scraping Browser\test.js:28:5

Https iframe in puppeteer

I'm trying to render an iframe in puppeteer - all well and good so far, it's setup and working. The problem I'm having is that to render an iframe from this url, I need to be on an https url; the error I get is:
Refused to frame because an ancestor violates the
following Content Security Policy directive: "frame-ancestors 'self'
https:".
Is there a way to get this kind of thing working in puppeteer?
Here's the code I have so far:
const puppeteer = require("puppeteer");
const embed = `
<iframe src="<some https url>" style="width: 330px; height: 186px; border: 0px;"></iframe>
`;
const timedPromise = time => new Promise(res => {
setTimeout(() => { res() }, time);
});
(async function () {
const browser = await puppeteer.launch({ headless: false});
const page = await browser.newPage();
await page.setContent(embed);
await timedPromise(3000);
await page.screenshot({ path: `screenshot${Number(Date.now())}.png` });
await browser.close();
})();

how to use puppeteer to goto web page then press Control P to print the page?

How to press control + P on a web page that is automated by puppeteer?
This code loads the web page. But using await page.keyboard.down('Control') to press the Control key has no effect.
(async () =>
{
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto(`https://google.com`);
await page.waitForSelector('input');
await page.focus("input");
// this works
await page.keyboard.down('Shift');
await page.keyboard.press('KeyP');
await page.keyboard.up('Shift');
// this has no effect.
await page.keyboard.down('Control');
await page.keyboard.press('KeyP');
await page.keyboard.up('Control');
})();
What I would like to do is navigate to a PDF file. Have the browser open the PDF. Then press Control P and automate the print dialog to the extent that the code selects the printer to print to and presses the Enter key.
running puppeteer in kiosk mode enables the window.print( ) dialog to be automatically responded to.
const puppeteer = require('puppeteer');
(async () =>
{
const browser = await puppeteer.launch(
{
headless: false,
"args": [ "--kiosk-printing" ]
});
const page = await browser.newPage();
await page.goto(`file:///C:/Users/srich/Downloads/packing-list.pdf`);
await page.evaluate(() => { window.print(); });
await page.waitForTimeout(2000) ;
await browser.close( ) ;
})();

Puppeteer can't catch failing request & errors

I trying to collect data from failing requests and js error.
I'm using the following site: https://nitzani1.wixsite.com/marketing-automation/3rd-page
The site has a request to https://api.fixer.io/1latest, which returns a status code of 404,
also the page contains thw following js error:
"Uncaught (in promise) Fetch did not succeed"
I've tried to code bellow to catch the 404 and js error but couldn't.
Not sure what I'm doing wrong, any idea as to how to solve it?
const puppeteer = require('puppeteer');
function wait (ms) {
return new Promise(resolve => setTimeout(() => resolve(), ms));
}
var run = async () => {
const browser = await puppeteer.launch({
headless: false,
args: ['--start-fullscreen']
});
page = await browser.newPage();
page.on('error', err=> {
console.log('err: '+err);
});
page.on('pageerror', pageerr=> {
console.log('pageerr: '+pageerr);
});
page.on('requestfailed', err => console.log('requestfailed: '+err));
collectResponse = [];
await page.on('requestfailed', rf => {
console.log('rf: '+rf);
});
await page.on('response', response => {
const url = response.url();
response.buffer().then(
b => {
// console.log(url+' : '+response.status())
},
e => {
console.log('response err');
}
);
});
await wait(500);
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://nitzani1.wixsite.com/marketing-automation/3rd-page', {
});
};
run();
The complete worked answer is:
const puppeteer = require('puppeteer');
const run = async () => {
const browser = await puppeteer.launch({
headless: true
});
const page = await browser.newPage();
// Catch all failed requests like 4xx..5xx status codes
page.on('requestfailed', request => {
console.log(`url: ${request.url()}, errText: ${request.failure().errorText}, method: ${request.method()}`)
});
// Catch console log errors
page.on("pageerror", err => {
console.log(`Page error: ${err.toString()}`);
});
// Catch all console messages
page.on('console', msg => {
console.log('Logger:', msg.type());
console.log('Logger:', msg.text());
console.log('Logger:', msg.location());
});
await page.setViewport({ width: 1920, height: 1080 });
await page.goto('https://nitzani1.wixsite.com/marketing-automation/3rd-page', { waitUntil: 'domcontentloaded' });
await page.waitFor(10000); // To be sure all exceptions logged and handled
await browser.close();
};
run();
Save in .js file and easily run it.
Current puppeteer 8.0.0^ have a very small amount of information in message.text(). So we need to get a description of the error from JSHandle.
Please check this comment with fully descriptive console errors from JSHandle object
Check the link here https://stackoverflow.com/a/66801550/9026103

Chrome puppeteer Close page on error event

I want to close pages when puppeteer faces on any error , sometimes page the page that i try to load crashes and it doesnt call .close();
(async () => {
const page = await browser.newPage();
await page.setViewport({width: resWidth, height: resHeight});
await page.goto(d["entities"]["urls"][0]["expanded_url"], {timeout :90000});
await page.screenshot({path: './resimdata/'+d['id']+'.png' ,fullPage: true});
await page.close();
})();
There is an issue/PR on puppeteer repo regarding this which will be helpful in similar situation.
Related Issue link: https://github.com/GoogleChrome/puppeteer/issues/952
Meanwhile, you can try this little hack, if the PR is there on version 0.12+, we don't have to worry about the following code.
(async() => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
function handleClose(msg){
console.log(msg);
page.close();
browser.close();
process.exit(1);
}
process.on("uncaughtException", () => {
handleClose(`I crashed`);
});
process.on("unhandledRejection", () => {
handleClose(`I was rejected`);
});
await page.goto("chrome://crash");
})();
Which will output something like the following,
▶ node app/app.js
I was rejected