I've got a chrome extension that is trying to implement puppeteer-web. I've followed the following code to try and get set up
puppeteer-web: "Puppeteer is not a constructor"
This is my code:
const puppeteer = require("puppeteer");
async function initiatePuppeteer() {
let browserWSEndpoint = '';
await fetch("http://127.0.0.1:9222/json")
.then(response => response.json())
.then(function(data) {
let filteredData = data.filter(tab => tab.type ==='page');
browserWSEndpoint = filteredData[0].webSocketDebuggerUrl;
})
.catch(error => console.log(error));
const browser = await puppeteer.connect({
browserWSEndpoint: browserWSEndpoint
});
const page = await browser.newPage();
....etc
}
It doesn't seem the code makes it past this point as when I put a debugger at const browser = await puppeteer.connect I get the error
Uncaught (in promise) Error: Protocol error (Target.getBrowserContexts): Not allowed.
Using Chrome version V76.0.3809.100
Any ideas?
Edit: my webSocketDebuggerUrl is something like ws://127.0.0.1:9222/devtools/page/E1B62B356262B00C26A5D79D03745360
And I suspect it's because it's /page/ and not /browser/ but I couldn't find any of type browser from the /json route. I'll give it another look at it tonight.
Ok so it turns out puppeteer can only connect to a browser target and not a page target as per https://github.com/GoogleChrome/puppeteer/issues/4579#issuecomment-511267351
And reading the documentation for the API (which I should have done earlier...) https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#browserwsendpoint states that URL for the browser target is http://${host}:${port}/json/version instead of just /json
Related
I have been studying the usage of the puppeteer.connect method for almost an hour, but it still makes an error. Please help. I am a beginner. This method has troubled me for a long time.
const puppeteer = require('puppeteer');
(async ()=>{
var wsa="ws://localhost:9222/devtools/browser/f59fe52c-d869-48a1-a7d4-c2b604a5b3";
const browserConfig={
browserWSEndpoint :wsa
};
const browser= await puppeteer.connect(browserConfig);
const page=await browser.newPage();
// todo 你的脚本内容
})().catch(err=>{
console.log(err);
process.exit();
});
The error that popped out
ErrorEvent {
target: WebSocket {
_events: [Object: null prototype] { open: [Function], error: [Function] },
type: 'error',
message: 'Unexpected server response: 404',
error: Error: Unexpected server response: 404
Browser command line
"C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222
I see other people's answers to solve the problem
const puppeteer = require('puppeteer');
(async ()=>{
const browserURL = 'http://127.0.0.1:9222';
const browser = await puppeteer.connect({browserURL , defaultViewport : null });
const page = await browser.newPage();
// todo 你的脚本内容
})().catch(err=>{
console.log(err);
process.exit();
});
Mainly, I made a mistake in the code before I modified it. Below is the official function explanation.
The browserWSEndpoint parameter I thought was entered in the chrome URL http://localhost:9222/json
The content of webSocketDebuggerUrl inside , "ws://localhost:9222/devtools/page/F0B9E1C93A2F91F91F4A63D3FAEDDB"
As a result, you only need to enter the local ip plus the port number
But I still know the details after reading other people's code
puppeteer.connect(options)
options <Object>
browserWSEndpoint <string> a browser websocket endpoint to connect to.
I've seen this asked around but I can't pinpoint what I'm doing wrong.
Error: UnhandledPromiseRejectionWarning: Error: Evaluation failed: ReferenceError: page is not defined
const page = await browser.newPage();
await page.goto('https://example.com');
const getAllElements = await page.$$eval('.aclass', links => {
links.map(link => {
page.hover(link);
page.screenshot({path: `example${link}.png`});
})
})
Expected behavior is that I go to example.com, I then get all the .aclass elements. Return those as 'links" then I map over each link, which should give me each element in link. I then am expecting to be able to page.hover and page.screenshot. However this is where I get the error that page is not defined. Any idea what I'm doing wrong?
hover and screenshot returns Promise.
Don't pass page to page.$$eval. Instead you can do the following:
const page = await browser.newPage();
await page.goto('https://example.com');
const getAllElements = await page.$$('.aclass');
for (let [i, link] of getAllElements.entries()) {
await link.hover();
await link.screenshot({path: `example${i}.png`});
}
I'm trying to build a web scraper using puppeteer that scrapes my venmo page to look for payments. When I try to run my script I get an error that says "page.goto is not a function"
I'm honestly not quite sure where to even start with this
const puppeteer = require('puppeteer');
const url = 'generic.com';
(async () => {
//running in headless to observe what happens for now
const browser = await puppeteer.launch({headless: false});
const page = browser.newPage();
await page.goto(url);
let data = await page.evaluate(() => {
let amount = document.querySelector('span.bold.medium.green').innerText;
let timePayed = document.querySelector('a.grey_link').innerText;
return {
amount,
timePayed
}
});
console.log(data);
debugger;
await browser.close();
})();
This is my error message
UnhandledPromiseRejectionWarning: TypeError: page.goto is not a function
at D:\venmoScraper\scraper.js:12:12
at process._tickCallback (internal/process/next_tick.js:68:7)
(node:13212) UnhandledPromiseRejectionWarning: Unhandled promise
rejection. This error originated either by throwing inside of an async
function without a catch block, or by rejecting a promise which was not
handled with .catch(). (rejection id: 1)
(node:13212) [DEP0018] DeprecationWarning: Unhandled promise rejections
are deprecated. In the future, promise rejections that are not handled
will terminate the Node.js process with a non-zero exit code.
The line,
const page = browser.newPage();
should be written as,
const page = await browser.newPage();
browser.newPage() returns as Promise.
You can use.
page.get("https://www.google.com/");
In addition to the helpful answers above, beginners might also want to double-check that the line above page.goto has the correct syntax i.e. page = await browser.newPage(); instead of something like page = await browser.newPage; (don't forget to put in the parens etc).
(I know the question asked is a bit different, but thought this might be useful since it's how I ran into a 'page.goto is not a function' error)
Can someone explain why this code isn't working. I have a console log before I run page.evaluate() which logs what I expect, but the console log inside page.evaluate never runs.
const puppeteer = require('puppeteer');
(async () => {
try {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.example.com');
page.on('response', async response => {
const url = response.url();
if (url.includes('something')) {
console.log('this code runs');
await page.evaluate(() => {
console.log("this code doesn't run");
});
}
});
} catch (err) {
console.log(err);
}
})();
Console log doesn't work in page.evaluate()
https://github.com/GoogleChrome/puppeteer/issues/1944
Try to use this code for display console.log from evaluate
page.on('console', msg => {
for (let i = 0; i < msg.args().length; ++i)
console.log(`${i}: ${msg.args()[i]}`);
});
page.evaluate(() => console.log('hello', 5, {foo: 'bar'}));
https://pptr.dev/#?product=Puppeteer&version=v1.20.0&show=api-event-console
The code inside page.evaluate is run in the browser context, so the console.log works, but inside the Chrome console and not the Puppeteer one.
To display the logs of the Chrome context inside the Puppeteer console, you can set dumpio to true in the arguments when launching a browser using Puppeteer:
const browser = await puppeteer.launch({
dumpio: true
})
Console.log works but in the browser context. I'm guessing here that you are trying to see the log in the CLI. If you want to see the log set headless to false and then see the log in the browser console.
I know you can capture a single html node vial the command prompt, but is it possible to do this programmatically from the console similar to Puppeteer? I'd like to loop all elements on a page and capture them for occasional one-off projects where I don't want to set up a full auth process in puppeteer.
I'm referring to this functionality:
But executed from the console like during a foreach or something like that.
See the puppeteer reference here.
Something to the effect of this:
$x("//*[contains(#class, 'special-class-name')]").forEach((el)=> el.screenshot())
I just made a script that take a screenshot every submit button in Google main page. Just take a look and take some inspiration from it.
const puppeteer = require('puppeteer')
;(async () => {
const browser = await puppeteer.launch({
headless:false,
defaultViewport:null,
devtools: true,
args: ['--window-size=1920,1170','--window-position=0,0']
})
const page = (await browser.pages())[0]
const open = await page.goto ( 'https://www.google.com' )
const submit = await page.$$('input[type="submit"]')
const length = submit.length
let num = 0
const shot = submit.forEach( async elemHandle => {
num++
await elemHandle.screenshot({
path : `${Date.now()}_${num}.png`
})
})
})()
You can use ElementHandle.screenshot() to take a screenshot of a specific element on the page. The ElementHandle can be obtained from Page.$(selector) or Page.$$(selector) if you want to return multiple results.
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto("https://stackoverflow.com/questions/50715164");
const userInfo = await page.$(".user-info");
await userInfo.screenshot({ path: "userInfo.png" });
The output image after executing the code: