Where to put the configurations for puppeteer? - puppeteer

I'm using the plugin gatsby-remark-mermaid which also includes installing puppeteer. The mermaid diagrams are rendered properly on my end, however it gets an error on the build. Here is the error message:
Failed to launch the browser process!
/tmp/build/node_modules/puppeteer/.local-chromium/linux-970485/chrome-linux/chrome: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
I looked into the documentation and since I'm running it on Heroku, I must include this configuration:
puppeteer.launch({ args: ['--no-sandbox'] });
I tried using it on gatsby-browser.js like this, but I only got errors instead.
const puppeteer = require('puppeteer')
const browser = await puppeteer.launch({ args: ['--no-sandbox'] })
Where do I need to put this configuration for it to work?

Related

how to use chromium with puppeteer on google colab

chromium not working on google colab
Hi, I want to run puppeteer on google colab.
running test code with !node --trace-warnings test.js says:
Command '/usr/bin/chromium-browser' requires the chromium snap to be installed. Please install it with:snap install chromium
installed chromium with
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!apt install chromium-browser # also gave same result
puppeteer code
const puppeteer = require('puppeteer');
//varv');
const dotenv = require('dotenv');
dotenv.config();
(async () => {
const browser = await puppeteer.launch({executablePath: '/usr/bin/chromium-browser'});
const page = await browser.newPage();
await page.goto('https://www.google.com');
await page.screenshot({
path: 'google.png',
fullPage: true
});
await browser.close();
})();
Tried
I tried installing chromium with 'snap install chromium' it says:
error: cannot communicate with server: Post http://localhost/v2/snaps/chromium: dial unix /run/snapd.socket: connect: no such file or directory
Tried installing snapd
!systemctl status snapd.service # snapd: unrecognized service
!sudo apt update && upgrade
!sudo apt install snapd
!which snapd # error: cannot communicate with server: Post http://localhost/v2/snaps/chromium: dial unix /run/snapd.socket: connect: no such file or directory
Tried brave-browser
installation code: https://brave.com/linux/
executablePath: "/opt/brave.com/brave"
error: (node:42514) UnhandledPromiseRejectionWarning: Error: Failed to launch the browser process! spawn /opt/brave.com/brave EACCES
chmod +x /opt/brave.com/brave # did not solve
Tried google-chrome: Worked while creating this question
installation code: https://brave.com/linux/
executablePath: "/opt/google/chrome/chrome"
references:
https://colab.research.google.com/drive/168X6Zo0Yk2fzEJ7WDfY9Q_0UOEmHSrZc?usp=sharing#scrollTo=_Yf4OfPBAAPR
https://forum.snapcraft.io/t/snap-d-error-cannot-communicate-with-server-connection-refused/6093/23
To use Chromium with Puppeteer on Google Colab, you can install Puppeteer and launch Chromium using the following code:
!pip install puppeteer
Now python Code.
import asyncio
from pyppeteer import launch
async def main():
browser = await launch(headless=False, args=['--no-sandbox'])
page = await browser.newPage()
await page.goto('https://www.example.com')
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Note that the headless option is set to False to launch a visible Chromium browser. The --no-sandbox argument is also added to run Chromium in a containerized environment, which is necessary when running on Google Colab.
With this code, you can launch a Chromium browser, create a new page, navigate to a URL, and close the browser, all using Puppeteer. From here, you can use the Puppeteer API to automate tasks or extract data from web pages.

How to find which chrome executable is getting used by Puppeteer?

// executablePath is specified
const browser = await puppeteer.launch({
executablePath: '/path/to/chrome'
});
// // executablePath is not specified
const browser = await puppeteer.launch();
// This will not work.
// console.log('executablePath is', browser.executablePath)
If we do not specify a value for the executablePath option, Puppeteer will try to find the default installation of Chrome or Chromium on the system. On Windows, this is usually C:\Program Files (x86)\Google\Chrome\Application\chrome.exe. On macOS and Linux, Puppeteer will try to use the chrome or chromium executable in the PATH.
How we can find out which executable is getting used by Puppetter in Puppeteer script itself?
https://pptr.dev/api/puppeteer.puppeteernode.executablepath
const puppeteer = require('puppeteer')
console.log(puppeteer.executablePath())
Example output on my node-18 docker image is
root#021100c40ec4:/usr/src/app# node -e "console.log(require('puppeteer').executablePath())"
/root/.cache/puppeteer/chrome/linux-1069273/chrome-linux/chrome

puppeteer cluster _ no sand box option is not working on launch

this is my config on the puppeteer cluster :
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT,
workerCreationDelay: 2000,
puppeteerOptions:{args: ['--no-sandbox', '--disable-setuid-sandbox']},
maxConcurrency: numCPUs,
});
when I try to run in my host it comes with error:
Error: Failed to launch the browser process!
[1014/132057.583562:ERROR:zygote_host_impl_linux.cc(90)] Running as
root without --no-sandbox is not supported. See
https://crbug.com/638180.
but according to the documentation of puppeter cluster you can pass the puppeteer option in puppeteerOptions
why passing options are not working?

Launching Chrome with Puppeteer (not Chromium)

I tried to launch chrome with puppeteer but it gave me this error
Error: Failed to launch the browser process! spawn //C://Program Files (x86)//Google//Chrome//Application ENOENT
This is the code I used
const puppeteer = require('puppeteer')
const browser = await puppeteer.launch( { headless: false,
executablePath: '//C://Program Files (x86)//Google//Chrome//Application' })
So how can I launch chrome with puppeteer?
The path you gave is invalid in this format. If you are on Windows (which I suppose based on your currently given path) (1) you should use double backslashes \\, (2) but you shouldn't start your path with slashes nor backslashes. (3) Also you need the exact executable file as well at the end: chrome.exe.
The process goes like this: You can retrieve the exact executable path at your Chrome's chrome://version/ page, then you just need to escape each backslashes with another backslashes.
Correct usage:
C:\\Program Files (x86)\\Google\\Chrome\\Application\\chrome.exe
I'd like to add, perhaps what you want is using the package chrome-launcher which takes care of running the chrome browser.
You can then use puppeteer.connect() to connect the puppeteer-core library to the browser opened and instrument it.
This is what worked for me on Windows
const browser = await puppeteer.launch({
headless: false,
executablePath: 'C:/Program Files/Google/Chrome/Application/chrome.exe',
})
The slashes should be forward facing

PupeeteerSharp Does Not Work in ServiceFabric Stateless Service

I am developing web crawler which could render Javascript websites and so I decided to use PupeeteerSharp, a .NET port of popular Node.JS headless Chrome browser Pupeeteer API. I am running Service Fabric's local development cluster on Windows 10 development machine and have one stateless service in my solution.
I've created Data folder under Service project's PackageRoot folder and put .local-chromium folder contents there (contains chrome.exe executable) so it deploys as independent data package of service.
I've also placed this XML config line in ServiceManifest.xml file:
<DataPackage Name="Data" Version="1.0.0" />
So far it looks good and headless browser content is copied to SFCluster Data package directory properly.
Then in my Stateless Service code I try to call Pupeeteer chromium executable as follows:
var browser = await Puppeteer.LaunchAsync(new LaunchOptions
{
Headless = true,
ExecutablePath = _chromiumPath // #$"{context.CodePackageActivationContext.GetDataPackageObject("Data").Path}\.local-chromium\Win64-706915\chrome-win\chrome.exe"
});
using (var page = (await browser.NewPageAsync()))
{
Response renderResponse;
try
{
renderResponse = await page.GoToAsync(webPage.AbsoluteUri, timeout);
if (renderResponse.Status != System.Net.HttpStatusCode.OK)
{
return new RenderResult(RenderStatus.OtherFailure);
}
// other code
}
catch (TimeoutException)
{
return new RenderResult(RenderStatus.Timeouted);
}
In this line: using (var page = (await browser.NewPageAsync())) my code (Thread) simply hangs without returning, in Debug console I see many thread exits, but no exception occurs. I was previously getting System.IO.FileNotFoundException when I was fixing some other errors regarding appropriate copying of chromium folder contents, but now these errors are gone so it seems that code find .exe but somehow cannot start headless mode of PupeeterSharp.
Does that mean that I cannot simply run external .exe chromium binary with Service Fabric's Native Application Model? Should I use Docker and Linux containers instead?