how to use chromium with puppeteer on google colab - puppeteer

chromium not working on google colab
Hi, I want to run puppeteer on google colab.
running test code with !node --trace-warnings test.js says:
Command '/usr/bin/chromium-browser' requires the chromium snap to be installed. Please install it with:snap install chromium
installed chromium with
!apt install chromium-chromedriver
!cp /usr/lib/chromium-browser/chromedriver /usr/bin
!apt install chromium-browser # also gave same result
puppeteer code
const puppeteer = require('puppeteer');
//varv');
const dotenv = require('dotenv');
dotenv.config();
(async () => {
const browser = await puppeteer.launch({executablePath: '/usr/bin/chromium-browser'});
const page = await browser.newPage();
await page.goto('https://www.google.com');
await page.screenshot({
path: 'google.png',
fullPage: true
});
await browser.close();
})();
Tried
I tried installing chromium with 'snap install chromium' it says:
error: cannot communicate with server: Post http://localhost/v2/snaps/chromium: dial unix /run/snapd.socket: connect: no such file or directory
Tried installing snapd
!systemctl status snapd.service # snapd: unrecognized service
!sudo apt update && upgrade
!sudo apt install snapd
!which snapd # error: cannot communicate with server: Post http://localhost/v2/snaps/chromium: dial unix /run/snapd.socket: connect: no such file or directory
Tried brave-browser
installation code: https://brave.com/linux/
executablePath: "/opt/brave.com/brave"
error: (node:42514) UnhandledPromiseRejectionWarning: Error: Failed to launch the browser process! spawn /opt/brave.com/brave EACCES
chmod +x /opt/brave.com/brave # did not solve
Tried google-chrome: Worked while creating this question
installation code: https://brave.com/linux/
executablePath: "/opt/google/chrome/chrome"
references:
https://colab.research.google.com/drive/168X6Zo0Yk2fzEJ7WDfY9Q_0UOEmHSrZc?usp=sharing#scrollTo=_Yf4OfPBAAPR
https://forum.snapcraft.io/t/snap-d-error-cannot-communicate-with-server-connection-refused/6093/23

To use Chromium with Puppeteer on Google Colab, you can install Puppeteer and launch Chromium using the following code:
!pip install puppeteer
Now python Code.
import asyncio
from pyppeteer import launch
async def main():
browser = await launch(headless=False, args=['--no-sandbox'])
page = await browser.newPage()
await page.goto('https://www.example.com')
await browser.close()
asyncio.get_event_loop().run_until_complete(main())
Note that the headless option is set to False to launch a visible Chromium browser. The --no-sandbox argument is also added to run Chromium in a containerized environment, which is necessary when running on Google Colab.
With this code, you can launch a Chromium browser, create a new page, navigate to a URL, and close the browser, all using Puppeteer. From here, you can use the Puppeteer API to automate tasks or extract data from web pages.

Related

How to find which chrome executable is getting used by Puppeteer?

// executablePath is specified
const browser = await puppeteer.launch({
executablePath: '/path/to/chrome'
});
// // executablePath is not specified
const browser = await puppeteer.launch();
// This will not work.
// console.log('executablePath is', browser.executablePath)
If we do not specify a value for the executablePath option, Puppeteer will try to find the default installation of Chrome or Chromium on the system. On Windows, this is usually C:\Program Files (x86)\Google\Chrome\Application\chrome.exe. On macOS and Linux, Puppeteer will try to use the chrome or chromium executable in the PATH.
How we can find out which executable is getting used by Puppetter in Puppeteer script itself?
https://pptr.dev/api/puppeteer.puppeteernode.executablepath
const puppeteer = require('puppeteer')
console.log(puppeteer.executablePath())
Example output on my node-18 docker image is
root#021100c40ec4:/usr/src/app# node -e "console.log(require('puppeteer').executablePath())"
/root/.cache/puppeteer/chrome/linux-1069273/chrome-linux/chrome

Install multiple vs code extensions in CICD

My unit test launch looks like this. As you can see I have exploited CLI options to install a VSIX my CICD has already produced, and then also tried to install ms-vscode-remote.remote-ssh because I want to re-run the tests on a remote workspace.
import * as path from 'path';
import * as fs from 'fs';
import { runTests } from '#vscode/test-electron';
async function main() {
try {
// The folder containing the Extension Manifest package.json
// Passed to `--extensionDevelopmentPath`
const extensionDevelopmentPath = path.resolve(__dirname, '../../');
// The path to the extension test runner script
// Passed to --extensionTestsPath
const extensionTestsPath = path.resolve(__dirname, './suite/index');
const vsixName = fs.readdirSync(extensionDevelopmentPath)
.filter(p => path.extname(p) === ".vsix")
.sort((a, b) => a < b ? 1 : a > b ? -1 : 0)[0];
const launchArgsLocal = [
path.resolve(__dirname, '../../src/test/test-docs'),
"--install-extension",
vsixName,
"--install-extension",
"ms-vscode-remote.remote-ssh"
];
const SSH_HOST = process.argv[2];
const SSH_WORKSPACE = process.argv[3];
const launchArgsRemote = [
"--folder-uri",
`vscode-remote://ssh-remote+testuser#${SSH_HOST}${SSH_WORKSPACE}`
];
// Download VS Code, unzip it and run the integration test
await runTests({ extensionDevelopmentPath, extensionTestsPath, launchArgs: launchArgsLocal });
await runTests({ extensionDevelopmentPath, extensionTestsPath, launchArgs: launchArgsRemote });
} catch (err) {
console.error(err);
console.error('Failed to run tests');
process.exit(1);
}
}
main();
runTests downloads and installs VS Code, and passes through the parameters I supply. For the local file system all the tests pass, so the extension from the VSIX is definitely installed.
But ms-vscode-remote.remote-ssh doesn't seem to be installed - I get this error:
Cannot get canonical URI because no extension is installed to resolve ssh-remote
and then the tests fail because there's no open workspace.
This may be related to the fact that CLI installation of multiple extensions repeats the --install-extension switch. I suspect the switch name is used as a hash key.
What to do? Well, I'm not committed to any particular course of action, just platform independence. If I knew how to do a platform independent headless CLI installation of VS Code:latest in a GitHub Action, that would certainly do the trick. I could then directly use the CLI to install the extensions before the tests, and pass the installation path. Which would also require a unified way to get the path for vs code.
Update 2022-07-20
Having figured out how to do a platform independent headless CLI installation of VS Code:latest in a GitHub Action followed by installation of the required extensions I face new problems.
The test framework options include a path to an existing installation of VS Code. According to the interface documentation, supplying this should cause the test to use the existing installation instead of installing VS Code; this is why I thought the above installation would solve my problems.
However, the option seems to be ignored.
My latest iteration uses an extension dependency on remote-ssh to install it. There's a new problem: how to get the correct version of my extension onto the remote host. By default the remote host uses the marketplace version, which obviously won't be the version we're trying to test.
I would first try with only one --install-extension option, just to check if any extension is installed.
I would also check if the same set of commands works locally (install VSCode and its remote SSH extension)
Testing it locally (with only one extension) also allows to check if that extension has any dependencies (like Remote SSH - Editing)

Where to put the configurations for puppeteer?

I'm using the plugin gatsby-remark-mermaid which also includes installing puppeteer. The mermaid diagrams are rendered properly on my end, however it gets an error on the build. Here is the error message:
Failed to launch the browser process!
/tmp/build/node_modules/puppeteer/.local-chromium/linux-970485/chrome-linux/chrome: error while loading shared libraries: libnss3.so: cannot open shared object file: No such file or directory
I looked into the documentation and since I'm running it on Heroku, I must include this configuration:
puppeteer.launch({ args: ['--no-sandbox'] });
I tried using it on gatsby-browser.js like this, but I only got errors instead.
const puppeteer = require('puppeteer')
const browser = await puppeteer.launch({ args: ['--no-sandbox'] })
Where do I need to put this configuration for it to work?

puppeteer cluster _ no sand box option is not working on launch

this is my config on the puppeteer cluster :
const cluster = await Cluster.launch({
concurrency: Cluster.CONCURRENCY_CONTEXT,
workerCreationDelay: 2000,
puppeteerOptions:{args: ['--no-sandbox', '--disable-setuid-sandbox']},
maxConcurrency: numCPUs,
});
when I try to run in my host it comes with error:
Error: Failed to launch the browser process!
[1014/132057.583562:ERROR:zygote_host_impl_linux.cc(90)] Running as
root without --no-sandbox is not supported. See
https://crbug.com/638180.
but according to the documentation of puppeter cluster you can pass the puppeteer option in puppeteerOptions
why passing options are not working?

I am trying to run a download data from chrome browser using chromedriver in a python file

I am getting this error while using chrome driver to download the google images ... the chrome driver is in path too. The executable is downloaded still the issue persists.
Windows 10 operating system .. the chromedriver is installed through pip too...
Input:
if __name__ == '__main__':
chrome_driver = 'C:\\Users\\320086442\\AppData\\Local\\Continuum\\anaconda3\\Lib\\site-packages\\selenium\\webdriver\\chrome'
# download the emotion data
data_dir = 'C:\\Users\\320086442\\Downloads\\emotion-master\\image_net'
emotions = {'angry': ['angry', 'furious', 'resentful', 'irate'],
'disgusted': ['disgusted', 'sour', 'grossed out'],
'happy': ['happy', 'smiling', 'cheerful', 'elated', 'joyful'],
'sad': ['sad', 'depressed', 'sorrowful', 'mournful', 'grieving', 'crying'],
'surprised': ['surprised', 'astonished', 'shocked', 'amazed']}
download_emotions(emotions, data_dir, chrome_driver)
# download the pseudo Imagenet data
imagenet_labels = []
with open('C:\\Users\\320086442\\Downloads\\emotion-master\\image_net\\imagenet_labels.txt', 'r') as file:
for line in file:
imagenet_labels.append(line.strip())
data_dir = 'C:\\Users\\320086442\\Downloads\\emotion-master'
imagenet_label_file = 'C:\\Users\\320086442\\Downloads\\emotion-master\\image_net\\imagenet_labels.txt'
download_fake_imagenet(imagenet_labels, data_dir, chrome_driver)
Output:
C:\Users\320086442\AppData\Local\Continuum\anaconda3\python.exe C:/Users/320086442/Downloads/emotion-
master/download_data.py
Downloading images for: angry human face ...
Looks like we cannot locate the path the 'chromedriver' (use the '--chromedriver' argument to specify
the path to the executable.) or google chrome browser is not installed on your machine (exception:
use
options instead of chrome_options)
Process finished with exit code 0