Puppeteer: how to access/intercept a FileSystemDirectoryHandle? - puppeteer

I'm wondering if it's possible within puppeteer to access a FileSystemDirectoryHandle (from the File System Access API). I would like to pass in a directory path via puppeteer as though the user had selected a directory via window.showDirectoryPicker(). On my client page I use the File System Access API to write a series of png files taken from a canvas element, like:
const directoryHandle = await window.showDirectoryPicker();
for (let frame = 0; frame < totalFrames; frame++){
const fileHandle = await directoryHandle.getFileHandle(`${frame}.png`, { create: true });
const writable = await fileHandle.createWritable();
updateCanvas(); // <--- update the contents of my canvas element
const blob = await new Promise((resolve) => canvas.toBlob(resolve, 'image/png'));
await writable.write(blob);
await writable.close();
}
On the puppeteer side, I want to mimic that behavior with something like:
const page = await browser.newPage();
await page.goto("localhost:3333/canvasRenderer.html");
// --- this part doesn't seem to exist ---
const [dirChooser] = await Promise.all([
page.waitForDirectoryChooser(),
page.click('#choose-directory'),
]);
await dirChooser.accept(['save/frames/here']);
//--------------------------------------
but waitForDirectoryChooser() doesn't exist.
I'd really appreciate any ideas or insights on how I might accomplish this!

Related

Solution to stay logged in using javascript/puppeteer

There are some intranet sites that log me out more often than I like and I tried using Python, PowerShell and Javascript. The last one worked but it was based on launching those sites in child/popup windows which caused the focus to keep returning to those pages. I then looked at Puppeteer to do this using a headless Edge session.
I am not a developer - just know how to write short scripts
I am open to any Windows-based solution as I am not allowed to use Linux
The following is what I tried with Puppeteer (against MS Edge (chromium)):
const puppeteer = require("puppeteer-core");
const edgePaths = require("edge-paths");
const EDGE_PATH = edgePaths.getEdgePath();
async function loadTabs(){
const browser = await puppeteer.launch({executablePath: EDGE_PATH});
const url1 = await browser.newPage();
const url2 = await browser.newPage();
await url1.goto("https://url1");
await url2.goto("https://url2");
await delay(3000);
await browser.close();
}
function delay(time){
return new Promise(function(resolve){
setTimeout(resolve, time)
});
}
loadTabs();
The script is cobbled together from what I found about Puppeteer and JavaScript but I make no claim to this being even set up correctly.
Thanks!

create new tab in puppeteer inside a loop cause Navigation timeout

Recently I am learning puppeteer using their docs and try to scrape some information.
First approach
First I collect a list of url from the mainpage. Second I create a new tab and go those url iterately and collect some data. I doubt when I enter the loop the new tab didn't work as I expect and freezed without giving any data. Eventually I got a error TimeoutError: Navigation timeout of 30000 ms exceeded. Is there any better approach?
(async () => {
const browser = await puppeteer.launch({ headless: true });
const mainpage = await browser.newPage();
console.log('goto main page'.green);
await mainpage.goto(mainURL);
console.log('collecting some url'.green);
const URLS = await mainpage.evaluate(() =>
Array.from(
document.querySelectorAll('.result-actions a'),
(element) => element.href
)
);
if (typeof URLS[0] === 'string') console.log('OK'.green);
console.log('collecting finished'.green);
const newTab= await browser.newPage();
console.log('create new tab'.green);
var data = [];
for (let i = 0, n = URLS.length; i < n; i++) {
//console.log(URLS[i]);
// use this new tab to collect some data then close this tab
// continue this process
await newTab.waitForNavigation();
await newTab.goto(URLS[i]);
await newTab.waitForSelector('.profile-phone-column span a');
console.log('Go each url using new tab'.green);
// collecting data
data.push(collected_data);
// close this tab
await collectNamePage.close();
console.log(data);
}
await mainpage.close();
await browser.close();
console.log('closing browser'.green);
})();
Second approach
This time I want to skip the part where I collect those data using a new tab. Hence I collect my urls using page.$$() and try to iterating using for...of over urls and collect my data using elementHandle.$(selector) but this approach also failed.
I am getting frustrated. Am I doing it wrong way or I didn't understand their documentation?
In your script, you do not need newTab.waitForNavigation(); at all. Usually, this is used when the navigation is caused by some event. When you just use .goto(), the page loading is waited automatically.
Even if you need waitForNavigation(), you usually do not await it before the navigation triggered, otherwise you just get the timeout. You await it with navigation trigger together:
await Promise.all([element.click(), page.waitForNavigation()]);
So try to just delete await newTab.waitForNavigation();.
Also, do not close the new tab in the loop, delete it after the loop.
Edited script:
const puppeteer = require('puppeteer');
const mainURL = 'https://www.psychologytoday.com/us/therapists/illinois/';
(async () => {
const browser = await puppeteer.launch({ headless: false });
const mainpage = await browser.newPage();
console.log('goto main page');
await mainpage.goto(mainURL);
console.log('collecting urls');
const URLS = await mainpage.evaluate(() =>
Array.from(
document.querySelectorAll('.result-actions a'),
(element) => element.href
)
);
if (typeof URLS[0] === 'string') console.log('OK');
console.log('collection finished');
const collectNamePage = await browser.newPage();
console.log('create new tab');
var data = [];
for (let i = 0, totalUrls = URLS.length; i < totalUrls; i++) {
console.log(URLS[i]);
await collectNamePage.goto(URLS[i]);
await collectNamePage.waitForSelector('.profile-phone-column span a');
console.log('create new tab and go there');
// collecting data
const [name, phone] = await collectNamePage.evaluate(
() => [
document.querySelector('.profile-middle .name-title-column h1').innerText,
document.querySelector('.profile-phone-column span a').innerText
]
);
data.push({ name, phone });
}
console.log(data);
await collectNamePage.close();
await mainpage.close();
await browser.close();
console.log('closing browser');
})();

how to create folder and upload images in ipfs

https://proto.school/#/mutable-file-system
I have gone through this link but don’t know how to do same thing in node.js
I have added 'hello' word into the IPFS network and its working fine and also i have used image to upload into ipfs but i want to know how can i create folder in ipfs network and upload images into that folder
So my problem is that how to create the folder and upload picture into that folder.
Here is my code.
const addFile = async () => {
//const Added = await ipfs.add('hello');
const fsReadImgData = fs.readFileSync('image1.jpg');
var ipfsSave = await ipfs.add({
path:image1.jpg,
content: fsReadImgData
});
return fsReadImgData;
}
const fileHash = await addFile();
First read file into a buffer (or replace with however you're getting the image data:
const imgdata = fs.readFileSync('/yourfile.jpg');
Regular IPFS files method (immutable, you do not expect to update these):
let added = await ipfs.add({
path: 'images/yourfile.jpg',
content: imgdata
}, { wrapWithDirectory: true })
Mutable filesystem method (you expect to update and change the files):
await ipfs.files.mkdir('/images')
await ipfs.files.write(
'/images/yourfile.jpg',
imgdata,
{create: true})

Can a har file be programmatically generated from headless chrome using Puppeteer?

I would like to control a headless chrome instance using puppeteer, taking snapshots and clicking on various page elements, while capturing a har file. Is this possible? I have looked at the API but haven't found anything useful.
There is no HAR generator helper in Puppeteer. But you can use chrome-har to generate HAR file.
const fs = require('fs');
const { promisify } = require('util');
const puppeteer = require('puppeteer');
const { harFromMessages } = require('chrome-har');
// list of events for converting to HAR
const events = [];
// event types to observe
const observe = [
'Page.loadEventFired',
'Page.domContentEventFired',
'Page.frameStartedLoading',
'Page.frameAttached',
'Network.requestWillBeSent',
'Network.requestServedFromCache',
'Network.dataReceived',
'Network.responseReceived',
'Network.resourceChangedPriority',
'Network.loadingFinished',
'Network.loadingFailed',
];
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// register events listeners
const client = await page.target().createCDPSession();
await client.send('Page.enable');
await client.send('Network.enable');
observe.forEach(method => {
client.on(method, params => {
events.push({ method, params });
});
});
// perform tests
await page.goto('https://en.wikipedia.org');
page.click('#n-help > a');
await page.waitForNavigation({ waitUntil: 'networkidle2' });
await browser.close();
// convert events to HAR file
const har = harFromMessages(events);
await promisify(fs.writeFile)('en.wikipedia.org.har', JSON.stringify(har));
})();
Here you can find an article about this solution.
Solution proposed by #Everettss is the only option (so far), but is not as good as HAR saved in browser. Look at this, in both cases I generated HAR for google.com page. At top you have HAR generated by puppeteer-har (which is using chrome-har). Too little requests here, no metrics for main document, strangely different timing.
Puppeteer is not a perfect option for HAR files. Therefore I am suggesting to use https://github.com/cyrus-and/chrome-har-capturer

Puppeteer - How to fill form that is inside an iframe?

I have to fill out a form that is inside an iframe, here the sample page. I cannot access by simply using page.focus() and page.type(). I tried to get the form iframe by using const formFrame = page.mainFrame().childFrames()[0], which works but I cannot really interact with the form iframe.
I figured it out myself. Here's the code.
console.log('waiting for iframe with form to be ready.');
await page.waitForSelector('iframe');
console.log('iframe is ready. Loading iframe content');
const elementHandle = await page.$(
'iframe[src="https://example.com"]',
);
const frame = await elementHandle.contentFrame();
console.log('filling form in iframe');
await frame.type('#Name', 'Bob', { delay: 100 });
Instead of figuring out how to get inside the iFrame and type, I would simplify the problem by navigating to the IFrame URL directly
https://warranty.goodmanmfg.com/registration/NewRegistration/NewRegistration.aspx?Sender=Goodman
Make your script directly go to the above URL and try automating, it should work
Edit-1: Using frames
Since the simple approach didn't work for you, we do it with the frames itself
Below is a simple script which should help you get started
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('http://www.goodmanmfg.com/product-registration', { timeout: 80000 });
var frames = await page.frames();
var myframe = frames.find(
f =>
f.url().indexOf("NewRegistration") > -1);
const serialNumber = await myframe.$("#MainContent_SerNumText");
await serialNumber.type("12345");
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
The output is
If you can't select/find iFrame read this:
I had an issue with finding stripe elements.
The reason for that is the following:
You can't access an with different origin using JavaScript, it would be a huge security flaw if you could do it. For the same-origin policy browsers block scripts trying to access a frame with a different origin. See more detailed answer here
Therefore when I tried to use puppeteer's methods:Page.frames() and Page.mainFrame(). ElementHandle.contentFrame() I did not return any iframe to me. The problem is that it was happening silently and I couldn't figure out why it couldn't find anything.
Adding these arguments to launch options solved the issue:
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
Though you have figured out but I think I have better solution. Hope it helps.
async doFillForm() {
return await this.page.evaluate(() => {
let iframe = document.getElementById('frame_id_where_form_is _present');
let doc = iframe.contentDocument;
doc.querySelector('#username').value='Bob';
doc.querySelector('#password').value='pass123';
});
}