Puppeteer how to get all image response from website with lazy loading - puppeteer

It executes concurrently while the page is loading
page.on('response', async response => {});
I use this to scroll the page, but it does it after the page has finished loading
async function autoScroll(page){
await page.evaluate(async () => {
await new Promise((resolve, reject) => {
var totalHeight = 0;
var distance = 500;
var timer = setInterval(() => {
var scrollHeight = document.body.scrollHeight;
window.scrollBy(0, distance);
totalHeight += distance;
if(totalHeight >= scrollHeight){
clearInterval(timer);
resolve();
}
}, 100);
});
});
}
Lazy loading demo: https://davidwalsh.name/demo/lazyload-2.0.php
Demo
Does anyone have a way?

Related

How can I fix camera rotation on mobile?

How can I fix the camera rotation on mobile? I've tried to handle the rotation event and override the handle gesture by referring to this link: https://www.keanw.com/2017/04/fixing-pinch-zoom-in-forge-viewer-applications.html
I've fixed the rotation but I can't pinch-to-zoom to the position I touched.
Please review this article. Here is the code snippet:
let viewer = null;
function onGestureEnd() {
var hitTest = viewer.clientToWorld(window.innerWidth/2, window.innerHeight/2, true);
viewer.navigation.setPivotPoint(hitTest.point);
}
function loadModel(urn) {
const options = {
env: 'AutodeskProduction',
accessToken: _adsk.token.access_token,
};
Autodesk.Viewing.Initializer(options, () => {
const div = document.getElementById('forgeViewer');
viewer = new Autodesk.Viewing.Private.GuiViewer3D(div);
viewer.start();
Autodesk.Viewing.Document.load(`urn:${urn}`, (doc) => {
var viewables = doc.getRoot().getDefaultGeometry();
viewer.loadDocumentNode(doc, viewables).then( onLoadFinished );
});
});
function onLoadFinished() {
document.addEventListener('touchend', onGestureEnd);
}
}

create new tab in puppeteer inside a loop cause Navigation timeout

Recently I am learning puppeteer using their docs and try to scrape some information.
First approach
First I collect a list of url from the mainpage. Second I create a new tab and go those url iterately and collect some data. I doubt when I enter the loop the new tab didn't work as I expect and freezed without giving any data. Eventually I got a error TimeoutError: Navigation timeout of 30000 ms exceeded. Is there any better approach?
(async () => {
const browser = await puppeteer.launch({ headless: true });
const mainpage = await browser.newPage();
console.log('goto main page'.green);
await mainpage.goto(mainURL);
console.log('collecting some url'.green);
const URLS = await mainpage.evaluate(() =>
Array.from(
document.querySelectorAll('.result-actions a'),
(element) => element.href
)
);
if (typeof URLS[0] === 'string') console.log('OK'.green);
console.log('collecting finished'.green);
const newTab= await browser.newPage();
console.log('create new tab'.green);
var data = [];
for (let i = 0, n = URLS.length; i < n; i++) {
//console.log(URLS[i]);
// use this new tab to collect some data then close this tab
// continue this process
await newTab.waitForNavigation();
await newTab.goto(URLS[i]);
await newTab.waitForSelector('.profile-phone-column span a');
console.log('Go each url using new tab'.green);
// collecting data
data.push(collected_data);
// close this tab
await collectNamePage.close();
console.log(data);
}
await mainpage.close();
await browser.close();
console.log('closing browser'.green);
})();
Second approach
This time I want to skip the part where I collect those data using a new tab. Hence I collect my urls using page.$$() and try to iterating using for...of over urls and collect my data using elementHandle.$(selector) but this approach also failed.
I am getting frustrated. Am I doing it wrong way or I didn't understand their documentation?
In your script, you do not need newTab.waitForNavigation(); at all. Usually, this is used when the navigation is caused by some event. When you just use .goto(), the page loading is waited automatically.
Even if you need waitForNavigation(), you usually do not await it before the navigation triggered, otherwise you just get the timeout. You await it with navigation trigger together:
await Promise.all([element.click(), page.waitForNavigation()]);
So try to just delete await newTab.waitForNavigation();.
Also, do not close the new tab in the loop, delete it after the loop.
Edited script:
const puppeteer = require('puppeteer');
const mainURL = 'https://www.psychologytoday.com/us/therapists/illinois/';
(async () => {
const browser = await puppeteer.launch({ headless: false });
const mainpage = await browser.newPage();
console.log('goto main page');
await mainpage.goto(mainURL);
console.log('collecting urls');
const URLS = await mainpage.evaluate(() =>
Array.from(
document.querySelectorAll('.result-actions a'),
(element) => element.href
)
);
if (typeof URLS[0] === 'string') console.log('OK');
console.log('collection finished');
const collectNamePage = await browser.newPage();
console.log('create new tab');
var data = [];
for (let i = 0, totalUrls = URLS.length; i < totalUrls; i++) {
console.log(URLS[i]);
await collectNamePage.goto(URLS[i]);
await collectNamePage.waitForSelector('.profile-phone-column span a');
console.log('create new tab and go there');
// collecting data
const [name, phone] = await collectNamePage.evaluate(
() => [
document.querySelector('.profile-middle .name-title-column h1').innerText,
document.querySelector('.profile-phone-column span a').innerText
]
);
data.push({ name, phone });
}
console.log(data);
await collectNamePage.close();
await mainpage.close();
await browser.close();
console.log('closing browser');
})();

Firebase Functions Multipart/formdata Error

My webpage provides functionallity to convert pdf to image.
For Webpage i am using Firebase Hosting and for functions obvs Functions.
But after file upload function logs error in firebase dashboard Boundary not found
Below is the code i used to upload file in html:
function uploadFile() {
var file = document.getElementById("file_input").files[0];
var pass = document.getElementById("pass").value;
console.log(file + pass);
var formdata = new FormData();
formdata.append("file", file);
formdata.append("password", pass);
var ajax = new XMLHttpRequest();
ajax.upload.addEventListener("progress", progressHandler, false);
ajax.addEventListener("load", completeHandler, false);
ajax.addEventListener("error", errorHandler, false);
ajax.addEventListener("abort", abortHandler, false);
ajax.open("POST", "/upload");
ajax.setRequestHeader("Content-Type", "multipart/form-data");
ajax.send(formdata);
}
and this is the code of functions:
var functions = require('firebase-functions');
var process;
var Busboy;
var path = require('path');
var os = require('os');
var fs = require('fs');
exports.upload = functions.https.onRequest((req, res) => {
const busboy = new Busboy({ headers: req.headers });
const fields = {};
const tmpdir = os.tmpdir();
const uploads = {};
const fileWrites = [];
var pass = '';
busboy.on('file', (fieldname, file, filename) => {
console.log(`Processed file ${filename}`);
const filepath = path.join(tmpdir, filename);
uploads[fieldname] = filepath;
const writeStream = fs.createWriteStream(filepath);
file.pipe(writeStream);
const promise = new Promise((resolve, reject) => {
file.on('end', () => {
writeStream.end();
});
writeStream.on('finish', resolve);
writeStream.on('error', reject);
});
fileWrites.push(promise);
});
busboy.on('field', function (fieldname, val, fieldnameTruncated, valTruncated, encoding, mimetype) {
pass = val;
});
busboy.on('finish', function () {
console.log('Done parsing form!');
console.log(pass);
console.log(uploads);
process.processCard(uploads['file'], pass, 2).then((s) => {
res.end(`
<!DOCTYPE html>
<html>
<body>
ImageConverted!!
<img src="data:image/jpeg;base64,${s}" width="90%"></img>
</body>
</html>
`);
}).catch((err) => { res.end('Error: ' + err) });
});
busboy.end(req.body);
});
What am i doing wrong ?
For multipart body it is recommended to use req.rawBody instead of req.body
https://stackoverflow.com/a/48289899/6003934

How to free up memory when saving images inside IndexDB

I have a no of images on page and trying to save it inside IndexDb if it does not exist.
All seems to be working fine and images load up instantly if it exist but looks like browser memory is leaking. It's give some jerk and hang sometime. I m not sure how this can be handle, I have written a directive that looks like this
(function () {
'use strict';
// TODO: replace app with your module name
angular.module('app').directive('imageLocal', imageLocal);
imageLocal.$inject = ['$timeout', '$window', 'config', 'indexDb'];
function imageLocal($timeout, $window, config, indexDb) {
// Usage:
//
// Creates:
//
var directive = {
link: link,
restrict: 'A'
};
return directive;
function link(scope, element, attrs) {
var imageId = attrs.imageLocal;
// Open a transaction to the database
var transaction;
$timeout(function () {
transaction = indexDb.db.transaction(["mystore"], "readwrite");
getImage();
}, 500);
function getImage() {
transaction.objectStore('mystore').get(imageId)
.onsuccess = function (event) {
var imgFile = event.target.result;
if (imgFile == undefined) {
saveToDb(imgFile);
return false;
}
showImage(imgFile);
}
}
function showImage(imgFile) {
console.log('getting');
// Get window.URL object
var url = $window.URL || $window.webkitURL;
// Create and revoke ObjectURL
var imageUrl = url.createObjectURL(imgFile);
element.css({
'background-image': 'url("' + imageUrl + '")',
'background-size': 'cover'
});
}
function saveToDb() {
// Create XHR
var xhr = new XMLHttpRequest(),
blob;
xhr.open("GET", config.remoteServiceName + '/image/' + imageId, true);
// Set the responseType to blob
xhr.responseType = "blob";
xhr.addEventListener("load", function () {
if (xhr.status === 200) {
console.log("Image retrieved");
// Blob as response
blob = xhr.response;
console.log("Blob:" + blob);
// Put the received blob into IndexedDB
putInDb(blob);
}
}, false);
// Send XHR
xhr.send();
function putInDb(blob) {
// Open a transaction to the database
transaction = indexDb.db.transaction(["mystore"], "readwrite");
// Put the blob into the database
var request = transaction.objectStore("mystore").add(blob, imageId);
getImage();
request.onsuccess = function (event) {
console.log('saved');
}
};
}
}
}
})();

Code to take snapshot of a html page? [duplicate]

Google's "Report a Bug" or "Feedback Tool" lets you select an area of your browser window to create a screenshot that is submitted with your feedback about a bug.
Screenshot by Jason Small, posted in a duplicate question.
How are they doing this? Google's JavaScript feedback API is loaded from here and their overview of the feedback module will demonstrate the screenshot capability.
JavaScript can read the DOM and render a fairly accurate representation of that using canvas. I have been working on a script which converts HTML into a canvas image. Decided today to make an implementation of it into sending feedbacks like you described.
The script allows you to create feedback forms which include a screenshot, created on the client's browser, along with the form. The screenshot is based on the DOM and as such may not be 100% accurate to the real representation as it does not make an actual screenshot, but builds the screenshot based on the information available on the page.
It does not require any rendering from the server, as the whole image is created on the client's browser. The HTML2Canvas script itself is still in a very experimental state, as it does not parse nearly as much of the CSS3 attributes I would want it to, nor does it have any support to load CORS images even if a proxy was available.
Still quite limited browser compatibility (not because more couldn't be supported, just haven't had time to make it more cross browser supported).
For more information, have a look at the examples here:
http://hertzen.com/experiments/jsfeedback/
edit
The html2canvas script is now available separately here and some examples here.
edit 2
Another confirmation that Google uses a very similar method (in fact, based on the documentation, the only major difference is their async method of traversing/drawing) can be found in this presentation by Elliott Sprehn from the Google Feedback team:
http://www.elliottsprehn.com/preso/fluentconf/
Your web app can now take a 'native' screenshot of the client's entire desktop using getUserMedia():
Have a look at this example:
https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing/
The client will have to be using chrome (for now) and will need to enable screen capture support under chrome://flags.
PoC
As Niklas mentioned you can use the html2canvas library to take a screenshot using JS in the browser. I will extend his answer in this point by providing an example of taking a screenshot using this library ("Proof of Concept"):
function report() {
let region = document.querySelector("body"); // whole screen
html2canvas(region, {
onrendered: function(canvas) {
let pngUrl = canvas.toDataURL(); // png in dataURL format
let img = document.querySelector(".screen");
img.src = pngUrl;
// here you can allow user to set bug-region
// and send it with 'pngUrl' to server
},
});
}
.container {
margin-top: 10px;
border: solid 1px black;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/0.4.1/html2canvas.min.js"></script>
<div>Screenshot tester</div>
<button onclick="report()">Take screenshot</button>
<div class="container">
<img width="75%" class="screen">
</div>
In report() function in onrendered after getting image as data URI you can show it to the user and allow him to draw "bug region" by mouse and then send a screenshot and region coordinates to the server.
In this example async/await version was made: with nice makeScreenshot() function.
UPDATE
Simple example which allows you to take screenshot, select region, describe bug and send POST request (here jsfiddle) (the main function is report()).
async function report() {
let screenshot = await makeScreenshot(); // png dataUrl
let img = q(".screen");
img.src = screenshot;
let c = q(".bug-container");
c.classList.remove('hide')
let box = await getBox();
c.classList.add('hide');
send(screenshot,box); // sed post request with bug image, region and description
alert('To see POST requset with image go to: chrome console > network tab');
}
// ----- Helper functions
let q = s => document.querySelector(s); // query selector helper
window.report = report; // bind report be visible in fiddle html
async function makeScreenshot(selector="body")
{
return new Promise((resolve, reject) => {
let node = document.querySelector(selector);
html2canvas(node, { onrendered: (canvas) => {
let pngUrl = canvas.toDataURL();
resolve(pngUrl);
}});
});
}
async function getBox(box) {
return new Promise((resolve, reject) => {
let b = q(".bug");
let r = q(".region");
let scr = q(".screen");
let send = q(".send");
let start=0;
let sx,sy,ex,ey=-1;
r.style.width=0;
r.style.height=0;
let drawBox= () => {
r.style.left = (ex > 0 ? sx : sx+ex ) +'px';
r.style.top = (ey > 0 ? sy : sy+ey) +'px';
r.style.width = Math.abs(ex) +'px';
r.style.height = Math.abs(ey) +'px';
}
//console.log({b,r, scr});
b.addEventListener("click", e=>{
if(start==0) {
sx=e.pageX;
sy=e.pageY;
ex=0;
ey=0;
drawBox();
}
start=(start+1)%3;
});
b.addEventListener("mousemove", e=>{
//console.log(e)
if(start==1) {
ex=e.pageX-sx;
ey=e.pageY-sy
drawBox();
}
});
send.addEventListener("click", e=>{
start=0;
let a=100/75 //zoom out img 75%
resolve({
x:Math.floor(((ex > 0 ? sx : sx+ex )-scr.offsetLeft)*a),
y:Math.floor(((ey > 0 ? sy : sy+ey )-b.offsetTop)*a),
width:Math.floor(Math.abs(ex)*a),
height:Math.floor(Math.abs(ex)*a),
desc: q('.bug-desc').value
});
});
});
}
function send(image,box) {
let formData = new FormData();
let req = new XMLHttpRequest();
formData.append("box", JSON.stringify(box));
formData.append("screenshot", image);
req.open("POST", '/upload/screenshot');
req.send(formData);
}
.bug-container { background: rgb(255,0,0,0.1); margin-top:20px; text-align: center; }
.send { border-radius:5px; padding:10px; background: green; cursor: pointer; }
.region { position: absolute; background: rgba(255,0,0,0.4); }
.example { height: 100px; background: yellow; }
.bug { margin-top: 10px; cursor: crosshair; }
.hide { display: none; }
.screen { pointer-events: none }
<script src="https://cdnjs.cloudflare.com/ajax/libs/html2canvas/0.4.1/html2canvas.min.js"></script>
<body>
<div>Screenshot tester</div>
<button onclick="report()">Report bug</button>
<div class="example">Lorem ipsum</div>
<div class="bug-container hide">
<div>Select bug region: click once - move mouse - click again</div>
<div class="bug">
<img width="75%" class="screen" >
<div class="region"></div>
</div>
<div>
<textarea class="bug-desc">Describe bug here...</textarea>
</div>
<div class="send">SEND BUG</div>
</div>
</body>
Get screenshot as Canvas or Jpeg Blob / ArrayBuffer using getDisplayMedia API:
FIX 1: Use the getUserMedia with chromeMediaSource only for Electron.js
FIX 2: Throw error instead return null object
FIX 3: Fix demo to prevent the error: getDisplayMedia must be called from a user gesture handler
// docs: https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getDisplayMedia
// see: https://www.webrtc-experiment.com/Pluginfree-Screen-Sharing/#20893521368186473
// see: https://github.com/muaz-khan/WebRTC-Experiment/blob/master/Pluginfree-Screen-Sharing/conference.js
function getDisplayMedia(options) {
if (navigator.mediaDevices && navigator.mediaDevices.getDisplayMedia) {
return navigator.mediaDevices.getDisplayMedia(options)
}
if (navigator.getDisplayMedia) {
return navigator.getDisplayMedia(options)
}
if (navigator.webkitGetDisplayMedia) {
return navigator.webkitGetDisplayMedia(options)
}
if (navigator.mozGetDisplayMedia) {
return navigator.mozGetDisplayMedia(options)
}
throw new Error('getDisplayMedia is not defined')
}
function getUserMedia(options) {
if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
return navigator.mediaDevices.getUserMedia(options)
}
if (navigator.getUserMedia) {
return navigator.getUserMedia(options)
}
if (navigator.webkitGetUserMedia) {
return navigator.webkitGetUserMedia(options)
}
if (navigator.mozGetUserMedia) {
return navigator.mozGetUserMedia(options)
}
throw new Error('getUserMedia is not defined')
}
async function takeScreenshotStream() {
// see: https://developer.mozilla.org/en-US/docs/Web/API/Window/screen
const width = screen.width * (window.devicePixelRatio || 1)
const height = screen.height * (window.devicePixelRatio || 1)
const errors = []
let stream
try {
stream = await getDisplayMedia({
audio: false,
// see: https://developer.mozilla.org/en-US/docs/Web/API/MediaStreamConstraints/video
video: {
width,
height,
frameRate: 1,
},
})
} catch (ex) {
errors.push(ex)
}
// for electron js
if (navigator.userAgent.indexOf('Electron') >= 0) {
try {
stream = await getUserMedia({
audio: false,
video: {
mandatory: {
chromeMediaSource: 'desktop',
// chromeMediaSourceId: source.id,
minWidth : width,
maxWidth : width,
minHeight : height,
maxHeight : height,
},
},
})
} catch (ex) {
errors.push(ex)
}
}
if (errors.length) {
console.debug(...errors)
if (!stream) {
throw errors[errors.length - 1]
}
}
return stream
}
async function takeScreenshotCanvas() {
const stream = await takeScreenshotStream()
// from: https://stackoverflow.com/a/57665309/5221762
const video = document.createElement('video')
const result = await new Promise((resolve, reject) => {
video.onloadedmetadata = () => {
video.play()
video.pause()
// from: https://github.com/kasprownik/electron-screencapture/blob/master/index.js
const canvas = document.createElement('canvas')
canvas.width = video.videoWidth
canvas.height = video.videoHeight
const context = canvas.getContext('2d')
// see: https://developer.mozilla.org/en-US/docs/Web/API/HTMLVideoElement
context.drawImage(video, 0, 0, video.videoWidth, video.videoHeight)
resolve(canvas)
}
video.srcObject = stream
})
stream.getTracks().forEach(function (track) {
track.stop()
})
if (result == null) {
throw new Error('Cannot take canvas screenshot')
}
return result
}
// from: https://stackoverflow.com/a/46182044/5221762
function getJpegBlob(canvas) {
return new Promise((resolve, reject) => {
// docs: https://developer.mozilla.org/en-US/docs/Web/API/HTMLCanvasElement/toBlob
canvas.toBlob(blob => resolve(blob), 'image/jpeg', 0.95)
})
}
async function getJpegBytes(canvas) {
const blob = await getJpegBlob(canvas)
return new Promise((resolve, reject) => {
const fileReader = new FileReader()
fileReader.addEventListener('loadend', function () {
if (this.error) {
reject(this.error)
return
}
resolve(this.result)
})
fileReader.readAsArrayBuffer(blob)
})
}
async function takeScreenshotJpegBlob() {
const canvas = await takeScreenshotCanvas()
return getJpegBlob(canvas)
}
async function takeScreenshotJpegBytes() {
const canvas = await takeScreenshotCanvas()
return getJpegBytes(canvas)
}
function blobToCanvas(blob, maxWidth, maxHeight) {
return new Promise((resolve, reject) => {
const img = new Image()
img.onload = function () {
const canvas = document.createElement('canvas')
const scale = Math.min(
1,
maxWidth ? maxWidth / img.width : 1,
maxHeight ? maxHeight / img.height : 1,
)
canvas.width = img.width * scale
canvas.height = img.height * scale
const ctx = canvas.getContext('2d')
ctx.drawImage(img, 0, 0, img.width, img.height, 0, 0, canvas.width, canvas.height)
resolve(canvas)
}
img.onerror = () => {
reject(new Error('Error load blob to Image'))
}
img.src = URL.createObjectURL(blob)
})
}
DEMO:
document.body.onclick = async () => {
// take the screenshot
var screenshotJpegBlob = await takeScreenshotJpegBlob()
// show preview with max size 300 x 300 px
var previewCanvas = await blobToCanvas(screenshotJpegBlob, 300, 300)
previewCanvas.style.position = 'fixed'
document.body.appendChild(previewCanvas)
// send it to the server
var formdata = new FormData()
formdata.append("screenshot", screenshotJpegBlob)
await fetch('https://your-web-site.com/', {
method: 'POST',
body: formdata,
'Content-Type' : "multipart/form-data",
})
}
// and click on the page
Here is a complete screenshot example that works with chrome in 2021. The end result is a blob ready to be transmitted. Flow is: request media > grab frame > draw to canvas > transfer to blob. If you want to do it more memory efficient explore OffscreenCanvas or possibly ImageBitmapRenderingContext
https://jsfiddle.net/v24hyd3q/1/
// Request media
navigator.mediaDevices.getDisplayMedia().then(stream =>
{
// Grab frame from stream
let track = stream.getVideoTracks()[0];
let capture = new ImageCapture(track);
capture.grabFrame().then(bitmap =>
{
// Stop sharing
track.stop();
// Draw the bitmap to canvas
canvas.width = bitmap.width;
canvas.height = bitmap.height;
canvas.getContext('2d').drawImage(bitmap, 0, 0);
// Grab blob from canvas
canvas.toBlob(blob => {
// Do things with blob here
console.log('output blob:', blob);
});
});
})
.catch(e => console.log(e));
Heres an example using: getDisplayMedia
document.body.innerHTML = '<video style="width: 100%; height: 100%; border: 1px black solid;"/>';
navigator.mediaDevices.getDisplayMedia()
.then( mediaStream => {
const video = document.querySelector('video');
video.srcObject = mediaStream;
video.onloadedmetadata = e => {
video.play();
video.pause();
};
})
.catch( err => console.log(`${err.name}: ${err.message}`));
Also worth checking out is the Screen Capture API docs.
You can try my new JS library: screenshot.js.
It's enable to take real screenshot.
You load the script:
<script src="https://raw.githubusercontent.com/amiad/screenshot.js/master/screenshot.js"></script>
and take screenshot:
new Screenshot({success: img => {
// callback function
myimage = img;
}});
You can read more options in project page.