Error related to deviceScaleFactor - puppeteer

So just using https://try-puppeteer.appspot.com/
This code works fine:
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://demo.spoonthemes.net/themes/couponis/');
await page.setViewport({width: 1280, height: 978, deviceScaleFactor: 1});
await page.screenshot({path: 'example2.jpg'});
await browser.close();
But if I change deviceScaleFactor to 2 (because I'm on a retina screen) I get this error: Error running your code. Error: Protocol error (Page.captureScreenshot): Target closed.
Any ideas why? Seems to work if I change the URL to example.com as well, but not if I try other websites.

This error no longer appears as of Puppeteer v1.5.0.
The website that you were trying to access had maximum-scale=1 in the source code:
<meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
Therefore, deviceScaleFactor: 2 appears to have been failing due to this fact.

Related

Chrome doesn't send "if-none-match" header in HTTPS (but sends it in HTTP)

tl;dr: Chrome is not sending "If-None-Match" header for HTTPS request but sends it for HTTP request. Firefox always send "If-None-Match", both in HTTPS and HTTP.
I was trying to optimize cookies management for my node server when I came across a weird behavior with Chrome. I will try to describe it and compare it with Firefox.
First, here is the HTTP node server I'm using to test this:
#!/usr/bin/env node
'use strict';
const express = require('express');
const cors = require('cors');
const compression = require('compression');
const pathUtils = require('path');
const fs = require('fs');
const http = require('http');
let app = express();
app.disable('x-powered-by');
app.use(function (req, res, next) {
res.set('Cache-control', 'no-cache');
console.log(req.headers);
next();
});
app.use(express.json({ limit: '50mb' }));
app.use(cors());
app.use(compression({}));
let server = http.createServer(app);
app.get('/api/test', (req, res) => {
res.status(200).send(fs.readFileSync(pathUtils.join(__dirname, 'dummy.txt')));
});
server.listen(1234);
And there the client code :
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Document</title>
</head>
<body>
<script>
let test = fetch('http://localhost:1234/api/test', { mode: 'no-cors' })
.then((res) => {
return res.text();
})
.then((resText) => console.log(resText));
</script>
</body>
</html>
I use the header "no-cache" to force the client to re-validate the response.
If I've understood correctly how the cache work, I'm expecting client to send the request with the "If-None-Match" header having the previous request's e-tag and the server responding with a 304 code.
Here is the result when I refresh the page in Firefox (so at least one response has already be received). I embedded the server log of the request header.
Here the header "If-None-Match" is sent by the client request, as expected.
Now the same test with Chrome gives :
Well, here Chrome shows a 200 response code, but under the hood, it's really a 304 response that is sent by the server, which is shown by this wireshark capture :
As you can see, Chrome send the "If-None-Match" header with the correct e-tag, hence the 304 response.
So now, let's try this with HTTPS. In the server code, I just replaced require('http'); by require('https') and pass my ssl keys in createServer options (as described here)
So first, Firefox behaviour:
I've included the wireshark capture. And as you can see, everything is right, Firefox has the expected behaviour.
Now let's see the same thing with Chrome :
Here is my problem : as you can see, "If-None-Match" is not sent by Chrome. So as expected, the server returns a 200 response which can be seen in the wireshark capture (I refreshed the page twice, that's why there are 2 exchanges).
Do anyone have an idea on why Chrome has this weird behaviour?
I think it happened because you didn't set in your setting the certificat of your localhost.
Go to the settings and add it :)
chrome settings capture

Unable to locate an element with puppeteer

I'm trying to do a basic search on FB marketplace with puppeteer(and it was working for me before) but fails recently.
The whole thing fails when it gets to "location" link on marketplace page. to change the location i need to click on it, but puppeteer Errors out saying:
Error: Node is either not visible or not an HTMLElement
If i try to get the boundingBox of the element it returns null
const browser = await puppeteer.launch();
const page = await browser.newPage();
const resp = await page.goto('https://www.facebook.com/marketplace', { waitUntil: 'networkidle2' })
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 })
console.log(await withinLink.boundingBox()) //returns null
await withinLink.click() //errors out
If i take a screenshot of the page right before i locate an element it is clearly there and i am able to locate in in chrome console using the same xPath manually.
It just doesn't seem to work in puppeteer
Something clearly changed on FB. Maybe they started to use some AI technology to detect scraping?
I don't think facebook changed in headless browser detection lately, but it seems you haven't taken into account that const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 }) returns an array, even if there is only one matching elment to contains(.,"Within").
That should work if you add [0] index to the elementHandles:
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]')
console.log(await withinLink[0].boundingBox())
await withinLink[0].click()
Note: Timeout is not mandatory in waitForXPath, but I'd suggest to rather use domcontentloaded instead of networkidle2 in page.goto if you don't need all analytics/tracking events to achive the desired results, it just slows down your script execution.
Note 2: Honestly, I don't have such element on my fb platform, maybe it is market dependent. But it works with any other XPath selectors with specific content.

Server or HTML isn't displaying CSS (but works when opening HTML file)

I've been trying to learn how to set up a node.js server for a simple website for the first time and am encountering some strange behavior. When I open my index.html file from my computer it opens up perfectly with all of the CSS working properly. However I then set up a basic node.js server and when accessing the index.html file through my browser it only loads the html but not the CSS.
I'm extremely new to this so haven't been able to try much, also because the code is extremely simple so can't see what's missing (I tried following this tutorial if that helps). I also found another question that seemed similar on here but it didn't have an answer and didn't really help, I did check that all the files are UTF-8.
The HTML:
<html>
<head>
<title>My Page</title>
<link rel="stylesheet" href="styles.css" type="text/css">
</head>
<body>
<h1>A headline</h1>
</body>
</html>
And the node.js server:
const http = require("http");
const fs = require("fs");
const server = http.createServer((req, res) => {
res.writeHead(200, {"Content-Type": "text/html"});
const myReadStream = fs.createReadStream(__dirname + "/index.html", "utf8");
myReadStream.pipe(res);
});
server.listen(3000, "127.0.0.1");
console.log("Listening to port 3000");
When I include the CSS within <style> tags and directly in index.html it does work, but I've tried putting <link rel="stylesheet" href="styles.css" type="text/css"> between <style> tags and that still doesn't (it would also be weird if that's necessary seeing as it displays perfectly when I simply open the html file). I've also tried removing type=text/css but that didn't seem to change anything. Any help would be much appreciated!
You need to serve the style.css as well. You are serving the index.html but in the index.html it is hitting http://127.0.0.1:300/style.css when the request is coming to your app it is STILL serving the index.html file. (You can confirm this in Network pane of developer tools)
const server = http.createServer(function (req, res) {
const url = req.url;
if (url === '/style.css') {
res.writeHead(200, { 'Content-Type': 'text/css' }); // http header
fs.createReadStream(__dirname + "/style.css", "utf8").pipe(res);
} else {
res.writeHead(200, { 'Content-Type': 'text/html' }); // http header
fs.createReadStream(__dirname + "/index.html", "utf8").pipe(res);
}
})
Note: It is very easy to achieve this using express, probably the most popular nodejs package.

Puppeteer Element Handle loses context when navigating

What I'm trying to do:
I'm trying to get a screenshot of every element example in my storybooks project. The way I'm trying to do this is by clicking on the element and then taking the screenshot, clicking on the next one, screenshot etc.
Here is the attached code:
test('no visual regression for button', async () => {
const selector = 'a[href*="?selectedKind=Buttons&selectedStory="]';
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://localhost:8080');
let examples = await page.$$(selector);
await examples.map( async(example) => {
await example.click();
const screen = await page.screenshot();
expect(screen).toMatchImageSnapshot();
});
await browser.close();
});
But when I run this code I get the following error:
Protocol error (Runtime.callFunctionOn): Target closed.
at Session._onClosed (../../node_modules/puppeteer/lib/Connection.js:209:23)
at Connection._onClose (../../node_modules/puppeteer/lib/Connection.js:116:15)
at Connection.dispose (../../node_modules/puppeteer/lib/Connection.js:121:10)
at Browser.close (../../node_modules/puppeteer/lib/Browser.js:60:22)
at Object.<anonymous>.test (__tests__/visual.spec.js:21:17)
at <anonymous>
at process._tickCallback (internal/process/next_tick.js:169:7)
I believe it is because the element loses its context or something similar and I don't know what methods to use to get around this. Could you provide a deeper explanation or a possible solution? I don't find the API docs helpful at all.
ElementHandle.dispose() is called once page navigation occurs as garbage collection as stated here in the docs. So when you call element.click() it navigates and the rest of the elements no longer point to anything.

Nodejs load my HTML without the CSS

I did a little program in nodeJS with cloud9 supposed to launch my html. it worked but without the css. I tried many things but i didn't found the solution.
var http = require("http");
var fs = require("fs");
var events = require('events');
var eventEmitter = new events.EventEmitter();
fs.readFile('homepage.html', function(err, data) {
if (err) return console.log(err);
var server = http.createServer(function (request, response) {
response.writeHead(200, {'Content-Type': 'text/html'});
response.writeHead(200, {'Content-Type': 'text/css'});
response.write(data);
response.end();
}).listen(8081);
});
console.log("Server running.");
my HTML:
<!DOCTYPE html>
<html>
<head>
<title>My project</title>
<meta charset="UTF-8">
<link rel="stylesheet" href="css/style.css"/>
</head>
<body>
<h1>My project</h1>
<p>coming soon</p>
</body>
`
my CSS:
h1{
font-size: 4px;
}
i tried by opening only the .html on my browser and my h1 was 4px.
<link rel="stylesheet" href="css/style.css"/>
You tell the browser to load css/style.css.
The browser asks the server for it.
response.writeHead(200, {'Content-Type': 'text/html'});
The server says "Here is some HTML!"
response.writeHead(200, {'Content-Type': 'text/css'});
Then it says "Here is some CSS!"
Which is very odd. You can send an HTML document or a standalone stylesheet. You can't send both at the same time.
response.write(data);
Then the server sends the contents of homepage.html
So when the browser asks for the style sheet, it gets another copy of the homepage.
You need to pay attention to the request object which tells you what the browser is asking for. In particular pay attention to the method and url properties.
You then need to give it the correct response for that particular URL (which should include the correct content-type for the file type, and could be a 404 error if it isn't a request for something you were expecting).
At the moment you always send the homepage. If the browser asks for /, you send the homepage. If it asks for /css/style.css, you send the homepage. If it asks for /this/does/not/exist, you send the homepage.