Issue listening for custom event via puppeteer - puppeteer

I am currently working on a GitLab CI test environment and I have a test harness which we use to test our SDK. I have gone about setting up a custom event that is fired on the page which designates the end of the test run. In my puppeteer implementation I am wanting to listen for this custom event "TEST_COMPLETE".
I have not been successful in getting this to work so I figured I would at least make sure the custom-event.js example on the puppeteer repo worked and there too I am not seeing what I believe I should be getting. I cloned the main repo below and performed an npm install. When I execute the js test below, setting headless:false and don't close the browser, I do not see any log in console that shows any custom event being fired.
It is my understanding that I should see some console event message with 'fired' and then 'app-ready' event and info, but this is not the case. Even if I interact with the page I don't see anything outside of some 'features_loaded' and 'features_unveil' logs.
https://github.com/puppeteer/puppeteer/blob/main/examples/custom-event.js
Anyone able to get the expected behavior on this code today? Not sure if this worked previously and has broke since or I am just doing something wrong. Any info would be of great help, Thanks!

Not sure if this is what you need, but I can get the message 'TEST_COMPLETE fired.' in Node.js console with this simplified code (puppeteer 8.0.0):
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch();
try {
const [page] = await browser.pages();
await page.goto('https://example.org/');
await page.exposeFunction('onCustomEvent', async (type) => {
console.log(`${type} fired.`);
await browser.close();
});
await page.evaluate(() => {
document.addEventListener('TEST_COMPLETE', (e) => {
window.onCustomEvent('TEST_COMPLETE');
});
document.dispatchEvent(new Event('TEST_COMPLETE'));
});
} catch (err) { console.error(err); }

Related

SERVICE WORKER: The service worker navigation preload request failed with network error: net::ERR_INTERNET_DISCONNECTED in Chrome 89

I have a problem with my Service Worker.
I'm currently implementing offline functionality with an offline.html site to be shown in case of network failure. I have implemented Navigation Preloads as described here: https://developers.google.com/web/updates/2017/02/navigation-preload#activating_navigation_preload
Here is my install EventListener were skipWaiting() and initialize new cache
const version = 'v.1.2.3'
const CACHE_NAME = '::static-cache'
const urlsToCache = ['index~offline.html', 'favicon-512.png']
self.addEventListener('install', function(event) {
self.skipWaiting()
event.waitUntil(
caches
.open(version + CACHE_NAME)
.then(function(cache) {
return cache.addAll(urlsToCache)
})
.then(function() {
console.log('WORKER: install completed')
})
)
})
Here is my activate EventListener were I feature-detect navigationPreload and enable it. Afterwards I check for old caches and delete them
self.addEventListener('activate', event => {
console.log('WORKER: activated')
event.waitUntil(
(async function() {
// Feature-detect
if (self.registration.navigationPreload) {
// Enable navigation preloads!
console.log('WORKER: Enable navigation preloads')
await self.registration.navigationPreload.enable()
}
})().then(
caches.keys().then(function(cacheNames) {
cacheNames.forEach(function(cacheName) {
if (cacheName !== version + CACHE_NAME) {
caches.delete(cacheName)
console.log(cacheName + ' CACHE deleted')
}
})
})
)
)
})
This is my fetch eventListener
self.addEventListener('fetch', event => {
const { request } = event
// Always bypass for range requests, due to browser bugs
if (request.headers.has('range')) return
event.respondWith(
(async function() {
// Try to get from the cache:
const cachedResponse = await caches.match(request)
if (cachedResponse) return cachedResponse
try {
const response = await event.preloadResponse
if (response) return response
// Otherwise, get from the network
return await fetch(request)
} catch (err) {
// If this was a navigation, show the offline page:
if (request.mode === 'navigate') {
console.log('Err: ',err)
console.log('Request: ', request)
return caches.match('index~offline.html')
}
// Otherwise throw
throw err
}
})()
)
})
Now my Problem:
On my local machine on localhost everything just works as it should. If network is offline the index~offline.html page is delivered to the user. If I deploy to my test server everything works as well as expected, except for a strange error-message in Chrome on normal browsing(not offline mode):
The service worker navigation preload request failed with network error: net::ERR_INTERNET_DISCONNECTED.
I logged the error and the request to get more information
Error:
DOMException: The service worker navigation preload request failed with a network error.
Request:
Its strange because somehow index.html is requested no matter which site is loaded.
Additional Information this is happening in Chrome 89, in chrome 88 everything seems fine(I checked in browserstack). I just saw there was a change in pwa offline detection in Chrome 89...
https://developer.chrome.com/blog/improved-pwa-offline-detection/
anybody has an idea what the problem might be?
Update
I rebuild the problem here so everybody can check it out: https://dreamy-leavitt-bd4f0e.netlify.app/
This error is directly caused by the improved pwa offline detection you linked to:
https://developer.chrome.com/blog/improved-pwa-offline-detection/
The browser fakes an offline context and tries to request the start_url of your manifest, e.g. the index.html specified in your https://dreamy-leavitt-bd4f0e.netlify.app/site.webmanifest
This is to make sure that your service worker is actually returning a valid 200 response in this situation, i.e. the valid cached response for your index~offline.html page.
The error you're asking about specifically is from the await event.preloadResponse part and it apparently can't be suppressed.
The await fetch call produces a similar error but that can be suppressed, just don't console.log in the catch section.
Hopefully chrome won't show this error from preload responses in future when doing offline pwa detection as it's needlessly confusing.

Unable to locate an element with puppeteer

I'm trying to do a basic search on FB marketplace with puppeteer(and it was working for me before) but fails recently.
The whole thing fails when it gets to "location" link on marketplace page. to change the location i need to click on it, but puppeteer Errors out saying:
Error: Node is either not visible or not an HTMLElement
If i try to get the boundingBox of the element it returns null
const browser = await puppeteer.launch();
const page = await browser.newPage();
const resp = await page.goto('https://www.facebook.com/marketplace', { waitUntil: 'networkidle2' })
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 })
console.log(await withinLink.boundingBox()) //returns null
await withinLink.click() //errors out
If i take a screenshot of the page right before i locate an element it is clearly there and i am able to locate in in chrome console using the same xPath manually.
It just doesn't seem to work in puppeteer
Something clearly changed on FB. Maybe they started to use some AI technology to detect scraping?
I don't think facebook changed in headless browser detection lately, but it seems you haven't taken into account that const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 }) returns an array, even if there is only one matching elment to contains(.,"Within").
That should work if you add [0] index to the elementHandles:
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]')
console.log(await withinLink[0].boundingBox())
await withinLink[0].click()
Note: Timeout is not mandatory in waitForXPath, but I'd suggest to rather use domcontentloaded instead of networkidle2 in page.goto if you don't need all analytics/tracking events to achive the desired results, it just slows down your script execution.
Note 2: Honestly, I don't have such element on my fb platform, maybe it is market dependent. But it works with any other XPath selectors with specific content.

Slack webhooks cause cls-hooked request context to orphan mysql connections

The main issue:
We have a lovely little express app, which has been crushing it for months with no issues. We manage our DB connections by opening a connection on demand, but then caching it "per request" using the cls-hooked library. Upon the request ending, we release the connection so our connection pool doesn't run out. Classic. Over the course of months and many connections, we've never "leaked" connections. Until now! Enter... slack! We are using the slack event handler as follows:
app.use('/webhooks/slack', slackEventHandler.expressMiddleware());
and we sort of think of it like any other request, however slack requests seem to play weirdly with our cls-hooked usage. For example, we use node-ts and nodemon to run our app locally (e.g. you change code, the app restarts automatically). Every time the app restarts locally on our dev machines, and you try and play with slack events, suddenly when our middleware that releases the connection tries to do so, it thinks there is nothing in session. When you then use a normal endpoint... it works fine and essentially seems to reset slack to working okay again. We are now scared to go to prod with our slack integration, because we're worried our slack "requests" are going to starve our connection pool.
Background
Relevant subset of our package.json:
{
"#slack/events-api": "^2.3.2",
"#slack/web-api": "^5.8.0",
"express": "~4.16.1",
"cls-hooked": "^4.2.2",
"mysql2": "^2.0.0",
}
The middleware that makes the cls-hooked session
import { session } from '../db';
const context = (req, res, next) => {
session.run(() => {
session.bindEmitter(req);
session.bindEmitter(res);
next();
});
};
export default context;
The middleware that releases our connections
export const dbReleaseMiddleware = async (req, res, next) => {
res.on('finish', async () => {
const conn = session.get('conn');
if (conn) {
incrementConnsReleased();
await conn.release();
}
});
next();
};
the code that creates the connection on demand and stores it in "session"
const poolConn = await pool.getConnection();
if (session.active) {
session.set('conn', poolConn);
}
return poolConn;
the code that sets up the session in the first place
export const session = clsHooked.createNamespace('our_company_name');
If you got this far, congrats. Any help appreciated!
Side note: you couldn't pay me to write a more confusing title...
Figured it out! It seems we have identified the following behavior in the node version of slack's API (seems to only happen on mac computers... sometimes)
The issue is that this is in the context of an express app, so Slack is managing the interface between its own event handler system + the http side of things with express (e.g. returning 200, or 500, or whatever). So what seems to happen is...
// you have some slack event handler
slackEventHandler.on('message', async (rawEvent: any) => {
const i = 0;
i = i + 1;
// at this point, the http request has not returned 200, it is "pending" from express's POV
await myService.someMethod();
// ^^ while this was doing its async thing, the express request returned 200.
// so things like res.on('finished') all fired and all your middleware happened
// but your event handler code is still going
});
So we ended up creating a manual call to release connections in our slack event handlers. Weird!

How to get all console messages with puppeteer? including errors, CSP violations, failed resources, etc

I am fetching a page with puppeteer that has some errors in the browser console but the puppeteer's console event is not being triggered by all of the console messages.
The puppeteer chromium browser shows multiple console messages
However, puppeteer only console logs one thing in node
Here is the script I am currently using:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
page.on('console', msg => console.log('PAGE LOG:', msg.text));
await page.goto('https://pagewithsomeconsoleerrors.com');
await browser.close();
})();
Edit: As stated in my comment below, I did try the page.waitFor(5000) command that Everettss recommended but that didn't work.
Edit2: removed spread operator from msg.text as it was there by accident.
Edit3: I opened an issue on github regarding this with similar but different example screenshots: https://github.com/GoogleChrome/puppeteer/issues/1512
The GitHub issue about capturing console erorrs includes a great comment about listening to console and network events. For example, you can register for console output and network responses and failures like this:
page
.on('console', message =>
console.log(`${message.type().substr(0, 3).toUpperCase()} ${message.text()}`))
.on('pageerror', ({ message }) => console.log(message))
.on('response', response =>
console.log(`${response.status()} ${response.url()}`))
.on('requestfailed', request =>
console.log(`${request.failure().errorText} ${request.url()}`))
And get the following output, for example:
200 'http://do.carlosesilva.com/puppeteer/'
LOG This is a standard console.log message
Error: This is an error we are forcibly throwing
at http://do.carlosesilva.com/puppeteer/:22:11
net::ERR_CONNECTION_REFUSED https://do.carlosesilva.com/http-only/sample.png
404 'http://do.carlosesilva.com/puppeteer/this-image-does-not-exist.png'
ERR Failed to load resource: the server responded with a status of 404 (Not Found)
See also types of console messages received with the console event and response, request and failure objects received with other events.
If you want to pimp your output with some colours, you can add chalk, kleur, colorette or others:
const { blue, cyan, green, magenta, red, yellow } = require('colorette')
page
.on('console', message => {
const type = message.type().substr(0, 3).toUpperCase()
const colors = {
LOG: text => text,
ERR: red,
WAR: yellow,
INF: cyan
}
const color = colors[type] || blue
console.log(color(`${type} ${message.text()}`))
})
.on('pageerror', ({ message }) => console.log(red(message)))
.on('response', response =>
console.log(green(`${response.status()} ${response.url()}`)))
.on('requestfailed', request =>
console.log(magenta(`${request.failure().errorText} ${request.url()}`)))
The examples above use Puppeteer API v2.0.0.
Easiest way to capture all console messages is passing the dumpio argument to puppeteer.launch().
From Puppeteer API docs:
dumpio: <boolean> Whether to pipe the browser process stdout and stderr
into process.stdout and process.stderr. Defaults to false.
Example code:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
dumpio: true
});
...
})();
You need to set multiple listeners if you want to catch everything. The console event is emitted when javascript within the page calls a console API message (like console.log).
For a full list of the console API take a look at the docs for console on MDN:
https://developer.mozilla.org/en-US/docs/Web/API/Console
The reason you need multiple listeners is because some of what is being logged in the image you posted is not happening within the page.
So for example, to catch the first error in the image, net:: ERR_CONNECTION_REFUSED you would set the listener like so:
page.on('requestfailed', err => console.log(err));
Puppeteer's documentation contains a full list of events. You should take a look at the documentation for the version you are using and look at the different events the Page class will emit as well as what those events will return. The example I've written above will return an instance of Puppeteer's request class.
https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#class-page
I extended Ferdinand's great comment code with descriptive errors text from JSHandle consoleMessage object in next comment. So you can catch all the errors from the browser and read all the description like in browser console.
Check it out there https://stackoverflow.com/a/66801550/9026103

Google chrome web push api bug

What is this bug? When sending web pushing browser Google Chrome "sometimes" gives a second message with the text: "This site has been updated in the background."
I want to make it only one message
This text I found in source Chrome
This site has been updated in the background.
github.com/scheib/chromium/blob/master/chrome/app/resources/generated_resources_en-GB.хтб
How to get rid of this message.
The way it works is a feature not a bug.
Here is an issue that explains your situation in Chrome: https://code.google.com/p/chromium/issues/detail?id=437277
And more specific code comment in Chromium code:
https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/push_messaging/push_messaging_notification_manager.cc&rcl=1449664275&l=287
What might have happened is some of the push messages sent to the client did not result in showing a notification.
Hope that helps
The reason this often occurs is the promise returned to event.waitUntil() didn't resolve with a notification being shown.
An example that might show the default push notification:
function handlePush() {
// BAD: The fetch's promise isn't returned
fetch('/some/api')
.then(function(response) {
return response.json();
})
.then(function(data) {
// BAD: the showNotification promise isn't returned
showNotification(data.title, {body: data.body});
});
}
self.addEventListener(function(event) {
event.waitUntil(handlePush());
});
Instead you could should write this as:
function handlePush() {
// GOOD
return fetch('/some/api')
.then(function(response) {
return response.json();
})
.then(function(data) {
// GOOD
return showNotification(data.title, {body: data.body});
});
}
self.addEventListener(function(event) {
const myNotificationPromise = handlePush();
event.waitUntil(myNotificationPromise);
});
The reason this is all important is that browsers wait for the promise passed into event.waitUntil to resolve / finish so they know the service worker needs to be kept alive and running.
When the promise resolves for a push event, chrome will check that a notification has been shown and it falls into a race condition / specific circumstance as to whether Chrome shows this notification or not. Best bet is to ensure you have a correct promise chain.
I put some extra notes on promises on this post (See: 'Side Quest: Promises' https://gauntface.com/blog/2016/05/01/push-debugging-analytics)