How do I get actual config in puppeteer? - puppeteer

I want to conditionally execute some code based on the headless config attribute in puppeteer (passed in the .launch function).
e.g. : when I use the .type function, if it is running with headless: true, I don't want any delay. Else, add some { delay: 200 }.
How can I retrieve the headless value from the config?

Edit (thanks to #AndreyLushnikov comment)
You can figure out if puppeteer runs (non-)headless at runtime by checking browser.process() spawnargs for --headless switch with which Chromium were (or not) launched:
const headless = browser.process().spawnargs.includes("--headless") ? true : false;
console.log("Headless? " + headless);

With the latest puppeteer version to date (1.7.0), this is how I retrieved the config :
const client = await page.target().createCDPSession();
const response = await client.send('Browser.getBrowserCommandLine');
page.headless = response.arguments.includes('--headless');
See this github issue for more information

Related

setValue() / addValue() type into adress bar instead of selected element

I'm using WebdriverIO + devtools:puppeteer + cucumber + Firefox Nightly.
When using setValue() / addValue(), the first letter of my input is typed into address bar, instead of selected element. The issue for same tests doesn't appear for mse or chrome browsers.
Issue:
After this, nothing happens until function timeouts
INFO devtools: COMMAND navigateTo("https://google.com/")
INFO devtools: RESULT null
INFO devtools: COMMAND findElement("css selector", "input[type=text]")
INFO devtools: RESULT { 'element-6066-11e4-a52e-4f735466cecf': 'ELEMENT-1' }
INFO devtools: COMMAND elementClear("ELEMENT-1")
INFO devtools: RESULT null
INFO devtools: COMMAND elementSendKeys("ELEMENT-1", "hello world")
Code examples:
Test:
Scenario: Try google
When I open "google.com"
Then I type "hello world" into "input[type=text]"
Steps:
When('I open {string}', async function (URL) {
await browser.url(`https://${URL}`);
});
Then('I type {string} into {string}', async function (input, selector) {
await $(selector).setValue(input);
});
Although there is a walkaround for some URLS with clicking on the element before using setValue(), this doesn't work for some cases (e.g. when redirecting from pre-login page to login page with pretyped-in login, I could not click + setValue for password field).
Hope anyone knows how this could be solved or walked around for all cases. Thanks.
[UPD]
#AnthumChris
as I'm using built-in puppeteer, page is not defined by default
Instead I tried:
const puppeteerBrowser = await browser.getPuppeteer()
const pages = await puppeteerBrowser.pages()
const page = await pages[0]
await (await page.waitForSelector('input[type=text]')).type('hello')
It worked for chrome and mse again, but failed for ffox nightly.
After opening in browser requested URL (google.com), I've received next error:
Error in "21: Then I type "hello world" into "input[type=text]""
TypeError [ERR_INVALID_URL]: Invalid URL: http://localhost:localhost:64619`
[UPD]
I've changed browserURL: 'http://localhost:${rdPort}' to browserURL: 'http://${rdPort}' in a ...\node_modules\webdriverio\build\commands\browser\getPuppeteer.js file
so I at least could connect to puppeteer.pages object, but there's still a problem on await (await page.waitForSelector('input[type=text]')).type('hello') action:
ProtocolError: Protocol error (DOM.resolveNode): Node with given id does not belong to the document resolveNode#chrome://remote/content/cdp/domains/content/DOM.jsm:245:15
execute#chrome://remote/content/cdp/domains/DomainCache.jsm:101:25
receiveMessage#chrome://remote/content/cdp/sessions/ContentProcessSession.jsm:84:45
Try awaiting the <input> and typing directly into it:
await (await page.waitForSelector('input[type=text]')).type('hello')

Beginner problem with chrome navigator.serial

I am working on use of serial port access with chrome browser, using "navigator.serial".
My initial experiment is based on a prior posting to stackoverflow:
Is there an example site that uses navigator.serial?
I have duplicated the code example referenced above, and have made the required configuration change #enable-experimental-web-platform-features, again as described above.
I am doing this all on Ubuntu 18.04. There are two USB serial ports attached to the machine, and I have verified using gtkterm that I can send and receive data between the two ports.
From the example given (code duplicated below), I find that I can open the serial port and establish a "reader", and the step await reader.read() does wait until an incoming character appears on the serial port, but at this point the variabler/object "data" remains undefined.
Two questions/issues:
What am I doing wrong that leaves "data" undefined? I added an alert() dialog box that pops up once const {done, data} = await reader.read(); proceeds, however, the dialog box says that "data" is at that point undefined. Is data a promise that I am failing to wait to be fulfilled?
I have not been able to find a (hopefully self-contained) reference on the methods and members of the classes involved (i.e., reader.read() and reader.write() are methods available to my object "readeer"; where can I find a list of available methods, and the properties of these?
Here is a copy of the code (small web page) that was obtained from the year-ago posting above:
<html>
<script>
var port;
var buffy = new ArrayBuffer(1);
var writer;
buffy[0]=10;
const test = async function () {
const requestOptions = {
// Filter on devices with the Arduino USB vendor ID.
//filters: [{ vendorId: 0x2341 }],
};
// Request an Arduino from the user.
port = await navigator.serial.requestPort(requestOptions);
// Open and begin reading.
await port.open({ baudrate: 115200 });
//const reader = port.in.getReader();
const reader = port.readable.getReader();
writer = port.writable.getWriter();
//const writer = port.writable.getWriter();
//writer.write(buffy);
while (true) {
const {done, data} = await reader.read();
if (done) break;
console.log(data);
}
} // end of function
</script>
<button onclick="test()">Click It</button>
</html>
Thank you for any assistance!
I was having the exact same problem and managed to solve it.
Change
const {done, data} = await reader.read();
To
const {value, done} = await reader.read();
The example where you got this from (and a few others) were wrong, params around the wrong way.
Also, not too sure why but when I used
const {data, done} = await reader.read();
it did not work either, it did not like the var data.
Documentation on navigator.serial is not great. Here are some links to help
The API (note this is draft and does not exactly match the Chrome implementation)
https://wicg.github.io/serial/
port.readable.getReader() is a ReadableStream
https://streams.spec.whatwg.org/#readablestream
that uses ReadableStreamDefaultReader which is defined as
dictionary ReadableStreamDefaultReadResult {
any value;
boolean done;
};
https://streams.spec.whatwg.org/#readablestreamdefaultreader
An explainer
https://github.com/WICG/serial/blob/gh-pages/EXPLAINER.md
A tutorial
https://codelabs.developers.google.com/codelabs/web-serial
Chromium tracker
https://goo.gle/fugu-api-tracker
The Web Serial API work item
https://bugs.chromium.org/p/chromium/issues/detail?id=884928

Unable to locate an element with puppeteer

I'm trying to do a basic search on FB marketplace with puppeteer(and it was working for me before) but fails recently.
The whole thing fails when it gets to "location" link on marketplace page. to change the location i need to click on it, but puppeteer Errors out saying:
Error: Node is either not visible or not an HTMLElement
If i try to get the boundingBox of the element it returns null
const browser = await puppeteer.launch();
const page = await browser.newPage();
const resp = await page.goto('https://www.facebook.com/marketplace', { waitUntil: 'networkidle2' })
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 })
console.log(await withinLink.boundingBox()) //returns null
await withinLink.click() //errors out
If i take a screenshot of the page right before i locate an element it is clearly there and i am able to locate in in chrome console using the same xPath manually.
It just doesn't seem to work in puppeteer
Something clearly changed on FB. Maybe they started to use some AI technology to detect scraping?
I don't think facebook changed in headless browser detection lately, but it seems you haven't taken into account that const withinLink = await page.waitForXPath('//span[contains(.,"Within")]', { timeout: 4000 }) returns an array, even if there is only one matching elment to contains(.,"Within").
That should work if you add [0] index to the elementHandles:
const withinLink = await page.waitForXPath('//span[contains(.,"Within")]')
console.log(await withinLink[0].boundingBox())
await withinLink[0].click()
Note: Timeout is not mandatory in waitForXPath, but I'd suggest to rather use domcontentloaded instead of networkidle2 in page.goto if you don't need all analytics/tracking events to achive the desired results, it just slows down your script execution.
Note 2: Honestly, I don't have such element on my fb platform, maybe it is market dependent. But it works with any other XPath selectors with specific content.

Cypress throwing SecurityError

I am currently running with Chrome 74 and trying to use Cypress to test a style-guide in my app. When I load up Cypress it throws this error:
SecurityError: Blocked a frame with origin "http://localhost:3000"
from accessing a cross-origin frame.
Please let me know if there is a solution to this!
I had tried to follow along with this:
https://github.com/cypress-io/cypress/issues/1951
But nothing has changed/worked for me. :(
My code is shown below: cypress/plugins/index.js
module.exports = (on, config) => {
on('before:browser:launch', (browser = {}, args) => {
// browser will look something like this
// {
// name: 'chrome',
// displayName: 'Chrome',
// version: '63.0.3239.108',
// path: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
// majorVersion: '63'
// }
if (browser.name === 'chrome') {
args.push('--disable-site-isolation-trials');
return args
}
if (browser.name === 'electron') {
args['fullscreen'] = true
// whatever you return here becomes the new args
return args
}
})
}
in my cypress/support/index.js
This will load the site before every test I run to save myself from having to write cy.visit in every test.
beforeEach(() =>{
cy.visit('http://localhost:3000/style-guide')
})
I had the very same issue yesterday and the answer from #jsjoeio in the cypress issue #1951 you've referenced in your question actually helped me.
So basically only thing I've done was to modify my cypress.json and add following value:
{
"chromeWebSecurity": false
}
You can disable security to overcome this issue.
Go to cypress.json file.
Write { "chromeWebSecurity": false } and save.
Run the test again.
I had exactly the same problem, I advise you to do as DurkoMatko recommends. Documentation chromeWebSecurity
But I encountered another problem with a link pointing to localhost in an iframe.
If you want to use a link in an iframe I recommend this :
cy.get('iframe').then((iframe) => {
const body = iframe.contents().find('body');
cy.wrap(body).find('a').click();
});
I have also faced this issue. My application was using service workers. Disabling service workers while visiting a page solved the issue.
cy.visit('index.html', {
onBeforeLoad (win) {
delete win.navigator.__proto__.serviceWorker
}
})
Ref: https://glebbahmutov.com/blog/cypress-tips-and-tricks/#disable-serviceworker
So, at least for me, my further problem was an internal one with tokens, logins, etc. BUT!
the code I posted for how the index in the plugin folder is correct to bypass the chrome issue. That is how you want to fix it!
Goto your cypress.json file.
Set chrome web security as false
{
"chromeWebSecurity": false
}
To get around these restrictions, Cypress implements some strategies involving JavaScript code, the browser's internal APIs, and network proxying to play by the rules of same-origin policy.
Acess your project
In file 'cypress.json' insert
{
"chromeWebSecurity": false
}
Reference: Cypress Documentation

How to run Headless Chrome in Azure Cloud Service or Azure Functions?

I am trying to use Headless Chrome to generate a PDF file from a complex HTML file (contains images, SVGs, etc.). I am able to use wkhtmltopdf.exe on Cloud Service (Windows) to generate simple PDF file, but I really need Chrome to produce PDFs as close as possible to the HTML + SVG + Image.
I was hoping to be able to run Headless Chrome in Azure Cloud Service or Azure Functions, but I cannot get it to work. I suppose this is due to restrictions on GDI. I was able to run my code and Headless Chrome in the Azure Emulator on my own machine, but once it is deployed nothing works.
Below is the code I am currently running in Azure Functions (for Windows). I am using Puppeteer to take a screenshot of example.com. If I can get this to work, I suppose that generating PDF will become easy.
const fs = require('fs');
const path = require('path');
const puppeteer = require('puppeteer');
const os = require('os');
module.exports = function (context, req) {
function failureCallback(error) {
context.log("--> Failure = '" + error + "'");
}
const chromeDir = path.normalize(__dirname + "/../node_modules/puppeteer/.local-chromium/win64-508693/chrome-win32/chrome.exe");
context.log("--> Chrome Path = " + chromeDir);
const dir = path.join(os.tmpdir(), '/screenshots');
if (!fs.existsSync(dir)){
fs.mkdirSync(dir);
}
const screenshotPath = path.join(dir, "example.png");
context.log("--> Path = " + screenshotPath);
let browser, page;
puppeteer.launch({ executablePath: chromeDir, headless: true, args: [ '--no-sandbox', '--single-process', '--disable-gpu' ] })
.then(b => {
context.log("----> 1");
browser = b;
return browser.newPage();
}, failureCallback)
.then(p => {
context.log("----> 2");
page = p;
return p.goto('https://www.example.com');
}, failureCallback)
.then(response => {
context.log("----> 3");
return page.screenshot({path: screenshotPath, fullPage: true});
}, failureCallback)
.then(r => {
browser.close();
context.res = {
body: "Done!"
};
context.done();
}, failureCallback);
};
Below is the log when trying to execute the script.
2017-12-18T04:32:05 Welcome, you are now connected to log-streaming service.
2017-12-18T04:33:05 No new trace in the past 1 min(s).
2017-12-18T04:33:11.400 Function started (Id=89b31468-8a5d-43cd-832f-b641216dffc0)
2017-12-18T04:33:20.578 JavaScript HTTP trigger function processed a request.
2017-12-18T04:33:20.578 --> Chrome Path D:\home\site\wwwroot\node_modules\puppeteer\.local-chromium\win64-508693\chrome-win32\chrome.exe
2017-12-18T04:33:20.578 --> Path = D:\local\Temp\screenshots\example.png
2017-12-18T04:33:20.965 --> Failure = 'Error: spawn UNKNOWN'
2017-12-18T04:33:20.965 ----> 2
The error "Failure = 'Error: spawn UNKNOWN'" is not clear. I made sure that the path I am using is correct using Kudu and PowerShell.
I am looking for a way to run Chrome on Azure Cloud Service and/or Azure Functions (for Windows - in order to use my existing App Service plan). Anybody has also attempted to run Headless Chrome in Azure? I am open to any ideas which would help me to get this script to work?
I would recommend to use https://www.browserless.io/ so you don't have to run the chrome.exe in the app service.
Replace puppeteer.launch with puppeteer.connect
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://chrome.browserless.io/'
});
I'm not sure about the usage of Headless Chrome, but the sandbox that Azure Functions runs in has problems generating PDFs from HTML due to some GDI restrictions.
Consider trying your task in Azure Functions on Linux. While this is still in preview, it does not utilize a sandbox, so if you can get headless chrome working on it then you may have more luck with the PDF generation.
Azure allows NodeJS:
you can do it in NodeJS using Phantom (instead of chrome since you wont have access to any browsers - nor will you be able to run them on azure web apps) see the example - its in hosted on google firebase but you can easily apply it to your NodeJS project:
https://stackoverflow.com/a/51828577/6306638
IIS server on a Azure VM is your only alternative if you NEED Chrome.
Let me know if you need any help with this!