How to run Headless Chrome in Azure Cloud Service or Azure Functions? - google-chrome

I am trying to use Headless Chrome to generate a PDF file from a complex HTML file (contains images, SVGs, etc.). I am able to use wkhtmltopdf.exe on Cloud Service (Windows) to generate simple PDF file, but I really need Chrome to produce PDFs as close as possible to the HTML + SVG + Image.
I was hoping to be able to run Headless Chrome in Azure Cloud Service or Azure Functions, but I cannot get it to work. I suppose this is due to restrictions on GDI. I was able to run my code and Headless Chrome in the Azure Emulator on my own machine, but once it is deployed nothing works.
Below is the code I am currently running in Azure Functions (for Windows). I am using Puppeteer to take a screenshot of example.com. If I can get this to work, I suppose that generating PDF will become easy.
const fs = require('fs');
const path = require('path');
const puppeteer = require('puppeteer');
const os = require('os');
module.exports = function (context, req) {
function failureCallback(error) {
context.log("--> Failure = '" + error + "'");
}
const chromeDir = path.normalize(__dirname + "/../node_modules/puppeteer/.local-chromium/win64-508693/chrome-win32/chrome.exe");
context.log("--> Chrome Path = " + chromeDir);
const dir = path.join(os.tmpdir(), '/screenshots');
if (!fs.existsSync(dir)){
fs.mkdirSync(dir);
}
const screenshotPath = path.join(dir, "example.png");
context.log("--> Path = " + screenshotPath);
let browser, page;
puppeteer.launch({ executablePath: chromeDir, headless: true, args: [ '--no-sandbox', '--single-process', '--disable-gpu' ] })
.then(b => {
context.log("----> 1");
browser = b;
return browser.newPage();
}, failureCallback)
.then(p => {
context.log("----> 2");
page = p;
return p.goto('https://www.example.com');
}, failureCallback)
.then(response => {
context.log("----> 3");
return page.screenshot({path: screenshotPath, fullPage: true});
}, failureCallback)
.then(r => {
browser.close();
context.res = {
body: "Done!"
};
context.done();
}, failureCallback);
};
Below is the log when trying to execute the script.
2017-12-18T04:32:05 Welcome, you are now connected to log-streaming service.
2017-12-18T04:33:05 No new trace in the past 1 min(s).
2017-12-18T04:33:11.400 Function started (Id=89b31468-8a5d-43cd-832f-b641216dffc0)
2017-12-18T04:33:20.578 JavaScript HTTP trigger function processed a request.
2017-12-18T04:33:20.578 --> Chrome Path D:\home\site\wwwroot\node_modules\puppeteer\.local-chromium\win64-508693\chrome-win32\chrome.exe
2017-12-18T04:33:20.578 --> Path = D:\local\Temp\screenshots\example.png
2017-12-18T04:33:20.965 --> Failure = 'Error: spawn UNKNOWN'
2017-12-18T04:33:20.965 ----> 2
The error "Failure = 'Error: spawn UNKNOWN'" is not clear. I made sure that the path I am using is correct using Kudu and PowerShell.
I am looking for a way to run Chrome on Azure Cloud Service and/or Azure Functions (for Windows - in order to use my existing App Service plan). Anybody has also attempted to run Headless Chrome in Azure? I am open to any ideas which would help me to get this script to work?

I would recommend to use https://www.browserless.io/ so you don't have to run the chrome.exe in the app service.
Replace puppeteer.launch with puppeteer.connect
const browser = await puppeteer.connect({
browserWSEndpoint: 'wss://chrome.browserless.io/'
});

I'm not sure about the usage of Headless Chrome, but the sandbox that Azure Functions runs in has problems generating PDFs from HTML due to some GDI restrictions.
Consider trying your task in Azure Functions on Linux. While this is still in preview, it does not utilize a sandbox, so if you can get headless chrome working on it then you may have more luck with the PDF generation.

Azure allows NodeJS:
you can do it in NodeJS using Phantom (instead of chrome since you wont have access to any browsers - nor will you be able to run them on azure web apps) see the example - its in hosted on google firebase but you can easily apply it to your NodeJS project:
https://stackoverflow.com/a/51828577/6306638
IIS server on a Azure VM is your only alternative if you NEED Chrome.
Let me know if you need any help with this!

Related

How to link Node.js Post script to HTML form?

I have created a REST full APi, which works as I would be expecting if I am running Postman. I run the Test from an index.js file which would have the routes saved as per below file.
const config = require('config');
const mongoose = require('mongoose');
const users = require('./routes/users');
const auth = require('./routes/auth');
const express = require('express');
const app = express();
//mongoose.set();
if (!config.get('jwtPrivateKey'))
{
console.log('Fatal ERRORR: jwtPrivateKey key is not defined')
process.exit(1);
}
mongoose.connect(uri ,{
useNewUrlParser: true,
useUnifiedTopology: true,
useCreateIndex: true
})
.then(()=>console.log('Connected to MongoDB...'))
.catch(err=> console.log('Not Connected, bad ;(', err));
app.use(express.json());
//THis is only for posting the user, e.g. Registering them
app.use('/api/users', users);
app.use('/api/auth', auth);
const port = process.env.PORT || 3000;
app.listen(port, () => console.log(`Listening on port ${port}...`));
The real code is happening here. Testing this in Postmon I could establish, that the values are saved in MongoDB.
router.post('/', async (req, res) => {
//validates the request.
const { error } = validate(req.body);
if (error) return res.status(400).send(error.details[0].message);
let user = await User.findOne({email: req.body.email});
if (user) return res.status(400).send('User Already Register, try again!');
user = new User(_.pick(req.body, ['firstName','lastName','email','password','subscription']));
const salt = await bcrypt.genSaltSync(15);
user.password = await bcrypt.hash(user.password, salt);
//Here the user is being saved in the Database.
await user.save();
//const token = user.generateAuthToken();
//const token = jwt.sign({_id: user._id}, config.get('jwtPrivateKey'));
const token = user.generateAuthToken();
//We are sending the authentication in the header, and the infromation back to client
res.header('x-auth-token',token).send( _.pick(user, ['_id','firstName','lastName','email','subscription']));
});
Now my question's are:
How can I call the second code block from a , in one particular html file. When using Action="path to the users.js", the browser opens the js file code but doesn't do anything.
Do I need to rewrite the Post block part so that it would as well include the connection details to the DB? And would this mean I would keep open the connection to MongoDB once I insert Read etc.? Wouldn't this eat a lot of resources if multiple users would e.g. log in at the same time?
Or is there a way how I can use the index.js + the users.js which is refereed in the index.js file together?
All of these are theoretical questions, as I am not quite sure how to use the created API in html, then I created as walking through a tutorial.
Do I need to change the approach here?
After some longs hours I finally understood my own issue and question.
What I wanted to achieve is from an HTML page post data in MongoDB through API (this I assume is the best way how to describe this).
In order to do this I needed to:
Start server for the API function e.g. nodemon index.js, which has the information regarding the API.
Opened VS Code opened the terminal and started the API server (if I can call it like that)
Opened CMD and startet the local host for the index.html with navigating to it's folder and then writting http-server now I could access this on http://127.0.0.1:8080.
For the register.html in the form I needed to post:
This is the part which I didn't understood, but now it makes sense. Basically I start the server API seperatly and once it is started I can use e.g. Postmon and other apps which can access this link. I somehow thought html needs some more direct calls.
So After the localhost is started then the register.html will know where to post it via API.
Now I have a JOI validate issue, though on a different more simple case this worked, so I just need to fix the code there.
Thank You For reading through and Apologize if was not clear, still learning the terminology!

Problem with Firebase Image Resize extension [duplicate]

I am following a tutorial to resize images via Cloud Functions on upload and am experiencing two major issues which I can't figure out:
1) If a PNG is uploaded, it generates the correctly sized thumbnails, but the preview of them won't load in Firestorage (Loading spinner shows indefinitely). It only shows the image after I click on "Generate new access token" (none of the generated thumbnails have an access token initially).
2) If a JPEG or any other format is uploaded, the MIME type shows as "application/octet-stream". I'm not sure how to extract the extension correctly to put into the filename of the newly generated thumbnails?
export const generateThumbs = functions.storage
.object()
.onFinalize(async object => {
const bucket = gcs.bucket(object.bucket);
const filePath = object.name;
const fileName = filePath.split('/').pop();
const bucketDir = dirname(filePath);
const workingDir = join(tmpdir(), 'thumbs');
const tmpFilePath = join(workingDir, 'source.png');
if (fileName.includes('thumb#') || !object.contentType.includes('image')) {
console.log('exiting function');
return false;
}
// 1. Ensure thumbnail dir exists
await fs.ensureDir(workingDir);
// 2. Download Source File
await bucket.file(filePath).download({
destination: tmpFilePath
});
// 3. Resize the images and define an array of upload promises
const sizes = [64, 128, 256];
const uploadPromises = sizes.map(async size => {
const thumbName = `thumb#${size}_${fileName}`;
const thumbPath = join(workingDir, thumbName);
// Resize source image
await sharp(tmpFilePath)
.resize(size, size)
.toFile(thumbPath);
// Upload to GCS
return bucket.upload(thumbPath, {
destination: join(bucketDir, thumbName)
});
});
// 4. Run the upload operations
await Promise.all(uploadPromises);
// 5. Cleanup remove the tmp/thumbs from the filesystem
return fs.remove(workingDir);
});
Would greatly appreciate any feedback!
I just had the same problem, for unknown reason Firebase's Resize Images on purposely remove the download token from the resized image
to disable deleting Download Access Tokens
goto https://console.cloud.google.com
select Cloud Functions from the left
select ext-storage-resize-images-generateResizedImage
Click EDIT
from Inline Editor goto file FUNCTIONS/LIB/INDEX.JS
Add // before this line (delete metadata.metadata.firebaseStorageDownloadTokens;)
Comment the same line from this file too FUNCTIONS/SRC/INDEX.TS
Press DEPLOY and wait until it finish
note: both original and resized will have the same Token.
I just started using the extension myself. I noticed that I can't access the image preview from the firebase console until I click on "create access token"
I guess that you have to create this token programatically before the image is available.
I hope it helps
November 2020
In connection to #Somebody answer, I can't seem to find ext-storage-resize-images-generateResizedImage in GCP Cloud Functions
The better way to do it, is to reuse the original file's firebaseStorageDownloadTokens
this is how I did mine
functions
.storage
.object()
.onFinalize((object) => {
// some image optimization code here
// get the original file access token
const downloadtoken = object.metadata?.firebaseStorageDownloadTokens;
return bucket.upload(tempLocalFile, {
destination: file,
metadata: {
metadata: {
optimized: true, // other custom flags
firebaseStorageDownloadTokens: downloadtoken, // access token
}
});
});

Windows 10 - running a puppeteer script opens a blank Chromium window

I am new to Puppeteer and am trying to run the example script. However, I get a blank chromium window (with no tab or URL bar).
Environment details:
OS: Windows 10
Node version: 8.4.0
NPM version: 6.4.1
I installed puppeteer using NPM and version 1.0.0 got installed. I also installed version 1.9.0 directly from Puppeteer's github page. Both versions have a similar issue.
This is my script:
const puppeteer = require('puppeteer');
(async () => {
try {
console.log('starting');
const browser = await puppeteer.launch({
executablePath: 'D:/Code/Puppeteer/node_modules/puppeteer/.local-chromium/win64-594312/chrome-win/chrome.exe',
headless: false
});
console.log('one');
const page = await browser.newPage();
console.log('two');
await page.goto('https://github.com');
console.log('three');
await page.screenshot({path: 'example.png'});
console.log("Page is up");
await browser.close();
}
catch (e) {
console.log("Error: ", e);
}
})();
In above script, I can see 'starting' and then Chromium window opens with nothing on screen. When I press F12 to bring up the dev tool, I see 'one' being printed on screen.
I have set environment variable 'path' to use this:
D:\Code\Puppeteer\node_modules\puppeteer\.local-chromium\win64-594312\chrome-win; C:\Program Files (x86)\Google\Chrome\Application
The puppeteer script is working now. I started the node.js cmd window in admin mode to run the script which did not work. Running in normal mode worked.

How do I get actual config in puppeteer?

I want to conditionally execute some code based on the headless config attribute in puppeteer (passed in the .launch function).
e.g. : when I use the .type function, if it is running with headless: true, I don't want any delay. Else, add some { delay: 200 }.
How can I retrieve the headless value from the config?
Edit (thanks to #AndreyLushnikov comment)
You can figure out if puppeteer runs (non-)headless at runtime by checking browser.process() spawnargs for --headless switch with which Chromium were (or not) launched:
const headless = browser.process().spawnargs.includes("--headless") ? true : false;
console.log("Headless? " + headless);
With the latest puppeteer version to date (1.7.0), this is how I retrieved the config :
const client = await page.target().createCDPSession();
const response = await client.send('Browser.getBrowserCommandLine');
page.headless = response.arguments.includes('--headless');
See this github issue for more information

webrtc: failed to send arraybuffer over data channel in chrome

I want to send streaming data (as sequences of ArrayBuffer) from a Chrome extension to a Chrome App, since Chrome message API (includes chrome.runtime.sendMessage, postMessage...) does not support ArrayBuffer and JS arrays have poor performance, I have to try other methods. Eventually, I found WebRTC over RTCDataChannel might a good solution in my case.
I have succeeded to send string over a RTCDataChannel, but when I tried to send ArrayBuffer I got:
code: 19
message: "Failed to execute 'send' on 'RTCDataChannel': Could not send data"
name: "NetworkError"
It seems that it's not a bandwidths limits problem since it failed even though I sent one byte of data. Here is my code:
pc = new RTCPeerConnection(configuration, { optional: [ { RtpDataChannels: true } ]});
//...
var dataChannel = m.pc.createDataChannel("mydata", {reliable: true});
//...
var ab = new ArrayBuffer(8);
dataChannel.send(ab);
Tested on OSX 10.10.1, Chrome M40 (Stnble), M42(Canary); and on Chromebook M40.
I have filed a bug for WebRTC here.
I modified my code, now everything worked amazing:
removed RtpDataChannels option when creating RTCPeerConnection.(YES, remove RtpDataChannels option if you want data channel, what a magic world!)
on receiver side: no need createDataChannel, instead, handle onmessage, onxxx by using event.channle from pc.ondatachannel callback:
pc.ondatachannel function(event)
var receiveChannel = event.channel;
receiveChannel.onmessage = function(event){
console.log("Got Data Channel Message:", event.data);
};
};