How do I use Headless Chrome in Chrome 60 on Windows 10? - google-chrome

I've been looking at the following article about Headless Chrome:
https://developers.google.com/web/updates/2017/04/headless-chrome
I just upgraded Chrome on Windows 10 to version 60, but when I run either of the following commands from the command line, nothing seems to happen:
chrome --headless --disable-gpu --dump-dom https://www.google.com/
chrome --headless --disable-gpu --print-to-pdf https://www.google.com/
And I'm running all of these commands from the following path (the default installation path for Chrome on Windows):
C:\Program Files (x86)\Google\Chrome\Application\
When I run the commands, something seems to process for a second, but I don't actually see anything. What am I doing wrong?
Thanks.
Edit:
As noted by Mark Rajcok, if you add --enable-logging to the --dump-dom command, it works. Also, the --print-to-pdf command works as well in Chrome 61.0.3163.79, but you'll probably have to specify a different path for the output file in order to have the necessary permissions to save it.
As such, the following two commands worked for me:
"C:\Program Files (x86)\Google\Chrome\Application\chrome" --headless --disable-gpu --enable-logging --dump-dom https://www.google.com/
"C:\Program Files (x86)\Google\Chrome\Application\chrome" --headless --disable-gpu --print-to-pdf=D:\output.pdf https://www.google.com/
I guess the next step is being able to step through the dumped DOM like PhantomJS with DOM selectors and whatnot, but I suppose that's a separate question.
Edit #2:
For what it's worth, I recently came across a Node API for Headless Chrome called Puppeteer (https://github.com/GoogleChrome/puppeteer), which is really easy to use and delivers all the power of Headless Chrome. If you're looking for an easy way to use Headless Chrome, I highly recommend it.

This works for me:
start chrome --enable-logging --headless --disable-gpu --print-to-pdf=c:\misc\output.pdf https://www.google.com/
... but only with "start chrome" and "--enable-logging" and with a path (for the pdf) specified - and if the folder "misc" exists on the c-directory.
Addition: ... the path for the pdf - "c:\misc" above - can of course be replaced with any other folder/dir.

With Chrome 61.0.3163.79, if I add --enable-logging then --dump-dom produces output:
> "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --enable-logging --headless --disable-gpu --dump-dom https://www.chromestatus.com
<body class="loading" data-path="/features">
<app-drawer-layout fullbleed="">
...
</script>
</body>
If you want to programatically control headless Chrome, here's one way to do it with Python3 and Selenium:
In an Admin cmd window, install Selenium for Python:
C:\Users\Mark> pip install -U selenium
Download ChromeDriver v2.32 and extract it. I put the chromedriver.exe in C:\Users\Mark, which is where I put this headless.py Python script:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("headless") # remove this line if you want to see the browser popup
driver = webdriver.Chrome(chrome_options = options)
driver.get('https://www.google.com/')
print(driver.page_source)
driver.quit() # don't miss this, or chromedriver.exe will keep running!
Run it in a normal cmd window:
C:\Users\Mark> python headless.py
<!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml" ...
... lots and lots of stuff here ...
...</body></html>

Current versions (68-70) seem to require --no-sandbox in order to run, without it they do absolutely nothing and hang in the background.
The full commands I use are:
chrome --headless --user-data-dir=tmp --no-sandbox --enable-logging --dump-dom https://www.google.com/ > file.html
chrome --headless --user-data-dir=tmp --no-sandbox --print-to-pdf=whatever.pdf https://www.google.com/
Using --no-sandbox is a pretty bad idea and you should use this only for websites you trust, but sadly it's the only way of making it work at all.
--user-data-dir=... uses the specified directory instead of the default one, which is likely already in use by your regular browser.
However, if you're trying to make a PDF from HTML, then this is fairly useless, since you can't remove header and footer (containing text like file:///...) and the only viable solution is to use Puppeteer.

You should be good. Check under the Chrome Version directory
C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78
For the command
chrome --headless --disable-gpu --print-to-pdf https://www.google.com/
C:\Program Files (x86)\Google\Chrome\Application\60.0.3112.78\output.pdf
Edit:
Still execute commands where the chrome executable is, in this instance
C:\Program Files (x86)\Google\Chrome\Application\

I know this question is for Windows, but since Google gives this post as the first search result, here's what works on Mac:
Mac OS X
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --dump-dom 'http://www.google.com'
Note you MUST put the http or it won't work.
Further tips
To indent the html (which is highly desirable in real pages that are bloated), use tidy:
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --headless --dump-dom 'http://www.google.com' | tidy
You can get tidy with:
brew install tidy

If you want to dodge on the problem in general, and just use a service of some kind to do the work for you, I'm the author/founder of browserless which attempts to tackle running headless Chrome in a service-like fashion. Other than that it's pretty tough to keep up with the changes and making sure all the appropriate packages and resources are installed to get Chrome running, but definitely doable.

I solved it by running this (inside chrome.exe directory),
start-process chrome -ArgumentList "--enable-logging --headless --disable-gpu --print-to-pdf=c:\users\output.pdf https://www.google.com/"
you can choose your own path.print-to-pdf=<<custom path>>

Related

How to enable remote debugging in Chrome headless browser?

As per details given in chrome://version/
Command Line : /Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary --flag-switches-begin --flag-switches-end
Executable Path : /Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary
If I try to run the executable path directly, I am able to see no file or directory
rajamuhamm.qaiser$ /Applications/Google Chrome Canary.app/Contents/MacOS/Google Chrome Canary -bash: /Applications/Google: No such file or directory
It shows an error because of <space> in the directory naming structure.
So I modified my command a bit to
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --headless -disable-gpu -screenshot https://google.com
It actually works, Now that I wanted to take PDF of it, I ran below
/Applications/Google\ Chrome\ Canary.app/Contents/MacOS/Google\ Chrome\ Canary --headless -disable-gpu –print-to-pdf https://google.com
which returned
:ERROR:headless_shell.cc(184)] Open multiple tabs is only supported when remote debugging is enabled.
Before posting the question, I tried to search for other solutions in Stackoverlow and all they answered is to check for issues in the spacing of directory naming. But seems my case is different, as I am able to run the command but it required me to enable remote debugging.
Can you please share me how? I Google but I am not able to find much

What is the exact command for launching Chrome with remote debugging in Terminal?

I've looked up and tried a couple ways of launching Chrome with remote debugging through the terminal, and neither have worked. I get the error "no such directory" or command not found. I've tried:
chrome --remote-debugging-port=9222
and
/Applications/Google\Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222
Are either of these correct? And if not, what is the right command?
One simple change is needed, add the bash shebang to the Chrome Debugger script.
#!/bin/bash
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222&

Headless chrome print to pdf hangs on certain sites when running on debian

I am trying to use Headless chrome to print to pdf on debian 9.
On some sites it hangs and never returns with an error.
When trying to do the same on Windows 10 it works.
Example site: https://www.ynet.co.il/home/0,7340,L-8,00.html
Turning on logging does not reveal any relevant information.
I am assuming it has something to do with fonts but since it hangs and return no error i am not sure how to proceed.
Is there some to understand why it hangs?
Setup on debian:
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install
Creating pdf on debian:
/usr/bin/google-chrome --headless --no-sandbox --disable-gpu --displayHeaderFooter=false --print-to-pdf=result.pdf https://www.ynet.co.il/home/0,7340,L-8,00.html
Creating pdf on windows (need to install google chrome and change the path to it accordingly):
"c:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --headless --no-sandbox --disable-gpu --displayHeaderFooter=false --print-to-pdf=result.pdf https://www.ynet.co.il/home/0,7340,L-8,00.html

Chrome headless immediately exits with --repl flag

I'm trying to use the chrome.exe headless REPL, but it seems to immediately exit.
I'm currently on Windows 7 Pro 64-bit
Chrome Version 72.0.3626.121
Command Used:
$ chrome.exe --headless --disable-gpu --enable-logging --no-sandbox --repl https://www.chromestatus.com/
Result
As you can see below it almost looks like I am able to start using the REPL, except there is no >>> .
$ [0307/131904.237:INFO:headless_shell.cc(370)] Type a Javascript expression to evaluate or "quit" to exit.
If i were to type a javascript expression:
$ [0307/132502.083:INFO:headless_shell.cc(370)] Type a Javascript expression to evaluate or "quit" to exit.
const someNumber = 1
'const' is not recognized as an internal or external command,
operable program or batch file.
$
It appears chrome has already exited. I've tried this in cmd.exe, PowerShell and ConEmu all with the same result. This is my first time with chrome headless so I apologize if the answer is obvious.
Chrome's official blog recommends two methods users can use to keep chromium from exiting after being launched from the command line. The blog explicitly and repeatedly mentions windows in their instructions so I assume they apply to the windows version of chrome as well:
Start the browser with remote debugging enabled by passing --remote-debugging-port=PORTNUM at the command line, or
Start chrome in REPL mode by passing --repl at the command line. This will cause chromium to persist as long as stdin remains open.
I've tested both approaches and can attest to both approaches working with Chromium 108.0.5359.124 on Linux as of the time of this writing. For the sake of completeness, I've included the exact commands I used to confirm this bellow:
Remote Debugging
chromium --headless --temp-profile --password-store=basic --disable-gpu --remote-debugging-port=9222 https://example.com
REPL Mode
chromium --headless --temp-profile --password-store=basic --disable-gpu --remote-debugging-port=9222 https://example.com

Chrome Headless Doesn't work

I've read about the Chrome Headless from developers.google said we can run the Google without UI. Quote from that link :
Headless Chrome is shipping in Chrome 59. It's a way to run the Chrome
browser in a headless environment. Essentially, running Chrome without
chrome! It brings all modern web platform features provided by
Chromium and the Blink rendering engine to the command line.
Why is that useful?
A headless browser is a great tool for automated testing and server
environments where you don't need a visible UI shell. For example, you
may want to run some tests against a real web page, create a PDF of
it, or just inspect how the browser renders an URL.
This is really great feature, so I do some experiment with this cool feature. The idea is to taking snapshot as the document site by do call of chrome.exe from Windows Command Prompt, as follow :
chrome --headless --disable-gpu --screenshot https://www.chromestatus.com/
After do several times and following the instruction from these site. I got nothing. I don't get any picture or screenshot with name screenshot.png as document mention it before Running with --screenshot will produce a file named screenshot.png in the current working directory.
From this document also said about version,
Caution: Headless mode is available on Mac and Linux in Chrome 59.
Windows support is coming in Chrome 60. To check what version of
Chrome you have, open chrome://version.
after do some check with suggested before, I run chrome://version on my Chrome on Windows x64 Machine and got some result :
Google Chrome 62.0.3202.94 (Official Build) (64-bit) (cohort: Stable)
Revision 4fd852a98d66564c88736c017b0a0b0478e885ad-refs/branch-heads/3202#{#789}
What wrong? What i missed?
Thanks
After do some experiments. for --screenshot will save the image on the same level as chrome.exe location and that will be mean save on Program Files.
So we need need to combine parameter names and arguments with a =
--screenshot="D:\screen.png" will work, otherwise Chrome writes to it's installation folder. Big design flaw, no software should use it's installation folder as a working directory.
Here are the complete argument :
chrome --headless --enable-logging --disable-gpu --screenshot="D:\screen.png" "https://www.chromestatus.com/"