How to savethe webpage as .mht in puppeteer - puppeteer

How to save the webpage as .mht in puppeteer.
There is an option of taking screenshot, also save as pdf but there is no option to save as HTML or other extensions

It seems like you could set the browser flags to enable the saving as mhtml.
But there is no way to access the browsers context menu to do the save_as.
The easiest option would be to save as PDF and then convert the PDF to MHTML

Related

PDF link and accessibility -- download yes or no?

I'm creating a site with downloadable PDF links in which we add a parameter at the end of the file download URL to tell the browser how to serve up the file:
https://example.com?ref=0&download=y
Using the parameter download=y opens the browser's file download dialog, asking the user to save the file to their desktop so they can open it with their machine's default PDF viewer.
Using download=n will open the browser's built-in PDF viewer, allowing the user to read the PDF without saving it to their machine.
I'm trying to understand which approach is more accessible for mobile / desktop / users with disabilities. Is one approach inherently better than the other from an accessibility perspective?
You could always let the user decide. If your link went directly to a PDF, then the user can change their browser settings to either view the PDF within the browser or to use an external viewer. I much prefer that over the web developer trying to choose for me (no offense). Personally, I like to view the PDF externally in Acrobat because the screen reader handles the PDF pretty well if the PDF is tagged. If you don't have a tagged PDF, then it won't matter how you serve up the file because the visually impaired user will have a tough time reading it.
Letting the user decide is the correct approach. This isn't a setting that you should be attempting to configure for accessibilty purposes.
Having properly tagged PDF documents is vastly more important.
It's also good practice that any HTML links to PDF documents be labeled as such in the anchor text.
e.g. Title of Document (PDF)

Open link in new tab of browser from PDF file

I have created the PDF file programmatically. For this, I have used the HTML to PDF converting library. In my PDF there are some links for some pages of website.
I have implemented the html by following way and then converted it to PDF programmatically.
<a target="_blank" href="http://mywebsite_url_here.html">Link</a>
But when opening this PDF into chrome or firefox browser and clicking on the any link exist in PDF. Links are opening is same TAB instead of the new TAB. Please help to find out the solution for this. so that my PDF reside in one tab and link will open in another tab in browser.
I have already tried
target="_blank"
target="_top"
<a onclick="window.open ('http://mywebsite_url_here.html', '');
return false" href="javascript:void(0);"></a>
But nothing has worked for me.
Short answer: It is not possible in a cross-plattform, guaranteed-to-work way.
Long answer: Hyperlinks in a PDF are different from Hyperlinks in HTML. PDF was not designed to be viewed as part of a browsing experience. Hence there is no option available for PDF Hyperlinks to open them in a new tab, because PDF does not know about the concept of tabs.
There is some discussion in Adobe's forums about it, which boils down to „not directly possible, but you could embed JavaScript in the PDF to do it“. They give an EPS file as example:
%!PS-Adobe-3.0 EPSF-3.0
%%BoundingBox: 0 0 100 100
%%EndProlog
[ /Rect [ 0 0 100 100 ]
/Action << /Subtype /JavaScript /JS (app.launchURL\("PLACE-YOUR-URL-HERE", true\);) >>
/Subtype /Link
/ANN pdfmark
%%EOF
Now before you try and get this EPS file embedded in your PDF, be aware that Chrome's PDF viewer has very little support for embedded JavaScript, so it is not guaranteed to work. It may also issue a warning to the User that there is JavaScript code going to be executed if they click on it. I would say it isn't worth the hassle.

Accessing visible PDF with Chrome Extension Content Script

I'm trying to build a Chrome extension to access the binary of a PDF in the case of a user viewing that PDF in a tab in Chrome. I thought that I would use a Content Script to get around file permission issues, and now I'm trying to figure out how to access the binary of the PDF via PDFium through the DOM. Is this possible? Has anyone found a good solution for this? I don't need to do any rendering, I just need the binary representation of the rendered file.

How to force a Download File prompt instead of displaying it in-browser with HTML?

Download
If I click Download button, this target blank is opening a new window.
But I need it to prompt a dialog for saving this file. How can I achieve this?
This is something that you cannot absolutely control with HTML itself.
If the user is having a browser with PDF reading capabilities (or a plugin) and the corresponding settings to open PDF files in-browser, the PDF will open like that.
The PDF opens in a new tab simple because of your target="_blank", which has nothing to do with a download prompt.
If you are using HTML5 you can use the download attribute:
Download
If you have a back-end service which you can control or you feel like fiddling with your Web Server, you can always look for setting the right Content-Disposition. See this SO question for some nice discussion on Content-Disposition.

Links on a webpage to either view or download a PDF

I have never thought about this before, but is there a way to control what happens when a user clicks a link to a PDF file?
My boss would like to offer two links to do the following:
1. View this PDF in the browser
2. Download the PDF
Is there a way to do this ? I don't think about these kinds of things, most modern browsers will open a PDF in the browser. If I want to download it, I right-click download. Any way to force the action ?
Thanks
You can link to an asset, or you can stream the data to the browser. Those are the only two options you have on your end.
If you link to a file: <a href="file.pdf"> whether or not it opens in the browser or not is entirely dependent on the end user's browser and operating system preferences.
You can force the download of a file, however by streaming it to the browser, which will usually trigger the browser's save as dialog.
Your best bet, though, from a user experience perspective, is to simply link to the PDF and let the user know that they are about to click on a PDF link...that way they can decide what they do with it.
How PDFs are displayed are based on the user's browser version and configuration. For example Chrome includes a PDF viewer by default, but the user has the ability to change the behavior of the plug-in ( automatically open PDFs, disable, ask the user).
One way to do this is to set the ContentType and Content-Disposition so the browser will know how to handle the request. For example in ASP.NET you would do it like this:
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", "attachment;filename=filename.pdf");
Disclosure: I hijacked this code from this article
Let me know if this helps.
There are a few ways using server side technologies:
you can link to a .net / php page that serves the file to download, eg:
download pdf
or to display:
view pdf
If you are using itextsharp to generate your pdf, you can add the following to the Response object to force a download:
Response.AddHeader(
"Content-Disposition",
"attachment; filename=itext.pdf"
);
or the following to open in the same window:
Response.AddHeader(
"Content-disposition",
"inline; filename=itext.pdf");
The user can always set their adobe reader plugin to always download, in which case the browser window display won't work.