QWebEngine: export HTML pages > 2 MB to PDF

QWebEngine: export HTML pages > 2 MB to PDF - html

I’m facing an issue while displaying an HTML page (that I dynamically built) within QWebEngine because, as indicated in the doc here (https://doc.qt.io/qt-6/qwebengineview.html#setHtml), content larger than 2 MB cannot be displayed directly using setHtml().
As suggested in some SO posts, dumping my HTML code to a local file and then displaying it via QWebEngineView::load(QUrl::fromLocalFile()) allowed me to workaround this limitation.
However, I would also like to be able to export this page in PDF format, I tried via QWebEngineView::printToPdf() but it fails for such “big” pages (and works for normal pages).
Do you know how I could workaround this HTML to PDF export issue in Qt? I guess that I could try to use some 3rd party library/tool but, if possible, I’d like to avoid complicating too much the code.
Thanks in advance for your help!

Related

Python jupyter notebook after converting to html , numbering of markdown disappeared

I used Table of Contents(2) of Nbextensions to create Table of Contents. And the titles were created using markdown.
Every thing is working fine. I mean , it looks pretty good in notebook modus.
But after I had converted the ipynb file to html file , then the number of each title dispeared . I used menu: File->Download as -> HTML to do this.
I tried to use another option "File->Download as ->HTML with toc" to convert to html. Although it generated desired numbering , this is still not what I want, because it will generate not only an html file but also multiple images file if there are some plots in the notebook.
Does anyone have a good idea?
I just need a SINGLE html file with everything embedded .

The numbering of the Nbextensions will be present if you'll download the notebook to PDF (after installing the relevant packages).
Regarding the HTML version, I didn't find an answer yet and will be happy for help either.

Notebook uses different formatting for the content. Exporting option only provide you to save your code. So, that you can send it to others.
Better save the data in the python notebook form itself to maintain formatting. Even exporting you to PDF won't provide you true formatting as the notebook does. But it will be better than HTML.
But if you still want the HTML format, you can format it manually as a webpage. May the formatting get improved in next version of nbconvert.

Edit Content of website without opening HTML file

is there some other way to edit or change some text or pictures on my website using it's interface or not from the HTML file, cause my client is wondering on how can day update the "Events" Box(they don't know how to use HTML) i'm really new at this and open to any suggestions, thank you

It depends on what you are using. if you are using a CMS based program that would be possible.
you cannot edit a page without opening its files right of the browser without any external help.

It seems that you want to edit the content of your website. Yes, its possible without opening any file using your CMS Dashboard. If the content is static then you have to open the php/html file.

How to integrate a json data frame into an html file

I'm currently "learning" d3js by myself and I found a lot of examples here. It seems that for all the visualizations we need two separated files. One is a script (an html file) and the other one is a json file which contains our data set.
I'm curious if there is a way to put the json file into html file so we can have only one file. I think I saw an example like this previously on the internet but I lost it.
The only reason I want to do it like this is that if data set is separated from html file, I cannot use Chrome to view my result (I think Chrome is blocking the script from reading local data set). I can use Firefox to open up my result but the animation doesn't perform smoothly.
Maybe some of my understanding is not really right. But if there is any suggestions please let me know. Thanks in advance.

If you're just using one HTML file, you probably have a <script> tag on your page where all the code is located. You can define your data as a Javascript array.
It can be nice to use multiple files to organize code, data, and view elements (the HTML). This page gives some help on setting your browser to let you do that. For Chrome, close all open windows. Then run Chrome from the 'Run' prompt with this flag: chrome --allow-file-access-from-files

Convert webarchive to html

I managed to collect the behavior of a complex web site into a webarchive. Thereafter I would like to turn that webarchive into an html set of nested directory. Yet, when I did it both with Waf and with a commercial software bought on the the Apple store, what I get is just the nested directory with the html page at the bottom and no images, nor css nor working links.
If you are interested the webarchive document is at:
http://www.miafoto.it/it/GiroMilano.webarchive
while the weak product of the extraction is at:
http://www.miafoto.it/it/Giromilano/Pagine/default.aspx
and the empty directories above.
In addition to the different look, the webarchive displays the same behavior as the official web site - when a listbox vales is selected and then the button pushed - while the extracted version produces a page with no contents by loading itself rather than the official page.
As you may see the webarchive is over 1MB while the extraction just little over 1 KB.
What is wrong with it and how may I perform such an apparently trivial business with usable results?
Thanks,

textutil -convert html example.webarchive
Be careful — html with files is created in the same folder as webarchive!
Also, I had to open .html with text editor and replace "file:///image.tiff" links (replace "file:///" with "") so they point to relative path.
Also, not all browsers display .tiff images.
Who knew we have Stack Overflow wiki?

I find that this WebArchiveExtractor.app works on my Mac (Mojave OS) –
https://robrohan.github.io/WebArchiveExtractor/

I managed the issue by finding all parameters being submitted in the page and submitting them too in my script, ignoring the webarchive.

To save HTML pages on mac, I use chrome. Download and install it and save your page as HTML. Safari will save the web pages with webarchiveformat and for me, it's very hard to deal with it.

PdfSharp, GDI+ and HTML printing

I currently have a "PrintingWebService" that I call from an AJAX page with all the information that is needed to construct a highly customized PDF printout using PDF Sharp and the PDFSharp's GDI+ mode, which takes DrawString and other commands that work basically just like GDI+ only they are drawn to the PDF.
I then save the PDF file to a location on the webserver and return the file name from the web service, and the AJAX page opens a new window with the pdf file.
So far, it works well, however, there is one part of my AJAX page that I want to printout and I haven't come up with a solution for yet. I've got a string of the HTML content of a TinyMCE editor that I want to dispay in the bottom part of the PDF page.
I'm looking for some sort of tool I could use for this purpose. Even something opensource that prints to GDI+ I could use by taking the source code and translating it to use PdfSharp's GDI+ (the class names are like XGraphics, with each class having X before the GDI+ name).
If I have to I will limit what HTML can be generated by TinyMCE and write my own renderer, but that will be a big challenge, so I'm looking for other solutions first.
I've stayed away from a printer-friendly page approach because I wanted to construct a page that was a near identical of an existing WinForms printout, using my existing code. With PdfSharp I was able to convert all the code except the text area stuff (which used the RichTextBox and RTF in the WinForms version).

Tony,
I personally have used WebSupergoo's ABCPdf library with much success. You can actually render HTML directly to the PDF and it does fairly well in regards to accuracy.
Another free software that will allow you the flexibility of writing HTML to PDF that I have used in the past with much success is iTextSharp.
Otherwise, I think you'll have to write something to render HTML to GDI.
Either way, you may want to consider using an HttpHandler that you map to using your web.config to generate the PDF file. This will allow for you to render the PDF to a bytestream and then dump it directly to the user (as opposed to having to save each PDF receipt to the web server). It will also allow for you to use the .pdf extension in the page that returns the receipt (PurchaseReceipt.pdf could be mapped to a HttpHandler)... making it more cross-browser friendly. Older versions of Adobe / Browsers will not display correctly if you start throwing a PDF byte stream from an ASPX page.
Hope this helps.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

QWebEngine: export HTML pages > 2 MB to PDF - html

Related

Python jupyter notebook after converting to html , numbering of markdown disappeared

Edit Content of website without opening HTML file

How to integrate a json data frame into an html file

Convert webarchive to html

PdfSharp, GDI+ and HTML printing

Categories

Resources