Need to display page numbers in Table of contents - puppeteer

I am just printing the HTML page, I need to show the page numbers opposite to the titles in Table of contents. I investigated a lot, but couldn't find a proper solution.
While investigating I found target-counter but it is not working for me. Seems it is explicitly providing support for PrinceXML, but I am using puppeteer to convert HTML content to PDF, since PrinceXML doesn't support .Net Core, I have opted Puppeteer Sharp.
So is there any way I could get page numbers in the table of contents.

Related

How to generate preview for a new post written in html by parsing

In my web application, a user may make a post with images, embed videos and text with different styles. I want to generate a preview for the post to show it on the front page of the web application. It demands that it doesn't take too much space and as clear as possible.
I know that I need to parse the post html first and may extract image elements first. My consideration for the text is simply to extract all plain texts and show part of them.
Could someone provide other advice, methods or resource about this problem?
As I understand, you want to render a HTML page into an image? You should use some layout engine, such as WebKit or Gecko on server side.
Another option is to use some third-party online tool for these previews. But rendering the page is pretty hard process, because of it's time complexity and memory space requirements for storing images. I have found these services:
http://www.thumboo.com/
http://www.thumbalizr.com/
http://www.zubrag.com/scripts/website-thumbnail-generator.php

Good PDF to HTML Converter for Mobiles

We are having Multiple PDF which have account tables and balance sheet within it. We have tried many Converters but the result is not satisfactory. Can anybody please suggest any good converter that would replicated the contents of PDF to Exact structure in HTML. IF any paid Converter is there please suggest me .
This is the PDF we want to convert and Show in html "http://www.marico.com/html/investor/pdf/Quarterly_Updates/Consolidated%20Financial%20Results%20-%20Q3FY11.pdf"
Have you looked into this? http://pdftohtml.sourceforge.net/
It's open source as well, so it's free and can be modified if necessary.
There's even a demo showing the before PDF and the after HTML version. Not bad if you ask me.
If you're having issues specifically with tables in PDFs, perhaps the issue are the table themselves and whatever program is being used to generate them. Not all PDFs are created equal.
ALSO: Be aware that all PDFs that I've created and come across over the years have had lots of issues when it comes to copy/pasting blocks/lines of text that have other blocks/lines of text at equal or higher height on any given page. I think Acrobat lacks the ability to define a "sequence order" of what block is selected after what (or most programs don't use it properly), so the system sorta moves from a top-down, left-to-right way of selecting content.....even if that means jumping over large blank areas or grabbing lines from multiple columns at once when you wouldn't expect it. This may be part of your tabular data issue. Your weak link here is the PDF format itself and I think perhaps you may be expecting too much from it. Turning anything into a PDF is pretty much a one-way street, especially when you start putting lots of editable text into it.
Have you tried http://www.jpedal.org/html_index.php - there is also a free online version

Recommended approach for presenting formatted text in Android?

I am developing an application that will provide instructions to making a product. The text will have bullets and/or numbered steps as well regular text paragraphs. I may have headings for various sections. The text will be placed into a scrollable TexView.
I was originally planning on loading the text from a resource text file and then applying formatting via xml. However, I just learned about WebView and the ability to load local html files. I could easily format the text in html and load it into a WebView for the various activities.
My question is, is there a performance issue with using WebView vs. TextView? Are there other ways to easily format text for a TextView?
Thanks,
WebView definitely takes longer to load the first time into your process. It also is not designed to go in a ScrollView, since it scrolls itself. OTOH, you get excellent HTML support.
TextView can display limited HTML, converted into a SpannedString via Html.fromHtml(). Here is a blog post where I list the HTML tags supported by the Android 2.1 edition of fromHtml(). Note that these are undocumented, and so the roster of tags may be different in other Android releases.

HTML printing - what methods are there to make an html based printout? What are the pros and cons?

I have a report I need to print out in an application I'm usually doing maintenance for. My question, which interests me beyond the scope of this task is, what are the ways to format an HTML page for printing? What are the pros and cons of each?
Note that the page is meant only to be printed. I'm not asking about an HTML page that looks ok also when printed.
Generally speaking, I know I can either rely heavily on <table>s or on <div>s, but I don't know which way to go.
I would also appreciate some resources to get me started, or to help with known problems, in any method you suggest.
Thanks,
Asaf
As you can certainly see, printing and web presentation are two different creatures. The main issue is the bounds of the printed page, which does not exist in a web page. Even if you think you have a page laid out in a manner that will fit a printed page, then you need to deal with the fact that the font you are using may not work or scale correctly on the user's printer.
I know of three ways to deal with this issue:
Use fixed-sized fonts (like Courier), limit yourself to an 80 column width, and only use font characters: meaning use something like asterisks for borders, etc. This is VERY old school - your reports look simple and old and plain. But, they will always print they way you intended.
Convert your report to an image. Images can be made to confirm to a specific size which can fit on a page. However, you can still have issues due to printer margin settings.
Let another application do the work for you. What I mean by this is put your report into a PDF or a spreadsheet. Both PHP and Perl have easy to use modules for creating a PDF - with no licensing needed. Perl has a fantastic spreadsheet module. This route takes a little learning up front, but frees you from having to be an expert on printing (which can be a real pain).
In case you DO want to have a page that also looks good when viewed in a browser, consider multiple stylesheets for different medias.

Do you know a webpage appearance comparator?

I need a tool to compare the design of a website, I do not want to compare the HTML code only, but the output design.
Is this even possible? also is there any opensource program of this kind?
I have searched google, but I only get one candidate so far which is an HTML Match.
In modern webpages the appearance is controlled by various 'things': html code, css styles and images at least (also javascript in some pages). Simple text-based diff programs are not enough because their output can be irrelevant to the webpage appearance (i.e. cleaning up css can show many differences but the rendered webpage remains the same).
For simpler pages HTML Match mentioned above could do the job. If I have to compare the design of two "complex" pages (including layout, space, image and text changes) I would do a two-step approach:
Run a diff tool on the html sources to highlight the textual content differences. Then I would modify one of the pages to show the same content as the other (in order to make the next step more accurate and 'focused' to show 'real' layout changes). Of course it works only with very similar html.
Load the pages in the same web browser, get some screenshots from the rendered output at fixed positions and compare the images (i.e. with ImageMagick). It should show all visual differences in the rendered output.
It is not perfect but should work.
[UPDATE] HTML Match seems dead, see this answer for an alternative solution.
Solution: “compare web pages” tool. (“We've been doing it since 1999. It's free.”)
Example output (comparing pages for TP-Link USB hub model UH700 and UH720):
Under windows:
http://www.htmlmatch.com/
If you are using KDE, you can use Kompare or KDiff3.
However, if you want to view how your web page looks in different browsers in different operating systems, BrowserShots can used.
There are these online tools - that aren't brilliant:
http://www.w3.org/2007/10/htmldiff
http://www.aaronsw.com/2002/diff/
I like the look of daisydiff but have not used it in anger: http://code.google.com/p/daisydiff/
The keyword you're looking for is "diff".
A good program that can show you the differences between two files (html markup or other) would be ExamDiff for windows.
I'm working on one and i tell you it's hard and there is nothing on the market. Maybe Google and Bing have something inhouse. You can use some image comparison tools which identify rectangle regions of changed images. This is for example a part of all modern video compression but you have to do it for different regions of the webpage (the nav bar section, the main article, the region filtered by an ad-blocker etc.) as some of them may change and it's still considered the same content on the page.
As i said very complex problem with no exact solution.
The other is going the non visual way and just compare the resulting computed computer styles of each html element. You have to hack the browser to get access to the layout tree. There is also no official API or existing library/program/hack/patch for it.
You can make a visual comparison with Araxis Merge Pro by taking screen output with systems like BrowserStack, Cross Browser, PhantomJS