rendering xml (reversing pdftohtml) - html

I have a document (shown below), created with pdftohtml (linux package/command). It has enough information to render a page. For example, see the page's height & width attributes. Also some textboxes with coordinates. I am thinking how should I approach the task of rendering it back to a readable, pdf-like view. Appreciate any pointer.
<page number="112" position="absolute" top="0" left="0" height="992" width="756">
<text top="78" left="108" width="540" height="21" font="22">This configures networking routes such that you have unique IP addresses assigned</text>
<text top="98" left="108" width="118" height="21" font="22">to every service of </text>
<text top="101" left="226" width="135" height="15" font="25">type: LoadBalancer</text>
<text top="98" left="360" width="4" height="21" font="22">.</text>
<text top="132" left="108" width="127" height="28" font="33"><b>Configuring DNS</b></text>
</page>

The whole point is it should be a HTML view.
So for example I print this Page to a PDF (Top) and run the pdftohtml command I get the output (Middle) with lower browser HTML view.
Now I can print to PDF again, However there are penalties for round tripping PDF some links will work but not other links (like that package one), or some other objects NOTE the line starting "Also" may need co-ordinate re-adjustments. Most easily after correction you could use a single command line to Chrome/Edge --headless "Print-to-PDF in place of my manual reprint.

Related

Adding google font Roboto in svg not displaying desired font [duplicate]

I'm working with an SVG pattern that uses a custom font, so as to use that pattern as a background image on an HTML page.
Everything renders fine in Chrome and Safari but it starts to get funny in Firefox:
Firefox renders the SVG along with the custom font text just fine when I open the SVG file itself (so far so good!);
However, Firefox does NOT render the custom font anymore when that same SVG file is used as the background to an HTML element (!)
I've spent hours trying to isolate the issue and a fresh pair of eyes would be welcome.
Here's what I've got in short:
CSS:
#import url(http://fonts.googleapis.com/css?family=Indie+Flower);
body {
background: url(pattern-google.svg);
}
SVG file:
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" height="200" width="200">
<style type="text/css">#import url(http://fonts.googleapis.com/css?family=Indie+Flower);</style>
<defs>
<!-- Geometry -->
<g>
<rect id="square" x="0" y="0" width="200" height="200" />
</g>
<!-- Patterns with Text -->
<pattern id="pattern" x="0" y="0" width="40" height="40" patternUnits="userSpaceOnUse" text-anchor="middle" font-size="20" font-family="Indie Flower, sans-serif" style="font-family: Indie Flower, sans-serif;">
<rect x="00" y="00" width="40" height="40" fill="transparent" />
<text x="00" y="00" fill="#777">S</text>
<text x="40" y="00" fill="#777">S</text>
<text x="20" y="20" fill="#777">S</text>
<text x="00" y="40" fill="#777">S</text>
<text x="40" y="40" fill="#777">S</text>
</pattern>
</defs>
<!-- Graphics -->
<use xlink:href="#square" transform="translate(0, 0)" fill="url(#pattern)"/>
</svg>
The HTML itself does not really matter but I've linked it below.
I did not produce a jsfiddle in the end because I could not host the SVG file there.
(Outside of the demo, the real-world application here is that I want to use a custom font to display phonetic symbols. (As background image, to help people learn them.) Doing so in SVG saves me the hassle to export to bitmap anytime I make a change in design.)
You are using SVG in an image context. I.e. either via the html <img> tag, the SVG <image> tag or in your case as a background image.
In Firefox (and likely in other UAs at some point) images must consist of a single file only. Any data external to the image file (pattern-google.svg) is ignored. If you display the SVG directly then external data is loaded/used.
So what can you do...
Load the data as a data URI. I.e. base64 encode http://fonts.googleapis.com/css?family=Indie+Flower but read the final paragraph before you do this and then stick that data directly in the svg file itself.
So the import would look like this...
#import url('data:text/css;base64,whatever the base 64 encoded data looks like...')
Do be careful though because http://fonts.googleapis.com/css?family=Indie+Flower itself has external data so that data itself must itself be encoded as a data URI. I.e. you must go all the way down the rabbit hole. And change that file as I've sketched out below.
#font-face {
font-family: 'Indie Flower';
font-style: normal;
font-weight: 400;
src: local('Indie Flower'), local('IndieFlower'), url(**convert this file to a data URI too before you convert the whole file to a data URI**) format('woff');
}
Once you've done that you can then encode the whole file as a data URI and #import it.
So, to reiterate step by step...
Convert http://themes.googleusercontent.com/static/fonts/indieflower/v5/10JVD_humAd5zP2yrFqw6nhCUOGz7vYGh680lGh-uXM.woff to a data URI
Replace http://fonts.googleapis.com/css?family=Indie+Flower with a version that has the data URI from step 1
Convert the file in step 2 to a data URI
#import the data URI from step 3
There are plenty of sites online that will create data URIs.

Firefox A11y Audit with inline SVGs - content with images must be labeled

If I have the following SVG image for example:
<svg role="img" viewbox="0 0 100 50" height="100px">
<title>Site Logo</title>
<rect x="0" y="00" width="100" height="10" fill="red"></rect>
<rect x="0" y="10" width="100" height="10" fill="salmon"></rect>
<rect x="0" y="20" width="100" height="10" fill="pink"></rect>
<rect x="0" y="30" width="100" height="10" fill="aqua"></rect>
<rect x="0" y="40" width="100" height="10" fill="blue"></rect>
</svg>
I should be hitting a11y svg guidelines by setting role=img and including a <title> element
However, when I run the Accessibility Audit in Firefox, it adds a warning for every element/graphic inside the SVG (path, rect, circ) with the following warning:
Content with images must be labeled. Learn more
But surely I don't need to mark up every individual path within the svg?
What should I do to improve a11y or indicate to FF what the correct alt text is?
Here's a demo page in fiddle that will reproduce this issue
I found that including a <title>Image title</title> tag within the <svg role="img"></svg> tags led FireFox to stop showing the error. Note the inclusion of the role attribute on the svg tag (as noted in an earlier comment), along with any other attributes you may need for the opening tag.
According to MDN Web Docs:
The element provides an accessible, short-text description of
any SVG container element or graphics element.
So, like an alt tag for an <img> element. It sounds like a <desc> tag could be used for additional descriptive information.
I am pretty sure this is a bug, or design flaw in FF's accessibility tool. I have reported it here.
Remember that automated accessibility audits can not catch every issue, and often report false positives. Try installing (e.g.) the WAVE accessibility add-on, which is another automatic accessibility auditor. It makes no such complaint.
w3 says
An img represents a single graphic within a document, whether or not
it is formed by a collection of drawing objects.
So you are right that role="img" on the SVG root should do The Right Thing. The accessibility API will not try to expose the children, but Firefox's current beta version of the accessibility tool obviously does.
I tried your code (wrapped in a bare bones HTML doc) with a screen reader (NVDA) and it didn't try to announce the rects, which is what I would expect. It did announce the accessible name. Actually it announces it twice (which is a known NVDA bug at time of writing).
I also tried putting a <g role="presentation"> element around the contents of the svg, but the accessibility tool still flagged warnings on all the children. This shouldn't be necessary.
So, I think you're good.
Elements with img role have the children presentational property set to true. So rect elements can't have an alternative name.
It's likely due to a bug in this Firefox plugin.
Note that (curiously?) the Accessible Name and Description guidelines state that the name can be:
generated from [...] a host language labeling mechanism, such as the alt or title attribute in HTML, or the desc element in SVG.
So according to this statement you should use the desc element. I'm not sure if it's an error in the documentation as the title element seems to be a more suitable choice.
This might come in late, but it helps all the time.
Since the svg does not have text, including aria-label="descriptive text here" and alt="svg description here" should help with the text label checks.
However, if the svg is just for presentational purpose, using empty alt helps out as well. This is to make screen readers recognise the svg, but will not describe the image (instead they'd just say "image", or similar).

How to print html content to a PDF with jasper reports?

I'm using Primefaces 5.3 in a web app along with the Primefaces <p:editor> which saves its data to a database table. I would like to export/print to PDF the value data from the database with the format being ready instead of printing out codes like
<br> or <ul> <li>, etc.
I'm using Jasper reports to create the report and then exporting to pdf, so I need to create "something" that can be printed with jasper reports that will take care of the html codes.
You can use the HtmlComponent <hc:html/>
Example
<componentElement>
<reportElement x="0" y="100" width="230" height="110" backcolor="#ADD8E6" uuid="332dd551-e8cd-4cb0-a11f-7325f481017b"/>
<hc:html xmlns:hc="http://jasperreports.sourceforge.net/htmlcomponent" xsi:schemaLocation="http://jasperreports.sourceforge.net/htmlcomponent http://jasperreports.sourceforge.net/xsd/htmlcomponent.xsd" scaleType="FillFrame" horizontalAlign="Left" verticalAlign="Top">
<hc:htmlContentExpression><![CDATA["Hello<br/>World"]]></hc:htmlContentExpression>
</hc:html>
</componentElement>
It will generate an image of your html.

Custom font not displaying in SVG pattern used as background-image

I'm working with an SVG pattern that uses a custom font, so as to use that pattern as a background image on an HTML page.
Everything renders fine in Chrome and Safari but it starts to get funny in Firefox:
Firefox renders the SVG along with the custom font text just fine when I open the SVG file itself (so far so good!);
However, Firefox does NOT render the custom font anymore when that same SVG file is used as the background to an HTML element (!)
I've spent hours trying to isolate the issue and a fresh pair of eyes would be welcome.
Here's what I've got in short:
CSS:
#import url(http://fonts.googleapis.com/css?family=Indie+Flower);
body {
background: url(pattern-google.svg);
}
SVG file:
<svg xmlns="http://www.w3.org/2000/svg" version="1.1" xmlns:xlink="http://www.w3.org/1999/xlink" height="200" width="200">
<style type="text/css">#import url(http://fonts.googleapis.com/css?family=Indie+Flower);</style>
<defs>
<!-- Geometry -->
<g>
<rect id="square" x="0" y="0" width="200" height="200" />
</g>
<!-- Patterns with Text -->
<pattern id="pattern" x="0" y="0" width="40" height="40" patternUnits="userSpaceOnUse" text-anchor="middle" font-size="20" font-family="Indie Flower, sans-serif" style="font-family: Indie Flower, sans-serif;">
<rect x="00" y="00" width="40" height="40" fill="transparent" />
<text x="00" y="00" fill="#777">S</text>
<text x="40" y="00" fill="#777">S</text>
<text x="20" y="20" fill="#777">S</text>
<text x="00" y="40" fill="#777">S</text>
<text x="40" y="40" fill="#777">S</text>
</pattern>
</defs>
<!-- Graphics -->
<use xlink:href="#square" transform="translate(0, 0)" fill="url(#pattern)"/>
</svg>
The HTML itself does not really matter but I've linked it below.
I did not produce a jsfiddle in the end because I could not host the SVG file there.
(Outside of the demo, the real-world application here is that I want to use a custom font to display phonetic symbols. (As background image, to help people learn them.) Doing so in SVG saves me the hassle to export to bitmap anytime I make a change in design.)
You are using SVG in an image context. I.e. either via the html <img> tag, the SVG <image> tag or in your case as a background image.
In Firefox (and likely in other UAs at some point) images must consist of a single file only. Any data external to the image file (pattern-google.svg) is ignored. If you display the SVG directly then external data is loaded/used.
So what can you do...
Load the data as a data URI. I.e. base64 encode http://fonts.googleapis.com/css?family=Indie+Flower but read the final paragraph before you do this and then stick that data directly in the svg file itself.
So the import would look like this...
#import url('data:text/css;base64,whatever the base 64 encoded data looks like...')
Do be careful though because http://fonts.googleapis.com/css?family=Indie+Flower itself has external data so that data itself must itself be encoded as a data URI. I.e. you must go all the way down the rabbit hole. And change that file as I've sketched out below.
#font-face {
font-family: 'Indie Flower';
font-style: normal;
font-weight: 400;
src: local('Indie Flower'), local('IndieFlower'), url(**convert this file to a data URI too before you convert the whole file to a data URI**) format('woff');
}
Once you've done that you can then encode the whole file as a data URI and #import it.
So, to reiterate step by step...
Convert http://themes.googleusercontent.com/static/fonts/indieflower/v5/10JVD_humAd5zP2yrFqw6nhCUOGz7vYGh680lGh-uXM.woff to a data URI
Replace http://fonts.googleapis.com/css?family=Indie+Flower with a version that has the data URI from step 1
Convert the file in step 2 to a data URI
#import the data URI from step 3
There are plenty of sites online that will create data URIs.

How to "use" local "defs" in SVG

I have multiple SVG pictures embedded into single HTML page.
Every SVG has own defs section that I am referencing to in my use elements.
It looks like I can't define element with the same id inside multiple defs and reference to it.
Second SVG use will pick the definition form the first SVG defs section, and ignore the local redefinition.
Does anybody know how I can reference to the LOCAL defs section?
The same story in Chrome and Firefox.
See the example below:
<html><head></head><body>
<svg height="50" width="50">
<defs>
<rect id="mybox" height="40" width="40" style="fill:#00F;"></rect>
</defs>
<use xlink:href="#mybox"/>
</svg>
<svg height="50" width="50">
<defs>
<rect id="mybox" height="20" width="20" style="fill:#F00;"></rect>
</defs>
<use xlink:href="#mybox"/>
</svg>
</body></html>
An SVG file with multiple identical IDs is invalid per http://www.w3.org/TR/SVG/struct.html#IDAttribute
Your options are either make all the IDs unique or move the SVG into separate files and reference them via <object> or <iframe> tags.
I created a tool to randomize definition id's to avoid this issue with inline svg's referencing the same #id, hopefully it will be useful for someone else. http://hugozap.com/randomize_svg_def_ids.html
One way to solve this is using svgo
svgo --enable=prefixIds *.svg
svgo can be installed via npm and is available as a library as well