iTextSharp HTML to PDF conversion - unable to change font - html

I'm creating some PDF documents with iTextSharp (5.5.7.0) from HTML in ASP.NET MVC5 application, but I'm unable to change the font. I've tried almost everything that I was able to find on SO or from some other resources.
Code for PDF generation is as follows:
public Byte[] GetRecordsPdf(RecordsViewModel model)
{
var viewPath = "~/Template/RecordTemplate.cshtml";
var renderedReport = RenderViewToString(viewPath, model);
FontFactory.RegisterDirectory(Environment.GetFolderPath(Environment.SpecialFolder.Fonts));
using (var ms = new MemoryStream())
{
using (var doc = new Document())
{
doc.SetPageSize(PageSize.A4.Rotate());
using (var writer = PdfWriter.GetInstance(doc, ms))
{
doc.Open();
using (var html = new MemoryStream(Encoding.Default.GetBytes(renderedReport)))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, html, Encoding.Default);
}
doc.Close();
}
}
var bytes = ms.ToArray();
return bytes;
}
}
Actual HTML is contained in renderedReport string variable (I have strongly typed .cshtml file which I render using MVC Razor engine and then return HTML in string).
I've tried to register some specific fonts, but that didn't help. I've also tried to register all fonts on my machine (as shown in example above), but that also didn't help. The fonts were loaded I've checked that in debug mode.
CSS is embedded in HTML file (in heading, style tag) like this:
body {
font-size: 7px;
font-family: Comic Sans MS;
}
(for test, I've decided to use Comic Sans, because I can recognize it with ease, I'm more interested in Arial Unicode MS actually).
And I'm actually able to change the font with that font-family attribute from CSS, but only from fonts that are preloaded by iTextSharp by default - Times New Roman, Arial, Courier, and some other (Helvetica i think). When I change it to - Comic Sans, or some other that is not preloaded iTextSharp renders with default font (Arial I would say).
The reason why I need to change the font is because I have some Croatian characters in my rendered HTML (ČĆŠĐŽčćšđž) which are missing from PDF, and currently I think the main reason is - font.
What am I missing?

A couple of things to make this work.
First, XMLWorkerHelper doesn't use FontFactory by default, you need to use one of the overloads to ParseXHtml() that takes an IFontProvider. Both of those overloads require that you specify a Stream for a CSS file but you can just pass null if your CSS lives inside your HTML file. Luckily FontFactory has a static property that implements this that you can use called FontFactory.FontImp
// **This guy**
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, null, Encoding.UTF8, FontFactory.FontImp);
Second, I know that you said that you tried registering your entire font directory out of desperation but that can be a rather expensive call. If you can, always try to just register the fonts you need. Although optional, I also strongly recommend that you explicitly define the font's alias because fonts can have several names and they're not always what we think.
FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");
Third, and this might not affect you, but any tags not present in the HTML, even if they're logically implied, won't get styling applied to them from CSS. That sounds weird so to say it differently, if your HTML is just <p>Hello</p> and your CSS is body{font-size: 7px;}, the font size won't get applied because your HTML is missing the <body> tag.
Fourth, and this is optional, but usually its easier to specify your HTML and CSS separately from each other which I'll do in the example below.
Your code was 95% there so with just a couple of tweaks it should work. Instead of a view I'm just parsing raw HTML and CSS but you can modify as needed. Please do remember (and I think you know this) that iTextSharp cannot process ASP.Net, only HTML, so you need to make sure that your ASP.Net to HTML conversion process is sane.
//Sample HTML and CSS
var html = #"<body><p>Sva ljudska bića rađaju se slobodna i jednaka u dostojanstvu i pravima. Ona su obdarena razumom i sviješću i trebaju jedna prema drugima postupati u duhu bratstva.</p></body>";
var css = "body{font-size: 7px; font-family: Comic Sans MS;}";
//Register a single font
FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");
//Placeholder variable for later
Byte[] bytes;
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
doc.SetPageSize(PageSize.A4.Rotate());
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Get a stream of our HTML
using (var msHTML = new MemoryStream(Encoding.UTF8.GetBytes(html))) {
//Get a stream of our CSS
using (var msCSS = new MemoryStream(Encoding.UTF8.GetBytes(css))) {
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, msCSS, Encoding.UTF8, FontFactory.FontImp);
}
}
doc.Close();
}
}
bytes = ms.ToArray();
}

Related

HTML Renderer - letter-spacing CSS not working

I'm generating PDF from HTML using HTML Renderer, but the letter-spacing HTML is not working:
h1 {
text-align: center;
font-size: 2.2em;
letter-spacing: 3px;
}
Generating PDF like below:
var config = new PdfGenerateConfig();
config.PageOrientation = PageOrientation.Landscape;
config.PageSize = PdfSharpPageSize.A4;
string cssStr = File.ReadAllText(folderPath + "1.css");
CssData css = PdfGenerator.ParseStyleSheet(cssStr);
PdfSharp.Pdf.PdfDocument pdf = PdfGenerator.GeneratePdf(html, config, css);
MemoryStream stream = new MemoryStream();
pdf.Save(stream, false);
byte[] bytes = stream.ToArray();
File.WriteAllBytes(folderPath + "document.pdf", bytes);
Also tried inline CSS style, also not working:
<h1 style="letter-spacing:3px;">
Maybe one of these links could be useful?
https://forum.pdfsharp.net/viewtopic.php?p=5793
https://forum.pdfsharp.net/viewtopic.php?p=9372
It seems as if you have to use the 'XTextFormatter' class to adjust word spacing in PDFsharp
I haven't tested it, but it seems the solution to your problem.

Some local fonts doesn't render on the browser

I've created a small app to view all local fonts on a browser. (Demo).
I'm using Flash to get font names:
import flash.external.ExternalInterface;
ExternalInterface.call("getFontList", getDeviceFonts());
function getDeviceFonts(): Array {
var embeddedAndDeviceFonts: Array = Font.enumerateFonts(true);
var deviceFontNames: Array = [];
for each(var font: Font in embeddedAndDeviceFonts) {
deviceFontNames.push(font.fontName);
}
deviceFontNames.sort();
return deviceFontNames;
}
and later process the names and create a 4 column layout to view them:
function getFontList(fontList) {
$("#flash").remove();
fontList.forEach(function (fontName) {
var cell = elem("div", {"class":"tile"}, [
elem("div", {"class":"fontName"}, fontName),
elem("div", {"class":"demo", "style":"font-family: \"" + fontName.replace(/"/g, '\\"') + "\", Consolas, Arial;"}),
]);
output.append(cell);
});
updateCells($("#input").val());
}
This is working fine, but there's slight problem. Some fonts doesn't render on the browser, and the fallback font is applied.
For example, this font: !PaulMaul. I can view this on MS Word and Photoshop:
and this is how it looks (Consolas applied) on the browser:
I haven't encountered something like this before, so I don't know what the problem is. I thought any local font is usable on the browser, especially if MS Word and PS can render them fine. I guess, that isn't true.
What exactly is the problem here, and is there a way to solve this?
App is available on Github, if anyone wants to tinker.

iTextSharp support for HTML controls conversion C#

Does iTextSharp HTML to PDF conversion support controls like Textboxes, Buttons etc.? Or we need to use iTextSharp class like TextField to implement controls during PDF conversion.
iTextSharp doesn't support convertion of text boxes, buttons, etc. Most likely, you need to implement your own logic if you want to covert the html page (with text boxes, buttons, etc.) to a pdf document. You can find all supported tags and styles here. You can also check whether an element is supported using this simple example:
byte[] bytes;
using (var stream = new MemoryStream())
{
using (var document = new Document())
{
using (var writer = PdfWriter.GetInstance(document, stream))
{
document.Open();
var html = #"<p>Before the button</p><br/><input type=""submit"" value=""Click me""/><br/><p>After the button</p>";
using (var reader = new StringReader(html))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, reader);
}
document.Close();
}
}
bytes = stream.ToArray();
}
File.WriteAllBytes("test.pdf", bytes);
If you run this example, you'll see that the input element is not a part of the final document:

code tag and pre css in html not functioning properly

in html i am using the code tag as below and also i am using the css as shown below :-
<style type="text/css">
code { white-space: pre; }
</style>
<code>
public static ArrayList<File> getFiles(File[] files){
ArrayList<File> _files = new ArrayList<File>();
for (int i=0; i<files.length; i++)
if (files[i].isDirectory())
_files.addAll(getFiles(new File(files[i].toString()).listFiles()));
else
_files.add(files[i]);
return _files;
}
public static File[] getAllFiles(File[] files) {
ArrayList<File> fs = getFiles(files);
return (File[]) fs.toArray(new File[fs.size()]);
}
</code>
When i use the code tag as shown above some part of the code is missing in the html page when viewed. when view the above html page the output is as shown below:-
public static ArrayList getFiles(File[] files){
ArrayList _files = new ArrayList();
for (int i=0; i fs = getFiles(files);
return (File[]) fs.toArray(new File[fs.size()]);
}
In the first method some part is missing and the second method is not appearing at all. what is the problem and how to fix it?
You have these <File> inside your <code> tag, you need to convert them to < and > html entities
Demo
<code>
public static ArrayList<File> getFiles(File[] files){
ArrayList<File> _files = new ArrayList<File>();
for (int i=0; i<files.length; i++)
if (files[i].isDirectory())
_files.addAll(getFiles(new File(files[i].toString()).listFiles()));
else
_files.add(files[i]);
return _files;
}
public static File[] getAllFiles(File[] files) {
ArrayList<File> fs = getFiles(files);
return (File[]) fs.toArray(new File[fs.size()]);
}
</code>
As already identified by Mr. Alien, you have characters being interpreted as markup inside your <code> block.
As an alternative to escaping lots of characters, providing your code does not include the string </script, you can exploit the parsing and (non)execution behaviour of the <script> element like this:
<code>
<script type="text/x-code">
public static ArrayList<File> getFiles(File[] files){
ArrayList<File> _files = new ArrayList<File>();
for (int i=0; i<files.length; i++)
if (files[i].isDirectory())
_files.addAll(getFiles(new File(files[i].toString()).listFiles()));
else
_files.add(files[i]);
return _files;
}
public static File[] getAllFiles(File[] files) {
ArrayList<File> fs = getFiles(files);
return (File[]) fs.toArray(new File[fs.size()]);
}
</script>
</code>
with this CSS:
script[type=text\/x-code] {
display: block;
white-space: pre;
line-height: 20px;
margin-top: -20px;
}
See JSfiddle: http://jsfiddle.net/fZuPm/3/
Update: In the comments, RoToRa raises some interesting points about the "correctness" of this approach, and I thank RoToRa for them.
Using a type attribute to stop the contents of a script tag from being executed as JavaScript is a well understood technique, and although the list of type names that cause script to be executed varies from browser to browser, finding one that won't cause execution is not hard.
More interesting is the question of the semantics. It is my view that the semantics of the script element are essentially inert, like a div or span element, while RoToRa's view is that it affects the semantics of the content. Looking at the specs, it is not easy to resolve. HTML 4.01 says very little about the semantics of the script element, concentrating solely on its functionality.
The HTML5 spec is not much better, but it does say "The element does not represent content for the user.". I don't know what to make of that. Saying what an element doesn't do is not very helpful. If it implies that its contents are semantically "hidden" from the user, such that the its contents are not semantically part of contents of the containing code element, then this technique should not be used.
If, however, it means that no new semantics are introduced by the script element, then there doesn't appear to be any problem.
I can't find any evidence of a script element being semantically required to contain script, as RoToRa suggests, and while it might be considered common-sense to infer that, that's not how HTML semantics works.
In many ways, this approach is really about trying to find a way to do validly what the XMP element does in browsers anyway, but is not valid. XMP was very nearly made valid in HTML5 but just missed out. The editor described it as a tough call. Using the script element like this meets that requirement, but it seems nevertheless to be controversial. If you are uncomfortable with whatever semantics you feel are being applied is this approach, I would suggest that you don't use it.

AS3 > Use different text styles (bold and regular) in the same dynamic text field

I've been using as3 for a lot of years but everytime I need to manipulate fonts I get crazy :(
My setup:
I'm filling up via code a movieClip with a lot of dynamic textFields. Their values come from an external XML.
My problem:
my client wants to insert html tags inside of the xml to have bold text in part of them. For example they want to have: "this string with this part bold".
The Xml part is ok, formatted with CDATA and so on. If I trace the value coming from the xml is html, but it is shown as regular text inside the textfield....
The textfields are using client custom font (not system font), and have the font embedded via embedding dialog panel in Flash by the graphic designer.
Any help?
This is the part of code that fills up the textfields (is inside a for loop)
var labelToWrite:String = labelsData.label.(#id == nameOfChildren)[VarHolder.activeLang];
if (labelToWrite != "") {
foundTextField.htmlText = labelToWrite;
// trace ("labelToWrite is -->" +labelToWrite);
}
And the trace outputs me
This should be <b>bold text with b tag</b> and this should be <strong>strong text </strong>.
Your code looks good. So the issue will be with the embedded Font. When you embed a font in flash, it doesn't embed any separate bold versions, so you may need to embed a bold version of your font.
Some resources:
Flash CS4 <b> tag in with htmlText
I find that html text works best by using the style sheets to declare your fonts instead of text formats.
var style:StyleSheet = new StyleSheet();
style.setStyle(".mainFont", { fontFamily: "Verdana", fontSize: 50, color: "#ff0000" } );
var foundTextField:TextField = new TextField();
foundTextField.embedFonts = true;
foundTextField.autoSize = TextFieldAutoSize.LEFT;
foundTextField.antiAliasType = AntiAliasType.ADVANCED;
foundTextField.styleSheet = style;
foundTextField.htmlText = "<div class=\"mainFont\">" + labelToWrite + "</div>";
See: flash as3 xml cdata bold tags rendered in htmlText with an embedded font
Other reasons maybe:
Are you embedding the right type of font (TLF vs CFF)?
If not using the flash IDE to make your textField, are you registering the font?
Font.registerFont(MyFontClass);
You can work with embeded fonts in Animate graphic design. Just add fonts and create them with a class name.
At Adobe Animate objects library you can add and create fonts as class objects like this:
Select a font and styles to add
Assign a name and export as a Action Script Class
At code frame, you must add the fonts with same classes names created before:
var gothamMediumItalic = new GothamMediumItalic();
var formatMediumItalic:TextFormat = new TextFormat();
formatMediumItalic.font = gothamMediumItalic.fontName;
var gothamBookItalic = new GothamBookItalic();
var formatBookItalic:TextFormat = new TextFormat();
formatBookItalic.font = gothamBookItalic.fontName;
var gothamBoldItalic = new GothamBoldItalic();
var formatBoldItalic:TextFormat = new TextFormat();
formatBoldItalic.font = gothamBoldItalic.fontName;
this.embedFonts = true;
this.antiAliasType = AntiAliasType.ADVANCED;
Finally, you just need to use your box added as graphic object, and set format and content:
var yourText:String = "Some Text";
boxText.defaultTextFormat = formatBook; //My default text font style
boxText.text = yourText; //Full text in same format
boxText.setTextFormat(formatBookItalic,yourText.indexOf("<i>"),yourText.indexOf("</i>")); //Some text in another format found by custom chars
That works fine and lets you use less code and more interface options. I hope this could be usefull for someone.