iTextSharp support for HTML controls conversion C# - html

Does iTextSharp HTML to PDF conversion support controls like Textboxes, Buttons etc.? Or we need to use iTextSharp class like TextField to implement controls during PDF conversion.

iTextSharp doesn't support convertion of text boxes, buttons, etc. Most likely, you need to implement your own logic if you want to covert the html page (with text boxes, buttons, etc.) to a pdf document. You can find all supported tags and styles here. You can also check whether an element is supported using this simple example:
byte[] bytes;
using (var stream = new MemoryStream())
{
using (var document = new Document())
{
using (var writer = PdfWriter.GetInstance(document, stream))
{
document.Open();
var html = #"<p>Before the button</p><br/><input type=""submit"" value=""Click me""/><br/><p>After the button</p>";
using (var reader = new StringReader(html))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, document, reader);
}
document.Close();
}
}
bytes = stream.ToArray();
}
File.WriteAllBytes("test.pdf", bytes);
If you run this example, you'll see that the input element is not a part of the final document:

Related

HTML.TextAreaFor - removing html tags for display only

In an MVC application I have to use #HTML.TextAreaFor to display some text from a database, the trouble is sometimes that text may have HTML tags within it and I can't see a way to remove those for display only.
Is it possible to do this in the view (maybe with CSS?) without having to strip the tags in the controller first?
EDIT
The data coming from the controller contains html tags which I do not want to remove, I just don't want to display them
Normally I would use #HTML.Raw but it has to work in a #HTML.TextAreaFor control.
If you want to decode Html returned from the Controller you can use the following JavaScript method:
This method decodes "Chris&apos; corner" to "Chris' corner".
var decodeEntities = (function () {
// this prevents any overhead from creating the object each time
var element = document.createElement('div');
function decodeHTMLEntities(str) {
if (str && typeof str === 'string') {
// strip script/html tags
str = str.replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '');
str = str.replace(/<\/?\w(?:[^"'>]|"[^"]*"|'[^']*')*>/gmi, '');
element.innerHTML = str;
str = element.textContent;
element.textContent = '';
}
return str;
}
return decodeHTMLEntities;
})();
You can do this by using a razor code in your view.
#Html.Raw(HttpUtility.HtmlDecode(Model.Content))
if I set Model.Content to this string "<strong>This is me</strong><button>click</button>", the code above will render it like HTML code and will have a strong text next to a button as an output like the image below:
There's some nice rich text editors libraries like CK Editor, Quill, or TinyMCE that can display HTML while still maintaining the editor capabilities of being a text editor. All of these libraries have capabilities of being read-only as well if that's necessary.
Example from Quill -
Sorted this by changing TextAreaFor toTextBoxFor and setting a formatted value.
#Html.TextBoxFor(x => Model.MyItem, new { #class = "form-control", #required = "true", Value = Regex.Replace(Model.MyItem, "<.*?>", String.Empty) })

iTextSharp HTML to PDF conversion - unable to change font

I'm creating some PDF documents with iTextSharp (5.5.7.0) from HTML in ASP.NET MVC5 application, but I'm unable to change the font. I've tried almost everything that I was able to find on SO or from some other resources.
Code for PDF generation is as follows:
public Byte[] GetRecordsPdf(RecordsViewModel model)
{
var viewPath = "~/Template/RecordTemplate.cshtml";
var renderedReport = RenderViewToString(viewPath, model);
FontFactory.RegisterDirectory(Environment.GetFolderPath(Environment.SpecialFolder.Fonts));
using (var ms = new MemoryStream())
{
using (var doc = new Document())
{
doc.SetPageSize(PageSize.A4.Rotate());
using (var writer = PdfWriter.GetInstance(doc, ms))
{
doc.Open();
using (var html = new MemoryStream(Encoding.Default.GetBytes(renderedReport)))
{
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, html, Encoding.Default);
}
doc.Close();
}
}
var bytes = ms.ToArray();
return bytes;
}
}
Actual HTML is contained in renderedReport string variable (I have strongly typed .cshtml file which I render using MVC Razor engine and then return HTML in string).
I've tried to register some specific fonts, but that didn't help. I've also tried to register all fonts on my machine (as shown in example above), but that also didn't help. The fonts were loaded I've checked that in debug mode.
CSS is embedded in HTML file (in heading, style tag) like this:
body {
font-size: 7px;
font-family: Comic Sans MS;
}
(for test, I've decided to use Comic Sans, because I can recognize it with ease, I'm more interested in Arial Unicode MS actually).
And I'm actually able to change the font with that font-family attribute from CSS, but only from fonts that are preloaded by iTextSharp by default - Times New Roman, Arial, Courier, and some other (Helvetica i think). When I change it to - Comic Sans, or some other that is not preloaded iTextSharp renders with default font (Arial I would say).
The reason why I need to change the font is because I have some Croatian characters in my rendered HTML (ČĆŠĐŽčćšđž) which are missing from PDF, and currently I think the main reason is - font.
What am I missing?
A couple of things to make this work.
First, XMLWorkerHelper doesn't use FontFactory by default, you need to use one of the overloads to ParseXHtml() that takes an IFontProvider. Both of those overloads require that you specify a Stream for a CSS file but you can just pass null if your CSS lives inside your HTML file. Luckily FontFactory has a static property that implements this that you can use called FontFactory.FontImp
// **This guy**
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, null, Encoding.UTF8, FontFactory.FontImp);
Second, I know that you said that you tried registering your entire font directory out of desperation but that can be a rather expensive call. If you can, always try to just register the fonts you need. Although optional, I also strongly recommend that you explicitly define the font's alias because fonts can have several names and they're not always what we think.
FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");
Third, and this might not affect you, but any tags not present in the HTML, even if they're logically implied, won't get styling applied to them from CSS. That sounds weird so to say it differently, if your HTML is just <p>Hello</p> and your CSS is body{font-size: 7px;}, the font size won't get applied because your HTML is missing the <body> tag.
Fourth, and this is optional, but usually its easier to specify your HTML and CSS separately from each other which I'll do in the example below.
Your code was 95% there so with just a couple of tweaks it should work. Instead of a view I'm just parsing raw HTML and CSS but you can modify as needed. Please do remember (and I think you know this) that iTextSharp cannot process ASP.Net, only HTML, so you need to make sure that your ASP.Net to HTML conversion process is sane.
//Sample HTML and CSS
var html = #"<body><p>Sva ljudska bića rađaju se slobodna i jednaka u dostojanstvu i pravima. Ona su obdarena razumom i sviješću i trebaju jedna prema drugima postupati u duhu bratstva.</p></body>";
var css = "body{font-size: 7px; font-family: Comic Sans MS;}";
//Register a single font
FontFactory.Register(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "comic.ttf"), "Comic Sans MS");
//Placeholder variable for later
Byte[] bytes;
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
doc.SetPageSize(PageSize.A4.Rotate());
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Get a stream of our HTML
using (var msHTML = new MemoryStream(Encoding.UTF8.GetBytes(html))) {
//Get a stream of our CSS
using (var msCSS = new MemoryStream(Encoding.UTF8.GetBytes(css))) {
XMLWorkerHelper.GetInstance().ParseXHtml(writer, doc, msHTML, msCSS, Encoding.UTF8, FontFactory.FontImp);
}
}
doc.Close();
}
}
bytes = ms.ToArray();
}

How to insert html in Microsoft Word Placeholder

I have this situation:
The user has an editor on his page and he enters text(with colors, formating, hyperlinks and he can also add pictures). When he clicks Submit the data from the editor(with the proper formating) must be sent to a specific placeholder in a Microsoft Office Word document.
I am using OpenXml SDK to write in the document and I tried HtmlToOpenXml so I can read the html.
I use HtmlToOpenXml and from the html string(from the user) I det a couple of paragraphs and now I have to insert them in the content control. Do you know how can I find the control and append them in it(if possible)
I managed to fix this and here is the code I used
//name of the file which will be saved
const string filename = "test.docx";
//html string to be inserted and rendered in the word document
string html = #"<b>Test</b>";
//the Html2OpenXML dll supports all the common html tags
//open the template document with the content controls in it(in my case I used Richtext Field Content Control)
byte[] byteArray = File.ReadAllBytes("..."); // template path
using (MemoryStream generatedDocument = new MemoryStream())
{
generatedDocument.Write(byteArray, 0, byteArray.Length);
using (WordprocessingDocument doc = WordprocessingDocument.Open(generatedDocument, true))
{
MainDocumentPart mainPart = doc.MainDocumentPart;
//just in case
if (mainPart == null)
{
mainPart = doc.AddMainDocumentPart();
new Document(new Body()).Save(mainPart);
}
HtmlConverter converter = new HtmlConverter(mainPart);
Body body = mainPart.Document.Body;
//sdtElement is the Content Control we need.
//Html is the name of the placeholder we are looking for
SdtElement sdtElement = doc.MainDocumentPart.Document.Descendants<SdtElement>()
.Where(
element =>
element.SdtProperties.GetFirstChild<SdtAlias>() != null &&
element.SdtProperties.GetFirstChild<SdtAlias>().Val == "Html").FirstOrDefault();
//the HtmlConverter returns a set of paragraphs.
//in them we have the data which we want to insert in the document with it's formating
//After that we just need to append all paragraphs to the Content Control and save the document
var paragraphs = converter.Parse(html);
for (int i = 0; i < paragraphs.Count; i++)
{
sdtElement.Append(paragraphs[i]);
}
mainPart.Document.Save();
}
File.WriteAllBytes(filename, generatedDocument.ToArray());
}

AS3 > Use different text styles (bold and regular) in the same dynamic text field

I've been using as3 for a lot of years but everytime I need to manipulate fonts I get crazy :(
My setup:
I'm filling up via code a movieClip with a lot of dynamic textFields. Their values come from an external XML.
My problem:
my client wants to insert html tags inside of the xml to have bold text in part of them. For example they want to have: "this string with this part bold".
The Xml part is ok, formatted with CDATA and so on. If I trace the value coming from the xml is html, but it is shown as regular text inside the textfield....
The textfields are using client custom font (not system font), and have the font embedded via embedding dialog panel in Flash by the graphic designer.
Any help?
This is the part of code that fills up the textfields (is inside a for loop)
var labelToWrite:String = labelsData.label.(#id == nameOfChildren)[VarHolder.activeLang];
if (labelToWrite != "") {
foundTextField.htmlText = labelToWrite;
// trace ("labelToWrite is -->" +labelToWrite);
}
And the trace outputs me
This should be <b>bold text with b tag</b> and this should be <strong>strong text </strong>.
Your code looks good. So the issue will be with the embedded Font. When you embed a font in flash, it doesn't embed any separate bold versions, so you may need to embed a bold version of your font.
Some resources:
Flash CS4 <b> tag in with htmlText
I find that html text works best by using the style sheets to declare your fonts instead of text formats.
var style:StyleSheet = new StyleSheet();
style.setStyle(".mainFont", { fontFamily: "Verdana", fontSize: 50, color: "#ff0000" } );
var foundTextField:TextField = new TextField();
foundTextField.embedFonts = true;
foundTextField.autoSize = TextFieldAutoSize.LEFT;
foundTextField.antiAliasType = AntiAliasType.ADVANCED;
foundTextField.styleSheet = style;
foundTextField.htmlText = "<div class=\"mainFont\">" + labelToWrite + "</div>";
See: flash as3 xml cdata bold tags rendered in htmlText with an embedded font
Other reasons maybe:
Are you embedding the right type of font (TLF vs CFF)?
If not using the flash IDE to make your textField, are you registering the font?
Font.registerFont(MyFontClass);
You can work with embeded fonts in Animate graphic design. Just add fonts and create them with a class name.
At Adobe Animate objects library you can add and create fonts as class objects like this:
Select a font and styles to add
Assign a name and export as a Action Script Class
At code frame, you must add the fonts with same classes names created before:
var gothamMediumItalic = new GothamMediumItalic();
var formatMediumItalic:TextFormat = new TextFormat();
formatMediumItalic.font = gothamMediumItalic.fontName;
var gothamBookItalic = new GothamBookItalic();
var formatBookItalic:TextFormat = new TextFormat();
formatBookItalic.font = gothamBookItalic.fontName;
var gothamBoldItalic = new GothamBoldItalic();
var formatBoldItalic:TextFormat = new TextFormat();
formatBoldItalic.font = gothamBoldItalic.fontName;
this.embedFonts = true;
this.antiAliasType = AntiAliasType.ADVANCED;
Finally, you just need to use your box added as graphic object, and set format and content:
var yourText:String = "Some Text";
boxText.defaultTextFormat = formatBook; //My default text font style
boxText.text = yourText; //Full text in same format
boxText.setTextFormat(formatBookItalic,yourText.indexOf("<i>"),yourText.indexOf("</i>")); //Some text in another format found by custom chars
That works fine and lets you use less code and more interface options. I hope this could be usefull for someone.

WPF RichTextBox with Inline HTML - i18n

I have a RichTextBox which I'm trying to use to display a translateable block of text containing hyperlinks. The problem I'm having is I can't find a way to set the text property without manually coding the s and controls into the content, which isn't translateable. Is there any way of doing this? I tried saving a simple RTF file containing one sentence using Word so I could extract the bits I need, but I end up with 160 lines of difficult to decipher RTF text.
Ideally HTML would be easier but this doesn't seem to be supported
I solved this by using the http://htmlagilitypack.codeplex.com/ to parse out the anchors.
public static IEnumerable<Inline> ParseHtml(string text)
{
var inlines = new List<Inline>();
HtmlDocument doc = new HtmlDocument();
doc.LoadHtml(text);
if(doc.ParseErrors==null ||!doc.ParseErrors.Any()) {
foreach (var childNode in doc.DocumentNode.ChildNodes) {
switch(childNode.Name.ToLowerInvariant()) {
case "a":
var lnk = new Hyperlink(new Run(childNode.InnerText));
lnk.NavigateUri = new Uri(childNode.Attributes["href"].Value);
inlines.Add(lnk);
break;
default:
inlines.Add(new Run(childNode.InnerText));
break;
}
}
}
return inlines;
}