Getting date in CSS and HTML to PDF's footer - html

Im using ITextRenderer to generate a pdf file from html and css, I have a footer with the current page on the right.
But now i would like to have the current date on the left.
I found this:
<div data-line="1"></div>
div[data-line]:after {
content: "[line " attr(data-line) "]";
}
But I dont know how to combine it with :
#bottom-left {
content: "Date: ";
}
Is this possible or is there any other way?
I would like the footer to look something like this:
Date: 2015-03-17 12:04
UPDATE 1:
I have a method createPDF(string html, String resourceUrl), that looks like this:
ByteArrayOutputStream out = new ByteArrayOutputStream();
HtmlCleaner cleaner = new HtmlCleaner();
TagNode node = cleaner.clean(html);
CleanerProperties props = cleaner.getProperties();
new SimpleXmlSerializer(props).writeToStream(node, out, "ISO-8859-1");
ITextRenderer renderer = new ITextRenderer();
renderer.setPDFVersion(PdfWriter.VERSION_1_7);
renderer.setDocumentFromString(new String(out.toByteArray(), "ISO-8859-1"));
renderer.getSharedContext().setBaseURL(resourceUrl);
renderer.layout();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
renderer.createPDF(outputStream);
renderer.finishPDF();
outputStream.flush();
outputStream.close();
return outputStream.toByteArray();

The example that you cite of using data-line in the CSS is an interesting way to dynamically generate CSS based on data-* HTML attributes. These in turn might be dynamically generated in a web page by a server-side or client-side script. But it seems like this might not be the appropriate method for your case.
If you are already using a Java class like ITextRenderer to generate this PDF, then the best method for you is probably just to use Java itself to generate the current date in the format that you want, then print that directly into the CSS as a string.
If you are loading the CSS from a manually created file, one way to do this would be to write your CSS with some text you intend to replace, for example INSERTDATE. Then in Java, load your document the way you normally do, then use some Java code like String.replace() to replace INSERTDATE with today's date.
Update 1:
Based on your sample code above, you could write INSERTDATE where you want the date to appear in your HTML/CSS, then call:
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm"); # your desired format
String strDate = sdf.format(new Date()); # get the current date
html = html.replace("INSERTDATE", strDate);
at the top of your method.

You can use css string-set with content() and string() css functions , see https://www.w3.org/TR/css-gcpm-3/ like :
<div class="where-content-comes-from">1</div>
<div class="where-content-is-injected"></div>
.where-content-comes-from{
display:none;
string-set: am-just-a-var content();
}
.where-content-is-injected:after {
content: string(am-just-a-var);
}
bonus : you can also use css env() function to display date value see https://developer.mozilla.org/en-US/docs/Web/CSS/env

Related

Parsing html page content without using selector

I am going to parse some web pages using Java program. For this purpose I wrote a small code for parsing page content by using xpath as selector. For parsing different sites you need to find the appropriate xpath per each site. The problem is for doing that you need an operator to find the write xpath for you. (for example using firepath firefox addon) Suppose you dont know what page you should parse or the number of sites get really big for operator to find right xpath. In this case you need a way for parsing pages without using any selector. (same scenario exist for CSS selector) Or there should be a way to find xpath automatically! I was wondering what is the method of parsing web pages in this way?
Here is the small code which I wrote for this purpose, please feel free to extend that in presenting your solutions.
public downloadHTML(String url) throws IOException{
CleanerProperties props = new CleanerProperties();
// set some properties to non-default values
props.setTranslateSpecialEntities(true);
props.setTransResCharsToNCR(true);
props.setOmitComments(true);
// do parsing
TagNode tagNode = new HtmlCleaner(props).clean(
new URL(url)
);
// serialize to xml file
new PrettyXmlSerializer(props).writeToFile(
tagNode, "c:\\TEMP\\clean.xml", "utf-8"
);
}
public static void testJavaxXpath(String pattern)
throws ParserConfigurationException, SAXException, IOException,
FileNotFoundException, XPathExpressionException {
DocumentBuilder b = DocumentBuilderFactory.newInstance()
.newDocumentBuilder();
org.w3c.dom.Document doc = b.parse(new FileInputStream(
"c:\\TEMP\\clean.xml"));
// Evaluate XPath against Document itself
javax.xml.xpath.XPath xPath = XPathFactory.newInstance().newXPath();
NodeList nodes = (NodeList) xPath.evaluate(pattern,
doc.getDocumentElement(), XPathConstants.NODESET);
for (int i = 0; i < nodes.getLength(); ++i) {
Element e = (Element) nodes.item(i);
System.out.println(e.getFirstChild().getTextContent());
}
}

iText HtmlWorker Pre tag rendering issue

I am newbie to iText. I am trying convert html file to pdf.After conversion the contents inside the "pre tag" is not proper. If anybody come across this issue before please share your thought on this with the solution that you applied.
Document document = new Document();
string filePath = HostingEnvironment.MapPath("~/Content/Pdf/");
PdfWriter.GetInstance(document, new FileStream(filePath + "\\pdf-"+Name+".pdf", FileMode.Create));
document.Open();
iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(document);
FontFactory.Register(Path.Combine(_webHelper.MapPath("~/App_Data/Pdf/arial.ttf")), "Garamond"); // just give a path of arial.ttf
StyleSheet css = new StyleSheet();
css.LoadTagStyle("body", "face", "Garamond");
css.LoadTagStyle("body", "encoding", "Identity-H");
css.LoadTagStyle("body", "size", "12pt");
hw.SetStyleSheet(css);
hw.Parse(new StringReader(htmlText));
in above code see there is htmlText is a html code in string format plase pass your html viewpage code in string format and use above code, your pdf will be generate.
please note me if this code is not work.
hope this helps.

The image I add to a header doesn't appear

I'm trying to create a header with an image; the header is added, but the image is missing in header. My application is deployed on an Oracle Weblogic server (using Java EE and Hibernate).
I'm trying to create the image like this. getImage(seasonalFilter.getPictureFileId()).getAbsolutePath(). The image path is something like this: /tmp/6461346546165461313_65464.jpg.
Note that I want to add text under the image in the header (for every page).
public File convertHtmlToPdf(String JSONString, ExportQueryTypeDTO queryType, String htmlText, ExportTypeDTO type) throws VedStatException {
try {
File retFile = null;
FilterDTO filter = null;
HashMap<Object, Object> properties = new HashMap<Object, Object>(queryType.getHashMap());
filter = JSONCoder.decodeSeasonalFilterDTO(JSONString);
DateFormat formatter = new SimpleDateFormat("yyyy_MM_dd__HH_mm");
//logger.debug("<<<<<< HTML TEXT: " + htmlText + " >>>>>>>>>>>>>>>>");
StringBuilder tmpFileName = new StringBuilder();
tmpFileName.append(formatter.format(new Date()));
retFile = File.createTempFile(tmpFileName.toString(), type.getSuffix());
OutputStream out = new FileOutputStream(retFile);
com.lowagie.text.Document document = new com.lowagie.text.Document(com.lowagie.text.PageSize.LETTER);
com.lowagie.text.pdf.PdfWriter pdfWriter = com.lowagie.text.pdf.PdfWriter.getInstance(document, out);
document.open();
com.lowagie.text.html.simpleparser.HTMLWorker htmlWorker = new com.lowagie.text.html.simpleparser.HTMLWorker(document);
String str = htmlText.replaceAll("ű", "û").replaceAll("ő", "õ").replaceAll("Ő", "Õ").replaceAll("Ű", "Û");
htmlWorker.parse(new StringReader(str));
if (filter instanceof SeasonalFilterDTO) {
SeasonalFilterDTO seasonalFilter = (SeasonalFilterDTO) filter;
if (seasonalFilter.getPictureFileId() != null) {
logger.debug("Image absolutePath: " + getImage(seasonalFilter.getPictureFileId()).getAbsolutePath());
Image logo = Image.getInstance(getImage(seasonalFilter.getPictureFileId()).getAbsolutePath());
logo.setAlignment(Image.MIDDLE);
logo.setAbsolutePosition(0, 0);
logo.scalePercent(100);
Chunk chunk = new Chunk(logo, 0, 0);
HeaderFooter header = new HeaderFooter(new Phrase(chunk), true);
header.setBorder(Rectangle.NO_BORDER);
document.setHeader(header);
}
}
document.close();
return retFile;
} catch (Exception e) {
throw new VedStatException(e);
}
}
I really dislike your false allegation that "Every tutorial is based on C:\imagelocation\dsadsa.jpg"
I'm the author of two books and many tutorials about iText and I know for a fact that what you say doesn't make any sense. Take a look at my name: "Bruno Lowagie." You are using my name in your code, so please believe me when I say you're doing it completely wrong.
Instead of HTMLWorker, you should use XML Worker. HTMLWorker is no longer supported and will probably be removed from iText in the near future.
I see that you're also using the HeaderFooter class. This class has been removed several years ago. Please take a look at the newer examples: http://www.itextpdf.com/themes/keyword.php?id=221
These examples are written in Java; if you need the C# version, please look for the corresponding C# examples in the SVN repository.
Regarding images, you may want to read chapter 10 of my book.
Finally: please read http://lowagie.com/itext2

AJAX HTMLEditorExtender on postback tables don't display

I am currently using an Ajax tool; HTMLEditorExtender to turn a textbox into a WYSIWYG editor, in a C# ASP.NET project. On the initial page load I place a large amount of formated text and tables into the editor which appears fine; even the tables.
The data is loaded into an asp:panel and the items/display from the panel is what is actually loaded into the extender and displayed.
However, if I want to have a button that saves all of the data that is in the editor to a Session and after the button press still display everything in the WYSIWG editor on the page postback everything that loads in the the textbox is fine except for the tables. They come up with the tags. Is there anyway around this?
The code I am using to initially load the page is this:
ContentPlaceHolder cphMain = (ContentPlaceHolder)this.Master.FindControl("MainContent");
Panel pnlContent = (Panel)cphMain.FindControl("innerFrame");
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);
HtmlTextWriter hw = new HtmlTextWriter(sw);
pnlContent.RenderControl(hw);
txtPN.Text = sb.ToString();
pnlContent.Visible = false;
On the button click I am having this saved:
string strHTMLText = txtPN.Text;
Session["ProgressNoteHTML"] = strHTMLText;
And I am loading it on the postback like this:
txtPN.Text = (string)Session["ProgressNoteHTML"];
ContentPlaceHolder cphMain = (ContentPlaceHolder)this.Master.FindControl("MainContent");
Panel pnlContent = (Panel)cphMain.FindControl("innerFrame");
pnlContent.Visible = false;
Any ideas as to why any postbacks would make the tags appear and in the original page load they do not?
The solution offered by Erik won't work for table tags containing property values. For instance: <table align="right"> will not be decoded. I have also found that <img> tags are encoded by the HTMLEditorExtender as well.
The easier solution is to use the Server.HTMLDecode() method.
TextBox_Editor.Text = Server.HtmlDecode(TextBox_Editor.Text) 'fixes encoding bug in ajax:HTMLEditor
I have the same problem, It seems to have something to do with the default sanitizing that the extension performs on the HTML content. I haven't found a way to switch it off, but the workaround is pretty simple.
Write an Anti-Sanitizing function that replaces the cleansed tags with proper tags. Below is mine written in VB.Net. A C# version would look very similar:
Protected Function FixTableTags(ByVal input As String) As String
'find all the matching cleansed tags and replace them with correct tags.
Dim output As String = input
'replace Cleansed table tags.
output = output.Replace("<table>", "<table>")
output = output.Replace("</table>", "</table>")
output = output.Replace("<tbody>", "<tbody>")
output = output.Replace("</tbody>", "</tbody>")
output = output.Replace("<tr>", "<tr>")
output = output.Replace("<td>", "<td>")
output = output.Replace("</td>", "</td>")
output = output.Replace("</tr>", "</tr>")
Return output
End Function

Adding html text to Word using Interop

I'm trying to add some HTML formatted text to Word using Office Interop. My code looks like this:
Clipboard.SetText(notes, TextDataFormat.Html);
pgCriteria.Range.Paste();
but it's throwing a Command Failed exception. Any idea?
After spending several hours the solutions is to use this excellent class
http://blogs.msdn.com/jmstall/pages/sample-code-html-clipboard.aspx
This worked for me on Windows 7 and Word 2007:
public static void pasteHTML(this Range range, string html)
{
Clipboard.SetData(
"HTML Format",
string.Format("Version:0.9\nStartHTML:80\nEndHTML:{0,8}\nStart" + "Fragment:80\nEndFragment:{0,8}\n", 80 + html.Length) + html + "<");
range.Paste();
}
Sample use: range.pasteHTML("a<b>b</b>c");
Probably a bit more reliable way without using the clipboard is to save the HTML fragment in a file and use InsertFile. Something like:
public static void insertHTML(this Range range, string html) {
string path = System.IO.Path.GetTempFileName();
System.IO.File.WriteAllText(path, "<html>" + html); // must start with certain tag to be detected as html: <html> or <body> or <table> ...
range.InsertFile(path, ConfirmConversions: false);
System.IO.File.Delete(path); }
It is kind of tricky to add the html in a word document. The best way is creating a temporary file and than insert this file to selected range of the word. The trick is to leverage the InsertFile function of the Range. This allows us to insert arbitrary HTML strings by first saving them as files to a temporary location on disk.
The only trick is that < html/> must be the root element of the
document.
I use something like this at one of my project.
private static string HtmlHeader => "<html lang='en' xmlns='http://www.w3.org/1999/xhtml'><head><meta charset='utf-8' /></ head >{0}</html>";
public static string GenerateTemporaryHtmlFile(Range range,string htmlContent)
{
string path = Path.GetTempFileName();
File.WriteAllText(path, string.Format(HtmlHeader , htmlContent));
range.InsertFile(FileName: tmpFilePath, ConfirmConversions: false);
return path;
}
it is important to adding
charset='utf-8'
to head of html file other wise you may see unexpected characters at your word document after you insert file.
Just build a temporary html file with your html content and insert it like below.
// 1- Sample HTML Text
var Html = #"<h1>Sample Title</h1><p>Lorem ipsum dolor <b>his sonet</b> simul</p>";
// 2- Write temporary html file
var HtmlTempPath = Path.Combine(Path.GetTempPath(), $"{Path.GetRandomFileName()}.html");
File.WriteAllText(HtmlTempPath, $"<html>{Html}</html>");
// 3- Insert html file to word
ContentControl ContentCtrl = Document.ContentControls.Add(WdContentControlType.wdContentControlRichText, Missing);
ContentCtrl.Range.InsertFile(HtmlTempPath, ref Missing, ref Missing, ref Missing, ref Missing);