Adding html text to Word using Interop - html

I'm trying to add some HTML formatted text to Word using Office Interop. My code looks like this:
Clipboard.SetText(notes, TextDataFormat.Html);
pgCriteria.Range.Paste();
but it's throwing a Command Failed exception. Any idea?

After spending several hours the solutions is to use this excellent class
http://blogs.msdn.com/jmstall/pages/sample-code-html-clipboard.aspx

This worked for me on Windows 7 and Word 2007:
public static void pasteHTML(this Range range, string html)
{
Clipboard.SetData(
"HTML Format",
string.Format("Version:0.9\nStartHTML:80\nEndHTML:{0,8}\nStart" + "Fragment:80\nEndFragment:{0,8}\n", 80 + html.Length) + html + "<");
range.Paste();
}
Sample use: range.pasteHTML("a<b>b</b>c");
Probably a bit more reliable way without using the clipboard is to save the HTML fragment in a file and use InsertFile. Something like:
public static void insertHTML(this Range range, string html) {
string path = System.IO.Path.GetTempFileName();
System.IO.File.WriteAllText(path, "<html>" + html); // must start with certain tag to be detected as html: <html> or <body> or <table> ...
range.InsertFile(path, ConfirmConversions: false);
System.IO.File.Delete(path); }

It is kind of tricky to add the html in a word document. The best way is creating a temporary file and than insert this file to selected range of the word. The trick is to leverage the InsertFile function of the Range. This allows us to insert arbitrary HTML strings by first saving them as files to a temporary location on disk.
The only trick is that < html/> must be the root element of the
document.
I use something like this at one of my project.
private static string HtmlHeader => "<html lang='en' xmlns='http://www.w3.org/1999/xhtml'><head><meta charset='utf-8' /></ head >{0}</html>";
public static string GenerateTemporaryHtmlFile(Range range,string htmlContent)
{
string path = Path.GetTempFileName();
File.WriteAllText(path, string.Format(HtmlHeader , htmlContent));
range.InsertFile(FileName: tmpFilePath, ConfirmConversions: false);
return path;
}
it is important to adding
charset='utf-8'
to head of html file other wise you may see unexpected characters at your word document after you insert file.

Just build a temporary html file with your html content and insert it like below.
// 1- Sample HTML Text
var Html = #"<h1>Sample Title</h1><p>Lorem ipsum dolor <b>his sonet</b> simul</p>";
// 2- Write temporary html file
var HtmlTempPath = Path.Combine(Path.GetTempPath(), $"{Path.GetRandomFileName()}.html");
File.WriteAllText(HtmlTempPath, $"<html>{Html}</html>");
// 3- Insert html file to word
ContentControl ContentCtrl = Document.ContentControls.Add(WdContentControlType.wdContentControlRichText, Missing);
ContentCtrl.Range.InsertFile(HtmlTempPath, ref Missing, ref Missing, ref Missing, ref Missing);

Related

Getting date in CSS and HTML to PDF's footer

Im using ITextRenderer to generate a pdf file from html and css, I have a footer with the current page on the right.
But now i would like to have the current date on the left.
I found this:
<div data-line="1"></div>
div[data-line]:after {
content: "[line " attr(data-line) "]";
}
But I dont know how to combine it with :
#bottom-left {
content: "Date: ";
}
Is this possible or is there any other way?
I would like the footer to look something like this:
Date: 2015-03-17 12:04
UPDATE 1:
I have a method createPDF(string html, String resourceUrl), that looks like this:
ByteArrayOutputStream out = new ByteArrayOutputStream();
HtmlCleaner cleaner = new HtmlCleaner();
TagNode node = cleaner.clean(html);
CleanerProperties props = cleaner.getProperties();
new SimpleXmlSerializer(props).writeToStream(node, out, "ISO-8859-1");
ITextRenderer renderer = new ITextRenderer();
renderer.setPDFVersion(PdfWriter.VERSION_1_7);
renderer.setDocumentFromString(new String(out.toByteArray(), "ISO-8859-1"));
renderer.getSharedContext().setBaseURL(resourceUrl);
renderer.layout();
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
renderer.createPDF(outputStream);
renderer.finishPDF();
outputStream.flush();
outputStream.close();
return outputStream.toByteArray();
The example that you cite of using data-line in the CSS is an interesting way to dynamically generate CSS based on data-* HTML attributes. These in turn might be dynamically generated in a web page by a server-side or client-side script. But it seems like this might not be the appropriate method for your case.
If you are already using a Java class like ITextRenderer to generate this PDF, then the best method for you is probably just to use Java itself to generate the current date in the format that you want, then print that directly into the CSS as a string.
If you are loading the CSS from a manually created file, one way to do this would be to write your CSS with some text you intend to replace, for example INSERTDATE. Then in Java, load your document the way you normally do, then use some Java code like String.replace() to replace INSERTDATE with today's date.
Update 1:
Based on your sample code above, you could write INSERTDATE where you want the date to appear in your HTML/CSS, then call:
SimpleDateFormat sdf = new SimpleDateFormat("yyyy-MM-dd HH:mm"); # your desired format
String strDate = sdf.format(new Date()); # get the current date
html = html.replace("INSERTDATE", strDate);
at the top of your method.
You can use css string-set with content() and string() css functions , see https://www.w3.org/TR/css-gcpm-3/ like :
<div class="where-content-comes-from">1</div>
<div class="where-content-is-injected"></div>
.where-content-comes-from{
display:none;
string-set: am-just-a-var content();
}
.where-content-is-injected:after {
content: string(am-just-a-var);
}
bonus : you can also use css env() function to display date value see https://developer.mozilla.org/en-US/docs/Web/CSS/env

Allow using some html tags in MVC 4

How i can allow client to use html tags in MVC 4?
I would like to save records to the database and when it extract in view allow only some HTML tags (< b > < i > < img >) and others tags must be represented as text.
My Controller:
[ValidateInput(false)]
[HttpPost]
public ActionResult Rep(String a)
{
var dbreader = new DataBaseReader();
var text = Request["report_text"];
dbreader.SendReport(text, uid, secret).ToString();
...
}
My View:
#{
var dbreader = new DataBaseReader();
var reports = dbreader.GetReports();
foreach (var report in reports)
{
<div class="report_content">#Html.Raw(report.content)</div>
...
}
}
You can replace all < chars to HTML entity:
tags = tags.Replace("<", "<");
Now, replace back only allowed tags:
tags = tags
.Replace("<b>", "<b>")
.Replace("</b>", "</b>")
.Replace("<i>", "</i>")
.Replace("</i>", "</i>")
.Replace("<img ", "<img ");
And render to page using #Html.Raw(tags)
If you are trying some property of your view model object to accept Html text, use AllowHtmlAttribute
[AllowHtml]
public string UserComment{ get; set; }
and before binding to the view
model.UserComment=model.UserComment.Replace("<othertagstart/end>",""); //hard
Turn off validation for report_text (1) and write custom HTML encoder (2):
Step 1:
Request.Unvalidated().Form["report_text"]
More info here. You don't need to turn off validation for entire controller action.
Step 2:
Write a custom html encoder (convert all tags except b, i, img to e.g.: script -> ;ltscript;gt), since you are customizing a default behaviour of request validation and html tag filtering. Consider to safeguard yourself from SQL injection attacks by checking SQL parameters passed to stored procedures/functions etc.
You may want to check out BBCode BBCode on Wikipedia. This way you have some control on what is allowed and what's not, and prevent illegal usage.
This would work like this:
A user submits something like 'the meeting will now be on [b]monday![/b]'
Before saving it to your database you remove all real html tags ('< ... >') to avoid the use of illegal tags or code injection, but leave the pseudo tags as they are.
When viewed you convert only the allowed pseudo html tags into real html
I found solution of my problem:
html = Regex.Replace(html, "<b>(.*?)</>", "<b>$1</b>");
html = Regex.Replace(html, "<i>(.*?)</i>", "<i>$1</i>");
html = Regex.Replace(html, "<img(?:.*?)src="(.*?)"(?:.*?)/>", "<img src=\"$1\"/>");

LibTiff.NET append mode bug?

I've started using LibTiff.NET for writing tiff IPTC tags lately and discovered strange behavior on some files that i have here. I'm using sample code that ships with LibTiff.NET binaries, and it works fine with most of the images, but some files are having image data corruption after these lines:
class Program
{
private const TiffTag TIFFTAG_GDAL_METADATA = (TiffTag)42112;
private static Tiff.TiffExtendProc m_parentExtender;
public static void TagExtender(Tiff tif)
{
TiffFieldInfo[] tiffFieldInfo =
{
new TiffFieldInfo(TIFFTAG_GDAL_METADATA, -1, -1, TiffType.ASCII,
FieldBit.Custom, true, false, "GDALMetadata"),
};
tif.MergeFieldInfo(tiffFieldInfo, tiffFieldInfo.Length);
if (m_parentExtender != null)
m_parentExtender(tif);
}
public static void Main(string[] args)
{
// Register the extender callback
// It's a good idea to keep track of the previous tag extender (if any) so that we can call it
// from our extender allowing a chain of customizations to take effect.
m_parentExtender = Tiff.SetTagExtender(TagExtender);
string destFile = #"d:\00000641(tiffed).tif";
File.Copy(#"d:\00000641.tif", destFile);
//Console.WriteLine("Hello World!");
// TODO: Implement Functionality Here
using (Tiff image = Tiff.Open(destFile, "a"))
{
// we should rewind to first directory (first image) because of append mode
image.SetDirectory(0);
// set the custom tag
string value = "<GDALMetadata>\n<Item name=\"IMG_GUID\">" +
"817C0168-0688-45CD-B799-CF8C4DE9AB2B</Item>\n<Item" +
" name=\"LAYER_TYPE\" sample=\"0\">athematic</Item>\n</GDALMetadata>";
image.SetField(TIFFTAG_GDAL_METADATA, value);
// rewrites directory saving new tag
image.CheckpointDirectory();
}
// restore previous tag extender
Tiff.SetTagExtender(m_parentExtender);
Console.Write("Press any key to continue . . . ");
Console.ReadKey(true);
}
}
After opening i see mostly blank white image or multiple black and white lines instead of text that have been written there (i don't need to read\write tags to produce this behavior). I noticed this happens when image already has a custom tag (console window alerts about it) or one of tags have got 'bad value' (console window in this case says 'vsetfield:%pathToTiffFile%: bad value 0 for "%TagName%" tag').
Original image: http://dl.dropbox.com/u/1476402/00000641.tif
Image after LibTiff.NET: http://dl.dropbox.com/u/1476402/00000641%28tiffed%29.tif
I would be grateful for any help provided.
You probably should not use CheckpointDirectory method for files opened in append mode. Try using RewriteDirectory method instead.
It will rewrite the directory, but instead of place it at it's old
location (as WriteDirectory() would) it will place them at the end of
the file, correcting the pointer from the preceeding directory or file
header to point to it's new location. This is particularly important
in cases where the size of the directory and pointed to data has
grown, so it won’t fit in the space available at the old location.
Note that this will result in the loss of the previously used
directory space.

Word having single quotes search from xml file using jquery issue

Hi I need to parse XML file using jquery. I created read and display functionality. But when a word having single quote not working.
My XML is like this
<container>
<data name="Google" definition="A search engine"/>
<data name=" Mozilla's " definition="A web browser"/>
</ container>
using my jquery code I can read definition of Google. But I can't read Mozilla's definition due to that single quotes. This is my jquery code.
var displayDefinition = function(obj){
$.get("definitions.xml", function(data){
xml_data1.find("data[name^='"+obj.innerHTML+"']").each(function(k, v){
right=''+ $(this).attr("Defination") + '';
}
}
$(".result").append(right);
}
Any body knows the solution for this please help me.
Thanks
jQuery deals with single quotes very well. the structure of your function looks really wild though. I changed it a big assuming you want to create a function that can display the definition based on passing it a name: http://jsfiddle.net/rkw79/VQxZ2/
function display(id) {
$('container').find('data[name="' +id.trim()+ '"]').each(function() {
var right = $(this).attr("definition");
$(".result").html(right);
});
}
Note, you have to make sure your 'name' attribute does not begin or end with spaces; and just trim the string that the user passes in.

jsf messages: adding link

Currently in JSF, all HTML contained within a message (rich:messages tag) is escaped and just shows up as the markup. For example, in my backing bean, I have:
createMessage("Title created successfully with product number: " + product.getProductNumber() + ".");
where createMessage() is just a helper function that adds a new Message to the faces context and is then viewable in my rich:messages tag.
When this message is created, my message simply shows up with the escaped HTML:
Title created successfully with product number: 1234.
Is there any way to avoid this and just provide an actual link in the message instead?
Thanks in advance
~Zack
A quick solution is to create a new renderer.
I've done this for h:messages as I wanted to separate the messages of different severities into separate divs. If you never want to use the default renderer then it's a good option.
The standard class that you would overwrite/extend is:
public class MessagesRenderer extends HtmlBasicRenderer
You would just use a ResponseWriter that doesn't escape the text. The concrete class is the HtmlResponseWriter which escapes the text. You could extend this and overwrite the
public void writeText(Object text, String componentPropertyName)
so that it doesn't use HtmlUtils.
Then just add your new renderer to faces-config.xml
<render-kit>
<renderer>
<component-family>javax.faces.Messages</component-family>
<renderer-type>javax.faces.Messages</renderer-type>
<renderer-class>com.mypackage.MessagesRenderer</renderer-class>
</renderer>
</render-kit>
It sounds like you need to create your own version of rich:messages that has an escape attribute, like h:outputText, so you can disable HTML escaping.
If you're using jquery you can unescape the xml characters:
<script type="text/javascript">
//<![CDATA[
$(document).ready(function() {
$(".esc").each(function(i) {
var h = $(this).html();
h = h.replace(/</gi, "<");
h = h.replace(/>/gi, ">");
$(this).html(h);
});
});
//]]>
</script>