Encoding HTML within XML - html

I have a big XML file which contains some HTML
<Orchard>
<Recipe>
<Name>Generated by Orchard.ImportExport</Name>
<Author>admin</Author>
</Recipe>
<Data>
<BodyPart Text="<p>My HTML</p><p align ="center">blah blah</p>"/>
</Data>
</Orchard>
I want to encode the HTML, but leave the XML unencoded.
I've given regular expressions a shot but couldn't come up with a solution.
Any ideas?
Cheers

If you want a simple hand-coded solution:
<Orchard>
<Recipe>
<Name>Generated by Orchard.ImportExport</Name>
<Author>admin</Author>
</Recipe>
<Data>
<BodyPart><Text><![CDATA[<p>My HTML</p><p align ="center">blah blah</p>]]></Text></BodyPart>
</Data>
</Orchard>
...but bear in mind that if the text "]]>" is present in the HTML, it will need to be escaped.
If you can't modify the structure of the file, use the DOM to find the attribute, and you should only need to escape the ampersand (with &) and the encapsulating quote (the double quote in your case--").
You might indicate what server-side language you are using and whether you can change the XML.

Here's my solution (not the most nicest way I know but it worked):
I moved the HTML into a CDATA as Brett Zamir suggested.
I then created a small program to parse the XML, find all the BodyPart items, and escape the HTML inside. Then moved the escaped HTML into the Text attribute, and deleted the inner text.

You can also copy the BodyPartDriver to your own project and override the Import and Export methods. This way you can encrypt/encode do whatever you want. It will run after the Import/Export from the usual BodyPartDriver.
[UsedImplicitly]
public class BodyPartDriver : ContentPartDriver<BodyPart>
{
protected override void Importing( BodyPart part, ImportContentContext context )
{
//Do your decoding here
var importedText = context.Attribute( part.PartDefinition.Name, "Text" );
if ( importedText != null )
{
part.Text = importedText;
}
}
protected override void Exporting( BodyPart part, ExportContentContext context )
{
//Do your encoding here
context.Element( part.PartDefinition.Name ).SetAttributeValue( "Text", part.Text );
}
}

Related

How to show HTML content in TextBlock using XAML from UWP app?

In JSON response:
results = "<p>This is a <b>paragraph</b></p><p>New paragraph with symbols > tags</p>";
XAML:
<Textblock Text={Binding results}/>
result:
This is a **paragraph** New Word
New paragraph with symbols > tags
You can use RichTextBlock to more easily match HTML DOM with XAML output. Unfortunately, there is not built-in API that will transform HTML into equivalent XAML for the control.
You can parse the HTML into known tags using HtmlAgilityPack and add items into RichTextBlock.Inlines manually. There is an old Docs article on this process, but it still applies. One of the examples it shows:
private static Inline GenerateBlockForNode(HtmlNode node)
{
switch (node.Name)
{
case "div":
return GenerateSpan(node);
case "p":
case "P":
return GenerateInnerParagraph(node);
case "img":
case "IMG":
return GenerateImage(node);
...
The individual GenerateXXX methods then generate appropriate inlines:
private static Inline GenerateSpan(HtmlNode node)
{
Span s = new Span();
AddChildren(s, node);
return s;
}
The easiest solution would be to use the code in this GitHub repo, which implements a lot the tag conversion and maybe you will be able to just copy-paste the converter to your project and get running.

MvcHtmlString.Create() method does not return Html encoded string

I must be missing something here, because this doc
says, MvcHtmlString.Create("someStringHere") returns html encoded string, but if i have something like MvcHtmlString.Create("<h1>myHeading</h1>") it still shows up as myHeading (as heading) and not as encoded text <h1>myHeading</h1>
Then what is meant by the statement
MvcHtmlString.Create(String)
Creates an HTML-encoded string using the specified text value.
I am not able to understand, would be grateful if somebody could explain what does the doc mean and what's the difference between the encoding they are trying to refer to vs html encoding.
Thanks in advance!
I agree that the documentation seems misleading for MvcHtmlString.
However, MvcHtmlString is intended to be used when you don't want a string to be HTML encoded. The default behaviour of razor is to encode the output.
The Html string that you pass to it should already be encoded to ensure that it is outputted without additional encoding.
So assuming the following HTML helper:
public static class HtmlHelper
{
public static string GetHtmlString(this System.Web.Mvc.HtmlHelper htmlHelper)
{
return "<h1>myHeading</h1>";
}
public static MvcHtmlString GetMvcHtmlString(this System.Web.Mvc.HtmlHelper htmlHelper)
{
return MvcHtmlString.Create("<h1>myHeading</h1>");
}
}
With the Razor view:
#Html.GetHtmlString()
#Html.GetMvcHtmlString()
The output would be:
<h1>myHeading</h1&gt
myHeading

How to build html using HTML helpers in MVC3

I've a helper like this, I created this using raw HTML inside as follows:
private static readonly Core Db = new Core();
// Main menu
public static MvcHtmlString MainMenu()
{
IQueryable<Page> primaryPages = Db.Pages.Where(p => p.IsItShowInMenu);
var sb = new StringBuilder();
sb.Clear();
string pagecode = Convert.ToString(HttpContext.Current.Request.RequestContext.RouteData.Values["url"]);
sb.Append("<div id=\"Logo\">");
sb.Append("<span id=\"Logo_Text\">Dr. Shreekumar</span> <span id=\"Logo_Sub_Text\">Obstetrician & Gynecologist</span>");
sb.Append("</div>");
sb.Append("<div id=\"Primary_Menu\">");
sb.Append("<ul>");
foreach (Page page in primaryPages)
{
if (page.PageCode != "Home")
{
Page currentPage = Db.Pages.SingleOrDefault(p => p.PageCode == pagecode);
if (currentPage != null)
{
Page parentPage = Db.Pages.Find(currentPage.ParentId);
if (parentPage != null)
{
sb.AppendFormat((page.PageCode == parentPage.PageCode ||
page.PageCode == currentPage.PageCode)
? "<li class=\"active\">{1}</li>"
: "<li>{1}</li>", page.PageCode,
page.Name.Trim());
}
else
{
sb.AppendFormat("<li>{1}</li>", page.PageCode,page.Name);
}
}
else
{
sb.AppendFormat("<li>{1}</li>", page.PageCode, page.Name);
}
}
}
sb.Append("</ul>");
sb.Append("</div>");
return new MvcHtmlString(sb.ToString());
}
Can anybody suggest me that how can I convert this using MVC HTML helpers (helpers for anchor, list (li), div etc)
It is an important part of your role as the architect of your application to define what will be generated by helpers and what not, as it depends on what is repeated where and how often in your code. I am not going to tell you what to build helpers for because that depends on the architecture of your whole application. To help you make the decision, however, consider the two general types of helpers you can build: global and local.
Global helpers are for chunks of code which are often repeated across your site, possibly with a few minor changes that can be handled by passing in different parameters. Local helpers do the same job, but are local to a given page. A page which has a repeating segment of code that isn't really found anywhere else should implement a local helper. Now then...
Global helpers: Create a new static class to contain your helpers. Then, create static methods inside the container class that look like this:
public static MvcHtmlString MyHelper(this HtmlHelper helper, (the rest of your arguments here))
{
// Create your HTML string.
return MvcHtmlString.Create(your string);
}
What this does is create an extension method on the Html helper class which will allow you to access your helpers with the standard Html. syntax. Note that you will have to include the namespace of this class in any files where you want to use your custom helpers.
Local helpers: The other way to do helpers works when you want them to be local to a single view. Perhaps you have a block of code in a view that is being repeated over and over again. You can use the following syntax;
#helper MyHelper()
{
// Create a string
#MvcHtmlString.Create(your string here);
}
You can then output this onto your page using:
#MyHelper()
The reason why we are always creating MvcHtmlString objects is because as a security feature built into MVC, outputted strings are encoded to appear as they look in text on the page. That means that a < will be encoded so that you actually see a "<" on the page. It won't by default start an HTML tag.
To get around this, we use the MvcHtmlString class, which bypasses this security feature and allows us to output HTML directly to the page.
I suggest you move all this logic into a separate Section as it is a Menu that is being rendered.
Instead of building the HTML from the code, it is cleaner and a lot more convenient to build it using Razor's helpers. Refer to this as well as this article from Scott Gu on how to render sections to get a quick starting guide.
Consider using Helper methods such as
#Html.DropDownListFor() or
#Html.DropDownList()

Trying to override Nokogiri's serializing behaviour

I'm using Nokogiri to alter an HTML tree and output the code. I need to alter the way a particular node outputs to html (details below), so I've subclassed Nokogiri::XML::Node.
How do I override that subclass' output behaviour?
Right now, if I override to_html(), then I get the display I want when calling to_html() for instances of Nokogiri::HTML::DocumentFragment, but when I call it on instances of Nokogiri::HTML::Document, the normal output behaviour takes over. That won't do because I actually need to make changes to the document head (which is excluded from DocumentFragment instances).
Why I need to alter the HTML output:
I need to be able to include an unpartnered </noscript> tag for the sake of using GWO with my code. However, I can't add an unpartnered end tag in an HTML tree.
With Nokogiri, I can't add it as text either because the < and > get escaped as html char codes.
I can't use Hpricot for this project because I'm running it over some bad code (written by others at work), and Hpricot won't preserve the errors in question (like putting a block element inside of an <a> element). (No, I'm not about to track down all the bad HTML and fix it.)
Specs: WinXP, Ruby 1.8.6, Nokogiri 1.4.4
Update:
For a reason I can't guess, when I create a constructor for my subclass, regardless of how many parameters I require for the subclass constructor, I get errors if I supply any number but two (the number of params required for the superclass).
class NoScript < Nokogiri::XML::Node
def initialize(doc)
super("string", doc)
end
end
I haven't had this problem with other classes. Am I missing something?
Most likely, your code is calling at some point write_to (to_html calls serialize, and serialize calls write_to). It then calls native_write_to on current node. Let's take a look at it.
static VALUE native_write_to(
VALUE self,
VALUE io,
VALUE encoding,
VALUE indent_string,
VALUE options
) {
xmlNodePtr node;
const char * before_indent;
xmlSaveCtxtPtr savectx;
Data_Get_Struct(self, xmlNode, node);
xmlIndentTreeOutput = 1;
before_indent = xmlTreeIndentString;
xmlTreeIndentString = StringValuePtr(indent_string);
savectx = xmlSaveToIO(
(xmlOutputWriteCallback)io_write_callback,
(xmlOutputCloseCallback)io_close_callback,
(void *)io,
RTEST(encoding) ? StringValuePtr(encoding) : NULL,
(int)NUM2INT(options)
);
xmlSaveTree(savectx, node);
xmlSaveClose(savectx);
xmlTreeIndentString = before_indent;
return io;
}
Code is in github. If you read it, you will see that it does not call your to_html anywhere, so your custom method is never run. OTOH, if you use a Nokogiri::HTML::DocumentFragment it is being called, because DocumentFragment#to_html relies on Nokogiri::XML::NodeSet#to_html and it is a plain map:
def to_html *args
if Nokogiri.jruby?
options = args.first.is_a?(Hash) ? args.shift : {}
if !options[:save_with]
options[:save_with] = Node::SaveOptions::NO_DECLARATION | Node::SaveOptions::NO_EMPTY_TAGS | Node::SaveOptions::AS_HTML
end
args.insert(0, options)
end
map { |x| x.to_html(*args) }.join
end

Is there a list of defined C# functions for userCSharp in XSLT?

I'm trying to debug a BizTalk map that has some custom XSLT in it that makes use of C#. I've found:
userCSharp:MathSubtract
userCSharp:MathAdd
userCSharp:StringSize
userCSharp:StringSubstring
and a few others but I'm finding it difficult to find some resources online defining all of the available predefined c# functions and their documentation.
The reason I ask is because it has a I have a "userCSharp:StringFind" which blows up saying StringFind() is an unknown XSLT function.
The xslt functions MathSubtract, MathAdd etc correspond to the predefined Functoids that your map uses (in the xmlns 'userCSharp').
Most of the functoids are just inline XSLT C# functions - BizTalk adds the C# script for the functoid at the bottom of the xslt when the map gets compiled. (I think some of the simple functoids can use xslt primitives as well). Your own script functoids will also be added to this block.
You can see what BizTalk is doing by compiling your assembly containing the maps, and then using the "Show all Files" command to look at the corresponding .btm.cs file to see what has been added.
BizBert site gives quite a good reference on the implementation of each of the functoids.
(The double "" escaping is because the XSLT is kept in a string constant)
private const string _strMap = #"<?xml version=""1.0"" encoding=""UTF-16""?>
<xsl:stylesheet xmlns:xsl=""http://www.w3.org/1999/XSL/Transform"" xmlns:msxsl=""urn:schemas-microsoft-com:xslt""
...
xmlns:userCSharp=""http://schemas.microsoft.com/BizTalk/2003/userCSharp"">
and then a script CDATA block at the bottom
<msxsl:script language=""C#"" implements-prefix=""userCSharp""><![CDATA[
public bool IsNumeric(string val)
{
if (val == null)
{
return false;
}
double d = 0;
return Double.TryParse(val, System.Globalization.NumberStyles.AllowThousands | System.Globalization.NumberStyles.Float, System.Globalization.CultureInfo.InvariantCulture, out d);
}
public string MathAdd(string param0, string param1)
{
System.Collections.ArrayList listValues = new System.Collections.ArrayList();
... etc