Is it possible to insert HTML content in XML document? - html

I need to insert HTML content into an XML document, is this possible or should HTML content be, for example, encoded in BASE64 or with something else like that?

You can include HTML content. One possibility is encoding it in BASE64 as you have mentioned.
Another might be using CDATA tags.
Example using CDATA:
<xml>
<title>Your HTML title</title>
<htmlData><![CDATA[<html>
<head>
<script/>
</head>
<body>
Your HTML's body
</body>
</html>
]]>
</htmlData>
</xml>
Please note:
CDATA's opening character sequence: <![CDATA[
CDATA's closing character sequence: ]]>

so long as your html content doesn't need to contain a CDATA element, you can contain the HTML in a CDATA element, otherwise you'll have to escape the XML entities.
<element><![CDATA[<p>your html here</p>]]></element>
VS
<element><p>your html here</p></element>

The purpose of BASE64 encoding is to take binary data and be able to persist that to a string. That benefit comes at a cost, an increase in the size of the result (I think it's a 4 to 3 ratio). There are two solutions. If you know the data will be well formed XML, include it directly. The other, an better option, is to include the HTML in a CDATA section within an element within the XML.

Please see this.
Text inside a CDATA section will be ignored by the parser.
http://www.w3schools.com/xml/dom_cdatasection.asp
This is will help you to understand the basics about XML

Just put the html tags with there content and add the xmlns attribute with quotes after the equals and in between the quotes is http://www.w3.org/1999/xhtml

Related

How do I show actual HTML Code in textarea rather than rendered HTML?

I have a code that saves (html code) plus (some text) in mysql from textarea.
I then take the text from the mysql and display it under the textarea. The thing is if I save the code
<div style="color:red">Hello</div>
in mysql and then display it, I see Hello in red, but I want to see the actual
<div style="color:red">Hello</div>
to appear under the textarea. I hope you understand my problem.
so when you've grabbed the data from the database you want to actually display the html, rather than the page rendering the html?
if so just use the php function htmlentities();
You can use the xmp element, see What was the tag used for. It has been in HTML since the beginning and is supported by all browsers. Specifications frown upon it, but HTML5 CR still describes it and requires browsers to support it (though it also tells authors not to use it, but it cannot really prevent you).
Everything inside xmp is taken as such, no markup (tags or character references) is recognized there, except, for apparent reason, the end tag of the element itself, .
Otherwise xmp is rendered like pre.
When using “real XHTML”, i.e. XHTML served with an XML media type (which is rare), the special parsing rules do not apply, so xmp is treated like pre. But in “real XHTML”, you can use a CDATA section, which implies similar parsing rules. It has no special formatting, so you would probably want to wrap it inside a pre element:
<![CDATA[
This is a demo, tags will
appear literally.
<div style="color:red">Hello</div>
]]>
you can refer this ans : https://stackoverflow.com/a/16785992/3000179
If you want to do on browser level, you can follow the steps :
Replace the & character with &
Replace the < character with <
Replace the > character with >
Optionally surround your HTML sample with <pre> and/or <code>
tags.
Hope this helps.

Display CDATA as text in html

How can I display <!CDATA[ as text in an html document? I tried wrapping it in <pre> tags but that didn't work.
<! DOCTYPE html >
<html>
<body>
<div>This should show <pre><!CDATA[</pre> when the page renders</div>
</body>
</html>
Browsers do not support CDATA markers in text/html documents. Use character references instead (<, &, etc).
As others have written, you should escape “<” as <. HTML markup like pre or code does not change the way content is parsed, with “<” as markup-significant. You can use e.g. code markup here, since the content is computer code and this markup may have some advantages, but that’s a different issue. Example: <code><!CDATA[</code>.
There is a half-secret markup, however: <xmp><!CDATA[</xmp>. Within an xmp element, everything is taken literally; only the end tag of this element is recognized. Such markup is regarded as very obsolete by many, and it has been dropped from HTML specs, but it actually keeps working.

Mixing own XML with HTML5 havin eclipse to display code hints

I am writing my own templating engine mainly for web applications.
It is actually mix of my own XML tags and HTML.
Here is the sample:
<lp:view xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<table>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
</table>
</lp:list_footer>
</lp:list>
</lp:view>
A little explanation:
Those tags prefixed with "lp" belong to my templating engine and are kind of "processing instructions" for it. The lp:view is a root node, then there is a lp:list node which having received some data source will produce a list: first it will include content of lp:list_header, then repeat proper times content of lp:list_item (replacing $title$ by actual data, but this does not matter here), then it will add content of lp:list_footer node. As you can see, for this reason I have html tag "table" splitting across my tags.
I have met two major problems here:
1. Eclipse complains that "table" is not properly closed -- I want Eclipse to stop complaining, treat this tag as a text or -- maybe you can suggest something?
2. Eclipse will not show any code hint if I am inside any of html tags. (code hint: attributes that maybe used by this tag like "class" or "id" etc)
I understand that I'm asking a weird freak question, but maybe there are some XSD gurus here who can direct me:
Eclipse should treat my xml template file as the following:
1. the tags prefixed "lp" are gods! They have precedence over anything other. Only errors from that tags (missing required attributes, missing required child elements etc) should be displayed.
2. All the other tags (any stuff in between angle brackets) are HTML tags. Eclipse should display code hint for them, but should anything be "incorrect" (like in my sample: no closing /table tag) -- Eclipse should not complain.
I hope this is possible.
thanks!
You would have to wrap your HTML in CDATA blocks. This will make the XML parser consider the contents (the unclosed <table>) to be plain text, and not a broken tag.
<lp:list_header><![CDATA[
<table>
]]></lp:list_header>
This is just a partial answer, but I'll still put it as an answer because it's too long to type into a comment.
To stop Eclipse complaining about unclosed tags, you should wrap the content in a <![CDATA[..]] section like so:
<lp:view xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<![CDATA[ <table> ]]>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
<![CDATA[ </table> ]]>
</lp:list_footer>
</lp:list>
They will be treated as text and Eclipse will not complain, but in that case you will lose any Eclipse completion inside the CDATA section.
To get completion working for HTML tags, I think you can try adding a default namespace for XHTML to your root tag, like so:
<?xml version="1.0" ?>
<lp:view xmlns="http://www.w3.org/1999/xhtml" xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<![CDATA[ <table> ]]>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
<![CDATA[ </table> ]]>
</lp:list_footer>
</lp:list>
EDIT: I think the second part won't work though, because the XHTML schema defines that the root element should be <html>. I just tried in Eclipse and completion for HTML tags only starts working when I first insert an <html> tag somewhere in the document. Maybe some other people can weigh in.

CDATA not rendering tags

I'm trying to display XML tags mixed in with plain text on a web page. I do this from a python script that obtains it's data from a database. I've simplified my problem to the program below.
#!/usr/local/bin/python
print """Content-type: text/html;charset=utf-8\n\n"""
print """<html><body>
start:<![CDATA[This is the <xml> tag </xml>.]]>:end
</body></html>"""
I'm expecting it to display the following:
start:This is the <xml> tag </xml>.:end
In both IE8 and Chrome15 it however displays the following:
start: tag .]]>:end
When I look at the HTML source of the page in IE, I can see the following:
<html><body>
start:<![CDATA[This is the <xml> tagxml.]]>:end
</body></html>
In Chrome I see the the same when looking at the source, but it seems that the <![CDATA[This is the <xml> part is in green because it is considered a comment.
I particularly want to keep the text (instead of converting the < to <) because via javascript I access the elements, allowing people to edit them in a separate textarea. Converting them would then save them converted, resulting in problems further down in processing. I could convert them back before saving, but this seems like the wrong approach.
Any idea what I'm doing wrong?
Thanks in advance,
Grant
CDATA is part of XML, not HTML, so the browser ignores it, and then treats any tags in it as it would any other tags - ignoring ones it doesn't recognise, and paying attention to those it does.
I think there's no alternative but to use < etc and convert to tags when editing and convert back when saving.
<!DOCTYPE html>
<html>
<body>
<div id="div1"><b>hi</b></div>
<textarea id="area"></textarea>
<script type="text/javascript">
var div1 = document.getElementById('div1')
var area = document.getElementById('area')
var text = div1.firstChild.nodeValue
area.value = text
</script>
</body>
</html>
Where the problem is?

Are there any issues with always self closing empty tags in html?

Are there any browser issues with always collapsing empty tags in html.
So for example an empty head tag can be written like this
<head></head>
but is can also be written like this
<head/>
Will the second case cause issues in any scenerio?
Thanks
Self-closing <script> tags can mess up some browsers really badly. I remember my whole page disappearing into thin air in IE after I self-closed a script tag - everything after it was read as a script.
Assuming that you are serving your XHTML as XML, no. <head></head> is entirely equivalent to <head />. In fact, an XML parser won't even bother to tell you which one you have.
(There is, however, an issue in that the <head> tag must contain a <title>.)
You shouldn't use minimized form for head in XHTML.
http://www.w3.org/TR/xhtml1/#guidelines
About empty elements:
http://www.w3.org/TR/xhtml1/#C_3
Given an empty instance of an element
whose content model is not EMPTY (for
example, an empty title or paragraph)
do not use the minimized form (e.g.
use <p> </p> and not <p />).
In other words, paragraph should always be closed in XHTML, in HTML you could go with only opening tag. But if the element is supposed to have content it should be properly opened and closed.
For example line break has EMPTY content model and can be written as <br /> (same goes for <hr />) but not <div />.
Also see this SO question.
Empty Elements (XHTML)
Shorthand markup in HTML
Self-closing tags don't exist in HTML. The / is always ignored, that is, <foo/> and <foo> are equivalent. For elements such as br, that's fine, because you want <br>. However, <script src="..." /> means the same as <script src="...">, which is a problem (as noted in other answers). <head/> is less of a problem, because the </head> end tag is optional anyway.
In XML, on the other hand, self-closing tags do what you want. However, you probably aren't using XML, even if you've got an XHTML doctype. Unless you send your documents with a text/xml, application/xml or application/xhtml+xml MIME type (or any other XML MIME type), particularly if you send them as text/html, they will not be treated as XML.
Not that I am aware of. One caveat that has bitten me in the past is self closing my script tag: <script type="text/javascript" src="somefile.js" />
This results in some interesting fail.
In general an empty element can be written as a self closing tag, or opening and closing tags.
However, the HTML4 DTD specifies that the document HEAD must contain a TITLE element.
"Every HTML document must have a TITLE element in the HEAD section."
http://www.w3.org/TR/1999/REC-html401-19991224/struct/global.html#h-7.4.1
I believe some older browsers had problems with the lack of whitespacing - in particular
<head/> would be interpreted as a "head/" tag, whereas <head /> will be interpreted as a "head" tag with a blank attribute "/" which is ignored.
This only affects a few browsers, AFAIK. Either is valid XHTML, but older HTML-only browsers might have trouble.
This is in fact documented in the XHTML guidelines as C.2
Even considering only browser issues (i.e. disregarding validity) and narrowing the question down to the head tag alone, the answer is still yes.
Compare
<head/>
<object>Does this display?</object>
against
<head></head>
<object>Does this display?</object>
each served as text/html to any version of IE.
Does this display? will be shown only in the latter example.