Problem adding namespaces to MSXML (using setProperty('SelectionNamespaces', ...)) - namespaces

A while back, I asked a question regarding the usage of namespaces in MSXML. At first, I circumvented the whole thing with the XPath *[local-name()]-hack (see my previous post), but having a crisis of conscience I decided to do things the right way. (Doh!)
Consider the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<Root xsi:schemaLocation="http://www.foo.bar mySchema.xsd" xmlns="http://www.foo.bar" xmlns:ds="http://www.w3.org/2000/09/xmldsig#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<MyElement>
</MyElement>
</Root>
When I try to add these namespaces using IXMLDOMDocument3.setProperty('SelectionNamespaces', NSString);, I get the following error: "SelectionNamespaces property value is invalid. Only well-formed xmlns attributes are allowed.". When removing the namespace xsi:schemaLocation="http://www.foo.bar mySchema.xsd", everything runs smoothly. What am I doing wrong here? Is there an error in the XML? Is MSXML to blame?

xsi:schemaLocation="..." is not a namespace definition, it is an attribute of the <Root> element which is in xsi namespace.
So removing this from the list of namespaces as you did is already the solution.

Related

XSL stylesheet keeps Firefox from recognising DTD-defined ids

I want a client-side XSL-transformed document with elements targettable (jumpable to) by #foo (URL fragments). Problem is, as soon as I attach the simplest XSL stylesheet, Firefox stops scrolling to the elements. Here's simple code:
test.xml:
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type='text/xsl' href='test.xsl'?>
<!DOCTYPE foo [<!ATTLIST bar id ID #REQUIRED>]>
<foo xmlns:html='http://www.w3.org/1999/xhtml' xml:lang='en-GB'>
<html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/>
<bar id='baz'>Baf.</bar>
</foo>
test.xsl:
<xsl:stylesheet version='1.0' xmlns:html='http://www.w3.org/1999/xhtml' xmlns:xsl='http://www.w3.org/1999/XSL/Transform'>
<xsl:template match='/'>
<xsl:copy-of select='.'/>
</xsl:template>
</xsl:stylesheet>
As soon as I uncomment the stylesheet line, /test.xml#baz does nothing. As though the transformation somehow loses some data about elements' identification.
Any ideas? Thanks.
Well the XSLT/XPath data model does not include any DTD and thus your result tree that XSLT creates is a copy of the input without the DTD, thus there is no definition of any ID attributes in the result tree and Firefox has no way of establishing to which element with which attribute #some-id refers.
Usually if you use client-side XSLT in the browser the target format is (X)HTML or SVG or a mix of both where id attributes are known by the browser implementation without needing a DTD. If you want to transform to a result format unknown to the browser then I don't think there is a way to use DTDs for the result tree in Firefox/Mozilla. And I am not sure whether they ever implemented xml:id support so that you could use that instead of defining your own ID attributes.
Martin Honnen's mention of XHTML resulted in experimentation during which I found out that setting the target element's namespace to XHTML's, xmlns='http://www.w3.org/1999/xhtml', does the trick. It doesn't seem very clean, but it doesn't seem as grave as, for instance, setting the whole doctype to XHTML's. So text.xml is now:
<?xml version='1.0' encoding='UTF-8'?>
<?xml-stylesheet type='text/xsl' href='test.xsl'?>
<foo xmlns:html='http://www.w3.org/1999/xhtml' xml:lang='en-GB'>
<html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/><html:br/>
<html:bar id='baz'>Baf.</html:bar>
</foo>
Also relevant might be http://xmlplease.com/xhtmlxhtml I found.
Thanks, all.

MSXSL Error and barely any output

I am trying to transform some HTML files to my own XML-format via XSL.
For this purpose I use HTML Tidy to clean up the input files, then transform them to xhtml with html2xhtml and then use a xsl script with msxsl to transform the xhtml files to my own format.
However, the last step is failing with not a error message at all (it is a semantical fail; not a technical ;-)): My output file just contains empty tags.
I had a problem like this before and removed the xmlns attribute from the html tag, what causes nearly all of the online transformers to work with my files correctly. MSXSL now writes the following error message: "Use of default namespace declaration attribute in DTD not supported".
Find the files I use here: http://pastie.org/5483087
Thank you in advance!
Well that is the FAQ with XSLT and XPath 1.0, the elements in your input XHTML document are in a namespace and your XSLT does not take that into account. You need to change it to e.g.
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xhtml="http://www.w3.org/1999/xhtml"
exclude-result-prefixes="xhtml">
<xsl:template match="/">
<stellenausschreibung>
<hochschule><xsl:value-of select="//xhtml:div[#id='contentText']/xhtml:img/#alt" /></hochschule>
<anbieter><xsl:value-of select="//xhtml:p[#id='ad_employer']" /></anbieter>
<typ><xsl:value-of select="//xhtml:h1" /></typ>
<bewerbungsschluss><xsl:value-of select="//xhtml:span[#id='ad_bewerbungsschluss']" /></bewerbungsschluss>
<erscheinungsdatum><xsl:value-of select="//xhtml:span[#class='job_published_at']" /></erscheinungsdatum>
<inhalt><xsl:value-of select="//xhtml:p[#id='ad_job']" /></inhalt>
</stellenausschreibung>
</xsl:template>
</xsl:stylesheet>
The prefix (in my example xhtml) for the XHTML namespace used in the stylesheet can of course be freely chosen but it is necessary to use one as with XSLT/XPath 1.0 a path of e.g. //p always selects p elements in no namespace.

Mixing own XML with HTML5 havin eclipse to display code hints

I am writing my own templating engine mainly for web applications.
It is actually mix of my own XML tags and HTML.
Here is the sample:
<lp:view xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<table>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
</table>
</lp:list_footer>
</lp:list>
</lp:view>
A little explanation:
Those tags prefixed with "lp" belong to my templating engine and are kind of "processing instructions" for it. The lp:view is a root node, then there is a lp:list node which having received some data source will produce a list: first it will include content of lp:list_header, then repeat proper times content of lp:list_item (replacing $title$ by actual data, but this does not matter here), then it will add content of lp:list_footer node. As you can see, for this reason I have html tag "table" splitting across my tags.
I have met two major problems here:
1. Eclipse complains that "table" is not properly closed -- I want Eclipse to stop complaining, treat this tag as a text or -- maybe you can suggest something?
2. Eclipse will not show any code hint if I am inside any of html tags. (code hint: attributes that maybe used by this tag like "class" or "id" etc)
I understand that I'm asking a weird freak question, but maybe there are some XSD gurus here who can direct me:
Eclipse should treat my xml template file as the following:
1. the tags prefixed "lp" are gods! They have precedence over anything other. Only errors from that tags (missing required attributes, missing required child elements etc) should be displayed.
2. All the other tags (any stuff in between angle brackets) are HTML tags. Eclipse should display code hint for them, but should anything be "incorrect" (like in my sample: no closing /table tag) -- Eclipse should not complain.
I hope this is possible.
thanks!
You would have to wrap your HTML in CDATA blocks. This will make the XML parser consider the contents (the unclosed <table>) to be plain text, and not a broken tag.
<lp:list_header><![CDATA[
<table>
]]></lp:list_header>
This is just a partial answer, but I'll still put it as an answer because it's too long to type into a comment.
To stop Eclipse complaining about unclosed tags, you should wrap the content in a <![CDATA[..]] section like so:
<lp:view xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<![CDATA[ <table> ]]>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
<![CDATA[ </table> ]]>
</lp:list_footer>
</lp:list>
They will be treated as text and Eclipse will not complain, but in that case you will lose any Eclipse completion inside the CDATA section.
To get completion working for HTML tags, I think you can try adding a default namespace for XHTML to your root tag, like so:
<?xml version="1.0" ?>
<lp:view xmlns="http://www.w3.org/1999/xhtml" xmlns:lp="http://sminit.com/view" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://sminit.com/view view.xsd ">
<lp:list name="my_items">
<lp:list_header>
<![CDATA[ <table> ]]>
</lp:list_header>
<lp:list_item>
<tr><td>$title$</td></tr>
</lp:list_item>
<lp:list_footer>
<![CDATA[ </table> ]]>
</lp:list_footer>
</lp:list>
EDIT: I think the second part won't work though, because the XHTML schema defines that the root element should be <html>. I just tried in Eclipse and completion for HTML tags only starts working when I first insert an <html> tag somewhere in the document. Maybe some other people can weigh in.

<xsl:value-of select="document(content)//title"/> returns empty node

I'm trying to get title of simple html document to build sitemap. But always return empty value. I debug this and found out that document(content) returns document nodes. It looks like this.alt text http://www.freeimagehosting.net/uploads/f7caf412dc.png But I could not access document(content)/html or something like this. Please help!
Some more code would help, but in such situations the first one to blame is namespace. I can see that your nodes are in the XHTML namespace, but you do not use any namespace prefix in your XPath.
You have to declare namespace prefix in your stylesheet like this:
<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:h="http://www.w3.org/1999/xhtml"
>
And then use this prefix in your XPath like this:
document(content)/h:html
If your xml elements are in a namespace, even if it is the default namespace for the document, you must use namespace prefixes in any XPath expressions and template match rules. It is the namespace uri and not the prefix that matters. Note that attributes will not be in the default namespace, they only have a namespace if their name has a prefix.
Additionally, an XPath expression containing // is usually less efficient than one that does not.
<xsl:stylesheet version="1.0"
xmlns:h="http://www.w3.org/1999/xhtml"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- and elsewhere in your stylesheet -->
<xsl:value-of select="document(content)/h:html/h:head/h:title"/>

what is the exact usage of xmlns in xml, and html

Does any one know what is the exact usage of xmlns in HTML, XML files?
Edit: Why do we need this namespace? What is its usage?
The xmlns attribute has special handling, that allows the declaration of a namespace.
All names, such as tag names, in a document belong to a namespace. In the absence of the xmlns attribute all the names belong to the "no name" namespace. Hence:-
<root><item /></root>
In the above example both root and item are names in the "no name" namespace. Whereas:-
<root xmlns="urn:mydomain.com:mystuff"><item /></root>
Now root and item exist in the "urn:mydomain.com:mystuff" namespace.
The xmlns can further define additional namespaces elements of which can be distinguished from others by using an alias prefix:-
<root xmlns="urn:mydomain.com:mystuff" xmlns:a="urn:otherdomain.com:other">
<item>
<a:supplement />
</item>
</root>
In this case root and item continue to be in the "urn:mydomain.com:mystuff" namespace but a:supplement indicates that the name supplement is in the "urn:otherdomain.com:other" namespace.
What does this acheive?
The X in XML stands for eXtensible. One goal is to allow additional information to layer onto an existing document, i.e., the ability to extend the document. Consider:-
Party A create a document:-
<root>
<item />
<root>
Party B extends the document by including additional information:-
<root>
<item />
<supplement />
</root>
Later Party A adds new info to their original form of the document and just so happen to also use the name supplement in their original. We could end up with something like:-
<root>
<item />
<supplement />
<supplement />
</root>
Which supplement element belongs to which party? By using namespaces the document would look like this:-
<root xmlns="urn:mydomain.com:mystuff" xmlns:a="urn:otherdomain.com:other">
<item />
<supplement />
<a:supplement />
</root>
Now when it comes to parsing and querying the XML its clear which element belongs to whom. Namespaces elimnate the collision between what would otherwise be a global set of simple names.
The xmlns attribute declares an XML Namespace. The Namespaces in XML standard discusses this element in depth.
Namespaces are used primarily to avoid conflicts between element names when mixing XML languages. If you have a particular application that you have questions about, perhaps you could post an example.
XML namespaces help contextualize elements an attributes, among other things. It also offers a precise identification for a particular element or attribute.
For instance, the <html> element can be defined by anyone and have any meaning. However, the <html> element within the http://www.w3.org/1999/xhtml namespace is unique and refers to the XHTML.
Namespaces also prove useful when dealing with homographs, when using multiple XML languages in a single file.
In HTML, xmlns is just a talisman to make moving from and to XHTML easier. It doesn't do anything at all.
Namespaces let you reduce ambiguity when there are duplicates. You could have a <title> tag that refers to authors and <title> tag that refers to a salutation, like Mr., Mrs. etc. To differentiate, you could assign them to different namespaces.
You can also use namespaces when validating documents for conformance to a particular standard/restrictions, where the namespace would indicate to what "Schema" that the document is belonging to.