Adding an invisible document name in XSLT or HTML - html

I have this XSLT document that has a file name. However for archiving purposes we want the file name to be displayed somewhere else within the code.
Now I used to do this like this:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
sample-xml=".\DOCUMENTNAME.xml"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office">
However we want to get rid of this solution due to new working methods. Now I was wondering if I could place this (literally just the word DOCUMENTNAME) somewhere within the XSL or within the HTML wrapped within it, in a way that it is not visible.
We add this code to a database, through a validator that looks for the documentname and checks for a match. And as only the contents of the code is placed on the database its easier to check back from the database what documentname was uploaded. However this documentname should not be visible in an HTML output.

Maybe add a processing instruction or comment somewhere in the XSLT.
Like:
<?DOCUMENTNAME?>
or:
<!--DOCUMENTNAME-->
They're not invisible, but they definitely won't be included in the output.

Not visible to whom?
Your current solution (with sample-xml) is not valid XSLT: unknown attributes on an XSLT element should be rejected by the processor unless they are in a namespace of their own.
The cleanest way to do this is probably a top-level element (a child of xsl:stylesheet) in a private namespace:
<my:document-name xmlns:my="http://my.company.com/ns/document-id">document.xml</my:document-name>
I don't know if that meets your criteria of being "not visible". It certainly wouldn't be visible in the output of the stylesheet.

Related

Is it possible to make a selectable drop down menu using data from an XML file?

I'm trying to create a directory for an address book, and I was wondering if it would be possible to create a selectable drop down menu that would pull the contact data from an XML file. The ideal way I would want it is to have all of the names of the contacts in the drop down menu, and when one is selected the rest of the information would pop up above the drop down, such as Address, Phone Number, and Email.
Either use a server-side language such as PHP to extract the data from the XML and insert it into the HTML document, or use AJAX to pull the XML file to the client then use JavaScript to process it and insert it into the DOM.
There should be libraries/frameworks/plugins/whatever available to parse XML using whatever language you need, if you know how to insert stuff into the HTML document (in the case of PHP) or into the DOM (in the case of JavaScript), you can do this easy.
From what I understand you have an XML document. Using XSLT you create an XHTML file from your XML and that you can display in your browser (XHTML is HTML that is conform to XML rules).
If that is the case then, yes, you can make links using XSLT. But the data needs to be in your XML source file and not in some database.
There is an article that describes it: http://www.ibm.com/developerworks/xml/library/x-tipxslt/index.html
You could attach an XSL to the XML using something like this:
<?xml version="1.0" encoding="ISO-8859-1"?>
<?xml-stylesheet type="text/xsl" href="cdcatalog.xsl"?>
... actual XML content...
If applying the XSL on the XML outputs an HTML page with JavaScript, you can get the actual result.
Outputting JavaScript is a bit of a pain because of character escaping but it can be done.

Run a regular expression into a new file (or another existing file)

I would like to take some stuff from file A and reformat it to stick into file B using regular expressions. I am kind of new to vim so this may be a dumb question but I could not find the solution to this anywhere. I guess I am searching for the wrong phrases. Anyway, here are the details of what I want to do. I have a static html page that I would like to have an RSS feed for. Luckily, this page is mostly links to various news items, so creating the RSS will be pretty easy.
I have the regular expression ready:
:%s/^<a href="\(.\{-}\)".title="\(.\{-}\)">\(.\{-}\)<\/a>/<title>\3<\/title>\r<link>\1<\/link>\r<description>\2<\/description>
My problem is I do not want to make the changes in the html file that I am searching. I want the changes to occur in another file, new or existing. How do I make this happen? Or is this method completely off.
Oh and by the way, this expression takes something like this in the html file:
Title of Link
and turns it into this in the xml file:
<title>Title of Link</title>
<link>http://linktosomesite.com</link>
<description>Description of link</description>
Bonus: It would be really nice if I can place this within another file, say starting at line 5.
PS: I know this is a vim and regex question but posting it in html and rss because I noticed people have static html to rss questions there.
Why not just copy your file and then use sed/replace on the copied file?
It sounds like you want to write a transform. There are many transform tools. You certainly could do it with sed & awk for example. But I think the easiest way would be xslt. (you could use xsltproc or saxon...)
Here's an example template:
<xsl:template match="a">
<title><xsl:value-of select="text()"/></title>
<link><xsl:value-of select="#href"/></link>
<description><xsl:value-of select="#title"/></description>
</xsl:template>
It finds each a element, and outputs the results, with the text() node and attributes filled in.
Just run your substitution and save as another file:
$ vim file.html
:%s/^<a href="\(.\{-}\)".title="\(.\{-}\)">\(.\{-}\)<\/a>/<title>\3<\/title>\r<link>\1<\/link>\r<description>\2<\/description>
:w file.rss
:q
That's how I would in any editor, by the way.

How to extract an attribute from an HTML element with Ant?

I have an ANT configuration file which is becoming complicated, and now I'm stuck with an issue. One of the tasks retrieves a page from a website and saves it to a file. I need to load such file and extract from it the href attribute of a specific element. HTML is reasonably well formed, but I can't guarantee it.
I was thinking of a RegEx, but the element's attributes are not guaranteed to always appear in the same order (e.g. its class name, or id). Besides, I haven't found out how to just return the value of the href attribute, without the attribute itself.
I'm trying to limit the amount of addons to be added to ANT, therefore a "self-contained" solution would be welcome. Thanks.
I'm not sure how you're going to find the specific HTML element that has the href you're looking for (I'd assume by checking an id attribute, but you did not say so). I put together this chain of regex's to filter the HTML down to candidate anchor tags and then ultimately strip out just the href's. I used the source of this page as my sample input and since I couldn't find any id attributes associated to anchors (that also had hrefs), I filtered down to anchors with the class="question-hyperlink" -- I'm hopeful this could be a good starting point for you (and note: as you stipulated, it does not contain any dependencies on additional modules, etc, regardless of how easy they are to install):
<?xml version="1.0" encoding="UTF-8"?>
<project name="Test Html attribute" default="test" basedir=".">
<target name="test">
<loadfile srcFile="ant.htm" property="html">
<filterchain>
<linecontainsregexp>
<regexp pattern="<a.*href[^>]*>"/>
<regexp pattern="<a.*class=["']question-hyperlink["'][^>]*>"/>
</linecontainsregexp>
<tokenfilter>
<replaceregex pattern=".*<a.*href=["']?([^>"']*).*>[^<]*" replace="\1" flags="gi"/>
</tokenfilter>
</filterchain>
</loadfile>
<echo>${html}</echo>
</target>
</project>

Extracting attribute from escaped XML tag in XSLT?

So I've been working on some XSLT to modify YouTube's RSS XML, and of course as soon as I got it working, they've changed their formatting. Before, each video's unique ID was stored between <videoid> tags, which you could then use to create a URL. But now the only way to get a video's URL is from a tag like this
<media:player url='https://www.youtube.com/watch?v=XXXXXXXXXXX&feature=youtube_gdata_player'/>`
which is contained within <media:group> tags.
The way I've been trying to get at it is
<xsl:value-of select="media:group/media:player#url" />
but doing that gives me a compilation error that says
xsl:value-of : could not compile select expression 'media:group/media:player#url'
Does anyone see anything wrong with that?
Also, as a side note, I want to do something similar with
<xsl:value-of select="media:group/media:thumbnail#url" />
however there are several <media:thumbnail> tags for each entry; would this just grab the first one, or would this potentially cause errors?
you are missing a / in your XPath. Try:
<xsl:value-of select="media:group/media:player/#url" />
As far as DOM API is concerned, attributes belong to an element. The XPath goes about it in a slightly different way. Child nodes live on the child:: axes (which is the default so you rarely see it used explicitly) and attributes live on the atribute:: axes (you can get there with the abbreviated #). When constructing an XPath expression you are basically building a sequence of location steps separated by /. An attribute is "one location step" away from the element owning it.
To the second part of your question. The selector will create a sequence (think node list) of all nodes that match the expression and will do what the xsl: instruction prescribes to do on that sequence. In your case (xsl:value-of), all #url of all matching media:thumbnail will be put together in a node-set that will be further converted to a string:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order
So you will get the value of the "first" one. That said, I would argue that running xsl:value-of on a sequence of more than one node is not really intuitive (even though the spec clearly says what it will do) so you would do a favor to someone reading the code after you if you be more specific with your selectors. Something like: media:group/media:thumbnail[1]/#url

How best open xml, parse with xslt and show result in browser

I am currently studying ways to present transformed xml files in browsers. My experience with this is minimal, so a number of questions pop up.
I have a transformation test.xslt which transforms input xml to html, and an input file test.xml containing
<?xml version="1.0" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="test.xslt" ?>
<root>...</root>
which, when opened in IE9, neatly displays the transformed xml contained above in the root element.
Question 1
Is there a processing instruction or similar available to include the source xml into the xml to be opened, somewhat like the following:
<?xml version="1.0" standalone="yes"?>
<?xml-stylesheet type="text/xsl" href="test.xslt" ?>
<... instruction to include source file data.xml>
Question 2
The file opened has extension xml. Is there a way to change file contents so it is valid html, allowing the file to be saved with extension html, so that when opened, the default browser will be selected (simply changing extension to html obviously does not have the desired effect so some structural change is necessary) ?
Question 3
My goal is to query a db to get the data to be parsed by the xslt code. What is the best way to do this (no problem if this includes javascript)?
Question 4
Standard db utilities may export query results in attribute-centered fashion (column names and values being represented as attribute names and values). This may involve pre-parsing the xml from db in order to convert it to parent-child fashion (columns as children instead of attributes). What is the best way to do this pre-parsing (note: I already have the xslt for this; I wonder about the data flow and when/how to run two xslt's in sequence) and then apply test.xslt (preferably without saving intermediate xml result files on the server)?
Question 5
When I open above xml in IE9, this works fine as said. But opening it in Firefox errors (RTF issue, apparently I need to use Firefox's node-set function but I still have to discover which namespace that has), and Opera/Chrome/Safari do not show any content. What exactly are the prerequisites for the various browsers where can I find more information on this?
Q1 If you start by serving an html file which then accesses the xml and xslt via javascript it naturally has access to both the input and the output of the xslt. If you are serving the xml and initiating the transformation using xml-stylesheet pi, then perhaps the best thing to do (depending on what you want to do) is to stuff the original source into the output, then javascript in the generated page can access it if needed, eg
<xsl:template matcj="whatever">
<html>
<head>
<script id="source" type="x-xml-spurce">
<xsl:copy-of select="/"/>
</script>
.... whatever you were going to do
then if you need to access the source in response to a user action on the page, a script can retrieve the script with id source and do whatever is needed. (If there is a possibility of the the source including the string you have to code it a bit more defensively).
Q2 If you want to use the xml-stylesheet API then you have to serve it as xml. However you can instead just serve html and then access the xml and xslt from within a script in the html page using the browsers javascip xslt api. as noted above that is more flexible than the xml-stylesheet mechanism.
Q3 pass
Q4 If you are accessing the xslt from javascript then it is easy to chain the output of one to the input of another without writing back to the server as you just have access to the result as a DOM node (or string, depending)
Answer to question 5: Firefox/Mozilla, Opera, Safari, Chrome all support the EXSLT node-set extension function in the namespace http://exslt.org/common, for IE and MSXML you can use script (imported) inside the XSLT stylesheet to allow it to support that namespace too, see http://dpcarlisle.blogspot.de/2007/05/exslt-node-set-function.html. That way inside the main stylesheet where you need to use the node-set function you don't need to write different code to cater for the different namespaces.