Why use HTML markup in languages like ruby, php, asp.net mvc instead of XLST to convert XML to HTML? - html

I just learned about XLST on stackoverflow today (I love how in computers you can program for years and constantly have 'darn, how did I not know about that technology' moments). I'm wondering how popular XLST it is for web development? I've worked on a few websites (using php, ruby, and asp.net mvc) but I'm not a web developer by any means.
Is the reason each web language I listed above has it's own way of marking up html (and thus taking advantage of 'templates') just to make it simpler (simpler as in more to the point and not and more geared to one specific purpose) in that you don't have to first convert what you want to display to xml and then to html? Or are there other reasons why XLST doesn't seem too popular for web development? Or am I just crazy (again most of my work is with Desktop apps) and actually it is widely used in webpages? If not in development, what do you mainly use it for?
It seems that being able to easily serialize objects in xml with C# would make XLST a very popular way of displaying object in HTML on websites?
Thanks for feeding my curiosity!!

IMHO there are two main reasons why XSLT is not very popular:
it's generally hard.
you can just skip it and directly write HTML, and HTML is not hard and has first-class support from all web frameworks.
In summary, there is usually not enough reason to introduce yet-another-abstraction. Abstractions are not free, they solve some problems but introduce others (i.e. the "solve it by adding another layer of indirection" adagio), so the benefits must clearly outweight the costs.
That said, there are XSLT-based solutions for many web frameworks, e.g.:
ASP.NET MVC XSLT view engine
libxslt in RoR
Here's an excellent article that discusses XSLT for view engines.

As I've started so many answers on Stackoverflow, it depends :)
Doing what you're describing is adding another layer of abstraction between application logic and the display output; and introducing another language. There can be very compelling reasons to do this, but the important part to keep in mind is that you need to recognize and quantify the need to be able to understand whether it's worth it.
It seems that being able to easily...
As with most things in software development, something that seems easy after a few hours of pondering turns out to be quite complex and involved when you actually try to do it. This is especially true here, because I have built exactly what you're describing in ASP.NET. It provides a very interesting mechanism for skinning sites, as you simply have to define your model XML schemas and anyone can write an XSLT to transform it. But XSLT is like a one-way tunnel. It can't (easily) reach back out or to the sides to pull in extra info that wasn't included in the original model - "peripheral data", so to speak. In fact, it has a hard time really being aware of what's going on in the application at all.
Also, XSLT is very verbose, and (in many ways) a crude language. This makes it... unpleasant to do things like loops, and rather time consuming to even do something like an if-else statement*. An XSLT that generated something like, say, the page you're looking out right now would probably be several thousand lines long - which you're adding on top of the application code you have to write either way.
It is simply an additional cost which may or may not be worth it, depending on what you're trying to accomplish.
*For example, I once saw a developer try to write a pager control (e.g. "first | prev | 1 | 2 | 3 | next | last") in XSLT. We still visit her in the sanitarium from time to time.

XSLT is not popular in web frameworks because XML is not popular in web frameworks. But, if you have XML data, or you are willing to convert your objects to XML then XSLT is the best tool for transforming that XML into HTML or XHTML.
If you are using ASP.NET checkout myxsl

When I first discovered XSLT I was excited because I could write semantically correct presentation markup, and I would no longer have to write so many hacks and use so many nested layers of <divs> just to get the effect I wanted -- XSLT could do all that for me!
Then I realized XSLT was a weird, backwards, and painful language to write in. I believe other web developers have discovered the same thing and steered away from it.
What's the point when you can just write straight HTML and avoid an extra layer of abstraction? And if your application is complex enough to warrant that, then there are so many better, simpler, more powerful alternatives (other templating languages).

The design of a web page is often handed over to designers.
One thing about web designers is they know HTML (if you're lucky) but they're not going to know XSLT.

It's unpleasant to do loops and if-else-statements in XSLT like it's unpleasant to hammer with a screwdriver. Don't do that. The behaviour of an XSLT script is driven by the data, you only need to find the right matches for your templates. XSLT-templates are not just a piece of code. The actual data from the XML-file fitting in the "match"-attribute of the "template"-elements decide if and when to execute a template.
Once you discover that XSLT works different to other languages you see the possibilities this gives to you. It's easy to do things that are hard to do in usual languages. Just use it when appropriate. If your data comes as XML and you know XSLT and XPath, this is the right tool to build web pages.
Hints are available at e.g. jenitennison.com/xslt.

Related

Custom parser for HTML5 and other languages

I'm attempting to write my very own custom parser (in C#) for (X)HTML5 and whatever might be embedded (EcmaScript, CSS) - just to learn and have fun. Although I'm an intermediate programmer I don't know much about parsers and all the technical stuff. I am able to create a lexical analyser (tokeniser) for HTML5 fairly easily but the syntactical analysis (parsing) is a bit tricky. I'm not sure if I should first lexically analyse all the source input and then do the other or try both at the same time; get char until I have a token, realise what the token syntactially means and then expect a certain token relevant to the previous one. The problem that I face is that HTML might have other languages such as CSS and JavaScript embedded and they, as far as I can see, will have different categories of tokens, so I'm not sure how to "know" where I am in the code as I tokenise it in order to have varying definitions of what a token "is". Any thoughts? Also, what are the benefits/drawbacks of analysing lexically first and then syntactically vs. doing both in at the same time?
If this is purely for your own education regarding parsing, I would suggest tacking a much smaller / easier field than HTML , CSS and JS parsing as HTML and JS both represent some really quite nasty parsing problems which even the most experienced parser writer would feel nervous tackling.
A language based off Scheme or Basic would probably be my first pick.
(A personal favourite is building a parser / interpreter as I go through http://mitpress.mit.edu/sicp/full-text/book/book-Z-H-10.html )
(Also picking up a copy of something like Modern Complier Design probably wouldn't hurt: http://www.amazon.com/Modern-Compiler-Design-D-Grune/dp/0471976970 )
If it has to be web related in order to keep your interest, I'd take a stab at doing your parser for one of the smaller web related languages such as sass ( http://sass-lang.com )
On the other hand, if this is something work related where you really need to parse those specific things, I'd suggest skipping the effort of writing your own parser entirely and hook into something like the Razor or Chromium libraries.
And to directly answer at least the second half of your question: I would recommend always splitting the various phases of parsing / interpreting out as far as possible from each other.
Each problem is difficult enough on it's own without trying to be "too smart" and attempting to combine functionality into a single sweep.
Where ever possible I'd suggest keeping things as high level, abstract and "clean" as possible... thus construct a tree of nodes specifically for lexical parsing and another for syntactical parsing... and in the case of combined languages as HTML, CSS and JS, a different AST and parsing code for each.
There is a great course on Udacity [1] called Programming Languages that covers the full concept of HTML and Javacript processing .
It covers in depth the lexical analysis, parsing and interpretation . It only covers a subset of Javascript so you have further development ahead of you after you finish the course, but you will have acquired the general structure and the concepts .
[1] http://www.udacity.com/overview/Course/cs262/CourseRev/apr2012

what's DSL in plain words?

I heard from someone that DSL is really powerful in some specific fields. So i want to find out if i can put it into my skill sets.
The first problem came out is What is DSL exactly? After doing some search, it seems Groovy supports DSL very well. Then i go and read Groovy's documents and try it out by myself.
And i got the impression that DSL is just some kind of configuration files consisting of texts, XMLs and you use some tools like Groovy to parse it, it magically become some methods or functions you can invoke. What happened?
I read something, but cannot get it straight. Any Help?
Did you read this? Martin Fowler is an authority on the subject and a great writer. I doubt that anyone will improve on the first paragraph. If you still don't get it, give it some time and re-read the article a few times.
I'd recommend looking into JetBrain's MPS
A book might be overwhelming, but there's a relatively new one available.
And i got the impression that DSL is just some kind of configuration
files consisting of texts, XMLs and you use some tools like Groovy to
parse it, it magically become some methods or functions you can
invoke. What happened?
I don't think your impression is entirely accurate. I'd forget about Groovy and parsing and all the implementation details for now. Focus on the problem that DSL is trying to solve.
A DSL designer tries to come up with a pseudo programming language that an expert, who is unfamilar with programming languages like Groovy or Java or C#, would recognize as a simple language describing they way they solve problems.
The DSL uses terms and concepts familiar to any one knowledgable about that domain.
The DSL shields users from the underlying implementation details so they can focus on how to attack their problems.
A DSL is written for the convenience of business users, not developers.
Keep that in mind and the rest is implementation. Eye on the prize....
A domain specific language (DSL) is a programing language that is not fully featured. The point is that programing in a DSL can be easier than programing in a general purpose language, and be less prone to bugs. The "domain" in "domain specific language" refers to the specific purpose the language will be used for.
For example, the language that a calculator uses with just + - * / and numbers could be called a domain specific language. It has the advantage over a regular programing language in that programs will never segfault, crash, loop forever, etc. Other examples of domains might be web development -- for example, Ur/Web is a DSL for building web applications. SQL is a database domain specific language. etc.
I don't know much about Groovy, but it seems that there are particular tools for using it to create DSLs. Fundamentally, to create a DSL you need to specify a syntax, along with some sort of semantics. How exactly Groovy does this I do not know.
DSL is a language dedicated to a specific domain. For instance, the well-known CSS is a Domain Specific Language serving the look and formatting of a document.
By using Groovy you might create your own DSL focusing on any selected domain - e.g. accounting, telecommunications, banking etc. This means, that the language will use the common terminology of this area meeting the needs of this domain. This language will be easily understood by people of this domain that are not necessarily technical (e.g. accountants). In some times, it focuses on being used by non-programmers. Especially Groovy is a dynamic language with which you can enable end-users to add code scripts dynamically similarly to what Excel does with VB, through configuration files.
You should delve into Martin Fowler's publications if you are interested in this subject, anyway.

What are the pros and cons of using a template engine like Jade?

I'm looking into developing a web app with Node.js. I'm coming from a PHP background where I didn't use a template engine (besides PHP itself) and I have always just written straight HTML. So, why should I or should I not use Jade or some other template engine?
Pros:
Encourages good code organization (data generation is separate from presentation code)
Output generation is more expressive (template syntax doesn't require a sea of string concatenation)
Better productivity (common problems such as output encoding, iterating, conditionals, etc. have been handled)
Generally requires less code overall (jade in particular has a very terse syntax)
Cons:
Some performance overhead
Yet another thing to learn
About JADE or any other template language that differ a lot from HTML:
First of all it is more time consuming to debug the produced HTML. You see HTML in the browser and you need to parse it back to JADE (in your brain) to compare with your editor content. This is very inconvenient and makes debugging harder then it should be.
Of course it may not be a problem if you are the only programmer who works on the code. It may seem so easy to match the html lines with JADE lines if you are the one who wrote them.
It is a problem when working in teams.

Studying standard library sources

How does one study open-source libraries code, particularly standard libraries?
The code base is often vast and hard to navigate. How to find some function or class definition?
Do I search through downloaded source files?
Do I need cvs/svn for that?
Maybe web-search?
Should I just know the structure of the standard library?
Is there any reference on it?
Or do some IDEs have such features? Or some other tools?
How to do it effectively without one?
What are the best practices of doing this in any open-source libraries?
Is there any convention of how are sources manipulated on Linux/Unix systems?
What are the differences for specific programming languages?
Broad presentation of the subject is highly encouraged.
I mark this 'community wiki' so everyone can rephrase and expand my awkward formulations!
Update: Probably didn't express the problem clear enough. What I want to, is to view just the source code of some specific library class or function. And the problem is mostly about work organization and usability - how do I navigate in the huge pile of sources to find the thing, maybe there are specific tools or approaches? It feels like there should've long existed some solution(s) for that.
One thing to note is that standard libraries are sometimes (often?) optimized more than is good for most production code.
Because they are widely used, they have to perform well over a wide variety of conditions, and may be full of clever tricks and special logic for corner cases.
Maybe they are not the best thing to study as a beginner.
Just a thought.
Well, I think that it's insane to just site down and read a library's code. My approach is to search whenever I come across the need to implement something by myself and then study the way that it's implemented in those libraries.
And there's also allot of projects/libraries with excellent documentation, which I find more important to read than the code. In Unix based systems you often find valuable information in the man pages.
Wow, that's a big question.
The short answer: it depends.
The long answer:
Some libraries provide documentation while others don't. Standard libraries are usually pretty well documented, whether your chosen implementation of the library includes documentation or not. For instance you may have found an implementation of the c standard library without documentation but the c standard has been around long enough that there are hundreds of good reference books available. Documentation with hyperlinks is a very useful way to learn a new API. In any case the first place I would look is the library's main website
For less well known libraries lacking documentation I find two different approaches very helpful.
First is a doc generator. Nearly every language I know of has one. It basically parses an source tree and creates documentation (usually as html or xml) which can be used to learn a library. Some use specially formatted comments in the code to create more complete documentation. JavaDoc is one good example of this. Doc generators for many other languages borrow from JavaDoc.
Second an IDE with a class browser. These act as a sort of on the fly documentation. Some display just the library's interface. Other's include description comments from the library's source.
Both of these will require access to the libraries source (which will come in handy if you intend actually use a library).
Many of these tools and techniques work equally well for closed/proprietary libraries.
The standard Java libraries' source code is available. For a beginning Java programmer these can be a great read. Especially the Collections framework is a good place to start. Take for instance the implementation of ArrayList and learn how you can implement a resizeable array in Java. Most of the source has even useful comments.
The best parts to read are probably whose purpose you can understand immediately. Start with the easy pieces and try to follow all the steps that are hidden behind that single call you make from your own code.
Something I do from time to time :
apt-get source foo
Then new C++ project (or whatever) in Eclipse and import.
=> Wow ! Browsable ! (use F3)

Is XSLT worth it? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
A while ago, I started on a project where I designed a html-esque XML schema so that authors could write their content (educational course material) in a simplified format which would then be transformed into HTML via XSLT. I played around (struggled) with it for a while and got it to a very basic level but then was too annoyed by the limitations I was encountering (which may well have been limitations of my knowledge) and when I read a blog suggesting to ditch XSLT and just write your own XML-to-whatever parser in your language of choice, I eagerly jumped onto that and it's worked out brilliantly.
I'm still working on it to this day (I'm actually supposed to be working on it right now, instead of playing on SO), and I am seeing more and more things which make me think that the decision to ditch XSLT was a good one.
I know that XSLT has its place, in that it is an accepted standard, and that if everyone is writing their own interpreters, 90% of them will end up on TheDailyWTF. But given that it is a functional style language instead of the procedural style which most programmers are familiar with, for someone embarking on a project such as my own, would you recommend they go down the path that I did, or stick it out with XSLT?
So much negativity!
I've been using XSLT for a good few years now, and genuinely love it. The key thing you have to realise is that it's not a programming language it's a templating language (and in this respect I find it indescribably superior to asp.net /spit).
XML is the de facto data format of web development today, be it config files, raw data or in memory reprsentation. XSLT and XPath give you an enormously powerful and very efficient way to transform that data into any output format you might like, instantly giving you that MVC aspect of separating the presentation from the data.
Then there's the utility abilities: washing out namespaces, recognising disparate schema definitions, merging documents.
It must be better to deal with XSLT than developing your own in-house methods. At least XSLT is a standard and something you could hire for, and if it's ever really a problem for your team it's very nature would let you keep most of your team working with just XML.
A real world use case: I just wrote an app which handles in-memory XML docs throughout the system, and transforms to JSON, HTML, or XML as requested by the end user. I had a fairly random request to provide as Excel data. A former colleague had done something similar programatically but it required a module of a few class files and that the server had MS Office installed! Turns out Excel has an XSD: new functionality with minimum basecode impact in 3 hours.
Personally I think it's one of the cleanest things I've encountered in my career, and I believe all of it's apparent issues (debugging, string manipulation, programming structures) are down to a flawed understanding of the tool.
Obviously, I strongly believe it is "worth it".
Advantages of XSLT:
Domain-specific to XML, so for example no need to quote literal XML in the output.
Supports XPath/XQuery, which can be a nice way to query DOMs, in the same way that regular expressions can be a nice way to query strings.
Functional language.
Disadvantages of XSLT:
Can be obscenely verbose - you don't have to quote literal XML, which effectively means you do have to quote code. And not in a pretty way. But then again, it's not much worse than your typical SSI.
Doesn't do certain things which most programmers take for granted. For instance string manipulation can be a chore. This can lead to "unfortunate moments" when novices design code, then frantically search the web for hints how to implement functions they assumed would just be there and didn't give themselves time to write.
Functional language.
One way to get procedural behaviour, by the way, is to chain multiple transforms together. After each step you have a brand new DOM to work on which reflects the changes in that step. Some XSL processors have extensions to effectively do this in one transform, but I forget the details.
So, if your code is mostly output and not much logic, XSLT can be a very neat way to express it. If there is a lot of logic, but mostly of forms which are built in to XSLT (select all elements which look like blah, and for each one output blah), it's likely to be quite a friendly environment. If you fancy thinking XML-ishly at all times, then give XSLT 2 a go.
Otherwise, I'd say that if your favourite programming language has a good DOM implementation supporting XPath and allowing you to build documents in a useful way, then there are few benefits to using XSLT. Bindings to libxml2 and gdome2 should do nicely, and there's no shame in sticking to general-purpose languages you know well.
Home-grown XML parsers are usually either incomplete (in which case you'll come unstuck some day) or else not much smaller than something you could have got off the shelf (in which case you're probably wasting your time), and give you any number of opportunities to introduce severe security issues around malicious input. Don't write one unless you know exactly what you gain by doing it. Which is not to say you can't write a parser for something simpler than XML as your input format, if you don't need everything that XML offers.
I have to admit a bias here because I teach XSLT for a living. But, it might be worth covering off the areas that I see my students working in. They split into three groups generally: publishing, banking and web.
Many of the answers so far could be summarised as "it's no good for creating websites" or "it's nothing like language X". Many tech folks go through their careers with no exposure to functional/declarative languages. When I'm teaching, the experienced Java/VB/C/etc folk are the ones who have issues with the language (variables are variables in the sense of algebra not procedural programming for example). That's many of the people answering here - I've never gotten on with Java but I'm not going to bother to critique the language because of that.
In many circumstances it is an inappropriate tool for creating websites - a general purpose programming language may be better. I often need to take very large XML documents and present them on the web; XSLT makes that trivial. The students I see in this space tend to be processing data sets and presenting them on the web. XSLT is certainly not the only applicable tool in this space. However, many of them are using the DOM to do this and XSLT is certainly less painful.
The banking students I see use a DataPower box in general. This is an XML appliance and it's used to sit between services 'speaking' different XML dialects. Transformation from one XML language to another is almost trivial in XSLT and the number of students attending my courses on this are increasing.
The final set of students I see come from a publishing background (like me). These people tend to have immense documents in XML (believe me, publishing as an industry is getting very into XML - technical publishing has been there for years and trade publishing is getting there now). These documents need to be processing (DocBook to ePub comes to mind here).
Someone above commented that scripts tend to be below 60 lines or they become unwieldy. If it does become unwieldy, the odds are the coder hasn't really got the idea - XSLT is a very different mindset from many other languages. If you don't get the mindset it won't work.
It's certainly not a dying language (the amount of work I get tells me that). Right now, it's a bit 'stuck' until Microsoft finish their (very late) implementation of XSLT 2. But it's still there and seems to be going strong from my viewpoint.
We use XSLT extensively for things like documentation, and making some complex configuration settings user-serviceable.
For documentation, we use a lot of DocBook, which is an XML-based format. This lets us store and manage our documentation with all of our source code, since the files are plain text. With XSLT, we can easily build our own documentation formats, allowing us to both autogenerate the content in a generic way, and make the content more readable. For example, when we publish release notes, we can create XML that looks something like:
<ReleaseNotes>
<FixedBugs>
<Bug id="123" component="Admin">Error when clicking the Foo button</Bug>
<Bug id="125" component="Core">Crash at startup when configuration is missing</Bug>
<Bug id="127" component="Admin">Error when clicking the Bar button</Bug>
</FixedBugs>
</ReleaseNotes>
And then using XSLT (which transforms the above to DocBook) we end up with nice release notes (PDF or HTML usually) where bug IDs are automatically linked to our bug tracker, bugs are grouped by component, and the format of everything is perfectly consistent. And the above XML can be generated automatically by querying our bug tracker for what has changed between versions.
The other place where we have found XSLT to be useful is actually in our core product. Sometimes when interfacing with third-party systems we need to somehow process data in a complex HTML page. Parsing HTML is ugly, so we feed the data through something like TagSoup (which generates proper SAX XML events, essentially letting us deal with the HTML as if it were properly written XML) and then we can run some XSLT against it, to turn the data into a "known stable" format that we can actually work with. By separating out that transformation into an XSLT file, that means that if and when the HTML format changes, the application itself does not need to be upgraded, instead the end-user can just edit the XSLT file themselves, or we can e-mail them an updated XSLT file without the entire system needing to be upgraded.
I would say that for web projects, there are better ways to handle the view side than XSLT today, but as a technology there are definitely uses for XSLT. It's not the easiest language in the world to use, but it is definitely not dead, and from my perspective still has lots of good uses.
XSLT is an example of a declarative programming language.
Other examples of declarative programming languages include regular expressions, Prolog, and SQL. All of these are highly expressive and compact, and usually very well designed and powerful for the task for which they are designed.
However, software developers generally hate such languages, because they are so different from more mainstream OO or procedural languages that they're hard to learn and debug. Their compact nature generally makes it very easy to do a lot of damage inadvertently.
So while XSLT is an efficient mechanism to merge data into presentation, it fails in the ease-of-use department. I believe that's why it hasn't really caught on.
I remember all the hype around XSLT when the standard was newly released. All the excitement around being able built an entire HTML UI with a 'simple' transform.
Let’s face it, it is hard to use, near impossible to debug, often unbearably slow. The end result is nearly always quirky and less than ideal.
I will sooner gnaw off my own leg than use an XSLT while there are better ways to do things. Still it has its places, its good for simple transform tasks.
I've used XSLT (and also XQuery) extensively for various things - to generate C++ code as part of build process, to produce documentation from doc comments, and within an application that had to work with XML in general and XHTML in particular a lot. The code generator in particular was in excess of 10,000 lines of XSLT 2.0 code spread around about a dozen separate files (it did a lot of things - headers for clients, remoting proxies/stubs, COM wrappers, .NET wrappers, ORM - to name a few). I inherited it over another guy who didn't really understand the language well, and the older bits were consequently quite a mess. Newer stuff that we wrote was mostly kept sane and readable, however, and I do not recall any particular problems with achieving that. It was certainly not any harder than doing it for C++.
Speaking of versions, dealing with XSLT 2.0 definitely helps keep you sane, but 1.0 is still alright for simpler transforms. In its niche, it is an extremely handy tool, and the productivity you get from certain domain-specific features (most importantly, dynamic dispatch via template matching) is hard to match. Despite the perceived wordiness of XSLT's XML-based syntax, the same thing in LINQ to XML (even in VB with XML literals) was usually several times longer. Quite often, however, it gets undeserved flack because of unnecessary use of XML in some case in the first place.
To sum it up: it is an incredibly useful tool to have in one's toolbox, but it is a very specialized one, so it is good so long as you use it properly and for its intended purpose. I really wish there was a proper, native .NET implementation of XSLT 2.0.
I use XSLT (for lack of better alternative), but not for presentation, just for transformation:
I write short XSLT transformations to do mass edits on our maven pom.xml files.
I've written a pipeline of transformations to generate XML Schemas from XMI (UML Diagram). It worked for a while, but it finally got too complex and we had to take it out behind the barn.
I've used transformations to refactor XML Schemas.
I've worked around some limitations in XSLT by using it to generate an XSLT to do the real work. (Ever tried to write an XSLT that produces an output using namespaces that aren't known until runtime?)
I keep coming back to it because it does a better job round-tripping the XML it's processing than other approaches I've tried, which have seemed needlessly lossy or simply misunderstand XML. XSLT is unpleasant, but I find using Oxygen makes it bearable.
That said, I'm investigating using Clojure (a lisp) to perform transformations of XML, but I haven't gotten far enough yet to know if that approach will bring me benefits.
Personally I used XSLT in a totally different context. The computer game that I was working on at the time used tons of UI pages defined using XML. During a major refactor shortly after a release we wanted to change the structure of these XML documents. We made the game's input format follow a much better and schema aware structure.
XSLT seemed the perfect choice for this translation from old format -> New format. Within two weeks I had a working conversion from old to new for our hundreds of pages. I was also able to use it to extract lots of information on the layout of our UI pages. I created lists of which components were imbedded in which relatively easily which I then used XSLT to write into our schema definitions.
Also, coming from a C++ background, it was a very fun and interesting language to master.
I think that as a tool to translate XML from one format to another it is fantastic. However, it is not the only way to define an algorithm that takes XML as an input and outputs Something. If your algorithm is sufficiently complex, the fact that the input is XML becomes irrelevant to your choice of tool - i.e roll your own in C++ / Python / whatever.
Specific to your example, I would imagine the best idea would be to create your own XML->XML convert that follows your business logic. Next, write a XSLT translator that just knows about formatting and does nothing clever. That might be a nice middle ground but it totally depends what you are doing. Having a XSLT translator on the output makes it easier to create alternative output formats - printable, for mobiles, etc.
Yes, I use it a lot. By using different xslt files, I can use the same XML source to create multiple polyglot (X)HTML files (presenting the same data in different ways), a RSS feed, an Atom feed, a RDF descriptor file and fragment of a site map.
It's not a panacea. There are things it does well, and things it doesn't do well, and like all other aspects of programming, it's all about using the right tool for the right job. It's a tool that's well worth having in your toolbox but it should used only when it's appropriate to do so.
I would definitely reccomend to stick it out. Particularly if you are using visual studio which has built in editing, viewing and debugging tools for XSLT.
Yes, it is a pain while you are learning, but most of the pain is to do with familiarity. The pain does diminish as you learn the language.
W3schools has two articles that are of particular worth:
http://www.w3schools.com/xpath/xpath_functions.asp
http://www.w3schools.com/xsl/xsl_functions.asp
I have found XSLT to be quite difficult to work with.
I have had experience working on a system somewhat similar to the one you describe. My company noted that the data we were returning from "the middle tier" was in XML, and that the pages were to be rendered in HTML which might as well be XHTML, plus they'd heard that XSL was a standard for transforming between XML formats. So the "architects" (by which I mean people who think deep design thoughts but apparently never code) decided that our front tier would be implemented by writing XSLT scripts that transformed the data into the XHTML for display.
The choice turned out to be disastrous. XSLT, it turns out, is a pain to write. And so all of our pages were difficult to write and to maintain. We would have done much better to have used JSP (this was in Java) or some similar approach that used one kind of markup (angle brackets) for the output format (the HTML) and another kind of markup (like <%...%>) for the meta-data. The most confusing thing about XSLT is that it is written in XML, and it translates from XML to XML... it is quite difficult to keep all 3 different XML documents straight in one's mind.
Your situation is slightly different: instead of authoring each page in XSLT as I did, you only need to write ONE bit of code in XSLT (the code to convert from templates to display). But it sounds like you may have run into the same kind of difficulty that I did. I would say that trying to interpret a simple XML-based DSL (domain specific language) like you are doing is NOT one of the strong points of XSLT. (Although it CAN do the job... after all, it IS Turing complete!)
However, if what you had was simpler: you have data in one XML format and wanted to make simple alterations to it -- not a full page-description DSL, but some simple straightforward modifications, then XSLT is an excellent tool for that purpose. It's declarative (not procedural) nature is actually an advantage for that purpose.
-- Michael Chermside
XSLT is difficult to work with, but once you conquer it you will have a very thorough understanding of the DOM and schema. If you also XPath, then you on your way to learning functional programming and this will expose to new techniques and ways about solving problems. In some cases, successive transformation is more powerful than procedural solutions.
I use XSLT extensively, for a custom MVC style front-end. The model is "serialized" to xml (not via xml serializaiton), and then converted to html via xslt. The advantage over ASP.NET lie in the natural integration with XPath, and the more rigorous well-formedness requirements (it's much easier to reason about document structure in xslt than in most other languages).
Unfortunately, the language contains several limitations (for example, the ability to transform the output of another transform) which mean that it's occasionally frustrating to work with.
Nevertheless, the easily achievable, strongly enforced separation of concerns which it grants aren't something I see another technology providing right now - so for document transforms it's still something I'd recommend.
I used XML, XSD and XSLT on an integration project between very dis-similar DB systems sometime in 2004. I had to learn XSD and XSLT from scratch but it wasn't hard. The great thing about these tools was that it enabled me to write data independent C++ code, relying on XSD and XSLT to validate/verify and then transform the XML documents. Change the data format, change the XSD and XSLT documents not the C++ code which employed the Xerces libraries.
For interest: the main XSD was 150KB and the average size of the XSLT was < 5KB IIRC.
The other great benefit is that the XSD is a specification document that the XSLT is based on. The two work in harmony. And specs are rare in software development these days.
Although I did not have too much trouble learning the declarative nature XSD and XSLT I did find that other C/C++ programmers had great trouble in adjusting to the declarative way. When they saw that was it, ah procedural they muttered, now that I understand! And they proceeded (pun?) to write procedural XSLT! The thing is you have to learn XPath and understand the axes of XML. Reminds me of old-time C programmers adjusting to employing OO when writing C++.
I used these tools as they enabled me to write a small C++ code base that was isolated from all but the most fundamental of data structure modifications and these latter were DB structure changes. Even though I prefer C++ to any other language I'll use what I consider to be useful to benefit the long term viability of a software project.
I used to think XSLT was a great idea. I mean it is a great idea.
Where it fails is the execution.
The problem I discovered over time was that programming languages in XML are just a bad idea. It makes the whole thing impenetrable. Specifically I think XSLT is very hard learn, code and understand. The XML on top of the functional aspects just makes the whole thing too confusing. I have tried to learn it about 5 times in my career, and it just doesn't stick.
OK, you could 'tool' it -- I think that was partly the point of it's design -- but that's the second failing: all the XSLT tools on the market are, quite simply ... crap!
The XSLT specification defines XSLT as "a language for transforming XML documents into other XML documents". If you are trying to do any thing but the most basic data processing within XSLT there are probably better solutions.
Also worth noting that the data processing capabilities of XSLT can be extended in .NET using custom extension functions:
MSDN Documentation
CSharpFriends: Tutorial
I maintain an online documentation system for my company. The writers create the documentation in SGML ( an xml like language ). The SGML is then combined with XSLT and transformed into HTML.
This allows us to easily make changes to the documentation layout without doing any coding. Its just a matter of changing the XSLT.
This works well for us. In our case, its a read only document. The user isn't interacting with the documentation.
Also, by using XSLT, you are working closer to your problem domain (HTML). I always consider that to be good idea.
Lastly, if your current system WORKS, leave it alone. I would never suggest trashing your existing code. If I was starting from scratch, I would use XSLT, but in your case, I would use what you have.
It comes down to what you need it for. Its main strength is the easy maintainability of the transform, and writing your own parser generally obliterates that. With that said, sometimes a system is small and simple and really doesn't need a "fancy" solution. As long as your code-based builder is replaceable without having to change other code, no big deal.
As for the ugliness of XSL, yes it's ugly. Yes, it takes some getting used to. But once you get the hang of it (shouldn't take long IMO), it's actually smooth sailing. Compiled transforms run quite quickly in my experience, and you can certainly debug into them.
I still believe that XSLT can be useful but it is an ugly language and can lead to an awful unreadable, unmaintainable mess. Partly because XML is not human readable enough to make up a "language" and partly because XSLT is stuck somewhere between being declarative and procedural. Having said that, and I think a comparison can be drawn with regular expressions, it has it's uses when it comes to simple well defined problems.
Using the alternative approach and parsing XML in code can be equally nasty and you really want to employ some kind of XML marshalling/binding technology (such as JiBX in Java) that will convert your XML straight to an object.
If you can use XSLT in a declarative style (although I don't entirely agree that it is declarative language) then I think it is useful and expressive.
I've written web apps that use an OO language (C# in my case) to handle the data/ processing layer, but output XML rather than HTML. This can then be consumed directly by clients as a data API, or rendered as HTML by XSLTs. Because the C# was outputting XML that was structurally compatible with this use it was all very smooth, and the presentation logic was kept declarative. It was easier to follow and change than sending the tags from C#.
However, as you require more processing logic at the XSLT level it gets convoluted and verbose - even if you "get" the functional style.
Of course, these days I'd probably have written those web apps using a RESTful interface - and I think data "languages" such as JSON are gaining traction in areas that XML has traditionally been transformed by XSLT. But for now XSLT is still an important, and useful, technology.
I have spent a lot of time in XSLT and found that while it is a useful tool in some situations, it is definitely not a fix all. It works very well for B2B purposes when it is used for data translation for machine-readable XML input/output. I don't think you are on the wrong track in your statement of its limitations. One of the things that frustrated me the most were the nuances in the implementations of XSLT.
Perhaps you should look at some of the other markup languages available. I believe Jeff did an article about this very topic concerning Stack Overflow.
Is HTML a Humane Markup Language?
I would take a look at what he wrote. You can probably find a software package that does what you want "out of the box", or at least very close instead of writing your own stuff from the ground up.
I'm currently tasked with scraping data from a public site (yeah, i know). Thankfully it conforms to xhtml so I'm able to use xslt to gather the data I need. The resulting solution is readable, clean and easy to change if need occurs. Perfect!
I've used XSLT before. The group of 6 .xslt files (refactored out of one large one) was about 2750 lines long before I rewrote it in C#. The C# code is currently 4000 lines containing lots of logic; I don't even want to think about what that would have taken to write in XSLT.
The point where I gave up is when I realized not having XPATH 2.0 was significantly hurting my progress.
To answer your three questions:
I've used XSLT once some years ago.
I do believe XSLT could be the right solution in certain circumstances. (Never say never)
I tend to agree with your assesment that it is mostly useful for 'simple' transformations. But I think as long as you understand XSLT well, there is a case to be made for using it for bigger tasks like publishing a website as XML transformed into HTML.
I believe the reason many developers dislike XSLT is because they do not understand the fundamentally different paradigm it is based on. But with the recent interest in functional programming we might see XSLT making a comeback...
One place where xslt really shines is in generating reports. I've found that a 2 step process, with the first step exporting the report data as an xml file, and the second step generating the visual report from the xml using xslt. This allows for nice visual reports while still keeping the raw data around as a validation mechanism if needs be.
At a previous company we did a lot with XML and XSLT. Both XML and XSLT big.
Yes there is a learning curve, but then you have a powerful tool to handle XML. And you can even use XSLT on XSLT (which can sometimes be useful).
Performance is also an issue (with very large XML) but you can tackle that by using smart XSLT and do some preprocessing with the (generated) XML.
Anybody with knowledge of XSLT can change the apearance of the finished product because it is not compiled.
I personally like XSLT, and you may want to give the simplified syntax a look (no explicit templates, just a regular old HTML file with a few XSLT tags to spit values into it), but it just isn't for everyone.
Maybe you just want to offer your authors a simple Wiki or Markdown interface. There are libraries for that, too, and if XSLT isn't working for you, maybe XML isn't working for them either.
XSLT is not the end-all be-all of xml transformation. However, it's very difficult to judge based on the information given if it would have been the best solution to your problem or if there are other more efficient and maintainable approaches. You say the authors could enter their content in a simplified format - what format? Text boxes? What kind of html were you converting it to?
To judge whether XSLT is the right tool for the job, it would help to know the features of this transformation in more detail.
I enjoy using XSLT only for changing the tree structure of XML documents. I find it cumbersome to do anything related to text processing and relegate that to a custom script that I may run before or after applying an XSLT to an XML document.
XSLT 2.0 included a lot more string functions, but I think it's not a good fit for the language, and there's not many implementations of XSLT 2.0.