HTML annotation system to aid web UI test automation element/attribute references? - html

Is there a system for annotating HTML for the purpose of identifying elements/attributes critical to web user interface test automation (by Selenium, HTMLUnit, Watir, Sahi, etc.)? The system might be a standard or library. The annotations might for example be implemented as HTML data attributes which reference simply an attribute name, an XPath expression, or CSS selector that needs to exist and remain consistent for test automation purposes. If identified attributes/elements change, test automation may break, so developers should not change them without coordinating with those responsible for automation. So the annotations are at the very least a visual cue to developers.
But beyond that, perhaps an "enforcer" part of the system (continuous integration plugin, CLI, or perhaps JavaScript library on the same page) would enforce the annotations to some degree, failing quickly and clearly if certain conditions are not satisfied. Perhaps the identified attribute/XPath/selector must exist and reference the same element that defines the annotation. The enforcer can also gather all the annotations and perhaps report the full list of them on each page of a web application, for diff notification purposes.
I may have some details not well worked out, but hopefully the gist of what I'm looking for makes sense. A free license would be handy but not absolutely necessary.
Does something like this exist?
A helpful answer might also be:
There's a better way to address your underlying need (with explanation).
We tried implementing this and it's not worth it; here's why.
Motivation: Web user interface tests can be sensitive to changes to HTML elements and attributes. The tests often run in an integration testing job or phase which may be significantly later than the build or unit tests. It would be nice to catch some of the potential mismatches between UI elements and test expectations earlier, in the build or unit test phase, and also to provide a test automation engineer more information to indicate what may be a bug in a test vs. a bug in the product under test.

As you say:
The annotations might for example be implemented as HTML data attributes
This is the way people are going with HTML5 as it enables us to pass control over to our QA automation teams and lets development teams ignore their concerns (changing an ID on an element would affect their testing).
So, an HTML element might look like (a relatively contrived example):
<div id="button" data-test-attribute="submit"></div>
Most selectors used by these libraries use CSS-oriented selections, making selecting the above eay. There are a number of articles that have begun to address this.
A really brief one I like states:
ID’s and also css class names have another purpose than for the usage
in UI automated tests. Css classes are primarily used to style
elements and they also can get uglified on the production page. ID’s
have also another purpose and can also sometimes just change or they
are not get implemented because they interfere with other functions on
the page.
It seems another part of your question is concerned about enforcement. This would be relatively simple to integrate into your build process with a hand-spun thing using something like Node.js and webpack (might increase your build time, considerably - but certainly possible).

Related

Remove element-ID's from Vaadin components in production mode

I use setId a lot for automated UI tests within my Vaadin application. For performance reasons, I would like to remove this ID's in production mode. Is there any good way to do so?
You can check if you are currently running in Vaadin Production Mode like this
VaadinService.getCurrent().getDeploymentConfiguration().isProductionMode();
So if you are setting your components id with setId() method, you can easily set it only when not in production mode, for example:
boolean isProductionMode = VaadinService.getCurrent().getDeploymentConfiguration().isProductionMode();
if(!isProductionMode) {
foo.setID(FOO_ID);
}
But I would consider whether to use this approach at all. To how many components do you assign id for web tests? If the number is not extremely high, the performance hit will be negligible, while your code will be cluttered too much with production mode checks. In many cases code readability and simplicity is more important than a minor performance hit.
Alternatively, you can rewrite many of your component selectors (assuming you are using Vaadin testbench?) using xpath queries which do not depend on component Ids but on some already present information instead - like "location" attribute when using custom layouts, css class, position in parent container etc.

Framework vs. Toolkit vs. Library [duplicate]

This question already has answers here:
What is the difference between a framework and a library? [closed]
(22 answers)
Closed 6 years ago.
What is the difference between a Framework, a Toolkit and a Library?
The most important difference, and in fact the defining difference between a library and a framework is Inversion of Control.
What does this mean? Well, it means that when you call a library, you are in control. But with a framework, the control is inverted: the framework calls you. (This is called the Hollywood Principle: Don't call Us, We'll call You.) This is pretty much the definition of a framework. If it doesn't have Inversion of Control, it's not a framework. (I'm looking at you, .NET!)
Basically, all the control flow is already in the framework, and there's just a bunch of predefined white spots that you can fill out with your code.
A library on the other hand is a collection of functionality that you can call.
I don't know if the term toolkit is really well defined. Just the word "kit" seems to suggest some kind of modularity, i.e. a set of independent libraries that you can pick and choose from. What, then, makes a toolkit different from just a bunch of independent libraries? Integration: if you just have a bunch of independent libraries, there is no guarantee that they will work well together, whereas the libraries in a toolkit have been designed to work well together – you just don't have to use all of them.
But that's really just my interpretation of the term. Unlike library and framework, which are well-defined, I don't think that there is a widely accepted definition of toolkit.
Martin Fowler discusses the difference between a library and a framework in his article on Inversion of Control:
Inversion of Control is a key part of
what makes a framework different to a
library. A library is essentially a
set of functions that you can call,
these days usually organized into
classes. Each call does some work and
returns control to the client.
A framework embodies some abstract
design, with more behavior built in.
In order to use it you need to insert
your behavior into various places in
the framework either by subclassing or
by plugging in your own classes. The
framework's code then calls your code
at these points.
To summarize: your code calls a library but a framework calls your code.
Diagram
If you are a more visual learner, here is a diagram that makes it clearer:
(Credits: http://tom.lokhorst.eu/2010/09/why-libraries-are-better-than-frameworks)
The answer provided by Barrass is probably the most complete. However, the explanation could easily be stated more clearly. Most people miss the fact that these are all nested concepts. So let me lay it out for you.
When writing code:
eventually you discover sections of code that you're repeating in your program, so you refactor those into Functions/Methods.
eventually, after having written a few programs, you find yourself copying functions you already made into new programs. To save yourself time you bundle those functions into Libraries.
eventually you find yourself creating the same kind of user interfaces every time you make use of certain libraries. So you refactor your work and create a Toolkit that allows you to create your UIs more easily from generic method calls.
eventually, you've written so many apps that use the same toolkits and libraries that you create a Framework that has a generic version of this boilerplate code already provided so all you need to do is design the look of the UI and handle the events that result from user interaction.
Generally speaking, this completely explains the differences between the terms.
Introduction
There are various terms relating to collections of related code, which have both historical (pre-1994/5 for the purposes of this answer) and current implications, and the reader should be aware of both, particularly when reading classic texts on computing/programming from the historic era.
Library
Both historically, and currently, a library is a collection of code relating to a specific task, or set of closely related tasks which operate at roughly the same level of abstraction. It generally lacks any purpose or intent of its own, and is intended to be used by (consumed) and integrated with client code to assist client code in executing its tasks.
Toolkit
Historically, a toolkit is a more focused library, with a defined and specific purpose. Currently, this term has fallen out of favour, and is used almost exclusively (to this author's knowledge) for graphical widgets, and GUI components in the current era. A toolkit will most often operate at a higher layer of abstraction than a library, and will often consume and use libraries itself. Unlike libraries, toolkit code will often be used to execute the task of the client code, such as building a window, resizing a window, etc. The lower levels of abstraction within a toolkit are either fixed, or can themselves be operated on by client code in a proscribed manner. (Think Window style, which can either be fixed, or which could be altered in advance by client code.)
Framework
Historically, a framework was a suite of inter-related libraries and modules which were separated into either 'General' or 'Specific' categories. General frameworks were intended to offer a comprehensive and integrated platform for building applications by offering general functionality, such as cross platform memory management, multi-threading abstractions, dynamic structures (and generic structures in general). Historical general frameworks (Without dependency injection, see below) have almost universally been superseded by polymorphic templated (parameterised) packaged language offerings in OO languages, such as the STL for C++, or in packaged libraries for non-OO languages (guaranteed Solaris C headers). General frameworks operated at differing layers of abstraction, but universally low level, and like libraries relied on the client code carrying out it's specific tasks with their assistance.
'Specific' frameworks were historically developed for single (but often sprawling) tasks, such as "Command and Control" systems for industrial systems, and early networking stacks, and operated at a high level of abstraction and like toolkits were used to carry out execution of the client codes tasks.
Currently, the definition of a framework has become more focused and taken on the "Inversion of Control" principle as mentioned elsewhere as a guiding principle, so program flow, as well as execution is carried out by the framework. Frameworks are still however targeted either towards a specific output; an application for a specific OS for example (MFC for MS Windows for example), or for more general purpose work (Spring framework for example).
SDK: "Software Development Kit"
An SDK is a collection of tools to assist the programmer to create and deploy code/content which is very specifically targeted to either run on a very particular platform or in a very particular manner. An SDK can consist of simply a set of libraries which must be used in a specific way only by the client code and which can be compiled as normal, up to a set of binary tools which create or adapt binary assets to produce its (the SDK's) output.
Engine
An Engine (In code collection terms) is a binary which will run bespoke content or process input data in some way. Game and Graphics engines are perhaps the most prevalent users of this term, and are almost universally used with an SDK to target the engine itself, such as the UDK (Unreal Development Kit) but other engines also exist, such as Search engines and RDBMS engines.
An engine will often, but not always, allow only a few of its internals to be accessible to its clients. Most often to either target a different architecture, change the presentation of the output of the engine, or for tuning purposes. Open Source Engines are by definition open to clients to change and alter as required, and some propriety engines are fixed completely. The most often used engines in the world however, are almost certainly JavaScript Engines. Embedded into every browser everywhere, there are a whole host of JavaScript engines which will take JavaScript as an input, process it, and then output to render.
API: "Application Programming Interface"
The final term I am answering is a personal bugbear of mine: API, was historically used to describe the external interface of an application or environment which, itself was capable of running independently, or at least of carrying out its tasks without any necessary client intervention after initial execution. Applications such as Databases, Word Processors and Windows systems would expose a fixed set of internal hooks or objects to the external interface which a client could then call/modify/use, etc to carry out capabilities which the original application could carry out. API's varied between how much functionality was available through the API, and also, how much of the core application was (re)used by the client code. (For example, a word processing API may require the full application to be background loaded when each instance of the client code runs, or perhaps just one of its linked libraries; whereas a running windowing system would create internal objects to be managed by itself and pass back handles to the client code to be utilised instead.
Currently, the term API has a much broader range, and is often used to describe almost every other term within this answer. Indeed, the most common definition applied to this term is that an API offers up a contracted external interface to another piece of software (Client code to the API). In practice this means that an API is language dependent, and has a concrete implementation which is provided by one of the above code collections, such as a library, toolkit, or framework.
To look at a specific area, protocols, for example, an API is different to a protocol which is a more generic term representing a set of rules, however an individual implementation of a specific protocol/protocol suite that exposes an external interface to other software would most often be called an API.
Remark
As noted above, historic and current definitions of the above terms have shifted, and this can be seen to be down to advances in scientific understanding of the underlying computing principles and paradigms, and also down to the emergence of particular patterns of software. In particular, the GUI and Windowing systems of the early nineties helped to define many of these terms, but since the effective hybridisation of OS Kernel and Windowing system for mass consumer operating systems (bar perhaps Linux), and the mass adoption of dependency injection/inversion of control as a mechanism to consume libraries and frameworks, these terms have had to change their respective meanings.
P.S. (A year later)
After thinking carefully about this subject for over a year I reject the IoC principle as the defining difference between a framework and a library. There ARE a large number of popular authors who say that it is, but there are an almost equal number of people who say that it isn't. There are simply too many 'Frameworks' out there which DO NOT use IoC to say that it is the defining principle. A search for embedded or micro controller frameworks reveals a whole plethora which do NOT use IoC and I now believe that the .NET language and CLR is an acceptable descendant of the "general" framework. To say that IoC is the defining characteristic is simply too rigid for me to accept I'm afraid, and rejects out of hand anything putting itself forward as a framework which matches the historical representation as mentioned above.
For details of non-IoC frameworks, see, as mentioned above, many embedded and micro frameworks, as well as any historical framework in a language that does not provide callback through the language (OK. Callbacks can be hacked for any device with a modern register system, but not by the average programmer), and obviously, the .NET framework.
A library is simply a collection of methods/functions wrapped up into a package that can be imported into a code project and re-used.
A framework is a robust library or collection of libraries that provides a "foundation" for your code. A framework follows the Inversion of Control pattern. For example, the .NET framework is a large collection of cohesive libraries in which you build your application on top of. You can argue there isn't a big difference between a framework and a library, but when people say "framework" it typically implies a larger, more robust suite of libraries which will play an integral part of an application.
I think of a toolkit the same way I think of an SDK. It comes with documentation, examples, libraries, wrappers, etc. Again, you can say this is the same as a framework and you would probably be right to do so.
They can almost all be used interchangeably.
very, very similar, a framework is usually a bit more developed and complete then a library, and a toolkit can simply be a collection of similar librarys and frameworks.
a really good question that is maybe even the slightest bit subjective in nature, but I believe that is about the best answer I could give.
Library
I think it's unanimous that a library is code already coded that you can use so as not to have to code it again. The code must be organized in a way that allows you to look up the functionality you want and use it from your own code.
Most programming languages come with standard libraries, especially some code that implements some kind of collection. This is always for the convenience that you don't have to code these things yourself. Similarly, most programming languages have construct to allow you to look up functionality from libraries, with things like dynamic linking, namespaces, etc.
So code that finds itself often needed to be re-used is great code to be put inside a library.
Toolkit
A set of tools used for a particular purpose. This is unanimous. The question is, what is considered a tool and what isn't. I'd say there's no fixed definition, it depends on the context of the thing calling itself a toolkit. Example of tools could be libraries, widgets, scripts, programs, editors, documentation, servers, debuggers, etc.
Another thing to note is the "particular purpose". This is always true, but the scope of the purpose can easily change based on who made the toolkit. So it can easily be a programmer's toolkit, or it can be a string parsing toolkit. One is so broad, it could have tool touching everything programming related, while the other is more precise.
SDKs are generally toolkits, in that they try and bundle a set of tools (often of multiple kind) into a single package.
I think the common thread is that a tool does something for you, either completely, or it helps you do it. And a toolkit is simply a set of tools which all perform or help you perform a particular set of activities.
Framework
Frameworks aren't quite as unanimously defined. It seems to be a bit of a blanket term for anything that can frame your code. Which would mean: any structure that underlies or supports your code.
This implies that you build your code against a framework, whereas you build a library against your code.
But, it seems that sometimes the word framework is used in the same sense as toolkit or even library. The .Net Framework is mostly a toolkit, because it's composed of the FCL which is a library, and the CLR, which is a virtual machine. So you would consider it a toolkit to C# development on Windows. Mono being a toolkit for C# development on Linux. Yet they called it a framework. It makes sense to think of it this way too, since it kinds of frame your code, but a frame should more support and hold things together, then do any kind of work, so my opinion is this is not the way you should use the word.
And I think the industry is trying to move into having framework mean an already written program with missing pieces that you must provide or customize. Which I think is a good thing, since toolkit and library are great precise terms for other usages of "framework".
Framework: installed on you machine and allowing you to interact with it. without the framework you can't send programming commands to your machine
Library: aims to solve a certain problem (or several problems related to the same category)
Toolkit: a collection of many pieces of code that can solve multiple problems on multiple issues (just like a toolbox)
It's a little bit subjective I think. The toolkit is the easiest. It's just a bunch of methods, classes that can be use.
The library vs the framework question I make difference by the way to use them. I read somewhere the perfect answer a long time ago. The framework calls your code, but on the other hand your code calls the library.
In relation with the correct answer from Mittag:
a simple example. Let's say you implement the ISerializable interface (.Net) in one of your classes. You make use of the framework qualities of .Net then, rather than it's library qualities. You fill in the "white spots" (as mittag said) and you have the skeleton completed. You must know in advance how the framework is going to "react" with your code. Actually .net IS a framework, and here is where i disagree with the view of Mittag.
The full, complete answer to your question is given very lucidly in Chapter 19 (the whole chapter devoted to just this theme) of this book, which is a very good book by the way (not at all "just for Smalltalk").
Others have noted that .net may be both a framework and a library and a toolkit depending on which part you use but perhaps an example helps. Entity Framework for dealing with databases is a part of .net that does use the inversion of control pattern. You let it know your models it figures out what to do with them. As a programmer it requires you to understand "the mind of the framework", or more realistically the mind of the designer and what they are going to do with your inputs. datareader and related calls, on the other hand, are simply a tool to go get or put data to and from table/view and make it available to you. It would never understand how to take a parent child relationship and translate it from object to relational, you'd use multiple tools to do that. But you would have much more control on how that data was stored, when, transactions, etc.

Why should I use code generators

I have encountered this topic lately and couldn't understand why they are needed.
Can you explain why I should use them in my projects and how they can ease my life.
Examples will be great, and where from I can learn this topic little more.
At least you have framed the question from the correct perspective =)
The usual reasons for using a code generator are given as productivity and consistency because they assume that the solution to a consistent and repetitive problem is to throw more code at it. I would argue that any time you are considering code generation, look at why you are generating code and see if you can solve the problem through other means.
A classic example of this is data access; you could generate 250 classes ( 1 for each table in the schema ) effectively creating a table gateway solution, or you could build something more like a domain model and use NHibernate / ActiveRecord / LightSpeed / [pick your orm] to map a rich domain model onto the database.
While both the hand rolled solution and ORM are effectively code generators, the primary difference is when the code is generated. With the ORM it is an implicit step that happens at run-time and therefore is one-way by it's nature. The hand rolled solution requires and explicit step to generate the code during development and the likelihood that the generated classes will need customising at some point therefore creating problems when you re-generate the code. The explicit step that must happen during development introduces friction into the development process and often leads to code that violates DRY ( although some argue that generated code can never violate DRY ).
Another reason for touting code generation comes from the MDA / MDE world ( Model Driven Architecture / Engineering ). I don't put much stock in this but rather than providing a number of poorly expressed arguments, I'm simply going to co-opt someone elses - http://www.infoq.com/articles/8-reasons-why-MDE-fails.
IMHO code generation is the only solution in an exceedingly narrow set of problems and whenever you are considering it, you should probably take a second look at the real problem you are trying to solve and see if there is a better solution.
One type of code generation that really does enhance productivity is "micro code-generation" where the use of macros and templates allow a developer to generate new code directly in the IDE and tab / type their way through placeholders (eg namespace / classname etc). This sort of code generation is a feature of resharper and I use it heavily every day. The reason that micro-generation benefits where most large scale code generation fails is that the generated code is not tied back to any other resource that must be kept in sync and therefore once the code is generated, it is just like all the other code in the solution.
#John
Moving the creation of "basic classes" from the IDE into xml / dsl is often seen when doing big bang development - a classic example would be developers try to reverse engineer the database into a domain model. Unless the code generator is very well written it simply introduces an additional burden on the developer in that every time they need to update the domain model, they either have to context-switch and update the xml / dsl or they have to extend the domain model and then port those changes back to the xml / dsl ( effectively doing the work twice).
There are some code generators that work very well in this space ( the LightSpeed designer is the only one I can think of atm ) by acting as the engine for a design surface but often
these code generators generate terrible code that cannot be maintained (eg winforms / webforms design surfaces, EF1 design surface) and therefore rapidly undo any productivity benefits gained from using the code generator in the first place.
Well, it's either:
you write 250 classes, all pretty much the same, but slightly different, e.g. to do data access; takes you a week, and it's boring and error-prone and annoying
OR:
you invest 30 minutes into generating a code template, and let a generation engine handle the grunt work in another 30 minutes
So a code generator gives you:
speed
reproducability
a lot less errors
a lot more free time! :-)
Excellent examples:
Linq-to-SQL T4 templates by Damien Guard to generate one separate file per class in your database model, using the best kept Visual Studio 2008 secret - T4 templates
PLINQO - same thing, but for Codesmith's generator
and countless more.....
Anytime you need to produce large amounts of repetetive boilerplate code, the code generator is the guy for the job. Last time I used a code generator was when creating a custom Data Access Layer for a project, where the skeleton for various CRUD actions was created based on an object model. Instead of hand-coding all those classes, I put together a template-driven code generator (using StringTemplate) to make it for me. The advandages of this procedure was:
It was faster (there was a large amount of code to generate)
I could regenerate the code in a whim in case I detected a bug (code can sometimes have bugs in the early versions)
Less error prone; when we had an error in the generated code it was everywhere which means that it was more likely to be found (and, as noted in the previous point, it was easy to fix it and regenerate the code).
Using GUI builders, that will generate code for you is a common practice. Thanks to this you don't need to manually create all widgets. You just drag&drop them and the use generated code. For simple widgets this really saves time (I have used this a lot for wxWidgets).
Really, when you are using almost any programming language, you are using a "code generator" (except for assembly or machine code.) I often write little 200-line scripts that crank out a few thousand lines of C. There is also software you can get which helps generate certain types of code (yacc and lex, for example, are used to generate parsers to create programming languages.)
The key here is to think of your code generator's input as the actual source code, and think of the stuff it spits out as just part of the build process. In which case, you are writing in a higher-level language with fewer actual lines of code to deal with.
For example, here is a very long and tedious file I (didn't) write as part of my work modifying the Quake2-based game engine CRX. It takes the integer values of all #defined constants from two of the headers, and makes them into "cvars" (variables for the in-game console.)
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/game/cvar_constants.c
Here is the short Bash script which generated that code at compile-time:
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/autogen/constant_cvars.sh
Now, which would you rather maintain? They are both equivalent in terms of what they describe, but one is vastly longer and more annoying to deal with.
The canonical example of this is data access, but I have another example. I've worked on a messaging system that communicates over serial port, sockets, etc., and I found I kept having to write classes like this over and over again:
public class FooMessage
{
public FooMessage()
{
}
public FooMessage(int bar, string baz, DateTime blah)
{
this.Bar = bar;
this.Baz = baz;
this.Blah = blah;
}
public void Read(BinaryReader reader)
{
this.Bar = reader.ReadInt32();
this.Baz = Encoding.ASCII.GetString(reader.ReadBytes(30));
this.Blah = new DateTime(reader.ReadInt16(), reader.ReadByte(),
reader.ReadByte());
}
public void Write(BinaryWriter writer)
{
writer.Write(this.Bar);
writer.Write(Encoding.ASCII.GetBytes(
this.Baz.PadRight(30).Substring(0, 30)));
writer.Write((Int16)this.Blah.Year);
writer.Write((byte)this.Blah.Month);
writer.Write((byte)this.Blah.Day);
}
public int Bar { get; set; }
public string Baz { get; set; }
public DateTime Blah { get; set; }
}
Try to imagine, if you will, writing this code for no fewer than 300 different types of messages. The same boring, tedious, error-prone code being written, over and over again. I managed to write about 3 of these before I decided it would be easier for me to just write a code generator, so I did.
I won't post the code-gen code, it's a lot of arcane CodeDom stuff, but the bottom line is that I was able to compact the entire system down to a single XML file:
<Messages>
<Message ID="12345" Name="Foo">
<ByteField Name="Bar"/>
<TextField Name="Baz" Length="30"/>
<DateTimeField Name="Blah" Precision="Day"/>
</Message>
(More messages)
</Messages>
How much easier is this? (Rhetorical question.) I could finally breathe. I even added some bells and whistles so it was able to generate a "proxy", and I could write code like this:
var p = new MyMessagingProtocol(...);
SetFooResult result = p.SetFoo(3, "Hello", DateTime.Today);
In the end I'd say this saved me writing a good 7500 lines of code and turned a 3-week task into a 3-day task (well, plus the couple of days required to write the code-gen).
Conclusion: Code generation is only appropriate for a relatively small number of problems, but when you're able to use one, it will save your sanity.
A code generator is useful if:
The cost of writing and maintaining the code generator is less than the cost of writing and maintaining the repetition that it is replacing.
The consistency gained by using a code generator will reduce errors to a degree that makes it worthwhile.
The extra problem of debugging generated code will not make debugging inefficient enough to outweigh the benefits from 1 and 2.
For domain-driven or multi-tier apps, code generation is a great way to create the initial model or data access layer. It can churn out the 250 entity classes in 30 seconds ( or in my case 750 classes in 5 minutes). This then leaves the programmer to focus on enhancing the model with relationships, business rules or deriving views within MVC.
The key thing here is when I say initial model. If you are relying on the code generation to maintain the code, then the real work is being done in the templates. (As stated by Max E.) And beware of that because there is risk and complexity in maintaining template-based code.
If you just want the data layer to be "automagically created" so you can "make the GUI work in 2 days", then I'd suggest going with a product/toolset which is geared towards the data-driven or two-tier application scenario.
Finally, keep in mind "garbage in=garbage out". If your entire data layer is homogeneous and does not abstract from the database, please please ask yourself why you are bothering to have a data layer at all. (Unless you need to look productive :) )
How 'bout an example of a good use of a code generator?
This uses t4 templates (a code generator built in to visual studio) to generate compressed css from .less files:
http://haacked.com/archive/2009/12/02/t4-template-for-less-css.aspx
Basically, it lets you define variables, real inheritance, and even behavior in your style sheets, and then create normal css from that at compile time.
Everyone talks here about simple code generation, but what about model-driven code generation (like MDSD or DSM)? This helps you move beyond the simple ORM/member accessors/boilerplate generators and into code generation of higher-level concepts for your problem domain.
It's not productive for one-off projects, but even for these, model-driven development introduces additional discipline, better understanding of employed solutions and usually a better evolution path.
Like 3GLs and OOP provided an increase in abstraction by generating large quantities of assembly code based on a higher level specification, model-driven development allows us to again increase the abstraction level, with yet another gain in productivity.
MetaEdit+ from MetaCase (mature) and ABSE from Isomeris (my project, in alpha, info at http://www.abse.info) are two technologies on the forefront of model-driven code generation.
What is needed really is a change in mindset (like OOP required in the 90's)...
I'm actually adding the finishing touches to a code generator I'm using for a project I've been hired on. We have a huge XML files of definitions and in a days worth of work I was able to generate over 500 C# classes. If I want to add functionality to all the classes, say I want to add an attribute to all the properties. I just add it to my code-gen, hit go, and bam! I'm done.
It's really nice, really.
There are many uses for code generation.
Writing code in a familiar language and generating code for a different target language.
GWT - Java -> Javascript
MonoTouch - C# -> Objective-C
Writing code at a higher level of abstraction.
Compilers
Domain Specific Languages
Automating repetitive tasks.
Data Access Layers
Initial Data Models
Ignoring all preconceived notions of code-generation, it is basically translating one representation (usually higher level) to another (usually lower level). Keeping that definition in mind, it is a very powerful tool to have.
The current state of programming languages has by no means reached its full potential and it never will. We will always be abstracting to get to a higher level than where we stand today. Code generation is what gets us there. We can either depend on the language creators to create that abstraction for us, or do it ourselves. Languages today are sophisticated enough to allow anybody to do it easily.
If with code generator you also intend snippets, try the difference between typing ctor + TAB and writing the constructor each time in your classes. Or check how much time you earn using the snippet to create a switch statement related to an enum with many values.
If you're paid by LOC and work for people who don't understand what code generation is, it makes a lot of sense. This is not a joke, by the way - I have worked with more than one programmer who employs this technique for exactly this purpose. Nobody gets paid by LOC formally any more (that I know of, anyway), but programmers are generally expected to be productive, and churning out large volumes of code can make someone look productive.
As an only slightly tangential point, I think this also explains the tendency of some coders to break a single logical unit of code into as many different classes as possible (ever inherit a project with LastName, FirstName and MiddleInitial classes?).
Here's some heresy:
If a task is so stupid that it can be automated at program writing time (i.e. source code can be generated by a script from, let's say XML) then the same can also be done at run-time (i.e. some representation of that XML can be interpreted at run-time) or using some meta-programming. So in essence, the programmer was lazy, did not attempt to solve the real problem but took the easy way out and wrote a code generator. In Java / C#, look at reflection, and in C++ look at templates

What is a Shim?

What's the definition of a Shim?
Simple Explanation via Cartoon
Summary
A shim is some code that takes care of what's asked (by 'interception'), without anyone being any wiser about it.
Example of a Shim
An example of a shim would be rbenv (a ruby tool). Calls to ruby commands are "shimmed". i.e. when you run bundle install, rbenv intercepts that message, and reroutes it according to the specific version of Ruby you are running. If that doesn't make sense try this example, or just think of the fairy god mother intercepting messages and delivering apposite outcomes.
That's it!
Important Clarifications on this example
Note: Like most analogies, this is not perfect: usually Ralph will get EXACTLY what he asked for - but the mechanics of HOW it was obtained is something Ralph doesn't care about. If Ralph asks for dog food, a good shim will deliver dog food.
I wanted to avoid semantic arguments, and complexity e.g. adapter gang of four design patterns, facade, proxy patterns - not that great when you're trying to explain a concept. Introducing code? Pedagogically risky. Wikipedia-like explanation? Boooring, too complex, and time consuming: so I had to deliberately simplify to a cartoon, so you can easily understand in a "fun" way, in 30 seconds, is memorable so you can move on. This approach is not for everyone: if you want a precise definition consider the Wikipedia entry on shims.
The term "shim" as defined in Wikipedia would technically be classified, based on its definition, as a "Structural" design pattern. The many types of “Structural” design patterns are quite clearly described in the (some would say defacto) object oriented software design patterns reference "Design Patterns, Elements of Reusable Object-Oriented Software" better known as the "Gang of Four".
The "Gang of Four" text outlines at least 3 well established patterns known as, "Proxy", "Adapter" and "Facade" which all provide “shim” type functionality. In most fields it’s often times the use and or miss use of different acronyms for the same root concept that causes people confusion. Using the word “shim” to describe the more specific “Structural” design patterns "Proxy", "Adapter" and "Facade" certainly is a clear example of this type of situation. A "shim" is simply a more general term for the more specific types of "Structural" patterns "Proxy", "Adapter", "Facade" and possibly others.
According to Microsoft's article "Demystifying Shims":
It’s a metaphor based on the English language word shim, which is an
engineering term used to describe a piece of wood or metal that is
inserted between two objects to make them fit together better. In
computer programming, a shim is a small library which transparently
intercepts an API, changes the parameters passed, handles the
operation itself, or redirects the operation elsewhere. Shims can also
be used for running programs on different software platforms than they
were developed for.
So a shim is a generic term for any library of code that acts as a middleman and partially or completely changes the behavior or operation of a program. Like a true middleman, it can affect the data passed to that program, or affect the data returned from that program.
The Windows API is an example:
The application is generally unaware that the request is going to a
shim DLL instead of to Windows itself, and Windows is unaware that the
request is coming from a source other than the application (because
the shim DLL is just another DLL inside the application’s process).
So the two programs that make the "bread" of the "shim sandwich" should not be able to differentiate between talking to their counterpart program and talking to the shim.
What are some pros and cons of using shims?
Again, from the article:
You can fix applications without access to the source code, or without
changing them at all. You incur a minimal amount of additional
management overhead... and you can fix a
reasonable number of applications this way. The downside is support as
most vendors don’t support shimmed applications. You can’t fix every
application using shims. Most people typically consider shims for
applications where the vendor is out of business, the software isn’t
strategic enough to necessitate support, or they just want to buy some
time.
As for origins of the word, quoth Apple's Dictionary widget
noun
a washer or thin strip of material used to align parts,
make them fit, or reduce wear.
verb ( shimmed, shimming) [ trans. ]
wedge (something) or fill up (a space) with a shim.
ORIGIN early 18th cent.: of unknown origin
This seems to fit quite well with how web designers use the term.
Shims are used in .net 4.5 Microsoft Fakes framework to isolate your application from other assemblies for unit testing. Shims divert calls to specific methods to code that you write as part of your test
As we could see in many responses here, a shim is a sort of adapter that provides functionality at API level which was not necessarily part of that API. This thread has a lot of good and complete responses, so I'm not expanding the definition further.
However, I think I can add a good example, which is the Javascript ES5 Shim (https://github.com/es-shims/es5-shim):
Javascript has evolved a lot during the last few years, and among many other changes to the language specification, a lot of new methods have been added to its core objects.
For example, in the ES2015 specification (aka ES5), the method find has been added to the Array prototype. So let's say you are running your code using a JavasScript engine prior to this specification (ex: Node 0.12) which doesn't offer that method yet. By loading the ES5 shim, these new methods will be added to the Array prototype, allowing you to make use of them even if you are not running on a newer JavaScript specification.
You might ask: why would someone do that instead of upgrading the environment to a newer version (let's say Node 8)?
There is a lot of real cases scenarios where that approach makes sense. One good example:
Let's say you have a legacy system that is running in an old environment, and you need to use such new methods to implement/fix a functionality. The upgrade of your environment still a work in progress because there are compatibility issues that require a lot of code changes and tests (a critical component).
In this example, you could try to craft your own version of such functionality, but that would make your code harder to read, more complex, can introduce new bugs and will require tons of additional tests just to cover a functionality that you know it will be available in the next release.
Instead, you can use this shim and make use of these new methods, taking advantage of the fact that this fix/functionality will be compatible after the upgrade, because you are already using the methods known to be available in the next specification. And there is a bonus reason: since these methods are native to the next language specification, there is a good chance that they will run faster than any implementation that you could have done if you tried to make your own version.
Another real scenario where such approach is welcome is at browser level. Let's say you need to support old browser and want to take advantage of these newer features. Javascript is a language that allows you to add/modify methods in its core objects (like adding methods to Array prototype), and those shim libraries are smart enough to add such methods only if the current implementation is lacking of them.
PS:
1) You will see the term "Polyfill" related to these Javascript shims. Polyfill is a more specialized type of shim that is used to provide forward compatibility in different browser level specifications. By the way, my example above refers exactly to such example.
2) Shims are not limited to this example (adding functionality that will be available in a future release). There are different use cases that would be considered to be a shim as well.
3) If you are curious about how this specific polyfill is implemented, you can open Javascript Array.find specs and scroll to the end of the page where you will find a canonical implementation for this method.
SHIM is another level of security check which is done for all the services, to protect upstream systems. SHIM Server validates every incoming request, with Headers User credentials, against the user credentials, which are passed in the request(SOAP / RESTFUL).

What are some different ways of implementing a plugin system?

I'm not looking so much for language-specific answers, just general models for implementing a plugin system (if you want to know, I'm using Python). I have my own idea (register callbacks, and that's about it), but I know others exist. What's normally used, and what else is reasonable?
What do you mean by a plugin system? Does Dependency Injection and IOC containers sounds like a good solution?
I mean, uh, well, a way to insert functionality into the base program without altering it. I didn't intend to define it when I set out. Dependency Injection doesn't look particularly suitable for what I'm doing, but I don't know much about them.
A simple plugin architecture can define a plugin interface with all the methods the plugin ought to implement. The plugin handles event from the application, and can use the application's standard code, model objects, etc. to get things done. Basically the same as an ASP.NET Form does, except that you're overriding rather than implementing.
Nobody taught me this part, and I'm no expert, but I feel: In general a plugin will be less stable than its application, so the application should always be in control and only give the plugin periodic opportunities to act. If a plugin can register an Observer, then calls to the delegate should be tried/caught.
There is a very good episode of Software Engineering Radio, which you may be interested in.
For future reference, I have reproduced here the "Rules for Enablers" (alternative link) given in the excellent Contributing to Eclipse by Erich Gamma, Kent Beck.
Invitation Rule - Whenever possible, let others contribute to your contributions.
Lazy Loading Rule - Contributions are only loaded when they are needed.
Safe Platform Rule - As the provider of an extension point, you must protect yourself against misbehavior on the part of extenders.
Fair Play Rule - All clients play by the same rules, even me.
Explicit Extension Rule - Declare explicitly where a platform can be extended.
Diversity Rule - Extension points accept multiple extensions.
Good Fences Rule - When passing control outside your code, protect yourself.
Explicit API Rule - separate the API from internals.
Stability Rule - Once you invite someone to contribute, don?t change the rules.
Defensive API Rule - Reveal only the API in which you are confident, but be prepared to reveal more API as clients ask for it.
In Python you can use the entry-point system provided by setuptools and pkg_resources. Each entry point should be a function that returns information about the plugin -- name, author, setup and teardown functions, etc.
How about abstract factory? Your base program defines how the abstract concepts interact with each other, but the caller has to provide the implementation.