I'm working on a small online game where there is a need to store a reasonable amount of information about many (100+) different kinds of game objects.
I'm trying to decide whether to have this data generated by code or stored in some configuration file.
Data generation approach would be something like (in java-ish pseudo code):
(within a set of functions executed once at program startup)
....
// create grass terrain
grass=new GameObject();
grass.inheritProperties(generic_terrain);
grass.set(NAME,grass);
grass.set(MOVEABLE,true);
grass.set(MOVECOST,10);
grass.set(IMAGE_INDEX,1);
....
Whereas the config file approach would probably just use an XML-type format e.g.
(within terrain.xml file)
....
<terrain name="grass">
<inherit class="generic_terrain"/>
<property key="NAME" value="grass"/>
<property key="MOVABLE" value="true"/>
<property key="MOVECOST" value="10"/>
<property key="IMAGE_INDEX" value="1"/>
</terrain>
....
Some important points:
This information is static each time
the game game is run (i.e. does not
change during execution)
The property names (NAME, MOVECOST etc.) are a relatively small list but additional ones could be added over time
It is safe
to assume that it will only get
changed by the development team (i.e.
there is not a need for configuration
to be managed outside the build
process).
It will need to be tweaked
quite regularly during development
for game balancing reasons (e.g. making units less/more powerful)
There is a certain amount of "inheritance" of properties, i.e. in the example above grass needs to have all the standard properties defined by generic_terrain plus a few new additions/changes.
Which approach would be best for this situation? Any more importantly why?
Personally I like to push as much to config as possible. The reason for this is that at some point I may want to reuse the code I wrote for the game in a completely different way. If the source code is littered with references to implementation specific details this becomes much harder.
One interesting caveat to the config approach comes up when you want to start describing the behaviors of objects in addition to their values. Consider a simple example where you have a cup object which needs to "catch" ball objects. You might express them like:
<object name="ball">
<property key="shape" value="circle"/>
<property key="movable" value="true"/>
<property key="speed" value="10"/>
</object>
<object name="cup">
<property key="shape" value="rectangle"/>
<property key="movable" value="true"/>
<property key="speed" value="6"/>
<property key="catches" value="ball"/>
</object>
The problem here is that somewhere you still have to define what "catches" does inside your code. If you are using an interpreted language you could do something like:
<object name="cup">
<property key="shape" value="rectangle"/>
<property key="movable" value="true"/>
<property key="speed" value="6"/>
<oncollision>
if (collided.getName() == "ball") {
collided.destroy();
points++;
}
</oncollision>
</object>
Now you have gained the ability to describe how an object behaves as well as what it is. The only problem here is that if you are not working in a interpreted language you do not have the luxury of defining code at run time. This is one of the reasons Lua has become so popular in game development. It works well as both a declarative and procedural language and it is easy to embed in a compiled application. So you may express this situation like:
object {
name='ball';
movable=true;
speed=10;
}
object {
name='cup';
movable=true;
speed=6;
oncollision=function(collided)
if collided:getName() == "ball" then
collided:destroy();
points++;
end
end;
}
Separating data from code is just about ALWAYS a good idea. Even if the data is static during execution, it isn't during the design process. A game with hard-coded data is much less flexible than one which gathers its data from an easily-modifiable config file.
Keeping the data in separate files, xml for example, allows for quick and simple altering of the various values, which you say is important.
This isn't language-agnostic.
If you're using a compiled language, use a configuration file so that you don't force a recompile every time you tweak something.
If you're using a language where you don't have to perform an explicit compile/link process, do it in code so that you don't have to deal with parsing and loading. (But do it in one place so that it's easy to completely swap out functionality, should you need to do so at some point in the future).
The basic philosophy here is that code is data, but sometimes code-as-data is painfully difficult to modify; in such cases (e.g., the compiled-language case), write it in a kind of code that's easier to modify. (Your interpreted configuration language.)
A lot of game engines use scripting languages for many of the reasons you mentioned. Lua is a really great, fairly fast, lightweight scripting engine that has been used to great success in a lot of games. You can easily use the parser to do simple config setting and leave it at that, or build in more functionality and let actual code be written in the file. Your example in lua might look something like:
grass = {
NAME = "grass",
MOVEABLE = true,
MOVECOST = 10,
IMAGE_INDEX = 1
}
setmetatable(grass, generic_terrain)
If you do choose to use XML, at least try to use a sane schema. There's no reason to embed your own little schema ("object", "property", "key", "value") into XML when it's designed to represent just that stuff directly. How about:
<ball>
<shape>circle</shape>
<movable>true</movable>
<speed>10</speed>
</ball>
<cup>
<shape>rectangle</shape>
<movable>true</movable>
<speed>6</speed>
<catches>ball</catches/>
</cup>
Related
We are now in process of evaluating integration solutions and comparing Mule and Boomi.
Use case is to read an Excel file, map the columns to a pre-defined set of JSON attributes and then use the JSON to insert records into a database. The mapping may vary from one Excel template to another wherein the column names in an Excel may be different from others.
How do I inject mapping information (source vs target) from outside integration flow?
Note: In Mule, I'm able to do that using a mapping variable (value is JSON) that I inject using Mule DataWeave language.
Boomi's mapping component is static in terms of structure but more versatile solutions are certainly possible.
The data processor component opens up Groovy, JavaScript, and XSLT 3.0 as options. These are Turing-complete languages that can be used to bend Boomi to almost any outcome.
You could make the Boomi UI available to those who need to write the maps in JSON. It's a pretty simple interface to learn. By using a route component, there could be one "parent" process that governs the a process for each template/process and then a map for each template. Such a solution would be pretty easy to build and run; allowing the template-specific processes to be deployed independently of the "parent".
You could map to a generic columnar structure and then dynamically alter the target
columns by writing a SQL procedure that would alter the target columns.
I've come across attempts to do what you're describing (not using either Boomi or Mulesoft) which were tragic failures: https://www.zdnet.com/article/uk-rural-payments-agency-rpa-it-failure-and-gross-incompetence-screws-farmers/ I draw your attention to the NAO's points:
ensure the system specifications retain a realistic level of flexibility
and
bespoke software is costly to develop, needs to be thoroughly tested, and takes more time to implement
The general goal for such a requirement like yours is usually to make transformation/ETL available to "non-programmers" which denies the reality that there are many more skills to delivering an outcome than "programming".
We have a software infrastructure which works pretty much like a software build system: Information is gathered from different sources and used to generate some outputs. Like in traditional software builds we have different types of output, dependency trees, etc.
The main difference is that our sources, intermediate results and outputs are not inherently file-based. Rather, they're (uniquely addressable) data objects.
Right now we're mapping our data structure to files and directories in combination with a traditional build system (SCons) but that does not scale, both w.r.t. performance but (more importantly) w.r.t. maintainability. Hence I'm looking for an infrastructure that's built for this purpose from the ground up.
As an illustration, assume you have 3 XML documents A, B and C. Let's say that B/foo/bar is to be calculated from A/x/y and A/x/z, and that similarly C/a/b is calculated from A/x/y. I need an infrastructure to
Implement these relationships (i.e. the transformations and their dependencies)
Automatically re-build the relevant parts after changes are made
One major problem with using files is that, if I map A, B and C to some files A.xml, B.xml and C.xml and use a traditional build system, then any change to A.xml will trigger a rebuild of B.xml and C.xml, even if A/x/y and A/x/z (the original dependencies of B) are not modified. For a fine-grained dependency resolution I therefore would need to map each of A, B and C not to a file, but to a directory where each sub-directory represents an element, files represents attributes, etc. As I said, this does not scale for us.
(Please note that our system is not actually based on XML)
Right now I'm looking for any existing software, infrastructure or concept which points into this direction, regardless of implementation language and underlying data structures.
It sounds like you need an active object database management system (ODBMS) like GemStone/S. ODBMSs provide the traditional persistence services without the old cost of mapping data structures to files and the well-known benefits of object technology. As you've mentioned dependency trees and addressable objects, in ODBMSs navigational references are stored as part of their data, allowing any complex interaction patterns among objects to be represented/accessed. This is specially true when you predict a system which makes use of inheritance, object nesting and cross-referencing.
Although an object engine may seem oversized for your requirements, it is common for large-scale production business systems to store and execute methods using OODBMs, within a concurrent and multiuser environment. It doesn't come for free because you have to invest in the human part of the equation (education and experience) but once the initial fear is overcome, it will pay the return of investment.
For re-building (subscribed) parts after changes (notifications from announcers) are made, you may use the Observer design pattern, or one of its variants (SASE or Announcements framework), to implement your announce/subscription architecture. Under this type of event frameworks there are intrinsic problems which are hard to solve with traditional file-based solutions, as you have noticed already. For example, it is typical for a dependency mechanism to manage the replacement of an object, or in your example an XML document, by another one. Any modern events framework should manage when an object is removed, all dependents plugged to the old object are updated to the new reference.
Finally, there is a free GemStone/S stack which includes object dependency framework so you may experiment with a real object-database.
So nothing comes to mind that solves exactly your problem, but there are a few tools that might get you a little closer than you are now:
1) You might be able to throw something together using Fuse that would give you better control of how your data objects are mapped out to files. Fuse basically allows you to construct arbitrary file systems from whatever backing data you want. (The python bindings are pretty friendly, but there are a number of other language interfaces available as well). Then you could use a traditional build tool, and take advantage of file like objects better associated w/your data.
2) Cmake has a pretty extensible language for writing custom targets that you might be able to press into service. Unfortunately its language is pretty didactic and has something of a steep learning curve, so it wouldn't be my first choice.
I have encountered this topic lately and couldn't understand why they are needed.
Can you explain why I should use them in my projects and how they can ease my life.
Examples will be great, and where from I can learn this topic little more.
At least you have framed the question from the correct perspective =)
The usual reasons for using a code generator are given as productivity and consistency because they assume that the solution to a consistent and repetitive problem is to throw more code at it. I would argue that any time you are considering code generation, look at why you are generating code and see if you can solve the problem through other means.
A classic example of this is data access; you could generate 250 classes ( 1 for each table in the schema ) effectively creating a table gateway solution, or you could build something more like a domain model and use NHibernate / ActiveRecord / LightSpeed / [pick your orm] to map a rich domain model onto the database.
While both the hand rolled solution and ORM are effectively code generators, the primary difference is when the code is generated. With the ORM it is an implicit step that happens at run-time and therefore is one-way by it's nature. The hand rolled solution requires and explicit step to generate the code during development and the likelihood that the generated classes will need customising at some point therefore creating problems when you re-generate the code. The explicit step that must happen during development introduces friction into the development process and often leads to code that violates DRY ( although some argue that generated code can never violate DRY ).
Another reason for touting code generation comes from the MDA / MDE world ( Model Driven Architecture / Engineering ). I don't put much stock in this but rather than providing a number of poorly expressed arguments, I'm simply going to co-opt someone elses - http://www.infoq.com/articles/8-reasons-why-MDE-fails.
IMHO code generation is the only solution in an exceedingly narrow set of problems and whenever you are considering it, you should probably take a second look at the real problem you are trying to solve and see if there is a better solution.
One type of code generation that really does enhance productivity is "micro code-generation" where the use of macros and templates allow a developer to generate new code directly in the IDE and tab / type their way through placeholders (eg namespace / classname etc). This sort of code generation is a feature of resharper and I use it heavily every day. The reason that micro-generation benefits where most large scale code generation fails is that the generated code is not tied back to any other resource that must be kept in sync and therefore once the code is generated, it is just like all the other code in the solution.
#John
Moving the creation of "basic classes" from the IDE into xml / dsl is often seen when doing big bang development - a classic example would be developers try to reverse engineer the database into a domain model. Unless the code generator is very well written it simply introduces an additional burden on the developer in that every time they need to update the domain model, they either have to context-switch and update the xml / dsl or they have to extend the domain model and then port those changes back to the xml / dsl ( effectively doing the work twice).
There are some code generators that work very well in this space ( the LightSpeed designer is the only one I can think of atm ) by acting as the engine for a design surface but often
these code generators generate terrible code that cannot be maintained (eg winforms / webforms design surfaces, EF1 design surface) and therefore rapidly undo any productivity benefits gained from using the code generator in the first place.
Well, it's either:
you write 250 classes, all pretty much the same, but slightly different, e.g. to do data access; takes you a week, and it's boring and error-prone and annoying
OR:
you invest 30 minutes into generating a code template, and let a generation engine handle the grunt work in another 30 minutes
So a code generator gives you:
speed
reproducability
a lot less errors
a lot more free time! :-)
Excellent examples:
Linq-to-SQL T4 templates by Damien Guard to generate one separate file per class in your database model, using the best kept Visual Studio 2008 secret - T4 templates
PLINQO - same thing, but for Codesmith's generator
and countless more.....
Anytime you need to produce large amounts of repetetive boilerplate code, the code generator is the guy for the job. Last time I used a code generator was when creating a custom Data Access Layer for a project, where the skeleton for various CRUD actions was created based on an object model. Instead of hand-coding all those classes, I put together a template-driven code generator (using StringTemplate) to make it for me. The advandages of this procedure was:
It was faster (there was a large amount of code to generate)
I could regenerate the code in a whim in case I detected a bug (code can sometimes have bugs in the early versions)
Less error prone; when we had an error in the generated code it was everywhere which means that it was more likely to be found (and, as noted in the previous point, it was easy to fix it and regenerate the code).
Using GUI builders, that will generate code for you is a common practice. Thanks to this you don't need to manually create all widgets. You just drag&drop them and the use generated code. For simple widgets this really saves time (I have used this a lot for wxWidgets).
Really, when you are using almost any programming language, you are using a "code generator" (except for assembly or machine code.) I often write little 200-line scripts that crank out a few thousand lines of C. There is also software you can get which helps generate certain types of code (yacc and lex, for example, are used to generate parsers to create programming languages.)
The key here is to think of your code generator's input as the actual source code, and think of the stuff it spits out as just part of the build process. In which case, you are writing in a higher-level language with fewer actual lines of code to deal with.
For example, here is a very long and tedious file I (didn't) write as part of my work modifying the Quake2-based game engine CRX. It takes the integer values of all #defined constants from two of the headers, and makes them into "cvars" (variables for the in-game console.)
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/game/cvar_constants.c
Here is the short Bash script which generated that code at compile-time:
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/autogen/constant_cvars.sh
Now, which would you rather maintain? They are both equivalent in terms of what they describe, but one is vastly longer and more annoying to deal with.
The canonical example of this is data access, but I have another example. I've worked on a messaging system that communicates over serial port, sockets, etc., and I found I kept having to write classes like this over and over again:
public class FooMessage
{
public FooMessage()
{
}
public FooMessage(int bar, string baz, DateTime blah)
{
this.Bar = bar;
this.Baz = baz;
this.Blah = blah;
}
public void Read(BinaryReader reader)
{
this.Bar = reader.ReadInt32();
this.Baz = Encoding.ASCII.GetString(reader.ReadBytes(30));
this.Blah = new DateTime(reader.ReadInt16(), reader.ReadByte(),
reader.ReadByte());
}
public void Write(BinaryWriter writer)
{
writer.Write(this.Bar);
writer.Write(Encoding.ASCII.GetBytes(
this.Baz.PadRight(30).Substring(0, 30)));
writer.Write((Int16)this.Blah.Year);
writer.Write((byte)this.Blah.Month);
writer.Write((byte)this.Blah.Day);
}
public int Bar { get; set; }
public string Baz { get; set; }
public DateTime Blah { get; set; }
}
Try to imagine, if you will, writing this code for no fewer than 300 different types of messages. The same boring, tedious, error-prone code being written, over and over again. I managed to write about 3 of these before I decided it would be easier for me to just write a code generator, so I did.
I won't post the code-gen code, it's a lot of arcane CodeDom stuff, but the bottom line is that I was able to compact the entire system down to a single XML file:
<Messages>
<Message ID="12345" Name="Foo">
<ByteField Name="Bar"/>
<TextField Name="Baz" Length="30"/>
<DateTimeField Name="Blah" Precision="Day"/>
</Message>
(More messages)
</Messages>
How much easier is this? (Rhetorical question.) I could finally breathe. I even added some bells and whistles so it was able to generate a "proxy", and I could write code like this:
var p = new MyMessagingProtocol(...);
SetFooResult result = p.SetFoo(3, "Hello", DateTime.Today);
In the end I'd say this saved me writing a good 7500 lines of code and turned a 3-week task into a 3-day task (well, plus the couple of days required to write the code-gen).
Conclusion: Code generation is only appropriate for a relatively small number of problems, but when you're able to use one, it will save your sanity.
A code generator is useful if:
The cost of writing and maintaining the code generator is less than the cost of writing and maintaining the repetition that it is replacing.
The consistency gained by using a code generator will reduce errors to a degree that makes it worthwhile.
The extra problem of debugging generated code will not make debugging inefficient enough to outweigh the benefits from 1 and 2.
For domain-driven or multi-tier apps, code generation is a great way to create the initial model or data access layer. It can churn out the 250 entity classes in 30 seconds ( or in my case 750 classes in 5 minutes). This then leaves the programmer to focus on enhancing the model with relationships, business rules or deriving views within MVC.
The key thing here is when I say initial model. If you are relying on the code generation to maintain the code, then the real work is being done in the templates. (As stated by Max E.) And beware of that because there is risk and complexity in maintaining template-based code.
If you just want the data layer to be "automagically created" so you can "make the GUI work in 2 days", then I'd suggest going with a product/toolset which is geared towards the data-driven or two-tier application scenario.
Finally, keep in mind "garbage in=garbage out". If your entire data layer is homogeneous and does not abstract from the database, please please ask yourself why you are bothering to have a data layer at all. (Unless you need to look productive :) )
How 'bout an example of a good use of a code generator?
This uses t4 templates (a code generator built in to visual studio) to generate compressed css from .less files:
http://haacked.com/archive/2009/12/02/t4-template-for-less-css.aspx
Basically, it lets you define variables, real inheritance, and even behavior in your style sheets, and then create normal css from that at compile time.
Everyone talks here about simple code generation, but what about model-driven code generation (like MDSD or DSM)? This helps you move beyond the simple ORM/member accessors/boilerplate generators and into code generation of higher-level concepts for your problem domain.
It's not productive for one-off projects, but even for these, model-driven development introduces additional discipline, better understanding of employed solutions and usually a better evolution path.
Like 3GLs and OOP provided an increase in abstraction by generating large quantities of assembly code based on a higher level specification, model-driven development allows us to again increase the abstraction level, with yet another gain in productivity.
MetaEdit+ from MetaCase (mature) and ABSE from Isomeris (my project, in alpha, info at http://www.abse.info) are two technologies on the forefront of model-driven code generation.
What is needed really is a change in mindset (like OOP required in the 90's)...
I'm actually adding the finishing touches to a code generator I'm using for a project I've been hired on. We have a huge XML files of definitions and in a days worth of work I was able to generate over 500 C# classes. If I want to add functionality to all the classes, say I want to add an attribute to all the properties. I just add it to my code-gen, hit go, and bam! I'm done.
It's really nice, really.
There are many uses for code generation.
Writing code in a familiar language and generating code for a different target language.
GWT - Java -> Javascript
MonoTouch - C# -> Objective-C
Writing code at a higher level of abstraction.
Compilers
Domain Specific Languages
Automating repetitive tasks.
Data Access Layers
Initial Data Models
Ignoring all preconceived notions of code-generation, it is basically translating one representation (usually higher level) to another (usually lower level). Keeping that definition in mind, it is a very powerful tool to have.
The current state of programming languages has by no means reached its full potential and it never will. We will always be abstracting to get to a higher level than where we stand today. Code generation is what gets us there. We can either depend on the language creators to create that abstraction for us, or do it ourselves. Languages today are sophisticated enough to allow anybody to do it easily.
If with code generator you also intend snippets, try the difference between typing ctor + TAB and writing the constructor each time in your classes. Or check how much time you earn using the snippet to create a switch statement related to an enum with many values.
If you're paid by LOC and work for people who don't understand what code generation is, it makes a lot of sense. This is not a joke, by the way - I have worked with more than one programmer who employs this technique for exactly this purpose. Nobody gets paid by LOC formally any more (that I know of, anyway), but programmers are generally expected to be productive, and churning out large volumes of code can make someone look productive.
As an only slightly tangential point, I think this also explains the tendency of some coders to break a single logical unit of code into as many different classes as possible (ever inherit a project with LastName, FirstName and MiddleInitial classes?).
Here's some heresy:
If a task is so stupid that it can be automated at program writing time (i.e. source code can be generated by a script from, let's say XML) then the same can also be done at run-time (i.e. some representation of that XML can be interpreted at run-time) or using some meta-programming. So in essence, the programmer was lazy, did not attempt to solve the real problem but took the easy way out and wrote a code generator. In Java / C#, look at reflection, and in C++ look at templates
I am developing a "dumb" front-end, it's an AIR application that interacts with a "smart" LiveCycle server. There are currently about 20 request & response pairs for the application. For many reasons (testing, developing outside the corporate network, etc), we have several XML files of fake data, and if a certain configuration flag is set, the files are loaded, a specific file is parsed and used to create a mock response. Each XML file is a set of responses for different situation, all internally consistent. We currently have about 10 XML files, each corresponding to different situation we can run into. This is probably going to grow to 30-50 XML files.
The current system was developed by me during one of those 90-hour-week release cycles, when we were under duress because LiveCycle was down again and we had a deadline to meet. Most of the minor crap has been cleaned up.
The fake data is in an object called FakeData, with properties like customerType1:XML, customerType2:XML, overdueCustomer1:XML, etc. Then in the FakeData constructor, all of the properties are set like this:
customerType1:XML = FileUtil.loadXML(File.applicationDirectory.resolvePath("fakeData/customerType1.xml");
And whenever you need some fake data (this happens in special FakeDelegates that extend the real LiveCycle Delegates), you get it from an instance of FakeData.
This is awful, for many reasons, but it works. One embarrassing part is that every time you create an instance of FakeData, it reloads all the XML files.
I'm trying to figure out if there's a design pattern that is not Singleton that can handle this more elegantly. The constraints are:
No global instances can be required (currently, all the code dealing with the fake data, including the fake delegates, is pulled out of production builds without any side-effects, and it needs to stay that way). This puts the Factory pattern out of the running.
It can handle multiple objects using the XML data without performance issues.
The XML files are read centrally so that the other code doesn't have to know where the XML files are, and so some preprocessing can be done (like creating a map of certain tag values and the associated XML file).
Design patterns, or other architecture suggestions, would be greatly appreciated.
Take a look at ASMock which was developed by a good friend of mine (and a member here Richard Szalay) and is based on .nets Rhino mocks. We've used it in several production environments now so i can vouch for it's stability.
should be able to get rid of any fake tests (more like integration tests) by using the mock object instead.
Wouldn't it make more sense to do traditional mocking with a mocking framework? Depending on your implementation, it might be possible to set up the Expects by reading the fake-data XML files.
Here is a Google Code project that offers mocking for ActionScript.
After reading the nice answers in this question, I watched the screencasts by Justin Etheredge. It all seems very nice, with a minimum of setup you get DI right from your code.
Now the question that creeps up to me is: why would you want to use a DI framework that doesn't use configuration files? Isn't that the whole point of using a DI infrastructure so that you can alter the behaviour (the "strategy", so to speak) after building/releasing/whatever the code?
Can anyone give me a good use case that validates using a non-configured DI like Ninject?
I don't think you want a DI-framework without configuration. I think you want a DI-framework with the configuration you need.
I'll take spring as an example. Back in the "old days" we used to put everything in XML files to make everything configurable.
When switching to fully annotated regime you basically define which component roles yor application contains. So a given
service may for instance have one implementation which is for "regular runtime" where there is another implementation that belongs
in the "Stubbed" version of the application. Furthermore, when wiring for integration tests you may be using a third implementation.
When looking at the problem this way you quickly realize that most applications only contain a very limited set of component roles
in the runtime - these are the things that actually cause different versions of a component to be used. And usually a given implementation of a component is always bound to this role; it is really the reason-of-existence of that implementation.
So if you let the "configuration" simply specify which component roles you require, you can get away without much more configuration at all.
Of course, there's always going to be exceptions, but then you just handle the exceptions instead.
I'm on a path with krosenvold, here, only with less text: Within most applications, you have a exactly one implementation per required "service". We simply don't write applications where each object needs 10 or more implementations of each service. So it would make sense to have a simple way say "this is the default implementation, 99% of all objects using this service will be happy with it".
In tests, you usually use a specific mockup, so no need for any config there either (since you do the wiring manually).
This is what convention-over-configuration is all about. Most of the time, the configuration is simply a dump repeating of something that the DI framework should know already :)
In my apps, I use the class object as the key to look up implementations and the "key" happens to be the default implementation. If my DI framework can't find an override in the config, it will just try to instantiate the key. With over 1000 "services", I need four overrides. That would be a lot of useless XML to write.
With dependency injection unit tests become very simple to set up, because you can inject mocks instead of real objects in your object under test. You don't need configuration for that, just create and injects the mocks in the unit test code.
I received this comment on my blog, from Nate Kohari:
Glad you're considering using Ninject!
Ninject takes the stance that the
configuration of your DI framework is
actually part of your application, and
shouldn't be publicly configurable. If
you want certain bindings to be
configurable, you can easily make your
Ninject modules read your app.config.
Having your bindings in code saves you
from the verbosity of XML, and gives
you type-safety, refactorability, and
intellisense.
you don't even need to use a DI framework to apply the dependency injection pattern. you can simply make use of static factory methods for creating your objects, if you don't need configurability apart from recompiling code.
so it all depends on how configurable you want your application to be. if you want it to be configurable/pluggable without code recompilation, you'll want something you can configure via text or xml files.
I'll second the use of DI for testing. I only really consider using DI at the moment for testing, as our application doesn't require any configuration-based flexibility - it's also far too large to consider at the moment.
DI tends to lead to cleaner, more separated design - and that gives advantages all round.
If you want to change the behavior after a release build, then you will need a DI framework that supports external configurations, yes.
But I can think of other scenarios in which this configuration isn't necessary: for example control the injection of the components in your business logic. Or use a DI framework to make unit testing easier.
You should read about PRISM in .NET (it's best practices to do composite applications in .NET). In these best practices each module "Expose" their implementation type inside a shared container. This way each module has clear responsabilities over "who provide the implementation for this interface". I think it will be clear enough when you will understand how PRISM work.
When you use inversion of control you are helping to make your class do as little as possible. Let's say you have some windows service that waits for files and then performs a series of processes on the file. One of the processes is to convert it to ZIP it then Email it.
public class ZipProcessor : IFileProcessor
{
IZipService ZipService;
IEmailService EmailService;
public void Process(string fileName)
{
ZipService.Zip(fileName, Path.ChangeFileExtension(fileName, ".zip"));
EmailService.SendEmailTo(................);
}
}
Why would this class need to actually do the zipping and the emailing when you could have dedicated classes to do this for you? Obviously you wouldn't, but that's only a lead up to my point :-)
In addition to not implementing the Zip and email why should the class know which class implements the service? If you pass interfaces to the constructor of this processor then it never needs to create an instance of a specific class, it is given everything it needs to do the job.
Using a D.I.C. you can configure which classes implement certain interfaces and then just get it to create an instance for you, it will inject the dependencies into the class.
var processor = Container.Resolve<ZipProcessor>();
So now not only have you cleanly separated the class's functionality from shared functionality, but you have also prevented the consumer/provider from having any explicit knowledge of each other. This makes reading code easier to understand because there are less factors to consider at the same time.
Finally, when unit testing you can pass in mocked dependencies. When you test your ZipProcessor your mocked services will merely assert that the class attempted to send an email rather than it really trying to send one.
//Mock the ZIP
var mockZipService = MockRepository.GenerateMock<IZipService>();
mockZipService.Expect(x => x.Zip("Hello.xml", "Hello.zip"));
//Mock the email send
var mockEmailService = MockRepository.GenerateMock<IEmailService>();
mockEmailService.Expect(x => x.SendEmailTo(.................);
//Test the processor
var testSubject = new ZipProcessor(mockZipService, mockEmailService);
testSubject.Process("Hello.xml");
//Assert it used the services in the correct way
mockZipService.VerifyAlLExpectations();
mockEmailService.VerifyAllExceptions();
So in short. You would want to do it to
01: Prevent consumers from knowing explicitly which provider implements the services it needs, which means there's less to understand at once when you read code.
02: Make unit testing easier.
Pete