Groovy performance - mysql

Hi
We are going to start a CRUD project.
I have some experience using groovy and
I think it is the right tool.
My concern is about performance.
How good is groovy compared to a java solution.
It is estimated that we can have up to 100
simultaneosly users. We are going to use a
MySql DB and a tomcat server.
Any comment or suggestion?
Thanks

I've recently gathered five negative votes (!) on an answer on Groovy performance; however, I think there should be, indeed, a need for objective facts. Personally, I think it's productive and fun to work with Groovy and Grails; nevertheless, there is a performance issue that needs to be addressed.
There are a number of benchmark comparisons on the web, including this one. You can never trust single benchmarks (and the cited one isn't even close to being scientific), but you'll get the idea.
Groovy strongly relies on runtime meta programming. Every object in Groovy (well, except Groovy scripts) extends from GroovyObject with its invokeMethod(..) method, for example. Every time you call a method in your Groovy classes, the method will not be called, directly, as in Java, but by invoking the aforementioned invokeMethod(..) (which does a whole bunch of reflection and lookups).
Additionally, every GroovyObject has an associated MetaClass. The concepts of method invocation, etc., are similar.
There are other factors that decrease Groovy performance in comparison to Java, including boxing of primitive data types and (optional) weak typing, but the aforementioned concept of runtime meta programming is crucial. You cannot even think of a JIT compiler with Groovy, that compiles Java bytecode to native code to speed up execution.
To address these issues, there's the Groovy++ project. You simply annotate your Groovy classes with #Typed, and they'll be statically compiled to (real) Java bytecode. Unfortunately, however, I found Groovy++ to be not quite mature, and not well integrated with the main Groovy line, and IDEs. Groovy++ also contradicts basic Groovy programming paradigms. Moreover, Groovy++' #Typed annotation does not work recursively, that is, does not affect underlying libraries like GORM or the Grails controllers infrastructure.
I guess you're evaluating employing a Grails project, as well.
When looking at Grails' GORM, that framework makes heavily use of runtime meta programming, using Hibernate directly, should perform much better.
At the controllers or (especially) services level, extensive computations can be externalized to Java classes. However, GORMs proportion in typical CRUD applications is higher.
Potential performance in Grails are typically addressed by caching layers at the database level or by avoiding to call service or controllers methods (see the SpringCache plugin or the Cache Filter plugin). These are typically implemented on top of the Ehcache infrastructure.
Caching, obviously, may suit well with static data in contrast to (database) data that frequently changes, or web output that is rather variable.
And, finally, you can "throw hardware at it". :-)
In conclusion, the most decisive factor for or against using Groovy/Grails in a large-scaling website ought to be the question whether caching fits with the specific website's nature.
EDIT:
As for the question whether Java's JIT compiler had a chance to step in ...
A simple Groovy class
class Hello {
def getGreeting(name) {
"Hello " + name
}
}
gets compiled to
public class Hello
implements GroovyObject
{
public Hello()
{
Hello this;
CallSite[] arrayOfCallSite = $getCallSiteArray();
}
public Object getGreeting(Object name) {
CallSite[] arrayOfCallSite = $getCallSiteArray();
return arrayOfCallSite[0].call("Hello ", name);
}
static
{
Long tmp6_3 = Long.valueOf(0L);
__timeStamp__239_neverHappen1288962446391 = (Long)tmp6_3;
tmp6_3;
Long tmp20_17 = Long.valueOf(1288962446391L);
__timeStamp = (Long)tmp20_17;
tmp20_17;
return;
}
}
This is just the top of an iceberg. Jochen Theodoru, an active Groovy developer, put it that way:
A method invocation in Groovy consists
usually of several normal method
calls, where the arguments are stored
in a array, the classes of the
arguments must be retrieved, a key is
generated out of them, a hashmap is
used to lookup the method and if that
fails, then we have to test the
available methods for compatible
methods, select one of the methods
based on the runtime type, create a
key for the hasmap and then in the
end, do a reflection like call on the
method.
I really don't think that the JIT inlines such dynamic, highly complex invocations.
As for a "solution" to your question, there is no "do it that way and you're fine". Instead, the task is to identify the factors that are more crucial than others and possible alternatives and mitigation strategies, to evaluate their impact on your current use cases ("can I live with it?"), and, finally, to identify the mix of technologies that meets the requirements best (not completely).

Performance (in the context of web applications) is an aspect of your application and not of the framework/language you are using. Any discussion and comparison about method invocation speed, reflection speed and the amount of framework layers a call goes through is completely irrelevant. You are not implementing photoshop filters, fractals or a raytracer. You are implementing web based CRUD.
Your showstopper will most probably be inefficient database design, N+1 queries (in case you use ORM), full table scans etc.
To answer your question: use any modern language/web framework you feel more confident with and focus on correct architecture/design to solve the business problem at hand.

Thanks for the answers and advices. I like groovy. It might be some performance problems under some circumstances. Groovy++ might be a better choice. At his point I would prefer to give a chance to "spring roo" which has a huge overlapping with Groovy but you remain at java and NO roo.jar is added to your project. Therefore you are not paying any extra cost for using it.
Moreover "roo" allows backward engineering and roundtrip engineering.
Unfortunately the plug-in library is pretty small up to now.
Luis

50 to 100 active users is not much of a traffic. As long as you have cached pages correctly, mysql queries are properly indexes, you should be ok.
Here is a site I am running in my basement in a $1000 server. It's written in Grails.
Checkout performance yourself http://www.ewebhostguide.com
Caution: Sometimes Comcast connections are down and site may appear down. But that happens only for few minutes. Cons of running site in basement.

Related

Windsor WCF facility and typed factories

My question has to do with the WCF and typed factory facilities that is provided by Windsor. I am quite new to the IoC container concepts and especially facilities but started looking into it after evaluating a project we wrote a while back. The program is n-layered and not very unit testable because dependency injection is not implemented everywhere. The issue is that since it is n-layered, to do a proper DI would end up making the top layer responsible for creating an instance of something that will get used like 5 layers down; so I turned to IoC.
However, after reading many a SO article and other sites I am now stuck with a couple of issues. One of the main issues initially was to decouple a class from a physical implementation of a WCF service and I did it as follows:
service = ManagerService.IoC.Resolve<IGridSubmitWorkService>(new { binding = new BasicHttpBinding(), remoteAddress = new EndpointAddress(WorkerAddress) });
result = service.SubmitWorkUnit(out errorString, aWorkUnit);
ManagerService.IoC.Release(service);
However, I found multiple mentions as to how you should not be using IoC.Resolve<>() and should rather use factories and WCF facilities to remove the dependency on your IoC container and that it is a service locator pattern which some people consider to be an anti-pattern.
My issue is this: The above three lines of code solves almost all my problems but to follow the proper patterns I have to instead create a WCF facility, a typed factory (that will deal with providing me instances of the class where this code is used) and a new interface for the factory and this feels like over-engineering and adding unnecessary complication to the code just for the sake of pleasing a pattern.
Question 1: Is there something fundamental I am missing here?
Question 2: As you can see from the code, I am calling a non-empty constructor for the web service. Will this still be as simple if I implement a WCF facility?
Question 3: Can you point me to a decent tutorial on using WCF facilities since I found the explanations on the Windsor wiki to be very brief and not really useful?
Thanks
Q1. Yes I think there is something important being missed. In your three line example, all you have done is replaced one implementation dependency (your GridSubmitWorkService) with another one (ManagerService.IoC). To illustrate, why not make it even simpler with the following?
service = new GridSubmitWorkService(binding,remoteAddress);
result = service.SubmitWorkUnit();
This is even simpler again but it has the hardwired implementation dependency. The point is that constructor and property injection allow you to write clean components that do not have to understand how the plumbing is done. In both your code and my code that principle is violated - the components are managing their own plumbing and in a large application, that becomes very painful, very quickly. This is why ServiceLocator is considered by many to be an anti-pattern.
The name of your service locator itself gives the game away. It's called IoC - abbreviation for Inversion of Control. But when you look at the structure of these two bits of code you can clearly see that nothing has been inverted - the code is doing pretty much the same thing in both cases, but your example is just a little more abstract and convoluted.
Also no-one offering IoC is trying to force you to stick to a pattern. If you don't see the benefit or the benefits simply aren't there for your app, then please don't use the pattern.
Q2. Any decent DI container is based on constructor injection. Using the Windsor WCF facility is very straight-forward and it supports a variety of hosting methods so things like bindings and addresses are handled properly anyway.
Q3. There is plenty of information around ranging from forums to questions here on stackoverflow.com, in addition to the Windsor docs and wikis. I suggest putting a specific question if there is something you are unable to resolve.

Why should I use code generators

I have encountered this topic lately and couldn't understand why they are needed.
Can you explain why I should use them in my projects and how they can ease my life.
Examples will be great, and where from I can learn this topic little more.
At least you have framed the question from the correct perspective =)
The usual reasons for using a code generator are given as productivity and consistency because they assume that the solution to a consistent and repetitive problem is to throw more code at it. I would argue that any time you are considering code generation, look at why you are generating code and see if you can solve the problem through other means.
A classic example of this is data access; you could generate 250 classes ( 1 for each table in the schema ) effectively creating a table gateway solution, or you could build something more like a domain model and use NHibernate / ActiveRecord / LightSpeed / [pick your orm] to map a rich domain model onto the database.
While both the hand rolled solution and ORM are effectively code generators, the primary difference is when the code is generated. With the ORM it is an implicit step that happens at run-time and therefore is one-way by it's nature. The hand rolled solution requires and explicit step to generate the code during development and the likelihood that the generated classes will need customising at some point therefore creating problems when you re-generate the code. The explicit step that must happen during development introduces friction into the development process and often leads to code that violates DRY ( although some argue that generated code can never violate DRY ).
Another reason for touting code generation comes from the MDA / MDE world ( Model Driven Architecture / Engineering ). I don't put much stock in this but rather than providing a number of poorly expressed arguments, I'm simply going to co-opt someone elses - http://www.infoq.com/articles/8-reasons-why-MDE-fails.
IMHO code generation is the only solution in an exceedingly narrow set of problems and whenever you are considering it, you should probably take a second look at the real problem you are trying to solve and see if there is a better solution.
One type of code generation that really does enhance productivity is "micro code-generation" where the use of macros and templates allow a developer to generate new code directly in the IDE and tab / type their way through placeholders (eg namespace / classname etc). This sort of code generation is a feature of resharper and I use it heavily every day. The reason that micro-generation benefits where most large scale code generation fails is that the generated code is not tied back to any other resource that must be kept in sync and therefore once the code is generated, it is just like all the other code in the solution.
#John
Moving the creation of "basic classes" from the IDE into xml / dsl is often seen when doing big bang development - a classic example would be developers try to reverse engineer the database into a domain model. Unless the code generator is very well written it simply introduces an additional burden on the developer in that every time they need to update the domain model, they either have to context-switch and update the xml / dsl or they have to extend the domain model and then port those changes back to the xml / dsl ( effectively doing the work twice).
There are some code generators that work very well in this space ( the LightSpeed designer is the only one I can think of atm ) by acting as the engine for a design surface but often
these code generators generate terrible code that cannot be maintained (eg winforms / webforms design surfaces, EF1 design surface) and therefore rapidly undo any productivity benefits gained from using the code generator in the first place.
Well, it's either:
you write 250 classes, all pretty much the same, but slightly different, e.g. to do data access; takes you a week, and it's boring and error-prone and annoying
OR:
you invest 30 minutes into generating a code template, and let a generation engine handle the grunt work in another 30 minutes
So a code generator gives you:
speed
reproducability
a lot less errors
a lot more free time! :-)
Excellent examples:
Linq-to-SQL T4 templates by Damien Guard to generate one separate file per class in your database model, using the best kept Visual Studio 2008 secret - T4 templates
PLINQO - same thing, but for Codesmith's generator
and countless more.....
Anytime you need to produce large amounts of repetetive boilerplate code, the code generator is the guy for the job. Last time I used a code generator was when creating a custom Data Access Layer for a project, where the skeleton for various CRUD actions was created based on an object model. Instead of hand-coding all those classes, I put together a template-driven code generator (using StringTemplate) to make it for me. The advandages of this procedure was:
It was faster (there was a large amount of code to generate)
I could regenerate the code in a whim in case I detected a bug (code can sometimes have bugs in the early versions)
Less error prone; when we had an error in the generated code it was everywhere which means that it was more likely to be found (and, as noted in the previous point, it was easy to fix it and regenerate the code).
Using GUI builders, that will generate code for you is a common practice. Thanks to this you don't need to manually create all widgets. You just drag&drop them and the use generated code. For simple widgets this really saves time (I have used this a lot for wxWidgets).
Really, when you are using almost any programming language, you are using a "code generator" (except for assembly or machine code.) I often write little 200-line scripts that crank out a few thousand lines of C. There is also software you can get which helps generate certain types of code (yacc and lex, for example, are used to generate parsers to create programming languages.)
The key here is to think of your code generator's input as the actual source code, and think of the stuff it spits out as just part of the build process. In which case, you are writing in a higher-level language with fewer actual lines of code to deal with.
For example, here is a very long and tedious file I (didn't) write as part of my work modifying the Quake2-based game engine CRX. It takes the integer values of all #defined constants from two of the headers, and makes them into "cvars" (variables for the in-game console.)
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/game/cvar_constants.c
Here is the short Bash script which generated that code at compile-time:
http://meliaserlow.dyndns.tv:8000/alienarena/lua_source/autogen/constant_cvars.sh
Now, which would you rather maintain? They are both equivalent in terms of what they describe, but one is vastly longer and more annoying to deal with.
The canonical example of this is data access, but I have another example. I've worked on a messaging system that communicates over serial port, sockets, etc., and I found I kept having to write classes like this over and over again:
public class FooMessage
{
public FooMessage()
{
}
public FooMessage(int bar, string baz, DateTime blah)
{
this.Bar = bar;
this.Baz = baz;
this.Blah = blah;
}
public void Read(BinaryReader reader)
{
this.Bar = reader.ReadInt32();
this.Baz = Encoding.ASCII.GetString(reader.ReadBytes(30));
this.Blah = new DateTime(reader.ReadInt16(), reader.ReadByte(),
reader.ReadByte());
}
public void Write(BinaryWriter writer)
{
writer.Write(this.Bar);
writer.Write(Encoding.ASCII.GetBytes(
this.Baz.PadRight(30).Substring(0, 30)));
writer.Write((Int16)this.Blah.Year);
writer.Write((byte)this.Blah.Month);
writer.Write((byte)this.Blah.Day);
}
public int Bar { get; set; }
public string Baz { get; set; }
public DateTime Blah { get; set; }
}
Try to imagine, if you will, writing this code for no fewer than 300 different types of messages. The same boring, tedious, error-prone code being written, over and over again. I managed to write about 3 of these before I decided it would be easier for me to just write a code generator, so I did.
I won't post the code-gen code, it's a lot of arcane CodeDom stuff, but the bottom line is that I was able to compact the entire system down to a single XML file:
<Messages>
<Message ID="12345" Name="Foo">
<ByteField Name="Bar"/>
<TextField Name="Baz" Length="30"/>
<DateTimeField Name="Blah" Precision="Day"/>
</Message>
(More messages)
</Messages>
How much easier is this? (Rhetorical question.) I could finally breathe. I even added some bells and whistles so it was able to generate a "proxy", and I could write code like this:
var p = new MyMessagingProtocol(...);
SetFooResult result = p.SetFoo(3, "Hello", DateTime.Today);
In the end I'd say this saved me writing a good 7500 lines of code and turned a 3-week task into a 3-day task (well, plus the couple of days required to write the code-gen).
Conclusion: Code generation is only appropriate for a relatively small number of problems, but when you're able to use one, it will save your sanity.
A code generator is useful if:
The cost of writing and maintaining the code generator is less than the cost of writing and maintaining the repetition that it is replacing.
The consistency gained by using a code generator will reduce errors to a degree that makes it worthwhile.
The extra problem of debugging generated code will not make debugging inefficient enough to outweigh the benefits from 1 and 2.
For domain-driven or multi-tier apps, code generation is a great way to create the initial model or data access layer. It can churn out the 250 entity classes in 30 seconds ( or in my case 750 classes in 5 minutes). This then leaves the programmer to focus on enhancing the model with relationships, business rules or deriving views within MVC.
The key thing here is when I say initial model. If you are relying on the code generation to maintain the code, then the real work is being done in the templates. (As stated by Max E.) And beware of that because there is risk and complexity in maintaining template-based code.
If you just want the data layer to be "automagically created" so you can "make the GUI work in 2 days", then I'd suggest going with a product/toolset which is geared towards the data-driven or two-tier application scenario.
Finally, keep in mind "garbage in=garbage out". If your entire data layer is homogeneous and does not abstract from the database, please please ask yourself why you are bothering to have a data layer at all. (Unless you need to look productive :) )
How 'bout an example of a good use of a code generator?
This uses t4 templates (a code generator built in to visual studio) to generate compressed css from .less files:
http://haacked.com/archive/2009/12/02/t4-template-for-less-css.aspx
Basically, it lets you define variables, real inheritance, and even behavior in your style sheets, and then create normal css from that at compile time.
Everyone talks here about simple code generation, but what about model-driven code generation (like MDSD or DSM)? This helps you move beyond the simple ORM/member accessors/boilerplate generators and into code generation of higher-level concepts for your problem domain.
It's not productive for one-off projects, but even for these, model-driven development introduces additional discipline, better understanding of employed solutions and usually a better evolution path.
Like 3GLs and OOP provided an increase in abstraction by generating large quantities of assembly code based on a higher level specification, model-driven development allows us to again increase the abstraction level, with yet another gain in productivity.
MetaEdit+ from MetaCase (mature) and ABSE from Isomeris (my project, in alpha, info at http://www.abse.info) are two technologies on the forefront of model-driven code generation.
What is needed really is a change in mindset (like OOP required in the 90's)...
I'm actually adding the finishing touches to a code generator I'm using for a project I've been hired on. We have a huge XML files of definitions and in a days worth of work I was able to generate over 500 C# classes. If I want to add functionality to all the classes, say I want to add an attribute to all the properties. I just add it to my code-gen, hit go, and bam! I'm done.
It's really nice, really.
There are many uses for code generation.
Writing code in a familiar language and generating code for a different target language.
GWT - Java -> Javascript
MonoTouch - C# -> Objective-C
Writing code at a higher level of abstraction.
Compilers
Domain Specific Languages
Automating repetitive tasks.
Data Access Layers
Initial Data Models
Ignoring all preconceived notions of code-generation, it is basically translating one representation (usually higher level) to another (usually lower level). Keeping that definition in mind, it is a very powerful tool to have.
The current state of programming languages has by no means reached its full potential and it never will. We will always be abstracting to get to a higher level than where we stand today. Code generation is what gets us there. We can either depend on the language creators to create that abstraction for us, or do it ourselves. Languages today are sophisticated enough to allow anybody to do it easily.
If with code generator you also intend snippets, try the difference between typing ctor + TAB and writing the constructor each time in your classes. Or check how much time you earn using the snippet to create a switch statement related to an enum with many values.
If you're paid by LOC and work for people who don't understand what code generation is, it makes a lot of sense. This is not a joke, by the way - I have worked with more than one programmer who employs this technique for exactly this purpose. Nobody gets paid by LOC formally any more (that I know of, anyway), but programmers are generally expected to be productive, and churning out large volumes of code can make someone look productive.
As an only slightly tangential point, I think this also explains the tendency of some coders to break a single logical unit of code into as many different classes as possible (ever inherit a project with LastName, FirstName and MiddleInitial classes?).
Here's some heresy:
If a task is so stupid that it can be automated at program writing time (i.e. source code can be generated by a script from, let's say XML) then the same can also be done at run-time (i.e. some representation of that XML can be interpreted at run-time) or using some meta-programming. So in essence, the programmer was lazy, did not attempt to solve the real problem but took the easy way out and wrote a code generator. In Java / C#, look at reflection, and in C++ look at templates

What do you wish was automatic in your favorite programming language? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
As a programmer, I often look at some features of the language I'm currently using and think to myself "This is pretty hard to do for a programmer, and could be taken care of automatically by the machine".
One example of such a feature is memory management, which has been automatic for a while in a variety of languages. While memory management is not that hard to do manually most of the time, doing it perfectly all the way through your application without leaking memory is extremely hard. Automation has made it easy again so that we programmers could concentrate on more critical questions.
Are there any features that you think programming languages should automate because the reward/difficulty ratio is just too low (say, for example concurrency)?
This question is intended to be a brainstorm about what the future of programming could be like, and what languages could do for us to let us focus on more important tasks, so please post your wishes even if you don't think automation is practical/feasible. Good answers will point to stuff that is genuinely hard to do in many languages, as opposed to single-language pet-peeves.
Whatever the language can do for me automatically, I will want a way of doing for myself.
Concurrent programming/parallelism that is (semi-)automated, opposed to having to mess around with threads, callbacks, and synchronisation. Being able to parallelise for loops, such as:
Parallel.ForEach(fooList, item =>
{
item.PerformLongTask();
}
is just made of win.
Certain languages already support such functionality to a degree, however. Notably, F# has asynchronous workflows. Coming with the release of .NET 4.0, the Parallel Extensions library will make concurrency much easier in C# and VB.NET. I believe Python also has some sort of concurrency library, though I personally haven't used it.
What would also be cool is fully automated parallelism in purely functional languages, i.e. not having to change your code even slightly and automatically have it run near optimally across multiple cores. Note that this can only be done with purely functional languages (such as Haskell, but not CAML/F#). Still, constructs such as example given above would be very handy for automating parallelism in object-oriented and other languages.
I would imagine that libraries, design patterns, and even entire programming languages oriented towards simple and high-level support for parallelism will become increasingly widespread in the near future, as desktop computers start to move from 2 cores to 4 and then 8 cores and the advantage of automated concurrency becomes much more evident.
exec("Build a system to keep the customer happy, based on requirements.txt");
In Java, create beans less verbosely.
For example:
bean Student
{
String name;
int id;
type1 property1;
type2 property2;
}
and this would create a bean private fields, default accessors, toString, hashCode, equals, etc.
In Java I would like a keyword that would make the entire class immutable.
E.g.
public immutable class Xyz {
}
And the compiler would warn me if any conditions of immutability were broken.
Concurrency. That was my main idea when asking this question. This is going to get more and more important with time, since current CPUs already have up to 8 logical cores (4 cores + hyperthreading), and 12 logical cores will appear in a few months. In the future, we are going to have a hell of a lot of cores at our disposal, but most programing languages only make it easy for us to use one at a time.
The Threads + Synchronization model that is exposed by most programming languages is extremely low level, and very close to what the CPU does. To me, the current level of concurrency language support is roughly equivalent to the memory management support in C: Not integrated, but some things can be delegated to the OS (malloc, free).
I wish some language would come up with a suitable abstraction that either makes the Threads + Synchronization model easier, or that simply completely hides it for us (just as automatic memory management make good old malloc/free obsolete in Java).
Some functional languages such as Erlang have a reputation of having good multithreading support, but the brain-switch required to do functional programming doesn't really make the whole ordeal much easier.
.Net:
A warning when manipulating strings with methods such as Replace and not returning the value (new string) to a variable, because if you don't know that a string is immutable this issue will frustrate you.
In C++, type inference for variable declarations, so that I don't need to write
for (vector<some_longwinded_type>::const_iterator i = v.begin(); i != v.end(); ++i) {
...
}
Luckily this is coming in C++1x in the form of auto:
for (auto i = v.begin(); i != v.end(); ++i) {
...
}
Coffee. I mean the language is call Java - so it should be able to make my coffee! I hate getting up from programming, going to the coffee pot, and finding out someone from marketing has taken the last cup and not made another pot.
Persistence, it seems to me we write far too much code to deal with persistence when this really should be a configuration problem.
In C++, enum-to-string.
In Ada, the language defines the 'image attribute of an enumerated type as a function that returns a string corresponding to the textual representation of an enumeration value.
C++ provides no such clean facility. It takes several lines of very arcane preprocessor macro black magic to get a rough equivalent.
For languages that provide operator overloading, provide automatically generated overloads for symmetric operations when only one operation is defined. For example, if the programmer provides an equality operator but not an inequality operator, the language could easily generate one. In C++, the same could be done for copy constructors and assignment operators.
I think that automatically generating one-side of a symmetric operation would be nice. Of course, I would definitely want to be able to explicitly say don't do that when needed. I guess providing the implementation of both sides with one of them being private and empty could do the job.
Everything that LINQ does. C# has spoiled me and I now find it hard to do anything with collections in any other language. In Python I use list comprehensions a lot, but they are not quite as powerful as LINQ. I haven't found any other language that makes working with collections as easy as in C#.
In Visual Studio environment I want "Remove unused usings" to run across all file in the project. I find it a significant loss of time to have to manually open each individual file and call this operation of a file basis.
From a dynamic languages perspective, I'd like to see better tool support. Steve Yegge has a great post on this. For instance, there are lots of cases where a tool could look inside various code paths and determine if the methods or functions existed and provide the equivalent of the compiler smoke test. Obviously, if you're using lots and lots of truly dynamic code, this won't work, but the fact is, you probably aren't, so it would be pretty nice if Python, for instance, would tell you that .ToLowerCase() wasn't a valid function at compile time, rather than waiting until you hit the else clause.
s = "a Mixed Case String"
if True:
s = s.lower()
else:
s = s.ToLowerCase()
Easy: initialize variables in C/C++ just like C# does. It would have saved me multiple sessions of debugging in other people's code.
Of course there would be a keyword when you specifically do not want to init a var.
noinit float myVal; // undefined
float my2ndVal; // 0.0f

Why would you want Dependency Injection without configuration?

After reading the nice answers in this question, I watched the screencasts by Justin Etheredge. It all seems very nice, with a minimum of setup you get DI right from your code.
Now the question that creeps up to me is: why would you want to use a DI framework that doesn't use configuration files? Isn't that the whole point of using a DI infrastructure so that you can alter the behaviour (the "strategy", so to speak) after building/releasing/whatever the code?
Can anyone give me a good use case that validates using a non-configured DI like Ninject?
I don't think you want a DI-framework without configuration. I think you want a DI-framework with the configuration you need.
I'll take spring as an example. Back in the "old days" we used to put everything in XML files to make everything configurable.
When switching to fully annotated regime you basically define which component roles yor application contains. So a given
service may for instance have one implementation which is for "regular runtime" where there is another implementation that belongs
in the "Stubbed" version of the application. Furthermore, when wiring for integration tests you may be using a third implementation.
When looking at the problem this way you quickly realize that most applications only contain a very limited set of component roles
in the runtime - these are the things that actually cause different versions of a component to be used. And usually a given implementation of a component is always bound to this role; it is really the reason-of-existence of that implementation.
So if you let the "configuration" simply specify which component roles you require, you can get away without much more configuration at all.
Of course, there's always going to be exceptions, but then you just handle the exceptions instead.
I'm on a path with krosenvold, here, only with less text: Within most applications, you have a exactly one implementation per required "service". We simply don't write applications where each object needs 10 or more implementations of each service. So it would make sense to have a simple way say "this is the default implementation, 99% of all objects using this service will be happy with it".
In tests, you usually use a specific mockup, so no need for any config there either (since you do the wiring manually).
This is what convention-over-configuration is all about. Most of the time, the configuration is simply a dump repeating of something that the DI framework should know already :)
In my apps, I use the class object as the key to look up implementations and the "key" happens to be the default implementation. If my DI framework can't find an override in the config, it will just try to instantiate the key. With over 1000 "services", I need four overrides. That would be a lot of useless XML to write.
With dependency injection unit tests become very simple to set up, because you can inject mocks instead of real objects in your object under test. You don't need configuration for that, just create and injects the mocks in the unit test code.
I received this comment on my blog, from Nate Kohari:
Glad you're considering using Ninject!
Ninject takes the stance that the
configuration of your DI framework is
actually part of your application, and
shouldn't be publicly configurable. If
you want certain bindings to be
configurable, you can easily make your
Ninject modules read your app.config.
Having your bindings in code saves you
from the verbosity of XML, and gives
you type-safety, refactorability, and
intellisense.
you don't even need to use a DI framework to apply the dependency injection pattern. you can simply make use of static factory methods for creating your objects, if you don't need configurability apart from recompiling code.
so it all depends on how configurable you want your application to be. if you want it to be configurable/pluggable without code recompilation, you'll want something you can configure via text or xml files.
I'll second the use of DI for testing. I only really consider using DI at the moment for testing, as our application doesn't require any configuration-based flexibility - it's also far too large to consider at the moment.
DI tends to lead to cleaner, more separated design - and that gives advantages all round.
If you want to change the behavior after a release build, then you will need a DI framework that supports external configurations, yes.
But I can think of other scenarios in which this configuration isn't necessary: for example control the injection of the components in your business logic. Or use a DI framework to make unit testing easier.
You should read about PRISM in .NET (it's best practices to do composite applications in .NET). In these best practices each module "Expose" their implementation type inside a shared container. This way each module has clear responsabilities over "who provide the implementation for this interface". I think it will be clear enough when you will understand how PRISM work.
When you use inversion of control you are helping to make your class do as little as possible. Let's say you have some windows service that waits for files and then performs a series of processes on the file. One of the processes is to convert it to ZIP it then Email it.
public class ZipProcessor : IFileProcessor
{
IZipService ZipService;
IEmailService EmailService;
public void Process(string fileName)
{
ZipService.Zip(fileName, Path.ChangeFileExtension(fileName, ".zip"));
EmailService.SendEmailTo(................);
}
}
Why would this class need to actually do the zipping and the emailing when you could have dedicated classes to do this for you? Obviously you wouldn't, but that's only a lead up to my point :-)
In addition to not implementing the Zip and email why should the class know which class implements the service? If you pass interfaces to the constructor of this processor then it never needs to create an instance of a specific class, it is given everything it needs to do the job.
Using a D.I.C. you can configure which classes implement certain interfaces and then just get it to create an instance for you, it will inject the dependencies into the class.
var processor = Container.Resolve<ZipProcessor>();
So now not only have you cleanly separated the class's functionality from shared functionality, but you have also prevented the consumer/provider from having any explicit knowledge of each other. This makes reading code easier to understand because there are less factors to consider at the same time.
Finally, when unit testing you can pass in mocked dependencies. When you test your ZipProcessor your mocked services will merely assert that the class attempted to send an email rather than it really trying to send one.
//Mock the ZIP
var mockZipService = MockRepository.GenerateMock<IZipService>();
mockZipService.Expect(x => x.Zip("Hello.xml", "Hello.zip"));
//Mock the email send
var mockEmailService = MockRepository.GenerateMock<IEmailService>();
mockEmailService.Expect(x => x.SendEmailTo(.................);
//Test the processor
var testSubject = new ZipProcessor(mockZipService, mockEmailService);
testSubject.Process("Hello.xml");
//Assert it used the services in the correct way
mockZipService.VerifyAlLExpectations();
mockEmailService.VerifyAllExceptions();
So in short. You would want to do it to
01: Prevent consumers from knowing explicitly which provider implements the services it needs, which means there's less to understand at once when you read code.
02: Make unit testing easier.
Pete

Singletons: good design or a crutch? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Singletons are a hotly debated design pattern, so I am interested in what the Stack Overflow community thought about them.
Please provide reasons for your opinions, not just "Singletons are for lazy programmers!"
Here is a fairly good article on the issue, although it is against the use of Singletons:
scientificninja.com: performant-singletons.
Does anyone have any other good articles on them? Maybe in support of Singletons?
In defense of singletons:
They are not as bad as globals because globals have no standard-enforced initialization order, and you could easily see nondeterministic bugs due to naive or unexpected dependency orders. Singletons (assuming they're allocated on the heap) are created after all globals, and in a very predictable place in the code.
They're very useful for resource-lazy / -caching systems such as an interface to a slow I/O device. If you intelligently build a singleton interface to a slow device, and no one ever calls it, you won't waste any time. If another piece of code calls it from multiple places, your singleton can optimize caching for both simultaneously, and avoid any double look-ups. You can also easily avoid any deadlock condition on the singleton-controlled resource.
Against singletons:
In C++, there's no nice way to auto-clean-up after singletons. There are work-arounds, and slightly hacky ways to do it, but there's just no simple, universal way to make sure your singleton's destructor is always called. This isn't so terrible memory-wise -- just think of it as more global variables, for this purpose. But it can be bad if your singleton allocates other resources (e.g. locks some files) and doesn't release them.
My own opinion:
I use singletons, but avoid them if there's a reasonable alternative. This has worked well for me so far, and I have found them to be testable, although slightly more work to test.
Google has a Singleton Detector for Java that I believe started out as a tool that must be run on all code produced at Google. The nutshell reason to remove Singletons:
because they can make testing
difficult and hide problems with your
design
For a more explicit explanation see 'Why Singletons Are Controversial' from Google.
A singleton is just a bunch of global variables in a fancy dress.
Global variables have their uses, as do singletons, but if you think you're doing something cool and useful with a singleton instead of using a yucky global variable (everyone knows globals are bad mmkay), you're unfortunately misled.
The purpose of a Singleton is to ensure a class has only one instance, and provide a global point of access to it. Most of the time the focus is on the single instance point. Imagine if it were called a Globalton. It would sound less appealing as this emphasizes the (usually) negative connotations of a global variable.
Most of the good arguments against singletons have to do with the difficulty they present in testing as creating test doubles for them is not easy.
There's three pretty good blog posts about Singletons by Miško Hevery in the Google Testing blog.
Singletons are Pathological Liars
Where Have All the Singletons Gone?
Root Cause of Singletons
Singleton is not a horrible pattern, although it is misused a lot. I think this misuse is because it is one of the easier patterns and most new to the singleton are attracted to the global side effect.
Erich Gamma had said the singleton is a pattern he wishes wasn't included in the GOF book and it's a bad design. I tend to disagree.
If the pattern is used in order to create a single instance of an object at any given time then the pattern is being used correctly. If the singleton is used in order to give a global effect, it is being used incorrectly.
Disadvantages:
You are coupling to one class throughout the code that calls the singleton
Creates a hassle with unit testing because it is difficult to replace the instance with a mock object
If the code needs to be refactored later on because of the need for more than one instance, it is more painful than if the singleton class were passed into the object (using an interface) that uses it
Advantages:
One instance of a class is represented at any given point in time.
By design you are enforcing this
Instance is created when it is needed
Global access is a side effect
Chicks dig me because I rarely use singleton and when I do it's typically something unusual. No, seriously, I love the singleton pattern. You know why? Because:
I'm lazy.
Nothing can go wrong.
Sure, the "experts" will throw around a bunch of talk about "unit testing" and "dependency injection" but that's all a load of dingo's kidneys. You say the singleton is hard to unit test? No problem! Just declare everything public and turn your class into a fun house of global goodness. You remember the show Highlander from the 1990's? The singleton is kind of like that because: A. It can never die; and B. There can be only one. So stop listening to all those DI weenies and implement your singleton with abandon. Here are some more good reasons...
Everybody is doing it.
The singleton pattern makes you invincible.
Singleton rhymes with "win" (or "fun" depending on your accent).
I think there is a great misunderstanding about the use of the Singleton pattern. Most of the comments here refer to it as a place to access global data. We need to be careful here - Singleton as a pattern is not for accessing globals.
Singleton should be used to have only one instance of the given class. Pattern Repository has great information on Singleton.
One of the colleagues I have worked with was very Singleton-minded. Whenever there was something that was kind of a manager or boss like object he would make that into a singleton, because he figured that there should be only one boss. And each time the system took up some new requirements, it turned out there were perfectly valid reasons to allow multiple instances.
I would say that singleton should be used if the domain model dictates (not 'suggests') that there is one. All other cases are just accendentally single instances of a class.
I've been trying to think of a way to come to the poor singelton's rescue here, but I must admit it's hard. I've seen very few legitimate uses of them and with the current drive to do dependency injection andd unit testing they are just hard to use. They definetly are the "cargo cult" manifestation of programming with design patterns I have worked with many programmers that have never cracked the "GoF" book but they know 'Singelton' and thus they know 'Patterns'.
I do have to disagree with Orion though, most of the time I've seen singeltons oversused it's not global variables in a dress, but more like global services(methods) in a dress. It's interesting to note that if you try to use Singeltons in the SQL Server 2005 in safe mode through the CLR interface the system will flag the code. The problem is that you have persistent data beyond any given transaction that may run, of course if you make the instance variable read only you can get around the issue.
That issue lead to a lot of rework for me one year.
Holy wars! Ok let me see.. Last time I checked the design police said..
Singletons are bad because they hinder auto testing - instances cannot be created afresh for each test case.
Instead the logic should be in a class (A) that can be easily instantiated and tested. Another class (B) should be responsible for constraining creation. Single Responsibility Principle to the fore! It should be team-knowledge that you're supposed to go via B to access A - sort of a team convention.
I concur mostly..
Many applications require that there is only one instance of some class, so the pattern of having only one instance of a class is useful. But there are variations to how the pattern is implemented.
There is the static singleton, in which the class forces that there can only be one instance of the class per process (in Java actually one per ClassLoader). Another option is to create only one instance.
Static singletons are evil - one sort of global variables. They make testing harder, because it's not possible to execute the tests in full isolation. You need complicated setup and tear down code for cleaning the system between every test, and it's very easy to forget to clean some global state properly, which in turn may result in unspecified behaviour in tests.
Creating only one instance is good. You just create one instance when the programs starts, and then pass the pointer to that instance to all other objects which need it. Dependency injection frameworks make this easy - you just configure the scope of the object, and the DI framework will take care of creating the instance and passing it to all who need it. For example in Guice you would annotate the class with #Singleton, and the DI framework will create only one instance of the class (per application - you can have multiple applications running in the same JVM). This makes testing easy, because you can create a new instance of the class for each test, and let the garbage collector destroy the instance when it is no more used. No global state will leak from one test to another.
For more information:
The Clean Code Talks - "Global State and Singletons"
Singleton as an implementation detail is fine. Singleton as an interface or as an access mechanism is a giant PITA.
A static method that takes no parameters returning an instance of an object is only slightly different from just using a global variable. If instead an object has a reference to the singleton object passed in, either via constructor or other method, then it doesn't matter how the singleton is actually created and the whole pattern turns out not to matter.
It was not just a bunch of variables in a fancy dress because this was had dozens of responsibilities, like communicating with persistence layer to save/retrieve data about the company, deal with employees and prices collections, etc.
I must say you're not really describing somthing that should be a single object and it's debatable that any of them, other than Data Serialization should have been a singelton.
I can see at least 3 sets of classes that I would normally design in, but I tend to favor smaller simpler objects that do a narrow set of tasks very well. I know that this is not the nature of most programmers. (Yes I work on 5000 line class monstrosities every day, and I have a special love for the 1200 line methods some people write.)
I think the point is that in most cases you don't need a singelton and often your just making your life harder.
The biggest problem with singletons is that they make unit testing hard, particularly when you want to run your tests in parallel but independently.
The second is that people often believe that lazy initialisation with double-checked locking is a good way to implement them.
Finally, unless your singletons are immutable, then they can easily become a performance problem when you try and scale your application up to run in multiple threads on multiple processors. Contended synchronization is expensive in most environments.
Singletons have their uses, but one must be careful in using and exposing them, because they are way too easy to abuse, difficult to truly unit test, and it is easy to create circular dependencies based on two singletons that accesses each other.
It is helpful however, for when you want to be sure that all your data is synchronized across multiple instances, e.g., configurations for a distributed application, for instance, may rely on singletons to make sure that all connections use the same up-to-date set of data.
I find you have to be very careful about why you're deciding to use a singleton. As others have mentioned, it's essentially the same issue as using global variables. You must be very cautious and consider what you could be doing by using one.
It's very rare to use them and usually there is a better way to do things. I've run into situations where I've done something with a singleton and then had to sift through my code to take it out after I discovered how much worse it made things (or after I came up with a much better, more sane solution)
I've used singletons a bunch of times in conjunction with Spring and didn't consider it a crutch or lazy.
What this pattern allowed me to do was create a single class for a bunch of configuration-type values and then share the single (non-mutable) instance of that specific configuration instance between several users of my web application.
In my case, the singleton contained client configuration criteria - css file location, db connection criteria, feature sets, etc. - specific for that client. These classes were instantiated and accessed through Spring and shared by users with the same configuration (i.e. 2 users from the same company). * **I know there's a name for this type of application but it's escaping me*
I feel it would've been wasteful to create (then garbage collect) new instances of these "constant" objects for each user of the app.
I'm reading a lot about "Singleton", its problems, when to use it, etc., and these are my conclusions until now:
Confusion between the classic implementation of Singleton and the real requirement: TO HAVE JUST ONE INSTANCE OF a CLASS!
It's generally bad implemented. If you want a unique instance, don't use the (anti)pattern of using a static GetInstance() method returning a static object. This makes a class to be responsible for instantiating a single instance of itself and also perform logic. This breaks the Single Responsibility Principle. Instead, this should be implemented by a factory class with the responsibility of ensuring that only one instance exists.
It's used in constructors, because it's easy to use and must not be passed as a parameter. This should be resolved using dependency injection, that is a great pattern to achieve a good and testable object model.
Not TDD. If you do TDD, dependencies are extracted from the implementation because you want your tests to be easy to write. This makes your object model to be better. If you use TDD, you won't write a static GetInstance =). BTW, if you think in objects with clear responsibilities instead classes, you'll get the same effect =).
I really disagree on the bunch of global variables in a fancy dress idea. Singletons are really useful when used to solve the right problem. Let me give you a real example.
I once developed a small piece of software to a place I worked, and some forms had to use some info about the company, its employees, services and prices. At its first version, the system kept loading that data from the database every time a form was opened. Of course, I soon realized this approach was not the best one.
Then I created a singleton class, named company, which encapsulated everything about the place, and it was completely filled with data by the time the system was opened.
It was not just a bunch of variables in a fancy dress because this was had dozens of responsibilities, like communicating with persistence layer to save/retrieve data about the company, deal with employees and prices collections, etc.
Plus, it was a fixed, system-wide, easily accessible point to have the company data.
Singletons are very useful, and using them is not in and of itself an anti-pattern. However, they've gotten a bad reputation largely because they force any consuming code to acknowledge that they are a singleton in order to interact with them. That means if you ever need to "un-Singletonize" them, the impact on your codebase can be very significant.
Instead, I'd suggest either hiding the Singleton behind a factory. That way, if you need to alter the service's instantiation behavior in the future, you can just change the factory rather than all types that consume the Singleton.
Even better, use an inversion of control container! Most of them allow you to separate instantiation behavior from the implementation of your classes.
One scary thing on singletons in for instance Java is that you can end up with multiple instances of the same singleton in some cases. The JVM uniquely identifies based on two elements: A class' fully qualified name, and the classloader responsible for loading it.
That means the same class can be loaded by two classloaders unaware of each other, and different parts of your application would have different instances of this singleton that they interact with.
Write normal, testable, injectable objects and let Guice/Spring/whatever handle the instantiation. Seriously.
This applies even in the case of caches or whatever the natural use cases for singletons are. There's no need to repeat the horror of writing code to try to enforce one instance. Let your dependency injection framework handle it. (I recommend Guice for a lightweight DI container if you're not already using one).