Sensible defaults for configuration - configuration

I've recently started to play with Ruby on Rails which favours convention over configuration and relies on sensible defaults to tie various aspects of the application together.
I was thinking that it might be useful if this concept of sensible default configuration was used in general configation for various frameworks then it might save some development headache.
For example, in a .net app I usually want to log an exception in the windows event log using enterprise library exception handling block but if I don't explicity state the behaviour I want in a config file then EL will complain. I think that instead, if it can't find custom configuration then it should revert to a sensible default configuration, like logging my exception in the event log.
Would this be a good or bad concept for frameworks to adopt for their configuration?

I work a lot with a framework that does this exact thing. My trouble with this way of working is that:
the framework grew to having an excessive amount of configuration keys that are actually never used/set in a configuration file.
behavior of the software becomes implicit sometimes, I want to explicitly set the system to behave a certain way instead of having it fallback on some other code path due to a 'default'.
a missed typo in configuration key may result in a very long diagnostic session before figuring out what is going on.
When forgetting to set a configuration value I rather have the software tell me, instead of assuming some form of behavior that I might not at all be after.
I'd prefer a 'template' configuration file in which I change what I want and have the unchanged settings serve as the default.
Figuring which out which convention the software picked when debugging can be a lot of time wasted also.

Related

Why config files should't be changed line-by-line with Chef / Puppet?

Why is changing lines in configuration file considered an anti-pattern in Chef or Puppet?
It's something like bad habit, as I understood. I assume that this file-editing is done in some idempotent way and with advanced tools (augeas for example).
Why is deploying the whole files, with ERB templates, considered a preferred method?
You can find a lot of examples where dev-ops are suggesting usage of templates instead of file-editing. For example here, here, here, etc.
Actually there is a large part of the DevOps community that sees accepting system/package defaults for config files and only modifying what you need through augeas as the preferred method, Github devops would be one of them(if you happened to catch them at Puppet Conf 2012).
I think having a default pattern of always using templates creates too high of a maintenance load and almost always requires you lock in specific versions for everything across your stack or you risk having an incompatible template against a newer version of that resource.
There's use cases for both options but in general I favor the "own as little as possible" practice vs the "own everything even if you don't have to" practice.
In terms of setting the your system to a known state, deploying whole files is better than editing, because you are sure the file is exactly as intended when you are done.
If you are tinkering around finding potential solutions to a problem and hand edit some configuration file, you don't have to worry about the hand edit you made staying around as an uncontrolled part of your environment. The next time you run chef-client, you know that the state will be exactly as specified in the Chef recipe, and won't include your edit.
Also, it is just in general harder and more complicated to robustly edit a file than it is to just generate one. You might write something that is idempotent in the basic case, but if the file contains a syntax error or something invalid, than your editing no longer works.
As always though, sometimes you don't have a choice, and editing is the only way to go.

Azure : can we check if a setting exists before trying to read it?

I currently use RoleEnvironment.GetConfigurationSettingValue(propertyName) to get the value of a setting defined in my WebRole config file (csdef + cscfg). Ok, sounds right.
This works well if the setting exists but failed with an Exception if the setting is not defined in the csdef and the cscfg.
I'm migrating an existing app to Azure which has many configuration settings in web.config. In my code, to read a setting value, I d'like to test : if it exists in the webRole config (csdef + cscfg) I read it from here, otherwise I read it with ConfigurationManager from web.config.
This would prevent to migrate all settings from my web.config and allow to custom one when the app is already deployed.
Is there a way to do this ?
I don't want to encapsulate the GetConfigurationSettingValue in a try/catch (and read from web.config if I enter the catch) because it's really an ugly way (and mostly it's not performance effective !).
Thanks !
Update for 1.7 Azure SDK.
The CloudConfigurationManager class has been introduced. The allows for a single GetSetting call to look in your cscfg first and then fall back to web.config if the key is not found.
http://msdn.microsoft.com/en-us/LIBRARY/jj157248
Pre 1.7 SDK
Simple answer is no. (That I know of)
The more interesting topic is to consider configuration as a dependency. I have found it to be beneficial to treat configuration settings as a dependency so that the backing implementation can be changed over time. That implementation may be a fake for testing or something more complex like switching from .config/.cscfg to a database implementation for multi-tennent solutions.
Given this configuration wrapper you can write that TryGetSetting as internal method for whatever your source of configuration options are. When this feature is added to the RoleEnvironment members, you would only have to change that internal implementation.

What are logging libraries for?

This may be a stupid question, as most of my programming consists of one-man scientific computing research prototypes and developing relatively low-level libraries. I've never programmed in the large in an enterprise environment before. I've always wondered, what are the main things that logging libraries make substantially easier than just using good old fashioned print statements or file output, simple programming logic and a few global variables to determine how verbosely things get logged? How do you know when a few print statements or some basic file output ain't gonna cut it and you need a real logging library?
Logging helps debug problems especially when you move to production and problems occur on people's machines you can't control. Best laid plans never survive contact with the enemy, and logging helps you track how that battle went when faced with real world data.
Off the shel logging libraries are easy to plug in and play in less than 5 minutes.
Log libraries allow for various levels of logging per statement (FATAL, ERROR, WARN, INFO, DEBUG, etc).
And you can turn up or down logging to get more of less information at runtime.
Highly threaded systems help sort out what thread was doing what. Log libraries can log information about threads, timestamps, that ordinary print statements can't.
Most allow you to turn on only portions of the logging to get more detail. So one system can log debug information, and another can log only fatal errors.
Logging libraries allow you to configure logging through an external file so it's easy to turn on or off in production without having to recompile, deploy, etc.
3rd party libraries usually log so you can control them just like the other portions of your system.
Most libraries allow you to log portions or all of your statements to one or many files based on criteria. So you can log to both the console AND a log file.
Log libraries allow you to rotate logs so it will keep several log files based on many different criteria. Say after the log gets 20MB rotate to another file, and keep 10 log files around so that log data is always 100MB.
Some log statements can be compiled in or out (language dependent).
Log libraries can be extended to add new features.
You'll want to start using a logging libraries when you start wanting some of these features. If you find yourself changing your program to get some of these features you might want to look into a good log library. They are easy to learn, setup, and use and ubiquitous.
There are used in environments where the requirements for logging may change, but the cost of changing or deploying a new executable are high. (Even when you have the source code, adding a one line logging change to a program can be infeasible because of internal bureaucracy.)
The logging libraries provide a framework that the program will use to emit a wide variety of messages. These can be described by source (e.g. the logger object it is first sent to, often corresponding to the class the event has occurred in), severity, etc.
During runtime the actual delivery of the messaages is controlled using an "easily" edited config file. For normal situations most messages may be obscured altogether. But if the situation changes, it is a simpler fix to enable more messages, without needing to deploy a new program.
The above describes the ideal logging framework as I understand the intention; in practice I have used them in Java and Python and in neither case have I found them worth the added complexity. :-(
They're for logging things.
Or more seriously, for saving you having to write it yourself, giving you flexible options on where logs are store (database, event log, text file, CSV, sent to a remote web service, delivered by pixies on a velvet cushion) and on what is logged at runtime, rather than having to redefine a global variable and then recompile.
If you're only writing for yourself then it's unlikely you need one, and it may introduce an external dependency you don't want, but once your libraries start to be used by others then having a logging framework in place may well help your users, and you, track down problems.
I know that a logging library is useful when I have more than one subsystem with "verbose logging," but where I only want to see that verbose data from one of them.
Certainly this can be achieved by having a global log level per subsystem, but for me it's easier to use a "system" of some sort for that.
I generally have a 2D logging environment too; "Info/Warning/Error" (etc) on one axis and "AI/UI/Simulation/Networking" (etc) on the other. With this I can specify the logging level that I care about seeing for each subsystem easily. It's not actually that complicated once it's in place, indeed it's a lot cleaner than having if my_logging_level == DEBUG then print("An error occurred"); Plus, the logging system can stuff file/line info into the messages, and then getting totally fancy you can redirect them to multiple targets pretty easily (file, TTY, debugger, network socket...).

Singleton for Application Configuration

In all my projects till now, I use to use singleton pattern to access Application configuration throughout the application. Lately I see lot of articles taking about not to use singleton pattern , because this pattern does not promote of testability also it hides the Component dependency.
My question is what is the best way to store Application configuration, which is easily accessible throughout the application without passing the configuration object all over the application ?.
Thanks in Advance
Madhu
I think an application configuration is an excellent use of the Singleton pattern. I tend to use it myself to prevent having to reread the configuration each time I want to access it and because I like to have the configuration be strongly typed (i.e, not have to convert non-string values each time). I usually build in some backdoor methods to my Singleton to support testability -- i.e., the ability to inject an XML configuration so I can set it in my test and the ability to destroy the Singleton so that it gets recreated when needed. Typically these are private methods that I access via reflection so that they are hidden from the public interface.
EDIT We live and learn. While I think application configuration is one of the few places to use a Singleton, I don't do this any more. Typically, now, I will create an interface and a standard class implementation using static, Lazy<T> backing fields for the configuration properties. This allows me to have the "initialize once" behavior for each property with a better design for testability.
Use dependency injection to inject the single configuration object into any classes that need it. This way you can use a mock configuration for testing or whatever you want... you're not explicitly going out and getting something that needs to be initialized with configuration files. With dependency injection, you are not passing the object around either.
For that specific situation I would create one configuration object and pass it around to those who need it.
Since it is the configuration it should be used only in certain parts of the app and not necessarily should be Omnipresent.
However if you haven't had problems using them, and don't want to test it that hard, you should keep going as you did until today.
Read the discussion about why are they considered harmful. I think most of the problems come when a lot of resources are being held by the singleton.
For the app configuration I think it would be safe to keep it like it is.
The singleton pattern seems to be the way to go. Here's a Setting class that I wrote that works well for me.
If any component relies on configuration that can be changed at runtime (for example theme support for widgets), you need to provide some callback or signaling mechanism to notify about the changed config. That's why it is not enough to pass only the needed parameters to the component at creation time (like color).
You also need to provide access to the config from inside of the component (pass complete config to component), or make a component factory that stores references to the config and all its created components so it can eventually apply the changes.
The former has the big downside that it clutters the constructors or blows up the interface, though it is maybe fastest for prototyping. If you take the "Law of Demeter" into account this is a big no because it violates encapsulation.
The latter has the advantage that components keep their specific interface where components only take what they need, and as a bonus gives you a central place for refactoring (the factory). In the long run code maintenance will likely benefit from the factory pattern.
Also, even if the factory was a singleton, it would likely be used in far fewer places than a configuration singleton would have been.
Here is an example done using Castale.Core >> DictionaryAdapter and StructureMap

What are the best practices to log an error?

Many times I saw logging of errors like these:
System.out.println("Method aMethod with parameters a:"+a+" b: "+b);
print("Error in line 88");
so.. What are the best practices to log an error?
EDIT:
This is java but could be C/C++, basic, etc.
Logging directly to the console is horrendous and frankly, the mark of an inexperienced developer. The only reason to do this sort of thing is 1) he or she is unaware of other approaches, and/or 2) the developer has not thought one bit about what will happen when his/her code is deployed to a production site, and how the application will be maintained at that point. Dealing with an application that is logging 1GB/day or more of completely unneeded debug logging is maddening.
The generally accepted best practice is to use a Logging framework that has concepts of:
Different log objects - Different classes/modules/etc can log to different loggers, so you can choose to apply different log configurations to different portions of the application.
Different log levels - so you can tweak the logging configuration to only log errors in production, to log all sorts of debug and trace info in a development environment, etc.
Different log outputs - the framework should allow you to configure where the log output is sent to without requiring any changes in the codebase. Some examples of different places you might want to send log output to are files, files that roll over based on date/size, databases, email, remoting sinks, etc.
The log framework should never never never throw any Exceptions or errors from the logging code. Your application should not fail to load or fail to start because the log framework cannot create it's log file or obtain a lock on the file (unless this is a critical requirement, maybe for legal reasons, for your app).
The eventual log framework you will use will of course depend on your platform. Some common options:
Java:
Apache Commons Logging
log4j
logback
Built-in java.util.logging
.NET:
log4net
C++:
log4cxx
Apache Commons Logging is not intended for applications general logging. It's intended to be used by libraries or APIs that don't want to force a logging implementation on the API's user.
There are also classloading issues with Commons Logging.
Pick one of the [many] logging api's, the most widely used probably being log4j or the Java Logging API.
If you want implementation independence, you might want to consider SLF4J, by the original author of log4j.
Having picked an implementation, then use the logging levels/severity within that implementation consistently, so that searching/filtering logs is easier.
The easiest way to log errors in a consistent format is to use a logging framework such as Log4j (assuming you're using Java). It is useful to include a logging section in your code standards to make sure all developers know what needs to be logged. The nice thing about most logging frameworks is they have different logging levels so you can control how verbose the logging is between development, test, and production.
A best practice is to use the java.util.logging framework
Then you can log messages in either of these formats
log.warning("..");
log.fine("..");
log.finer("..");
log.finest("..");
Or
log.log(Level.WARNING, "blah blah blah", e);
Then you can use a logging.properties (example below) to switch between levels of logging, and do all sorts of clever stuff like logging to files, with rotation etc.
handlers = java.util.logging.ConsoleHandler
.level = WARNING
java.util.logging.ConsoleHandler.level = ALL
com.example.blah = FINE
com.example.testcomponents = FINEST
Frameworks like log4j and others should be avoided in my opinion, Java has everything you need already.
EDIT
This can apply as a general practice for any programming language. Being able to control all levels of logging from a single property file is often very important in enterprise applications.
Some suggested best-practices
Use a logging framework. This will allow you to:
Easily change the destination of your log messages
Filter log messages based on severity
Support internationalised log messages
If you are using java, then slf4j is now preferred to Jakarta commons logging as the logging facade.
As stated slf4j is a facade, and you have to then pick an underlying implementation. Either log4j, java.util.logging, or 'simple'.
Follow your framework's advice to ensuring expensive logging operations are not needlessly carried out
The apache common logging API as mentioned above is a great resource. Referring back to java, there is also a standard error output stream (System.err).
Directly from the Java API:
This stream is already open and ready
to accept output data.
Typically this stream corresponds to
display output or another output
destination specified by the host
environment or user. By convention,
this output stream is used to display
error messages or other information
that should come to the immediate
attention of a user even if the
principal output stream, the value of
the variable out, has been redirected
to a file or other destination that is
typically not continuously monitored.
Aside from technical considerations from other answers it is advisable to log a meaningful message and perhaps some steps to avoid the error in the future. Depending on the errors, of course.
You could get more out of a I/O-Error when the message states something like "Could not read from file X, you don't have the appropriate permission."
See more examples on SO or search the web.
There really is no best practice for logging an error. It basically just needs to follow a consistent pattern (within the software/company/etc) that provides enough information to track the problem down. For Example, you might want to keep track of the time, the method, parameters, calling method, etc.
So long as you dont just print "Error in "