The use of config file is it equivalent to use of globals? - language-agnostic

I've read many times and agree with avoiding the use of globals to keep code orthogonal. Does the use of the config file to keep read only information that your program uses similar to using Globals?

If you're using config files in place of globals, then yes, they are similar.
Config files should only be used in cases where the end-user (presumably a computer-savvy user, like a developer) needs to declare settings for an application or piece of code, while keeping their hands out of the code itself.

My first reaction would be that it is not the same. I think the problem with globals is the read+write scenario. Config-files are readonly (at least in terms of execution).
In the same way constants are not considered bad programming behaviour. Config-files, at least in the way I use them, are just easy-changable constants.

Well, since a config file and a global variable can both have the effect of propagating changes throughout a system - they are roughly similar.
But... in the case of a configuration file that change is usually going to take place in a single, highly-visible (to the developer) location, and global variables can affect change in very sneaky and hard to track down ways -- so in this way the two concepts are not similar.
Having a configuration file ususally helps with DRY concepts, and it shouldn't hurt the orthogonality of the system, either.
Bonus points for using the $25 word 'orthogonal'. I had to look that one up in Wikipedia to find out the non-Euclidean definition.

Configuration files are really meant to be easily editable by the end user as a way of telling the program how to run.
A more specialized form of configuration files, user preferences, are used to remember things between program executions.

Global is related to a unique instance for an object which will never change, whereas config file is used as container for reference values, for objects within the application that can change.
One "global" object will never change during runtime, the other object is initialized through config file, but can change later on.
Actually, those objects not only can change during the lifetime of the application, they can also monitor the config file in order to realize "hot-change" (modification of their value without stopping/restarting the application), if that config file is modified.

They are absolutely not the same or replacements for eachother. A config file, or object can be used non-globally, ie passed explicitly.
You can of course have a global variable that refers to a config object, and that would be defeating the purpose.

Related

Clojurescript decoupled file & namespace?

I'm using reagent to build several alternate root components, only one of which will be mounted on any given page; definitely either/or. These have a degree of commonality in their makeup, hence it will be convenient to move what is common among them to a common namespace.
What would be ideal is if in the file for each of these components I had the option to switch namespace into common, and add defs particular to the component, then switch back, thus avoiding circular dependencies nor needing any kind of inheritance.
I recalled this being possible in common lisp, how wonderful it was, and it also seems possible in clojure.
From Clojurescript docs: ns must be the first form and can only be used once, and in-ns is only usable from the repl.
I'm wondering if there's a way to achieve this kind of thing in clojurescript which is still eluding me.
If not I may need to reconsider my assumptions behind multiple alternate root components; the "many builds within one build" kind of idea, if that makes sense.
Update after some futher experimentation and confusion:
another option might be to split a single namespace across multiple files (is this possible?). Not sure what direction to turn in here.
The fact that in reagent I am using atoms in the global namespace is what's creating the need for circular dependencies if I use a separate namespace for common. Hence, wonder about one global namespace, in which case multiple files might help. Or is the way forward one giant file and one namespace??
Update: I've realised there is a great tension between keeping all app state globally (in my current case, multiple atoms), and passing app state around. My pattern currently is everything global, don't pass any of it around. Passing the necessary state as parameters to fns in the common namespace would solve the problem here (duh!), but then there's the question of what principles are being followed here regarding app state. If I just added a param whenever I needed one, but started with the idea that everything was global, there'd be no real principle to it...
In ClojureScript, everything is pre-compiled into a single static JavaScript "executable", so there is nothing like the repl you are used to in Clojure. Indeed, in CLJS the "Var" concept doesn't really after the compiler, they are just static (constant) variables and cannot be rebound.
Having said that, CLJS does emulate the behavior of Clojure dynamic variables via the binding form, so that may help you to reach your goal. As in CLJ, it creates what amounts to a (thread-local) global variable. This is a degenerate case in CLJS since there is only one thread. However, the source code looks identical to the CLJ case.
Another way to accomplish this is to just use a plain atom as a global variable so you don't have to pass a parameter around.
As always, when using a global variable, it reduces the number of parameters in function call trees, but it creates invisible dependencies between different parts of the code. Somethimes convenient, but usually a bad tradeoff.

How to organize code so that we can move and update it without having to edit the location of the configuration file?

The issue that I consider is how to write code that can easily know the location of a required config file and yet is portable, without any edit, from an environment to another. We don't want to edit the location of the configuration file to adapt the code to each new environment, say each time we move the code from a development environment to production. The method should not rely on resources that are not universally available, such as an access to user-defined environment variables or an access to a specific directory. For example, it may seem that using the DOCUMENT_ROOT as a base location for the config file is the way to go, but that is not universal. First, in a command line environment the DOCUMENT_ROOT makes no sense. Second, a programmer might be given access to a sub-folder of the DOCUMENT_ROOT only. Another requirement is that the configuration file could depend on values known at run time, say the user who call the application, as in this question How to load a config file based on user selection from "unknown" location .
The question is not what is the best location of the configuration file in specific environments, such as Location to put user configuration files in windows . The programmers would still have to figure out the best location so that end users could easily find the configuration file. The question is how this location, whatever it is, even if it depends on values known at run time, can be passed to the code in a portable manner.
One approach is to design any script file with in mind that it is to be included in another file and so on until we get to a wrapper script that only defines the directory of the config file to the benefit of the included file and other included files therein. Once this directory path is known, other configuration values can be obtained from a named configuration file within it. This works because the wrapper scripts are not updated when we update the code from a repository or testing environment. This approach seems universally applicable : no special support of any kind such as an access to user defined environment variables or to some specific directory in the server is needed. As long as you have access to the code, which is a strict minimum to expect, it works. Also, scripts are often naturally designed to be included in another file - so it is natural.
The approach only requires that we agree on a convention for the name of the constant, say CONFIG_DIRECTORY. If every programmer would agree to search at the location specified by this constant for the config file, then any user of the code could put the config file anywhere and just define this constant accordingly.
In Linux, they have the folder /etc for config files. So, the notion of an universally agreed standard in a very large context is already there. This is the same idea than the one proposed here, except that it is the same constant for all machines and someone might not have access to that level of the server. Moreover, we lose the possibility to have different configuration directories for different wrapper scripts. Allowing the universal standard to be a constant name, say 'CONFIG_DIRECTORY', instead of being the fixed constant '/etc', seems just an extra flexibility with no additional inconvenient. It does require that we define this constant in some wrapper script, but we could fall back to the old approach if it is not defined. The outcome, if the approach is strictly applied, would be that all the scripts required in the server document root would only be simple wrappers that define a configuration directory. That seems cool. Often people say that it is safer to have important code outside the document root.

Why make global Lua functions local?

I've been looking at some Lua source code, and I often see things like this at the beginning of the file:
local setmetatable, getmetatable, etc.. = setmetatable, getmetatable, etc..
Do they only make the functions local to let Lua access them faster when often used?
Local data are on the stack, and therefore they do access them faster. However, I seriously doubt that the function call time to setmetatable is actually a significant issue for some program.
Here are the possible explanations for this:
Prevention from polluting the global environment. Modern Lua convention for modules is to not have them register themselves directly into the global table. They should build a local table of functions and return them. Thus, the only way to access them is with a local variable. This forces a number of things:
One module cannot accidentally overwrite another module's functions.
If a module does accidentally do this, the original functions in the table returned by the module will still be accessible. Only by using local modname = require "modname" will you be guaranteed to get exactly and only what that module exposed.
Modules that include other modules can't interfere with one another. The table you get back from require is always what the module stores.
A premature optimization by someone who read "local variables are accessed faster" and then decided to make everything local.
In general, this is good practice. Well, unless it's because of #2.
In addition to Nicol Bolas's answer, I'd add on to the 3rd point:
It allows your code to be run from within a sandbox after it's been loaded.
If the functions have been excluded from the sandbox and the code is loaded from within the sandbox, then it won't work. But if the code is loaded first, the sandbox can then call the loaded code and be able to exclude setmetatable, etc, from the sandbox.
I do it because it allows me to see the functions used by each of my modules
Additionally it protects you from others changing the functions in global environment.
That it is a free (premature) optimisation is a bonus.
Another subtle benefit: It clearly documents which variables (functions, modules) are imported by the module. And if you are using the module statement, it enforces such declarations, because the global environment is replaced (so globals are not available).

Best practices for avoiding hardcoded values IRL

In theory, source code should not contain hardcoded values beyond 0, 1 and the empty string. In practice, I find it very hard to avoid all hardcoded values while being on very tight delivery times, so I end up using a few of them and feeling a little guilty.
How do you reconcile avoiding hardcoded values with tight delivery times?
To avoid hard-coding you should
use configuration files (put your values in XML or ini-like text files).
use database table(s) to store your values.
Of course not all values qualify to be moved to a config file. For those you should use constructs provided by the programming language such as (constants, enums, etc).
Just saw an answer to use "Constatn Interface". With all due respect to the poster and the voters, but this is not recommended. You can read more about that at:
http://en.wikipedia.org/wiki/Constant_interface
The assumption behind the question seems to me invalid.
For most software, configuration files are massively more difficult to change that source code. For widely installed, software, this could easily be a factor of a million times more difficult: there could easily be that many files hanging round on user installations which you have little knowledge and no control over.
Having numeric literals in the software is no different from having functional or algorithmic literals: it's just source code. It is the responsibility of any software that intends to be useful to get those values right.
Failing that make them at least maintainable: well named and organised.
Making them configurable is the kind of last-ditch compromise you might be forced into if you are on a tight schedule.
This comes with a little bit of planning, in most cases it is as simple as having a configuration file, or possibly a database table that stores critical configuration items. I don't find that there is any reason that you "have" to have hard coded values, and it shouldn't take you much additional time to offload to a configuration mechanism to where tight time lines would be a valid excuse.
The problem of hardcoded values is that sometimes it's not obvoius that particular code relies on them. For example, in java it is possible to move all constants into separate interface and separate particular constants into inner sub-interfaces. It's quite convenient and obvious. Also it's easy to find the constant usage just by using IDE facilities ("find usage" functionality) and change or refactor them.
Here's an example:
public interface IConstants {
public interface URL {
String ALL = "/**";
}
public interface REST_URL {
String DEBUG = "/debug";
String SYSTEM = "/system";
String GENERATE = "/generate";
}
}
Referencing is quite human readable: IConstants.REST_URL.SYSTEM
Most non-trivial enterprise-y projects will have some central concept of properties or configuration options, which already takes care of loading up option from a file/database. In these cases, it's usually simple (as in, less than 5 minutes' work) to extend this to support the new propert(ies) you need.
If your project doesn't have one, then either:
It could benefit from one - in which case write it, taking values from a flat .properties file to start with. This shouldn't take more than an hour, tops, and is reusable for any config stuff in future
Even that hour would be a waste - in which case, you can still hav a default value but allow this to be overridden by a system property. This require no infrastructure work and minimal time to implement in your class.
There's really no excuse for hardcoding values - it only saves you a few minutes at most, and if your project deadline is measured in minutes then you've got bigger problems than how to code for configurability.
Admittedly, I hardcode a lot of stuff at my current hobby project. Configuration files are ridiculously easy to use instead (at least with Python, which comes with a great and simple .cfg parser), I just don't bother to use them because I am 99% confident that I will never have to change them - and even if that assumption proved false, it's small enough to refactor it with reasonable effort. For annything larger/more important, however, I would never type if foo == "hardcoded bar", but rather if foo == cfg.bar (likely with a more meaningful name for cfg). Cfg is a global singleton (yeah, I know...) which is fed the .cfg file at startup, and next time some sentinel value changes, you change the configuration file and not the source.
With a dynamic/reflective language, you don't even need to change the part loading the .cfg when you add another value to it - make it populate the cfg object dynamically with all entries in the file (or use a hashmap, for that matter) and be done.
2 suggestions here:
First, if you are working on embedded system using language like C. Simply work out a coding convention to use a #define for any string or constant. All the #define should be categorized in a .h file. That should be enough - not too complex but enough for maintainability. You don't need to mangle all the constant between the code line.
Second, if you are working on a application with access to DB. It is simple just to keep all the constant values in the database. You just need a very simple interface API to do the retrieval.
With simple tricks, both methods can be extended to support multi-language feature.

Singleton for Application Configuration

In all my projects till now, I use to use singleton pattern to access Application configuration throughout the application. Lately I see lot of articles taking about not to use singleton pattern , because this pattern does not promote of testability also it hides the Component dependency.
My question is what is the best way to store Application configuration, which is easily accessible throughout the application without passing the configuration object all over the application ?.
Thanks in Advance
Madhu
I think an application configuration is an excellent use of the Singleton pattern. I tend to use it myself to prevent having to reread the configuration each time I want to access it and because I like to have the configuration be strongly typed (i.e, not have to convert non-string values each time). I usually build in some backdoor methods to my Singleton to support testability -- i.e., the ability to inject an XML configuration so I can set it in my test and the ability to destroy the Singleton so that it gets recreated when needed. Typically these are private methods that I access via reflection so that they are hidden from the public interface.
EDIT We live and learn. While I think application configuration is one of the few places to use a Singleton, I don't do this any more. Typically, now, I will create an interface and a standard class implementation using static, Lazy<T> backing fields for the configuration properties. This allows me to have the "initialize once" behavior for each property with a better design for testability.
Use dependency injection to inject the single configuration object into any classes that need it. This way you can use a mock configuration for testing or whatever you want... you're not explicitly going out and getting something that needs to be initialized with configuration files. With dependency injection, you are not passing the object around either.
For that specific situation I would create one configuration object and pass it around to those who need it.
Since it is the configuration it should be used only in certain parts of the app and not necessarily should be Omnipresent.
However if you haven't had problems using them, and don't want to test it that hard, you should keep going as you did until today.
Read the discussion about why are they considered harmful. I think most of the problems come when a lot of resources are being held by the singleton.
For the app configuration I think it would be safe to keep it like it is.
The singleton pattern seems to be the way to go. Here's a Setting class that I wrote that works well for me.
If any component relies on configuration that can be changed at runtime (for example theme support for widgets), you need to provide some callback or signaling mechanism to notify about the changed config. That's why it is not enough to pass only the needed parameters to the component at creation time (like color).
You also need to provide access to the config from inside of the component (pass complete config to component), or make a component factory that stores references to the config and all its created components so it can eventually apply the changes.
The former has the big downside that it clutters the constructors or blows up the interface, though it is maybe fastest for prototyping. If you take the "Law of Demeter" into account this is a big no because it violates encapsulation.
The latter has the advantage that components keep their specific interface where components only take what they need, and as a bonus gives you a central place for refactoring (the factory). In the long run code maintenance will likely benefit from the factory pattern.
Also, even if the factory was a singleton, it would likely be used in far fewer places than a configuration singleton would have been.
Here is an example done using Castale.Core >> DictionaryAdapter and StructureMap