Best way to configure a run script? - configuration

I am tasked with supportting a run script that uses environment variables to determine which tools to use, which directories to grab source files from, etc. This does not seem like the best technique to me. It seems like it would be much better to have configuration files that set all these things and have the run script parse this instead of relying on environment variables. For one thing it would allow others to run your tests ver easily (just point to the config file) and less prone to errors (environment variables getting contaminated) and easier to debug. I thought I had also read somewhere that best practices was to use an explict config file for these types of things.
I just wanted to get everyones thoughts on this.

Yes, it's often helpful to keep config separate from code (although I've seen this taken to ridiculous extremes with either too many "configurables", or long chains of dependent config files where one or two be fine).
One simple step could be moving the environment variables into a separate file and have the original script "source" the new file (which effectively becomes your "config file") - minimal changes and no additional parsing required.

Related

How to organize code so that we can move and update it without having to edit the location of the configuration file?

The issue that I consider is how to write code that can easily know the location of a required config file and yet is portable, without any edit, from an environment to another. We don't want to edit the location of the configuration file to adapt the code to each new environment, say each time we move the code from a development environment to production. The method should not rely on resources that are not universally available, such as an access to user-defined environment variables or an access to a specific directory. For example, it may seem that using the DOCUMENT_ROOT as a base location for the config file is the way to go, but that is not universal. First, in a command line environment the DOCUMENT_ROOT makes no sense. Second, a programmer might be given access to a sub-folder of the DOCUMENT_ROOT only. Another requirement is that the configuration file could depend on values known at run time, say the user who call the application, as in this question How to load a config file based on user selection from "unknown" location .
The question is not what is the best location of the configuration file in specific environments, such as Location to put user configuration files in windows . The programmers would still have to figure out the best location so that end users could easily find the configuration file. The question is how this location, whatever it is, even if it depends on values known at run time, can be passed to the code in a portable manner.
One approach is to design any script file with in mind that it is to be included in another file and so on until we get to a wrapper script that only defines the directory of the config file to the benefit of the included file and other included files therein. Once this directory path is known, other configuration values can be obtained from a named configuration file within it. This works because the wrapper scripts are not updated when we update the code from a repository or testing environment. This approach seems universally applicable : no special support of any kind such as an access to user defined environment variables or to some specific directory in the server is needed. As long as you have access to the code, which is a strict minimum to expect, it works. Also, scripts are often naturally designed to be included in another file - so it is natural.
The approach only requires that we agree on a convention for the name of the constant, say CONFIG_DIRECTORY. If every programmer would agree to search at the location specified by this constant for the config file, then any user of the code could put the config file anywhere and just define this constant accordingly.
In Linux, they have the folder /etc for config files. So, the notion of an universally agreed standard in a very large context is already there. This is the same idea than the one proposed here, except that it is the same constant for all machines and someone might not have access to that level of the server. Moreover, we lose the possibility to have different configuration directories for different wrapper scripts. Allowing the universal standard to be a constant name, say 'CONFIG_DIRECTORY', instead of being the fixed constant '/etc', seems just an extra flexibility with no additional inconvenient. It does require that we define this constant in some wrapper script, but we could fall back to the old approach if it is not defined. The outcome, if the approach is strictly applied, would be that all the scripts required in the server document root would only be simple wrappers that define a configuration directory. That seems cool. Often people say that it is safer to have important code outside the document root.

In what scenario is the --srcdir config option valid?

I am working on a configuration program - it isn't autoconf, but I'm trying (as much as possible) to get it so that ./configure files that use it can be interfaced in a similar manner to those that are made with autoconf -- and that means (as much as possible) supporting the same variable options.
Only there's one option that makes no sense to me. I mean, yes, I am fully clear on what the option means, I just can't conceive of a single scenario in which someone would be well-advised to use that option - except for one scenario in which I'm equally curious why ./configure scripts can't auto-detect the information it would provide.
The option I am referring to is the "--srcdir" option. The reason it so befuddles me is that the only scenario I can imagine in which the source-code files won't be in your present-working-directory (or relative to your present-working-directory as the configure script is programmed to expect) is if the "configure" script itself isn't in your present-working-directory ---- and in that one scenario, I really am unable to imagine why the "configure" script can't extrapolate the source-directory from the name it is invoked by - and instead has to have that --srcdir option to give it that information.
For example, let's say your program's source-code is located in the "awesome/software" directory. That means that, from where you are, the "configure" script would be "awesome/software/configure". Why can't the "configure" script deduce that the source-directory is "awesome/software" just from the fact that it is invoked by the name "awesome/software/configure", and instead require me to add a separate command-line option of: --srcdir=awesome/software
And if this is not the kind of scenario where one would need to specify the --srcdir option (or if it is not the only kind of such scenario) can someone describe to me any other kind of scenario where the person installing a program would be well-advised to alter the "srcdir" variable from it's default?
The option I am referring to is the "--srcdir" option. ...
the only scenario I can imagine in which the source-code files won't be in your present-working-directory (or relative to your present-working-directory as the configure script is programmed to expect) is if the "configure" script itself isn't in your present-working-directory
Right.
and in that one scenario, I really am unable to imagine why the "configure" script can't extrapolate the source-directory from the name it is invoked by - and instead has to have that --srcdir option to give it that information.
I'm not sure it's required. The configure script will attempt to guess the location of srcdir:
# Find the source files, if location was not specified.
if test -z "$srcdir"; then
ac_srcdir_defaulted=yes
# Try the directory containing this script, then the parent directory.
...
So if it's in neither of those places, this will fail, hence the need for --srcdir. Maybe this is (was?) needed where there's some kind of performance differential where the sources are stored on a "slow" drive and the build happens on a "fast" drive, and configure seems to run faster on the "fast" drive so it needs to be there as well...
At any rate, --srcdir is just a variable assignment, so it's not hard to do.
Why can't the "configure" script deduce that the source-directory is "awesome/software" just from the fact that it is invoked by the name "awesome/software/configure"
The configure source seems to do that without specifying --srcdir, but I have not tried it.

Why config files should't be changed line-by-line with Chef / Puppet?

Why is changing lines in configuration file considered an anti-pattern in Chef or Puppet?
It's something like bad habit, as I understood. I assume that this file-editing is done in some idempotent way and with advanced tools (augeas for example).
Why is deploying the whole files, with ERB templates, considered a preferred method?
You can find a lot of examples where dev-ops are suggesting usage of templates instead of file-editing. For example here, here, here, etc.
Actually there is a large part of the DevOps community that sees accepting system/package defaults for config files and only modifying what you need through augeas as the preferred method, Github devops would be one of them(if you happened to catch them at Puppet Conf 2012).
I think having a default pattern of always using templates creates too high of a maintenance load and almost always requires you lock in specific versions for everything across your stack or you risk having an incompatible template against a newer version of that resource.
There's use cases for both options but in general I favor the "own as little as possible" practice vs the "own everything even if you don't have to" practice.
In terms of setting the your system to a known state, deploying whole files is better than editing, because you are sure the file is exactly as intended when you are done.
If you are tinkering around finding potential solutions to a problem and hand edit some configuration file, you don't have to worry about the hand edit you made staying around as an uncontrolled part of your environment. The next time you run chef-client, you know that the state will be exactly as specified in the Chef recipe, and won't include your edit.
Also, it is just in general harder and more complicated to robustly edit a file than it is to just generate one. You might write something that is idempotent in the basic case, but if the file contains a syntax error or something invalid, than your editing no longer works.
As always though, sometimes you don't have a choice, and editing is the only way to go.

Read all CSV files in a directory into an internal table

I have a parameter and, on F4, we can choose the directory. I'm trying to figure out how to choose a folder and read the content of all the files in it (the files are in .CSV) to an internal table. I think I have to use TMP_GUI_DIRECTORY_LIST_FILES function. Hope I'm explaining myself. Thank you.
You'll have to do this manually: first read the list of files, the go through each file and process its contents. There may be some odd function modules to read CSV files, but be aware that many of them are broken - for example, they just clip the lines that exceed a certain length. Therefore I won't recommend any of them - personally, I'd implement the CSV import part myself.
If you have access to the transaction KCLJ in your system you could analyze the coding behind it. This tool has an option to interpret CSV files so you might find interesting function modules that might help you with your tasks.
EDIT: I looked at it very quickly and the piece of coding you could reuse is reconvert_format from include RKCDFILEINCFOR. An example how to call it is located starting from line 128 in the same include.

The use of config file is it equivalent to use of globals?

I've read many times and agree with avoiding the use of globals to keep code orthogonal. Does the use of the config file to keep read only information that your program uses similar to using Globals?
If you're using config files in place of globals, then yes, they are similar.
Config files should only be used in cases where the end-user (presumably a computer-savvy user, like a developer) needs to declare settings for an application or piece of code, while keeping their hands out of the code itself.
My first reaction would be that it is not the same. I think the problem with globals is the read+write scenario. Config-files are readonly (at least in terms of execution).
In the same way constants are not considered bad programming behaviour. Config-files, at least in the way I use them, are just easy-changable constants.
Well, since a config file and a global variable can both have the effect of propagating changes throughout a system - they are roughly similar.
But... in the case of a configuration file that change is usually going to take place in a single, highly-visible (to the developer) location, and global variables can affect change in very sneaky and hard to track down ways -- so in this way the two concepts are not similar.
Having a configuration file ususally helps with DRY concepts, and it shouldn't hurt the orthogonality of the system, either.
Bonus points for using the $25 word 'orthogonal'. I had to look that one up in Wikipedia to find out the non-Euclidean definition.
Configuration files are really meant to be easily editable by the end user as a way of telling the program how to run.
A more specialized form of configuration files, user preferences, are used to remember things between program executions.
Global is related to a unique instance for an object which will never change, whereas config file is used as container for reference values, for objects within the application that can change.
One "global" object will never change during runtime, the other object is initialized through config file, but can change later on.
Actually, those objects not only can change during the lifetime of the application, they can also monitor the config file in order to realize "hot-change" (modification of their value without stopping/restarting the application), if that config file is modified.
They are absolutely not the same or replacements for eachother. A config file, or object can be used non-globally, ie passed explicitly.
You can of course have a global variable that refers to a config object, and that would be defeating the purpose.