Change default configuration on Hadoop slave nodes? - configuration

Currently I am trying to pass some values through command line arguments and then parse it using GenericOptionsParser with tool implemented.
from the Master node I run something like this:
bin/hadoop jar MYJAR.jar MYJOB -D mapred.reduce.tasks=13
But this only get applied on the Master!! Is there any way to make this applied on the slaves as well?
I use Hadoop 0.20.203.
Any help is appreciated.

But this only get applied on the Master!! Is there any way to make this applied on the slaves as well?
According to the "Hadoop : The Definitive Guide". Setting some of the property on the client side is of no use. You need to set the same in the configuration file. Note, that you can also create new properties in the configuration files and read them in the code using the Configuration Object.
Be aware that some properties have no effect when set in the client configuration. For
example, if in your job submission you set mapred.tasktracker.map.tasks.maximum with
the expectation that it would change the number of task slots for the tasktrackers running your job, then you would be disappointed, since this property only is only honored
if set in the tasktracker’s mapred-site.html file. In general, you can tell the component
where a property should be set by its name, so the fact that mapred.task.tracker.map.tasks.maximum starts with mapred.tasktracker gives you a clue that it can be set only for the tasktracker daemon. This is not a hard and fast rule, however, so in some cases you may need to resort to trial and error, or even reading the source.
You can also configure the environment of the Hadoop variables using the HADOOP_*_OPTS in the conf/hadoop-env.sh file.
Again, according to the "Hadoop : The Definitive Guide".
Do not confuse setting Hadoop properties using the -D property=value option to GenericOptionsParser (and ToolRunner) with setting JVM system properties using the -Dproperty=value option to the java command. The syntax for JVM system properties does not allow any whitespace between the D and the property name, whereas GenericOptionsParser requires them to be separated by whitespace.
JVM system properties are retrieved from the java.lang.System class, whereas Hadoop properties are accessible only from a Configuration object.

Related

Can I set the Node interpreter for all run / debug configuration at once in WebStorm / PhpStorm?

I use PhpStorm (though this should also apply to WebStorm) to develop node projects.
I manage my node versions via nvm and hence always have multiple node versions available.
In the past, I always set a specific node version for a run configuration, e.g.
~/.nvm/versions/node/v10.16.2.0/bin/node
which means that once I changed my node version to e.g. v11.15.0, the file link would become invalid and I have to update it to
~/.nvm/versions/node/v11.15.0/bin/node
This becomes tedious, and I can have a lot of run configs per project. Now I realized, I could set the node interpreter to "Project" via:
and it would use the one defined in the general node settings:
This way, if I change the node version I can just change the project setting and it would be applied for all the run configuration using the project setting.
But now I have a lot of already set run configs, and I would have to change all of them one by one.
Hence: Is there a way to set all run configs at once? (Also, at best this would also reset the node interpreters for tslint / typescript and other tools.)
Or do I have to do it manually?
You have to do this manually - specific interpreters chosen in run configurations (and other Node.js-dependent configs) always override the defaults, so changing the interpreter in one place won't update others. You have to set interpreter to Project alias everywhere, then next time you change it in Settings | Languages & Frameworks | Node.js and NPM, all configs will be auto-updated

Does Qpid Broker 7 support ${variable} substitution in its config file?

I am upgrading from version 6 to version 7, which means switching from from instantiating a Broker to using the SystemLauncher.
In particular, I had specified the keystore with a variable, and am now providing that via the systemConfigAttributes to start. But it reports that it can't find they keystore, and names my ${variable} as the keystore, rather than the value I provided.
I've also tried using one of the standard variables ${qpid.amqp_port} instead, just in case, and it's still reporting that variable name rather than a value.
Is this a feature that doesn't work any more, or am I plugging it in wrong?
The feature is still used in the broker configuration, and so you should be able to use it. Can you share your code / config so we can see what the problem might be?

What are the output files of the VxWorks Workbench kernel configuration GUI

I'm trying to generate a VxWorks 6.9.4.8 kernel configuration that is identical to another kernel workbench project. The Workbench 3.3.6 only allows GUI configuration.
Is there an underlying kernel configuration file, produced by the GUI, which can be replaced?
After updating the kernel configuration using the Workbench GUI, I see the following files have changed:
linkSyms.c,
prjComps.h,
prjConfig.c, and
prjParams.h
I guess my question is, which one, if any uniquely identifies the kernel as built?
prjComps.h will contain all the component's names, as you have chosen in your kernel configuration GUI.
First step to create new Kernel configuration based on some other Kernel configuration is to use GUI configurator and add the missing component in prjComps.h, Better use some diff tool like 'beyond compare', and keep reducing the differences by adding/removing the components. Remember not to edit this file directly, but via GUI configurator only. As the tool calculates the dependent component and adds/removes them.
Second step is to create the new prjParams.h as above.
The Workbench actually allows to use command line to edit Kernel configuration via vxprj tool in vxworks 6.9(this tool has been replaced by "wrtool" in vxworks 7), you can right click on the Image project and chose 'Open Wind River vxWorks 6.9 Developement Shell'.
If you want to add a component for e.g. telnet client (INCLUDE_TELNET_CLIENT)
, you can use the following command
vxprj component add INCLUDE_TELNET_CLIENT
To remove a component
vxprj component remove INCLUDE_TELNET_CLIENT
For more of vxprj tool, you can look up the documentation in the workbench itself.
The project configuration is held in a handful of files in the kernel project directory.
These are:
.project
.cproject
.wrproject
projectname.wpj
Files such as prjComps.h, prjParams.h prjConfig.c are all generated by the configuration tool, however these are not configuration files themselves. Instead, this is generated C code that contains, amongst other things, a list of selected components.
These files are also re-generated, I believe, when you rebuild the project.
As such, these are not really the authoritative source you are interested in.
For this, you need to look at the project files. In terms of a list of components, the most interesting is the .wpj file, which contains amongst other things a list of explicitly and implicitly included components.
The explicitly included components are those you manually selected in the Kernel Configuration GUI, the implicitly included are those that were then included to satisfy dependencies.
This distinction can sometimes make comparing kernel configurations tricky, then you may want to fall back on the generated files eg prjComps.h, however you should always remember that this is a representation of the configuration, not the source.
The .project etc configuration files are big and complex, but a decent diff tool, such as BeyondCompare can make comparisons of the project directories fairly easy
Thanks for the clue, #endTunnel. I looked at that file, and noticed that a few files get modified when I save my GUI selections.
prjComps.h - all the components #included in the kernel build
prjParams.h - the additional parameters set for the enabled components
prjConfig.c - the configuration and initialization calls for each module included.
'linkSyms.c' also gets modified. Not sure how that is used, yet.
I can now use diff to compare kernel configurations, and perhaps even duplicate a configuration (haven't tried that yet).

Azure : can we check if a setting exists before trying to read it?

I currently use RoleEnvironment.GetConfigurationSettingValue(propertyName) to get the value of a setting defined in my WebRole config file (csdef + cscfg). Ok, sounds right.
This works well if the setting exists but failed with an Exception if the setting is not defined in the csdef and the cscfg.
I'm migrating an existing app to Azure which has many configuration settings in web.config. In my code, to read a setting value, I d'like to test : if it exists in the webRole config (csdef + cscfg) I read it from here, otherwise I read it with ConfigurationManager from web.config.
This would prevent to migrate all settings from my web.config and allow to custom one when the app is already deployed.
Is there a way to do this ?
I don't want to encapsulate the GetConfigurationSettingValue in a try/catch (and read from web.config if I enter the catch) because it's really an ugly way (and mostly it's not performance effective !).
Thanks !
Update for 1.7 Azure SDK.
The CloudConfigurationManager class has been introduced. The allows for a single GetSetting call to look in your cscfg first and then fall back to web.config if the key is not found.
http://msdn.microsoft.com/en-us/LIBRARY/jj157248
Pre 1.7 SDK
Simple answer is no. (That I know of)
The more interesting topic is to consider configuration as a dependency. I have found it to be beneficial to treat configuration settings as a dependency so that the backing implementation can be changed over time. That implementation may be a fake for testing or something more complex like switching from .config/.cscfg to a database implementation for multi-tennent solutions.
Given this configuration wrapper you can write that TryGetSetting as internal method for whatever your source of configuration options are. When this feature is added to the RoleEnvironment members, you would only have to change that internal implementation.

Command line parameters or configuration file?

I'm developing a tool that will perform several types of analysis, and each analysis can have different levels of thoroughness. This app will have a fair amount of options to be given before it starts. I started implementing this using a configuration file, since the number of types of analysis specified were little. As the number of options implemented grew, I created more configuration files. Then, I started mixing some command line parameters since some of the options could only be flags. Now, I've mixed a bunch of command line parameters with configuration files and feel I need refactoring.
My question is, When and why would you use command line parameters instead of configuration files and vice versa?
Is it perhaps related to the language you use, personal preference, etc.?
EDIT: I'm developing a java app that will work in Windows and Mac. I don't have a GUI for now.
Command line parameters are useful for quickly overriding some parameter setting from the configuration file. As well, command line parameters are useful if there are not so many parameters. For your case, I'd suggest that you export parameter presets to command line.
Command line arguments:
Pros:
concise - no extra config files to maintain by itself
great interaction with bash scripts - e.g. variable substitution, variable reference, bash math, etc.
Cons:
it could get very long as the options become more complex
formatting is inflexible - besides some command line utilities that help you parse the high level switches and such, anything more complex (e.g. nested structured information) requires custom syntax such as using Regex, and the structure could be quite rigid - while JSON or YAML would be hard to specify at the command line level
Configuration files:
Pros:
it can be very large, as large as you need it to be
formatting is more flexible - you can use JSON, YAML, INI, or any other structural format to represent the information in a more human consumable way
Cons:
inflexible to interact with bash variable substitutions and references (as well as bash math) - you have to probably define your own substitution rules if you want the config file to be "generic" and reusable, while this is the biggest advantage of using command line arguments - variable math would be difficult in config files (if not impossible) - you have to define your own "operator" in the config files, or you have to rely on another bash script to carry out the variable math, and perform your custom variable substitution so the "generic" config file could become "concretely usable".
for all that it takes to have a generic config file (with custom defined variable substitution rules) ready, a bash script is still needed to carry out the actual substitution, and you still have to code your command line to accept all the variable substitutions, so either you have config files with no variable substitution, which means you "hard code" and repeat the config file for different scenarios, or the substitution logic with custom variable substitution rules make your in-app config file logic much more complex.
In my use case, I value being able to do variable substitution / reference (as well as bash math) in the bash scripts more important, since I'm using the same binary to start many server nodes with different responsibilities in a server backend cluster, and I kind of use the bash scripts as sort of a container or actually a config file to start the many different nodes with differing command line arguments.
my vote = both ala mysqld.exe
What environment/platform? In Windows you'd rather use a config file, or even a configuration panel/window in the gui.
I place configuration that don't really change in a configuration file.
Configuration that change often I place on the command-line.