Delayed deployment of InfoSphere Streams Operators and Runtime 'tags' deployment - infosphere-spl

I'd like to know if two functionalities are available in infosphere streams, but I could not find it anywhere else.
1) To the best of my knowledge, when an InfoSphere Streams Application starts, all of the operators are deployed on the hosts in the cluster. Is it possible to deploy specific operator per results of previous Operator(s)? So that the deployment is happening during a job (and not only when a host fails).
2) Also, to the best of my knowledge, tags exists which allow specifying which Operators will be deployed to which hosts. Is it possible to change hosts tags during a Job runtime? Adding to question (1), is it possible that on runtime, I will deploy an Operator to a specific machine based on computations that occurred during the job?
Thanks, Tom.

answers to your questions ...
1.) operators can be placed relative to placement of other operators, but not based upon the results of an operator's execution.
2.) There is currently no way for a running operator to change host tags based upon its calculations.
The tags can be changed on a host while a job is running, but this must be done through administrator operations. Then the PEs must be stopped and restarted to take advantage of this new tagging configuration.

Related

Openwhisk multitenancy on Openshift

I'm trying to install Openwhisk onto Openshift. I've followed the official guide and it worked.
Now the point is that my environment would be a multitenant ecosystem, so let's pretend having two different users (Ux and Uy) who want to run their containers on my openwhisk environment.
I'd like to have the following projects in my openshift:
Core project, that hosts the Openwhisk's Ingress, Controller, Kafka and CouchDB components (maybe also the Invokers?)
UxPRJ project, that hosts only the containers running actions created by Ux (maybe also the Invokers?)
UyPRJ project, that hosts only the containers running actions created by Uy (maybe also the Invokers?)
The following images better explain what I've in mind:
or also:
Is this possible configuration possible?
Looking around, I wasn't able to find anything like that...
Thank you.
The OpenWhisk loadbalancer which assigns functions to invokers does not segregate users in the way you want, but it is possible to do what you want if you modify the loadbalancer. The way it works now is that there is a list of available invokers which form the allowed set of invokers for a function assignment. You could at that point take into consideration a partitioning based on the user and form the allowed set of invokers differently. There are other ways to realize the partitioning you want as well, but all require modification to the OpenWhisk control plane.

Can "MaxConcurrentStreams" server option be considered an equivalent to "maximum_concurrent_rpcs" from grpc-python?

I am implementing a grpc server(in go) where I need to respond with some sort of server busy/unavailable message in case my server is already servicing a set maximum number of RPCs (currently).
I have implemented a grpc server with grpc-python earlier where I achieved this with a combination of maximum_concurrent_rpcs and the max number of threads in the threadpool. I am looking for something similar in grpc-go. The closest I could find was the server setting which can be set by the ServerOptions returned by calling MaxConcurrentStreams. My application only supports unary RPCs and I am not sure if this setting will apply to that.
I am just looking to enforce/set a max number of active concurrent requests the server can handle. Would setting maxConcurrentStreams work or should I look at doing it in my code itself (I have done some rudimentary implementation for it but I would rather use something provided by grpc-go)?
I've never used MaxConcurrentStreams before, because for highload services you usually want to make the most from your hardware, and this limitation doesn't seem to make sense. Perhaps it's possible to achieve your goal with this setting, but you need to investigate, which kind of error is returned when MaxConcurrentStreams is achieved. I think that should be GRPC's transport error, not your own, so you'll not be able to control error message and code.

How about an Application Centralized Configuration Management System?

We have a build pipeline to manage the artifacts' life cycle. The pipline is consist of four stages:
1.commit(runing unit/ingetration tests)
2.at(deploy artifact to at environment and runn automated acceptance tests)
3.uat(deploy artifact to uat environment and run manual acceptance tests)
4.pt(deploy to pt environment and run performance tests)
5.//TODO we're trying to support the production environment.
The pipeline supports environment varialbles so we can deploy artifacts with different configurations by triggerting with options. The problem is sometimes there are too many configuration items making the deploy script contains too many replacement tasks.
I have an idea of building a centralized configuration managment system (ccm for short name), so we can maintain the configuration items over there and leave only a url(connect to the ccm) replacement task (handling different stages) in the deploy script. Therefore, the artifact doesnt hold the configuration values, it asks the ccm for them.
Is this feasible or a bad idea of the first place?
My concern is that the potential mismatch between the configuration key (defined in the artifact) and value (set in the ccm) is not solved by this solution and may even worse.
Configuration files should remain with the project or set as configuration variables where they are run. The reasoning behind this is that you're adding a new point of failure in your architecture, you have to take into account that your configuration server could go down thus breaking everything that depends on it.
I would advice against putting yourself in this situation.
There is no problem in having a long list of environment variables defined for a project, besides that could even mean you're doing things properly.
If for some reason you find yourself changing configuration files a lot (for ex. database connection strings, api ednpoints, etc...) then the problem might be this need to change a lot configurations, which should stay almost always the same.

Reliable way to tell development server apart from production server?

Here are the ways I've come up with:
Have an unversion-controlled config file
Check the server-name/IP address against a list of known dev servers
Set some environment variable that can be read
I've used (2) on some of my projects, and that has worked well with only one dev machine, but we're up to about 10 now, it may become difficult to manage an ever-changing list.
(1) I don't like, because that's an important file and it should be version controlled.
(3) I've never tried. It requires more configuration when we set up each server, but it could be an OK solution.
Are there any others I've missed? What are the pros/cons?
(3) doesn't have to require more configuration on the servers. You could instead default to server mode, and require more configuration on the dev machines.
In general I'd always want to make the dev machines the special case, and release behavior the default. The only tricky part is that if the relevant setting is in the config file, then developers will keep accidentally checking in their modified version of the file. You can avoid this either in your version-control system (for example a checkin hook), or:
read two config files, one of which is allowed to not exist (and only exists on dev machines, or perhaps on servers set up by expert users)
read an environment variable that is allowed to not exist.
Personally I prefer to have a config override file, just because you've already got the code to load the one config file, it should be pretty straightforward to add another. Reading the environment isn't exactly difficult, of course, it's just a separate mechanism.
Some people really like their programs to be controlled by the environment (especially those who want to control them when running from scripts. They don't want to have to write a config file on the fly when it's so easy to set the environment from a script). So it might be worth using the environment from that POV, but not just for this setting.
Another completely different option: make dev/release mode configurable within the app, if you're logged into the app with suitable admin privileges. Whether this is a good idea might depend whether you have the kind of devs who write debug logging messages along the lines of, "I can't be bothered to fix this, but no customer is ever going to tell the difference, they're all too stupid." If so, (a) don't allow app admins to enable debug mode (b) re-educate your devs.
Here are a few other possibilities.
Some organizations keep development machines on one network, and production machines on another network, for example, dev.example.com and prod.example.com. If your organization uses that practice, then an application can determine its environment via the fully-qualified hostname on which it is running, or perhaps by examining some bits in its IP address.
Another possibility is to use an embeddable scripting language (Tcl, Lua and Python come to mind) as the syntax of your configuration file. Doing that means your configuration file can easily query environment variables (or IP addresses) and use that to drive an if-then-else statement. A drawback of this approach is the potential security risk of somebody editing a configuration file to add malicious code (for example, to delete files).
A final possibility is to start each application via a shell/Python/Perl script. The script can query its environment and then use that to driven an if-then-else statement for passing a command-line option to the "real" application.
By the way, I don't like to code an environment-testing if-then-else statement as follows:
if (check-for-running-in-production) {
... // run program in production mode
} else {
... // run program in development mode
}
The above logic silently breaks if the check-for-running-in-production test has not been updated to deal with a newly added production machine. Instead, if prefer to code a bit more defensively:
if (check-for-running-in-production) {
... // run program in production mode
} else if (check-for-running-in-development) {
... // run program in development mode
} else {
print "Error: unknown environment"
exit
}

What is the role/responsibility of a 'shell'? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have been looking at the source code of the IronPython project and the Orchard CMS project. IronPython operates with a namespace called Microsoft.Scripting.Hosting.Shell (part of the DLR). The Orchard Project also operates with the concept of a 'shell' indirectly in various interfaces (IShellContainerFactory, IShellSettings).
None of the projects mentioned above have elaborate documentation, so picking up the meaning of a type (class etc.) from its name is pretty valuable if you are trying to figure out the overall application structure/architecture by reading the source code.
Now I am wondering: what do the authors of this source code have in mind when they refer to a 'shell'? When I hear the word 'shell', I think of something like a command line interpreter. This makes sense for IronPython, since it has an interactive interpreter. But to me, it doesn't make much sense with respect to a Web CMS.
What should I think of, when I encounter something called a 'shell'? What is, in general terms, the role and responsibility of a 'shell'? Can that question even be answered? Is the meaning of 'shell' subjective (making the term useless)?
Thanks.
In Orchard the term "shell" is really more of a metaphor for a scope. There are three nested scopes: host, shell, and work.
The host is a single container that lives for the duration of the web app domain.
The shell is a child container created by the host that is built according to the current configuration. If the configuration is changed a new shell is built up and the existing one is let go.
The work is another a container, created by the shell, that holds the components that live for the duration of a single request.
One nice thing about the use of a shell container it that it helps avoids the use of static variables and the need to cycle the app domain on when configuration changes. Another nice thing is that it enables an Orchard app domain to serve more than one "site" at the same time when the host holds a number of shells and uses the appropriate one for each request.
I think that a general meaning for shell would be 'user process that interprets and executes commands'.
'User process': as distinct from a process built into the operating system kernel. JCL in the IBM mainframe world would be hard-pressed to count as a shell.
'interprets and executes': in some shape or form, a shell reads commands from a file or a terminal, and reacts to what is presented, rather than being rigidly programmed to do a certain sequence of commands.
'commands: what the commands are depends on the context. In the standard Unix shells, the commands executed are mainly other programs, with the shell linking them together appropriately. Obviously, there are built-in commands, and also there is usually flow-control syntax to allow for appropriate reactions to the results of executing commands.
In other contexts, it is reasonable to think of other sorts of commands being executed. For example, one could envision an 'SQL Shell' which allowed the user to execute SQL statements while connected to a database.
A Python shell would support Pythonic notations and would execute Python-like statements, with a syntax closely related to the syntax of Python. A Perl Shell would support Perl-like notations and would execute Perl-like statements, ... And so the list goes on. (For example, Tcl has tclsh - the Tcl Shell.)