I know that ip changes over time but is there a way to force openshift to change it every X hours without restarting (If not, I will consider restart) an application? Like some command, cartridge or cron script? Does this option become available with plan upgrade to bronze?
If there is absolutely no way to do that, can someone recommend me a platform similar to openshift which allows changing ip on fly.
With OpenShift Online, the applications are sometimes moved to a different node. However, users are unable to initiate or "force" moving to another node, thus changing the application's IP address (even with Bronze or Silver plans).
The other part of your question does not seem suitable for Stack Overflow.
Related
is there a way I could listen to the service that is changing the status of pods that run on my openshift? I would love to store the status to my database so my web app can read the info when some customer will need it. I wouldn't mind looping through 1 pod but some of our customers can have hundreds of microservices running on openshift and looping every so often through all the pods isn't something I want to do.
Thanks.
I'm not sure I really understand your use case: why would a web app care about what pods are running? And, moreso, why would you want to store the information about which pods are running in a database? (That's what etcd and the Kubernetes API are for.)
90% of the time, unless you are building an operator, when I find people are asking these types of questions the right answer is "don't do that". Because, after all, this creates a very tight coupling to the implementation. It also means your permissions management becomes more complicated.
But with those cautionary remarks out of the way, yes, there's a pretty easy way to do what you are asking: watch operations. See the efficient detection of changes section of the API. That shows you how to receive notifications when a query (such as a query on Pods) changes.
I am using this template (https://github.com/openshift-evangelists/php-quickstart) on a start node west 2 on Openshift. I assigned 256MB on the php container and 256MB on the MySQL container.
I have no data on MySQL and with really bare bone php scripts the time to first byte (TTFB) is 6 seconds. I don't get any delays to other websites like this and definitely not on my old Openshift 2 installation.
Is this normal? Is Openshift 3 slower like that for the free (starter) services? Or is there something I am doing wrong? Any way I can troubleshoot this further?
256MB is too little for MySQL, it usually wants to use more than that from what I have seen and why the default was set to 512MB. Unless that is, that it is dynamically working out how much memory it has available and tries to gobble as much as possible.
The behaviour with slow responses is a known issue which has been affecting a number of the Online Starter environments on and off. The issue is still being investigated and a solution implemented.
You can verify actual response times by getting inside the container using oc rsh or the web console and using curl against $HOSTNAME:8080.
We have an internal network devoted to development and testing, and this network has an OGE cluster on it. I'd like to allow any machine on that network to submit jobs, without having to add them manually one by one as submit hosts. I've tried doing a wildcard, but it hasn't liked my syntax. Is there any way to do this?
Thanks!
A qualified "no" - if you really need this, consider automating it instead.
GridEngine does not support wildcarding in host names. GE relies heavily on forward and reverse name resolution for pretty much all host interactions. You are not going to get a GridEngine cluster to blindly accept job submissions from any unspecified machine on your subnet without some bad magic.
If you use a configuration management system like Puppet or Chef, that might be the best layer to define whether a server is a submit host or not.
The alternative, brute force way(this will almost certainly violate your IT department's acceptable use policy) is using something like nmap to produce a list of hostnames on your network(if you think you can get away with it) and write a simple shell script to add each one as a submit host. This approach would require minor ongoing maintenance as the hosts on your network change over time, etc.
I'm running a complex server setup for a defacto high-availability service. So far it takes me about two days to set everything up so I would like to automate the provisioning.
However I do a quite a lot of manual changes to (running) server(s). A typical example is changing a firewall configuration to cope with various hacking attempts, packet floods etc. Being able to work on active nodes quickly is important. Also the server maintains a lot of active TCP connections and loosing those for a simple config change is out of question.
I don't understand if either Chef or Puppet is designed to deal with this. Once I change some system config, I would like to store it somewhere and use it while the next instance is being provisioned. Should I stick with one of those tools or choose a different one?
Hand made changes and provisioning don't take hands. They don't even drink tea together.
At work we use puppet to manage all arquitecture, and as you we need to do hand made changes in a hurry due to performance bottlenecks, attacks, etc.
What we do is first make sure puppet is able to setup every single part of the arquitecture ready to be delivered without any specific tuning.
Then when we need to do hand made changes, if in a hurry as long you don't mess with files managed by puppet there's no risk, if it's a puppet managed file what we need to change then we just stop puppet agent and do whatever we need.
After hurry ended, we proceed as follows:
These changes should be applied to all servers with same symptoms ?
If so, then you can develop what puppet call 'facts' which is code that it's run on the agent on each run and save results in variables available in all your puppet modules, so if for example you changed ip conntrack max value because a firewall was not able to deal with all connections, you could easily (ten lines of code) have in puppet on each run a variable with current conntrack count value, and so tell puppet to set a max value related to current usage. Then all other servers will benefit for this tunning and likely you won't ever have to deal with conntrack issues anymore (as long you keep running puppet with a short frequency which is the default)
These changes should be always applied by hand on given emergencies?
If configuration is managed by puppet, find a way to make configuration include other file and tell puppet to ignore it. This is the easiest way, however it's not always possible (e.g. /etc/network/interfaces does not support includes). If it's not possible, then you will have to stop puppet agent during emergencies to be able to change puppet files without risk of being removed on next puppet run.
Are this changes only for this host and no other host will ever need it?
Add it to puppet anyway! Place a sweet if $fqdn == my.very.specific.host and put inside whatever you need. Even for a single case it's always beneficial (and time consuming) to migrate all changes you do to a server, as will allow you to do a full restore of server setup if for some reason your server crash to a not recoverable state (e.g. hardware issues)
In summary:
For me the trick in dealing with hand made changes it's putting a lot of effort in reasoning how you decided to do the change and after emergency is over move that logic into puppet. If you felt something was wrong because for a given software slots all were used but free memory was still available on the server so to deal with the traffic peak was reasonable to allow more slots to be run, then spend some time moving that logic into puppet. Very carefully of course, and as time consuming as the amount of different scenarios on your architecture you want to test it against, but at the end it's very, VERY rewarding.
I would like to complete Valor's excellent answer.
puppet is a tool to enforce a configuration. So you must think of it this way :
on the machine I run puppet onto...
I ask puppet client...
to ensure that the config of the current machine...
is as specified in the puppet config...
which is taken from a puppet server, or directly from a bunch of puppet files (easier)
So to answer one of your questions, puppet doesn't require a machine or a service reboot. But if a change in a config file you set with puppet requires a reboot of the corresponding service/daemon/app, then there is no way to avoid it. There are method in puppet to tell that a service needs to be relaunched in case of config change. Of course, puppet will not relaunch the service if it sees that nothing changed.
Valor is assuming you use puppet in client/server way, with (for example) puppet clients polling a puppet server for config every hours. But it is also possible to move your puppet files from machine to machine, for example with git, and launch puppet manually. This way is :
far simpler than the client/server technique (authentication is a headache)
only enforce config change when you explicitely ask for it, thus avoiding any overwrite of your handmade changes
This is obviously not the best way to use puppet if you manage a lot of machines, but it may be a good start or a good transition.
And also, puppet is very hard to learn at an interesting level. It took me 2 weeks to be able to automatically install an AWS server from scratch. I don't regret it, but you may want to know that fact if you must convince a boss to allocate you time.
I can think of a few hacks using ping, the box name, and the HA shared name but I think that they are leading to data leakage.
Should a box even know its part of an HA cluster or what that cluster name is? Is this more a function of DNS? Is there some API exposed for boxes to join an HA cluster and request the id of the currently active node?
I want to differentiate between the inactive node and active node in alerting mechanisms for a running program. If the active node is alerting I want to hit a pager and on the inactive node I want to send an email. Pushing the determination into the alerting layer moves the same problem elsewhere.
EASY SOLUTION: Polling the server from an external agent that connects through the network makes any shell game of who is the active node a moot point. To clarify this the only thing that will page is the remote agent monitoring the real. Each box can send emails all day long for all I care.
It really depends on the HA system you're using.
For example, if your system uses a shared IP and the traffic is managed by some hardware box, then it can be hard to determine if a certain box is a master or slave. That will depend on a specific solution really... As long as you can add a custom script to the supervisor, you should be ok - for example the controller can ping a daemon on the master server every second. In the alerting script, simply check if the time of the last ping < 2 sec...
If your system doesn't have a supervisor / controller node, but each node tries to determine the state itself, you can have more problems. If a split brain occurs, you can end up with both slaves or both masters, so your alerting software will be wrong in both cases. Gadgets that can ensure only one live node (STONITH and others) could help.
On the other hand, in the second scenario, if the HA software works on both hosts properly, you should be able to obtain the master/slave information straight from it. It has to know its own state at any time, because it's one of its main functions. In most HA solutions you should be able to either get the current state, or add some code to run when the state changes. Heartbeat offers both.
I wouldn't worry about the edge cases like a split brain though. Almost any situation when you lose connection between the clustered nodes will be more important than the stuff that happens on the separate nodes :)
If the thing you care about is really logging / alerting only, then ideally you could have a separate logger box which gets all the information about the current network / cluster status. External box will probably have better idea how to deal with the situation. If your cluster gets dos'ed / disconnected from the network / loses power, you won't get any alert. A redundant pair of independent monitors can save you from that.
I'm not sure why you mentioned DNS - due to its refresh time it shouldn't be a source of any "real-time" cluster information.
One way is to get the box to export it's idea of whether it is active into your monitoring. From there you can predicate paging/emailing on this status (with a race condition around failover), and alert on none/too many systems believing they are active.
Another option is to monitor the active system via a DNS alias (or some other method to address the active system) and page on that. Then also monitor all the systems, both active and inactive, and email on that. This will cause duplicate alerts for the active system, but that's probably okay.
It's hard to be more specific without knowing more about your setup.
As a rule, the machines in a HA cluster shouldn't really know which one is active. There's one exception, mind, and that's with cronjobs. At work, we have a HA cluster on top of which some rather important services run. Some of those use services have cronjobs, and we only want them running on the active box. To do that, we use this shell script:
#!/bin/sh
HA_CLUSTER_IP=0.0.0.0
if ip addr | grep $HA_CLUSTER_IP >/dev/null; then
eval "$#"
fi
(Note that this is running on Debian.) What this does is check to see if the current box is the active one within the cluster (replace 0.0.0.0 with the external IP of your HA cluster), and if so, executes the command passed in as arguments to the script. This ensures that one and only one box is ever actually executing the cronjobs.
Other than that, there's really no reasons I can think of why you'd need to know which box is the active one.
UPDATE: Our HA cluster uses Heartbeat to assign the cluster's external IP address as a secondary address to the active machine in the cluster. Programmatically, you can check to see if your machine is the current active box by calling gethostbyname(), and iterating over the data returned until you either get to the end or you find the cluster's IP in the list.
Without hard-coding.... ? I assume you mean some native heartbeat query, not sure. However, you could use ifconfig, HA creates a virtual interface on whatever interface it is configured to run on. For instance if HA was configured on eth0 then it would create a virtual interface of eth0:0, but only on the active node.
Therefore you could do a simple query of the ifconfig output to determine if the server twas the active node or not, for example if eth0 was the configured interface:
ACTIVE_NODE=`ifconfig | grep -c 'eth0:0'`
That will set the $ACTIVE_NODE variable to 1 (for active) and 0 (if standby). Hope that may help.
http://www.of-networks.co.uk