Openshift OKD Excessive Logging - openshift

So I installed a single host Openshift OKD v3.11 cluster. I installed it on a VM running Centos 7.8.2003.
It seems to have installed ok except that it continually streams verbose logs to /var/log/messages. Around 5 logs per second and all seem to be about throttling requests. Example of a typical log message:
******Jun 13 15:49:13 centos7 journal: I0613 14:49:13.011402 1 request.go:485] Throttling request took 196.341689ms, request: GET:https://172.30.0.1:443/api/v1/namespaces/openshift-service-cert-signer/serviceaccounts/service-serving-cert-signer-sa*****
The only reference I have managed to find is a question here but the access to the discussion is only available to those with deep pockets.
https://access.redhat.com/solutions/3348921
I assume these logs are nothing to worry about and so my main question is what is the "best"/cleanest/simplest/easiest way to ensure the Openshift cluster doesn't continue to fill up /var/log/messages but will still log any important messages there?

I would recommend looking at the root cause for this behavior. These messages indicate that there are a lot of requests coming to your API. Typically this is due to some application performing calls in a tight loop leading to this many messages. In your case check your openshift-service-cert-signer if you can see any warnings or an abnormal amount of log messages.
If you want to get rid of the throttling messages, you can increase the amount of Queries per second (QPS) for the API server: Recommended Practices for OKD Master Hosts (lower part).
The only reference I have managed to find is a question here but the access to the discussion is only available to those with deep pockets. https://access.redhat.com/solutions/3348921
I do not understand why you're saying that, as I can access that document with my free Red Hat account without any subscriptions. Have you tried with a free account as it says on the site?

Simon's answer was helpful but I've finally got to the bottom of this.
The problem was simply that the version of Docker I had installed was old. At the time of writing the latest version of Centos is 7.8.2003 and if you install that and then simply run "yum install docker" hoping that you'll get something at least reasonably new and certainly compatible with the rest of the linux installation, you'll probably be making a mistake.
The right thing to do is to follow the simple steps here:
https://docs.docker.com/engine/install/centos/
The reason I found the problem was because excessive logging of my openshift cluster wasn't the only issue. I started seeing strange behaviour of other containers. A process of trial and error narrowed down the issue to the default Centos version of docker. Once I followed the page above all my problem vanished including the original problem of /var/log/message getting hammered by openshift containers.
The main reason I decided to answer my own question was because surely someone else is going to be as impatient/thick as me and simply install Centos7 then try "yum install docker" without knowing they're about to enter a world of pain.

Related

Encountered problem while integrating devstack - osm (open source mano)

I'm currently trying to develop a cloud in my pc using virtual box. The idea is that I have 2 virtual machines, one which devstack installed (all in one) and the other with osm mano. Right now both have everything installed. Hence, I can log in to mano via user and password 'admin' as well as to devstack.
Current properties:
VM1 (devstack): IP (enp0s8) -> 192.168.56.101
Login to 192.168.56.101 -> correct
VM2 (mano): IP (enp0s8) -> 192.168.56.105
Login to 192.168.56.105 -> correct
As some of you may guess, I have 2 network interfaces in every vm, the first one being NAT (enp0s3 with 10.0.2.15 IP) and the second one being Host Only (192.168.56.x according to virtual box).
Needless to say, I can ping from one virtual machine to another without any problem.
Now, in the past I've being using devstack (ubuntu 18.04) in order to play with it a little bit, learn how to deploy instances, create groups and so on. Indeed, I developed a topology with an instance as a router and nagios as the monitoring tool system. It worked and I learnt a lot!
Anyway, what I want in this case is starting from scratch (scratch meaning having downloaded mano and devstack but without going further). So here I am, trying to integrate OSM with Devstack, making use of osm-vim command as it is:
osm vim-create --name openstack-site --user admin --password my_openstack_password --auth_url http://192.168.56.101:5000/v3 --tenant admin --account_type openstack
In this case, my openrc file (downloaded from horizon) resulted in my auth_url being:
export OS_AUTH_URL=http://192.168.56.101:5000/v3
What I'm trying to get my head into is how it's possible that this doesn't work, as whenever I log-in to mano web interface (after osm-vim command) I go to VIM accounts and operational state equals to "error".
Any kind of help would be much appreciated, as I've being struggling for a week now.
Thanks in advance!
I had the same problem. At the beginning I thought It was a network problem, but finally I found out It was due to a SSL problem. The most easy solution is to put a specific flag to avoid the SSL verification until the developers fix it. "--config '{insecure: True}'"
I also encountered this problem when I finished installing OSM-10 and OpenStack-Ussuri for Ubuntu18.04 some days ago. I solved this problem by change the url "--auth_url http//:192.168.23.18:5000/v3" to "-- auth_url http//:controller:5000/v3" and put "192.168.23.18 controller" in the ro container "/etc/hosts". The "controller" here is the host name where you install your openstack and which is used is your keystone authentication urls. Maybe you also have solved this problem but this problem is so troublesome and I hope more people do not be annoyed at this~

Legacy GCE and GKE metadata requests from google_daemon/manage_addresses.py

I have an old Debian Compute Engine instance (created and running since December 2013) and got an email warning about the turndown of Legacy GCE and GKE metadata server endpoints (more details at https://cloud.google.com/compute/docs/migrating-to-v1-metadata-server).
I followed the directions for locating the process and found that the requests were coming from /usr/share/google/google_daemon/manage_addresses.py. The script seems to be the same as what's at https://github.com/gtt116/gce/blob/master/google_daemon/manage_addresses.py (also with what's in that directory).
I don't recall installing this, so I'm imaging it came with the provided Debian image I used in 2013.
Does anyone know what this manage_addresses.py script is, what it does, and what I should do with it now that the legacy metadata server endpoints are turning down? Is it safe to just stop running it? Or is there a new script I should replace it with? Or should I just try to update it myself to use the new endpoint?
I dug around and was able to trace /usr/share/google/google_daemon/manage_addresses.py as being installed by a package called google-compute-daemon. A search for that brought me to https://github.com/GoogleCloudPlatform/compute-image-packages#troubleshooting which explains that google-compute-daemon has been replaced with python-google-compute-engine. That led me to https://cloud.google.com/compute/docs/images/install-guest-environment . I followed the instructions there and manually installed the guest environment.
I noticed during installation that it said it was removing the google-compute-daemon package (and a packaged called google-startup-scripts), so this seems like the right thing. And I'm no longer seeing any requests to the legacy endpoints. So it seems like at some point the old guest environment failed to update.
TLDR; If you have this problem, follow the instructions at https://cloud.google.com/compute/docs/images/install-guest-environment#installing_guest_environment to manually update the guest environment.

OpenShift system and package updates/patches

How does one keep OpenShift gears up-to-date? For example, updates to:
The Linux kernel
Important components/libraries like libc
Apache
Apache modules like mod_wsgi
Python
Python packages
Does OpenShift automatically update these and then restart the gear (or reboot the node)? Or does OpenShift send email notifications and the end-user can restart the gear during maintenance windows? What is the model?
What got me thinking about this was back in January there was a remote-code-execution bug in Ruby on Rails that everyone had to patch immediately.
This FAQ seems to suggest that some level of upgrades happen automatically, but it isn’t clear whether this only applies to the OpenShift-specific code, or also other components like the kernel, Apache, etc.
I can tell you from my experience that changes to the openshift system are not always automatic. They had a change about 10 days ago and I'm still tracking down what they did to make my app run correctly. As far as I know there was no email sent. I did find a blog post of some of the major changes, not all. Of course, they introduced at least one bug that I know of. YMMV
My experiences over the last few weeks have been the following:
Last week there seemed to be an unannounced reboot of the server. I detected this by logging from a custom action hook. I didn't receive any email about it and I didn't see any notice at https://twitter.com/openshift_ops or https://openshift.redhat.com/app/status.
This week, there was the Heartbleed OpenSSL vulnerability and it seems like some gears were restarted. I didn't receive any email about it, Twitter didn't show anything, but there was information on the status page.

Struts Hibernate Mysql Tomcat based application hangs

I have an application running on Apache 2.2, tomcat 6 and it uses struts framework, hibernate framework. We use Mysql at back end. We also interact with third party servers to place some requests that are requested by the user. Due to confidentiality constrains I can not tell you exactly what we do but I can assure you that we have not customized any thing and we use the most general builds availaible for Apache, tomcat etc. We use Linux platform. Lots of visitors visit our site, where they first pay using payment gateways, and then buy a product. To buy the product we again have to hit a third party site. Its a simple e-commerce kind of setup. The problem is that some times the server hangs. As in it does not responds and when we click on a link that (I know) is served by the tomcat container, it does not get loaded. Here is what i need help for:
Since my hosting is on a headless linux platform, please suggest me a good debugging tool.
We have logging in place and we print stack trace of almost all exception(if they happen), we always monitor catalina.out, but when the server hangs, we dont see any activity on catalina.out. may be this can give some one a clue.
We have show_sql disabled for hibernate, we tried to enable it but still that was not sufficient to figure out if the application stucks on a query. We also have slow_sql log enabled but that does not show any significant queries. How can we check if my application is stuck on a query?
If my application is not stuck on a query, how can i know where it is stuck?
How can I get java stack dump?
What are possible ways to resolve such a problem.
ANy suggestions are welcome. I thank you all in advance for reading my question and for the time you will devote writing an answer.
Facing your situation I would do a thread dump when the servlet container "hangs". A thread dump provides you a list of the stack traces for all Java threads in a given JVM.
Doing a thread dump is pretty easy:
Find the Tomcat's process ID, e.g. ps -ef | grep java
Send a SIGQUIT to the process, e.g. kill -3 tomcatProcessId
You will find the thread dump's content in TOMCAT_HOME/logs/catalina.out.

Are there any disadvantages to using Bitnami vs a native server stack?

I have read about the advantages of using a BitNami stack for LAMP development, now I am wondering if there are any drawbacks to using BitNami vs manually installing PHP, MySQL, and Apache separately. I use Mac OS but I would be interested on how it applies to both Mac and Windows. Any thoughts?
I am one of the developers of BitNami. Whether to use a native stack or a BitNami stack depends on what you are trying to do. Installing the individual items separately should be exactly the same as running our installer, and the whole purpose why we put the installers together is so you would not have to :) In the case of Mac, one of the advantages of BitNami is that you can have more up-to-date components and multiple installations. A disadvantage / difference is that the applications and path will be different than the typical ones so if you are using third-party tutorials or documentation, it may not work right away
There are 3 common drawbacks to Bitnami vs. a native LEMP/LAMP stack:
File paths. Because Bitnami is a container approach to web stacks, it installs everything in Ubuntu (or whatever Linux distro) under the /opt/bitnami directory. So, many developers who are used to customizing their stack using nano or vim editors (via the Bash shell) quickly discover that you first have to figure out where all the different configuration files of your stack modules reside, etc. Even after you figure those out, most of the online tutorials and documentations you might find will not apply to your stack.
Lockdown. This could be seen as either an advantage or a disadvantage, depending on your perspective (and situation). The entire point of using a containerized approach is to have more control of the stack environment, which can improve compatibility, predictability, security, and otherwise. However as #team-life mentioned, this can quickly become frustrating when you are trying to use "standard" Bash shell commands or even the MySQL CLI, e.g. when trying to analyze or replicate your stack, etc. To put it simply, logging into shell on a server where Bitnami is installed is not in fact logging into the actual shell :)
Upgrades. At the end of the day, Bitnami (and other containers, like Docker) are adding another "layer" to your stack, and thus, more bloat. For some users this "bloat" is justifiable, and preferable (for example, very large companies who require across-the-board uniformity). But what many developers discover with Bitnami and containers is upgrading your stack can be rather janky. For all the alleged advantages in terms of environment "stability", it turns out that upgrading your stack can actually introduce quite a bit of instability and unpredictability, often to the extent of canceling out the benefits. As #domi mentioned, all upgrades run through Bitnami (and not Ubuntu mirrors, etc) meaning you are bound to their versions and release schedules; you are also often required to completely re-install the stack again...
Ultimately, containers are a recent trend that have become very popular among so-called "enterprise" and "corporate" in-house teams, but it is one of those things that might not be the best features for smaller agencies or independent developers to embrace.
That is why native LEMP stacks like SlickStack (my project) are gaining momentum.
This Reddit thread has a few other AWS-specific comments as well.
BitNami uses paths that will be very different from the industry standard ones so if you are trying to login to a server to do some task, it will take you a lot of time to understand their custom-made-folder-structure. And that's a big drawback. When you login to a unix server, you know where the files and paths are, maybe you have one or two options, that are standard. BitNami uses a completely different one. Chaos ensues.
I'm a happy bitnami stack user. It's a great stack. I can describe many advantages.
The draw back of using bitnami stack is the update cycle. For example on Debian/Ubuntu based system, you can not use the standard apt-get update/upgrade.
That means some security updates might not get to your system as fast as your standard cron (automated periodic) update mechanism.
To upgrade the system you will need to create backup, install a new stack, then import the backup to the new stack. Which might not be an ideal procedure.
Some people categorize that as non-production-environment.
Bitnami - ease of use, validated components - known working good configuration.
Disadvantage - Patches and updates. you cannot update packages for security like you can for native install. Any bulletins must be addressed by the bitnami team, who may/will roll out an update to address issues. The bitnami updates are full stack upgrades, meaning you can't just upgrade a single component (php for example) - you need to upgrade the whole bitnami stack, and the often recommended method is to backup your application database, install a parallel bitnami stack that has the latest updates, then restore or migrate to the new installation.
Some will tell you that you can shoehorn patches into bitnami stacks, but it's not at all recommended, will lead you off the stack and most likely cause you down stream issues.
Bitnami evidently is unable to use certain commands from their mysql command line. I'm finding this very frustrating. Here is some stuff I found out.
It puts you into its own bash shell bash-4.2#
mysql>SHOW MASTER STATUS returns -> (nothing) doesn't seem to work
rcmysql start or stop doesn't work from mysql> you have to shell out of where your at and run the ctlscript.sh which is a pain.
Just to get to command line you have to run ./use_lampstack
I'm guessing that they are giving us a very paired down mysql group of commands because there will be less for them to support and less for people to jack up.
So this came up for me because I was trying setup replication. I was following directions from someone who had a "regular" install. It was difficult to follow because most of the commands he was suggesting didn't work from the bitnami mysql> command line. So while I really like the uniformity of Bitnami and the modular nature of it I have run into a snag trying to setup replication.