Open Splunk forwarder/receiver TCP port on OpenShift

Open Splunk forwarder/receiver TCP port on OpenShift - openshift

I need to open TCP port 9997 on OpenShift so Splunk is able to listen for incoming data from fowarders on other servers.
I've set up Splunk using this guide: http://www.kelvinism.com/2013/11/free-splunk-hosting.html and but I can't figure out how to add another TCP port to the manifest.yml file. I tried the following for a new OpenShift instance but with no luck.
- Private-IP-Name: IP
Private-Port-Name: PORT_FORWARDER
Private-Port: 9997
Public-Port-Name: PROXY_PORT_FORWARDER
Options: { "ssl_to_gear": true }
Do I need to configure other parts of the cartridge to read my new port and set up some configuration elsewhere?

You will only be able to listen publicly on ports 80/443/8000/8443, no other tcp or udp ports are allowed in (except 22 for ssh/scp/sftp). The private port that you have configured is for internal access only (either on the same gear, or installed on it's own gear as part of a scaled application). Having remote agents connect to your application on port 9997 just won't work.

alternatively, you can write a very simple splunk add-on to listen on that port, that's very straight forward.
Splunk has a SDK you can implement it with variable language. Here is a framework for python. for more information, you can see a full example for UDP receiver: link to the example, it's not an english post, but you can read the code from there.
import sys
from splunklib.modularinput import *
class MyScript(Script):
def get_scheme(self):
# Returns scheme.
def validate_input(self, validation_definition):
# Validates input.
def stream_events(self, inputs, ew):
# Splunk Enterprise calls the modular input,
# streams XML describing the inputs to stdin,
# and waits for XML on stdout describing events.
# TODO: implement a socket to listen and receive the
# message then send by Event()
if __name__ == "__main__":
sys.exit(MyScript().run(sys.argv))

Related

How to allow IP dynamically using ingress controller

My structure
Kubernetes cluster on GKE
Ingress controller deployed using helm
An application which will return list of IP ranges note: it will get updated periodically
curl https://allowed.domain.com
172.30.1.210/32,172.30.2.60/32
Secured application which is not working
What I am trying to do?
Have my clients IPs in my API endpoint which is done
curl https://allowed.domain.com
172.30.1.210/32,172.30.2.60/32
Deploy my example app with ingress so it can pull from the https://allowed.domain.com and allow people to access to the app
What I tried and didn't work?
Deploy the application with include feature of nginx
nginx.ingress.kubernetes.io/configuration-snippet: |
include /tmp/allowed-ips.conf;
deny all;
yes its working but the problem is when /tmp/allowed-ips.conf gets updated the ingress config doesn't
I tried to use if condition to pull the IPs from the endpoint and deny if user is not in the list
nginx.ingress.kubernetes.io/configuration-snippet: |
set $deny_access off;
if ($remote_addr !~ (https://2ce8-73-56-131-204.ngrok.io)) {
set $deny_access on;
}
I am using nginx.ingress.kubernetes.io/whitelist-source-range annotation but that is not what I am looking for
None of the options are working for me.

From the official docs of ingress-nginx controller:
The goal of this Ingress controller is the assembly of a configuration file (nginx.conf). The main implication of this requirement is the need to reload NGINX after any change in the configuration file. Though it is important to note that we don't reload Nginx on changes that impact only an upstream configuration (i.e Endpoints change when you deploy your app)
After the nginx ingress resource was initially created, the ingress controller assembles the nginx.conf file and uses it for routing traffic. Nginx web server does not auto-reload its configuration if the nginx.conf and other config files were changed.
So, you can work around this problem in several ways:
update the k8s ingress resource with new IP addresses and then apply changes to the Kubernetes cluster (kubectl apply / kubectl patch / smth else) / for your options 2 and 3.
run nginx -s reload inside an ingress Pod to reload nginx configuration / for your option 1 with include the allowed list file.
$ kubectl exec ingress-nginx-controller-xxx-xxx -n ingress-nginx -- nginx -s reload
try to write a Lua script (there is a good example for Nginx+Lua+Redis here and here). You should have a good understanding of nginx and lua to estimate if it is worth trying.

Sharing what I implemented at my workplace. We had a managed monitoring tool called Site24x7. The tool pings our server from their VMs with dynamic IPs and we had to automate the whitelisting of the IPs at GKE.
nginx.ingress.kubernetes.io/configuration-snippet allows you to set arbitrary Nginx configurations.
Set up a K8s CronJob resource on the specific namespace.
The CronJob runs a shell script, which
fetches the list of IPs to be allowed (curl, getent, etc.)
generates a set of NGINX configurations (= the value for nginx.ingress.kubernetes.io/configuration-snippet)
runs a kubectl command which overwrites the annotation of the target ingresses.
Example shell/bash script:
#!/bin/bash
site24x7_ip_lookup_url="site24x7.enduserexp.com"
site247_ips=$(getent ahosts $site24x7_ip_lookup_url | awk '{print "allow "$1";"}' | sort -u)
ip_whitelist=$(cat <<-EOT
# ---------- Default whitelist (Static IPs) ----------
# Office
allow vv.xx.yyy.zzz;
# VPN
allow aa.bbb.ccc.ddd;
# ---------- Custom whitelist (Dynamic IPs) ----------
$site247_ips # Here!
deny all;
EOT
)
for target_ingress in $TARGET_INGRESS_NAMES; do
kubectl -n $NAMESPACE annotate ingress/$target_ingress \
--overwrite \
nginx.ingress.kubernetes.io/satisfy="any" \
nginx.ingress.kubernetes.io/configuration-snippet="$ip_whitelist" \
description="*** $(date '+%Y/%m/%d %H:%M:%S') NGINX annotation 'configuration-snippet' updated by cronjob $CRONJOB_NAME ***"
done
The shell/bash script can be stored as ConfigMap to be mounted on the CronJob resource.

how is the traffic to the openshift_cluster_hostname is redirected to the openshift web console

Question 1 :
1.1. who is sitting behind the "openshift_master_cluster_public_hostname" hostname ? is it the web console ( web console service ? or web service deployment ) or something else ?
1.2. when doing oc get service -n openshift-web-console i can see that the web console is runnung in 443 , isn't it supposed to work on port 8443 , same thing for api server , shouldn't be working on port 8443 ?
1.3. can you explain to me the flow of a request to https://openshift_master_cluster_public_hostname:8443 ?
1.4. in the documentation is
Question 2:
why i get different response for curl and wget ?
when i : curl https://openshift_master_cluster_public_hostname:8443 , i get :
{
"paths": [
"/api",
"/api/v1",
"/apis",
"/apis/",
"/apis/admissionregistration.k8s.io",
"/apis/admissionregistration.k8s.io/v1beta1",
"/apis/apiextensions.k8s.io",
"/apis/apiextensions.k8s.io/v1beta1",
...
"/swagger.json",
"/swaggerapi",
"/version",
"/version/openshift"
]
}
when i : wget https://openshift_master_cluster_public_hostname:8443 i get an index.html page.
Is the web console answering this request or the
Question 3 :
how can i do to expose the web console on port 443 rather then the 8443 , i found several solution :
using variables "openshift_master_console_port,openshift_master_api_port" but found out that these ports are ‘internal’ ports and not designed to be the public ports. So changing this ports could crash your OpenShift setup
using an external service ( described here )
I'm kind of trying to setup port forwarding on an external haporxy , is it doable ?

Answer to Q1:
1.1. Cite from the documentation Configuring Your Inventory File
This variable overrides the public host name for the cluster,
which defaults to the host name of the master. If you use an
external load balancer, specify the address of the external load balancer.
For example:
> openshift_master_cluster_public_hostname=openshift-ansible.public.example.com
This means that this Variable is the Public facing interface to the OpenShift Web-Console.
1.2 A Service is a virtual Object which connects the Service Name to the pods and is used to connect the Route Object with the Service Object. This is explained in the documentation Services. You can use almost every port for a Service because it's virtual and nothing will bind on this Port.
1.3. The answer depend on your setup. I explain it in a ha-setup with a TCP loadbalancer in front of the masters.
/> Master API 1
client -> loadbalancer -> Master API 2
\> Master API 3
The Client make a request to https://openshift_master_cluster_public_hostname:8443 the loadbalancer forwards the Client to the Master API 1 or 2 or 3 and the Client get the answer from the requested Master API Server.
api server redirect to console if request come from a browser ( https://github.com/openshift/origin/blob/release-3.11/pkg/cmd/openshift-kube-apiserver/openshiftkubeapiserver/patch_handlerchain.go#L60-L61 )
Answer to Q2:
curl and wget behaves different because they are different tools but the https request is the same.
curl behavior with wget
wget --output-document=- https://openshift_master_cluster_public_hostname:8443
wget behavior with curl
curl -o index.html https://openshift_master_cluster_public_hostname:8443
Why - is described in Usage of dash (-) in place of a filename
Answer to Q3:
You can use the OpenShift Router which you use for the apps to make the Web-Console available on 443. It's a little bit outdated but the concept is the same for the current 3.x versions Make OpenShift console available on port 443 (https) [UPDATE]

Tinyproxy not forwarding requests. Getting Unauthorized connection from <IP>

I have installed tinyproxy in CentOS 7 machine and changed the port to 8080 in tinyproxy.conf
Wherenever I am hitting request I am getting following logs in tinyproxy.log:-
CONNECT Mar 15 08:14:42 [22148]: Connect (file descriptor 6): <IP> [<IP>]
NOTICE Mar 15 08:14:42 [22148]: Unauthorized connection from "<IP>" [<IP>].
INFO Mar 15 08:14:42 [22148]: Read request entity of 1200 bytes
My request is reaching to proxy and proxy is not forwarding it to the destination.

In the Tinyproxy config file (/etc/tinyproxy/tinyproxy.conf) you can use the Allow directive to explicitly specify the host(s) that will be connecting to the proxy. You can also comment out or remove all Allow <host> lines to allow connections from all hosts. See below description from the config file (here I've commented out Allow 127.0.0.1 and since there are no other entries all connections will be allowed):
# Allow: Customization of authorization controls. If there are any
# access control keywords then the default action is to DENY. Otherwise,
# the default action is ALLOW.
#
# The order of the controls are important. All incoming connections are
# tested against the controls based on order.
#
#Allow 127.0.0.1

OpenShift Hazelcast

Is it possible to open a port for hazelcast on openshift? No matter what port I try, I get the same exception:
SocketException: Permission denied
I am not trying to open the port to the world. I just want to open a port so the gears can use Hazelcast. It seems like this should be possible.

You're probably have to use a HTTP tunnel to connect Hazelcast, not a nice solution but I prototyped it some time ago: https://github.com/noctarius/https-tunnel-openshift-hazelcast
Anyhow gears should be openshift V2, isn't it? Never tried it with V2, if you get the chance, there's support for V3 (and V3.1) - http://blog.hazelcast.com/openshift/

What cartridge type do you use?
You can bind to any port from 15000 to 35530 internally, but other gears won't be able to access it.
From my experience - I had to open the public proxy port for other members of the cluster to join.
For example, Vert.x cartridge uses Hazelcast for clustering and has some additional public proxy ports open (see https://github.com/vert-x/openshift-cartridge/blob/master/metadata/manifest.yml).
Endpoints:
- Private-IP-Name: IP
Private-Port-Name: PORT
Private-Port: 8080
Public-Port-Name: PROXY_PORT
Mappings:
- Frontend: ""
Backend: ""
Options: { "websocket": 1}
- Private-IP-Name: IP
Private-Port-Name: HAZELCAST_PORT
Private-Port: 5701
Public-Port-Name: HAZELCAST_PROXY_PORT
- Private-IP-Name: IP
Private-Port-Name: CLUSTER_PORT
Private-Port: 9123
Public-Port-Name: CLUSTER_PROXY_PORT
(see https://access.redhat.com/documentation/en-US/OpenShift_Online/2.0/html/Cartridge_Specification_Guide/chap-Exposing_Services.html).

On OpenShift, you should only bind websockets to either port 8000 or 8443.
See:
https://developers.openshift.com/en/managing-port-binding-routing.html
https://blog.openshift.com/paas-websockets/

Flask/MySQL app seems to be blocking concurrent requests when query is complicated

I have a simple, single page Flask (v0.8) application that queries a MySQL database and displays results for each request based on different request params. The application is served using Tornado over Nginx.
Recently I've noticed that the application seems to blocking concurrent requests from different clients when a DB query is still running. E.g -
A client makes a request with a complicated DB query that takes a while to complete (> 20 sec).
A different client makes a request to the server and is blocked until the first query returns.
So basically the application behaves like a single process that serves everyone. I was thinking the problem was with a shared DB connection on the server, so I started using the dbutils module for connection pooling. That didn't help. I think I'm probably missing something big in the architecture or the configuration of the server, so I'd appreciate any feedback on this.
This is the code for the Flask that performs the db querying (simplified):
#... flask imports and such
import MySQLdb
from DBUtils.PooledDB import PooledDB
POOL_SIZE = 5
class DBConnection:
def __init__(self):
self.pool = PooledDB(MySQLdb,
POOL_SIZE,
user='admin',
passwd='sikrit',
host='localhost',
db='data',
blocking=False,
maxcached=10,
maxconnections=10)
def query(self, sql):
"execute SQL and return results"
# obtain a connection from the pool and
# query the database
conn = self.pool.dedicated_connection()
cursor = conn.cursor()
cursor.execute(sql)
# get results and terminate connection
results = cursor.fetchall()
cursor.close()
conn.close()
return results
global db
db = DBConnection()
#app.route('/query/')
def query():
if request.method == 'GET':
# perform some DB querying based query params
sql = process_request_params(request)
results = db.query(sql)
# parse, render, etc...
Here's the tornado wrapper (run.py):
#!/usr/bin/env python
import tornado
from tornado.wsgi import WSGIContainer
from tornado.httpserver import HTTPServer
from tornado.ioloop import IOLoop
from myapplication import app
from tornado.options import define, options
define("port", default=8888, help="run on the given port", type=int)
def main():
tornado.options.parse_command_line()
http_server = HTTPServer(WSGIContainer(app), xheaders=True)
http_server.listen(options.port)
IOLoop.instance().start()
if __name__ == '__main__': main()
Starting the app via startup script:
#!/bin/sh
APP_ROOT=/srv/www/site
cd $APP_ROOT
python run.py --port=8000 --log_file_prefix=$APP_ROOT/logs/app.8000.log 2>&1 /dev/null
python run.py --port=8001 --log_file_prefix=$APP_ROOT/logs/app.8001.log 2>&1 /dev/null
And this is the nginx configuration:
user nginx;
worker_processes 1;
error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;
events {
worker_connections 1024;
use epoll;
}
http {
upstream frontends {
server 127.0.0.1:8000;
server 127.0.0.1:8001;
}
include /usr/local/nginx/conf/mime.types;
default_type application/octet-stream;
# ..
keepalive_timeout 65;
proxy_read_timeout 200;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
gzip on;
gzip_min_length 1000;
gzip_proxied any;
gzip_types text/plain text/html text/css text/xml application/x-javascript
application/xml application/atom+xml text/javascript;
proxy_next_upstream error;
server {
listen 80;
root /srv/www/site;
location ^~ /static/ {
if ($query_string) {
expires max;
}
}
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_pass http://frontends;
}
}
}
This is a small application that serves a very small client base, and most of it is legacy code I inherited and never got around to fix or rewrite. I've only noticed the problem after adding more complex query types that took longer to complete. If anything jumps out, I'd appreciate your feedback. thanks.

The connection pool doesn't make MySQLdb asyncronous. The results = cursor.fetchall() blocks Tornado, until the query is complete.
That's what happens when using non-asynchronous libraries with Tornado. Tornado is an IO loop; it's one thread. If you have a 20 second query, the server will be unresponsive while it waits for MySQLdb to return. Unfortunately, I'm not aware of a good async python MySQL library. There are some Twisted ones but they introduce additional requirements and complexity into a Tornado app.
The Tornado guys recommend abstracting slow queries into an HTTP service, which you can then access using tornado.httpclient. You could also look at tuning your query (>20 seconds!), or running more Tornado processes. Or you could switch to a datastore with an async python library (MongoDB, Postgres, etc).

What kind of 'complicated db queries' are you running? Are they just reads or are you updating the tables. Under certain circumstances, MySQL must lock the tables - even on what might seem likely read-only queries. This could explain the blocking behavior.
Additionally, I'd say any query that takes 20 seconds or more to run and that is run frequently is a candidate for optimization.

So, as we know - standard mysql drivers are blocking, so server will block while query executing. Here is good article, about how you can achieve non-blocking mysql queires in tornado.
By the way, as Mike Johnston mentioned - if your query executes >20s - it is very long. My suggestion is find way to move this query in background. Tornado does not have asynchronous mysql driver in it's package - because guys at FriendFeed done their best, to make their queries executing really fast.
Also instead of using pool of 20 synchronous database connections - you can start 20 server instances with 1 connection each, and use nginx as reverse proxy for them. They will be more bulletproof, than pool.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008