I am planning to run an SSIS ETL job , which has a sql server as SOURCE db , this is on a physical on-premise machine and the DESTINATION db (postegres/patroni) is running on Openshift platform as pod/containers. The issue I am facing now is like, DB hosted on openshift cannot be exposed via tcp port. As per few articles online, openshift only allows HTTP traffic via “routes”. Is this assumption right? If yes, how in real world people run ETL or bulk data transfer or migration to a db on openshift from outside. I am worried to use HTTP since I feel , it’s not efficient for ETL. Few folks mentioned like, use OC PORT FORWARDING. But for a production app, how an open shift port forwarding be stable? Please throw your comments
In a production environment it is a little questionable if you want to expose your database to the public internet. Normally you probably rather want to go with a site-to-site VPN.
That left aside it is correct that OCP is using routes for most use cases, which are then exposing an http(s) endpoint. If you need plain TCP however, you can create a service of type loadbalancer.
The regular setup with a route is stacked like
route --> service --> pods where the service is commonly of type clusterIP.
with a service of type loadbalancer, you eliminate the route and directly expose a TCP service.
If you run on a public cloud, OCP takes care of the leftover requirements for you. Namely that is to create a Loadbalancer with your cloudprovider. In the case of AWS for example, OCP would create an ELB (Elastic Loadbalancer) for you.
You can find more information in the documentation
Related
I need to access a postgres database from my java code which resides in openshift cluster. I need a way to do so. without initiating port forwarding manually through oc port forward command.
I have tried using openshift java client class openshift connection factory to get the connection by passing server url and username password through which I log in to the console but it dint help.
(This is mostly just a more detailed version of Will Gordon's comment, so credit to him.)
It sounds like you are trying to expose a service (specifically Postgres) outside of your cluster. This is very common.
However the best method to do so does depend a bit on your physical infrastructure because we are by definition trying to integrate with your networking. Look at the docs for Getting Traffic into your Cluster. Routes are probably not what you want, because Postgres is a TCP protocol. But one of the other options in that chapter (Load Balancer, External IP, or NodePort) is probably your best option depending on your networking infrastructure and needs.
I have two openshift routers, running as pods, running in OSE.
However, I don't see any associated services in my openshift cluster which forwards traffic / loadbalances to them.
Should I expose my routers to the external world in a normal OSE environment?
Note that this is in a running openshift (OSE) cluster, so I do not think it would be appropriate to recreate the routers with new service accounts, and even if I did want to do this, it isn't always gauranteed that I will have access inside of OpenShift to do so.
If you are talking about the haproxy routers which are a part of the OpenShift platform, and which handle routing of external HTTP/HTTPS requests through to the pods of an application which has been exposed using a route, then no, you should not at least expose then as an OpenShift Route. Adding a Route for them would be circular as the router is what implements the route.
The incoming port of the haproxy routers does need to be exposed outside of the cluster, but this should have been handled as part of the setup you did when the OpenShift cluster was installed. Exactly what you may needed to have done to prepare for that when installing the OpenShift cluster depends on your target system into which OpenShift was installed.
It may be better to step back and explain the problem you are having. If it is an installation issue, you may be better asking on one of the lists at:
https://lists.openshift.redhat.com/openshiftmm/listinfo
as that is more frequented by people more familiar with installing OpenShift.
Question: How can I provide reliable access from (non-K8s) services running in an GCE network to other services running inside Kubernetes?
Background: We are running a hosted K8s setup in the Google Cloud Platform. Most services are 12factor apps and run just fine within K8s. Some backing stores (databases) are run outside of K8s. Accessing them is easy by using headless services with manually defined endpoints to fixed internal IPs. Those services usually do not need to "talk back" to the services in K8s.
But some services running in the internal GCE network (but outside of K8s) need to access services running within K8s. We can expose the K8s services using spec.type: NodePort and talk to this port on any of the K8s Nodes IPs. But how can we automatically find the right NodePort and a valid Worker Node IP? Or maybe there is even a better way to solve this issue.
This setup is probably not a typical use-case for a K8s deployment, but we'd like to go this way until PetSets and Persistent Storage in K8s have matured enough.
As we are talking about internal services I'd like to avoid using an external loadbalancer in this case.
You can make cluster service IPs meaningful outside of the cluster (but inside the private network) either by creating a "bastion route" or by running kube-proxy on the machine you are connecting from (see this answer).
I think you could also point your resolv.conf at the cluster's DNS service to be able to resolve service DNS names. This could get tricky if you have multiple clusters though.
One possible way is to use an Ingress Controller. Ingress Controllers are designed to provide access from outside a Kubernetes cluster to services running inside the cluster. An Ingress Controller runs as a pod within the cluster and will route requests from outside the cluster to the correct services inside the cluster, based on the configured rules. This provides a secure and reliable way for non-Kubernetes services running in a GCE network to access services running in Kubernetes.
In my env variable there is host for MySQL database. But it is ip in local network (starts with 127...). How can I make MySQL available for external world via domain name for db?
This is not possible. Openshift is a Platform-as-a-Service (PaaS) that shields the internals of the implementation in a paradigm that allows access through an API connector such as PHP and a database cartidge. Or through SSH tunneling. It does not expose an IP Address of your mysql server sitting there as port 3306 for use in development with such db libraries a c#, java, python, etc. Or with Mysql Workbench or the like.
In fact, it is not your mysql server as much as it is a shared one.
Infrastruture-as-a-Service (IaaS) platforms such as AWS EC2 would allow for those native port 3306 connections and a public IP Address exposed if you opened up the firewall for them.
With Openshift, in order to achieve connections with such things as Mysql Workbench, you need a pki key pair and an SSH tunnel. Same for a native app, say, written in c#, which would need the likes of SSH.NET . these are all configurations that are bearable for a single developer, but don't scale for a rollout to your users, generally speaking. Unless you are up for the task of doing that. That is, key management.
It is one of the drawbacks, but also one of the security guarantees you can bank on. You can also enjoy its simplicity. But it has its shortcomings. I have converted some people away from Openshift once they have realized this. The same limitations exist with major shared hosts where SSH is the only way in.
I hope I have answered your question.
I have a java based web application developed in Amazon EC2. It is doing transactions of confidential information. I have a MySQL server installed all by my self in the same amazon instance. The web application access the database via localhost. In Security Groups, I have created a custom security where the port 8080 (the Tomcat) can be accessed only via localhost.
Considering these, do I still need SSL to make sure the transactions are secured?
It depends. Are you comfortable with plain text inside the datacenter? Don't bother with SSL.
Are you worried about that traffic being sniffed locally (tcpdump) or from a malicious source (for instance, if data was being rerouted from the switch between EC2 instances)? Use SSL.
There's a trend of large companies making sure to encrypt local traffic.