Managing SnappyData cluster with YARN - snappydata

Is it possible to manage SnappyData cluster with YARN.
The requirement is to containerise snappy data cluster, so that it can be managed by YARN.

You can find out more about using YARN with SnappyData here:
http://snappydatainc.github.io/snappydata/programming_guide/working_with_hadoop_yarn_cluster_manager/
In short, when used in Smart Connector mode, YARN can manage Spark.
Also, a Docker image of SnappyData exists here:
http://snappydatainc.github.io/snappydata/quickstart/getting_started_with_docker_image/
And SnappyData-on-Kubernetes is coming out shortly. You can track it here:
https://github.com/SnappyDataInc/spark-on-k8s

Related

How to schedule jupyter notebooks on kubernetes?

These are the requirements:
Mysql, Jupyter Notebook (both should be on Kubernetes cluster)
I need to run machine learning models using jupyter notebook by fetching data from mysql database and this whole task needs to be scheduled(just like cron scheduling) on kubernetes cluster environment.
I am new at kubernetes but have knowledge of docker containerization and have built containerized applications before. I beg you to please help.
scheduled(just like cron scheduling)
You can use the feature cornjob to schedule the resources on Kubernetes.
Read more about cronjob : https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/
If you have docker image of notebook or code that you are planning to run you can simply create the YAML config and run that docker image with cronjob on Kubernetes as schedule task.
i have not used Jupyter notebook so not sure how it's work.
For running MySQL on kubernetes you can follow same way, config the YAML files and apply the changes in K8s cluster and your container will get deployed to K8s.
You can read more here : https://kubernetes.io/docs/tasks/run-application/run-single-instance-stateful-application/

Install AppDynamics in OpenShift 4.X

I am looking for a way to install AppDynamics in a OpenShift Cluster.
Unable to find proper documentation on how to install and what tools need to be installed.
Should My Application Docker file also include any images related to AppDynamics
If anyone familiar with this please share some steps or provide reference to documents.
Old docs: https://docs.appdynamics.com/22.2/en/infrastructure-visibility/monitor-containers-with-docker-visibility/use-docker-visibility-with-red-hat-openshift
New Docs: https://docs.appdynamics.com/22.2/en/infrastructure-visibility/monitor-kubernetes-with-the-cluster-agent
Note that there is not a prescribed way to instrument as such, you need to make some decisions.
i.e. (from the second doc link):
The first decision is to use the officially released pre-built
Appdynamics Operator images published on DockerHub and Redhat
Registry or If you want to build a custom Appdynamics Operator image.
See Build the Custom Cluster Agent Image.
The second decision is whether to use the officially released
pre-built Cluster Agent images published on DockerHub and Redhat
Registry or If you want to build a custom Cluster Agent image. See
Cluster Agent Container Image.
The third decision is whether to install the Cluster Agent using the
Kubernetes CLI or the Cluster Agent Helm Chart. See Install the
Cluster Agent with the Kubernetes CLI and Install the Cluster Agent
with Helm Charts.

How to use CDK/minishift OpenShift cluster with kubectl

I have installed CDK on my Windows 10 laptop.
I am following documentation on using IBM Blockchain Platform with RedHat OpenShift.
One of the first steps is issuing kubectl commands.
I see CDK comes with the OpenShift CLI (oc) installed but not with kubectl. Do I need to install kubectl separatelly ? If so, how do I configure kubectl to know about my OpenShift cluster running in CDK/minishift?
To answer your specific question, any time you see a "kubectl" command you can replace it with "oc".
You can also download kubectl directly from upstream, and it will use the same (by default, or use $KUBECONFIG to override) ~/.kube/config file.
However, you should know that CDK is based on OpenShift 3.11.z and is approaching end-of-life. I would suggest you take a look at CRC, which is based on 4.x. Start here for more information -- https://console.redhat.com/openshift/create/local

Is mysql/mongodb cluster suitable for installation on kubernetes?

I used to test installed mongodb-shard on kubernetes by helm,but I found that those helm charts are not really a qualified mongodb shard.These helm charts can correctly create Pods with names like mongos-1 mongod-server-1 mongod-shard-1,this seems to be a correct shard cluster configuration, but the appropriate mongos, mongod server instance is not created on the corresponding Pod.They just create a normal mongod instance on the corresponding pod, and there is no connection between them.Do I need to add scripts to execute commands similar to rs.addShard(config)?Encountered the same problem when installing mysql cluster using helm.
What I want to know is, is it not appropriate to install mysql/mongodb cluster on kubernetes in general scenarios?Is the database installed independently or deployed on Kubernetes?
Yes, you can deploy MongoDB instances on Kubernetes clusters.
Use standalone instance if you want to test and develop and replica set for production like deployments.
Also to make things easier you can use MongoDB Enterprise Kubernetes Operator:
The Operator enables easy deploys of MongoDB into Kubernetes clusters,
using our management, monitoring and backup platforms, Ops Manager and
Cloud Manager. By installing this integration, you will be able to
deploy MongoDB instances with a single simple command.
This guide has references to the official MongoDB documentation with more necessary details regarding:
Install Kubernetes Operator
Deploy Standalone
Deploy Replica Set
Deploy Sharded Cluster
Edit Deployment
Kubernetes Resource Specification
Troubleshooting Kubernetes Operator
Known Issues for Kubernetes Operator
So basically all you need to know in this topic.
Please let me know if that helped.

Hadoop Metrics Issue

I am provisioning hadoop cluster with ambari local repository. My ambari server is running in one machine and the cluster nodes are running on remote machines. I am successfully able to create the cluster.
However I am not getting any metrics. Can you suggest your i/ps what needs to be done?
Might be a proxy issue. See the instructions in Ambari manual. If so, fix is to set -Dhttp.nonProxyHosts to list of your remote nodes and restart Ambari Server.