How to make oc cluster up persistent? - openshift

I'm using "oc cluster up" to start my Openshift Origin environment. I can see, however, that once I shutdown the cluster my projects aren't persisted at restart. Is there a way to make them persistent ?
Thanks

There are a couple ways to do this. oc cluster up doesn't have a primary use case of persisting resources.
There are couple ways to do it:
Leverage capturing etcd as described in the oc cluster up README
There is a wrapper tool, that makes it easy to do this.

There is now an example in the cluster up --help command, it is bound to stay up to date so check that first
oc cluster up --help
...
Examples:
# Start OpenShift on a new docker machine named 'openshift'
oc cluster up --create-machine
# Start OpenShift using a specific public host name
oc cluster up --public-hostname=my.address.example.com
# Start OpenShift and preserve data and config between restarts
oc cluster up --host-data-dir=/mydata --use-existing-config
So specifically in v1.3.2 use --host-data-dir and --use-existing-config

Assuming you are using docker machine with vm such as virtual box, the easiest way I found is taking a vm snapshot WHILE vm and openshift cluster are up and running. This snapshot will backup memory in addition to disk therefore you can restore entire cluster later on by restoring the vm snapshot, then run docker-machine start ...
btw, as of latest os image openshift/origin:v3.6.0-rc.0 and oc cli, --host-data-dir=/mydata as suggested in the other answer doesn't work for me.

I'm using:
VirtualBox 5.1.26
Kubernetes v1.5.2+43a9be4
openshift v1.5.0+031cbe4
Didn't work for me using --host-data-dir (and others) :
oc cluster up --logging=true --metrics=true --docker-machine=openshift --use-existing-config=true --host-data-dir=/vm/data --host-config-dir=/vm/config --host-pv-dir=/vm/pv --host-volumes-dir=/vm/volumes
With output:
-- Checking OpenShift client ... OK
-- Checking Docker client ...
Starting Docker machine 'openshift'
Started Docker machine 'openshift'
-- Checking Docker version ...
WARNING: Cannot verify Docker version
-- Checking for existing OpenShift container ... OK
-- Checking for openshift/origin:v1.5.0 image ... OK
-- Checking Docker daemon configuration ... OK
-- Checking for available ports ... OK
-- Checking type of volume mount ...
Using Docker shared volumes for OpenShift volumes
-- Creating host directories ... OK
-- Finding server IP ...
Using docker-machine IP 192.168.99.100 as the host IP
Using 192.168.99.100 as the server IP
-- Starting OpenShift container ...
Starting OpenShift using container 'origin'
FAIL
Error: could not start OpenShift container "origin"
Details:
Last 10 lines of "origin" container log:
github.com/openshift/origin/vendor/github.com/coreos/pkg/capnslog.(*PackageLogger).Panicf(0xc4202a1600, 0x42b94c0, 0x1f, 0xc4214d9f08, 0x2, 0x2)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/pkg/capnslog/pkg_logger.go:75 +0x16a
github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend.newBackend(0xc4209f84c0, 0x33, 0x5f5e100, 0x2710, 0xc4214d9fa8)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:106 +0x341
github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend.NewDefaultBackend(0xc4209f84c0, 0x33, 0x461e51, 0xc421471200)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/mvcc/backend/backend.go:100 +0x4d
github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver.NewServer.func1(0xc4204bf640, 0xc4209f84c0, 0x33, 0xc421079a40)
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver/server.go:272 +0x39
created by github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver.NewServer
/go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/coreos/etcd/etcdserver/server.go:274 +0x345
Openshift writes to the directories /vm/... (also defined in VirtualBox) but successfully won't start.
See [https://github.com/openshift/origin/issues/12602][1]
Worked for me too, using Virtual Box Snapshots and restoring them.

To make it persistent after each shutdown you need to provide base-dir parameter.
$ mkdir ~/openshift-config
$ oc cluster up --base-dir=~/openshift-config
From help
$ oc cluster up --help
...
Options:
--base-dir='': Directory on Docker host for cluster up configuration
--enable=[*]: A list of components to enable. '*' enables all on-by-default components, 'foo' enables the component named 'foo', '-foo' disables the component named 'foo'.
--forward-ports=false: Use Docker port-forwarding to communicate with origin container. Requires 'socat' locally.
--http-proxy='': HTTP proxy to use for master and builds
--https-proxy='': HTTPS proxy to use for master and builds
--image='openshift/origin-${component}:${version}': Specify the images to use for OpenShift
--no-proxy=[]: List of hosts or subnets for which a proxy should not be used
--public-hostname='': Public hostname for OpenShift cluster
--routing-suffix='': Default suffix for server routes
--server-loglevel=0: Log level for OpenShift server
--skip-registry-check=false: Skip Docker daemon registry check
--write-config=false: Write the configuration files into host config dir
But you shouln't use it, because "cluster up" is removed in version 4.0.0. More here: https://github.com/openshift/origin/pull/21399

Related

Unable to connect worker node to master using K3S

I am trying to setup a K3S cluster for learning purposes but I am having trouble connecting the master node with agents. I have looked several tutorials and discussions on this but I can't find a solution. I know I am probably missing something obvious (due to my lack of knowledge), but still help would be much appreciated.
I am using two AWS t2.micro instances with default configuration.
When ssh into the master and installed K3S using
curl -sfL https://get.k3s.io | sh -s - --no-deploy traefik --write-kubeconfig-mode 644 --node-name k3s-master-01
with kubectl get nodes, I am able to see the master
NAME STATUS ROLES AGE VERSION
k3s-master-01 Ready control-plane,master 13s v1.23.6+k3s1
So far it seems I am doing things right. From what I understand, I am supposed to configure the kubeconfig file. So, I accessed it by using
cat /etc/rancher/k3s/k3s.yaml
I copied the configuration file and the server info to match the private IP I took from AWS console, resulting in something like this
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: <lots_of_info>
server: https://<master_private_IP>:6443
name: default
contexts:
- context:
cluster: default
user: default
name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
user:
client-certificate-data: <my_certificate_data>
client-key-data: <my_key_data>
Then, I ran vi ~/.kube/config, and there I pasted the kubeconfig file
Finally, I grabbed the token with cat /var/lib/rancher/k3s/server/node-token, ssh into the other machine and then run the following
curl -sfL https://get.k3s.io | K3S_NODE_NAME=k3s-worker-01 K3S_URL=https://<master_private_IP>:6443 K3S_TOKEN=<master_token> sh -
The output is
[INFO] Finding release for channel stable
[INFO] Using v1.23.6+k3s1 as release
[INFO] Downloading hash https://github.com/k3s-io/k3s/releases/download/v1.23.6+k3s1/sha256sum-amd64.txt
[INFO] Downloading binary https://github.com/k3s-io/k3s/releases/download/v1.23.6+k3s1/k3s
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Creating /usr/local/bin/ctr symlink to k3s
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-agent-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s-agent.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s-agent.service
[INFO] systemd: Enabling k3s-agent unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s-agent.service → /etc/systemd/system/k3s-agent.service.
[INFO] systemd: Starting k3s-agent
By this output, it looks like I have created an agent. However, when I run kubectl get nodes in the master, I still get
NAME STATUS ROLES AGE VERSION
k3s-master-01 Ready control-plane,master 12m v1.23.6+k3s1
What is the thing I was supposed to do in order to get the agent connected to the master? I am guess I am probably missing something simple, but I just can't seem to find the solution. I've read all the documentation but it is still not clear to me where I am making the mistake. I've tried saving the private master IP and token into the agent as environmental variables with export K3S_TOKEN=master_token and K3S_URL=master_private_IP and then simply running curl -sfL https://get.k3s.io | sh - but I still can't see the worker nodes when running kubectl get nodes
Any help would be appreciated.
It might be your VM instance firewall that prevents appropriate connection from your master to the worker node (and vice versa). Official rancher documentation advise to disable firewall for (Red Hat/CentOS) Enterprise Linux:
It is recommended to turn off firewalld:
systemctl disable firewalld --now
If enabled, it is required to disable nm-cloud-setup and reboot the node:
systemctl disable nm-cloud-setup.service nm-cloud-setup.timer reboot
If you are using Ubuntu on your VM's, there is a different firewall tool (ufw).
In my case, allowing 6443 and 443(not sure if required) port TCP connections worked fine.
Allow port 6443 and TCP connection in all of your cluster machines:
sudo ufw allow 6443/tcp
Then apply k3s installation script in your worker node(s):
curl -sfL https://get.k3s.io | K3S_NODE_NAME=k3s-worker-1 K3S_URL=https://<k3s-master-1 IP>:6443 K3S_TOKEN=<k3s-master-1 TOKEN> sh -
This should work. If not, you can try adding additional allow rule for 443 tcp port as well.
A few options to check.
Check Journalctl for errors
journalctl -u k3s-agent.service -n 300 -xn
If using RaspberryPi for a worker node, make sure you have
cgroup_enable=cpuset cgroup_enable=memory cgroup_memory=1
as the very end of your /boot/cmdline.txt file. DO NOT PUT THIS VALUE ON A NEW LINE! Should just be appended to the end of the line.
If your master node(s) have self-signed certs, make sure you copy the master node's self signed cert to your worker node(s). In linux or raspberry pi copy cert to /usr/local/share/ca-certificates, then issue an
sudo update-ca-certificates
on the worker node
Don't forget to reboot the worker node after you make these changes!
Hope this helps someone!

minishift - setup authentication for openshift internal docker registry?

In the Ubuntu desktop, started the OC cluster using minishift. The docker registry is available in the default namespace.
How to setup the authentication for the docker registry running inside the openshift cluster?
How to allow the users like developer,system or any users of openshift to push/pull images to/from the internal docker registry?
I have enabled the route for the docker service in the openshift.
root#desktop:~# docker login -p 5d2XKusYJ9xB6sg1_uRfwPE8Ap3FQMg8_MrR9IEw3N8 -u aprasath docker-registry-default.127.0.0.1.nip.io
WARNING! Using --password via the CLI is insecure. Use --password-stdin.
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
root#desktop:~# docker push docker-registry-default.127.0.0.1.nip.io/myproject/hello-world
The push refers to repository [docker-registry-default.127.0.0.1.nip.io/myproject/hello-world]
af0b15c8625b: Pushing [==================================================>] 3.584kB
unauthorized: authentication required
root#desktop:~# minishift addons list
- admin-user : enabled P(0)
- registry-route : enabled P(0)
The imagestream creation, tag and push to registry output below.
root#desktop:~# oc create imagestream ghost
Error from server (AlreadyExists): imagestreams.image.openshift.io "ghost" already exists
root#desktop:~ docker tag ghost docker-registry-default.127.0.0.1.nip.io/myproject/ghost:latest
root#desktop:~ docker push docker-registry-default.127.0.0.1.nip.io/myproject/ghost:latest
The push refers to repository [docker-registry-default.127.0.0.1.nip.io/myproject/ghost]
6545fabd1db4: Pushing [==================================================>] 4.096kB
e1b5357c9029: Pushing [==================================================>] 205.2MB/205.2MB
2f546e8c419e: Pushing [==================================================>] 25.33MB/25.33MB
2f5caec27732: Pushing [==================================================>] 1.287MB/1.287MB
da4dc4c42b60: Pushing [==================================================>] 3.584kB
24ad92b56299: Waiting
4eab4d25c303: Waiting
e2dd6cf79115: Waiting
67ecfc9591c8: Waiting
unauthorized: authentication required

go-ethereum - geth - puppeth - ethstat remote server : docker: command not found

I'm trying to setup a private ethereum test network using Puppeth (as Péter Szilágyi demoed in Ethereum devcon three 2017). I'm running it on a macbook pro (macOS Sierra).
When I try to setup the ethstat network component I get an "docker configured incorrectly: bash: docker: command not found" error. I have docker running and I can use it fine in the terminal e.g. docker ps.
Here are the steps I took:
What would you like to do? (default = stats)
1. Show network stats
2. Manage existing genesis
3. Track new remote server
4. Deploy network components
> 4
What would you like to deploy? (recommended order)
1. Ethstats - Network monitoring tool
2. Bootnode - Entry point of the network
3. Sealer - Full node minting new blocks
4. Wallet - Browser wallet for quick sends (todo)
5. Faucet - Crypto faucet to give away funds
6. Dashboard - Website listing above web-services
> 1
Which server do you want to interact with?
1. Connect another server
> 1
Please enter remote server's address:
> localhost
DEBUG[11-15|22:46:49] Attempting to establish SSH connection server=localhost
WARN [11-15|22:46:49] Bad SSH key, falling back to passwords path=/Users/xxx/.ssh/id_rsa err="ssh: cannot decode encrypted private keys"
The authenticity of host 'localhost:22 ([::1]:22)' can't be established.
SSH key fingerprint is xxx [MD5]
Are you sure you want to continue connecting (yes/no)? yes
What's the login password for xxx at localhost:22? (won't be echoed)
>
DEBUG[11-15|22:47:11] Verifying if docker is available server=localhost
ERROR[11-15|22:47:11] Server not ready for puppeth err="docker configured incorrectly: bash: docker: command not found\n"
Here are my questions:
Is there any documentation / tutorial describing how to setup this remote server properly. Or just on puppeth in general?
Can I not use localhost as "remote server address"
Any ideas on why the docker command is not found (it is installed and running and I can use it ok in the terminal).
Here is what I did.
For the docker you have to use the docker-compose binary. You can find it here.
Furthermore, you have to be sure that an ssh server is running on your localhost and that keys have been generated.
I didn't find any documentations for puppeth whatsoever.
I think I found the root cause to this problem. The SSH daemon is compiled with a default path. If you ssh to a machine with a specific command (other than a shell), you get that default path. This does not include /usr/local/bin for example, where docker lives in my case.
I found the solution here: https://serverfault.com/a/585075:
edit /etc/ssh/sshd_config and make sure it contains PermitUserEnvironment yes (you need to edit this with sudo)
create a file ~/.ssh/environment with the path that you want, in my case:
PATH=/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
When you now run ssh localhost env you should see a PATH that matches whatever you put in ~/.ssh/environment.

Questions on starting Locator using snappydata/bin> ./spark-shell.sh script

Spark v. 0.5
Here's the command I used to start a Locator:
ubuntu#ip-172-31-8-115:/snappydata-0.5-bin/bin$ ./snappy-shell locator start
Starting SnappyData Locator using peer discovery on:
0.0.0.0[10334] Starting DRDA server for SnappyData at address localhost/127.0.0.1[1527]
Logs generated in /snappydata-0.5-bin/bin/snappylocator.log
SnappyData Locator pid: 9352 status: running
It looks like it starts the DRDA server locally, with no outside interface for a client to connect to. So, I cannot reach my SnappyData Locator using this JDBC URL from an outside client host (e.g. my SquirrelSQL editor).
This does not connect:
jdbc:snappydata://MY-AWS-PUBLIC-IP-HERE:1527/
What property do I pass my ./snappy-shell.sh location start command to get the DRDA Server to start on a public IP address instead of "localhost/127.0.0.1"?
Use -client-bind-address and -client-port options. For locator also use the -peer-discovery-address and -peer-discovery-port options to specify bind address for other locators/servers/leads (that are passed to their -locators=<address>:<port>):
snappy-shell locator start -peer-discovery-address=<internal IP for peers> -client-bind-address=<public IP for clients>
See the output of snappy-shell locator --help for commonly used options.
For SnappyData releases, you may find it much easier to use the global configuration for all of the locators, servers, leads. Check configuring the cluster.
This will allow specifying all options for all JVMs of the cluster in conf/locators, conf/leads, conf/servers then starting with snappy-start-all.sh, status with snappy-status-all.sh and stop all with snappy-stop-all.sh
On a related note, we at SnappyData Inc., are developing scripts to enable users quickly launch SnappyData cluster on AWS.
If you want to try it out, below steps would guide you. We would love to hear your feedback on this.
Download its development branch git clone https://github.com/SnappyDataInc/snappydata.git -b SNAP-864 (You don't need to clone the repo for this, but I could not find a way to attach the scripts here.)
Go to ec2 directory cd snappydata/cluster/ec2
Run snappy-ec2. ./snappy-ec2 -k ec2-keypair-name -i /path/to/keypair/private/key/file launch your-cluster-name
See this README for more details.

Starting instance again after power off

How do I start instance on GCE again after power off.
Instance shows TERMINATED , but has PERSISTENT disk type.
if I use add instance with the same instance name it asks me for the
Select an new image with only choice of OS level, not my existing disk.
then fails with
ERROR: RESOURCE_ALREADY_EXISTS: The resource XXXX already exists
Is there way to start (or clone) copy of image once stopped?
Anything similar to AWS stop/start. I don't care about instance state or scratch to be saved, just start since I have boot disk stored and payed for.
Success, below is stop/start procedure, assuming that $PROJECT and $INSTANCE are set appropriately:
#--------- stop instance -----
#connect and shutdown
gcutil --project=$PROJECT ssh $INSTANCE
sudo shutdown -h now
# check
gcutil listinstances --project $PROJECT
#delete instance/keep boot disk , use -f to avoid confirmation
gcutil --project=$PROJECT deleteinstance $INSTANCE --nodelete_boot_pd
# check disks
gcutil listdisks --project=$PROJECT
#--------- start new instance -----
# launch instance using the existing disk (has to be in the same zone!)
gcutil --project=$PROJECT addinstance $INSTANCE --disk=$DISK,boot --zone=$ZONE --machine_type=n1-standard-1
#check that it's running
gcutil listinstances --project $PROJECT
You're on the right track. You just need to delete the existing TERMINATED instance before adding it again.
Even though the instance isn't running when it is TERMINATED, the resources (such as Persistent Disk) are still allocated to it.
Also, if this instance was created before December 5th, (when Compute Engine went GA), you'll need to add a kernel to the disk or it won't boot. See the transition guide for details.
(For a temporary work around to upgrading the kernel, see this Q/A: My Google Compute Engine instances hang during boot using the v1 API)