How to manually recreate the bootstrap client certificate for OpenShift 3.11 master? - openshift

Our origin-node.service on the master node fails with:
root#master> systemctl start origin-node.service
Job for origin-node.service failed because the control process exited with error code. See "systemctl status origin-node.service" and "journalctl -xe" for details.
root#master> systemctl status origin-node.service -l
[...]
May 05 07:17:47 master origin-node[44066]: bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 05 07:17:47 master origin-node[44066]: bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 05 07:17:47 master origin-node[44066]: certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 05 07:17:47 master origin-node[44066]: server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
So it seems that kubelet-client-current.pem and/or kubelet-server-current.pem contains an expired certificate and the service tries to create a CSR using an endpoint which is probably not yet available (because the master is down). We tried redeploying the certificates according to the OpenShift documentation Redeploying Certificates, but this fails while detecting an expired certificate:
root#master> ansible-playbook -i /etc/ansible/hosts openshift-master/redeploy-openshift-ca.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] *******************************************************************************************************************************************
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200505T042754.html or /root/cert-expiry-report.20200505T042754.json.\n"}
[...]
root#master> cat /root/cert-expiry-report.20200505T042754.json
[...]
"kubeconfigs": [
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
[...]
"summary": {
"expired": 2,
"ok": 22,
"total": 24,
"warning": 0
}
}
There is a guide for OpenShift 4.4 for Recovering from expired control plane certificates, but that does not apply for 3.11 and we did not find such a guide for our version.
Is it possible to recreate the expired certificates without a running master node for 3.11? Thanks for any help.
OpenShift Ansible: https://github.com/openshift/openshift-ansible/releases/tag/openshift-ansible-3.11.153-2
Update 2020-05-06: I also executed redeploy-certificates.yml, but it fails at the same TASK:
root#master> ansible-playbook -i /etc/ansible/hosts playbooks/redeploy-certificates.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ******************************************************************************
Wednesday 06 May 2020 04:07:06 -0400 (0:00:00.909) 0:01:07.582 *********
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200506T040603.html or /root/cert-expiry-report.20200506T040603.json.\n"}
Update 2020-05-11: Running with -e openshift_certificate_expiry_fail_on_warn=False results in:
root#master> ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml
[...]
TASK [Wait for master API to come back online] *****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.111) 0:02:25.186 ************
skipping: [master.openshift-cluster.mydomain.com]
TASK [openshift_control_plane : restart master] ****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.257) 0:02:25.444 ************
changed: [master.openshift-cluster.mydomain.com] => (item=api)
changed: [master.openshift-cluster.mydomain.com] => (item=controllers)
RUNNING HANDLER [openshift_control_plane : verify API server] **************************************************************************************************
Monday 11 May 2020 03:48:57 -0400 (0:00:00.945) 0:02:26.389 ************
FAILED - RETRYING: verify API server (120 retries left).
FAILED - RETRYING: verify API server (119 retries left).
[...]
FAILED - RETRYING: verify API server (1 retries left).
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "--silent", "--tlsv1.2", "--max-time", "2", "--cacert", "/etc/origin/master/ca-bundle.crt", "https://lb.openshift-cluster.mydomain.com:8443/healthz/ready"], "delta": "0:00:00.182367", "end": "2020-05-11 03:51:52.245644", "msg": "non-zero return code", "rc": 35, "start": "2020-05-11 03:51:52.063277", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
root#master> systemctl status origin-node.service -l
[...]
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: E0511 04:23:28.077964 109972 bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.078001 109972 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.080555 109972 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: F0511 04:23:28.130968 109972 server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
[...]

I have this same case in customer environment, this error is because the certified was expiry, i "cheated" changing da S.O date before the expiry date. And the origin-node service started in my masters:
systemctl status origin-node
● origin-node.service - OpenShift Node
Loaded: loaded (/etc/systemd/system/origin-node.service; enabled; vendor preset: disabled)
Active: active (running) since Sáb 2021-02-20 20:22:21 -02; 6min ago
Docs: https://github.com/openshift/origin
Main PID: 37230 (hyperkube)
Memory: 79.0M
CGroup: /system.slice/origin-node.service
└─37230 /usr/bin/hyperkube kubelet --v=2 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-c...
Você tem mensagem de correio em /var/spool/mail/okd

The openshift_certificate_expiry role uses the openshift_certificate_expiry_fail_on_warn variable to determine if the playbook should fail when the days left are less than openshift_certificate_expiry_warning_days.
So try running the redeploy-certificates.yml with this additional variable set to "False":
ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml

Related

Unable to start nginx-ingress-controller Readiness and Liveness probes failed

I have installed using instructions at this link for the Install NGINX using NodePort option.
When I do ks logs -f ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
W0304 09:33:40.568799 8 client_config.go:614] Neither --kubeconfig nor --master was
specified. Using the inClusterConfig. This might not work.
I0304 09:33:40.569097 8 main.go:241] "Creating API client" host="https://10.96.0.1:443"
I0304 09:33:40.584904 8 main.go:285] "Running in Kubernetes cluster" major="1" minor="23" git="v1.23.1+k0s" state="clean" commit="b230d3e4b9d6bf4b731d96116a6643786e16ac3f" platform="linux/amd64"
I0304 09:33:40.911443 8 main.go:105] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0304 09:33:40.916404 8 main.go:115] "Enabling new Ingress features available since Kubernetes v1.18"
W0304 09:33:40.918137 8 main.go:127] No IngressClass resource with name nginx found. Only annotation will be used.
I0304 09:33:40.942282 8 ssl.go:532] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0304 09:33:40.977766 8 nginx.go:254] "Starting NGINX Ingress controller"
I0304 09:33:41.007616 8 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"ingress-nginx-controller", UID:"1a4482d2-86cb-44f3-8ebb-d6342561892f", APIVersion:"v1", ResourceVersion:"987560", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/ingress-nginx-controller
E0304 09:33:42.087113 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:43.041954 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:44.724681 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:48.303789 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:59.113203 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:34:16.727052 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
I0304 09:34:39.216165 8 main.go:187] "Received SIGTERM, shutting down"
I0304 09:34:39.216773 8 nginx.go:372] "Shutting down controller queues"
E0304 09:34:39.217779 8 store.go:178] timed out waiting for caches to sync
I0304 09:34:39.217856 8 nginx.go:296] "Starting NGINX process"
I0304 09:34:39.218007 8 leaderelection.go:243] attempting to acquire leader lease ingress-nginx/ingress-controller-leader-nginx...
I0304 09:34:39.219741 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:initial-sync,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.219787 8 nginx.go:316] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0304 09:34:39.242501 8 leaderelection.go:253] successfully acquired lease ingress-nginx/ingress-controller-leader-nginx
I0304 09:34:39.242807 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.242837 8 status.go:84] "New leader elected" identity="ingress-nginx-controller-7f48b8-s7pg4"
I0304 09:34:39.252025 8 status.go:204] "POD is not ready" pod="ingress-nginx/ingress-nginx-controller-7f48b8-s7pg4" node="fbcdcesdn02"
I0304 09:34:39.255282 8 status.go:132] "removing value from ingress status" address=[]
I0304 09:34:39.255328 8 nginx.go:380] "Stopping admission controller"
I0304 09:34:39.255379 8 nginx.go:388] "Stopping NGINX process"
E0304 09:34:39.255664 8 nginx.go:319] "Error listening for TLS connections" err="http: Server closed"
2022/03/04 09:34:39 [notice] 43#43: signal process started
I0304 09:34:40.263361 8 nginx.go:401] "NGINX process has stopped"
I0304 09:34:40.263396 8 main.go:195] "Handled quit, awaiting Pod deletion"
I0304 09:34:50.263585 8 main.go:198] "Exiting" code=0
When I do ks describe pod ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
Name: ingress-nginx-controller-7f48b8-s7pg4
Namespace: ingress-nginx
Priority: 0
Node: fxxxxxxxx/10.XXX.XXX.XXX
Start Time: Fri, 04 Mar 2022 08:12:57 +0200
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
pod-template-hash=7f48b8
Annotations: kubernetes.io/psp: 00-k0s-privileged
Status: Running
IP: 10.244.0.119
IPs:
IP: 10.244.0.119
Controlled By: ReplicaSet/ingress-nginx-controller-7f48b8
Containers:
controller:
Container ID: containerd://638ff4d63b7ba566125bd6789d48db6e8149b06cbd9d887ecc57d08448ba1d7e
Image: k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Image ID: k8s.gcr.io/ingress-nginx/controller#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 04 Mar 2022 11:33:40 +0200
Finished: Fri, 04 Mar 2022 11:34:50 +0200
Ready: False
Restart Count: 61
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-7f48b8-s7pg4 (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zvcnr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
kube-api-access-zvcnr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 23m (x316 over 178m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 8m52s (x555 over 174m) kubelet Back-off restarting failed container
Normal Pulled 3m54s (x51 over 178m) kubelet Container image "k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899" already present on machine
When I try to curl the health endpoints I get Connection refused :
The state of the pods shows that they are both not ready :
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-4hzzk 0/1 Completed 0 3h30m
ingress-nginx-controller-7f48b8-s7pg4 0/1 CrashLoopBackOff 63 (91s ago) 3h30m
I have tried to increase the values for initialDelaySeconds in /etc/nginx/nginx.conf but when I attempt to exec into the container (ks exec -it -n ingress-nginx ingress-nginx-controller-7f48b8-s7pg4 -- bash) I also get an error error: unable to upgrade connection: container not found ("controller")
I am not really sure where I should be looking in the overall setup.
I have installed using instructions at this link for the Install NGINX using NodePort option.
The problem is that you are using outdated k0s documentation:
https://docs.k0sproject.io/v1.22.2+k0s.1/examples/nginx-ingress/
You should use this link instead:
https://docs.k0sproject.io/main/examples/nginx-ingress/
You will install the controller-v1.0.0 version on your Kubernetes cluster by following the actual documentation link.
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.0/deploy/static/provider/baremetal/deploy.yaml
The result is:
$ sudo k0s kubectl get pods -n ingress-nginx
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-dw2f4 0/1 Completed 0 11m
ingress-nginx-admission-patch-4dmpd 0/1 Completed 0 11m
ingress-nginx-controller-75f58fbf6b-xrfxr 1/1 Running 0 11m

Caused by: io.debezium.text.ParsingException: extraneous input 'ASC' expecting

I am running source kafka connector but unfortunately i am getting below error:
{"name":"supplier-central","connector":{"state":"RUNNING","worker_id":"192.168.208.4:8083"},"tasks":[{"id":0,"state":"FAILED","worker_id":"192.168.208.4:8083","trace":"org.apache.kafka.connect.errors.ConnectException: extraneous input 'ASC' expecting {<EOF>, '--'}\n\tat io.debezium.connector.mysql.AbstractReader.wrap(AbstractReader.java:230)\n\tat io.debezium.connector.mysql.AbstractReader.failed(AbstractReader.java:208)\n\tat io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:508)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.notifyEventListeners(BinaryLogClient.java:1095)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.listenForEventPackets(BinaryLogClient.java:943)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient.connect(BinaryLogClient.java:580)\n\tat com.github.shyiko.mysql.binlog.BinaryLogClient$7.run(BinaryLogClient.java:825)\n\tat java.lang.Thread.run(Thread.java:748)\nCaused by: io.debezium.text.ParsingException: extraneous input 'ASC' expecting {<EOF>, '--'}\n\tat io.debezium.antlr.ParsingErrorListener.syntaxError(ParsingErrorListener.java:40)\n\tat org.antlr.v4.runtime.ProxyErrorListener.syntaxError(ProxyErrorListener.java:41)\n\tat org.antlr.v4.runtime.Parser.notifyErrorListeners(Parser.java:544)\n\tat org.antlr.v4.runtime.DefaultErrorStrategy.reportUnwantedToken(DefaultErrorStrategy.java:349)\n\tat org.antlr.v4.runtime.DefaultErrorStrategy.singleTokenDeletion(DefaultErrorStrategy.java:513)\n\tat org.antlr.v4.runtime.DefaultErrorStrategy.sync(DefaultErrorStrategy.java:238)\n\tat io.debezium.ddl.parser.mysql.generated.MySqlParser.root(MySqlParser.java:817)\n\tat io.debezium.connector.mysql.antlr.MySqlAntlrDdlParser.parseTree(MySqlAntlrDdlParser.java:68)\n\tat io.debezium.connector.mysql.antlr.MySqlAntlrDdlParser.parseTree(MySqlAntlrDdlParser.java:41)\n\tat io.debezium.antlr.AntlrDdlParser.parse(AntlrDdlParser.java:80)\n\tat io.debezium.connector.mysql.MySqlSchema.applyDdl(MySqlSchema.java:307)\n\tat io.debezium.connector.mysql.BinlogReader.handleQueryEvent(BinlogReader.java:694)\n\tat io.debezium.connector.mysql.BinlogReader.handleEvent(BinlogReader.java:492)\n\t... 5 more\n"}],"type":"source"}**
and in debezium logs i am getting below error:
2019-08-23 05:02:40,101 INFO MySQL|data_lake|task [Consumer clientId=supplier-central-dbhistory, groupId=supplier-central-dbhistory] Member supplier-central-dbhistory-41cab001-1c64-4ab2-8869-58dca22b783c sending LeaveGroup request to coordinator kafka:9092 (id: 2147483646 rack: null) [org.apache.kafka.clients.consumer.internals.AbstractCoordinator]
Aug 23, 2019 5:02:41 AM com.github.shyiko.mysql.binlog.BinaryLogClient connect
INFO: Connected to 52.76.148.206:3306 at mysql-bin.010785/66551561 (sid:425, cid:315812)
2019-08-23 05:02:41,200 INFO || WorkerSourceTask{id=supplier-central-0} Source task finished initialization and start [org.apache.kafka.connect.runtime.WorkerSourceTask]
2019-08-23 05:02:41,841 INFO || WorkerSourceTask{id=supplier-central-0} Committing offsets [org.apache.kafka.connect.runtime.WorkerSourceTask]
2019-08-23 05:02:41,841 INFO || WorkerSourceTask{id=supplier-central-0} flushing 0 outstanding messages for offset commit [org.apache.kafka.connect.runtime.WorkerSourceTask]
2019-08-23 05:02:41,841 ERROR || WorkerSourceTask{id=supplier-central-0} Task threw an uncaught and unrecoverable exception [org.apache.kafka.connect.runtime.WorkerTask]
2019-08-23 05:02:41,841 ERROR || WorkerSourceTask{id=supplier-central-0} Task is being killed and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask]
2019-08-23 05:02:41,859 INFO MySQL|data_lake|task [Producer clientId=supplier-central-dbhistory] Closing the Kafka producer with timeoutMillis = 9223372036854775807 ms. [org.apache.kafka.clients.producer.KafkaProducer]
I am not using schema registry and avro. source db is mysql.
My other source connector works fine. I am not able to identify error. Source db is third party db may be someone change anything in db but as per my understanding kafka connector also make changes in binlog for that. So may be this is not issue.
Can anyone tell me problem and solution for this?
connector configuration:
curl -i -X POST -H "Accept:application/json" \
-H "Content-Type:application/json" http://localhost:38083/connectors/ \
-d '{
"name": "supplier-central",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"database.hostname": "localhost",
"database.port": "3306",
"database.user": "ankitg",
"snapshot.mode": "initial",
"include.schema.changes": "true",
"database.password": "abc#123",
"database.server.id": "425",
"database.server.name": "data_lake",
"database.whitelist": "supplier",
"database.history.kafka.bootstrap.servers": "kafka:9092",
"database.history.kafka.topic": "history.supplier_central",
"table.whitelist": "supplier_central.suppliers,supplier_central.supplier_business_types,supplier_central.supplier_address,supplier_central.supplier_banks,supplier_central.supplier_profile,supplier_central.supplier_documents",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter": "org.apache.kafka.connect.json.JsonConverter"
}
}'
I got this error when i used different database name and different table name in configuration. check your configuration database.whitelist and table.whitelist are matching with each other or configured correctly.

Can't install cloudwatch agent by cloudformation on Amazon ECS-optimized AMI

I am creating a cloudformation template, which creates some resources as EC2 instance, autoscaling group and launchConfiguration.
By the userData property of the launchConfiguration resource, I tried to install the Cloudwatch agent as follows:
"UserData":{ "Fn::Base64" : {
"Fn::Join" : ["", [
"#!/bin/bash -xe\n",
"yum -y install aws-cfn-bootstrap\n",
"/opt/aws/bin/cfn-init -v",
" --stack ", { "Ref": "AWS::StackName" },
" --resource LaunchCongig",
" --region ", { "Ref" : "AWS::Region" },"\n",
"yum -y install wget\n",
"# Get the CloudWatch Logs agent\n",
"wget https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py\n",
"# Install the CloudWatch Logs agent\n",
"python ./awslogs-agent-setup.py -n -r ", { "Ref" : "AWS::Region" }, " -c /etc/cwlogs.cfg || error_exit 'Failed to run CloudWatch Logs agent setup'\n",
"service awslogs start"
]]}
After ssh into the instance, I checked the file /var/log/cloud-init-output.log to see if everything is fine, but here is what I got:
+ wget https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py
--2017-02-17 14:36:10-- https://s3.amazonaws.com/aws-cloudwatch/downloads/latest/awslogs-agent-setup.py
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.216.226.59
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.216.226.59|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 47998 (47K) [text/x-python]
Saving to: ‘awslogs-agent-setup.py’
0K .......... .......... .......... .......... ...... 100% 196K=0.2s
2017-02-17 14:36:10 (196 KB/s) - ‘awslogs-agent-setup.py’ saved [47998/47998]
+ python ./awslogs-agent-setup.py -n -r eu-west-1 -c /etc/cwlogs.cfg
Step 1 of 5: Installing pip ...Traceback (most recent call last):
File "./awslogs-agent-setup.py", line 1144, in <module>
main()
File "./awslogs-agent-setup.py", line 1140, in main
setup.setup_artifacts()
File "./awslogs-agent-setup.py", line 693, in setup_artifacts
self.install_pip()
File "./awslogs-agent-setup.py", line 600, in install_pip
fail("Could not install pip. Please try again or see " + AGENT_SETUP_LOG_FILE + " for more details")
TypeError: fail() takes exactly 2 arguments (1 given)
+ error_exit 'Failed to run CloudWatch Logs agent setup'
/var/lib/cloud/instance/scripts/part-001: line 8: error_exit: command not found
Feb 17 14:36:12 cloud-init[2798]: util.py[WARNING]: Failed running /var/lib/cloud/instance/scripts/part-001 [127]
Feb 17 14:36:12 cloud-init[2798]: cc_scripts_user.py[WARNING]: Failed to run module scripts-user (scripts in /var/lib/cloud/instance/scripts)
Feb 17 14:36:12 cloud-init[2798]: util.py[WARNING]: Running module scripts-user (<module 'cloudinit.config.cc_scripts_user' from '/usr/lib/python2.7/dist-packages/cloudinit/config/cc_scripts_user.pyc'>) failed
Cloud-init v. 0.7.6 finished at Fri, 17 Feb 2017 14:36:12 +0000. Datasource DataSourceEc2. Up 85.78 seconds
What is wrong with this script? Is there any other way to install the agent?
Thank you.
EDIT:
I figured out that is because maybe the python-pip package didn't get installed so I added this to the userData:
"yum -y install python-pip\n",
After that I played the template again and strangely I got the same Error.
I am usinh an Amazon ECS-optimized AMI
I solved the problem by installing the agent directly by yum awslogs:
"UserData":{ "Fn::Base64" : {
"Fn::Join" : ["", [
"#!/bin/bash -xe\n",
"yum -y install aws-cfn-bootstrap\n",
"/opt/aws/bin/cfn-init -v",
" --stack ", { "Ref": "AWS::StackName" },
" --resource launchConfig",
" --region ", { "Ref" : "AWS::Region" },"\n",
"yum -y install awslogs\n",
"service awslogs start"
]]}
Here is the output from the log file:
Installed:
awslogs.noarch 0:1.1.2-1.10.amzn1
Dependency Installed:
aws-cli.noarch 0:1.11.29-1.45.amzn1
aws-cli-plugin-cloudwatch-logs.noarch 0:1.3.3-1.15.amzn1
freetype.x86_64 0:2.3.11-15.14.amzn1
libjpeg-turbo.x86_64 0:1.2.90-5.14.amzn1
mailcap.noarch 0:2.1.31-2.7.amzn1
python27-botocore.noarch 0:1.4.86-1.62.amzn1
python27-colorama.noarch 0:0.2.5-1.7.amzn1
python27-dateutil.noarch 0:2.1-1.3.amzn1
python27-docutils.noarch 0:0.11-1.15.amzn1
python27-futures.noarch 0:3.0.3-1.3.amzn1
python27-imaging.x86_64 0:1.1.6-19.9.amzn1
python27-jmespath.noarch 0:0.9.0-1.11.amzn1
python27-ply.noarch 0:3.4-3.12.amzn1
python27-pyasn1.noarch 0:0.1.7-2.9.amzn1
python27-rsa.noarch 0:3.4.1-1.8.amzn1
Complete!
+ service awslogs start
Starting awslogs: [ OK ]
Cloud-init v. 0.7.6 finished at Fri, 17 Feb 2017 15:33:42 +0000. Datasource DataSourceEc2. Up 83.47 seconds
Everything works fine this way. Hope that will help someone someday!
For ECS specifically, see Using CloudWatch Logs with Container Instances in the EC2 Container Service documentation for details on configuring CloudWatch Logs. The documentation recommends using yum install -y awslogs instead of the Python install script.
The documentation provides a complete sample in the Configuring CloudWatch Logs at Launch with User Data section.
In your case, since you're already managing your config files using cfn-init and CloudFormation::Init metadata in CloudFormation, you don't need any complex parsing of config files in your User-Data script, but you can still use the script as a reference. One thing worth adding to your User-Data script is running chkconfig awslogs on to make sure the service continues running on the instance after a reboot.

Why wont Cygnus receive a subscription on my CentOS 6.7?

I just finished testing the entire thing on my virtual machine environment and now I am trying to launch it on the dedicated server. And now I ran into a completely new issue. First I confirmed that I have both context Broker and centos running (on 1026 and 5050 respectively):
[root#centos conf]# netstat -ntlpd
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:1026 0.0.0.0:* LISTEN 1321/contextBroker
tcp 0 0 127.0.0.1:27017 0.0.0.0:* LISTEN 1282/mongod
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN 1791/mysqld
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 1260/sshd
tcp 0 0 :::1026 :::* LISTEN 1321/contextBroker
tcp 0 0 :::8081 :::* LISTEN 2481/java
tcp 0 0 :::22 :::* LISTEN 1260/sshd
tcp 0 0 :::5050 :::* LISTEN 2481/java
[root#centos conf]# service cygnus status
Cygnus 1 status...
cygnus-flume-ng (pid 2481) is running...
Then I confirmed that I have data on contextBroker because this command gave me an appropriate response:
(curl localhost:1026/v1/queryContext -s -S --header 'Content-Type: application/json' \
--header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
{
"entities": [
{
"type": "Room",
"isPattern": "false",
"id": "Room1"
}
]
}
EOF
Following the workaround to an issue with root user and logging I fixed the log4j.properties and changed the follwing:
flume.log.dir=/var/log/cygnus
I then started cygnus and got the following log:
Starting an ordered shutdown of Cygnus
Stopping sources
Stopping http-source (lyfecycle state=START)
All the channels are empty
Stopping channels
Stopping mysql-channel (lyfecycle state=START)
Stopping sinks
Stopping mysql-sink (lyfecycle state=START)
Info: Sourcing environment configuration script /usr/cygnus/conf/flume-env.sh
Warning: JAVA_HOME is not set!
+ exec /usr/bin/java -Xmx20m -Dflume.log.file=cygnus.log -cp '/usr/cygnus/conf:/usr/cygnus/lib/*:/usr/cygnus/plugins.d/cygnus/lib/*:/usr/cygnus/plugins.d/cygnus/libext/*' -Djava.library.path= com.telefonica.iot.cygnus.nodes.CygnusApplic$
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/cygnus/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/cygnus/plugins.d/cygnus/lib/cygnus-0.11.0-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
17 Dec 2015 13:35:37,684 INFO [main] (com.telefonica.iot.cygnus.nodes.CygnusApplication.main:235) - Starting a Jetty server listening on port 8081 (Management Interface)
17 Dec 2015 13:35:37,700 INFO [main] (org.mortbay.log.Slf4jLog.info:67) - Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
17 Dec 2015 13:35:37,700 INFO [main] (com.telefonica.iot.cygnus.nodes.CygnusApplication.main:238) - Starting Cygnus application
17 Dec 2015 13:35:37,700 INFO [Thread-1] (org.mortbay.log.Slf4jLog.info:67) - jetty-6.1.26
17 Dec 2015 13:35:37,713 INFO [lifecycleSupervisor-1-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider.start:61) - Configuration provider starting
17 Dec 2015 13:35:37,715 INFO [conf-file-poller-0] (org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run:133) - Reloading configuration file:/usr/cygnus/conf/agent_1.conf
17 Dec 2015 13:35:37,725 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:mysql-sink
17 Dec 2015 13:35:37,725 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:mysql-sink
17 Dec 2015 13:35:37,725 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.addProperty:1016) - Processing:mysql-sink
17 Dec 2015 13:35:37,755 INFO [Thread-1] (org.mortbay.log.Slf4jLog.info:67) - Started SocketConnector#0.0.0.0:8081
17 Dec 2015 13:35:37,764 WARN [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.isValid:319) - Agent configuration for 'cygunsagent' does not contain any channels. Marking it as invalid.
17 Dec 2015 13:35:37,765 WARN [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:127) - Agent configuration invalid for agent 'cygunsagent'. It will be removed.
17 Dec 2015 13:35:37,766 INFO [conf-file-poller-0] (org.apache.flume.conf.FlumeConfiguration.validateConfiguration:140) - Post-validation flume configuration contains configuration for agents: [cygnusagent]
17 Dec 2015 13:35:37,766 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:150) - Creating channels
17 Dec 2015 13:35:37,778 INFO [conf-file-poller-0] (org.apache.flume.channel.DefaultChannelFactory.create:40) - Creating instance of channel mysql-channel type memory
17 Dec 2015 13:35:37,782 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.loadChannels:205) - Created channel mysql-channel
17 Dec 2015 13:35:37,783 INFO [conf-file-poller-0] (org.apache.flume.source.DefaultSourceFactory.create:39) - Creating instance of source http-source, type org.apache.flume.source.http.HTTPSource
17 Dec 2015 13:35:37,791 INFO [conf-file-poller-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.<init>:75) - Cygnus version (0.11.0.2a9c87fb7fd6156225e2eed7fbc9792f1d9c5dfe)
17 Dec 2015 13:35:37,807 INFO [conf-file-poller-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.configure:141) - Startup completed
17 Dec 2015 13:35:37,826 INFO [conf-file-poller-0] (org.apache.flume.sink.DefaultSinkFactory.create:40) - Creating instance of sink: mysql-sink, type: com.telefonica.iot.cygnus.sinks.OrionMySQLSink
17 Dec 2015 13:35:37,839 INFO [conf-file-poller-0] (org.apache.flume.node.AbstractConfigurationProvider.getConfiguration:119) - Channel mysql-channel connected to [http-source, mysql-sink]
17 Dec 2015 13:35:37,843 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:138) - Starting new configuration:{ sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTT$
17 Dec 2015 13:35:37,844 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:145) - Starting Channel mysql-channel
17 Dec 2015 13:35:37,910 INFO [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:110) - Monitoried counter group for type: CHANNEL, name: mysql-channel, registered successfully.
17 Dec 2015 13:35:37,910 INFO [lifecycleSupervisor-1-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94) - Component type: CHANNEL, name: mysql-channel started
17 Dec 2015 13:35:37,911 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:173) - Starting Sink mysql-sink
17 Dec 2015 13:35:37,913 INFO [lifecycleSupervisor-1-1] (com.telefonica.iot.cygnus.sinks.OrionMySQLSink.start:153) - [mysql-sink] Startup completed
17 Dec 2015 13:35:37,915 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:184) - Starting Source http-source
17 Dec 2015 13:35:37,916 INFO [lifecycleSupervisor-1-2] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize:92) - Grouping rules read:
17 Dec 2015 13:35:37,916 INFO [conf-file-poller-0] (org.apache.flume.node.Application.stopAllComponents:101) - Shutting down configuration: { sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HT$
17 Dec 2015 13:35:37,917 INFO [conf-file-poller-0] (org.apache.flume.node.Application.stopAllComponents:105) - Stopping Source http-source
17 Dec 2015 13:35:37,920 ERROR [lifecycleSupervisor-1-2] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.parseGroupingRules:165) - Error while parsing the Json-based grouping rules file. Details=null
17 Dec 2015 13:35:37,921 WARN [lifecycleSupervisor-1-2] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize:98) - Grouping rules syntax has errors
17 Dec 2015 13:35:37,948 INFO [lifecycleSupervisor-1-2] (org.mortbay.log.Slf4jLog.info:67) - jetty-6.1.26
17 Dec 2015 13:35:37,973 INFO [lifecycleSupervisor-1-2] (org.mortbay.log.Slf4jLog.info:67) - Started SocketConnector#0.0.0.0:5050
17 Dec 2015 13:35:37,974 INFO [lifecycleSupervisor-1-2] (org.apache.flume.instrumentation.MonitoredCounterGroup.register:110) - Monitoried counter group for type: SOURCE, name: http-source, registered successfully.
17 Dec 2015 13:35:37,974 INFO [lifecycleSupervisor-1-2] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94) - Component type: SOURCE, name: http-source started
17 Dec 2015 13:35:37,974 INFO [conf-file-poller-0] (org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise:171) - Stopping component: EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTTPSource{name:http-source,state$
17 Dec 2015 13:35:37,974 INFO [conf-file-poller-0] (org.mortbay.log.Slf4jLog.info:67) - Stopped SocketConnector#0.0.0.0:5050
17 Dec 2015 13:35:37,975 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:139) - Component type: SOURCE, name: http-source stopped
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:145) - Shutdown Metric for type: SOURCE, name: http-source. source.start.time == 1450355737974
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:151) - Shutdown Metric for type: SOURCE, name: http-source. source.stop.time == 1450355737975
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.accepted == 0
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.append-batch.received == 0
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.append.accepted == 0
17 Dec 2015 13:35:37,976 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.append.received == 0
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.events.accepted == 0
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.events.received == 0
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: SOURCE, name: http-source. src.open-connection.count == 0
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.source.http.HTTPSource.stop:172) - Http source http-source stopped. Metrics: SOURCE:http-source{src.events.accepted=0, src.open-connection.count=0, src.append.receiv$
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.node.Application.stopAllComponents:115) - Stopping Sink mysql-sink
17 Dec 2015 13:35:37,977 INFO [conf-file-poller-0] (org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise:171) - Stopping component: SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor#76da521f counterGroup:{ name:nul$
17 Dec 2015 13:35:37,987 INFO [conf-file-poller-0] (org.apache.flume.node.Application.stopAllComponents:125) - Stopping Channel mysql-channel
17 Dec 2015 13:35:37,987 INFO [conf-file-poller-0] (org.apache.flume.lifecycle.LifecycleSupervisor.unsupervise:171) - Stopping component: org.apache.flume.channel.MemoryChannel{name: mysql-channel}
17 Dec 2015 13:35:37,987 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:139) - Component type: CHANNEL, name: mysql-channel stopped
17 Dec 2015 13:35:37,987 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:145) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.start.time == 1450355737910
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:151) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.stop.time == 1450355737987
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.capacity == 1000
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.current.size == 0
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.event.put.attempt == 0
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.event.put.success == 0
17 Dec 2015 13:35:37,988 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.event.take.attempt == 1
17 Dec 2015 13:35:37,989 INFO [conf-file-poller-0] (org.apache.flume.instrumentation.MonitoredCounterGroup.stop:167) - Shutdown Metric for type: CHANNEL, name: mysql-channel. channel.event.take.success == 0
17 Dec 2015 13:35:37,989 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:138) - Starting new configuration:{ sourceRunners:{http-source=EventDrivenSourceRunner: { source:org.apache.flume.source.http.HTT$
17 Dec 2015 13:35:37,989 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:145) - Starting Channel mysql-channel
17 Dec 2015 13:35:37,989 INFO [lifecycleSupervisor-1-3] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94) - Component type: CHANNEL, name: mysql-channel started
17 Dec 2015 13:35:37,992 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:173) - Starting Sink mysql-sink
17 Dec 2015 13:35:37,993 INFO [lifecycleSupervisor-1-8] (com.telefonica.iot.cygnus.sinks.OrionMySQLSink.start:153) - [mysql-sink] Startup completed
17 Dec 2015 13:35:37,993 INFO [conf-file-poller-0] (org.apache.flume.node.Application.startAllComponents:184) - Starting Source http-source
17 Dec 2015 13:35:37,993 INFO [lifecycleSupervisor-1-4] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize:92) - Grouping rules read:
17 Dec 2015 13:35:37,994 ERROR [lifecycleSupervisor-1-4] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.parseGroupingRules:165) - Error while parsing the Json-based grouping rules file. Details=null
17 Dec 2015 13:35:37,994 WARN [lifecycleSupervisor-1-4] (com.telefonica.iot.cygnus.interceptors.GroupingInterceptor.initialize:98) - Grouping rules syntax has errors
17 Dec 2015 13:35:37,994 INFO [lifecycleSupervisor-1-4] (org.mortbay.log.Slf4jLog.info:67) - jetty-6.1.26
17 Dec 2015 13:35:37,996 INFO [lifecycleSupervisor-1-4] (org.mortbay.log.Slf4jLog.info:67) - Started SocketConnector#0.0.0.0:5050
17 Dec 2015 13:35:37,996 INFO [lifecycleSupervisor-1-4] (org.apache.flume.instrumentation.MonitoredCounterGroup.start:94) - Component type: SOURCE, name: http-source started
Then I tried to subscribe the previously mentioned data to cygnus:
(curl localhost:1026/v1/subscribeContext -s -S --header 'Content-Type: application/json' \
--header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
{
"entities": [
{
"type": "Room",
"isPattern": "false",
"id": "Room1"
}
],
"attributes": [
"pressure"
"temperature"
],
"reference": "http://localhost:5050/notify",
"duration": "P1M",
"notifyConditions": [
{
"type": "ONCHANGE",
"condValues": [
"pressure",
"temperature"
]
}
],
"throttling": "PT1S"
}
EOF
Even after I updated the information on context Broker thinking it would trigger an event:
(curl localhost:1026/v1/updateContext -s -S --header 'Content-Type: application/json' \
--header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
{
"contextElements": [
{
"type": "Room",
"isPattern": "false",
"id": "Room1",
"attributes": [
{
"name": "temperature",
"type": "float",
"value": "333"
},
{
"name": "pressure",
"type": "integer",
"value": "555"
}
]
}
],
"updateAction": "APPEND"
}
EOF
But the cygnus log remainged exactly the same and its like nothing even got through to it. Which is odd considering my agent_1.conf:
# Copyright 2014 Telefónica Investigación y Desarrollo, S.A.U
#
# This file is part of fiware-cygnus (FI-WARE project).
#
# fiware-cygnus is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General
# Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
# later version.
# fiware-cygnus is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
# details.
#
# You should have received a copy of the GNU Affero General Public License along with fiware-cygnus. If not, see
# http://www.gnu.org/licenses/.
#
# For those usages not covered by the GNU Affero General Public License please contact with iot_support at tid dot es
#=============================================
# To be put in APACHE_FLUME_HOME/conf/agent.conf
#
# General configuration template explaining how to setup a sink of each of the available types (HDFS, CKAN, MySQL).
#=============================================
# The next tree fields set the sources, sinks and channels used by Cygnus. You could use different names than the
# ones suggested below, but in that case make sure you keep coherence in properties names along the configuration file.
# Regarding sinks, you can use multiple types at the same time; the only requirement is to provide a channel for each
# one of them (this example shows how to configure 3 sink types at the same time). Even, you can define more than one
# sink of the same type and sharing the channel in order to improve the performance (this is like having
# multi-threading).
cygnusagent.sources = http-source
cygnusagent.sinks = mysql-sink
cygnusagent.channels = mysql-channel
#=============================================
# source configuration
# channel name where to write the notification events
cygnusagent.sources.http-source.channels = mysql-channel
# source class, must not be changed
cygnusagent.sources.http-source.type = org.apache.flume.source.http.HTTPSource
# listening port the Flume source will use for receiving incoming notifications
cygnusagent.sources.http-source.port = 5050
# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = Trace_Data
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = Sensor
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts gi
# TimestampInterceptor, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# GroupinInterceptor, do not change
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
# Grouping rules for the GroupingInterceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf
# ============================================
# OrionMySQLSink configuration
# channel name from where to read notification events
cygnusagent.sinks.mysql-sink.channel = mysql-channel
# sink class, must not be changed
cygnusagent.sinks.mysql-sink.type = com.telefonica.iot.cygnus.sinks.OrionMySQLSink
# true if the grouping feature is enabled for this sink, false otherwise
cygnusagent.sinks.mysql-sink.enable_grouping = false
# the FQDN/IP address where the MySQL server runs
cygnusagent.sinks.mysql-sink.mysql_host = 127.0.0.1
# the port where the MySQL server listes for incomming connections
cygnusagent.sinks.mysql-sink.mysql_port = 3306
# a valid user in the MySQL server
cygnusagent.sinks.mysql-sink.mysql_username = root
# password for the user above
cygnusagent.sinks.mysql-sink.mysql_password = klasika
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.mysql-sink.attr_persistence = column
# select the table type from table-by-destination and table-by-service-path
cygnusagent.sinks.mysql-sink.table_type = table-by-destination
# number of notifications to be included within a processing batch
cygnusagent.sinks.mysql-sink.batch_size = 1
# timeout for batch accumulation
cygunsagent.sinks.mysql-sink.batch_timeout = 30
#=============================================
# mysql-channel configuration
# channel type (must not be changed)
cygnusagent.channels.mysql-channel.type = memory
# capacity of the channel
cygnusagent.channels.mysql-channel.capacity = 1000
# amount of bytes that can be sent per transaction
cygnusagent.channels.mysql-channel.transactionCapacity = 100
#============================================
It has 5050 and notify as the reference address. I double checked the cygnus_instance_1.conf as well and it is pointing at agent_1.conf
#####
#
# Configuration file for apache-flume
#
#####
# Copyright 2014 Telefonica Investigación y Desarrollo, S.A.U
#
# This file is part of fiware-cygnus (FI-WARE project).
#
# fiware-cygnus is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General
# Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
# later version.
# fiware-cygnus is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more
# details.
#
# You should have received a copy of the GNU Affero General Public License along with fiware-cygnus. If not, see
# http://www.gnu.org/licenses/.
#
# For those usages not covered by the GNU Affero General Public License please contact with iot_support at tid dot es
# Who to run cygnus as. Note that you may need to use root if you want
# to run cygnus in a privileged port (<1024)
CYGNUS_USER=cygnus
# Where is the config folder
CONFIG_FOLDER=/usr/cygnus/conf
# Which is the config file
CONFIG_FILE=/usr/cygnus/conf/agent_1.conf
# Name of the agent. The name of the agent is not trivial, since it is the base for the Flume parameters
# naming conventions, e.g. it appears in .sources.http-source.channels=...
AGENT_NAME=cygnusagent
# Name of the logfile located at /var/log/cygnus. It is important to put the extension '.log' in order to the log rotation works properly
LOGFILE_NAME=cygnus.log
# Administration port. Must be unique per instance
ADMIN_PORT=8081
# Polling interval (seconds) for the configuration reloading
POLLING_INTERVAL=30
This is the content of my config folder in /usr/cygnus/conf:
[root#centos conf]# ls
agent_1.conf cygnus_instance_1.conf flume-env.sh grouping_rules.conf krb5.conf krb5_login.conf log4j.properties.template
agent.conf.template cygnus_instance.conf.template flume-env.sh.template grouping_rules.conf.template krb5.conf.template log4j.properties README.md
I noticed that there is an exacted mirror of this in /etc/cygnus/conf but I didn't touch anything because the installation only intructs me to use the /usr/ folder.
Here is my Mysql create statement. In this table I am expecting to receive the context broker data but I get nothing of course since the log didnt register anything.
CREATE TABLE sensor_room1_room (
sensorID INT NOT NULL AUTO_INCREMENT,
recvTime mediumtext,
fiwareservicepath text,
entityId text,
entityType text,
pressure text,
pressure_md text,
temperature text,
temperature_md text,
PRIMARY KEY (sensorID));
Edit 1:
Here is the listener
[root#centos conf]# nc -l 5050
But when I tried subscribing or updating context, nothing was received on the listener side. I am not taking into account the client side of nc: nc 127.0.0.1 5050 because it successfully sends everything i type (even gibberish).
I also tried the test: /usr/cygnus/bin/cygnus-flume-ng agent --conf /usr/cygnus/conf/ -f /usr/cygnus/conf/agent_1.conf -n cygnusagent -Dflume.root.logger=DEBUG,console. I tried both 5050 and 8081 ports to subscribe to and then update context but nothing is read on the console.
Since I seriously have no idea why that subscription didn't work, but thanks to #fgalan I did manage to read the logs so I am posting the subscription that did trigger the event to cygnus:
(curl localhost:1026/v1/subscribeContext -s -S --header 'Content-Type: application/json' \
--header 'Accept: application/json' -d #- | python -mjson.tool) <<EOF
{
"entities": [
{
"type": "Room",
"isPattern": "false",
"id": "Room1"
}
],
"attributes": [
"temperature"
],
"reference": "http://localhost:5050/notify",
"duration": "P1M",
"notifyConditions": [
{
"type": "ONCHANGE",
"condValues": [
"pressure"
]
}
],
"throttling": "PT5S"
}
EOF
Thank you #fgalan one more time!

Unable to mount volumes for pod

EDITED:
I've an OpenShift cluster with one master and two nodes. I've installed NFS on the master and NFS client on the nodes.
I've followed the wordpress example with NFS: https://github.com/openshift/origin/tree/master/examples/wordpress
I did the following on my master as: oc login -u system:admin:
mkdir /home/data/pv0001
mkdir /home/data/pv0002
chown -R nfsnobody:nfsnobody /home/data
chmod -R 777 /home/data/
# Add to /etc/exports
/home/data/pv0001 *(rw,sync,no_root_squash)
/home/data/pv0002 *(rw,sync,no_root_squash)
# Enable the new exports without bouncing the NFS service
exportfs -a
So exportfs shows:
/home/data/pv0001
<world>
/home/data/pv0002
<world>
$ setsebool -P virt_use_nfs 1
# Create the persistent volumes for NFS.
# I did not change anything in the yaml-files
$ oc create -f examples/wordpress/nfs/pv-1.yaml
$ oc create -f examples/wordpress/nfs/pv-2.yaml
$ oc get pv
NAME LABELS CAPACITY ACCESSMODES STATUS CLAIM REASON
pv0001 <none> 1073741824 RWO,RWX Available
pv0002 <none> 5368709120 RWO Available
This is also what I get.
Than I'm going to my node:
oc login
test-admin
And I create a wordpress project:
oc new-project wordpress
# Create claims for storage in my project (same namespace).
# The claims in this example carefully match the volumes created above.
$ oc create -f examples/wordpress/pvc-wp.yaml
$ oc create -f examples/wordpress/pvc-mysql.yaml
$ oc get pvc
NAME LABELS STATUS VOLUME
claim-mysql map[] Bound pv0002
claim-wp map[] Bound pv0001
This looks exactly the same for me.
Launch the MySQL pod.
oc create -f examples/wordpress/pod-mysql.yaml
oc create -f examples/wordpress/service-mysql.yaml
oc create -f examples/wordpress/pod-wordpress.yaml
oc create -f examples/wordpress/service-wp.yaml
oc get svc
NAME LABELS SELECTOR IP(S) PORT(S)
mysql name=mysql name=mysql 172.30.115.137 3306/TCP
wpfrontend name=wpfrontend name=wordpress 172.30.170.55 5055/TCP
So actually everyting seemed to work! But when I'm asking for my pod status I get the following:
[root#ip-10-0-0-104 pv0002]# oc get pod
NAME READY STATUS RESTARTS AGE
mysql 0/1 Image: openshift/mysql-55-centos7 is ready, container is creating 0 6h
wordpress 0/1 Image: wordpress is not ready on the node 0 6h
The pods are in pending state and in the webconsole they're giving the following error:
12:12:51 PM mysql Pod failedMount Unable to mount volumes for pod "mysql_wordpress": exit status 32 (607 times in the last hour, 41 minutes)
12:12:51 PM mysql Pod failedSync Error syncing pod, skipping: exit status 32 (607 times in the last hour, 41 minutes)
12:12:48 PM wordpress Pod failedMount Unable to mount volumes for pod "wordpress_wordpress": exit status 32 (604 times in the last hour, 40 minutes)
12:12:48 PM wordpress Pod failedSync Error syncing pod, skipping: exit status 32 (604 times in the last hour, 40 minutes)
Unable to mount +timeout. But when I'm going to my node and I'm doing the following (test is a created directory on my node):
mount -t nfs -v masterhostname:/home/data/pv0002 /test
And I place some file in my /test on my node than it appears in my /home/data/pv0002 on my master so that seems to work.
What's the reason that it's unable to mount in OpenShift?
I've been stuck on this for a while.
LOGS:
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.795267904Z" level=info msg="GET /containers/json"
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.832179 1148 mount_linux.go:103] Mount failed: exit status 32
Oct 21 10:44:52 ip-10-0-0-129 origin-node: Mounting arguments: localhost:/home/data/pv0002 /var/lib/origin/openshift.local.volumes/pods/2bf19fe9-77ce-11e5-9122-02463424c049/volumes/kubernetes.io~nfs/pv0002 nfs []
Oct 21 10:44:52 ip-10-0-0-129 origin-node: Output: mount.nfs: access denied by server while mounting localhost:/home/data/pv0002
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.832279 1148 kubelet.go:1206] Unable to mount volumes for pod "mysql_wordpress": exit status 32; skipping pod
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.832794476Z" level=info msg="GET /containers/json?all=1"
Oct 21 10:44:52 ip-10-0-0-129 docker: time="2015-10-21T10:44:52.835916304Z" level=info msg="GET /images/openshift/mysql-55-centos7/json"
Oct 21 10:44:52 ip-10-0-0-129 origin-node: E1021 10:44:52.837085 1148 pod_workers.go:111] Error syncing pod 2bf19fe9-77ce-11e5-9122-02463424c049, skipping: exit status 32
Logs showed Oct 21 10:44:52 ip-10-0-0-129 origin-node: Output: mount.nfs: access denied by server while mounting localhost:/home/data/pv0002
So it failed mounting on localhost.
to create my persistent volume I've executed this yaml:
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"metadata": {
"name": "registry-volume"
},
"spec": {
"capacity": {
"storage": "20Gi"
},
"accessModes": [ "ReadWriteMany" ],
"nfs": {
"path": "/home/data/pv0002",
"server": "localhost"
}
}
}
So I was mounting to /home/data/pv0002 but this path was not on the localhost but on my master server (which is ose3-master.example.com. So I created my PV in a wrong way.
{
"apiVersion": "v1",
"kind": "PersistentVolume",
"metadata": {
"name": "registry-volume"
},
"spec": {
"capacity": {
"storage": "20Gi"
},
"accessModes": [ "ReadWriteMany" ],
"nfs": {
"path": "/home/data/pv0002",
"server": "ose3-master.example.com"
}
}
}
This was also in a training environment. It's recommended to have a NFS server outside of your cluster to mount to.