Our origin-node.service on the master node fails with:
root#master> systemctl start origin-node.service
Job for origin-node.service failed because the control process exited with error code. See "systemctl status origin-node.service" and "journalctl -xe" for details.
root#master> systemctl status origin-node.service -l
[...]
May 05 07:17:47 master origin-node[44066]: bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 05 07:17:47 master origin-node[44066]: bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 05 07:17:47 master origin-node[44066]: certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 05 07:17:47 master origin-node[44066]: server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
So it seems that kubelet-client-current.pem and/or kubelet-server-current.pem contains an expired certificate and the service tries to create a CSR using an endpoint which is probably not yet available (because the master is down). We tried redeploying the certificates according to the OpenShift documentation Redeploying Certificates, but this fails while detecting an expired certificate:
root#master> ansible-playbook -i /etc/ansible/hosts openshift-master/redeploy-openshift-ca.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] *******************************************************************************************************************************************
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200505T042754.html or /root/cert-expiry-report.20200505T042754.json.\n"}
[...]
root#master> cat /root/cert-expiry-report.20200505T042754.json
[...]
"kubeconfigs": [
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
{
"cert_cn": "O:system:cluster-admins, CN:system:admin",
"days_remaining": -75,
"expiry": "2020-02-20 13:14:27",
"health": "expired",
"issuer": "CN=openshift-signer#1519045219 ",
"path": "/etc/origin/node/node.kubeconfig",
"serial": 27,
"serial_hex": "0x1b"
},
[...]
"summary": {
"expired": 2,
"ok": 22,
"total": 24,
"warning": 0
}
}
There is a guide for OpenShift 4.4 for Recovering from expired control plane certificates, but that does not apply for 3.11 and we did not find such a guide for our version.
Is it possible to recreate the expired certificates without a running master node for 3.11? Thanks for any help.
OpenShift Ansible: https://github.com/openshift/openshift-ansible/releases/tag/openshift-ansible-3.11.153-2
Update 2020-05-06: I also executed redeploy-certificates.yml, but it fails at the same TASK:
root#master> ansible-playbook -i /etc/ansible/hosts playbooks/redeploy-certificates.yml
[...]
TASK [openshift_certificate_expiry : Fail when certs are near or already expired] ******************************************************************************
Wednesday 06 May 2020 04:07:06 -0400 (0:00:00.909) 0:01:07.582 *********
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"changed": false, "msg": "Cluster certificates found to be expired or within 60 days of expiring. You may view the report at /root/cert-expiry-report.20200506T040603.html or /root/cert-expiry-report.20200506T040603.json.\n"}
Update 2020-05-11: Running with -e openshift_certificate_expiry_fail_on_warn=False results in:
root#master> ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml
[...]
TASK [Wait for master API to come back online] *****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.111) 0:02:25.186 ************
skipping: [master.openshift-cluster.mydomain.com]
TASK [openshift_control_plane : restart master] ****************************************************************************************************************
Monday 11 May 2020 03:48:56 -0400 (0:00:00.257) 0:02:25.444 ************
changed: [master.openshift-cluster.mydomain.com] => (item=api)
changed: [master.openshift-cluster.mydomain.com] => (item=controllers)
RUNNING HANDLER [openshift_control_plane : verify API server] **************************************************************************************************
Monday 11 May 2020 03:48:57 -0400 (0:00:00.945) 0:02:26.389 ************
FAILED - RETRYING: verify API server (120 retries left).
FAILED - RETRYING: verify API server (119 retries left).
[...]
FAILED - RETRYING: verify API server (1 retries left).
fatal: [master.openshift-cluster.mydomain.com]: FAILED! => {"attempts": 120, "changed": false, "cmd": ["curl", "--silent", "--tlsv1.2", "--max-time", "2", "--cacert", "/etc/origin/master/ca-bundle.crt", "https://lb.openshift-cluster.mydomain.com:8443/healthz/ready"], "delta": "0:00:00.182367", "end": "2020-05-11 03:51:52.245644", "msg": "non-zero return code", "rc": 35, "start": "2020-05-11 03:51:52.063277", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}
root#master> systemctl status origin-node.service -l
[...]
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: E0511 04:23:28.077964 109972 bootstrap.go:195] Part of the existing bootstrap client certificate is expired: 2020-02-20 13:14:27 +0000 UTC
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.078001 109972 bootstrap.go:56] Using bootstrap kubeconfig to generate TLS client cert, key and kubeconfig file
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: I0511 04:23:28.080555 109972 certificate_store.go:131] Loading cert/key pair from "/etc/origin/node/certificates/kubelet-client-current.pem".
May 11 04:23:28 master.openshift-cluster.mydomain.com origin-node[109972]: F0511 04:23:28.130968 109972 server.go:262] failed to run Kubelet: cannot create certificate signing request: Post https://lb.openshift-cluster.mydomain.com:8443/apis/certificates.k8s.io/v1beta1/certificatesigningrequests: EOF
[...]
I have this same case in customer environment, this error is because the certified was expiry, i "cheated" changing da S.O date before the expiry date. And the origin-node service started in my masters:
systemctl status origin-node
● origin-node.service - OpenShift Node
Loaded: loaded (/etc/systemd/system/origin-node.service; enabled; vendor preset: disabled)
Active: active (running) since Sáb 2021-02-20 20:22:21 -02; 6min ago
Docs: https://github.com/openshift/origin
Main PID: 37230 (hyperkube)
Memory: 79.0M
CGroup: /system.slice/origin-node.service
└─37230 /usr/bin/hyperkube kubelet --v=2 --address=0.0.0.0 --allow-privileged=true --anonymous-auth=true --authentication-token-webhook=true --authentication-token-webhook-cache-ttl=5m --authorization-mode=Webhook --authorization-webhook-c...
Você tem mensagem de correio em /var/spool/mail/okd
The openshift_certificate_expiry role uses the openshift_certificate_expiry_fail_on_warn variable to determine if the playbook should fail when the days left are less than openshift_certificate_expiry_warning_days.
So try running the redeploy-certificates.yml with this additional variable set to "False":
ansible-playbook -i /etc/ansible/hosts -e openshift_certificate_expiry_fail_on_warn=False playbooks/redeploy-certificates.yml
I have a VPN site to site configuration Fortinet800C and Google Cloud VPN as link: https://cloud.google.com/files/CloudVPNGuide-UsingCloudVPNwithFortinetFortiGate300C.pdf.
But it's not successful.The logs look like this repeated over and over:
16:43:36.240
sending packet: from 146.148.29.x[500] to 27.72.57.x[500] (640 bytes)
16:43:36.547
received packet: from 27.72.57.x[500] to 146.148.29.x[500] (360 bytes)
16:43:36.548
parsed IKE_SA_INIT request 0 [ SA KE No ]
16:43:36.548
27.72.57.x is initiating an IKE_SA
16:43:36.559
generating IKE_SA_INIT response 0 [ SA KE No N(MULT_AUTH) ]
16:43:36.559
sending packet: from 146.148.29.x[500] to 27.72.57.x[500] (384 bytes)
16:43:36.565
received packet: from 27.72.57.x[500] to 146.148.29.x[500] (360 bytes)
16:43:36.565
parsed IKE_SA_INIT response 0 [ SA KE No ]
16:43:36.571
authentication of '146.148.29.x' (myself) with pre-shared key
16:43:36.571
establishing CHILD_SA vpn_27.72.57.x{1}
16:43:36.571
generating IKE_AUTH request 1 [ IDi N(INIT_CONTACT) IDr AUTH SA TSi TSr N(EAP_ONLY) ]
16:43:36.572
sending packet: from 146.148.29.x[500] to 27.72.57.x[500] (316 bytes)
16:43:36.885
received packet: from 27.72.57.x[500] to 146.148.29.x[500] (204 bytes)
16:43:36.886
parsed IKE_AUTH request 1 [ IDi AUTH N(MSG_ID_SYN_SUP) SA TSi TSr ]
16:43:36.886
looking for peer configs matching 146.148.29.x[%any]...27.72.57.x[192.168.0.x]
16:43:36.886
no matching peer config found
16:43:36.886
generating IKE_AUTH response 1 [ N(AUTH_FAILED) ]
16:43:36.886
sending packet: from 146.148.29.x[500] to 27.72.57.x[500] (76 bytes)
16:43:36.891
received packet: from 27.72.57.x[500] to 146.148.29.x[500] (124 bytes)
16:43:36.891
parsed IKE_AUTH response 1 [ IDr AUTH N(TS_UNACCEPT) ]
16:43:36.891
authentication of '192.168.0.x' with pre-shared key successful
16:43:36.891
constraint check failed: identity '27.72.57.x' required
16:43:36.891
selected peer config 'vpn_27.72.57.x' inacceptable: constraint checking failed
16:43:36.891
no alternative config found
16:43:36.891
generating INFORMATIONAL request 2 [ N(AUTH_FAILED) ]
16:43:36.891
sending packet: from 146.148.29.x[500] to 27.72.57.x[500] (76 bytes)
16:43:37.887
received packet: from 27.72.57.x[500] to 146.148.29.x[500] (360 bytes)
16:43:37.888
parsed IKE_SA_INIT request 0 [ SA KE No ]
16:43:37.888
27.72.57.140 is initiating an IKE_SA
16:43:37.900
generating IKE_SA_INIT response 0 [ SA KE No N(MULT_AUTH) ]
I'd be very grateful if someone can spot my mistake. Thank you.
My guess is that cloud VPN and Fortinet device are not configured to the same IKE version. Please check that.
Also, try looking at the status message of the VPN as displayed in the cloud console, or using 'gcloud compute vpn-tunnels describe' in command line.
It looks like one or more of the phase 1 setting did not match up on both sides. Without looking at the actual config, I cannot determine. But generally, check the pre-shared key, authentication and encryption algorithm, DH groups, IP of the remote gateway and the outgoing interface of the connection. These factors have to match. Also, if you have NAT-Traversal enabled on one end, it has to be enabled on the other end as well.
I agree with the previous answers. The logs says, that phase 1 could not be established. So the parameters are not equal.
It seems, that the psi (pre-shared key) is equal:
"authentication of '192.168.0.x' with pre-shared key successful"
I tried to start BOSH on ejabberd. My ejabberd.cfg snippet is below:
{5280, ejabberd_http, [
{request_handlers, [
{["xmpp-httpbind"], mod_http_bind}
]},
captcha,
http_bind,
http_poll,
web_admin
]}
http://localhost:5280/http-bind fails to open any page.
And my client getting this response from server
Sent XML:
<iq to='localhost' id='uid:50502b03:00004823' type='get' x
mlns='jabber:client'><query xmlns='jabber:iq:auth'><username>anurag</username></
query></iq>
Received XML:
<iq xmlns='jabber:client' from='localhost' id='uid:505
029df:00004823' type='error'><error code='503' type='cancel'><service-unavailabl
e xmlns='urn:ietf:params:xml:ns:xmpp-stanzas'/></error></iq>
Sent XML: </stream:stream>
auth failed. reason: 0
ce: 18
I am using gloox library to create a client.
Did you add {mod_http_bind, []} to your modules section?
I'm trying to connect to a hypertable master machine, hypertable is deployed via mesos, When I copy hypertable.cfg file from master machine to some arbitrary machine, after running start-thriftbroker.sh, all I get is about ten lines of "Waiting for ThriftBroker to come up..." and then "ERROR: ThriftBroker did not come up", ThirftBroker's logfile says:
1342340080 NOTICE ThriftBroker : (/root/src/hypertable/src/cc/Common/Config.cc:526) Initializing ThriftBroker (Hypertable 0.9.5.6 (v0.9.5.6-dirty))...
CPU cores count=1
CephBroker.MonAddr=10.0.1.245:6789
CephBroker.Port=38030
CephBroker.Workers=20
DfsBroker.Host=localhost
DfsBroker.Local.Port=38030
DfsBroker.Local.Root=fs/local
DfsBroker.Port=38030
HdfsBroker.Port=38030
HdfsBroker.Workers=20
HdfsBroker.fs.default.name=hdfs://<ip>:9010
Hyperspace.GracePeriod=200000
Hyperspace.KeepAlive.Interval=30000
Hyperspace.Lease.Interval=1000000
Hyperspace.Replica.Dir=hyperspace
Hyperspace.Replica.Host=[<ip>]
Hyperspace.Replica.Port=38040
Hyperspace.Replica.Workers=20
Hypertable.Master.Port=38050
Hypertable.Master.Workers=20
Hypertable.RangeServer.Port=38060
Hypertable.Verbose=true
ThriftBroker.Port=38080
pidfile=/opt/hypertable/current/run/ThriftBroker.pid
port=38080
reactors=1
verbose=true
1342340080 INFO ThriftBroker : (/root/src/hypertable/src/cc/Hyperspace/Session.cc:63) Hyperspace session setup to reconnect
1342340082 ERROR ThriftBroker : main (/root/src/hypertable/src/cc/ThriftBroker/ThriftBroker.cc:2404): Hypertable::Exception: Hyperspace 'mkdir' error, name=/hypertable/namemap/names - HYPERSPACE file exists
at void Hyperspace::Session::mkdir(const std::string&, bool, const std::vector<Hyperspace::Attribute, std::allocator<Hyperspace::Attribute> >*, Hypertable::Timer*) (/root/src/hypertable/src/cc/Hyperspace/Session.cc:1257)
It got solved by updating to new version of ht.
I made a comet chat server with Erlang and Mochiweb. And I run the "./start-dev.sh" to start the server. But after about 1 month I got the following error:
=ERROR REPORT==== 26-Sep-2009::09:21:06 ===
{mochiweb_socket_server,235,
{child_error,
{badmatch,
{error,
[70,97,105,108,101,100,32,115,101,110,100,105,110,103,32,100,
97,116,97,32,111,110,32,115,111,99,107,101,116,32,58,32,
"closed"]}}}}
mysql: fetch "SELECT appKey FROM applications WHERE appID = 1" (id p1)
=CRASH REPORT==== 26-Sep-2009::09:21:10 ===
crasher:
initial call: mochiweb_socket_server:acceptor_loop/1
pid: <0.4271.23>
registered_name: []
exception error: no match of right hand side value
{error,[70,97,105,108,101,100,32,115,101,110,100,105,110,
103,32,100,97,116,97,32,111,110,32,115,111,99,
107,101,116,32,58,32,"closed"]}
in function moonwalker_web:loop/2
in call from mochiweb_http:headers/5
ancestors: [moonwalker_web,moonwalker_sup,<0.52.0>]
messages: []
links: [<0.54.0>,#Port<0.792854>]
dictionary: [{mochiweb_request_body,
<<"appID=1&appKey=keyy&userID=8048943&nickName=bill&buddies=N%3B×tamp=1253928070154">>},
{mochiweb_request_recv,true},
{mochiweb_request_post,
[{"appID","1"},
{"appKey","key"},
{"userID","8048943"},
{"nickName",[143,229,167,144]},
{"buddies","N;"},
{"timestamp","1253928070154"}]},
{mochiweb_request_path,"/online"}]
trap_exit: false
status: running
heap_size: 2584
stack_size: 24
reductions: 1368
neighbours:
=ERROR REPORT==== 26-Sep-2009::09:21:10 ===
{mochiweb_socket_server,235,
{child_error,
{badmatch,
{error,
[70,97,105,108,101,100,32,115,101,110,100,105,110,103,32,100,
97,116,97,32,111,110,32,115,111,99,107,101,116,32,58,32,
"closed"]}}}}
And if turn the following numbers into characters
[70,97,105,108,101,100,32,115,101,110,100,105,110,103,32,100,
97,116,97,32,111,110,32,115,111,99,107,101,116,32,58,32,
"closed"]}}}}
they are
Failed sending data on socket :"closed"
Does that mean I have problems with MySQL connection or socket?
I don't know if this error has something to do with my "./start-dev.sh" or I just had some wrong settings?
And what else information do I have to provide for diagnosing?
Thanks and looking forward to your reply?
It looks like somewhere in the loop/2 function you don't handle an {error,Error} return from a function call. This causes the error which crashes the process. Without the code it is difficult to say what caused the error return.