Problem:
Sending a test notification from SonarQube using STARTTLS over SMTP is failing.
Configuration used in SonarQube:
SMTP host: 1X.XXX.XX.X1
SMTP port: 587
Secure connection: starttls
Destination e-mail address is provided. Client with SonarQube is Debian 11. SMTP host is a MS Exchange server. Self signed certificates. Certificates are installed in the truststore.
Relevant:
Sending a test notification using SMTP but without STARTTLS is delivered succesfully.
Log:
Bellow are relevant fragments from the client web.log from one such failed attempt sending a notification using SMTP and STARTTLS:
2022.10.24 09:36:57 INFO web[AYPp5oPhM9pKCPrzAA6Z][javax.mail] JavaMail version 1.6.2
2022.10.24 09:36:57 INFO web[AYPp5oPhM9pKCPrzAA6Z][javax.mail] successfully loaded resource: /META-INF/javamail.default.address.map
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][javax.activation] MailcapCommandMap: createDataContentHandler for text/plain
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][javax.activation] search DB #1
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][javax.activation] got content-handler
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][javax.activation] class com.sun.mail.handlers.text_plain
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][javax.mail] getProvider() returning javax.mail.Provider[TRANSPORT,smtp,com.sun.mail.smtp.SMTPTransport,Oracle]
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] useEhlo true, useAuth false
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] trying to connect to host "1X.XXX.XX.X1", port 587, isSSL false
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] connected to host "1X.XXX.XX.X1", port: 587
2022.10.24 09:36:57 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "SIZE", arg "26214400"
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "PIPELINING", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "DSN", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "ENHANCEDSTATUSCODES", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "STARTTLS", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "AUTH", arg "NTLM"
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "8BITMIME", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "BINARYMIME", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][com.sun.mail.smtp] Found extension "CHUNKING", arg ""
2022.10.24 09:36:58 DEBUG web[AYPp5oPhM9pKCPrzAA6Z][o.s.s.n.e.EmailNotificationChannel] Fail to send test email to xxxxxxx#xxxxx.xxx: {}
org.apache.commons.mail.EmailException: Sending the email to the following server failed : 1X.XXX.XX.X1:587
...
Caused by: javax.mail.MessagingException: Could not convert socket to TLS
...
Caused by: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
...
Caused by: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
...
Caused by: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target
Command: $ echo | openssl s_client -connect 1X.XXX.XX.X1:587
returns:
CONNECTED(00000003)
140269928117568:error:1408F10B:SSL routines:ssl3_get_record:wrong version number:../ssl/record/ssl3_record.c:331:
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 5 bytes and written 283 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
---
Command: $ echo | openssl s_client -connect 1X.XXX.XX.X1:587 -starttls smtp
returns:
CONNECTED(00000003)
Can't use SSL_get_servername
depth=2 CN = CA-ROOT
verify return:1
depth=1 DC = local, DC = regulator, CN = CA-SUB
verify return:1
depth=0 C = PL, ST = Aaa, L = Bbb, O = Ccc, OU = Ddd, CN = [DOMAIN.NAME]
verify return:1
---
Certificate chain
0 s:C = PL, ST = Aaa, L = Bbb, O = Ccc, OU = Ddd, CN = [DOMAIN.NAME]
i:DC = local, DC = regulator, CN = CA-SUB
1 s:DC = local, DC = regulator, CN = CA-SUB
i:CN = CA-ROOT
---
Server certificate
-----BEGIN CERTIFICATE-----
[...CERTIFICATE...]
-----END CERTIFICATE-----
subject=C = PL, ST = Aaa, L = Bbb, O = Ccc, OU = Ddd, CN = [DOMAIN.NAME]
issuer=DC = local, DC = regulator, CN = CA-SUB
---
No client certificate CA names sent
Peer signing digest: SHA256
Peer signature type: RSA
Server Temp Key: ECDH, P-256, 256 bits
---
SSL handshake has read 3596 bytes and written 498 bytes
Verification: OK
---
New, TLSv1.2, Cipher is ECDHE-RSA-AES256-SHA384
Server public key is 2048 bit
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
Protocol : TLSv1.2
Cipher : ECDHE-RSA-AES256-SHA384
Session-ID: [...SESSION ID...]
Session-ID-ctx:
Master-Key: [...MASTER ID...]
PSK identity: None
PSK identity hint: None
SRP username: None
Start Time: 1666767236
Timeout : 7200 (sec)
Verify return code: 0 (ok)
Extended master secret: yes
---
250 CHUNKING
DONE
Question:
What do I need to do, for SonarQube notifications be delivered successfully using STARTTLS over SMTP?
I have installed using instructions at this link for the Install NGINX using NodePort option.
When I do ks logs -f ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
W0304 09:33:40.568799 8 client_config.go:614] Neither --kubeconfig nor --master was
specified. Using the inClusterConfig. This might not work.
I0304 09:33:40.569097 8 main.go:241] "Creating API client" host="https://10.96.0.1:443"
I0304 09:33:40.584904 8 main.go:285] "Running in Kubernetes cluster" major="1" minor="23" git="v1.23.1+k0s" state="clean" commit="b230d3e4b9d6bf4b731d96116a6643786e16ac3f" platform="linux/amd64"
I0304 09:33:40.911443 8 main.go:105] "SSL fake certificate created" file="/etc/ingress-controller/ssl/default-fake-certificate.pem"
I0304 09:33:40.916404 8 main.go:115] "Enabling new Ingress features available since Kubernetes v1.18"
W0304 09:33:40.918137 8 main.go:127] No IngressClass resource with name nginx found. Only annotation will be used.
I0304 09:33:40.942282 8 ssl.go:532] "loading tls certificate" path="/usr/local/certificates/cert" key="/usr/local/certificates/key"
I0304 09:33:40.977766 8 nginx.go:254] "Starting NGINX Ingress controller"
I0304 09:33:41.007616 8 event.go:282] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"ingress-nginx", Name:"ingress-nginx-controller", UID:"1a4482d2-86cb-44f3-8ebb-d6342561892f", APIVersion:"v1", ResourceVersion:"987560", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap ingress-nginx/ingress-nginx-controller
E0304 09:33:42.087113 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:43.041954 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:44.724681 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:48.303789 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:33:59.113203 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
E0304 09:34:16.727052 8 reflector.go:138] k8s.io/client-go#v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.Ingress: failed to list *v1beta1.Ingress: the server could not find the requested resource
I0304 09:34:39.216165 8 main.go:187] "Received SIGTERM, shutting down"
I0304 09:34:39.216773 8 nginx.go:372] "Shutting down controller queues"
E0304 09:34:39.217779 8 store.go:178] timed out waiting for caches to sync
I0304 09:34:39.217856 8 nginx.go:296] "Starting NGINX process"
I0304 09:34:39.218007 8 leaderelection.go:243] attempting to acquire leader lease ingress-nginx/ingress-controller-leader-nginx...
I0304 09:34:39.219741 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:initial-sync,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.219787 8 nginx.go:316] "Starting validation webhook" address=":8443" certPath="/usr/local/certificates/cert" keyPath="/usr/local/certificates/key"
I0304 09:34:39.242501 8 leaderelection.go:253] successfully acquired lease ingress-nginx/ingress-controller-leader-nginx
I0304 09:34:39.242807 8 queue.go:78] "queue has been shutdown, failed to enqueue" key="&ObjectMeta{Name:sync status,GenerateName:,Namespace:,SelfLink:,UID:,ResourceVersion:,Generation:0,CreationTimestamp:0001-01-01 00:00:00 +0000 UTC,DeletionTimestamp:<nil>,DeletionGracePeriodSeconds:nil,Labels:map[string]string{},Annotations:map[string]string{},OwnerReferences:[]OwnerReference{},Finalizers:[],ClusterName:,ManagedFields:[]ManagedFieldsEntry{},}"
I0304 09:34:39.242837 8 status.go:84] "New leader elected" identity="ingress-nginx-controller-7f48b8-s7pg4"
I0304 09:34:39.252025 8 status.go:204] "POD is not ready" pod="ingress-nginx/ingress-nginx-controller-7f48b8-s7pg4" node="fbcdcesdn02"
I0304 09:34:39.255282 8 status.go:132] "removing value from ingress status" address=[]
I0304 09:34:39.255328 8 nginx.go:380] "Stopping admission controller"
I0304 09:34:39.255379 8 nginx.go:388] "Stopping NGINX process"
E0304 09:34:39.255664 8 nginx.go:319] "Error listening for TLS connections" err="http: Server closed"
2022/03/04 09:34:39 [notice] 43#43: signal process started
I0304 09:34:40.263361 8 nginx.go:401] "NGINX process has stopped"
I0304 09:34:40.263396 8 main.go:195] "Handled quit, awaiting Pod deletion"
I0304 09:34:50.263585 8 main.go:198] "Exiting" code=0
When I do ks describe pod ingress-nginx-controller-7f48b8-s7pg4 -n ingress-nginx I get :
Name: ingress-nginx-controller-7f48b8-s7pg4
Namespace: ingress-nginx
Priority: 0
Node: fxxxxxxxx/10.XXX.XXX.XXX
Start Time: Fri, 04 Mar 2022 08:12:57 +0200
Labels: app.kubernetes.io/component=controller
app.kubernetes.io/instance=ingress-nginx
app.kubernetes.io/name=ingress-nginx
pod-template-hash=7f48b8
Annotations: kubernetes.io/psp: 00-k0s-privileged
Status: Running
IP: 10.244.0.119
IPs:
IP: 10.244.0.119
Controlled By: ReplicaSet/ingress-nginx-controller-7f48b8
Containers:
controller:
Container ID: containerd://638ff4d63b7ba566125bd6789d48db6e8149b06cbd9d887ecc57d08448ba1d7e
Image: k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Image ID: k8s.gcr.io/ingress-nginx/controller#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899
Ports: 80/TCP, 443/TCP, 8443/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP
Args:
/nginx-ingress-controller
--election-id=ingress-controller-leader
--ingress-class=nginx
--configmap=$(POD_NAMESPACE)/ingress-nginx-controller
--validating-webhook=:8443
--validating-webhook-certificate=/usr/local/certificates/cert
--validating-webhook-key=/usr/local/certificates/key
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Fri, 04 Mar 2022 11:33:40 +0200
Finished: Fri, 04 Mar 2022 11:34:50 +0200
Ready: False
Restart Count: 61
Requests:
cpu: 100m
memory: 90Mi
Liveness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=5
Readiness: http-get http://:10254/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
Environment:
POD_NAME: ingress-nginx-controller-7f48b8-s7pg4 (v1:metadata.name)
POD_NAMESPACE: ingress-nginx (v1:metadata.namespace)
LD_PRELOAD: /usr/local/lib/libmimalloc.so
Mounts:
/usr/local/certificates/ from webhook-cert (ro)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-zvcnr (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
webhook-cert:
Type: Secret (a volume populated by a Secret)
SecretName: ingress-nginx-admission
Optional: false
kube-api-access-zvcnr:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning Unhealthy 23m (x316 over 178m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 500
Warning BackOff 8m52s (x555 over 174m) kubelet Back-off restarting failed container
Normal Pulled 3m54s (x51 over 178m) kubelet Container image "k8s.gcr.io/ingress-nginx/controller:v0.48.1#sha256:e9fb216ace49dfa4a5983b183067e97496e7a8b307d2093f4278cd550c303899" already present on machine
When I try to curl the health endpoints I get Connection refused :
The state of the pods shows that they are both not ready :
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-4hzzk 0/1 Completed 0 3h30m
ingress-nginx-controller-7f48b8-s7pg4 0/1 CrashLoopBackOff 63 (91s ago) 3h30m
I have tried to increase the values for initialDelaySeconds in /etc/nginx/nginx.conf but when I attempt to exec into the container (ks exec -it -n ingress-nginx ingress-nginx-controller-7f48b8-s7pg4 -- bash) I also get an error error: unable to upgrade connection: container not found ("controller")
I am not really sure where I should be looking in the overall setup.
I have installed using instructions at this link for the Install NGINX using NodePort option.
The problem is that you are using outdated k0s documentation:
https://docs.k0sproject.io/v1.22.2+k0s.1/examples/nginx-ingress/
You should use this link instead:
https://docs.k0sproject.io/main/examples/nginx-ingress/
You will install the controller-v1.0.0 version on your Kubernetes cluster by following the actual documentation link.
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.0/deploy/static/provider/baremetal/deploy.yaml
The result is:
$ sudo k0s kubectl get pods -n ingress-nginx
NAME READY STATUS RESTARTS AGE
ingress-nginx-admission-create-dw2f4 0/1 Completed 0 11m
ingress-nginx-admission-patch-4dmpd 0/1 Completed 0 11m
ingress-nginx-controller-75f58fbf6b-xrfxr 1/1 Running 0 11m
I am setting up a cluster. I tried to join 3 nodes but while re-balancing. I got below error. So i extracted some info from debug.log and unable to identify the exact issue. Appreciate any help.
=========================CRASH REPORT=========================
crasher:
initial call: service_agent:-spawn_connection_waiter/2-fun-0-/0
pid: <0.18486.7>
registered_name: []
exception exit: {no_connection,"index-service_api"}
in function service_agent:wait_for_connection_loop/3 (src/service_agent.erl, line 305)
ancestors: ['service_agent-index',service_agent_children_sup,
service_agent_sup,ns_server_sup,ns_server_nodes_sup,
<0.170.0>,ns_server_cluster_sup,<0.89.0>]
messages: []
links: [<0.18481.7>,<0.18490.7>]
dictionary: []
trap_exit: false
status: running
heap_size: 987
stack_size: 27
reductions: 1195
neighbours:
[ns_server:error,2018-02-12T13:54:43.531-05:00,ns_1#xuodf9.firebrand.com:service_agent-index<0.18481.7>:service_agent:terminate:264]Terminating abnormally
[ns_server:debug,2018-02-12T13:54:43.531-05:00,ns_1#xuodf9.firebrand.com:<0.18487.7>:ns_pubsub:do_subscribe_link:145]Parent process of subscription {ns_config_events,<0.18481.7>} exited with reason {linked_process_died,
<0.18486.7>,
{no_connection,
"index-service_api"}}
[error_logger:error,2018-02-12T13:54:43.531-05:00,ns_1#xuodf9.firebrand.com:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]** Generic server 'service_agent-index' terminating
** Last message in was {'EXIT',<0.18486.7>,
{no_connection,"index-service_api"}}
** When Server state == {state,index,
{dict,6,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[[{uuid,<<"55a14ec6b06d72205b3cd956e6de60e7">>}|
'ns_1#xuodf7.firebrand.com']],
[],
[[{uuid,<<"c5e67322a74826bef8edf27d51de3257">>}|
'ns_1#xuodf8.firebrand.com']],
[],
[[{uuid,<<"3b55f7739e3fe85127dcf857a5819bdf">>}|
'ns_1#xuodf9.firebrand.com']],
[],
[[{node,'ns_1#xuodf7.firebrand.com'}|
<<"55a14ec6b06d72205b3cd956e6de60e7">>],
[{node,'ns_1#xuodf8.firebrand.com'}|
<<"c5e67322a74826bef8edf27d51de3257">>],
[{node,'ns_1#xuodf9.firebrand.com'}|
<<"3b55f7739e3fe85127dcf857a5819bdf">>]],
[],[],[],[],[],[],[],[],[]}}},
undefined,undefined,<0.18626.7>,#Ref<0.0.5.56873>,
<0.18639.7>,
{[{<0.18646.7>,#Ref<0.0.5.56891>}],[]},
undefined,undefined,undefined,undefined,undefined}
** Reason for termination ==
** {linked_process_died,<0.18486.7>,{no_connection,"index-service_api"}}
[error_logger:error,2018-02-12T13:54:43.532-05:00,ns_1#xuodf9.firebrand.com:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: service_agent:init/1
pid: <0.18481.7>
registered_name: 'service_agent-index'
exception exit: {linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}
in function gen_server:terminate/6 (gen_server.erl, line 744)
ancestors: [service_agent_children_sup,service_agent_sup,ns_server_sup,
ns_server_nodes_sup,<0.170.0>,ns_server_cluster_sup,
<0.89.0>]
messages: [{'EXIT',<0.18639.7>,
{linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}}]
links: [<0.18487.7>,<0.4805.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 28690
stack_size: 27
reductions: 6334
neighbours:
[error_logger:error,2018-02-12T13:54:43.533-05:00,ns_1#xuodf9.firebrand.com:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================SUPERVISOR REPORT=========================
Supervisor: {local,service_agent_children_sup}
Context: child_terminated
Reason: {linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}
Offender: [{pid,<0.18481.7>},
{name,{service_agent,index}},
{mfargs,{service_agent,start_link,[index]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
[ns_server:error,2018-02-12T13:54:43.533-05:00,ns_1#xuodf9.firebrand.com:service_rebalancer-index<0.18626.7>:service_rebalancer:run_rebalance:80]Agent terminated during the rebalance: {'DOWN',#Ref<0.0.5.56860>,process,
<0.18481.7>,
{linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}}
[error_logger:info,2018-02-12T13:54:43.534-05:00,ns_1#xuodf9.firebrand.com:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================PROGRESS REPORT=========================
supervisor: {local,service_agent_children_sup}
started: [{pid,<0.20369.7>},
{name,{service_agent,index}},
{mfargs,{service_agent,start_link,[index]}},
{restart_type,permanent},
{shutdown,1000},
{child_type,worker}]
[ns_server:error,2018-02-12T13:54:43.534-05:00,ns_1#xuodf9.firebrand.com:service_agent-index<0.20369.7>:service_agent:handle_call:186]Got rebalance-only call {if_rebalance,<0.18626.7>,unset_rebalancer} that doesn't match rebalancer pid undefined
[ns_server:error,2018-02-12T13:54:43.534-05:00,ns_1#xuodf9.firebrand.com:service_rebalancer-index<0.18626.7>:service_agent:process_bad_results:815]Service call unset_rebalancer (service index) failed on some nodes:
[{'ns_1#xuodf9.firebrand.com',nack}]
[ns_server:warn,2018-02-12T13:54:43.534-05:00,ns_1#xuodf9.firebrand.com:service_rebalancer-index<0.18626.7>:service_rebalancer:run_rebalance:89]Failed to unset rebalancer on some nodes:
{error,{bad_nodes,index,unset_rebalancer,
[{'ns_1#xuodf9.firebrand.com',nack}]}}
[error_logger:error,2018-02-12T13:54:43.535-05:00,ns_1#xuodf9.firebrand.com:error_logger<0.6.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: service_rebalancer:-spawn_monitor/6-fun-0-/0
pid: <0.18626.7>
registered_name: 'service_rebalancer-index'
exception exit: {linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}
in function service_rebalancer:run_rebalance/7 (src/service_rebalancer.erl, line 92)
ancestors: [cleanup_process,ns_janitor_server,ns_orchestrator_child_sup,
ns_orchestrator_sup,mb_master_sup,mb_master,<0.4893.0>,
ns_server_sup,ns_server_nodes_sup,<0.170.0>,
ns_server_cluster_sup,<0.89.0>]
messages: [{'EXIT',<0.18640.7>,
{linked_process_died,<0.18486.7>,
{no_connection,"index-service_api"}}}]
links: []
dictionary: []
trap_exit: true
status: running
heap_size: 2586
stack_size: 27
reductions: 6359
neighbours:
[ns_server:error,2018-02-12T13:54:43.536-05:00,ns_1#xuodf9.firebrand.com:cleanup_process<0.18625.7>:service_janitor:maybe_init_topology_aware_service:84]Initial rebalance for `index` failed: {error,
{initial_rebalance_failed,index,
{linked_process_died,<0.18486.7>,
{no_connection,
"index-service_api"}}}}
[ns_server:debug,2018-02-12T13:54:43.536-05:00,ns_1#xuodf9.firebrand.com:menelaus_cbauth<0.4796.0>:menelaus_cbauth:handle_cast:95]Observed json rpc process {"projector-cbauth",<0.5099.0>} needs_update
[ns_server:debug,2018-02-12T13:54:43.538-05:00,ns_1#xuodf9.firebrand.com:menelaus_cbauth<0.4796.0>:menelaus_cbauth:handle_cast:95]Observed json rpc process {"goxdcr-cbauth",<0.479.0>} needs_update
[ns_server:debug,2018-02-12T13:54:43.539-05:00,ns_1#xuodf9.firebrand.com:menelaus_cbauth<0.4796.0>:menelaus_cbauth:handle_cast:95]Observed json rpc process {"cbq-engine-cbauth",<0.5124.0>} needs_update
[ns_server:debug,2018-02-12T13:54:43.540-05:00,ns_1#xuodf9.firebrand.com:menelaus_cbauth<0.4796.0>:menelaus_cbauth:handle_cast:95]Observed json rpc process {"fts-cbauth",<0.5129.0>} needs_update
This is a blocker for cluster creation at this point.
The rebalance error is coming due to index service. You can check indexer.log to see if there are any errors and the process is able to bootstrap correctly.
Please make sure the communication ports are open as mentioned here: https://developer.couchbase.com/documentation/server/current/install/install-ports.html
projector_port 9999 being blocked can lead to this.