Couchbase rebalance keeps failing because of buckets_shutdown_wait_failed - couchbase

I've updated couchbase cluster from version 2.5.1 to 3.0.3. Upgrade was performed node by node by removing one node from cluster, upgrading it and add it again.
All was working but not the DCP upgrade. I've found in error.log that we have some "Weird vbucket_move_done" for a proximic2 bucket, which isn't used for a while now.
So I've deleted this bucket and try to rebalance but it failed. It was few ours ago and rebalance keeps failing:
Rebalance exited with reason {buckets_shutdown_wait_failed,
[{'ns_1#cb6.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb2.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb1.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb0.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}}]}
ns_orchestrator002 ns_1#cb2.xxx.com 20:42:59 - Fri Jun 19, 2015
Failed to wait deletion of some buckets on some nodes: [{'ns_1#cb6.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb2.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb1.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}},
{'ns_1#cb0.xxx.com',
{'EXIT',
{old_buckets_shutdown_wait_failed,
["proximic2"]}}}]
I'm a bit blocked right now. I've tried to create this proximic2 bucket, but I've got information that it already exists. However this bucket is not visible in web GUI or cli. Couchbase runs on debians 7.8. All nodes are in version 3.0.3.
Marek

Related

MySQL Workbench crashing when trying to do a SELECT query

I'm trying to use a spring boot application with MySQL Workbench however when I try to make a SELECT query the app constantly crashed and it seems like it only happens when I do a SELECT because it works when I create a database and use an INSERT statement.
I'm on MacOS Ventura, has anyone ran into this issue as well? If so how was it fixed?
I have tried uninstalling and reinstalling but the issue persists.
Edit
This is the error I see.
-------------------------------------
Translated Report (Full Report Below)
-------------------------------------
Process: MySQLWorkbench [4040]
Path: /Applications/MySQLWorkbench.app/Contents/MacOS/MySQLWorkbench
Identifier: com.oracle.workbench.MySQLWorkbench
Version: 8.0.32.CE (1)
Code Type: X86-64 (Translated)
Parent Process: launchd [1]
User ID: 501
Date/Time: 2023-02-02 14:09:45.6928 -0400
OS Version: macOS 13.0.1 (22A400)
Report Version: 12
Anonymous UUID: F7C82EFE-672B-7D8B-2236-0CE707EC500E
Sleep/Wake UUID: D960C8C9-39F1-456B-96B6-12CB04D76D93
Time Awake Since Boot: 21000 seconds
Time Since Wake: 5002 seconds
System Integrity Protection: enabled
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_BAD_INSTRUCTION (SIGILL)
Exception Codes: 0x0000000000000001, 0x0000000000000000
Termination Reason: Namespace SIGNAL, Code 4 Illegal instruction: 4
Terminating Process: exc handler [4040]
There's much more though but it's too much text.

Cannot set configuration in Elastic Beanstalk

I have 4 Elastic Beanstalk deployments: 3 are Corretto 8 and the other one is Corretto 11.
On the Corretto 8 deployments, I can set new configuration without issue. On the Corretto 11 instance, however, any attempt to set a new configuration fails and causes a rollback.
The Corretto versions might not be the problem, but it's the only difference I can see. All 4 apps are Spring Boot apps that run as web servers (i.e embedded tomcat with exposed web ports). I am trying to set the exact same configuration name and value, and it only fails on the one instance.
The configuration I'm trying to set is pretty simple:
VALIDATE_RENEWALS = true
Even just trying to set DEBUG = true causes a failure and rollback.
I don't see a lot of information from the console about what's failing. Here is the event log:
2020-03-16 13:55:17 UTC-0600 INFO The environment was reverted to the previous configuration setting.
2020-03-16 13:54:45 UTC-0600 ERROR During an aborted deployment, some instances may have deployed the new application version. To ensure all instances are running the same version, re-deploy the appropriate application version.
2020-03-16 13:54:45 UTC-0600 ERROR Failed to deploy configuration.
2020-03-16 13:54:45 UTC-0600 ERROR Unsuccessful command execution on instance id(s) 'i-00553f4ac36afd327'. Aborting the operation.
2020-03-16 13:54:45 UTC-0600 INFO Command execution completed on all instances. Summary: [Successful: 0, Failed: 1].
2020-03-16 13:54:45 UTC-0600 ERROR [Instance: i-00553f4ac36afd327] Command failed on instance. An unexpected error has occurred [ErrorCode: 0000000001].
2020-03-16 13:54:20 UTC-0600 INFO Updating environment XXX's configuration settings.
2020-03-16 13:54:15 UTC-0600 INFO Environment update is starting.
I've also downloaded the full set of logs for the instance and don't see anything obvious. The app stdout doesn't have any errors or exceptions, it just starts normally and then gets terminated. None of the other log files have messages around the times above, so I'm really not sure what else I can look at.
Edit
The times don't line up but I do see this in eb-engine.log file:
2020/03/16 17:54:38.508634 [INFO] checking whether command is applicable to this instance...
2020/03/16 17:54:38.508658 [INFO] this command is applicable to the instance, thus instance should execute command
2020/03/16 17:54:38.508665 [INFO] check whether this is an enhanced env...
2020/03/16 17:54:38.508794 [INFO] Executing instruction: StageJavaApplication
2020/03/16 17:54:38.508858 [ERROR] GetArchivedFileType with file /opt/elasticbeanstalk/deployment/app_source_bundle failed with error open /opt/elasticbeanstalk/deployment/app_source_bundle: no such file or directory
2020/03/16 17:54:38.508868 [ERROR] An error occurred during execution of command [config-deploy] - [StageJavaApplication]. Stop running the command. Error: staging java app failed with error GetArchivedFileType with file /opt/elasticbeanstalk/deployment/app_source_bundle failed with error open /opt/elasticbeanstalk/deployment/app_source_bundle: no such file or directory

:Maven build failure for spring cloud

I have been observing the build fail issue for this
[ERROR] The build could not read 1 project -> [Help 1]
[ERROR]
[ERROR] The project org.springframework.cloud:spring-cloud-netflix-hystrix-stream:2.0.1.BUILD-SNAPSHOT (E:\springcloud\spring-cloud-netflix-master\spring-cloud-netflix-master\spring-cloud-netflix-hystrix-stream\pom.xml) has 1 error
[ERROR] Unresolveable build extension: Plugin org.springframework.cloud:spring-cloud-contract-maven-plugin:1.2.4.RELEASE or one of its dependencies could not be resolved: Failure to find org.springframework.cloud:spring-cloud-netflix-hystrix-contract:jar:2.0.1.BUILD-SNAPSHOT in https://repo.spring.io/libs-snapshot-local was cached in the local repository, resolution will not be reattempted until the update interval of spring-snapshots has elapsed or updates are forced -> [Help 2]
Please run ./scripts/build.sh to fix the issue.

Apache Karaf stuck while shutting with Pax-Exam

I was running integrations tests with Pax-Exam and Karaf, tests got executed successfully but while shutting Karaf, it stuck on below and never resume
Pax-Exam = 4.11
Karaf = 4.2
[main] DEBUG o.ops4j.store.intern.TemporaryStore - Exit store(): 66cf6a516d0d1a670e78bd6b0be97f3da2a380b3
[main] DEBUG o.o.p.e.c.remote.RBCRemoteTarget - Preparing and Installing bundle (from stream )..
[main] DEBUG o.o.p.e.r.c.RemoteBundleContextClient - Packing probe into memory for true RMI. Hopefully things will fill in..
[main] DEBUG o.o.p.e.c.remote.RBCRemoteTarget - Installed bundle (from stream) as ID: 86
[main] DEBUG o.o.p.e.c.remote.RBCRemoteTarget - call [[TestAddress:PaxExam-bc970a6c-c656-4aa6-9300-35ded2bcde50 root:PaxExam-f6737e31-8f28-43e
0-847e-1f3f49649233]]
[main] DEBUG o.o.p.e.k.c.i.KarafTestContainer - Shutting down the test container (Pax Runner)
Following is output of JConsole for blocking
Name: main
State: BLOCKED on java.lang.Object#d53a0bb owned by: KarafJavaRunner
Total blocked: 106 Total waited: 105
Stack trace:
org.ops4j.pax.exam.karaf.container.internal.runner.InternalRunner.shutdown(InternalRunner.java:71)
org.ops4j.pax.exam.karaf.container.internal.runner.KarafJavaRunner.shutdown(KarafJavaRunner.java:120)
- locked org.ops4j.pax.exam.karaf.container.internal.runner.KarafJavaRunner#279baf5b
org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer.stop(KarafTestContainer.java:600)
- locked org.ops4j.pax.exam.karaf.container.internal.KarafTestContainer#25dcfa62
org.ops4j.pax.exam.spi.reactors.AllConfinedStagedReactor.invoke(AllConfinedStagedReactor.java:87)
org.ops4j.pax.exam.junit.impl.ProbeRunner$2.evaluate(ProbeRunner.java:267)
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
org.junit.runners.ParentRunner.run(ParentRunner.java:309)
org.ops4j.pax.exam.junit.impl.ProbeRunner.run(ProbeRunner.java:98)
org.ops4j.pax.exam.junit.PaxExam.run(PaxExam.java:93)
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
Update:
One more thing i observed if i forcefully shut it and then if i run "mvn clean install" i get following error and i have to wait to get it run again
[←[1;31mERROR←[m] Failed to execute goal ←[32morg.apache.maven.plugins:maven-clean-plugin:2.5:clean←[m ←[1m(default-clean)←[m on project ←[36mosgi-unit-tes
ts-sample←[m: ←[1;31mFailed to clean project: Failed to delete C:\Users\..\target\pax
exam\e266ddcb-5fed-4997-8178-3d4944251418\system\org\apache\felix\org.apache.felix.framework\5.6.10\org.apache.felix.framework-5.6.10.jar←[m -> ←[1m[Help 1
Update2:
After exiting prompt still its running
C:\Program Files\Java\jdk1.8.0_162\bin>jps -l
1552 sun.tools.jps.Jps
4144
1420 org.apache.karaf.main.Main
C:\Program Files\Java\jdk1.8.0_162\bin>jps -l 1420
RMI Registry not available at 1420:1099
Exception creating connection to: 1420; nested exception is:
java.net.SocketException: Network is unreachable: connect
Update3:
if i kill this process, Pax resume and display message successful execution of Tests. infact before shutting all tests are already successfull but it not able to shut.
TASKKILL /F /PID 10692
Now i have no clue to handle this locking issue.
Update4:
Name: main
State: WAITING on org.apache.felix.framework.util.ThreadGate#b3d26d8
Total blocked: 6 Total waited: 7
Stack trace:
java.lang.Object.wait(Native Method)
org.apache.felix.framework.util.ThreadGate.await(ThreadGate.java:79)
org.apache.felix.framework.Felix.waitForStop(Felix.java:1075)
org.apache.karaf.main.Main.awaitShutdown(Main.java:640)
org.apache.karaf.main.Main.main(Main.java:188)
Name: FelixDispatchQueue
State: WAITING on java.util.ArrayList#3276dd18
Total blocked: 353 Total waited: 342
Stack trace:
java.lang.Object.wait(Native Method)
java.lang.Object.wait(Object.java:502)
org.apache.felix.framework.EventDispatcher.run(EventDispatcher.java:1122)
org.apache.felix.framework.EventDispatcher.access$000(EventDispatcher.java:54)
org.apache.felix.framework.EventDispatcher$1.run(EventDispatcher.java:102)
java.lang.Thread.run(Thread.java:748)
Update5:
After spending lot of time i finally realize that by adding below bundles it got stuck, if i dont add them it works fine
wrappedBundle( maven("org.ops4j.pax.tinybundles", "tinybundles").versionAsInProject() ), //2.1.0
wrappedBundle( maven("biz.aQute.bnd", "bndlib").versionAsInProject() )//2.4.0
Regards,
I resolved issue by changing following jars version
maven("org.ops4j.pax.tinybundles", "tinybundles") from 2.1.0 to 3.0.0
maven("biz.aQute.bnd", "bndlib") from 2.4.0 to 3.5.0

Are the tables stored with MEMORY engine recoverable from cluster crash?

I have set up MySQL NDB Cluster 7.3.5 and the cluster was working fine.
Cluster with 4 nodes :
NodeA : SQLNode1, DataNode1
NodeB : SQLNode2, DataNode2
NodeC : Mgmt Node1
NodeD : Mgmt Node2
To test the server reboot scenario I rebooted VMWare ESXi and restarted all VMs.
But the data nodes are subsequently failing to start.
Adding logs for the servers respectively:
/home/mysql/mysqlcluster_data/1/ndb_1_out.log (Data Node 1)
error: [ code: 708 line: 38848236 node: 1 count: 1 status: 32687 key: 445914048 name: 'hhmefep/def/fgvmev0000000000-elog-1398414831' ]
2014-05-13 13:16:40 [ndbd] INFO -- Failed to recreate object 505 during restart, error 708.
2014-05-13 13:16:40 [ndbd] INFO -- DBDICT (Line: 4688) 0x00000000
2014-05-13 13:16:40 [ndbd] INFO -- Error handler restarting system
2014-05-13 13:16:40 [ndbd] INFO -- Error handler shutdown completed - exiting
2014-05-13 13:16:40 [ndbd] ALERT -- Angel detected too many startup failures(3), not restarting again
2014-05-13 13:16:40 [ndbd] ALERT -- Node 1: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
It seems that the nodes are failing to recover this table:
hhmefep.fgvmev0000000000-elog-1398414831
/home/mysql/mysqlcluster_data/2/ndb_2_out.log (Data Node 2)
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 1 completed
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 2 completed
2014-05-13 13:05:48 [ndbd] INFO -- Start phase 3 completed
2014-05-13 13:05:51 [ndbd] INFO -- Node 1 disconnected
2014-05-13 13:05:51 [ndbd] INFO -- QMGR (Line: 3308) 0x00000000
2014-05-13 13:05:51 [ndbd] INFO -- Error handler restarting system
2014-05-13 13:05:51 [ndbd] INFO -- Error handler shutdown completed - exiting
2014-05-13 13:05:51 [ndbd] ALERT -- Angel detected too many startup failures(3), not restarting again
2014-05-13 13:05:51 [ndbd] ALERT -- Node 2: Forced node shutdown completed. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
It seems that data node 2 is trying to sync with data node 1 but has been forcefully shutdown by management node.
(Mgmt Node)
ndb_mgm> Node 1: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 1: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 1: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Node 2: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2308: 'Another node failed during system restart, please investigate error(s) on other node(s)(Restart error). Temporary error, restart node'.
Node 2: Forced node shutdown completed, restarting. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
ndb_mgm> Node 2: Forced node shutdown completed. Occured during startphase 4. Caused by error 2355: 'Failure to restore schema(Resource configuration error). Permanent error, external action needed'.
Please help me on this since it is very frustrating.
Per the MySQL memory engine page:
The MEMORY storage engine (formerly known as HEAP) creates
special-purpose tables with contents that are stored in memory.
Because the data is vulnerable to crashes, hardware issues, or power
outages, only use these tables as temporary work areas or read-only
caches for data pulled from other tables.