How does the intracluster replication on couchbase work?
I understood that the buckets that contains the documents, are subdivided in vbuckets.
The vbuckets also create their replicas to provide High Availability,and the master vbucket and the replicas are stored in different servers throughout the cluster. Now I wanted to understand how is the process of sending the copies to the replicas done? With MongoDB we have oplogs, what about in couchbase?
Couchbase Server uses Distributed Change Protocol (DCP) for intracluster and intercluster replication.
From Couchbase Distributed Data Management documentation:
[DCP is] a high-performance streaming protocol that communicates the state of the data using an ordered change log with sequence numbers.
The Couchbase Forums have some commentary on the replication process in the face of node failures.
DCP facilitates many Couchbase integrations such as the Kafka Connector. See the Connector Guides for more examples.
Related
So I want to deploy a master-slaves MySQL cluster in k8s. I found 2 ways that seem popular:
The first one is to use statefulsets directly from k8s official document: https://kubernetes.io/docs/tutorials/stateful-application/basic-stateful-set/
The second one is to use operator, i.e. https://github.com/oracle/mysql-operator
Which way is most commonly used?
Also, in statefulsets, if my MySQL master dies, will k8s automatically promote the slave to be the master?
Lastly, when my logic backend app performs an operation (CRUD) to MySQL cluster, how does k8s know which pod to route to, i.e. write operation can only be sent to master while read is sent to all?
Users can deploy and maintain a set of highly available MySQL services in k8s based on StatefulSets, the process is relatively complex. This process requires users to familiarize themselves with various k8s resource objects, learn many MySQL operation details and maintain a set of complex management scripts. Kubernetes Operators are designed to reduce the threshold for deploying complex applications on k8s.
Operator hides the orchestration details of complex applications and greatly reduces the threshold to use them in k8s. If you need to deploy other complex applications, we recommend that you use the Operator.
Speaking about master election while using StatefulSet.
Electing potential slave to be a master is not an automatic process - you have to configure this manually using Xtrabackup - here is more information - setting_up_replication.
Take a look: cloning-existing-data, starting-replication, mysql-statefulset-operator.
Useful tools: vitess for better MySQL networking management and percona-xtradb-cluster that provides superior performance, scalability and instrumentation.
I want to know how to sync Couchbase with other Databases seamlessly? Can we use different databases with Couchbase in the same project?
As you haven't specified which databases you have in mind, I will give you a broad answer:
Mobile: Couchbase can be sync with Couchbase Lite (https://www.couchbase.com/products/lite) via Sync Gateway - the middleware between cblite and Couchbase Server. Sync Gateway is mandatory in this case for security reasons, as you should not simply expose your database on the web.
Xamarin: https://blog.couchbase.com/synchronized-drawing-apps-with-couchbase-mobile/
Android: https://docs.couchbase.com/couchbase-lite/current/java-android.html
Swift: https://docs.couchbase.com/couchbase-lite/current/swift.html
Java: https://docs.couchbase.com/couchbase-lite/current/java-platform.html
Others: https://docs.couchbase.com/couchbase-lite/current/index.html
Couchbase Lite 1.x could also be sync with PouchDB, but we dropped this support on Couchbase Lite 2.x as we rewrote the whole thing and this is a feature yet to come.
Server: One of the most common ways to sync Couchbase Server with another database is through the Kafka Connector https://docs.couchbase.com/kafka-connector/current/index.html
Currently, I am on a requirement to sync data from apache direcotry ldap to any of the RDBMS Databases (MySQL, PostgreSQL). Directory approximately holds a few million of records for now and may grow in future. Ldap directory is being the primary data source for now but the motive is to have real time data in both Ldap as well as in RDBMS since We have a plan to use RDBMS for real-time analytics purpose.
Option1:
Thinking of using spring cloud data flow. A source spring boot app to read ldap data that are changed after the last sync run. Source app pushes data to queue(RabbitMQ for now). Sink would be another spring boot app that collects data directly from queue and persists the data into RDBMS. We will be able to better track and manage the sync process jobs using spring cloud data flow dashboard offerings.
Option2:
Spring LdapTemplate helps us to talk to ldap directory in our application. One approach would be to intercept the ldapTemplate calls wherever applicable and push the data to queue and then an intermediate app reads data from queue(RabbitMQ) and converts the ldap response to the required format that can be updated into RDBMS DB.
I am new to Ldap and spring cloud data flow. So far, I have got only these 2 approaches considering my project's existing technology and system landscape. Any other suggestions/ approach are really appreciated. Thanks in advance.
One another approach if LDAP is Microsoft ad server then creating windows service in C# which will connect to your LDAP server and fetch data every day and send data to your rdbms through socket connection. Which is reliable and consistent.
I am running a couchbase cluster (v2.1). Now I have a couchbase cluster (v4.0) provisioned. I want to transfer data from the 2.1 cluster to the 4.0 cluster. Can I just simply use the XDCR through the web console to do that? That is I replicate the data from the v2.1 cluster to the v4.0 cluster.
Any risk that I might lose the data in the v2.1 cluster?
Thanks for the hints.
Yes, you can simply use XDCR to replicate the data to the new cluster. It is robust and is designed to replicate data safely. Note that XDCR uses some resources, so make sure your source cluster has enough CPU and memory headroom. Couchbase best practices recommend approximately 1 core per replication steam and at most 80% of RAM allocates to Couchbase.
Is there any limit on server on serving number of requests per second or number of requests serving simultaneously. [in configuration, not due to RAM, CPU etc hardware limitations]
Is there any limit on number of simultaneous requests on an instance of CouchbaseClient in Java servlet.
Is it best to create only one instance on CouchbaseClient and keep it open or to create multiple instances and destroy.
Is Moxi helpful with Couchbase 1.8.0 server/Couchbase java client 1.0.2
I need this info to setup application in production.
Thanks you
The memcached instance that runs behind Couchbase has a hard
connection limit of 10,000 connections. Couchbase in general
recommends that you should increase the number of nodes to address
the distrobution of traffic on that level.
The client itself does not have a hardcoded limit in regards to how
many connections it makes to a Couchbase cluster.
Couchbase generally recommends that you create a connection pool
from your application to the cluster and just re-use those
connections versus creation and destroying them over and over. On
heavier load applications, the creation and destruction of these
connections over and over can get very expensive from a resource
perspective.
Moxi is an integrated piece of Couchbase. However, it is generally
in place as an adapter layer for clients developers to specifically
use it or to give legacy access to applications designed to directly
access a memcached interface. If you are using the Couchbase client
driver you won't need to use the Moxi interface.