Couchbase Sync Gateway business logic - couchbase

I'm currently working on an architecture with an offline mobile client and a database server.
I was thinking about using the sync-gateway component from couchbase, hence, couchbase as a server db and pouchdb as a client db.
The business logic is quite complex, though and as far as I understand, synchronization filtering, data validation and authorization is made through the gateway configuration.
Is this a good idea or couchbase synchronization capabilities are preferred for simpler logics and I should stick to a more Spring Rest API and fill the local indexDB manually.

Couchbase Sync Gateway is used in very large enterprise grade deployments of varying complexity and scale so that shouldn't be an issue. The decision you would need to make is whether you need sync or you are looking for a simple request-response approach (that's better suited for connected environments). FWIW, Sync Gateway also supports a REST interface - so you can use indexedDB requesting data via the REST interface as well.
You mention offline mobile client so why are you not using couchbase lite as the embedded database? Is this a PWA? The synchronization protocol between Couchbase lite and Sync Gateway is more performant and advanced compared to the couchDB based approach used between the likes of PouchDB and Sync Gateway.

Related

Real time communication between clients via websocket server on Google App Engine

This article describes how a websocket server for a chat application can look. We are planning to implement something similar; when a message is sent to the server it is sent to the correct recipient based on an authentication token and the message gets saved in a mysql database.
We will eventually host the server on Google App Engine, and I suspect that that will cause some issues with the above described approach, since that depends on all clients being connected to the same server, and that probably won't be the case since multiple instances will be created as needed. Is there a way to connect all instances so that this won't be a problem (Pub/Sub maybe? (That will cause additional costs though)), or should we find a different solution?
One idea I had was to use mysql-events to monitor the binlog from the websocket server for the creation of new rows in the messages table, but I read somewhere that that wasn't recommend. But I can't find where I read that, and maybe that is the best solution.
Since you asked about other solutions, I would recommend looking at Firebase and specifically the Realtime Database. Out of the box it provides all of the functionality that you need for realtime communication between connected clients and Cloud Messaging for clients who aren't.
Here's a tutorial that uses Firestore to create a realtime chat web app, but it can all be applied to the Realtime Database with minor modification. I say that because Firestore has expensive writes, which in my opinion make it unsuitable for a chat backend.

Using Couchbase SDK vs Sync Gateway API

I have a full deployment of couchbase (server, sync gateway and lite) and have an API, mobile app and web app all using it.
It works very well, but I was wondering if there are any advantages to using the Sync Gateway API over the Couchbase SDK? Specifically I would like to know if Sync Gateway would handle larger numbers of operations better than the SDK, perhaps an internal queue/cache system, but can't seem to find definitive documentation for this.
At the moment the API uses the C# Couchbase SDK and we use SyncGateway very little (only really for synchronising the mobile app).
First, some relevant background info :
Every document that needs to be synced over to Couchbase Lite(CBL) clients needs to be processed by the Sync Gateway (SGW). This is true whether a doc is written via the SGW API or whether it comes in via server write (N1QL or SDK). The latter case is referred to as "import processing” wherein the document that is written to the bucket (via N1QL) is read by SGW via DCP feed. The document is then processed by SGW and written back to the bucket with the relevant sync metadata.
Prerequisite :
In order for the SGW to import documents written directly via N1QL/SDK, you must enable “shared bucket access” and import processing as discussed here
Non-mobile documents :
If you have documents that are never going to be synced to the CBL clients, then choice is obvious. Use server SDKs or N1QL
Mobile documents (docs to sync to CBL clients) :
Assuming you are on SGW 2.x syncing with CBL 2.x clients
If you have documents written at server end that need to be synced to CBL clients, then consider the following
Server side write rate:
If you are looking at writes on server side coming in at sustained rates significantly exceeding 1.5K/sec (lets say 5K/sec), then you should go the SGW API route. While it's easy enough to do a bulk update via server N1QL query, remember that SGW still needs to keep up and do the import processing (what's discussed in the background).
Which means, if you are doing high volume updates through the SDK/N1QL, then you will have to rate limit it so the SGW can keep up (do batched updates via SDK)
That said, it is important to consider the fact that if SGW can't keep up with the write throughput on the DCP feed, it's going to result in latency, no matter how the writes are happening (SGW API or N1QL)
If your sustained write rate on server isn’t excepted to be significantly high, then go with N1QL.
Deletes Handling:
Does not matter. Under shared-bucket-access, deletes coming in via SDK or SGW API will result in a tombstone. Read more about it here
SGW specific config :
Naturally, if you are dealing with SGW specific config, creating SGW users, roles, then you will use the SGW API for that.
Conflict Handling :
In 2.x, it does not matter. Conflicts are handled on CBL side.
Challenge with SGW API
Probably the biggest challenge in a real-world scenario is that using the SG API path means either storing information about SG revision IDs in the external system, or perform every mutation as a read-then-write (since we don't have a way to PUT a document without providing a revision ID)
The short answer is that for backend operations, Couchbase SDK is your choice, and will perform much better. Sync Gateway is meant to be used by Mobile clients, with few exceptions (*).
Bulk/Batch operations
In my performance tests using Java Couchbase SDK and bulk operations from AsyncBucket (link), I have updated up to 8 thousand documents per second. In .Net there you can do Batch operations too (link).
Sync Gateway also supports bulk operations, yet it is much slower because it relies on REST API and it requires you to provide a _rev from the previous version of each document you want to update. This will usually result in the backend having to do a GET before doing a PUT. Also, keep in mind that Sync Gateway is not a storage unit. It just works as a proxy to Couchbase, managing mobile client access to segments of data based on the channels registered for each user, and writes all of it's meta-data documents into the Couchbase Server bucket, including channel indexing, user register, document revisions and views.
Querying
Views are indexed thus for querying of large data they may will respond very fast. Whenever a document is changed, the map function of all views has the opportunity to map it. But when a view is created through Sync Gateway REST API, some code is added to your map function to handle user channels/permissions, making it slower than plain code created directly in Couchbase Admin UI. Querying views with compound keys using startKey/endKey parameters is very powerful when you have hierarchical data, but this functionality and the use of reduce function are not available for mobile clients.
N1QL can also be very fast too, when your N1QL query is taking advantage of Couchbase indexes.
Notes
(*) One exception to the rule is when you want to delete a document and have this reflected on mobile phones. The DELETE operation, leaves an empty document with _deleted: true attribute, and can only be done through Sync Gateway. Next time the mobile device synchronizes and finds this hint, it will delete the document from local storage. You can also use set this attribute through a PUT operation, when you may also adding _exp: "2019-12-12T00:00:00.000Z" attribute to perform a programmed purge of the document in a future date, so that the server also gets clean. However, just purging a document through Sync Gateway is equivalent to delete it through Couchbase SDK and this won't reflect on mobile devices.
NOTE: Prior to Sync Gateway 1.5 and Couchbase 5.0, all backend operations had to be done directly in Sync Gateway so that Sync Gateway and mobile clients could detect those changes. This has changed since shared_bucket_access option was introduced. More info here.

Would FeathersJS be suitable to synchronize an offline javascript database with a backend api

I want to be able to allow the user to work offline with a javascript database such as PouchDB or IndexDb to store records not just user data and then sync up to the server when online.
To that end FeathersJS said it could sit in the middle between a legacy api and the Feather's client to handle real-time sync.
Does the real-time sync mean that Feathers is appropriate for use as a two way client to api synchronization with conflict resolution?
My understanding is that the default Feathers realtime sync does not handle offline storage and conflict resolution out of the box.
That said, there are a lot of resources for getting Feathers to accomplish this. I would start here:
https://feathersjs-offline.github.io/docs/
(Edited thanks to Yannick Marcon from a comment)

unable to sync pouchDB with couchBase Sync Gateway

I am trying to sync pouchDB with couchBase through Sync Gateway, but i just get data added by pouchDB, not initial data added to couchBase. For example there is 750 docs in couchBase but none of them synced to the pouchDB. Also http://localhost:4985/_admin/db/db not showing couchBase docs too.
The problem is with adding data to Couchbase Server directly. Couchbase Mobile currently requires extra metadata in order to deal with replication and conflict resolution. This isn't handled by the Server SDKs.
The recommended approach is to do all database writes through Sync Gateway.
To simplify use with PHP, you may want to use a Swagger PHP client. (You can see an example of using clients autogenerated by Swagger in this post. The example use Javascript and Node.js, but the principles are the same.)
You can read from Couchbase Server directly if you want (to do a N1QL query, for example).
Another option is to use "bucket shadowing". This is trickier, and is likely to get deprecated at some point. I only list it for completeness.

Couchbase SDK and Moxi Client

I am new to Couchbase and trying to understand why we need a Client side proxy like Moxi if using a Couchbase PHP SDK.
As per my understanding the proxying of client requests to right server is done by the Client SDK which maintains the vBucket map of all keys.
Why in case of a web application using PHP SDK and Couchbase, we need an additional Moxi client?
They are for two different things.
Moxi is for when you want to use a standard memcached library as MOXI will proxy memcached calls to the Couchbase cluster and use Couchbase buckets. Your code will not know it is talking to a persistent database in the background. Using moxi with Couchbase buckets will give you some of the benefits of Couchbase, like High Availability, easy scalability and performance Couchbase is known for, but you can use any old off the shelf memcached library. Just know that because of adhering to memcached, moxi is limited to that functionality from an application perspective.
In my opinion, moxi should be used to bridge the gap between people on memcached and using the full SDKs and is not meant to be a final destination, though some people have been on it for years.
Using the Couchbase PHP SDK on the other hand gives you the full suite functionality that Couchbase can offer and you do not need MOXI at all.
In summary, if you are in a spot to use the Couchbase SDK, do that. You will get more functionality, performance, etc. from it. Moxi is for those that are already have memcached, but want to step up to a clustered high performance cache and not change their code.