Is it possible to store only one revision of each document in Couchbase so that it doesn't inflate after each update using set? This question is interesting in context of Couchbase Lite too.
Related
I am new to couchbase and NoSql.We have a central/main data center that contains whole documents and several branches/offices, they will have their own specific documents.We want to replicate some documents between branches and main data center.
Each branch must see its own documents and not others.
I want to know if there is any document level security in couchbase?
Have you decided what mechanism you are going to use for replication yet? If you use Couchbase's included XDCR (Cross Data Centre Replication) you could use XDCR filtering to replicate only a subset of documents from the main data centre to the branches. In this way the branch servers will only have the specific documents you send to them. You could also send any updates on the data back to the main DC by setting up a second XDCR process from the branch back to the main DC.
Couchbase's XDCR filtering documentation explain the filtering setup quite well and here are some more details about the XDCR setup in general. Hope that helps!
I am working on a project that uses Couchbase Server and Sync Gateway to synchronize the contents of a bucket with iOS and Android clients running Couchbase Lite. I also need read and write access to the Couchbase Server from a Node.js server application. From the research I've done, using shadowing is frowned upon (https://github.com/couchbase/sync_gateway/wiki/Bucket-Shadowing), which led me to look into the Sync Gateway API as a means to update the bucket from the Node.js application. Updating existing documents through the Sync Gateway API appears to require the most recent revision ID of the document to be passed in, requiring a separate read before the modification (http://mobile-couchbase.narkive.com/HT2kvBP0/cblite-sync-gateway-couchbase-server), which seems potentially inefficient. What is the best way to solve this problem?
Updating a document (which is really creating a new revision) requires the revision ID. Otherwise Couchbase can't associate the update with a parent. This breaks the whole approach to conflict resolution. (Couchbase uses a method known as multiversion concurrency control.)
The expectation is that you're updating the existing contents of a document. This implies you've read the document already, including the revision ID.
If for some reason you don't need to the old contents to update the document, you still need the revision ID. If you work around it (for example, by purging a document through Sync Gateway and then pushing your new version) you can end up with two versions of document in the system with no connection, which will cause a special kind of conflict.
So the short answer is no, there's no way to avoid this (without causing yourself other headaches).
I am not sure why your question was downvoted, as it seems like a reasonable question. You are correct, the Couchbase bucket that is used by Sync Gateway should probably best be thought of as "opaque", you should not be poking around in there and changing things. There are a number of implementations of Couchbase Lite, such as one for Java, .NET, and Mac OS X. Have you considered making a web service that, on one side, is serving your application, and on the other side is itself a Couchbase Lite client? You should be able to separate your data as necessary using channels.
I want to set TTL (time to live) at couchbase server for all documents getting pushed from mobile devices continuously for replication. I did not find any clear explanation/example in documentation to do this.
What should be the approach to set TTL for documents coming from mobile devices to Server through Sync Gateway.
Approach 1:
One approach is to create a view at server side which would return createdDate as key. We will query that view for keys of today date which would return today documents and we can set TTL for those documents. But how and when would we call this view and is it a good approach?
Approach 2:
Should I do it by using webhooks where it will listen to document changes (creations) made through Couchbase Lite push replications, set TTL for new documents and save back to Couchbase server?
Is there any other better way to do it?
Also what is the way to set TTL for only specific documents?
EDIT: My final approach:
I will create following view at couchbase server:
function (doc, meta) {
emit(dateToArray(doc.createdDate));
}
I will have a job which would run daily at EOD, query view to get documents created today and set TTL for those documents.
In this way I would be able to delete documents regularly.
Let me know if there is any issue with it or there is any better way.
Hopefully someone from the mobile team can chime in and confirm this, but I don't think that the mobile SDK allows to set an expiry at the moment.
I guess you could add the creation/update time to your document and create a batch job that uses the "core" SDKs to periodically find old documents (either via N1QL or a dedicated view) and remove them.
It is not currently possible to set a TTL for documents via Sync Gateway like you can with a Couchbase Server smart-client. There is a conceptual impedance mismatch with Sync Gateway using the native-style TTLs on documents. This is because the Sync Gateway protocol functions on the basis of revision trees and even when a document is 'deleted', there is still a document in place to mark that there is a document that has been deleted.
I would also be wary of workloads which might require TTLs (e.g. a cache), Sync Gateway documents take up space even after they've been deleted so your dataset may continue to grow unbounded.
If you do require TTLs, and if you do not think the dataset size will be an issue then the best way would be to store a field in the document that represents the time the document would expire. Then you would do two things:
When accessing the document, if it has expired then you manually delete it
Periodically iterate over the all docs endpoint and delete any documents you find with an expiry time in the past.
Couchbase does not delete when TTL reached;
Instead, when you access (expired) document,
then Couchbase check expiry, remove it at that moment.
http://developer.couchbase.com/documentation/server/4.0/developer-guide/expiry.html
i want to implement a feature that shows when a couchbase document is last read.
Is this saved by default in meta data of couchbase or i need to update the document with a field on every read so it can be retrieved later on .
There's nothing like that in the metadata, you'd have to update the document yourself.
Side note: For writes/updates, you could have made use of the auditing annotation feature of Spring Data (supported by Spring Data Couchbase since SDC 2.1.1) but not for reads.
Also note that performance will suffer as you'd have to effectively perform a write for each read. And there's also the potential consistency side-effects: what if there's already a write of the same document happening in parallel?
To implement this, if you can wait for Couchbase Server 4.5, you should maybe consider using the sub-document API. (see this blog).
Apache Jackrabbit (or the JCR API) helps you separate the data store from the data management system. This would mean that every data store provider would have to implement the JCR API for his own data store. The question is JCR implemented for MySQL? Can we use the JCR API over MySQL? I want to truly abstract out where i store my content, so that tomorrow if i don't want to use a relational DB i can swap it out with the file system with ease.
You could try ModeShape, which is a JCR implementation that can store its data in a variety of systems, including MySQL (or almost any other relational database), data grids (such as Infinispan), file systems, version control systems (e.g., SVN), etc. You can even create a single JCR repository backed by multiple federated systems. ModeShape does this through an extensible connector library (that is much, much simpler than implementing the full JCR API), so you can use the JCR API to get at your data in other systems, too.
Apache Jackrabbit can be configured to use MySQL for storage, the discussion at http://markmail.org/message/fbkw5vey2mme4uxe is a good starting point.
"ModeShape isn't your father's JCR" covers all this in more detail, as does the Reference Guide on the project site.
So is it correct to say that ModeShape and Teiid are kind of the same thing, apart from the fact that one gives you a relational view and the other a hierarchial (or tree) based view of the various data sources?