Sometimes when inserting values into a Couchbase Lite database, document.putproperties() method throws Couchbase Lite conflict exception with status code 409. Why is that exception happening?
This happens when you try to save a document that has been altered by a different writer. The document revision you're trying to save is in conflict with the existing document.
This could happen because you have two paths in your app that write the document. That's usually a bug in managing your threads.
If you're sure it's not a threading bug in your app, then most likely you're running a replication. Replications run on a separate thread. This means it's possible to have a document altered by the replication in between you retrieving it and writing the altered version.
Check the Couchbase Lite documentation on Documents. Read through the section on updating documents.
Related
I am working on a project that uses Couchbase Server and Sync Gateway to synchronize the contents of a bucket with iOS and Android clients running Couchbase Lite. I also need read and write access to the Couchbase Server from a Node.js server application. From the research I've done, using shadowing is frowned upon (https://github.com/couchbase/sync_gateway/wiki/Bucket-Shadowing), which led me to look into the Sync Gateway API as a means to update the bucket from the Node.js application. Updating existing documents through the Sync Gateway API appears to require the most recent revision ID of the document to be passed in, requiring a separate read before the modification (http://mobile-couchbase.narkive.com/HT2kvBP0/cblite-sync-gateway-couchbase-server), which seems potentially inefficient. What is the best way to solve this problem?
Updating a document (which is really creating a new revision) requires the revision ID. Otherwise Couchbase can't associate the update with a parent. This breaks the whole approach to conflict resolution. (Couchbase uses a method known as multiversion concurrency control.)
The expectation is that you're updating the existing contents of a document. This implies you've read the document already, including the revision ID.
If for some reason you don't need to the old contents to update the document, you still need the revision ID. If you work around it (for example, by purging a document through Sync Gateway and then pushing your new version) you can end up with two versions of document in the system with no connection, which will cause a special kind of conflict.
So the short answer is no, there's no way to avoid this (without causing yourself other headaches).
I am not sure why your question was downvoted, as it seems like a reasonable question. You are correct, the Couchbase bucket that is used by Sync Gateway should probably best be thought of as "opaque", you should not be poking around in there and changing things. There are a number of implementations of Couchbase Lite, such as one for Java, .NET, and Mac OS X. Have you considered making a web service that, on one side, is serving your application, and on the other side is itself a Couchbase Lite client? You should be able to separate your data as necessary using channels.
i want to implement a feature that shows when a couchbase document is last read.
Is this saved by default in meta data of couchbase or i need to update the document with a field on every read so it can be retrieved later on .
There's nothing like that in the metadata, you'd have to update the document yourself.
Side note: For writes/updates, you could have made use of the auditing annotation feature of Spring Data (supported by Spring Data Couchbase since SDC 2.1.1) but not for reads.
Also note that performance will suffer as you'd have to effectively perform a write for each read. And there's also the potential consistency side-effects: what if there's already a write of the same document happening in parallel?
To implement this, if you can wait for Couchbase Server 4.5, you should maybe consider using the sub-document API. (see this blog).
I want to be able to read from a real live proper MySQL database using standard file access routines. I don't mean reading the MySQL database's own underlying private files. What I mean is implementing a file-based linux device driver that "presents" a MySQL database as a file. In other words, the text file is a "View" of the MySQL database. The MySQL records are presented in our homegrown custom variation of the CSV format that the legacy code was originally written to understand.
Background
I have some legacy code that reads from a text file that contains a very large table of data, each line being a separate record. New records (lines) need to be added but there is contention for the file among the team, there is also an overhead in deployment of the legacy code and this file to many systems when releasing the software to them. The text file itself also needs to be version controlled.
Rather than modify the legacy code to call a MYSQL database version of these records directly, I thought it would be better to leave it untouched. This would avoid risks in modifying the code and ease deployment and moreover, modifying the code would cause much overhead in de-risking, design discussions, more testing etc.
So what I'm looking to do is write a file-based device driver such that this makes the MySQL database appear as a file to the legacy code, with the data within the format that the legacy code expects. That way the legacy code is not changed and can work oblivious that the file is really an underlying database. Contention is removed because the individual records in the database can now be updated/added to separately (via MySQL, or even better a separate web admin interface that guides and validates data entry from the user for individual records) and deployment effort is much reduced without having to up-issue the whole file on all the systems that use it.
The device driver would contain routines to internally translate standard file read operations into MySQL queries to the MySQL database and contain routines to return the MySQL results and translate these into the text format for returning back to the file read operation.
This is for a Linux/Unix platform.
Has this been done and what are your thoughts?
(cleaned up the question, grammar, clarification, readability. This does not affect the accepted answer.)
This kind of thing has been done before - an obvious example being the dynamic view filing system in ClearCase which provided (maybe still does?) a virtualised view onto a version control repository. Behind the scenes it implemented an object cache and used RPC to fetch objects from other hosts if necessary, and made extensive use of both local and remote databases.
It's fairly clear that you are going to implement the bulk of your filing system in user-space, but you will need a (small) kernel resident portion. Unless there's a really good reason to do otherwise, FUSE is what you're looking for - it will provide the kernel-resident part for you. All you'll need to write is glue to turn file operations into SQL requests.
am I correct assuming that if a different process updates the DB then my NHibernate powered application will be out-of-sync? I'm almost using non-lazy update.
My target DB is mysql 5.0, if it makes any difference.
There isn't a simple way to answer that without more context.
What type of application are you thinking about (web, desktop, other)?
What do you think would be out of sync exactly?
If you have a desktop application with an open window with an open session that has data loaded and you change the same entities somewhere else, of course the DB will be out of sync, but you can use Refresh to update those entities.
If you use NH second-level caching and you modify the cached entities somewhere else, the cache contents will be out of sync, but you can still use Refresh or cache-controlling methods to update directly from the DB.
In all cases, NH provides support for optimistic concurrency by using Version properties; those prevent modifications to out-of-sync entities.
Yes, the objects in your current session will be out of sync, the same way a DataSet/DataTable would be out of sync if you fetch it and another process updates the same data.
The application's code and configuration files are maintained in a code repository. But sometimes, as a part of the project, I also have a some data (which in some cases can be >100MB, >1GB or so), which is stored in a database. Git does a nice job in handling the code and its changes, but how can the development team easily share the data?
It doesn't really fit in the code version control system, as it is mostly large binary files, and would make pulling updates a nightmare. But it does have to be synchronised with the repository, because some code revisions change the schema (ie migrations).
How do you handle such situations?
We have the data and schema stored in xml and use liquibase to handle the updates to both the schema and the data. The advantage here is that you can diff the files to see what's going on, it plays nicely with any VCS and you can automate it.
Due to the size of your database this would mean a sizable "version 0" file. But, using the migration strategy, after that the updates should be manageable as they would only be deltas. You might be able to convert your existing migrations one-to-one to liquibase as well which might be nicer than a big-bang approach.
You can also leverage #belisarius' strategy if your deltas are very large so each developer doesn't have to apply the delta individually.
It seems to me that your database has a lot of parallels with a binary library dependency: it's large (well, much larger than a reasonable code library!), binary, and has its own versions which must correspond to various versions of your codebase.
With this in mind, why not integrate a dependency manager (e.g. Apache Ivy) with your build process and let it manage your database? This seems like just the sort of task that a dependency manager was built for.
Regarding the sheer size of the data/download, I don't think there's any magic bullet (short of some serious document pre-loading infrastructure) unless you can serialize the data into a delta-able format (the XML/JSON/SQL you mentioned).
A second approach (maybe not so compatible with dependency management): If the specifics of your code allow it, you could keep a second file that is a manual diff that can take a base (version 0) database and bring it up to version X. Every developer will need to keep a clean version 0. A pull (of a version with a changed DB) will consist of: pull diff file, copy version 0 to working database, apply diff file. Note that applying the diff file might take a while for a sizable DB, so you may not be saving as much time over the straight download as it first seems.
We usually use the database sync or replication schema.
Each developer has 2 copies of the database, one for working and the other just for keeping the sync version.
When the code is synchronized, the script syncs the database too (the central DB against the "dead" developer's copy). After that each developer updates his own working copy. Sometimes a developer needs to keep some of his/her data, so these second updates are not always driven by the standard script.
It is as robust as the replication schema .... sometimes (depending on the DB) that doesn't represent good news.
DataGrove is a new product that gives you version control for databases. We allow you to store the entire database (schema and data), tag, restore and share the database at any point in time.
This sounds like what you are looking for.
We're currently working on features to allow git-like (push-pull) behaviors so developers can share their repositories across machines, so I can load the latest version of your database when I need it.