I have a Couchbase server and a .Net client. When I named the bucket "default" every thing run ok but when I create a bucket with another name like 'cashdb' my client got error "Null Pointer Exception".
I really don't know if you want to have 3 bucket on a server with difference names , what can you do?
When you have multiple buckets (or even a single bucket that's not named 'default'), you have to explicitly specify which one you want to open when creating the connection.
In the 1.x SDK it's:
var config = new CouchbaseClientConfiguration();
config.Bucket = "mybucket"
config.BucketPassword = "12345";
var connection = new CouchbaseClient(config);
In the 2.x SDK it's slightly longer, so take a look here: http://docs.couchbase.com/developer/dotnet-2.0/configuring-the-client.html
I cannot answer the part about the .Net drive, but I can address the multiple bucket questions.
You can have multiple buckets, but know why you are doing it. A logical organization is not necessarily a great reason, IMO. More buckets means more resources being used. I can give you a great example of a when you would split data into separate buckets, views. If you have views that only look at a subset of the data you have and will never ever look at the other parts of the data, then it might make sense to split it out. Say you have some JSON docs that are 30% of your data and a bunch of key value pairs that are 70% of your data. More than likely, you would only ever do views on the JSON docs and if there are enough of those docs and in large enough sizes, it can provide much faster view creation, maintenance, cluster rebalances, etc.
Another reason is if you have multiple applications accessing the same cluster. That is a good reason too.
Anyhow, it is fine to have multiple buckets, just read up on and understand the implications and do it strategically.
Related
I have heard only of couchbase bucket is there also a basket? I would like to have multiple apps use multiple buckets but for couchbase performance is there a lighter thing than a bucket called basket?
Never heard of a basket in Couchbase. Now that being said we strongly encourage people to add a type field to every document stored in buckets. Before having queries we would tell you to do multiple applications by prefixing all your document keys by the app prefix. Now that we have n1ql and that you can do queries based on the content, you should add a field in the doc as well.
From a security perspective you'll be mixing stuff from different app in the same bucket. We have no way to make any distinction right now between doc from one app or the other on the server side. It means your security model has to be handled on the client/application layer.
I'll describe the application I'm trying to build and the technology stack I'm thinking at the moment to know your opinion.
Users should be able to work in a list of task. These tasks are coming from an API with all the information about it: id, image urls, description, etc. The API is only available in one datacenter and in order to avoid the delay, for example in China, the tasks are stored in a queue.
So you'll have different queues depending of your country and once that you finish with your task it will be send to another queue which will write this information later on in the original datacenter
The list of task is quite huge that's why there is an API call to get the tasks(~10k rows), store it in a queue and users can work on them depending on the queue the country they are.
For this system, where you can have around 100 queues, I was thinking on redis to manage the list of tasks request(ex: get me 5k rows for China queue, write 500 rows in the write queue, etc).
The API response are coming as a list of json objects. These 10k rows for example need to be stored somewhere. Due to you need to be able to filter in this queue, MySQL isn't an option at least that I store every field of the json object as a new row. First think is a NoSQL DB but I wasn't too happy with MongoDB in the past and an API response doesn't change too much. Like I need relation tables too for other thing, I was thinking on PostgreSQL. It's a relation database and you have the ability to store json and filter by them.
What do you think? Ask me is something isn't clear
You can use HStore extension from PostgreSQL to store JSON, or dynamic columns from MariaDB (MySQL clone).
If you can move your persistence stack to java, then many interesting options are available: mapdb (but it requires memory and its api is changing rapidly), persistit, or mvstore (the engine behind H2).
All these would allow to store json with decent performances. I suggest you use a full text search engine like lucene to avoid searching json content in a slow way.
When you create a document using the Couchbase Server API, one of the arguments is a document name. What is this used for and why is it needed?
When using Couchbase Lite you can create an empty document and it is assigned an _id and _rev. You do not need to give it a name. So what is this argument for in Couchbase Server?
In Couchbase Server it is a design decision that all objects are identified by a the object ID, key or name (all the same thing by different names) and those are not auto-assigned. The reason for this is that keys are not embedded in the document itself, key lookups are the fastest way to get that object and the technology dictates this under the hood of the server. Getting a document by ID is much faster than querying for it. Querying means you are asking a question, whereas getting the object by ID means you already know the answer and are just telling the DB to go get it for you and is therefor faster.
If the ID is something that is random, then more than likely you must query the DB and that is less efficient. Couchbase Mobile's sync_gateway together with Couchbase Lite handles this on your behalf if you want it to as it can have its own keyspace and key pattern it manages for key lookups. If you are going straight to the DB on your own with the Couchbase SDK though, knowing that key will be the fastest way to get that object. Like I said, Couchbase Sync_Gateway handeles this lookup for you, as it is the app server. When you go direct with the SDKs you get more control and different design patterns emerge.
Many people in Couchbase Server create a key pattern that means something to their application. As an example for a user profile store I might consider breaking up the profile into three separate documents with a unique username (in this example hernandez94) for each document:
1) login-data::hernandez94 is the object that has the encrypted password since I need to query that all of the time and want it in Couchbase's managed cache for performance reasons.
2) sec-questions::hernandez94 is the object that has the user's 3 security questions and since I do not use that very often, do not care if it is in the managed cache
3) main::hernandez94 is the user's main document that has everything else that I might need to query often, but not nearly as often as other times.
This way I have tailored my keyspace naming to my application's access patterns and therefor get only the data I need and exactly when I need it for best performance. If I want, since these key names are standardized in my app, I could do a paralellized bulk get on all three of these document since my app can construct the name and it would be VERY fast. Again, I am not querying for the data, I have the keys, just go get them. I could normalize this keyspace naming further depending on the access patterns of my application. email-addresses::hernandez94, phones::hernandez94, appl-settings::hernandez94, etc.
I am thinking of working with couchbase for my next web application, and I am wondering how my data should be structured, specifically the use of buckets. For example, assuming each user is going to have a lot of unique data, should a separate bucket be created for each user (maybe even for different categories of data)? Also I am wondering if there is any real advantage/disadvantage to separating data using buckets (aside from the obvious organizational benefit) instead of simply storing everything in one bucket.
You will not get any performance gain from using more or less buckets. The reason that Couchbase has buckets is so that it can be multi-tenant. The best use case I can think of for using multiple buckets is if you are a hosting provider and you want to have different users using the same database server. Buckets can be password protected and would prevent one user from accessing another users data.
Some people create multiple buckets for organizational purposes. Maybe you are running two different applications and you want the data to be separate or as you mentioned maybe you want to split data by category.
In terms of management though it is probably best to create as few buckets as possible for your application since it will simplify your client logic by reducing the amount of connections you need to Couchbase from you web-tier (client). For each bucket you have you must create a separate client connection.
It is recommended to have one repository per aggregate.
However, I have a case where the same aggregate object can be fetched from 2 heterogeneous data stores. For the background, that object is:
fetched from data store A (remote & read-only)
presented to the user for validation
on validation, imported into data store B (local & read-write)
it can be fetched from and modified in data store B
Obviously (or not), I can't have a unique aggregate repository for that - at some point I need to know from which data store the object is fetched.
Given that the domain layer should ignore the infrastructure, my particular case breaks somehow my understanding of how the repository pattern and DDD in general should be properly implemented.
Did I get something wrong?
Did I get something wrong?
Seems to me what you got wrong is having two data stores for the same data.
If indeed there's a good reason for this redundancy, the two aggregates must be different in some way, and that might justify considering them as separate aggregates and having two repositories.
If you want to treat them as a single aggregate, a single repository should know how to disambiguate and deal with the correct datastore, but encapsulate that knowledge of datastores away from your domain model.
EDIT:
In the situation as explained in comments, where one datastore is read-only and the other a local modifiable copy, having two datastores is in fact forced on you. Your repository needs to know about both datastores and use the remote read-only store only if it does not find something locally. Immediately upon retrieving something from the remote, it should save it to the local and thereafter use the local copy.
This logic is sort of a caching proxy, but not exactly, as the remote is read-only and the local is read-write. It might contain enough logic to be extracted to a service used by the repository, but shouldn't be exposed to the domain.
This situation also has some risks that you need to think about. Once you've saved something locally, you have two versions of the same data, which will get out of synch. What do you do if someone with write access on the remote changes it after you've changed your local?