Is there such a thing as couchbase basket? - couchbase

I have heard only of couchbase bucket is there also a basket? I would like to have multiple apps use multiple buckets but for couchbase performance is there a lighter thing than a bucket called basket?

Never heard of a basket in Couchbase. Now that being said we strongly encourage people to add a type field to every document stored in buckets. Before having queries we would tell you to do multiple applications by prefixing all your document keys by the app prefix. Now that we have n1ql and that you can do queries based on the content, you should add a field in the doc as well.
From a security perspective you'll be mixing stuff from different app in the same bucket. We have no way to make any distinction right now between doc from one app or the other on the server side. It means your security model has to be handled on the client/application layer.

Related

Why does the Couchbase Server API require a name for new documents

When you create a document using the Couchbase Server API, one of the arguments is a document name. What is this used for and why is it needed?
When using Couchbase Lite you can create an empty document and it is assigned an _id and _rev. You do not need to give it a name. So what is this argument for in Couchbase Server?
In Couchbase Server it is a design decision that all objects are identified by a the object ID, key or name (all the same thing by different names) and those are not auto-assigned. The reason for this is that keys are not embedded in the document itself, key lookups are the fastest way to get that object and the technology dictates this under the hood of the server. Getting a document by ID is much faster than querying for it. Querying means you are asking a question, whereas getting the object by ID means you already know the answer and are just telling the DB to go get it for you and is therefor faster.
If the ID is something that is random, then more than likely you must query the DB and that is less efficient. Couchbase Mobile's sync_gateway together with Couchbase Lite handles this on your behalf if you want it to as it can have its own keyspace and key pattern it manages for key lookups. If you are going straight to the DB on your own with the Couchbase SDK though, knowing that key will be the fastest way to get that object. Like I said, Couchbase Sync_Gateway handeles this lookup for you, as it is the app server. When you go direct with the SDKs you get more control and different design patterns emerge.
Many people in Couchbase Server create a key pattern that means something to their application. As an example for a user profile store I might consider breaking up the profile into three separate documents with a unique username (in this example hernandez94) for each document:
1) login-data::hernandez94 is the object that has the encrypted password since I need to query that all of the time and want it in Couchbase's managed cache for performance reasons.
2) sec-questions::hernandez94 is the object that has the user's 3 security questions and since I do not use that very often, do not care if it is in the managed cache
3) main::hernandez94 is the user's main document that has everything else that I might need to query often, but not nearly as often as other times.
This way I have tailored my keyspace naming to my application's access patterns and therefor get only the data I need and exactly when I need it for best performance. If I want, since these key names are standardized in my app, I could do a paralellized bulk get on all three of these document since my app can construct the name and it would be VERY fast. Again, I am not querying for the data, I have the keys, just go get them. I could normalize this keyspace naming further depending on the access patterns of my application. email-addresses::hernandez94, phones::hernandez94, appl-settings::hernandez94, etc.

Any drawback of building website based on JSON API for Data Access Layer

For instance, in ecommerce websites, we generally have two interfaces. One with which customer interacts and places orders and one with which company employees interact to manage orders and customers etc.
If we divide this website into two different websites. That means, two different projects all together, not dependent on each other. Only thing common between both websites will be the database. Both websites will be using the same database. Then what would be a good option for making Data Access Layer
Each website have its own Database access code and entities.
Link both website with a centralized layer - which exposes Read/Write to database using API based on JSON
In my opinion, second option would be better. As it cancels out dependency of database, any changes made in database need not to be made at two places. And many other benefits.
But my only concern is, how much it could hamper performance of overall system? Because in that case we are serializing and de-serializing objects and also making use of HTTP connections.
Could someone please throw some light over what would be benefits and drawbacks of API backed Data Access Layer in comparison to having own Database access code.
People disagree about the best architecture for this sort of thing, but one common and popular architectural guideline suggest that you avoid integrating two products at the database layer at all costs. It is simpler to have two separate apps and databases which can change independently of each other, and if you need to reference data from one in the other you should have some sort of event pipeline between the two configured on the esb.
And, you should probably have more than two back end databases anyway -- unless you have an incredibly simple system with only the two classes of objects you mentioned, you'll probably find that you have more than two bounded domains.
Also, if your performance requirements increase then you'll probably want to look at splitting the read and write sides of your services and databases, connecting the two sides through an eventing system of some sort, (maybe event-sourcing).
Before you decide what to do you should read Implementing Domain Driven Design by Vaughn Vernon. And, the paper on CQRS by Martin Fowler. And the paper on event sourcing, also from Dr Fowler. For extra points you should also read Fowler on Microservices architecture.
Finally, on JSON -- and I'm a big fan -- but you should only use it at the repository interface if you're either using javascript on the back end (which is a great idea if you're using io.js and Koa) and the front end (backbone & marionette, please), or if you're using a data-source that natively emits json. If you have to parse it then it's only going to slow you down so use some format native to the data-source and its consumers, that way you'll be as fast as possible.
An API centric approach makes more sense as the data is standardised and gives you more flexibility by being usable in any language for one or multiple interfaces.
Performance wise this would greatly depend on the quality and implementation of the technology stack behind the API. You could also look at caching certain data on the frontend to improve page load time.
The guys over at moltin have already built a platform like this and I've had great success using it. There's already a backend dashboard and the response times are pretty fast too!

Connect to bucket name that named not 'default'

I have a Couchbase server and a .Net client. When I named the bucket "default" every thing run ok but when I create a bucket with another name like 'cashdb' my client got error "Null Pointer Exception".
I really don't know if you want to have 3 bucket on a server with difference names , what can you do?
When you have multiple buckets (or even a single bucket that's not named 'default'), you have to explicitly specify which one you want to open when creating the connection.
In the 1.x SDK it's:
var config = new CouchbaseClientConfiguration();
config.Bucket = "mybucket"
config.BucketPassword = "12345";
var connection = new CouchbaseClient(config);
In the 2.x SDK it's slightly longer, so take a look here: http://docs.couchbase.com/developer/dotnet-2.0/configuring-the-client.html
I cannot answer the part about the .Net drive, but I can address the multiple bucket questions.
You can have multiple buckets, but know why you are doing it. A logical organization is not necessarily a great reason, IMO. More buckets means more resources being used. I can give you a great example of a when you would split data into separate buckets, views. If you have views that only look at a subset of the data you have and will never ever look at the other parts of the data, then it might make sense to split it out. Say you have some JSON docs that are 30% of your data and a bunch of key value pairs that are 70% of your data. More than likely, you would only ever do views on the JSON docs and if there are enough of those docs and in large enough sizes, it can provide much faster view creation, maintenance, cluster rebalances, etc.
Another reason is if you have multiple applications accessing the same cluster. That is a good reason too.
Anyhow, it is fine to have multiple buckets, just read up on and understand the implications and do it strategically.

what is the proper way to separate data in couchbase

I am thinking of working with couchbase for my next web application, and I am wondering how my data should be structured, specifically the use of buckets. For example, assuming each user is going to have a lot of unique data, should a separate bucket be created for each user (maybe even for different categories of data)? Also I am wondering if there is any real advantage/disadvantage to separating data using buckets (aside from the obvious organizational benefit) instead of simply storing everything in one bucket.
You will not get any performance gain from using more or less buckets. The reason that Couchbase has buckets is so that it can be multi-tenant. The best use case I can think of for using multiple buckets is if you are a hosting provider and you want to have different users using the same database server. Buckets can be password protected and would prevent one user from accessing another users data.
Some people create multiple buckets for organizational purposes. Maybe you are running two different applications and you want the data to be separate or as you mentioned maybe you want to split data by category.
In terms of management though it is probably best to create as few buckets as possible for your application since it will simplify your client logic by reducing the amount of connections you need to Couchbase from you web-tier (client). For each bucket you have you must create a separate client connection.

Drupal: Many fields/content-types equal *many* tables and after a point make MySQL very slow

As I create more and more fields and content-types, I see that Drupal creates a huge numbers of tables (>1k) in MySQL and after a point my system becomes very slow.
I have tried several MySQL performance tuning tips, but nothing has improved the performance significantly. Enabling caching makes for good speed in the front-end, but if I try to edit a content-type from the admin back-end, it takes for ever!
How do you cope with that? How do you scale Drupal?
If sheer number of tables has become the database performance bottleneck, I'd have to agree with Rimian. You can define your own content types programmatically, and then develop your own content type model by leveraging the Node API.
API documentation and an example of doing just that are here: http://api.drupal.org/api/drupal/developer--examples--node_example--node_example.module/6
The code flow is basically:
Make Drupal recognize your content type
Define the fields it needs to take using the Forms API
Define how each of the Node API's functions should behave (view, load, save, etc.).
This allows you control over how things are stored, yet still gives you (and all contributed modules) ability to leverage the hook system for Node API calls.
Obvious drawbacks are missing out on all of the features/modules that directly depend on CCK for their functionality. But at >1k tables (which suggests a gargantuan number of content types and fields), it sounds like you're at that level of custom work already.
I worked on a Drupal 5 site with more than a million nodes and this was a serious issue.
If you're scaling Drupal up to enterprise level, consider not using CCK for your fields and developing your own content model with the node API. It's actually quite easy.
The devel module offers a Performance Monitoring tool that will show you all queries performed organized by time, showing which hooks and modules called them, etc.
Just don't run on production.