Join documents in Couchbase view

Join documents in Couchbase view - couchbase

Noob Question on Couchbase / NOSQL :
I have 2 different types of documents which have a common field across them, I would like to join them using views based on that common field :
Document 1 - Key : Page_Object1
{
"URL": "/someurl/",
"title": "some title"
}
Document 2 Key : Zone_Object1
{
"URL": "/someurl/",
"zone": "some Ad zone"
"
}
Can someone pls help me join these 2 documents and return a single joined document (using view?) based off of the "url" field, both these documents live in the same bucket.

Actually, I guess the view code posted will not work.
The correct would be:
if (doc.title) {
emit (doc.url, doc.title)
} else if (doc.zone) {
emit (doc.url, doc.zone)
}
And it's true,
If you have N1QL available, you'll get a proper join mechanism.

According to the Couchbase documentation http://docs.couchbase.com/admin/admin/Views/views-querySample.html, this is not possible in the strict sense.
But you can define your view like
if (doc.type == "page") {
emit (doc.url, doc.title)
} else if (doc.type == "zone") {
emit (doc.url, doc.zone)
}
Now, when querying this view for a given url, you'll get both informations.
If you have N1QL available, you'll get a proper join mechanism.

Related

How to query documents which are arrays

I'm a novice Couchbase user and I have a bucket which I've created that contains documents which are actually arrays in the form of:
{
"key": [
{
"data1": "somedata1"
},
{
"data2": "somedata2"
}
]
}
I want to query these documents via N1QL statements and have yet to find a solution to how to do this properly. More specifically, I would like to select fields inside each sub-document that is in an array of a certain key. For example, I would like to access: key.[1].data2 or key.[0].data1
How should I do it?

Couchbase has some reserved keywords that need to be escaped. In this case, key needs to be escaped. For example, if you're querying against my_bucket, then
SELECT my_bucket.`key`[0].data1 FROM my_bucket;
should return somedata1

REST POST json body to support complex query advice

I'm fairly new to REST. All of our legacy webservices were SOAP based with enterprise (ORACLE or DB2) databases. We are now moving to REST/couchbase.
Our team is looking into implementing a complex query method. We already have implemented simple query methods using GET, for example GET returns all entries and a GET/067e6162-3b6f-4ae2-a171-2470b63dff00 would return the entry for 067e6162-3b6f-4ae2-a171-2470b63dff00.
We want to support a query method that would support receiving several query parameters such a list of Ids and date ranges. The number of Ids can number into a few thousand and because of this, we realize we cannot pass these query parameters in a GET HTTP header since there is a limit on header size.
We are starting to look into passing our query parameters into the JSON body of a POST request. For example, we could have client pass in a few thousand Ids as an array and also pass in a date range, so we'd have each query param/filter be an object. The JSON body would then be an array of objects. For example:
{
"action" : "search",
"queryParameters" : {
[
{
“operation”: “in”,
"key" : "name.of.attribute.Id",
"value" : "[{ "id: "067e6162-3b6f-4ae2-a171-2470b63dff00"}, {"id": "next id"....}],
},
{
“operation”: “greater”,
"key" : "name.of.attribute “,
"value" : "8/20/2016"
},
{
“operation”: “less”,
"key" : "name.of.attribute “,
"value" : "8/31/2016"
}
]
}
The back end code would then receive POST and read the body. It would see action is a search and then look for any entries in the list that are in the list of Ids that are in the date range of > 8/20/2016 and < 8/31/2016.
I've been trying to look online for tips/best practices on how best to structure the JSON body for complex queries but have not found much. So any tips, guidance or advice would be greatly appreciated.
thanks.

Possible to chain results in N1ql?

I'm currently trying to do a bit of complex N1QL for a project I'm working on, theoretically I could do all of this processing in multiple N1QL calls and by parsing the results each time, however if possible I'd like for this to contained in one call.
What I would like to do is:
filter all documents that contain a "dataSync.test.id" field with more than 1 id
Read back all other ids in that list
Use that list to get other documents containing those ids
Get the "dataSync.test._channels" field for those documents (optionally a filter by docType might help parsing)
This would probably return a list of "dataSync.test._channels"
Is this possible in N1QL? It appears like it might be but I can't get the syntax right.
My data structures look a little like
{
"dataSync": {
"test": {
"_channels": [
"RP"
],
"id": [
"dataSync_user_1015",
"dataSync_user_1010",
"dataSync_user_1005"
],
"_lastUpdatedBy": "TEST"
}
},
...
}
{
"dataSync": {
"test": {
"_channels": [
"RSD"
],
"id": [
"dataSync_user_1010"
],
"_lastUpdatedBy": "TEST"
}
},
...
}

Yes. I think you can do all these.
Initial set of IDs with filtering can be retrieved as a subquery and then you can get subsquent documents by joins.
SELECT fulldoc
FROM (select meta().id as dockey from doc where a=1) as mydoc
INNER JOIN doc fulldoc ON KEYS mydoc.dockey;
There are optimizations that can be done here. Try the sequencing first to ensure you're get the job done.

REST API status as integer or as string?

Me and my colleague are working on REST API. We've been arguing quite a lot whether status of a resource/item should be a string or an integer---we both need to read, understand and modify this resource (using separate applications). As this is a very general subject, google did not help to settle this argument. I wonder what is your experience and which way is better.
For example, let's say we have Job resource, which is accesible through URI http://example.com/api/jobs/someid and it has the following JSON representation which is stored in NoSQL DB:
JOB A:
{
"id": "someid",
"name": "somename",
"status": "finished" // or "created", "failed", "compile_error"
}
So my question is - maybe it should be more like following?
JOB B:
{
"id": "someid",
"name": "somename",
"status": 0 // or 1, 2, 3, ...
}
In both cases each of us would have to create a map, that we use to make sense of status in our application logic. But I myself am leaning towards first one, as it is far more readable... You can also easily mix up '0' (string) and 0 (number).
However, as the API is consumed by machines, readability is not that important. Using numbers also has some other advantages - it is widely accepted when working with applications in console and can be beneficial when you want to include arbitrary new failed statuses, say:
status == 50 - means you have problem with network component X,
status > 100 - means some multiple special cases.
When you have numbers, you don't need to make up all those string names for them. So which way is best in you opinion? Maybe we need multiple fields (this could make matters a bit confusing):
JOB C:
{
"id": "someid",
"name": "somename",
"status": 0, // or 1, 2, 3...
"error_type": "compile_error",
"error_message": "You coding skill has failed. Please go away"
}

Personally I would look at handling this situation with a combination of both approaches you have mentioned. I would store the statuses as integers within a database, but would create an enumeration or class of constants to map status names to numeric status values.
For example (in C#):
public enum StatusType
{
Created = 0,
Failed = 1,
Compile_Error = 2,
// Add any further statuses here.
}
You could then convert the numeric status stored in the database to an instance of this enumeration, and use this for decision making throughout your code.
For example (in C#):
StatusType status = (StatusType) storedStatus;
if(status == StatusType.Created)
{
// Status is created.
}
else
{
// Handle any other statuses here.
}
If you're being pedantic, you could also store these mappings in your DB.
For access via an API, you could go either way depending on your requirements. You could even return a result with both the status number and status text:
object YourObject
{
status_code = 0,
status = "Failed"
}
You could also create an API to retrieve the status name from a code. However returning both the status code and name in the API would be the best from a performance standpoint.

Iterating through couchbase keys without a view

In couchbase, I was wondering if there was a way - WITHOUT using a view - to iterate through database keys. The admin interface appears to do this, but maybe its doing something special. What I'd like to is make a call like this to retrieve an array of keys:
$result = $cb->get("KEY_ALBERT", "KEY_FRED");
having the result be an array [KEY_ALEX, KEY_BOB, KEY_DOGBERT]
Again, I don't want to use a view unless there's no alternative. Doesn't look like its possible, but since the "view documents" in the admin appears to do this, I thought i'd double-check. I'm using the php interface if that matters.

Based on your comments, the only way is to create a simple view that emit only the id as par of the key:
function(doc, meta) {
emit( meta.id );
}
With this view you will be able to create query with the various options you need :
- pagination, range, ...
Note: you talk about the Administration Console, the console use an "internal view" that is similar to what I have written above (but not optimized)

I don't know about how couchbase admin works, but there are two options. First option is to store your docs as linked list, one doc have property (key) that points to another doc.
docs = [
{
id: "doc_C",
data: "somedata",
prev: "doc_B",
next: "doc_D"
},
{
id: "doc_D",
data: "somedata",
prev: "doc_C",
next: "doc_E"
}
]
The second approach is to use sequential id. You should have one doc that contain sequence and increment it on each add. It would be something like this:
docs = [
{
id: "doc_1",
data: "somedata"
},
{
id: "doc_2",
data: "somedata"
}
...
]
In this way you can do "range requests". To do this you form array of keys on server side:
[doc_1, doc_2 .... doc_N]and execute multiget query. Here is also a link to another example

The couchbase PHP sdk does support multiget requests. For a list of keys it will return an array of documents.
getMulti(array $ids, array $cas, int $flags) : array
http://www.couchbase.com/autodocs/couchbase-php-client-1.1.5/classes/Couchbase.html#method_getMulti

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008