How does Simulating Joins works in Couchbase? - couchbase

I have documents one is dependent to other. first:
{
"doctype": "closed_auctions",
"seller": {
"person": "person11304"
},
"buyer": {
"person": "person0"
},
"itemref": {
"item": "item1"
},
"price": 50.03,
"date": "11/17/2001",
"quantity": 1,
"type": "Featured",
"annotation": {
"author": {
"person": "person8597"
}
}
here you can see doc.buyer.person is dependent to another documents like this:
{
"doctype": "people",
"id": "person0",
"name": "Kasidit Treweek",
"profile": {
"income": 20186.59,
"interest": [
{
"category": "category251"
}
],
"education": "Graduate School",
"business": "No"
},
"watch": [
{
"open_auction": "open_auction8747"
}
]
}
How can I get buyer's name from these two documents? I means doc.buyer.person is connected with second document's id. It is join and from documentation it's not clear. http://docs.couchbase.com/couchbase-manual-2.0/#solutions-for-simulating-joins

Well, first off, let me point out that the very first sentence of the documentation section that you referenced says (I added the emphasis):
Joins between data, even when the documents being examined are
contained within the same bucket, are not possible directly within the
view system.
So, the quick answer to your question is that you have lots of options. Here are a few of them:
Assume you need only the name for a rather small subset of people. Create a view that outputs the PersonId as key and Name as value, then query the view for a specific name each time you need it.
Assume you need many people joined to many auctions. Download the full contents of the basic index from #1 and execute the join using linq.
Assume you need many properties of the person, not just the name. Download the Person document for each auction item.
Assume you need a small subset from both Auction and People. Index the fields from each that you need, include a type field, and emit all of them under the key of the Person. You will be able to query the view for all items belonging to the person.
The last approach was used in the example you linked to in your question. For performance, it will be necessary to tailor the approach to your usage scenario.

An other solution consist to merge datas in a custom reduce function.
// view
function (doc, meta) {
if (doc.doctype === "people") {
emit(doc.id, doc);
}
if (doc.doctype === "closed_auctions") {
emit(doc.buyer.person, doc);
}
}
// custom reduce
function (keys, values, rereduce) {
var peoples = values.filter(function (doc) {
return doc.doctype === "people";
});
for (var key in peoples) {
var people = peoples[key];
people.closed_auctions = (function (peopleId) {
return values.filter(function (doc) {
return doc.doctype === "closed_auctions" && doc.buyer.person === peopleId;
});
})(people.id);
}
return peoples;
}
And then you can query one user with "key" or multiple users with "keys".
After I don't know what the performances issues are with this method.

Related

How to query multiple fields with one value in Firebase Realtime Database? [duplicate]

{
"movies": {
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson"
},
"movie2": {
"genre": "Horror",
"name": "The Shining",
"lead": "Jack Nicholson"
},
"movie3": {
"genre": "comedy",
"name": "The Mask",
"lead": "Jim Carrey"
}
}
}
I am a Firebase newbie. How can I retrieve a result from the data above where genre = 'comedy' AND lead = 'Jack Nicholson'?
What options do I have?
Using Firebase's Query API, you might be tempted to try this:
// !!! THIS WILL NOT WORK !!!
ref
.orderBy('genre')
.startAt('comedy').endAt('comedy')
.orderBy('lead') // !!! THIS LINE WILL RAISE AN ERROR !!!
.startAt('Jack Nicholson').endAt('Jack Nicholson')
.on('value', function(snapshot) {
console.log(snapshot.val());
});
But as #RobDiMarco from Firebase says in the comments:
multiple orderBy() calls will throw an error
So my code above will not work.
I know of three approaches that will work.
1. filter most on the server, do the rest on the client
What you can do is execute one orderBy().startAt()./endAt() on the server, pull down the remaining data and filter that in JavaScript code on your client.
ref
.orderBy('genre')
.equalTo('comedy')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
if (movie.lead == 'Jack Nicholson') {
console.log(movie);
}
});
2. add a property that combines the values that you want to filter on
If that isn't good enough, you should consider modifying/expanding your data to allow your use-case. For example: you could stuff genre+lead into a single property that you just use for this filter.
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"genre_lead": "comedy_Jack Nicholson"
}, //...
You're essentially building your own multi-column index that way and can query it with:
ref
.orderBy('genre_lead')
.equalTo('comedy_Jack Nicholson')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
console.log(movie);
});
David East has written a library called QueryBase that helps with generating such properties.
You could even do relative/range queries, let's say that you want to allow querying movies by category and year. You'd use this data structure:
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"genre_year": "comedy_1997"
}, //...
And then query for comedies of the 90s with:
ref
.orderBy('genre_year')
.startAt('comedy_1990')
.endAt('comedy_2000')
.on('child_added', function(snapshot) {
var movie = snapshot.val();
console.log(movie);
});
If you need to filter on more than just the year, make sure to add the other date parts in descending order, e.g. "comedy_1997-12-25". This way the lexicographical ordering that Firebase does on string values will be the same as the chronological ordering.
This combining of values in a property can work with more than two values, but you can only do a range filter on the last value in the composite property.
A very special variant of this is implemented by the GeoFire library for Firebase. This library combines the latitude and longitude of a location into a so-called Geohash, which can then be used to do realtime range queries on Firebase.
3. create a custom index programmatically
Yet another alternative is to do what we've all done before this new Query API was added: create an index in a different node:
"movies"
// the same structure you have today
"by_genre"
"comedy"
"by_lead"
"Jack Nicholson"
"movie1"
"Jim Carrey"
"movie3"
"Horror"
"by_lead"
"Jack Nicholson"
"movie2"
There are probably more approaches. For example, this answer highlights an alternative tree-shaped custom index: https://stackoverflow.com/a/34105063
If none of these options work for you, but you still want to store your data in Firebase, you can also consider using its Cloud Firestore database.
Cloud Firestore can handle multiple equality filters in a single query, but only one range filter. Under the hood it essentially uses the same query model, but it's like it auto-generates the composite properties for you. See Firestore's documentation on compound queries.
I've written a personal library that allows you to order by multiple values, with all the ordering done on the server.
Meet Querybase!
Querybase takes in a Firebase Database Reference and an array of fields you wish to index on. When you create new records it will automatically handle the generation of keys that allow for multiple querying. The caveat is that it only supports straight equivalence (no less than or greater than).
const databaseRef = firebase.database().ref().child('people');
const querybaseRef = querybase.ref(databaseRef, ['name', 'age', 'location']);
// Automatically handles composite keys
querybaseRef.push({
name: 'David',
age: 27,
location: 'SF'
});
// Find records by multiple fields
// returns a Firebase Database ref
const queriedDbRef = querybaseRef
.where({
name: 'David',
age: 27
});
// Listen for realtime updates
queriedDbRef.on('value', snap => console.log(snap));
var ref = new Firebase('https://your.firebaseio.com/');
Query query = ref.orderByChild('genre').equalTo('comedy');
query.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(DataSnapshot dataSnapshot) {
for (DataSnapshot movieSnapshot : dataSnapshot.getChildren()) {
Movie movie = dataSnapshot.getValue(Movie.class);
if (movie.getLead().equals('Jack Nicholson')) {
console.log(movieSnapshot.getKey());
}
}
}
#Override
public void onCancelled(FirebaseError firebaseError) {
}
});
Frank's answer is good but Firestore introduced array-contains recently that makes it easier to do AND queries.
You can create a filters field to add you filters. You can add as many values as you need. For example to filter by comedy and Jack Nicholson you can add the value comedy_Jack Nicholson but if you also you want to by comedy and 2014 you can add the value comedy_2014 without creating more fields.
{
"movies": {
"movie1": {
"genre": "comedy",
"name": "As good as it gets",
"lead": "Jack Nicholson",
"year": 2014,
"filters": [
"comedy_Jack Nicholson",
"comedy_2014"
]
}
}
}
For Cloud Firestore
https://firebase.google.com/docs/firestore/query-data/queries#compound_queries
Compound queries
You can chain multiple equality operators (== or array-contains) methods to create more specific queries (logical AND). However, you must create a composite index to combine equality operators with the inequality operators, <, <=, >, and !=.
citiesRef.where('state', '==', 'CO').where('name', '==', 'Denver');
citiesRef.where('state', '==', 'CA').where('population', '<', 1000000);
You can perform range (<, <=, >, >=) or not equals (!=) comparisons only on a single field, and you can include at most one array-contains or array-contains-any clause in a compound query:
Firebase doesn't allow querying with multiple conditions.
However, I did find a way around for this:
We need to download the initial filtered data from the database and store it in an array list.
Query query = databaseReference.orderByChild("genre").equalTo("comedy");
databaseReference.addValueEventListener(new ValueEventListener() {
#Override
public void onDataChange(#NonNull DataSnapshot dataSnapshot) {
ArrayList<Movie> movies = new ArrayList<>();
for (DataSnapshot dataSnapshot1 : dataSnapshot.getChildren()) {
String lead = dataSnapshot1.child("lead").getValue(String.class);
String genre = dataSnapshot1.child("genre").getValue(String.class);
movie = new Movie(lead, genre);
movies.add(movie);
}
filterResults(movies, "Jack Nicholson");
}
}
#Override
public void onCancelled(#NonNull DatabaseError databaseError) {
}
});
Once we obtain the initial filtered data from the database, we need to do further filter in our backend.
public void filterResults(final List<Movie> list, final String genre) {
List<Movie> movies = new ArrayList<>();
movies = list.stream().filter(o -> o.getLead().equals(genre)).collect(Collectors.toList());
System.out.println(movies);
employees.forEach(movie -> System.out.println(movie.getFirstName()));
}
The data from firebase realtime database is as _InternalLinkedHashMap<dynamic, dynamic>.
You can also just convert this it to your map and query very easily.
For example, I have a chat app and I use realtime database to store the uid of the user and the bool value whether the user is online or not. As the picture below.
Now, I have a class RealtimeDatabase and a static method getAllUsersOnineStatus().
static getOnilineUsersUID() {
var dbRef = FirebaseDatabase.instance;
DatabaseReference reference = dbRef.reference().child("Online");
reference.once().then((value) {
Map<String, bool> map = Map<String, bool>.from(value.value);
List users = [];
map.forEach((key, value) {
if (value) {
users.add(key);
}
});
print(users);
});
}
It will print [NOraDTGaQSZbIEszidCujw1AEym2]
I am new to flutter If you know more please update the answer.
ref.orderByChild("lead").startAt("Jack Nicholson").endAt("Jack Nicholson").listner....
This will work.

Mongo query to get comma separated value

I have query which is traversing only in forward direction.
example:
{
"orderStatus": "SUBMITTED",
"orderNumber": "785654",
"orderLine": [
{
"lineNumber": "E1000",
**"trackingnumber": "12345,67890",**
"lineStatus": "IN-PROGRESS",
"lineStatusCode": 50
}
],
"accountNumber": 9076
}
find({'orderLine.trackingNumber' : { $regex: "^12345.*"} })**
When I use the above query I get the entire document. But I want to fetch the document when I search with 67890 value as well
At any part of time I will be always querying with single tracking number only.
12345 or 67890 Either with 12345 or 67890. There are chances tracking number value can extend it's value 12345,56789,01234,56678.
I need to pull the whole document no matter what the tracking number is in whatever position.
OUTPUT
should be whole document
{
"orderStatus": "SUBMITTED",
"orderNumber": "785654",
"orderLine": [
{
"lineNumber": "E1000",
"trackingnumber": "12345,67890",
"lineStatus": "IN-PROGRESS",
"lineStatusCode": 50
}
],
"accountNumber": 9076
}
Also I have done indexing for trackingNumber field. Need help here. Thanks in advance.
Following will search with either 12345 or 67890. It is similar to like condition
find({'orderLine.trackingNumber' : { $regex: /12345/} })
find({'orderLine.trackingNumber' : { $regex: /67890/} })
There's also an alternative way to do this
Create a text index
db.order.createIndex({'orderLine.trackingnumber':"text"})
You can make use of this index to search the value from trackingnumber field
db.order.find({$text:{$search:'12345'}})
--
db.order.find({$text:{$search:'67890'}})
--
//Do take note that you can't search using few in between characters
//like the following query won't give any result..
db.order.find({$text:{$search:'6789'}}) //have purposefully removed 0
To further understand how $text searches work, please go through the following link.

DynamoDB&NodeJS: search table from JSON array

I have a DynamoDB table with only two columns "EmailId" and "SubscriptionId". "EmailId" is Primary sort key and "SubscriptionId" is Primary partition key. I have to insert a record into it but before that I need to make sure that the record does not exist. I get the records from a third party API endpoint in JSON array format. So, I will have to search in the table and the records that do not exist in the will have to be inserted.
The records I get are in the following format. This is a sample response and I can get maybe 1000 of records in the array.
[{
"emailId": "abc1#abc1.com",
"subscriptionId": "A1"
}, {
"emailId": "abc2#abc2.com",
"subscriptionId": "A2"
}, {
"emailId": "abc3#abc3.com",
"subscriptionId": "A3"
}]
I don't want to pick each record from the array above, search the table and if not found, insert it because this table is going to get huge. Is there any other way we can do that? I am using this with NodeJS. Though I can not change the JSON array but I can make changes to DynamoDB table. Any suggestions?
The batchWrite item API can be used to put multiple items in a batch. The maximum number of requests in the batch is 25.
The best part is that if the key in the PutRequest is already present in the table, it updates the item rather than throwing some error or exception (i.e. key is not unique).
The disadvantage of this approach is that the latest update will overwrite all the attributes of the existing item in the table. For example, if the existing item in the table has 5 attributes and the latest update has only 3 attributes, the table will have only 3 attributes (as present in the latest PutRequest) after the latest batch execution.
var docClient = new AWS.DynamoDB.DocumentClient();
var params = {
RequestItems: {
"subscription": [
{
PutRequest: {
Item: {
"emailId": "abc1#abc1.com",
"subscriptionId": "A1"
}
}
},
{
PutRequest: {
Item: {
"emailId": "abc2#abc2.com",
"subscriptionId": "A2"
}
}
},
{
PutRequest: {
Item: {
"emailId": "abc3#abc3.com",
"subscriptionId": "A3"
}
}
}
]
}
};
docClient.batchWrite(params, function (err, data) {
if (err) {
console.error("Unable to write item. Error JSON:", JSON.stringify(err,
null, 2));
} else {
console.log("Write Item succeeded:", JSON.stringify(data, null, 2));
}
});

MongoDB find() to return the sub document when a (field,value) is matched

This is a single collection which has 2 json files. I am searching for a particular field: value in an object and the entire sub document must be returned in case of a match ( That particular sub document from the collection must be returned out of the 2 sub documents in the following collection). Thanks in advance.
{
"clinical_study": {
"#rank": "379",
"#comment": [],
"required_header": {
"download_date": "ClinicalTrials.gov processed this data on March 18, 2015",
"link_text": "Link to the current ClinicalTrials.gov record.",
"url": "http://clinicaltrials.gov/show/NCT00000738"
},
"id_info": {
"org_study_id": "ACTG 162",
"secondary_id": "11137",
"nct_id": "NCT00000738"
},
"brief_title": "Randomized, Double-Blind, Placebo-Controlled Trial of Nimodipine for the Neurological Manifestations of HIV-1",
"official_title": "Randomized, Double-Blind, Placebo-Controlled Trial of Nimodipine for the Neurological Manifestations of HIV-1",
}
{
"clinical_study": {
"#rank": "381",
"#comment": [],
"required_header": {
"download_date": "ClinicalTrials.gov processed this data on March 18, 2015",
"link_text": "Link to the current ClinicalTrials.gov record.",
"url": "http://clinicaltrials.gov/show/NCT00001292"
},
"id_info": {
"org_study_id": "920106",
"secondary_id": "92-C-0106",
"nct_id": "NCT00001292"
},
"brief_title": "Study of Scaling Disorders and Other Inherited Skin Diseases",
"official_title": "Clinical and Genetic Studies of the Scaling Disorders and Other Selected Genodermatoses",
}
Your example documents are malformed - right now both clinical_study keys are part of the same object, and that object is missing a closing }. I assume you want them to be two separate documents, although you call them subdocuments. It doesn't make sense to have them be subdocuments of a document if they are both named under the same key. You cannot save the document that way, and in the mongo shell it will silently replace the first instance of the key with the second:
> var x = { "a" : 1, "a" : 2 }
> x
{ "a" : 2 }
If you just want to return the clinical_study part of the document when you match on clinical_study.#rank, use projection:
db.test.find({ "clinical_study.#rank" : "379" }, { "clinical_study" : 1, "_id" : 0 })
If instead you meant for the clinical_study documents to be elements of an array inside a larger document, then use $. Here, clinical_study is now the name of an array field which has as its elements the two values of the clinical_study key in your non-documents:
db.test.find({ "clinical_study.#rank" : "379" }, { "_id" : 0, "clinical_study.$" : 1 })

Can you GET Rally API requirements, defects, and all tasks with one query

Currently I have to make multiple GETs to receive all the information which I need
User Story: FormattedID, _refObjectName, State, Owner._refObjectName
Tasks for each User Story: FormattedID, _refObjectName, State, Owner._refObjectName
Defect: FormattedID, _refObjectName, State, Owner._refObjectName
Tasks for each Defect: FormattedID, _refObjectName, State, Owner._refObjectName
For all of the User Stories I use:
https://rally1.rallydev.com/slm/webservice/1.26/hierarchicalrequirement.js?query=((Project.Name = "[projectName]") and (Iteration.Name = "[iterationName]"))&fetch=true&start=1&pagesize=100
For all of the Defects I use:
https://rally1.rallydev.com/slm/webservice/1.26/defects.js?query=((Project.Name = "[projectName]") and (Iteration.Name = "[iterationName]"))&fetch=true&start=1&pagesize=100
Within each of these, if they have any Tasks, they display as:
{
"_rallyAPIMajor": "1",
"_rallyAPIMinor": "26",
"_ref": "https://rally1.rallydev.com/slm/webservice/1.26/task/9872916743.js",
"_refObjectName": "Update XYZ when ABC",
"_type": "Task"
}
This doesn't have all the information I need, so I hit each of the Tasks' _ref URLs to get the full task information.
This adds up to sometimes 80+ AJAX calls per page load.
Is there a better query which would provide the extra Task information up front?
The fetch parameter can be tricky with queries. If you provide fetch=true you will get all of the fields that exist on the queried type (Story,Defect). If the field is also a domain object (like a tasks or a defect) you will only get the thin ref object like this
{
"_ref": "/task/1234.js"
}
If you want to get fields populated on the sub-objects you will need to specify the fields you want shown in the fetch param fetch=Name,FormattedID,Tasks. This would return an object like the one below:
{
"HierarchicalRequirement" : {
"Name" : "StoryName",
"FormattedID" : "S1234",
"Tasks" : [
{
"_rallyAPIMajor": "1",
"_rallyAPIMinor": "26",
"_ref": "https://rally1.rallydev.com/slm/webservice/1.26/task/9872916743.js",
"_refObjectName": "Update XYZ when ABC",
"_type": "Task",
"FormattedID" : "T1",
"Name" : "Update XYZ when ABC"
}
]
}
}
Let me know if that helped