Using JSON-based Database for unordered data - json

I am working on a simple app for Android. I am having some trouble using the Firebase database since it uses JSON objects and I am used to relational databases.
My data will consists of two users that share a value. In relational databases this would be represented in a table like this:
**uname1** **uname2** shared_value
In which the usernames are the keys. If I wanted the all the values user Bob shares with other users, I could do a simple union statement that would return the rows where:
uname1 == Bob or unname == Bob
However, in JSON databases, there seems to be a tree-like hierarchy in the data, which is complicated since I would not be able to search for users at the top level. I am looking for help in how to do this or how to structure my database for best efficiency if my most common search will be one similar to the one above.
In case this is not enough information, I will elaborate: My database would be structured like this:
{
'username': 'Bob'
{
'username2': 'Alice'
{
'shared_value' = 2
}
}
'username': 'Cece'
{
'username2': 'Bob'
{
'shared_value' = 4
}
}
As you can see from the example, Bob is included in two relationships, but looking into Bobs node doesn't show that information. (The relationship is commutative, so who is "first" cannot be predicted).
The most intuitive way to fix this would be duplicate all data. For example, when we add Bob->Alice->2, also add Alice->Bob->2. In my experience with relational databases, duplication could be a big problem, which is why I haven't done this already. Also, duplication seems like an inefficient fix.

Is there a reason why you don't invert this? How about a collection like:
{ "_id": 2, "usernames":[ "Bob", "Alice"]}
{ "_id": 4, "usernames":[ "Bob", "Cece"]}
If you need all the values for "Bob", then index on "usernames".
EDIT:
If you need the two usernames to be a unique key, then do something like this:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
But this would still permit the creation of:
{ "_id": {"uname1":"Alice", "uname2":"Bob"}, "value": 78 }
(This issue is also present in your as-is relational model, btw. How do you handle it there?)
In general, I think implementing an array by creating multiple columns with names like "attr1", "attr2", "attr3", etc. and then having to search them all for a possible value is an artifact of relational table modeling, which does not support array values. If you are converting to a document-oriented storage, these really should be an embedded list of values, and you should use the document paradigm and model them as such, instead of just reimplementing your table rows as documents.

You can still have old structure:
[
{ username: 'Bob', username2: 'Alice', value: 2 },
{ username: 'Cece', username2: 'Bob', value: 4 },
]
You may want to create indexes on 'username' and 'username2' for performance. And then just do the same union.

To create a tree-like structure, the best way is to create an "ancestors" array that stores all the ancestors of a particular entry. That way you can query for either ancestors or descendants and all documents that are related to a particular value in the tree. Using your example, you would be able to search for all descendants of Bob's, or any of his ancestors (and related documents).
The answer above suggest:
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 }
That is correct. But you don't get to see the relationship between Bob and Cece with this design. My suggestion, which is from Mongo, is to store ancestor keys in an ancestor array.
{ "_id": {"uname1":"Bob", "uname2":"Alice"}, "value": 2 , "ancestors": [{uname: "Cece"}]}
With this design you still get duplicates, which is something that you do not want. I would design it like this:
{"username": "Bob", "ancestors": [{"username": "Cece", "shared_value": 4}]}
{"username": "Alice", "ancestors": [{"username": "Bob", "shared_value": 2}, {"username": "Cece"}]}

Related

Bind JSON to build a dynamic form in AngularJs for One-To-Many relationship and having parent-child relations in both the tables

I have two database tables with a one-to-many relationship between them and the parent-child relationship within each of the tables. Here reference table (one side) works as a master table and the reference_copies table (many side) works as replicas of the master.
I want to create UI form in AngularJS to provide insert/update functionalities to the user. As shown in UI_image, user can go up to the n number of level as he/she wants. Also attached the image with database tables structure.
In reference_copies table, data can already exist as we are uploading through excels too. Here, name & type_id columns combining together create unique constraints. So while the user tries to add a level, I need to check if the name exists for that type or not. If exists then fetch the object else save and fetch the saved object (with created id). Here value and selected type will be the same for all levels so the user needs to select only once.
On the final submit, the master form's each level will be mapped with corresponding reference copies' each level. i.e. parent name of a master will be mapped with parent names of reference_copies 1, 2... Likewise, level1 of a master will be mapped with level1 of reference_copies 1& 2. and so on. If there is no corresponding level in either of the form, nothing happens, that level will not be mapped with any. Here, there are no restrictions to create similarity in levels. As shown in the example, the master form is having two levels, reference copy 1 form is having only one level and reference copy 2 form is having 3 levels.
On final Submit button, I want to build the json payload as below: Also when I get the response json in below format with IDs, the form should be filled as shown in above for the update.
{
"name": "Reference Name",
"childs": [
{
"name": "child level1 name",
"childs": [
{
"name": "childlevel2 name",
"childs":[],
"referenceCopies": [
{
"id" : 2004
}
]
}
],
"referenceCopies": [
{
"id": 2001
},
{
"id": 2003
}
]
}
],
"referenceCopies": [
{
"id": 2000
},
{
"id": 2002
}
]
}
I tried with recursive template in AngularJS to achieve this but it's not working. Can anyone provide some demo or suggestion to achieve above kind of requirement.
Please let me know if the above description is incomplete or unclear.

Confusion about Couchbase keys and indexes

I have imported a dataset into Couchbase that looks like so:
{
"CLUSTER": "M1M",
"CLUSTER_NAME": "MARTIN MARIETTA",
"PRIMARY": "",
"SET_NUM": "10000163",
SHORTENED_NAME": "MARTIN MARIETTA MATERIALS",
"TYPE": "SET",
"_class": "com.company.aad.xref.model.ClusterCodeXref"
}
I had to provide a key-generation strategy, and I made the strategy what I ultimately want my index to look like, %SET_NUM%::%TYPE%. So I have a couple of questions:
Does the key-generation automatically create a field called ID with those 2 elements, or do I need to create an ID column in my CSV dataset?
How can I create an index out of those 2 fields? I understand how to use the CREATE INDEX command with composite fields, but will that index look like the key generated by %SET_NUM%::%TYPE%? I need them to be the same, with the :: in the middle.
I hope my question is clear! Would appreciate any help.
In Couchbase, the ID/key of a document is not actually in the document itself. If you use the --generate-key template, your document would look something like:
key = "10000163::SET"
{
"CLUSTER": "M1M",
"CLUSTER_NAME": "MARTIN MARIETTA",
"PRIMARY": "",
"SET_NUM": "10000163",
SHORTENED_NAME": "MARTIN MARIETTA MATERIALS",
"TYPE": "SET",
"_class": "com.company.aad.xref.model.ClusterCodeXref"
}
There is no designated "id" field in Couchbase. You can certainly create an id field, but it will be just like any other field.
As for an index, it depends on what kind of query you want to run. You can CREATE INDEX idx_setnumandtype ON bucketname (SET_NUM, TYPE) as you mentioned. This is going to be a useful index for queries like: SELECT b.* FROM bucketname WHERE SET_NUM = 'foo' AND TYPE = 'bar';
But, if you know those two values and just need to do a lookup of a single document, you don't necessary need to create an index or use N1QL. You can simply do a key/value GET operation. In Java for instance: bucket.get("10000163::SET")

How to avoid Keys with Duplicate Values in Couchbase.Lite

Is it possible to tell CB.Lite to reject documents that contain values from a certain key repeated?
For instance, if i have the next document already in CB.Lite:
{
"Dog": {
"Name": "Dug",
"Color": "Blue",
"Age": 2
}
}
Is it possible to tell CB.Lite to reject any document with repeated Key "Name", so that if i try to add the next one:
{
"Dog": {
"Name": "Dug",
"Color": "Green",
"Age": 5
}
}
it would reject it?
I know It would be not much hassle to implement this functionality myself, but i was wondering if CB.Lite has already something Out of the Box.
Currently not at commit time (this is as of 1.4.x). The closest you could where Couchbase would do most of the work would be to create a View emitting the value you don't want repeated, then query and do the enforcement yourself.
This is assuming the docs themselves have different IDs. If you had what you showed using the same document ID, there are other possibilities. For example, you could trap this and reject it in Sync Gateway.

Update nested fields in Mongodb

I have a Json for vendor:
{
"id": 1,
"contact": {
"address": "abc",
"phone": "123456"
}
}
If the update is {"contact": {"address":"xyz"}}, the address should be updated to xyz, and phone is still there, i.e. not deleted.
I know $set and dot notation (https://docs.mongodb.org/manual/reference/operator/update/set/), for example, {$set: {"contact.address":"xyz"}}, can do this.
However, what I am trying to do is to come out with a generic solution in the sense that it can be applied to models with nested depth larger than 2. In other words, given the update in json form, the solution should ONLY update the fields specified in the update and leave other fields intact.

How to enter multiple table data in mongoDB using json

I am trying to learn mongodb. Suppose there are two tables and they are related. For example like this -
1st table has
First name- Fred, last name- Zhang, age- 20, id- s1234
2nd table has
id- s1234, course- COSC2406, semester- 1
id- s1234, course- COSC1127, semester- 1
id- s1234, course- COSC2110, semester- 1
how to insert data in the mongo db? I wrote it like this, not sure is it correct or not -
db.users.insert({
given_name: 'Fred',
family_name: 'Zhang',
Age: 20,
student_number: 's1234',
Course: ['COSC2406', 'COSC1127', 'COSC2110'],
Semester: 1
});
Thank you in advance
This would be a assuming that what you want to model has the "student_number" and the "Semester" as what is basically a unique identifier for the entries. But there would be a way to do this without accumulating the array contents in code.
You can make use of the upsert functionality in the .update() method, with the help of of few other operators in the statement.
I am going to assume you are going this inside a loop of sorts, so everything on the right side values is actually a variable:
db.users.update(
{
"student_number": student_number,
"Semester": semester
},
{
"$setOnInsert": {
"given_name": given_name,
"family_name": family_name,
"Age": age
},
"$addToSet": { "courses": course }
},
{ "upsert": true }
)
What this does in an "upsert" operation is first looks for a document that may exist in your collection that matches the query criteria given. In this case a "student_number" with the current "Semester" value.
When that match is found, the document is merely "updated". So what is being done here is using the $addToSet operator in order to "update" only unique values into the "courses" array element. This would seem to make sense to have unique courses but if that is not your case then of course you can simply use the $push operator instead. So that is the operation you want to happen every time, whether the document was "matched" or not.
In the case where no "matching" document is found, a new document will then be inserted into the collection. This is where the $setOnInsert operator comes in.
So the point of that section is that it will only be called when a new document is created as there is no need to update those fields with the same information every time. In addition to this, the fields you specified in the query criteria have explicit values, so the behavior of the "upsert" is to automatically create those fields with those values in the newly created document.
After a new document is created, then the next "upsert" statement that uses the same criteria will of course only "update" the now existing document, and as such only your new course information would be added.
Overall working like this allows you to "pre-join" the two tables from your source with an appropriate query. Then you are just looping the results without needing to write code for trying to group the correct entries together and simply letting MongoDB do the accumulation work for you.
Of course you can always just write the code to do this yourself and it would result in fewer "trips" to the database in order to insert your already accumulated records if that would suit your needs.
As a final note, though it does require some additional complexity, you can get better performance out of the operation as shown by using the newly introduced "batch updates" functionality.For this your MongoDB server version will need to be 2.6 or higher. But that is one way of still reducing the logic while maintaining fewer actual "over the wire" writes to the database.
You can either have two separate collections - one with student details and other with courses and link them with "id".
Else you can have a single document with courses as inner document in form of array as below:
{
"FirstName": "Fred",
"LastName": "Zhang",
"age": 20,
"id": "s1234",
"Courses": [
{
"courseId": "COSC2406",
"semester": 1
},
{
"courseId": "COSC1127",
"semester": 1
},
{
"courseId": "COSC2110",
"semester": 1
},
{
"courseId": "COSC2110",
"semester": 2
}
]
}