Proper JSON format in noSQL - json

I would like to keep a DB of nested groups (For example, a company hierarchy: a manager has managers who manage other managers who manage employees..)
How should one represent this structure as a JSON?
Should the name of each manager be a key and the managers below him be an object? (Assume each manager has a unique name)
{
"mamanger1": {
"sub_manager1": {
...
},
"sub_manager2": {
...
}
}
}
Or, should the JSON consist of "recursive objects", i.e, a key-value object where key is an identifier and value is an array of same, key-value objects?
In this case, the key-value pair would be called "name"-"employees".
{
"name": "mamanger1",
"employees": [
{
"name": "sub_manager1",
"employees": [ ... ]
},
{
"name": "sub_manager2",
"employees": [ ... ]
},
]
}
In the first example ,each manager has a unique key (Better performance on search?)
In the second example, all objects have the save keys (Easier looping?)

In my view you should use the second approach:
Benifits:
It is more extensible. You can add more data to manager entity
later, if needed.
Easy looping as you are having field names against each value.
Your approach will be more realworld relevant as manager names may or may not be unique.
you will not loose in the search performance as you will be having key "name" and values are still unique as per you. Even if it won't be unique, All nosql db store range of values from a key on same node.
When you will ask for details about manager/managers who has name "xyz" the search process is as follows:
You hit the api
A node receives request
Request will be forwarded to a node/nodes having range which xyz belongs to
Only data of this node will be scanned and matched ones will be returned.
Also as per me the first approach will be creating as many key as the number of managers. Considering the limited number of nodes, one node will be scanned if you try to get the details for "xyz" as key.
You will get better performance in approach 2 , if you are seaching for "xyz" and "xyz1" in same query. As the string values are close to each other you may get it in the same node(mostly). However, in the first approach there are less chances of getting it on same node as both are not considered neighbors because of different keys all together.

Related

JSON object with id as primary key schema design

I want to use an ID as the primary key in a JSON object. This way all users in the list are unique.
Like so:
{
"user": [{
"id": 1,
"name": "bob"
}]
}
In an application, I have to search for the id in all elements of the list 'user'.
But I can also use the ID as an index to get easier access to a specific user.
Like so:
{
"user": {
"1": {
"name": "bob"
}
}
}
In an application, I can now simply write user["3"] to get the correct user.
What should I use? Are there any disadvantages to the second option? I'm sure there is a best practice.
It depends on what format you want objects to look like, how much processing you want to do on your objects and how much data you have.
When dealing with web data you will often see the first format. If there is a lot of data then you will need to iterate through all records to find your matching id because your data is an array. Often that query would be enforced on your lower level data set though so it might already be indexed (eg. if it is a database) so this may not be an issue. This format is clean and binds easily.
Your second option works best when you need efficiency in your lookups since you have a dictionary with key value pairs allowing for significantly faster lookups in large datasets. Putting a numeric key (even though you are forcing it to be a string) is not supported by all libraries. You can prefix your Id with an alpha value though, then you can just add the prefix when doing a lookup. I have used k in this example but you can choose a prefix that makes sense for your data. I use this format when storing objects as the json binary data type in databases.
{
"user": {
"k1": {
"name": "bob"
}
}
}

Return a field as object or as primitive type in JSON in a REST API?

Currently I'm working on a REST API with an object that has a status. Should I return the status as a string or as an object?
When is it smart to change from field being a primitive type to a field being an object?
[
{
"id": 1
"name": "Hello"
"status": "active"
},
{
"id": 1
"name": "Hello"
"status": {
"id": 0
"name": "active"
}
}
]
In terms of extensibility I would suggest going for and object.
Using an object also adds the advantage of being able to split responsibility in terms of identifying (via f.e. an id field) and describing (via f.e. a name or description field), in your case, a status.
Adding i18n as a possible necessity, an object would also have to carry a string as identifier.
All these things are not possible with simple primitives. Conclusion: go for an object.
Other interesting remarks are given here.
It depends on what you need to pass.
If you only want to distinguish between different states and have all other related information (strings, translations, images) on the client either way, you might only want to send a simple integer value and use an enum on the client side. This reduces the data to the smallest amount.
If you have data that changes within one status on the server side, you need an object to pass everything else.
But best practice here would be to reduce data as much as possible.

Store multiple authors in to couchbase database

I am a newbie to "couchbase server". What i am looking for is to store 10 author names to couchbase document one after another. Someone please help me whether the structure is like a single document "author" and multiple values
{ id : 1, name : Auther 1}, { id : 2, name : Author 2}
OR store Author 1 to a document and Author 2 to another document.
If so, how can i increment the id automatically before "insert" command.
you can store all authors in a single document
{ doctype : "Authors",
AuthorNames:[
{
id: 1,
Name : "author1"
}
{
id: 2,
Name : "author2"
}
so on
]
IF you want to increase the ID, one is to enter one author name at a time in new document, but ID will be randomly generated and it would not in incremental order.
In Couchbase think more about how your application will be using the data more than how you are want to store it. For example, will your application need to get all of the 10 authors all of the time? If so, then one document might be worthwhile. Perhaps your application needs to only ever read/write one of the authors at a time. Then you might want to put each in their own, but have an object key pattern that makes it so you can get the object really fast. Objects that are used often are kept in the managed cache, other objects that are not used often may fall out of the managed cache...and that is ok.
The other factor is what your reads to writes ratio is on this data.
So like I said, it depends on how your application will be reading and writing your data. Use this as the guidance for how your data should be stored.
The single JSON document is pretty straight forward. The more advanced schema design where each author is in its own document and you access them via object key, might be a bit more complicated, but ultimately faster and more scalable depending on what I already pointed out. I will lay out an example schema and some possibilities.
For the authors, I might create each author JSON document with an object key like this:
authors::ID
Where ID is a value I keep in a special incrementer object that I will called authors::incrementer. Think of that object as a key value pair only holding an integer that happens to be the upper bound of an array. Couchbase SDKs include a special function to increment just such an integer object. With this, my application can put together that object key very quickly. If I want to go after the 5th author, I do a read by object key for "authors::5". If I need to get 10, I do a parallelized BulkGet function and get authors::1 through authors::10. If I want to get all the authors, I get the incrementer object, and get that integer and then to a parallelized bulk get. This way i can get them in order or in whatever order I feel like and I am accessing them by object key which is VERY fast in Couchbase.
All this being said, I could use a view to query this data or the upcoming "SQL for Documents" in Couchbase 4.0 or I can mix and match when I query and when I get objects by their key. Key access will ALWAYS be faster. It is the difference between asking a question then going and getting the object and simply knowing the answer and getting it immediately.

Source of documentation for a standard JSON document structure?

I am working on a (.NET) REST API which is returning some JSON data. The consumer of the API is an embedded client. We have been trying to establish the structure of the JSON we will be working with. The format the embedded client wants to use is something I have not seen before in working with JSON. I suggested that it is no "typical" JSON. I was met with the question "Where is 'typical' JSON format documented"?
As an example of JSON I "typically" see:
{
"item" : {
"users": [ ... list of user objects ... ],
"times": [ ... list of time objects ...],
}
}
An example of the non-typical JSON:
{
"item" : [
{
"users": [ ... list of user objects ... ]
},
{
"times": [ ... list of time objects ...]
},
]
}
In the second example, item contains an array of objects, which each contain a property whose value is an array of entities. This is valid JSON. However, I have not encountered another instance of JSON that is structured this way when it is not an arbitrary array of objects but is in fact a set list of properties on the "item" object.
In searching json.org, stackoverflow.com and other places on the interwebs I have not found any guidelines on why the structure of JSON follows the "typical" example above rather than the second example.
Can you provide links to documentation that would provide recommendations for one format or the other above?
Not a link, but just straightforward answer: Items are either indexed (0, 1, 2, ...) or keyed (users, times). No matter what software you use, you can get at indexed or keyed data equally easily and quickly. But not with what you call "non-typical" JSON: To get at the users, I have to iterate through the array and find one dictionary that has a key "users". But there might be two or more dictionaries with that key. So what am I supposed to do then? If you use JSON schema, the "non-typical" JSON is impossible to check. In iOS, in the typical case I write
NSArray* users = itemDict [#"users"];
For the non-typical JSON I have to write
NSArray* users = nil;
for (NSDictionary* dict in itemArray)
if (dict [#"users"] != nil)
users = dict [#"users"];
but that still has no error checking for multiple dicts with the key "users". Which is an error that in the first case isn't even possible. So just tell them what the are asking for is rubbish and creates nothing but unnecessary work. For other software, you probably have the same problems.

json data format in firebase - are arrays supported? And/Or, if only objects are supported, can dictionaries be numbered with integers?

I am tinkering with firebase and curious about the data structure. Browsing to my database, firebase allows me to modify the structure and data in my database. But it seems that firebase only supports objects (and dictionaries for lists).
I want to know if arrays are supported. I would also like to know if dictionary items can be named with integers - the firebase interface only inserts strings as names which makes me concerned about ordering records.
Here is a sample of json created through firebase interface:
{
"dg":{
"users":{
"rein":{
"searches":{
"0":{
"urls":"http://reinpetersen.com,http://www.reinpetersen.com",
"keyphrases":"rein petersen,reinsbrain,programmer turned kitesurfer"
}
}
},
"jacqui":{
"searches":{
"0":{
"urls":"http://www.diving-fiji.com,http://diving-fiji.com",
"keyphrases":"diving marine conservation, diving fiji"
}
}
}
},
"crawl_list":{
"1":{
"urls":"http://www.diving-fiji.com,http://diving-fiji.com",
"keyphrases":"diving marine conservation, diving fiji"
},
"0":{
"urls":"http://reinpetersen.com,http://www.reinpetersen.com",
"keyphrases":"rein petersen,reinsbrain,programmer turned kitesurfer"
}
}
}
}
Obviously, for my lists, I want the dictionary item names to be integers so i can ensure sorting is correct.
You can save arrays into Firebase. For example:
var data = new Firebase(...);
data.set(['x', 'y', 'z']);
Javascript Arrays are essentially just objects with numeric keys. When retrieving data, we automatically detect when a Firebase object has only numeric keys, and we return an array if that is the case.
Note that for storing a list of data to which many people can append, an array is not a good choice, as multiple people writing to the same index in the array can cause conflicts. Instead, we have a "push" function which creates a chronologically-ordered unique ID for your data.
Also, if you're intending to use the array as a way of ordering data, there's a better way to do that using our priorities. See the docs.
The Firebase docs have a pretty good section on how to order your data: Ordered Data.
Just like JSON fields, Firebase fields can only be named with strings. It sounds like what you're looking for is setWithPriority(), which attaches sortable priority data to your fields, or push(), which is guaranteed to give your fields unique names, ordered chronologically. (More on lists and push() here.)
You can also push() or set() arrays. For example,
new Firebase("http://gamma.firebase.com/MyUser").push(["cakes","bulldozers"]);
results in a tree like you'd expect, with MyUser receiving a uniquely named child who has children "0":"cakes" and "1":"bulldozers".