I have an object model with a parent -> children relation:
class diagram
The relation is a composition, a parent can have multiple children, a child has always one parent and cannot live without its parent. To persist this data, we serialize it into JSON and use a NoSQL document database (MongoDB) to store it:
{
"id": 123456,
"Attr1": "I'm a parent",
"children": [
{
"Attr1": "I'm a child"
},
{
"Attr1": "I'm another child"
}
]
}
Now we are trying to map this model to an OData EDM. We map the parent objects as an entity type. Because the children cannot live on their own but are rather part of the parent we did not want them as entity type but more like a complex type.
When using OData 2, I am having a hard time, because OData 2 seems to not know about collections of primitive or complex types. As soon as you need to map a "to many" relation, it seems necessary to define the "many"-side of the relation as an entity type in its own right. But when defining the children in my model as an entity type, OData2 requires to be able to address these children on service root level.
So the canonical URL of the following object
httX://host/serviceroot/Parent(1)/Child(2)
really is something like this:
httX://host/serviceroot/Child(1,2)
See documentation of OData2:
"For OData services conformant with the addressing conventions in this section, the canonical form of an absolute URI identifying a single Entry is formed by adding a single path segment to the service root URI. The path segment is made up of the name of the Collection associated with the Entry followed by the key predicate identifying the Entry within the Collection. For example the URIs httX://services.odata.org/OData/OData.svc/Categories(1)/Products(1) and httX://services.odata.org/OData/OData.svc/Products(1) represent the same Entry, but the canonical URI for the Entry is httX://services.odata.org/OData/OData.svc/Products(1)."
http://www.odata.org/documentation/odata-version-2-0/uri-conventions/
For me, OData 2 seems to fit for relational databases, where when objects are connected with a "to many" relation, they are mapped to different tables and every object therefor has its own primary key. In this case, it is easy to address a child object on service root level (httX://host/serviceroot/Child(48346954) (where 48346954 is the DB primary key)) because you can fetch that object directly with SQL means.
However, OData2 seems not to fit for document databases which can store nested object graps as documents (e.g. JSON). Child objects do not need to be independent documents but are rather nested in the parent document. When mapping this to OData and then address a concrete child, you would need to first find the document in which this child is embedded and from there traverse the object graph to the child. A direct fetch of an embedded child is not possible.
What do you think, am I right when stating that OData 2 does not really fit a document oriented database?
When using OData 4, things change a bit, because OData 4 knows about collections of complex types. So may be I can model my children as complex types. Then I have the "to many" relation of parent to its children as a property of the parent with type "complex collection". Would this be the recommendet way to model dependent and nested child objects?
Is OData 4 better suited to map documents from a document oriented database to a REST interface? Has anyone tried it? Or are there more pitfalls I currently do not see and that make OData 4 not a good choice in my case?
Related
I have a Tree Model on db like it's shown on the picture
City node is linked to Region node by IS_A_City_BELONGING_TO
Sector node is linked to Region node by IS_A_SECTOR_BELONGING_TO_THAT_REGION
Sector node is linked to City node by IS_A_SECTOR_BELONGING_TO_THAT_CITY
The hierarchical nested json ideal output is as follows
Indexes
ON :TTL(ttl) ONLINE
ON :City(cityName) ONLINE (for uniqueness constraint)
ON :Region(region) ONLINE (for uniqueness constraint)
ON :Sector(sectorName) ONLINE (for uniqueness constraint)
Constraints
ON ( city:City ) ASSERT city.cityName IS UNIQUE
ON ( region:Region ) ASSERT region.region IS UNIQUE
ON ( sector:Sector ) ASSERT sector.sectorName IS UNIQUE
How to generate the json file from db using cypher request.
THANK YOU very much.
So... Your Hierarchy is kinda hard to read... so I'll focus on the JSON response part. While Neo4j doesn't have Map as a property type, it is valid inside Cypher.
To aggregate results into a map, you can use this format
MATCH (c:City)<--(s:Sector)
RETURN {city_node:c, city_properties:PROPERTIES(c) name:c.name, sectors:COLLECT(s)} as city
Basically {} as varname defines your map, and the contents of {} define the key-value pairs.
And you can merge 2 maps with the + operator like WITH map1 + map2 as mymap. In the case of conflict, the value in the second map takes priority.
If you only want the properties of a node, and not the whole node, you can use the PROPERTIES(c) function instead of passing in the node.
One thing you will quickly notice, is this will not work recursively. It looks like in your case, it's fixed at 2 nest levels deep. So that limitation shouldn't be a problem.
On a side note, if this is meant to scale, you may want to make your Cypher paged (LIMIT+SKIP) to improve response times. (Only return what you need as you need it) On that note, it may be better to aggregate this client side, as you will probably be returning some sectors frequently for each city.
The json data structure for jstree is define in https://github.com/vakata/jstree, here is an example
[ { "text" : "Root node", "children" : [ "Child node 1", "Child node 2" ] } ]
Notably it says
The children key can be used to add children to the branch, it should
be an array
However later on in section Populating the tree using AJAX and lazy loading nodes it shows to use set children to false to indicate when a child has not be processed
[{
"id":1,"text":"Root node","children":[
{"id":2,"text":"Child node 1","children":true},
{"id":3,"text":"Child node 2"}
]
}]
So here we see children used as both as an array and as a boolean
I am using jstree as an example because this is where I encountered the issue, but my question is really a general json question. My question is this, is it valid JSON for the same element in json to be two different types (an array and a boolean)
Structure wise, both are valid JSON packets. This is okay, as JSON is somewhat less stricter than XML(with a XSD or a DTD). As per: https://www.w3schools.com/js/js_json_objects.asp,
JSON objects are surrounded by curly braces {}.
JSON objects are written in key/value pairs.
Keys must be strings, and values must be a valid JSON data type (string, number, object, array, boolean or null).
Keys and values are separated by a colon.
Each key/value pair is separated by a comma.
Having said that, if the sender is allowed to send such JSONs, only caveat is that server side will have to handle this discrepancy upon receiving such different packets. This is a bad-looking-contract, and hence server might need to do extra work to manage it. Server side handling of such incoming JSON packets can become tricky.
See: How do I create JSON data structure when element can be different types in for use by
You could validate whether a JSON is okay or not at https://jsonlint.com/
See more about JSON in this answer: https://stackoverflow.com/a/4862511/945214
It is valid Json. JSON RFC 8259 defines a general syntax but it contains nothing that would allow a tool to identify that two equally named entries are meant to describe the same conceptual thing.
The need to have a criteria to check two JSON structures for instance equality has been one motivation to create something like Json Schema.
I also think it is not too unusual for javascript to provide this kind of mixed data. Sometimes it might help to explicitly convert the javascript object to JSON. Like in JSON.stringify(testObject)
A thing for json validation
https://www.npmjs.com/package/json-validation
https://davidwalsh.name/json-validation.
Let's say I have a json object:
Object = {
param1: '',
param2: '',
param3: '',
param4: {
paramA: '',
paramB: '',
paramC: '',
paramD: [AnotherJsonObject1,AnotherJsonObject2]
}
}
Will my MongoDB structure not be similar? Would this type of structuring make the data (or some of it) less searchable?
Edit 1:
By less searchable I mean: if the top level entities have sub entities which themselves have sub-entities and so on. Will I be able to reach the lowest level entities with the same efficiency of those in the top level?
I currently depend heavily on JSON files in my website. Those files need not be indexed to searchable, BUT they would fit in the DB logically.
For example: I have a director, the director has the list of movies he created, every movie in this list has itself a list of actors who play in it, and every actor has a bio.
The bio in this example doesn't need to be indexed. I can just include a link to the file that contains the actor's bio, but I am wondering whether I can just add this to the DB because this way it will all fit in logically, or will 'unnecessary' data will harm the db's ability to perform efficiently.
Mongodb stores the document in a BSON format. It will appear similar to JSON structure.
The structure you explained seems to be a proper use case of nested documents.
You can query nested fields using the . operator
Would this type of structuring make the data (or some of it) less
searchable?
It depends on your nested data structure and the kind of queries on those fields. There may be some limitations or queries may be a bit more complicated in nested structure cases in case on nested docs. However, as far the searchability of your nested docs is concerned, it entirely depends on your use case.
For eg.
director:[movies:[{movieName:"movie1", actors:[{firstName:"will", lastName:"smith"}, {firstName:"bruce", lastName:"willis"}]}]]
In the above scenario, if you have search for a director where any of the directed movies has actor with firstName as will and lastName as smith may turn out to be a bit more complex.
a simple query like
{director.movies.actors.firstName:"will", director.movies.actors.lastName:"smith"}
may return a false response
The doc : director:[movies:[actors:actors:[{firstName:"will", lastName:"willis"}, {firstName:"bruce", lastName:"smith"}]]]
will also turn out to be a positive match.
Also, negation queries like where firstName!="bruce" will also return both the documents.
You may like to go through the mongodb docs for the same
For the first case, you can refer to elemMatch
I am a newbie to "couchbase server". What i am looking for is to store 10 author names to couchbase document one after another. Someone please help me whether the structure is like a single document "author" and multiple values
{ id : 1, name : Auther 1}, { id : 2, name : Author 2}
OR store Author 1 to a document and Author 2 to another document.
If so, how can i increment the id automatically before "insert" command.
you can store all authors in a single document
{ doctype : "Authors",
AuthorNames:[
{
id: 1,
Name : "author1"
}
{
id: 2,
Name : "author2"
}
so on
]
IF you want to increase the ID, one is to enter one author name at a time in new document, but ID will be randomly generated and it would not in incremental order.
In Couchbase think more about how your application will be using the data more than how you are want to store it. For example, will your application need to get all of the 10 authors all of the time? If so, then one document might be worthwhile. Perhaps your application needs to only ever read/write one of the authors at a time. Then you might want to put each in their own, but have an object key pattern that makes it so you can get the object really fast. Objects that are used often are kept in the managed cache, other objects that are not used often may fall out of the managed cache...and that is ok.
The other factor is what your reads to writes ratio is on this data.
So like I said, it depends on how your application will be reading and writing your data. Use this as the guidance for how your data should be stored.
The single JSON document is pretty straight forward. The more advanced schema design where each author is in its own document and you access them via object key, might be a bit more complicated, but ultimately faster and more scalable depending on what I already pointed out. I will lay out an example schema and some possibilities.
For the authors, I might create each author JSON document with an object key like this:
authors::ID
Where ID is a value I keep in a special incrementer object that I will called authors::incrementer. Think of that object as a key value pair only holding an integer that happens to be the upper bound of an array. Couchbase SDKs include a special function to increment just such an integer object. With this, my application can put together that object key very quickly. If I want to go after the 5th author, I do a read by object key for "authors::5". If I need to get 10, I do a parallelized BulkGet function and get authors::1 through authors::10. If I want to get all the authors, I get the incrementer object, and get that integer and then to a parallelized bulk get. This way i can get them in order or in whatever order I feel like and I am accessing them by object key which is VERY fast in Couchbase.
All this being said, I could use a view to query this data or the upcoming "SQL for Documents" in Couchbase 4.0 or I can mix and match when I query and when I get objects by their key. Key access will ALWAYS be faster. It is the difference between asking a question then going and getting the object and simply knowing the answer and getting it immediately.
I am using MongoDB and I want to store various trees in it.
One way of storing tree is to store each node as a document with references to its children/parent/ancestors (as mentioned here)
Other way of storing it is to store whole tree as one document with children as sub-documents. e.g.
tree : {
"title" : "root",
"children" : [
{
"title" : "node_1",
"children" : [
...
]
},
{
"title" : "node_2",
"children" : [
...
]
}
]
}
Question: Which way is recommended for storing trees?
Here are the operations that I want to perform on my data:
Add a node
Delete a node
Update a node
Get the json of whole tree
As I am planning to show this tree on UI using JsTree(you can recommend a better alternative to JsTree), which expects json data in nested format (way 2), I thought of storing the data in the same way instead of way 1.
If I store the json data in the db in way 1, then I will have to map a java object for each document/node and then manually create a tree object in java by pointing each parent to its corresponding children and then convert that java-tree-object back to json to get that nested json.
Jave object for each node looks like:
class Node {
private String title:
private List<Node> children;
}
It looks like you are going to lots of operations in different levels of nested nodes in the tree. Although MongoDB can store a structure like you describe, it is not very good at allowing you to do update at lots of nested levels.
Therefore I would recommend you to store each node as it's own document, and look at where you store the parent-child relations. Remember to optimise the schema for data operations.
I'd go with your "way 1" in this case. If you would not have to change the tree a lot, and you have say 1000x more read than write operations to the tree, then you could consider using "way 2" and just deal with the extra work it takes to update the nodes at a few levels deep.