Possible to have a table with referential integrity to itself? - mysql

Is it possible to have a table with circular referential integrity keys to itself? In example, if I had a table called Container
ObjectId ParentId
1 1
2 1
3 2
ObjectId 1 references itself. Id's 2 and 3 reference their respective parents, which are also in the same table. It wouldn't be possible to delete 3 without deleteing 2, 2 without deleting 1, and it would be impossible to delete 1.
I know I could accomplish the same thing by having a cross reference table, such as,
ObjectId ContainerId
1 1
2 2
3 3
ContainerId ObjectId
1 1
2 1
3 3
But I'm interested in the first way of accomplishing it more, as it would eliminate a possibly unnecessary table. Is this possible?

Yes, self referencing tables are fine.
They are the classical way to represent deeply nested hierarchies.
Just set a foreign key from the child column to the parent column (so, a value in the child must exist in the parent column).

I have done this many times. But be aware if you really are managing hierachies of data, SQL isn't good at tree-like queries. Some SQL vendors have SQL extensions to help with this that might be usable, but Joe Celko's 'Nested Sets' is the cat's meow for this. You'll get lots of hits in a search.
Currently I use the nested-sets approach with a self-reference 'parentID' as a short-cut for the references:
Who is my parent?
Who are my immediate children?
The rest are nested-sets queries.

The first way works, however if you're trying to store an arbitrarily deep tree, the recursive queries will be slow. You could look into storing an adjacency list or a different method (see http://vadimtropashko.wordpress.com/2008/08/09/one-more-nested-intervals-vs-adjacency-list-comparison/).
One thing we do is to store (in a separate table) each object along with all of its successors as well as having a "parent" indicator in the main table, which we use to build the tree in the application.

The goal, George, is not to eliminate an unnecessary table when using the self-reference nested set approach. Rather, it is to handle a hierarchy whose depth is not known in advance: your boss's boss's boss's boss. Who knows how deep that organizational tree may go? If you do know the depth of the hierarchy in advance, and it is not subject to frequent change, you would be better served with separate tables because writing queries against nested sets is a headache best avoided. Simplicity is better than complexity.

Related

Recursion the optimal solution in this case?

I'm getting a headache over this. I'm building a system, that can handle a number of projects, groups and file references.
Please take a look at this:
A user should be able to create an infinite number of projects, an infinite numbers of groups and attach an infinite number of file references - much like an ordinary PC file structure works with drive-letters, folders and files.
All of the mentioned elements resides inside a MySQL database. However, I'm not sure if this (see below) is the optimal way of structuring the whole thing:
As you can see, it contains one entity called "Xrefs", containing projects and groups. The rows points inside itself, probably making it ideal to do a recursive call when retrieving the data.
A different approach could be to create 1 entity for projects, 1 entity for groups and 1 entity for file references... as well as 1 helper entity, that ties the three entities together, also containing a "parent" value, that (similar to the first solution) refers to the upper level tuples in order to create a hierachy.
If you were to build a similar project, what would you do?
You hit one of the best known restrictions of MySQL: the ability to use what is called recursive queries (PostgreSQL) or CTE queries (Oracle). There are some possibles workarounds, but considering a project with this kind of requirements you'd probably suffer a lot with many other well known MySQL limitations. Even SQLLite would be more usefull (except for the one concurrent user restriction) on this matter.
DBIx::Class has some components to help you circumvent this MySQL limitations, search for Nested Trees, Ordered Trees, WITH RECURSIVE QUERY… [DBIx::Class::Tree::NestedSet][1]
You will need support for something like: 7.8. WITH Queries (Common Table Expressions), which MySQL do not offer to you.
Your structure is fine - since you are building a tree, not a general graph, there is no need for a separate table that ties entities together. I would put projects into their own table, because they appear to stand on their own, unless you must support hierarchy among projects as well.
However, given that your RDBMS is MySQL, you would have problems building recursive queries. For example, try thinking of a query that would give you all files related to xfer_id of 1 (i.e. the project). None of the files is tied to that ID, so you need to locate your first-level groups, then your second level groups, and then tie files to them. Since your groups can be nested in any number of levels, your query would have to be recursive as well.
Although you can certainly do it, it is currently not simple, and requires writing stored procedures. A common approach for situations like that is to build the tree in memory, with some assistance from RDBMS. The trick is to store the id of the top project in each group, i.e.
xfer_id xfer_fk xfer_top
------- ------- --------
1 - 1
2 1 1
3 1 1
4 3 1
5 3 1
Now a query with the condition WHERE xfer_top=... will give your all the individual "parts", which could be combined in memory without having to bring the entire table in memory.

How to get multi-levels with a single query? [duplicate]

Given the following table
id parentID name image
0 0 default.jpg
1 0 Jason
2 1 Beth b.jpg
3 0 Layla l.jpg
4 2 Hal
5 4 Ben
I am wanting to do the following:
If I search for Ben, I would like to find the image, if there is no image, I would like to find the parent's image, if that does not exist, I would like to go to the grandparent's image... up until we hit the default image.
What is the most efficient way to do this? I know SQL isn't really designed for hierarchical values, but this is what I need to do.
Cheers!
MySQL lacks recursive queries, which are part of standard SQL. Many other brands of database support this feature, including PostgreSQL (see http://www.postgresql.org/docs/8.4/static/queries-with.html).
There are several techniques for handling hierarchical data in MySQL.
Simplest would be to add a column to note the hierarchy that a given photo belongs to. Then you can search for the photos that belong to the same hierarchy, fetch them all back to your application and figure out the ones you need there. This is slightly wasteful in terms of bandwidth, requires you to write more application code, and it's not good if your trees have many nodes.
There are also a few clever techniques to store hierarchical data so you can query them:
Path Enumeration stores the list of ancestors with each node. For instance, photo 5 in your example would store "0-2-4-5". You can search for ancestors by searching for nodes whose path concatenated with "%" is a match for 5's path with a LIKE predicate.
Nested Sets is a complex but clever technique popularized by Joe Celko in his articles and his book "Trees and Hierarchical in SQL for Smarties." There are numerous online blogs and articles about it too. It's easy to query trees, but hard to query immediate children or parents and hard to insert or delete nodes.
Closure Table involves storing every ancestor/descendant relationship in a separate table. It's easy to query trees, easy to insert and delete, and easy to query immediate parents or children if you add a pathlength column.
You can see more information comparing these methods in my presentation Practical Object-Oriented Models in SQL or my upcoming book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Perhaps Managing Hierarchical Data in MySQL helps.

modeling many to many unary relationship and 1:M unary relationship

Im getting back into database design and i realize that I have huge gaps in my knowledge.
I have a table that contains categories. Each category can have many subcategories and each subcategory can belong to many super-categories.
I want to create a folder with a category name which will contain all the subcategories folders. (visual object like windows folders)
So i need to preform quick searches of the subcategories.
I wonder what are the benefits of using 1:M or M:N relationship in this case?
And how to implement each design?
I have create a ERD model which is a 1:M unary relationship. (the diagram also contains an expense table which stores all the expense values but is irrelevant in this case)
is this design correct?
will many to many unary relationship allow for faster searches of super-categories and is the best design by default?
I would prefer an answer which contains an ERD
If I understand you correctly, a single sub-category can have at most one (direct) super-category, in which case you don't need a separate table. Something like this should be enough:
Obviously, you'd need a recursive query to get the sub-categories from all levels, but it should be fairly efficient provided you put an index on PARENT_ID.
Going in the opposite direction (and getting all ancestors) would also require a recursive query. Since this would entail searching on PK (which is automatically indexed), this should be reasonably efficient as well.
For some more ideas and different performance tradeoffs, take a look at this slide-show.
In some cases the easiest way to maintain a multilevel hierarchy in a relational database is the Nested Set Model, sometimes also called "modified preorder tree traversal" (MPTT).
Basically the tree nodes store not only the parent id but also the ids of the left-most and right-most leaf:
spending_category
-----------------
parent_id int
left_id int
right_id int
name char
The major benefit from doing this is that now you are able to get an entire subtree of a node with a single query: the ids of subtree nodes are between left_id and right_id. There are many variations; others store the depth of the node in addition to or instead of the parent node id.
A drawback is that left_id and right_id have to be updated when nodes are inserted or deleted, which means this approach is useful only for trees of moderate size.
The wikipedia article and the slideshow mentioned by Branko explains the technique better than I can. Also check out this list of resources if you want to know more about different ways of storing hierarchical data in a relational database.

Can a binary tree or tree be always represented in a Database as 1 table and self-referencing?

I didn't feel this rule before, but it seems that a binary tree or any tree (each node can have many children but children cannot point back to any parent), then this data structure can be represented as 1 table in a database, with each row having an ID for itself and a parentID that points back to the parent node.
That is in fact the classical Employee - Manager diagram: one boss can have many people under him... and each person can have n people under him, etc. This is a tree structure and is represented in database books as a common example as a single table Employee.
The answer to your question is 'yes'.
Simon's warning about your trees becoming a cyclic graph is correct too.
All the stuff that has been said about "You have to ensure by hand that this won't happen, i.e. the DBMS won't do that for you automatically, because you will not break any integrity or reference rules.", is WRONG.
This remark and the coresponding comments holds true, as long as you only consider SQL systems.
There exist systems which CAN do this for you in a pure declarative way, that is without you having to write *any* code whatsoever. That system is SIRA_PRISE (http://shark.armchair.mb.ca/~erwin).
Yes, you can represent hierarchical structures by self-referencing the table. Just be aware of such situations:
Employee Supervisor
1 2
2 1
Yes, that is correct. Here's a good reference
Just be aware that you generally need a loop in order to unroll the tree (e.g. find transitive relationships)

Retrieving data with a hierarchical structure in MySQL

Given the following table
id parentID name image
0 0 default.jpg
1 0 Jason
2 1 Beth b.jpg
3 0 Layla l.jpg
4 2 Hal
5 4 Ben
I am wanting to do the following:
If I search for Ben, I would like to find the image, if there is no image, I would like to find the parent's image, if that does not exist, I would like to go to the grandparent's image... up until we hit the default image.
What is the most efficient way to do this? I know SQL isn't really designed for hierarchical values, but this is what I need to do.
Cheers!
MySQL lacks recursive queries, which are part of standard SQL. Many other brands of database support this feature, including PostgreSQL (see http://www.postgresql.org/docs/8.4/static/queries-with.html).
There are several techniques for handling hierarchical data in MySQL.
Simplest would be to add a column to note the hierarchy that a given photo belongs to. Then you can search for the photos that belong to the same hierarchy, fetch them all back to your application and figure out the ones you need there. This is slightly wasteful in terms of bandwidth, requires you to write more application code, and it's not good if your trees have many nodes.
There are also a few clever techniques to store hierarchical data so you can query them:
Path Enumeration stores the list of ancestors with each node. For instance, photo 5 in your example would store "0-2-4-5". You can search for ancestors by searching for nodes whose path concatenated with "%" is a match for 5's path with a LIKE predicate.
Nested Sets is a complex but clever technique popularized by Joe Celko in his articles and his book "Trees and Hierarchical in SQL for Smarties." There are numerous online blogs and articles about it too. It's easy to query trees, but hard to query immediate children or parents and hard to insert or delete nodes.
Closure Table involves storing every ancestor/descendant relationship in a separate table. It's easy to query trees, easy to insert and delete, and easy to query immediate parents or children if you add a pathlength column.
You can see more information comparing these methods in my presentation Practical Object-Oriented Models in SQL or my upcoming book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Perhaps Managing Hierarchical Data in MySQL helps.