Retrieving data with a hierarchical structure in MySQL - mysql

Given the following table
id parentID name image
0 0 default.jpg
1 0 Jason
2 1 Beth b.jpg
3 0 Layla l.jpg
4 2 Hal
5 4 Ben
I am wanting to do the following:
If I search for Ben, I would like to find the image, if there is no image, I would like to find the parent's image, if that does not exist, I would like to go to the grandparent's image... up until we hit the default image.
What is the most efficient way to do this? I know SQL isn't really designed for hierarchical values, but this is what I need to do.
Cheers!

MySQL lacks recursive queries, which are part of standard SQL. Many other brands of database support this feature, including PostgreSQL (see http://www.postgresql.org/docs/8.4/static/queries-with.html).
There are several techniques for handling hierarchical data in MySQL.
Simplest would be to add a column to note the hierarchy that a given photo belongs to. Then you can search for the photos that belong to the same hierarchy, fetch them all back to your application and figure out the ones you need there. This is slightly wasteful in terms of bandwidth, requires you to write more application code, and it's not good if your trees have many nodes.
There are also a few clever techniques to store hierarchical data so you can query them:
Path Enumeration stores the list of ancestors with each node. For instance, photo 5 in your example would store "0-2-4-5". You can search for ancestors by searching for nodes whose path concatenated with "%" is a match for 5's path with a LIKE predicate.
Nested Sets is a complex but clever technique popularized by Joe Celko in his articles and his book "Trees and Hierarchical in SQL for Smarties." There are numerous online blogs and articles about it too. It's easy to query trees, but hard to query immediate children or parents and hard to insert or delete nodes.
Closure Table involves storing every ancestor/descendant relationship in a separate table. It's easy to query trees, easy to insert and delete, and easy to query immediate parents or children if you add a pathlength column.
You can see more information comparing these methods in my presentation Practical Object-Oriented Models in SQL or my upcoming book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.

Perhaps Managing Hierarchical Data in MySQL helps.

Related

Best method for storing hierarchy of organisations using eloquent

I need to store organisation ownership hierarchy in a laravel backend. Each node in the hierarchy can be one of a number of types, and each relationship needs to carry the amount of ownership (and potentially more meta data relating to the relationship between nodes). The structure can be arbitrarily deep, and it must be possible to attach a subtree an arbitrary number of times (see C1 below, which appears twice). Below is a sketch of kind of hierarchy I need....
I am using mySQL 8 so I have access to CTE for recursion. I have looked into the adjacency-list package (staudenmeir/laravel-adjacency-list) which uses CTE and looks good, but it uses self referencing tables. I think this means that I cannot store relationship data, and the I don't think I can get the repeated sub tree structure you see above.
I am currently exploring many to many relationships, with a custom pivot table to store the "relationship weighting". But I am unsure if this is a sensible approach and perhaps I'm missing some useful design pattern or this.
I am aware that this is a nebulous question, but while I'm trying to crack this myself using eloquent relationships, I thought I might get a discussion going about design pattens for this type of work.

Recursion the optimal solution in this case?

I'm getting a headache over this. I'm building a system, that can handle a number of projects, groups and file references.
Please take a look at this:
A user should be able to create an infinite number of projects, an infinite numbers of groups and attach an infinite number of file references - much like an ordinary PC file structure works with drive-letters, folders and files.
All of the mentioned elements resides inside a MySQL database. However, I'm not sure if this (see below) is the optimal way of structuring the whole thing:
As you can see, it contains one entity called "Xrefs", containing projects and groups. The rows points inside itself, probably making it ideal to do a recursive call when retrieving the data.
A different approach could be to create 1 entity for projects, 1 entity for groups and 1 entity for file references... as well as 1 helper entity, that ties the three entities together, also containing a "parent" value, that (similar to the first solution) refers to the upper level tuples in order to create a hierachy.
If you were to build a similar project, what would you do?
You hit one of the best known restrictions of MySQL: the ability to use what is called recursive queries (PostgreSQL) or CTE queries (Oracle). There are some possibles workarounds, but considering a project with this kind of requirements you'd probably suffer a lot with many other well known MySQL limitations. Even SQLLite would be more usefull (except for the one concurrent user restriction) on this matter.
DBIx::Class has some components to help you circumvent this MySQL limitations, search for Nested Trees, Ordered Trees, WITH RECURSIVE QUERY… [DBIx::Class::Tree::NestedSet][1]
You will need support for something like: 7.8. WITH Queries (Common Table Expressions), which MySQL do not offer to you.
Your structure is fine - since you are building a tree, not a general graph, there is no need for a separate table that ties entities together. I would put projects into their own table, because they appear to stand on their own, unless you must support hierarchy among projects as well.
However, given that your RDBMS is MySQL, you would have problems building recursive queries. For example, try thinking of a query that would give you all files related to xfer_id of 1 (i.e. the project). None of the files is tied to that ID, so you need to locate your first-level groups, then your second level groups, and then tie files to them. Since your groups can be nested in any number of levels, your query would have to be recursive as well.
Although you can certainly do it, it is currently not simple, and requires writing stored procedures. A common approach for situations like that is to build the tree in memory, with some assistance from RDBMS. The trick is to store the id of the top project in each group, i.e.
xfer_id xfer_fk xfer_top
------- ------- --------
1 - 1
2 1 1
3 1 1
4 3 1
5 3 1
Now a query with the condition WHERE xfer_top=... will give your all the individual "parts", which could be combined in memory without having to bring the entire table in memory.

How to get multi-levels with a single query? [duplicate]

Given the following table
id parentID name image
0 0 default.jpg
1 0 Jason
2 1 Beth b.jpg
3 0 Layla l.jpg
4 2 Hal
5 4 Ben
I am wanting to do the following:
If I search for Ben, I would like to find the image, if there is no image, I would like to find the parent's image, if that does not exist, I would like to go to the grandparent's image... up until we hit the default image.
What is the most efficient way to do this? I know SQL isn't really designed for hierarchical values, but this is what I need to do.
Cheers!
MySQL lacks recursive queries, which are part of standard SQL. Many other brands of database support this feature, including PostgreSQL (see http://www.postgresql.org/docs/8.4/static/queries-with.html).
There are several techniques for handling hierarchical data in MySQL.
Simplest would be to add a column to note the hierarchy that a given photo belongs to. Then you can search for the photos that belong to the same hierarchy, fetch them all back to your application and figure out the ones you need there. This is slightly wasteful in terms of bandwidth, requires you to write more application code, and it's not good if your trees have many nodes.
There are also a few clever techniques to store hierarchical data so you can query them:
Path Enumeration stores the list of ancestors with each node. For instance, photo 5 in your example would store "0-2-4-5". You can search for ancestors by searching for nodes whose path concatenated with "%" is a match for 5's path with a LIKE predicate.
Nested Sets is a complex but clever technique popularized by Joe Celko in his articles and his book "Trees and Hierarchical in SQL for Smarties." There are numerous online blogs and articles about it too. It's easy to query trees, but hard to query immediate children or parents and hard to insert or delete nodes.
Closure Table involves storing every ancestor/descendant relationship in a separate table. It's easy to query trees, easy to insert and delete, and easy to query immediate parents or children if you add a pathlength column.
You can see more information comparing these methods in my presentation Practical Object-Oriented Models in SQL or my upcoming book SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
Perhaps Managing Hierarchical Data in MySQL helps.

What is the best practice for fetching a tree of nodes from database for further rendering?

Let's say we have a table with user comments. First-level comments have a reference to an article they are attached to. Deeper-level comments do not have this reference by design but they have a reference to it's parent comment.
For this database structure - what would be the most efficient way to fetch all comments for a given article and then render it in html format? (Let's assume that we have approx. 200 comments of first level and the deepiest level of 20)
I usually recommend a design called Closure Table.
See example in my answer to What is the most efficient/elegant way to parse a flat table into a tree?
I also designed this presentation: Models for Hierarchical Data with SQL and PHP. I developed a PHP app that render a tree in 0.3 seconds, from a collection of hierarchical data with 490k nodes.
I blogged about Closure Table here: Rendering Trees with Closure Table.
I wrote a chapter about different strategies for hierarchical data in my book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
For the most efficient way Quassnoi has written a series of articles on this subject.
Hierarchical queries in MySQL
Hierarchical queries in MySQL: adding level
Hierarchical queries in MySQL: adding ancestry chains.
Hierarchical queries in MySQL: finding leaves
Hierarchical queries in MySQL: finding loops
I suggest you read the first article and adapt the examples to work with your specific table, but the crux is to make a function that can recurse over the rows you need to fetch. You probably also want the level (depth in the heirarchy) so the second article is probably also relevant too.
The other articles may be useful if you need to make other types of queries on your data. He also has an article Adjacency list vs. nested sets: MySQL in which he compares highly optimized queries for both the adjacency model and the nested set model.

Can a binary tree or tree be always represented in a Database as 1 table and self-referencing?

I didn't feel this rule before, but it seems that a binary tree or any tree (each node can have many children but children cannot point back to any parent), then this data structure can be represented as 1 table in a database, with each row having an ID for itself and a parentID that points back to the parent node.
That is in fact the classical Employee - Manager diagram: one boss can have many people under him... and each person can have n people under him, etc. This is a tree structure and is represented in database books as a common example as a single table Employee.
The answer to your question is 'yes'.
Simon's warning about your trees becoming a cyclic graph is correct too.
All the stuff that has been said about "You have to ensure by hand that this won't happen, i.e. the DBMS won't do that for you automatically, because you will not break any integrity or reference rules.", is WRONG.
This remark and the coresponding comments holds true, as long as you only consider SQL systems.
There exist systems which CAN do this for you in a pure declarative way, that is without you having to write *any* code whatsoever. That system is SIRA_PRISE (http://shark.armchair.mb.ca/~erwin).
Yes, you can represent hierarchical structures by self-referencing the table. Just be aware of such situations:
Employee Supervisor
1 2
2 1
Yes, that is correct. Here's a good reference
Just be aware that you generally need a loop in order to unroll the tree (e.g. find transitive relationships)