Tree structure mysql query - mysql

i got this table structure :
-----------------------------------
Name DocID ParentID
-----------------------------------
doc1 1 NULL
doc2 2 1
doc3 3 NULL
doc4 4 3
doc5 5 1
The query should output the tree structure with parents and childs nodes, the level can have any value.
The output is like that :
doc1
| --doc2
| --doc5
|
doc3
--doc4
Can you help to do that in mysql in a simple or recursive query in mysql ?

Writing a query like that for your data model (known as an Adjacency List) would be relatively complex and inefficient. If you really need to do so, check out http://www.artfulsoftware.com/mysqlbook/sampler/mysqled1ch20.html#adjacency_list_model.
If you don't mind altering your data model, there are two approaches:
Nested Sets - http://en.wikipedia.org/wiki/Nested_set_model. This is the most common and easy to use. There are plenty of scripts and frameworks that facilitate the use of nested sets without requiring you to manage all of the low level operations. If you have existing data, you can loop through the existing rows and insert them into the new nested set table, and you'll be good to go.
Alternatively, you could store the complete hierarchical path of each document (like breadcrumbs) in a new column. Something like:
------
Path
------
doc1
doc1>doc2
doc1>doc5
doc3
doc3>doc4
The second method would allow you to do a simple SELECT with an ORDER BY on "Path". You could then look at the number of ">" (or whatever character(s) you use) to determine the document's level in the hierarchy (how much to indent it in the UI).
The second method is not ideal because it requires more maintenance. If you change one parent-child relationship, you end up having to regenerate all of the paths. Nested Sets also involve a fair amount of row updates, but it's done very efficiently.

Related

How to find missing numbers within a column of strings

I'm trying to find unaccounted for numbers within a substantially large SQL dataset and facing some difficulty sorting.
By default the data for column reads
'Brochure1: Brochure2: Brochure3:...Brochure(k-1): Brochure(k):'
where k stands in for the number of brochures a unique id is eligible for.
Now the issue arises as the brochures are accounted for a sample updated data would read
'Brochure1: 00001 Brochure2: 00002 Brochure3: 00003....'
How does one query out the missing numbers, if in the range of number of say 00001-88888 some haven't been accounted next to Brochure(X):
The right way:
You should change the structure of your database. If you care about performance, you should follow the good practices of relational databases, so as first comment under your question said: normalize. Instead of placing information about brochures in one column of the table, it's much faster and more clear solution to create another table, that will describe relations between brochures and your-first-table-name
<your-first-table-name>_id | brochure_id
----------------------------+---------------
1 | 00002
1 | 00038
1 | 00281
2 | 28192
2 | 00293
... | ...
Not mention, if possible - you should treat brochure_id as integer, so using 12 instead of 0012.
The difference here is, that now you can make efficient and simple queries, to find out how many brochures one ID from your first table has, or what ID any brochure belongs to. If for some reason you need to keep the ordinal number of every single brochure you can add a column to the above table, like brochure_number.
What you want to achieve (not recommended): I think the fastest way to achieve your objective without changing the db structure, is to get the value of your brochures column, and then process it with your script. You really don't want to create a SQL statement to parse this kind of data. In PHP that wolud look something like this:
// Let's assume you already have your `brochures` column value in variable $brochures
$bs = str_replace(": ", ":", $brochures);
$bs = explode(" ", $bs);
$brochures = array();
foreach($bs as $b)
$brochures[substr($b, 8, 1)] = substr($b, strpos($b, ":")+1, 5);
// Now you have $brochures array with keys representing the brochure number,
// and values representing the ID of brochure.
if(isset($brochures['3'])){
// that row has a defined Brochure3
}else{
// ...
}

Is it more performant to have rows or columns in sql?

If I have to save many strings that are related and that may be dividied in different languages: What's the best way to do it?
I think I have the following options. Option 1 and 3 is the most clear solution to me. They have more columns, but result in fewer rows.
Option 2 and 4 are the most flexible ones (I could dynamically add new string_x without changing the database). They have only three columns but they will result in many rows.
Option 5 would result in many tables.
Option 1:
id | string_1 | string_2 | string_3 | string_4 | ... | string_n | lang
Option 2 *(where name would be string_1 or string_2 etc.)*
id | name | lang
Option 3
id | string_1 | string_2 | string_3 | string_4 | ... | string_n
id | lang | stringid
Option 4
id | lang | stringid
id | name
Option 5
id | string_1 | lang
id | string_2 | lang
id | ... |lang
I'm using it to store precached html values for multiple views (one line view, two lines, long description, etc.), if this is of interest.
Option 1 and 3 are not recommended, as you end up with the language (which is data) in the field name. You have to change the database design if you want to add another language.
Option 5 is not recommended, as you end up with the string identifider (which is data) in the table name. You have to change the database design if you want to add another string.
Option 2 or 4 would work fine. Option 4 is more normalised, as you don't have duplicate string names, but option 2 might be easier to work with if you enter values directly into the table view.
Having many rows in a table is not a problem, that's what the database system is built for.
Although I've not had to specifically deal with multi-language interfaces, and if that is all its purpose is, is a translation, I would to option 1, but swapped, something like
id English French German Spanish, etc...
So you would basically have a master column (such as English) as a "primary" word that is always populated, then as available, the other language columns get filled in. This way, you can keep adding as many "words" as you need, and if they get populated across all the different languages, so be it... If not, you still have a "primary" value that could be used.
It depends on a lot of other things. First of all, how many strings could there be? How many languages could there be? To simplify things, let's say if either of those numbers are greater than 5, then options 1 and 3 are infeasible.
Before I go any further, you should definitely look into implementing multi-language functionality outside of the database. In PHP you can use Gettext and put your translation data in flat files. This is a better idea for multiple reasons, the main ones being performance and ease of use with external translators.
If you absolutely must do this in a database then you should use a table structure similar to this:
id | string | language
An example entry would be:
welcome_message | Hello, World! | english
Which I think you've described in Option 2. To clarify, depending on the amount of different languages and different strings, you should use a single table with a fixed number of fields.
If you support only a few languages, you might also consider a schema in which each language is its own column:
ID EN ES FR Etc...
This is less normalized than your option 4, but it is very easy to work with. We have built our database translations like this. As we develop code, we create string resources fill in the English text. Later, a translator fills in the strings of their language.

Hard-coding URLs vs Nested Set vs Combo in Content System

I've been putting together a database to handle content produced for a site, however, thinking about the long-term, I'm unsure if I have the best system.
At present I'm using the routing method of passing everything via index.php which .htaccess routes as follows index.php?route=example/url (user sees http://www.domain.com/example/url)
At present the database is setup like below:
uid | title | content | category
--------------------------------------------------
/ | Home | aaa | 1
/example | Example | bbb | 2
/example/url | Example: URL | ccc | 2
Though I am not sure if this is the best approach, especially if I wanted to rename example to something - I'd have to rename each URL...
So I've also thought about the Nested Set method (such as http://www.phpclasses.org/package/2547-PHP-Manipulate-database-records-in-hierarchical-trees.html) though this would just show lots of different numbers in the database where I could access everything by it's node. Example below;
node | left | right | name
--------------------------
1 | 1 | 6 | Home
2 | 2 | 5 | Example
3 | 3 | 4 | URL
Then I could use the node as the uid? But I'm unsure how I could translate http://www.domain.com/example/url to the uid equalling 3...
I already do have a category column in my database at the moment, to categorise the content, though I could potentially alter this.
I'm basically looking for suggestions about how to proceed, because as the site gets more content it will be harder to change the setup - so I want to ideally get this right from day one.
Which of the two is better for scalability?
If the second, how to translate the URL to the node?
Could I somehow combine both so that the original database stores the uid as the node number, then do a join of some sort to make the uid be a url (as in 1) - then ]
^ I think I'd prefer this (the third), but unsure how to do in MySQL exactly, with some other benefits:
I could replace my category system with the parent node - which may be better
I could also then in theory store the node ID within a stats system, rather than a URL
If anyone can give some help/suggestions - I'd be grateful!
Well, if you use index.php?route=example/url, you could always do something like this:
$args = explode( '/', $_GET['route'] );
$args = filter_var_array( $_GET['route'], FILTER_SANITIZE_STRING );
Then your values of $args would be:
0 -> example
1 -> url
etc. You could then use these values to determine what template to load, and what content to grab from the database, or whatever else you're doing already.
HTH.
The nested set model probably is a good choice here. That'd result in a table layout like (id,left,right are the fields required by the nested set model, the others contain the respective content):
| id | left | right | uid | title | content | category |
More details on how to perform a particular query can be found here.
However I would not perform the look up on the database but a simple array cache:
new array('/' => array('content' => 'aaa', 'category' => 'bbbb'),
'/example/' => array(),
.....
);
This cache can be build up very easy (though expensive) and queried very easy.
On a side note: i suspect you're trying to model page content here. Maybe you should refactor you database structure then as this table would have two responsibilities (url->content mapping and content).

Get tree of single item in MySQL hierachircal database

I want to retrieve the path to a single node in a hierachical database where only the parent node ID is stored as a reference. Could someone give me a query or some advice on how to write a query (ideally the first option - I'm a MySQL noob) so that all the node titles in the end node's path are given in a generated table?
id name depth
10 Top level 0
22 Second level 1
34 3rd level 2
43 End node 3
I want to use this data to create on of those "you are here" lists like:
Home > Forums > Stuffs > ... > Topics
Thanks for any help,
James
This is only possible for a fixed number of levels, as there is no recursion in SQL.
You can convert your data structure from the "adjacency list" model you have to the so-called "nested sets" model. With that model a "find the path to the top" query is possible.

How to do a recursive query in Linq2Sql?

I have the following table, MenuItems, in the database:
ID ParentID Name
--- --------- -----
1 0 Item 1
2 1 Item 2
3 1 Item 3
4 0 Item 4
5 3 Item 5
I want to write an extension method to get all menu items to the root of the tree. Something like this:
public IQueryable<MenuItem> GetToRoot(this IQueryable<MenuItem> source, int menuItemID)
{
return from m in source
????
????
select m;
}
If I call this extension method with the data above for the menu item with ID 3, I should get:
ID ParentID Name
--- --------- -----
1 0 Item 1
3 1 Item 3
Is this possible with Linq2Sql with only one call to the database?
I don't think you'll be able to do it in a single query, and here's my thinking: discovering an item's parent effectively requires one join of the table with itself. Each additional menu level requires one more join of the table with itself. How many joins/additional levels will you need to reach the root? You won't know until you perform each one, right? So, whether on the database/SQL side or in LINQ to SQL, you'll have to take each step one at a time.
If you know your menu system won't go beyond a certain depth, I suppose you could set up a LINQ to SQL query that joins the table with itself that number of times, but that sounds ugly.
What I would suggest is setting up an association of the table with itself in your DBML designer, that would give you a parent EntityRef<> property on the class. Since cycles are not allowed in your LoadOptions (and therefore the parent cannot be pre-loaded), you could force the lazy load of the parent in the entity's partial OnLoaded() method.
Here are some relevant SO questions:
https://stackoverflow.com/questions/1435229/hierarchy-problem-replace-recursion-with-linq-join
LINQ to SQL for self-referencing tables?
Here is a server-side/SQL treatment of the problem:
http://www.sqlteam.com/article/more-trees-hierarchies-in-sql
Here is someone who has written some helper code:
http://www.scip.be/index.php?Page=ArticlesNET18