Finding the lowest node in a given sub-tree of hierarchical data - mysql

Consider a database with these three tables:
category:
cat_id name parent_id
-----------------------
1 drinks 0
2 carbonated 1
3 cola 2
4 water 1
5 rc-cola 3
product:
prod_id name default_cat
-----------------------------------
1 cola-zero 2
2 mineral water 4
cat_prod:
cat_id prod_id
---------------
1 1
2 1
3 1
4 2
We have category hierarchy and a product, which may belong to several categories.
Also, each product has a default category. In this case cola-zero product has default category 2 - carbonated, which is a mistake. Default category has to be 3 - cola. I.e., the lowest category in the category tree. However, I may consider only a subset of the category tree: only those categories that the product belongs to.
I need to update the default category of each product in the product table and ensure that product's default category is the most "defined" one, i.e., the lowest for a given product.
I can write a script, which would retrieve all categories, build the tree in memory and then for each product check the default category against this tree. But I hope there is a smarter way to do this via SQL only.
Is it even possible to do it in pure SQL?
Thanks.

If you store the hierarchy in a Closure Table, it's really easy to find the lowest node(s) in the tree:
SELECT c.descendant FROM closure c
JOIN (SELECT MAX(pathlength) AS pathlength FROM closure) x USING (pathlength);
Finding the lowest node of a subtree, you just need to be specific about the starting node of the branch you want to search:
SELECT c.descendant FROM closure c
JOIN (SELECT MAX(pathlength) AS pathlength FROM closure) x USING (pathlength)
WHERE c.ancestor = 2;

I finally solved it. A bit dirty, since I have to create a temporary table to hold intermediate results, but it works.
Here is the full code:
-- schema
CREATE TABLE category
(
cat_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
parent_id INT NOT NULL,
PRIMARY KEY (cat_id)
);
GO
CREATE TABLE product
(
prod_id INT NOT NULL,
name VARCHAR(255) NOT NULL,
default_cat INT NOT NULL,
PRIMARY KEY (prod_id)
);
GO
CREATE TABLE cat_prod
(
cat_id INT NOT NULL,
prod_id INT NOT NULL,
PRIMARY KEY (cat_id, prod_id),
FOREIGN KEY (cat_id) REFERENCES category(cat_id),
FOREIGN KEY (prod_id) REFERENCES product(prod_id)
);
GO
-- data
INSERT INTO category (cat_id, name, parent_id)
VALUES
(1, 'drinks', 0),
(2, 'carbonated', 1),
(3, 'cola', 2),
(4, 'water', 1),
(5, 'rc-cola', 3)
;
GO
INSERT INTO product (prod_id, name, default_cat)
VALUES
(1, 'cola-zero', 2), -- this is a mistake! must be 3
(2, 'mineral water', 4) -- this one should stay intact
;
GO
INSERT INTO cat_prod (cat_id, prod_id)
VALUES
(1, 1),
(2, 1),
(3, 1),
(4, 2),
(4, 1)
;
GO
-- stored proc
CREATE PROCEDURE iterate_products()
BEGIN
DECLARE prod_id INT;
DECLARE default_cat INT;
DECLARE new_default_cat INT;
DECLARE done INT DEFAULT FALSE;
DECLARE cur CURSOR FOR SELECT p.prod_id, p.default_cat FROM product p;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
-- temporary table to hold the category subtree for a given product
CREATE TABLE IF NOT EXISTS tmp_category_sub_tree
(
cat_id INT NOT NULL,
parent_id INT NOT NULL
);
OPEN cur;
UPDATE_LOOP: LOOP
FETCH cur INTO prod_id, default_cat;
IF done THEN
LEAVE UPDATE_LOOP;
END IF;
TRUNCATE TABLE tmp_category_sub_tree;
-- select all cateries this products belongs to
INSERT INTO tmp_category_sub_tree (cat_id, parent_id)
SELECT category.cat_id, category.parent_id
FROM category
INNER JOIN cat_prod
ON category.cat_id = cat_prod.cat_id
WHERE
cat_prod.prod_id = prod_id;
-- select a leaf (only one)
SELECT t1.cat_id FROM
tmp_category_sub_tree AS t1 LEFT JOIN tmp_category_sub_tree AS t2
ON t1.cat_id = t2.parent_id
WHERE
t2.cat_id IS NULL
LIMIT 1
INTO NEW_DEFAULT_CAT;
-- update product record, if required
IF default_cat != new_default_cat THEN
UPDATE product
SET default_cat = new_default_cat
WHERE
product.prod_id = prod_id;
END IF;
END LOOP;
CLOSE cur;
DROP TABLE tmp_category_sub_tree;
END;
GO
Here is the SQLFiddle link: http://sqlfiddle.com/#!2/98a45/1

Related

How to display 'InValid' records depending on, if the latest record is validated?

Goal
Retrieve InValid records but with some complexity. Let me explain.
Taking a look at this output: There is an InValid record for ItemId of 1.
Using the query below, I am able to see InValid records - to display to the user this Item needs to be Re-Checked.
SELECT * FROM `Records` WHERE IsValid = 0;
Problem
This is where I am stuck.
In the Red is the InValid record, recorded on the 28-06-2021. The next day, 29-09-2021 the ItemId is now a Valid record.
But using this query below is not relevant anymore, as it will show me the records that are still Invalid. Even though the record has been validated the next day.
SELECT * FROM `Records` WHERE IsValid = 0;
My idea to solve this problem (See Edit 1 below, for further details to this solution)
My idea would be to create a Trigger that will check if today's record is valid, if it is valid, then Update all the items to true where date is before today. From here, I can use the simple query above to see InValid records.
Also, I thought of creating a History table and a trigger to see what actions have been performed on the Records table.
Question
I am not sure if my idea is appropriate to solve my problem, creating a trigger to update all previous records does not seem the records are valid at all. But, my history table will show me valid values.
Is there a query I can use and avoid creating any triggers or it's best to go with my solution?
Schema (MySQL v5.7)
-- Base
CREATE TABLE `Items` (
Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
ItemName VARCHAR(30));
CREATE TABLE `Records` (
Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
Date TIMESTAMP NOT NULL,
ItemId INT NOT NULL,
IsValid BOOLEAN NOT NULL,
FOREIGN KEY (ItemId) REFERENCES `Items`(Id));
-- History
CREATE TABLE `Records_History` LIKE `Records`;
ALTER TABLE `Records_History`
MODIFY COLUMN Id INT NOT NULL,
DROP PRIMARY KEY,
ADD Action VARCHAR(8) DEFAULT 'insert' FIRST;
CREATE TRIGGER Records_AfterInsert
AFTER INSERT ON `Records` FOR EACH ROW
INSERT INTO `Records_History` SELECT 'insert', d.*
FROM `Records` AS d WHERE d.Id = NEW.Id;
-- Insert Records
INSERT INTO `Items`
(ItemName)
VALUES
('Item 1'),
('Item 2');
INSERT INTO `Records`
(Date, ItemId, IsValid)
VALUES
('2021-09-28', 1, 0),
('2021-09-28', 2, 1),
('2021-09-29', 1, 1),
('2021-09-29', 2, 1);
Query #1
select * from `Records`;
Id
Date
ItemId
IsValid
1
2021-09-28 00:00:00
1
0
2
2021-09-28 00:00:00
2
1
3
2021-09-29 00:00:00
1
1
4
2021-09-29 00:00:00
2
1
Query #2
select * from `Records_History`;
Action
Id
Date
ItemId
IsValid
insert
1
2021-09-28 00:00:00
1
0
insert
2
2021-09-28 00:00:00
2
1
insert
3
2021-09-29 00:00:00
1
1
insert
4
2021-09-29 00:00:00
2
1
View on DB Fiddle
Edit 1
Unfortunately my solution is not an option. As I will hit this error:
Tried my solution: Error Code: 1442. Can't update table 'Records' in stored function/trigger because it is already used by statement which invoked this stored function
This basically means: I have a chance of causing an infinite loop.
This is what I have done to achieve my goal:
Schema (MySQL v5.7)
SET GLOBAL sql_mode=(SELECT REPLACE(##sql_mode,'ONLY_FULL_GROUP_BY',''));
CREATE TABLE `Items` (
Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
ItemName VARCHAR(30));
INSERT INTO `Items`
(ItemName)
VALUES
('Item 1'),
('Item 2');
CREATE TABLE `Records` (
Id INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
Date TIMESTAMP NOT NULL,
ItemId INT NOT NULL,
IsValid BOOLEAN NOT NULL,
FOREIGN KEY (ItemId) REFERENCES `Items`(Id));
INSERT INTO `Records`
(Date, ItemId, IsValid)
VALUES
('2021-09-28', 1, 1),
('2021-09-28', 2, 1),
('2021-09-29', 1, 0),
('2021-09-29', 2, 1),
('2021-09-30', 1, 1),
('2021-09-30', 2, 1),
('2021-10-01', 1, 0),
('2021-10-01', 2, 1);
Query #1
SELECT * FROM Records a WHERE IsValid = 0 AND ItemId NOT IN (
SELECT ItemId FROM Records b WHERE IsValid = 1 AND b.Date >= a.Date
) GROUP BY ItemId;
Id
Date
ItemId
IsValid
7
2021-10-01 00:00:00
1
0
View on DB Fiddle

Link one item to other item in same table

I searched a lot but found nothing.
My scenario is:
I have database with two tables table_item and table_item_linked. table_item has many items. User will come and add item(s). Later other user come and link one item with other item(s) via a form with two dropdown.
What I did so far is:
Structure of table_item:
+-------------------+
| table_item |
+-------------------+
| item_id (Primary) |
| others |
| .... |
| .... |
| .... |
+-------------------+
Structure of table_item_linked:
+---------------------+
| table_item_linked |
+---------------------+
| linked_id | (Primary)
| item_id | (Foreign key referencing -> item_id of table_item)
| linked_items | (here I need to store ids of linked items)
| linked_by | (referencing to user_id of user_table)
| linked_timestamp | (timestamp)
+---------------------+
If I have items in table_item like:
A B C D E F G H
When I link D with G
I can successfully fetch G when I am fetching D or vice versa. But problem came when I
Link H with G
So I must fetch D H while fetching G.
(D H G are linked in all means and upon fetching one, the remaining two must be attached and fetched)
It is like a multiple relation (Many to Many relationship).
Guys I know there must be professional way to do it. I will like to have any guidance. I can even change my database structure.
PS:
Please don't suggest to add #tag as one item is exactly similar to the other linked.
UPDATES
Frontend looks like this. If I intend to link two records I will have two dropdowns as shown:
And If I check details of record A
And If I check details of record B
And If I check details of record C
Assuming your table_item looks like this:
create table table_item (
item_id int unsigned auto_increment not null,
record varchar(50),
primary key (item_id)
);
insert into table_item (record) values
('Record A'),
('Record B'),
('Record C'),
('Record D'),
('Record E'),
('Record F'),
('Record G'),
('Record H');
table_item_linked could then be
create table table_item_linked (
linked_id int unsigned auto_increment not null,
item1_id int unsigned not null,
item2_id int unsigned not null,
linked_by int unsigned not null,
linked_timestamp timestamp not null default now(),
primary key (linked_id),
unique key (item1_id, item2_id),
index (item2_id, item1_id),
foreign key (item1_id) references table_item(item_id),
foreign key (item2_id) references table_item(item_id)
);
This is basically a many-to-many relation between items of the same type.
Note that you usually don't need an AUTO_INCREMENT column here. You can remove it, and define (item1_id, item2_id) as PRIMARY KEY. And linked_by should be a FOREGN KEY referencing the users table.
If a user (with ID 123) wants to link "Record A" (item_id = 1) with "Record B" (item_id = 2) and "Record B" (item_id = 2) with "Record C" (item_id = 3), your INSERT statements would be:
insert into table_item_linked (item1_id, item2_id, linked_by) values (1, 2, 123);
insert into table_item_linked (item1_id, item2_id, linked_by) values (2, 3, 123);
Now - When the user selects "Record A" (item_id = 1), you can get all related items with a recursive query (Requires at least MySQL 8.0 or MariaDB 10.2):
set #input_item_id = 1;
with recursive input as (
select #input_item_id as item_id
), rcte as (
select item_id from input
union distinct
select t.item2_id as item_id
from rcte r
join table_item_linked t on t.item1_id = r.item_id
union distinct
select t.item1_id as item_id
from rcte r
join table_item_linked t on t.item2_id = r.item_id
)
select i.*
from rcte r
join table_item i on i.item_id = r.item_id
where r.item_id <> (select item_id from input)
The result will be:
item_id record
———————————————————
2 Record B
3 Record C
db-fiddle
In your application you would remove set #input_item_id = 1; and change select #input_item_id as item_id using a placeholder to select ? as item_id. Then prepare the statement and bind item_id as parameter.
Update
If the server doesn't support recursive CTEs, you should consider to store redundat data in a separate table, which is simple to query. A closure table would be an option, but it's not necessery here, and might consume too much storage space. I would group items that are connected together (directly and indirectly) into clusters.
Given the same schema as above, we define a new table table_item_cluster:
create table table_item_cluster (
item_id int unsigned not null,
cluster_id int unsigned not null,
primary key (item_id),
index (cluster_id, item_id),
foreign key (item_id) references table_item(item_id)
);
This table links items (item_id) to clusters (cluster_id). Since an item can belong only to one cluster, we can define item_id as primary key. It's also a foreign key referencing table_item.
When a new item is created, it's not connected to any other item and builds an own cluster. So when we insert a new item, we need also to insert a new row in table_item_cluster. For simplicity we identify the cluster by item_id (item_id = cluster_id). This can be done in the application code, or with the following trigger:
delimiter //
create trigger table_item_after_insert
after insert on table_item
for each row begin
-- create a new cluster for the new item
insert into table_item_cluster (item_id, cluster_id)
values (new.item_id, new.item_id);
end//
delimiter ;
When we link two items, we simply merge their clusters. The cluster_id for all items from the two merged clusters needs to be the same now. Here I would just take the least one of two. Again - we can do that in application code or with a trigger:
delimiter //
create trigger table_item_linked_after_insert
after insert on table_item_linked
for each row begin
declare cluster1_id, cluster2_id int unsigned;
set cluster1_id = (
select c.cluster_id
from table_item_cluster c
where c.item_id = new.item1_id
);
set cluster2_id = (
select c.cluster_id
from table_item_cluster c
where c.item_id = new.item2_id
);
-- merge the linked clusters
update table_item_cluster c
set c.cluster_id = least(cluster1_id, cluster2_id)
where c.item_id in (cluster1_id, cluster2_id);
end//
delimiter ;
Now - When we have an item and want to get all (directly and indirectly) linked items, we just select all items (except of the given item) from the same cluster:
select i.*
from table_item i
join table_item_cluster c on c.item_id = i.item_id
join table_item_cluster c1
on c1.cluster_id = c.cluster_id
and c1.item_id <> c.item_id -- exclude the given item
where c1.item_id = ?
db-fiddle
The result for c1.item_id = 1 ("Record A") would be:
item_id record
———————————————————
2 Record B
3 Record C
But: As almost always when dealing with redundant data - Keeping it in sync with the source data can get quite complex. While it is simple to add and merge clusters - When you need to remove/delete an item or a link, you might need to split a cluster, which may require writing recursive or iterative code to determine which items belong to the same cluster. Though a simple (and "stupid") algorithm would be to just remove and reinsert all affected items and links, and let the insert triggers do theit work.
Update 2
Last but not least: You can write a stored procedure, which will iterate through the links:
delimiter //
create procedure get_linked_items(in in_item_id int unsigned)
begin
set #ids := concat(in_item_id);
set #ids_next := #ids;
set #sql_tpl := "
select group_concat(distinct id order by id) into #ids_next
from (
select item2_id as id
from table_item_linked
where item1_id in ({params_in})
and item2_id not in ({params_not_in})
union all
select item1_id
from table_item_linked
where item2_id in ({params_in})
and item1_id not in ({params_not_in})
) x
";
while (#ids_next is not null) do
set #sql := #sql_tpl;
set #sql := replace(#sql, '{params_in}', #ids_next);
set #sql := replace(#sql, '{params_not_in}', #ids);
prepare stmt from #sql;
execute stmt;
set #ids := concat_ws(',', #ids, #ids_next);
end while;
set #sql := "
select *
from table_item
where item_id in ({params})
and item_id <> {in_item_id}
";
set #sql := replace(#sql, '{params}', #ids);
set #sql := replace(#sql, '{in_item_id}', in_item_id);
prepare stmt from #sql;
execute stmt;
end//
delimiter ;
To get all linked items of "Record A" (item_id = 1), you would use
call get_linked_items(1);
db-fiddle
To explain it in pseudo code:
Initialize #ids and #ids_next with the input parameter
Find all item IDs which are directly linked to any ID in #ids_next except of those, which are already in #ids
Store the result into #ids_next (overwrite it)
Append IDs from #ids_next to #ids (merge the two sets into #ids)
If #ids_next is not empty: GOTO step 2.
Return all items with IDs in #ids
The obvious solution is to store one row for each link in table_item_linked.
Your table then becomes
+---------------------+
| table_item_linked |
+---------------------+
| linked_id | (Primary
| from_item_id | (The item linked _from_ -> item_id of table_item)
| to_item_id | the item linked _to_
| linked_by | (referencing to user_id of user_table)
| linked_timestamp | (timestamp)
+---------------------+
In your example, the data would be:
linked_id from_item_id to_item_id linked_by linked_timestamp
------------------------------------------------------------------------
1 D H sd '1 jan 2020'
2 H G sa '2 Jan 2020'
You then need to write a hierarchical query to retrieve all the "children" of G.

get list of deleted primary key from auto increment primary key

I have a table on which id is a primary key column set with auto increment. It contains over 10,00 rows.
I need to get all primary keys that have been deleted.
like
1 xcgh fct
2 xxml fcy
5 ccvb fcc
6 tylu cvn
9 vvbh cvv
The result that i should get is
3
4
7
8
currently i count all records and then insert(1 to count) in another table and then i select id from that table that dosent exists in record table. But this method is very inefficient. Is there any direct query that i can use?
please specify for mysql.
See fiddle:
http://sqlfiddle.com/#!2/edf67/4/0
CREATE TABLE SomeTable (
id INT PRIMARY KEY
, mVal VARCHAR(32)
);
INSERT INTO SomeTable
VALUES (1, 'xcgh fct'),
(2, 'xxml fcy'),
(5, 'ccvb fcc'),
(6, 'tylu cvn'),
(9, 'vvbh cvv');
set #rank = (Select max(ID)+1 from sometable);
create table CompleteIDs as (Select #rank :=#rank-1 as Rank
from sometable st1, sometable st2
where #rank >1);
SELECT CompleteIDs.Rank
FROM CompleteIDs
LEFT JOIN someTable S1
on CompleteIDs.Rank = S1.ID
WHERE S1.ID is null
order by CompleteIDs.rank
There is one assumption here. That the number of records in someTable* the number of records in sometable is greater than the maximum ID in sometable. Otherwise this doesn't work.
You can try to create a temp table, fill it with e.g. 1,000 values, you can do it using any scripting language or try a procedure (This might be not-effective overall)
DELIMITER $$
CREATE PROCEDURE InsertRand(IN NumRows INT)
BEGIN
DECLARE i INT;
SET i = 1;
START TRANSACTION;
WHILE i <= NumRows DO
INSERT INTO rand VALUES (i);
SET i = i + 1;
END WHILE;
COMMIT;
END$$
DELIMITER ;
CALL InsertRand(5);
Then you just do query
SELECT id AS deleted_id FROM temporary_table
WHERE id NOT IN
(SELECT id FROM main_table)
Please note that it should be like every day action or something cause it's very memory inefficient

Recursive mysql select?

I saw this answer and i hope he is incorrect, just like someone was incorrect telling primary keys are on a column and I can't set it on multiple columns.
Here is my table
create table Users(id INT primary key AUTO_INCREMENT,
parent INT,
name TEXT NOT NULL,
FOREIGN KEY(parent)
REFERENCES Users(id)
);
+----+--------+---------+
| id | parent | name |
+----+--------+---------+
| 1 | NULL | root |
| 2 | 1 | one |
| 3 | 1 | 1down |
| 4 | 2 | one_a |
| 5 | 4 | one_a_b |
+----+--------+---------+
I'd like to select user id 2 and recurse so I get all its direct and indirect child (so id 4 and 5).
How do I write it in such a way this will work? I seen recursion in postgresql and sqlserver.
CREATE DEFINER = 'root'#'localhost'
PROCEDURE test.GetHierarchyUsers(IN StartKey INT)
BEGIN
-- prepare a hierarchy level variable
SET #hierlevel := 00000;
-- prepare a variable for total rows so we know when no more rows found
SET #lastRowCount := 0;
-- pre-drop temp table
DROP TABLE IF EXISTS MyHierarchy;
-- now, create it as the first level you want...
-- ie: a specific top level of all "no parent" entries
-- or parameterize the function and ask for a specific "ID".
-- add extra column as flag for next set of ID's to load into this.
CREATE TABLE MyHierarchy AS
SELECT U.ID
, U.Parent
, U.`name`
, 00 AS IDHierLevel
, 00 AS AlreadyProcessed
FROM
Users U
WHERE
U.ID = StartKey;
-- how many rows are we starting with at this tier level
-- START the cycle, only IF we found rows...
SET #lastRowCount := FOUND_ROWS();
-- we need to have a "key" for updates to be applied against,
-- otherwise our UPDATE statement will nag about an unsafe update command
CREATE INDEX MyHier_Idx1 ON MyHierarchy (IDHierLevel);
-- NOW, keep cycling through until we get no more records
WHILE #lastRowCount > 0
DO
UPDATE MyHierarchy
SET
AlreadyProcessed = 1
WHERE
IDHierLevel = #hierLevel;
-- NOW, load in all entries found from full-set NOT already processed
INSERT INTO MyHierarchy
SELECT DISTINCT U.ID
, U.Parent
, U.`name`
, #hierLevel + 1 AS IDHierLevel
, 0 AS AlreadyProcessed
FROM
MyHierarchy mh
JOIN Users U
ON mh.Parent = U.ID
WHERE
mh.IDHierLevel = #hierLevel;
-- preserve latest count of records accounted for from above query
-- now, how many acrual rows DID we insert from the select query
SET #lastRowCount := ROW_COUNT();
-- only mark the LOWER level we just joined against as processed,
-- and NOT the new records we just inserted
UPDATE MyHierarchy
SET
AlreadyProcessed = 1
WHERE
IDHierLevel = #hierLevel;
-- now, update the hierarchy level
SET #hierLevel := #hierLevel + 1;
END WHILE;
-- return the final set now
SELECT *
FROM
MyHierarchy;
-- and we can clean-up after the query of data has been selected / returned.
-- drop table if exists MyHierarchy;
END
It might appear cumbersome, but to use this, do
call GetHierarchyUsers( 5 );
(or whatever key ID you want to find UP the hierarchical tree for).
The premise is to start with the one KEY you are working with. Then, use that as a basis to join to the users table AGAIN, but based on the first entry's PARENT ID. Once found, update the temp table as to not try and join for that key again on the next cycle. Then keep going until no more "parent" ID keys can be found.
This will return the entire hierarchy of records up to the parent no matter how deep the nesting. However, if you only want the FINAL parent, you can use the #hierlevel variable to return only the latest one in the file added, or ORDER BY and LIMIT 1
I know there is probably better and more efficient answer above but this snippet gives a slightly different approach and provides both - ancestors and children.
The idea is to constantly insert relative rowIds into temporary table, then fetch a row to look for it's relatives, rinse repeat until all rows are processed. Query can be probably optimized to use only 1 temporary table.
Here is a working sqlfiddle example.
CREATE TABLE Users
(`id` int, `parent` int,`name` VARCHAR(10))//
INSERT INTO Users
(`id`, `parent`, `name`)
VALUES
(1, NULL, 'root'),
(2, 1, 'one'),
(3, 1, '1down'),
(4, 2, 'one_a'),
(5, 4, 'one_a_b')//
CREATE PROCEDURE getAncestors (in ParRowId int)
BEGIN
DECLARE tmp_parentId int;
CREATE TEMPORARY TABLE tmp (parentId INT NOT NULL);
CREATE TEMPORARY TABLE results (parentId INT NOT NULL);
INSERT INTO tmp SELECT ParRowId;
WHILE (SELECT COUNT(*) FROM tmp) > 0 DO
SET tmp_parentId = (SELECT MIN(parentId) FROM tmp);
DELETE FROM tmp WHERE parentId = tmp_parentId;
INSERT INTO results SELECT parent FROM Users WHERE id = tmp_parentId AND parent IS NOT NULL;
INSERT INTO tmp SELECT parent FROM Users WHERE id = tmp_parentId AND parent IS NOT NULL;
END WHILE;
SELECT * FROM Users WHERE id IN (SELECT * FROM results);
END//
CREATE PROCEDURE getChildren (in ParRowId int)
BEGIN
DECLARE tmp_childId int;
CREATE TEMPORARY TABLE tmp (childId INT NOT NULL);
CREATE TEMPORARY TABLE results (childId INT NOT NULL);
INSERT INTO tmp SELECT ParRowId;
WHILE (SELECT COUNT(*) FROM tmp) > 0 DO
SET tmp_childId = (SELECT MIN(childId) FROM tmp);
DELETE FROM tmp WHERE childId = tmp_childId;
INSERT INTO results SELECT id FROM Users WHERE parent = tmp_childId;
INSERT INTO tmp SELECT id FROM Users WHERE parent = tmp_childId;
END WHILE;
SELECT * FROM Users WHERE id IN (SELECT * FROM results);
END//
Usage:
CALL getChildren(2);
-- returns
id parent name
4 2 one_a
5 4 one_a_b
CALL getAncestors(5);
-- returns
id parent name
1 (null) root
2 1 one
4 2 one_a

Connect By Prior Equivalent for MySQL

All,
I have three fields in a table that define a parent child relationship present in a MySQL database version 5.0 . The table name is tb_Tree and it has the following data:
Table Name: tb_Tree
Id | ParentId | Name
--------------------
1 | 0 | Fruits
2 | 0 | Vegetables
3 | 1 | Apple
4 | 1 | Orange
5 | 2 | Cabbage
6 | 2 | Eggplant
How do I write a Query to get all the children if a ParentId is specified. Note that the table entries given are just sample data and they can have many more rows. Oracle has a "CONNECT BY PRIOR" clause, but I didn't find anything similar for MySQL. Can anyone please advise?
Thanks
MySQL doesn't support recursive queries so you have to do it the hard way:
Select the rows where ParentID = X where X is your root.
Collect the Id values from (1).
Repeat (1) for each Id from (2).
Keep recursing by hand until you find all the leaf nodes.
If you know a maximum depth then you can join your table to itself (using LEFT OUTER JOINs) out to the maximum possible depth and then clean up the NULLs.
You could also change your tree representation to nested sets.
Might be late post.
With MySQL8 you can achieve it with recursive clause. Here is the example.
with recursive cte (id, name, parent_id) as (
select id,
name,
parent_id
from products
where parent_id = 19
union all
select p.id,
p.name,
p.parent_id
from products p
inner join cte
on p.parent_id = cte.id
)
select * from cte;
For more help find another thread, Hope It will help someone.
You Can also look into this interesting blog, which demonstrate how can we get similar results in mysql
http://explainextended.com/2009/03/17/hierarchical-queries-in-mysql/
This is an old thread, but since I got the question in another forum I thought I'd add it here. For this case, I created a stored procedure that is hard-coded to handle the specific case. This do, of course have some drawbacks since not all users can create stored procedures at will, but nevertheless.
Consider the following table with nodes and children:
CREATE TABLE nodes (
parent INT,
child INT
);
INSERT INTO nodes VALUES
( 5, 2), ( 5, 3),
(18, 11), (18, 7),
(17, 9), (17, 8),
(26, 13), (26, 1), (26,12),
(15, 10), (15, 5),
(38, 15), (38, 17), (38, 6),
(NULL, 38), (NULL, 26), (NULL, 18);
With this table, the following stored procedure will compute a result set consisting of all the decedents of the node provided:
delimiter $$
CREATE PROCEDURE find_parts(seed INT)
BEGIN
-- Temporary storage
DROP TABLE IF EXISTS _result;
CREATE TEMPORARY TABLE _result (node INT PRIMARY KEY);
-- Seeding
INSERT INTO _result VALUES (seed);
-- Iteration
DROP TABLE IF EXISTS _tmp;
CREATE TEMPORARY TABLE _tmp LIKE _result;
REPEAT
TRUNCATE TABLE _tmp;
INSERT INTO _tmp SELECT child AS node
FROM _result JOIN nodes ON node = parent;
INSERT IGNORE INTO _result SELECT node FROM _tmp;
UNTIL ROW_COUNT() = 0
END REPEAT;
DROP TABLE _tmp;
SELECT * FROM _result;
END $$
delimiter ;
The below select lists all plants and their parentid up to 4-level (and of course you can extend the level):
select id, name, parentid
,(select parentid from tb_tree where id=t.parentid) parentid2
,(select parentid from tb_tree where id=(select parentid from tb_tree where id=t.parentid)) parentid3
,(select parentid from tb_tree where id=(select parentid from tb_tree where id=(select parentid from tb_tree where id=t.parentid))) parentid4
from tb_tree t
and then you can use this query to get the final result. for example, you can get all children of "Fruits" by the below sql:
select id ,name from (
select id, name, parentid
,(select parentid from tb_tree where id=t.parentid) parentid2
,(select parentid from tb_tree where id=(select parentid from tb_tree where id=t.parentid)) parentid3
,(select parentid from tb_tree where id=(select parentid from tb_tree where id=(select parentid from tb_tree where id=t.parentid))) parentid4
from tb_tree t) tt
where ifnull(parentid4,0)=1 or ifnull(parentid3,0)=1 or ifnull(parentid2,0)=1 or ifnull(parentid,0)=1
The below stored procedure order a table that has rows with back reference to the previous one. Notice on the first step I copy rows into temp table - those rows match some condition. In my case those are rows that belong to the same linear (road that is used in GPS navigation). Business domain is not important. Just in my case I am sorting segments that belong to the same road
DROP PROCEDURE IF EXISTS orderLocations;
DELIMITER //
CREATE PROCEDURE orderLocations(_full_linear_code VARCHAR(11))
BEGIN
DECLARE _code VARCHAR(11);
DECLARE _id INT(4);
DECLARE _count INT(4);
DECLARE _pos INT(4);
DROP TEMPORARY TABLE IF EXISTS temp_sort;
CREATE TEMPORARY TABLE temp_sort (
id INT(4) PRIMARY KEY,
pos INT(4),
code VARCHAR(11),
prev_code VARCHAR(11)
);
-- copy all records to sort into temp table - this way sorting would go all in memory
INSERT INTO temp_sort SELECT
id, -- this is primary key of original table
NULL, -- this is position that still to be calculated
full_tmc_code, -- this is a column that references sorted by
negative_offset -- this is a reference to the previous record (will be blank for the first)
FROM tmc_file_location
WHERE linear_full_tmc_code = _full_linear_code;
-- this is how many records we have to sort / update position
SELECT count(*)
FROM temp_sort
INTO _count;
-- first position index
SET _pos = 1;
-- pick first record that has no prior record
SELECT
code,
id
FROM temp_sort l
WHERE prev_code IS NULL
INTO _code, _id;
-- update position of the first record
UPDATE temp_sort
SET pos = _pos
WHERE id = _id;
-- all other go by chain link
WHILE (_pos < _count) DO
SET _pos = _pos +1;
SELECT
code,
id
FROM temp_sort
WHERE prev_code = _code
INTO _code, _id;
UPDATE temp_sort
SET pos = _pos
WHERE id = _id;
END WHILE;
-- join two tables and return position along with all other fields
SELECT
t.pos,
l.*
FROM tmc_file_location l, temp_sort t
WHERE t.id = l.id
ORDER BY t.pos;
END;