Preventing circular joining, recursive searches - mysql

So in my situation I have three tables: list, item and list_relation.
Each item will be linked to a list through the list_id foreign key.
the list_relation looks like this:
CREATE TABLE list_relation
(
parent_id INT UNSIGNED NOT NULL,
child_id INT UNSIGNED NOT NULL,
UNIQUE(parent_id, child_id)
FOREIGN KEY (parent_id)
REFERENCES list (id)
ON DELETE CASCADE,
FOREIGN KEY (child_id)
REFERENCES list (id)
ON DELETE CASCADE
);
I want to be be able to inherit from multiple lists as well (which includes the related items).
For example I have list: 1, 2, 3.
I was wondering if there was any SQL way to prevent there from being a circular relation. E.g.
List 1 inherits from List 3, List 2 inherits from List 1, List 3 inherits from List 1.
1 -> 2 -> 3 -> 1
My current idea is that I would have to find out whether it would be circular by validating the desired inheritance first then inserting it into the DB.

If you use MySQL 8.0 or MariaDB 10.2 (or higher) you can try recursive CTEs (common table expressions).
Assuming the following schema and data:
CREATE TABLE `list_relation` (
`child_id` int unsigned NOT NULL,
`parent_id` int unsigned NOT NULL,
PRIMARY KEY (`child_id`,`parent_id`)
);
insert into list_relation (child_id, parent_id) values
(2,1),
(3,1),
(4,2),
(4,3),
(5,3);
Now you try to insert a new row with child_id = 1 and parent_id = 4. But that would create cyclic relations (1->4->2->1 and 1->4->3->1), which you want to prevent. To find out if a reverse relation already exists, you can use the following query, which will show all parents of list 4 (including inherited/transitive parents):
set #new_child_id = 1;
set #new_parent_id = 4;
with recursive rcte as (
select *
from list_relation r
where r.child_id = #new_parent_id
union all
select r.*
from rcte
join list_relation r on r.child_id = rcte.parent_id
)
select * from rcte
The result would be:
child_id | parent_id
4 | 2
4 | 3
2 | 1
3 | 1
Demo
You can see in the result, that the list 1 is one of the parents of list 4, and you wouldn't insert the new record.
Since you only want to know if list 1 is in the result, you can change the last line to
select * from rcte where parent_id = #new_child_id limit 1
or to
select exists (select * from rcte where parent_id = #new_child_id)
BTW: You can use the same query to prevent redundant relations.
Assuming you want to insert the record with child_id = 4 and parent_id = 1. This would be redundant, since list 4 already inherits list 1 over list 2 and list 3. The following query would show you that:
set #new_child_id = 4;
set #new_parent_id = 1;
with recursive rcte as (
select *
from list_relation r
where r.child_id = #new_child_id
union all
select r.*
from rcte
join list_relation r on r.child_id = rcte.parent_id
)
select exists (select * from rcte where parent_id = #new_parent_id)
And you can use a similar query to get all inherited items:
set #list = 4;
with recursive rcte (list_id) as (
select #list
union distinct
select r.parent_id
from rcte
join list_relation r on r.child_id = rcte.list_id
)
select distinct i.*
from rcte
join item i on i.list_id = rcte.list_id

For those who do no have MySQL 8.0 or Maria DB and would like to use recursive method in MySQL 5.7. I just hope you don't have to exceed the max rec.depth of 255 manual:)
MySQL does not allow recursive functions, however it does allow recursive procedures. Combining them both you can have nice little function which you can use in any select command.
the recursive sp will take two input parameters and one output. First input is the ID you are searching the node tree for, second input is used by the sp to preserve results during the execution. Third parameter is the output parameter which carries the the end result.
CREATE DEFINER=`root`#`localhost` PROCEDURE `sp_list_relation_recursive`(
in itemId text,
in iPreserve text,
out oResult text
)
BEGIN
DECLARE ChildId text default null;
IF (coalesce(itemId,'') = '') then
-- when no id received retun whatever we have in the preserve container
set oResult = iPreserve;
ELSE
-- add the received id to the preserving container
SET iPreserve = concat_ws(',',iPreserve,itemId);
SET oResult = iPreserve;
SET ChildId =
(
coalesce(
(
Select
group_concat(TNode.child_id separator ',') -- get all children
from
list_relation as TNode
WHERE
not find_in_set(TNode.child_id, iPreserve) -- if we don't already have'em
AND find_in_set(TNode.parent_id, itemId) -- from these parents
)
,'')
);
IF length(ChildId) >0 THEN
-- one or more child found, recursively search again for further child elements
CALL sp_list_relation_recursive(ChildId,iPreserve,oResult);
END IF;
END IF;
-- uncomment this to see the progress looping steps
-- select ChildId,iPreserve,oResult;
END
test this:
SET MAX_SP_RECURSION_DEPTH = 250;
set #list = '';
call test.sp_list_relation_recursive(1,'',#list);
select #list;
+----------------+
| #list |
+----------------+
| ,1,2,3,6,4,4,5 |
+----------------+
don't mind about the duplicate parents or extra commas, we just want to know if an element exist in the node without having much if's and whens.
Looks fine sofar, but SP can't be used in select command so we just create wrapper function for this sP.
CREATE DEFINER=`root`#`localhost` FUNCTION `fn_list_relation_recursive`(
NodeId int
) RETURNS text CHARSET utf8
READS SQL DATA
DETERMINISTIC
BEGIN
/*
Returns a tree of nodes
branches out all possible branches
*/
DECLARE mTree mediumtext;
SET MAX_SP_RECURSION_DEPTH = 250;
call sp_list_relation_recursive(NodeId,'',mTree);
RETURN mTree;
END
now check it in action:
SELECT
*,
FN_LIST_RELATION_RECURSIVE(parent_id) AS parents_children
FROM
list_relation;
+----------+-----------+------------------+
| child_id | parent_id | parents_children |
+----------+-----------+------------------+
| 1 | 7 | ,7,1,2,3,6,4,4,5 |
| 2 | 1 | ,1,2,3,6,4,4,5 |
| 3 | 1 | ,1,2,3,6,4,4,5 |
| 4 | 2 | ,2,4 |
| 4 | 3 | ,3,4,5 |
| 5 | 3 | ,3,4,5 |
| 6 | 1 | ,1,2,3,6,4,4,5 |
| 51 | 50 | ,50,51 |
+----------+-----------+------------------+
your inserts will look like this:
insert into list_relation (child_id,parent_id)
select
-- child, parent
1,6
where
-- parent not to be foud in child's children node
not find_in_set(6,fn_list_relation_recursive(1));
1,6 should add 0 records. However 1,7 should work.
As always, i'm just proving the concept, you guys are more than welcome
to tweak the sp to return a parent's children node, or child's parent node. Or have two separate SP for each node tree or even all combined so from a single single id it returns all parents and children.
Try it.. it's not that hard :)

Q: [is there] any SQL way to prevent a circular relation
A: SHORT ANSWER
There's no declarative constraint that would prevent an INSERT or UPDATE from creating a circular relation (as described in the question.)
But a combination of a BEFORE INSERT and BEFORE UPDATE trigger could prevent it, using queries and/or procedural logic to detect that successful completion of the INSERT or UPDATE would cause a circular relation.
When such a condition is detected, the triggers would need to raise an error to prevent the INSERT/UPDATE operation from completing.

Isn't better to put a column parent_id inside the list table?
Then you can get the list tree by a query with LEFT JOIN on the list table, matching the parent_id with the list_id, e.g:
SELECT t1.list_id, t2.list_id, t3.list_id
FROM list AS t1
LEFT JOIN list as t2 ON t2.parent_id = t1.list_id
LEFT JOIN list as t3 ON t3.parent_id = t2.list_id
WHERE t1.list_id = #your_list_id#
Is it a solution to your case?
Anyway, I suggest you to read about managing hierarchical data in mysql, you can find a lot about this issue!

Do you mind if you need to add an additional table?
A SQL way and efficient way to do this is to create an additional table which contains ALL parents for every child. And then check to see if the potential child exists in the parent list of the current node before the inheritance is established.
The parent_list table would be something like this:
CREATE TABLE parent_list (
list_id INT UNSIGNED NOT NULL,
parent_list_id INT UNSIGNED NOT NULL,
PRIMARY KEY (list_id, parent_list_id)
);
Now, let's start at the very beginning.
2 inherit from 1 and 4.
parent_list is empty, which means both 1 and 4 have no parents. So it's fine in this case.
After this step, parent_list should be:
list_id, parent_list_id
2, 1
2, 4
3 inherit from 2.
2 have two parents, 1 and 4. 3 isn't one of them. So it's fine again.
Now parent_list becomes(Note that 2's parents should be 3's parents also):
list_id, parent_list_id
2, 1
2, 4
3, 1
3, 4
3, 2
4 inherit from 3.
4 exists in 3's parent list. This will lead to a cycle. NO WAY!
To check whether the cycle will happen, you just need one simple SQL:
SELECT * FROM parent_list
WHERE list_id = potential_parent_id AND parent_list_id = potential_child_id;
Want to do all these things with one call? Apply a stored procedure:
CREATE PROCEDURE 'inherit'(
IN in_parent_id INT UNSIGNED,
IN in_child_id INT UNSIGNED
)
BEGIN
DECLARE result INT DEFAULT 0;
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
ROLLBACK;
SELECT -1;
END;
START TRANSACTION;
IF EXISTS(SELECT * FROM parent_list WHERE list_id = in_parent_id AND parent_list_id = in_child_id) THEN
SET result = 1; -- just some error code
ELSE
-- do your inserting here
-- update parent_list
INSERT INTO parent_list (SELECT in_child_id, parent_list_id FROM parent_list WHERE list_id = in_parent_id);
INSERT INTO parent_list VALUES (in_child_id, in_parent_id);
END IF;
COMMIT;
SELECT result;
END
When it comes to a multiple inheritance, just call inherit multiple times.

In the example you provide, the errant relationship is simple. It's the 3 -> 1 and 1-> 3 relationships. You could simply look for the inverse relationships when inserting a new row. If it exists, don't insert the new row.
If you add an auto-incrementing column, you could then identify the offending rows specifically.
On the other hand, if you are looking at existing rows, you could identify the errant rows using a simple SQL statement like:
SELECT
a.parent_id,
a.child_id
FROM list_relation a
JOIN list_relation b
ON a.child_id = b.parent_id AND a.parent_id = b.child_id
If you add an auto-incrementing column, you could then identify the offending rows specifically.
Your question title includes the word "prevent", so I presume you want to avoid adding the rows. To do so, you would need a ON BEFORE INSERT trigger that checks for an existing row and prevents the insert. You could also use an ON BEFORE UPDATE trigger to prevent existing rows from being changed to values that would be a problem.

Related

How to detect circular dependency in mysql [duplicate]

So in my situation I have three tables: list, item and list_relation.
Each item will be linked to a list through the list_id foreign key.
the list_relation looks like this:
CREATE TABLE list_relation
(
parent_id INT UNSIGNED NOT NULL,
child_id INT UNSIGNED NOT NULL,
UNIQUE(parent_id, child_id)
FOREIGN KEY (parent_id)
REFERENCES list (id)
ON DELETE CASCADE,
FOREIGN KEY (child_id)
REFERENCES list (id)
ON DELETE CASCADE
);
I want to be be able to inherit from multiple lists as well (which includes the related items).
For example I have list: 1, 2, 3.
I was wondering if there was any SQL way to prevent there from being a circular relation. E.g.
List 1 inherits from List 3, List 2 inherits from List 1, List 3 inherits from List 1.
1 -> 2 -> 3 -> 1
My current idea is that I would have to find out whether it would be circular by validating the desired inheritance first then inserting it into the DB.
If you use MySQL 8.0 or MariaDB 10.2 (or higher) you can try recursive CTEs (common table expressions).
Assuming the following schema and data:
CREATE TABLE `list_relation` (
`child_id` int unsigned NOT NULL,
`parent_id` int unsigned NOT NULL,
PRIMARY KEY (`child_id`,`parent_id`)
);
insert into list_relation (child_id, parent_id) values
(2,1),
(3,1),
(4,2),
(4,3),
(5,3);
Now you try to insert a new row with child_id = 1 and parent_id = 4. But that would create cyclic relations (1->4->2->1 and 1->4->3->1), which you want to prevent. To find out if a reverse relation already exists, you can use the following query, which will show all parents of list 4 (including inherited/transitive parents):
set #new_child_id = 1;
set #new_parent_id = 4;
with recursive rcte as (
select *
from list_relation r
where r.child_id = #new_parent_id
union all
select r.*
from rcte
join list_relation r on r.child_id = rcte.parent_id
)
select * from rcte
The result would be:
child_id | parent_id
4 | 2
4 | 3
2 | 1
3 | 1
Demo
You can see in the result, that the list 1 is one of the parents of list 4, and you wouldn't insert the new record.
Since you only want to know if list 1 is in the result, you can change the last line to
select * from rcte where parent_id = #new_child_id limit 1
or to
select exists (select * from rcte where parent_id = #new_child_id)
BTW: You can use the same query to prevent redundant relations.
Assuming you want to insert the record with child_id = 4 and parent_id = 1. This would be redundant, since list 4 already inherits list 1 over list 2 and list 3. The following query would show you that:
set #new_child_id = 4;
set #new_parent_id = 1;
with recursive rcte as (
select *
from list_relation r
where r.child_id = #new_child_id
union all
select r.*
from rcte
join list_relation r on r.child_id = rcte.parent_id
)
select exists (select * from rcte where parent_id = #new_parent_id)
And you can use a similar query to get all inherited items:
set #list = 4;
with recursive rcte (list_id) as (
select #list
union distinct
select r.parent_id
from rcte
join list_relation r on r.child_id = rcte.list_id
)
select distinct i.*
from rcte
join item i on i.list_id = rcte.list_id
For those who do no have MySQL 8.0 or Maria DB and would like to use recursive method in MySQL 5.7. I just hope you don't have to exceed the max rec.depth of 255 manual:)
MySQL does not allow recursive functions, however it does allow recursive procedures. Combining them both you can have nice little function which you can use in any select command.
the recursive sp will take two input parameters and one output. First input is the ID you are searching the node tree for, second input is used by the sp to preserve results during the execution. Third parameter is the output parameter which carries the the end result.
CREATE DEFINER=`root`#`localhost` PROCEDURE `sp_list_relation_recursive`(
in itemId text,
in iPreserve text,
out oResult text
)
BEGIN
DECLARE ChildId text default null;
IF (coalesce(itemId,'') = '') then
-- when no id received retun whatever we have in the preserve container
set oResult = iPreserve;
ELSE
-- add the received id to the preserving container
SET iPreserve = concat_ws(',',iPreserve,itemId);
SET oResult = iPreserve;
SET ChildId =
(
coalesce(
(
Select
group_concat(TNode.child_id separator ',') -- get all children
from
list_relation as TNode
WHERE
not find_in_set(TNode.child_id, iPreserve) -- if we don't already have'em
AND find_in_set(TNode.parent_id, itemId) -- from these parents
)
,'')
);
IF length(ChildId) >0 THEN
-- one or more child found, recursively search again for further child elements
CALL sp_list_relation_recursive(ChildId,iPreserve,oResult);
END IF;
END IF;
-- uncomment this to see the progress looping steps
-- select ChildId,iPreserve,oResult;
END
test this:
SET MAX_SP_RECURSION_DEPTH = 250;
set #list = '';
call test.sp_list_relation_recursive(1,'',#list);
select #list;
+----------------+
| #list |
+----------------+
| ,1,2,3,6,4,4,5 |
+----------------+
don't mind about the duplicate parents or extra commas, we just want to know if an element exist in the node without having much if's and whens.
Looks fine sofar, but SP can't be used in select command so we just create wrapper function for this sP.
CREATE DEFINER=`root`#`localhost` FUNCTION `fn_list_relation_recursive`(
NodeId int
) RETURNS text CHARSET utf8
READS SQL DATA
DETERMINISTIC
BEGIN
/*
Returns a tree of nodes
branches out all possible branches
*/
DECLARE mTree mediumtext;
SET MAX_SP_RECURSION_DEPTH = 250;
call sp_list_relation_recursive(NodeId,'',mTree);
RETURN mTree;
END
now check it in action:
SELECT
*,
FN_LIST_RELATION_RECURSIVE(parent_id) AS parents_children
FROM
list_relation;
+----------+-----------+------------------+
| child_id | parent_id | parents_children |
+----------+-----------+------------------+
| 1 | 7 | ,7,1,2,3,6,4,4,5 |
| 2 | 1 | ,1,2,3,6,4,4,5 |
| 3 | 1 | ,1,2,3,6,4,4,5 |
| 4 | 2 | ,2,4 |
| 4 | 3 | ,3,4,5 |
| 5 | 3 | ,3,4,5 |
| 6 | 1 | ,1,2,3,6,4,4,5 |
| 51 | 50 | ,50,51 |
+----------+-----------+------------------+
your inserts will look like this:
insert into list_relation (child_id,parent_id)
select
-- child, parent
1,6
where
-- parent not to be foud in child's children node
not find_in_set(6,fn_list_relation_recursive(1));
1,6 should add 0 records. However 1,7 should work.
As always, i'm just proving the concept, you guys are more than welcome
to tweak the sp to return a parent's children node, or child's parent node. Or have two separate SP for each node tree or even all combined so from a single single id it returns all parents and children.
Try it.. it's not that hard :)
Q: [is there] any SQL way to prevent a circular relation
A: SHORT ANSWER
There's no declarative constraint that would prevent an INSERT or UPDATE from creating a circular relation (as described in the question.)
But a combination of a BEFORE INSERT and BEFORE UPDATE trigger could prevent it, using queries and/or procedural logic to detect that successful completion of the INSERT or UPDATE would cause a circular relation.
When such a condition is detected, the triggers would need to raise an error to prevent the INSERT/UPDATE operation from completing.
Isn't better to put a column parent_id inside the list table?
Then you can get the list tree by a query with LEFT JOIN on the list table, matching the parent_id with the list_id, e.g:
SELECT t1.list_id, t2.list_id, t3.list_id
FROM list AS t1
LEFT JOIN list as t2 ON t2.parent_id = t1.list_id
LEFT JOIN list as t3 ON t3.parent_id = t2.list_id
WHERE t1.list_id = #your_list_id#
Is it a solution to your case?
Anyway, I suggest you to read about managing hierarchical data in mysql, you can find a lot about this issue!
Do you mind if you need to add an additional table?
A SQL way and efficient way to do this is to create an additional table which contains ALL parents for every child. And then check to see if the potential child exists in the parent list of the current node before the inheritance is established.
The parent_list table would be something like this:
CREATE TABLE parent_list (
list_id INT UNSIGNED NOT NULL,
parent_list_id INT UNSIGNED NOT NULL,
PRIMARY KEY (list_id, parent_list_id)
);
Now, let's start at the very beginning.
2 inherit from 1 and 4.
parent_list is empty, which means both 1 and 4 have no parents. So it's fine in this case.
After this step, parent_list should be:
list_id, parent_list_id
2, 1
2, 4
3 inherit from 2.
2 have two parents, 1 and 4. 3 isn't one of them. So it's fine again.
Now parent_list becomes(Note that 2's parents should be 3's parents also):
list_id, parent_list_id
2, 1
2, 4
3, 1
3, 4
3, 2
4 inherit from 3.
4 exists in 3's parent list. This will lead to a cycle. NO WAY!
To check whether the cycle will happen, you just need one simple SQL:
SELECT * FROM parent_list
WHERE list_id = potential_parent_id AND parent_list_id = potential_child_id;
Want to do all these things with one call? Apply a stored procedure:
CREATE PROCEDURE 'inherit'(
IN in_parent_id INT UNSIGNED,
IN in_child_id INT UNSIGNED
)
BEGIN
DECLARE result INT DEFAULT 0;
DECLARE EXIT HANDLER FOR SQLEXCEPTION
BEGIN
ROLLBACK;
SELECT -1;
END;
START TRANSACTION;
IF EXISTS(SELECT * FROM parent_list WHERE list_id = in_parent_id AND parent_list_id = in_child_id) THEN
SET result = 1; -- just some error code
ELSE
-- do your inserting here
-- update parent_list
INSERT INTO parent_list (SELECT in_child_id, parent_list_id FROM parent_list WHERE list_id = in_parent_id);
INSERT INTO parent_list VALUES (in_child_id, in_parent_id);
END IF;
COMMIT;
SELECT result;
END
When it comes to a multiple inheritance, just call inherit multiple times.
In the example you provide, the errant relationship is simple. It's the 3 -> 1 and 1-> 3 relationships. You could simply look for the inverse relationships when inserting a new row. If it exists, don't insert the new row.
If you add an auto-incrementing column, you could then identify the offending rows specifically.
On the other hand, if you are looking at existing rows, you could identify the errant rows using a simple SQL statement like:
SELECT
a.parent_id,
a.child_id
FROM list_relation a
JOIN list_relation b
ON a.child_id = b.parent_id AND a.parent_id = b.child_id
If you add an auto-incrementing column, you could then identify the offending rows specifically.
Your question title includes the word "prevent", so I presume you want to avoid adding the rows. To do so, you would need a ON BEFORE INSERT trigger that checks for an existing row and prevents the insert. You could also use an ON BEFORE UPDATE trigger to prevent existing rows from being changed to values that would be a problem.

Link one item to other item in same table

I searched a lot but found nothing.
My scenario is:
I have database with two tables table_item and table_item_linked. table_item has many items. User will come and add item(s). Later other user come and link one item with other item(s) via a form with two dropdown.
What I did so far is:
Structure of table_item:
+-------------------+
| table_item |
+-------------------+
| item_id (Primary) |
| others |
| .... |
| .... |
| .... |
+-------------------+
Structure of table_item_linked:
+---------------------+
| table_item_linked |
+---------------------+
| linked_id | (Primary)
| item_id | (Foreign key referencing -> item_id of table_item)
| linked_items | (here I need to store ids of linked items)
| linked_by | (referencing to user_id of user_table)
| linked_timestamp | (timestamp)
+---------------------+
If I have items in table_item like:
A B C D E F G H
When I link D with G
I can successfully fetch G when I am fetching D or vice versa. But problem came when I
Link H with G
So I must fetch D H while fetching G.
(D H G are linked in all means and upon fetching one, the remaining two must be attached and fetched)
It is like a multiple relation (Many to Many relationship).
Guys I know there must be professional way to do it. I will like to have any guidance. I can even change my database structure.
PS:
Please don't suggest to add #tag as one item is exactly similar to the other linked.
UPDATES
Frontend looks like this. If I intend to link two records I will have two dropdowns as shown:
And If I check details of record A
And If I check details of record B
And If I check details of record C
Assuming your table_item looks like this:
create table table_item (
item_id int unsigned auto_increment not null,
record varchar(50),
primary key (item_id)
);
insert into table_item (record) values
('Record A'),
('Record B'),
('Record C'),
('Record D'),
('Record E'),
('Record F'),
('Record G'),
('Record H');
table_item_linked could then be
create table table_item_linked (
linked_id int unsigned auto_increment not null,
item1_id int unsigned not null,
item2_id int unsigned not null,
linked_by int unsigned not null,
linked_timestamp timestamp not null default now(),
primary key (linked_id),
unique key (item1_id, item2_id),
index (item2_id, item1_id),
foreign key (item1_id) references table_item(item_id),
foreign key (item2_id) references table_item(item_id)
);
This is basically a many-to-many relation between items of the same type.
Note that you usually don't need an AUTO_INCREMENT column here. You can remove it, and define (item1_id, item2_id) as PRIMARY KEY. And linked_by should be a FOREGN KEY referencing the users table.
If a user (with ID 123) wants to link "Record A" (item_id = 1) with "Record B" (item_id = 2) and "Record B" (item_id = 2) with "Record C" (item_id = 3), your INSERT statements would be:
insert into table_item_linked (item1_id, item2_id, linked_by) values (1, 2, 123);
insert into table_item_linked (item1_id, item2_id, linked_by) values (2, 3, 123);
Now - When the user selects "Record A" (item_id = 1), you can get all related items with a recursive query (Requires at least MySQL 8.0 or MariaDB 10.2):
set #input_item_id = 1;
with recursive input as (
select #input_item_id as item_id
), rcte as (
select item_id from input
union distinct
select t.item2_id as item_id
from rcte r
join table_item_linked t on t.item1_id = r.item_id
union distinct
select t.item1_id as item_id
from rcte r
join table_item_linked t on t.item2_id = r.item_id
)
select i.*
from rcte r
join table_item i on i.item_id = r.item_id
where r.item_id <> (select item_id from input)
The result will be:
item_id record
———————————————————
2 Record B
3 Record C
db-fiddle
In your application you would remove set #input_item_id = 1; and change select #input_item_id as item_id using a placeholder to select ? as item_id. Then prepare the statement and bind item_id as parameter.
Update
If the server doesn't support recursive CTEs, you should consider to store redundat data in a separate table, which is simple to query. A closure table would be an option, but it's not necessery here, and might consume too much storage space. I would group items that are connected together (directly and indirectly) into clusters.
Given the same schema as above, we define a new table table_item_cluster:
create table table_item_cluster (
item_id int unsigned not null,
cluster_id int unsigned not null,
primary key (item_id),
index (cluster_id, item_id),
foreign key (item_id) references table_item(item_id)
);
This table links items (item_id) to clusters (cluster_id). Since an item can belong only to one cluster, we can define item_id as primary key. It's also a foreign key referencing table_item.
When a new item is created, it's not connected to any other item and builds an own cluster. So when we insert a new item, we need also to insert a new row in table_item_cluster. For simplicity we identify the cluster by item_id (item_id = cluster_id). This can be done in the application code, or with the following trigger:
delimiter //
create trigger table_item_after_insert
after insert on table_item
for each row begin
-- create a new cluster for the new item
insert into table_item_cluster (item_id, cluster_id)
values (new.item_id, new.item_id);
end//
delimiter ;
When we link two items, we simply merge their clusters. The cluster_id for all items from the two merged clusters needs to be the same now. Here I would just take the least one of two. Again - we can do that in application code or with a trigger:
delimiter //
create trigger table_item_linked_after_insert
after insert on table_item_linked
for each row begin
declare cluster1_id, cluster2_id int unsigned;
set cluster1_id = (
select c.cluster_id
from table_item_cluster c
where c.item_id = new.item1_id
);
set cluster2_id = (
select c.cluster_id
from table_item_cluster c
where c.item_id = new.item2_id
);
-- merge the linked clusters
update table_item_cluster c
set c.cluster_id = least(cluster1_id, cluster2_id)
where c.item_id in (cluster1_id, cluster2_id);
end//
delimiter ;
Now - When we have an item and want to get all (directly and indirectly) linked items, we just select all items (except of the given item) from the same cluster:
select i.*
from table_item i
join table_item_cluster c on c.item_id = i.item_id
join table_item_cluster c1
on c1.cluster_id = c.cluster_id
and c1.item_id <> c.item_id -- exclude the given item
where c1.item_id = ?
db-fiddle
The result for c1.item_id = 1 ("Record A") would be:
item_id record
———————————————————
2 Record B
3 Record C
But: As almost always when dealing with redundant data - Keeping it in sync with the source data can get quite complex. While it is simple to add and merge clusters - When you need to remove/delete an item or a link, you might need to split a cluster, which may require writing recursive or iterative code to determine which items belong to the same cluster. Though a simple (and "stupid") algorithm would be to just remove and reinsert all affected items and links, and let the insert triggers do theit work.
Update 2
Last but not least: You can write a stored procedure, which will iterate through the links:
delimiter //
create procedure get_linked_items(in in_item_id int unsigned)
begin
set #ids := concat(in_item_id);
set #ids_next := #ids;
set #sql_tpl := "
select group_concat(distinct id order by id) into #ids_next
from (
select item2_id as id
from table_item_linked
where item1_id in ({params_in})
and item2_id not in ({params_not_in})
union all
select item1_id
from table_item_linked
where item2_id in ({params_in})
and item1_id not in ({params_not_in})
) x
";
while (#ids_next is not null) do
set #sql := #sql_tpl;
set #sql := replace(#sql, '{params_in}', #ids_next);
set #sql := replace(#sql, '{params_not_in}', #ids);
prepare stmt from #sql;
execute stmt;
set #ids := concat_ws(',', #ids, #ids_next);
end while;
set #sql := "
select *
from table_item
where item_id in ({params})
and item_id <> {in_item_id}
";
set #sql := replace(#sql, '{params}', #ids);
set #sql := replace(#sql, '{in_item_id}', in_item_id);
prepare stmt from #sql;
execute stmt;
end//
delimiter ;
To get all linked items of "Record A" (item_id = 1), you would use
call get_linked_items(1);
db-fiddle
To explain it in pseudo code:
Initialize #ids and #ids_next with the input parameter
Find all item IDs which are directly linked to any ID in #ids_next except of those, which are already in #ids
Store the result into #ids_next (overwrite it)
Append IDs from #ids_next to #ids (merge the two sets into #ids)
If #ids_next is not empty: GOTO step 2.
Return all items with IDs in #ids
The obvious solution is to store one row for each link in table_item_linked.
Your table then becomes
+---------------------+
| table_item_linked |
+---------------------+
| linked_id | (Primary
| from_item_id | (The item linked _from_ -> item_id of table_item)
| to_item_id | the item linked _to_
| linked_by | (referencing to user_id of user_table)
| linked_timestamp | (timestamp)
+---------------------+
In your example, the data would be:
linked_id from_item_id to_item_id linked_by linked_timestamp
------------------------------------------------------------------------
1 D H sd '1 jan 2020'
2 H G sa '2 Jan 2020'
You then need to write a hierarchical query to retrieve all the "children" of G.

How can I create a query that shows children of an item recursively [duplicate]

This question already has answers here:
Recursive MySQL Query with relational innoDB
(2 answers)
Closed 9 years ago.
I have a MySQL table which has the following format:
CREATE TABLE IF NOT EXISTS `Company` (
`CompanyId` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
`Name` VARCHAR(45) NULL ,
`Address` VARCHAR(45) NULL ,
`ParentCompanyId` INT UNSIGNED NULL ,
PRIMARY KEY (`CompanyId`) ,
INDEX `fk_Company_Company_idx` (`ParentCompanyId` ASC) ,
CONSTRAINT `fk_Company_Company`
FOREIGN KEY (`ParentCompanyId` )
REFERENCES `Company` (`CompanyId` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
So to clarify, I have companies which can have a parent company. This could result in the following example table contents:
CompanyId Name Address ParentCompanyId
1 Foo Somestreet 3 NULL
2 Bar Somelane 4 1
3 McD Someway 1337 1
4 KFC Somewhere 12 2
5 Pub Someplace 2 4
Now comes my question.
I want to retrieve all children of CompanyId 2 recursive. So the following result set should appear:
CompanyId Name Address ParentCompanyId
4 KFC Somewhere 12 2
5 Pub Someplace 2 4
I thought of using the With ... AS ... statement, but it is not supported by MySQL. Another solution I thought of was using a procedure or function which returns a result set and union it with the recursive call of that function. But MySQL does only support column types as return values.
The last possible solution I thought about was to create a table with two fields: CompanyId and HasChildId. I could then write a procedure that loops recursively through the companies and fills the table with all recursive children by a companyid. In this case I could write a query which joins this table:
SELECT CompanyId, Name, Address
FROM Company C -- The child
INNER JOIN CompanyChildMappingTable M
ON M.CompanyId = C.HasChildId
INNER JOIN Company P -- The parent
ON P.CompanyId = M.CompanyId
WHERE P.CompanyId = 2;
This option should be a fast one if i'd call the procedure every 24 hours and fill the table on the fly when new records are inserted into Company. But this could be very tricky and I should do this by writing triggers on the Company table.
I would like to hear your advice.
Solution: I've built the following procedure to fill my table (now it just returns the SELECT result).
DELIMITER $$
DROP PROCEDURE IF EXISTS CompanyFillWithSubCompaniesByCompanyId$$
CREATE PROCEDURE CompanyFillWithSubCompaniesByCompanyId(IN V_CompanyId BIGINT UNSIGNED, IN V_TableName VARCHAR(100))
BEGIN
DECLARE V_CONCAT_IDS VARCHAR(9999) DEFAULT '';
DECLARE V_CURRENT_CONCAT VARCHAR(9999) DEFAULT '';
SET V_CONCAT_IDS = (SELECT GROUP_CONCAT(CompanyId) FROM Company WHERE V_CompanyId IS NULL OR ParentCompanyId = V_CompanyId);
SET V_CURRENT_CONCAT = V_CONCAT_IDS;
IF V_CompanyId IS NOT NULL THEN
companyLoop: LOOP
IF V_CURRENT_CONCAT IS NULL THEN
LEAVE companyLoop;
END IF;
SET V_CURRENT_CONCAT = (SELECT GROUP_CONCAT(CompanyId) FROM Company WHERE FIND_IN_SET(ParentCompanyId, V_CURRENT_CONCAT));
SET V_CONCAT_IDS = CONCAT_WS(',', V_CONCAT_IDS, V_CURRENT_CONCAT);
END LOOP;
END IF;
SELECT * FROM Company WHERE FIND_IN_SET(CompanyId, V_CONCAT_IDS);
END$$
Refer:
Recursive MySQL Query with relational innoDB
AND
How to find all child rows in MySQL?
It shall give a idea of how such a data structure, can be dealt in MYSQL
One quickest way to search is, use company id values in power of 2. companyId = parentId * 2 then query database like, select * from company where ((CompanyId % $parentId) == 0 )
I tried this code, it's quick but problem is it creates child's id as parentId * 2 and if depth of child goes deep, int, float may go out of range. So, I re-created my whole program.

Recursive mysql select?

I saw this answer and i hope he is incorrect, just like someone was incorrect telling primary keys are on a column and I can't set it on multiple columns.
Here is my table
create table Users(id INT primary key AUTO_INCREMENT,
parent INT,
name TEXT NOT NULL,
FOREIGN KEY(parent)
REFERENCES Users(id)
);
+----+--------+---------+
| id | parent | name |
+----+--------+---------+
| 1 | NULL | root |
| 2 | 1 | one |
| 3 | 1 | 1down |
| 4 | 2 | one_a |
| 5 | 4 | one_a_b |
+----+--------+---------+
I'd like to select user id 2 and recurse so I get all its direct and indirect child (so id 4 and 5).
How do I write it in such a way this will work? I seen recursion in postgresql and sqlserver.
CREATE DEFINER = 'root'#'localhost'
PROCEDURE test.GetHierarchyUsers(IN StartKey INT)
BEGIN
-- prepare a hierarchy level variable
SET #hierlevel := 00000;
-- prepare a variable for total rows so we know when no more rows found
SET #lastRowCount := 0;
-- pre-drop temp table
DROP TABLE IF EXISTS MyHierarchy;
-- now, create it as the first level you want...
-- ie: a specific top level of all "no parent" entries
-- or parameterize the function and ask for a specific "ID".
-- add extra column as flag for next set of ID's to load into this.
CREATE TABLE MyHierarchy AS
SELECT U.ID
, U.Parent
, U.`name`
, 00 AS IDHierLevel
, 00 AS AlreadyProcessed
FROM
Users U
WHERE
U.ID = StartKey;
-- how many rows are we starting with at this tier level
-- START the cycle, only IF we found rows...
SET #lastRowCount := FOUND_ROWS();
-- we need to have a "key" for updates to be applied against,
-- otherwise our UPDATE statement will nag about an unsafe update command
CREATE INDEX MyHier_Idx1 ON MyHierarchy (IDHierLevel);
-- NOW, keep cycling through until we get no more records
WHILE #lastRowCount > 0
DO
UPDATE MyHierarchy
SET
AlreadyProcessed = 1
WHERE
IDHierLevel = #hierLevel;
-- NOW, load in all entries found from full-set NOT already processed
INSERT INTO MyHierarchy
SELECT DISTINCT U.ID
, U.Parent
, U.`name`
, #hierLevel + 1 AS IDHierLevel
, 0 AS AlreadyProcessed
FROM
MyHierarchy mh
JOIN Users U
ON mh.Parent = U.ID
WHERE
mh.IDHierLevel = #hierLevel;
-- preserve latest count of records accounted for from above query
-- now, how many acrual rows DID we insert from the select query
SET #lastRowCount := ROW_COUNT();
-- only mark the LOWER level we just joined against as processed,
-- and NOT the new records we just inserted
UPDATE MyHierarchy
SET
AlreadyProcessed = 1
WHERE
IDHierLevel = #hierLevel;
-- now, update the hierarchy level
SET #hierLevel := #hierLevel + 1;
END WHILE;
-- return the final set now
SELECT *
FROM
MyHierarchy;
-- and we can clean-up after the query of data has been selected / returned.
-- drop table if exists MyHierarchy;
END
It might appear cumbersome, but to use this, do
call GetHierarchyUsers( 5 );
(or whatever key ID you want to find UP the hierarchical tree for).
The premise is to start with the one KEY you are working with. Then, use that as a basis to join to the users table AGAIN, but based on the first entry's PARENT ID. Once found, update the temp table as to not try and join for that key again on the next cycle. Then keep going until no more "parent" ID keys can be found.
This will return the entire hierarchy of records up to the parent no matter how deep the nesting. However, if you only want the FINAL parent, you can use the #hierlevel variable to return only the latest one in the file added, or ORDER BY and LIMIT 1
I know there is probably better and more efficient answer above but this snippet gives a slightly different approach and provides both - ancestors and children.
The idea is to constantly insert relative rowIds into temporary table, then fetch a row to look for it's relatives, rinse repeat until all rows are processed. Query can be probably optimized to use only 1 temporary table.
Here is a working sqlfiddle example.
CREATE TABLE Users
(`id` int, `parent` int,`name` VARCHAR(10))//
INSERT INTO Users
(`id`, `parent`, `name`)
VALUES
(1, NULL, 'root'),
(2, 1, 'one'),
(3, 1, '1down'),
(4, 2, 'one_a'),
(5, 4, 'one_a_b')//
CREATE PROCEDURE getAncestors (in ParRowId int)
BEGIN
DECLARE tmp_parentId int;
CREATE TEMPORARY TABLE tmp (parentId INT NOT NULL);
CREATE TEMPORARY TABLE results (parentId INT NOT NULL);
INSERT INTO tmp SELECT ParRowId;
WHILE (SELECT COUNT(*) FROM tmp) > 0 DO
SET tmp_parentId = (SELECT MIN(parentId) FROM tmp);
DELETE FROM tmp WHERE parentId = tmp_parentId;
INSERT INTO results SELECT parent FROM Users WHERE id = tmp_parentId AND parent IS NOT NULL;
INSERT INTO tmp SELECT parent FROM Users WHERE id = tmp_parentId AND parent IS NOT NULL;
END WHILE;
SELECT * FROM Users WHERE id IN (SELECT * FROM results);
END//
CREATE PROCEDURE getChildren (in ParRowId int)
BEGIN
DECLARE tmp_childId int;
CREATE TEMPORARY TABLE tmp (childId INT NOT NULL);
CREATE TEMPORARY TABLE results (childId INT NOT NULL);
INSERT INTO tmp SELECT ParRowId;
WHILE (SELECT COUNT(*) FROM tmp) > 0 DO
SET tmp_childId = (SELECT MIN(childId) FROM tmp);
DELETE FROM tmp WHERE childId = tmp_childId;
INSERT INTO results SELECT id FROM Users WHERE parent = tmp_childId;
INSERT INTO tmp SELECT id FROM Users WHERE parent = tmp_childId;
END WHILE;
SELECT * FROM Users WHERE id IN (SELECT * FROM results);
END//
Usage:
CALL getChildren(2);
-- returns
id parent name
4 2 one_a
5 4 one_a_b
CALL getAncestors(5);
-- returns
id parent name
1 (null) root
2 1 one
4 2 one_a

Selecting all the descendants of a tree node

I'm using the adjacency list model to store a (very dynamic) tree structure in a MySQL database. I need a way to select all of the descendants of a given node, preferably via a single call to a stored routine. I know that the nested sets model would make this easy, but it would make other things very difficult, so unfortunately it's not an option for me. Here's what I've got so far:
DELIMITER //
CREATE PROCEDURE get_descendants(node_id INT)
BEGIN
DROP TEMPORARY TABLE IF EXISTS descendants;
CREATE TEMPORARY TABLE descendants (id INT, name VARCHAR(100), parent_id INT);
INSERT INTO descendants
SELECT *
FROM nodes
WHERE parent_id <=> node_id;
-- ...?
END//
DELIMITER ;
The idea is to keep drilling down and appending children to the descendants table until I reach the leaves. I can then access the temporary table from outside the procedure...I hope. (It really sucks that I can't return a result set from a stored function.)
I need to somehow loop over the results and issue a new SELECT statement for each row. I've read that cursors might help here, but I don't see how. It seems like with cursors you have to select everything up front, then iterate.
It's up to read : write ratio. If you have very high rate of reading, it will be very helpful to make a full-relationship table, not a temporary one.
A pseudo approach (Not a real code!) :
1. node(1) has child node(2)
-> Insert a row with (parent_id = 1, child_id = 2, direct = True)
2. node(2) has child node(3)
-> Insert a row with (parent_id = 2, child_id = 3, direct = True)
-> Choose all ascendants of node(2)
-> Ascendants of node(2) : [node(1)]
-> Insert a row with (parent_id = 1, child_id = 3, direct = False)
3. To retrieve all descendants of node(1)
-> SELECT child_id FROM [table] WHERE parent_id = 1;
4. To retrieve children of node(1)
-> SELECT child_id FROM [table] WHERE parent_id = 1 AND direct = True;
5. To retrieve all ascendants of node(3)
-> SELECT parent_id FROM [table] WHERE child_id = 3;
6. To retrieve parent of node(3)
-> SELECT parent_id FROM [table] WHERE child_id = 3 AND direct = True;
+-----------+----------+--------+
| parent_id | child_id | direct |
+-----------+----------+--------|
| 1 | 2 | True |
| 1 | 3 | False |
| 2 | 3 | True |
....
+-----------+----------+--------+
Index 1 on ( parent_id, direct )
Index 2 on ( child_id, direct )
This approche has a bad performance while updating relationships. Use at your own risk.