I'm writing myself a forum, and I want to have one of those "you are here" strings along the top ("home > forum > sub forum > topic > etc" kind of thing). Now, the depth the forums can go to is limited to something like 128 by TINYINT in the database, not that this matters.
My question is this: is there a way to select the current forum (using it's ID - easy), but also select everything else it is inside of so I can generate the "you are here" string? Obviously "Home > " is hard coded, but the rest will be titles of forums and sub forums.
I'd need some sort of loop, starting from the deepest level forum I'm currently in and moving up to the top. Is the only way to do it using PHP loops and lots of queries? I'd rather just use one as it's faster.
Thanks,
James
You can do it with a trivially simple query, no joins... if you change your schema to make that information easy to extract. Look up the nested set model.
Well, once you have the initial ID, can't you just quickly use a PHP loop to generate a set of variables that you use to generate a "where" statement for your SQL query?
This is a previous answer of mine which might be of use: Recursively check the parents of a child in a database
It's a non recursive single call from php to db using a stored procedure...
-- TABLES
drop table if exists pages;
create table pages
(
page_id smallint unsigned not null auto_increment primary key,
title varchar(255) not null,
parent_page_id smallint unsigned null,
key (parent_page_id)
)
engine = innodb;
-- TEST DATA
insert into pages (title, parent_page_id) values
('Page 1',null),
('Page 2',null),
('Page 1-2',1),
('Page 1-2-1',3),
('Page 1-2-2',3),
('Page 2-1',2),
('Page 2-2',2);
-- STORED PROCEDURES
drop procedure if exists page_parents;
delimiter #
create procedure page_parents
(
in p_page_id smallint unsigned
)
begin
declare v_done tinyint unsigned default 0;
declare v_depth smallint unsigned default 0;
create temporary table hier(
parent_page_id smallint unsigned,
page_id smallint unsigned,
depth smallint unsigned default 0
)engine = memory;
insert into hier select parent_page_id, page_id, v_depth from pages where page_id = p_page_id;
/* http://dev.mysql.com/doc/refman/5.0/en/temporary-table-problems.html */
create temporary table tmp engine=memory select * from hier;
while not v_done do
if exists( select 1 from pages pg inner join hier on pg.page_id = hier.parent_page_id and hier.depth = v_depth) then
insert into hier
select pg.parent_page_id, pg.page_id, v_depth + 1 from pages pg
inner join tmp on pg.page_id = tmp.parent_page_id and tmp.depth = v_depth;
set v_depth = v_depth + 1;
truncate table tmp;
insert into tmp select * from hier where depth = v_depth;
else
set v_done = 1;
end if;
end while;
select
pg.page_id,
pg.title as page_title,
b.page_id as parent_page_id,
b.title as parent_page_title,
hier.depth
from
hier
inner join pages pg on hier.page_id = pg.page_id
left outer join pages b on hier.parent_page_id = b.page_id
order by
hier.depth, hier.page_id;
drop temporary table if exists hier;
drop temporary table if exists tmp;
end #
delimiter ;
-- TESTING (call this stored procedure from php)
call page_parents(5);
call page_parents(7);
If you assume the user navigated using the physical hierarchy of the forums, just use a lot of left joins as follows:
select current.forum as current,
parent1.forum as history1,
parent2.forum as history2,
parent3.forum as history3,
parent4.forum as history4,
parent5.forum as history5,
parent6.forum as history6
from forum current
left join forum parent1 on parent1.id = current.parentid
left join forum parent2 on parent2.id = parent1.parentid
left join forum parent3 on parent3.id = parent2.parentid
left join forum parent4 on parent4.id = parent3.parentid
left join forum parent5 on parent5.id = parent4.parentid
left join forum parent6 on parent6.id = parent5.parentid
Otherwise, you may want to create a forum breadcrumb table to store the history of locations the user has visited. Update this table with each location the user visits.
Related
I'd like to create reports without having to create a pivot table in excel for every report.
I have survey software that creates a new table for each survey. The columns are named with ID numbers. So, I never know what the columns will be named. The software stores answers in two different tables depending on the 'type' of question. (text, radio button, etc.)
I manually created a table 'survey_answers_lookup' that stores a few key fields but it duplicates the answers. The procedure 'survey_report' works well and produces the required data but there is a challenge.
Since the survey tables are created when someone creates a new survey, I would need a trigger on the schema that creates a second trigger and I don't think that is possible. The second trigger would monitor the survey table and insert the data into the 'survey_answers_lookup' table after someone completes a survey.
I could edit the php software and insert the values into the survey_answers_lookup table but that would create more work when I update the software. (I'd have to update the files and then put my changes back in the files). I also could not determine where they insert the values into the tables.
Can you please help?
Edited. I posted my solution below.
Change some_user to a user who has access to the database.
CREATE DEFINER=`some_user`#`localhost` PROCEDURE `usp_produce_survey_report`(IN survey_id VARCHAR(10), IN lang VARCHAR(2))
SQL SECURITY INVOKER
BEGIN
/*---------------------------------------------------------------------------------
I do not guarantee that this will work for you or that it cannot be hacked with
with SQL injections or other malicious intents.
This stored procedure will produce output that you may use to create a report.
It accepts two arguments; The survey id (745) and the language (en).
It parses the column name in the survey table to get the qid.
It will copy the answers from the survey table to the survey_report
table if the answer is type S or K. It will get the answers from
the answers table for other types. NOTE: Other types might need to
be added to the if statement.
Additionally, the qid and id from the survey table are also copied to
the survey_report table.
Then the questions from the questions table, and answers from the answers
and survey_report tables are combined and displayed.
The data in the survey_report table is deleted after the data is displayed.
The id from the survey table is displayed as the respondent_id which may
be used to combine the questions and answers from a specific respondent.
You may have to change the prefix on the table names.
Example: survey_answers to my_prefix_answers.
Use this to call the procedure.
Syntax: call survey.usp_produce_survey_report('<SURVERY_ID>', '<LANGUAGE>');
Example: call survey.usp_produce_survey_report('457345', 'en');
use this to create the table that stores the data
CREATE TABLE `survey_report` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`qid` int(11) NOT NULL DEFAULT '0',
`survey_row_id` int(11) NOT NULL DEFAULT '0' COMMENT 'id that is in the survey_<id> table',
`answer` mediumtext COLLATE utf8mb4_unicode_ci DEFAULT NULL,
PRIMARY KEY (`id`)
);
*/
DECLARE v_col_name VARCHAR (25);
DECLARE v_qid INT;
DECLARE v_col_count INT DEFAULT 0;
DECLARE done INT DEFAULT false;
DECLARE tname VARCHAR(24) DEFAULT CONCAT('survey_survey_',survey_id);
DECLARE counter INT DEFAULT 0;
DECLARE current_row INT DEFAULT 0;
DECLARE total_rows INT DEFAULT 0;
-- select locate ('X','123457X212X1125', 8); -- use locate to determine location of second X - returns 11
-- select substring('123457X212X1125', 11+1, 7); -- use substring to get the qid - returns 1125
DECLARE cur1 cursor for
SELECT column_name, substring(column_name, 11+1, 7) as qid -- get the qid from the column name. the 7 might need to be higher depending on the id.
FROM information_schema.columns -- this has the column names
WHERE table_name = tname -- table name created form the id that was passed to the stored procedure
AND column_name REGEXP 'X'; -- get the columns that have an X
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
SET done = FALSE;
OPEN cur1;
SET total_rows = (SELECT table_rows -- get the number of rows
FROM INFORMATION_SCHEMA.TABLES
WHERE table_name = tname);
-- SELECT total_rows;
read_loop: LOOP
FETCH cur1 INTO v_col_name, v_qid; -- v_col_name is the original column name and v_qid is the qid that is taken from the column name
IF done THEN
LEAVE read_loop;
END IF;
-- SELECT v_col_name, v_qid;
SET counter = 1; -- use to compare id's
SET current_row = 1; -- used for the while loop
WHILE current_row <= total_rows DO
SET #sql := NULL;
-- SELECT v_col_name, v_qid, counter, x;
-- SELECT counter as id, v_col_name, v_qid as qid, x;
-- SET #sql = CONCAT ('SELECT id ', ',',v_qid, ' as qid ,', v_col_name,' FROM ', tname, ' WHERE id = ', counter );
-- I would have to join the survey table below if I did not add the answer (v_col_name). I assume this is faster than another join.
SET #sql = CONCAT ('INSERT INTO survey_report(qid,survey_row_id,answer) SELECT ',v_qid, ',id,' , v_col_name, ' FROM ', tname, ' WHERE id = ', counter );
-- SELECT #sql;
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
-- SELECT counter, x;
SET current_row = current_row + 1; -- increment counter for while loop
SET counter = counter + 1; -- increment counter for id's
END WHILE;
END LOOP; -- read_loop
CLOSE cur1;
-- SELECT * FROM survey_report
-- ORDER BY id, qid;
SET #counter = 0;
SELECT
#counter:=#counter + 1 AS newindex, -- increment the counter that is in the header
survey_report.id,
survey_report.survey_row_id as respondent_id, -- the id that copied from the survey table
survey_report.qid,
question,
IF(type IN ('S' , 'K'),
(SELECT answer
FROM survey_report
WHERE qid NOT IN (SELECT qid FROM survey_answers)
AND survey_questions.language = lang
AND survey_report.id = #counter),
(SELECT answer
FROM survey_answers
WHERE survey_questions.qid = survey_answers.qid
AND survey_report.qid = survey_questions.qid
AND survey_report.answer = survey_answers.code
AND survey_answers.language = lang
)
) AS answer
FROM survey_questions
JOIN survey_report ON survey_report.qid = survey_questions.qid
WHERE survey_questions.sid = survey_id
ORDER BY survey_report.survey_row_id, survey_report.id;
TRUNCATE TABLE survey_report;
END
Why does STRAIGHT_JOIN consume more CPU than a regular join? Do you have any idea?
When i use straight_join on one of my queries, it speeds up the query like from 12 seconds to 3 seconds. But it consumes so much CPU? Might it be about server configuration or something else?
You might want to check the code after this comment / Topic Ids are OK, Getting Devices... /
Before this line there are some code about filling topic_ids to a temp table.
Here is the query:
CREATE PROCEDURE `DevicesByTopic`(IN platform TINYINT, IN application TINYINT, IN topicList TEXT, IN page_no MEDIUMINT UNSIGNED)
BEGIN
DECLARE m_index INT DEFAULT 0;
DECLARE m_topic VARCHAR(255);
DECLARE m_topic_id BIGINT UNSIGNED DEFAULT NULL;
DECLARE m_session_id VARCHAR(40) CHARSET utf8 COLLATE utf8_turkish_ci;
-- Session Id
SET m_session_id = replace(uuid(), '-', '');
-- Temp table
CREATE TEMPORARY TABLE IF NOT EXISTS tmp_topics(
topic_slug VARCHAR(100) COLLATE utf8_turkish_ci
,topic_id BIGINT UNSIGNED
,session_id VARCHAR(40) COLLATE utf8_turkish_ci
,INDEX idx_tmp_topic_session_id (session_id)
,INDEX idx_tmp_topic_id (topic_id)
) CHARSET=utf8 COLLATE=utf8_turkish_ci;
-- Filling topics in a loop
loop_topics: LOOP
SET m_index = m_index + 1;
SET m_topic_id = NULL;
SET m_topic= SPLIT_STR(topicList,',', m_index);
IF m_topic = '' THEN
LEAVE loop_topics;
END IF;
SELECT t.topic_id INTO m_topic_id FROM topic AS t WHERE t.application = application AND (t.slug_hashed = UNHEX(MD5(m_topic)) AND t.slug = m_topic) LIMIT 1;
-- Fill temp table
IF m_topic_id IS NOT NULL AND m_topic_id > 0 THEN
INSERT INTO tmp_topics
(topic_slug, topic_id, session_id)
VALUES
(m_topic, m_topic_id, m_session_id);
END IF;
END LOOP loop_topics;
/* Topic Ids are OK, Getting Devices... */
SELECT
dr.device_id, dr.platform, dr.application, dr.unique_device_id, dr.amazon_arn
FROM
device AS dr
INNER JOIN (
SELECT STRAIGHT_JOIN
DISTINCT
d.device_id
FROM
device AS d
INNER JOIN
device_user AS du ON du.device_id = d.device_id
INNER JOIN
topic_device_user AS tdu ON tdu.device_user_id = du.device_user_id
INNER JOIN
tmp_topics AS tmp_t ON tmp_t.topic_id = tdu.topic_id
WHERE
((platform IS NULL OR d.platform = platform) AND d.application = application)
AND d.page_no = page_no
AND d.status = 1
AND du.status = 1
AND tmp_t.session_id = m_session_id COLLATE utf8_turkish_ci
) dFiltered ON dFiltered.device_id = dr.device_id
WHERE
((platform IS NULL OR dr.platform = platform) AND dr.application = application)
AND dr.page_no = page_no
AND dr.status = 1;
-- Delete rows fFill temp table
DELETE FROM tmp_topics WHERE session_id = m_session_id;
END;
With the STRAIGHT_JOIN this query takes about 3 seconds but consumes so much CPU like 90%, but if i remove the keyword "STRAIGHT_JOIN", it takes 12 seconds but consume 12% CPU.
MySQL 5.6.19a - innodb
What might be the reason?
Best regards.
A STRAIGHT_JOIN is used when you need to override MySQL's optimizer. You are telling it to ignore its own optimized execution path and instead rely on reading the tables in the order you have written them in the query.
99% of the time you don't want to use a straight_join. Just rely on MySQL to do its job and optimize the execution path for you. After all, any RDBMS worth its salt is going to be pretty decent at optimizing.
The few times you should use a straight_join are when you've already tested MySQL's optimization for a given query and found it lacking. In your case with this query, clearly your manual optimization using straight_join is not better than MySQL's baked in optimization.
DELIMITER //
DROP PROCEDURE IF EXISTS salary_ref//# MySQL returned an empty result set (i.e. zero rows).
# MySQL returned an empty result set (i.e. zero rows).
CREATE PROCEDURE salary_ref(con int(11),IN id varchar(120),maxval int(11),minval int(11) , taxo int(11))
BEGIN
DECLARE s VARCHAR(50);
IF con = 1 THEN
SELECT `i` . * , `taxo`.`id` , `t`.`item_id` AS id, `u`.`name` AS user_name, `t`.`value` AS val
FROM (
`taxonomy` AS taxo
)
JOIN `item_taxo` AS t ON `t`.`taxo_id` = `taxo`.`id`
INNER JOIN `items` AS i ON `i`.`id` = `t`.`item_id`
INNER JOIN `users` AS u ON `u`.`id` = `i`.`created_by`
WHERE `t`.`value`
BETWEEN minval
AND maxval
AND `t`.`taxo_id` = taxo
AND `i`.`parent_tag_id` in (id);
ELSE
SELECT `i` . * , `taxo`.`id` , `t`.`item_id` AS id, `u`.`name` AS user_name, `t`.`value` AS val
FROM (
`taxonomy` AS taxo
)
JOIN `item_taxo` AS t ON `t`.`taxo_id` = `taxo`.`id`
INNER JOIN `items` AS i ON `i`.`id` = `t`.`item_id`
INNER JOIN `users` AS u ON `u`.`id` = `i`.`created_by`
WHERE `t`.`value`
BETWEEN minval
AND maxval
AND `t`.`taxo_id` = taxo
AND `i`.`parent_tag_id` in (id);
END IF;
END;
//# MySQL returned an empty result set (i.e. zero rows).
DELIMITER ;
//calling of that
call salary_ref(2,"\'36\',\'50\',\'57\'",17000000,0,7)
this is not work for me.
Following changes would solve the issues.
Change 1:
I suggest to not use same parameter names for stored procedure to represent column names of tables.
Unless you handle them properly there would arise an identifier conflict but does not seem to be there.
Change procedure signature as follows:
CREATE PROCEDURE salary_ref(
_con int(11), IN csv_id_values varchar(120),
_maxval int(11), _minval int(11), _taxo int(11)
)
Change 2:
You are trying to pass comma separated values for id field to use in where clause.
But using escape character won't solve your problem and that is not correct way of using input values.
call salary_ref( 2, '36,50,57', 17000000, 0, 7 )
The CSV value '36,50,57' can be used as is for where clause.
See the suggested Change below.
Change 3:
You can use FIND_IN_SET on CSV values instead of IN in the WHERE clause.
WHERE `t`.`value` BETWEEN _minval AND _maxval
AND `t`.`taxo_id` = _taxo
AND FIND_iN_SET( `i`.`parent_tag_id`, csv_id_values );
And, I think your procedure body is incomplete. Your IF con and ELSE are using the same SELECT statement. It is redundant, unless you change it.
I have two tables; categories and products. For each category i would like to count how many products there are in all of its subcategories. I already have counted how many are in each category. Example tables are:
Categories:
ID ParentID ProductCount SubCategoryProducts
1 NULL 0
2 1 2
3 2 1
Products:
ProductID CategoryID
123 2
124 2
125 3
So i would like my function to make:
ID ParentID ProductCount SubCategoryProducts
1 NULL 0 3
2 1 2 1
3 2 1 0
It simply needs to be as a select query, no need to update the database.
Any ideas?
EDIT: SQL FIddle: http://sqlfiddle.com/#!2/1941a/4/0
If it were me I'd create a STORED PROCEDURE. The other option is to loop with PHP through the first query, then for each ID run another query - but this kind of logic can slow down your page drastically.
Here's a nice tutorial on stored procedures: http://net.tutsplus.com/tutorials/an-introduction-to-stored-procedures/
Basically you run the same loops I mentioned above you would with PHP (but it runs much faster). The procedure is stored in the database and can be called like a function. The result is the same as a query.
As requested, here's a sample procedure (or rather, it uses two) in my instance, "ags_orgs" acts in a similar way to your categories where there is a parentOrgID. "getChildOrgs" also acts kind of like a redundant function since I had no idea how many levels down I had to go (this was written for MSSQL - there are probably differences with mySQL) Unfortunately this doesn't count rows, rather it gets data. I highly recommend following a tutorial or two to get a better grip on how it works:
USE [dbname]
GO
/****** Object: StoredProcedure [dbo].[getChildOrgs] Script Date: 09/26/2012 15:30:06 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[getChildOrgs]
#myParentID int,
#isActive tinyint = NULL
AS
BEGIN
SET NOCOUNT ON
DECLARE #orgID int, #orgName varchar(255), #level int
DECLARE cur CURSOR LOCAL FOR SELECT orgID FROM dbo.ags_orgs WHERE parentOrgID = #myParentID AND isActive = ISNULL(#isActive, isActive) ORDER BY orderNum, orgName
OPEN cur
fetch next from cur into #orgID
WHILE ##fetch_status = 0
BEGIN
INSERT INTO #temp_childOrgs SELECT orgID,orgName,description,parentOrgID,adminID,isActive,##NESTLEVEL-1 AS level FROM dbo.ags_orgs WHERE orgID = #orgID
EXEC getChildOrgs #orgID, #isActive
-- get next result
fetch next from cur into #orgID
END
CLOSE cur
DEALLOCATE cur
END
GO
Which is called by this proc:
USE [dbname]
GO
/****** Object: StoredProcedure [dbo].[execGetChildOrgs] Script Date: 09/26/2012 15:29:34 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[execGetChildOrgs]
#parentID int,
#isActive tinyint = NULL,
#showParent tinyint = NULL
AS
BEGIN
CREATE TABLE #temp_childOrgs
(
orgID int,
orgName varchar(255),
description text,
parentOrgID int,
adminID int,
isActive tinyint,
level int
)
-- if this isn't AGS top level (0), make the first record reflect the requested organization
IF #parentID != 0 AND #showParent = 1
BEGIN
INSERT INTO #temp_childOrgs SELECT orgID,orgName,description,parentOrgID,adminID,isActive,0 AS level FROM dbo.ags_orgs WHERE orgID = #parentID
END
exec getChildOrgs #parentID, #isActive
SELECT * FROM #temp_childOrgs
DROP TABLE #temp_childOrgs
END
GO
Here is my procedure for counting products in all subcategories
DELIMITER $$
CREATE PROCEDURE CountItemsInCategories(IN tmpTable INT, IN parentId INT, IN updateId INT)
BEGIN
DECLARE itemId INT DEFAULT NULL;
DECLARE countItems INT DEFAULT NULL;
DECLARE done INT DEFAULT FALSE;
DECLARE recCount INT DEFAULT NULL;
DECLARE
bufItemCategory CURSOR FOR
SELECT
itemCategory.id AS id,
COUNT(CASE WHEN item.isVisible = 1 then 1 ELSE NULL END) items
FROM
itemCategory
LEFT JOIN item ON
item.categoryId = itemCategory.id
WHERE
itemCategory.isVisible = 1 AND itemCategory.categoryParentId = parentId
GROUP BY
itemCategory.id
ORDER BY
itemCategory.name;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
SET max_sp_recursion_depth = 10000;
IF tmpTable = 1 THEN
DROP TEMPORARY TABLE IF EXISTS tblResults;
CREATE TEMPORARY TABLE IF NOT EXISTS tblResults(
id INT NOT NULL PRIMARY KEY,
items INT
);
END IF;
OPEN bufItemCategory;
Reading_bufItemCategory: LOOP
FETCH FROM bufItemCategory INTO itemId, countItems;
IF done THEN
LEAVE Reading_bufItemCategory;
END IF;
IF tmpTable = 1 THEN
INSERT INTO tblResults VALUES(itemId, countItems);
ELSE
UPDATE tblResults SET items = items + countItems WHERE id = updateId;
END IF;
SET recCount = (SELECT count(*) FROM itemCategory WHERE itemCategory.categoryParentId = itemId AND itemCategory.isVisible = 1);
IF recCount > 0 THEN
CALL CountItemsInCategories(0, itemId, CASE WHEN updateId = 0 then itemId ELSE updateId END);
END IF;
END LOOP Reading_bufItemCategory;
CLOSE bufItemCategory;
IF tmpTable = 1 THEN
SELECT * FROM tblResults WHERE items > 0;
DROP TEMPORARY TABLE IF EXISTS tblResults;
END IF;
END $$
DELIMITER;
To call procedure just run:
CountItemsInCategories(firstLoop,parentId,updateId);
Where parameters are:
firstLoop - always "1" for first loop
parentId - parent of subcategories
updateId - id of row to update, always "0" for first loop
On example:
CountItemsInCategories(1,1,0);
I hope this example will be useful to someone.
This assumes you have
Product table named prods
prod_id|categ_id
and Category table named categ
categ_id|parent_categ_id
As you seem to be using Adjacency List structure where foreign key parent_categ_id column references prod_id column at the same table
the following query should work
select c1.categ_id,c1.parent_categ_id,count(prods.prod_id)
as product_count from categ c1
join prods on prods.categ_id=c1.categ_id or prods.categ_id
in( with recursive tree(id,parent_id)as
(select categ_id,parent_categ_id from categ
where categ_id=c1.categ_id
union all
select cat.categ_id,cat.parent_categ_id from categ cat
join tree on tree.id=cat.parent_categ_id) select id from tree)
group by c1.categ_id,c1.parent_categ_id
order by product_count
You can do this in one statement if you have a limit on the depth of the hierarchy. You said you only have 4 levels in total.
SELECT SUM(ProductCount)
FROM (
SELECT c0.ID, c0.ProductCount
FROM Categories AS c0
WHERE c0.ID = 1
UNION ALL
SELECT c1.ID, c1.ProductCount
FROM Categories AS c0
JOIN Categories AS c1 ON c0.ID = c1.ParentID
WHERE c0.ID = 1
UNION ALL
SELECT c2.ID, c2.ProductCount
FROM Categories AS c0
JOIN Categories AS c1 ON c0.ID = c1.ParentID
JOIN Categories AS c2 ON c1.ID = c2.ParentID
WHERE c0.ID = 1
UNION ALL
SELECT c3.ID, c3.ProductCount
FROM Categories AS c0
JOIN Categories AS c1 ON c0.ID = c1.ParentID
JOIN Categories AS c2 ON c1.ID = c2.ParentID
JOIN Categories AS c3 ON c2.ID = c3.ParentID
WHERE c0.ID = 1
) AS _hier;
That'll work for this query if you store the hierarchy in the way you're doing, which is called Adjacency List. Basically, the ParentID is the way each node records its position in the hierarchy.
There are a few other ways of storing hierarchies that allow for easier querying of whole trees or subtrees. The best data organization depends on which queries you want to run.
Here are some more resources:
Models for Hierarchical Data with SQL and PHP (user # RaymondNijland linked to it in a comment)
I gave that presentation as a webinar (free to view the recording, but requires registration).
My book, SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming.
What is the most efficient/elegant way to parse a flat table into a tree?
I have a tree (nested categories) stored as follows:
CREATE TABLE `category` (
`category_id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`category_name` varchar(100) NOT NULL,
`parent_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`category_id`),
UNIQUE KEY `category_name_UNIQUE` (`category_name`,`parent_id`),
KEY `fk_category_category1` (`parent_id`,`category_id`),
CONSTRAINT `fk_category_category1` FOREIGN KEY (`parent_id`) REFERENCES `category` (`category_id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci
I need to feed my client-side language (PHP) with node information (child+parent) so it can build the tree in memory. I can tweak my PHP code but I think the operation would be way simpler if I could just retrieve the rows in such an order that all parents come before their children. I could do that if I knew the level for each node:
SELECT category_id, category_name, parent_id
FROM category
ORDER BY level -- No `level` column so far :(
Can you think of a way (view, stored routine or whatever...) to calculate the node level? I guess it's okay if it's not real-time and I need to recalculate it on node modification.
First update: progress so far
I've written these triggers based on feedback by Amarghosh:
DROP TRIGGER IF EXISTS `category_before_insert`;
DELIMITER //
CREATE TRIGGER `category_before_insert` BEFORE INSERT ON `category` FOR EACH ROW BEGIN
IF NEW.parent_id IS NULL THEN
SET #parent_level = 0;
ELSE
SELECT level INTO #parent_level
FROM category
WHERE category_id = NEW.parent_id;
END IF;
SET NEW.level = #parent_level+1;
END//
DELIMITER ;
DROP TRIGGER IF EXISTS `category_before_update`;
DELIMITER //
CREATE TRIGGER `category_before_update` BEFORE UPDATE ON `category` FOR EACH ROW BEGIN
IF NEW.parent_id IS NULL THEN
SET #parent_level = 0;
ELSE
SELECT level INTO #parent_level
FROM category
WHERE category_id = NEW.parent_id;
END IF;
SET NEW.level = #parent_level+1;
END//
DELIMITER ;
It seems to work fine for insertions and modifications. But it doesn't work for deletions: MySQL Server does not launch triggers when the rows are updated from ON UPDATE CASCADE foreign keys.
The first obvious idea is to write a new trigger for deletion; however, a trigger on table categories is not allowed to modify other rows on this same table:
DROP TRIGGER IF EXISTS `category_after_delete`;
DELIMITER //
CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
/*
* Raises an error, see below
*/
UPDATE category SET parent_id=NULL
WHERE parent_id = OLD.category_id;
END//
DELIMITER ;
Error:
Grid editing error: SQL Error (1442):
Can't update table 'category' in
stored function/trigger because it is
already used by statement which
invoked this stored function/trigger.
Second update: working solution (unless proved wrong)
My first attempt was pretty sensible but I found a problem I could not manage to solve: when you launch a series of operations from a trigger, MySQL will not allow to alter other lines from the same table. Since node deletions require to adjust the level of all descendants, I had hit a wall.
In the end, I changed the approach using code from here: rather than correcting individual levels when a node change, I have code to calculate all levels and I trigger it on every edit. Since it's a slow calculation and fetching data requires a very complex query, I cache it into a table. In my case, it's an acceptable solution since editions should be rare.
1. New table for cached levels:
CREATE TABLE `category_level` (
`category_id` int(10) NOT NULL,
`parent_id` int(10) DEFAULT NULL, -- Not really necesary
`level` int(10) NOT NULL,
PRIMARY KEY (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_spanish_ci
2. Helper function to calculate levels
If I really got a grasp on how it works, it doesn't really return anything useful by itself. Instead, it stores stuff in session variables.
CREATE FUNCTION `category_connect_by_parent_eq_prior_id`(`value` INT) RETURNS int(10)
READS SQL DATA
BEGIN
DECLARE _id INT;
DECLARE _parent INT;
DECLARE _next INT;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET #category_id = NULL;
SET _parent = #category_id;
SET _id = -1;
IF #category_id IS NULL THEN
RETURN NULL;
END IF;
LOOP
SELECT MIN(category_id)
INTO #category_id
FROM category
WHERE COALESCE(parent_id, 0) = _parent
AND category_id > _id;
IF #category_id IS NOT NULL OR _parent = #start_with THEN
SET #level = #level + 1;
RETURN #category_id;
END IF;
SET #level := #level - 1;
SELECT category_id, COALESCE(parent_id, 0)
INTO _id, _parent
FROM category
WHERE category_id = _parent;
END LOOP;
END
3. Procedure to launch the recalculation process
It basically encapsulates the complex query that retrieves the levels aided by the helper function.
CREATE PROCEDURE `update_category_level`()
SQL SECURITY INVOKER
BEGIN
DELETE FROM category_level;
INSERT INTO category_level (category_id, parent_id, level)
SELECT hi.category_id, parent_id, level
FROM (
SELECT category_connect_by_parent_eq_prior_id(category_id) AS category_id, #level AS level
FROM (
SELECT #start_with := 0,
#category_id := #start_with,
#level := 0
) vars, category
WHERE #category_id IS NOT NULL
) ho
JOIN category hi ON hi.category_id = ho.category_id;
END
4. Triggers to keep cache table up-to-date
CREATE TRIGGER `category_after_insert` AFTER INSERT ON `category` FOR EACH ROW BEGIN
call update_category_level();
END
CREATE TRIGGER `category_after_update` AFTER UPDATE ON `category` FOR EACH ROW BEGIN
call update_category_level();
END
CREATE TRIGGER `category_after_delete` AFTER DELETE ON `category` FOR EACH ROW BEGIN
call update_category_level();
END
5. Known issues
It's pretty suboptimal if nodes are altered frequently.
MySQL does not allow transactions or table locking in triggers and procedures. You must take care of these details where you edit nodes.
There's an excellent series of articles here on Hierarchical Queries in MySQL that includes how to identify level, leaf nodes, loops in the hierarchy, etc.
If there won't be any cycles (if it'd always be a tree and not a graph), you can have a level field that is set to zero (top most) by default and a stored procedure that updates the level to (parent's level + 1) whenever you update the parent_id.
CREATE TRIGGER setLevelBeforeInsert BEFORE INSERT ON category
FOR EACH ROW
BEGIN
IF NEW.parent_id IS NOT NULL THEN
SELECT level INTO #pLevel FROM category WHERE id = NEW.parent_id;
SET NEW.level = #pLevel + 1;
ELSE
SET NEW.level = 0;
END IF;
END;
No level column so far :(
Hmm * shrug *
I'd just made this level field manually.
Say, like Materialized path, with just one update after insert and without all these fancy triggers.
A field which is going to be like 000000100000210000022 for the 3-rd level for example
so it can build the tree in memory.
if you going to get whole table into PHP, I see no problem here. A little recursive function can give you your nested arrays tree.
I can tweak my PHP code but I think the operation would be way simpler
well, well.
The code you've got so far doesn't seem to me "way simple" :)