How can I revise the following query with subqueries composed of functions and join queries in functions. I want to append extra values to my main query that relies on main tables and two primary joins (risks, users) on several occasions.
Creating a MCVRE (Minimal Complete Verifiable Reproduceable Example) proved to be somewhat challenging because of request sent to SQL Fiddle has too many rows (too many text characters) After removing nearly all rows on main two tables ( users, risks ) I ended up with a running query.
The Fiddle (http://www.sqlfiddle.com/#!9/1d52a0/17) create functions and insertion of data commands have reduced rows from actual example on my local pc due to character count of 8000 being exceeded for request payload for SQLFiddle to understand.
Actual table has about 100 rows for risks, and 20 or so rows for users and takes about 3 seconds to run
What can I do to speed up query, via staving desired function results in table, or by revision, index insertion, movement of joins to outer main query, or even using stored procedure, or rewriting query structure, to reduce execution time to possibly half the time or less optimistically. SQL fiddle does not take all rows needed so I pasted a very limited subset, Even SQLFiddle query (see total select query below) does not run, due to Stack Overflow (pun partially intended).
http://www.sqlfiddle.com/#!9/1d52a0/17
Base Queries that do run on the fiddle (see fiddle)
select * from users;
select * from risks;
select * from riskevents;
select * from riskmatrixthresholds;
select * from risklevels;
#significantly minimized result set but still query does not run due to stack overflow issue on sql fiddle - see fiddle result (on bottom most portion of fiddle query output)
SELECT r.RiskID,
r.CreatorID,
r.OwnerID,
r.ApproverID,
r.RiskTitle,
r.RiskStatement,
r.ClosureCriteria,
r.RiskState,
r.Context AS 'Context',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'Last','') AS 'creator.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'First','') AS 'creator.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'Last','') AS 'owner.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'First','') AS 'owner.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'Last','') AS 'approver.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'First','') AS 'approver.firstname',
r.Likelihood AS 'OriginalLikelihood',
r.Technical AS 'OriginalTechnical',
r.Schedule AS 'OriginalSchedule',
r.Cost AS 'OriginalCost',
GREATEST(r.Technical, r.Schedule, r.Cost) AS 'OriginalConsequence',
RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0) AS 'OriginalValue',
RiskLevel(RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0),'') AS 'OriginalLevel',
LatestEventDate(r.RiskID, r.AssessmentDate,'') AS 'LatestEventDate',
r.AssessmentDate AS 'AssessmentDate',
(SELECT CurrentLikelihood(r.RiskID,0)) AS 'CurrentLikelihood',
(SELECT CurrentConsequence(r.RiskID,0)) AS 'CurrentConsequence',
(SELECT CurrentRiskValue(r.RiskID,0)) AS 'CurrentValue',
(SELECT RiskLevel(CurrentRiskValue(r.RiskID,0),'')) AS 'CurrentLevel'
FROM risks r;
Create Function Script
CREATE TABLE `riskevents` (
`ID` int NOT NULL AUTO_INCREMENT,
`EventID` int ,
`RiskID` int ,
`EventTitle` text,
`EventStatus` varchar(10) ,
`EventOwnerID` int ,
`ActualDate` date ,
`ScheduleDate` date ,
`BaselineDate` date ,
`ActualLikelihood` int ,
`ActualTechnical` int ,
`ActualSchedule` int ,
`ActualCost` int ,
`ScheduledLikelihood` int ,
`ScheduledTechnical` int ,
`ScheduledSchedule` int ,
`ScheduledCost` int ,
`BaselineLikelihood` int ,
`BaselineTechnical` int ,
`BaselineSchedule` int ,
`BaselineCost` int ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risklevels` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskLevelID` int ,
`RiskMaximum` float ,
`RiskHigh` float ,
`RiskMedium` float ,
`RiskMinimum` float ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `riskmatrixthresholds` (
`ID` int NOT NULL AUTO_INCREMENT,
`CellID` int ,
`Likelihood` int ,
`Consequence` int ,
`Level` decimal(2,2) ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risks` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskState` varchar(10) ,
`RiskID` int ,
`RiskTitle` text CHARACTER SET latin1,
`RiskStatement` text CHARACTER SET latin1,
`ApproverID` int ,
`OwnerID` int ,
`CreatorID` int ,
`Likelihood` int ,
`Technical` int ,
`Schedule` int ,
`Cost` int ,
`ClosureCriteria` text CHARACTER SET latin1,
`CategoryID` int ,
`AssessmentDate` date ,
`CompletionDate` date ,
`ClosureDate` date ,
`Context` text,
PRIMARY KEY (`ID`),
UNIQUE KEY `risk_index` (`RiskID`)
)
CREATE TABLE `users` (
`ID` int NOT NULL AUTO_INCREMENT,
`UserID` int NOT NULL,
`LastName` char(25) ,
`FirstName` char(15) ,
`Title` char(20) ,
`Email` varchar(30) ,
`Phone` char(12) ,
`Extension` char(4) ,
`Department` char(25) ,
PRIMARY KEY (`ID`),
UNIQUE KEY `user_index` (`UserID`),
KEY `SURROGATE` (`UserID`)
)
insert into `riskevents`(`ID`,`EventID`,`RiskID`,`EventTitle`,`EventStatus`,`EventOwnerID`,`ActualDate`,`ScheduleDate`,`BaselineDate`,`ActualLikelihood`,`ActualTechnical`,`ActualSchedule`,`ActualCost`,`ScheduledLikelihood`,`ScheduledTechnical`,`ScheduledSchedule`,`ScheduledCost`,`BaselineLikelihood`,`BaselineTechnical`,`BaselineSchedule`,`BaselineCost`) values
(171,0,1,'Risk','Complete',5,'2019-06-14',NULL,'2019-06-14',5,2,2,5,NULL,NULL,NULL,NULL,5,2,2,5),
(184,0,10,'Risk','Complete',21,'2019-10-07',NULL,'2019-10-07',5,4,5,4,NULL,NULL,NULL,NULL,5,4,5,4));
insert into `risklevels`(`ID`,`RiskLevelID`,`RiskMaximum`,`RiskHigh`,`RiskMedium`,`RiskMinimum`) values
(1,1,1,0.55,0.3,0);
insert into `riskmatrixthresholds`(`ID`,`CellID`,`Likelihood`,`Consequence`,`Level`) values
(1,1,1,1,0.09),
(2,2,1,2,0.12),
(3,3,1,3,0.16),
(4,4,1,4,0.19),
(5,5,1,5,0.23),
(6,6,2,1,0.12),
(7,7,2,2,0.19),
(8,8,2,3,0.27),
(9,9,2,4,0.34),
(10,10,2,5,0.41),
(11,11,3,1,0.16),
(12,12,3,2,0.27),
(13,13,3,3,0.37),
(14,14,3,4,0.48),
(15,15,3,5,0.59),
(16,16,4,1,0.19),
(17,17,4,2,0.34),
(18,18,4,3,0.48),
(19,19,4,4,0.63),
(20,20,4,5,0.77),
(21,21,5,1,0.23),
(22,22,5,2,0.41),
(23,23,5,3,0.59),
(24,24,5,4,0.77),
(25,25,5,5,0.95);
insert into `risks`(`ID`,`RiskState`,`RiskID`,`RiskTitle`,`RiskStatement`,`ApproverID`,`OwnerID`,`CreatorID`,`Likelihood`,`Technical`,`Schedule`,`Cost`,`ClosureCriteria`,`CategoryID`,`AssessmentDate`,`CompletionDate`,`ClosureDate`,`Context`) values
(1,'Completed',1,'t','t',1,5,1,5,2,2,5,'t',NULL,'2019-06-14','2020-09-26',NULL,'t'),
(2,'Completed',2,'t','t',2,1,1,5,3,4,2,'test',NULL,'2019-05-14',NULL,NULL,'t'),
insert into `users`(`ID`,`UserID`,`LastName`,`FirstName`,`Title`,`Email`,`Phone`,`Extension`,`Department`) values
(1,1,'Admin','','Admin','a#yz.com','17890','1234',''),
(2,2,'Last','First','Engineer','a#yz.com','123890','1234','Supplier');
CREATE FUNCTION Consequence(technical int, sched int, cost int, consequence int) RETURNS int
BEGIN
select GREATEST(technical, sched, cost) into consequence;
return consequence;
END;
CREATE FUNCTION CurrentRiskEventID(riskidentifier int, eid int) RETURNS int
BEGIN
select MAX(e.EventID) into eid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
return riskeventid;
END;
CREATE FUNCTION CurrentConsequence(riskidentifier int, currentconsequence int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
ANDactualschedule is not null
andactualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
return currentconsequence;
END;
CREATE FUNCTION CurrentLikelihood(riskidentifier int, currentlikelihood int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
END;
CREATE FUNCTION CurrentRiskLevel(riskidentifier int, currentrisklevel int) RETURNS int
BEGIN
select RiskLevel(CurrentRiskValue(riskidentifier, 0), '') into currentrisklevel;
return currentrisklevel;
END;
CREATE FUNCTION CurrentRiskValue(riskidentifier int, currentriskvalue int) RETURNS int
BEGIN
SELECT RiskValue(CurrentLikelihood(riskidentifier, 0), CurrentConsequence(riskidentifier, 0), 0) into currentriskvalue;
return currentriskvalue;
END;
CREATE FUNCTION GetRiskUserID(riskusertype VARCHAR(25), riskidentifier int, riskuserid int) RETURNS int
BEGIN
SELECT COALESCE(userres.userid, 0) into riskuserid FROM
(
SELECT r.creatorid, r.ownerid, r.approverid, u.userid
FROM risks r, users u
WHERE r.riskid = (select riskidentifier) and
(
((select riskusertype) = 'Creator' AND u.userid = r.creatorid) OR
((select riskusertype) = 'Approver' AND u.userid = r.approverid) OR
((select riskusertype) = 'Owner' AND u.userid = r.ownerid)
)
) userres;
RETURN riskuserid;
END;
CREATE FUNCTION GetRiskUserLastOrFirstName(riskuserid int, riskid int, whichname char(25), firstorlastname char(25)) RETURNS char(25) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
SELECT (case
when whichname = 'Last' then u.LastName
WHEN whichname = 'First' THEN u.FirstName
end)
into firstorlastname
FROM users u,risks r
WHERE u.UserID = riskuserid
AND r.RiskID = riskid;
return firstorlastname;
END;
CREATE FUNCTION LatestEventDate(riskidentifier int, riskassessmentdate date, latestdate date) RETURNS date
BEGIN
SELECT COALESCE(
(SELECT ActualDate FROM riskevents evt WHERE evt.eventid = CurrentRiskEventID(riskidentifier, 0) and evt.riskid = riskidentifier),
(SELECT riskassessmentdate)) into latestdate;
return latestdate;
END;
CREATE FUNCTION RiskLevel(riskvalue int, risklevel varchar(4)) RETURNS varchar(4)
begin
SELECT
CASE
WHEN riskvalue >= levels.riskhigh*100 THEN 'High'
WHEN riskvalue >= levels.riskmedium*100 THEN 'Med'
ELSE 'Low'
ENd as cat into risklevel
FROM risklevels levels;
return risklevel;
END;
CREATE FUNCTION RiskValue(likelihood int, consequence int, riskvalue int) RETURNS int
BEGIN
SELECT m.level*100 INTO riskvalue FROM riskmatrixthresholds m WHERE m.likelihood = likelihood AND m.consequence = consequence;
RETURN riskvalue;
END;
http://www.sqlfiddle.com/#!9/1d52a0/17
Note: SQL is a declarative language, not a procedural language. You tell it what you want, not how to get it. Your use of functions and so forth is procedural.
How can you make this application faster?
First, use the latest version of MySQL (8+, or MariaDB 10.4+). Later versions get faster.
Second, you have stated a requirement to use "subqueries composed of functions". That means you probably can't do much about the performance.
Why not? The subqueries buried in functions are so-called dependent subqueries. Those don't perform well. And because they're buried MySQL's query planner can't do anything useful to optimize them.
Refactoring your query to avoid using functions with SELECT operations will give the query planner visibility into your overall query. That will give it a chance to optimize things. You might replace them with views.
And don't use SELECT tablea, tableb syntax. That's been obsolete since 1992. Use
SELECT tablea JOIN tableb ON tablea.joincolumn = tableb.joincolumn.
I'd offer you more advice but I can't figure out your intent.
Application of the following changes in CurrentLikelihood() and CurrentConsequence() reduced total query execution time to exec 0.070 sec, total 0.082 sec.
Old Current Likelihood Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
Working CurrentLikelihood Query
SELECT actuallikelihood INTO currentlikelihood
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentConsequence Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
and actualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
Working CurrentConsequence Query
SELECT GREATEST(actualtechnical, actualschedule, actualcost) INTO currentconsequence
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentRiskEventID() Query
select MAX(e.EventID) into currentriskeventid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
Modified GetRiskEventID() function
SELECT MAX(e.EventID) INTO currentriskeventid
FROM riskevents e
WHERE e.riskid = riskidentifier AND
(e.eventstatus != 'Open'
OR
(e.EventID = 0 AND e.eventstatus = 'Open'));
Hi all I have a MySQL table that has a field of comma separated values
id res
=============================
1 hh_2,hh_5,hh_6
------------------------------
2 hh_3,hh_5,hh_4
------------------------------
3 hh_6,hh_8,hh_7
------------------------------
4 hh_2,hh_7,hh_4
------------------------------
Please see the above example ,Actually i need to compare each row 'res' with other row's 'res' values and need to display count if they match with others. Please help me to get the count.
For example,
IN first row 'hh_2' also exist in fourth row so we need count as 2, likewise we need to compare all in all rows
I Have run the function its working for me. but the table so big. It have million of records so my performance take time. While check one record with 50000 record take 25 sec. Suppose my input is 60 rows it take one hour. Please help me how to optimize.
CREATE FUNCTION `combine_two_field`(s1 CHAR(96), s3 TEXT) RETURNS int(11)
BEGIN
DECLARE ndx INT DEFAULT 0;
DECLARE icount INT DEFAULT 0;
DECLARE head1 char(10);
DECLARE head2 char(10);
DECLARE head3 char(10);
WHILE ndx <= LENGTH(s1) DO
SET head1 = SUBSTRING_INDEX(s3, ',', 1);
SET s3 = SUBSTRING(s3, LENGTH(head1) + 1 + #iSeparLen);
SET head2 = SUBSTRING_INDEX(s1, ',', 1);
SET s1 = SUBSTRING(s1, LENGTH(head2) + 1 + #iSeparLen);
IF (head1 = head2) THEN
SET icount = icount + 1;
END IF;
SET ndx = ndx + 1;
END WHILE;
RETURN icount;
END
And the table size is too big and i want to reduce fetching time also ...
UPDATE QUERY:
DROP PROCEDURE IF EXISTS `pcompare7` $$
CREATE DEFINER=`root`#`localhost` PROCEDURE `pcompare7`(IN in_analysis_id INT(11))
BEGIN
drop table if exists `tmp_in_results`;
CREATE TEMPORARY TABLE `tmp_in_results` (
`t_id` INTEGER UNSIGNED NOT NULL AUTO_INCREMENT,
`r_id` bigint(11) NOT NULL,
`r_res` char(11) NOT NULL,
PRIMARY KEY (`t_id`),
KEY r_res (r_res)
)
ENGINE = InnoDB;
SELECT splite_snp(r_snp,id,ruid) FROM results WHERE technical_status = 1 and critical_status = 1 and autosomal_status = 1 and gender_status != "NO CALL" and analys_id = in_analysis_id;
-- SELECT * FROM tmp_in_results;
-- COmpare Functionality
SELECT a.t_id, b.id, SUM(IF(FIND_IN_SET(a.r_res, b.r_snp), 1, 0)) FROM tmp_in_results a CROSS JOIN results b GROUP BY a.t_id, b.id;
END $$
Function FOR CREATE TEMP TABLE:
DROP FUNCTION IF EXISTS `splite_snp` $$
CREATE DEFINER=`root`#`localhost` FUNCTION `splite_snp`(s1 TEXT, in_id bigint(96), ruid char(11)) RETURNS tinyint(1)
BEGIN
DECLARE ndx INT DEFAULT 0;
DECLARE icount INT DEFAULT 0;
DECLARE head1 TEXT;
DECLARE head2 TEXT;
DECLARE intpos1 char(10);
DECLARE intpos2 char(10);
DECLARE Separ char(3) DEFAULT ',';
DECLARE iSeparLen INT;
SET #iSeparLen = LENGTH( Separ );
WHILE s1 != '' DO
SET intpos1 = SUBSTRING_INDEX(s1, ',', 1);
SET s1 = SUBSTRING(s1, LENGTH(intpos1) + 1 + #iSeparLen);
INSERT INTO tmp_in_results(r_id,r_res) VALUES(in_id,intpos1);
END WHILE;
RETURN TRUE;
END $$
New table structure
pc_input
id in_res in_id
=============================
1 hh_2 1000
------------------------------
2 hh_3 1000
------------------------------
3 hh_6 1001
------------------------------
4 hh_2 1001
------------------------------
res_snp
id r_res r_id
=============================
1 hh_2 999
------------------------------
2 hh_3 999
------------------------------
3 hh_9 999
------------------------------
4 hh_2 998
------------------------------
5 hh_6 998
------------------------------
6 hh_9 998
------------------------------
Result:
in_id r_id matches_count
=============================
1000 999 2 (hh_2,hh_3)
------------------------------
1000 998 1 (hh_2)
------------------------------
1001 999 1 (hh_2)
------------------------------
1001 998 2 (hh_2,hh_6)
------------------------------
I have add the separate index both table in_res,in_id and r_res and r_id
QUERY:
SELECT b.r_id,count(*) FROM pc_input AS a INNER JOIN results_snps AS b ON (b.r_snp = a.in_snp) group by a.in_id,b.r_id;
But mysql server was freeze. Cloud you please suggest any other way or optimize my query.
EXPLAIN TABLE: res_snp
Field Type Null Key Default Extra
id bigint(11) NO PRI NULL auto_increment
r_snp varchar(50) NO MUL NULL
r_id bigint(11) NO MUL NULL
EXPLAIN TABLE: pc_input
Field Type Null Key Default Extra
id bigint(11) NO PRI NULL auto_increment
in_snp varchar(55) NO MUL NULL
in_id bigint(11) NO MUL NULL
Explain Query:
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE a ALL in_snp NULL NULL NULL 192 Using temporary; Using filesort
1 SIMPLE b ref r_snp r_snp 52 rutgers22042014.a.in_snp 2861 Using where0
This is possible, but nasty. A properly normalised database would be far easier, but sometime you have to work with an existing database.
Something like this should do it (not tested). This uses a couple of sub queries to generate the numbers from 0 to 9, combined allowing a range from 0 to 99. This is then used with substring_index to split the string up, along with DISTINCT to get eleminate the duplicates that this will otherwise generate (I assume there should be no duplicates on any line - if there are they can be got rid of but it gets more complicated), then that is just used as a sub query to do the counts
SELECT aRes, COUNT(*)
FROM
(
SELECT DISTINCT sometable.id, SUBSTRING_INDEX(SUBSTRING_INDEX(sometable.res, ',', 1 + units.i + tens.i * 10), ',', -1) AS aRes
FROM sometable
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN (SELECT 0 i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
) Sub1
GROUP BY aRes
EDIT - now tested:-
http://www.sqlfiddle.com/#!2/0ef59/4
EDIT - Possible solution. Hopefully will be acceptably quick.
First extract your input rows into a temp table:-
CREATE TEMPORARY TABLE tmp_record
(
unique_id INT NOT NULL AUTO_INCREMENT,
id INT,
res varchar(25),
PRIMARY KEY (unique_id),
KEY `res` (`res`)
);
Load the above up with your test data
INSERT INTO tmp_record (unique_id, id, res)
VALUES
(1, 1, 'hh_2'),
(2, 1, 'hh_5'),
(3, 1, 'hh_6'),
(4, 2, 'hh_3'),
(5, 2, 'hh_5'),
(6, 2, 'hh_4');
Then you can do a join as follows.
SELECT a.id, b.id, SUM(IF(FIND_IN_SET(a.res, b.res), 1, 0))
FROM tmp_record a
CROSS JOIN sometable b
GROUP BY a.id, b.id
This is joining every input row with every row on your main table and checking if the individual input res in in the comma separated list. If it is then the IF returns 1, else 0. Then it is summing up those values, grouped by the 2 ids.
Not tested but hopefully this should work. I am unsure on performance (which might be slow as you are dealing with a LOT of potential records).
Note that temp tables only last for the length of time the connection to the database exists. If you need to do this over several scripts then you will probably need to create a normal table (and remember to drop it when you have finished with it)
UserID UserName ParentID TopID
1 abc Null Null
2 edf 1 1
3 gef 1 1
4 huj 3 1
5 jdi 4 1
6 das 2 1
7 new Null Null
8 gka 7 7
TopID and ParentID is from the userID
I Want to get a user record and its child and subchild record. Here userid1 is the root and its child are userid2 and userid 3. So If the user id is 1 I have to display all the records from userid 1 to userid 6 since all are child and SUbchild of the root. Similarly for userid3 I have to display userid3 and its child Userid 4 and Child of Userid 4 Userid5
if the userid is 3
output should be
Userid Username
3 gef
4 huj
5 jdi
I will know the userid and the topID so how can I do the query to acheive the above result.
SELECT UserID, UserName FROM tbl_User WHERE ParentID=3 OR UserID=3 And TopID=1;
By the above query I am able to display userid 3 and userid 4 I am not able to display userid 5, Kind of struck in it. Need help. Thanks
It is technically possible to do recursive hierarchical queries in MySQL using stored procedures.
Here is one adapted to your scenario:
CREATE TABLE `user` (
`UserID` int(16) unsigned NOT NULL,
`UserName` varchar(32),
`ParentID` int(16) DEFAULT NULL,
`TopID` int(16) DEFAULT NULL,
PRIMARY KEY (`UserID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO user VALUES (1, 'abc', NULL, NULL), (2, 'edf', 1, 1), (3, 'gef', 1, 1),
(4, 'huj', 3, 1), (5, 'jdi', 4, 1), (6, 'das', 2, 1), (7, 'new', NULL, NULL),
(8, 'gka', 7, 7);
DELIMITER $$
DROP PROCEDURE IF EXISTS `Hierarchy` $$
CREATE PROCEDURE `Hierarchy` (IN GivenID INT, IN initial INT)
BEGIN
DECLARE done INT DEFAULT 0;
DECLARE next_id INT;
-- CURSOR TO LOOP THROUGH RESULTS --
DECLARE cur1 CURSOR FOR SELECT UserID FROM user WHERE ParentID = GivenID;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;
-- CREATE A TEMPORARY TABLE TO HOLD RESULTS --
IF initial=1 THEN
-- MAKE SURE TABLE DOESN'T CONTAIN OUTDATED INFO IF IT EXISTS (USUALLY ON ERROR) --
DROP TABLE IF EXISTS OUT_TEMP;
CREATE TEMPORARY TABLE OUT_TEMP (userID int, UserName varchar(32));
END IF;
-- ADD OURSELF TO THE TEMPORARY TABLE --
INSERT INTO OUT_TEMP SELECT UserID, UserName FROM user WHERE UserID = GivenID;
-- AND LOOP THROUGH THE CURSOR --
OPEN cur1;
read_loop: LOOP
FETCH cur1 INTO next_id;
-- NO ROWS FOUND, LEAVE LOOP --
IF done THEN
LEAVE read_loop;
END IF;
-- NEXT ROUND --
CALL Hierarchy(next_id, 0);
END LOOP;
CLOSE cur1;
-- THIS IS THE INITIAL CALL, LET'S GET THE RESULTS --
IF initial=1 THEN
SELECT * FROM OUT_TEMP;
-- CLEAN UP AFTER OURSELVES --
DROP TABLE OUT_TEMP;
END IF;
END $$
DELIMITER ;
CALL Hierarchy(3,1);
+--------+----------+
| userID | UserName |
+--------+----------+
| 3 | gef |
| 4 | huj |
| 5 | jdi |
+--------+----------+
3 rows in set (0.07 sec)
Query OK, 0 rows affected (0.07 sec)
CALL Hierarchy(1,1);
+--------+----------+
| userID | UserName |
+--------+----------+
| 1 | abc |
| 2 | edf |
| 6 | das |
| 3 | gef |
| 4 | huj |
| 5 | jdi |
+--------+----------+
6 rows in set (0.10 sec)
Query OK, 0 rows affected (0.10 sec)
Time to point out some caveats:
Since this is recursively calling a stored procedure, you need to increase the size of max_sp_recursion_depth, which has a max value of 255 (defaults to 0).
My results on a non-busy server with the limited test data (10 tuples of the user table) took 0.07-0.10 seconds to complete. The performance is such that it might be best to put the recursion in your application layer.
I didn't take advantage of your TopID column, so there might be a logic flaw. But the two test-cases gave me the expected results.
Disclaimer: This example was just to show that it can be done in MySQL, not that I endorse it in anyway. Stored Procedures, temporary tables and cursors are perhaps not the best way to do this problem.
Well not a pretty clean implementation but since you need only the children and sub-children, either of these might work:
Query1:
SELECT UserID, UserName
FROM tbl_user
WHERE ParentID = 3 OR UserID = 3
UNION
SELECT UserID, UserName
FROM tbl_user
WHERE ParentID IN (SELECT UserID
FROM tbl_user
WHERE ParentID = 3);
Query 2:
SELECT UserID, UserName
FROM tbl_user
WHERE UserID = 3
OR ParentID = 3
OR ParentID IN (SELECT UserID
FROM tbl_user
WHERE ParentID = 3);
EDIT 1: Alternatively, you may modify your table structure to make it more convenient to query all children of a particular category. Please follow this link to read more on storing hierarchical data in MySQL.
Also, you may think on storing your data hierarchically in a tree-like fashion that is very well explained in this article.
Please note that each method has its trade-offs with respect to retrieving desired results vs adding/removing categories but I'm sure you'll enjoy the reading.
This is one of the best articles I've seen for explaining the "Modified Preorder Tree Traversal" method of storing tree-like data in a SQL-style database.
http://www.sitepoint.com/hierarchical-data-database/
The MPTT stuff starts on page 2.
Essentially, you store a "Left" and a "Right" value for each node in the tree, in such a manner that to get all children of ParentA, you get the Left and Right for ParentA, then
SELECT *
FROM TableName
WHERE Left > ParentLeft
AND Right < ParentRight
To get all of the parents of the selected child (user_id = 3 in this example):
SELECT
#parent_id AS _user_id,
user_name,
(
SELECT #parent_id := parent_id
FROM users
WHERE user_id = _user_id
) AS parent
FROM (
-- initialize variables
SELECT
#parent_id := 3
) vars,
users u
WHERE #parent_id <> 0;
To get all of the children of a selected user_id
SELECT ui.user_id AS 'user_id', ui.user_name AS 'user_name', parent_id,
FROM
(
SELECT connect_by_parent(user_id) AS user_id
FROM (
SELECT
#start_user := 3,
#user_id := #start_user
) vars, users
WHERE #user_id IS NOT NULL
) uo
JOIN users ui ON ui.user_id = uo.user_id
This requires the following function
CREATE FUNCTION connect_by_parent(value INT) RETURNS INT
NOT DETERMINISTIC
READS SQL DATA
BEGIN
DECLARE _user_id INT;
DECLARE _parent_id INT;
DECLARE _next INT;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET #user_id = NULL;
SET _parent_id = #user_id;
SET _user_id = -1;
IF #user_id IS NULL THEN
RETURN NULL;
END IF;
LOOP
SELECT MIN(user_id)
INTO #user_id
FROM users
WHERE parent_id = _parent_id
AND user_id > _user_id;
IF #user_id IS NOT NULL OR _parent_id = #start_with THEN
RETURN #user_id;
END IF;
SELECT user_id, parent_id
INTO _user_id, _parent_id
FROM users
WHERE user_id = _parent_id;
END LOOP;
END
This example heavily uses session variables which many sql users may be unfamiliar with, so here's a link that may provide some insight: session variables