How can I revise the following query with subqueries composed of functions and join queries in functions. I want to append extra values to my main query that relies on main tables and two primary joins (risks, users) on several occasions.
Creating a MCVRE (Minimal Complete Verifiable Reproduceable Example) proved to be somewhat challenging because of request sent to SQL Fiddle has too many rows (too many text characters) After removing nearly all rows on main two tables ( users, risks ) I ended up with a running query.
The Fiddle (http://www.sqlfiddle.com/#!9/1d52a0/17) create functions and insertion of data commands have reduced rows from actual example on my local pc due to character count of 8000 being exceeded for request payload for SQLFiddle to understand.
Actual table has about 100 rows for risks, and 20 or so rows for users and takes about 3 seconds to run
What can I do to speed up query, via staving desired function results in table, or by revision, index insertion, movement of joins to outer main query, or even using stored procedure, or rewriting query structure, to reduce execution time to possibly half the time or less optimistically. SQL fiddle does not take all rows needed so I pasted a very limited subset, Even SQLFiddle query (see total select query below) does not run, due to Stack Overflow (pun partially intended).
http://www.sqlfiddle.com/#!9/1d52a0/17
Base Queries that do run on the fiddle (see fiddle)
select * from users;
select * from risks;
select * from riskevents;
select * from riskmatrixthresholds;
select * from risklevels;
#significantly minimized result set but still query does not run due to stack overflow issue on sql fiddle - see fiddle result (on bottom most portion of fiddle query output)
SELECT r.RiskID,
r.CreatorID,
r.OwnerID,
r.ApproverID,
r.RiskTitle,
r.RiskStatement,
r.ClosureCriteria,
r.RiskState,
r.Context AS 'Context',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'Last','') AS 'creator.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'First','') AS 'creator.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'Last','') AS 'owner.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'First','') AS 'owner.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'Last','') AS 'approver.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'First','') AS 'approver.firstname',
r.Likelihood AS 'OriginalLikelihood',
r.Technical AS 'OriginalTechnical',
r.Schedule AS 'OriginalSchedule',
r.Cost AS 'OriginalCost',
GREATEST(r.Technical, r.Schedule, r.Cost) AS 'OriginalConsequence',
RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0) AS 'OriginalValue',
RiskLevel(RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0),'') AS 'OriginalLevel',
LatestEventDate(r.RiskID, r.AssessmentDate,'') AS 'LatestEventDate',
r.AssessmentDate AS 'AssessmentDate',
(SELECT CurrentLikelihood(r.RiskID,0)) AS 'CurrentLikelihood',
(SELECT CurrentConsequence(r.RiskID,0)) AS 'CurrentConsequence',
(SELECT CurrentRiskValue(r.RiskID,0)) AS 'CurrentValue',
(SELECT RiskLevel(CurrentRiskValue(r.RiskID,0),'')) AS 'CurrentLevel'
FROM risks r;
Create Function Script
CREATE TABLE `riskevents` (
`ID` int NOT NULL AUTO_INCREMENT,
`EventID` int ,
`RiskID` int ,
`EventTitle` text,
`EventStatus` varchar(10) ,
`EventOwnerID` int ,
`ActualDate` date ,
`ScheduleDate` date ,
`BaselineDate` date ,
`ActualLikelihood` int ,
`ActualTechnical` int ,
`ActualSchedule` int ,
`ActualCost` int ,
`ScheduledLikelihood` int ,
`ScheduledTechnical` int ,
`ScheduledSchedule` int ,
`ScheduledCost` int ,
`BaselineLikelihood` int ,
`BaselineTechnical` int ,
`BaselineSchedule` int ,
`BaselineCost` int ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risklevels` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskLevelID` int ,
`RiskMaximum` float ,
`RiskHigh` float ,
`RiskMedium` float ,
`RiskMinimum` float ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `riskmatrixthresholds` (
`ID` int NOT NULL AUTO_INCREMENT,
`CellID` int ,
`Likelihood` int ,
`Consequence` int ,
`Level` decimal(2,2) ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risks` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskState` varchar(10) ,
`RiskID` int ,
`RiskTitle` text CHARACTER SET latin1,
`RiskStatement` text CHARACTER SET latin1,
`ApproverID` int ,
`OwnerID` int ,
`CreatorID` int ,
`Likelihood` int ,
`Technical` int ,
`Schedule` int ,
`Cost` int ,
`ClosureCriteria` text CHARACTER SET latin1,
`CategoryID` int ,
`AssessmentDate` date ,
`CompletionDate` date ,
`ClosureDate` date ,
`Context` text,
PRIMARY KEY (`ID`),
UNIQUE KEY `risk_index` (`RiskID`)
)
CREATE TABLE `users` (
`ID` int NOT NULL AUTO_INCREMENT,
`UserID` int NOT NULL,
`LastName` char(25) ,
`FirstName` char(15) ,
`Title` char(20) ,
`Email` varchar(30) ,
`Phone` char(12) ,
`Extension` char(4) ,
`Department` char(25) ,
PRIMARY KEY (`ID`),
UNIQUE KEY `user_index` (`UserID`),
KEY `SURROGATE` (`UserID`)
)
insert into `riskevents`(`ID`,`EventID`,`RiskID`,`EventTitle`,`EventStatus`,`EventOwnerID`,`ActualDate`,`ScheduleDate`,`BaselineDate`,`ActualLikelihood`,`ActualTechnical`,`ActualSchedule`,`ActualCost`,`ScheduledLikelihood`,`ScheduledTechnical`,`ScheduledSchedule`,`ScheduledCost`,`BaselineLikelihood`,`BaselineTechnical`,`BaselineSchedule`,`BaselineCost`) values
(171,0,1,'Risk','Complete',5,'2019-06-14',NULL,'2019-06-14',5,2,2,5,NULL,NULL,NULL,NULL,5,2,2,5),
(184,0,10,'Risk','Complete',21,'2019-10-07',NULL,'2019-10-07',5,4,5,4,NULL,NULL,NULL,NULL,5,4,5,4));
insert into `risklevels`(`ID`,`RiskLevelID`,`RiskMaximum`,`RiskHigh`,`RiskMedium`,`RiskMinimum`) values
(1,1,1,0.55,0.3,0);
insert into `riskmatrixthresholds`(`ID`,`CellID`,`Likelihood`,`Consequence`,`Level`) values
(1,1,1,1,0.09),
(2,2,1,2,0.12),
(3,3,1,3,0.16),
(4,4,1,4,0.19),
(5,5,1,5,0.23),
(6,6,2,1,0.12),
(7,7,2,2,0.19),
(8,8,2,3,0.27),
(9,9,2,4,0.34),
(10,10,2,5,0.41),
(11,11,3,1,0.16),
(12,12,3,2,0.27),
(13,13,3,3,0.37),
(14,14,3,4,0.48),
(15,15,3,5,0.59),
(16,16,4,1,0.19),
(17,17,4,2,0.34),
(18,18,4,3,0.48),
(19,19,4,4,0.63),
(20,20,4,5,0.77),
(21,21,5,1,0.23),
(22,22,5,2,0.41),
(23,23,5,3,0.59),
(24,24,5,4,0.77),
(25,25,5,5,0.95);
insert into `risks`(`ID`,`RiskState`,`RiskID`,`RiskTitle`,`RiskStatement`,`ApproverID`,`OwnerID`,`CreatorID`,`Likelihood`,`Technical`,`Schedule`,`Cost`,`ClosureCriteria`,`CategoryID`,`AssessmentDate`,`CompletionDate`,`ClosureDate`,`Context`) values
(1,'Completed',1,'t','t',1,5,1,5,2,2,5,'t',NULL,'2019-06-14','2020-09-26',NULL,'t'),
(2,'Completed',2,'t','t',2,1,1,5,3,4,2,'test',NULL,'2019-05-14',NULL,NULL,'t'),
insert into `users`(`ID`,`UserID`,`LastName`,`FirstName`,`Title`,`Email`,`Phone`,`Extension`,`Department`) values
(1,1,'Admin','','Admin','a#yz.com','17890','1234',''),
(2,2,'Last','First','Engineer','a#yz.com','123890','1234','Supplier');
CREATE FUNCTION Consequence(technical int, sched int, cost int, consequence int) RETURNS int
BEGIN
select GREATEST(technical, sched, cost) into consequence;
return consequence;
END;
CREATE FUNCTION CurrentRiskEventID(riskidentifier int, eid int) RETURNS int
BEGIN
select MAX(e.EventID) into eid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
return riskeventid;
END;
CREATE FUNCTION CurrentConsequence(riskidentifier int, currentconsequence int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
ANDactualschedule is not null
andactualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
return currentconsequence;
END;
CREATE FUNCTION CurrentLikelihood(riskidentifier int, currentlikelihood int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
END;
CREATE FUNCTION CurrentRiskLevel(riskidentifier int, currentrisklevel int) RETURNS int
BEGIN
select RiskLevel(CurrentRiskValue(riskidentifier, 0), '') into currentrisklevel;
return currentrisklevel;
END;
CREATE FUNCTION CurrentRiskValue(riskidentifier int, currentriskvalue int) RETURNS int
BEGIN
SELECT RiskValue(CurrentLikelihood(riskidentifier, 0), CurrentConsequence(riskidentifier, 0), 0) into currentriskvalue;
return currentriskvalue;
END;
CREATE FUNCTION GetRiskUserID(riskusertype VARCHAR(25), riskidentifier int, riskuserid int) RETURNS int
BEGIN
SELECT COALESCE(userres.userid, 0) into riskuserid FROM
(
SELECT r.creatorid, r.ownerid, r.approverid, u.userid
FROM risks r, users u
WHERE r.riskid = (select riskidentifier) and
(
((select riskusertype) = 'Creator' AND u.userid = r.creatorid) OR
((select riskusertype) = 'Approver' AND u.userid = r.approverid) OR
((select riskusertype) = 'Owner' AND u.userid = r.ownerid)
)
) userres;
RETURN riskuserid;
END;
CREATE FUNCTION GetRiskUserLastOrFirstName(riskuserid int, riskid int, whichname char(25), firstorlastname char(25)) RETURNS char(25) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
SELECT (case
when whichname = 'Last' then u.LastName
WHEN whichname = 'First' THEN u.FirstName
end)
into firstorlastname
FROM users u,risks r
WHERE u.UserID = riskuserid
AND r.RiskID = riskid;
return firstorlastname;
END;
CREATE FUNCTION LatestEventDate(riskidentifier int, riskassessmentdate date, latestdate date) RETURNS date
BEGIN
SELECT COALESCE(
(SELECT ActualDate FROM riskevents evt WHERE evt.eventid = CurrentRiskEventID(riskidentifier, 0) and evt.riskid = riskidentifier),
(SELECT riskassessmentdate)) into latestdate;
return latestdate;
END;
CREATE FUNCTION RiskLevel(riskvalue int, risklevel varchar(4)) RETURNS varchar(4)
begin
SELECT
CASE
WHEN riskvalue >= levels.riskhigh*100 THEN 'High'
WHEN riskvalue >= levels.riskmedium*100 THEN 'Med'
ELSE 'Low'
ENd as cat into risklevel
FROM risklevels levels;
return risklevel;
END;
CREATE FUNCTION RiskValue(likelihood int, consequence int, riskvalue int) RETURNS int
BEGIN
SELECT m.level*100 INTO riskvalue FROM riskmatrixthresholds m WHERE m.likelihood = likelihood AND m.consequence = consequence;
RETURN riskvalue;
END;
http://www.sqlfiddle.com/#!9/1d52a0/17
Note: SQL is a declarative language, not a procedural language. You tell it what you want, not how to get it. Your use of functions and so forth is procedural.
How can you make this application faster?
First, use the latest version of MySQL (8+, or MariaDB 10.4+). Later versions get faster.
Second, you have stated a requirement to use "subqueries composed of functions". That means you probably can't do much about the performance.
Why not? The subqueries buried in functions are so-called dependent subqueries. Those don't perform well. And because they're buried MySQL's query planner can't do anything useful to optimize them.
Refactoring your query to avoid using functions with SELECT operations will give the query planner visibility into your overall query. That will give it a chance to optimize things. You might replace them with views.
And don't use SELECT tablea, tableb syntax. That's been obsolete since 1992. Use
SELECT tablea JOIN tableb ON tablea.joincolumn = tableb.joincolumn.
I'd offer you more advice but I can't figure out your intent.
Application of the following changes in CurrentLikelihood() and CurrentConsequence() reduced total query execution time to exec 0.070 sec, total 0.082 sec.
Old Current Likelihood Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
Working CurrentLikelihood Query
SELECT actuallikelihood INTO currentlikelihood
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentConsequence Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
and actualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
Working CurrentConsequence Query
SELECT GREATEST(actualtechnical, actualschedule, actualcost) INTO currentconsequence
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentRiskEventID() Query
select MAX(e.EventID) into currentriskeventid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
Modified GetRiskEventID() function
SELECT MAX(e.EventID) INTO currentriskeventid
FROM riskevents e
WHERE e.riskid = riskidentifier AND
(e.eventstatus != 'Open'
OR
(e.EventID = 0 AND e.eventstatus = 'Open'));