MySQL Query (Sub Queries + Composed Functions + JOIN operations) takes too long to run - mysql

How can I revise the following query with subqueries composed of functions and join queries in functions. I want to append extra values to my main query that relies on main tables and two primary joins (risks, users) on several occasions.
Creating a MCVRE (Minimal Complete Verifiable Reproduceable Example) proved to be somewhat challenging because of request sent to SQL Fiddle has too many rows (too many text characters) After removing nearly all rows on main two tables ( users, risks ) I ended up with a running query.
The Fiddle (http://www.sqlfiddle.com/#!9/1d52a0/17) create functions and insertion of data commands have reduced rows from actual example on my local pc due to character count of 8000 being exceeded for request payload for SQLFiddle to understand.
Actual table has about 100 rows for risks, and 20 or so rows for users and takes about 3 seconds to run
What can I do to speed up query, via staving desired function results in table, or by revision, index insertion, movement of joins to outer main query, or even using stored procedure, or rewriting query structure, to reduce execution time to possibly half the time or less optimistically. SQL fiddle does not take all rows needed so I pasted a very limited subset, Even SQLFiddle query (see total select query below) does not run, due to Stack Overflow (pun partially intended).
http://www.sqlfiddle.com/#!9/1d52a0/17
Base Queries that do run on the fiddle (see fiddle)
select * from users;
select * from risks;
select * from riskevents;
select * from riskmatrixthresholds;
select * from risklevels;
#significantly minimized result set but still query does not run due to stack overflow issue on sql fiddle - see fiddle result (on bottom most portion of fiddle query output)
SELECT r.RiskID,
r.CreatorID,
r.OwnerID,
r.ApproverID,
r.RiskTitle,
r.RiskStatement,
r.ClosureCriteria,
r.RiskState,
r.Context AS 'Context',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'Last','') AS 'creator.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Creator', r.RiskID,0),r.RiskID, 'First','') AS 'creator.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'Last','') AS 'owner.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Owner', r.RiskID,0),r.RiskID, 'First','') AS 'owner.firstname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'Last','') AS 'approver.lastname',
GetRiskUserLastOrFirstName(GetRiskUserID('Approver',r.RiskID,0),r.RiskID, 'First','') AS 'approver.firstname',
r.Likelihood AS 'OriginalLikelihood',
r.Technical AS 'OriginalTechnical',
r.Schedule AS 'OriginalSchedule',
r.Cost AS 'OriginalCost',
GREATEST(r.Technical, r.Schedule, r.Cost) AS 'OriginalConsequence',
RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0) AS 'OriginalValue',
RiskLevel(RiskValue(r.Likelihood, GREATEST(r.Technical, r.Schedule, r.Cost),0),'') AS 'OriginalLevel',
LatestEventDate(r.RiskID, r.AssessmentDate,'') AS 'LatestEventDate',
r.AssessmentDate AS 'AssessmentDate',
(SELECT CurrentLikelihood(r.RiskID,0)) AS 'CurrentLikelihood',
(SELECT CurrentConsequence(r.RiskID,0)) AS 'CurrentConsequence',
(SELECT CurrentRiskValue(r.RiskID,0)) AS 'CurrentValue',
(SELECT RiskLevel(CurrentRiskValue(r.RiskID,0),'')) AS 'CurrentLevel'
FROM risks r;
Create Function Script
CREATE TABLE `riskevents` (
`ID` int NOT NULL AUTO_INCREMENT,
`EventID` int ,
`RiskID` int ,
`EventTitle` text,
`EventStatus` varchar(10) ,
`EventOwnerID` int ,
`ActualDate` date ,
`ScheduleDate` date ,
`BaselineDate` date ,
`ActualLikelihood` int ,
`ActualTechnical` int ,
`ActualSchedule` int ,
`ActualCost` int ,
`ScheduledLikelihood` int ,
`ScheduledTechnical` int ,
`ScheduledSchedule` int ,
`ScheduledCost` int ,
`BaselineLikelihood` int ,
`BaselineTechnical` int ,
`BaselineSchedule` int ,
`BaselineCost` int ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risklevels` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskLevelID` int ,
`RiskMaximum` float ,
`RiskHigh` float ,
`RiskMedium` float ,
`RiskMinimum` float ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `riskmatrixthresholds` (
`ID` int NOT NULL AUTO_INCREMENT,
`CellID` int ,
`Likelihood` int ,
`Consequence` int ,
`Level` decimal(2,2) ,
PRIMARY KEY (`ID`)
)
CREATE TABLE `risks` (
`ID` int NOT NULL AUTO_INCREMENT,
`RiskState` varchar(10) ,
`RiskID` int ,
`RiskTitle` text CHARACTER SET latin1,
`RiskStatement` text CHARACTER SET latin1,
`ApproverID` int ,
`OwnerID` int ,
`CreatorID` int ,
`Likelihood` int ,
`Technical` int ,
`Schedule` int ,
`Cost` int ,
`ClosureCriteria` text CHARACTER SET latin1,
`CategoryID` int ,
`AssessmentDate` date ,
`CompletionDate` date ,
`ClosureDate` date ,
`Context` text,
PRIMARY KEY (`ID`),
UNIQUE KEY `risk_index` (`RiskID`)
)
CREATE TABLE `users` (
`ID` int NOT NULL AUTO_INCREMENT,
`UserID` int NOT NULL,
`LastName` char(25) ,
`FirstName` char(15) ,
`Title` char(20) ,
`Email` varchar(30) ,
`Phone` char(12) ,
`Extension` char(4) ,
`Department` char(25) ,
PRIMARY KEY (`ID`),
UNIQUE KEY `user_index` (`UserID`),
KEY `SURROGATE` (`UserID`)
)
insert into `riskevents`(`ID`,`EventID`,`RiskID`,`EventTitle`,`EventStatus`,`EventOwnerID`,`ActualDate`,`ScheduleDate`,`BaselineDate`,`ActualLikelihood`,`ActualTechnical`,`ActualSchedule`,`ActualCost`,`ScheduledLikelihood`,`ScheduledTechnical`,`ScheduledSchedule`,`ScheduledCost`,`BaselineLikelihood`,`BaselineTechnical`,`BaselineSchedule`,`BaselineCost`) values
(171,0,1,'Risk','Complete',5,'2019-06-14',NULL,'2019-06-14',5,2,2,5,NULL,NULL,NULL,NULL,5,2,2,5),
(184,0,10,'Risk','Complete',21,'2019-10-07',NULL,'2019-10-07',5,4,5,4,NULL,NULL,NULL,NULL,5,4,5,4));
insert into `risklevels`(`ID`,`RiskLevelID`,`RiskMaximum`,`RiskHigh`,`RiskMedium`,`RiskMinimum`) values
(1,1,1,0.55,0.3,0);
insert into `riskmatrixthresholds`(`ID`,`CellID`,`Likelihood`,`Consequence`,`Level`) values
(1,1,1,1,0.09),
(2,2,1,2,0.12),
(3,3,1,3,0.16),
(4,4,1,4,0.19),
(5,5,1,5,0.23),
(6,6,2,1,0.12),
(7,7,2,2,0.19),
(8,8,2,3,0.27),
(9,9,2,4,0.34),
(10,10,2,5,0.41),
(11,11,3,1,0.16),
(12,12,3,2,0.27),
(13,13,3,3,0.37),
(14,14,3,4,0.48),
(15,15,3,5,0.59),
(16,16,4,1,0.19),
(17,17,4,2,0.34),
(18,18,4,3,0.48),
(19,19,4,4,0.63),
(20,20,4,5,0.77),
(21,21,5,1,0.23),
(22,22,5,2,0.41),
(23,23,5,3,0.59),
(24,24,5,4,0.77),
(25,25,5,5,0.95);
insert into `risks`(`ID`,`RiskState`,`RiskID`,`RiskTitle`,`RiskStatement`,`ApproverID`,`OwnerID`,`CreatorID`,`Likelihood`,`Technical`,`Schedule`,`Cost`,`ClosureCriteria`,`CategoryID`,`AssessmentDate`,`CompletionDate`,`ClosureDate`,`Context`) values
(1,'Completed',1,'t','t',1,5,1,5,2,2,5,'t',NULL,'2019-06-14','2020-09-26',NULL,'t'),
(2,'Completed',2,'t','t',2,1,1,5,3,4,2,'test',NULL,'2019-05-14',NULL,NULL,'t'),
insert into `users`(`ID`,`UserID`,`LastName`,`FirstName`,`Title`,`Email`,`Phone`,`Extension`,`Department`) values
(1,1,'Admin','','Admin','a#yz.com','17890','1234',''),
(2,2,'Last','First','Engineer','a#yz.com','123890','1234','Supplier');
CREATE FUNCTION Consequence(technical int, sched int, cost int, consequence int) RETURNS int
BEGIN
select GREATEST(technical, sched, cost) into consequence;
return consequence;
END;
CREATE FUNCTION CurrentRiskEventID(riskidentifier int, eid int) RETURNS int
BEGIN
select MAX(e.EventID) into eid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
return riskeventid;
END;
CREATE FUNCTION CurrentConsequence(riskidentifier int, currentconsequence int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
ANDactualschedule is not null
andactualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
return currentconsequence;
END;
CREATE FUNCTION CurrentLikelihood(riskidentifier int, currentlikelihood int) RETURNS int
BEGIN
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
END;
CREATE FUNCTION CurrentRiskLevel(riskidentifier int, currentrisklevel int) RETURNS int
BEGIN
select RiskLevel(CurrentRiskValue(riskidentifier, 0), '') into currentrisklevel;
return currentrisklevel;
END;
CREATE FUNCTION CurrentRiskValue(riskidentifier int, currentriskvalue int) RETURNS int
BEGIN
SELECT RiskValue(CurrentLikelihood(riskidentifier, 0), CurrentConsequence(riskidentifier, 0), 0) into currentriskvalue;
return currentriskvalue;
END;
CREATE FUNCTION GetRiskUserID(riskusertype VARCHAR(25), riskidentifier int, riskuserid int) RETURNS int
BEGIN
SELECT COALESCE(userres.userid, 0) into riskuserid FROM
(
SELECT r.creatorid, r.ownerid, r.approverid, u.userid
FROM risks r, users u
WHERE r.riskid = (select riskidentifier) and
(
((select riskusertype) = 'Creator' AND u.userid = r.creatorid) OR
((select riskusertype) = 'Approver' AND u.userid = r.approverid) OR
((select riskusertype) = 'Owner' AND u.userid = r.ownerid)
)
) userres;
RETURN riskuserid;
END;
CREATE FUNCTION GetRiskUserLastOrFirstName(riskuserid int, riskid int, whichname char(25), firstorlastname char(25)) RETURNS char(25) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
SELECT (case
when whichname = 'Last' then u.LastName
WHEN whichname = 'First' THEN u.FirstName
end)
into firstorlastname
FROM users u,risks r
WHERE u.UserID = riskuserid
AND r.RiskID = riskid;
return firstorlastname;
END;
CREATE FUNCTION LatestEventDate(riskidentifier int, riskassessmentdate date, latestdate date) RETURNS date
BEGIN
SELECT COALESCE(
(SELECT ActualDate FROM riskevents evt WHERE evt.eventid = CurrentRiskEventID(riskidentifier, 0) and evt.riskid = riskidentifier),
(SELECT riskassessmentdate)) into latestdate;
return latestdate;
END;
CREATE FUNCTION RiskLevel(riskvalue int, risklevel varchar(4)) RETURNS varchar(4)
begin
SELECT
CASE
WHEN riskvalue >= levels.riskhigh*100 THEN 'High'
WHEN riskvalue >= levels.riskmedium*100 THEN 'Med'
ELSE 'Low'
ENd as cat into risklevel
FROM risklevels levels;
return risklevel;
END;
CREATE FUNCTION RiskValue(likelihood int, consequence int, riskvalue int) RETURNS int
BEGIN
SELECT m.level*100 INTO riskvalue FROM riskmatrixthresholds m WHERE m.likelihood = likelihood AND m.consequence = consequence;
RETURN riskvalue;
END;
http://www.sqlfiddle.com/#!9/1d52a0/17

Note: SQL is a declarative language, not a procedural language. You tell it what you want, not how to get it. Your use of functions and so forth is procedural.
How can you make this application faster?
First, use the latest version of MySQL (8+, or MariaDB 10.4+). Later versions get faster.
Second, you have stated a requirement to use "subqueries composed of functions". That means you probably can't do much about the performance.
Why not? The subqueries buried in functions are so-called dependent subqueries. Those don't perform well. And because they're buried MySQL's query planner can't do anything useful to optimize them.
Refactoring your query to avoid using functions with SELECT operations will give the query planner visibility into your overall query. That will give it a chance to optimize things. You might replace them with views.
And don't use SELECT tablea, tableb syntax. That's been obsolete since 1992. Use
SELECT tablea JOIN tableb ON tablea.joincolumn = tableb.joincolumn.
I'd offer you more advice but I can't figure out your intent.

Application of the following changes in CurrentLikelihood() and CurrentConsequence() reduced total query execution time to exec 0.070 sec, total 0.082 sec.
Old Current Likelihood Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT actuallikelihood
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)),
(SELECT r.likelihood
FROM risks r
WHERE r.riskid = riskidentifier)) into currentlikelihood;
return currentlikelihood;
Working CurrentLikelihood Query
SELECT actuallikelihood INTO currentlikelihood
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentConsequence Query (producing slow and incorrect output)
SELECT coalesce(
(SELECT GREATEST(actualtechnical, actualschedule, actualcost)
FROM riskevents
WHERE id = CurrentRiskEventID(riskidentifier, 0)
and actualtechnical is not null
and actualschedule is not null),
(SELECT greatest(technical, schedule, cost)
from risks
Where riskid = riskidentifier)
) into currentconsequence;
Working CurrentConsequence Query
SELECT GREATEST(actualtechnical, actualschedule, actualcost) INTO currentconsequence
FROM riskevents
WHERE eventid = CurrentRiskEventID(riskidentifier)
AND riskid = riskidentifier;
Old CurrentRiskEventID() Query
select MAX(e.EventID) into currentriskeventid
FROM riskevents e
WHERE e.eventstatus not in('Open')
AND e.riskid = riskidentifier;
Modified GetRiskEventID() function
SELECT MAX(e.EventID) INTO currentriskeventid
FROM riskevents e
WHERE e.riskid = riskidentifier AND
(e.eventstatus != 'Open'
OR
(e.EventID = 0 AND e.eventstatus = 'Open'));

Related

mysql: Finding a max in each column of a table

I have the table below. I'm looking for the both the max value in each column AND it's matching username (all values of NULL to be ignored).
A bunch of mad googling has lead me to believe I need to find the max values and then use a second query to find the matching username?
But is there a query that can return this in one go?
ID username Vale Jorge Andrea
-------------------------------------------
01 John 2 6 NULL
02 Ted NULL 0 0
03 Marcy NULL 2 1
Output would be...
John Jorge 6
John Vale 2
Marcy Andrea 1
There's different ways of looking at it, here's a table that gives a row for each username that has a matching max value:
SELECT
username
, IF (max_vale = t.vale, max_vale, NULL) AS for_vale
, IF (max_jorge = t.jorge, max_jorge, NULL) AS for_jorge
, IF (max_andrea = t.andrea, max_andrea, NULL) AS for_andrea
FROM (
SELECT
MAX(vale) AS max_vale
, MAX(jorge) AS max_jorge
, MAX(andrea) AS max_andrea
FROM t
) y
JOIN t ON (
t.vale = max_vale
OR t.jorge = max_jorge
OR t.andrea = max_andrea
)
http://sqlfiddle.com/#!9/58e37d/5
This gives:
username for_vale for_jorge for_andrea
----------------------------------------------
John 2 6 (null)
Marty (null) (null) 1
Basically, all I'm doing is selecting the specific column max values, then using that query as the source for another query that just looks at the MAX generated columns, and filters (IF()) based on the matches found.
This is one reasonably simple way of doing it... make a table of all the maximum values and join it to the usernames on matching the value with the max. Uses the fact that NULL isn't equal to anything. I haven't attempted to order the results but that is easy enough with adding an ORDER BY clause.
select username, name, COALESCE(mv.vale, mv.jorge, mv.Andrea) as value
from table1
join
(select 'Vale' as name, max(vale) as vale, NULL as jorge, NULL as andrea from table1
union ALL
select 'Jorge', NULL, max(jorge), NULL from table1
union all
select 'Andrea', NULL, NULL, max(andrea) from table1) mv
on table1.vale = mv.vale or table1.jorge = mv.jorge or table1.andrea = mv.andrea
Output
username name value
John Vale 2
John Jorge 6
Marcy Andrea 1
To extrapolate this to more columns is reasonably straightforward (if somewhat painful) e.g. to add a column called fred you would use (changes inside **):
select username, name, COALESCE(mv.vale, mv.jorge, mv.Andrea**, mf.fred**) as value
from table1
join
(select 'Vale' as name, max(vale) as vale, NULL as jorge, NULL as andrea**, NULL as fred** from table1
union ALL
select 'Jorge', NULL, max(jorge), NULL**, NULL** from table1
union all
select 'Andrea', NULL, NULL, max(andrea)**, NULL** from table1) mv
**union all
select 'Fred', NULL, NULL, max(fred), NULL from table1) mf**
on table1.vale = mv.vale or table1.jorge = mv.jorge or table1.andrea = mv.andrea **or table1.fred = mf.fred**
If you have access to stored procedures, you can also do it like this (in a much more flexible way in terms of columns)
DROP PROCEDURE IF EXISTS list_maxes;
DELIMITER //
CREATE PROCEDURE list_maxes(tname VARCHAR(20), column_list VARCHAR(1000))
BEGIN
DECLARE maxv INT DEFAULT 0;
DECLARE cpos INT;
DECLARE colname VARCHAR(20);
-- loop through the column names
WHILE (LENGTH(column_list) > 0)
DO
SET cpos = LOCATE(',', column_list);
IF (cpos > 0) THEN
SET colname = LEFT(column_list, cpos - 1);
SET column_list = SUBSTRING(column_list, cpos + 1);
ELSE
SET colname = column_list;
SET column_list = '';
END IF;
-- find the maximum value of this column
SET #getmax = CONCAT('SELECT MAX(', colname, ') INTO #maxv FROM Table1');
PREPARE s1 FROM #getmax;
EXECUTE s1;
DEALLOCATE PREPARE s1;
-- now find the user with the maximum value
SET #finduser = CONCAT("SELECT username, '", colname, "' AS name, ", colname, ' AS value FROM ', tname,' WHERE ', colname, ' = ', #maxv);
PREPARE s2 FROM #finduser;
EXECUTE s2;
DEALLOCATE PREPARE s2;
END WHILE;
END//
DELIMITER ;
CALL list_maxes('table1', 'Vale,Jorge,Andrea')
Output
John Vale 2
John Jorge 6
Marcy Andrea 1
This is quite long but working. I combine all columns into one table using UNION ALL. Then get the max value per surname. Join this to the original table using the surname and value. Order by value in descending order.
select tv.*
from( select surname, max(val) as maxval
from (
select username,'vale' as surname,vale as val
from tbl
union all
select username,'jorge' as surname,jorge
from tbl
union all
select username,'andrea' as surname,andrea
from tbl) tab
group by surname) tt
join (
select username,'vale' as surname,vale as val
from tbl
union all
select username,'jorge' as surname,jorge
from tbl
union all
select username,'andrea' as surname,andrea
from tbl) tv
on tt.surname=tv.surname and tt.maxval=tv.val
order by tv.val desc;
As a whole other way to resolve the request, we may look at the model, e.g., the fact that what appears to be domain content is represented as columns instead of rows. This is typical of a source aggregate query (pivot or rollup to get aggregated totals or groupings, for instance), but if the underlying data spreads out, should perhaps be based on the transactional integrity of that data source (the "spread out" underlying source).
Basically, I wonder why there are Vale, Jorge and Andrea columns at all in the database. This implies it's already been summarized.
So we may look at an alternate model that is notably easier to navigate for these purposes:
CREATE TABLE IF NOT EXISTS `user` (
`id` MEDIUMINT NOT NULL AUTO_INCREMENT,
`username` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `prospect` (
`id` MEDIUMINT NOT NULL AUTO_INCREMENT,
`name` varchar(20) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
CREATE TABLE IF NOT EXISTS `aggregate` (
`id` MEDIUMINT NOT NULL AUTO_INCREMENT,
`user_id` int(6) unsigned NOT NULL,
`prospect_id` int(6) unsigned NOT NULL,
`total` int(6) unsigned NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
SELECT
user.username
, prospect.name
, MAX(aggregate.total) AS max_aggregate
FROM aggregate
JOIN user ON user_id = user.id
JOIN prospect ON prospect_id = prospect.id
GROUP BY username
This produces:
John Andrea 6
Marty Jorge 2
Ted Jorge 5
http://sqlfiddle.com/#!9/07ba0c/1
This may not be useful to you now, but as your experience grows and your experience with advanced querying evolves, this will make more sense. The main difficulty may be that the core data is already turned, making the querying more difficult because what you want is a different dimension than what you may have already derived.

Speeding up a MySQL insert select into with 10 millions records

I have a few table storing daily orders, customers and salespersons. Yet the schema was not well design as columns have inappropriate data value and type, missing index and partition etc. I re-designed a new schema and populate the new tables with the wrecked tables. I am now stuck on populating the daily orders table (with around 10M records).
Attached data definition and the SQL script to populate the table.
TABLE DEFINITION
CREATE TABLE IF NOT EXISTS `testing`.`Orders` (
`order_ID` INT UNSIGNED NOT NULL AUTO_INCREMENT,
`ord_id` BIGINT UNSIGNED NOT NULL,
`create_time` DATETIME NOT NULL,
`create_date` DATE NOT NULL,
`cust_id` MEDIUMINT UNSIGNED NOT NULL,
`cust_mob` BIGINT UNSIGNED NULL,
`sales_id` MEDIUMINT UNSIGNED NULL,
`sales_mob` BIGINT UNSIGNED NULL,
`sales_flag` TINYINT UNSIGNED NULL,
`comm_flag` TINYINT UNSIGNED NULL,
`extraprice` TINYINT UNSIGNED NULL,
PRIMARY KEY (`order_ID`),
INDEX `Date_cust_id` (`create_date` ASC, `cust_id` ASC),
INDEX `Date_cust_mob` (`create_date` ASC, `cust_mob` ASC),
INDEX `Date_dri_id` (`create_date` ASC, `sales_id` ASC),
INDEX `Date_dri_mob` (`create_date` ASC, `sales_mob` ASC),
INDEX `Date_cust` (`create_date` ASC, `cust_id` ASC, `cstu_mob` ASC),
INDEX `Date_dri` (`create_date` ASC, `sales_id` ASC, `sales_mob` ASC),
INDEX `cust` (`cust_id` ASC, `cust_mob` ASC),
INDEX `dri` (`sales_id` ASC, `sales_mob` ASC),
UNIQUE INDEX `ord_id_UNIQUE` (`ord_id` ASC)
)
ENGINE = InnoDB
DEFAULT CHARACTER SET = utf8;
This script is to populate the table, involving two left join tables: Pag table with 6xx K record and dri table with 3x k record.
SET SQL_SAFE_UPDATES=0;
SET SQL_MODE='';
DROP PROCEDURE IF EXISTS testing.populate_ord1;
DELIMITER $$
CREATE PROCEDURE testing.populate_ord1()
BEGIN
PREPARE stmt
FROM "
INSERT INTO testing.Orders
SELECT
1
,ord_id
,CASE WHEN TRIM(create_time) ='NULL' THEN NULL ELSE STR_TO_DATE(substring(create_time,1,19), '%Y-%m-%d %H:%i:%s') END AS create_time
,CASE WHEN TRIM(create_time) ='NULL' THEN NULL ELSE DATE(STR_TO_DATE(substring(create_time,1,19), '%Y-%m-%d %H:%i:%s')) END AS create_date
,CASE WHEN TRIM(ord.cust_id) = 'NULL' THEN NULL else pag.cust_id END as cust_id
,CASE WHEN TRIM(ord.mob) = 'NULL' THEN NULL else pag.cust_mob END as cust_mob
,CASE WHEN TRIM(ord.sales_id) = 'NULL' THEN NULL else dri.sales_id END as sales_id
,CASE WHEN TRIM(ord.mob1) = 'NULL' THEN NULL else dri.sales_mob END as sales_mob
,CASE WHEN TRIM(sales_flag) ='NULL' THEN NULL ELSE CONVERT(TRIM(sales_flag),UNSIGNED INTEGER) end AS sales_flag
,CASE WHEN TRIM(comm_flag) ='NULL' THEN NULL ELSE CONVERT(TRIM(comm_flag),UNSIGNED INTEGER) end AS comm_flag
,CASE WHEN TRIM(extraprice) ='NULL' THEN NULL ELSE CONVERT(TRIM(extraprice),UNSIGNED INTEGER) end AS extraprice
FROM testing.ord_table ord
LEFT JOIN
(SELECT cust_id,customer_id,cust_mob FROM testing.Passenger) pag
ON TRIM(ord.customer_id) = TRIM(pag.pag_id)
AND TRIM(ord.mob) = TRIM(pag.passenger_mob)
LEFT JOIN
(SELECT sales_id,salesperson_id,sales_mob FROM testing.sales) dri
ON TRIM(ord.salesperson_id) = TRIM(dri.sales_id)
AND TRIM(ord.mob1) = TRIM(dri.sales_mob)
WHERE ord_id != 'NULL' AND create_time IS NOT NULL AND create_time != 'NULL' AND YEAR(create_time) = ? AND MONTH(create_time) = ? AND DAY(create_time) = ?
GROUP BY ord_id
ON DUPLICATE KEY UPDATE ord_id = ord_id
;
";
SET #y = 2014, #m = 9, #d = 1;
WHILE #y<= 2014 DO
WHILE #m<= 12 DO
SET #d = 1;
WHILE #d<= 31 DO
EXECUTE stmt USING #y, #m, #d;
SET #d = #d + 1;
END WHILE;
SET #m = #m + 1;
END WHILE;
SET #y = #y + 1;
SET #m = 1;
END WHILE;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;
set autocommit=0;
call testing.populate_ord1();
COMMIT;
I have failed to populate any record to the table. Sometimes it raises lock wait timeout error or data type error or simply takes too long time (2 days) I suspect it is even doing any job.
I searched the web a bit and have added the following settings to my.cnf.
innodb_autoinc_lock_mode = 2
innodb_lock_wait_time_out = 150
innodb_flush_log_at_trx_commit =2
innodb_buffer_pool_size = 14G
Would anyone advise on how I could accomplish the same task efficiently? The code above run without any syntax error. And in case if there is any naming confusion, please let me know if that's critical to get clarified as I am slightly tweaked those variable tables.
Start by performing
UPDATE ... SET
comm_flag = TRIM(comm_flag),
sales_flag = TRIM(sales_flag),
...
That will speed up the subsequent queries some, and simplify them.
Then avoid using LEFT JOIN ( SELECT ... FROM x WHERE ... ). Instead, see if you can turn that into LEFT JOIN x ON ... WHERE .... That is likely to help.
It is usually a bad idea to split a DATE and TIME into two columns. Or do you have a good argument for such? Let's see the queries that touch that pair of columns.
There is no need for STR_TO_DATE() if the string is already a properly formatted DATE or DATETIME. That is, a string works just fine.
Once the TRIM is out of the way, CONVERT(TRIM(comm_flag),UNSIGNED INTEGER) can be simply comm_flag.
Don't loop through things a day at a time -- the way you have it structured, it will be doing a full table scan! About 1000 times !! (This is likely to be the biggest performance issue.)

mySQL: Query performance with User Defined Functions

I'm trying to make a multi-language mySQL database.
I was considering using user defined functions to determine which column in a table to read - where different columns will store the different translations - however I was concerned about query performance.
For example, if I wanted to return a list of Cities, and the name of country - with the latter returned in multiple languages. At what point would the below approach impact performance? - as I might do an equivalent on a table with 2-5,000 rows.
Country table structure:
SELECT
`CountryID`,
`Name_English`,
`Name_French`,
`Name_Spanish`
FROM `Country`
WHERE `CountryID` = fCountryID;
City Table Structure:
SELECT
`City`.`CityName` 'City'
,getCountry(1, `City`.`CountryID`) 'Country'
FROM `City`;
Example Function Call:
SELECT
`City`.`CityName` 'City'
,getCountry(1, `City`.`CountryID`) 'Country'
FROM `City`;
Full Function:
delimiter $$
CREATE DEFINER=root#localhost FUNCTION
getCountry(fLanguageID INT, fCountryID SMALLINT)
RETURNS varchar(100) CHARSET utf8 COLLATE utf8_unicode_ci
BEGIN
DECLARE returnCountry VARCHAR(100);
IF (fLanguageID = 1) -- English
THEN
SET returnCountry = (
SELECT `Name_English` FROM `Country`
WHERE `CountryID` = fCountryID
);
ELSEIF (fLanguageID = 2) -- French
THEN
SET returnCountry = (
SELECT `Name_French` FROM `Country`
WHERE `CountryID` = fCountryID
);
ELSEIF (fLanguageID = 3) -- Spanish
THEN
SET returnCountry = (
SELECT `Name_Spanish` FROM `Country`
WHERE `CountryID` = fCountryID
);
END IF;
RETURN returnCountry;
SELECT
`CountryID`,
`Name_English`,
`Name_French`,
`Name_Spanish`
FROM `Country`
WHERE `CountryID` = fCountryID;
SELECT
`City`.`CityName` 'City'
,getCountry(1, `City`.`CountryID`) 'Country'
FROM `City`;
END$$

Mysql INSER INTO with SELECT and INNER JOIN WITH CASE

The following code reads:
--Code Created by Michael Berkowski
create table dvd (
dvd_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
);
INSERT INTO dvd VALUES (1),(2),(3),(4);
CREATE TABLE dvd_price (
dvd_price_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
dvd_id INT NOT NULL,
rental_amount INT,
time_rented_for VARCHAR(10)
);
INSERT INTO dvd_price (dvd_id, rental_amount, time_rented_for)
SELECT
dvd_id,
2 AS rental_amount,
rental_period
FROM
dvd
CROSS JOIN (
-- This is where I'm having issues
SELECT (CASE dvd.dvd_id
WHEN dvd.dvd_id = 1
THEN '1-Day'
ELSE '3-Day'
END) AS rental_period
) rental_periods
Why can I not do a CASE statement after the CROSS JOIN and how would I fix this?
I get the error, "Unknown table 'dvd' in field list:", what is a better way of writing this?
Try this way:
INSERT INTO dvd_price (dvd_id, rental_amount, time_rented_for)
SELECT
dvd_id,
2 AS rental_amount,
CASE dvd.dvd_id
WHEN dvd.dvd_id = 1
THEN '1-Day'
ELSE '3-Day'
END
FROM
dvd

How to declare table in SQL Server?

I am trying to create a function which needs to return a table but even function is not made too and I need to return the resulted table.
My script is like this
create function FNC_getPackageListById(#PkId int )
returns table
as
return
if exists (select Date1, Date2 from PromotionPackage
where PkId = #PkId and Date1 is null and Date2 is null)
begin
select Rate,Remarks,PackageName from Package where PkId=#PkId
end
else
begin
select p.Rate,
p.Remarks,
p.PackageName,
pp.Date1,
pp.Date2
from PromotionPackage pp,
Package p
where pp.PkId=p.PkId and p.PkId=#PkId
end
end
The function called table valued function that returns a table. See this example:
CREATE FUNCTION TrackingItemsModified(#minId int)
RETURNS #trackingItems TABLE (
Id int NOT NULL,
Issued date NOT NULL,
Category int NOT NULL,
Modified datetime NULL
)
AS
BEGIN
INSERT INTO #trackingItems (Id, Issued, Category)
SELECT ti.Id, ti.Issued, ti.Category
FROM TrackingItem ti
WHERE ti.Id >= #minId;
RETURN;
END;