Slow nested function - mysql

Please let me know what information you seek to improve the question, rather than just downvoting.
I have a function that looks like this:
DELIMITER $$
DROP FUNCTION IF EXISTS f_splitadjprice;
CREATE FUNCTION f_splitadjprice (id CHAR(8), startdate DATE)
RETURNS FLOAT
BEGIN
DECLARE splitfactor FLOAT;
DECLARE splitadjprice FLOAT;
SELECT f_splitfactor(id, startdate) INTO splitfactor;
SELECT (f.p_price FROM fp_v2_fp_basic_prices as f WHERE f.fsym_id = id AND
f.p_date = startdate) * splitfactor INTO splitadjprice;
RETURN splitadjprice;
END$$
DELIMITER ;
The function for splitfactor is:
DELIMITER $$
DROP FUNCTION IF EXISTS f_splitfactor;
CREATE FUNCTION f_splitfactor (id CHAR(8), startdate DATE)
RETURNS FLOAT
BEGIN
DECLARE splitfactor FLOAT;
SELECT IFNULL(EXP(SUM(LOG(f.p_split_factor))),1) INTO splitfactor
FROM fp_v2_fp_basic_splits AS f
WHERE f.fsym_id = id AND f.p_split_date > startdate AND f.p_split_date <
NOW();
RETURN splitfactor;
END$$
DELIMiTER ;
The function f_splitadjprice runs extremely slow. About 14 seconds PR row. I have tried to run the individual pieces of the function by themselves. That is, the function call f_splitfactor and SELECT (f.p_price FROM fp_v2_fp_basic_prices as f WHERE f.fsym_id = id AND
f.p_date = startdate). When running these two by themselves outside of the function they take 0,001 seconds to run. So the whole problem is that as soon as I want to do in combination through the nested function it takes 100.000 times longer?

The solution is to not call tables within functions. In general that seems to be bad practice and it is nevertheless extremely slow. One should instead try to get rid of the function and perform the function directly in the query.

Related

MySQL: Function not returning the correct integer

We have a question regarding a function returning the wrong integer-value in MySQL. We have checked that "booked_passengers" contains the right value, 0, and it works just fine when removing that variable, meaning just returning the integer 40. But as soon as we try to subtract "booked_passengers" from it, which still should end up returning 40, it does not work.
Including the code below.
Thanks in advance! :-)
CREATE FUNCTION calculateFreeSeats(flightnumber INT)
RETURNS INT
NOT DETERMINISTIC
BEGIN
DECLARE booked_passengers INT;
SELECT BOOKED_PASSENGERS INTO booked_passengers FROM FLIGHT WHERE (flightnumber = NR);
RETURN (40-booked_passengers);
END $$
When column name and local variable name interfere and there is no table alias then the variable is preferred. So your SELECT BOOKED_PASSENGERS ... selects variable value, not column value. Use
CREATE FUNCTION calculateFreeSeats(flightnumber INT)
RETURNS INT
READS SQL DATA
BEGIN
DECLARE booked_passengers INT;
SELECT FLIGHT.BOOKED_PASSENGERS INTO booked_passengers FROM FLIGHT WHERE (flightnumber = NR);
RETURN (40-booked_passengers);
END $$
From the other side the variable usage is obviously excess:
CREATE FUNCTION calculateFreeSeats(flightnumber INT)
RETURNS INT
READS SQL DATA
RETURN (SELECT 40 - BOOKED_PASSENGERS FROM FLIGHT WHERE flightnumber = NR LIMIT 1);

Very slow execution when calling function

I have two functions. The second function uses the output from the first function.
One is:
DELIMITER $$
DROP FUNCTION IF EXISTS fp_splitfactor;
CREATE FUNCTION fp_splitfactor_price (id CHAR(8), startdate DATE)
RETURNS FLOAT
BEGIN
DECLARE splitfactor FLOAT;
SELECT IFNULL(EXP(SUM(LOG(f.p_split_factor))),1) INTO splitfactor
FROM fp_v2_fp_basic_splits AS f
WHERE f.fsym_id = id AND f.p_split_date > startdate AND f.p_split_date < NOW();
RETURN splitfactor;
END$$
DELIMiTER ;
Second one is:
DELIMITER $$
DROP FUNCTION IF EXISTS fp_splitadjprice;
CREATE FUNCTION fp_splitadjprice (id CHAR(8), startdate DATE)
RETURNS FLOAT
BEGIN
DECLARE splitfactor FLOAT;
DECLARE splitadjprice FLOAT;
DECLARE spinofffactor FLOAT;
SET splitfactor = 1.0;
SELECT fp_splitfactor(id, startdate) INTO splitfactor;
SELECT (p_price * splitfactor) INTO splitadjprice
FROM fp_v2_fp_basic_prices
WHERE fsym_id = id AND p_date = startdate;
RETURN splitadjprice;
END$$
DELIMITER ;
I then try to exectute a query as the following:
SELECT
p.fsym_id,
b.p_co_sec_name_desc AS Company_Name,
b.region AS Region,
p.p_date,
p.p_price AS Unadjusted_Price,
fp_splitadjprice(p.fsym_id,p_date) AS Adjusted_Price
FROM
fp_v2_fp_basic_prices p
LEFT JOIN (
SELECT r2.region, b2.p_co_sec_name_desc, b2.fsym_id
FROM fp_v2_fp_sec_coverage b2
LEFT JOIN sym_v1_sym_region r2 ON b2.fsym_id = r2.fsym_id
WHERE r2.region = "EUR") b
ON b.fsym_id =p.fsym_id
So basically my query calls the second function, which then calls the first function in order to return a value to the query. The execution is extremely slow though, but I do not understand why that is the case?
I found out that the slow execution was entire due to MySQL workbench not handling large datasets well. Once I migrated everything to BigQuery on Google Cloud everything worked perfectly.
STAY AWAY FROM CALLING FUNCTIONS ON LARGE DATASETS IN MySQL Workbench!

MYSQL - table not updating from Procedure

I want to get distance between two GeoPoints (using LatLong) for that I wrote GETDISTANCE function from solution provided [MySQL Function to calculate distance between two latitudes and longitudes. If I call function independently it works like charm.
As per my understanding I cannot return ResultSet from Function in MySQL so I created Procedure and called function inside procedure As follows:
DELIMITER $$
CREATE PROCEDURE GetNearByGeoPoints(IN Lat REAL, IN Longi REAL)
BEGIN
DECLARE v_max int;
DECLARE v_counter int unsigned default 0;
SET #v_max = (SELECT COUNT(*) FROM TransmitterPointsData);
START TRANSACTION;
WHILE v_counter < v_max
DO
SELECT #coverageID :=CoverageID, #tableLatitude := Latitude, #tableLongitude :=Longitude FROM TransmitterPointsData LIMIT v_counter,1;
SET #Dist= GETDISTANCE(Lat, Longi, tableLatitude, tableLongitude);
UPDATE TransmitterPointsData SET DynamicDistance = #Dist WHERE CoverageID= #coverageID;
set v_counter=v_counter+1;
END WHILE;
COMMIT;
SELECT * FROM TransmitterPointsData;
END $$
DELIMITER ;
What I am trying to do is taking a set of LatLong parameters from user and comparing it with each set of LatLong from table. And after getting output from function I am updating TransmitterPointsData table with where condition on coverageID.
This is my first MySQL query so far I was following syntax but I do not know why I am getting all null values in DynammicDistance Column.
Thank You in Advance
Try replacing the while loop with this:
UPDATE TransmitterPointsData
SET DynamicDistance = GETDISTANCE(Lat, Longi, Latitude, Longitude)
Much shorter, and you avoid potential issues with row selection via limit + offset (which is poor style at best, and gives you a random row each time at worse).

User defined function only returns NULL

I have this following MySQL code:
DELIMITER $$
CREATE FUNCTION durationInMinutes(id INT)
RETURNS INT DETERMINISTIC
BEGIN
DECLARE Minutes INT;
SET Minutes =
(SELECT TIME_TO_SEC(TIMEDIFF(timeDeparture, timeArrival)) FROM AirRoute
WHERE pk_id = id) / 60;
RETURN Minutes;
END$$
DELIMITER;
Basically, this function calculates the duration of a flight in minutes. The parameter is the id of the flight. For some reason though, this function always returns NULL. I even checked this:
SELECT (SELECT TIME_TO_SEC(TIMEDIFF(timeDeparture, timeArrival)) FROM AirRoute
WHERE pk_id = 925) / 60;
This does return the correct answer if I put id = 925, so there could be something wrong with the RETURN statement.
I suspect there is a column called id in the table. I always name parameters and local variables in a way to distinguish them from column names:
CREATE FUNCTION durationInMinutes (
in_id INT
)
RETURNS INT DETERMINISTIC
BEGIN
DECLARE out_Minutes INT;
SELECT out_Minutes := TIME_TO_SEC(TIMEDIFF(timeDeparture, timeArrival))
FROM AirRoute ar
WHERE ar.pk_id = in_id) / 60;
RETURN out_Minutes;
END$$
Ok, I solved it. This is my corrected code:
DELIMITER $$
CREATE FUNCTION durationInMinutes(id INT)
RETURNS INT DETERMINISTIC
BEGIN
RETURN (SELECT TIME_TO_SEC(TIMEDIFF(timeDeparture, timeArrival))
FROM AirRoute
WHERE pk_id = id / 60);
END$$
DELIMITER ;
Still, I really don't understand why it wasn't possible using a temp variable.

GROUP_CONCAT as input of MySQL function

Is it possible to use a GROUP_CONCAT in a SELECT as the input of a MySQL function? I cannot figure out how to cast the variable it seems. I've tried blob. I've tried text (then using another function to break it up into a result set, here) but I haven't had any success.
I want to use it like this:
SELECT
newCustomerCount(GROUP_CONCAT(DISTINCT items.invoicenumber)) AS new_customers
FROM items;
Here is the function:
DROP FUNCTION IF EXISTS newCustomerCount;
DELIMITER $$
CREATE FUNCTION newCustomerCount(invoicenumbers BLOB)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE new_customers INT;
SET new_customers = 0;
SELECT
SUM(nc.record) INTO new_customers
FROM (
SELECT
1 AS customer,
(SELECT COUNT(*) FROM person_to_invoice ps2 WHERE person_id = ps1.person_id AND invoice < ps1.invoice) AS previous_invoices
FROM person_to_invoice ps1
WHERE invoice IN(invoicenumbers)
HAVING previous_invoices = 0
) nc;
RETURN new_customers;
END$$
DELIMITER ;
Because Mysql functions do not support dynamic queries, I recommend you re-think your basic strategy to pass in a list of invoice numbers to your function. Instead, you could modify your function to accept a single invoice number and return the number of new customers just for the one invoice number.
Also, there are some optimizations you can make in your query for finding the number of new customers.
DROP FUNCTION IF EXISTS newCustomerCount;
DELIMITER $$
CREATE FUNCTION newCustomerCount(p_invoice INT)
RETURNS INT
DETERMINISTIC
BEGIN
DECLARE new_customers INT;
SET new_customers = 0;
SELECT
COUNT(DISTINCT ps1.person_id) INTO new_customers
FROM
person_to_invoice ps1
WHERE
ps1.invoice = p_invoice
AND NOT EXISTS (
SELECT 1
FROM person_to_invoice ps2
WHERE ps1.person_id = ps2.person_id
AND ps2.invoice < ps1.invoice
);
RETURN new_customers;
END$$
DELIMITER ;
Then you can still get the total number of new customers for a given list of invoice numbers like this:
SELECT
SUM(newCustomerCount(invoice)) as total_new_customers
FROM items
WHERE ...
You could try FIND_IN_SET() instead of IN(). The performance will probably be horrible when passing in a long list of invoice numbers. But it should work.
WHERE FIND_IN_SET(invoice, invoicenumbers)
You are looking in the wrong place.
WHERE invoice IN(invoicenumbers) will not do the desired substitution. Instead you need to use CONCAT to construct the SQL, then prepare and execute it.