I've been writing SQL queries for years but I'm stuck on this one.
I've got 2 tables in MySQL:
LOANPAYMENTSDUE includes LoanPaymentsDueId, LoanId, AmtDue, DueDate
LOANPAYMENTS includes LoanPaymentsId, LoanId, AmtPaid, PaidDate
The relationship between the tables is the LoanId and not the specific payment that is due. In a perfect world the DueDate = PaidDate and the AmtDue = AmtPaid. However, what is making this complex for me is no relationship between the LoanPaymentsDueId and the LoanPaymentsId. The relationship only exists at the LoanId allowing for partial payments to be made on a single LOANPAYMENTSDUE payment.
I've researched the web trying to find the right query to create a report showing the date that each LOANPAYMENTSDUE was satisfied. This requires calculating the balance as of the LOANPAYMENTSDUE.DueDate because there can be payments missed and a new payment should satisfy the balance of the oldest LOANPAYMENTSDUE payment.
Here is the sample data and table scripts:
CREATE TABLE LOANPAYMENTSDUE (
LoanPaymentsDueId BIGINT(20) NOT NULL AUTO_INCREMENT
, LoanId BIGINT(20)
, AmtDue double NOT NULL
, DueDate date NOT NULL
, PRIMARY KEY (LoanPaymentsDueId)
);
INSERT INTO LOANPAYMENTSDUE (LoanId, AmtDue, DueDate) VALUES (1, 100, '2013-07-15');
INSERT INTO LOANPAYMENTSDUE (LoanId, AmtDue, DueDate) VALUES (1, 100, '2013-08-15');
INSERT INTO LOANPAYMENTSDUE (LoanId, AmtDue, DueDate) VALUES (1, 100, '2013-09-15');
INSERT INTO LOANPAYMENTSDUE (LoanId, AmtDue, DueDate) VALUES (1, 100, '2013-10-15');
INSERT INTO LOANPAYMENTSDUE (LoanId, AmtDue, DueDate) VALUES (1, 100, '2013-11-15');
CREATE TABLE LOANPAYMENTS (
LoanPaymentsId BIGINT(20) NOT NULL AUTO_INCREMENT
, LoanId BIGINT(20)
, AmtPaid double NOT NULL
, PaidDate date NOT NULL
, PRIMARY KEY (LoanPaymentsId)
);
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 100, '2013-07-15'); /* Full pmt on due date */
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 100, '2013-08-10'); /* Full pmt a few days early */
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 100, '2013-09-22'); /* Full pmt a week late */
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 50, '2013-10-18'); /* Partial pmt a few days late */
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 50, '2013-11-07');/* Partial pmt 3 weeks late and satisfies the 10/15/2013 balance on this date */
INSERT INTO LOANPAYMENTS (LoanId, AmtPaid, PaidDate) VALUES (1, 100, '2013-11-22');/* Full pmt a week late and satisfies the 11/15/2013 pmt due */
The report query should simply provide the PAIDDATE when each LOANPAYMENTSDUE was satisfied. Using the table data above the report would be as follows:
LOANID LOANPAYMENTSDUEID AMTDUE DUEDATE PAIDDATE
1 1 100 2013-07-15 2013-07-15
1 2 100 2013-08-15 2013-08-10
1 3 100 2013-09-15 2013-09-22
1 4 100 2013-10-15 2013-11-07
1 5 100 2013-11-15 2013-11-22
You could start with these two queries, that return all of the rows with a running total column:
SELECT
LoanId, DueDate,
CASE WHEN LoanId=#last_LoanId THEN #Due:=#Due+AmtDue
ELSE #Due:=AmtDue END total_due,
#last_LoanId:=LoanId
FROM
LOANPAYMENTSDUE, (SELECT #last_LoanId:=NULL, #Due:=NULL) t;
SELECT
LoanId, PaidDate,
CASE WHEN LoanId=#last_LoanId THEN #Paid:=#Paid+AmtPaid
ELSE #Paid:=AmtPaid END total_paid,
#last_LoanId:=LoanId
FROM
LOANPAYMENTS, (SELECT #last_LoanId:=NULL, #Paid:=NULL) t;
and then you could use a LEFT JOIN on due.LoanId=due.LoanId AND total_due<=total_paid, and a GROUP BY to get the minimum date where the join succeded:
SELECT
ld.LoanId, ld.DueDate, MIN(lp.PaidDate)
FROM
(SELECT
LoanId, DueDate,
CASE WHEN LoanId=#last_LoanId1 THEN #Due:=#Due+AmtDue ELSE #Due:=AmtDue END total_due,
#last_LoanId1:=LoanId
FROM
LOANPAYMENTSDUE, (SELECT #last_LoanId1:=NULL, #Due:=NULL) t1) ld
LEFT JOIN
(SELECT
LoanId, PaidDate,
CASE WHEN LoanId=#last_LoanId2 THEN #Paid:=#Paid+AmtPaid ELSE #Paid:=AmtPaid END total_paid,
#last_LoanId2:=LoanId
FROM
LOANPAYMENTS, (SELECT #last_LoanId2:=NULL, #Paid:=NULL) t2) lp
ON
ld.LoanId=lp.LoanId AND ld.total_due<=lp.total_paid
GROUP BY
ld.LoanId, ld.DueDate
Please see fiddle here.
Assuming that when the amount is paid it's paid in portion or remaining amount in whole, you check based on Total Amount Due and Total Amount Paid by matching those up. Here's the sqlFiddle example of your data and query
SELECT T1.LoanId,
T1.LoanPaymentsDueId,
T1.AmtDue,
T1.DueDate,
T2.PaidDate
FROM
(SELECT
LD.LoanPaymentsDueId,
LD.LoanId,
LD.DueDate,
LD.AmtDue,
(SELECT Sum(AmtDue)
FROM LOANPAYMENTSDUE LD1
WHERE LD1.DueDate <= LD.DueDate
AND LD1.LoanId = LD.LoanId
)as AmtDueTotal
FROM
LOANPAYMENTSDUE LD
)T1,
(SELECT
L.LoanPaymentsId,
L.LoanId,
L.PaidDate,
(SELECT Sum(AmtPaid)
FROM LOANPAYMENTS L1
WHERE L1.PaidDate <= L.PaidDate
AND L1.LoanId = L.LoanId
)as AmtPaidTotal
FROM LOANPAYMENTS L
)T2
WHERE
T1.LoanId = T2.LoanId
AND T1.LoanId = 1
AND T1.AmtDueTotal = T2.AmtPaidTotal;
Related
I am trying to get the oldest record for every status update/change in the following table.
Table (status_updates) :
id
entity_id
status
date
7
2
Approved
2022-02-10
6
2
Approved
2022-02-05
5
2
Approved
2022-02-04
4
2
OnHold
2022-02-04
3
2
OnHold
2022-02-03
2
2
Approved
2022-02-02
1
2
Approved
2022-02-01
Result Needed :
id
entity_id
status
date
5
2
Approved
2022-02-04
3
2
OnHold
2022-02-03
1
2
Approved
2022-02-01
Tried :
select
`status`,
`created_at`
from
`status_updates`
left join
(select
`id`,
row_number() over (partition by status_updates.entity_id, status_updates.status order by status_updates.created_at asc) as sequence
from
`status_updates`)
as `oldest_history`
on
`oldest_history`.`id` = `shipper_credit_histories`.`id`
where `sequence` = 1
Result Achived :
id
entity_id
status
date
3
2
OnHold
2022-02-03
1
2
Approved
2022-02-01
Just using lag:
select s.*
from (
select id, status<>coalesce(lag(status) over (partition by entity_id order by id),'') status_change
from status_updates
) ids
join status_updates s using (id)
where status_change
here are the queries:
create table status_updates
(entity_id integer,
status varchar(32),
date date
);
insert into status_updates values (2, 'Approved', '2022-02-05');
insert into status_updates values (2, 'Approved', '2022-02-04');
insert into status_updates values (2, 'On Hold', '2022-02-04');
insert into status_updates values (2, 'On Hold', '2022-02-03');
insert into status_updates values (2, 'Approved', '2022-02-02');
insert into status_updates values (2, 'Approved', '2022-02-01');
select b.*
from status_updates a
right join status_updates b
on a.status=b.status and a.date=(b.date - interval 1 day)
where a.entity_id is null;
or this query(if you prefer left join)
select a.*
from status_updates a
left join status_updates b
on a.status=b.status and a.date=(b.date + interval 1 day)
where b.entity_id is null;
in both you will see the expected result
the second solution is almost the same, but join by id instead of date
create table status_updates
(id integer,
entity_id integer,
status varchar(32),
date date
);
insert into status_updates values (7, 2, 'Approved', '2022-02-10');
insert into status_updates values (6, 2, 'Approved', '2022-02-05');
insert into status_updates values (5, 2, 'Approved', '2022-02-04');
insert into status_updates values (4, 2, 'On Hold', '2022-02-04');
insert into status_updates values (3, 2, 'On Hold', '2022-02-03');
insert into status_updates values (2, 2, 'Approved', '2022-02-02');
insert into status_updates values (1, 2, 'Approved', '2022-02-01');
select a.*
from status_updates a
left join status_updates b
on a.status=b.status and a.id=b.id + 1
where b.entity_id is null;
result is the same what you expected
I have an orders data set. I'd like to get email addresses where the count of orders are specific counts for each year. Let's say 2000 = 1, 2001 = 5 or less, 2002 = 3.
select email
from orders
where year in (2000,2001,2002)
That's where I'm stuck. My thought process is pushing me towards using a having clause or a case statement, but I'm at a wall with the condition of considering the counts by year.
In pseudo SQL it'd be:
select email
from orders
where count(year = 2000) = 1
and count(year = 2001) <= 5
and count(year = 2002) = 3
You can't do this in the where clause, you have to group by email and apply your condition in a having clause (or have your group by query as a subquery and use a where condition in an outer query).
select email
from orders
where year in (2000,2001,2003)
group by email
having sum(year = 2000) = 1
and sum(year = 2001) <= 5
and sum(year = 2002) = 3
You can do it as bellow.
Note that you can change the filtred values wthin the where condition for the count value and the associated year.
-- create a table
CREATE TABLE Orders (
id INTEGER PRIMARY KEY,
email VARCHAR(30) NOT NULL,
year int NOT NULL
);
-- insert some values
INSERT INTO Orders VALUES (1, 'test1#mail.com', 2000);
INSERT INTO Orders VALUES (2, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (3, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (4, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (5, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (6, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (7, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (9, 'test2#mail.com', 2001);
INSERT INTO Orders VALUES (10, 'test3#mail.com', 2002);
INSERT INTO Orders VALUES (11, 'test4#mail.com', 2002);
INSERT INTO Orders VALUES (12, 'test4#mail.com', 2001);
INSERT INTO Orders VALUES (13, 'test4#mail.com', 2002);
--sql statement
select result.email from (
select email, year, count(*) As count from Orders where year in (2000,2001,2002)
group by year, email
)result
where
(result.count = 1 and year = 2000)
;
Output:
email
test1#mail.com
There are 2 tables ost_ticket and ost_ticket_action_history.
create table ost_ticket(
ticket_id int not null PRIMARY KEY,
created timestamp,
staff bool,
status varchar(50),
city_id int
);
create table ost_ticket_action_history(
ticket_id int not null,
action_id int not null PRIMARY KEY,
action_name varchar(50),
started timestamp,
FOREIGN KEY(ticket_id) REFERENCES ost_ticket(ticket_id)
);
In the ost_ticket_action_history table the data is:
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (1, 1, 'Consultation', '2022-01-06 18:30:29');
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (2, 2, 'Bank Application', '2022-02-06 18:30:45');
INSERT INTO newdb.ost_ticket_action_history (ticket_id, action_id, action_name, started) VALUES (3, 3, 'Consultation', '2022-05-06 18:42:48');
In the ost_ticket table the data is:
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (1, '2022-04-04 18:26:41', 1, 'open', 2);
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (2, '2022-05-05 18:30:48', 0, 'open', 3);
INSERT INTO newdb.ost_ticket (ticket_id, created, staff, status, city_id) VALUES (3, '2022-04-06 18:42:53', 1, 'open', 4);
My task is to get the conversion from the “Consultation” stage to the “Bank Application” stage broken down by months (based on the start date of the “Bank Application” stage).Conversion is calculated according to the following formula: (number of applications with the “Bank Application” stage / number of applications with the “Consultation” stage) * 100%.
My request is like this:
select SUM(action_name='Bank Application')/SUM(action_name='Consultation') * 2 as 'Conversion' from ost_ticket_action_history JOIN ost_ticket ot on ot.ticket_id = ost_ticket_action_history.ticket_id where status = 'open' and created > '2020 -01-01 00:00:00' group by action_name,started having action_name = 'Bank Application';
As a result I get:
Another query:
SELECT
SUM(CASE
WHEN b.ticket_id IS NOT NULL THEN 1
ELSE 0
END) / COUNT(*) conversion,
YEAR(a.started) AS 'year',
MONTH(a.started) AS 'month'
FROM
ost_ticket_action_history a
LEFT JOIN
ost_ticket_action_history b ON a.ticket_id = b.ticket_id
AND b.action_name = 'Bank Application'
WHERE
a.action_name = 'Consultation'
AND a.status = 'open'
AND a.created > '2020-01-01 00:00:00'
GROUP BY YEAR(a.started) , MONTH(a.started)
I apologize if I didn't write very clearly. Please explain what to do.
Like I explained in my comment, you exclude rows with your having clause.
I will show you in the next how to debug.
First check what the raw result of the select query is.
As you see, when you remove the GROUP BY and see what you actually get is only 1 row with bank application, because the having clause excludes all other rows
SELECT
*
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY
action_name, started
HAVING
action_name = 'Bank Application';
Output:
ticket_id
action_id
action_name
started
ticket_id
created
staff
status
city_id
2
2
Bank Application
2022-02-06 18:30:45
2
2022-05-05 18:30:48
0
open
3
Second step, see what the result set is without calculating anything.
As you can see you make a division with 0, what you have learned in school, is forbidden, hat is why you have as result set NULL
SELECT
SUM(action_name = 'Bank Application')
#/
,SUM(action_name = 'Consultation') * 2 AS 'Conversion'
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY action_name , started
HAVING action_name = 'Bank Application';
SUM(action_name = 'Bank Application') | Conversion
------------------------------------: | ---------:
1 | 0
db<>fiddle here
#Third what you can do exclude a division with 0, here i didn't remove all othe rows as this is only for emphasis
SELECT
SUM(action_name = 'Bank Application')
/
SUM(action_name = 'Consultation') * 2 AS 'Conversion'
FROM
ost_ticket_action_history
JOIN
ost_ticket ot ON ot.ticket_id = ost_ticket_action_history.ticket_id
WHERE
status = 'open'
AND created > '2020-01-01 00:00:00'
GROUP BY action_name , started
HAVING SUM(action_name = 'Consultation') > 0;
| Conversion |
| ---------: |
| 0.0000 |
| 0.0000 |
db<>fiddle here
Final words,
If you get a strange result, simply go back remove everything that doesn't matter and try to get all values, so hat you can check your math
I have a database containing tickets. Each ticket has a unique number but this number is not unique in the table. So for example ticket #1000 can be multiple times in the table with different other columns (Which I have removed here for the example).
create table countries
(
isoalpha varchar(2),
pole varchar(50)
);
insert into countries values ('DE', 'EMEA'),('FR', 'EMEA'),('IT', 'EMEA'),('US','USCAN'),('CA', 'USCAN');
create table tickets
(
id int primary key auto_increment,
number int,
isoalpha varchar(2),
created datetime
);
insert into tickets (number, isoalpha, created) values
(1000, 'DE', '2021-01-01 00:00:00'),
(1001, 'US', '2021-01-01 00:00:00'),
(1002, 'FR', '2021-01-01 00:00:00'),
(1003, 'CA', '2021-01-01 00:00:00'),
(1000, 'DE', '2021-01-01 00:00:00'),
(1000, 'DE', '2021-01-01 00:00:00'),
(1004, 'DE', '2021-01-02 00:00:00'),
(1001, 'US', '2021-01-01 00:00:00'),
(1002, 'FR', '2021-01-01 00:00:00'),
(1005, 'IT', '2021-01-02 00:00:00'),
(1006, 'US', '2021-01-02 00:00:00'),
(1007, 'DE', '2021-01-02 00:00:00');
Here is an example:
http://sqlfiddle.com/#!9/3f4ba4/6
What I need as output is the number of new created tickets for each day, devided into tickets from USCAN and rest of world.
So for this Example the out coming data should be
Date | USCAN | Other
'2021-01-01' | 2 | 2
'2021-01-02' | 1 | 3
At the moment I use this two queries to fetch all new tickets and then add the number of rows with same date in my application code:
SELECT MIN(ti.created) AS date
FROM tickets ti
LEFT JOIN countries ct ON (ct.isoalpha = ti.isoalpha)
WHERE ct.pole = 'USCAN'
GROUP BY ti.number
ORDER BY date
SELECT MIN(ti.created) AS date
FROM tickets ti
LEFT JOIN countries ct ON (ct.isoalpha = ti.isoalpha)
WHERE ct.pole <> 'USCAN'
GROUP BY ti.number
ORDER BY date
but that doesn't look like a very clean method. So how can I improved the query to get the needed data with less overhead?
Ii is recommended that is works with mySQL 5.7
You may logically combine the queries using conditional aggregation:
SELECT
MIN(CASE WHEN ct.pole = 'USCAN' THEN ti.created END) AS date_uscan,
MIN(CASE WHEN ct.pole <> 'USCAN' THEN ti.created END) AS date_other
FROM tickets ti
LEFT JOIN countries ct ON ct.isoalpha = ti.isoalpha
GROUP BY ti.number
ORDER BY date;
You can create unique entries for each date/country then use that value to count USCAN and non-USCAN
SELECT created,
SUM(1) as total,
SUM(CASE WHEN pole = 'USCAN' THEN 1 ELSE 0 END) as uscan,
SUM(CASE WHEN pole != 'USCAN' THEN 1 ELSE 0 END) as nonuscan
FROM (
SELECT created, t.isoalpha, MIN(pole) AS pole
FROM tickets t JOIN countries c ON t.isoalpha = c.isoalpha
GROUP BY created,isoalpha
) AS uniqueTickets
GROUP BY created
Results:
created total uscan nonuscan
2021-01-01T00:00:00Z 4 2 2
2021-01-02T00:00:00Z 3 1 2
http://sqlfiddle.com/#!9/3f4ba4/45/0
Regarding the answer of SQL Hacks I found the right solution
SELECT created,
SUM(1) as total,
SUM(CASE WHEN pole = 'USCAN' THEN 1 ELSE 0 END) as uscan,
SUM(CASE WHEN pole != 'USCAN' THEN 1 ELSE 0 END) as nonuscan
FROM (
SELECT created, t.isoalpha, MIN(pole) AS pole
FROM tickets t JOIN countries c ON t.isoalpha = c.isoalpha
GROUP BY t.number
) AS uniqueTickets
GROUP BY SUBSTR(created, 1 10)
I am stuck on a MySQL problem. I am trying to calculate the return series of a portfolio using:
for(i = startdate+1; i <= enddate; i++) {
return[i]=0;
for(n = 0; n < count(instruments); n++) {
return[i] += price[i,n] / price[i-1, n] * weight[n];
}
}
So, the return of portfolio today is calculated as a sum of price_today/price_yesterday*weight over the instruments in the portfolio.
I created a scribble at http://rextester.com/FUC35243.
If it doesn't work, the code is:
DROP TABLE IF EXISTS x_ports;
DROP TABLE IF EXISTS x_weights;
DROP TABLE IF EXISTS x_prices;
CREATE TABLE IF NOT EXISTS x_ports (id INT NOT NULL AUTO_INCREMENT, name VARCHAR(20), PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_weights (id INT NOT NULL AUTO_INCREMENT, port_id INT, inst_id INT, weight DOUBLE, PRIMARY KEY (id));
CREATE TABLE IF NOT EXISTS x_prices (id INT NOT NULL AUTO_INCREMENT, inst_id INT, trade_date DATE, price DOUBLE, PRIMARY KEY (id));
INSERT INTO x_ports (name) VALUES ('PORT A');
INSERT INTO x_ports (name) VALUES ('PORT B');
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 1, 20.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (1, 2, 80.0);
INSERT INTO x_weights (port_id, inst_id, weight) VALUES (2, 1, 100.0);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-01', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-02', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-03', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-04', 1.12);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-05', 1.13);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (1, '2018-01-06', 1.14);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-01', 50.23);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-02', 50.45);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-03', 50.30);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-04', 50.29);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-05', 50.40);
INSERT INTO x_prices (inst_id, trade_date, price) VALUES (2, '2018-01-06', 50.66);
# GETTING THE DATES
SET #DtShort='2018-01-01';
SET #DtLong=#DtShort;
SELECT
#DtShort:=#DtLong as date_prev,
#DtLong:=dt.trade_date as date_curent
FROM
(SELECT DISTINCT trade_date FROM x_prices ORDER BY trade_date) dt;
# GETTING RETURN FOR SINGLE DAY
SET #DtToday='2018-01-03';
SET #DtYesterday='2018-01-02';
SELECT
x2.trade_date,
x2.portfolio,
sum(x2.val*x2.weight)/sum(x2.weight) as ret
FROM
(SELECT
x1.trade_date,
x1.portfolio,
sum(x1.weight)/2.0 as weight,
sum(x1.val_end)/sum(x1.val_start) as val,
sum(x1.val_start) as val_start,
sum(x1.val_end) as val_end
FROM
(SELECT
#DtToday as trade_date,
prt.name as portfolio,
wts.inst_id as iid,
wts.weight,
if(prc.trade_date=#DtToday,prc.price*wts.weight,0) as val_start,
if(prc.trade_date=#DtYesterday,prc.price*wts.weight,0) as val_end
FROM
x_ports prt,
x_weights wts,
x_prices prc
WHERE
wts.port_id=prt.id and
prc.inst_id=wts.inst_id and
(prc.trade_date=#DtToday or prc.trade_date=#DtYesterday)) x1
GROUP BY x1.portfolio) x2
GROUP BY x2.portfolio;
I hope to be able to produce a result looking like this:
Date Port A Port B
--------------------------------------------
01/01/2010
02/01/2010 1.005289596 1.004379853
03/01/2010 0.995851496 0.997026759
04/01/2010 0.999840954 0.999801193
05/01/2010 1.003535565 1.002187314
06/01/2010 1.005896896 1.00515873
The return for Port A on the 2/1/2018 should be calculated as 1.13/1.12*20/(20+80) + 50.45/50.23*80/(20+80).
The return for Port B on the 2/1/2018 should be calculated as 50.45/50.23*100/100, or possibly 1.13/1.12*0/(0+100) + 50.45/50.23*100/(0+100).
FYI, in the looping function above, I only calculate at the nominator (or the unscaled weight) so that Port A would be calculated as 1.13/1.12*20+50.45/50.23*80, which I see as the crucial step when calculating the return. The return is then found by dividing it by the sum of the weight.
Though it certainly can be done better, I can get the dates and I can calculate the return of a single day, but I just can't put the two together.
Simulating analytics is no fun! Demo
The math on this doesn't seem right to me; as I'm no where close to your 'looks like results'
I'd like to be able to reuse CurDay but as the version is lower I couldn't use a common table expression.
What this does:
X1 generate the join of the tables
X2 gives us a count of the instruments in a portfolio used later in math
r generates a uservariable on which we can assign rows #rn and #Rn2
CurDay generate a rownumber ordered correctly so we can join
NextDay generates a copy of CurDay so we can join curday to next day on RN+1
Z allows us to do the math and group by current day and prepare for pivot on the portfolio name.
Outer most select allows us to pivot the data so we have date+2 columns
.
SELECT Z.Trade_Date
, sum(case when name = 'Port A' then P_RETURN end) as PortA
, sum(case when name = 'Port B' then P_RETURN end) as PortB
FROM (
## Raw data
SELECT CurDay.*, NextDay.Price/CurDay.Price*CurDay.Weight/CurDay.Inst_Total_Weight as P_Return
FROM (SELECT x1.*, #RN:=#RN+1 rn,x2.inst_cnt, x2.Inst_Total_Weight
FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
FROM x_ports Prt
INNER JOIN x_weights W
on W.Port_ID = prt.ID
INNER JOIN x_prices Prc
on Prc.INST_ID = W.INST_ID
ORDER BY W.port_id, W.inst_id,trade_Date) x1
CROSS join (SELECT #RN:=0) r
INNER join (SELECT count(*) inst_Cnt, port_ID, sum(Weight) as Inst_Total_Weight
FROM x_weights
GROUP BY Port_ID) x2
on X1.Port_ID = X2.Port_ID) CurDay
LEFT JOIN (SELECT x1.*, #RN2:=#RN2+1 rn2
FROM (SELECT prt.name, W.port_ID, W.inst_ID, W.weight, prc.trade_Date, Prc.Price
FROM x_ports Prt
INNER JOIN x_weights W
on W.Port_ID = prt.ID
INNER JOIN x_prices Prc
on Prc.INST_ID = W.INST_ID
ORDER BY W.port_id, W.inst_id,trade_Date) x1
CROSS join (SELECT #RN2:=0) r
) NextDay
on NextDay.Port_ID = CurDay.Port_ID
and NextDay.Inst_ID = curday.Inst_ID
and NextDay.RN2 = CurDay.RN+1
GROUP BY CurDay.Port_ID, CurDay.Inst_ID, CurDay.Trade_Date) Z
##END RAW DATA
GROUP BY Trade_Date;
+----+---------------------+-------------------+-------------------+
| | Trade_Date | PortA | PortB |
+----+---------------------+-------------------+-------------------+
| 1 | 01.01.2018 00:00:00 | 1,00528959642786 | 1,00892857142857 |
| 2 | 02.01.2018 00:00:00 | 0,995851495829569 | 0,991150442477876 |
| 3 | 03.01.2018 00:00:00 | 0,999840954274354 | 1 |
| 4 | 04.01.2018 00:00:00 | 1,0035355651507 | 1,00892857142857 |
| 5 | 05.01.2018 00:00:00 | 1,00589689563141 | 1,00884955752212 |
| 6 | 06.01.2018 00:00:00 | NULL | NULL |
+----+---------------------+-------------------+-------------------+