MySQL - Time Series Sliding Window

MySQL - Time Series Sliding Window - mysql

I have a MySQL table containing financial market prices.
+------------+------+--------+--------+--------+--------+
| date | pair | open | high | low | close |
+------------+------+--------+--------+--------+--------+
| 12/9/2009 | 1 | 1.4703 | 1.4783 | 1.4668 | 1.4727 |
| 12/9/2009 | 2 | 1.6287 | 1.6378 | 1.6167 | 1.6262 |
| 12/9/2009 | 3 | 0.9038 | 0.9116 | 0.9015 | 0.9086 |
| 12/9/2009 | 4 | 88.435 | 88.71 | 87.36 | 87.865 |
| 12/9/2009 | 5 | 1.064 | 1.0664 | 1.0515 | 1.0545 |
| 12/10/2009 | 1 | 1.4725 | 1.4761 | 1.4683 | 1.4732 |
| 12/10/2009 | 2 | 1.6261 | 1.6348 | 1.6214 | 1.6279 |
| 12/10/2009 | 3 | 0.9086 | 0.9192 | 0.908 | 0.9166 |
| 12/10/2009 | 4 | 87.87 | 88.47 | 87.73 | 88.2 |
| 12/10/2009 | 5 | 1.0546 | 1.0584 | 1.0479 | 1.0517 |
| 12/11/2009 | 1 | 1.4733 | 1.4778 | 1.4586 | 1.4615 |
| 12/11/2009 | 2 | 1.6278 | 1.634 | 1.6197 | 1.6262 |
| 12/11/2009 | 3 | 0.9164 | 0.9197 | 0.909 | 0.9128 |
| 12/11/2009 | 4 | 88.2 | 89.82 | 88.195 | 89.115 |
| 12/11/2009 | 5 | 1.0517 | 1.0624 | 1.0483 | 1.0602 |
+------------+------+--------+--------+--------+--------+
I want to get something like this. This is filtered by pair (where pair = 1). Every row consists of two consecutive rows.
+--------+--------+--------+--------+--------+--------+--------+--------+
| open1 | high1 | low1 | close1 | open2 | high2 | low2 | close2 |
+--------+--------+--------+--------+--------+--------+--------+--------+
| 1.4703 | 1.4783 | 1.4668 | 1.4727 | 1.4725 | 1.4761 | 1.4683 | 1.4732 |
| 1.4725 | 1.4761 | 1.4683 | 1.4732 | 1.4733 | 1.4778 | 1.4586 | 1.4615 |
+--------+--------+--------+--------+--------+--------+--------+--------+
I tried this query from https://stackoverflow.com/a/5084722/1487781 to get two consecutive dates.
select (
select max(t1.date)
from data as t1
where t1.date < t2.date
and t1.pair = 1
) as date1,
t2.date as date2
from data as t2
It worked but I can't rewrite it to suit my need as I need values and I can't just use max() to do that. Also I need to know how to generalize the solution. For example how if I need three or four consecutive rows.

Try this query:
SELECT d1.date date1,
d2.date date2,
d1.pair,
d1.open open1,
d1.high high1,
d1.low low1,
d1.close close2,
d2.open open2,
d2.high high2,
d2.low low2,
d2.close close2
FROM table1 d1
JOIN table1 d2
ON d1.pair = d2.pair
AND d1.date = d2.date - interval 1 day
Demo: http://www.sqlfiddle.com/#!2/f490d/2
Here is a version with a subquery that determines a next date for given pair number (next date = lowest date that is greater than given date):
SELECT d1.date date1,
d2.date date2,
d1.pair,
d1.open open1,
d1.high high1,
d1.low low1,
d1.close close2,
d2.open open2,
d2.high high2,
d2.low low2,
d2.close close2
FROM table1 d1
JOIN table1 d2
ON d1.pair = d2.pair
AND d2.date = (
SELECT min(date)
FROM table1 t
WHERE t.date > d1.date
AND t.pair = d1.pair
)
demo: --> http://www.sqlfiddle.com/#!2/f490d/9

Related

What is the execution order of the associative sub-query SQL?

I came across this SQL at work, This was written by my colleague. Although there are better solutions, I’m just curious,and now I have simplified it as follows:
-- Calculate the total of the 'APPROVING' salary and the 'AGENT' salary already actual paid
SELECT ifnull(sum(l.salary),0) +
(SELECT ifnull(sum(l1.salary),0)
FROM salary_header h1 JOIN salary_lines l1
ON h1.salary_id = l1.salary_id
WHERE h1.status='APPROVING' AND l1.project_id = l.project_id)
FROM salary_pay_headers h JOIN salary_pay_lines l
ON h.salary_pay_id = l.salary_pay_id
WHERE h.pay_type='AGENT'
AND l.project_id=9904
mysql> select * from salary_header;
+-----------+-----------+
| salary_id | status |
+-----------+-----------+
| 1 | APPROVING |
| 2 | PAID |
+-----------+-----------+
mysql> select * from salary_lines;
+----------------+-----------+------------+--------+
| salary_line_id | salary_id | project_id | salary |
+----------------+-----------+------------+--------+
| 1 | 2 | 9905 | 200.00 |
+----------------+-----------+------------+--------+
mysql> select * from salary_pay_headers;
+---------------+----------+
| salary_pay_id | pay_type |
+---------------+----------+
| 1 | AGENT |
| 2 | OTHER |
+---------------+----------+
mysql> select * from salary_pay_lines;
+--------------------+---------------+------------+--------+
| salary_pay_line_id | salary_pay_id | project_id | salary |
+--------------------+---------------+------------+--------+
| 1 | 1 | 9904 | 3.05 |
| 2 | 1 | 9904 | 201.37 |
| 3 | 1 | 9904 | 6.10 |
| 4 | 1 | 9904 | 10.17 |
| 5 | 1 | 9904 | 6.44 |
| 6 | 1 | 9904 | 9.15 |
| 8 | 3 | 9905 | 100.00 |
+--------------------+---------------+------------+--------+
Its result is not 3.05+201.37+6.10+10.17+6.44+9.15=236.28 as I expected，but 236.28+200=436.28，obviously that one in the salary_line is not filtered out. I have spent the whole afternoon on this problem, so I really want to know the execution order of this SQL.

How to group by columns in a different table

I am trying to write a query to return the sum of totalRxCount that is grouped by zipcode.
I have two tables named fact2 and demographic.
My problem is that in the demographic table there are duplicate rows which affects the sum of totalRxCount.
To avoid duplicates I am wanting to only return results where npiNum is distinct.
Right now I have this working but it is grouping by relId (the primary key).
I cannot figure out a way to group by zipcode since this column and totalRxCount are in separate tables.
When I try this I am getting wrong results since it is counting the duplicate rows.
Here is my query. I am wanting to modify this to return results grouped by zipcode instead of relId.
Any input will be greatly appreciated!
SELECT fact2.relID
, SUM(fact2.`totalRxCount`)
FROM fact2
LEFT
JOIN (
SELECT O1.relId, COUNT(DISTINCT O1.npiNum)
FROM demographic As O1
GROUP BY O1.relId
) AS d1
ON d1.`relId` = fact2.relID
LEFT
JOIN (
SELECT O2.relID, Sum(O2.totalRxCount)
FROM fact2 AS O2
GROUP BY O2.relID
) AS p1
ON p1.relID = d1.relId
WHERE (monthEndDate BETWEEN 201911 AND 202010) GROUP BY fact2.relID;
Results:
+-------+---------------------------+
| relID | SUM(fact2.totalRxCount) |
+-------+---------------------------+
| 2465 | 2 |
+-------+---------------------------+
What I've tried
SELECT zipcode, SUM(fact2.`totalRxCount`)
FROM fact2
INNER JOIN demographic ON demographic.relId=fact2.relID
LEFT JOIN (
SELECT O1.`relId`, COUNT(DISTINCT O1.`npiNum`)
FROM demographic As O1
GROUP BY O1.`relId`
) AS d1
ON d1.`relId` = fact2.`relID`
LEFT JOIN (
SELECT O2.`relID`, Sum(O2.`totalRxCount`)
FROM fact2 AS O2
GROUP BY O2.`relID`
) AS p1
ON p1.`relID` = d1.`relId`
WHERE (`monthEndDate` BETWEEN 201911 AND 202010) GROUP BY zipcode;
This is returning the sum multiplied by number of duplicate rows in demographic.
Results:
+---------+---------------------------+
| zipcode | SUM(fact2.`totalRxCount`) |
+---------+---------------------------+
| 66097 | 4 |
+---------+---------------------------+
^ This should be 2
demographic table:
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| relId | zipcode | providerId | writerType | firstName | middleName | lastName | title | specCode | specDesc | address | city | state | amaNoContact | pdrpInd | pdrpDate | deaNum | amaNum | amaCheckDig | npiNum | terrId | callStatusCode |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
| 2465 | 66097 | | A | | | JEFFERSON COUNTY MEMORIAL HOSPITAL | | | | 408 DELAWARE ST | WINCHESTER | KS | | | | AJ4281096 | | | | 11604 | |
+-------+---------+------------+------------+-----------+------------+------------------------------------+-------+----------+----------+-----------------+------------+-------+--------------+---------+----------+-----------+--------+-------------+--------+--------+----------------+
fact2
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| relID | marketId | marketName | productID | productName | dataType | providerId | writerType | planId | pmtTypeInd | monthEndDate | newRxCount | refillRxCount | totalRxCount | newRxQuan | refillRxQuan | totalRxQuan | newRxCost | refillRxCost | totalRxCost |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+
| 2465 | 10871 | GALT PP MONTHLY | 1399451 | ZOLPIDEM TARTRATE | 15 | | A | 900145 | C | 202004 | 1 | 0 | 1 | 30 | 0 | 30 | 139 | 0 | 139 |
| 2465 | 10871 | GALT PP MONTHLY | 1399458 | ESZOPICLONE | 15 | | A | 900145 | C | 202006 | 1 | 0 | 1 | 30 | 0 | 30 | 350 | 0 | 350 |
+-------+----------+-----------------+-----------+-------------------+----------+------------+------------+--------+------------+--------------+------------+---------------+--------------+-----------+--------------+-------------+-----------+--------------+-------------+

How to pivot in mysql for previous date in mysql

The table structure and data is present at https://www.db-fiddle.com/f/5rKXiavsoMeQazSHwxaTVK/0
i am trying to get the percentage of all previous date range.
circle_name`|current_capacity|2020-03-16|2020-03-17|2020-03-18|
where current capacity is the max date capacity and the percentage of each date in rows.
The final out like https://www.db-fiddle.com/f/wTFejVRzQBTXZkfWjBuqq8/0
select circle_name,Subscriber as current_capacity,
Subscriber/Subs_Capacity * 100 as percentage,
DATE_FORMAT(date,'%y-%m-%d') as date
from circle
where date BETWEEN date_add('2020-03-18',interval -3 day)
and '2020-03-18'
Please help here in pivot in mysql.

Here is one approach that uses conditional aggregation. Instead of assigning actual dates to column names (which would require dynamic SQL), this works by setting a variable that represents the date from which you want the computation to start, and then generates columns named d_2 (for day - 2), d_1 and d_0.
set #mydate = '2020-03-18';
select
circle_name,
max(case when date = #mydate then subscriber end) current_capacity,
100 * sum(case when date = #mydate - interval 2 day then subscriber end)
/ sum(case when date = #mydate - interval 2 day then subs_capacity end) d_2,
100 * sum(case when date = #mydate - interval 1 day then subscriber end)
/ sum(case when date = #mydate - interval 1 day then subs_capacity end) d_1,
100 * sum(case when date = #mydate then subscriber end)
/ sum(case when date = #mydate then subs_capacity end) d_0
from circle
group by circle_name;
In your db fiddle, the query yields:
| circle_name | current_capacity | d_2 | d_1 | d_0 |
| ----------- | ---------------- | ------- | ------- | ------- |
| AP | 758415 | 90.8422 | 91.1373 | 91.3753 |
| AS | 764976 | 83.461 | 83.2508 | 82.7001 |
| BH | 807447 | 84.5168 | 86.2083 | 85.8986 |
| CH | 785384 | 87.4384 | 87.2934 | 87.2649 |
| DL | 859161 | 85.9683 | 85.5766 | 85.9161 |
| GJ | 882817 | 85.6419 | 85.7533 | 86.1285 |
| HP | 203300 | 80.9292 | 80.9608 | 81.32 |
| HR | 255511 | 84.9163 | 85.1213 | 85.1703 |
| JK | 592271 | 84.244 | 84.2937 | 84.6101 |
| KK | 729628 | 88.0607 | 88.1499 | 87.907 |
| KL | 793872 | 80.0359 | 79.5544 | 79.3872 |
| KO | 847638 | 84.6341 | 84.6501 | 84.7638 |
| MB | 687501 | 87.8208 | 85.9449 | 85.9376 |
| MH | 886554 | 95.8487 | 95.6997 | 88.6554 |
| MP | 821474 | 83.1335 | 83.3047 | 83.8239 |
| NE | 824807 | 87.5708 | 87.5969 | 86.8218 |
| OR | 710822 | 84.5128 | 85.2319 | 83.6261 |
| PB | 194300 | 88.798 | 96.5235 | 97.15 |
| RJ | 840310 | 82.438 | 83.7309 | 84.031 |
| TN | 725307 | 90.8855 | 90.7334 | 90.6634 |
| UE | 903366 | 89.8577 | 90.0483 | 90.3366 |
| UW | 729154 | 98.9529 | 87.1339 | 87.8499 |
| WB | 0 | 0 | 0 | 0 |

SQL update after choosing a MAX from some AVERAGEs

I have 2 tables with same columns but different data. I need to compute the average of a column in one table ( with some filters ) and to choose the MAX of them. Then to put that value in the 2nd table.
I've built so far this query:
UPDATE st16
INNER JOIN st17 ON st17.parent = st16.uid
SET
st16.p1 = SELECT MAX(
(SELECT AVG(st17.p1) FROM st17 WHERE st17.parent = st16.uid AND st17.row = st16.row)),
st16.p2 = SELECT MAX(
(SELECT AVG(st17.p2) FROM st17 WHERE st17.parent = st16.uid AND st17.row = st16.row))
but I get this error: "#1111 - Invalid use of group function".
Any ideas? Thanks!
Sample data ( first is st17, and below is st16 ):
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
| uid | parent | fen | p1 | p2 | row |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
| ee95b564f2b3fa1573b451d8f4e00f5d | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D8D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/6D5D3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD | -10.481481481481481 | 10.481481481481481 | 1 |
| 691ed545dd5375cb3e75f0b8d032534b | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D6D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/5D3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D | -10.481481481481481 | 10.481481481481481 | 1 |
| b6e2a3f4ea51c8e6638a2cc657bf3511 | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D5D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/3D2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D | -10.481481481481481 | 10.481481481481481 | 1 |
| 0dbe5038d01e457e4f65415ac081d0dd | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D3D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/2DKSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D | -10.481481481481481 | 10.481481481481481 | 1 |
| ca1e85058ed8294d60a9922d36f8c1fa | bc5ef0d66b3bde08b0ba35a91412c058 | QS7D2D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/KSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D | -10.481481481481481 | 10.481481481481481 | 1 |
| e85179f395ba8e441ff7b1544e05404c | c75eb9315dee4e3b42fb52e8cd509910 | QS7DJS/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/TS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DAS | -9.703703703703704 | 9.703703703703704 | 1 |
| eb3c352febe8ff25f375032bbb6cc5d7 | c75eb9315dee4e3b42fb52e8cd509910 | QS7DTS/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJS | -9.703703703703704 | 9.703703703703704 | 1 |
| 69f06801edf9b3cf669df56dc9152271 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D8S/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS | -9.703703703703704 | 9.703703703703704 | 1 |
| 5f78082dd3aee8b51bf096286df5e4e7 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D5H/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS8S7S6S5S3S2SAH7H | -9.703703703703704 | 9.703703703703704 | 1 |
| 7ee50e8aa1afd3af703b3a5b3cdf3cf8 | c75eb9315dee4e3b42fb52e8cd509910 | QS7D3H/4H9HQH4D4S/6H8HTHJHKH/4CKS/9S9D7CJC9C/6C8CQCKCAC/2HTC5C3C2CADKDQDJDTD8D6D5D3D2DASJSTS8S7S6S5S3S2SAH7H5H | -9.703703703703704 | 9.703703703703704 | 1 |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+---------------------+---------------------+-----+
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+
| uid | parent | fen | p1 | p2 | row |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+
| bc5ef0d66b3bde08b0ba35a91412c058 | 9e123e356e468b847d4493cf55809fcd | QS7D/4H9HQH4D4S/6H8HTHJHKH/4CAS/9S9D7CJC9C/6C8CQCKCAC/KSJSTS8S7S6S5S3S2SAH7H5H3H2HTC5C3C2CADKDQDJDTD8D6D5D3D2D | 0 | 0 | 1 |
+----------------------------------+----------------------------------+----------------------------------------------------------------------------------------------------------------+----+----+-----+

As Gordon Linoff mentioned, you can't pass a subquery to an aggregate function such as MAX(). Another problem that you will likely run into: you cannot select from the table you are updating in MySQL. So something like this
UPDATE st16
SET
st16.p1 = (SELECT AVG(st17.p1) FROM st17 JOIN st16 ON st17.parent = st16.uid WHERE st17.row = st16.row ORDER BY AVG(st17.p1) DESC LIMIT 1),
st16.p2 = (SELECT AVG(st17.p2) FROM st17 JOIN st16 ON st17.parent = st16.uid WHERE st17.row = st16.row ORDER BY AVG(st17.p2) DESC LIMIT 1);
will not work, unfortunately. You might just want to break this into multiple queries; that is, retrieve the maximum averages first in a SELECT, then ship those results in a second, separate UPDATE.

How do I properly format this MySQL JOIN Statement?

I've got a table that looks like:
Table 1 ->
+----+--------+--------+
| id | name | author |
+----+--------+--------+
| 1 | First | Me |
| 2 | Second | You |
+----+--------+--------+
Table 2 ->
+-----+------------+-----------+------------+
| mid | table1_id | key | value |
+-----+------------+-----------+------------+
| 1 | 1 | desc | hello |
| 2 | 1 | begin_day | monday |
| 3 | 1 | end_day | tuesday |
| 4 | 2 | desc | goodbye |
| 5 | 2 | begin_day | wednesday |
| 6 | 2 | end_day | friday |
+-----+------------+-----------+------------+
The relationship here is that the id in table 1 corresponds to the table1_id in table 2.
The output that I'm trying to get is...
+----+---------+---------+-------------+-----------+-----------+
| id | name | author | desc | begin_day | end_day |
+----+---------+---------+-------------+-----------+-----------+
| 1 | First | Me | hello | monday | tuesday |
| 1 | Second | You | goodbye | wednesday | friday |
+----+---------+---------+-------------+-----------+-----------+
I've tried several different join statements -- all a variation of the below. I'm not that well versed in MySQL queries, however.
SELECT * FROM table_1 LEFT JOIN table_2 on table_1.id = table_2.table1_id
Which produces...
+----+----------+----------+----------+------------+-----------+
| id | mid | name | author | key | value |
+----+----------+----------+----------+------------+-----------+
| 1 | 1 | First | Me | desc | hello |
| 1 | 2 | First | Me | begin_day | monday |
| 1 | 3 | First | Me | end_day | tuesday |
| 2 | 4 | Second | You | desc | goodbye |
| 2 | 5 | Second | You | begin_day | wednesday|
| 2 | 6 | Second | You | end_day | friday |
Obviously, iterating over this join statement produces 6 results, 1 for each row in table 2 that matches the id in table 1. How can I avoid this with a proper query statement?
Thank you in advance.

You can use a case statement if you know all of the columns you will be getting, as follows:
Select distinct table_1.*,
case when table_2.key='desc' then value end as desc,
case when table_2.key='begin_day' then value end as begin_day,
case when table_2.key='end_day' then value end as end_day
FROM table_1 LEFT JOIN table_2 on table_1.id = table_2.table1_id
Hope this helps!

SELECT
table_1.*,
MAX(IF(key='desc', value, NULL)) AS 'desc',
MAX(IF(key='begin_day', value, NULL)) AS begin_day,
MAX(IF(key='end_day', value, NULL)) AS end_day
FROM table_1
LEFT JOIN table_2 ON (id = table1_id)
GROUP BY id;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

MySQL - Time Series Sliding Window - mysql

Related

What is the execution order of the associative sub-query SQL?

How to group by columns in a different table

How to pivot in mysql for previous date in mysql

SQL update after choosing a MAX from some AVERAGEs

How do I properly format this MySQL JOIN Statement?

Categories

Resources