Custom function inside order by - mysql

I'm trying to make a pretty complex query. I have a database with blocks.
Each block has a start date, an end date and the module to which it belongs.
I have to calculate the turnover, which would be the difference between consecutive blocks (for the block[i]):
block[i].start - block[i - 1].end
Let's put the following example, I have these data:
create table blocks (start datetime, end datetime, module integer);
insert into blocks (start, end, module)
values
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 1), -- diff: null or 0
('2016-04-13 11:00:00', '2016-04-13 12:00:00', 1), -- diff: 1hour
('2016-04-13 12:30:00', '2016-04-13 14:00:00', 1), -- diff: 30minutes
-- turnoverAvg: 45min = (1h + 30min) / 2
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 2), -- diff: null or 0
('2016-04-13 12:00:00', '2016-04-13 12:30:00', 2), -- diff: 2hour
('2016-04-13 13:30:00', '2016-04-13 14:30:00', 2), -- diff: 1hour
-- turnoverAvg: 90min = (2h + 1h) / 2
('2016-04-14 14:30:00', '2016-04-14 16:00:00', 2), -- diff: null or 0
('2016-04-14 17:00:00', '2016-04-14 18:00:00', 2), -- diff: 1hour
-- turnoverAvg: 60min = 1h/1
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 3), -- diff: null or 0
('2016-04-13 10:00:00', '2016-04-13 11:00:00', 3), -- diff: 0
('2016-04-13 12:00:00', '2016-04-13 13:00:00', 3), -- diff: 1hour
('2016-04-13 14:00:00', '2016-04-13 15:00:00', 3), -- diff: 1hour
('2016-04-13 16:00:00', '2016-04-13 17:00:00', 3), -- diff: 1hour
-- turnoverAvg: 45min = (0 + 1h + 1h + 1h) / 4
('2016-04-13 09:00:00', '2016-04-13 10:00:00', 4), -- diff: null or 0
-- turnoverAvg: null
('2016-04-13 09:00:00', '2016-04-13 15:00:00', 5), -- diff: null or 0
('2016-04-13 19:00:00', '2016-04-13 20:00:00', 5); -- diff: 4hour
-- turnoverAvg: 240min = 4h/1
I should make the following query (pseudo-code):
SELECT turnoverAVG (rows of each group by)
FROM blocks
GROUP BY DATE (start), module
Where turnoverAvg would be a function like this (pseudo-code):
function turnoverAVG(rows):
acc = 0.0
for(i=1; i < rows.length; i++)
d = row[i].start - rows[i - 1].end
acc += d
return acc/(rows.length - 1)
Actually I have tried many things, but I do not know where to start ... If someone has an idea, I would greatly appreciate it.
EDIT:
The output would be similar to:
turnoverAVG, module, day
45min, 1, 2016-04-13
1:30hour, 2, 2016-04-13
1hour, 2, 2016-04-14 -- different day but same module
45min, 3, 2016-04-13
4hour, 5, 2016-04-13
The turnoverAVG would be fine if it was in minutes, but I've written it that way to make it better understood. As you can see it never computes the first block because it can not be subtracted with the previous one (there is no previous block).

Functions like this are called window functions. They are only available starting with MySQL 8.
Until then, you will have to find an alternative way to write do your query, see e.g. this question. Most times, you will do it by using variables, although the sql way is to use joins.
But in your specific case, you do not actually need these: the turnovertime is not only the sum between the modules (for which you need to know the previous row), but also the time between start and end of your day (for which you only need min and max) minus the time the modules where running (for which you do not need the previous row).
So try this:
select
module,
date(start),
case when count(module) > 1
then (TIMESTAMPDIFF(Minute,min(start),max(end)) -
sum(TIMESTAMPDIFF(Minute,start, end)))
/ (count(module) - 1)
else null
end as turnoverAVG,
-- details, just for information:
TIMESTAMPDIFF(Minute,min(start),max(end)) as total_day,
sum(TIMESTAMPDIFF(Minute,start, end)) as module_duration,
TIMESTAMPDIFF(Minute,min(start),max(end)) -
sum(TIMESTAMPDIFF(Minute,start, end)) as turnover,
count(module) as cnt
from blocks
group by date(start), module;
The 4 additional columns are just there to so you can see the different termsn used in the calculation, and you can remove them.
All modules are required to start and end on the same date (although you can simply modify it to support overnight modules). It also does not correct the times if the modules overlap (but so doesn't your pseudocode).
It's not entirely clear if you want to include days with only one module (as suggested in the comment for module 4) or not (as suggested in your sample output). If you want to exclude these, you can add e.g. having count(module) > 1 at the end of the query.

Related

How to fill the missing date in a sql table

I have a sql table related to discontinuous dates:
CREATE TABLE IF NOT EXISTS date_test1 ( items CHAR ( 8 ), trade_date date );
INSERT INTO `date_test1` VALUES ( 'a', '2020-03-20');
INSERT INTO `date_test1` VALUES ( 'b', '2020-03-20');
INSERT INTO `date_test1` VALUES ('a', '2020-03-21');
INSERT INTO `date_test1` VALUES ( 'c', '2020-03-22');
INSERT INTO `date_test1` VALUES ( 'd', '2020-03-22');
INSERT INTO `date_test1` VALUES ('a', '2020-03-25');
INSERT INTO `date_test1` VALUES ( 'e', '2020-03-26');
In this table, '2020-03-23' and '2020-03-24' are missed. I want to fill them by their previous data, in this table, the '2020-03-22' data.
Expected result:
The number of continues missing dates and of the records in one day are both uncertain.
So how to do this in mysql?
This solution uses Python and assumes that there aren't so many rows that they cannot be read into memory. I do not warrant this code free from defects; use at your own risk. So I suggest you run this against a copy of your table or make a backup first.
This code uses the pymysql driver.
import pymysql
from datetime import date, timedelta
from itertools import groupby
import sys
conn = pymysql.connect(db='x', user='x', password='x', charset='utf8mb4', use_unicode=True)
cursor = conn.cursor()
# must be sorted by date:
cursor.execute('select items, trade_date from date_test1 order by trade_date, items')
rows = cursor.fetchall() # tuples: (datetime.date, str)
if len(rows) == 0:
sys.exit(0)
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
one_day = timedelta(days=1)
previous_group = groups.pop(0)
next_date = previous_group[0][1]
for group in groups:
next_date = next_date + one_day
while group[0][1] != next_date:
# missing date
for tuple in previous_group:
cursor.execute('insert into date_test1(items, trade_date) values(%s, %s)', (tuple[0], next_date))
print('inserting', tuple[0], next_date)
conn.commit()
next_date = next_date + one_day
previous_group = group
Prints:
inserting c 2020-03-23
inserting d 2020-03-23
inserting c 2020-03-24
inserting d 2020-03-24
Discussion
With your sample data, after the rows are fetched, rows is:
(('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20)), ('a', datetime.date(2020, 3, 21)), ('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22)), ('a', datetime.date(2020, 3, 25)), ('e', datetime.date(2020, 3, 26)))
After the following is run:
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
groups is:
[[('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20))], [('a', datetime.date(2020, 3, 21))], [('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22))], [('a', datetime.date(2020, 3, 25))], [('e', datetime.date(2020, 3, 26))]]
That is, all the tuples with the same date are grouped together in a list so it becomes to easier to detect missing dates.

SSRS: How to return last day of a Custom/Financial Month (Not Calendar) through expression

This is a real hair puller so any help, much appreciated!
I want to be able to determine the:
First day of the current custom/financial month
Last day of the current custom/financial month
And use these new columns Start_Date and end_Date as between Filters in the Matrix.
Note: I understand that if this was calendar Month, then that will be "quite" straightforward.
But in this case its quite different.
Please see image which might help with the context i am trying to work with:
I'm sure this is possible using expressions in SSRS but I don;t have time to investigate. In case it's useful, here's how I would do it in SQL.
Again there is probably a more elegant solution but this was what came to me.
I'll reproduced your data plus a few more dates either end for testing which I guessed based on your sample.
DECLARE #t TABLE(Custom_Date date, Custom_Day int)
INSERT INTO #t VALUES
('2017-10-26', 28),
('2017-10-27', 29),
('2017-10-28', 30),
('2017-10-29', 1),
('2017-10-30', 2),
('2017-10-31', 3),
('2017-11-01', 4),
('2017-11-02', 5),
('2017-11-03', 6),
('2017-11-04', 7),
('2017-11-05', 8),
('2017-11-06', 9),
('2017-11-07', 10),
('2017-11-08', 11),
('2017-11-09', 12),
('2017-11-10', 13),
('2017-11-11', 14),
('2017-11-12', 15),
('2017-11-13', 16),
('2017-11-14', 17),
('2017-11-15', 18),
('2017-11-16', 19),
('2017-11-17', 20),
('2017-11-18', 21),
('2017-11-19', 22),
('2017-11-20', 23),
('2017-11-21', 24),
('2017-11-22', 25),
('2017-11-23', 26),
('2017-11-24', 27),
('2017-11-25', 28),
('2017-11-26', 1),
('2017-11-27', 2),
('2017-11-28', 3)
Then two queries to pull out the correct dates which you could combine if required.
SELECT MAX(Custom_Date) FROM #t WHERE Custom_Date < getdate() AND custom_day = 1
SELECT MAX(Custom_Date)
FROM #t
WHERE
Custom_Date > getdate()
AND DATEDIFF(d, getdate(), Custom_Date)<=31
AND Custom_Day = (
SELECT MAX(custom_day)
FROM #t
WHERE
Custom_Date > getdate()
AND datediff(d, getdate(), Custom_Date)<=31
)
FYI: This would be a lot easier if you had a custom month/period and year in your dates table as then you could just look custom_day 1 and max(custom_day) where the custom month and year are the same as the current date.

Select/Query calculate time between timestamps without weekends

Problem presented is to calculate for each row returned the time ("ResponseTime") between 2 timestamps ("StartDateTime" and "EndDateTime") excluding the weekends. Does not take into consideration Work hours or Holidays.
Weekends in this case are defined as Saturday 00:00:00 to Sunday 23:59:59.
Had a tough time coming up with a solution for this question so thought I would share my final product. Found lots of solutions online but most either used a calendar table, which I couldn't use in this application, or had a logic I didn't understand. Solution shared below. Please feel free to offer your own solution based on the problem or to correct any errors you see in my code. Regards,
EDIT: as per comments provided by #JuanCarlosOropeza solution I presented is not optimal. Providing sample data for him to forward a different solution. If anyone has improvements as well feel free to participate.
CREATE TABLE SourceTable
(`id` int, `StartDateTime` datetime, `EndDateTime` datetime)
;
INSERT INTO SourceTable
(`id`, `StartDateTime`, `EndDateTime`)
VALUES
(1, '2016-09-20 12:52:00', '2016-09-23 13:15:00'),
(2, '2016-09-19 19:15:00', '2016-09-22 19:15:00'),
(3, '2016-09-01 10:35:00', '2016-09-06 13:15:00'),
(4, '2016-09-26 10:34:00', '2016-09-29 11:25:00'),
(5, '2016-09-01 13:01:00', '2016-09-06 14:55:00'),
(6, '2016-09-05 02:21:00', '2016-09-08 19:15:00'),
(7, '2016-09-27 14:14:00', '2016-10-01 19:15:00'),
(8, '2016-09-27 04:18:00', '2016-09-30 14:15:00'),
(9, '2016-09-01 14:50:00', '2016-09-06 17:25:00'),
(10, '2016-09-20 12:52:00', '2016-09-23 13:15:00'),
(11, '2016-09-26 02:14:00', '2016-09-29 10:15:00'),
(12, '2016-09-01 12:04:00', '2016-09-06 17:05:00'),
(13, '2016-09-20 15:30:00', '2016-09-23 15:15:00'),
(14, '2016-09-02 16:04:00', '2016-09-07 20:55:00'),
(15, '2016-09-23 10:41:00', '2016-09-28 13:05:00'),
(16, '2016-09-27 16:28:00', '2016-10-01 13:15:00'),
(17, '2016-09-27 15:33:00', '2016-10-01 22:45:00'),
(18, '2016-09-20 12:53:00', '2016-09-23 13:25:00'),
(19, '2016-09-19 13:49:00', '2016-09-22 13:05:00'),
(20, '2016-09-20 13:46:00', '2016-09-23 13:15:00'),
(21, '2016-09-01 16:32:00', '2016-09-06 18:05:00'),
(22, '2016-09-01 10:35:00', '2016-09-06 22:45:00'),
(23, '2016-09-26 12:40:00', '2016-09-29 12:35:00'),
(24, '2016-09-27 10:37:00', '2016-09-30 21:25:00'),
(25, '2016-09-27 09:41:00', '2016-09-30 15:15:00'),
(26, '2016-09-16 02:09:00', '2016-09-21 10:05:00'),
(27, '2016-09-20 15:13:00', '2016-09-23 15:15:00'),
(28, '2016-09-20 15:30:00', '2016-09-23 15:15:00'),
(29, '2016-09-27 09:55:00', '2016-09-30 13:25:00'),
(30, '2016-09-27 04:18:00', '2016-09-30 14:15:00')
;
I created this solution considering the following logic assumptions.
StartDateTime always occurs before EndDateTime (though had some that didn't and it calculated the time difference correctly)
Week StartDateTime occurred: WEEK(StartDateTime,1)
Week EndDateTime occurred: WEEK(EndDateTime,1)
Start of weekend of week StartDateTime: ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime))
Start of workweek after first weekend: ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime))
Full Query:
SELECT
id,
StartDateTime,
EndDateTime,
CASE
WHEN ( WEEK(EndDateTime,1) = WEEK(StartDateTime,1) )
THEN
CASE
WHEN ( StartDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN SEC_TO_TIME(0)
ELSE
CASE
WHEN ( EndDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN ( TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime) )
ELSE ( TIMEDIFF(EndDateTime,StartDateTime) )
END
END
ELSE
CASE
WHEN ( StartDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN
CASE
WHEN ( EndDateTime >= ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) * 7) )
THEN ( SEC_TO_TIME(120*3600*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1))) )
ELSE ( SEC_TO_TIME(120*3600*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1) + TIME_TO_SEC(TIMEDIFF(EndDateTime, ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime)),7*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1))))) )
END
ELSE
CASE
WHEN ( EndDateTime >= ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) * 7) )
THEN ( SEC_TO_TIME(120*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) + TIME_TO_SEC(TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime))) )
ELSE ( SEC_TO_TIME(TIME_TO_SEC(TIMEDIFF(EndDateTime, ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime)),7*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1)))) + TIME_TO_SEC(TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime))) )
END
END
END as ResponseTime
FROM
SourceTable;
First CASE checks if both timestamps happened on the same week. Second layer checks if StartDateTime happened during the first weekend. Third layer checks if EndDateTime happened during a weekend. Based on these considerations outputs the correct calculation.

Transfer mySQL data structure into new relation database (comma separated to relation table)

I have a several mySQL tables where I have saved the relation ID of the child table comma separated. Now I have to transfer this entries into a new table where for each relation is one entry.
Is there an easy way to transfer import query into the correct format?
Here the data example, my old table (cat_projects) has the following entries I want convert:
-- export of table cat_projects
INSERT INTO `cat_projects` (`id`, `authors`) VALUES
(2, '4,1'),
(3, '0'),
(4, '8,4,1'),
(5, '13,12'),
(10, '19,4,1'),
(13, ''),
(14, ''),
(15, '28,27,25,12,9,1');
This entries I want just to write into the new relation table (cat_project_relation). The att_id links to the another table where I have save the settings of the old authors column:
-- att_id = 58
-- item_id = id
-- value_sorting = counting from 0 for each item_id
-- value_id = for each relation one entry value
INSERT INTO `cat_project_relation` (`att_id`, `item_id`, `value_sorting`, `value_id`) VALUES
(58, 2, 0, '4'),
(58, 2, 1, '1'),
(58, 3, 0, '0'),
(58, 4, 0, '8'),
(58, 4, 1, '4'),
(58, 4, 2, '1'),
(58, 5, 0, '13'),
(58, 5, 1, '12'),
(58, 10, 0, '19'),
(58, 10, 1, '4'),
(58, 10, 2, '1'),
(58, 13, 0, ''),
(58, 14, 0, ''),
(58, 15, 0, '28'),
(58, 15, 1, '27'),
(58, 15, 2, '25'),
(58, 15, 3, '12'),
(58, 15, 4, '9'),
(58, 15, 5, '1');
I hope it is clear what I try to achieve. Is it possible to do that directly in SQL or do I have to apply an external bash script?
Actually it wasn't as difficult as I thought to write a little bash script. And I guess its the easier solution then program something in SQL.
#!/bin/bash
# input table
table="(2, '4,1'),
(3, '0'),
(4, '8,4,1'),
(5, '13,12'),
(10, '19,4,1'),
(13, ''),
(14, ''),
(15, '28,27,25,12,9,1');"
# fixed attribute id
att_id=58
# read each line into an array
readarray -t y <<<"$table"
# for each array item (each line)
for (( i=0; i<${#y[*]}; i=i+1 ))
do
z=${y[$i]}
# split by comma into array
IFS=', ' read -r -a array <<< "$z"
# loop through each value
for (( j=0; j<${#array[*]}; j=j+1 ))
do
# remove all other characters then number
nr=$(echo ${array[$j]} | sed 's/[^0-9]*//g')
# each first value is the item_id
if [ $j -eq 0 ]
then
item_id=$nr
else
k=$(expr $j - 1)
value_id=$nr
# print output line by line
echo "($att_id, $item_id, $k, '$value_id')," >> output.txt
fi
done
done
The result will be as the on asked in the question.

MySQL query comparing values to previous rows' values

I've been searching but have been unable to find a solution to this--I know it's do-able but I just don't have the ninja SQL skills I need (yet)....
I'm looking for a solution to this issue: I have a 2 tables related to stock market data. The first is a simple list of stock symbols with an ID and stock ticker symbol (ID,SYMBOL). The second table contains historical price data for each of the stocks. (ID, DATE, OPEN, HIGH, LOW, CLOSE, VOLUME).
I'm trying to figure out how to query for stocks that have the most recent CLOSE price that is greater than their CLOSE price 5 trading-days ago. I can't just do date math because the stocks don't trade every day (no trading on weekends & holidays, as well as some stocks may not trade on a normal trading day). Thus, I just need to compare the CLOSE price from most recent row and the 5th row proceeding it for each symbol.
I have sample tables and data here:
http://sqlfiddle.com/#!2/5fe76/2
CREATE TABLE `STOCKS` (
`ID` int,
`SYMBOL` varchar(10)
);
INSERT INTO `STOCKS` (`ID`,`SYMBOL`)
VALUES
(1, 'AA'),
(2, 'ADT'),
(3, 'AEO'),
(4, 'AFA');
CREATE TABLE `PRICES` (
`ID` int,
`DATE` date,
`OPEN` decimal(6,2),
`HIGH` decimal(6,2),
`LOW` decimal(6,2),
`CLOSE` decimal(6,2),
`VOLUME` bigint
);
INSERT INTO `PRICES` (`ID`,`DATE`,`OPEN`,`HIGH`,`LOW`,`CLOSE`,`VOLUME`) VALUES
(1, '2014-11-06', 16.37, 16.42, 16.15, 16.37, 14200400),
(1, '2014-11-05', 16.68, 16.69, 16.17, 16.26, 18198200),
(1, '2014-11-04', 16.85, 16.87, 16.43, 16.56, 13182800),
(1, '2014-11-03', 16.78, 17.03, 16.65, 16.93, 15938500),
(1, '2014-10-31', 16.43, 16.76, 16.24, 16.76, 18618300),
(1, '2014-10-30', 16.17, 16.36, 15.83, 16.22, 17854400),
(1, '2014-10-29', 16.58, 16.70, 16.05, 16.27, 31173000),
(1, '2014-10-28', 16.5, 16.65, 16.41, 16.60, 12305900),
(1, '2014-10-27', 16.56, 16.57, 16.31, 16.38, 15452900),
(1, '2014-10-24', 16.33, 16.57, 16.22, 16.55, 12840200),
(2, '2014-11-06', 35.9, 36.12, 35.75, 36.07, 1018100),
(2, '2014-11-05', 35.68, 35.99, 35.37, 35.96, 1101500),
(2, '2014-11-04', 35.13, 35.69, 35.02, 35.49, 819100),
(2, '2014-11-03', 35.81, 35.99, 35.27, 35.32, 1304500),
(2, '2014-10-31', 35.79, 35.86, 35.46, 35.84, 1319400),
(2, '2014-10-30', 34.7, 35.34, 34.66, 35.19, 1201800),
(2, '2014-10-29', 35.06, 35.56, 34.5, 34.92, 1359000),
(2, '2014-10-28', 34.32, 35.17, 34.15, 35.07, 1301800),
(2, '2014-10-27', 34.2, 34.2, 33.66, 34.1, 662600),
(2, '2014-10-24', 34.02, 34.54, 33.95, 34.5, 750600),
(3, '2014-11-06', 13.27, 13.92, 13.25, 13.82, 6518000),
(3, '2014-11-05', 12.95, 13.27, 12.74, 13.22, 8716700),
(3, '2014-11-04', 12.85, 12.94, 12.65, 12.89, 4541200),
(3, '2014-11-03', 12.91, 13.12, 12.73, 12.89, 4299100),
(3, '2014-10-31', 13.2, 13.23, 12.83, 12.87, 7274700),
(3, '2014-10-30', 12.83, 12.91, 12.68, 12.86, 4444300),
(3, '2014-10-29', 13.02, 13.20, 12.79, 12.91, 2974900),
(3, '2014-10-28', 12.87, 13.10, 12.52, 13.04, 7365600),
(3, '2014-10-27', 12.84, 13.00, 12.67, 12.92, 6647900),
(3, '2014-10-24', 13.26, 13.29, 12.60, 12.92, 12803300),
(4, '2014-11-06', 24.59, 24.59, 24.49, 24.55, 20400),
(4, '2014-11-05', 24.81, 24.9, 24.81, 24.88, 11800),
(4, '2014-11-04', 24.87, 24.88, 24.76, 24.88, 10600),
(4, '2014-11-03', 24.85, 24.88, 24.76, 24.81, 18100),
(4, '2014-10-31', 24.82, 24.85, 24.77, 24.78, 8100),
(4, '2014-10-30', 24.83, 24.87, 24.74, 24.79, 13900),
(4, '2014-10-29', 24.86, 24.86, 24.78, 24.81, 5500),
(4, '2014-10-28', 24.85, 24.85, 24.80, 24.84, 10600),
(4, '2014-10-27', 24.68, 24.85, 24.68, 24.85, 7700),
(4, '2014-10-24', 24.67, 24.82, 24.59, 24.82, 9300);
Pseudo code for the query would be something like this:
"Find symbols whos most recent closing prices is greater than the closing price 5 trading-days earlier"
The query I'd like to create should result in the following:
Date Symbol Close Close(-5)
2014-11-06 AA 16.37 16.22
2014-11-06 ADT 36.07 35.19
2014-11-06 AEO 13.82 12.86
(the symbol 'AFA' would not match as it's recent close is 24.55 and 5 rows prior it was 24.75)
You can get the price 5 days ago using a correlated subquery. In fact, you can get the most recent price the same way. So, this might be the right path:
select s.*,
(select p.close
from prices p
where p.id = s.id
order by date desc
limit 1
) as Close,
(select p.close
from prices p
where p.id = s.id and p.date <= date(now()) - interval 5 day
order by date desc
limit 1
) as Close_5
from stocks s
having Close > Close_5;