Related
I have a sql table related to discontinuous dates:
CREATE TABLE IF NOT EXISTS date_test1 ( items CHAR ( 8 ), trade_date date );
INSERT INTO `date_test1` VALUES ( 'a', '2020-03-20');
INSERT INTO `date_test1` VALUES ( 'b', '2020-03-20');
INSERT INTO `date_test1` VALUES ('a', '2020-03-21');
INSERT INTO `date_test1` VALUES ( 'c', '2020-03-22');
INSERT INTO `date_test1` VALUES ( 'd', '2020-03-22');
INSERT INTO `date_test1` VALUES ('a', '2020-03-25');
INSERT INTO `date_test1` VALUES ( 'e', '2020-03-26');
In this table, '2020-03-23' and '2020-03-24' are missed. I want to fill them by their previous data, in this table, the '2020-03-22' data.
Expected result:
The number of continues missing dates and of the records in one day are both uncertain.
So how to do this in mysql?
This solution uses Python and assumes that there aren't so many rows that they cannot be read into memory. I do not warrant this code free from defects; use at your own risk. So I suggest you run this against a copy of your table or make a backup first.
This code uses the pymysql driver.
import pymysql
from datetime import date, timedelta
from itertools import groupby
import sys
conn = pymysql.connect(db='x', user='x', password='x', charset='utf8mb4', use_unicode=True)
cursor = conn.cursor()
# must be sorted by date:
cursor.execute('select items, trade_date from date_test1 order by trade_date, items')
rows = cursor.fetchall() # tuples: (datetime.date, str)
if len(rows) == 0:
sys.exit(0)
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
one_day = timedelta(days=1)
previous_group = groups.pop(0)
next_date = previous_group[0][1]
for group in groups:
next_date = next_date + one_day
while group[0][1] != next_date:
# missing date
for tuple in previous_group:
cursor.execute('insert into date_test1(items, trade_date) values(%s, %s)', (tuple[0], next_date))
print('inserting', tuple[0], next_date)
conn.commit()
next_date = next_date + one_day
previous_group = group
Prints:
inserting c 2020-03-23
inserting d 2020-03-23
inserting c 2020-03-24
inserting d 2020-03-24
Discussion
With your sample data, after the rows are fetched, rows is:
(('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20)), ('a', datetime.date(2020, 3, 21)), ('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22)), ('a', datetime.date(2020, 3, 25)), ('e', datetime.date(2020, 3, 26)))
After the following is run:
groups = []
for k, g in groupby(rows, key=lambda row: row[1]):
groups.append(list(g))
groups is:
[[('a', datetime.date(2020, 3, 20)), ('b', datetime.date(2020, 3, 20))], [('a', datetime.date(2020, 3, 21))], [('c', datetime.date(2020, 3, 22)), ('d', datetime.date(2020, 3, 22))], [('a', datetime.date(2020, 3, 25))], [('e', datetime.date(2020, 3, 26))]]
That is, all the tuples with the same date are grouped together in a list so it becomes to easier to detect missing dates.
This is a real hair puller so any help, much appreciated!
I want to be able to determine the:
First day of the current custom/financial month
Last day of the current custom/financial month
And use these new columns Start_Date and end_Date as between Filters in the Matrix.
Note: I understand that if this was calendar Month, then that will be "quite" straightforward.
But in this case its quite different.
Please see image which might help with the context i am trying to work with:
I'm sure this is possible using expressions in SSRS but I don;t have time to investigate. In case it's useful, here's how I would do it in SQL.
Again there is probably a more elegant solution but this was what came to me.
I'll reproduced your data plus a few more dates either end for testing which I guessed based on your sample.
DECLARE #t TABLE(Custom_Date date, Custom_Day int)
INSERT INTO #t VALUES
('2017-10-26', 28),
('2017-10-27', 29),
('2017-10-28', 30),
('2017-10-29', 1),
('2017-10-30', 2),
('2017-10-31', 3),
('2017-11-01', 4),
('2017-11-02', 5),
('2017-11-03', 6),
('2017-11-04', 7),
('2017-11-05', 8),
('2017-11-06', 9),
('2017-11-07', 10),
('2017-11-08', 11),
('2017-11-09', 12),
('2017-11-10', 13),
('2017-11-11', 14),
('2017-11-12', 15),
('2017-11-13', 16),
('2017-11-14', 17),
('2017-11-15', 18),
('2017-11-16', 19),
('2017-11-17', 20),
('2017-11-18', 21),
('2017-11-19', 22),
('2017-11-20', 23),
('2017-11-21', 24),
('2017-11-22', 25),
('2017-11-23', 26),
('2017-11-24', 27),
('2017-11-25', 28),
('2017-11-26', 1),
('2017-11-27', 2),
('2017-11-28', 3)
Then two queries to pull out the correct dates which you could combine if required.
SELECT MAX(Custom_Date) FROM #t WHERE Custom_Date < getdate() AND custom_day = 1
SELECT MAX(Custom_Date)
FROM #t
WHERE
Custom_Date > getdate()
AND DATEDIFF(d, getdate(), Custom_Date)<=31
AND Custom_Day = (
SELECT MAX(custom_day)
FROM #t
WHERE
Custom_Date > getdate()
AND datediff(d, getdate(), Custom_Date)<=31
)
FYI: This would be a lot easier if you had a custom month/period and year in your dates table as then you could just look custom_day 1 and max(custom_day) where the custom month and year are the same as the current date.
Problem presented is to calculate for each row returned the time ("ResponseTime") between 2 timestamps ("StartDateTime" and "EndDateTime") excluding the weekends. Does not take into consideration Work hours or Holidays.
Weekends in this case are defined as Saturday 00:00:00 to Sunday 23:59:59.
Had a tough time coming up with a solution for this question so thought I would share my final product. Found lots of solutions online but most either used a calendar table, which I couldn't use in this application, or had a logic I didn't understand. Solution shared below. Please feel free to offer your own solution based on the problem or to correct any errors you see in my code. Regards,
EDIT: as per comments provided by #JuanCarlosOropeza solution I presented is not optimal. Providing sample data for him to forward a different solution. If anyone has improvements as well feel free to participate.
CREATE TABLE SourceTable
(`id` int, `StartDateTime` datetime, `EndDateTime` datetime)
;
INSERT INTO SourceTable
(`id`, `StartDateTime`, `EndDateTime`)
VALUES
(1, '2016-09-20 12:52:00', '2016-09-23 13:15:00'),
(2, '2016-09-19 19:15:00', '2016-09-22 19:15:00'),
(3, '2016-09-01 10:35:00', '2016-09-06 13:15:00'),
(4, '2016-09-26 10:34:00', '2016-09-29 11:25:00'),
(5, '2016-09-01 13:01:00', '2016-09-06 14:55:00'),
(6, '2016-09-05 02:21:00', '2016-09-08 19:15:00'),
(7, '2016-09-27 14:14:00', '2016-10-01 19:15:00'),
(8, '2016-09-27 04:18:00', '2016-09-30 14:15:00'),
(9, '2016-09-01 14:50:00', '2016-09-06 17:25:00'),
(10, '2016-09-20 12:52:00', '2016-09-23 13:15:00'),
(11, '2016-09-26 02:14:00', '2016-09-29 10:15:00'),
(12, '2016-09-01 12:04:00', '2016-09-06 17:05:00'),
(13, '2016-09-20 15:30:00', '2016-09-23 15:15:00'),
(14, '2016-09-02 16:04:00', '2016-09-07 20:55:00'),
(15, '2016-09-23 10:41:00', '2016-09-28 13:05:00'),
(16, '2016-09-27 16:28:00', '2016-10-01 13:15:00'),
(17, '2016-09-27 15:33:00', '2016-10-01 22:45:00'),
(18, '2016-09-20 12:53:00', '2016-09-23 13:25:00'),
(19, '2016-09-19 13:49:00', '2016-09-22 13:05:00'),
(20, '2016-09-20 13:46:00', '2016-09-23 13:15:00'),
(21, '2016-09-01 16:32:00', '2016-09-06 18:05:00'),
(22, '2016-09-01 10:35:00', '2016-09-06 22:45:00'),
(23, '2016-09-26 12:40:00', '2016-09-29 12:35:00'),
(24, '2016-09-27 10:37:00', '2016-09-30 21:25:00'),
(25, '2016-09-27 09:41:00', '2016-09-30 15:15:00'),
(26, '2016-09-16 02:09:00', '2016-09-21 10:05:00'),
(27, '2016-09-20 15:13:00', '2016-09-23 15:15:00'),
(28, '2016-09-20 15:30:00', '2016-09-23 15:15:00'),
(29, '2016-09-27 09:55:00', '2016-09-30 13:25:00'),
(30, '2016-09-27 04:18:00', '2016-09-30 14:15:00')
;
I created this solution considering the following logic assumptions.
StartDateTime always occurs before EndDateTime (though had some that didn't and it calculated the time difference correctly)
Week StartDateTime occurred: WEEK(StartDateTime,1)
Week EndDateTime occurred: WEEK(EndDateTime,1)
Start of weekend of week StartDateTime: ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime))
Start of workweek after first weekend: ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime))
Full Query:
SELECT
id,
StartDateTime,
EndDateTime,
CASE
WHEN ( WEEK(EndDateTime,1) = WEEK(StartDateTime,1) )
THEN
CASE
WHEN ( StartDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN SEC_TO_TIME(0)
ELSE
CASE
WHEN ( EndDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN ( TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime) )
ELSE ( TIMEDIFF(EndDateTime,StartDateTime) )
END
END
ELSE
CASE
WHEN ( StartDateTime >= ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)) )
THEN
CASE
WHEN ( EndDateTime >= ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) * 7) )
THEN ( SEC_TO_TIME(120*3600*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1))) )
ELSE ( SEC_TO_TIME(120*3600*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1) + TIME_TO_SEC(TIMEDIFF(EndDateTime, ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime)),7*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1))))) )
END
ELSE
CASE
WHEN ( EndDateTime >= ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) * 7) )
THEN ( SEC_TO_TIME(120*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1)) + TIME_TO_SEC(TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime))) )
ELSE ( SEC_TO_TIME(TIME_TO_SEC(TIMEDIFF(EndDateTime, ADDDATE(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),7-WEEKDAY(StartDateTime)),7*(WEEK(EndDateTime,1) - WEEK(StartDateTime,1) - 1)))) + TIME_TO_SEC(TIMEDIFF(ADDDATE(TIMESTAMP(DATE(StartDateTime),'00:00:00'),5-WEEKDAY(StartDateTime)),StartDateTime))) )
END
END
END as ResponseTime
FROM
SourceTable;
First CASE checks if both timestamps happened on the same week. Second layer checks if StartDateTime happened during the first weekend. Third layer checks if EndDateTime happened during a weekend. Based on these considerations outputs the correct calculation.
I have a several mySQL tables where I have saved the relation ID of the child table comma separated. Now I have to transfer this entries into a new table where for each relation is one entry.
Is there an easy way to transfer import query into the correct format?
Here the data example, my old table (cat_projects) has the following entries I want convert:
-- export of table cat_projects
INSERT INTO `cat_projects` (`id`, `authors`) VALUES
(2, '4,1'),
(3, '0'),
(4, '8,4,1'),
(5, '13,12'),
(10, '19,4,1'),
(13, ''),
(14, ''),
(15, '28,27,25,12,9,1');
This entries I want just to write into the new relation table (cat_project_relation). The att_id links to the another table where I have save the settings of the old authors column:
-- att_id = 58
-- item_id = id
-- value_sorting = counting from 0 for each item_id
-- value_id = for each relation one entry value
INSERT INTO `cat_project_relation` (`att_id`, `item_id`, `value_sorting`, `value_id`) VALUES
(58, 2, 0, '4'),
(58, 2, 1, '1'),
(58, 3, 0, '0'),
(58, 4, 0, '8'),
(58, 4, 1, '4'),
(58, 4, 2, '1'),
(58, 5, 0, '13'),
(58, 5, 1, '12'),
(58, 10, 0, '19'),
(58, 10, 1, '4'),
(58, 10, 2, '1'),
(58, 13, 0, ''),
(58, 14, 0, ''),
(58, 15, 0, '28'),
(58, 15, 1, '27'),
(58, 15, 2, '25'),
(58, 15, 3, '12'),
(58, 15, 4, '9'),
(58, 15, 5, '1');
I hope it is clear what I try to achieve. Is it possible to do that directly in SQL or do I have to apply an external bash script?
Actually it wasn't as difficult as I thought to write a little bash script. And I guess its the easier solution then program something in SQL.
#!/bin/bash
# input table
table="(2, '4,1'),
(3, '0'),
(4, '8,4,1'),
(5, '13,12'),
(10, '19,4,1'),
(13, ''),
(14, ''),
(15, '28,27,25,12,9,1');"
# fixed attribute id
att_id=58
# read each line into an array
readarray -t y <<<"$table"
# for each array item (each line)
for (( i=0; i<${#y[*]}; i=i+1 ))
do
z=${y[$i]}
# split by comma into array
IFS=', ' read -r -a array <<< "$z"
# loop through each value
for (( j=0; j<${#array[*]}; j=j+1 ))
do
# remove all other characters then number
nr=$(echo ${array[$j]} | sed 's/[^0-9]*//g')
# each first value is the item_id
if [ $j -eq 0 ]
then
item_id=$nr
else
k=$(expr $j - 1)
value_id=$nr
# print output line by line
echo "($att_id, $item_id, $k, '$value_id')," >> output.txt
fi
done
done
The result will be as the on asked in the question.
I've been searching but have been unable to find a solution to this--I know it's do-able but I just don't have the ninja SQL skills I need (yet)....
I'm looking for a solution to this issue: I have a 2 tables related to stock market data. The first is a simple list of stock symbols with an ID and stock ticker symbol (ID,SYMBOL). The second table contains historical price data for each of the stocks. (ID, DATE, OPEN, HIGH, LOW, CLOSE, VOLUME).
I'm trying to figure out how to query for stocks that have the most recent CLOSE price that is greater than their CLOSE price 5 trading-days ago. I can't just do date math because the stocks don't trade every day (no trading on weekends & holidays, as well as some stocks may not trade on a normal trading day). Thus, I just need to compare the CLOSE price from most recent row and the 5th row proceeding it for each symbol.
I have sample tables and data here:
http://sqlfiddle.com/#!2/5fe76/2
CREATE TABLE `STOCKS` (
`ID` int,
`SYMBOL` varchar(10)
);
INSERT INTO `STOCKS` (`ID`,`SYMBOL`)
VALUES
(1, 'AA'),
(2, 'ADT'),
(3, 'AEO'),
(4, 'AFA');
CREATE TABLE `PRICES` (
`ID` int,
`DATE` date,
`OPEN` decimal(6,2),
`HIGH` decimal(6,2),
`LOW` decimal(6,2),
`CLOSE` decimal(6,2),
`VOLUME` bigint
);
INSERT INTO `PRICES` (`ID`,`DATE`,`OPEN`,`HIGH`,`LOW`,`CLOSE`,`VOLUME`) VALUES
(1, '2014-11-06', 16.37, 16.42, 16.15, 16.37, 14200400),
(1, '2014-11-05', 16.68, 16.69, 16.17, 16.26, 18198200),
(1, '2014-11-04', 16.85, 16.87, 16.43, 16.56, 13182800),
(1, '2014-11-03', 16.78, 17.03, 16.65, 16.93, 15938500),
(1, '2014-10-31', 16.43, 16.76, 16.24, 16.76, 18618300),
(1, '2014-10-30', 16.17, 16.36, 15.83, 16.22, 17854400),
(1, '2014-10-29', 16.58, 16.70, 16.05, 16.27, 31173000),
(1, '2014-10-28', 16.5, 16.65, 16.41, 16.60, 12305900),
(1, '2014-10-27', 16.56, 16.57, 16.31, 16.38, 15452900),
(1, '2014-10-24', 16.33, 16.57, 16.22, 16.55, 12840200),
(2, '2014-11-06', 35.9, 36.12, 35.75, 36.07, 1018100),
(2, '2014-11-05', 35.68, 35.99, 35.37, 35.96, 1101500),
(2, '2014-11-04', 35.13, 35.69, 35.02, 35.49, 819100),
(2, '2014-11-03', 35.81, 35.99, 35.27, 35.32, 1304500),
(2, '2014-10-31', 35.79, 35.86, 35.46, 35.84, 1319400),
(2, '2014-10-30', 34.7, 35.34, 34.66, 35.19, 1201800),
(2, '2014-10-29', 35.06, 35.56, 34.5, 34.92, 1359000),
(2, '2014-10-28', 34.32, 35.17, 34.15, 35.07, 1301800),
(2, '2014-10-27', 34.2, 34.2, 33.66, 34.1, 662600),
(2, '2014-10-24', 34.02, 34.54, 33.95, 34.5, 750600),
(3, '2014-11-06', 13.27, 13.92, 13.25, 13.82, 6518000),
(3, '2014-11-05', 12.95, 13.27, 12.74, 13.22, 8716700),
(3, '2014-11-04', 12.85, 12.94, 12.65, 12.89, 4541200),
(3, '2014-11-03', 12.91, 13.12, 12.73, 12.89, 4299100),
(3, '2014-10-31', 13.2, 13.23, 12.83, 12.87, 7274700),
(3, '2014-10-30', 12.83, 12.91, 12.68, 12.86, 4444300),
(3, '2014-10-29', 13.02, 13.20, 12.79, 12.91, 2974900),
(3, '2014-10-28', 12.87, 13.10, 12.52, 13.04, 7365600),
(3, '2014-10-27', 12.84, 13.00, 12.67, 12.92, 6647900),
(3, '2014-10-24', 13.26, 13.29, 12.60, 12.92, 12803300),
(4, '2014-11-06', 24.59, 24.59, 24.49, 24.55, 20400),
(4, '2014-11-05', 24.81, 24.9, 24.81, 24.88, 11800),
(4, '2014-11-04', 24.87, 24.88, 24.76, 24.88, 10600),
(4, '2014-11-03', 24.85, 24.88, 24.76, 24.81, 18100),
(4, '2014-10-31', 24.82, 24.85, 24.77, 24.78, 8100),
(4, '2014-10-30', 24.83, 24.87, 24.74, 24.79, 13900),
(4, '2014-10-29', 24.86, 24.86, 24.78, 24.81, 5500),
(4, '2014-10-28', 24.85, 24.85, 24.80, 24.84, 10600),
(4, '2014-10-27', 24.68, 24.85, 24.68, 24.85, 7700),
(4, '2014-10-24', 24.67, 24.82, 24.59, 24.82, 9300);
Pseudo code for the query would be something like this:
"Find symbols whos most recent closing prices is greater than the closing price 5 trading-days earlier"
The query I'd like to create should result in the following:
Date Symbol Close Close(-5)
2014-11-06 AA 16.37 16.22
2014-11-06 ADT 36.07 35.19
2014-11-06 AEO 13.82 12.86
(the symbol 'AFA' would not match as it's recent close is 24.55 and 5 rows prior it was 24.75)
You can get the price 5 days ago using a correlated subquery. In fact, you can get the most recent price the same way. So, this might be the right path:
select s.*,
(select p.close
from prices p
where p.id = s.id
order by date desc
limit 1
) as Close,
(select p.close
from prices p
where p.id = s.id and p.date <= date(now()) - interval 5 day
order by date desc
limit 1
) as Close_5
from stocks s
having Close > Close_5;