Populating a time dimension table automatically

Populating a time dimension table automatically - mysql

I am currently working on a reporting project. In my datawarehouse I need a dimension table "Time" containing all dates (since 01-01-2011 maybe?) and which increments automatically everyday having this format yyyy-mm-dd.
I'm using MySQL on Debian by the way.
thanks
JT

You can add DATE field and use a query like this -
INSERT INTO table(date_column, column1, column2)
VALUES(DATE(NOW()), 'value1', 'value2');
Also, you can add TIMESTAMP column with ON UPDATE CURRENT_TIMESTAMP, in this case date-time value will be updated automatically.
Automatic Initialization and Updating for TIMESTAMP

See this answer
Or This one
There are a number of suggestions there. If your date range is going to be moderate, perhaps a year or two, and assuming your report uses a stored procedure to return the results, you could just create a temporary table on the fly using a rownum technique with limit to get you all of the dates in the range. Then join with your data as required.
Failing that the Union trick in the second answer seems to perform well according to the comments and can be extended to whatever maximum range you will need. It's very messy though!

This article seems to cover what you want. See also this question for another example of the columns you might want to have on your table. You should definitely generate a large amount of dates in advance instead of updating the table daily; it saves a lot of work and complications. 100 years are only ~36500 rows, which is a small table.
Temporary tables or procedural code are not good solutions for a data warehouse, because you want your reporting tool to be able to access the dimension tables. And if your RDBMS has optimizations for star schema queries (I don't know if MySQL does or not) then it would need to see the dimension too.

Here is what I am using to create and populate time dimension table:
DROP TABLE IF EXISTS time_dimension;
CREATE TABLE time_dimension (
id INTEGER PRIMARY KEY, -- year*10000+month*100+day
db_date DATE NOT NULL,
year INTEGER NOT NULL,
month INTEGER NOT NULL, -- 1 to 12
day INTEGER NOT NULL, -- 1 to 31
quarter INTEGER NOT NULL, -- 1 to 4
week INTEGER NOT NULL, -- 1 to 52/53
day_name VARCHAR(9) NOT NULL, -- 'Monday', 'Tuesday'...
month_name VARCHAR(9) NOT NULL, -- 'January', 'February'...
holiday_flag CHAR(1) DEFAULT 'f' CHECK (holiday_flag in ('t', 'f')),
weekend_flag CHAR(1) DEFAULT 'f' CHECK (weekday_flag in ('t', 'f')),
UNIQUE td_ymd_idx (year,month,day),
UNIQUE td_dbdate_idx (db_date)
) Engine=MyISAM;
DROP PROCEDURE IF EXISTS fill_date_dimension;
DELIMITER //
CREATE PROCEDURE fill_date_dimension(IN startdate DATE,IN stopdate DATE)
BEGIN
DECLARE currentdate DATE;
SET currentdate = startdate;
WHILE currentdate <= stopdate DO
INSERT INTO time_dimension VALUES (
YEAR(currentdate)*10000+MONTH(currentdate)*100 + DAY(currentdate),
currentdate,
YEAR(currentdate),
MONTH(currentdate),
DAY(currentdate),
QUARTER(currentdate),
WEEKOFYEAR(currentdate),
DATE_FORMAT(currentdate,'%W'),
DATE_FORMAT(currentdate,'%M'),
'f',
CASE DAYOFWEEK(currentdate) WHEN 1 THEN 't' WHEN 7 then 't' ELSE 'f' END
);
SET currentdate = ADDDATE(currentdate,INTERVAL 1 DAY);
END WHILE;
END
//
DELIMITER ;
TRUNCATE TABLE time_dimension;
CALL fill_date_dimension('1800-01-01','2050-01-01');
OPTIMIZE TABLE time_dimension;

Related

Generate a text file based on comparison of two rows in MYSQL

I have a MYSQL table which contains timestamp and direction (buy/sell signal) of stock market data.
Below is the CREATE and INSERT statement of sample data.
The table is in descending order of timestamp, and the table is truncated and reinserted at 5-minute interval. I have included a id field which is autoincremented, as it may help in comparing the first row with the second row.
Everytime the direction of the market changes, I want a text file to be generated. As an example (from sample data), when timestamp was 15:00:00, since it was the first row that was inserted to the table, it should generate a text file as SELL.txt. At 15:05:00, since the direction changed from SELL to BUY, it should generate a text file as BUY.txt. Since the direction did not change at 15:10:00 and 15:15:00 compared to the previous row, no text file should be generated. At 15:20:00, since the direction changed from BUY to SELL, it should generate a text file as SELL.txt. Since the direction did not change at 15:25:00 and 15:30:00 compared to the previous row, no text file should be generated.
In Summary, if the cell value of the first row of direction field is not equal to the cell value of the second row of direction field, then a text file has to be generated based on the value of the first row of direction field. If the cell value of the first row of direction field is equal to the cell value of the second row of direction field, then no text file has to be generated.
I am assuming this can be implemented using stored procedures. However, I am new to stored procedures, and I have not been able to get this implemented so far. I would truly appreciate if someone can help in this regard.
thanks and regards,
CREATE TABLE `tbl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`timestamp` datetime DEFAULT NULL,
`direction` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `market`.`tbl`
(`id`,
`timestamp`,
`direction`)
VALUES
(1,'2020-02-24 15:30:00','BUY'),
(2,'2020-02-24 15:25:00','SELL'),
(3,'2020-02-24 15:20:00','SELL'),
(4,'2020-02-24 15:15:00','BUY'),
(5,'2020-02-24 15:10:00','BUY'),
(6,'2020-02-24 15:05:00','BUY'),
(7,'2020-02-24 15:00:00','SELL');

CREATE TRIGGER tr
AFTER INSERT
ON tbl
FOR EACH ROW
BEGIN
IF EXISTS ( SELECT 1
FROM tbl t1, tbl t2
WHERE t1.`timestamp` BETWEEN CURRENT_TIMESTAMP - INTERVAL 2 MINUTE
AND CURRENT_TIMESTAMP + INTERVAL 2 MINUTE
AND t2.`timestamp` BETWEEN CURRENT_TIMESTAMP - INTERVAL 7 MINUTE
AND CURRENT_TIMESTAMP - INTERVAL 3 MINUTE
AND t1.direction != t2.direction ) THEN
IF 'SELL' = ( SELECT direction
FROM tbl
ORDER BY `timestamp` DESC LIMIT 1 ) THEN
/* SELECT 1 INTO OUTFILE 'SELL.txt'; */
INSERT INTO service (txt) VALUES (CONCAT(CURRENT_TIMESTAMP, ' SELL'));
ELSE
/* SELECT 1 INTO OUTFILE 'BUY.txt'; */
INSERT INTO service (txt) VALUES (CONCAT(CURRENT_TIMESTAMP, ' BUY'));
END IF;
END IF;
END
fiddle
Execute the fiddle a lot of times - you'll see that the messages are generated when the directions in 2 last records differs, and not generated when the direcions are the same.
The problem - each insert (except the first one) generates an insertion into the service table (and OUTFILE creation if uncomment it) - but the second attempt to create OUTFILE (which already exists) will fail which will cause the whole insertion query fail. You must create some static mark (service table which stores the timestamp is safe - and check it with some clearance like in records checking, +/- 2 min. seems to be useful) which allows to identify that the file was already created during this INSERT, and do not try to create it one more time.

Function that automatically converts the date format for every date column in a query result?

Date information in my database is formatted in unixtime to the millisecond. Currently, in order to convert a result set into MST I use this function string:
date_format(convert_tz(from_unixtime(table.column/1000),'utc','us/mountain')'%m/%d/%Y')
I have a routine which I can apply to individual columns in my query that looks like this:
create function datefmt(convert_tz(TEDATE bigint, TEFMT text),'gmt','us/mountain')
returns varchar(50);
This works fine when I'm specifically calling date columns, but I can't apply it to all date columns in a select * statement. This can make running general queries quite tedious, especially with joins (as most of the tables I use have between 3-6 date columns)!
I am trying to figure out how to create something that will recognize every date column in the result set and then apply the date formatting to all rows in the applicable columns. I've considered using triggers, user defined functions, and routines. But I'm having a hard time figuring out exactly how I can accomplish this feat, or if it can even be accomplished.
An example table would be "Task" with these date columns: rowAddedDate (bigint not null), rowUpdatedDate (bigint not null), createdDate (bigint not null), orderedDate (bigint not null), serviceEndDate (int null), serviceStartDate (int null), expectedServiceDate (int null).
I use a clone, and the database software is MariaDB v 10.2.12.
Any help regarding this matter is greatly appreciated!

Looping with dynamic sql demonstration in sql server:
create table #temp (rowid int identity, test varchar(max))
insert #temp
values
('a'),('b'),('c'),('d')
declare #iterator int = 1
declare #maxrows int = (select max(rowid) from #temp)
while #iterator<=#maxrows
begin
exec('select test from #temp where rowid='+#iterator+'')
set #iterator=#Iterator+1
end

Sounds like the columns should have been TIMESTAMP(3), not DATETIME(3) or some hack with BIGINT. (Looks like you have the BIGINT.)
That way, you could SET the timezone once, and all times would be automatically converted.

Creating procedure to validate data to be inserted

I'm trying to create a procedure to prevent the insertion of an incorrect date. The table accepts an integer 8 digits long so April 28 2015 would be inserted as 4282015.
My logic here was to create some temp variables to store month, date, and year and then assign them values by taking sub strings from the original 8 digit value. I would then convert them to strings and concatenate together (I am not sure if there's a way to concatenate int, if there is that would probably be better) then convert that back to and int to be inserted. This is what I have tried so far.
UPDATE: The "sample" table is just an example, I will be running this on a different table in a poorly set up database (my job to run analysis and fix it up a little). The way they have it set up, date is an integer.
CREATE TABLE sample (
id INT not null,
date INT not null
);
CREATE PROCEDURE InsertDate ( date int(8))
BEGIN
DECLARE month INT;
DECLARE day INT;
DECLARE year INT;
SET month = SUBSTRING(new.date, 1, 2);
SET day = SUBSTRING(new.date, 3, 2);
SET year = SUBSTRING(new.date, 5, 4);
IF ( month IN(1, 12) AND day IN(1, 31) AND year IN(2012, 2013) )
Declare temp as int;
#Cast all variables as VARCHARS to concatenate together
#Convert back to INT to be inserted
Set temp = CAST( (CAST(month AS VARCHAR(2)) +
CAST(day AS VARCHAR(2)) +
CAST(year as VARCHAR(4))) as INT );
insert into sample (id, date) values (1 ,temp);
END IF;
END;
If anyone wants to take a look and give me some pointers or explain some stuff, it would be much appreciated!

If you do the sensible thing, you will create the column as a Date or DateTime type. Since you have both MySQL and SQL-Server tags I don't actually know which DBMS you are using, but both of them support Date types.
Don't make it more complicated than you need to.

I stupidly made a row with dates as VARCHAR, can I still do date based selects on it?

So I have a large table that simply cannot be altered without breaking my PHP app. Stupidly (yes I know), I made a start date and end date as VARCHAR with data stored as '03/04/2013' for example.
Now i need to be able to search to see which rows are currently 'active' meaning which rows have a start date before today AND and end date after today.
Is this at all possible with an SQL query?

Action plan to migrate VARCHAR columns to DATE without breaking the application:
Create new indexed DATE columns and fill them with the respective values in the VARCHAR columns:
-- new column
ALTER TABLE MY_TABLE ADD `NEW_DATE_COLUMN` DATE;
-- index
CREATE INDEX `MY_TABLE_NEW_DATE_IDX` ON MY_TABLE(`NEW_DATE_COLUMN`);
-- initial values
UPDATE MY_TABLE
SET `NEW_DATE_COLUMN` = STR_TO_DATE(`VARCHAR_DATE`, '%d/%m/%Y')
WHERE `NEW_DATE_COLUMN` IS NULL;
Create insert / update triggers to cast your VARCHAR columns to DATE and update your new DATE columns with their respective values:
-- triggers
DELIMITER //
CREATE TRIGGER `MY_TABLE_VARCHAR_DATE_BI` BEFORE INSERT ON MY_TABLE
FOR EACH ROW
BEGIN
IF NEW.`NEW_DATE_COLUMN` IS NULL AND NEW.`VARCHAR_DATE` IS NOT NULL THEN
SET NEW.NEW_DATE_COLUMN = STR_TO_DATE(NEW.`VARCHAR_DATE`, '%d/%m/%Y');
END IF;
END;
//
CREATE TRIGGER `MY_TABLE_VARCHAR_DATE_BU` BEFORE UPDATE ON MY_TABLE
FOR EACH ROW
BEGIN
IF NEW.`NEW_DATE_COLUMN` IS NULL AND NEW.`VARCHAR_DATE` IS NOT NULL THEN
SET NEW.NEW_DATE_COLUMN = STR_TO_DATE(NEW.`VARCHAR_DATE`, '%d/%m/%Y');
END IF;
END;
//
DELIMITER;
Use the DATE columns in your queries:
SELECT * FROM MY_TABLE
WHERE `NEW_DATE_COLUMN` BETWEEN
CURRENT_DATE AND DATE_ADD(CURRENT_DATE, INTERVAL 1 DAY);
Take your time and update your application to get ride of places that uses the original VARCHAR columns directly, meanwhile nothing will be broken.
When you are done remove the triggers and the VARCHAR columns:
DROP TRIGGER `MY_TABLE_VARCHAR_DATE_BI`;
DROP TRIGGER `MY_TABLE_VARCHAR_DATE_BU`;
ALTER TABLE MY_TABLE DROP `VARCHAR_DATE`;
Working SQL Fiddle.

Yes you can do that.
Try something like this:-
select date_format(str_to_date('03/04/2013', '%d/%m/%Y'), '%Y%m');
or may be this:-(Just a small change with month and days as I am confused with 03 and 04)
select date_format(str_to_date('03/04/2013', '%m/%d/%Y'), '%Y%m');
OR you may also try to convert your column back to date like this:
UPDATE `table`
SET `column` = str_to_date( `column`, '%d-%m-%Y' );

Use STR_TO_DATE as follows:
WHERE STR_TO_DATE(start, '%d/%m/%Y') < DATE(NOW())
AND STR_TO_DATE(end, '%d/%m/%Y') > DATE_ADD(DATE(NOW()), INTERVAL 1 DAY)

UPDATE `table`
SET `yourColumn` = str_to_date( `yourColumn`, '%d-%m-%Y' );
Convert it do date type without losing your data,add minutes or seconds as needed it.IT will be easier in the long run,but if you prefer dabbling in php..
Or create a new column date type from the varchar one.

adding '1' returns "BLOB"

A JobID goes as follows: ALC-YYYYMMDD-001. The first three are a companies initials, the last three are an incrementing number that resets daily and increments throughout the day as jobs are added for a maximum of 999 jobs in a day; it is these last three that I am trying to work with.
I am trying to get a before-insert trigger to look for the max JobID of the day, and add one so I can have the trigger derive the proper JobID. For the first job, it will of course return null. So here is what I have so far.
Through the following I can get a result of '000'.
set #maxjobID =
(select SUBSTRING(
(Select MAX(
SUBSTRING((Select JobID FROM jobs WHERE SUBSTRING(JobID,5,8)=date_format(curdate(), '%Y%m%d')),4,12)
)
),14,3)
);
select lpad((select ifnull(#maxjobID,0)),3,'0')
But I really need to add one to this keeping the leading zeros to increment the first and subsequent jobs of the day. My problem is as soon as try to add '1' I get a return of 'BLOB'. That is:
select lpad((select ifnull(#maxjobID,0)+1),3,'0')
returns 'BLOB'
I need it to return '001' so I can concatenate that result with the CO initials and the current date.

try casting VARCHAR back to INTEGER
SELECT lpad(SELECT (COALESCE(#maxjobID,0, CAST(#maxjobID AS SIGNED)) + 1),3,'0')

If you're using the MyISAM storage engine, you can implement exactly this with AUTO_INCREMENT, without denormalising your data into a delimited string:
For MyISAM tables, you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
In your case:
Normalise your schema:
ALTER TABLE jobs
ADD initials CHAR(3) NOT NULL FIRST,
ADD date DATE NOT NULL AFTER initials,
ADD seq SMALLINT(3) UNSIGNED NOT NULL AFTER date,
;
Normalise your existing data:
UPDATE jobs SET
initials = SUBSTRING_INDEX(JobID, '-', 1),
date = STR_TO_DATE(SUBSTRING(JobID, 5, 8), '%Y%m%d'),
seq = SUBSTRING_INDEX(JobID, '-', -1)
;
Set up the AUTO_INCREMENT:
ALTER TABLE jobs
DROP PRIMARY KEY,
DROP JobID,
MODIFY seq SMALLINT(3) UNSIGNED NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY(initials, date, seq)
;
You can then recreate your JobID as required on SELECT (or even create a view from such a query):
SELECT CONCAT_WS(
'-',
initials,
DATE_FORMAT(date, '%Y%m%d'),
LPAD(seq, 3, '0')
) AS JobID,
-- etc.
If you're using InnoDB, whilst you can't generate sequence numbers in this fashion I'd still recommend normalising your data as above.

So, I found a query that works (thus far).
Declare maxjobID VARCHAR(16);
Declare jobincrement SMALLINT;
SET maxjobID =
(Select MAX(
ifnull(SUBSTRING(
(Select JobID FROM jobs WHERE SUBSTRING(JobID,5,8)=date_format(curdate(), '%Y%m%d')),
5,
12),0)
)
);
if maxjobID=0
then set jobincrement=1;
else set jobincrement=(select substring(maxjobID,10,3))+1;
end if;
Set NEW.JobID=concat
(New.AssignedCompany,'-',date_format(curdate(), '%Y%m%d'),'-',(select lpad(jobincrement,3,'0')));
Thanks for the responses! Especially eggyal for pointing out the auto_increment capabilities in MyISAM.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Populating a time dimension table automatically - mysql

I am currently working on a reporting project. In my datawarehouse I need a dimension table "Time" containing all dates (since 01-01-2011 maybe?) and which increments automatically everyday having this format yyyy-mm-dd. I'm using MySQL on Debian by the way. thanks JT

Related

Generate a text file based on comparison of two rows in MYSQL

Function that automatically converts the date format for every date column in a query result?

Creating procedure to validate data to be inserted

I stupidly made a row with dates as VARCHAR, can I still do date based selects on it?

adding '1' returns "BLOB"

Categories

Resources