Generate a text file based on comparison of two rows in MYSQL - mysql

I have a MYSQL table which contains timestamp and direction (buy/sell signal) of stock market data.
Below is the CREATE and INSERT statement of sample data.
The table is in descending order of timestamp, and the table is truncated and reinserted at 5-minute interval. I have included a id field which is autoincremented, as it may help in comparing the first row with the second row.
Everytime the direction of the market changes, I want a text file to be generated. As an example (from sample data), when timestamp was 15:00:00, since it was the first row that was inserted to the table, it should generate a text file as SELL.txt. At 15:05:00, since the direction changed from SELL to BUY, it should generate a text file as BUY.txt. Since the direction did not change at 15:10:00 and 15:15:00 compared to the previous row, no text file should be generated. At 15:20:00, since the direction changed from BUY to SELL, it should generate a text file as SELL.txt. Since the direction did not change at 15:25:00 and 15:30:00 compared to the previous row, no text file should be generated.
In Summary, if the cell value of the first row of direction field is not equal to the cell value of the second row of direction field, then a text file has to be generated based on the value of the first row of direction field. If the cell value of the first row of direction field is equal to the cell value of the second row of direction field, then no text file has to be generated.
I am assuming this can be implemented using stored procedures. However, I am new to stored procedures, and I have not been able to get this implemented so far. I would truly appreciate if someone can help in this regard.
thanks and regards,
CREATE TABLE `tbl` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`timestamp` datetime DEFAULT NULL,
`direction` varchar(10) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `market`.`tbl`
(`id`,
`timestamp`,
`direction`)
VALUES
(1,'2020-02-24 15:30:00','BUY'),
(2,'2020-02-24 15:25:00','SELL'),
(3,'2020-02-24 15:20:00','SELL'),
(4,'2020-02-24 15:15:00','BUY'),
(5,'2020-02-24 15:10:00','BUY'),
(6,'2020-02-24 15:05:00','BUY'),
(7,'2020-02-24 15:00:00','SELL');

CREATE TRIGGER tr
AFTER INSERT
ON tbl
FOR EACH ROW
BEGIN
IF EXISTS ( SELECT 1
FROM tbl t1, tbl t2
WHERE t1.`timestamp` BETWEEN CURRENT_TIMESTAMP - INTERVAL 2 MINUTE
AND CURRENT_TIMESTAMP + INTERVAL 2 MINUTE
AND t2.`timestamp` BETWEEN CURRENT_TIMESTAMP - INTERVAL 7 MINUTE
AND CURRENT_TIMESTAMP - INTERVAL 3 MINUTE
AND t1.direction != t2.direction ) THEN
IF 'SELL' = ( SELECT direction
FROM tbl
ORDER BY `timestamp` DESC LIMIT 1 ) THEN
/* SELECT 1 INTO OUTFILE 'SELL.txt'; */
INSERT INTO service (txt) VALUES (CONCAT(CURRENT_TIMESTAMP, ' SELL'));
ELSE
/* SELECT 1 INTO OUTFILE 'BUY.txt'; */
INSERT INTO service (txt) VALUES (CONCAT(CURRENT_TIMESTAMP, ' BUY'));
END IF;
END IF;
END
fiddle
Execute the fiddle a lot of times - you'll see that the messages are generated when the directions in 2 last records differs, and not generated when the direcions are the same.
The problem - each insert (except the first one) generates an insertion into the service table (and OUTFILE creation if uncomment it) - but the second attempt to create OUTFILE (which already exists) will fail which will cause the whole insertion query fail. You must create some static mark (service table which stores the timestamp is safe - and check it with some clearance like in records checking, +/- 2 min. seems to be useful) which allows to identify that the file was already created during this INSERT, and do not try to create it one more time.

Related

MySQL Variable Returning Incorrect Value

The Issue
I have a stored proc in a DB server that's bringing back a value of 5064803 when that record does not exist and the value should be 5064800 as per the query that builds the value of the variable.
I'm not sure if this is an issue with the value being of the FLOAT data type and the value in the record of the table ending in a double-zero or what but I cannot figure it out easily.
The table data types match those from the sensors that are set but this particular value from this sensor never actually gets set to a data type and it's usually always either a 1-8 digit INT with no decimal but I'd like to keep the data types the same as the correlated sensor just in case.
I've broke down the proc and I'm able to recreate the problem easily so I will post the detail below for those that may be able to help me figure out the issue and any workaround, etc.
The SQL Data
Create Table
delimiter $$
CREATE TABLE `number` (
`TimeInt` varchar(10) NOT NULL,
`TimeStr` datetime NOT NULL,
`IsInitValue` int(11) NOT NULL,
`Value` float NOT NULL,
`IQuality` int(11) NOT NULL,
UNIQUE KEY `uk_Times` (`TimeInt`,`TimeStr`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$
Insert Data
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502618950','2017-08-13 10:09:10',1,5064800,0);
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502618796','2017-08-13 10:06:36',0,5064800,3);
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502617167','2017-08-13 09:39:27',1,5063310,0);
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502613355','2017-08-13 08:35:55',0,5063310,3);
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502612814','2017-08-13 08:26:54',1,0,0);
INSERT INTO `Number` (`TimeInt`,`TimeStr`,`IsInitValue`,`Value`,`IQuality`) VALUES ('1502609015','2017-08-13 07:23:35',0,0,3);
The SQL Query Breakdown
SET #bStartTime = '2017-08-13 09:24:16';
SET #bEndTime = '2017-08-13 10:06:31';
SET #LastNumber = (SELECT Value FROM Number ORDER BY TimeStr DESC LIMIT 1);
SET #NowNumber = (SELECT Value FROM Number WHERE TimeStr BETWEEN #bStartTime AND #bEndTime ORDER BY TimeStr DESC LIMIT 1);
SELECT #NowNumber;
SELECT #LastNumber;
Recreating the Issue
So based on The SQL Query Breakdown above, once all the data is in the table and then I run the queries within the SELECT queries alone within the #NowNumber and/or #LastNumber variables, I get the correct result of 5064800. However, if I run the entire SET statements for both of those to have it set the query and then just do a SELECT of those variable, it brings back the wrong result of 5064803.
So for example if I run SELECT Value FROM Number ORDER BY TimeStr DESC LIMIT 1 then the correct value is returned. If I run SET #LastNumber = (SELECT Value FROM Number ORDER BY TimeStr DESC LIMIT 1); and then run SELECT #LastNumber; I get the incorrect value returned.
Server System Specs
This particular MySQL Server is running the x86 version of 5.5.50 on Windows Server 2008 with 144 GB of RAM for some quick specs.
Question
I'd like to know what is causing this, and if there is a workaround to the problem either with or without changing the data type of the column assuming that's the issue when it's returned as a variable rather than just a straight query result.
I'll be happy to disclose more technical specs of the environment if needed but I've included what I think it important for the question. Perhaps this is a version bug or there's something obvious that causes this that I cannot see easily so I'm hoping someone can help me with this or explain why this is or is not possible with MySQL.
Sorry, declares can only be used in stored procedures in MySQL. I found this article which may help. It explains how MySQL rounds when storing digits and recommends using doubles. Try changing your floats to doubles.
MySql FLOAT datatype and problems with more then 7 digit scale

track change in sql row for update

I've a java program to update a SQL table (id, name, status). Entire table is updated with same data or some changed data. How can I track if a row is same like it was before update or it has some modified data? id will be always same, only small typo on name. I just want to check on next update if name is modified. In this case the status field should be changed from 'same' to 'modified'. Will timestamp solve my issue? Please help.
1 - If you are looking to audit the table (inserts, updates, deletes), look at my how to prevent unwanted transactions slide deck w/code - http://craftydba.com/?page_id=880.
SEE CODE AT END!
The trigger that fills the audit table can hold information from multiple tables since the data is saved as XML. Therefore, you can un-delete if necessary. It tracks who and what made the change.
2 - If you are never going to purge the data from the audit table, why not mark the row as deleted but keep it for ever?
Many systems like people soft use effective dating to show if a record is no longer active. In the BI world this is called a type 2 dimensional table (slowly changing dimensions).
See the data warehouse institute article. http://www.bidw.org/datawarehousing/scd-type-2/
Each record has a begin and end date. All active records have a end date of null.
3 - Micorsoft SQL Server introduced the change data capture feature. While this tracks data change using a LOG reader after the fact, it lacks things like who and what made the change.
Again, all the above solutions work. I am partial to my solution!
Sincerely
John
The Crafty DBA
--
-- 7 - Auditing data changes (table for DML trigger)
--
-- Delete existing table
IF OBJECT_ID('[AUDIT].[LOG_TABLE_CHANGES]') IS NOT NULL
DROP TABLE [AUDIT].[LOG_TABLE_CHANGES]
GO
-- Add the table
CREATE TABLE [AUDIT].[LOG_TABLE_CHANGES]
(
[CHG_ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL,
[CHG_DATE] [datetime] NOT NULL,
[CHG_TYPE] [varchar](20) NOT NULL,
[CHG_BY] [nvarchar](256) NOT NULL,
[APP_NAME] [nvarchar](128) NOT NULL,
[HOST_NAME] [nvarchar](128) NOT NULL,
[SCHEMA_NAME] [sysname] NOT NULL,
[OBJECT_NAME] [sysname] NOT NULL,
[XML_RECSET] [xml] NULL,
CONSTRAINT [PK_LTC_CHG_ID] PRIMARY KEY CLUSTERED ([CHG_ID] ASC)
) ON [PRIMARY]
GO
-- Add defaults for key information
ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_DATE] DEFAULT (getdate()) FOR [CHG_DATE];
ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_TYPE] DEFAULT ('') FOR [CHG_TYPE];
ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_CHG_BY] DEFAULT (coalesce(suser_sname(),'?')) FOR [CHG_BY];
ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_APP_NAME] DEFAULT (coalesce(app_name(),'?')) FOR [APP_NAME];
ALTER TABLE [AUDIT].[LOG_TABLE_CHANGES] ADD CONSTRAINT [DF_LTC_HOST_NAME] DEFAULT (coalesce(host_name(),'?')) FOR [HOST_NAME];
GO
--
-- 8 - Make DML trigger to capture changes
--
-- Delete existing trigger
IF OBJECT_ID('[ACTIVE].[TRG_FLUID_DATA]') IS NOT NULL
DROP TRIGGER [ACTIVE].[TRG_FLUID_DATA]
GO
-- Add trigger to log all changes
CREATE TRIGGER [ACTIVE].[TRG_FLUID_DATA] ON [ACTIVE].[CARS_BY_COUNTRY]
FOR INSERT, UPDATE, DELETE AS
BEGIN
-- Detect inserts
IF EXISTS (select * from inserted) AND NOT EXISTS (select * from deleted)
BEGIN
INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET])
SELECT 'INSERT', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM inserted as Record for xml auto, elements , root('RecordSet'), type)
RETURN;
END
-- Detect deletes
IF EXISTS (select * from deleted) AND NOT EXISTS (select * from inserted)
BEGIN
INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET])
SELECT 'DELETE', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM deleted as Record for xml auto, elements , root('RecordSet'), type)
RETURN;
END
-- Update inserts
IF EXISTS (select * from inserted) AND EXISTS (select * from deleted)
BEGIN
INSERT [AUDIT].[LOG_TABLE_CHANGES] ([CHG_TYPE], [SCHEMA_NAME], [OBJECT_NAME], [XML_RECSET])
SELECT 'UPDATE', '[ACTIVE]', '[CARS_BY_COUNTRY]', (SELECT * FROM deleted as Record for xml auto, elements , root('RecordSet'), type)
RETURN;
END
END;
GO
SQL Server doesn't have versioning on the table level. If you want to track difference between two field you have at least two options:
Control from your Java application - do pre-update check in your update method.
Control from SQL Server - write trigger again for pre-update check
You can also create suplimentary field where Version number will be kept
Yes you can use time stamp.
By using time stamp, you can find out the latest entry into the table and by using Order by timestamp in the query,you can get the latest and set the status based on corresponding value
It depends on how much information you need. If all you care about is whether the record has ever been modified, you can use created_when and updated_when fields. If the latter is greater than the former, it's been updated.
If you want to know what fields have been updated, you have to log the changes. The details depend on your requirements. If you need to log changes, a trigger is the best way to do it.

how to store date default values to table.but it takes null values

i am fresher in Sql server 2008.
i create table as:
-- Table structure for [xyz]
-- ----------------------------
DROP TABLE [xyz]
GO
CREATE TABLE [xy] (
[abc] DATETIME DEFAULT GETDATE() NOT NULL
)
in inserted time date values as:2013-08-07 00:00:00.000
i want store time value as it is present time.
You can also use the time stamp for current time for each inserted record.
DROP TABLE [xyz]
GO
CREATE TABLE [xy] (
[abc] DATETIME default CURRENT_TIMESTAMP
)
Try something like this:-
ALTER TABLE myTable ADD CONSTRAINT_NAME DEFAULT GETDATE() FOR myColumn
Your Java code is not including the date - the default value only applies when you do not specify a value - it does not "magically" add the time component to the date you pass in. If you want to add the date you'll have to create an UPDATE/INSERT trigger to add the current time to the date that's pass in.
However, I would just update your Java code (if you can) to include the time.

SSIS Inserts not inserting the computed columns

I am using SSIS to insert a Excel file into a Sql Server Table. I believe it uses the Bulk insert, and as a result it doesn't insert into the 'CreationDate' and the 'ModificationDate' columns (both of which are computed columns with getdate() as the default).
Is there a way to get around this problem?
Also, just to be clear - both these date columns are not a part of excel. Here is the exact scenario:
My excel has two columns - code and description. My SQL Server table has 4 columns Code, Description, CreationDate, ModificationDate.
So, when the SSIS copies the data, it copies Code and Description, but the CreationDate and ModificationDate (which are SQL Server Computed Columns) are both empty.
You should use a normal column with a default constraint if you want to log creation
A computed column defined as GETDATE() will change every time you query it.
It is also impossible for a computed column to not be populated
So, assuming you mean "normal column with default", then you need stop sending NULL from SSIS which overrides the default
This is all demonstrated here:
CREATE TABLE #foo (
bar int NOT NULL,
testCol1Null datetime NULL DEFAULT GETDATE(),
testCol1NotNull datetime NOT NULL DEFAULT GETDATE(),
testCol2 AS GETDATE()
);
INSERT #foo (bar, testCol1Null) VALUES (1, NULL);
SELECT * FROM #foo;
WAITFOR DELAY '00:00:00.100';
SELECT * FROM #foo;
WAITFOR DELAY '00:00:00.100';
SELECT * FROM #foo;
DROP TABLE #foo;
Assuming you are using the Bulk Insert Task in SSIS, then you need to set "Keep nulls = off/unchecked" in the options page
You should have a default constraint on the column(s) that specifies get
col1 datetime default getdate()
There should also be an option for the bulk insert KEEPNULLS which should be turned off.
From Bulk Insert on MSDN:
Specifies that empty columns should retain a null value during the bulk-import operation, instead of having any default values for the
columns inserted. For more information, see Keeping Nulls or Using
Default Values During Bulk Import.
KEEPNULLS is also documented: http://msdn.microsoft.com/en-us/library/ms187887.aspx
Put in a Derived Column in your dataflow and populate the two missing columns with the values you want.
The value on a computed column doesn't physically exists on the database, it is calculated every time SQL Server needs to access it, that's why you can't inform a value to it on a insert.
What you need is a default column, which is a column that has a default value that's inserted if you don't inform any other value.
CreationDate datetime default getdate()
ModificationDate datetime default getdate()

Populating a time dimension table automatically

I am currently working on a reporting project. In my datawarehouse I need a dimension table "Time" containing all dates (since 01-01-2011 maybe?) and which increments automatically everyday having this format yyyy-mm-dd.
I'm using MySQL on Debian by the way.
thanks
JT
You can add DATE field and use a query like this -
INSERT INTO table(date_column, column1, column2)
VALUES(DATE(NOW()), 'value1', 'value2');
Also, you can add TIMESTAMP column with ON UPDATE CURRENT_TIMESTAMP, in this case date-time value will be updated automatically.
Automatic Initialization and Updating for TIMESTAMP
See this answer
Or This one
There are a number of suggestions there. If your date range is going to be moderate, perhaps a year or two, and assuming your report uses a stored procedure to return the results, you could just create a temporary table on the fly using a rownum technique with limit to get you all of the dates in the range. Then join with your data as required.
Failing that the Union trick in the second answer seems to perform well according to the comments and can be extended to whatever maximum range you will need. It's very messy though!
This article seems to cover what you want. See also this question for another example of the columns you might want to have on your table. You should definitely generate a large amount of dates in advance instead of updating the table daily; it saves a lot of work and complications. 100 years are only ~36500 rows, which is a small table.
Temporary tables or procedural code are not good solutions for a data warehouse, because you want your reporting tool to be able to access the dimension tables. And if your RDBMS has optimizations for star schema queries (I don't know if MySQL does or not) then it would need to see the dimension too.
Here is what I am using to create and populate time dimension table:
DROP TABLE IF EXISTS time_dimension;
CREATE TABLE time_dimension (
id INTEGER PRIMARY KEY, -- year*10000+month*100+day
db_date DATE NOT NULL,
year INTEGER NOT NULL,
month INTEGER NOT NULL, -- 1 to 12
day INTEGER NOT NULL, -- 1 to 31
quarter INTEGER NOT NULL, -- 1 to 4
week INTEGER NOT NULL, -- 1 to 52/53
day_name VARCHAR(9) NOT NULL, -- 'Monday', 'Tuesday'...
month_name VARCHAR(9) NOT NULL, -- 'January', 'February'...
holiday_flag CHAR(1) DEFAULT 'f' CHECK (holiday_flag in ('t', 'f')),
weekend_flag CHAR(1) DEFAULT 'f' CHECK (weekday_flag in ('t', 'f')),
UNIQUE td_ymd_idx (year,month,day),
UNIQUE td_dbdate_idx (db_date)
) Engine=MyISAM;
DROP PROCEDURE IF EXISTS fill_date_dimension;
DELIMITER //
CREATE PROCEDURE fill_date_dimension(IN startdate DATE,IN stopdate DATE)
BEGIN
DECLARE currentdate DATE;
SET currentdate = startdate;
WHILE currentdate <= stopdate DO
INSERT INTO time_dimension VALUES (
YEAR(currentdate)*10000+MONTH(currentdate)*100 + DAY(currentdate),
currentdate,
YEAR(currentdate),
MONTH(currentdate),
DAY(currentdate),
QUARTER(currentdate),
WEEKOFYEAR(currentdate),
DATE_FORMAT(currentdate,'%W'),
DATE_FORMAT(currentdate,'%M'),
'f',
CASE DAYOFWEEK(currentdate) WHEN 1 THEN 't' WHEN 7 then 't' ELSE 'f' END
);
SET currentdate = ADDDATE(currentdate,INTERVAL 1 DAY);
END WHILE;
END
//
DELIMITER ;
TRUNCATE TABLE time_dimension;
CALL fill_date_dimension('1800-01-01','2050-01-01');
OPTIMIZE TABLE time_dimension;