How to use the same SSIS Data Flow with different Date Values? - ssis

I have a very straightforward SSIS package containing one data flow which is comprised of an OLEDB source and a flat file destination. The OLEDB source calls a query that takes 2 sets of parameters. I've mapped the parameters to Date/Time variables.
I would like to know how best to pass 4 different sets of dates to the variables and use those values in my query?
I've experimented with the For Each Loop Container using an item enumerator. However, that does not seem to work and the package throws a System.IO.IOException error.
My container is configured as follows:
Note that both variables are of the Date/Time data type.
How can I pass 4 separate value sets to the same variables and use each variable pair to run my data flow?

Setup
I created a table and populated it with contiguous data for your sample set
DROP TABLE IF EXISTS dbo.SO_67439692;
CREATE TABLE dbo.SO_67439692
(
SurrogateKey int IDENTITY(1,1) NOT NULL
, ActionDate date
);
INSERT INTO
dbo.SO_67439692
(
ActionDate
)
SELECT
TOP (DATEDIFF(DAY, '2017-12-31', '2021-04-30'))
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)), '2017-12-31') AS ActionDate
FROM
sys.all_columns AS AC;
In my SSIS Package, I added two Variables, startDate and endDAte2018 both of type Date Time. I added an OLE DB Connection manager pointed to the database where I made the above tables.
I added a Foreach Item Enumerator, configured it for Item Enumerator and defined the columns there as datetime as well
I populated it (what a clunky editor) with the year ranges from 2018 to 2020 as shown and 2021-01-01 to 2021-04-30.
I wired the variables up as shown in the problem definition and ran it as is. No IO error reported.
Once I knew my foreach container was working, the data flow was trivial.
I added a data flow inside the foreach loop with an OLE DB Source using a parameterized query like so
DECLARE #StartDate date, #EndDate date;
SELECT #StartDate = ?, #EndDate = ?;
SELECT *
FROM
dbo.SO_67439692 AS S
WHERE
S.ActionDate >= #StartDate AND S.ActionDate <= #EndDate;
I mapped my two variables in as parameter names of 0 and 1 and ran it.
The setup you described works great. Either there is more to your problem than stated or there's something else misaligned. Follow along with my repro and compare it to what you've built and you should see where things are "off"

Related

SSIS type 2 scd with batch ID

I want to modify standard SSIS SCD behavior.
EmployeeID is my business key and title, firstname, lastname are type 2 attributes.
I want BatchLogID to reflect when a change occurred - otherwise it remains unchanged.
BatchLogID is passed to dataflow as an int
EmployeeID,title,firstname,lastname,BatchLogID,startdate,enddate
source data
101,Miss,Jane,Smith,101 -- inserted for first time
101,Miss,Jane,Smith,102 process runs
101,Miss,Jane,Smith,103 process runs
101,Miss,Jane,Smith,104 process runs
101,Mrs, Jane,Brown,105 process runs -- only when data has changed do I want the Batch number in target updated
target data
101,Miss,Jane,Smith,101,101,1 jan 2000,null-- inserted for first time
101,Miss,Jane,Smith,105,105,1 jan 2000,5 Jan 2000 -- as a change is detected the data is updated
101,Mrs, Jane,Brown,105,105 jan 2000,null-- only when data has changed to I want the Batch number updated
any thoughts?
You may need to perform delta load using :
Lookup and derived column
Merge join, Conditional Split and derived column
Change Data Capture
Temporal tables (if the RDBMS supports that)
To know more :
https://www.c-sharpcorner.com/article/design-the-full-load-and-delta-load-patterns-in-ssis/
Used a sql merge command - had to wash through twice for
declare #batchLogID int= 1
MERGE dbo.targetTable AS t
USING dbo.sourceTable AS s `enter code here`
ON (t.[key] = s.[key] and t.endDate is null)
WHEN MATCHED and s.[value] <> t.[value]
THEN UPDATE SET t.enddate = dateadd(ss,-1,cast(cast(getdate() as date) as datetime))
WHEN not MATCHED
THEN INSERT (key,[col1], [col2], [value], [col3],startdate,BatchLogID)
VALUES (s.key,s.[col1], s.[col2], s.[value], s.[col3],cast(getdate() as date),#batchLogID)

How to compare the two table row count , if counts matches than ok if not matches this will restart the SSIS package

I have made the ssis package in which i made the data flow for incremental data. Source and destination server ip's are different. Below you can find the flow diagram of my packageControl flow diagram
Data flow diagram
the package is working fine .
In the Execute SQl task :- it controls the log table and start the incremental task
query which i used is :-
insert into audit_log (
Packagename,
process_date,
start_datetime,
end_datetime,
Record_processed,
status
)values('CRM-TO-TRANSORGDB',null,GETDATE(),null,null,null);
select MAX(ID) as ID,MAX(process_date) as proc_date from audit_log where Packagename ='CRM-TO-TRANSORGDB' ;
store the ID and proc_date in the variable.
in the Execute SQl task 1:- it just update the log table.
UPDATE audit_log
SET
process_date=?,
end_datetime = GETDATE(),
status='SUCCESS'
record_processed=?
WHERE (packagename = 'CRM-TO-TRANSORGDB') AND ID=? ;
this is the query we have used to update the log table.
In the Data flow simple fetching the all the records and put in into the destination table.
this all i have done .
But my question are:-
1) How to compare the total no. of row counts from the source table to destination table in ssis package.
2) if its doesn't matches than it will restart my task automatically.
#thomas as per your instruction i have done the following thing:
1) i have made the Execute SQl Task for source and destination .
2) and Add the Execute Package task and added the condition for not matching the count.
and added the expression for check row_count_src!= row_count_dest
and in Source_table_count i have used the below query:
select count(SubOrderID) as row_count_src from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
in dest_table_count i have used the below query:
select count(SubOrderID) as row_count_dest from fact_suborder_journey
WHERE Suborderdate between '2016-06-01' and GETDATE()-1 ;
i have added the two variable as int64 in ths ssis package. and map in the result set below you can find the pic what i have done.
but After done all this this i am getting this error:
[Execute SQL Task] Error: An error occurred while assigning a value to variable "row_count_src": "The type of the value being assigned to variable "User::row_count_src" differs from the current variable type. Variables may not change type during execution. Variable types are strict, except for variables of type Object.
".
I havent tested this completely but you might be able to do something like this. This creates a loop of your packages and will executes as long as your count variables are different from each other.
What have i done?
First i have a DataFlow Task which moves data from source to
destination.
Then i have an Execute SQL task which basically counts all rows from
TableA and maps it to variable count1 eg. Source table
Then i have an Execute SQL task which basically counts all rows from
TableB and maps it to variable count2 eg. Destination Table
Then i create an Execute Package task where i reference it too it
self. Then i make a precedence constraint with an expression saying
Count1 != count2.
Because if they are different you want to restart the task. If they
are equal the last task Execute Package task will never be executed.
Hope that is something like that?
If I understand your challenge correctly...
In the data flow task, use a RowCount transformation between source
and destination to capture the rows written to the destination. This
will be stored in a variable.
In the control flow, get the max row counts available from the log table and store that a variable.
Create an execute package tasks that executes this same package and put a precedence constraint before if that compares if variable from Step1 <> variable in Step2.

SSIS 2012: How do I change the project's environment variables?

I want to change 2 date project variables, StartDateTime and EndDateTime. I have another variable called RunType. Very simply, I want to read the value RunType first. If it is set to "Incremental" I want to change the StartDate from whatever it is to 12:00:00 AM yesterday and the EndDate to 11:59:59 PM yesterday. The errors I get from trying to write back the values seem to indicate writes are out on project level variables. Is this true--or is there something I need to do differently when dealing with these project level variables? I thought of creating package level variables, a control table, blah blah... It seems like overkill for this.
When I test the package manually by changing the parameter values under the integration service catalog/environments--I get the range I expect. This package will run via sql agent job. Is there a pre-ssis step I can create and execute to do this simple task outside of ssis?
While the values may be stored in internal.object_parameters, resist the temptation to edit the values in those tables directly. Instead use the supplied methods for manipulation. In this case, the stored procedure catalog.set_environment_variable_value.
Below is a sample of how you can programmatically change the value of an environment variable. I would see this as being a TSQL jobstep in SQL Agent job that runs prior to your package launching to ensure the correct values for StartDateTime and EndDateTime are set as expected.
DECLARE #var sql_variant;
DECLARE
#StartDateTime date
, #EndDateTime datetime
, #RunType bit = 1;
SELECT
#StartDateTime = CAST(dateadd(d, -1, CURRENT_TIMESTAMP) AS date)
, #EndDateTime = DATEADD(s, -1, cast(cast(current_timestamp AS date) AS datetime))
SELECT #StartDateTime, #EndDateTime;
IF (#RunType = 1)
BEGIN
SET #var = #StartDateTime;
EXECUTE [SSISDB].[catalog].[set_environment_variable_value]
#variable_name=N'StartDateTime'
, #environment_name=N'MyEnvironmentName'
, #folder_name=N'MyFolder'
, #value=#var;
SET #var = #EndDateTime;
EXECUTE [SSISDB].[catalog].[set_environment_variable_value]
#variable_name=N'EndDateTime'
, #environment_name=N'MyEnvironmentName'
, #folder_name=N'MyFolder'
, #value=#var;
END
ELSE
BEGIN
PRINT 'Logic goes here to handle the other conditions for RunType'
END

Table valued parameters for SSRS 2008

We have a requirement of generating SSRS reports from where we need to convert multi-valued string and integer parameters to datatable and pass it to stored procedure. The stored procedure contains multiple table type parameters. Earlier we used varchar(8000) but it was also crossing the datatype limit. Then we thought to introducing datatable concept. But we were not aware of how to pass values from SSRS.
We found a solution from GruffCode on Using Table-Valued Parameters With SQL Server Reporting Services.
The solution solved my problem, and we're able to generate reports. However, sometimes SSRS returns the two following errors:
An error has occurred during report processing.
Query execution failed for dataset 'DSOutput'.
String or binary data would be truncated. The statement has been terminated.
And
An unexpected error occurred in Report Processing.
Exception of type 'System.OutOfMemoryException' was thrown.
I'm not sure when and where it's causing the issue.
The approach outlined in that blog post relies on building an enormous string in memory in order to load all of the selected parameter values into the table-valued parameter instance. If you are selecting a very large number of values to pass into the query I could see it potentially causing the 'System.OutOfMemoryException' while trying to build the string containing the insert statements that will load the parameter.
As for the 'string or binary data would be truncated' error that sounds like it's originating within the query or stored procedure that the report is using to gather its data. Without seeing what that t-sql looks like I couldn't say why that's happening, but I'd guess that it's also somehow related to selecting a very large number of parameter values.
Unfortunately I'm not sure that there's a workaround for this, other than trying to see if you could figure out a way to select fewer parameter values. Here's a couple of rough ideas:
If you have a situation where users might select a handful of parameter values or all parameter values then you could have the query simply take a very simple boolean value indicating that all values were selected rather than making the report send all of the values in through a parameter.
You could also consider "zooming out" of your parameter values a bit and grouping them together somehow if they lend themselves to that. That way users would be selecting from a smaller number of parameter values that represent a group of the individual values all rolled up.
I'm not a fan of using a Text parameter and EXEC in the SQL statement like the article you referenced describes as doing so is subject to SQL injection. The default SSRS behavior with a Multi-value parameter substitutes a comma-separated list of the values directly in place of the parameter when the query is sent to the SQL server. That works great for simple IN queries, but can be undesirable elsewhere. This behavior can be bypassed by setting the Parameter Value on the DataSet to an expression of =Join(Parameters!CustomerIDs.Value, ", "). Once you have done that you can get a table variable loaded by using the following SQL:
DECLARE #CustomerIDsTable TABLE (CustomerID int NOT NULL PRIMARY KEY)
INSERT INTO #CustomerIDsTable (CustomerID)
SELECT DISTINCT TextNodes.Node.value(N'.', N'int') AS CustomerID
FROM (
SELECT CONVERT(XML, N'<A>' + COALESCE(N'<e>' + REPLACE(#CustomerIDs, N',', N'</e><e>') + N'</e>', '') + N'</A>') AS pNode
) AS xmlDocs
CROSS APPLY pNode.nodes(N'/A/e') AS TextNodes(Node)
-- Do whatever with the resulting table variable, i.e.,
EXEC rpt_CustomerTransactionSummary #StartDate, #EndDate, #CustomerIDsTable
If using text instead of integers then a couple of lines get changed like so:
DECLARE #CustomerIDsTable TABLE (CustomerID nvarchar(MAX) NOT NULL PRIMARY KEY)
INSERT INTO #CustomerIDsTable (CustomerID)
SELECT DISTINCT TextNodes.Node.value(N'.', N'nvarchar(MAX)') AS CustomerID
FROM (
SELECT CONVERT(XML, N'<A>' + COALESCE(N'<e>' + REPLACE(#CustomerIDs, N',', N'</e><e>') + N'</e>', '') + N'</A>') AS pNode
) AS xmlDocs
CROSS APPLY pNode.nodes(N'/A/e') AS TextNodes(Node)
-- Do whatever with the resulting table variable, i.e.,
EXEC rpt_CustomerTransactionSummary #StartDate, #EndDate, #CustomerIDsTable
This approach also works well for handling user-entered strings of comma-separated items.

Query not working in execute SQL task in the ssis package

This query works fine in the query window of SQL Server 2005, but throws error when I run it in Execute SQL Task in the ssis package.
declare #VarExpiredDays int
Select #VarExpiredDays= Value1 From dbo.Configuration(nolock) where Type=11
DECLARE #VarENDDateTime datetime,#VarStartDateTime datetime
SET #VarStartDateTime= GETDATE()- #VarExpiredDays
SET #VarENDDateTime=GETDATE();
select #VarStartDateTime
select #VarENDDateTime
SELECT * FROM
(SELECT CONVERT(Varchar(11),#VarStartDateTime,106) AS VarStartDateTime) A,
(SELECT CONVERT(Varchar(11),#VarENDDateTime,106) AS VarENDDateTime) B
What is the issue here?
Your intention is to retrieve the values of start and end and assign those into SSIS variables.
As #Diego noted above, those two SELECTS are going to cause trouble. With the Execute SQL task, your resultset options are None, Single Row, Full resultset and XML. Discarding the XML option because I don't want to deal with it and None because we want rows back, our options are Single or Full. We could use Full, but then we'd need to return values of the same data type and then the processing gets much more complicated.
By process of elimination, that leads us to using a resultset of Single Row.
Query aka SQLStatement
I corrected the supplied query by simply removing the two aforementioned SELECTS. The final select can be simplified to the following (no need to put them into derived tables)
SELECT
CONVERT(Varchar(11),#VarStartDateTime,106) AS VarStartDateTime
, CONVERT(Varchar(11),#VarENDDateTime,106) AS VarENDDateTime
Full query used below
declare #VarExpiredDays int
-- I HARDCODED THIS
Select #VarExpiredDays= 10
DECLARE #VarENDDateTime datetime,#VarStartDateTime datetime
SET #VarStartDateTime= GETDATE()- #VarExpiredDays
SET #VarENDDateTime=GETDATE();
/*
select #VarStartDateTime
select #VarENDDateTime
*/
SELECT * FROM
(SELECT CONVERT(Varchar(11),#VarStartDateTime,106) AS VarStartDateTime) A,
(SELECT CONVERT(Varchar(11),#VarENDDateTime,106) AS VarENDDateTime) B
Verify the Execute SQL Task runs as expected. At this point, it simply becomes a matter of wiring up the outputs to SSIS variables. As you can see in the results window below, I created two package level variables StartDateText and EndDateText of type String with default values of an empty string. You can see in the Locals window they have values assigned that correspond to #VarExpiredDays = 10 in the supplied source query
Getting there is simply a matter of configuring the Result Set tab of the Execute SQL Task. The hardest part of this is ensuring you have a correct mapping between source system type and SSIS type. With an OLE DB connection, the Result Name has no bearing on what the column is called in the query. It is simply a matter of referencing columns by their ordinal position (0 based counting).
Final thought, I find it better to keep things in their base type, like a datetime data type and let the interface format it into a pretty, localized value.
you have more that one output type. You have two variables and one query.
You need to select only one on the "resultset" propertie
are you mapping these to the output parameters?
select #VarStartDateTime
select #VarENDDateTime