RecordSetDestination SSIS - ssis

What is the major use of recordset destination in SSIS?I heard that it is an in-memory,so the variable which is holding the data is it in raw format? Can someone explain the explain me the real time project use of Recordset destination?

A recordset destination can be used for just about anything you can think of. Some common uses I hear is to use the recordset in a foreach loop. Say you want to export several "categories" from a transaction table. Perhaps you get a recordset of the categories that exist and then call a new dataflow to export that category as it's own file. Or perhaps date ranges, months, etc.
One way I use it is in a script task to perform an action on the data that SSIS cannot do natively. I was using a script component but this particular task ran into a concurrency issue. So by dumping to a recordset I was able to use the recordset in a script task to do the logic in a manner to avoid that issue.
Another script task use is to build and send HTML emails.
I suppose a use for it might be when you have 1 data flow to get 1 record set then do a bunch of non dataflow tasks and then use that as a source in another data flow task, but that is not something I have ever done.

Related

SSIS: How to get the number of updated and deleted rows in an audit?

Imagine that you want to save in a variable the number of rows the were updated or deleted in a table.
‌
This is the steps that i did:
First, in the Control flow i created a Data Flow Task.
Them, in the Data Flow, i created a source(in my case is a excel file), then i proceeded to create two variables to count those rows- countDeleted and countUpdated, then connected the variables to two row count transformations, and them connected my destination (OLE DB).
Now in the control flow, what do i do??
Create a SQL execute task?? or a Script task?? What is the best way to do it?? What is the piece of code to use??
Thanks for youy help.
P‌S: i only have 4 weeks off SSIS, sorry for my noobieness :)
An OLD DB destination only inserts. It can't UPDATE or DELETE
What's your logic for updating or deleting?
If you're just starting out and reading about doing things in SSIS you will eventually find advice to use the OLE DB Command to perform row by row delete and inserts.
In my opinion this is to be avoided. It does not scale (works fine for small recorsets then fails for large recordsets), and it is difficult to maintain parameter mappings in the OLE DB Command. Although you should try it anyway to familiarise yourself with it.
My advice is to load the Excel data into a staging table, perform batch DELETE and UPDATE statements to load the data and use ##ROWCOUNT to capture the records updated.
For example;
Your existing described dataflow can be used to load into a table called StagingTable
Before your dataflow you should run an Execute SQL Task (This is in the Control Flow pane, not the Data Flow pane) that clears the staging table:
TRUNCATE TABLE StagingTable;
So first get that working - repeatedly running your package clears the staging table then loads Excel into it without creating duplicates
This in itself is a challenge as Excel is a terrible data interchange format.
Once you have that working, you add an execute SQL task to the end that runs some SQL that deletes the records you want and captures the count. For example:
DELETE FROM MyFinalTable WHERE PriamryKey IN (SELECT PrimaryKey FROM StagingTable);
SELECT ##ROWCOUNT;
Then you follow the instructions here to load that back to your SSIS variable
http://microsoft-ssis.blogspot.com/2011/03/rowcount-for-execute-sql-statement.html
What are you doing with this row count? Are you writing it to a logging table? Save
yourself the bother of pulling it back into an SSIS variable and just write it directly:
DELETE FROM MyFinalTable WHERE PriamryKey IN (SELECT PrimaryKey FROM StagingTable);
INSERT INTO LogTable(Table,Operation,Type)
SELECT 'MyFinalTable','Delete', ##ROWCOUNT;
In my experience it is not a good idea to build convoluted logic into SSIS packages if you can instead do in a database. Although it does depend on the person who has to eventually maintain it. Hopefully you can appreciate that this T-SQL approach is a more straightforward code based approach as opposed to having to dig around in property pages and events and other places inside SSIS packages.
I assume that you're using an Execute SQL Task for the updates and deletes? As #Nick.McDermaid mentioned, using an OLE DB Command within a Data Flow presents various issues when performing DML. You can find the number of rows updated, inserted, or deleted in a table through an Execute SQL Task by using the ExecValueVariable property of this task. Set the variable that will hold the row count to this property and it will return the number of affected rows. Note that is will only return the number of rows impacted by the last statement in the Execute SQL Task, regardless of batches (i.e. GO separators) are in the component.

Store a sql query result in pentaho variable

I am new in PDI (passing from SSIS) and I am having some troubles by handling the variables issue.
I would like to perform this:
From a sql select query I would like to save the result into a variable.
For that reason I have created one job and two transformations, given that in pentaho every step is executed in parallel.
The first transformation is going to be on charge of setting the variable and the second transformation is going to use this result as an input.
But in the first transformation I am having troubles by setting the variable, I do not understand where do I have to instanciate this variable to implement the "set season variable" step. And then how to get this result in the next transformation.
If anyone knows about this, or if you could recommend any link with a good example, I'll really appreciate it.
This can indeed be confusing for SSIS users. In PDI, you don't create a recordset variable as you do in SSIS. Simply creating a job creates one for you. Each job has two different types of "Results". One for recordset rows and one for filenames.
These variables are not directly accessible; they are just part of the job. There are steps that interact with them directly. For example under the "Job" branch when you're creating a transform, there is a Get rows from results step and a Copy rows to results step. They work directly with the job's row results.
Be aware that you must manually manage the metadata for the results. This is a pain, but over-all I find PDI's method of doing this more intuitive and easier than SSIS. I find SSIS more flexible in this regard.
There are also Get files from result and Set files in result. These interact with the job's built in file results. This is simply a list of every file touched by any step configured in the job. On the job tab there are tasks that deal with it directly such as Process result filenames, Add filenames to result and Delete filenames from results. These tasks operate on the built in file results list for the job and provide an easy way to, say, archive all the files loaded by the transform you just ran.
Be aware when using these steps that they record EVERY file touched by EVERY step in the job. If you look through most of the steps in transformations (data flows) that deal with files, there's usually an "Add files to results" checkbox that is checked by default. If you uncheck this, it will not add the file names to the jobs file results. You can also delete specific files from the file results with the Delete filenames from result step.
From your Job, start a Transformation:
Overload transformation variable into global variable in your job and use it:

SSIS PACKAGE, Only want Derived Columns

I have a SSIS Package that I have a For Each Loop which imports multiple txt files into a SQL Server table. That runs fine.
What I am trying to accomplish is to store the distinct filename and date it was imported into a separate table. I created a separate For Each Loop for this and then archive the txt file after it's complete with a File System Task.
The issue I am having is I put an event handler to invoke a SQL Task and Send Email task if there is a warning (I was hoping for a warning only if there were no files in the directory where the package is importing from).
However, I found a warning that a column in the Data Flow task was not being used and should be removed if not needed. But the Data Flow task requires at least one field for me to put a Derived Column task
Derived Column Field1: pulls the #User: CurrentFile from the ForEachLoop Container.
Field2 pulls the current date.
Is there a way to perform this without the warning?
It sounds like you're over-complicating thing.
You have a ForEach loop and you're therefore assigning a value into some Variable to contain the file name, #User::CurrentFile. You can get the date it was loaded through either a call to GETDATE() or reference the system scoped variable, StartTime #[System::StarTime]
The most straight forward option would be to add an Execute SQL Task wired up to the OnSuccess Precedent Constraint from your Data Flow Task. The Execute SQL Task will then have a statement like INSERT INTO dbo.MyLog(FileName, InsertDate) SELECT ?, ?, assuming OLE DB Connection Manger, and then you map in your two variables.
Easy, clean, no warnings fired about unused columns in your data flow.
What I think you have is something like this, based on
I created a separate For Each Loop for this

how to assign multiple rows data in a variablet in SSIS?

Just wandering, is there any way that I can assign multiple rows data in a variable in SSIS?
example:
I have the following table (tableA) with the following data created by the data flow task:
DataRow
Jay,10,11 Happy St\n\n
David,12,13 Angel St\n\n
Tom,30,23 Betman St\n\n
How can I able to assign those records into a variable in SSIS as below:
Jay,10,11 Happy St\n\nDavid,12,13 Angel St\n\nTom,30,23 Betman St\n\n
Then pass it into the Web service task. At the moment I'm running the web service task into the loop, but I would like to compile all the data row and pass over to the web service task outside the loop
Any way that I can do it? Any example are link could share?
If you can, take the source data using an "Execute SQL Task". If that is not possible you can create a temporary table and then populate this table in your data flow task. The key is to store your result set in a varible of type "Object" and then you can iterate it into a loop task and send every row to the Web Service task as you want.
Here you could find a detailed tutorial about this:
Loop through ADO recordset in SSIS
If you need help with the temporary table just let me know.
Hope that helps,
Paul

Trasnfer multiple table from one databsse to another database using SSIS

Can somebody please help me to transfer around 15 tables from one database to another database. At present I can do this one by one using Data Flow task, but then I need to do this task 15 times which is very time consuming.
Why don't you just use a task? Maybe tasks->export is what you're looking for.
Otherwise you'll need to create separate blocks for each table or:
Create a variable of type object
Script Task: Add to your list all table names.
Iterate over this object variable with For each loop container
Inside the loop create a source from a variable. In this variable specify the connection dynamically depending on the current loop value.
you can use SSIS package, select Transfer SQL server objects from SSIS toolbox , in Object specify the source and destination servers and database. for copyAllObjects make it false . ObjectToCopy select CopyAllTables true or make it false and pick from the list the table you want to copy.