SSIS 2008R2 Data Driven Variable Values - ssis

I am fairly new to SSIS and have set myself a challenging first project, creating a data driven package framework. My current challenge is that I want to store the values of variables for my various packages in a table and then load them. So for instance, the SSIS package might be processing records between 2 dates. I would have two records in a parameters table:
ParmName ParmValue
-------- ---------
DateFrom 2013-01-01
DateTom 2013-01-31
These variables names will exist in the package, I just need to load them. In a false start, I tried using an Execute SQL Task but this didn't work. I assume I need a Script Task C# to do this but I don't know C#. Wondering if anyone could give me a pointer to where I can find some code similar to what I am trying to do. Just to make it a bit clearer, in pseudo code I ebevision a process like
Dataset = Select * from PkgParms where PckID = ?
FOR EACH DataSet.Record
SET (DataSet.Record.ParmName.Value) = (DataSet.Record.ParmValue.Value)
If this is not doable or I am in over my head please just let me know
Thanks
Steve

This is usually done in SSIS - Package Configurations. Follow the wizard for SQL Server configuration type.
You can find some tutorials how to do it, but in general you'll need 2 new columns:
packagepath with values like:
\Package.Variables[User::DateFrom].Properties[Value]
configurationfilter which will have same values for both dates, e.g. Dates

Related

SSIS package - How to produce multiple files with a foreach loop container?

How can several files be produced in a single SSIS package? I have created one that produces a single file, but have no idea how to produce several ones.
The package I produced uses variables to know which data to retrieve, and an expression in the flat file connection manager to assign the correct name to the file (which is based on variables).
The single package I created retrieves the city for which I want the sales data (New York) and the month (September 2020) as variables/parameters, and uses them to extract the appropriate data. Example of SQL statement executed:
select * from table1 where City = ? and Period = ??
It then uses those to build the name for the file to be exported and sends it to a folder. But how do you do that to produce several files within the same package? How can I make the same SSIS package produce another file for Chicago - July 2020, another for Denver - June 2020, and another for San Diego - March 2020?
I plan to have a table that indicates what needs to be produced.
ExampleRow1: Chicago, Sep 2020, Produce=Yes.
ExampleRow2: Miami, Jan 2020, Produce=Yes.
So the SSIS package would need to use that info to produce a file, and then do it again, and again, until there is nothing more to produce. Is this even possible? I know a foreach loop container can help, but not sure if it can handle so many variables changing. This is pretty much the first package I create, that's why I am this ignorant. Thanks in advance!
Right now, you have it working correctly for the value of your two SSIS variables (City and Period) and you have it parameterized so I wouldn't discount that as your first SSIS package. People struggle with far easier tasks
What you need to do is connect the orchestrator/driver table into your package. Here's how we're going to do that.
Create an SSIS variable called rsObject of type Object. This is going to hold a recordset object aka the results of our query.
Execute SQL Task
Add an Execute SQL Task to the Control Flow. Call it "SQL Get Driver Data" You'd use a query like
SELECT T.City, T.Period FROM dbo.ExtractSetup AS T WHERE T.Produce = 1;
Change the default of No Result Set to Full Result Set. That tells SSIS to expect a table shaped return object but something needs to catch that incoming data.
In the Results tab, you now need to map the results into an SSIS variable. Assuming an OLE DB type connection manager, you'll select User::rsObject in the Variable list and then 0 as the recordset name (doing this from memory so specifics might be a slightly off)
Save and run that task. Assuming no errors, when the package runs we have a, potentially empty, set of data in our recordset object. Let's do something with that.
Shredding the data
The name I generally see applied to getting data out of enumerable objects in SSIS is called "shredding the data". The implementation of that is an Foreach Enumerator - one of the most powerful tools in your toolkit.
Drag a ForEach Loop Container onto the canvas. Drag the connector line (precedent constraint) from the "SQL Get Driver Data" to our new ForEach Loop Container. I'd name it "FELC Shred Results" to indicate my intent.
Double click the Task and change the default enumerator type from File System to "Ado.net recordset" This has no bearing on whether you used an OLE, ODBC or ADO.NET connection manager to populate the table-like object. If it's a table, use ADO.NET Recordset.
Specify our variable [User::rsObject] as the source of the Recordset object.
The last thing we need to do is configure what we should do with the current row in the enumerator. That's in the Mapping tab. Here you'll add two parameters and this will be a zero based ordinal system. Choose [User::City] (or whatever you've named your City variable) for your first entry and map that to column name 0. Add a row and use User::Period and map that to column 1
The final step is to take the existing logic (Data Flow Task and whatever else is variable dependent) and move it into the FELC. That's literally drawing a box around it with the mouse to highlight everything and hold the left mouse button and drag it into the FELC.
Hit F5 and you should have 2 files generated.

How to Implement logging at the end of each job In talend?

I am new to Talend os.
However, I received a task:
Create file delimited .csv metadata (one for Lead & Opportunity).
Move files to your repository on the AWS server (the etl_process1 login).
Create two tables sfdc_leads_reporting_raw and sfdc_opp_reporting_raw.
Load the data from the files into the tables. Ensure the data types are correctly used when creating metadata schemas & tables.
Till step 4 I am done.
Now the problem is:
How to Implement logging at the end of each job to report the number of leads (count of distinct id in leads table) and number of opportunities created (count of opportunity id) by stages (how many converted, qualified, closed won, and dead)?
Help would be appreciated.
You can get this data using global variables, in a subjob at the end of your job. Most components provide a global variable called tComponent_NB_LINE (or _NB_LINE_INSERTED for database components) that gives you the number of lines output by the component.
For instance tFileOutputDelimited_1_NB_LINE or tOracleOutput_1_NB_LINE_INSERTED.
Using these variables you can log into console or file.
Here is a simple example. If you have a tOracleOutput_1 in your job you can do:
tPostJob -- OnComponentOk -- tFixedFlowInput -- Main -- tLogRow
Inside tFixedFlowInput you retrieve the variable
(Integer)globalMap.get("tOracleOutput_1_NB_LINE_INSERTED")`.
If you need to log aggregated info, you can append a tAggregateRow to your output components, and use tSetGlobalVar to get count by certain criteria.

Batch job to export data into CSV

I'm doing my first ABAP job and I don't have much experience so I need a little help.
What I want to do to create a batch job that runs every morning at a specific time, fetches data from different tables and exports it as a csv file. To create that batch job I can use transaction code SM36 or SM37.
But I need some help how to fetch the data?
Has anyone an example code that I can use or take a look at?
TheG is right, it sounds like you're trying to learn ABAP from scratch with no guidance. That's difficult but here are some basics:
There are three parts to this:
1. create a program
2. generate a file
3. schedule the job
For 1,
If you go to SE38, you can create a new report. You'll have to check with your colleagues about the namespace, but usually you just start the program with Z (which puts it in the 'customer' namespace).
In the entry box of SE38, you can type DEMO to pull up lots of sap-provided demo reports. The names usually give you a hint about what they demo and you can probably find one that mentions creating a file.
Once you create your own report through SE38 by typing in the name and hitting enter, you can use SELECT...INTO TABLE or SELECT ... ENDSELECT to query the database tables. Highlight select and click the blue i icon to pull up SAP's internal documentation.
At it's most basic, you can use the WRITE statement to print out the rows and columns of your data.
Once you have your report running, then scheduling it with SM36 will be more self explanatory.
This is very basic ABAP reporting program stuff. Making the report run as a background/batch job is the least of the concerns. Let us help you walk through this.
-> Have you done any reporting programming before ?
-> Do you have the list of tables from which you want the data and do you know how they are linked ?
-> Do you know how often this report would be run and what would be the selection criteria required ?
-> Did you check with the functional team whether you want 'delta pull' or 'full pull' every time you run the report ?
-> Do you have the file share where you want to output the file ? Is it on the presentation server or the application server ? If not presentation server can you reason out why not ?
-> Did you confirm on the file name and how it should look ?
-> Do you know how to generate a CSV file ? If this is a 'production requirement' ,are there reusable frameworks for handling file operations in your company ?
-> Do you have the final format of how the CSV file would look ?
-> Did you verify with the functional team whether they want the output data in external format for some fields ?
-> Did you check if there are date fields in your output and what format you want it to be for consistency ?
If you are familiar with ABAP a little bit, explore answers for above, write a report and getting it running in dialog mode. Then revert back to us and we will help you on how to run it as a batch job.

Captuing runtime for each task within a dataflow in SSIS2012

In my SSIS package I have a dataflow that looks something like this.
My requirement is to log the end time of each flatfile destination (Or the time when each of the flat files is created) , in a SQL server table. To be more clear, there will be one row per flatfile in the log table. Is there any simple way(preferably) to accomplish this? Thanks in advance.
Update: I ended up using a script task after the dataflow and read the creation time of each of the file created in the dataflow. I also used same script task to insert logs into the table, just to keep things in one place. For details refer the post masked as answer.
In order to get the accurate date and timestamp of each flat file created as the destination, you'll need to create three new global variables and set up a for-each loop container in the control flow following your current data flow task and then add to the for-each loop container a script task that will read from one flat file at a time the date/time information. That information will then be saved to one of the new global variables that can then be applied in a second SQL task (also in the for-each loop) to write the information to a database table.
The following link provides a good example of the steps you'll need to apply. There are a few extra steps not applicable that you can easily exclude.
http://microsoft-ssis.blogspot.com/2011/01/use-filedates-in-ssis.html
Hope this helps.
After looking more closely at the toolbox, I think the best way to do this is to move each source/destination pairing into its own dataflow and use the OnPostExecute event of each dataflow to write to the SQL table.
Wanted to provide more detail to #TabAlleman's approach.
For each control flow task with a name like Bene_hic, you will have a source file and a destination file.
On the 'Event Handlers' tab for that executable (use the drop-down list,) you can create the OnPostExecute event.
In that event, I have two SQL tasks. One generates the SQL to execute for this control flow task, the second executes the SQL.
These SQL tasks are dependent on two user variables scoped in the OnPostExecute event. The EvaluateAsExpression property for both is set to True. The first one, Variable1, is used as a template for the SQL to execute and has a value like:
"SELECT execSQL FROM db.Ssis_onPostExecute
where stgTable = '" + #[System::SourceName] + "'"
#[System::SourceName] is an SSIS system variable containing the name of the control flow task.
I have a table in my database named Ssis_onPostExecute with two fields, an execSQL field with values like:
DELETE FROM db.TableStats WHERE TABLENAME = 'Bene_hic';
INSERT INTO db.TableStats
SELECT CreatorName ,t.tname, CURRENT_TIMESTAMP ,rcnt FROM
(SELECT databasename, TABLENAME AS tname, CreatorName FROM dbc.TablesV) t
INNER JOIN
(SELECT 'Bene_hic' AS tname,
COUNT(*) AS rcnt FROM db.Bene_hic) u ON
t.tname = u.tname
WHERE t.databasename = 'db' AND t.tname = 'Bene_hic';
and a stgTable field with the name of the corresponding control flow task in the package (case-sensitive!) like Bene_hic
In the first SQL task (named SQL,) I have the SourceVariable set to a user variable (User::Variable1) and the ResultSet property set to 'single row.' The Result Set detail includes a Result Name = 0 and Variable name as the second user variable (User::Variable2.)
In the second SQL task (exec,) I have the SQLSourceType property set to Variable and the SourceVariable property set to User::Variable2.
Then the package is able to copy the data in the source object to the destination, and whether it fails or not, enter a row in a table with the timestamp and number of rows copied, along with the table name and anything else you want to track.
Also, when debugging, you have to run the whole package, not just one task in the event. The variables won't be set correctly otherwise.
HTH, it took me forever to figure all this stuff out, working from examples on several web sites. I'm using code to generate the SQL in the execSQL field for each of the 42 control flow tasks, meaning I created 84 user variables.
-Beth
The easy solution will be:
1) drag the OLE DB Command from the tool box after the Fatfile destination.
2) Update Script to update table with current date when Flat file destination is successful.
3) You can create a variable (scope is project) with value systemdatetime.
4) You might have to create another variable depending on your package construct if Success or fail

SSIS 2012 Full Result Set to set variables

I'm trying to create an SSIS package that reads a mapping table that contains foreign key information and tables they point to and store the full result set to be used to populate 7 columns representing columns in the result set that is then used to update an xxxSID column on 6 servers.
I'm stuck! Please help.
I've created the SQL Task with query to build the result set and mapped to object variable SidMap and the task runs successfully however, I don't know where to go from there. Some blogs say create a ForEachLoop Container and map the object variable to the collection which I've done. I've also created string variables representing the 7 columns but don't know how to populate them.
The blogs I've read so far suggest this can only be done from a Script task. Is that true? If so how is it done?
Another user posted a question that sounded like he may be doing the same or very similar thing using a SQL Task but I didn't see how he was populating the column object variables and then converting data into string variables.
SSIS Result set, Foreachloop and Variable
Currently I'm updating tables manually using a cursor. If anyone cares to see the code I can post it but didn't think it relevant to the question other than providing a clear picture of what I'm doing.
I would create a For Each Loop Container using the Foreach ADO Enumerator, and map the object variable to the collection. I would map the 7 string variables on the Variable Mappings page.
This process is documented in detail here:
http://technet.microsoft.com/en-us/library/cc879316.aspx
A common "gotcha" is mismatched datatypes between the result set and the Variables. To avoid this I always wrap CAST ( ... AS NVARCHAR ( 4000 ) ) or similar around the columns in the dataflow that produces the dataset, and all my receiving Variables are String datatype.