I am working on a large SSIS data migration project in which all of the output tables have fields for the creation date & user as well as the last update date and user. The values will be the same for all of the records in all of the output tables.
Is there a way to define parameters or variables that will appear in the destination mapping window, and can be used to populate the output table?
If I use a sql statement in the source, I could, of course, include extra fields for this, but then I also have to add a Data Conversion task for translating the string fields from varchar to nvarchar.
You cannot do this in the destination mapping.
As you've already considered, you could do this by including the extra fields in the source, but then you are passing all that uniform data through the entire dataflow and perhaps having to convert it as well.
A third option would be to run through your data flow without those columns at all (let them be NULL in the destination), and then follow the data flow with an UPDATE that sets those columns with a package variable value.
Related
I have SSIS with 2 loops that loop over excel files and over the sheets. I have a data flow task in the for each sheet loop with variable name for sheetname and the source is excel and odbc destination.
The table in the db has all the columns I need such as userid, username, productname, supportname.
However, some sheets can have columns username, productname and others have userid, username, productname, supportname.
How can I load the excel files? Can I add columns to a derived column task that checks if a column exists and if not add it with a default value and then map it to the destination?
thanks
SSIS is a not an any format goes at run-time data loading engine. There was a conscious design decision to make the fastest possible ETL tool and one of those requirements was that they needed to define a contract between the data source's shape and the destination. That's why you'll inevitably run into VS_NEEDSNEWMETADATA error because something has altered the shape and the package needs to be edited in designer mode to update the columns and sizes.
If you want to write the C# to make a generic Excel ingest engine, more power to you.
An alternative approach would be to have multiple data flows defined within your file and worksheet looping construct. The trick would be to conditionally enable them based on the available column set.
Columns "username and productname" detected, enable DFT UserName and ProductName. And that DFT will have default values, or a lookup, for UserId, SupportName, etc
All columns present, enable DFT All.
Finally, Azure Data Factory can "slurp and burp" whatever source to whatever destination. Perhaps that might be a better fit for your problem.
I am somewhat new to MS Access and I have inherited an application with a table that uses this Lookup feature to replace a code with a value from a query to another table.
When I first used this table and exported it to Excel for analysis, I somehow got the base ID number (or whatever it would be called) rather than the translated lookup value. Now, when I do this, I get the translated text. The biggest problem is that while the base value is unique, the translated values are not, so I cannot use them for the work I am doing.
Can someone explain how to get the underlying ID value rather than the lookup value? Is there some setting I can use or some way to reference the field upon which the lookup is based. When I query the ID field, I get the lookup value. I know that the first time I did this, the spreadsheet contained the ID number not the text.
For now, I created a copy of the table and removed the lookup information from this copy, but I know I did not run into this when I did this the first time.
Thanks.
When you export to Excel, leave Export data with formatting and layout unchecked. This will create a spreadsheet with raw data values in Lookup fields.
Export settings image
I have an SSIS Package setup with the following Data Flow Tasks in order:
Flat File Source
Derived Column
Custom Task
Flat File Destination
The Flat File source contains fixed-width rows of data (282 characters per row).
The Derived Column splits each row into columns using the SUBSTRING() method.
The Custom Task performs some Regular Expression validation and creates two new output columns: RowIsValid (a DT_BOOL) and InvalidReason (a DT_WSTR of 200). There is no Custom UI for this Task.
The Flat File Destination is the validated data in delimited column format. Eventually, this would be a
database destination.
I know that this can be done using a Script Task. In fact, I am currently doing so in my solution. However, what I am trying to accomplish is building a Custom Task so that code changes are done in a single-spot instead of having to change multiple Script Tasks.
I have a couple of issues I'm trying to overcome and am hoping for some help/guidance:
(Major) Currently, when I review the mappings of the Flat File Destination, the Available Input columns are coming from the Flat File Source, the Derived Column Task, and the Custom Task. Only one column is coming from the Flat File Source (because there is only one column), while the Derived Column and Custom Task each have all of the columns created in the Derived Column.
My expectation is that the Available Input Columns would/should only display the Custom Validator.[column name] columns (with only the column name) from the Custom Validator. Debugging, I don't see where I can manipulate and suppress the Derived Column.[column name] columns.
(Minor) Getting the input columns from the Derived Column Task to automatically be selected or used when the Input is attached.
Currently, after hooking up the input and output of the Custom Validator, I have to go to the Inputs tab on the Advanced Edit and select the columns I want. I'm selecting all, because I want all columns to go through the task, even though only some will be validated by the task.
I am pretty new to SSIS and BI in general, so first of all sorry if this is a newbie question.
I have my source data for the fact table in a csv, so I want to match the ids against the surrogate keys in lookup tables.
The data structure in the csv is like this
... userId, OriginStationId, DestinyStationId,..
What I am trying to accomplish is to match the data against my lookup table. So what I am doing is
Reading Lookup data using OLE DB Source
Reading my csv file
Sorting both inputs by the same field
Doing a left join by Id, in order to get the SK
This way, if there is no match (aka can't find the surrogate key) I can redirect that to a rejected csv and handle it later.
something like this:
(sorry for the spanish!)
I am doing this for each dimension, so I can handle each one with different error codes.
Since OriginStationId and DestinyStationId are two values from the same dimension (they both match against the same lookup table), I wanted to know if there's a way to avoid reading two times the data from the table (I mean, not to use two ole db sources to read twice the data from the same table).
I tried adding a second output to the sort but I am not allowed to. The same goes to adding another output from OLE DB Source.
I see there's an "cache option", is the best way to go ? (Although it would impy creating anyway another OLE DB source.. right?)
The third option I thought of was joining by the two fields, but since there is only one field in the lookup table (the same field) I am getting an error when I try to map both colums from my csv against the same column in my Lookup table
There are columns missing with the sort order 2 to 2
What is the best way to go for this ?
Or I am thinking something incorrectly ?
If something was not clear let me know and I'll update my question
Any time you wish you could have multiple outputs from a component that only allows one, all you have to do is follow that component with the Multicast component, whose sole purpose is to split a Data Flow stream into multiple outputs.
Gonzalo
I have just used this article on how to derive columns for a data warehouse building:- How to Populate a Fact Table using SSIS (part 1).
Using this I built a simple package that reads a CSV file with two columns that are used to derive separate values from the same CodeTable. The CodeTable has two fields Id and Description.
The Data Flow has two "Lookup" tasks. The first one joins the attribute Lookup1 against the Description to derive its Id. The second joins the attribute Lookup2 against the Description to derive a different Id.
Here is the Data Flow:-
Note the "Data Conversion" was required to convert the string attributes from the CSV file into "Unicode string [DT_WSTR]" so they could be joined to the nvarchar(50) description attribute in the table.
Here is the Data Conversion:-
Here is the first Lookup (the second one joins "Copy of Lookup2" to the Description):-
Here is the Data Viewer output with the to two derived Ids CodeTableFirstId and CodeTableSecondId:-
Hopefully I understand your problem and this is of use to you.
Cheers John
What is the expression in SSIS to get the same dates as in source to destination. If I am using GETDATE() it will give current date but I want the same dates mentioned in source.
It sounds like you are looking to have the same date value for each row as it moves from the Source to the Destination. You can create your own variable and add it as a Derived Column transformation to the dataflow or you can use a system variable like ContainerStartTime from an Audit transformation (or Derived Column, too).
Here's an article on all the available System Variables in SSIS.
Since your wording was "same dates mentioned in source", you could do the following to get a single date from the source and use it in your data flow.
On the control flow, create a SQL task that returns GETDATE() as a single row result set from the source server. Save this result to a variable.
Within a data flow, add a derived column transformation after the source. Add the new variable value to the flow as a new column.
Map it to the destination column for a single date/time value that was derived from the source system right before the operation began.