I have a DataFlow Task which picks up a variable while running. This variable changes its value three times so the job has to run three times. I want to have a Lookup Transformation in the DFT that checks if the new value to be inserted already exists in the database for the value of the current variable.(I cannot create any unique key constraints in the database.)How do I make the where clause of the LookUp transformation pick up the value from the variable? I cannot use execute sql as it is restricted to Control flow tasks.
A better way than use a lookup is to use a MERGE statement : http://technet.microsoft.com/en-us/library/bb510625.aspx
If you still want to use a lookup, you have to disable the cache on your component (or set it to partial), then in the advanced tab you can check the "Modify SQL instruction", type your query and use variables using the "parameters..." button.
Related
I am using SSIS 2008 and put a simple query (not proc) in an execute sql task (control flow). The query generate one column with a single value, what I am trying to do is based on this value to decide whether to do the following tasks. I tried mapping the value to a variable in the parameter mapping. I tried direction Output/Return value etc but all failed. The query takes no parameter. I know probably I can create a proc with a output parameter to be mapped to a variable but just wondering if there is other options (e.g. not creating proc, it's very simple query)?
As mentioned, you need to change the SQL Task to give a Result Set on a 'Single Row', you can then output that result set to a variable.
From here you can use the constraints within the Control Flow to execute different tasks based upon what the outcome variable will be; for example:
I've a package where I need to retrieve data from a mysql table and insert it into sql server table.
I've a situation where in old data often gets modified and the client wants to dump all data which is too large and time consuming...So I've come up with a proposal that we'd load only yesterday's data on week days and do complete dump on weekend...Is there a possibility of Enable/Disabling a DFT Based on an expression? I've Tried using Expressions->Disable based on DATEPART(WeekDAY,GETDATE()) but it runs for a complete load irrespective of expression's value
Regards,
Vijay
Create a SQL Task or Script Task that does your expression, and set the result to a variable.
Then create your data flow task.
Then connect the two with an arrow (aka precedence constraint)
Then right-click:Edit on the arror and choose Edit, and in the Precedence Constraint Editor choose
EvaluationOperation: Expression
Value: ##YourVariable= {an expression, such as #iRowsUpdated==True}
You should put the condition as a where clause against your select statement in your source query. You will need to change access mode from table to SQL Statement first
I have an input csv file with columns eid,ename,designation. Next i use Lookup transformation, inside look up am using query like
select * from employee where ename=?
i need to pass parameter ? from csv file. That is ename which is in csv file has to be passed into the query using Lookup transformation.
Inside Lookup i have changed mode to Partial cache, and inside Advanced tab, i selected Modify the SQL Statement and placed my query, and clicke on paramters tab. But i don't know like how to pass the parameter.
you cant add parameters to your lookup query. If by adding the parameters your goal is to reduce the amount of data read from the database, you don't have to worry, the "partial cache" will do that for you.
Partial cache means that the lookup query is not executed on the validation phase (like the full cache option) and that rows are being added to the cache as they are being queried from the database one by one. So, if you have one million rows on your lookup cache and your query only have reference to 10 of those rows, your lookup will do 10 selects to your database and end up with 10 rows only.
I would like to make a package that would copy data from a table only if table is not empty. I know how to do count and how to make a package for copying data but problem is that Source can't have any inputs so I don't know how to do it. Any suggestions?
I don't understand your comment about dragging a "green line from a package to a source" but instead of trying to determine in advance if the table is empty, just do your copy anyway and then see how many rows were copied:
Create a package variable for the rowcount
Populate the variable using the rowcount transformation
Use an expression in the precedence constraint to check the variable: if it's greater than zero then continue executing the rest of your package
#Pondlife I don't think you can use precedence constraint on the data flow task, can you?
I believe you can use it only on the control flow.
I would add a "Execute SQL Task" with the count, sending the result to a variable and from this task, I would drag the green arrow to the Data Flow task that makes the copy and on this arrow I would add the expression on the precedence constraint.
As you have correctly noted, a data flow source does not accept input so one cannot perform logic in the dataflow to determine whether this task should run.
Cannot create connector.
The destination component does not have any available inputs for use in creating a path.
However, there's nothing stopping you from setting up this logic in your control flow. I would use a query that hits the DMVs for a fast rowcount on the destination system, filtered to only the tables I wished to replicate.
Armed with the list of empty tables, it'd probably depend how I'd handle it. For a small number of tables, I'd define N dataflows all with a do nothing script task as a precedent and then use an expression on table name to enable a path, much like I did on this question.
If there are many tables, I'd define a package per table and then invoke execute package task with the package name built dynamically based on the empty table name.
I'm fairly new to SSIS,
I'm importing from an XLS spreadsheet into a database table. Along the way I want to select a record from a table, but it is NOT a lookup, ie: a straight SELECT with no join from input source. Then I want to merge this along with the other rows from the XLS.
What is the best way to do this? Variables? OLE DB commands?
Thanks
You could use an OLE DB command but the important thing to remember about this is that it is fired on a per-row basis and could potentially be slow. You can still use a lookup for this purpose, but make sure that you use set the error output to ignore lookup errors for the cases when the lookup transformation does not contain an value for the match you are looking for.
You could also use a merge transformation with an outer join condition rather than an inner join.
If the record that you are retrieving from the database table is not dependent on the data within the row from the spreadsheet then it will probably be the same for each row - is that what you are hoping for?
In this case, I would consider using an Execute SQL Task in the Control Flow to retrieve the record and save it to a variable. You can use a Script Component in the Data Flow to copy the values in the record from the variable to the appropriate fields in each row. This will mean that the lookup data is retrieved only once and not once per row which is slow as jn29098 said above.
If the target for your Data Flow is the same database as the one from which you are extracting the 'lookup' record then you could also consider using an Execute SQL Task (in the Control Flow) to add the lookup values once the spreadsheet data has arrived in the database (once the Data Flow has completed). This would be much more efficient.