processing multiple files in business objects data services - business-objects

I am new to the Business Objects Data services.
I have to run a dataflow reading from a file. Filename should be read based on wild chars like Platform. And I want to run the dataflow only if the file exists, if file is not present , it should not error out or should not do anything but it should just move on to the next dataflow or workflow in the job.
I tried below code to check if the file exists as built_in function File_Exists cannot check the file based on wild chars.
*$FILEEXISTSFLAG= exec('/bin/ksh',' "ls xxxxxx/Platform.csv',8);*
My intention is based on the value assigned to $FILEEXISTSFLAG from above code, I will decide whether to execute the data flow or not (if $FILEEXISTSFLAG is null do nothing otherwise execute the data flow ) but its retrieving below output.
*ls: cannot access /xxxxxx/Platform.csv: No such file*
Is there any other way to achieve this?

I was able to solve the above problem by using the index function.
$FILEEXISTSFLAG is containing a value like "ls: cannot access Platform: No such file or directory ". So, I have used the index function as below. So if the output is not null for below index function, it will execute the dataflow, otherwise it will do nothing.
index( $FILEEXISTSFLAG , 'No such file',1)
Thanks,
Phani.

Related

Importing flat file which has changing column order using SSIS [duplicate]

Problem.
I regularly receive a feed files from different suppliers. Although the column names are consistent the problem comes when some suppliers send text files with more or less columns in there feed file.
Furthermore the arrangement of these files are inconsistent.
Other than the Dynamic data flow task provided by Cozy Roc is there another way I could import these files. I am not a C# guru but i am driven torwards using a "Script Task" control flow or "Script Component" Data flow task.
Any suggestion, samples or direction will greatly be appreciated.
http://www.cozyroc.com/ssis/data-flow-task
Some forums
http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400
http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow
Off the top of my head, I have a 50% solution for you.
The problem
SSIS really cares about meta data so variations in it tend to result in exceptions. DTS was far more forgiving in this sense. That strong need for consistent meta data makes use of the Flat File Source troublesome.
Query based solution
If the problem is the component, let's not use it. What I like about this approach is that conceptually, it's the same as querying a table-the order of columns does not matter nor does the presence of extra columns matter.
Variables
I created 3 variables, all of type string: CurrentFileName, InputFolder and Query.
InputFolder is hard wired to the source folder. In my example, it's C:\ssisdata\Kipreal
CurrentFileName is the name of a file. During design time, it was input5columns.csv but that will change at run time.
Query is an expression "SELECT col1, col2, col3, col4, col5 FROM " + #[User::CurrentFilename]
Connection manager
Set up a connection to the input file using the JET OLEDB driver. After creating it as described in the linked article, I renamed it to FileOLEDB and set an expression on the ConnectionManager of "Data Source=" + #[User::InputFolder] + ";Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties=\"text;HDR=Yes;FMT=CSVDelimited;\";"
Control Flow
My Control Flow looks like a Data flow task nested in a Foreach file enumerator
Foreach File Enumerator
My Foreach File enumerator is configured to operate on files. I put an expression on the Directory for #[User::InputFolder] Notice that at this point, if the value of that folder needs to change, it'll correctly be updated in both the Connection Manager and the file enumerator. In "Retrieve file name", instead of the default "Fully Qualified", choose "Name and Extension"
In the Variable Mappings tab, assign the value to our #[User::CurrentFileName] variable
At this point, each iteration of the loop will change the value of the #[User::Query to reflect the current file name.
Data Flow
This is actually the easiest piece. Use an OLE DB source and wire it as indicated.
Use the FileOLEDB connection manager and change the Data Access mode to "SQL Command from variable." Use the #[User::Query] variable in there, click OK and you're ready to work.
Sample data
I created two sample files input5columns.csv and input7columns.csv All of the columns of 5 are in 7 but 7 has them in a different order (col2 is ordinal position 2 and 6). I negated all the values in 7 to make it readily apparent which file is being operated on.
col1,col3,col2,col5,col4
1,3,2,5,4
1111,3333,2222,5555,4444
11,33,22,55,44
111,333,222,555,444
and
col1,col3,col7,col5,col4,col6,col2
-1111,-3333,-7777,-5555,-4444,-6666,-2222
-111,-333,-777,-555,-444,-666,-222
-1,-3,-7,-5,-4,-6,-2
-11,-33,-77,-55,-44,-666,-222
Running the package results in these two screen shots
What's missing
I don't know of a way to tell the query based approach that it's OK if a column doesn't exist. If there's a unique key, I suppose you could define your query to have only the columns that must be there and then perform lookups against the file to try and obtain the columns that ought to be there and not fail the lookup if the column doesn't exist. Pretty kludgey though.
Our solution. We use parent child packages. In the parent pacakge we take the individual client files and transform them to our standard format files then call the child package to process the standard import using the file we created. This only works if the client is consistent in what they send though, if they try to change their format from what they agreed to send us, we return the file.

SSIS - Load flat files, save file names to SQL Table

I have a complex task that I need to complete. It worked well before since there was only one file but this is now changing. Each file has one long row that is first bulk inserted into a staging table. From here I'm supposed to save the file name into another table and then insert the the broken up parts of the staging table data. This is not the problem. We might have just one file or even multiple files to load at once. What needs to happen is this:
The first SSIS task is a script task that does some checks. The second task prepares the file list.
The staging table is truncated.
The third task is currently a Foreach loop container task that uses the files from the file list and processes it:
File is loaded into table using Bulk Insert task.
The file name needs to be passed as a variable to the next process. This was done with a C# task before but it is now a bit more complex since there could be more than one file and each file name needs to be saved separately.
The last task is a SQL task that executes a stored procedure with the file name as input variable.
My problem is that before it was only one file. This was easy enough. What would the best way be to go about it now?
In Data Flow Task which imports your file create a derrived column. Populate it with system variable value of filename. Load filename into the same table.
Use a Execute SQL task to retrieve distinc list of filenames into a recordset (Object type variable).
Use For Each Loop container to loop through the recordset. Place your code inside the container. Code will recieve filename from the loop as a value of a variable and process the file.
Use Execute SQL task in For Each Loop container to call SP. Pass filename as a parameter like:
Exec sp_MyCode param1, param2, ?
Where ? will pass filename INPUT as a string
EDIT
To make Flat File Connection to pick up the file specified by a variable - use Connection String property of the Flat File Connection
Select FF Connection, right click and select Properties
Click on empty field for Expressions and then click ellipsis that appears. With Expressions you can define every property of the object listed there using variables. Many objects in SSIS can have Expressions specified.
Add an Expression, select Connection String Property and define an expression with absolute path to the file (just to be on a safe side, it can be a UNC path too).
All the above can be accomplished using C# code in the script task itself. You can loop through all the files one by one and for each file :
1. Bulk Copy the data to the staging
2. Insert the filename to the other table
You can modify the logic as per your requirement and desired execution flow.
Add a colunm to your staging table - FileName
Capture the filename in a SSIS Variable (using expressions) then run something like this each loop:
UPDATE StagingTable SET FileName=? WHERE FileName IS NULL
Why are you messing about with C#? From your description it's totally unnecessary.

SSIS Use DataFlow task with variables instead of a source database

I have a task that I am working on that has me stumped. Hoping you can help me. I am using a data flow task which is basically inserting a row into a sqlite table. I was doing this using a "SQL Task" but unfortunately the only way to successfully insert a guid into the sqlite table is to convert it as a byte stream using the data flow task. I do not want to use a source database because my data is not flowing from one table to another. I really just want to take my populated variables and convert them to a byte stream which i can then insert successfully into a sqlite database. The issue is, i cannot use a dataflow task without a source database.
My work-around so far has been to declare a source database/table and only one column (but never use it in the data flow). This works fine and I am unable to insert the row into sqlite using my pre-set variables, but i am left with a somewhat annoying message in my Output log every time i do this:
Warning: 0x80047076 at , SSIS.Pipeline: The output column "" (117) on output "OLE DB Source Output" (11) and component "OLE DB Source" (1) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.
Anyone know of a good way to get this warning not to show up?
In your dataflow choose a Script Component.
When prompted to choose Source, Destination, or Transformation, choose Source.
Add your pre populated variables to the CustomProperties.ReadOnlyVariables section of the script tab.
Go to the Inputs and Outputs section.
Add a column to the default output for each of your variables.
In your script (if using C#) put something similar to the following in the CreateNewOutputRows() section
Output0Buffer.AddRow();
Output0Buffer.ContainerName = Variables.ContainerName;
Output0Buffer.TaskName = Variables.TaskName;
Output0Buffer.TaskStartDate = Variables.ContainerStartTime;
Save your script.
Connect your script component to your destination object.
If this is causing your package execution to get failed, you got an option of ignoring these warnings/errors..
Just double click the Source block in Dataflow and navigate to the last tab("Error OUtput") in left side pane and you need to select the option to ignore the errors. (I dont know eactly what phrase in that option will do it )

ssis - capturing the bad rows

HI, Can you help me to figure this out? Is there a way to get the row in which error occured in ssis? I have this flat file with some 10k + records which is being read via a 'flatfilesource'.
Right now the error output defaults to error-column, error-code, and 'flatfilesourceerroroutputcolumn' - and i use a script-component to handle it. But none of these three inputs (to script component) are user-friendly enough. So i want to get an output like the first column-value(this is a unique identifier) of the row in which error occured. How can I add that?
While debugging this in SSIS, you can add a Data Viewer on the path to where your script handles error. This path has all the columns of the original row where your error is.
If you want to handle your SSIS errors and also do something else with it, you can direct the error output from your flat file source to a Multicast and then send one stream down to a file, a table, or something else ( a Recordset destination and a subsequent foreach loop on the object used to store the Recordset will let you do stuff on a row-by-row basis on the errored row(s)).

SSIS 2005 - How to check a file to see if a files does not exist

How do you check to see if a file does not exist in SQL Server Integration Services 2005?
Is there a native SSIS component which will just do this for you?
I have checked for file existence this using the Script Task and then branch accordingly.
You can do something like
If System.IO.File.Exists("\\Server\Share\Folder\File.Ext") Then
Dts.TaskResult = Dts.Results.Success
Else
Dts.TaskResult = Dts.Results.Failure
End If
Although there are no native components for this, there are several third party components for SSIS that you can use for this purpose.
The File System Task in SSIS is basically for move, copy, delete, etc., but does not support file existence checks.
#Raj More gave a good solution. Another way that I have used before is to create a Foreach Loop Container that loops over the file system for a file spec. If you know the name of the file you want, then you can set the name in a variable and set the spec to equal the variable in the expression tab for the Foreach Loop Container. You could also just specify or a directory or a partial file name if you don't know the exact name but know the naming convention or know there will be no other files in the folder.
If you want to take a specific action based on whether or not there is a file, then you could create a variable with a default value of 0 and create a script task in the Foreach Loop Container that increments the variable. You could also just put the commands in the Foreach Loop Container that you want to execute if you want to execute it for the existence of each individual file. If you want to take an action based on the absence of the file, then you could restrict your precedence constraint after the Foreach Loop Container so that it is restricted on constraint and expression and make it check if the counter variable is > 0.
#Raj's solution could also be used to increment the variable. Instead of using an If Else to raise an error or success result, you could do this:
C#
if (System.IO.File.Exists("\\Server\Share\Folder\File.Ext"))
{ Dts.Variables["my_case_sensitive_variable_name"].Value = Dts.Variables["my_case_sensitive_variable_name"].Value + 1;
}
VB.NET
If System.IO.File.Exists("\\Server\Share\Folder\File.Ext") Then
Dts.Variables["my_case_sensitive_variable_name"].Value = Dts.Variables["my_case_sensitive_variable_name"].Value + 1
End If
The advantage of this approach is that the package may not need to fail in the absence of a file. You could also use a variable name if the file changes that you could define either as a variable in the package or just solely created in the script task. The only short-coming of #Raj's approach is that you have to know the file name you want to check.
Another possibility is to execute a File System Task to rename the file to its existing name or copy the file to its existing location. If the file doesn't exist, then you can route the error to an action. I don't recommend this solution, but I remember using it years ago in one instance where it actually made sense. But in that particular instance, I was actually copying it to a real location.
Good luck!