Disclaimer: new to SSIS and Active Directory
I have a need to extract all users within a particular Active Directory (AD) domain and import them into Excel. I have followed this: https://www.itnota.com/query-ldap-in-visual-studio-ssis/ in order to create my SSIS package. My SQL is:
LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=a*));Name,sAMAccountName
As you know there is a 1,000 row limit when pulling from the AD. In my SQL I currently have (name=a*) to test the process and it works. I need to know how to setup a loop with variables to pull all records and import into Excel (or whatever you experts recommend). Also, how do I know what the other field names are that are available to pull?
Thanks in advance.
How do I see what's in Active Directory
Tool recommendations are off topic for the site but a tool that you can download, no install required, is AD Explorer It's a MS tool that allows you to view your domain. Highly recommend people that need to see what's in AD use something like this as it shows you your basic structure.
What's my domain controller?
Start -> Command Prompt
Type set | find /i "userdnsdomain" and look for USERDNSDOMAIN and put that value in the connect dialog and I save it because I don't want to enter this every time.
Search/Find and then look yourself up. Here I'm going to find my account by using my sAMAccountName
The search results show only one user but there could have been multiples since I did a contains relationship.
Double clicking the value in the bottom results section causes the under pane window to update with the details of the search result.
This is nice because while the right side shows all the properties associated to my account, it's also updated the left pane to navigate to the CN. In my case it's CN=Users but again, it could be something else in your specific environment.
You might discover an interesting categorization for your particular domain. At a very large client, I discovered that my target users were all under a CN
(Canonical Name, I think) so I could use that in my AD query.
There are things you'll see here that you sure would like to bring into a data flow but you won't be able to. Like the memberOf that's a complex type and there's no equivalent in the data flow data types for it. I think Integer8 is also something that didn't work.
Loop the loop
The "trick" here is that we'll need to take advantage of the
The name of the AD provider has changed since I last looked at this. In VS 2017, I see the OLE DB Provider name as "OLE DB Provider for Microsoft Directory Service"
Put in your query and you should get results back. Let that happen so the metadata is set.
An ADO.NET source does not support parameterization as the OLE DB does. However, you can apply an Expression on the Data Flow which surfaces the component and that's what we'll do.
Click out of the Data Flow and back into the Control Flow and right click on the Data Flow and select Properties. In that properties window, find Expressions and click the ellipses ... Up pops the Property Expressions Editor
Find the ADO.NET source under Property and in the Expressions section, click the Ellipses.
Here, we'll use your same source query just to prove we're doing the right things
"LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + "a" + "*));Name,sAMAccountName"
We're doing string building here so the problem we're left to solve is how we can substitute something for the "a" in the above query.
The laziest route would be to
Create an SSIS variable of type String called CurrentLetter and initialize it to a
Update the expression we just created to be "LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + #[USer::CurrentLetter] + "*));Name,sAMAccountName"
Add a Foreach Loop Container (FELC) to your Control Flow.
Configure the FELC with an enumerator of "Foreach Item Enumerator"
Click the Columns...
Click Add (this results in Column 0 with data type String) so click OK
Fill the collection with each letter of the alphabet
In the Variable Mappings tab, assign Variable User::CurrentLetter to Index 0
Click OK
Old blog posts on the matter because I like clicks
https://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html
Imagine I have an external file dates.csv in the following format:
Name
Date
start_of_fin_year
01.03.2022
end_of_fin_year
28.02.2023
Obviously, the file may get updated in the future, and the date may change. I create a piece of code that checks the file periodically to extract needed dates and put them into the DB/variables. Roughly speaking, I have this pseudocode:
start_of_fin_year = SELECT Date FROM table WHERE Name = 'start_of_fin_year'
The problem I face: my code will break if I or someone else changes the name in the table. How do I prevent this?
FYI this is a personal project that I developed on my own, but I will have to give access to .csv files to others so they can update info. I'm afraid they may accidentally change the names, so that's why I'm worried.
I am new to the Business Objects Data services.
I have to run a dataflow reading from a file. Filename should be read based on wild chars like Platform. And I want to run the dataflow only if the file exists, if file is not present , it should not error out or should not do anything but it should just move on to the next dataflow or workflow in the job.
I tried below code to check if the file exists as built_in function File_Exists cannot check the file based on wild chars.
*$FILEEXISTSFLAG= exec('/bin/ksh',' "ls xxxxxx/Platform.csv',8);*
My intention is based on the value assigned to $FILEEXISTSFLAG from above code, I will decide whether to execute the data flow or not (if $FILEEXISTSFLAG is null do nothing otherwise execute the data flow ) but its retrieving below output.
*ls: cannot access /xxxxxx/Platform.csv: No such file*
Is there any other way to achieve this?
I was able to solve the above problem by using the index function.
$FILEEXISTSFLAG is containing a value like "ls: cannot access Platform: No such file or directory ". So, I have used the index function as below. So if the output is not null for below index function, it will execute the dataflow, otherwise it will do nothing.
index( $FILEEXISTSFLAG , 'No such file',1)
Thanks,
Phani.
Problem.
I regularly receive a feed files from different suppliers. Although the column names are consistent the problem comes when some suppliers send text files with more or less columns in there feed file.
Furthermore the arrangement of these files are inconsistent.
Other than the Dynamic data flow task provided by Cozy Roc is there another way I could import these files. I am not a C# guru but i am driven torwards using a "Script Task" control flow or "Script Component" Data flow task.
Any suggestion, samples or direction will greatly be appreciated.
http://www.cozyroc.com/ssis/data-flow-task
Some forums
http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400
http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow
Off the top of my head, I have a 50% solution for you.
The problem
SSIS really cares about meta data so variations in it tend to result in exceptions. DTS was far more forgiving in this sense. That strong need for consistent meta data makes use of the Flat File Source troublesome.
Query based solution
If the problem is the component, let's not use it. What I like about this approach is that conceptually, it's the same as querying a table-the order of columns does not matter nor does the presence of extra columns matter.
Variables
I created 3 variables, all of type string: CurrentFileName, InputFolder and Query.
InputFolder is hard wired to the source folder. In my example, it's C:\ssisdata\Kipreal
CurrentFileName is the name of a file. During design time, it was input5columns.csv but that will change at run time.
Query is an expression "SELECT col1, col2, col3, col4, col5 FROM " + #[User::CurrentFilename]
Connection manager
Set up a connection to the input file using the JET OLEDB driver. After creating it as described in the linked article, I renamed it to FileOLEDB and set an expression on the ConnectionManager of "Data Source=" + #[User::InputFolder] + ";Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties=\"text;HDR=Yes;FMT=CSVDelimited;\";"
Control Flow
My Control Flow looks like a Data flow task nested in a Foreach file enumerator
Foreach File Enumerator
My Foreach File enumerator is configured to operate on files. I put an expression on the Directory for #[User::InputFolder] Notice that at this point, if the value of that folder needs to change, it'll correctly be updated in both the Connection Manager and the file enumerator. In "Retrieve file name", instead of the default "Fully Qualified", choose "Name and Extension"
In the Variable Mappings tab, assign the value to our #[User::CurrentFileName] variable
At this point, each iteration of the loop will change the value of the #[User::Query to reflect the current file name.
Data Flow
This is actually the easiest piece. Use an OLE DB source and wire it as indicated.
Use the FileOLEDB connection manager and change the Data Access mode to "SQL Command from variable." Use the #[User::Query] variable in there, click OK and you're ready to work.
Sample data
I created two sample files input5columns.csv and input7columns.csv All of the columns of 5 are in 7 but 7 has them in a different order (col2 is ordinal position 2 and 6). I negated all the values in 7 to make it readily apparent which file is being operated on.
col1,col3,col2,col5,col4
1,3,2,5,4
1111,3333,2222,5555,4444
11,33,22,55,44
111,333,222,555,444
and
col1,col3,col7,col5,col4,col6,col2
-1111,-3333,-7777,-5555,-4444,-6666,-2222
-111,-333,-777,-555,-444,-666,-222
-1,-3,-7,-5,-4,-6,-2
-11,-33,-77,-55,-44,-666,-222
Running the package results in these two screen shots
What's missing
I don't know of a way to tell the query based approach that it's OK if a column doesn't exist. If there's a unique key, I suppose you could define your query to have only the columns that must be there and then perform lookups against the file to try and obtain the columns that ought to be there and not fail the lookup if the column doesn't exist. Pretty kludgey though.
Our solution. We use parent child packages. In the parent pacakge we take the individual client files and transform them to our standard format files then call the child package to process the standard import using the file we created. This only works if the client is consistent in what they send though, if they try to change their format from what they agreed to send us, we return the file.
I am fairly new to SSIS and have set myself a challenging first project, creating a data driven package framework. My current challenge is that I want to store the values of variables for my various packages in a table and then load them. So for instance, the SSIS package might be processing records between 2 dates. I would have two records in a parameters table:
ParmName ParmValue
-------- ---------
DateFrom 2013-01-01
DateTom 2013-01-31
These variables names will exist in the package, I just need to load them. In a false start, I tried using an Execute SQL Task but this didn't work. I assume I need a Script Task C# to do this but I don't know C#. Wondering if anyone could give me a pointer to where I can find some code similar to what I am trying to do. Just to make it a bit clearer, in pseudo code I ebevision a process like
Dataset = Select * from PkgParms where PckID = ?
FOR EACH DataSet.Record
SET (DataSet.Record.ParmName.Value) = (DataSet.Record.ParmValue.Value)
If this is not doable or I am in over my head please just let me know
Thanks
Steve
This is usually done in SSIS - Package Configurations. Follow the wizard for SQL Server configuration type.
You can find some tutorials how to do it, but in general you'll need 2 new columns:
packagepath with values like:
\Package.Variables[User::DateFrom].Properties[Value]
configurationfilter which will have same values for both dates, e.g. Dates