Tracking parameter mappings in SSIS - ssis

In an SSIS project, I know how to set up a variable to hold the connection string for a Connection Manager - but how do I go the other way? ie - I have an existing SSIS package and want to find out the name of the variable which provides the connection string for a particular Connection Manager.
I can find loads of references to how to set it up, and did expect that going to Connection manager>Properties>Expressions would show me, but it doesn't. I did manage to find it by going to Package Configuration Organizer, picking sensibly named parameters, going to edit them, going to the second dialog of the wizard to find the Exported property name. This can't be the only way, surely?
Regards, Stewart
EDIT - This is in Visual Studio 2008

First of all, grab BIDSHelper it's a free add-in for visual studio and at a minimum it helps identify when elements have Expressions and Configurations applied to them. One will have a teal highlight, the other a fuschia colored and yes, an object can have both.
The first scenario you described, look at the Properties, Expressions and identify the use of expressions. Other objects, you might need to look at an Expressions tab.
Configurations work differently. You can use an Environment Variable, Registry Value, Parent Package Value, XML file or SQL Server table. The first three provide a 1:1 mapping between a configuration value and a configured item (variable, connection manager, etc). XML and SQL Server can configure many items. The order configurations are applied is important as you could have 5 configuration entries and each one of them modify the same setting with different values. There is also a difference between how 2005 and 2008 applies configurations so take a peek at Understanding How Integration Services Applies Configurations
When a package loads, BIDS will indicate what configurations it is attempting to load (look in your output window). Beyond BIDS highlighting and those messages, those messages are your other clue that configurations exist and are being applied. That's also your opportunity for detecting missing configurations (I expected to find configuration X and didn't find it = the configuration resource doesn't exist) or (I expected to configure property X but could not find it = the thing being configured does not exist)
I have found the best approach is to define a common set of configurations (Sales connection, warehouse connection) that all the applications in an environment use and use a consistent configuration naming approach. We then use custom configurations for project level things (the path for input and output for the InsuranceProcessing packages is would apply across all the packages but would be different for Sales) and then a third set of configurations that is package specific. We use SQL Server tables for this as it makes inspecting values much easier than gloming through lots of ugly XML.
Lots of information, but nothing that directly answers your question. Sorry about that. You might be able to inspect the object model and look at what's configured but that's ugly.

Thanks to the above answers, I realised that it wasn't a variable, but a configuration. Once that became clear, more Googling led to an explanation of why the Target Object is blank in the Package Configuration Organizer (there can be more than one).
The answer is in the config database; the PackagePath entry holds \Package.Connections[connection manager name].Properties[ConnectionString], so to find out where a particular mapping comes from, use something like this.
SELECT TOP 1000 [ConfigurationFilter]
,[ConfiguredValue]
,[PackagePath]
,[ConfiguredValueType]
FROM [Database].[dbo].[Configurations_table]
where ConfiguredValue like '%object in which you are interested%'
(not sure why all that didn't go in the code block).
I still think it should be easier, but I hope this helps others.
Regards, Stewart

Great answer by billinkc. In addition to that answer (or rather fleshing out the "inspect the object model and look at what's configured" part), I run a script at the start of any package that adds the values of all connection manager connection strings to the output window, followed by the connection string expression for each manager. In addition it loops through all the variables that have been specifed for use in the script and outputs the value. Not so useful in production but very useful when developing/testing.
Just add a script task to the start of the package flow, specify any variables you want to debug then add the following code to the script:
'Report number of connections
Dts.Events.FireInformation(99, "debug", "number of connections = " & Dts.Connections.Count, "", 0, True)
'Loop through connection collection
For Each cConnection As Microsoft.SqlServer.Dts.Runtime.ConnectionManager In Dts.Connections
'Report connection string value
Try
Dts.Events.FireInformation(99, "debug", "connection """ & cConnection.Name & """ value = " & cConnection.ConnectionString, "", 0, True)
Catch
End Try
'Report connection string expression
Try
Dts.Events.FireInformation(99, "debug", "connection """ & cConnection.Name & """ constring expression = " & cConnection.GetExpression("ConnectionString"), "", 0, True)
Catch
End Try
Next
'Report number of variables
Dts.Events.FireInformation(99, "debug", "Number of Variables = " & Dts.Variables.Count, "", 0, True)
'Loop through variables collection
For Each vVariable As Microsoft.SqlServer.Dts.Runtime.Variable In Dts.Variables
'Report variable value
Try
Dts.Events.FireInformation(99, "debug", "Variable """ & vVariable.Name & """ value = " & vVariable.Value, "", 0, True)
Catch
End Try
Next

Related

SSIS - Loop Through Active Directory

Disclaimer: new to SSIS and Active Directory
I have a need to extract all users within a particular Active Directory (AD) domain and import them into Excel. I have followed this: https://www.itnota.com/query-ldap-in-visual-studio-ssis/ in order to create my SSIS package. My SQL is:
LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=a*));Name,sAMAccountName
As you know there is a 1,000 row limit when pulling from the AD. In my SQL I currently have (name=a*) to test the process and it works. I need to know how to setup a loop with variables to pull all records and import into Excel (or whatever you experts recommend). Also, how do I know what the other field names are that are available to pull?
Thanks in advance.
How do I see what's in Active Directory
Tool recommendations are off topic for the site but a tool that you can download, no install required, is AD Explorer It's a MS tool that allows you to view your domain. Highly recommend people that need to see what's in AD use something like this as it shows you your basic structure.
What's my domain controller?
Start -> Command Prompt
Type set | find /i "userdnsdomain" and look for USERDNSDOMAIN and put that value in the connect dialog and I save it because I don't want to enter this every time.
Search/Find and then look yourself up. Here I'm going to find my account by using my sAMAccountName
The search results show only one user but there could have been multiples since I did a contains relationship.
Double clicking the value in the bottom results section causes the under pane window to update with the details of the search result.
This is nice because while the right side shows all the properties associated to my account, it's also updated the left pane to navigate to the CN. In my case it's CN=Users but again, it could be something else in your specific environment.
You might discover an interesting categorization for your particular domain. At a very large client, I discovered that my target users were all under a CN
(Canonical Name, I think) so I could use that in my AD query.
There are things you'll see here that you sure would like to bring into a data flow but you won't be able to. Like the memberOf that's a complex type and there's no equivalent in the data flow data types for it. I think Integer8 is also something that didn't work.
Loop the loop
The "trick" here is that we'll need to take advantage of the
The name of the AD provider has changed since I last looked at this. In VS 2017, I see the OLE DB Provider name as "OLE DB Provider for Microsoft Directory Service"
Put in your query and you should get results back. Let that happen so the metadata is set.
An ADO.NET source does not support parameterization as the OLE DB does. However, you can apply an Expression on the Data Flow which surfaces the component and that's what we'll do.
Click out of the Data Flow and back into the Control Flow and right click on the Data Flow and select Properties. In that properties window, find Expressions and click the ellipses ... Up pops the Property Expressions Editor
Find the ADO.NET source under Property and in the Expressions section, click the Ellipses.
Here, we'll use your same source query just to prove we're doing the right things
"LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + "a" + "*));Name,sAMAccountName"
We're doing string building here so the problem we're left to solve is how we can substitute something for the "a" in the above query.
The laziest route would be to
Create an SSIS variable of type String called CurrentLetter and initialize it to a
Update the expression we just created to be "LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + #[USer::CurrentLetter] + "*));Name,sAMAccountName"
Add a Foreach Loop Container (FELC) to your Control Flow.
Configure the FELC with an enumerator of "Foreach Item Enumerator"
Click the Columns...
Click Add (this results in Column 0 with data type String) so click OK
Fill the collection with each letter of the alphabet
In the Variable Mappings tab, assign Variable User::CurrentLetter to Index 0
Click OK
Old blog posts on the matter because I like clicks
https://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html

Importing flat file which has changing column order using SSIS [duplicate]

Problem.
I regularly receive a feed files from different suppliers. Although the column names are consistent the problem comes when some suppliers send text files with more or less columns in there feed file.
Furthermore the arrangement of these files are inconsistent.
Other than the Dynamic data flow task provided by Cozy Roc is there another way I could import these files. I am not a C# guru but i am driven torwards using a "Script Task" control flow or "Script Component" Data flow task.
Any suggestion, samples or direction will greatly be appreciated.
http://www.cozyroc.com/ssis/data-flow-task
Some forums
http://www.sqlservercentral.com/Forums/Topic525799-148-1.aspx#bm526400
http://www.bidn.com/forums/microsoft-business-intelligence/integration-services/26/dynamic-data-flow
Off the top of my head, I have a 50% solution for you.
The problem
SSIS really cares about meta data so variations in it tend to result in exceptions. DTS was far more forgiving in this sense. That strong need for consistent meta data makes use of the Flat File Source troublesome.
Query based solution
If the problem is the component, let's not use it. What I like about this approach is that conceptually, it's the same as querying a table-the order of columns does not matter nor does the presence of extra columns matter.
Variables
I created 3 variables, all of type string: CurrentFileName, InputFolder and Query.
InputFolder is hard wired to the source folder. In my example, it's C:\ssisdata\Kipreal
CurrentFileName is the name of a file. During design time, it was input5columns.csv but that will change at run time.
Query is an expression "SELECT col1, col2, col3, col4, col5 FROM " + #[User::CurrentFilename]
Connection manager
Set up a connection to the input file using the JET OLEDB driver. After creating it as described in the linked article, I renamed it to FileOLEDB and set an expression on the ConnectionManager of "Data Source=" + #[User::InputFolder] + ";Provider=Microsoft.Jet.OLEDB.4.0;Extended Properties=\"text;HDR=Yes;FMT=CSVDelimited;\";"
Control Flow
My Control Flow looks like a Data flow task nested in a Foreach file enumerator
Foreach File Enumerator
My Foreach File enumerator is configured to operate on files. I put an expression on the Directory for #[User::InputFolder] Notice that at this point, if the value of that folder needs to change, it'll correctly be updated in both the Connection Manager and the file enumerator. In "Retrieve file name", instead of the default "Fully Qualified", choose "Name and Extension"
In the Variable Mappings tab, assign the value to our #[User::CurrentFileName] variable
At this point, each iteration of the loop will change the value of the #[User::Query to reflect the current file name.
Data Flow
This is actually the easiest piece. Use an OLE DB source and wire it as indicated.
Use the FileOLEDB connection manager and change the Data Access mode to "SQL Command from variable." Use the #[User::Query] variable in there, click OK and you're ready to work.
Sample data
I created two sample files input5columns.csv and input7columns.csv All of the columns of 5 are in 7 but 7 has them in a different order (col2 is ordinal position 2 and 6). I negated all the values in 7 to make it readily apparent which file is being operated on.
col1,col3,col2,col5,col4
1,3,2,5,4
1111,3333,2222,5555,4444
11,33,22,55,44
111,333,222,555,444
and
col1,col3,col7,col5,col4,col6,col2
-1111,-3333,-7777,-5555,-4444,-6666,-2222
-111,-333,-777,-555,-444,-666,-222
-1,-3,-7,-5,-4,-6,-2
-11,-33,-77,-55,-44,-666,-222
Running the package results in these two screen shots
What's missing
I don't know of a way to tell the query based approach that it's OK if a column doesn't exist. If there's a unique key, I suppose you could define your query to have only the columns that must be there and then perform lookups against the file to try and obtain the columns that ought to be there and not fail the lookup if the column doesn't exist. Pretty kludgey though.
Our solution. We use parent child packages. In the parent pacakge we take the individual client files and transform them to our standard format files then call the child package to process the standard import using the file we created. This only works if the client is consistent in what they send though, if they try to change their format from what they agreed to send us, we return the file.

Shared data set for cached reports?

Is there a way in SSRS to create a snapshot for a report that uses a shared dataset? We are looking for a way to dynamically set the server and credentials in SSRS, but it seems when the shared dataset is used there is no way to cache a report.
Two things that I think may help you:
You can create a dynamic connection string from parameters you pass in. However you lose intellisense when creating this so generally I use an actual database first to do my dataset then change the connection string later:
A. Create a variable #Server, set it to text:
B. Create a local DataSource, this must be local as you cannot share a data source that is dynamic, to my knowledge as it has no input to go on thus a shared must have set inputs.
C. On the side of 'Connection string:' hit the 'fx' button to get a dynamic connection string. Build a connection string of text with your parameter being an input:
="Data Source=" & Parameters!Server.Value & ";Initial Catalog=(DBName)"
D. You NOW have to set up a dataset to bind to #Server variable or else someone just needs to do plain text to guess at a server. For this reason I usually create a dataset like
select 'Server1' as Server
union
select 'Server2'
union
select 'Server3'
You can handle the cache aspect COMPLETELY from the hosted end and not worry about the report stuff. Just go to a report once deployed and choose 'Manage'>'Snapshot Options'> Set your preferences.
EDIT: You probably want another variable for the database or else you will assume a same database structure all the time.

How to define connection string for multiple environments in SSIS?

Anyone have any idea how do we specify different connection strings in SSIS for different environments - like system integration test environment, user acceptance test environment and production environment?
Is it done by defining multiple connection managers or we can define multiple configuration files and point our connection string to?
You do not need to have multiple connections for the same database.
In SSIS 2008:
It's a good idea to have your connection strings defined using an expression. This is how you can do it:
Suppose you have added a anew OLEDB connection in your connection manager. Copy the value of the ConnectionString property. IT would look like this:
Data Source=(local);Initial Catalog=Learn;Provider=SQLNCLI10.1;Integrated Security=SSPI;Application Name=SSIS-RaggedFile-{03053F2E-8101-4985-9F2B-8C2DDE510065}(local).Learn;Auto Translate=False;
Remove the non-essentials:
Data Source=(local);Initial Catalog=Learn;Provider=SQLNCLI10.1;Integrated Security=SSPI;
Now create three new variables at the control surface level: sServer, sDb, sProvider. Type for all these three varaibles will be string. Their values - using this example - will be (local), Learn, and SQLNCLI10.1.
Go back to the ConnectionString property of your connection. and set its value to
"Data Source=(local);Initial Catalog=Learn;Provider=SQLNCLI10.1;Integrated Security=SSPI;"
Now, replace the server, db, and provider with the variables you have created to make your expression look like this:
"Data Source=" + #[User::sServer] + ";Initial Catalog=" + #[User::sDb] + ";Provider=" + #[User::sProvider] + ";Integrated Security=SSPI;"
When we move from one environment to another, we may encounter different versions of the databases - hence it is helpful to have a variable for provider as well.
Now, these values can be changed at the time of deployment.
In SSIS 2012 -
This method would still work with a little bit of change. Change these three variables to parameters and make them required. That way, at the time of deployment of deployment, you will be forced to change the values. This is just a starting point. Read up on server environment.
Note: I opine that a variable should have atomic value. This helps in avoiding any error due to mistyping. That's why I have created three separate variables as opposed to having one variable called sCnxn (for example) and have someone change a portion of that variable at the time of deployment.

entity createdatabase ldf log file name change default how to

Visual Web Developer. Entity data sources model. I have it creating the new database fine. Example
creates SAMPLE1.MDF and SAMPLE1.LDF
When I run my app, it creates another SAMPLE1_LOG.lDF file.
When I run createdatabase, is there a place I can specify the _LOG.ldf for the log file? SQL 2008 r2.
It messes up when I run the DeleteDatabase functions... 2 log files...
How come it does not create the file SAMPLE1_Log.ldf to start with, if that is what it is looking for...
Thank you for your time,
Frank
// database or initial catalog produce same results...
// strip the .mdf off of newfile and see what happens?
// nope. this did not do anything... still not create the ldf file correctly!!!
// sample1.mdf, sample1.ldf... but when run, it creates sample1_log.LDF...
newfile = newfile.Substring(0, newfile.Length - 4);
String mfile = "Initial Catalog=" + newfile + ";data source=";
String connectionString = FT_EntityDataSource.ConnectionManager.GetConnectionString().Replace("data source=", mfile);
// String mexclude = #"attachdbfilename=" + "|" + "DataDirectory" + "|" + #"\" + newfile + ";";
// nope. must have attach to create the file in the app_data, otherwise if goes to documents & setting, etc sqlexpress.
// connectionString = connectionString.Replace(mexclude, "");
Labeldebug2.Text = connectionString;
using (FTMAIN_DataEntities1 context = new FTMAIN_DataEntities1(connectionString))
{
// try
// {
if (context.DatabaseExists())
{
Buttoncreatedb.Enabled = false;
box.Checked = true;
boxcreatedate.Text = DateTime.Now.ToString();
Session["zusermdf"] = Session["zusermdfsave"];
return;
// Make sure the database instance is closed.
// context.DeleteDatabase();
// i have entire diff section for deletedatabase.. not here.
}
// View the database creation script.
// Labeldebug.Text = Labeldebug.Text + " script ==> " + context.CreateDatabaseScript().ToString().Trim();
// Console.WriteLine(context.CreateDatabaseScript());
// Create the new database instance based on the storage (SSDL) section
// of the .edmx file.
context.CreateDatabaseScript();
context.CreateDatabase();
}
took out all the try, catch so i can see anything that might happen...
==========================================================================
Rough code while working out the kinks..
connection string it creates
metadata=res://*/FT_EDS1.csdl|res://*/FT_EDS1.ssdl|res://*/FT_EDS1.msl;provider=System.Data.SqlClient;provider connection string="Initial Catalog=data_bac100;data source=.\SQLEXPRESS;attachdbfilename=|DataDirectory|\data_bac100.mdf;integrated security=True;user instance=True;multipleactiveresultsets=True;App=EntityFramework"
in this example, the file to create is "data_bac100.mdf".
It creates the data_bac100.mdf and data_bac100.ldf
when I actually use this file and tables to run, it auto-creates data_bac100_log.LDF
1) was trying just not to create the ldf, so when the system runs, it just creates the single one off the bat...
2) the Initial Catalog, and/or Database keywords are ONLY added to the connection string to run the createdatabase().. the regular connection strings created in web config only have attachdbfilename stuff, and works fine.
I have 1 connection string for unlimited databases, with the main database in the web.config.. I use a initialize section based on the user roles, whether visitor, member, admin, anonymous, or not authenticated... which sets the database correctly with a expression builder, and function to parse the connection string with the correct values for the database to operate on. This all runs good.
The entity framework automatically generates the script. I have tried with and without the .mdf extensions, makes no difference... thought maybe there is a setup somewhere that holds naming conventions for ldf files...
Eventually all of this will be for naught when start trying to deploy where not using APP_Data folder anyways...
Here is an example of connection string created when running application
metadata=res://*/FT_EDS1.csdl|res://*/FT_EDS1.ssdl|res://*/FT_EDS1.msl;provider=System.Data.SqlClient;provider connection string="data source=.\SQLEXPRESS;attachdbfilename=|DataDirectory|\TDSLLC_Data.mdf;integrated security=True;user instance=True;multipleactiveresultsets=True;App=EntityFramework"
in this case, use the TDSLLCData.mdf file...
04/01/2012... followup...
Entity Framework
feature
Log files created by the ObjectContext.CreateDatabase method
change
When the CreateDatabase method is called either directly or by using Code First with the SqlClient provider and an AttachDBFilename value in the connection string, it creates a log file named filename_log.ldf instead of filename.ldf (where filename is the name of the file specified by the AttachDBFilename value).
impact.
This change improves debugging by providing a log file named according to SQL Server specifications. It should have no unexpected side effects.
http://msdn.microsoft.com/en-us/library/hh367887(v=vs.110).aspx
I am on a Windows XP with .net 4 (not .net 4.5)... will hunt some more.. but looks like a issue that cannot be changed.
4/1/2012, 4:30...
ok, more hunting and searching and some of the inconsistancies I have experienced with createdatabase and databaseexists... so .net 4.5 is supposed to add the _log.ldf, and not just .ldf files, so they must have addressed this for some reason....
found others with same issues, but different server....
MySQL has a connector for EF4, the current version is 6.3.5 and its main functionalities are working fine but it still has issues with a few methods, e.g.
•System.Data.Objects.ObjectContext.CreateDatabase()
•System.Data.Objects.ObjectContext.DatabaseExists()
which makes it difficult to fully use the model-first approach. It's possible by manually editing the MySQL script (available with the CreateDatabaseScript method). The MySQL team doesn't seem eager to solve those bugs, I'm not sure what the commitment level actually is from their part but it certainly is lower than it once was.
That being said, the same methods fail with SQL CE too (they are not implemented, and I don't see the MS team as likely to tackle that soon).
Ran out of space below... it just becomes a problem when create a database, and it does not create the _log.ldf file, but just the ldf file, then use the database, and it creates a _log.ldf file... now you have 2 ldf files.. one becomes invalid.. Then when done with the database, delete it, then try to create a new, and a ldf exists, it will not work....
it turns out this is just the way it is with EF4, and they changed with EF4.5 beta to create the _log.ldf file to match what is created when the database is used.
thanks for time.
I've never used this "mdf attachment" feature myself and I don't know much about it, but according to the xcopy deployment documentation, you should not create a log file yourself because it will be automatically created when you attach the mdf. The docs also mention naming and say that the new log filename ends in _log.ldf. In other words, this behaviour appears to be by design and you can't change it.
Perhaps a more important question is, why do you care what the log file is called? Does it actually cause any problems for your application? If so, you should give details of that problem and see if someone has a solution.