SSIS newbie How to get data output to file

SSIS newbie How to get data output to file - ssis

OK so I'm a complete Novice to SSIS but i needed to export images stored in our DB relating to specific sales orders , I attempted to do this in as an SP in SQ but this required CURSORS and I found it very easy to do the same thing in SSIS but I have a bit of an odd question , The DataFlow OLE DB to Data Flow works fine but i have have had to Declare the path OUTPUT FILE LOCATION in the SQL . I have worked out how to create a Dynamic File creation On the control flow but what i cant work out is how to remove the declared PATH and point it to the Control flow FILE SYSTEM TASK . I really hope this make sense and I appriciate any assistance
Control Flow
Data Flow Task
SQL -
DECLARE #Path nvarchar(1000);
SET #Path = N'C:\Users\X-JonC\Documents\';
SELECT
Products_IMAGEDATA.ImageData
, Products_IMAGEDATA.ImageTitle
, #Path + Imagetitle AS path
FROM
SalesOrders
INNER JOIN
SalesOrderItems
ON SalesOrders.SalesOrder = SalesOrderItems.SalesOrder
INNER JOIN
Products
ON SalesOrderItems.Product = Products.Product
AND SalesOrderItems.Product = Products.Product
INNER JOIN
Products_IMAGEDATA
ON Products.Product = Products_IMAGEDATA.Product
WHERE
SalesOrders.SalesOrderId = ?

A File System Task is something you use to perform operations on the file system (copy/rename/delete files/folders). You likely don't need a file system task unless you need to do something with the file after you've exported it to disk (like copy to a remote location or something).
A Data Flow Task is something you use to move data between multiple places. In your case, you'd like to move data from a database to a file system. The interesting twist is that you need to export binary/image data.
You have an excellent start by having created a data flow task and wired up an Export Column task to it.
The challenge that you're struggling with is how do I get my SSIS variable value into the package. Currently, you have it hard coded to a TSQL variable #Path. Assuming what you have is working, then you merely need to use the parameterization approach you already have with SalesOrderId ? and populate the value of #Path in the same manner- thus line three becomes SET #Path = ?;
One thing to note is that OLE and ODBC parameterization is based on ordinal position (0 and 1 based respectively). By adding in this new parameter placeholder in on line 3, it's now the first element as it comes before the WHERE clause's usage so you will need to update the mapping. Or you can be lazy and replace the empty line 2 with DECLARE #SalesOrderId int = ?; {this allows the first element to remain as is, you add your new element as +1 to the usage}. You'd need to replace the final question mark with the local variable like
DECLARE #Path nvarchar(1000);
DECLARE #SalesOrderId = ?
SET #Path = ?;
SELECT
Products_IMAGEDATA.ImageData
, Products_IMAGEDATA.ImageTitle
, #Path + Imagetitle AS path
FROM
SalesOrders
INNER JOIN
SalesOrderItems
ON SalesOrders.SalesOrder = SalesOrderItems.SalesOrder
INNER JOIN
Products
ON SalesOrderItems.Product = Products.Product
AND SalesOrderItems.Product = Products.Product
INNER JOIN
Products_IMAGEDATA
ON Products.Product = Products_IMAGEDATA.Product
WHERE
SalesOrders.SalesOrderId = #SalesOrderId;
Reference answers
Using SSIS to extract a XML representation of table data to a file
Export Varbinary(max) column with ssis

Related

What is the purpose of parameterizing connection manager in ssis?

I have ssis package with 2 connection managers.
When deployed to sql server and when I right click and click execute it allows me to set the connection manager configuration value.
Also in the above popup I can set parameter value.
Similarly I can right click and choose configure to set the parameter and connection manager values.
So what exactly is the purpose of parameterizing connection managers in ssis when I can anyways configure the connection manager via the pop-up?

A Parameter is a read only Variable that a package can receive at run time. An example of a package level parameter would be something like Processing Date. That way I can run yesterday's work and then rerun the package with Today's date.
A Variable can also be set at run-time but the mechanics of doing so are less intuitive. Net result is the same.
A Project Parameter is a read only Variable that all the packages in a project can reference. An example of a project level connection manager would be a file path. At least in my world, I define that as a path like C:\ssisdata\MyProject and then I have Input/Output/Archive folders hanging off that path. When I get to production, or another developer's machine, maybe that value becomes D:\data or \server2\share\MyProject
If each package had defined a Parameter of FilePath, then I would have to modify each package's parameter when it runs to reflect the server environment's value. If I change the value in the project, all of the packages pick up that new value.
That's all just in the execution environment from Visual Studio.
Running packages from the SSISDB
When you deploy to SQL Server's SSISDB catalog, you get some different options.
A simple case as you describe can be envisioned here.
Right click on the package and select Execute. The bold text for FilePath indicates I have changed it for this run of the package. The icon to the left show whether it is a project level parameter (first two) or package level (final one).
Behind the scenes, this generates the following SQL
DECLARE #execution_id bigint;
EXEC SSISDB.catalog.create_execution
#package_name = N'Package.dtsx'
, #execution_id = #execution_id OUTPUT
, #folder_name = N'So'
, #project_name = N'SO_66497856'
, #use32bitruntime = False
, #reference_id = NULL
, #runinscaleout = False;
SELECT
#execution_id;
DECLARE #var0 sql_variant = N'D:\ssisdata\MyProject';
EXEC SSISDB.catalog.set_execution_parameter_value
#execution_id
, #object_type = 20
, #parameter_name = N'FilePath'
, #parameter_value = #var0;
DECLARE #var1 smallint = 1;
EXEC SSISDB.catalog.set_execution_parameter_value
#execution_id
, #object_type = 50
, #parameter_name = N'LOGGING_LEVEL'
, #parameter_value = #var1;
EXEC SSISDB.catalog.start_execution
#execution_id;
GO
Every time I want to run this job and make it work for the environment (D: instead of C:), I would have to click the ellipses, ..., and provide a value.
Someone is going to mess that up so either you script the TSQL as I did and put that into the job definition. But if I run Package2, I would need to do the same run-time level change, set_execution_parameter_value to ensure that package also used the D drive. By the time I get to Package100, I'd say there must be a better way.
If I right click on my Project, SO_66497865, I have an option for Configure...
You can see me changing the value to an entirely different path on the D drive. Behind the scenes SQL is working with set_object_parameter_value
DECLARE #var sql_variant = N'D:\Set\Configure\Value';
EXEC SSISDB.catalog.set_object_parameter_value
#object_type = 20
, #parameter_name = N'FilePath'
, #object_name = N'SO_66497856'
, #folder_name = N'So'
, #project_name = N'SO_66497856'
, #value_type = V
, #parameter_value = #var;
GO
Now when I go to run the same package, look at that
It uses the Configured project parameter value without me having to provide a per run override (no bolded text).
For completeness, the last thing you can do is create an "Environment". An Environment is a set of shared variable values. For example, my Oracle User name and password (marked as sensitive) could be an Environment level thing because any of my 4 projects might want to use that value for configuration purposes. The Environment SOEnvironment is available to any of the projects.
I'm going to wire up MagicNumber from my Environment to my project's OtherProjectParameter.
Once again, right click on a project and choose Configure. Go to the References tab (this is a one time activity) and click Add and then find the Environment.
Now, back to Parameters tab and click the ellipses on OtherProjectParameters. Notice that Use environment variable is now longer greyed out. This shows you allowable environment variables based on data type. Pick MagicNumber
When you click OK, you now have an underscore on the configure screen
At this point, when I go to run the package, it will show me something like this
Pick your environment and that will get the OtherProjectParameter to fill in
That's a whirlwind tour of what your choices are and what/when they matter. How you should configure things is extremely dependent on your parameterization needs.
If you have multiple configurations enabled, then when you go to execute the package - either as a one-off execution or a SQL Agent job, you must pick the environment. Here I have SOEnvironment, SO_67402693_env0, and SO_67402693_env1 as sources for my package and for your deleted question, the latter two environments both provide a value for parameter p which is configuring OtherProjectParameter
When I go to execute the package, it will flag that it cannot start until an environment is picked. Here I select env0 and it results in the following tsql being generated. The #reference_id = 20002 is how that precedence would be determined an in fact, there is no precedence as only one environment reference is allowed at runtime.
DECLARE #execution_id bigint;
EXEC SSISDB.catalog.create_execution
#package_name = N'Package.dtsx'
, #execution_id = #execution_id OUTPUT
, #folder_name = N'So'
, #project_name = N'SO_66497856'
, #use32bitruntime = False
, #reference_id = 20002
, #runinscaleout = False;
SELECT
#execution_id;
DECLARE #var0 smallint = 1;
EXEC SSISDB.catalog.set_execution_parameter_value
#execution_id
, #object_type = 50
, #parameter_name = N'LOGGING_LEVEL'
, #parameter_value = #var0;
EXEC SSISDB.catalog.start_execution
#execution_id;
GO
Similar commands are generated if this is done via SQL Agent instead of right clicking on a package to execute but the same single environment reference allowed will hold true.
what exactly is the purpose of parameterizing connection managers in ssis when I can anyways configure the connection manager via the pop-up?
Backwards compatibility. The pattern for 2005/2008 was to have SSIS connection strings with expressions driven by variables which were then driven by classic Configuration or to just use Configuration to directly inject values to the ConnnectionString attributes. Some people continue to use that approach as is with the Project Deployment Model. Others use Package/Project managers to pass in credentials or a connection string. I favor using the pop-up window to handle configuring connection managers as it's one less moving part to deal with.
An argument for project/package parameters is ftp credentials. The existing FTP task, last I used it, would fail if the expected file wasn't there. My pattern was to write a .NET script to handle FTP activities as I could better handle missing file scenarios. But, I would need to get credential data passed securely to my package and thus, I needed package parameters and I would check the Sensitive box. Were I to have supplied them at run-time, then they would be saved in clear text in the SQL Agent job steps.

SSIS Execute SQL Task error no rows returned

I am a bit new to SSIS and given a task to send mail to particular stores based on Purchase Orders -> PONumber.
The steps should be as follows:
1)Take a XML file from a particular folder
2)Get the PONumber from that file
3)Write a query to fetch all the store email addresses for PONumbers
4)Send a mail to particular restaurant
Below screenshot is a package I had created. The only thing I am getting an issue is the Execute SQL Task , not sure what is the exact cause?
Could you please help on how can I debug this ? This was working fine before, but suddenly it started showing errors.
IMAGE1
IMAGE5

Execute SQL task is expecting results from the query, but is not getting any. Maybe you could use SQL Server profiler to catch exact SQL that is executed on SQL Server. Then you can use that SQL in query window to troubleshoot what it returns or why it is not not giving any results.
Edit.
With your current additional information interesting place is "parameter mapping" page, which you did not include. You should link SSIS variable to query parameter in there as Matt explained. SSIS does NOT link your variables in SSIS and query automatically even if they have the same names.

#dvlpr is correct your problem is you are getting NO results when Execute SQL Task 1 needs a single result.
The code you pasted is a little unclear as to which code is where but I will assume the first part is the code you use in SSIS Execute Task and the latter is an example in SSMS. If that is the case the problem is you are assigning the variable with a value of 0 in the script itself which I assume there is no PONUMBER that is 0:
Declare #POID as Varchar(50)
Set #POID = 0
WHERE (BizTalk_POA_HEADER.PONUMBER = #POID)
If you want to pass in the PONUMBER from your first dataflow task you need to load that to a variable and then use the variable in your Execute SQL task and made sure you setup parameter mapping correctly when doing so. here is one SO question on parameters that will help How to pass variable as a parameter in Execute SQL Task SSIS? And here is use of an expression task in a Data Flow task to set the variables value SSIS set result set from data flow to variable (note use the non-accepted answer that it was added later and was for 2012+ while the original was for 2008)
Next unless you are guaranteed only 1 result you will also need to add TOP 1 to your select statement because if you get more than 1 result you will get a different error again.
EDIT Per all of the comments:
So the configuration looks like you are using an ADO.NET connection which allows you to use named paramaters. There are restrictions if you don use that (https://msdn.microsoft.com/en-us/library/cc280502.aspx). The parameter mapping looks correct, and the result set should be fine. As far as your Error I don't know because you haven't posted the exact error so I cannot know what is the problem. If you use ADO.Net with your current Execute SQL Task configuration in the images you do have a couple of problems. 1 you are trying to declare the variable that you want to pass as a parameter that doesn't work, you need to remove that DECLARE statement. I suspect all you really need to do is modify your SQL Input to be:
SELECT DISTINCT BizTalk_POA_HEADER.PONUMBER, FAN_Suppliers.SupplierName,
FAN_Company_Details.CompanyName, FAN_Company_Details.[PrimaryEmail],
BizTalk_POA_HEADER.[DeliveryDate]
FROM BizTalk_POA_HEADER INNER JOIN
FAN_PO_Details ON BizTalk_POA_HEADER.PONUMBER =
CONCAT('PO',FAN_PO_Details.PoNumber) INNER JOIN
FAN_PO ON FAN_PO_Details.PurchaseOrderID = FAN_PO.PurchaseOrderID
INNER JOIN FAN_SupplierDetails ON FAN_PO.SupplierDetailsID =
FAN_SupplierDetails.SuppliersDetailsID INNER JOIN
FAN_Suppliers ON FAN_SupplierDetails.SupplierID = FAN_Suppliers.SupplierID
INNER JOIN FAN_Company_Details ON FAN_PO.CompanyID =
FAN_Company_Details.CompanyDetailsID
WHERE (BizTalk_POA_HEADER.PONUMBER = #POID)
Just get rid of the declare #POID and SET = 0 for a couple of reasons 1 because it is redundant when you have setup parameter mapping, 2 SSIS doesn't like it and will throw an error, 3 because you are setting a value of 0 to it which means it would always be 0.....

Get last cube processed date in SSIS

I need to get last processed date of SSAS cube in SSIS and save it into a variable.
I've tried a "Execute SQL task":
SELECT LAST_DATA_UPDATE as LAST_DT FROM $system.mdschema_cubes
WHERE CUBE_NAME = 'CubeName'
It works ok in MSSQL management studio MDX query window but in SSIS it says: Unsupported data type on result set binding.
Then I've tried:
WITH MEMBER [Measures].[LastProcessed] AS ASSP.GetCubeLastProcessedDate() SELECT [Measures].[LastProcessed] ON 0 FROM [CubeName]
And it says '[ASSP].[GetCubeLastProcessedDate]' function does not exist.
Any ideas how to do this?
Thank you

A linked server might be your best option;
Create the linked server with the following, changing as appropriate:
EXEC master.dbo.sp_addlinkedserver
#server = N'LINKED_SERVER_OLAP_TEST', --Change to a suitable name
#srvproduct='', --Creates the productname as blank
#provider=N'MSOLAP', --Analysis Services
#datasrc=N'localhost', --Change to your datasource
#catalog=N'TESTCUBE' --Change to set the default cube
Change the data source of your Execute SQL Task to make sure it is pointing to any of the databases where the linked server is hosted, I.E. don't use an analysis service datasource use a standard OLE DB. Then have the following in your execute SQL task (Changing as appropriate).
SELECT *
FROM OpenQuery(LINKED_SERVER_OLAP_TEST,'SELECT LAST_DATA_UPDATE as LAST_DT FROM $system.mdschema_cubes
WHERE CUBE_NAME = ''CUBENAME''')
Set the variable to be DATETIME and the result set to be single row.
There may well be other ways to do this, however I have always found this method the most straight forward.

RMySQL update row, not full table

Does anyone know how I can use RMySQL (or another library) to update a row in a table, rather than having to pull out the full table and push it back in? I don't want to read such a huge table into memory just to update one row.
What I am trying to do is pull out a row, change some of the values in there within R and push the same row object back into the table.
However, dbWriteTable seems to replace the entire table rather than just the row I specify.

The easiest way is to construct a string within R containing the adequate SQL Update statement and use dbSendQuery to push your data back into the table.

Using sqldf package:
library(sqldf)
table_name = data.frame(a = 1:10, b = 4)
# Open connection
sqldf()
fn$sqldf("update table_name set b=1")
ans = sqldf("select * from main.table_name")
# Close connection
sqldf()
print(table_name)

Parameters in SQL Server 2008

I have a stored procedure that pulls data for a report. I'm having a problem with the parameters. I have a couple temp tables and some joins that work so I have omitted them below. The problem is this line:
WHERE
SeminarDivision = #SeminarDivision AND SeminarType = #SeminarType
When I put this where clause in to use my seminar parameters the stored proc returns nothing But I need to generate a report based on those two parameters. So where do the parameters go? Can anyone help?
#StartDate DateTime,
#EndDate DateTime,
#SeminarDivision VARCHAR(50),
#SeminarType VARCHAR(50)
)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
... OMITTED
SELECT
WL.PID,
CONVERT(varchar(20), upper(substring(FirstName,1,1))+
LOWER(substring(FirstName,2,19))) AS FirstName,
CONVERT(varchar(20), upper(substring(LastName,1,1))+
LOWER(substring(LastName,2,19))) AS LastName,
S.SeminarDivision,
S.SeminarType,
S.StartDate,
S.SeminarLocation
FROM
#tblWaitList WL
INNER JOIN #tblSeminar S ON WL.SeminarGuid=S.SeminarGuid
WHERE
SeminarDivision = #SeminarDivision AND SeminarType = #SeminarType
ORDER BY
LastName,FirstName,StartDate

First and foremost there is nothing wrong with your code, when asking where do these parameters go, they go exactly where you put them. The question is - is the data coming in for SeminarDivision and SeminarType the right type of data? For instance just as a test,
copy the code into a new sql code query inside the editor. Run the command without the where, if you get values great. Now change the where to
WHERE
SeminarDivision = "Possible_Value"
Where Possible_Value should be a possible value...If it returns rows, good...now add the second condition also hardcoding a value:
WHERE SeminarDivision = "Possble_Value" AND SeminarType="Possible_Value_2"
Getting any data? Is it possible you want OR rather then AND ?

There's nothing wrong with the 'location' of your params.
If you're getting no data back, it's either because you've not populated #tblWaiList or #tblSeminar or because the records simply don't match your WHERE clause.
Check your params have the value you think they do by executing print #SeminarDivision etc.
SELECT * FROM #tblSeminar may give you a clue too.

You are not setting parameters correctly for the call.
Try this in SSMS, change values accordingly
EXEC Proc '20110101', '20111101', 'PossibleDivision', 'PossibleType'
If this fails, then show us "OMITTED" code
if this works, show us how you are calling this from the client code

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008