This is not exactly the same as passing a huge string parameter to stored procedure ....
I have a SQL Server 2008 sproc that takes an id int and a large string and inserts it into a table. The sproc is called by a .NET 4.0 code, which does a File.ReadAllText into a string and then sends it to the sproc. The source of the string is a text file. The thought of future reading many 100mb files all day into an immutable "use once" strings and then sending it to SQL Server sounds memory-wasteful on the C# side.
What is a smarter way to stream the text from disk to the sproc? I can change the current Varchar(Max) parameter to anything that makes better sense.
All ideas appreciated.
Thanks.
I see three options for your situation:
Keep your existing design
Use OPENROWSET in the SP
Filestream
Item one is best if the files are remote from the server and have unique names.
Item two will take the work off the c# side but you will have to deal with read permissions for SQL on the file, naming convention and file cleanup.
Item three is the current best pratice for SQL Server 2008. There are numerous HOW2 articles to follow. This choice will allow SQL to manage the file while keeping the file on NTFS storage
Related
I am implementing a SSIS package and currently trying to do the following.
Truncate the destination table
Fetch the data by executing the stored procedure and insert it into the destination table.
I have created an Execute SQL task to address step 1 and dataflow with oledb source and oledb destination to address the second point. It been working successfully so far but isn't working for one my stored procedure that uses temp tables.
When I edit the oledb source and click the preview button, I get the error no column returned
I know that SSIS has an issue with generating column while executing stored procedures that depend on temp tables. I have converted the stored proc to use temporary table variables and its now able to return columns in SSIS when I do a preview. The only downside is that the stored procedure is taking longer time to execute. Its taking 1 hour 15 mins as compared to 15 mins while using temp tables.
I did see a suggestion to use SET FMTONLY before executing the stored procedure as an alternate solution to changing to temp table variables but that didn't seem to work as I am getting syntax or permission denied error.
Could somebody tell me a solution to my problem which does not compromise on the performance.
Sounds like you've already read all the approaches to using Temp tables in SSIS, including the IF 1=0... trick? If you haven't seen that one yet, google it.
You say that using Table Variables causes your stored procedure to take about 5 times longer than using Temp Tables. The most likely reason for that is that you are indexing your temp tables but not your table variables. If you didn't know that table variables can be indexed, they can. You might try that.
Finally, a solution that you haven't mentioned is that you can replace your temporary table with a real table that gets truncated when you're done using it.
Short comment:
Try EXEC WITH RESULT SETS and specify the metadata yourself for a proc with temp tables; or use the Script Component as a source and specify the Output columns yourself.
Long comment:
Technically speaking, it is the driver/database you are using in SSIS that would decide the behavior when working with temp tables.
Metadata is an important factor when using SSIS's pipeline components. By metadata, I mean the names of the columns, their data types etc that a pipeline component uses. When designing a data flow, someone/something should provide this metadata to the components that require it.
In most cases, SSIS automatically retreives the metadata. Components that do not connect to a external data source, like Conditional Split etc, get their metadata from the other components they are connected to. For the pipeline components that connect to a external data source (like Oledb source, oledb destination, Lookup etc.), SSIS provides a mechanism to get this metadata without human involvement. This mechanism involves the driver connecting to the database and retrieving the metadata of the output. If the driver/database is capable of returning the metadata, then that metadata is used. If the driver/database is incapable, then you get the errors you are seeing. The rest of my comments are based on the assumption that you are using a SQL Server database in your question.
When working with a SQL Server database in SSIS, typically, we use the native client drivers provided by Microsoft. When trying to get the metadata, these drivers try to get the metadata without actually executing the SQL Statement (actual execution can have side effects; and also, might take more than a few seconds/minutes/hours; and you dont want side effects and long wait times during package design time.) So to get the metadata, the driver relies on the metadata of the actual objects used in the sql command. If the command uses a physical table or view, SQL Server already has the metadata available and can supply it to the driver. If it is a temp table, SQL Server does not have the metadata until it can create the temp table. If using FMT ONLY option, you can use it in such a way to create the temp tables, but avoid any heavy processing/side affects and thus be able to retrieve metadata without penalties. Post 2012, these native client drivers rely on some newer functionality to retrieve metadata than the drivers before 2012. In 2012 and after, the driver uses the sp_describe_first_result_set proc to retrieve metadata. So, whether you can get metadata or not is determined by the ability of the sp_describe_first_result_set proc.
So while SSIS can automatically get the metadata (because of the driver/database), it does not automatically get the metadata in some cases (again because of the driver/database). In cases involving the second scenario, some other process (typically a human) can help the driver infer metadata or provide the metadata to the component directly.
To help the driver, in case of SQL Server 2012 and after, you can use the WITH RESULTSETS clause to specify the output metadata. When this clause is present, the driver will use it and doesnt try to query the metadata from system objects; and thus avoid the error which you would otherwise get. If you are using the drivers that came with SQL Server 2008, you can use FMT ONLY. This option is at the driver/database level.
Another option could be to use a Script Component as the Source and in the Output columns, you can specify the columns/metadata. SSIS would not try to retrieve metadata from the datasource in this case, but would rely on the definitions you provided in the Output section of the Script Component.
As you can see, both options involve a human (or some other process) specifying the metadata instead of SSIS trying to retrieve the metadata in an automated fashion. I would prefer the first option if working with SQL Server and the second option if working with databases like MySql.
My current application was built up in SQL Server 2008 server in JAVA with Hibernate and I had used HierarchyId data type for department hierarchy in my database.
I had written SQL queries to deal with HierarchyId datatype. And I also have n-Level of department tree structure.
Now I want to change my Database server from SQL Server 2008 to MySQL as per business requirement.
After feasibility checking I came with the solution that my whole application will migrate to MySQL database server except HierarchyId data type.
So my main challenge is to find alternate solution of HierarchyId data type with the minimal change in coding.
What is the best way to implement department hierarchy in my database?
Thanks...
I faced the similar situation when our team decided to migrate from MS-SQL to MySQL. We resolved the issue using the following steps:
Added a column of type varchar(100) to the same table in MS SQL.
Converted the hierarchyid from hexadecimal value to string using the hierarchyid.ToString() function as saved it in the newly created column using computed column functionality. for eg. 0x58 -> "/1/", 0x7CE0 -> "/3/7/".
The level of the entity is equal to no-of '/''s minus 1.
These columns could be migrated to the MySQL.
The IsDesendantOf() and is method was replaced with LIKE function of string concaenated with '%'.
Thus we got rid of the hierarchyid functionality in MySQL.
Whenever we face such an issue, we just need to ask ourselves, what would we have done if this functionality would not have been provided by the tool we use. We generally end up getting the answer optimally.
Mysql has no equivalent that I'm aware of, but you could store the same data in a varchar.
For operations involving the HierarchyId, you're probably going to have to implement them yourself, probably as either user defined functions or stored procedures.
What sqlserver does looks like the "materialized path" method of storing a hierarchy. One example of that in mysql can be seen at http://www.cloudconnected.fr/2009/05/26/trees-in-sql-an-approach-based-on-materialized-paths-and-normalization-for-mysql/
I am creating an ETL in SSIS in which I which I want my data source to be a restricted query, like select * from table_name where id='Variable'. This variable is what I defined as User created variable.
I do not understand how I can have my source query interact with the SSIS scoped Variable.
The only present options are
Table
Table from variable
SQL Command
SQL command from a variable
What I want is to have a SQL statement having a variable as parameter
Simple. Choose SQL command as the Data Access Mode. Enter your query with a question mark as a parameter placeholder. Then click the Parameters button and map your variable to Parameter0 in the Set Query Parameters dialog:
More information is available on MSDN.
An inferior alternative to #Edmund's approach is to use an Expression on another Variable to build your string. Assuming you have #[User::FirstName] already defined, you would then create another variable, #[User::SourceQuery].
In the properties for this variable, set EvaluateAsExpression to True and then set an Expression like "SELECT FirstName, LastName, FROM Person.Person WHERE FirstName = '" + #[User::FirstName] +"'" The double quotes are required because we are building an SSIS String.
There are two big reasons this approach should not be implored.
Caching
This approach is going to bloat your plan cache in SQL Server with N copies of essentially the same query. The first time it runs and the value is "Edmund" SQL Server will create an execution plan and save it (because it can be expensive to build them). You then run the package and the value is "Bill". SQL Server checks to see if it has a plan for this. It doesn't, it only has one for Edmund and so it creates another copy of the plan, this time hard coded to Bill. Lather-rinse-repeat and watch your available memory dwindle until it unloads some plans.
By using the parameter approach, when the plan is submitted to SQL Server, it should be creating a parameterized version of the plan internally and assumes that all parameters supplied will result in equal costing executions. Generally speaking, this is the desired behaviour.
If your database is optimized for ad-hoc workload (it's a setting turned off by default), that should be mitigated as every plan is going to get parameterized.
SQL Injection
The other big nasty you will run into with building your own string is that you open yourself up to SQL Injection attacks or at the least, you can get runtime errors. It's as simple as having a value of "d'Artagnan." That single quote will cause your query to fail resulting in package failure. Changing the value to "';DROP TABLE Person.Person;--" will result in great pain.
You might think it's trivial to safe quote everything but the effort of implementing it consistently everywhere you query is beyond what your employer is paying you. All the more so since there is native functionality provided to do the same thing.
When using OLEDB Connection manager (with SQL Server Native Client 11.0 provider in my case) you can catch an error like this:
Parameters cannot be extracted from the SQL command. The provider
might not help to parse parameter information from the command. In
that case, use the "SQL command from variable" access mode, in which
the entire SQL command is stored in a variable.
So you need to explicitly specify database name in OLEDB Connection manager properties. Otherwise SQL Server Native Client can use different database name then you mean (e.g. master in MSSQL Server).
For some cases you can explicitly specify database name for each database object used in query, e.g.:
select Name
from MyDatabase.MySchema.MyTable
where id = ?
I have a data table which is filled within my application with some values that user has entered them via an excel file. My Application targer .net framework is 2.0 and I can NOT change it to 3.0 or 3.5 in order to use LINQ feature.
So, I have to send my data table values to a stored procedure and contribute them in a join operation.
Is it a good solution or not? If yes, How can I send my Data table to Stored procedure as an input parameter?
Thank you
By using table-valued parameters you can send data to the SQLServer stored procedure. This user-defined type represents the definition of a table structure and is compatible with SQLServer 2008 and next versions.
You can find example and more information referring to this msdn aritcle
Large Complex: For large complex data I'd probably get your data into a *.csv file (if it's not already that way in Excel), use .Net to BCP it into #temp tables, and then on that same connection call the proc and have the proc always know to look for the data in the #temp tables. BCP is the fasted way to get large chunks of data into SQLServer.
Medium Size: If the size of the data is small then you could format it as XML and send that to the proc. Here's a quick example using C# http://www.a2zdotnet.com/View.aspx?Id=107
Small Delimited: For small list type data you might be able to get away with sending it as comma delimited string of values. This is very handy when sending a list of ID's to a proc (http://blog.logiclabz.com/sql-server/split-function-in-sql-server-to-break-comma-separated-strings-into-table.aspx)
Im running SQL Server 2008 on winows server 2008 and I have a stored proc that outputs some information about a product entity with the inout as a product Id.
It outputs a reecord to represent the product information followed by a second table full of orders.
Im wondering if there is any way to call the stored proc and write the orders data to a CSV file from the command shell?
The other alternative is to try this using a custom written application and a data reader but I dont realy want to go down this route.
You should be able to use the SQLCMD command-line utility to do this. It's a complex tool to use; the BOL entry can be found here, and if necessary a bit of googling should turn up the odd tutorial that goes over the basics.