I would like to transfer data from a MS SQL Server database to a MySQL database. So, I added a linked server to MS SQL so that I can use Openquery to insert the data in the MySQL database. I want to optimize the performance of the data transfer and I found the guidelines for improving performance of data loading in MySQL.
One optimization consists of disabling AUTOCOMMIT mode, however I was not able to do it using Openquery.
I tried both ways:
SELECT * from openquery(MYSQL,'SET autocommit=0')
exec openquery(MYSQL,'SET autocommit=0')
and I got:
Cannot process the object "SET autocommit=0". The OLE DB provider
"MSDASQL" for linked server "MYSQL" indicates that either the object
has no columns or the current user does not have permissions on that
object.
Is it possible to execute such statements through openquery?
Thanks,
Mickael
OPENDATASOURCE() and OPENROWSET() allow for add-hoc server connections. You do not need to define a linked server ahead of time.
The OPENQUERY() depends upon a static linked server being defined ahead of time.
Here is the MSDN reference.
http://technet.microsoft.com/en-us/library/ms188427.aspx
Most of the examples show a DML (SELECT, UPDATE, DELETE, INSERT) using the OPENQUERY() as the source or destination of the command. What you are trying to do is execute a session command. Therefore it will fail. Also, you might not even know if the session stays open for the next call.
Why not package up the logic on the MYSQL server as a stored procedure. The stored procedure can be executed on a linked server by using a four-part name?
For example:
INSERT INTO #results
EXEC server.database..stored-proc
This assumes MYSQL has the same object structure as ORACLE. Since I am not a MYSQL person, I can not comment. I allow you to research this little item.
But this should work. It will allow you to package any type of logic in the MYSQL database.
If you want to use SSIS to transfer data from SQL Server to MYSQL.
For the ADO.NET Destination to work properly, the MySQL database needs to have the ANSI_QUOTES SQL_MODE option enabled. This option can be enabled globally, or for a particular session. To enable it for a single session:
1 - Create an ADO.NET Connection Manager which uses the ODBC driver
2 - Set the connection manager’s RetainSameConnection property to True
3 - Add an Execute SQL Task before your data flow to set the SQL_MODE – Ex. set sql_mode='STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ANSI_QUOTES'
4 - Make sure that your Execute SQL Task and your ADO.NET Destination are using the same connection manager.
Matt Mason did this on a reply.. The key is item #2, use the same connection.
http://blogs.msdn.com/b/mattm/archive/2009/01/07/writing-to-a-mysql-database-from-ssis.aspx#comments
Also, CozyRoc has a custom ODBC driver that might be faster / more reliable than the free one from MYSQL.
http://cozyroc.com/ssis/odbc-destination
Related
I am creating an ETL in SSIS in which I which I want my data source to be a restricted query, like select * from table_name where id='Variable'. This variable is what I defined as User created variable.
I do not understand how I can have my source query interact with the SSIS scoped Variable.
The only present options are
Table
Table from variable
SQL Command
SQL command from a variable
What I want is to have a SQL statement having a variable as parameter
Simple. Choose SQL command as the Data Access Mode. Enter your query with a question mark as a parameter placeholder. Then click the Parameters button and map your variable to Parameter0 in the Set Query Parameters dialog:
More information is available on MSDN.
An inferior alternative to #Edmund's approach is to use an Expression on another Variable to build your string. Assuming you have #[User::FirstName] already defined, you would then create another variable, #[User::SourceQuery].
In the properties for this variable, set EvaluateAsExpression to True and then set an Expression like "SELECT FirstName, LastName, FROM Person.Person WHERE FirstName = '" + #[User::FirstName] +"'" The double quotes are required because we are building an SSIS String.
There are two big reasons this approach should not be implored.
Caching
This approach is going to bloat your plan cache in SQL Server with N copies of essentially the same query. The first time it runs and the value is "Edmund" SQL Server will create an execution plan and save it (because it can be expensive to build them). You then run the package and the value is "Bill". SQL Server checks to see if it has a plan for this. It doesn't, it only has one for Edmund and so it creates another copy of the plan, this time hard coded to Bill. Lather-rinse-repeat and watch your available memory dwindle until it unloads some plans.
By using the parameter approach, when the plan is submitted to SQL Server, it should be creating a parameterized version of the plan internally and assumes that all parameters supplied will result in equal costing executions. Generally speaking, this is the desired behaviour.
If your database is optimized for ad-hoc workload (it's a setting turned off by default), that should be mitigated as every plan is going to get parameterized.
SQL Injection
The other big nasty you will run into with building your own string is that you open yourself up to SQL Injection attacks or at the least, you can get runtime errors. It's as simple as having a value of "d'Artagnan." That single quote will cause your query to fail resulting in package failure. Changing the value to "';DROP TABLE Person.Person;--" will result in great pain.
You might think it's trivial to safe quote everything but the effort of implementing it consistently everywhere you query is beyond what your employer is paying you. All the more so since there is native functionality provided to do the same thing.
When using OLEDB Connection manager (with SQL Server Native Client 11.0 provider in my case) you can catch an error like this:
Parameters cannot be extracted from the SQL command. The provider
might not help to parse parameter information from the command. In
that case, use the "SQL command from variable" access mode, in which
the entire SQL command is stored in a variable.
So you need to explicitly specify database name in OLEDB Connection manager properties. Otherwise SQL Server Native Client can use different database name then you mean (e.g. master in MSSQL Server).
For some cases you can explicitly specify database name for each database object used in query, e.g.:
select Name
from MyDatabase.MySchema.MyTable
where id = ?
I'm trying to use the SQL Server 2008 Change Tracking feature. Once the feature is enabled, you can make use of the CHANGETABLE(... function to query the change tracking history that is kept internally by SQL Server, e.g.:
SELECT
CT.ID, CT.SYS_CHANGE_OPERATION,
CT.SYS_CHANGE_COLUMNS, CT.SYS_CHANGE_CONTEXT
FROM
CHANGETABLE(CHANGES dbo.CONTACT,20) AS CT
where the SYS_CHANGE_CONTEXT column records the CONTEXT_INFO() session value. This column is useful for auditing who changed what etc.
Some of the statements that change data are executed using four-part notation by a remote SQL Server that has the home server as a linked server e.g.:
INSERT INTO [home server].[db name].[dbo].[CONTACT](id) values(#id)
My problem is that the CONTEXT_INFO() as set on the remote server in the session executing the query does not get picked up in my home server change tracking, i.e. it doesn't look like the CONTEXT_INFO spans a distributed query. This means that the following will not result in the CONTEXT_INFO being logged on the home server change tracking.
-- I'm running on a remote server
WITH CHANGE_TRACKING_CONTEXT (0x1256698477)
INSERT INTO [home server].[db name].[dbo].[CONTACT](id) values(#id)
Does anyone know whether this is a limitation or if there is a way to persist/communicate CONTEXT_INFO across the distributed query?
Thanks
I was thinking about using Context_Info to audit changes (web app). but after doing some tests understood its not good idea. Because of connection pooling context_info was not working the way i desired.
Ended up with using GUID identifier associated with each logical session + table, where is stored session GUID and information related to session + each table stores that identifier in separate column. Not as easy to code as it would be with context_info()..
And as far as i understood from documentation, change tracking is not designed for audit purposes (think that is what you trying to do).
We had some old stored procedures in SQL Server 2000 which updated system catalog from an application which was used for user application security which is tied to SQL Server roles to take advantage of SQL Server in-built security and NT logins.
When we migrate DB to SQL Server 2008 and try to run these stored procedure we get SQL Server 2008 error
Ad hoc updates to system catalogs are not allowed.
I searched around and found that from SQL Server 2005 onwards MS do not support catalog updates (unless using Dedicated Administrator Connection (DAC) ).
If anyone can help me with how to do this in new versions or any other alternative (like .Net code run in Sql server??) that will be great.
Some sample queries below
update sysusers
set roles = convert(varbinary(2048), substring(convert(binary(2048), roles), 1, #ruidbyte-1)
+ convert(binary(1), (~#ruidbit) & substring(convert(binary(2048), roles), #ruidbyte, 1))
+ substring(convert(binary(2048), roles), #ruidbyte+1, 2048-#ruidbyte)),
updatedate = getdate()
where uid = #memuid
delete from syspermissions where grantee = #uid
delete from sysusers where uid = #uid
insert into sysusers
values(#uid, 0, #rolename, NULL, 0x00, getdate(), getdate(), #owner, NULL)
There is no routine reason to update system tables in SQL Server. Ever.
There is a whole array of commands and system stored procedures to do it properly in all versions of SQL Server.
You can't anyway as you noted: there is no hack or workaround
Updating a system table can be a dangerous task, as it may lead to unexpected results. So before you do that just make sure that you are very confident with whatever you are doing. It would be advisable that you do the changes on a replica of the original database to prevent unwanted results or crashes in the database.
The possible ways could be:
Use a DAC. You can get the technique after searching it on google. It is more of hacking the system tables.
Use the following code:
sp_configure 'allow updates',1
go
//your code
reconfigure with override
go
reconfigure would configure back your database.
setting 'allow updates' to '1' would allow you to update the system tables.
But the best way would be to find an another alternative for your task.!
I am writing an SSIS package that has a conditional split from a SQL Server source that splits records to either be updated or inserted into a MYSQL database.
The SQL Server connection has provider .NET Provider for OldDB\SQL Server Native Client 10.0.
The MYSQL connection is a MYSQL ODBC 5.1 ADO.NET connection.
I was thinking about using the OLE DB Command branching off of the conditional split to update records but I connect use this and connect to the MYSQL database.
Does anyone know how to accomplish this task?
I would write to a staging table for updates including the PK and columns to be updated and then execute an UPDATE SQL statement using that table and the table to be updated. The alternative is to use the command for every row and that just doesn't seem to perform that well in my experience - at least compared to a nice fat batch insert and a single update command.
For that matter, I guess you could do without the conditional split altogether, write everything to a staging table and then use an UPDATE and INSERT in SQL back to back.
Probably, the following MSDN blog link might help you. I haven't tried this.
How do I UPDATE and DELETE if I don’t have an OLEDB provider?
The post suggests the following three options.
Script Component
Store the data in a Recordset
Use a custom component (like Merge destination component)
The author also had posted two other articles about MySQL prior to posting the above article.
Connecting to MySQL from SSIS
Writing to a MySQL database from SSIS
Hope that points you in the right direction.
I have a mysql database full of data which I need to keep but migrate to SQL Server 2008.
I know end to end where the data should go, table to table but I have no idea how to go about moving the data. I've looked around the web but it seems there are 'solutions' which you have to download and run. I'd rather if possible do something myself in terms of writing scripts or code.
Can anyone recommend the best way to do this please?
You have several options here:
On the sql server side, you can set up a connection to your old mysql db using something called a linked server. This will allow you to write sql code for sql server that returns data from the mysql tables. You can use this to build INSERT or SELECT INTO statements.
You can write queries for mysql to export your data as csv, and then use the BULK INSERT features of sql server to efficiently import the csv data.
You can use Sql Server integration services to set move the data over from mysql.
Regardless of which you choose, non-data artifacts like indexes, foreign keys, triggers, stored procedures, and security will have to be moved manually.
Have you tried tool from MSFT called SQL Server Migration Assistance for MySQL ???
https://www.microsoft.com/download/en/details.aspx?id=1495
Try this tutorial it is very easy to perform migration to SQL Server from Mysql and is straightforward as mentioned
http://www.codeproject.com/Articles/29106/Migrate-MySQL-to-Microsoft-SQL-Server
Thanks
You can use the Import/Export Wizard that comes with SQL Server Standard Edition.
Select your 'data source' from MySQL using the ODBC data source. Note: You will need to first install the from ODBC driver for MySQL (ODBC Connector). Then, select your SQL Server destination. Select all tables, and fire it up. You will need to add your primary and foreign keys, and indexes manually.
A bit more automated means would be by using the SQL Server Migration Assistant for MySQL, also free. It has the benefit of recreating the relationships and indexes automatically for you. Probably your best bet.
I did it once, some time ago. First you could couple your mssql server to the mysql server using the odbc mysql connector
http://dev.mysql.com/downloads/connector/
After the connection is made you can write you database procedure as you would if it were two mssql db's. Probably easiest to write some sql batch scripts including a cursor where you run through every every row of a table an decide on a field basis where you will need the field in the future.
example of a cursor: http://www.mssqltips.com/tip.asp?tip=1599
If you decide to go with the cursor, you can play with the parameter to increase performance. I especially remember the FORWARD_ONLY parameter giving a big boost.