Related
I work for a large organization that has an established and well populated MS-SQL server. However, I am not a Microsoft user, and my database of choice is MySQL. I am looking for a solution that will allow me to either...
-Directly query our MS-SQL server from my MySQL server
and/or
-Set up some sort of job that will copy data systematically from the MS-SQL server to our MySQL server.
It looks like Linked Servers may be part of the solution, however everything I have found describes scenarios where MS-SQL is accessing MySQL, not the other way around.
To be clear I want my MySQL server to talk to/query/pull data from my MS-SQL server.
Any help appreciated!
As far as I'm aware, you can't query any other RDBMS vendor from MySQL. MySQL's remote access feature is FEDERATED tables, which only work with other MySQL databases as far as I know.
About the simplest way you could do this would be to use SQL Server's Import/Export Wizard to create a simple package that copies the data to your MySQL server through an ODBC or ADO.NET connection to the MySQL database.
To be clear I want my MySQL server to talk to/query/pull
data from my MS-SQL server.
I think it is hard to even assume this is the best decision. Without a TON more context of what the real problem is and/or the real "need", answers vary widely from "just use ms-sql" to other levels of ad-hoc ETL. That said, some abstract feedback.
There is nothing wrong with MS-SQL, as long as you are (a) not paying for it and (b) have a clean solution to use it from a real POSIX based system. Technically, MS-SQL is a great database, I just dislike Windows. To that end, I made sure that working with MS-SQL from Ruby was done well at both the C extension layer with TinyTDS and the ActiveRecord adapter.
Sadly, I have personally stopped maintaing the later, but the C extensions are strong and even used by great projects like Sequel which if you had to some sort of raw ETL without the overhead of ActiveRecord is a great choice since it has adapter for all DBs, TinyTDS included.
I'm looking for a possible solution for the following problem.
First the situation I'm at:
I've 2 databases, 1 Oracle DB and 1 MySQL DB. Although they have a lot of similarities they are not identical. A lot of tables are available on both the Oracle DB and the MySQL DB but the Oracle tables are often more extensive and contain more columns.
The situation with the databases can't be changed, so I've to deal with that.
Now I'm looking for the following:
I want to synchronise data from Oracle to MySQL and vice versa. This has to be done real time or as close to real time as possible. So when changes are made at one DB they have to be synced to the other DB as quickly as possible.
Also not every table has to be in sync, so the solution must offer a way of selecting which tables have to be synced and which not.
Because the databases are not identical replication isn't an option I think. But what is?
I hope you guys can help me with finding a way of doing this or a tool which does exactly what I need. Maybe you know some good papers/articles I can use?
Thanks!
Thanks for the comments.
I did some further research on ETL and EAI.
I found out that I am searching for an ETL tool.
I read your question and your answer. I have worked on both Oracle, SQL, ETL and data warehouses and here are my suggestions:
It is good to have a readymade ETL tool. But, if your application is big enough to make you need a tailor made ETL tool, I suggest you for a home-made ETL process.
If your transactional database is on Oracle, you can have triggers set up on the key tables that would further trigger an external procedure written in C, C++ or Java.
The reason behind using an external procedure is to be able to communicate with both databases at a time - Oracle and MySQL.
You can read more about Oracle External Procedures here.
If not through ExtProc, you can develop a separate application in Java or .Net that would extract data from the first database, transform it according to your business rules and load it into your warehouse.
In either approaches that you choose, you will have greater control on the ETL process if you implement your own tool, rather than going for a readymade tool.
I'm not sure, if it fits exactly stackoverflow, however as i'm seeking for some code rather than a tool, i think it does.
I'm looking for a way of how to replicate / synchronize different database systems -- in this case: mysql and mongodb. We are running both for different purpose. We started with a mysql database and added mongodb later on for special applications. There's data we would like to have in both databases, where we want to have constraints in mysql respectivly dbrefs in mongodb. For example: We need a user-record in mysql, but also in mongodb for references between tables respectivly objects. At the moment we have a cronjob, which dumps the mysql data and imports it in mongodb. However though it works quite well, that's not the solution we would like to have.
I think for the moment a one-way replication would be enough -- mysql->mongodb, the important part is, that the replication works in "realtime", much like a mysql master->slave replication works.
Are there already any solutions for this problem or ideas anyone of how to achieve this?
Thanks!
SymmetricDS is open source, Java-based, web-enabled, database independent, data synchronization/replication software that might do the trick with a few tweaks. It has an extension point called IDataLoaderFilter which you could use to implement a MongodbDataLoader.
This would help with one way database replication. It might be a little more difficult to synchronized from MongoDb -> relational database, but the SymmetricDS team would be very helpful in trying to find the solution.
What you're looking for is called EAI (Enterprise application integration). There are a lot of commercial tools around but under the provided link, you'll also find a couple OSS solutions. The basis of EAI is that you have data sources and data sinks. The EAI framework offers tools to build custom pumps between the two.
I suggest to either use a DB trigger to start the synchronization or send a trigger signal in your applications. Note that there is no key-hole solution since synchronization can become arbitrarily complex (for example, how do you make sure that all rows are copied?).
As far as I see you need to develop some sort of "Control program" that has the drivers for each DBMS and run it as a daemon. The daemon should have a trigger or a very small recheck interval to keep the DBs synchronized
Technically, you could set up a process which parses the binary log of the MySQL server and replicate the relevant sql queries. I've never done such a thing with a a different database as a slave, but maybe it is worth a shot?
I've used both and I've found MySql to have several frustrating bugs, limited support for: IDE integration, profiling, integration services, reporting, and even lack of a decent manager. Total cost of ownership of MSSQL Server is touted to be less than MySQL too (.net environment), but maintaining an open mind could someone point out any killer features of MySql?
I've used MySQL in the past and I'm using MSSQL lately but I can't remember anything that MySQL has and MSSQL can't do.
I think the most killer feature of MySQL it's the simplicity. For some projects you just don't need all the power you can have with a huge system like MSSQL. I have an UNIX heritage and find the simple configuration file like my.ini a killer feature of MySQL.
Also the security system of MySQL is much less robust but it makes the job right for most of applications. I believe MySQL it's killer itself from this point of view, and should stay that way, letting young users being introduced to RDBMS with a simple view first. If your project gets big enough that you are considering switch to a more robust system, then MSSQL can pop as a possibility.
That's what happened to me.
The only thing I can think of, off hand, is locking. SQLServer has traditionally had poor locking strategy that has tripped many people up.
You should use what you prefer, ultimately. Its not as if MySQL is not good enough to compete with MS SQL, eg. Slashdot uses MySQL, so its hardly got problems with high-scalability performance.
Its killer feature though, is that it is free - you can deploy as many of them without worrying one fig about licensing issues. That's more important for the spread of software than anyone could imagine.
(TCO is a difficult thing to calculate - and is advice only ever given from paid consultants and other vested interests. Ignore that. MSSQL is expensive and MySQL is free.)
About 6 years ago I developed a custom e-commernce website using ASP and MySQL for the database. At the time MySQL was clearly a better choice than MSDE which had built in throttling which concerned me enough to use MySQL. Also the difference in coding between using MySQL and MSDE/SQL was not that different or much of a concern.
Now all these years later I'm trying to get the code converted to .NET and even after purchasing commercial MySQL drivers from CRLab. I found that, as you hinted, the IDE integration is just not up to par.
I will say that MySQL is doing a great job even with our database tables approaching 4GB. So when I switch to MSSQL I have to go ahead and get SQL Workstation or higher ($$$), and not use SQL Express which has a 4gb limit.
All of my experience has changed the way I develop new websites. Now, unless it is expected to have a lot of traffic. I use VistaDB and then upgrade to SQL Server if needed. VistaDB is syntax and datasource compatible with SQL Server. And the best part is it is only a single file for the database and a dll for your bin folder.
That's my two cents based on my personal experience with using MySQL in ASP and now .NET.
I work with MSSQL, MySql and PostGres regularly (using .net, java and PHP). One of my favorite things about about MySQL (esp. compared to MSSQL) is the ease with which you can run and restore full database backups.
MSSQL's model of using .bak files is really ugly and time-consuming (topic for another post.) But if you want to do somethign like automated testing, or automated build processes (that include building a db from scratch), MySQL can be a bit easier to deal with.
A few other points:
The management tools have gotten a lot better since the early days.
If you are interested in transactions, constraints, etc.. be sure you are defining your tables to use the InnoDB storage engine (instead of MyISAM which is designed for speed.)
I do miss MSSQL's schema generating tool, but I think there are equivalent tools out there.
We've used a Linux database server and a window's web server (for .net apps) with great success.
If you are using something like NHibernate or some other non-MS data abstraction layer, the case to look beyond MSSQL is stronger too...
Three points to consider; unfortunately the first two are contradictory:
1) .NET and MySQL were not designed to interact with one another, and there is no official support from either side. You're invariably going to encounter issues trying to use them together.
2) If portability off of Windows may ever be an issue (much .NET code runs quite nicely on other platforms via Mono), you'll want to avoid locking yourself too deeply to MSSQL. That doesn't mean not using it, but being careful that you don't rely on its particular quirks too much.
3) TCO is just a buzzword. It's complete nonsense when it's calculated by anyone other than you. Nobody can make such a calculation and honestly claim that it has any relevance outside their particular environment. There are too many factors, most of which have absolutely nothing to do with things like tool availability.
I've been using the community version of MySQL for alsmost 99% of my project. I like MySQL is that I can deploy via Xcopy and is powerful compare to other "xcopy-able" database server. I also wrote a wrapper to start and stop MySQL & Apache (like LAMP), but with my own implemetation and addon capability
MySQL probably has a lower TCO, since administration and configuration is more simple and straightforward than the Spaghetti GUI that MS SQL makes you do most of the configuration through, having to dig through hundreds of obscure properties dialogs to accomplish even basic administration tasks.
There is one area where MS SQL clearly excels over MySQL in my experience:
Integration with other technologies. MS SQL allows you to replicate back and forth with Oracle and MySQL databases, and provides SSIS for executing scheduled data transformations from other database servers.
There may be others, but I don't have experience with them.
I've worked on a variety of systems as a programmer, some with Oracle, some with MySQL. I keep hearing people say that Oracle is more stable, more robust, and more secure. Is this the case?
If so in what ways and why?
For the purposes of this question, consider a small-medium sized production DB, perhaps 500,000 records or so.
Yes. Oracle is enterprise grade software.
I'm not sure if its really any more stable that mysql, I haven't used mysql that much, but I dont ever remember having mysql crash on me. I've had oracle crash, but when it does, it gives me more information about why it crashed than I could possibly want, and Oracle support is always there to help ( for a fee ).
Its very very robust, Oracle DB will do virtually everything it can before breaking your data, I've had mysql servers do really weird things when they run out of disk space, Oracle will just halt all transactions, and eventually shutdown if it can't write the files it needs. I've never lost data in oracle, even when I do stupid things like forget the where clause and update every row rather than a single row, its very easy to get the database back to how it was before screwing up.
Not sure about security, certainly Oracle gives you lots of options for how you are going to connect to the DB and authenticate. It gives lots of options regarding which users have access to what, etc. But as with most things, if you want to take security seriously, then you need an expert to do it. Oracle certainly has a lot more to lose if they don't get security right. But, as with all things there has been exploits.
If nothing else, just consider this... When Oracle stuffs up, they have customers who are paying $40k per CPU (if they are suckers and pay list price) license + yearly maintenance fees.. This gives them a very strong intensive to make sure the customers are happy with the product.
For a small database, I'd seriously recommend Oracle XE well before mysql. It has the important features of mysql (Free), its dead easy to install, comes with a nice web interface and application framework (Application Express), if you DB will happy run on a single cpu, 1gb ram and 4gb data, then XE is the way to go IMHO.
Mysql has its uses, many many people have shown that you can build great things with it, but its far behind oracle (and SQL Server, and DB2) in terms of features... But then, its also free and very easy to learn, which for many people is the most important feature.
I've had Oracle create a corrupt database when the disk ran out of space. It's hard to debug, uses loads of resources and is difficult to work with without seriously skilled DBA's holding your hand. Oracle even replaced system binaries (e.g. gcc) in /usr/bin/ when I installed in on an occation.
Working with PostgreSQL, on the other hand, has been much more pleasant. It gives readable error messages and acts in a more understandable way if you're used to work with open source *nix systems. It's quite easy to set up replication, thus making your data fairly secure.
A 500K record database can probably be run on your mobile phone. Seriously, it's so small that both Oracle XE and MySQL will be more than sufficient to manage it.
for smallish DBs (a few million records), Oracle is overkill
you need an experienced DBA to properly install and manage an Oracle system
Oracle has a larger "base overhead", i.e. you need a beefier machine to run Oracle
the "out of the box" experience of Oracle used to be atrocious (i haven't installed an oracle system in years; no idea how it currently behaves), while mysql is very nice
Oracle is a beast that really needs DBA knowledge. I concur with those who say 500k records are nothing. It's not worth the complexity of Oracle if it's simple numeric/text data.
On the other hand, Oracle is extremely efficient with blobs. If each of your records was a 100MB binary file, you'd need a fortune to run it on Oracle (I'd recommend a 3-node RAC cluster with a good SAN).
I have a project that sends data (~10M rows, 1.2GB of data) to three different databases, 2 Oracle and 1 MySQL. I haven't had problems working with either system, nor have I seen any major advantages on either side. If you're in a place that already uses Oracle for other projects, adding on one new database shouldn't be too much of a problem, but if you're thinking of setting up a new database server and don't have anything in place already, MySQL will save you the money.
Oracle Enterprise assumes that there is an Enterprise to support it, ie, a real Oracle DBA. A novice (but competent) DBA should be able to secure MySQL much more easily than Oracle, just because Oracle is inherently more complex. Of course, Oracle has the Enterprise monitoring tools beyond what MySQL currently features (as far as I've seen) but the DBA needs to be able use them to be effective.
Such a small database as you describe could be handled by most anything so I can't see that Oracle would be warranted unless the infrastructure was already in place. Both have replication, transactions and warm-backups so either would serve well.
The answer depends entirely on how you configure each DBMS.
Both are capable of handling 500,000 records many times over.
Oracle is a lot beefier. Many of its features would only be looked for in a larger enterprise or high-performance setting. They're mainly features to do with scaling, replication and load balancing.
For small DBs, consider SQLite. For small-medium, look at MySQL or PostgreSQL. For the largest, look at MSSQL, Oracle, DB2, etc.
Edit: Having read the other answer, I'll add that if your data is really, really critical, you'll want a replicated setup and you'll probably want to look to one of the big DB providers for something like that.
If you can sacrifice potential (exceedingly rare) data losses and would prefer improved performance, look at some of the lighter-weight options.
It's true that Oracle is a beast.
It is also true that Oracle is widely considered the most secure major database.
The problem is that Oracle's devs don't appear to grasp critical security consepts. Oracle is the least secure database server on the market (According to independent security researchers)
http://itic-corp.com/blog/2010/09/sql-server-most-secure-database-oracle-least-secure-database-since-2002/
MySQL is actually fairly secure according to these researchers. I don't know much about the tools available for it. What's most amusing about this research is that the same people who would call Microsoft SQL server a toy would have their data stolen by attackers that MSSQL would thwart because they are using a beast that has a terrible security model rather than a "toy" that is secure.
I'm using Oracle/SQL Server/MySql for different applications and site
No Database beat can Oracle in many different area, but it's the most database that require deep knowledge for the administration.
and if you found a problem with oracle, may spend few times to solve it even with good DBAs guys.
You can go with MySql for 500K or millions of records, it's more light than other DB, and require zero administration work, and will not take a lot of your computer resources, I always have it in my development PC, and never had faced any serious problem with it.
I would require you go with MySql or PostgreSQL if you don't need the advanced featuers of Oracle.