How to keep databases synchronized between hosting account and a local testing server? - mysql

I have several databases hosted on a shared server, and a local testing server which I use for development.
I would like to keep both set of databases somewhat synchronized (more or less daily).
So far, my ideas to solve the problem seem very clumsy. Anyway, for reference, here is what I have considered so far:
Make a database dump from online databases, trash local databases, and recreate the databases from the dump. It's a lot of work and requires a lot of download time (which guarantees I won't do it as much as I would like it to be done)
Write a small web service to access the new data, and write a small application locally to communicate with said web service, download the newest data, and update the local databases.
Both solutions sound like a lot of work for a problem that is probably already solved a zillion times over. Or maybe it's even an existing feature which I completely overlooked.
Is there an easy way to keep databases more or less in synch? Ideally something that I can set up once, schedule and forget about.
I am using MySQL 5 (MyISAM) databases on both servers.
=============
Edit: I had a look at replication, but it seems that I can't go that route because the shared hosting does not give me enough control on the server itself (I got most permissions on my databases, but not on the MySQL server itself)
I only need to keep the data synchronized, nothing else. Is there any other solution that doesn't require full control on the server?
Edit 2:
Sorry, I forgot to mention I am running on a LAMP stack on the shared server, so Windows-only solutions won't work.
I am surprised to see that there is no obvious off-the-shelves solution for this problem.

Have you considered replication? It's not to be trifled with but may be what you want. See here for more details... http://dev.mysql.com/doc/refman/5.0/en/replication-configuration.html

Take a look at Microsoft Sync Framework - you will need to code in .net, but it can resolve your issues.
http://msdn.microsoft.com/en-in/sync/default(en-us).aspx
Here is a sample for SQL server, but it can be adapted to mysql as well using ado.net provider for Mysql.
http://code.msdn.microsoft.com/sync/Release/ProjectReleases.aspx?ReleaseId=4835
You will need the additional tables for change tracking and anchors (keeping track of last synchronization) for this to work, in your mysql database, but you wont need full control as long as you can access the db.
Replication would have simpler :), but this might just work in your case.

Related

How to update mysql tables between computers

I'm working on a group project where we all have a mysql database working on a local machine. The table mainly has filenames and stats used for image processing. We all will run some processing, which updates the database locally with results.
I want to know what the best way is to update everyone else's database, once someone has changed theirs.
My idea is to perform a mysqldump after each processing run, and let that file be tracked by git (which we use religiously). I've written a bunch of python utils for the database, and it would be simple enough to read this dump into the database when we detect that the db is behind. I don't really want to do this though, less it clog up our git repo with unnecessary 10-50Mb files with every commit.
Does anyone know a better way to do this?
*I'll also note that we are Aerospace students. I have some DB experience, but it only comes out of need. We're busy and I'm not looking to become an IT networking guru. Just want to keep it hands off for them since they are DB noobs and get the glazed over look of fear whenever I tell them to do anything with the database. I made it hands off for them thus far.
You might want to consider following the Rails-style database migration concept, whereby as you are developing you provide roll-forward and roll-back SQL statements that work as patches, allowing you to roll your database to any particular revision state that is required.
Of course, this is typically meant for dealing with schema changes only (i.e. you don't worry about revisioning data that might be dynamically populated into tables.). For configuration tables or similar tables that are basically static in content, you can certainly add migrations as well.
A Google search for "rails migrations for python" turned up a number of results, including the following tool:
http://pypi.python.org/pypi/simple-db-migrate
I would suggest to create a DEV MySQL server on any shared hosting. (No DB experience is required).
Allow remote access to this server. (again, no experience is required, everything could be done through Control Panel)
And you and your group of developers will have access to the database at any time from any place and from any device. (As long as you have internet connection)

Any reason NOT to use subdomain for development?

I was originally planning on using a local machine on our network as the development server.
Then I had the idea of using a subdomain.
So if the site was at www.example.com then the development could be done at dev.example.com.
If I did this, I would know that the entire software stack was configured exactly the same for development and production. Also development could use the same database as production removing the hassle of syncing the data. I could even use the same media (images, videos, etc.)
I have never heard of anyone else doing this, and with all these pros I am wondering why not?
What are the cons to this approach?
Update
OK, so its seems the major no no of this approach is using the same DB for dev and production. If you take that out of the equation, is it still a terrible idea?
The obvious pro is what you mentioned: no need to duplicate files, databases, or even software stacks. The obvious con is slightly bigger: you're using the exact same files, databases, or even software stacks. Needless to say: if your development isn't working correctly (infinite loops, and whatnot), production will be pulled down right alongside with it. Obviously, there are possibilities to jail both environments within the OS, but in that case you're back to square one.
My suggestion: use a dedicated development machine, not the production server, for development. You want to split it for stability.
PS: Obviously, if the development environment missed a "WHERE id = ?", all information in the production database is removed. That sounds like a huge problem, doesn't it? :)
People do do this.
However, it is a bad idea to run development against a production database.
What happens if your dev code accidentally overwrites a field?
We use subdomains of the production domain for development as you suggest, but the thought of the dev code touching the prod database is a bit hair-raising.
In my experience, using the same database for production and development is nonsence. How would you change your data model without changing your code?
And also 2 more things:
Its wise to prepare all changes in SQL script, that is run after testing from different environment not your console. Some accidental updates to live system made me headake for weeks.
Once happend to me, that restored backup didn't reproduced live system problem, because of unordered query result. This strange baviour of backup later helped us find the real problem simplier, than retrying on live system.
Using the production machine for development takes away your capacity to experiment. Trying out new modules/configurations can be very risky in a live environment. If I mess up our dev machine with an error in the apache conf, I will just slightly inconvenience my fellow devs. You will be shutting down the live server while people are trying to give you their money.
Not only that but you will be sharing resources with the live enviroment. You can forget about stress testing when the dev server also has to deal with actual customers. Any mistakes that can cause problems on the development server (infinite loop taking up the entire CPU, running out of HDD space, etc) suddenly become a real issue.

How does database tiering work?

The only good reference that I can find on the internet is this whitepaper, which explains what database tiering is, but not how it works:
The concept behind database tiering is
the seamless co-existence of multiple
(legacy and new) database technologies
to best solve a business problem.
But, how does it implemented? How does it work?
Any links regarding this would also be helpful. Thanks.
I think the idea of that document is you to put "cheap" databases in front of the "expensive" databases to reduce costs.
For example. Let's assume you have an "expensive" db...something like Oracle, or DB2 or even MSSQL (more realistically it's probably more of an issue with a legacy DB system that is not supported much or you need specialized resources to maintain). A database engine that costs a lot to purchase and maintain (arguably these are not expensive when you take all factors into consideration. But let's use them for the example).
Now if you suddenly get famous and your server starts to get overloaded what do you do? Do you buy a bigger server and migrate all your data to that new server? That could be incredibly expensive.
With the tiering solution you put several "cheap" databases in front of you "expensive" database to take the brunt of the work. So your web servers (or app servers) talk to a bunch of MySQL servers, for example, instead of directly to the your expensive server. Then these MySQL servers handle the majority of the calls. For example, they could handle all read-only calls completely on their own and only need to pass write-calls back to the main database server. These MySQL servers are then kept in sync via standard replication practices.
Using methods like this you could in theory scale out your expensive server to dozens, if not hundreds, of "cheap" database servers and handle a much higher load.
Database tiering is just a specific style of tiering. There are also application tiering and service tiering. It's a form of scalability.
What exactly are you asking? This question is rather vague.
This is a PDF from a course at Ohio State. What it discusses is a bit over my head, but hopefully you might understand it better.

How to replicate two different database systems?

I'm not sure, if it fits exactly stackoverflow, however as i'm seeking for some code rather than a tool, i think it does.
I'm looking for a way of how to replicate / synchronize different database systems -- in this case: mysql and mongodb. We are running both for different purpose. We started with a mysql database and added mongodb later on for special applications. There's data we would like to have in both databases, where we want to have constraints in mysql respectivly dbrefs in mongodb. For example: We need a user-record in mysql, but also in mongodb for references between tables respectivly objects. At the moment we have a cronjob, which dumps the mysql data and imports it in mongodb. However though it works quite well, that's not the solution we would like to have.
I think for the moment a one-way replication would be enough -- mysql->mongodb, the important part is, that the replication works in "realtime", much like a mysql master->slave replication works.
Are there already any solutions for this problem or ideas anyone of how to achieve this?
Thanks!
SymmetricDS is open source, Java-based, web-enabled, database independent, data synchronization/replication software that might do the trick with a few tweaks. It has an extension point called IDataLoaderFilter which you could use to implement a MongodbDataLoader.
This would help with one way database replication. It might be a little more difficult to synchronized from MongoDb -> relational database, but the SymmetricDS team would be very helpful in trying to find the solution.
What you're looking for is called EAI (Enterprise application integration). There are a lot of commercial tools around but under the provided link, you'll also find a couple OSS solutions. The basis of EAI is that you have data sources and data sinks. The EAI framework offers tools to build custom pumps between the two.
I suggest to either use a DB trigger to start the synchronization or send a trigger signal in your applications. Note that there is no key-hole solution since synchronization can become arbitrarily complex (for example, how do you make sure that all rows are copied?).
As far as I see you need to develop some sort of "Control program" that has the drivers for each DBMS and run it as a daemon. The daemon should have a trigger or a very small recheck interval to keep the DBs synchronized
Technically, you could set up a process which parses the binary log of the MySQL server and replicate the relevant sql queries. I've never done such a thing with a a different database as a slave, but maybe it is worth a shot?

Which server can I decide for MySQL, windows or Unix/Linux/Ubuntu/Debian?

I'm working on a SaaS project and mysql is our main database. Our applications is written on c# .net and runs under an windows 2003 server.
Considering maintainance, cost, options and performance, which server plattaform can I decide for MySQL hosting, windows or Unix/Linux/Ubuntu/Debian?
The scenario is as following:
The server I run today has a modarate transaction volume. Databases increase 5MB daily and we expect to increase 50MB in couple of months and it is mission critical.
I don't know how big the database is going to be. We rent a VPS to host application and database server.
Most of our queries are simple but our ORM Tool makes constantly use of subqueries. Also we run reports simple and heavy ones. Some them runs after user click, but most runs in order to the queue.
Buy an extra co-lo space will be nice as we got more clients. That's SaaS project after all.
When developing, you can use your Windows box to also run a MySQL server. If and when you
want to have your DBMS in a separate server it can be in either a Windows or Linux server.
MySql and supporting tools for backup etc probably have more choices in Linux.
There are also 3rd party suppliers who will host your MySQL database on their servers. The benefit is they will handle backups, maintenance etc.
Also: look into phpMyAdmin for use as a great admin tool.
Larry
I think you need more information to make an informed decision. It's hard to just pull out a "best" answer based on no specific information.
What is your expected transaction volume?
How big will the database get?
How complex are your queries, ie are they long running or relatively quick?
Are you hosting the application on your own server at your own location? If you have to buy extra co-lo space maybe an extra server isn't the best option.
How "mission critical" is this database? Ie maybe you need replicated servers to ensure stability.
There is a server sizing tool online at http://www.sizinglounge.com/, so you should check that out. It sounds like your server could be smaller than their smallest tier, but it should be a good place to start.
If this is a mission critical application you need to do some kind of replication to an extra server in case the primary one fails, so you are definitely looking at two systems. This has to be in addition to a good backup plan.
Given that you are uncertain about how big it could get you might just continue renting a server. For your backup one idea would be to look at running MySQL on an Amazon EC2 instance. BTW it is important to have a remote replicated server. If you have two systems next to each other and an environmental problem comes up, they could both be out of commission at the same time. But with a remote copy your options are open to potentially working around it.
If you run a lot of read-only queries locally and have your site hosted somewhere, it might make sense to set up a local replicated database copy to query against. That could potentially improve both your website and local performance quite a bit. Plus it would give you some good piece of mind having a local copy under your control.
HTH,
Brandon