How to merge data from mysql schemas that have diverged? - mysql

I have two servers that share an original ancestor codebase, but which have changed during the past couple of months in terms of database schema (I'm using mysql). I'm about to use the second one as my new production server, but I have to update the data (there are new users, there's new data related to those users, etc.). I want the data in the server that's now live, but has the old schema to have the authority, yet I want the schema in the new one to be the final one. So it's kind of a weird merge: I want data from the old server to be imported into a new server with a (not vastly) different schema.
I was thinking of simply making a dump of the server with the most up-to-date data, but then loading it wouldn't work since the schema has changed quite a bit.
I was also thinking on dumping the schema of the new server, applying it to a copy of the old one, then dumping the data from the latter and loading it into the new one, but I'm not sure how to go about doing that and if it's the safest option.
I develop on mac OS X and both of my servers are debian.

Applying the schema from the new server to the old and then migrating data is the safest option, largely because it forces you to evaluate what specifically has changed and what you want to do about that in terms of data (e.g., where a new column is added, what do you want to put in it)?
Since you mentioned the schemata are not massively different, simply doing a mysqldump without data (i.e., tables only) of each server and manually comparing (e.g., with diff) would tell you what columns are different. You can then apply those changes with ALTER on the old database.
It's all a little kludgy, but then ultimately there isn't really a non-kludgy way of doing this.

Look here: http://bitbucket.org/idler/mmp - it is a tool for mysql schema versioning, but only schema, not the data. First you must migrate your schema, then load your new data.

Related

merge design of mysql between localhost and server?

I'm kinda new to this kind of problem. I'm developing a web-app and changing DB design trying to improve it and add new tables.
well since we had not published the app since some days ago,
what I would do was to dump all the tables in server and import my local version but now we've passed the version 1 and users are starting to use it.
so I can't dump the server, but I still would need to update design of server DB when I want to publish a new version. What are the best practices here?
I like to know how I can manage differences between local and server in mysql?
I need to preserve data in server and just change the design, data on local DB are only for test.
Before this all my other apps were small and I would change a single table or column but I can't keep track of all changes now, since I might revert many of them later and managing all team members on this is impossible.
Assuming you are not using a framework that provides a migration tool for database, you need to keep track of the changes manually.
Create a folder sql_upgrades (or whatever name you name) in your code repository
Whenever a team member updates the SQL schema, he creates a file in this folder with the corresponding ALTER statements, and possibly UPDATE, CREATE TABLE etc. So basically the file contains all the statements used to update the dev database.
Name the files so that it's easy to manage, and that statements for the same feature are grouped together. I suggest something like YYYYMMDD-description.sql, e.g. 20150825-queries-for-feature-foobar.sql
When you push to production, execute the files to upgrade you SQL schema in production. Only execute the files that have been created since your last deployment, and execute them in the order they have been created.
Should you need to rollback a file, check the queries it contains, and write queries to undo what was done (drop added columns, re-create dropped columns, etc.). Note that this is "non-trivial", as many changes cannot be rolled back fully (e.g. you can recreate a dropped column, but you will have lost the data inside).
Many web frameworks (such as Ruby of Rails) have tools that will do exactly that process for you. They usually work together with the ORM provided by the framework. Keeping track of the changes manually in SQL works just as well.

How to log mysql database structural changes

I'm working with a project which is using mysql as the database. The application is hosted with many clients and we are doing upgrades for the current live systems often.
There are some instances where the client has change the database structure(adding new tables) and causes some unexpected db crashes.
I need to log all the structural changes which were done at that database, so we can find the correct root cause for that. We can't do it 100% correct with diff tool because it will not show the intermediate changes.
I found http://www.liquibase.org/ tool but seems little bit complex.
Is there any well known technique or a tool to track database structural changes only.
well from mysql studio you can generate all object's schema definition and compare them with your standard schema definition and this way you can compare two database schema...
generate scrips of both database (One is client's Database and One is master copy database) and then compare it using file compare tool would be the best practice according to me because this way you can track which collumn was added, which column was deleted, which index was added like wise without any tool download.
Possiable duplication of Compare two MySQL databases ?
Hope this helps.
If you have an application for your clients to manage these schema changes, you can use a mechanism at application level. If you have a Python and Django-based solution, you could probably use South which provides schema change tracking and rollbacks.

How do I manage a set of mysql tables in a production Rails app that are periodically recreated?

I have a production Rails app that serves data from a set of tables that are built from a LOAD DATA LOCAL INFILE MYSQL import of CSV files, via a ruby script. The tables are consistently named and the schema does not change. The script drops/creates the tables and schema, then loads the data.
However I want to re-do how I manage data changes. I need a suggestion on how to manage new published data over time, since the app is in production, so I can (1) push data updates frequently without breaking the application servicing user requests and (2) make the new set of data "testable" before it is live (with the ability to roll back to the previous tables/data if something went wrong).
What I'm thinking is keeping a table of "versions" and creating a record each time a new rebuild is done. The latest version ID could be stuck into the database.yml, and each model could specify a table name from database.yml. A script could move the version forward or backward to make sure everything is ok on the new import, without destroying the old version.
Is that a good approach? Any patterns like this already? It seems similar to Rails' migrations somewhat. Any plugins or gems that help with this sort of data management?
UPDATE/current solution: I ended up creating database.yml configuration and creating the tables at import time there. The data doesn't change based on the environment, so it is a "peer" to the environment-specific config. Since there are only four models to update, I added the database connection explicitly:
establish_connection Rails.configuration.database_configuration["other_db"]
This way migrations and queries work as normal with Rails. To I can keep running imports, I update the database name in the separate config for each import. I could manually specify the previous database version this way and restart the app if there was a problem.
config = YAML.load_file(File.join("config/database.yml"))
config["other_db"]["database"] = OTHER_DB_NAME
File.open(path, 'w'){|f| f.write(config.to_yaml)}
One option would be to use soft deletes or an "is active" column. If you need to know when records were replaced/deleted, you can also add columns for date imported and date deleted. When you load new data, default "is active" to false. Your application can preview the newly loaded data by using different queries than the production application, and when you're ready to promote the new data, you can do it in a single transaction so the production application gets the changes atomically.
This would be simpler than trying to maintain multiple tables, but there would be some complexity around separating previously deleted rows and incoming rows that were just imported but haven't been made active.

How to compare/update two mySQL databases' schema

Okay, I've got two databases, the second one being a more up to date version of the first one. It has new columns, tables, constraints, and whatnot.
I was wondering if there is a solid program out there that will update the first database with all that of the second already updated database (Not the data, just the tables, columns and all that) or am I stuck creating my own update script from scratch?
I actually found another post, that did not look like the same thing but it still helped me anyway, I found a program called Toad for MySQL and it has a compare Schema option that compares the two databases and then can sync one to the other (it creates a script and executes it) it seems to be working flawlessly, but I'm still testing the web app that uses the database to ensure this is true.
If you're on Windows, the RedGate SQL data and schema compare tools are beautiful:
http://mysql-compare.com/info
I've used them a few times. They're quite simple to use.
They're designed around creating DB diffs for moving from dev/QA/staging environments to integration/production environments (so yes, they generate scripts).
If you are looking for a tool that will compare at the schema level then I would suggest Navicat. The older version that I use works well for getting my production and development boxes in synch. I don't recommend it for large levels of data synchronization though - it seems very slow compared to a SQL dump and SQL import.

How do you maintain revision control of your database structure?

What is the simplest way of keeping track of changes to a projects database structure?
When I change something about the database (eg, add a new table, add a new field to an existing table, add an index etc), I want that to be propagated to the rest of the team, and ultimately the production server, with the minimal fuss and effort.
At the moment, the solution is pretty weak and relies of people remembering to do things, which is an accident waiting to happen.
Everything else is managed with standard revision control software (Perforce in our case).
We use MySQL, so tools that understand that would be helpful, though I would also be interested to learn how other places are handling this anyway, regardless of database engine.
You can dump the schema and commit it -- and the RCS will take care of the changes between versions.
You can get a tool like Sql Compare from Red-Gate which allows you to point to two databases and it will let you know what is different, and will build alter scripts for you.
If you're using .NET(Visual Studio), you can create a Database project and check that into source control.
This has alrady been discussed a lot I think. Anyhow I really like Rails approach to the issue. It's code that has three things:
The version number
The way of applying the changes (updates a version table)
The way of rolling the changes back (sets the version on the version table to the previous)
So, each time you make a changeset you create this code file that can rollback or upgrade the database schema when you execute it.
This, being code, you can commit in any revision control system. You commit the first dump and then the scripts only.
The great thing about this approach is that you can easily distribute the database changes to customers, whereas with a standard just dump the schema and update it approach generating an upgrade/rollback script is a nuisance
In my company each developer is encouraged to save all db sctructure changes to a script files in the folder containing module's revision number. These scripts are kept in svn repository.
When application starts, the db upgrade code compares current db version and code version and if the code is newer - looks into scripts folder and applies all db changes automatically.
This way every instance of application (on production or developers machines) always upgrades db to their code version and it works great.
Of course, some automation could be added - if we find a suitable tool.
Poor mans version control:
Separate file for each object (table, view, etc)
When altering tables, you want to diff CREATE TABLE to CREATE TABLE. Source code history is for communicating a story. You can't do a meaningful diff of CREATE TABLE and ALTER TABLE
Try to make changes to the files, then commit them to source control, then commit them to the SQL database. Most tools poorly support this because you shouldn't commit to source control until you test and you can't test without putting the code into SQL. So in practice, you try to use SQL Redgate to compare your files to the SQL database. Failing that, you adopt a harsh policy of dropping everything in the database and replacing it with what made it into source control.
Change scripts usually are single use, but applications exist, like wordpress, where you need to move the schema from 1.0 to 1.1, 1.1 to 1.5, etc. Each of those should be under source control and modified as such (i.e. as you find bugs in the script that moves you from 1.0 to 1.1, you create a new version of that script, not yet-another script)