Using version control (Git) on a MySQL database - mysql

I am a WordPress Designer/Developer, who is getting more and more heavily involved with using version control, notably Git, though I do use SVN for some projects. I am currently using Beanstalk for my remote repo.
Adding all of the WordPress files to my repo is no problem, if I wanted to I know I could .gitignore the wp-config file, but since I'm the only developer, currently, and these projects are closed source, it really makes little sense.
WordPress relies heavily on the database, as any CMS does, to keep textual content, and many settings depending on the specific plugin/theme configuration I'm using. I'm wondering what the best way of using version control on the database would be, if it's even possible. I guess I could do a SQL dump, though my MySQL server is running on Windows (read as: I don't know how to do it), and then add the SQL dump to my repository. But when I push something live, that poses huge security threats.
Is there an accepted practice of doing this?

You can backup your database within a git repository. Of course, if you place the data into git in a binary form, you will lose all of git's ability to efficiently store the data using diffs (changes). So the number one best practice is this: store the data in a text serialised format.
mysqldump is a suitable program to help you do this. It isn't perfect though. If anything disturbs the serialisation order of items (eg. as a result of creating new tables, etc.) then artificial breaks will enter into the diff. That will decrease the efficiency of storage. You could write a custom serialiser to serialise changes only -- but then you are doing the hard work that git is already good at. Just use the sql dump.
That being said, what you are wanting to do isn't what devs normally mean when they talk about putting the database in git. For instance, if you read the link posted by #eggyal (link to codinghorror) you will see that what is actually placed in git are the scripts needed to generate the initial database. There may be additional scripts, like those to populate the database data with a clean state, or to populate it with testing data. All such sql scripts are text files, and pretty much the same format as the sql dump you would get from mysqldump. So there's no reason you can't do it that way with your day-to-day data as well.

There are not many software available to version control databases like MySQL and MongoDB.
But one is under development and the beta version is about to be launched soon. Check out Klonio - Version Control for databases

The article How to Sync A Local & Remote WordPress Blog Using Version Control gives advice on how to automate sync between two instances (development, production) of a WordPress blog using Mercurial. Is mentions that for this scenario, Git and Mercurial are very similar.
Step 4 (Synchronizing The Databases) is of interest here.
The database content will be exported to a file that is tracked by the revision control. Each time we pull changes, the database content will be replaced by this file, making our database up-to-date.
Then, it elaborates on conflicts and the scripting part of the job.
There is a version control tutorial in Mercurial out there, if you're not familiar with it.

If you are only interested in schema changes under version control, there is a nice stuff SqlRog. It extracts schema into the project files that can be put under the git.

Be aware that Wordpress stores all news feed content in the database, so even if you don't make any changes, there will be a lot of changing content.

Related

MySQL & GIT - Is it possible to use GIT versioning to merge MySQL databases? [duplicate]

I am a WordPress Designer/Developer, who is getting more and more heavily involved with using version control, notably Git, though I do use SVN for some projects. I am currently using Beanstalk for my remote repo.
Adding all of the WordPress files to my repo is no problem, if I wanted to I know I could .gitignore the wp-config file, but since I'm the only developer, currently, and these projects are closed source, it really makes little sense.
WordPress relies heavily on the database, as any CMS does, to keep textual content, and many settings depending on the specific plugin/theme configuration I'm using. I'm wondering what the best way of using version control on the database would be, if it's even possible. I guess I could do a SQL dump, though my MySQL server is running on Windows (read as: I don't know how to do it), and then add the SQL dump to my repository. But when I push something live, that poses huge security threats.
Is there an accepted practice of doing this?
You can backup your database within a git repository. Of course, if you place the data into git in a binary form, you will lose all of git's ability to efficiently store the data using diffs (changes). So the number one best practice is this: store the data in a text serialised format.
mysqldump is a suitable program to help you do this. It isn't perfect though. If anything disturbs the serialisation order of items (eg. as a result of creating new tables, etc.) then artificial breaks will enter into the diff. That will decrease the efficiency of storage. You could write a custom serialiser to serialise changes only -- but then you are doing the hard work that git is already good at. Just use the sql dump.
That being said, what you are wanting to do isn't what devs normally mean when they talk about putting the database in git. For instance, if you read the link posted by #eggyal (link to codinghorror) you will see that what is actually placed in git are the scripts needed to generate the initial database. There may be additional scripts, like those to populate the database data with a clean state, or to populate it with testing data. All such sql scripts are text files, and pretty much the same format as the sql dump you would get from mysqldump. So there's no reason you can't do it that way with your day-to-day data as well.
There are not many software available to version control databases like MySQL and MongoDB.
But one is under development and the beta version is about to be launched soon. Check out Klonio - Version Control for databases
The article How to Sync A Local & Remote WordPress Blog Using Version Control gives advice on how to automate sync between two instances (development, production) of a WordPress blog using Mercurial. Is mentions that for this scenario, Git and Mercurial are very similar.
Step 4 (Synchronizing The Databases) is of interest here.
The database content will be exported to a file that is tracked by the revision control. Each time we pull changes, the database content will be replaced by this file, making our database up-to-date.
Then, it elaborates on conflicts and the scripting part of the job.
There is a version control tutorial in Mercurial out there, if you're not familiar with it.
If you are only interested in schema changes under version control, there is a nice stuff SqlRog. It extracts schema into the project files that can be put under the git.
Be aware that Wordpress stores all news feed content in the database, so even if you don't make any changes, there will be a lot of changing content.

Git Repository And Database Schemas

My company use Git for “version control”,etc. Currently it is used for C, C# and Python. I have been asked to add the database schemas together with the more “complex” SQL (no idea when it becomes “complex”) to the repository. Currently the database is backed up after changes have been made to the schemas or after data has been added (at the moment it is purely a development environment). Having looked at Git, database schemas and the like do not really seem (to me) to map onto it. Should I be considering another package for “source control” to compliment the existing MySQL backups?
Thank you...
Assuming you are just wanting to store the SQL scripts that can recreate your DB schema without any data in it (CREATE TABLE, VIEW, INDEX, etc.) then Git seems like a perfectly good option. Git is generally good for version control of textual data, such as SQL scripts.
The fingerprint rule is not to store large files which are often modified in git for several reasons. (out of this answer scope - heuristically, snapshots etc) so i would suggest not to add them to git directly and instead store them in a submodule as a standalone repository.
This way you can still use git to track changes but your git repository will not growth to a huge size (pack files) and you can manage it inside your project.
If you only want to store the sql script git is a good choice sine it will handle it as any other file.

Are there generic options for version control within a database?

I have a small amount of experience using SVN on my development projects, and I have just as little experience with relational databases. I know the basic concepts like tables, and SQL statements, but I'm far from being an expert.
What I'd like to know is if there are any generic version control type systems like SVN, but that work with a database rather than files. I would like the same kind of features you get with SVN like the ability to create branches, create tags, and merge branches together. Rather than a revision number being associated to a version of a file repository it would be associated with a version of the database.
Are their any generic solutions available that can add this kind of functionality independent of the actual database schema? I'd be interested in solutions that work with MySQL or MS SQL Server.
I should also clarify that I'm trying to version control the data not the schema. I would expect the schema to remain constant. So really it seems like I want a way to create a log of all the INSERT, UPDATE, and DELETE requests sent the the database between each version of the data. That way any version could be recreated by resending all the SQL statements that have been saved up to the desired version.
You can script all your DDL, stored procedures and such to regular text files.
Then you can simply use SVN for database versioning.
I've never found a solution that works as well as Subversion, but here's a few things I've done that have helped:
Make scripts that will create the schema and populate any initial data. Then make an update script for each change after that. It's a fairly manual process, but it works. There's extra things that help like storing the current version number in a table in the db and making sure that the scripts are idempotent.
Store the full development db in Subversion. This doesn't usually work out too well for me if there is a lot of data or it is frequently changed. But in some projects is could work.
I keep and maintain create scripts in my version control system.
There are two things I can think of:
http://www.liquibase.org/ - provides a way of generally managing database changes. Creates files that get committed into source control, and it helps manage changes across different development databases, etc.
http://www.viget.com/extend/backup-your-database-in-git/ - this describes a strategy for backing up a database into source control, but the same strategy can be used just on the schema. In this scheme, the database would be in a separate area from your main code. (This can be used with other source control systems too.)

MySQL Version Control - Subversion

Wondering if it is possible to have a version control of a MySQL database.
I realize this question has been asked before however the newest is almost a year ago, and at the rate things change...
The problem is coming that each developer has apache/MySQL/PHP on their own computers to which they sometimes edit the database. Its rather inconvenient if they have to send an email to all the other developers and then manually edit the test servers database.
How do you deal with this problem?
Thanks
This is not a MySQL-related solution in itself, but we've had a lot of success with a product called liquibase. (http://www.liquibase.org/)
It's a migration solution which covers many different database vendors, allowing all database changes to be coded in configuration files, all of which are kept in Subversion. Since all configuration is kept in XML files, it's easy to merge other people's changes into the mainline script and it plays well with tags and branches.
The database can be brought up to the current revision level by running the "update database" command. Most changes also have the ability to roll-back a database change, which can be helpful too. I would recommend following the practice of making sure you get current before running the migration, as this would likely be easiest.
Finally, when it comes to a production delivery, you can choose to have all the database changes output as a full SQL script so it can allow DBAs to run it and maintain a separation of duties.
So far, it's worked like a charm.
Well we use Rails which keeps all the change in the migration files. I know that a couple of PHP frameworks do the same thing - Symphony for instance. So when all the changes are merged in our repository ( we user mercurial) - we can see all the changes in migrations that need to or were applied on database in development. Than the person responsible for production rolls out code to production after a full backup is made. However if you don't use a PHP framework that takes care of this than, awied's suggestion sounds very interesting - I haven't heard of liquidbase before but I will definitely check it out.
There is a tool called iBatis, now called MyBatis that handles versions of databases perfectly.
It takes a little work to have all your changes in script instead of with a graphical tool, but, if you are familiar with coding, it's not a problem.
When you have multiple databases (like dev-test-prod), you just make 3 environment files and you can update one environment with only one command-line instruction.

How do you maintain revision control of your database structure?

What is the simplest way of keeping track of changes to a projects database structure?
When I change something about the database (eg, add a new table, add a new field to an existing table, add an index etc), I want that to be propagated to the rest of the team, and ultimately the production server, with the minimal fuss and effort.
At the moment, the solution is pretty weak and relies of people remembering to do things, which is an accident waiting to happen.
Everything else is managed with standard revision control software (Perforce in our case).
We use MySQL, so tools that understand that would be helpful, though I would also be interested to learn how other places are handling this anyway, regardless of database engine.
You can dump the schema and commit it -- and the RCS will take care of the changes between versions.
You can get a tool like Sql Compare from Red-Gate which allows you to point to two databases and it will let you know what is different, and will build alter scripts for you.
If you're using .NET(Visual Studio), you can create a Database project and check that into source control.
This has alrady been discussed a lot I think. Anyhow I really like Rails approach to the issue. It's code that has three things:
The version number
The way of applying the changes (updates a version table)
The way of rolling the changes back (sets the version on the version table to the previous)
So, each time you make a changeset you create this code file that can rollback or upgrade the database schema when you execute it.
This, being code, you can commit in any revision control system. You commit the first dump and then the scripts only.
The great thing about this approach is that you can easily distribute the database changes to customers, whereas with a standard just dump the schema and update it approach generating an upgrade/rollback script is a nuisance
In my company each developer is encouraged to save all db sctructure changes to a script files in the folder containing module's revision number. These scripts are kept in svn repository.
When application starts, the db upgrade code compares current db version and code version and if the code is newer - looks into scripts folder and applies all db changes automatically.
This way every instance of application (on production or developers machines) always upgrades db to their code version and it works great.
Of course, some automation could be added - if we find a suitable tool.
Poor mans version control:
Separate file for each object (table, view, etc)
When altering tables, you want to diff CREATE TABLE to CREATE TABLE. Source code history is for communicating a story. You can't do a meaningful diff of CREATE TABLE and ALTER TABLE
Try to make changes to the files, then commit them to source control, then commit them to the SQL database. Most tools poorly support this because you shouldn't commit to source control until you test and you can't test without putting the code into SQL. So in practice, you try to use SQL Redgate to compare your files to the SQL database. Failing that, you adopt a harsh policy of dropping everything in the database and replacing it with what made it into source control.
Change scripts usually are single use, but applications exist, like wordpress, where you need to move the schema from 1.0 to 1.1, 1.1 to 1.5, etc. Each of those should be under source control and modified as such (i.e. as you find bugs in the script that moves you from 1.0 to 1.1, you create a new version of that script, not yet-another script)