How to keep track of database updates associated with Jira issues? - mysql

We are using Jira as our issue-tracker, and our team works with mercurial repositories. When a developer makes a database change that is associated with a jira issue, he adds the sql as a comment on the issue. The problem with this is - when it comes time to push these issues to our production site, I need to browse through each issue going live to see which ones have db updates in their comments. There has to be a better way!!
Our production mysql db is on a shared host that does not allow us direct access. Any sql updates I want to go live need to be emailed in a sql file to be imported.
Thanks.

What you describe is a common problem when developing against a database. The usual solution is "database versioning".
The basic idea is that different states of your schema (i.e. your tables, columns, stored procedures etc.) get different version numbers. Then scripts for migrating between schema versions are created and stored.
Be warned that you'll likely need to fundamentally change your workflow. I don't think having the SQL code for migration in JIRA is a sustainable strategy. SQL is code, and belongs into the code repository.
See e.g. this question for details and techniques: Database Schema Versioning Strategies

Related

How to reconcile WordPress data between staging and production [duplicate]

I've had a hard time trying to find good examples of how to manage database schemas and data between development, test, and production servers.
Here's our setup. Each developer has a virtual machine running our app and the MySQL database. It is their personal sandbox to do whatever they want. Currently, developers will make a change to the SQL schema and do a dump of the database to a text file that they commit into SVN.
We're wanting to deploy a continuous integration development server that will always be running the latest committed code. If we do that now, it will reload the database from SVN for each build.
We have a test (virtual) server that runs "release candidates." Deploying to the test server is currently a very manual process, and usually involves me loading the latest SQL from SVN and tweaking it. Also, the data on the test server is inconsistent. You end up with whatever test data the last developer to commit had on his sandbox server.
Where everything breaks down is the deployment to production. Since we can't overwrite the live data with test data, this involves manually re-creating all the schema changes. If there were a large number of schema changes or conversion scripts to manipulate the data, this can get really hairy.
If the problem was just the schema, It'd be an easier problem, but there is "base" data in the database that is updated during development as well, such as meta-data in security and permissions tables.
This is the biggest barrier I see in moving toward continuous integration and one-step-builds. How do you solve it?
A follow-up question: how do you track database versions so you know which scripts to run to upgrade a given database instance? Is a version table like Lance mentions below the standard procedure?
Thanks for the reference to Tarantino. I'm not in a .NET environment, but I found their DataBaseChangeMangement wiki page to be very helpful. Especially this Powerpoint Presentation (.ppt)
I'm going to write a Python script that checks the names of *.sql scripts in a given directory against a table in the database and runs the ones that aren't there in order based on a integer that forms the first part of the filename. If it is a pretty simple solution, as I suspect it will be, then I'll post it here.
I've got a working script for this. It handles initializing the DB if it doesn't exist and running upgrade scripts as necessary. There are also switches for wiping an existing database and importing test data from a file. It's about 200 lines, so I won't post it (though I might put it on pastebin if there's interest).
There are a couple of good options. I wouldn't use the "restore a backup" strategy.
Script all your schema changes, and have your CI server run those scripts on the database. Have a version table to keep track of the current database version, and only execute the scripts if they are for a newer version.
Use a migration solution. These solutions vary by language, but for .NET I use Migrator.NET. This allows you to version your database and move up and down between versions. Your schema is specified in C# code.
Your developers need to write change scripts (schema and data change) for each bug/feature they work on, not just simply dump the entire database into source control. These scripts will upgrade the current production database to the new version in development.
Your build process can restore a copy of the production database into an appropriate environment and run all the scripts from source control on it, which will update the database to the current version. We do this on a daily basis to make sure all the scripts run correctly.
Have a look at how Ruby on Rails does this.
First there are so called migration files, that basically transform database schema and data from version N to version N+1 (or in case of downgrading from version N+1 to N). Database has table which tells current version.
Test databases are always wiped clean before unit-tests and populated with fixed data from files.
The book Refactoring Databases: Evolutionary Database Design might give you some ideas on how to manage the database. A short version is readable also at http://martinfowler.com/articles/evodb.html
In one PHP+MySQL project I've had the database revision number stored in the database, and when the program connects to the database, it will first check the revision. If the program requires a different revision, it will open a page for upgrading the database. Each upgrade is specified in PHP code, which will change the database schema and migrate all existing data.
You could also look at using a tool like SQL Compare to script the difference between various versions of a database, allowing you to quickly migrate between versions
Name your databases as follows - dev_<<db>> , tst_<<db>> , stg_<<db>> , prd_<<db>> (Obviously you never should hardcode db names
Thus you would be able to deploy even the different type of db's on same physical server ( I do not recommend that , but you may have to ... if resources are tight )
Ensure you would be able to move data between those automatically
Separate the db creation scripts from the population = It should be always possible to recreate the db from scratch and populate it ( from the old db version or external data source
do not use hardcode connection strings in the code ( even not in the config files ) - use in the config files connection string templates , which you do populate dynamically , each reconfiguration of the application_layer which does need recompile is BAD
do use database versioning and db objects versioning - if you can afford it use ready products , if not develop something on your own
track each DDL change and save it into some history table ( example here )
DAILY backups ! Test how fast you would be able to restore something lost from a backup (use automathic restore scripts
even your DEV database and the PROD have exactly the same creation script you will have problems with the data, so allow developers to create the exact copy of prod and play with it ( I know I will receive minuses for this one , but change in the mindset and the business process will cost you much less when shit hits the fan - so force the coders to subscript legally whatever it makes , but ensure this one
This is something that I'm constantly unsatisfied with - our solution to this problem that is. For several years we maintained a separate change script for each release. This script would contain the deltas from the last production release. With each release of the application, the version number would increment, giving something like the following:
dbChanges_1.sql
dbChanges_2.sql
...
dbChanges_n.sql
This worked well enough until we started maintaining two lines of development: Trunk/Mainline for new development, and a maintenance branch for bug fixes, short term enhancements, etc. Inevitably, the need arose to make changes to the schema in the branch. At this point, we already had dbChanges_n+1.sql in the Trunk, so we ended up going with a scheme like the following:
dbChanges_n.1.sql
dbChanges_n.2.sql
...
dbChanges_n.3.sql
Again, this worked well enough, until we one day we looked up and saw 42 delta scripts in the mainline and 10 in the branch. ARGH!
These days we simply maintain one delta script and let SVN version it - i.e. we overwrite the script with each release. And we shy away from making schema changes in branches.
So, I'm not satisfied with this either. I really like the concept of migrations from Rails. I've become quite fascinated with LiquiBase. It supports the concept of incremental database refactorings. It's worth a look and I'll be looking at it in detail soon. Anybody have experience with it? I'd be very curious to hear about your results.
We have a very similar setup to the OP.
Developers develop in VM's with private DB's.
[Developers will soon be committing into private branches]
Testing is run on different machines ( actually in in VM's hosted on a server)
[Will soon be run by Hudson CI server]
Test by loading the reference dump into the db.
Apply the developers schema patches
then apply the developers data patches
Then run unit and system tests.
Production is deployed to customers as installers.
What we do:
We take a schema dump of our sandbox DB.
Then a sql data dump.
We diff that to the previous baseline.
that pair of deltas is to upgrade n-1 to n.
we configure the dumps and deltas.
So to install version N CLEAN we run the dump into an empty db.
To patch, apply the intervening patches.
( Juha mentioned Rail's idea of having a table recording the current DB version is a good one and should make installing updates less fraught. )
Deltas and dumps have to be reviewed before beta test.
I can't see any way around this as I've seen developers insert test accounts into the DB for themselves.
I'm afraid I'm in agreement with other posters. Developers need to script their changes.
In many cases a simple ALTER TABLE won't work, you need to modify existing data too - developers need to thing about what migrations are required and make sure they're scripted correctly (of course you need to test this carefully at some point in the release cycle).
Moreover, if you have any sense, you'll get your developers to script rollbacks for their changes as well so they can be reverted if need be. This should be tested as well, to ensure that their rollback not only executes without error, but leaves the DB in the same state as it was in previously (this is not always possible or desirable, but is a good rule most of the time).
How you hook that into a CI server, I don't know. Perhaps your CI server needs to have a known build snapshot on, which it reverts to each night and then applies all the changes since then. That's probably best, otherwise a broken migration script will break not just that night's build, but all subsequent ones.
Check out the dbdeploy, there are Java and .net tools already available, you could follow their standards for the SQL file layouts and schema version table and write your python version.
We are using command-line mysql-diff: it outputs a difference between two database schemas (from live DB or script) as ALTER script. mysql-diff is executed at application start, and if schema changed, it reports to developer. So developers do not need to write ALTERs manually, schema updates happen semi-automatically.
If you are in the .NET environment then the solution is Tarantino (archived). It handles all of this (including which sql scripts to install) in a NANT build.
I've written a tool which (by hooking into Open DBDiff) compares database schemas, and will suggest migration scripts to you. If you make a change that deletes or modifies data, it will throw an error, but provide a suggestion for the script (e.g. when a column in missing in the new schema, it will check if the column has been renamed and create xx - generated script.sql.suggestion containing a rename statement).
http://code.google.com/p/migrationscriptgenerator/ SQL Server only I'm afraid :( It's also pretty alpha, but it is VERY low friction (particularly if you combine it with Tarantino or http://code.google.com/p/simplescriptrunner/)
The way I use it is to have a SQL scripts project in your .sln. You also have a db_next database locally which you make your changes to (using Management Studio or NHibernate Schema Export or LinqToSql CreateDatabase or something). Then you execute migrationscriptgenerator with the _dev and _next DBs, which creates. the SQL update scripts for migrating across.
For oracle database we use oracle-ddl2svn tools.
This tool automated next process
for every db scheme get scheme ddls
put it under version contol
changes between instances resolved manually

Visibility of databases in cloudbees?

I'm looking into using CloudBees for some application prototyping. I am using free accounts right now, I am not paying any subscriptions at the moment.
The first step for me is to create a MySQL database to host my application's data. I've done so (and it was pretty easy!). I also use Liquibase to manage the database (I've started this work using local H2 databases for the pre-prototyping), and I've been able to construct everything as expected.
As part of checking whether liquibase created the tables, I brought up the MySQL database in NetBeans. And, it did function well. But I can also see other schemas as well as the schema I just created. They're all innocently named (test, test_6hob). But, I can see the tables and view their data.
My question is around the visibility of the data that's in the CloudBees database. Is the database created for the free accounts viewable to other people connecting to the same machine? Does this change if I use a paid account? Or is it more the nature of how the database was created? I can see other schemas (and their data) but I have no idea if other people can see mine? Is there a permissions-aspect I need to ensure I set? I've fairly ignorant with the inner-workings of MySQL.
While this is a prototype, were I to move into using CloudBees for production applications, I wouldn't want the data to be visible to anyone who happened to connect to the same database as my application. It's entirely possible that I'm missing something in this new cloud world. :)
Thanks for any info
All CloudBees MySQL databases are secured separately (although will be in shared instances unless you have a dedicated server) - they are not readable by any other account by default.
However, it is possible for the database owner to grant access to users from other accounts on that same database server if you really wanted to - even though it makes very little sense to do so (and your special user configuration will be lost during a failover).
So this is what has happened for the test databases that you can see - the database owner has opened up security on those databases / tables.
This question is probably off topic but i'll bite anyway. The database data is private to your account. Actual hardware/vm's maybe shared but the data/database is not.

How to log mysql database structural changes

I'm working with a project which is using mysql as the database. The application is hosted with many clients and we are doing upgrades for the current live systems often.
There are some instances where the client has change the database structure(adding new tables) and causes some unexpected db crashes.
I need to log all the structural changes which were done at that database, so we can find the correct root cause for that. We can't do it 100% correct with diff tool because it will not show the intermediate changes.
I found http://www.liquibase.org/ tool but seems little bit complex.
Is there any well known technique or a tool to track database structural changes only.
well from mysql studio you can generate all object's schema definition and compare them with your standard schema definition and this way you can compare two database schema...
generate scrips of both database (One is client's Database and One is master copy database) and then compare it using file compare tool would be the best practice according to me because this way you can track which collumn was added, which column was deleted, which index was added like wise without any tool download.
Possiable duplication of Compare two MySQL databases ?
Hope this helps.
If you have an application for your clients to manage these schema changes, you can use a mechanism at application level. If you have a Python and Django-based solution, you could probably use South which provides schema change tracking and rollbacks.

Compare database differences and get SQL output

I need ability to compare two similar databases. One will be slightly newer than the other and have changes to the structure of the database as well as possibly the content within it.
So far I have tried using liquibase but it doesn't seem to be comparing properly.
I have also tried the MySQL Diff Perl module which works but doesn't consider content.
Main Question:
Does anyone know any solutions that will give back SQL for both structural and content differences and generate a SQL script?
A bit more info:
The intended use for this is when making updates and installing MODs to phpBB so that the forum can be included in the build process along with the rest of our website. Which has a 4 tier process (local, development, staging, production).
When installing the phpBB updates and MODs I will make a dump of the current production database and lock the site so no new data can be added whilst I make changes. That way databases shouldn't come out of sync.
When installing MODs and updates sometimes the database structure changes and also the data within tables, especially when adding things requiring extra permissions etc.
The solution I use therefore will be used to compare the local database with the upgraded changes to the production database, providing me with a script I can run on each tier in the build process, rather than manually installing the update/MOD on each.
You can use SQLyog Database synchronization tool to sync two databases,either one-way or two-way. By far this is the best data comparison tool for MySQL GUI. And, Schema sync for schema comparisons between two databases.
Both tools can generate SQL scripts.
I've actually found a way to do it via Navicat for MySQL using the Tools > Structure Syncronization option.
This will give SQL statements for differences in structure between the two databases.
Then do do the data differences you can use Data Syncornization.
I've managed to copy out the SQL script for differences in structure. However the data syncronization seems to be more of an internal Navicat thing. I'm sure there's a way that the queries could be extracted though.
Please note I'm using a license version so not sure if its available in the free to use one.

Are there generic options for version control within a database?

I have a small amount of experience using SVN on my development projects, and I have just as little experience with relational databases. I know the basic concepts like tables, and SQL statements, but I'm far from being an expert.
What I'd like to know is if there are any generic version control type systems like SVN, but that work with a database rather than files. I would like the same kind of features you get with SVN like the ability to create branches, create tags, and merge branches together. Rather than a revision number being associated to a version of a file repository it would be associated with a version of the database.
Are their any generic solutions available that can add this kind of functionality independent of the actual database schema? I'd be interested in solutions that work with MySQL or MS SQL Server.
I should also clarify that I'm trying to version control the data not the schema. I would expect the schema to remain constant. So really it seems like I want a way to create a log of all the INSERT, UPDATE, and DELETE requests sent the the database between each version of the data. That way any version could be recreated by resending all the SQL statements that have been saved up to the desired version.
You can script all your DDL, stored procedures and such to regular text files.
Then you can simply use SVN for database versioning.
I've never found a solution that works as well as Subversion, but here's a few things I've done that have helped:
Make scripts that will create the schema and populate any initial data. Then make an update script for each change after that. It's a fairly manual process, but it works. There's extra things that help like storing the current version number in a table in the db and making sure that the scripts are idempotent.
Store the full development db in Subversion. This doesn't usually work out too well for me if there is a lot of data or it is frequently changed. But in some projects is could work.
I keep and maintain create scripts in my version control system.
There are two things I can think of:
http://www.liquibase.org/ - provides a way of generally managing database changes. Creates files that get committed into source control, and it helps manage changes across different development databases, etc.
http://www.viget.com/extend/backup-your-database-in-git/ - this describes a strategy for backing up a database into source control, but the same strategy can be used just on the schema. In this scheme, the database would be in a separate area from your main code. (This can be used with other source control systems too.)