I've had a hard time trying to find good examples of how to manage database schemas and data between development, test, and production servers.
Here's our setup. Each developer has a virtual machine running our app and the MySQL database. It is their personal sandbox to do whatever they want. Currently, developers will make a change to the SQL schema and do a dump of the database to a text file that they commit into SVN.
We're wanting to deploy a continuous integration development server that will always be running the latest committed code. If we do that now, it will reload the database from SVN for each build.
We have a test (virtual) server that runs "release candidates." Deploying to the test server is currently a very manual process, and usually involves me loading the latest SQL from SVN and tweaking it. Also, the data on the test server is inconsistent. You end up with whatever test data the last developer to commit had on his sandbox server.
Where everything breaks down is the deployment to production. Since we can't overwrite the live data with test data, this involves manually re-creating all the schema changes. If there were a large number of schema changes or conversion scripts to manipulate the data, this can get really hairy.
If the problem was just the schema, It'd be an easier problem, but there is "base" data in the database that is updated during development as well, such as meta-data in security and permissions tables.
This is the biggest barrier I see in moving toward continuous integration and one-step-builds. How do you solve it?
A follow-up question: how do you track database versions so you know which scripts to run to upgrade a given database instance? Is a version table like Lance mentions below the standard procedure?
Thanks for the reference to Tarantino. I'm not in a .NET environment, but I found their DataBaseChangeMangement wiki page to be very helpful. Especially this Powerpoint Presentation (.ppt)
I'm going to write a Python script that checks the names of *.sql scripts in a given directory against a table in the database and runs the ones that aren't there in order based on a integer that forms the first part of the filename. If it is a pretty simple solution, as I suspect it will be, then I'll post it here.
I've got a working script for this. It handles initializing the DB if it doesn't exist and running upgrade scripts as necessary. There are also switches for wiping an existing database and importing test data from a file. It's about 200 lines, so I won't post it (though I might put it on pastebin if there's interest).
There are a couple of good options. I wouldn't use the "restore a backup" strategy.
Script all your schema changes, and have your CI server run those scripts on the database. Have a version table to keep track of the current database version, and only execute the scripts if they are for a newer version.
Use a migration solution. These solutions vary by language, but for .NET I use Migrator.NET. This allows you to version your database and move up and down between versions. Your schema is specified in C# code.
Your developers need to write change scripts (schema and data change) for each bug/feature they work on, not just simply dump the entire database into source control. These scripts will upgrade the current production database to the new version in development.
Your build process can restore a copy of the production database into an appropriate environment and run all the scripts from source control on it, which will update the database to the current version. We do this on a daily basis to make sure all the scripts run correctly.
Have a look at how Ruby on Rails does this.
First there are so called migration files, that basically transform database schema and data from version N to version N+1 (or in case of downgrading from version N+1 to N). Database has table which tells current version.
Test databases are always wiped clean before unit-tests and populated with fixed data from files.
The book Refactoring Databases: Evolutionary Database Design might give you some ideas on how to manage the database. A short version is readable also at http://martinfowler.com/articles/evodb.html
In one PHP+MySQL project I've had the database revision number stored in the database, and when the program connects to the database, it will first check the revision. If the program requires a different revision, it will open a page for upgrading the database. Each upgrade is specified in PHP code, which will change the database schema and migrate all existing data.
You could also look at using a tool like SQL Compare to script the difference between various versions of a database, allowing you to quickly migrate between versions
Name your databases as follows - dev_<<db>> , tst_<<db>> , stg_<<db>> , prd_<<db>> (Obviously you never should hardcode db names
Thus you would be able to deploy even the different type of db's on same physical server ( I do not recommend that , but you may have to ... if resources are tight )
Ensure you would be able to move data between those automatically
Separate the db creation scripts from the population = It should be always possible to recreate the db from scratch and populate it ( from the old db version or external data source
do not use hardcode connection strings in the code ( even not in the config files ) - use in the config files connection string templates , which you do populate dynamically , each reconfiguration of the application_layer which does need recompile is BAD
do use database versioning and db objects versioning - if you can afford it use ready products , if not develop something on your own
track each DDL change and save it into some history table ( example here )
DAILY backups ! Test how fast you would be able to restore something lost from a backup (use automathic restore scripts
even your DEV database and the PROD have exactly the same creation script you will have problems with the data, so allow developers to create the exact copy of prod and play with it ( I know I will receive minuses for this one , but change in the mindset and the business process will cost you much less when shit hits the fan - so force the coders to subscript legally whatever it makes , but ensure this one
This is something that I'm constantly unsatisfied with - our solution to this problem that is. For several years we maintained a separate change script for each release. This script would contain the deltas from the last production release. With each release of the application, the version number would increment, giving something like the following:
dbChanges_1.sql
dbChanges_2.sql
...
dbChanges_n.sql
This worked well enough until we started maintaining two lines of development: Trunk/Mainline for new development, and a maintenance branch for bug fixes, short term enhancements, etc. Inevitably, the need arose to make changes to the schema in the branch. At this point, we already had dbChanges_n+1.sql in the Trunk, so we ended up going with a scheme like the following:
dbChanges_n.1.sql
dbChanges_n.2.sql
...
dbChanges_n.3.sql
Again, this worked well enough, until we one day we looked up and saw 42 delta scripts in the mainline and 10 in the branch. ARGH!
These days we simply maintain one delta script and let SVN version it - i.e. we overwrite the script with each release. And we shy away from making schema changes in branches.
So, I'm not satisfied with this either. I really like the concept of migrations from Rails. I've become quite fascinated with LiquiBase. It supports the concept of incremental database refactorings. It's worth a look and I'll be looking at it in detail soon. Anybody have experience with it? I'd be very curious to hear about your results.
We have a very similar setup to the OP.
Developers develop in VM's with private DB's.
[Developers will soon be committing into private branches]
Testing is run on different machines ( actually in in VM's hosted on a server)
[Will soon be run by Hudson CI server]
Test by loading the reference dump into the db.
Apply the developers schema patches
then apply the developers data patches
Then run unit and system tests.
Production is deployed to customers as installers.
What we do:
We take a schema dump of our sandbox DB.
Then a sql data dump.
We diff that to the previous baseline.
that pair of deltas is to upgrade n-1 to n.
we configure the dumps and deltas.
So to install version N CLEAN we run the dump into an empty db.
To patch, apply the intervening patches.
( Juha mentioned Rail's idea of having a table recording the current DB version is a good one and should make installing updates less fraught. )
Deltas and dumps have to be reviewed before beta test.
I can't see any way around this as I've seen developers insert test accounts into the DB for themselves.
I'm afraid I'm in agreement with other posters. Developers need to script their changes.
In many cases a simple ALTER TABLE won't work, you need to modify existing data too - developers need to thing about what migrations are required and make sure they're scripted correctly (of course you need to test this carefully at some point in the release cycle).
Moreover, if you have any sense, you'll get your developers to script rollbacks for their changes as well so they can be reverted if need be. This should be tested as well, to ensure that their rollback not only executes without error, but leaves the DB in the same state as it was in previously (this is not always possible or desirable, but is a good rule most of the time).
How you hook that into a CI server, I don't know. Perhaps your CI server needs to have a known build snapshot on, which it reverts to each night and then applies all the changes since then. That's probably best, otherwise a broken migration script will break not just that night's build, but all subsequent ones.
Check out the dbdeploy, there are Java and .net tools already available, you could follow their standards for the SQL file layouts and schema version table and write your python version.
We are using command-line mysql-diff: it outputs a difference between two database schemas (from live DB or script) as ALTER script. mysql-diff is executed at application start, and if schema changed, it reports to developer. So developers do not need to write ALTERs manually, schema updates happen semi-automatically.
If you are in the .NET environment then the solution is Tarantino (archived). It handles all of this (including which sql scripts to install) in a NANT build.
I've written a tool which (by hooking into Open DBDiff) compares database schemas, and will suggest migration scripts to you. If you make a change that deletes or modifies data, it will throw an error, but provide a suggestion for the script (e.g. when a column in missing in the new schema, it will check if the column has been renamed and create xx - generated script.sql.suggestion containing a rename statement).
http://code.google.com/p/migrationscriptgenerator/ SQL Server only I'm afraid :( It's also pretty alpha, but it is VERY low friction (particularly if you combine it with Tarantino or http://code.google.com/p/simplescriptrunner/)
The way I use it is to have a SQL scripts project in your .sln. You also have a db_next database locally which you make your changes to (using Management Studio or NHibernate Schema Export or LinqToSql CreateDatabase or something). Then you execute migrationscriptgenerator with the _dev and _next DBs, which creates. the SQL update scripts for migrating across.
For oracle database we use oracle-ddl2svn tools.
This tool automated next process
for every db scheme get scheme ddls
put it under version contol
changes between instances resolved manually
I have an access database that connects to a vb6 application and this whole thing is connected between two computers via a shared network one running win 8 and other a win 7, and there is no internet involved in any sorta way nor should it be that is a requirement in fact
sorry I advance I have tried researching on the net but there is really short time and a lot of confusing material online
I am creating a WPF app connected to MySQL DB
now I have copied the access file and imported the contents of the DB in MySQL
things are a real mess in the imported DB so I am fixing it
what I am confused is how I am going to make it work there
do I go and install MySQL and do the whole process manually there, repeating all the steps and changes
is made
make a document that contains the code/script for all the changes I have made and run the data through
it, and is there even a way to implement that as a whole in a singular go
connect both databases together, i don't even know if this is possible
yes, in place of a simple "file share" of the Access file, you now are going to run some kind of SQL server system. In this case MySQL. But it could be PostgreSQL or any kind of "server" database.
That instance of "sql server" thus has to be setup, installed and you ensure that the "box" running that instance of MySQL also allows external connections (often by default the given computer firewall settings prevent this).
At that point, 2 or 10 different computers on that same network can now simply connect to the SQL server. The code of course is going to be VERY simular. You almost for sure used the oleDB provider for use with Access. However, you can use the ODBC provider, or even use the provider from MySQL. Those providers thus means you change the connect object, datareader object etc. However the "base" .net types such as row, or datatable, or dataset can remain as before (so you only change the provider). If you have a lot of code based on oleDB, then you could well consider to contine to use that oleDB provider code in .net, and thus you change the connection strings to now point to MySQL.
If you don't have a lot of code, then for sure do adopt the mySQL provider for .net. But as noted the least amount of changes would be to continue to use a oleDB provider for mysql, and that would suggest the least amount of code to be changed.
As for the msaccess data migration? Well, it not clear what tools and how you doing that now. But, once you transfer the data to the MySQL server (assuming you installed + setup my sql to run on one computer). The it is a simple matter to point your .net connection(s) in your code to Now MySQL as opposed to Acess. As a result, most if not all of your code logic for working with the tables can remain as before - but as noted you have to swap out the provider parts in .net
Now, if your REALLY lucky and the .net code used the ODBC provider? Then all you have to do is change your connection strings. And of course "some" SQL syntax in your code may have to be tweaked, as like Oracle, MS SQL server, postgreSQL, and MySQL?
Well, they all have some features and syntax that is different - this is especially in regards to date/time calculations, datediff() etc. But the general sql you have/had in your .net code should continue to run mostly un-changed against MySQL data tables.
As for how to migrate the data? I think that a really good tool is of course to use MS-Access. What you do is get MySQL up and running. Then use ms-access to open that database. You then add linked tables from MS-access to the MySQL tables.
At that point, you can now run append queries from Access to move/send the data to MySQL. It really depends on how many tables, and how many related tables are in that database. The more complex and the greater number of related tables in Access then the more the challenge to move such data up to MySQL.
Transferring Excel or a small or even big table is a breeze. (again, use MS Access and link to the tables on the sql server). However, where things can become messy is that if you have say 25 tables, and they are all related, many have cascade delete and say enforced parent to child relationships. So the more tables, and especially a larger number of related data tables, then the more work such a data migration task will become.
I think MS Access is a really good tool, since if you setup a connection to MySQL, then you can execute a transferDatabase commend in Access to send up one table to MySQL, and even all the columns and data types for those columns will be automatic created for you. So not only can Access transfer the data, but MORE valuable is it has the abilty to create the target tables on MySQL for you - and that will save you large amounts of time to build + setup the tables on MySQL.
Bit of a back story, I was using MySQL Server 5.X on an old server. Retired server and migrated all data to a new server with MySQL 8.0.11 (now 8.0.12) installed. Used Legacy authentication to reduce issues.
This seemed to work and all my programs open/ran as expected. I've been editing them and publishing without any issues as well, however in that time I have not had any reason to change any of the data sources.
I've gone to change the data source today though and can not get it to work for the life of me.
If I try and make any changes I get the error "Configure TableAdapter tbl_users failed. Specified cast is not valid.". Obviously the table name varies and this happens regardless of which table (even trying to add a new table that I've just created).
It seems to work though, but on closer inspection, the delete and update commands are not created which means if I try and run the application I just get errors.
I've currently got:
Visual Studio 2015
MySQL Connector Net 6.9.8
MySQL for Visual Studio 1.2.7
Thanks in advance for any help/ideas.
#A Tyler. I suggest trying to update the MySQL for Visual Studio and MySQL connector to the newest versions. Have you tried that?
I think this issue is somehow related with the one described here.
https://bugs.mysql.com/bug.php?id=31338
This bug dates back to version 5.1 of the connector and VS 2005... should have been fixed right now, but the symptoms are very similar.
Maybe you found any other solution?
Update:
I 'slept' with the problem and found a workaround on the next day. This does not really solve the issue but made it possible to continue with the project.
I went to the 'properties' window of a table adapter and manually edited select and update commands there. It caused the table adapter to reflect these changes and I could use the new table field in my program. For me, this is good enough.
I have been working with Phpmyadmin for quite some time and recently i came across Jetbrains Phpstorm and Intellij which i both really liked.
Now i found the database environment Datagrip.
I wanted to know objectively:
what are the advantages of Datagrip over Phpmyadmin and vice versa.
What does Datagripgive me that Phpmyadmin lacks?
What does Phpmyadmin give me that Datagrip lacks?
I am using PhpStorm and Datagrip since February 2017. Before that I was using Dreamweaver and PhpMyAdmin.
The only thing PhpMyAdmin had an advantage over Datagrip from my perspective, was searching the whole database. But since 2019 Datagrip has a "full text search" which does exactly the same.
Furthermore:
Datagrip is fully integrated into your IDE (PhpStorm, PyCharme, IntelliJ, etc). You don't need to leave it to run any SQL queries.
within the SQL console you have access to the "live templates" which let you insert huge code snippets impossible to remember via typing a few letters of the live template name. Before:
after:
SQL consoles are saved automatically (by a consecutive number). And you can save them as SQL files to any directory right from the console via ctrl/cmd + s
you also have access to the IDE's huge clipboard with (in my case) 100 previously copied text pieces, while each of them can be a whole (SQL) document:
it so easy to modify a table without writing any queries (table name, column name, foreign keys, indexes, column data type, etc):
the tables and search results are super easy to edit and update as if you would edit an Excel table
you can assign to /setup on any project as many databases as you like and access them easily
you can assign/setup and access any remote database via an SSH tunnel
you can assign/setup any type of DBMS:
Datagrip checks the SQL query syntax live, before you even run it
this is my IDE setup for testing query time on two identical sites running on different server setups (the one on nginx/mysql and the other one on Openlitespeed/Mariadb):
All that you get only for a couple of bucks! I pay now for PhpStorm only 80€ annually! I pay often much more for a single site license of some shitty wordpress plugin! But get a really really high quality software product with PhpStorm. Seriously, probably the only company I would LOVE to work as employee (being freelancer throughout my whole dev "career") is probably JetBrains. It seems as they can read my thoughts :D. Sure there are some few minor issues. But any time they bring out a new version I am excited as a child.
No I'm not paid by JetBrains :D And no I don't hype them because I'm Russian. At the time I felt in love with them I thought it is a Czech company with a bunch of Russian devs (nothing unusual in Europe). And Czech's in general don't like Russians. So I loved them even I thought they wouldn't love me :D Only a year after I've switched from Dreamweaver to PhpStorm I've found out it is a fully Russian company.
The only thing I hate in Datagrip is that the SQL console output shares for any reason a tab/window with Docker (dafuq?) and it is a huge pain in the as* to navigate between multiple query outputs/results (as in the example above where I compare the performance of two servers)
Update:
the only flaw of Datagrip from my perspective (pain in the as* to switch between console outputs) has gone now as well! :D
I've found a setting by which you can just simply open a "services" tree (Command + Shift + T) which list all the active/latest "services". I still don't understand why the Datagrip console output doesn't have a dedicated window, but at least I can now navigate easily between the different console's output:
What does Datagripgive me that Phpmyadmin lacks?
DataGrip provides fast code completion, based on the syntax — it can even complete your JOIN clause based on foreign keys.
It also has data editor — so you can edit several cells at once or you can edit many rows locally and then submit them.
Also you can navigate inside the grid y foreign keys.
Multi-cursor in the editor can help you edit a bunch of statements.
What does Phpmyadmin give me that Datagrip lacks?
PhpMyAdmin can export to PDF, ISO/IEC 26300 - OpenDocument Text and Spreadsheet, Word, LATEX
PhpMyAdmin has more administration features, dataGrip is not focused at administration at all
It also supports working with user accounts and privileges
Background: I have just completed a move of approximately 50 classic ASP sites from an IIS6/Sever 2003 and SQL Server 2000 environment to a new virtual environment of 2 machines behind an nginx load balancer. Each MS machine is running IIS7.5 and SQL Server 2008 R2. They current each have 6Gb & 2 VCPUs. The databases are set up in a mirroring configuration (currently without a witness).
During testing all sites appeared to function correctly.
Once live traffic started to hit the sites it became apparent quite quickly that the initial resource allocation (2Gb & 1 VCPU was way too low and was quickly increased). The main problem has come from an intermittent ASP error occuring on approximately 10 (and probably including the busiest) sites on the servers. They will produce a 500 response from an ASP error of
Provider error '8002000a' Out of present range.
All research has pointed to causes such as numbers too large to fit into an integer variable and some people have mentioned some correlation with the newer implementation of RAND and NEWINT() in SQL Server 2008 compared to 2000. The stored procedures that appear to cause the error are relatively simple, with some as simple as accepting a single VARCHAR parameter (well within the limits) and doing a single column select on a table. Most do not even involve INTs at all and if they do, the values are well within range.
The error can appear on one machine for a given amount of time while during this same time the other server will not necessarily have the error, it sometimes will though. After a while the error will stop occurring, this doesn't seem to correlate with excessively overloaded system resources either.
ASP to database is done via a DSN using SQL Server Client 10 drivers. The code is using the ADODB connection and command objects. This code has been working happily for 6+ years on the previous servers. The databases are set to compatibility mode 80 (SQL Server 2000).
Can anyone shed any light on where I should be looking to try and solve this please? If there is any other information I can share, specific code snippets etc please just let me know.
Update:
I thought the UPDATEUSAGE answer below had got it but unfortunately it reared up again a little later. After some thinking I've had the following thoughts... There are two instances of IIS, independent of each other, they both talk to a single database whether it be local at the time or not, they both execute identical sync'd code with code that has been working with the same syntax and valid variables for a long time. As the ASP execution through IIS is the only layer in this equation that is not a single point as it were this is where I've headed. When the problem reoccurred, I restarted IIS on the machine at that point that was showing the error (the situation is often that it is only occurring on one of the two servers). The restart of IIS appeared to cure the problem. It then happened on the other server with a different site, again restarting IIS appeared to sort the issue.
Further reading has now lead me to the "Managed pipeline" modes of the app pools. They are currently set to "Integrated". I've done some reading and I'm wondering if they should be set to classic to emulate IIS6. Does anyone have any more thoughts on this?
Many thanks
Eric
Did you:
(1) Update usage counters: In earlier versions of SQL Server, the values for the table and index row counts and page counts can become incorrect. To correct any invalid row or page counts, run DBCC UPDATEUSAGE on all databases following the upgrade.
(2) Rebuild all Indexes
Upgrading from SQL Server 2000 to 2008
I had the same problem and tracked it down to a field definition in my database i had defined as a long integer. the value i had in there was some like 53435534126262 , immediately changed it to a text field and the problem disappeared
try that??
I thought it might be useful to post my findings and solution to this problem as I found no where on the web that mentions the same situation I had.
I went through a number of steps that each seemed to reduce the frequency of the errors but not eliminate them. Firstly I changed the database authentication method to SQL instead of Windows based. At first I changed all the sites to use the same login but later on I changed them to all use a unique login.
I updated the SQL Server with service pack 2 and cumulative update pack 3.
As mentioned, the above steps reduced the frequency of the errors but didn't stop them. I started looking through the class that all the sites use to manage their database connections and use of stored procedures. I came across the line adocommand.parameters.refresh I read up on what this actually does and when called it makes a call to the database to retrieve the parameters of a given stored procedure so that they can be accessed as an object in ASP rather than the parameters having to be given in a particular order and manually assigning the types to them. On the Microsoft page that details this method it has a little footnote that says
Parameters.refresh will fail in some situations or return information that is not entirely correct. Parameters.refresh is particularly vulnerable when used on ASP pages.
This was all it gave and I couldn't find any other details about this. I increased the logging on my sites to, on error, output what parameters.refresh had returned. I caught it in one instance returning the two variables from the stored procedure, with the correct names, but not with the correct variable types. They should have been a VARCHAR and an INT but they came back as both being CURRENCY. Obviously this then errors when you try and assign a string to a CURRENCY. I only managed to catch this one instance of an error before I fixed the problem.
The only way I found that seemed to fix the problem was to change from using an ODBC based driver, both DSN or DSNless, and use the SQL Native Client OLE DB driver with the "PROVIDER" keyword. This had the added benefit of appearing to enable connection pooling when it previously didn't appear to have been working.
One side effect of changing to the driver is that the stored procedures and ASP became susceptible to intermediate results being returned from the stored procedure if there were multiple statements within it and it didn't have SET NOCOUNT ON explicitly set at the top. Rather than trying to update 1000+ stored procedures, I found that the NOCOUNT flag can be set at the database instance level for all databases which solved this problem.
I hope this helps someone, as it was an incredibly frustrating 3 weeks that I spent tracking down this problem. Feel free to ask any further questions and I'll help if I can.
Thanks
Eric