Tools and Methods [closed] - mysql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
What tools and procedures would you recommend or use yourself to help streamline the following sceanario: (I know its a long one but any help is appreciated)
I work in a team that have an ecommerce app that we develop at our company. Its a reasonably standard LAMP application that we have been developing on and off for about 3 years. We develop the application on a testing domain, here we add new features and fix bugs etc. Our bug tracking and feature development is all managed within a hosted subversion solution (unfuddle.com). As bugs are reported we make these fixes on the testing domain and then commit changes to svn when we are happy the bug has been fixed. We follow this same procedure with the addition of new features.
It is worth pointing out there the general architecture of our system and application across our servers. Each time a new feature is developed we roll this update out to all sites using our application (always a server we control). Each site using our system essentially uses exactly the same files for 95% of the codebase. We have a couple of folders within each site which contain files bespoke to that site - css files / images etc. Other than that the differences between each site are defined by various configuration settings within each sites database.
This gets on to the actual deployment itself. As and when we are ready to roll out an update of some kind we run a command on the server that the testing site is on. This performs a copy command (cp -fru /testsite/ /othersite/) and goes through each vhost force updating the files based on modified date. Each additional server that we host on has a vhost that we rsync the production codebase to and we then repeat the copy procedure on all sites on that server. During this process we move out the files we dont want to be overwritten, moving them back when the copy has completed. Our rollout script performs a number of other function such as applying SQL commands to alter each database, adding fields / new tables etc.
We have become increasingly concerned that our process is not stable enough, not fault-tolerant and is also a bit of a brute-force method. We're also aware we are not making best use of subversion as we have a position where working on a new feature would prevent us from rolling out an important bug fix as we are not making use of branches or tags. It also seems wrong that we have so much replication of files across our servers. We're also not able to easily perform a rollback on what we have just rolled out. We do perform a diff before each rollout so we can get a list of files that will be changed so we know what has been changed after but the process to rollback would still be problematic. In terms of the database i've started looking into dbdeploy as a potential solution. What we really want though is some general guidance about how we can improve our file management and deployment. Ideally we want the file management to be more closely linked to our repository so a rollout / rollback would be more connected to svn. Something like using the export command to make sure the site files are the same as the repo files. It would also be good though if the solution maybe would also stop the file replication around our servers.
Ignoring our current methods it would be really good to hear how other people approach the same problem.
to summarise...
What is the best way for making files across multiple servers stay in sync with svn?
How should we prevent file replication? symlinks / something else?
How should we structure our repo so we can dev new features and fix old ones?
How should we trigger rollouts/rollbacks?
Thanks in advance.

For rollback and testing out new features, the standard subversion concepts of branches and tags should be sufficient:
always create a tag before rollout, and roll out that tag. Rollback would then mean to return to the previous tag.
develop new features in branches and merge to the trunk when completed; alternatively, develop new features in trunk, and have a maintenance branch that receives only bug fixes.
keep the per-site files in separate directories in subversion, and use a configuration file on each site, or a symbolic link, to have sites refer to their specific files.
To reduce file duplication, I recommend to use NFS (in particular when all sites are virtual machines on the same host - make the host the NFS server, and the sites NFS clients; alternatively, make a dedicated VM the NFS server). To deploy an update, only install the new files on the NFS server; the clients will pick up changes automatically.
If you need a multi-step update (e.g. first update the databases in each client, then update the code), you should still use NFS, but add symlinks to that. Check out the new code into a separate directory on the NFS server, then go to all VMs, update the databases, and change the symlink in the VM to point to the new code. When done, remove the old code on the NFS server.

you may want to look at this article which covers deployment of PHP apps.
http://blog.digitalstruct.com/2009/10/07/deployments-php-applications/
Specifically it mentions a few tools which might help:
Phing
Ant
Liquibase
DbDeploy
I have also heard a few people mentioning using capistrano so you might want to look at that too.
EDIT:
from looking at this poll http://twtpoll.com/3zwfox it seems that SVN export is a common method in the community for deploying php apps. This poll seems to have been used in this slideshare presentation http://www.slideshare.net/ccornutt/taming-the-deployment-beast

Related

How to get the difference between two mysql dumps and update the delta? [duplicate]

Is there a way to keep two databases in sync? I have a client who's running WordPress with MySQL. Is there a way to take a copy of the database the current state, and use it for a development server, and then when the dev changes are done push it back to the live site?
The client might make changes to the live site while I'll be working on the dev version, and wondering if there will any merge conflicts.
If I import the updated database via phpmyadmin, will it only update with only the newest changes or overwrite everything?
Here's a quick reference of MySQL Replication by #Mark Baker or you can use MySQL Workbench Synchronization.
So I finally found a solution to my problem. Since this was an issue for WordPress I found two plugins that worked really well.
Free one: Database Sync
Very simple and has an easy push/pull interface.
Paid Plugin $40-200: WP Migrate DB Pro
Much more polished and has an option to select specific tables you want to sync.
There's an answer to the duplication problem here. However, that's only the start of your difficulties. If two people are making changes independently to two copies of one database, merging the two will inevitably cause nightmares. In short, yes there will be merge conflicts. Exactly what, and what you do about it, will depend on the nature of the changes each of you have made. Good luck!
Other modern (this post is quite old) paid solutions to the problem would be deevop and mergebot.
Mergebot is a plugin saas, that helps with complicated merges between the different development and production databases, specifically for WordPress.
deevop is a more comprehensive solution providing the development environment but also having many options for complex data syncronisation between phases (excluding tables, etc) not only for WordPress but for other platforms, too.
You can even combine both and use deevop as deployment manager (one click deploy to/from production) and then use mergebot for the complex database merges.

MySQL versioning control to complement GIT?

I work with a small web team that is currently in the process of getting GIT integrated into our development process. We develop locally, have a central bare repository and then pull changes down to separate test and production servers. This is working great for our files but we are hitting roadblocks when it comes to syncing MySQL databases.
We have a lot of sites built with Wordpress and the issues are more prominent here:
Wordpress inserts the domain name into the DB. Right now, we get around this by doing a find and replace whenever we move the sites from local, to testing and then to production. It would be nice if we didn't have to do this, though.
The production server site DBs are constantly changing (comments, etc.) and the testing server and our local servers are not in sync. This makes it difficult to send changes (after adding a plugin, page, etc.) to the production DB from the test server.
It would be great if we could find something that could integrate with GIT (maybe through githooks) that would allow us to sync the databases across different development and production servers. Moreover, it would be a bonus if there were a way to track changes within the database itself -- allowing us to merge changes (development edits and production changes) when pushing to production.
And finally, it would be even better if this could all work across multiple domains (local, testing and production); in other words, it would have to find and replace the URLs in the sql on each push/pull.
Thanks a bunch for any insight.
You might want to check out http://www.liquibase.org/. It's a database refactoring tool made for creating and modifying database schema, creating rollbacks and code for SQL generation. I was introduced to it a long while back and can't remember it that well, but it seems like it's made for what you need and from what I remember it kicks ass.

Collaborating on websites with relational databases and a CMS

What processes do you put in place when collaborating in a small team on websites with databases?
We have no problems working on site files as they are under revision control, so any number of our developers can work from any location on this aspect of a website.
But, when database changes need to be made (either directly as part of the development or implicitly by making content changes in a CMS), obviously it is difficult for the different developers to then merge these database changes.
Our approaches thus far have been limited to the following:
Putting a content freeze on the production website and having all developers work on the same copy of the production database
Delegating tasks that will involve database changes to one developer and then asking other developers to import a copy of that database once changes have been made; in the meantime other developers work only on site files under revision control
Allowing developers to make changes to their own copy of the database for the sake of their own development, but then manually making these changes on all other copies of the database (e.g. providing other developers with an SQL import script pertaining to the database changes they have made)
I'd be interested to know if you have any better suggestions.
We work mainly with MySQL databases and at present do not keep track of revisions to these databases. The problems discussed above pertain mainly to Drupal and Wordpress sites where a good deal of the 'development' is carried out in conjunction with changes made to the database in the CMS.
You put all your database changes in SQL scripts. Put some kind of sequence number into the filename of each script so you know the order they must be run in. Then check in those scripts into your source control system. Now you have reproducible steps that you can apply to test and production databases.
While you could put all your DDL into the VC, this can get very messy very quickly if you try to manage lots and lots of ALTER statements.
Forcing all developers to use the same source database is not a very efficient approach either.
The solution I used was to maintain a file for each database entity specifying how to create the entity (primarily so the changes could be viewed using a diff utility), then manually creating ALTER statements by comparing the release version with the current version - yes, it is rather labour intensive but the only way I've found to solve the problem.
I had a plan to automate the generation of the ALTER statements - it should be relatively straightforward - indeed a quick google found this article and this one. Never got round to implementing one myself since the effort of doing so was much greater than the frequency of schema changes on the projects I was working on.
Where i work, every developer (actually, every development virtual machine) has its own database (or rather, its own schema on a shared Oracle instance). Our working process is based around complete rebuilds. We don't have any ability to modify an existing database - we only ever have the nuclear option of blowing away the whole schema and building from scratch.
We have a little 'drop everything' script, which uses queries on system tables to identify every object in the schema, constructs a pile of SQL to drop them, and runs it. Then we have a stack of DDL files full of CREATE TABLE statements, then we have a stack of XML files containing the initial data for the system, which are loaded by a loading tool. All of this is checked into source control. When a developer does an update from source control, if they see incoming database changes (DDL or data), they run the master build script, which runs them in order to create a fresh database from scratch.
The good thing is that this makes life simple. We never need to worry about diffs, deltas, ALTER TABLE, reversibility, etc, just straightforward DDL and data. We never have to worry about preserving the state of the database, or keeping it clean - you can get back to a clean state at the push of a button. Another important feature of this is that it makes it trivial to set up a new platform - and that means that when we add more development machines, or need to build an acceptance system or whatever, it's easy. I've seen projects fail because they couldn't build new instances from their muddled databases.
The main bad thing is that it takes some time - in our case, due to the particular depressing details of our system, a painfully long time, but i think a team that was really on top of its tools could do a complete rebuild like this in 10 minutes. Half an hour if you have a lot of data. Short enough to be able to do a few times during a working day without killing yourself.
The problem is what you do about data. There are two sides to this: data generated during development, and live data.
Data generated during development is actually pretty easy. People who don't work our way are presumably in the habit of creating that data directly in the database, and so see a problem in that it will be lost when rebuilding. The solution is simple: you don't create the data in the database, you create it in the loader scripts (XML in our case, but you could use SQL DML, or CSV with your database's import tool, or whatever). Think of the loader scripts as being source code, and the database as object code: the scripts are the definitive form, and are what you edit by hand; the database is what's made from them.
Live data is tougher. My company hasn't developed a single process which works in all cases - i don't know if we just haven't found the magic bullet yet, or if there isn't one. One of our projects is taking the approach that live is different to development, and that there are no complete rebuilds; rather, they have developed a set of practices for identifying the deltas when making a new release and applying them manually. They release every few weeks, so it's only a couple of days' work for a couple of people that often. Not a lot.
The project i'm on hasn't gone live yet, but it is replacing an existing live system, so we have a similar problem. Our approach is based on migration: rather than trying to use the existing database, we are migrating all the data from it into our system. We have written a rather sprawling tool to do this, which runs queries against the existing database (a copy of it, not the live version!), then writes the data out as loader scripts. These then feed into the build process just like any others. The migration is scripted, and runs every night as part of our daily build. In this case, the effort needed to write this tool was necessary anyway, because our database is very different in structure to the old one; the ability to do repeatable migrations at the push of a button came for free.
When we go live, one of our options will be to adapt this process to migrate from old versions of our database to new ones. We'll have to write completely new queries, but they should be very easy, because the source database is our own, and the mapping from it to the loader scripts is, as you would imagine, straightforward, even as the new version of the system drifts away from the live version. This would let us keep working in the complete rebuild paradigm - we still wouldn't have to worry about ALTER TABLE or keeping our databases clean, even when we're doing maintenance. I have no idea what the operations team will think of this idea, though!
You can use the replication module of the database engine, if it has one.
One server will be the master, changes are to be made on it.
Developers copies will be slaves.
Any changes on the master will be duplicated on the slaves.
It's a one way replication.
Can be a bit tricky to put into place as any changes on the slaves will be erased.
Also it means that the developers should have two copy of the database.
One will be the slave and another the "development" database.
There are also tools for cross database replications.
So any copies can be the master.
Both solutions can lead to disasters (replication errors).
The only solution is see fit is to have only one database for all developers and save it several times a day on a rotating history.
Won't save you from conflicts but you will be able to restore the previous version if it happens (and it always do...).
Where I work we are using Dotnetnuke and this poses the same problems. i.e. once released the production site has data going into the database as well as files being added to the file system by some modules and in the DNN file system.
We are versioning the site file system with svn which for the most part works ok. However, the database is a different matter. The best method we have come across so far is to use RedGate tools to synchronise the staging database with the production database. RedGate tools are very good and well worth the money.
Basically we all develop locally with a local copy of the database and site. If the changes are major we branch. Then we commit locally and do a RedGate merge to put our DB changes on the the shared dev server.
We use a shared dev server so others can do the testing. Once complete we then update the site on staging with svn and then merge the database changes from the development server to the staging server.
Then to go live we do the same from staging to prod.
This method works but is prone to error and is very time consuming when small changes need to be made. The prod DB is always backed up so we can roll back easily if a delivery goes wrong.
One major headache we have is that Dotnetnuke uses identity cols in many tables and if you have data going into tables on development and production such as tabs and permissions and module instances you have a nightmare syncing them. Ideally you want to find or build a cms that uses GUI's or something else in the database so you can easily sync tables that are in use concurrently.
We'd love to find a better method! As we have a lot of trouble with branching and merging when projects are concurrent.
Gus

Mercurial, do I need a server for team-work or can I just create a repository on a network share?

If I want to set up a smallish Mercurial repository for some internal work among a few developers, can I just navigate to a network share and create a repository there, and then just clone that down locally? Or do I need to set up a server (I know, it's easy to do).
This is Windows by the way.
Specifically, I'm wondering if there will be concurrency issues, like abandoned transactions, etc. if multiple users work push/pull simultaneously.
So long as folks are interacting with the repo using only 'clone', 'push', and 'pull', you're in fine shape. What you can't do is have multiple people committing directly from a shared working directory. However, push, pull, and clone are safe to use to a shared folder from a user's personal repository. All changes end up effectively atomic, and no aborted work should cause anyone any problems.
When creating that clone consider using clone -U so it's created without a working directory so folks aren't tempted to edit and commit there.
There's no reason I can think of why you wouldn't be able to do so. I do something similar, only I don't use CIFS, but ssh to access the files. No server setup to speak of in either case.
The only thing that came to mind as a possible problem was concurrent access, but you can see for yourself that Mercurial takes care not to allow users to step on each other's toes.

How to maintain application configuration data in the database across multiple environments?

The company I work for has attempted to maintain configuration data for our application across multiple environments, but syncing that data has always been problematic and we've never come up with a good solution.
To help clarify, we (developers or business) might change some configuration using our admin interface on the Staging environment, test it, and then want to copy those changes to our Production environment without having to redo all the changes in the Production environment. We've also typically wanted to sync these changes between all of our environments (dev, staging, & production), again without having to make the changes individually on each environment.
Preferably we don't want to use any low level tools, as asking the business to use something like RedGate's SQL Data Compare and copying individual rows wouldn't work. It would need to be something intuitive enough so the not-so-technical could use it and not overwhelm them.
How do we maintain this configuration data across the different environments while still providing the business with the ability to test their changes before applying it to the live environment?
What level of technical know-how will the users have? As product manager at Red Gate I can give you our perspective. Although we're not considering support for data in our v1 release of SQL Source Control (currently under development), it will inevitably follow. However, this would still require those who wish to edit static data to do so in SSMS, although they could of course use edit the values using SSMS's graphical designers. Or is this still less intuitive than you'd like? They would be changing the data on a dev or staging database and would be expected to verify that the changes are correct and function as expected. These would then be committed to source control via our tool.
To deploy it would be a question of launching SQL Data Compare, although we plan to provide simple shortcuts from SSMS, rather than requiring users to negotiate their way around a completely separate tool. We haven't nailed down designs for this functionality so I'd encourage you to participate in our Early Access Program and state your case. More details of the Program can be found here:
http://www.red-gate.com/Products/SQL_Source_Control/index.htm