Large scale MySQL changes to active sites

Large scale MySQL changes to active sites - mysql

Just some pointers here.
I am making fairly extensive modifications to a site, including the MySQL database.
My plan is to do everything on my development server, export the new MySQL structure for the db and import it onto the clients server.
Basically I need to know that performing a structure only import will not overwrite/delete existing data. I am not making changes to the data type or field length.

In my experience, when you export a database (through phpMyAdmin for instance), part of the SQL script that is created includes a "DROP TABLE IF EXISTS 'table_name';" before doing a "CREATE TABLE 'table_name'...;" to build the new table.
My guess is that this is not what you want to do! Certainly use the dev system to alter the structure in order to make everything correct, but then look around for a database synchronisation routine where you can provide the old structure, the new structure, and the software will create the appropriate "ALTER TABLE 'table_name'...;" scripts to make the required changes.
You should then really examine these change files before executing them on the live database, and of course BACKUP the live database, and ensure you are able to fully recover from the backup before starting any of the alterations!

I've had to do this a lot, and it always goes like this:
Make a backup of the live database, complete with data.
Make a backup of the live database schema only.
Calculate the differences between the old (live) schema and the new (devel) schema.
Create all of the 'ALTER TABLE ...' DDL statements necessary to upgrade from the old schema to the new one. Keep in mind that if you rename a field, you probably won't be able to just rename it -- you'll need to create the new field, copy the data from the old field, and then drop the old field.
If you changed relationships between tables, you'll probably need to drop indexes and foreign key relationships first, and then add them back afterwards.
You'll need to populate any new fields based upon their default values, if any.
Once you've got all the pieces working, you'll need to combine them into one large script, and then run it on a copy of the live database.
Dump the schema and compare it against the desired new schema -- if they don't match, go back to step 3 and repeat.
Dump the data and compare it against the expected changes -- again, if they don't match, go back to step 3 and repeat.
You're going to learn a lot more about SQL DDL/DML during this process than you ever thought you'd learn. (For one project, where we were switching from natural keys to UUID keys for 50+ tables, I ended up writing programs to generate all of the DDL/DML.)
Good luck, and make frequent backups.

I'd recommend to prepare a sql script for every change you do on development server, so you will be able to reproduce it on development. You shouldn't get to the point where you need to calculate differences between database structures
This is how I do it, all changes are reflected in sql scripts, and I can reconstruct the history of my database running all these files if needed.
Test the final release version on a "staging" mysql server. Make a copy of your production server on another machine and test your script to make sure everything's ok.
Of course, preliminary database backup is a must.

Related

How to properly wipe a database, and re-import?

I am unsure about the best way to do this. As I'm getting ready to put a new database into production, I need to import data from the old database that has been formed in the meantime of me working on it. The new database now also contains a lot of fake data that was used for testing, which I have to get rid of, so a fresh complete re-import seems reasonable.
Now, truncating all the tables in the new database cannot go through, because the foreign keys prevent it. Simply deleting the data instead would solve that problem, but it leaves the AUTO_INCREMENT indexes to the values where they were, so it's not a "proper" wipe. Now, there could be more properties such as that one, that would be left over (so to say), but this is the only one that I'm aware of.
So my question now is, how much of a problem could these "leftover" pieces of data pose to performance, if I were to go with the simple DELETE solution?
And also; is there a way that would be more thorough in cleaning it out, and also allow me to, of course, keep the defined constraints?

First i would use some gui tool to create the dump for the old DB ( like mySql workbench, or what ever you prefer ). Check options "Export to self-contained file", and check "Dump stored procedures and functions","Dump events" and "Dump triggers".
Then get create scripts for all tables not included in the old DB.
You can do this via "reverse engineer" option.
If you have trouble with this part this post will help.
How to get a table creation script in MySQL Workbench?
When you have old DB dump and create scripts for new sql tables, combine them to a single sql file.
On the first row add:
SET FOREIGN_KEY_CHECKS = 0;
On the last row add:
SET FOREIGN_KEY_CHECKS = 1;
Run the script. As a result you should have all tables ( new without data and old with data ), with all relations set properly. Hope it will work for you.

Perl: How to copy/mirror remote MYSQL table(s) to another database? Possibly different structure too?

I am very new to this and a good friend is in a bind. I am at my wits end. I have used gui's like navicat and sqlyog to do this but, only manually.
His band info data (schedules and whatnot) is in a MYSQL database on a server (admin server).
I am putting together a basic site for him written in Perl that grabs data from a database that resides on my server (public server) and displays schedule info, previous gig newsletters and some fan interaction.
He uses an administrative interface, which he likes and desires to keep, to manage the data on the admin server.
The admin server db has a bunch of tables and even table data the public db does not need.
So, I created tables on the public side that only contain relevant data.
I basically used a gui to export the data, then insert to the public side whenever he made updates to the admin db (copy and paste).
(FYI I am using DBI module to access the data in/via my public db perl script.)
I could access the admin server directly to grab only the data I need but, the whole purpose of this is to "mirror" the data not access the admin server on every query. Also, some tables are THOUSANDS of rows and parsing every row in a loop seemed too "bulky" to me. There is however a "time" column which could be utilized to compare to.
I cannot "sync" due to the fact that the structures are different, I only need the relevant table data from only three tables.
SO...... I desire to automate!
I read "copy" was a fast way but, my findings in how to implement were too advanced for my level.
I do not have the luxury of placing a script on the admin server to notify when there was an update.
1- I would like to set up a script to check a table to see if a row was updated or added on the admin servers db.
I would then desire to update or insert the new or changed data to the public servers db.
This "check" could be set up in a cron job I guess or triggered when a specific page loads on the public side. (the same sub routine called by the cron I would assume).
This data does not need to be "real time" but, if he updates something it would be nice to have it appear as quickly as possible.
I have done much reading, module research and experimenting but, here I am again at stackoverflow where I always get great advice and examples.
Much of the terminology is still quite over my head so verbose examples with explanations really help me learn quicker.
Thanks in advance.

The two terms you are looking for are either "replication" or "ETL".
First, replication approach.
Let's assume your admin server has tables T1, T2, T3 and your public server has tables TP1, TP2.
So, what you want to do (since you have different table structres as you said) is:
Take the tables from public server, and create exact copies of those tables on the admin server (TP1 and TP2).
Create a trigger on the admin server's original tables to populate the data from T1/T2/T3 into admin server's copy of TP1/TP2.
You will also need to do initial data population from T1/T2/T3 into admin server's copy of TP1/TP2. Duh.
Set up the "replication" from admin server's TP1/TP2 to public server's TP1/TP2
A different approach is to write a program (such programs are called ETL - Extract-Transform-Load) which will extract the data from T1/T2/T3 on admin server (the "E" part of "ETL"), massage the data into format suitable for loading into TP1/TP2 tables (the "T" part of "ETL"), transfer (via ftp/scp/whatnot) those files to public server, and the second half of the program (the "L") part will load the files into the tables TP1/TP2 on public server. Both halfs of the program would be launched by cron or your scheduler of choice.
There's an article with a very good example of how to start building Perl/MySQL ETL: http://oreilly.com/pub/a/databases/2007/04/12/building-a-data-warehouse-with-mysql-and-perl.html?page=2
If you prefer not to build your own, here's a list of open source ETL systems, never used any of them so no opinions on their usability/quality: http://www.manageability.org/blog/stuff/open-source-etl

I think you've misunderstood ETL as a problem domain, which is complicated, versus ETL as a one-off solution, which is often not much harder than writing a report. Unless I've totally misunderstood your problem, you don't need a general ETL solution, you need a one-off solution that works on a handful of tables and a few thousand rows. ETL and Schema mapping sound scarier than they are for a single job. (The generalization, scaling, change-management, and OLTP-to-OLAP support of ETL are where it gets especially difficult.) If you can use Perl to write a report out of a SQL database, you probably know enough to handle the ETL involved here.
1- I would like to set up a script to check a table to see if a row was updated or added on the admin servers db. I would then desire to update or insert the new or changed data to the public servers db.
If every table you need to pull from has an update timestamp column, then your cron job includes some SELECT statements with WHERE clauses based on the last time the cron job ran to get only the updates. Tables without an update timestamp will probably need a full dump.
I'd use a one-to-one table mapping unless normalization was required... just simpler to my opinion. Why complicate it with "big" schema changes if you don't have to?
some tables are THOUSANDS of rows and parsing every row in a loop seemed too "bulky" to me.
Limit your queries to only the columns you need (and if there are no BLOBs or exceptionally big columns in what you need) a few thousand rows should not be a problem via DBI with a FETCHALL method. Loop all you want locally, just make as few trips to the remote database as possible.
If a row is has a newer date, update it. I will also have to check for new rows for insertion.
Each table needs one SELECT ... WHERE updated_timestamp_columnname > last_cron_run_timestamp. That result set will contain all rows with newer timestamps, which contains newly inserted rows (if the timestamp column behaves like I'd expect). For updating your local database, check out MySQL's ON DUPLICATE KEY UPDATE syntax... this will let you do it in one step.
... how to implement were too advanced for my level ...
Yes, I have actually done this already but, I have to manually update...
Some questions to help us understand your level... Are you hitting the database from the mysql client command-line or from a GUI? Have you gotten to the point where you've wrapped your SQL queries in Perl and DBI, yet?

If the two databases have different, you'll need an ETL solution to map from one schema to another.
If the schemas are the same, all you have to do is replicate the data from one to the other.

Why not just create identical structure on the 'slave' server to the master server. Then create a small table that keeps track of the last timestamp or id for the updated tables.
Then select from the master all rows changed since the last timestamp or greater than the id. Insert them into the matching table on the slave server.
You will need to be careful of updated rows. If a row on the master is updated but the timestamp doesn't change then how will you tell which rows to fetch? If that's not an issue the process is quite simple.
If it is an issue then you need to be more sophisticated, but without knowing the data structure and update mechanism its a goose chase to give pointers on it.
The script could be called by cron every so often to update the changes.
if the database structures must be different on the two servers then a simple translation step may need to be added, but most of the time that can be done within the sql select statement and maybe a join or two.

How do you version and sync your MySQL data model?

What's the best way to save my MySQL data model and automatically apply changes to my development database server as they are made (or at least nightly)?
For example, today I'm working on my project and create this table in my database, and save the statement to SQL file to deploy to production later:
create table dog (
uid int,
name varchar(50)
);
And tomorrow, I decide I want to record the breed of each dog too. So I change the SQL file to read:
create table dog (
uid int,
name varchar(50),
breed varchar(30)
);
That script will work in production for the first release, but it won't help me update my development database because ERROR 1050 (42S01): Table 'dog' already exists. Furthermore, it won't work in production if this change was made after the first release. So I really need to ALTER the table now.
So now I have two concerns:
Is this how I should be saving my
data model (a bunch of create
statements in a SQL file), and
How
should I be applying changes like
this to my database?
My goal is to release changes accurately and enable continuous integration. I use a tool called DDLSYNC do find and apply difference in an Oracle database, but I'm not sure what similar tools exist for MySQL.

At work, we developed a small script to manage our database versioning. Every change to any table or set of data gets it's own SQL file.
The files are numbered sequentially. We keep track of which update files have been run by storing that information in the database. The script inserts a row with the filename when the file is about to be executed, and updates the row with a completion timestamp when the execution finishes. This is wrapped inside a transaction. (It's worth remembering that DDL commands in MySQL can not occur within a transaction. Any attempt to perform DDL in a transaction causes an implicit commit.)
Because the SQL files are part of our source code repository, we can make running the update script part of the normal rollout process. This makes keeping the database and the code in sync easy as pie. Honestly, the hardest part is making sure another dev hasn't grabbed the next number in a pending commit.
We combine this update system with an (optional) nightly wipe of our dev database, replacing the contents with last night's live system backup. After the backup is restored, the update gets run, with any pending update files getting run in the process.
The restoration occurs in such a way that only tables that were in the live database get overwritten. Any update that adds a table therefore also has to be responsible for only adding it if it doesn't exist. DROP TABLE IF EXISTS is handy. Unfortunately not all databases support that, so the update system also allows for execution of scripts written in our language of choice, not just SQL.
All of this in about 150 lines of code. It's as easy as reading a directory, comparing the contents to a table, and executing anything that hasn't already been executed, in a determined order.

There are standard tools for this in many frameworks: Rails has something called Migrations, something that's easily replicated in PHP or any similar language.

question about MySQL database migration

If I have a MySQL database with several tables on a live server, now I would like to migrate this database to another server. Of course, the migration I mean here involves some database tables, for example: add some new columns to several tables, add some new tables etc..
Now, the only method I can think of is to use some php/python(two scripts I know) script, connect two databases, dump the data from the old database, and then write into the new database. However, this method is not efficient at all. For example: in old database, table A has 28 columns; in new database, table A has 29 columns, but the extra column will have default value 0 for all the old rows. My script still needs to dump the data row by row and insert each row into the new database.
Using MySQLDump etc.. won't work. Here is the detail. For example: I have FOUR old databases, I can name them as 'DB_a', 'DB_b', 'DB_c', 'DB_d'. Now the old table A has 28 columns, I want to add each row in table A into the new database with a new column ID 'DB_x' (x to indicate which database it comes from). If I can't differentiate the database ID by the row's content, the only way I can identify them is going through some user input parameters.
Is there any tools or a better method than writing a script yourself? Here, I dont need to worry about multithread writing problems etc.., I mean the old database will be down (not open to public usage etc.., only for upgrade ) for a while.
Thanks!!

I don't entirely understand your situation with the columns (wouldn't it be more sensible to add any new columns after migration?), but one of the arguably fastest methods to copy a database across servers is mysqlhotcopy. It can copy myISAM only and has a number of other requirements, but it's awfully fast because it skips the create dump / import dump step completely.

Generally when you migrate a database to new servers, you don't apply a bunch of schema changes at the same time, for the reasons that you're running into right now.
MySQL has a dump tool called mysqldump that can be used to easily take a snapshot/backup of a database. The snapshot can then be copied to a new server and installed.
You should figure out all the changes that have been done to your "new" database, and write out a script of all the SQL commands needed to "upgrade" the old database to the new version that you're using (e.g. ALTER TABLE a ADD COLUMN x, etc). After you're sure it's working, take a dump of the old one, copy it over, install it, and then apply your change script.

Use mysqldump to dump the data, then echo output.txt > msyql. Now the old data is on the new server. Manipulate as necessary.

Sure there are tools that can help you achieving what you're trying to do. Mysqldump is a premier example of such tools. Just take a glance here:
http://dev.mysql.com/doc/refman/5.1/en/mysqldump.html
What you could do is:
1) You make a dump of the current db, using mysqldump (with the --no-data option) to fetch the schema only
2) You alter the schema you have dumped, adding new columns
3) You create your new schema (mysql < dump.sql - just google for mysql backup restore for more help on the syntax)
4) Dump your data using the mysqldump complete-insert option (see link above)
5) Import your data, using mysql < data.sql
This should do the job for you, good luck!

Adding extra rows can be done on a live database:
ALTER TABLE [table-name] ADD [row-name] MEDIUMINT(8) default 0;
MySql will default all existing rows to the default value.
So here is what I would do:
make a copy of you're old database with MySql dump command.
run the resulting SQL file against you're new database, now you have an exact copy.
write a migration.sql file that will modify you're database with modify table commands and for complex conversions some temporary MySql procedures.
test you're script (when fail, go to (2)).
If all OK, then goto (1) and go live with you're new database.

These are all valid approaches, but I believe you want to write a sql statement that writes other insert statements that support the new columns you have.

MySQL to SQL Server transferring data

I need to convert data that already exists in a MySQL database, to a SQL Server database.
The caveat here is that the old database was poorly designed, but the new one is in a proper 3N form. Does any one have any tips on how to go about doing this? I have SSMS 2005.
Can I use this to connect to the MySQL DB and create a DTS? Or do I need to use SSIS?
Do I need to script out the MySQL DB and alter every statement to "insert" into the SQL Server DB?
Has anyone gone through this before? Please HELP!!!

See this link. The idea is to add your MySQL database as a linked server in SQL Server via the MySQL ODBC driver. Then you can perform any operations you like on the MySQL database via SSMS, including copying data into SQL Server.
Congrats on moving up in the RDBMS world!

SSIS is designed to do this kind of thing. The first step is to map out manually where each piece of data will go in the new structure. So your old table had four fields, in your new structure fileds1 and 2 go to table a and field three and four go to table b, but you also need to have the autogenerated id from table a. Make notes as to where data types have changed and you may need to make adjustments or where you have required fileds where the data was not required before etc.
What I usually do is create staging tables. Put the data in the denormalized form in one staging table and then move to normalized staging tables and do the clean up there and add the new ids as soon as you have them to the staging tables. One thing you will need to do if you are moving from a denormalized database to a normalized one is that you will need to eliminate the duplicates from the parent tables before inserting them into the actual production tables. You may also need to do dataclean up as there may be required fileds in the new structure that were not required in the old or data converstion issues becasue of moving to better datatypes (for instance if you stored dates in the old database in varchar fields but properly move to datetime in the new db, you may have some records which don't have valid dates.
ANother issue you need to think about is how you will convert from the old record ids to the new ones.
This is not a an easy task, but it is doable if you take your time and work methodically. Now is not the time to try shortcuts.

What you need is an ETL (extract, transform, load) tool.
http://en.wikipedia.org/wiki/Extract,_transform,_load#Tools

I don't really know how far an 'ETL' tool will get you depending on the original and new database designs. In my career I've had to do more than a few data migrations and we usually always had to design a special utility which would update a fresh database with records from the old database, and yes we coded it complete with all the update/insert statements that would transform data.
I don't know how many tables your database has, but if they are not too many then you could consider going the grunt root. That's one technique that's guaranteed to work after all.

If you go to your database in SSMS and right-click, under tasks should be an option for "Import Data". You can try to use that. It's basically just a wizard that creates an SSIS package for you, which it can then either run for you automatically or which you can save and then alter as needed.
The big issue is how you need to transform the data. This goes into a lot of specifics which you don't include (and which are probably too numerous for you to include here anyway).
I'm certain that SSIS can handle whatever transformations you need to do to change it from the old format to the new. An alternative though would be to just import the tables into MS SQL as-is into staging tables, then use SQL code to transform the data into the 3NF tables. It's all a matter of what your most comfortable with. If you go the second route, then the import process that I mentioned above in SSMS could be used. It will even create the destination tables for you. Just be sure that you give them unique names, maybe prefixing them with "STG_" or something.
Davud mentioned linked servers. That's definitely another way that you can go (and got my upvote). Personally, I prefer to copy the tables over into MS SQL first since linked servers can sometimes have weirdness, especially when it comes to data types not mapping between different providers. Having the tables all in MS SQL will also probably be a bit faster and saves time if you have to rerun or correct portions of the data. As I said though, the linked server method would probably be fine too.

I have done this going the other direction and SSIS works fine, although I might have needed to use a script task to deal with slight data type weirdness. SSIS does ETL.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008