MySQL update whole database without downtime - mysql

I have a large database that needs to be rebuilt every 24h. The database is built using a custom script on a server that puls data from different files. The problem is that the whole process takes 1 min to complete and that is 1 min downtime because we need to drop the whole database in order to rebuild it (there is no other way than to drop it).
At first, we planned to build a temporary database, and drop the original and then rename the temporary to the original name but MySQL doesn't support database renaming.
The second approach was to dump .sql file from temp database and import it to main(original) database but that also causes downtime.
What is the best way to do this?

Here is something that I do. It doesn't result in zero downtime but could finish in less than a second.
Create a database that only has interface elements to your real database. In my case, it only contains view definitions, and all user queries go through this database.
Create a new database each night. When it is done, then update the view definitions to refer to the new database. I would recommend either turning off user access to the database containing the views while you are updating them or deleting all of the views and recreating them -- this prevents partial access to the old database. Because creating views is fast, this should be a very fast operation.
We do all of this through a job. In fact, before changing the production views, we test the view creation on another database to be sure they are all working.
Obviously, if you use alter view instead of requiring consistency across all the views, then there is no downtime, just a brief period of inconsistency.

Related

Proper way to sync a table from another similar table with a few different columns while inserts and updates are happening

We need to alter an existing InnoDb table with 10+ million records to add a few columns. We tried simple alter table query and it took almost an hour to complete. However, the change did not reflect. No error details available.
So, we are trying this approach:
creating a new table with same schema,
then altering the table
then syncing up data from the existing table
then just renaming the first table to use a different name (application will cause error during this time) and then renaming the 2nd table to the production name, being used by application.
Problem in hand
I am not sure how to go ahead with the syncing, while application is live.
I think we should go ahead with syncing, instead of just dumping and restoring. If dumping is to be done, should be done by shutting down traffic.
Edits can happen to the table in question corresponding to txns done. So, we need to ensure that in addition to sanity checks on total accounts migrated, we also don’t lose any edits done to the table during migration.
Is a stored procedure needed in this scenario?
Update
We need to make sure no updates to existing table (being written from application) and inserts are missed. Not sure if stored procedure is the solution here.
Do we need to shut down writes completely for this? Any way of doing this by keeping application running?

Move data between AWS RDS instances

I need to move millions of rows between identical mysql db's on two different rds instances. The approach I thought about is this:
- use data-pipeline to export data from the first instance to amazon-s3
- use data-pipeline to import data from amazon-s3 to the second instance
My problem is that I need to delete the data on the first instance at the end. Since we're talking about huge amounts of data I thought about creating a stored procedure to delete the rows in batches. Is there a way to achieve that in aws? Or are there any other solutions?
One other thing is that i need only to move some rows from a specific table, not the whole table or the whole database.
You can use AWS DMS service which is the easiest method to move the huge amount of data. Please follow the below steps.
First, you need to change some settings in Parameter group on both RDS instances.
'log_bin' = 'ON'
'binlog_format' = 'ROW'
'binlog_checksum' = 'NONE'
'log_bin_use_v1_row_events' = 'ON'
Take a dump of the database's schema from the first RDS instance.
Restore it on the second RDS.
Now start to configure the DMS.
Setup the Endpoints first.
Then create a task to import data from Source(first RDS) to
Destination(second RDS).
In the migration type, if you want to load existing data choose to
Migrate existing data or if you trying to sync real time data then
select ongoing changes.
Under task setting, select Target table preparation mode = Do
nothing.
Check Enable logging check box it'll help to debug in case of any
errors.
Once the task is started you can able to see the process in the
dashboard.
Using TRUNCATE TABLE instead of delete statement if you want to delete all the data in one table. It will save you a lot of time.
Data-pipeline is more for a recurring process. Seems like a lot of extra hassle if you just want to do a one-time operation. Maybe easier to launch an instance with decent network throughput, attach a big enough EBS volume to hold your data and use command line tools like mysqldump to move the data.
As far as cleanup goes, probably faster to come up with a query that copies the rows you want to keep to a temp table (or everything but the rows you don't want) and then use rename to swap out the temp table for the original. Then drop the original table.

Duplicate a whole database on the same server?

We are running a service where we have to setup a new database for each new site. The database is exactly the same so we can simply dump from a backup file or clone from a sample database (which is created only for clone purpose, no transaction will be run there thus no worry about corrupting data) from the same server. The database it self contains around 100 tables and with some data, taking around 1-2mins to import, which is too slow.
I'm trying to find a way to do it as fast as possible, the first thought came to mind was to copy the files within the sample database data_dir, but it seems like I also need to somehow edit the table lists or mysql wont be able to read my new database's tables eventhough it still shows them there.
You're duplicating the database the wrong way, it will be much faster if you do it properly.
Here is how you duplicate a database:
create database new_database;
create table new_database.table_one select * from source_database.table_one;
create table new_database.table_two select * from source_database.table_two;
create table new_database.table_three select * from source_database.table_three;
...
I just did a performance test, this takes 81 seconds to duplicate 750MB of data across 7 million table rows. Presumably your database is smaller than that?
I don't think you are going to find anything faster. One thing you could do is already have a queue of duplicate databases on standby ready to be picked up and used at any time. So you don't need to create a new database at all, you just rename an existing database from a queue of available ones. And have a cron job running to make sure the queue never runs empty.
Why mysql not able to read or what you changes in table lists?
I think there may be problem of permissions to read by mysql, otherwise it would be fine..
Thanks

Will cron job for creating db affect the user?

I have a website where I need to create a temporary database table which recreates every 5 hours. It takes about 0.5 sec to complete the action.
Site analytics shows about 5 hits / Second. This figure may gradually increase.
Question
The cron job empties the db and the recreates it. Does it mean, while someone is accessing a page which populates data based on the temporary db while its active under the cron job, he may get no data found or incomplete data?
Or
This scenario is taken care of by Mysql due to locking?
From my tests, if one MySQWL client attempts to drop a database while another client has one of its tables locked, the client will wait until the tasble is unlocked.
However the client dropping the database cannot itself hold a lock on any of the database's tables either. So depending on what you are doing, you may need to use some other method to serialise requests and so on. For example, if the job needs to drop and re-create the database, create the table(s) and populate them before other clients use them, table locking will be a problem because there won't always be a table to lock.
Consider using explicit locking using get_lock() to coordinate operations on the "temporary" database.
Also consider rethinking your strategy. Dropping and re-creating entire databases on a regular basis is not a common practice.
Instead of dropping and recreating, you might to create first under a temporary name, populate and then drop the old one while renaming the new one.
Additionally, you should either make your web app fit for retrying if the table was not found in order to cope with the small time window where the table is not here, or operate on a view instead of renaming tables.
as I know, when you lock the table, others couldn't access that table unless you unlock it, but other connections will only be waiting until 0.5 seconds later, so your users may have to wait for extra 0.5 seconds when you recreate the table.
don't worry about no data, only sometime delay.

How to update DB structure when updating production system without doing a teardown / rebuild

If I'm working on a development server and have updates to the database structure for some of our releases, what is the best way to update the structure on the production server?
Currently we create a new production database containing the structure only, do a SQL dump of the data on the 'old' production database, then run a SQL query to insert the data into the new database.
I know there is an easier way to do these updates, right?
Thanks in advance.
We don't run anything on prod without a script and that script must be in source control. Additionally we have to write a rollback script in case the initial script goes bad and we have to back it out. And when we move to prod configuration management does a differential compare between prod and dev to see if we have missed anything in the production script (any differences have to be traceable to development we are not yet ready to move to prod and documented). A product like Red-gate's SQL compare can do this. Our process is very formalized so that we can maintain a certification required by our larger clients.
If you have large tables even alter table can be slow, but it's still generally more efficient in total time than making a copy of the table with a new name and structure, copying the data to that table, renaming the old table, then naming the new table the name of the orginal table, then deleting the old table.
However, there are times when that is a preferable process as the total down time apparent to the user in this case is the time it takes to rename two tables, so this is good for tables where the data only is filled from the backend not the application (if the application can update the tables, it is a dangerous practice to do this as you may lose changes made while the tables were in transition). A lot of what process to use depends on the nature of the change you are making. Some changes should be done in a maintenance window where the users are not allowed to access the database. For instance if you are adding a new field with a default value to a table with 100,000,000 records, you are liable to lock up the users from using the table while the update happens. It is better to do this in single user mode during off hours (and when the users are told in advance the database will not be available). Other changes only take milliseconds and can happen easily while users are logged in.
Look at alter table to change the schema
It might not be easier than your method but it means less copying of the database
This is actually quite a deep question. If the only changes you've made are to add some columns then ALTER TABLE is probably sufficient. But if you're renaming or deleting columns then ALTER statements may break various foreign key constraints. In addition, sometimes you need to make changes both to the database and the data, which is pretty much unscriptable.
Most likely the best way to automate this would be to write a simple script for each deployment (along with a script to roll back!) which is basically what some systems like Rails will do for you I believe. Some scripts might be simply ALTER statements, some might temporarily disable foreign-key checking and triggers etc, some might run some update statements as well. And some might be dumping the db and rebuilding it. I don't think there's a one-size-fits-all solution here, sorry :)
Use the ALTER TABLE command: http://dev.mysql.com/doc/refman/5.0/en/alter-table.html