Automatic MySQL backup on sharedhost - mysql

As I am not a coder so I should stop here but I am interested how far I could go to build an automatic SQL script.
Case:
A website is hosted on a shared host server which uses CPanel. The website use only one DB and one of the table is the log. Now the log table has reached 300k rows... (I might do something wrong here... it is a popular website?:))
So I need to reduce the log table but I would like to do a backup rather. So here is my idea:
Setup a backup DB and copy the old entries meanwhile use tables for only quarter years so the log from Jan-April would be stored in table_2012-q1 etc.
Method:
I would like to use cron and email alert.
Questions:
is there any better and easier solution to do the back up with this row num.
if I do a "move rows" by INSERT/DELETE rows how can I check which one is ready on time?
do I need to focus on the performance of this process as it should work in the background? In other words is it a select or a dump?
Sorry if it is too Dummy but I would like to learn! I also don't want to use too much processor for this.
Thanks Andras

Since you are using shared hosting, I'm pretty sure you will not be able to access cron, so here is an alternative:
Since the database is filled with log data :
1. Create a new table, regardless of the name or time period
2. Move the files (from a certain id) from one table to the next
This link will explain better : mysqldump partial database
If this is active DB, I would clone it and then play around with the ways the data will be moved since you do not consider youreself a coder.
Hope this helps.

Related

Duplicate a whole database on the same server?

We are running a service where we have to setup a new database for each new site. The database is exactly the same so we can simply dump from a backup file or clone from a sample database (which is created only for clone purpose, no transaction will be run there thus no worry about corrupting data) from the same server. The database it self contains around 100 tables and with some data, taking around 1-2mins to import, which is too slow.
I'm trying to find a way to do it as fast as possible, the first thought came to mind was to copy the files within the sample database data_dir, but it seems like I also need to somehow edit the table lists or mysql wont be able to read my new database's tables eventhough it still shows them there.
You're duplicating the database the wrong way, it will be much faster if you do it properly.
Here is how you duplicate a database:
create database new_database;
create table new_database.table_one select * from source_database.table_one;
create table new_database.table_two select * from source_database.table_two;
create table new_database.table_three select * from source_database.table_three;
...
I just did a performance test, this takes 81 seconds to duplicate 750MB of data across 7 million table rows. Presumably your database is smaller than that?
I don't think you are going to find anything faster. One thing you could do is already have a queue of duplicate databases on standby ready to be picked up and used at any time. So you don't need to create a new database at all, you just rename an existing database from a queue of available ones. And have a cron job running to make sure the queue never runs empty.
Why mysql not able to read or what you changes in table lists?
I think there may be problem of permissions to read by mysql, otherwise it would be fine..
Thanks

Transfer mySQL from development to production

I need to synch development mysql db with the production one.
Production db gets updated by user clicks and other data generated via web.
Development db gets updated with processing data.
What's the best practice to accomplish this?
I found some diff tools (eg. mySQL diff), but they don't manage updated records.
I also found some application solution: http://www.isocra.com/2004/10/dumptosql/
but I'm not sure it's a good practice as in this case I need to retest my code each time I add new innodb related tables.
Any ideas?
Take a look at mysqldump. It may serve you well enough for this.
Assuming your tables are all indexed with some sort of unique key you could do a dump and have it leave out the 'drop/create table' bits. Have it run as 'insert ignore' and you'll get the new data without effecting the existing data.
Another option would be to use the query part of mysqldump to dump only the new records from the production side. Again - have mysqldump leave off the 'drop/create' bits.

How to update database of ~25,000 music files?

Update:
I wrote a working script that finishes this job in a reasonable length of time, and seems to be quite reliable. It's coded entirely in PHP and is built around the array_diff() idea suggested by saccharine (so, thanks saccharine!).
You can access the source code here: http://pastebin.com/ddeiiEET
I have a MySQL database that is an index of mp3 files in a certain directory, together with their attributes (ie. title/artist/album).
New files are often being added to the music directory. At the moment it contains about 25,000 MP3 files, but I need to create a cron job that goes through it each day or so, adding any files that it doesn't find in the database.
The problem is that I don't know what is the best / least taxing way of doing this. I'm assuming a MySQL query would have to be run for each file on each cron run (to check if it's already indexed), so the script would unavoidably take a little while to run (which is okay; it's an automated process). However, because of this, my usual language of choice (PHP) would probably not suffice, as it is not designed to run long-running scripts like this (or is it...?).
It would obviously be nice, but I'm not fussed about deleting index entries for deleted files (if files actually get deleted, it's always manual cleaning up, and I don't mind just going into the database by hand to fix the index).
By the way, it would be recursive; the files are mostly situated in an Artist/Album/Title.mp3 structure, however they aren't religiously ordered like this and the script would certainly have to be able to fetch ID3 tags for new files. In fact, ideally, I would like the script to fetch ID3 tags for each file on every run, and either add a new row to the database or update the existing one if it had changed.
Anyway, I'm starting from the ground up with this, so the most basic advice first I guess (such as which programming language to use - I'm willing to learn a new one if necessary). Thanks a lot!
First a dumb question, would it not be possible to simply order the files by date added and only run the iterations through the files added in the last day? I'm not very familiar working with files, but it seems like it should be possible.
If all you want to do is improve the speed of your current code, I would recommend that you check that your data is properly indexed. It makes queries a lot faster if you search through a table's index. If you're searching through columns that aren't the key, you might want to change your setup. You should also avoid using "SELECT *" and instead use "SELECT COUNT" as mysql will then be returning ints instead of objects.
You can also do everything in a few mysql queries but will increase the complexity of your php code. Call the array with information about all the files $files. Select the data from the db where the files in the db match the a file in $files. Something like this.
"SELECT id FROM MUSIC WHERE id IN ($files)"
Read the returned array and label it $db_files. Then find all files in $files array that don't appear in $db_files array using array_diff(). Label the missing files $missing_files. Then insert the files in $missing_files into the db.
What kind of Engine are you using? If you're using MyISAM, the whole table will be locked while updating your table. But still, 25k rows are not that much, so basically in (max) a few minutes it should be updated. If it is InnoDB just update it since it's row-level locked and you should be still able to use your table while updating it.
By the way, if you're not using any fulltext search on that table, I believe that you should convert it to InnoDB as you can use foreign indexes, and that would help you a lot while joining tables. Also, it scales better AFAIK.

MySQL table modified timestamp

I have a test server that uses data from a test database. When I'm done testing, it gets moved to the live database.
The problem is, I have other projects that rely on the data now in production, so I have to run a script that grabs the data from the tables I need, deletes the data in the test DB and inserts the data from the live DB.
I have been trying to figure out a way to improve this model. The problem isn't so much in the migration, since the data only gets updated once or twice a week (without any action on my part). The problem is having the migration take place only when it needs to. I would like to have my migration script include a quick check against the live tables and the test tables and, if need be, make the move. If there haven't been updates, the script quits.
This way, I can include the update script in my other scripts and not have to worry if the data is in sync.
I can't use time stamps. For one, I have no control over the tables on the live side once it goes live, and also because it seems a bit silly to bulk up the tables more for conviencience.
I tried doing a "SHOW TABLE STATUS FROM livedb" but because the tables are all InnoDB, there is no "Update Time", plus, it appears that the "Create Time" was this morning, leading me to believe that the database is backed up and re-created daily.
Is there any other property in the table that would show which of the two is newer? A "Newest Row Date" perhaps?
In short: Make the development-live updating first-class in your application. Instead of depending on the database engine to supply you with the necessary information to enable you to make a decision (to update or not to update ... that is the question), just implement it as part of your application. Otherwise, you're trying to fit a round peg into a square hole.
Without knowing what your data model is, and without understanding at all what your synchronization model is, you have a few options:
Match primary keys against live database vs. the test database. When test > live IDs, do an update.
Use timestamps in a table to determine if it needs to be updated
Use the md5 hash of a database table and modification date (UTC) to determine if a table has changed.
Long story short: Database synchronization is very hard. Implement a solution which is specific to your application. There is no "generic" solution which will work ideally.
If you have an autoincrement in your tables, you could compare the maximum autoincrement values to see if they're different.
But which version of mysql are you using?
Rather than rolling your own, you could use a preexisting solution for keeping databases in sync. I've heard good things about SQLYog's SJA (see here). I've never used it myself, but I've been very impressed with their other programs.

Best way to archive live MySQL database

We have a live MySQL database that is 99% INSERTs, around 100 per second. We want to archive the data each day so that we can run queries on it without affecting the main, live database. In addition, once the archive is completed, we want to clear the live database.
What is the best way to do this without (if possible) locking INSERTs? We use INSERT DELAYED for the queries.
http://www.maatkit.org/ has mk-archiver
archives or purges rows from a table to another table and/or a file. It is designed to efficiently “nibble” data in very small chunks without interfering with critical online transaction processing (OLTP) queries. It accomplishes this with a non-backtracking query plan that keeps its place in the table from query to query, so each subsequent query does very little work to find more archivable rows.
Another alternative is to simply create a new database table each day. MyIsam does have some advantages for this, since INSERTs to the end of the table don't generally block anyway, and there is a merge table type to being them all back together. A number of websites log the httpd traffic to tables like that.
With Mysql 5.1, there are also partition tables that can do much the same.
I use mysql partition tables and I've achieve wonderful results in all aspects.
Sounds like replication is the best solution for this. After the initial sync the slave gets updates via the Binary Log, thus not affecting the master DB at all.
More on replication.
MK-ARCHIVER is a elegant tool to archive MYSQL data.
http://www.maatkit.org/doc/mk-archiver.html
MySQL replication would work perfectly for this.
Master -> the live server.
Slave -> a different server on the same network.
Could you keep two mirrored databases around? Write to one, keep the second as an archive. Switch every, say, 24 hours (or however long you deem appropriate). Into the database that was the archive, insert all of todays activity. Then the two databases should match. Use this as the new live db. Take the archived database and do whatever you want to it. You can backup/extract/read all you want now that its not being actively written to.
Its kind of like having mirrored raid where you can take one drive offline for backup, resync it, then take the other drive out for backup.