Related
I would like to use test databases for feature branches.
Of course it would be best to create a gitlab ci environment on the fly (review apps style) and also create a test database on the target system with the same name. Unfortunately, this is not possible because the MySQL databases in the target system have fixed names, like xxx_1, xxx_2 etc. and this cannot be changed without moving to a different hosting provider.
So I would like to do something like "grab an empty test data base from the given xxx_n and then empty it again when the branch is deleted".
How could this be handled with gitlab ci?
Can I set a variable on the project that says "feature branch Y already uses database xxx_4"?
Or should I put a table into the test database to store this information?
Using dynamic environments/variables and stop jobs might be able to do the trick. Stop jobs will run when the environment is "stopped" -- in the case of feature branches without associated MRs, when the feature branch is deleted (or if there is an open MR for the review app, when the MR is merged or closed)
Can I set a variable on the project that says "feature branch Y already uses database xxx_4"?
One way may be to put the db name directly in the environment name. Then the Environments API keeps track of this.
stages:
- pre-deploy
- deploy
determine_database:
stage: pre-deploy
image: python:3.9-slim
script:
- pip install python-gitlab
- database_name=$(determine-database) # determine what database names are not currently in use
- echo "database_name=${database_name}" > vars.env
artifacts:
reports: # automatically set $database_name variable in subsequent jobs
dotenv: "vars.env"
deploy_review_app:
stage: deploy
environment:
name: review/$CI_COMMIT_REF_SLUG/$database_name
on_stop: teardown
script:
- echo "deploying review app for $CI_COMMIT_REF with database name configuration $database_name"
- ... # steps to actually do the deploy
teardown: # this will trigger when the environment is stopped
stage: deploy
variables:
GIT_STRATEGY: none # ensures this works even if the branch is deleted
when: manual
script:
- echo "tearing down test database $database_name"
- ... # actual script steps to stop env and cleanup database
environment:
name: review/$CI_COMMIT_REF_SLUG/$database_name
action: "stop"
The implementation of the determine-database command may have to connect to your database to determine what database names are available (or perhaps you have a set of these provisioned in advance). You can then inspect the GitLab environments API to see what database names are still in use (since it's baked into the environment name).
For example, you might have something like this. Here, I am using the python-gitlab API wrapper just because it's most familiar to me, but the same principle can be applied to any method of calling the GitLab REST API.
#!/usr/bin/env python3
import gitlab
import os, sys, random
GITLAB_URL = os.environ['CI_SERVER_URL']
PROJECT_TOKEN = os.environ['MY_PROJECT_TOKEN'] # you generate and add this to your CI/CD variables!
PROJECT_ID = os.environ['CI_PROJECT_ID']
DATABASE_NAMES = ['xxx_1', 'xxx_2', 'xxx_3'] # or determine this programmatically by connecting to the DB
gl = gitlab.Gitlab(GITLAB_URL, private_token=PROJECT_TOKEN)
in_use_databases = []
project = gl.projects.get(PROJECT_ID)
for environment in project.environments.list(state='available', all=True):
# the in-use database name is the string after the last '/' in the env name
in_use_db_name = environment.name.split('/')[-1]
in_use_databases.append(in_use_db_name)
available_databases = [name for name in DATABASE_NAMES if name not in in_use_databases]
if not available_databases: # bail if all databases are in use
print('FATAL. no available databases', file=sys.stderr)
raise SystemExit(1)
# otherwise pick one and output to stdout
db_name = random.choice(available_databses)
# optionally you could prepare the database here, too, instead of relying on the `on_stop` job.
print(db_name)
There is a potential concurrency problem here (two runs of determine_database concurrently on different branches can potentially select the same db twice before either finish) but that could be addressed with resource locks.
default: on
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
disable = no
flags = REUSE
socket_type = stream
type = UNLISTED
port = 9200
wait = no
user = root
server = /usr/bin/mysqlclustercheck
log_on_failure += USERID
only_from = 0.0.0.0/0
#
# Passing arguments to clustercheck
# <user> <pass> <available_when_donor=0|1> <log_file> <available_when_readonly=0|1> <defaults_extra_file>"
# Recommended: server_args = user pass 1 /var/log/log-file 0 /etc/my.cnf.local"
# Compatibility: server_args = user pass 1 /var/log/log-file 1 /etc/my.cnf.local"
# 55-to-56 upgrade: server_args = user pass 1 /var/log/log-file 0 /etc/my.cnf.extra"
#
# recommended to put the IPs that need
# to connect exclusively (security purposes)
per_source = UNLIMITED
}
/etc/xinetd.d #
It is kind of strange that script works fine when run manually when it runs using /etc/xinetd.d/ , it is not working as expected.
In mysqlclustercheck script, instead of using --user= and passord= syntax, I am using --login-path= syntax
script runs fine when I run using command line but status for xinetd was showing signal 13. After debugging, I have found that even simple command like this is not working
mysql_config_editor print --all >>/tmp/test.txt
We don't see any output generated when it is run using xinetd ( mysqlclustercheck)
Have you tried the following instead of /usr/bin/mysqlclustercheck?
server = /usr/bin/clustercheck
I am wondering if you could test your binary location with the linux which command.
A long time ago since this question was asked, but it just came to my attention.
First of all as mentioned, Percona Cluster Control script is called clustercheck, so make sure you are using the correct name and correct path.
Secondly, since the server script runs fine from command line, it seems to me that the path of mysql client command is not known by the xinetd when it runs the Cluster Control script.
Since the mysqlclustercheck script as it is offered from Percona, it uses only the binary name mysql without specifying the absolute path I suggest you do the following:
Find where mysql client command is located on your system:
ccloud#gal1:~> sudo -i
gal1:~ # which mysql
/usr/local/mysql/bin/mysql
gal1:~ #
then edit script /usr/bin/mysqlclustercheck and in the following line:
MYSQL_CMDLINE="mysql --defaults-extra-file=$DEFAULTS_EXTRA_FILE -nNE --connect-timeout=$TIMEOUT \
place the exact path of mysql client command you found in the previous step.
I also see that you are not using MySQL connection credentials for connecting to MySQL server. mysqlclustercheck script as it is offered from Percona, it uses User/Password in order to connect to MySQL server.
So normally, you should execute the script in the command line like:
gal1:~ # /usr/sbin/clustercheck haproxy haproxyMySQLpass
HTTP/1.1 200 OK
Content-Type: text/plain
Where haproxy/haproxyMySQLpass is the MySQL connection user/pass for HAProxy monitoring user.
Additionally, you should specify them to your script's xinetd settings like:
server = /usr/bin/mysqlclustercheck
server_args = haproxy haproxyMySQLpass
Last but not least, the signal 13 you are getting is because you try to write something in a script run by xinetd. If for example in your mysqlclustercheck you try to add a statement like
echo "debug message"
you probably going to see the broken pipe signal (13 in POSIX).
Finally, I had issues with this script using SLES 12.3 and I finally manage to run it not as 'nobody' but as 'root'.
Hope it helps
I've seen lots of examples of making Docker containers for Rails applications. Typically they run a rails server and have a CMD that runs migrations/setup then brings up the Rails server.
If I'm spawning 5 of these containers at the same time, how does Rails handle multiple processes trying to initiate the migrations? I can see Rails checking the current schema version in the general query log (it's a MySQL database):
SELECT `schema_migrations`.`version` FROM `schema_migrations`
But I can see a race condition here if this happens at the same time on different Rails instances.
Considering that DDL is not transactional in MySQL and I don't see any locks happening in the general query log while running migrations (other than the per-migration transactions), it would seem that kicking them off in parallel would be a bad idea. In fact if I kick this off three times locally I can see two of the rails instances crashing when trying to create a table because it already exists while the third rails instance completes the migrations happily. If this was a migration that inserted something into the database it would be quite unsafe.
Is it then a better idea to run a single container that runs migrations/setup then spawns (for example) a Unicorn instance which in turn spawns multiple rails workers?
Should I be spawning N rails containers and one 'migration container' that runs the migration then exits?
Is there a better option?
Especially with Rails I don't have any experience, but let's look from a docker and software engineering point of view.
The Docker team advocates, sometimes quite aggressively, that containers are about shipping applications. In this really great statement, Jerome Petazzoni says that it is all about separation of concerns. I feel that this is exactly the point you already figured out.
Running a rails container which starts a migration or setup might be good for initial deployment and probably often required during development. However, when going into production, you really should consider separating the concerns.
Thus I would say have one image, which you use to run N rails container and add a tools/migration/setup whatever container, which you use to do administrative tasks. Have a look what the developers from the official rails image say about this:
It is designed to be used both as a throw away container (mount your source code and start the container to start your app), as well as the base to build other images off of.
When you look at that image there is no setup or migration command. It is totally up to the user how to use it. So when you need to run several containers just go ahead.
From my experience with mysql this works fine. You can run a data-only container to host the data, run a container with the mysql server and finally run a container for administrative tasks like backup and restore. For all three containers you can use the same image. Now you are free to access your database from let's say several Wordpress containers. This means clear separation of concerns. When you use docker-compose it is not that difficult to manage all those containers. Certainly there are already many third party containers and tools to also support you with setting up a complex application consisting of several containers.
Finally, you should decide whether docker and the micro-service architecture is right for your problem. As outlined in this article there are some reasons against. One of the core problems being that it adds a whole new layer of complexity. However, that is the case with many solutions and I guess you are aware of this and willing to except it.
docker run <container name> rake db:migrate
Starts you standard application container but don't run the CMD (rails server), but rake db:migrate
UPDATE: Suggested by Roman, the command would now be:
docker exec <container> rake db:migrate
Having the same pb publishing to a docker swarm, I put here a solution partially grabbed from others.
Rails has already a mechanism to detect concurrent migrations by using a lock on the database. But it triggers ConcurrentException where it should just wait.
One solution is then to have a loop, that whenever a ConcurrentException is thrown, just wait for 5s et then redo the migration.
This is especially important that all containers perform the migration as the migration fails, all containers must fails.
Solution from coffejumper
namespace :db do
namespace :migrate do
desc 'Run db:migrate and monitor ActiveRecord::ConcurrentMigrationError errors'
task monitor_concurrent: :environment do
loop do
puts 'Invoking Migrations'
Rake::Task['db:migrate'].reenable
Rake::Task['db:migrate'].invoke
puts 'Migrations Successful'
break
rescue ActiveRecord::ConcurrentMigrationError
puts 'Migrations Sleeping 5'
sleep(5)
end
end
end
end
And sometimes you have other processes you want to execute also one by one to perform the migration like after_party, cron setup, etc... The solution is then to use the same mechanism as Rails to embed rake tasks around a database lock:
Below, based on Rails 6 code, the migrate_without_lock performs the needed migrations while with_advisory_lock gets database lock (triggering ConcurrentMigrationError if lock cannot be acquired).
module Swarm
class Migration
def migrate
with_advisory_lock { migrate_without_lock }
end
private
def migrate_without_lock
**puts "Database migration"
Rake::Task['db:migrate'].invoke
puts "After_party migration"
Rake::Task['after_party:run'].invoke
...
puts "Migrations successful"**
end
def with_advisory_lock
lock_id = generate_migrator_advisory_lock_id
MyAdvisoryLockBase.establish_connection(ActiveRecord::Base.connection_config) unless MyAdvisoryLockBase.connected?
connection = MDAdvisoryLockBase.connection
got_lock = connection.get_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError unless got_lock
yield
ensure
if got_lock && !connection.release_advisory_lock(lock_id)
raise ActiveRecord::ConcurrentMigrationError.new(
ActiveRecord::ConcurrentMigrationError::RELEASE_LOCK_FAILED_MESSAGE
)
end
end
MIGRATOR_SALT = 1942351734
def generate_migrator_advisory_lock_id
db_name_hash = Zlib.crc32(ActiveRecord::Base.connection_config[:database])
MIGRATOR_SALT * db_name_hash
end
end
# based on rails 6.1 AdvisoryLockBase
class MyAdvisoryLockBase < ActiveRecord::AdvisoryLockBase # :nodoc:
self.connection_specification_name = "MDAdvisoryLockBase"
end
end
Then as before, do a loop to wait
namespace :swarm do
desc 'Run migrations tasks after acquisition of lock on database'
task migrate: :environment do
result = 1
(1..10).each do |i|
**Swarm::Migration.new.migrate**
puts "Attempt #{i} sucessfully terminated"
result = 0
break
rescue ActiveRecord::ConcurrentMigrationError
seconds = rand(3..10)
puts "Attempt #{i} another migration is running => sleeping #{seconds}s"
sleep(seconds)
rescue => e
puts e
e.backtrace.each { |m| puts m }
break
end
exit(result)
end
end
Then in your startup script just launch the rake tasks
set -e
bundle exec rails swarm:migrate
exec bundle exec rails server -b "0.0.0.0"
At the end, as your migrations tasks are run by all containers, they must have a mechanism to do nothing when it's already done. (like does db:migrate)
Using this solution, the order in which Swarm launches containers doesn't matter anymore AND if something goes wrong, all containers know the problem :-)
For single container id:
docker exec -it <container ID> bundle exec rails db:migrate
for multiple we can repeat the process for different container, if there number in 1000 the need to script to execute.
How can I trace MySQL queries on my Linux server as they happen?
For example I'd love to set up some sort of listener, then request a web page and view all of the queries the engine executed, or just view all of the queries being run on a production server. How can I do this?
You can log every query to a log file really easily:
mysql> SHOW VARIABLES LIKE "general_log%";
+------------------+----------------------------+
| Variable_name | Value |
+------------------+----------------------------+
| general_log | OFF |
| general_log_file | /var/run/mysqld/mysqld.log |
+------------------+----------------------------+
mysql> SET GLOBAL general_log = 'ON';
Do your queries (on any db). Grep or otherwise examine /var/run/mysqld/mysqld.log
Then don't forget to
mysql> SET GLOBAL general_log = 'OFF';
or the performance will plummet and your disk will fill!
You can run the MySQL command SHOW FULL PROCESSLIST; to see what queries are being processed at any given time, but that probably won't achieve what you're hoping for.
The best method to get a history without having to modify every application using the server is probably through triggers. You could set up triggers so that every query run results in the query being inserted into some sort of history table, and then create a separate page to access this information.
Do be aware that this will probably considerably slow down everything on the server though, with adding an extra INSERT on top of every single query.
Edit: another alternative is the General Query Log, but having it written to a flat file would remove a lot of possibilities for flexibility of displaying, especially in real-time. If you just want a simple, easy-to-implement way to see what's going on though, enabling the GQL and then using running tail -f on the logfile would do the trick.
Even though an answer has already been accepted, I would like to present what might even be the simplest option:
$ mysqladmin -u bob -p -i 1 processlist
This will print the current queries on your screen every second.
-u The mysql user you want to execute the command as
-p Prompt for your password (so you don't have to save it in a file or have the command appear in your command history)
i The interval in seconds.
Use the --verbose flag to show the full process list, displaying the entire query for each process. (Thanks, nmat)
There is a possible downside: fast queries might not show up if they run between the interval that you set up. IE: My interval is set at one second and if there is a query that takes .02 seconds to run and is ran between intervals, you won't see it.
Use this option preferably when you quickly want to check on running queries without having to set up a listener or anything else.
Run this convenient SQL query to see running MySQL queries. It can be run from any environment you like, whenever you like, without any code changes or overheads. It may require some MySQL permissions configuration, but for me it just runs without any special setup.
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep';
The only catch is that you often miss queries which execute very quickly, so it is most useful for longer-running queries or when the MySQL server has queries which are backing up - in my experience this is exactly the time when I want to view "live" queries.
You can also add conditions to make it more specific just any SQL query.
e.g. Shows all queries running for 5 seconds or more:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep' AND TIME >= 5;
e.g. Show all running UPDATEs:
SELECT * FROM INFORMATION_SCHEMA.PROCESSLIST WHERE COMMAND != 'Sleep' AND INFO LIKE '%UPDATE %';
For full details see: http://dev.mysql.com/doc/refman/5.1/en/processlist-table.html
strace
The quickest way to see live MySQL/MariaDB queries is to use debugger. On Linux you can use strace, for example:
sudo strace -e trace=read,write -s 2000 -fp $(pgrep -nf mysql) 2>&1
Since there are lot of escaped characters, you may format strace's output by piping (just add | between these two one-liners) above into the following command:
grep --line-buffered -o '".\+[^"]"' | grep --line-buffered -o '[^"]*[^"]' | while read -r line; do printf "%b" $line; done | tr "\r\n" "\275\276" | tr -d "[:cntrl:]" | tr "\275\276" "\r\n"
So you should see fairly clean SQL queries with no-time, without touching configuration files.
Obviously this won't replace the standard way of enabling logs, which is described below (which involves reloading the SQL server).
dtrace
Use MySQL probes to view the live MySQL queries without touching the server. Example script:
#!/usr/sbin/dtrace -q
pid$target::*mysql_parse*:entry /* This probe is fired when the execution enters mysql_parse */
{
printf("Query: %s\n", copyinstr(arg1));
}
Save above script to a file (like watch.d), and run:
pfexec dtrace -s watch.d -p $(pgrep -x mysqld)
Learn more: Getting started with DTracing MySQL
Gibbs MySQL Spyglass
See this answer.
Logs
Here are the steps useful for development proposes.
Add these lines into your ~/.my.cnf or global my.cnf:
[mysqld]
general_log=1
general_log_file=/tmp/mysqld.log
Paths: /var/log/mysqld.log or /usr/local/var/log/mysqld.log may also work depending on your file permissions.
then restart your MySQL/MariaDB by (prefix with sudo if necessary):
killall -HUP mysqld
Then check your logs:
tail -f /tmp/mysqld.log
After finish, change general_log to 0 (so you can use it in future), then remove the file and restart SQL server again: killall -HUP mysqld.
I'm in a particular situation where I do not have permissions to turn logging on, and wouldn't have permissions to see the logs if they were turned on. I could not add a trigger, but I did have permissions to call show processlist. So, I gave it a best effort and came up with this:
Create a bash script called "showsqlprocesslist":
#!/bin/bash
while [ 1 -le 1 ]
do
mysql --port=**** --protocol=tcp --password=**** --user=**** --host=**** -e "show processlist\G" | grep Info | grep -v processlist | grep -v "Info: NULL";
done
Execute the script:
./showsqlprocesslist > showsqlprocesslist.out &
Tail the output:
tail -f showsqlprocesslist.out
Bingo bango. Even though it's not throttled, it only took up 2-4% CPU on the boxes I ran it on. I hope maybe this helps someone.
From a command line you could run:
watch --interval=[your-interval-in-seconds] "mysqladmin -u root -p[your-root-pw] processlist | grep [your-db-name]"
Replace the values [x] with your values.
Or even better:
mysqladmin -u root -p -i 1 processlist;
This is the easiest setup on a Linux Ubuntu machine I have come across. Crazy to see all the queries live.
Find and open your MySQL configuration file, usually /etc/mysql/my.cnf on Ubuntu. Look for the section that says “Logging and Replication”
#
# * Logging and Replication
#
# Both location gets rotated by the cronjob.
# Be aware that this log type is a performance killer.
log = /var/log/mysql/mysql.log
Just uncomment the “log” variable to turn on logging. Restart MySQL with this command:
sudo /etc/init.d/mysql restart
Now we’re ready to start monitoring the queries as they come in. Open up a new terminal and run this command to scroll the log file, adjusting the path if necessary.
tail -f /var/log/mysql/mysql.log
Now run your application. You’ll see the database queries start flying by in your terminal window. (make sure you have scrolling and history enabled on the terminal)
FROM http://www.howtogeek.com/howto/database/monitor-all-sql-queries-in-mysql/
Check out mtop.
I've been looking to do the same, and have cobbled together a solution from various posts, plus created a small console app to output the live query text as it's written to the log file. This was important in my case as I'm using Entity Framework with MySQL and I need to be able to inspect the generated SQL.
Steps to create the log file (some duplication of other posts, all here for simplicity):
Edit the file located at:
C:\Program Files (x86)\MySQL\MySQL Server 5.5\my.ini
Add "log=development.log" to the bottom of the file. (Note saving this file required me to run my text editor as an admin).
Use MySql workbench to open a command line, enter the password.
Run the following to turn on general logging which will record all queries ran:
SET GLOBAL general_log = 'ON';
To turn off:
SET GLOBAL general_log = 'OFF';
This will cause running queries to be written to a text file at the following location.
C:\ProgramData\MySQL\MySQL Server 5.5\data\development.log
Create / Run a console app that will output the log information in real time:
Source available to download here
Source:
using System;
using System.Configuration;
using System.IO;
using System.Threading;
namespace LiveLogs.ConsoleApp
{
class Program
{
static void Main(string[] args)
{
// Console sizing can cause exceptions if you are using a
// small monitor. Change as required.
Console.SetWindowSize(152, 58);
Console.BufferHeight = 1500;
string filePath = ConfigurationManager.AppSettings["MonitoredTextFilePath"];
Console.Title = string.Format("Live Logs {0}", filePath);
var fileStream = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite);
// Move to the end of the stream so we do not read in existing
// log text, only watch for new text.
fileStream.Position = fileStream.Length;
StreamReader streamReader;
// Commented lines are for duplicating the log output as it's written to
// allow verification via a diff that the contents are the same and all
// is being output.
// var fsWrite = new FileStream(#"C:\DuplicateFile.txt", FileMode.Create);
// var sw = new StreamWriter(fsWrite);
int rowNum = 0;
while (true)
{
streamReader = new StreamReader(fileStream);
string line;
string rowStr;
while (streamReader.Peek() != -1)
{
rowNum++;
line = streamReader.ReadLine();
rowStr = rowNum.ToString();
string output = String.Format("{0} {1}:\t{2}", rowStr.PadLeft(6, '0'), DateTime.Now.ToLongTimeString(), line);
Console.WriteLine(output);
// sw.WriteLine(output);
}
// sw.Flush();
Thread.Sleep(500);
}
}
}
}
In addition to previous answers describing how to enable general logging, I had to modify one additional variable in my vanilla MySql 5.6 installation before any SQL was written to the log:
SET GLOBAL log_output = 'FILE';
The default setting was 'NONE'.
Gibbs MySQL Spyglass
AgilData launched recently the Gibbs MySQL Scalability Advisor (a free self-service tool) which allows users to capture a live stream of queries to be uploaded to Gibbs. Spyglass (which is Open Source) will watch interactions between your MySQL Servers and client applications. No reconfiguration or restart of the MySQL database server is needed (either client or app).
GitHub: AgilData/gibbs-mysql-spyglass
Learn more: Packet Capturing MySQL with Rust
Install command:
curl -s https://raw.githubusercontent.com/AgilData/gibbs-mysql-spyglass/master/install.sh | bash
If you want to have monitoring and statistics, than there is a good and open-source tool Percona Monitoring and Management
But it is a server based system, and it is not very trivial for launch.
It has also live demo system for test.
This question already has answers here:
How do I backup a MySQL database?
(5 answers)
Closed 9 years ago.
How do I do backups in MySQL?
I'm hoping there'll be something better than just running mysqldump every "x" hours.
Is there anything like SQL Server has, where you can take a full backup each day, and then incrementals every hour, so if your DB dies you can restore up to the latest backup?
Something like the DB log, where as long as the log doesn't die, you can restore up to the exact point where the DB died?
Also, how do these things affect locking?
I'd expect the online transactions to be locked for a while if I do a mysqldump.
You might want to look at incremental backups.
mysqldump is a reasonable approach, but bear in mind that for some engines, this will lock your tables for the duration of the dump - and this has availability concerns for large production datasets.
An obvious alternative to this is mk-parallel-dump from Maatkit (http://www.maatkit.org/) which you should really check out if you're a mysql administrator. This dumps multiple tables or databases in parallel using mysqldump, thereby decreasing the amount of total time your dump takes.
If you're running in a replicated setup (and if you're using MySQL for important data in production, you have no excuses not to be doing so), taking dumps from a replication slave dedicated to the purpose will prevent any lock issues from causing trouble.
The next obvious alternative - on Linux, at least - is to use LVM snapshots. You can lock your tables, snapshot the filesystem, and unlock the tables again; then start an additional MySQL using a mount of that snapshot, dumping from there. This approach is described here: http://www.mysqlperformanceblog.com/2006/08/21/using-lvm-for-mysql-backup-and-replication-setup/
now i am beginning to sound like a marketeer for this product. i answered a question with it here, then i answered another with it again here.
in a nutshell, try sqlyog (enterprise in your case) from webyog for all your mysql requirements. it not only schedules backups, but also schedules synchronization so you can actually replicate your database to a remote server.
it has a free community edition as well as an enterprise edition. i recommend the later to you though i also reccomend you start with the comm edition and first see how you like it.
I use mysqlhotcopy, a fast on-line hot-backup utility for local MySQL databases and tables. I'm pretty happy with it.
the Percona guys made a open source altenative to innobackup ...
Xtrabackup
https://launchpad.net/percona-xtrabackup/
Read this article about XtraDB
http://www.linux-mag.com/cache/7356/1.html
You might want to supplement your current offline backup scheme with MySQL replication.
Then if you have a hardware failure you can just swap machines. If you catch the failure quickly you're users won't even notice any downtime or data loss.
I use a simple script that dumps the mysql database into a tar.gz file, encrypts it using gpg and sends it to a mail account (Google Mail, but that's irrelevant really)
The script is a Python script, which basically runs the following command, and emails the output file.
mysqldump -u theuser -p mypassword thedatabase | gzip -9 - | gpg -e -r 12345 -r 23456 > 2008_01_02.tar.gz.gpg
This is the entire backup. It also has the web-backup part, which just tar/gzips/encrypts the files. It's a fairly small site, so the web backups are much less than 20MB, so can be sent to the GMail account without problem (the MySQL dumps are tiny, about 300KB compressed). It's extremely basic, and won't scale very well. I run it once a week using cron.
I'm not quite sure how we're supposed to put longish scripts in answers, so I'll just shove it as a code-block..
#!/usr/bin/env python
#encoding:utf-8
#
# Creates a GPG encrypted web and database backups, and emails it
import os, sys, time, commands
################################################
### Config
DATE = time.strftime("%Y-%m-%d_%H-%M")
# MySQL login
SQL_USER = "mysqluser"
SQL_PASS = "mysqlpassword"
SQL_DB = "databasename"
# Email addresses
BACKUP_EMAIL=["email1#example.com", "email2#example.com"] # Array of email(s)
FROM_EMAIL = "root#myserver.com" # Only one email
# Temp backup locations
DB_BACKUP="/home/backupuser/db_backup/mysite_db-%(date)s.sql.gz.gpg" % {'date':DATE}
WEB_BACKUP="/home/backupuser/web_backup/mysite_web-%(date)s.tar.gz.gpg" % {'date':DATE}
# Email subjects
DB_EMAIL_SUBJECT="%(date)s/db/mysite" % {'date':DATE}
WEB_EMAIL_SUBJECT="%(date)s/web/mysite" % {'date':DATE}
GPG_RECP = ["MrAdmin","MrOtherAdmin"]
### end Config
################################################
################################################
### Process config
GPG_RECP = " ".join(["-r %s" % (x) for x in GPG_RECP]) # Format GPG_RECP as arg
sql_backup_command = "mysqldump -u %(SQL_USER)s -p%(SQL_PASS)s %(SQL_DB)s | gzip -9 - | gpg -e %(GPG_RECP)s > %(DB_BACKUP)s" % {
'GPG_RECP':GPG_RECP,
'DB_BACKUP':DB_BACKUP,
'SQL_USER':SQL_USER,
'SQL_PASS':SQL_PASS,
'SQL_DB':SQL_DB
}
web_backup_command = "cd /var/www/; tar -c mysite.org/ | gzip -9 | gpg -e %(GPG_RECP)s > %(WEB_BACKUP)s" % {
'GPG_RECP':GPG_RECP,
'WEB_BACKUP':WEB_BACKUP,
}
# end Process config
################################################
################################################
### Main application
def main():
"""Main backup function"""
print "Backing commencing at %s" % (DATE)
# Run commands
print "Creating db backup..."
sql_status,sql_cmd_out = commands.getstatusoutput(sql_backup_command)
if sql_status == 0:
db_file_size = round(float( os.stat(DB_BACKUP)[6] ) /1024/1024, 2) # Get file-size in MB
print "..successful (%.2fMB)" % (db_file_size)
try:
send_mail(
send_from = FROM_EMAIL,
send_to = BACKUP_EMAIL,
subject = DB_EMAIL_SUBJECT,
text = "Database backup",
files = [DB_BACKUP],
server = "localhost"
)
print "Sending db backup successful"
except Exception,errormsg:
print "Sending db backup FAILED. Error was:",errormsg
#end try
# Remove backup file
print "Removing db backup..."
try:
os.remove(DB_BACKUP)
print "...successful"
except Exception, errormsg:
print "...FAILED. Error was: %s" % (errormsg)
#end try
else:
print "Creating db backup FAILED. Output was:", sql_cmd_out
#end if sql_status
print "Creating web backup..."
web_status,web_cmd_out = commands.getstatusoutput(web_backup_command)
if web_status == 0:
web_file_size = round(float( os.stat(WEB_BACKUP)[6] ) /1024/1024, 2) # File size in MB
print "..successful (%.2fMB)" % (web_file_size)
try:
send_mail(
send_from = FROM_EMAIL,
send_to = BACKUP_EMAIL,
subject = WEB_EMAIL_SUBJECT,
text = "Website backup",
files = [WEB_BACKUP],
server = "localhost"
)
print "Sending web backup successful"
except Exception,errormsg:
print "Sending web backup FAIELD. Error was: %s" % (errormsg)
#end try
# Remove backup file
print "Removing web backup..."
try:
os.remove(WEB_BACKUP)
print "...successful"
except Exception, errormsg:
print "...FAILED. Error was: %s" % (errormsg)
#end try
else:
print "Creating web backup FAILED. Output was:", web_cmd_out
#end if web_status
#end main
################################################
################################################
# Send email function
# needed email libs..
import smtplib
from email.MIMEMultipart import MIMEMultipart
from email.MIMEBase import MIMEBase
from email.MIMEText import MIMEText
from email.Utils import COMMASPACE, formatdate
from email import Encoders
def send_mail(send_from, send_to, subject, text, files=[], server="localhost"):
assert type(send_to)==list
assert type(files)==list
msg = MIMEMultipart()
msg['From'] = send_from
msg['To'] = COMMASPACE.join(send_to)
msg['Date'] = formatdate(localtime=True)
msg['Subject'] = subject
msg.attach( MIMEText(text) )
for f in files:
part = MIMEBase('application', "octet-stream")
try:
part.set_payload( open(f,"rb").read() )
except Exception, errormsg:
raise IOError("File not found: %s"%(errormsg))
Encoders.encode_base64(part)
part.add_header('Content-Disposition', 'attachment; filename="%s"' % os.path.basename(f))
msg.attach(part)
#end for f
smtp = smtplib.SMTP(server)
smtp.sendmail(send_from, send_to, msg.as_string())
smtp.close()
#end send_mail
################################################
if __name__ == '__main__':
main()
You can make full dumps of InnoDB databases/tables without locking (downtime) via mysqldump with "--single-transaction --skip-lock-tables" options. Works well for making weekly snapshots + daily/hourly binary log increments (#Using the Binary Log to Enable Incremental Backups).
#Jake,
Thanks for the info.
Now, it looks like only the commercial version has backup features.
Isn't there ANYTHING built into MySQL to do decent backups?
The official MySQL page even recommends things like "well, you can copy the files, AS LONG AS THEY'RE NOT BEING UPDATED"...
The problem with a straight backup of the mysql database folder is that the backup will not necessarily be consistent, unless you do a write-lock during the backup.
I run a script that iterates through all of the databases, doing a mysqldump and gzip on each to a backup folder, and then backup that folder to tape.
This, however, means that there is no such thing as incremental backups, since the nightly dump is a complete dump. But I would argue that this could be a good thing, since a restore from a full backup will be a significantly quicker process than restoring from incrementals - and if you are backing up to tape, it will likely mean gathering a number of tapes before you can do a full restore.
In any case, whichever backup plan you go with, make sure to do a trial restore to ensure that it works, and get an idea of how long it might take, and exactly what the steps are that you need to go through.
the correct way to run incremental or continuous backups of a mysql server is with binary logs.
to start with, lock all of the tables or bring the server down. use mysql dump to make a backup, or just copy the data directory. you only have to do this once, or any time you want a FULL backup.
before you bring the server back up, make sure binary logging is enabled.
to take an incremental backup, log in to the server and issue a FLUSH LOGS command. then backup the most recently closed binary log file.
if you have all innodb tables, it's simpler to just use inno hot backup (not free) or mysqldump with the --single-transaction option (you'd better have a lot of memory to handle the transactions).
Binary logs are probably the correct way to do incremental backups, but if you don't trust binary file formats for permanent storage here is an ASCII way to do incremental backups.
mysqldump is not a bad format, the main problem is that it outputs stuff a table as one big line. The following trivial sed will split its output along record borders:
mysqldump --opt -p | sed -e "s/,(/,\n(/g" > database.dump
The resulting file is pretty diff-friendly, and I've been keeping them in a standard SVN repository fairly successfully. That also allows you to keep a history of backups, if you find that the last version got borked and you need last week's version.
This is a pretty solid solution for Linux shell. I have been using it for years:
http://sourceforge.net/projects/automysqlbackup/
Does rolling backups: daily, monthly, yearly
Lots of options
#Daniel,
in case you are still interested, there is a newish (new to me) solution shared by Paul Galbraith, a tool that allows for online backup of innodb tables called ibbackup from oracle which to quote Paul,
when used in conjunction with
innobackup, has worked great in
creating a nightly backup, with no
downtime during the backup
more detail can be found on Paul's blog
Sound like you are talking about transaction roll back.
So in terms of what you need, if you have the logs containing all historical queries, isn't that the backup already? Why do you need an incremental backup which is basically a redundant copy of all the information in DB logs?
If so, why don't you just use mysqldump and do the backup every once a while?