Making Database Backups

It's important to back up your databases in case tables are lost or damaged. If a serious system crash occurs, you want to be able to restore your tables to the state they were in at the time of the crash with as little data loss as possible. Likewise, a user who issues an unwise DROP DATABASE or DROP TABLE may show up at your door requesting that you perform data recovery.

Database backups also are useful for copying databases to another server. Most commonly, a database is transferred to a server running on another host, but you can also transfer data to a different server running on the same host. You might do this if you're testing a server for a new release of MySQL and want to use it with some real data from your production server.

Another use for a backup is to set up a replication server, because one of the first steps in setting up a slave server is to take a snapshot of the master server at a specific point in time. The backup serves as this snapshot, and loading it into the slave server brings it up to date with respect to the master server. Thereafter, updates made on the master server are replicated to the slave server through the standard replication protocol. The procedure for setting up replication is discussed in Chapter 11.

There are two general categories of database backups:

Text backups made by using mysqldump to write table contents into dump files. These files consist of CREATE TABLE and INSERT statements and can be reloaded into the server later to restore the tables.
Binary backups made by directly copying the files containing table contents. This type of backup can be made in various ways. For example, you can use programs such as mysqlhotcopy, cp, tar, or cpio.

Each method has its own advantages and disadvantages. Some of the factors to consider are whether you can leave the server running, the time needed to make the backup, and portability of the backup.

mysqldump operates in cooperation with the MySQL server, so you can use it while the server is running. Binary-backup methods involve file copy operations that are done external to the server. Some of these methods require that you stop the server. For those that do not, you still must take steps to ensure that the server does not modify the tables while you copy them.
mysqldump is slower than binary-backup techniques because the dump operation involves transferring the information over the network connection between mysqldump and the server. Binary-backup methods operate by copying files at the filesystem level and require no network traffic.
mysqldump generates text files containing SQL statements. These files are portable to other machines, even those with a different hardware architecture. They are therefore usable for copying databases from one server to another. Files generated by direct-copy binary backup methods may or may not be portable to other machines. It depends on whether the files correspond to tables that use a machine independent storage format. ISAM tables do not satisfy this constraint. For example, copying ISAM table files from Solaris on SPARC to Solaris on SPARC will work, but copying them from Solaris on SPARC to Solaris on Intel or to Mac OS X will not work. MyISAM and InnoDB tables normally are machine independent. For those storage engines, directly copied files can be moved to a server running on a machine with a different hardware architecture. For further discussion of the portability characteristics of various storage engines, see Chapter 2, "MySQL SQL Syntax and Use."

Whichever backup method you choose, there are certain principles to which you should adhere to ensure the best results if you ever need to restore database contents:

Perform backups regularly. Set a schedule and stick to it.
Configure the server to perform binary logging (see "Maintaining Log Files," in Chapter 11). The binary logs can help when you need to restore databases after a crash: After you use your backup files to restore the databases to the state they were in at the time of the backup, you can re-apply the changes that occurred after the backup was made by running the statements contained in the logs. This restores the tables in the databases to their state at the time the crash occurred.
Use a consistent and meaningful naming scheme for your backup files. Names like backup1, backup2, and so forth are not particularly helpful. When it comes time to perform a restore operation, you'll waste time figuring out what's in the files. You may find it useful to construct backup filenames using database names and dates. For example, if you dump the sampdb database on January 2, 2005, you might name the backup file sampdb-2005-01-02.
Put your backup files on a different filesystem than the one you use for your databases. This reduces the likelihood of filling up the filesystem containing the data directory as a result of generating backups. Also, if the filesystem where you store the backups is on a different physical drive, you further reduce the extent of damage that can be caused by drive failure, because loss of any one drive cannot destroy both your data directory and your backups.
Expire your backup files periodically to keep them from filling your disk. One way to do this is use file rotation techniques. Chapter 11 discusses these techniques in relation to log files, but the same principles apply to backup file expiration as well.
Include your database backup files in your regular filesystem backups. If you have a complete crash that wipes out not only your data directory but also the disk drive containing your database backups, you'll be in real trouble. Back up your log files, too.

Making Backups with `mysqldump`

The mysqldump program creates text dump files. By default, it writes a dump file in SQL format consisting of CREATE TABLE statements that create the tables being dumped and INSERT statements containing the data for the rows in the tables. To re-create the dumped tables later, reload the dump file into MySQL by using it as input to mysql. (Don't use mysqlimport to read SQL-format mysqldump output; mysqlimport expects to read raw data, not SQL statements.)

To back up all tables from all databases into a file, you can use a command like this:

% mysqldump --opt --all-databases > /archive/mysql/dump-all.2005-01-02

However, that produces a rather large dump file if you have a lot of data. You can dump a single database into a file as follows:

% mysqldump --opt sampdb > /archive/mysql/sampdb.2005-01-02

The beginning of the output file will look something like this:

-- MySQL dump 10.7
--
-- Host: localhost    Database: sampdb
-- ------------------------------------------------------
-- Server version   4.1.6-gamma-log
--
-- Table structure for table `absence`
--
DROP TABLE IF EXISTS `absence`;
CREATE TABLE `absence` (
  `student_id` int(10) unsigned NOT NULL default '0',
  `date` date NOT NULL default '0000-00-00',
  PRIMARY KEY  (`student_id`,`date`),
  CONSTRAINT `absence_ibfk_1` FOREIGN KEY (`student_id`)
  REFERENCES `student` (`student_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
--
-- Dumping data for table `absence`
--
LOCK TABLES `absence` WRITE;
INSERT INTO `absence` VALUES (3,'2004-09-03'),(5,'2004-09-03'),
(10,'2004-09-06'),(10,'2004-09-09'),(17,'2004-09-07'),(20,'2004-09-07');
UNLOCK TABLES;
...

The rest of the file consists of more CREATE TABLE and INSERT statements.

Dump files often are large, so you'll likely want to do what you can to make them smaller. One way to reduce their size is to use the --opt option, which optimizes the dump process to generate less output. One effect of this option is to cause mysqldump to write multiple-row INSERT statements. These not only take less space than the equivalent set of single-row INSERT statements, they can be processed more quickly when you reload the dump file later.

Another way to reduce the size of a dump file is to compress it. On Windows, you can use WinZip or similar program to compress the dump and produce a file in Zip format. On Unix, you might use gzip instead. You can even compress the backup as you generate it by using a command pipeline:

% mysqldump --opt sampdb | gzip > /archive/mysql/sampdb.2005-01-02.gz

If you find large dump files difficult to manage, it's possible to dump the contents of individual tables by naming them following the database name on the mysqldump command line. mysqldump will dump just the named tables rather than all the tables in the database, resulting in smaller, more manageable files. The following example shows how to dump subsets of the sampdb tables into separate files:

% mysqldump --opt sampdb member president > hist-league.sql
% mysqldump --opt sampdb student score grade_event absence > gradebook.sql

mysqldump has many options. Some of those that you may find useful include the following:

Typically, you name a database on the mysqldump command line, optionally followed by specific table names. To dump several databases at once, use the --databases option. mysqldump will interpret all names as database names and dump all the tables in each of them. To dump all of a server's databases, use --all-databases. In this case, you supply no database or table name arguments. Both --databases and --all-databases cause the output for each database to be preceded by CREATE DATABASE IF NOT EXISTS and USE statements.
Be careful with the --all-databases option if you intend to load the dump output into another server: The dump will include the grant tables in the mysql database, and you may not really want to replace the other server's grant tables.
By default, mysqldump dumps both table structure (the CREATE TABLE statements) and table contents (the INSERT statements). To dump just one or the other, use the --no-create-info or --no-data options.
As already mentioned, the --opt option optimizes the dump process. It turns on other options that speed up dumping the data. In addition, the dump file is written in such a way that it can be processed more quickly later when loaded back into the server.
Note: The --opt option is on by default as of MySQL 4.1.0, so you need not specify it explicitly unless you have an older version. If you really want an unoptimized dump for MySQL 4.1.0 or later, use the --skip-opt option.
Making backups using --opt is probably the most common method because of the benefits for backup speed. Be warned, however, that the --opt option does have a price; what --opt optimizes is your backup procedure, not access by other clients to the database. The --opt option prevents anyone from updating any of the tables that you're dumping because it locks all the tables at once. You can easily see for yourself the effect of this on general database access. Just try making a backup at the time of day when your database is normally most heavily used. It won't take long for your phone to start ringing with people calling to find out what's going on. (I'd appreciate it if you would refrain from asking how I happen to know this.)
--opt is useful for generating backup files that you intend to use for periodically refreshing the contents of another database (for example, a database on another server). That's because it automatically enables the --add-drop-table option, which tells mysqldump to precede each CREATE TABLE statement in the file with a DROP TABLE IF NOT EXISTS statement for the same table. When you take the backup file and load it into the second database later, you won't get an error if the tables already exist. If you're running a second test server that's not a replication slave, you can use this technique to reload it periodically with a copy of the data from the databases on your production server.
One effect of --opt is that it enables the --extended-insert option that causes mysqldump to write multiple-row INSERT statements. This is a disadvantage if you want a more readable dump file. To produce single-row INSERT statements, use the --skip-extended-insert option.
The combination of --flush-logs and --lock-tables is helpful for checkpointing your database. --lock-tables locks all the tables that you're dumping, and --flush-logs closes and reopens the log files. If binary logging is enabled, flushing the logs creates a new binary log file that will contain only those statements that modify tables subsequent to the checkpoint. This synchronizes your log to the time of the backup. (The downside is that locking all the tables is not so good for client access during the backups if you have clients that need to perform updates.)
If you use --flush-logs to checkpoint the logs to the time of the backup, it's probably best to dump entire databases at a time. During restore operations, it's common to extract log contents on a per-database basis. If you dump individual tables, it's much more difficult to synchronize log checkpoints against your backup files. (There is no option for extracting updates for individual tables, so you'll have to extract them yourself.)
If you're dumping InnoDB tables, the --single-transaction option dumps the tables within a transaction so that you get a consistent backup.

mysqldump has several other options; consult Appendix F for more information.

Making Binary Database Backups

A method for backing up databases or tables that doesn't involve mysqldump is to copy table files directly. Typically this is done using a special program developed for the task (such as mysqlhotcopy or InnoDB Hot Backup) or regular filesystem utilities (such as cp, tar, or cpio). There are two key points to observe when you use a direct-copy backup method:

You must make sure the tables aren't being used. If the server is changing a table while you're copying it, the copy will be worthless. The best way to ensure the integrity of your backups is to stop the server, copy the files, and restart the server. Some binary backup methods in fact require that you stop the server. If you don't want to stop the server (and the backup method doesn't require that you do so), use the read-only locking protocol described in "Performing Database Maintenance with the Server Running" earlier in this chapter. That protocol prevents the server from changing the tables while you're copying them.
You must copy all files that are required to restore the tables that you're backing up. Direct-copy methods are easiest to use for storage engines such as MyISAM that represent a given table using a unique set of files in the database directory. To back up a MyISAM table, you need copy only its .frm, .MYD, and .MYI files. For engines such as InnoDB or BDB, you must copy additional files because information about table contents is stored in files that are shared among tables or in storage engine-specific log files.

If you make a binary backup, beware of symbolic links, such as symlinks in the data directory to database directories or symlinks to MyISAM data or index files. These present a problem because your file-copying technique might copy only the symlinks and not the data that they point to.

Making a Complete Binary Backup

A complete binary backup includes all files in which table contents are stored and any log files that are used by specific storage engines. For good measure, you should also copy the binary logs. If the server is a replication slave, copy the relay log files and the master.info and relay-log.info files. Also, the slave may have created files with names of the form SQL_LOAD-xxx in its temporary file directory. You should back these up, too; they're needed for LOAD DATA statements. These files will be in the directory named by the --slave-load-tmpdir option (which, if not given, defaults to the value of the tmpdir system variable). To make it easier to identify these files for backup, create a directory to be devoted to use by the slave server, and start the slave with the --slave-load-tmpdir option set to that value.

To properly copy all the files just discussed, you must stop the server so that storage engines close their log files and the server closes any other logs that it is writing.

All of that sounds like a lot of stuff to back up, but it is not necessarily complicated to do so. For example, all of your database directories are under the data directory, and logs and information files are created there by default as well. In this case, you can make a backup by stopping the server and copying the entire data directory. For example, to create a backup as a compressed tar file under the /archive/mysql directory, change location into the data directory and use a command like this:

% tar czf /archive/mysql/dump-all-2005-04-11.tar.gz .

Making a Partial Binary Backup

Making a partial binary backup by copying files is similar to making a complete backup, except that you copy only a subset of the full set of files. In some cases, a partial backup can be made without stopping the server if you use the read-locking protocol to lock the tables that you want to copy. This is true if a database contains only MyISAM tables, for example. Suppose that you want to back up the mydb database located under the data directory /usr/local/mysql/data and store the backup under the archive directory /archive/mysql. Read-lock the tables, and then execute these commands:

% mkdir /archive/mysql/mydb
% cd /usr/local/mysql/data/mydb
% cp -r . /archive/mysql/mydb

After executing these commands, the /archive/mysql/mydb directory contains a copy of the mydb database. Individual tables can be backed up like this:

% mkdir /archive/mysql/mydb
% cd /usr/local/mysql/data/mydb
% cp tbl1.* /archive/mysql/mydb
% cp tbl2.* /archive/mysql/mydb
...

When you're done backing up, you can release the table locks. (As an alternative to using the read-locking protocol, stop the server before copying the files and restart it afterward.)

Making Backups with `mysqlhotcopy`

mysqlhotcopy is a Perl DBI script that helps you make database backups. The "hot" in the name refers to the fact that the backups are made while the server is running.

mysqlhotcopy has the following principal benefits:

It's faster than mysqldump because it directly copies table files rather than requesting the tables from the server the way mysqldump does. (This means that you must run mysqlhotcopy on the server host; it does not work with remote servers.)
It's convenient, because it automatically manages for you the locking protocol necessary to keep the server from changing the tables while they're being copied. (mysqlhotcopy does this by using internal locking as described earlier in "Performing Database Maintenance with the Server Running.")
It can flush the logs, which synchronizes the checkpoints for the backup files and the logs. This makes the backups easier to use for recovery, should that be necessary later.

mysqlhotcopy also has certain limitations:

Because it tells the server to read-lock the tables that it wants to copy, mysqlhotcopy must be used while the server is running.
It can be used only for MyISAM and ISAM tables.
It works only on Unix and NetWare, not on Windows.

The following examples assume that databases to be backed up contain only MyISAM or ISAM tables.

There are several ways to invoke mysqlhotcopy. Suppose that you want to copy a database named mydb. The following command creates a directory mydb_copy in the server's data directory and copies the files in the mydb database directory into it:

% mysqlhotcopy mydb

However, it's not necessarily a good idea to back up a database into the data directory. The new directory will also appear to the server to be a database that it can access. (You can see this by issuing a SHOW DATABASES statement after executing the preceding command. The output will show both mydb and mydb_copy.) Because of this, the tables in the backup directory could be modified by clients that connect to the server.

To copy the mydb database into a directory named mydb under a directory that you specify explicitly, give the directory pathname following the database name. For example, to copy the mydb database to a directory named /archive/mysql/mydb-2005-03-12, use this command:

% mysqlhotcopy mydb /archive/mysql/mydb-2005-03-12

To determine what actions mysqlhotcopy would perform for any given command, include the -n option in your invocation syntax. This runs mysqlhotcopy in "no execution" mode, so that it just prints commands rather than executing them.

Backing Up InnoDB or BDB Tables

Tables for the InnoDB and BDB transactional storage engines can be dumped using mysqldump, just like any other kind of tables. One option that is useful for transactional engines is --single-transaction, which causes mysqldump to dump the tables as part of a transaction. For InnoDB this ensures that the tables are not modified during the dump so that you get a consistent backup.

You can also make a binary backup. For InnoDB, the best way to do this is use InnoDB Hot Backup, available from Innobase Oy. InnoDB Hot Backup is a commercial tool that allows you to make InnoDB backups with the server running. Visit http://innodb.com for details.

To make a binary backup yourself, take care to observe the following special requirements:

InnoDB and BDB have their own log files for transaction management that are active while the server is running. Therefore, to make a binary backup, the server must be stopped. It must shut down cleanly, not abnormally, so that these storage engines have a chance to finish outstanding transactions and close their logs properly.
To make a binary backup of your InnoDB tables, you must copy the these files:
- The shared tablespace files.
- The .frm file for each table.
- The .ibd file for each table, if you have configured InnoDB to use individual tablespace files.
- The InnoDB log files.
- The option file in which the shared tablespace configuration is specified. (Make a copy of the option file because you'll want it for reinitializing the shared tablespace should you suffer loss of the current option file.)
To make a binary backup of BDB tables, copy the .frm and .db files for all BDB tables managed by the server. You must also copy the BDB log files. The BDB storage engine requires the logs to be present when the server starts, which means that should it be necessary to restore BDB tables, you'll need to restore the logs as well. BDB log files are created in the data directory by default and have names of the form log.nnnnnnnnnn, where nnnnnnnnnn is a 10-digit suffix indicating the log file sequence number. (Older versions of MySQL use 6-digit suffixes.)

Making Backups Using a Replication Slave

If you have a replication slave server set up, it can help you resolve a conflict of interest that arises from your duties as a MySQL administrator:

One the one hand, it's important to maximize the availability of your server to the members of your user community, which includes allowing them to make database updates.
On the other hand, it's important to make backups, which is best done while you prevent anyone from making database changes. Also, for recovery purposes, backups are most useful if you make sure your backup file and log file checkpoints are synchronized, either by stopping the server or by locking all the tables at once.

The goal of maintaining accessibility conflicts with enforcing complete or partial loss of database access to clients while making backups. A replication slave provides a way out of this dilemma. Rather than making backups of the master server, use the slave server instead. Suspend replication on the slave while you make the backup, resume it afterward, and the slave will catch up on any updates made by the master server during the backup period. This way you need not stop the master or otherwise make it unavailable to clients during the backup.

The following list describes some possible strategies for backing up the slave:

For a complete binary backup, stop the slave server, follow the instructions in "Making a Complete Binary Backup," and restart the server.
To make a backup using a method that does not require the slave to be stopped (for example, with mysqldump), suspend replication on the slave with STOP SLAVE and flush its logs. Then make the backup and resume replication with START SLAVE.
Some backup methods do not require even that you suspend replication. For example, if you're backing up a single database containing only MyISAM tables, you can use mysqlhotcopy or mysqldump with the appropriate options to lock all the tables at once. In these cases, the slave server can continue to run, but it won't attempt any updates to the locked tables during the backup. When the backup program finishes and releases the locks, the slave resumes update processing automatically.

Using a Backup to Rename a Database

MySQL has no command for renaming a database, but you can do so by using a dump file. Dump the database with mysqldump, create a new empty database with the new name, and then reload the dump file into the new database. After that, you can drop the old database. Here's an example that shows how to rename db1 to db2:

% mysqldump db1 > db1.sql
% mysqladmin create db2
% mysql db2 < db1.sql
% mysqladmin drop db1

Do not use the --database option to mysqldump. That causes it to write a USE db1 statement to the dump file that would cause the file contents to be reloaded into db1 rather than db2.

The preceding method might not work if db1 contains InnoDB tables that have foreign key relationships with tables in other databases. To rename a database while preserving foreign key relationships, create an empty database with the new name. Then use RENAME TABLE to rename each table in the original database into the new database, and drop the original database:

mysql> CREATE DATABASE db2;
mysql> RENAME TABLE db1.t1 TO db2.t1;
mysql> RENAME TABLE db1.t2 TO db2.t2;
mysql> ...
mysql> DROP DATABASE db1;

Whichever method you use for renaming a database, remember that access rights to it are controlled through the grant tables in the mysql database. If any of the grant tables have records that refer specifically to the database that was renamed, you'll need to adjust those records appropriately to refer to the new name. For a database renamed from db1 to db2, the statements to use look like this:

mysql> UPDATE db SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE tables_priv SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE columns_priv SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE host SET Db = 'db2' WHERE Db = 'db1';
mysql> FLUSH PRIVILEGES;

No UPDATE statement is needed for the user table because it has no Db column.