Setting Up Replication Servers
One form of database "replication" involves simply copying a database to another server. But then you have to repeat the operation later if the original database changes and you want to keep the copy up to date. To achieve continual updating of a secondary database as changes are made to the contents of a master database, use MySQL's live replication capabilities. This gives you a means of keeping a copy of a database and making sure that changes to the original database propagate on a timely basis to the copy automatically.
How Replication Works
Database replication in MySQL is based on the following principles:
In a replication relationship, one server acts as the master and another server acts as the slave. Each server must be assigned a unique replication ID.
You can have multiple slaves per master, but not multiple masters per slave. However, one slave can serve as a master to another slave, thus creating a chain of replication servers.
Each slave must begin with its databases synchronized to those of its master. That is, any database to be replicated on the slave must be an identical copy of the master database when replication begins. After that, updates that are made on the master server propagate to the slave. Updates should not be made directly to the replicated databases on the slave.
The means of communication of updates is based on the master server's binary logs, which is where updates are recorded that are to be sent to the slaves. Binary logging therefore must be enabled on the master server. Stored updates in the binary logs are called "events."
Each slave server must have permission to connect to the master and request updates. When a slave connects to its master, it tells the master how far into the master's binary logs it had progressed when it last connected. This progress is measured in terms of replication coordinates: A binary log filename and position within that file. The master then begins sending to the slave those events in the binary logs that occurred after the given coordinates. When the slave has read all available events, it pauses and waits for more.
As new updates occur on the master server, it sends them to any connected slaves and records them in the binary log to be sent later to any slaves that are not currently connected.
The master server creates one thread to handle each connected slave, much as it creates a thread to handle regular clients. These slave handler threads count against the limit set by the max_connections system variable.
On the slave side, the server uses two threads to handle replication duties. The I/O thread receives events to be processed from the master server and writes them to the slave's relay logs. The SQL thread reads events from the relay logs and executes them. The relay logs serve as the means by which the I/O thread communicates changes to the SQL thread. As each relay log is processed completely, the slave removes it automatically. The I/O and SQL threads operate independently, so each can be stopped or started separately from the other. This decoupling of function into different threads has important benefits. For example, the I/O thread can continue to read events from the master server while you stop the SQL thread so that no updates occur in the slave's databases while you make a backup.
Replication support is an area of active development, so it's sometimes difficult to keep track of just which replication-related feature was added when. In general, it's best to run the most recent server that you can. You'll also need to consider compatibility constraints between different versions of the server. Replication compatibility is based on the binary log format. Three versions of the format have been used thus far. The original format was developed in MySQL 3.23 when replication support was added, the second format was developed in MySQL 4.0, and the third was developed in MySQL 5.0.
In general, I recommend that you follow these guidelines:
Try to match binary log formats for your master and slave servers. For example, try to match 4.x masters with 4.x slaves, not 5.x masters with 3.23.x slaves or vice versa.
Within MySQL versions that use the same binary log format, try to use the most recent versions possible. This will give you the benefit of the richest feature set and the greatest number of restrictions removed and problems eliminated.
Assuming that your servers have compatible binary log formats, they must also be feature-compatible. For example, if the master server replicates InnoDB tables that require the use of transactions or foreign keys, the slave server must include the InnoDB storage engine.
Establishing a Master-Slave Replication Relationship
The following procedure describes how to set up a master-slave replication relationship between two servers:
Determine what ID value you want to assign to each server. These IDs must be different and each should be a positive integer from 1 to 2321. The ID values will be needed for the server-id startup option used with each server.
The slave server needs an account on the master server so that it can connect and request updates. On the master server, set up an account like this:
GRANT REPLICATION SLAVE ON *.* TO 'slave_user'@'slave_host'
IDENTIFIED BY 'slave_pass';
Remember the slave_user
values for later when you set up the slave server. No other privileges are needed if the account is used only for the single, limited purpose of replication. However, you may want to grant additional privileges to the account if you plan to use it connect to the master from the slave host "manually" with the mysql
program for testing. Then you won't be so limited in what you can do. (For example, if REPLICATION SLAVE
is the only privilege granted to the account, you might not even be able to see database names on the master server with SHOW DATABASES
The slave server must begin with an exact copy of the databases to be replicated. Perform the initial synchronization of the slave to the master server by copying the master's databases to the slave. One way to do this is to make a backup on the master host, and then move it to the slave host and load it into the slave server. Another method is to copy all the databases over the network from the master to the slave. See Chapter 13
for database backup and copying techniques.
Another way to set up the slave is to use LOAD DATA FROM MASTER
. However, use of this statement is subject to a number of conditions, which are described in the description for LOAD DATA FROM MASTER
in Appendix E
. One such condition is that the statement works only for MyISAM tables, so it is not useful if you have other types of tables such as InnoDB tables.
Whatever method you use to copy databases from the master to the slave, you must make sure that no updates occur on the master between the time when you make the backup and the time that you reconfigure the master to enable binary logging.
Stop the master server if it is running.
Modify the master's configuration to tell the server its replication ID and to enable binary logging. To do this, add lines like the following to an option file that the master server reads when it starts:
Restart the master server; from this point on, it logs updates by writing them to the binary log. (If you already had binary logging enabled, back up your existing binary logs before restarting the server. Then, after it comes up, connect to it and issue a RESET MASTER
statement to clear the existing binary logs. You must do this before any updates occur.)
Stop the slave server if it is running.
Configure the slave server to know its replication ID, where to find the master server, and how to connect to it. In the simplest case, the two servers will be running on separate hosts and using the default TCP/IP port, and you'll need only four lines in the [mysqld]
group of an option file that the slave server reads when it starts:
is the replication ID of the slave server. It must be different from the master's ID. master_host
is the name of the host where the master server is running. On Unix, if the master host is the same as the slave host, use 127.0.0.1
rather than localhost
to make sure that the slave uses a TCP/IP connection. (A Unix socket file is used for connections to localhost
, and replication through a socket file is not supportedor rather, it is possible but there is no way to specify the socket file pathname.) The slave_user
values should be the name and password of the account that you set up on the master server earlier for the slave server to use when it connects to the master to request updates. Be sure to put these lines in an option file that is accessible only to the MySQL administrator's login account on the slave server, because you should keep the username and password confidential. Don't use /etc/my.cnf
, for example, which normally is world-readable. One possibility is to use my.cnf
in the server's data directory and make sure the data directory contents are locked down as described in Chapter 12
If it's necessary to be more specific about how to establish the connection to the master server, you can include a master-port
line in the option group to indicate a port number if the master isn't listening on the default port.
If the connection between the master and slave is intermittent or unreliable, you may want to change the defaults for the connection attempt interval or the number of retries before giving up (60 seconds and 86,400 times, respectively). The master-connect-retry
options can be used for this.
Restart the slave server. A slave uses two sources of information to determine its progress in the replication process. One is the master.info file in the data directory, and the other is the configuration information specified by the server's startup options. The first time you start a slave server, no master.info file exists, so the slave uses the values of the master-xxx startup options to determine how to connect. Then the slave creates a new master.info file in which to record that information and its replication status. Thereafter when the slave server starts, it uses the information in the master.info file in preference to the startup options. This means that if you later change the master host information in the option file, it will ignore your changes. Instead, specify the changes with the CHANGE MASTER statement, which enables you to modify replication parameters while the slave is running. The slave automatically updates the master.info file with these changes.
The procedure just described is based on the assumption that you want to replicate all databases from one server to another, including the mysql database that contains the grant tables. If you don't want to have the same accounts on both hosts, you can exclude replication of the mysql database. (For example, you might want to set up a private replication slave that people cannot connect to even if they have an account on the master.) To exclude the mysql database, add the following line to the [mysqld] group in the master's option file when you enable binary logging:
To exclude multiple databases, use the binlog-ignore-db option multiple times, once per database.
A problem with using binlog-ignore-db is that the binary log won't contain any information for the ignored databases, and that information is useful for performing recovery after a crash. For that reason, it might be preferable to use replicate-ignore-db on the slave server instead. In that case, the master logs all updates to its binary log and sends them to the slave, but the slave discards any that are for ignored databases.
After you have replication set up and running, there are several statements that you may find useful for monitoring or controlling the master and the slave. Details about these statements are available in Appendix E. A brief summary follows:
The STOP SLAVE and START SLAVE statements suspend and resume a slave server's replication-related activity. These statements can be useful for telling the slave to be quiescent while you're making a backup, for example.
SHOW SLAVE STATUS on a slave shows its replication coordinates. This information can be used to determine which binary logs are no longer needed.
CHANGE MASTER on a slave lets you alter several of the slave's current replication parameters, such as which binary log it reads from the master, or which relay log file it writes to.
PURGE MASTER on the master expires binary logs. You can use this after issuing a SHOW SLAVE STATUS statement on each of the slaves to determine which binary logs no longer are needed.
As mentioned earlier, a slave server uses two threads internally to manage replication. The I/O thread talks to the master server, receives updates from it, and writes updates to a relay log file. The SQL thread reads the relay logs and executes the updates it finds there. You can use STOP SLAVE and START SLAVE to suspend or resume each thread individually by adding IO_THREAD or SQL_THREAD to the end of the statement. For example, STOP SLAVE SQL_THREAD stops execution by the slave of the updates in the relay logs, but allows the slave to continue to read updates from the master and record them in the relay logs.
You can make a backup of the slave server while the SQL thread is stopped, and then restart the thread after making the backup. This way the slave won't be making changes to databases while you're backing them up. The I/O thread can be left running; it will continue to write events to the relay logs that it receives from the master.
Relay logs are generated in numbered sequence, much like the binary logs. There also is a relay log index file analogous to the binary log index. The default relay log and index filenames are HOSTNAME-relay-bin.nnnnnn and HOSTNAME-relay-bin.index. The defaults can be changed with the --relay-log and --relay-log-index server startup options. Another related status file is the relay information file, which has a default name of relay-log.info and can be changed with the --relay-log-info-file option.