After completing the installation (See: Installation and Configuration), we are going to set up MySQL master-master replication. We need to replicate MySQL servers to achieve high-availability (HA). In my case I need two masters that are synchronized with each other so that if one of them goes down, the other could take over and no data is lost. Similarly, when the first one goes up again, it will still be used as slave for the live one.
Here is a basic step by step tutorial, that will cover the MySQL master and slave replication and also will describe the MySQL master and master replication.
Notions: we will call system 1 as master1 and slave2, and system2 as master2 and slave 1.
Install MySQL on master 1 and slave 1. Configure network services on both systems, like:
...[mysqld]log-bin=/var/log/mysql/mysql-bin.logbinlog-do-db=<database name> # input the database which should be replicatedbinlog-ignore-db=mysql # input the database that should be ignored for replicationbinlog-ignore-db=testexpire_logs_days=14# binlog_cache_size = 64K # Enable this if binlog_cache_disk_use increase in 'show global status'sync_binlog=1 # Reduce performances but essential for data integrity between slave <- masterslave_compressed_protocol=1 # Good for wan replication to reduce network I/O at low cost of extra CPUserver-id=1bind-address=0.0.0.0...
Notes: Italic lines are nearly optional and bold lines need to be set
If you want to sync all databases without setting exclusions, you could do without adding ‘*ignore-db’ values:
1
2
3
4
5
6
7
8
9
10
11
...[mysqld]log-bin=/var/log/mysql/mysql-bin.logexpire_logs_days=14# binlog_cache_size = 64K # Enable this if binlog_cache_disk_use increase in 'show global status'sync_binlog=1 # Reduce performances but essential for data integrity between slave <- masterslave_compressed_protocol=1 # Good for wan replication to reduce network I/O at low cost of extra CPUserver-id=1bind-address=0.0.0.0...
The highlighted rows above must indicate related log files and Slave_IO_Running and Slave_SQL_Running must be set to YES.
Seconds_Behind_Master: is the difference time between master and slave data sync (in progress…).
Master_Host: This is the address of the master server. The slave will connect to it to get the logs that have to be replayed
Master_User: The user that we will use for the replication
Master_Port: The port where the slave will connect to on the master
Master_Log_File: This file is the binary log that the slave will have to replay. The binary log is located on the master server. This file contains every statement that makes a change to the database in a binary format.
Read_Master_Log_Pos: This is the position number on the binary log file that is being read by the “Slave_I/O Process” (the process that established the connection to the master and ensures the sync)
Relay_Log_File: Same as the binary log but this file is located on the slave. This is a relay file of the master binary log
Relay_Log_Pos: This is the position number on the relay log file that is being read by the “Slave_SQL Process” (the process that replays the changes to the slave database)
Relay_Master_Log_File: It contains all the statements that have been processed by the slave.
Slave_IO_Running: The status of the Slave_I/O process. When it’s “Yes” that means that the slave is properly connected to the master. If it’s “No” that means something is wrong with the connectivity between the 2 nodes
Slave_SQL_Running: The status of the Slave_SQL process. When it’s “Yes” that means that the slave is able to process statements and that all is working correctly. If it’s “No” that means something is wrong while reading the relay logs
...[mysqld]log-bin=/var/log/mysql/mysql-bin.logbinlog-do-db=<database name> # input the database which should be replicatedbinlog-ignore-db=mysql # input the database that should be ignored for replicationbinlog-ignore-db=testexpire_logs_days=14# binlog_cache_size = 64K # Enable this if binlog_cache_disk_use increase in 'show global status'sync_binlog=1 # Reduce performances but essential for data integrity between slave <- masterslave_compressed_protocol=1 # Good for wan replication to reduce network I/O at low cost of extra CPUserver-id=2bind-address=0.0.0.0# sync_binlog = 1expire_log_days=14...
Try to do this on the slave (adapt to your configuration):
1
2
3
4
5
6
mysql -uroot -p
slave stop;
reset slave;
change master to master_host='10.8.0.6', master_user='replication', master_password='password', master_log_file='mysql-bin.000173', master_log_pos=50937;
start slave;
show slave status\G;
The slave synchronization should be repaired :-)
Watch current process to determine syncro problem link
With this command, we can see if there is a MySQL replication problem:
1
2
3
4
5
6
7
8
9
10
$ mysqladmin -uroot -p processlist
+------+-------------+------------------------+----+-------------+--------+-----------------------------------------------------------------------+------------------+
| Id | User | Host | db | Command | Time | State | Info |
+------+-------------+------------------------+----+-------------+--------+-----------------------------------------------------------------------+------------------+
| 1 | system user | | | Connect | 264546 | Waiting for master to send event | |
| 2 | system user | | | Connect | 1798 | Has read all relay log; waiting for the slave I/O thread to update it | |
| 4796 | replication | 10.8.0.6:60523 | | Binlog Dump | 4733 | Has sent all binlog to slave; waiting for binlog to be updated | |
| 5020 | root | localhost | | Query | 0 | | show processlist |
+------+-------------+------------------------+----+-------------+--------+-----------------------------------------------------------------------+------------------+
If you’ve got this kind of problem, a tool called mk-table-checksum available here allows you to check consistency between 2 servers. It also can help to recreate a consistency database from the master.
You may need to repair your replication because something went wrong and you can see that in the slave status (Last_Error). The way to correct the problem is to stop the slave:
1
stop slave;
Then analyze the error message and correct it on the slave. Then skip the error message: