MHA 实施文档之最佳实践-亲试ok-202008041622-20201020

MHA 实施文档之最佳实践-亲试ok-202008041622
 
 
架构图如下:
         Master(192.168.124.100)
            vip:(192.168.12.499                                                    MHA监控服务器(192.168.124.103)
                                                                                                                                 
                     
       Slave02                         Slave03
(192.168.124.101)                  (192.168.124.102)
 
 
MHA软件由两部分组成:Manager工具包和Node工具包,具体说明如下:
MHA Manager:
1. masterha_check_ssh:检查MHA的SSH配置状况
2. masterha_check_repl:检查MySQL的复制状况
3. masterha_manager:启动MHA
4. masterha_check_status:检测当前MHA运行状态
5. masterha_master_monitor:检测Master是否宕机
6. masterha_master_switch:控制故障转移(自动或手动)
7. masterha_conf_host:添加或删除配置的server信息
8. masterha_stop:关闭MHA
 
MHA Node:
save_binary_logs:保存或复制Master的二进制日志
apply_diff_relay_logs:识别差异的relay log并将差异的event应用到其它Slave中
filter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs:消除中继日志(不会堵塞SQL线程)
 
另有如下几个脚本需自定义:
1. Master_ip_failover:管理VIP
2. Master_ip_online_change:
3. masterha_secondary_check:当MHA manager检测到Master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。
4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。
 
 
集群信息
角色                              IP地址                 ServerID          类型
Master                         192.168.124.100   100                 写入
Candicate Master        192.168.124.101   101                 读
Slave                            192.168.124.102   102                 读
Monitor host               192.168.124.103                      监控集群组
注:操作系统均为CentOS 7.6
其中,Master对外提供写服务,备选Master提供读服务,Slave也提供相关的读服务,一旦Master宕机,将会把备选Master提升为新的Master,Slave指向新的Master
 
 
1.操作系统版本
[root@MHA-Master ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
 
2.MySQL数据库版本
2.1 MHA-Master节点
 
[root@MHA-Master ~]# mysql -uroot -p
Enter password:
mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.10 sec)
 
 
2.2 MHA-Slave02节点
 
[root@MHA-Slave01 ~]# mysql -uroot -p
Enter password:
mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.08 sec)
 
2.3 MHA-Slave03
 
[root@MHA-Slave02 yum.repos.d]# mysql -uroot -p
Enter password:
mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.09 sec)
2.4 MHA-Manager
 
[root@MHA-Manager ~]# mysql -uroot -p
Enter password:
 
mysql> select version();
+------------+
| version()  |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.07 sec)
 
3.配置GTID复制模式的主从架构
3.1修改 server_uuid 和 server_id 
由于三台虚拟机都是复制过来的,因此必须修改本地auto.cnf 文件中的 server_uuid
 
[root@MHA-Master ~]# vi /data/my3306/data/auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407100
 
 
[root@MHA-Slave01 data]# vi auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407101
 
[root@MHA-Slave02 yum.repos.d]# vi /data/my3306/data/auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407102
 
同理 server_id
 
[root@MHA-Slave01 data]# vi /etc/my.cnf
 
# GENERAL #
lower_case_table_names=1
server_id = 101
 
 
[root@MHA-Slave02 yum.repos.d]# vi /etc/my.cnf
 
# GENERAL #
lower_case_table_names=1
server_id = 102
 
3.2 启用GTID的复制模式
 
MHA-Master启用gtid模式
mysql> stop Slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
 
mysql> set global enforce_gtid_consistency=true;
Query OK, 0 rows affected (0.00 sec)
 
mysql> set global gtid_mode=off_permissive;
Query OK, 0 rows affected (0.01 sec)
 
mysql> set global gtid_mode=on_permissive;
Query OK, 0 rows affected (0.01 sec)
 
mysql> set global gtid_mode=on;
Query OK, 0 rows affected (0.00 sec)
 
分别在从节点启用gtid并指向主节点 开启Slave
mysql> change Master to Master_host='192.168.124.201',Master_port=3306,Master_user='repl',Master_password='123456',master_auto_position = 1;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
 
mysql> start Slave;
Query OK, 0 rows affected (0.00 sec)
设置relay log清除方式(在每个Slave上) 
set global relay_log_purge=0;
 
3.3 创建复制用户
 
mysql> create user 'repl'@'192.168.124.%' identified by '123456';
Query OK, 0 rows affected (0.32 sec)
 
mysql> grant replication Slave on *.* to 'repl'@'192.168.124.%';
Query OK, 0 rows affected (0.00 sec)
 
Master 创建用户,该用户会自动同步至其他两台主机
 
 
3.4 分别在三主机建立SSH的免认证登陆
 
Master
[root@MHA-Master ~]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa);   分别把该文件拷贝至三个主机 包括本地 哈哈 
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:dSJpYSwanC+07ELkWyajwF3NPpsy+MAaqbLSCEVtAB8 root@MHA-Master
The key's randomart image is:
+---[RSA 2048]----+
|..E+ .o.o        |
| .o.B..+.o       |
|.+o+.=..+ o .    |
|..*.B .+ o o     |
|.+oB..  S        |
|ooo+.o o         |
|ooo.o o          |
|=..  .           |
|+.               |
+----[SHA256]-----+
 
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
 
MHA-Slave01
[root@MHA-Slave01 data]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:8J5yqFgyYfXQa642uX3b8PVLVewVhlnIEq6TCxg4nP4 root@MHA-Slave02
The key's randomart image is:
+---[RSA 2048]----+
|           .o =+ |
|  . o.    .. =...|
|   =o.o    ..   +|
|  ...oo+  o    .o|
|  o. .+.S+      o|
| . ..o o..o    . |
|  o .E+ =.  . .  |
|   =++ o.+ . o   |
|  ..+o....o   o. |
+----[SHA256]-----+
[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
 
MHA-Slave02
[root@MHA-Slave02 yum.repos.d]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:QF3o6ziZw12u7ihQ1OpdTkollbIjzWSzGydXf0WBEvE root@MHA-Slave03
The key's randomart image is:
+---[RSA 2048]----+
|     ..o.+.oo .oo|
|    ..B = ....  .|
|   . *.O . ..E . |
|    + X.*   . .  |
|   o + %S.   .   |
|  . . + o .      |
|   . . * o       |
|    . B.o .      |
|     ..=+.       |
+----[SHA256]-----+
[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
 
MHA-Manager
[root@MHA-Manager ~]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:f5LkhocdeJE9d6lQgH5nEslGP5lOwYx5u5wHo8yeo8U root@MHA-Manager
The key's randomart image is:
+---[RSA 2048]----+
|          +oBo   |
|         . Xo++ .|
|        . +.=*o..|
|         o ++Oo. |
|        S B *o=  |
|         O.* + . |
|        o XEo .  |
|         o.*     |
|         .. .    |
+----[SHA256]-----+
[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
 
MHA-Master 将秘钥拷贝至两个从节点和manager主机 包括 manager主机
 
 
分别登陆两个主机测试
[root@MHA-Master ~]# SSH 192.168.124.102
Last login: Mon Aug  3 05:14:04 2020 from 192.168.124.9
[root@MHA-Slave02 ~]# exit
logout
Connection to 192.168.124.102 closed.
[root@MHA-Master ~]# SSH 192.168.124.101
Last login: Mon Aug  3 05:13:06 2020 from 192.168.124.9
[root@MHA-Slave01 ~]# exit
logout
Connection to 192.168.124.101 closed.
 
 
4.安装mha-manager 和mha-node软件包
安装mha4mysql-node-0.58-0.el7.noarch.rpm
 
4.1 从下面地址下载软件包
[root@MHA-Master ~]# wget \
 
[root@MHA-Master ~]# wget \
 
mysql 节点都需要安装 mysql-node 软件包 ,管理节点上需要安装 mysql-manager 软件包
 
 
4.1.1 安装安装 mysql-node 软件包
[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.102:~
root@192.168.124.102's password:
mha4mysql-node-0.58-0.el7.centos.noarch.rpm         
 
[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86
 
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
No package ncftp available.
No package perl-DBI.x86 available.
Nothing to do
 
报错找不到 ,哈哈
需要运行
[root@MHA-Master yum.repos.d]#     yum install epel-release -y
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
 
再次安装
[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
Package 2:ncftp-3.2.5-7.el7.x86_64 already installed and latest version
No package perl-DBI.x86 available.
Nothing to do
 
dbi包找不到
[root@MHA-Master ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:mha4mysql-node-0.58-0.el7.centos ################################# [100%]
 
 
[root@MHA-Master ~]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --nodeps --force
Preparing...                          ################################# [100%]
Updating / installing...
   1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]
 
[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.101:~
root@192.168.124.101's password:
mha4mysql-node-0.58-0.el7.centos.noarch.rpm                                                                                100%   35KB 656.2KB/s   00:00    
 
4.4.2 安装 mysql-manager 软件包
 
[root@MHA-Manager softwares]# yum install perl-Config-Tiny.noarch perl-Time-HiRes.x86_64 perl-Parallel-ForkManager perl-Log-Dispatch-Perl.noarch perl-DBD-Mysql ncftp
QL ncftp
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
Package perl-Config-Tiny-2.14-7.el7.noarch already installed and latest version
Package 4:perl-Time-HiRes-1.9725-3.el7.x86_64 already installed and latest version
Package perl-Parallel-ForkManager-1.18-2.el7.noarch already installed and latest version
No package perl-Log-Dispatch-Perl.noarch available.
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
Package 2:ncftp
 
费了一番周折 愣是装不上 perl-Log-Dispatch,重启机器搞定
[root@MHA-Manager ~]# yum install perl-Log-Dispatch
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
。。。。。。
Installed:
  perl-Log-Dispatch.noarch 0:2.41-1.el7.1                                                                                                                    
 
Dependency Installed:
  perl-Class-Load.noarch 0:0.20-3.el7               perl-Email-Date-Format.noarch 0:1.002-15.el7           perl-MIME-Lite.noarch 0:3.030-1.el7             
  perl-Mail-Sendmail.noarch 0:0.79-21.el7           perl-MailTools.noarch 0:2.12-2.el7                     perl-Net-SMTP-SSL.noarch 0:1.01-13.el7          
  perl-Package-Stash.noarch 0:0.34-2.el7            perl-Package-Stash-XS.x86_64 0:0.26-3.el7             
 
Complete!
 
 
[root@MHA-Manager softwares]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm   --force --nodeps
Preparing...                          ################################# [100%]
Updating / installing...
   1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]
 
 
 
5.MHA管理mysql各个节点
 
mha-manager 管理节点
[root@MHA-Manager ~]# mkdir /etc/mha  mha的安装目录
[root@MHA-Manager ~]# mkdir /home/mysql_mha  mha的工作目录
 
[root@MHA-Manager ~]# cd /etc/mha/
[root@MHA-Manager mha]# ls
 
MySQL主节点创建监控用户
mysql> create user 'ha_monitor'@'192.168.124.%' identified by '123456';
Query OK, 0 rows affected (0.32 sec)
 
mysql> grant all privileges on *.* to 'ha_monitor'@'192.168.124.%';
Query OK, 0 rows affected (0.00 sec)
 
 
创建mha的配置文件
[root@MHA-Manager mha]# vi app1.cnf  
 
[server default]
#授权的manager监控用户和密码 mysql数据库用户
user=ha_monitor
password=123456
 
#MySQL服务复制用户和密码
repl_user=repl
repl_password=123456
 
# SSH远程登录用户和端口
SSH_user=root
SSH_port=22
 
# 检测主Master存活间隔和类型
ping_interval=1
ping_type=SELECT
 
#管理端log路径
manager_log=/home/mha/manager.log
#管理服务端所使用的目录
manager_workdir=/home/mha/
#远程节点工作目录
remote_workdir=/home/mha/
#数据库binlog目录
Master_binlog_dir=/data/my3306/log/binlog
 
#故障failover脚本,需在官网的基础上修改配置
Master_ip_failover_script=/etc/mha/Master_ip_failover
Master_ip_online_change_script=/etc/mha/Master_ip_online_change
 
#检测路由主Master是否宕机
secondary_check_script="masterha_secondary_check -s 192.168.124.101 -s 192.168.124.102"
 
[server1]
hostname=192.168.124.201
port=3306
candidate_Master=1
 
[server2]
hostname=192.168.124.202
port=3306
#设置可参数选举为MYSQL Master
candidate_Master=1
#忽略延迟
check_repl_delay=0
 
[server3]
hostname=192.168.124.203
port=3306
#不参与选举
no_Master=1
 
 
 
 
 
[root@MHA-Manager ~]# masterha_check_ssh  --conf=/etc/mha/app1.conf
 
Tue Aug  4 01:34:10 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug  4 01:34:10 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug  4 01:34:10 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug  4 01:34:10 2020 - [info] Starting SSH connection tests..
Tue Aug  4 01:34:14 2020 - [debug]
Tue Aug  4 01:34:11 2020 - [debug]  Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.100(192.168.124.100:22)..
Tue Aug  4 01:34:13 2020 - [debug]   ok.
Tue Aug  4 01:34:13 2020 - [debug]  Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.101(192.168.124.101:22)..
Tue Aug  4 01:34:14 2020 - [debug]   ok.
Tue Aug  4 01:34:14 2020 - [debug]
Tue Aug  4 01:34:10 2020 - [debug]  Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.101(192.168.124.101:22)..
Tue Aug  4 01:34:12 2020 - [debug]   ok.
Tue Aug  4 01:34:12 2020 - [debug]  Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.102(192.168.124.102:22)..
Tue Aug  4 01:34:13 2020 - [debug]   ok.
Tue Aug  4 01:34:14 2020 - [debug]
Tue Aug  4 01:34:11 2020 - [debug]  Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.100(192.168.124.100:22)..
Tue Aug  4 01:34:12 2020 - [debug]   ok.
Tue Aug  4 01:34:12 2020 - [debug]  Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.102(192.168.124.102:22)..
Tue Aug  4 01:34:13 2020 - [debug]   ok.
Tue Aug  4 01:34:14 2020 - [info] All SSH connection tests passed successfully.
 
 
[root@MHA-Manager ~]# masterha_check_repl  --conf=/etc/mha/app1.conf
[root@MHA-Manager ~]# masterha_check_repl  --conf=/etc/mha/app1.conf
Tue Aug  4 02:20:12 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug  4 02:20:12 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug  4 02:20:12 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug  4 02:20:12 2020 - [info] MHA::MasterMonitor version 0.58.
Tue Aug  4 02:20:14 2020 - [info] GTID failover mode = 1
Tue Aug  4 02:20:14 2020 - [info] Dead Servers:
Tue Aug  4 02:20:14 2020 - [info] Alive Servers:
Tue Aug  4 02:20:14 2020 - [info]   192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:20:14 2020 - [info]   192.168.124.101(192.168.124.101:3306)
Tue Aug  4 02:20:14 2020 - [info]   192.168.124.102(192.168.124.102:3306)
Tue Aug  4 02:20:14 2020 - [info] Alive Slaves:
Tue Aug  4 02:20:14 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:20:14 2020 - [info]     GTID ON
Tue Aug  4 02:20:14 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:20:14 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:20:14 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:20:14 2020 - [info]     GTID ON
Tue Aug  4 02:20:14 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:20:14 2020 - [info]     Not candidate for the new Master (no_Master is set)
Tue Aug  4 02:20:14 2020 - [info] Current Alive Master: 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:20:14 2020 - [info] Checking Slave configurations..
Tue Aug  4 02:20:14 2020 - [info]  read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).
Tue Aug  4 02:20:14 2020 - [info]  read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).
Tue Aug  4 02:20:14 2020 - [info] Checking replication filtering settings..
Tue Aug  4 02:20:14 2020 - [info]  binlog_do_db= , binlog_ignore_db=
Tue Aug  4 02:20:14 2020 - [info]  Replication filtering check ok.
Tue Aug  4 02:20:14 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Tue Aug  4 02:20:14 2020 - [info] Checking SSH publickey authentication settings on the current Master..
Tue Aug  4 02:20:15 2020 - [info] HealthCheck: SSH to 192.168.124.100 is reachable.
Tue Aug  4 02:20:15 2020 - [info]
192.168.124.100(192.168.124.100:3306) (current Master)
+--192.168.124.101(192.168.124.101:3306)
+--192.168.124.102(192.168.124.102:3306)
 
Tue Aug  4 02:20:15 2020 - [info] Checking replication health on 192.168.124.101..
Tue Aug  4 02:20:15 2020 - [info]  ok.
Tue Aug  4 02:20:15 2020 - [info] Checking replication health on 192.168.124.102..
Tue Aug  4 02:20:15 2020 - [info]  ok.
Tue Aug  4 02:20:15 2020 - [info] Checking Master_ip_failover_script status:
Tue Aug  4 02:20:15 2020 - [info]   /etc/mha/Master_ip_failover --command=status --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306
 
 
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
 
Checking the Status of the script.. OK
SIOCSIFADDR: No such device
eth0:3: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Tue Aug  4 02:20:17 2020 - [info]  OK.
Tue Aug  4 02:20:17 2020 - [warning] shutdown_script is not defined.
Tue Aug  4 02:20:17 2020 - [info] Got exit code 0 (Not Master dead).
 
MySQL Replication Health is OK.
开启MHA Manager监控
[root@MHA-Manager ~]# nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf  --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1 &
 
[1] 10135
 
Tue Aug  4 21:10:48 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug  4 21:10:48 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug  4 21:10:48 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
 
检查MHA Manager的状态
[root@MHA-Manager ~]# masterha_check_status  --conf=/etc/mha/app1.conf
app1 (pid:15102) is running(0:PING_OK), Master:192.168.124.101
 
关闭MHA Manager监控
[root@MHA-Manager ~]# masterha_stop --conf=/etc/mha/app1.conf
Stopped app1 successfully.
[2]+  Exit 1                  nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1  (wd: /etc)
(wd now: /etc/mha)
 
 
主节点配置虚拟ip
 
[root@MHA-Master-201 softwares]# ifconfig ens33:1 192.168.124.99/24
[root@MHA-Master-201 softwares]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.124.201  netmask 255.255.255.0  broadcast 192.168.124.255
        inet6 fe80::20c:29ff:fe8b:c2e2  prefixlen 64  scopeid 0x20<link>
        ether 00:0c:29:8b:c2:e2  txqueuelen 1000  (Ethernet)
        RX packets 6888  bytes 572997 (559.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 4442  bytes 474278 (463.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.124.99  netmask 255.255.255.0  broadcast 192.168.124.255
        ether 00:0c:29:8b:c2:e2  txqueuelen 1000  (Ethernet)
 
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 131  bytes 15546 (15.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 131  bytes 15546 (15.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
 
 
 
 
日志
[root@MHA-Manager ~]# tail -f  /home/mha/manager.log
 
failover演练
关停主节点
[root@MHA-Master ~]# service mysql stop
Shutting down MySQL..................... SUCCESS!
 
MHA可以自动的 无人值守的从 其他剩余节点中选择新的主节点
 
Tue Aug  4 02:53:19 2020 - [warning] Connection failed 4 time(s)..
Tue Aug  4 02:53:19 2020 - [warning] Master is not reachable from health checker!
Tue Aug  4 02:53:19 2020 - [warning] Master 192.168.124.100(192.168.124.100:3306) is not reachable!
Tue Aug  4 02:53:19 2020 - [warning] SSH is reachable.
Tue Aug  4 02:53:19 2020 - [info] Connecting to a Master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.conf again, and trying to connect to all servers to check server status..
Tue Aug  4 02:53:19 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug  4 02:53:19 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug  4 02:53:19 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug  4 02:53:21 2020 - [info] GTID failover mode = 1
Tue Aug  4 02:53:21 2020 - [info] Dead Servers:
Tue Aug  4 02:53:21 2020 - [info]   192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:21 2020 - [info] Alive Servers:
Tue Aug  4 02:53:21 2020 - [info]   192.168.124.101(192.168.124.101:3306)
Tue Aug  4 02:53:21 2020 - [info]   192.168.124.102(192.168.124.102:3306)
Tue Aug  4 02:53:21 2020 - [info] Alive Slaves:
Tue Aug  4 02:53:21 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:21 2020 - [info]     GTID ON
Tue Aug  4 02:53:21 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:21 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:53:21 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:21 2020 - [info]     GTID ON
Tue Aug  4 02:53:21 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:21 2020 - [info]     Not candidate for the new Master (no_Master is set)
Tue Aug  4 02:53:21 2020 - [info] Checking Slave configurations..
Tue Aug  4 02:53:21 2020 - [info]  read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).
Tue Aug  4 02:53:21 2020 - [info]  read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).
Tue Aug  4 02:53:21 2020 - [info] Checking replication filtering settings..
Tue Aug  4 02:53:21 2020 - [info]  Replication filtering check ok.
Tue Aug  4 02:53:21 2020 - [info] Master is down!
Tue Aug  4 02:53:21 2020 - [info] Terminating monitoring script.
Tue Aug  4 02:53:21 2020 - [info] Got exit code 20 (Master dead).
Tue Aug  4 02:53:21 2020 - [info] MHA::MasterFailover version 0.58.
Tue Aug  4 02:53:21 2020 - [info] Starting Master failover.
Tue Aug  4 02:53:21 2020 - [info]
Tue Aug  4 02:53:21 2020 - [info] * Phase 1: Configuration Check Phase..
Tue Aug  4 02:53:21 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] GTID failover mode = 1
Tue Aug  4 02:53:22 2020 - [info] Dead Servers:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info] Checking Master reachability via MySQL(double check)...
Tue Aug  4 02:53:22 2020 - [info]  ok.
Tue Aug  4 02:53:22 2020 - [info] Alive Servers:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)
Tue Aug  4 02:53:22 2020 - [info] Alive Slaves:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Not candidate for the new Master (no_Master is set)
Tue Aug  4 02:53:22 2020 - [info] Starting GTID based failover.
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] Forcing shutdown so that applications never connect to the current Master..
Tue Aug  4 02:53:22 2020 - [info] Executing Master IP deactivation script:
Tue Aug  4 02:53:22 2020 - [info]   /etc/mha/Master_ip_failover --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --command=stopSSH --SSH_user=root
 
 
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
 
Disabling the VIP on old Master: 192.168.124.100
eth0:3: ERROR while getting interface flags: No such device
Tue Aug  4 02:53:22 2020 - [info]  done.
Tue Aug  4 02:53:22 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead Master.
Tue Aug  4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] * Phase 3: Master Recovery Phase..
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] The latest binary log file/position on all Slaves is log-bin.000031:731
Tue Aug  4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4
Tue Aug  4 02:53:22 2020 - [info] Latest Slaves (Slaves that received relay log files to the latest):
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info] The oldest binary log file/position on all Slaves is log-bin.000031:731
Tue Aug  4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4
Tue Aug  4 02:53:22 2020 - [info] Oldest Slaves:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Not candidate for the new Master (no_Master is set)
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] * Phase 3.3: Determining New Master Phase..
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] Searching new Master from Slaves..
Tue Aug  4 02:53:22 2020 - [info]  Candidate Masters from the configuration file:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.101(192.168.124.101:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Primary candidate for the new Master (candidate_Master is set)
Tue Aug  4 02:53:22 2020 - [info]  Non-candidate Masters:
Tue Aug  4 02:53:22 2020 - [info]   192.168.124.102(192.168.124.102:3306)  Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug  4 02:53:22 2020 - [info]     GTID ON
Tue Aug  4 02:53:22 2020 - [info]     Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug  4 02:53:22 2020 - [info]     Not candidate for the new Master (no_Master is set)
Tue Aug  4 02:53:22 2020 - [info]  Searching from candidate_Master Slaves which have received the latest relay log events..
Tue Aug  4 02:53:22 2020 - [info] New Master is 192.168.124.101(192.168.124.101:3306)
Tue Aug  4 02:53:22 2020 - [info] Starting Master failover..
Tue Aug  4 02:53:22 2020 - [info]
From:
192.168.124.100(192.168.124.100:3306) (current Master)
+--192.168.124.101(192.168.124.101:3306)
+--192.168.124.102(192.168.124.102:3306)
 
To:
192.168.124.101(192.168.124.101:3306) (new Master)
+--192.168.124.102(192.168.124.102:3306)
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info] * Phase 3.3: New Master Recovery Phase..
Tue Aug  4 02:53:22 2020 - [info]
Tue Aug  4 02:53:22 2020 - [info]  Waiting all logs to be applied..
Tue Aug  4 02:53:22 2020 - [info]   done.
Tue Aug  4 02:53:22 2020 - [info] Getting new Master's binlog name and position..
Tue Aug  4 02:53:22 2020 - [info]  log-bin.000032:731
Tue Aug  4 02:53:22 2020 - [info]  All other Slaves should start replication from here. Statement should be: CHANGE Master TO Master_HOST='192.168.124.101', Master_PORT=3306, Master_AUTO_POSITION=1, Master_USER='repl', Master_PASSWORD='xxx';
Tue Aug  4 02:53:22 2020 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: log-bin.000032, 731, 6c26c068-a952-11ea-ac80-000c29407100:1-4
Tue Aug  4 02:53:22 2020 - [info] Executing Master IP activate script:
Tue Aug  4 02:53:22 2020 - [info]   /etc/mha/Master_ip_failover --command=start --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --new_Master_host=192.168.124.101 --new_Master_ip=192.168.124.101 --new_Master_port=3306 --new_Master_user='ha_monitor'   --new_Master_password=xxx
 
 
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
 
The new Master had set read_only=0.
Enabling the VIP - 192.168.124.99 on the new Master - 192.168.124.101
SIOCSIFADDR: No such device
eth0:3: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Tue Aug  4 02:53:23 2020 - [info]  OK.
Tue Aug  4 02:53:23 2020 - [info] ** Finished Master recovery successfully.
Tue Aug  4 02:53:23 2020 - [info] * Phase 3: Master Recovery Phase completed.
Tue Aug  4 02:53:23 2020 - [info]
Tue Aug  4 02:53:23 2020 - [info] * Phase 4: Slaves Recovery Phase..
Tue Aug  4 02:53:23 2020 - [info]
Tue Aug  4 02:53:23 2020 - [info]
Tue Aug  4 02:53:23 2020 - [info] * Phase 4.1: Starting Slaves in parallel..
Tue Aug  4 02:53:23 2020 - [info]
Tue Aug  4 02:53:23 2020 - [info] -- Slave recovery on host 192.168.124.102(192.168.124.102:3306) started, pid: 12380. Check tmp log /home/mha//192.168.124.102_3306_20200804025321.log if it takes time..
Tue Aug  4 02:53:24 2020 - [info]
Tue Aug  4 02:53:24 2020 - [info] Log messages from 192.168.124.102 ...
Tue Aug  4 02:53:24 2020 - [info]
Tue Aug  4 02:53:23 2020 - [info]  Resetting Slave 192.168.124.102(192.168.124.102:3306) and starting replication from the new Master 192.168.124.101(192.168.124.101:3306)..
 
 
创建时间:2021-12-22 11:29
浏览量:0