MHA 实施文档之最佳实践-亲试ok-202008041622-20201020
MHA 实施文档之最佳实践-亲试ok-202008041622
架构图如下:
Master(192.168.124.100)
vip:(192.168.12.499) MHA监控服务器(192.168.124.103)

Slave02 Slave03
(192.168.124.101) (192.168.124.102)
MHA软件由两部分组成:Manager工具包和Node工具包,具体说明如下:
|
MHA Manager:
1. masterha_check_ssh:检查MHA的SSH配置状况
2. masterha_check_repl:检查MySQL的复制状况
3. masterha_manager:启动MHA
4. masterha_check_status:检测当前MHA运行状态
5. masterha_master_monitor:检测Master是否宕机
6. masterha_master_switch:控制故障转移(自动或手动)
7. masterha_conf_host:添加或删除配置的server信息
8. masterha_stop:关闭MHA
MHA Node:
save_binary_logs:保存或复制Master的二进制日志
apply_diff_relay_logs:识别差异的relay log并将差异的event应用到其它Slave中
filter_mysqlbinlog:去除不必要的ROLLBACK事件(MHA已不再使用这个工具)
purge_relay_logs:消除中继日志(不会堵塞SQL线程)
另有如下几个脚本需自定义:
1. Master_ip_failover:管理VIP
2. Master_ip_online_change:
3. masterha_secondary_check:当MHA manager检测到Master不可用时,通过masterha_secondary_check脚本来进一步确认,减低误切的风险。
4. send_report:当发生故障切换时,可通过send_report脚本发送告警信息。
|
集群信息
|
角色 IP地址 ServerID 类型
Master 192.168.124.100 100 写入
Candicate Master 192.168.124.101 101 读
Slave 192.168.124.102 102 读
Monitor host 192.168.124.103 监控集群组
注:操作系统均为CentOS 7.6
其中,Master对外提供写服务,备选Master提供读服务,Slave也提供相关的读服务,一旦Master宕机,将会把备选Master提升为新的Master,Slave指向新的Master
|
1.操作系统版本
[root@MHA-Master ~]# cat /etc/redhat-release
CentOS Linux release 7.6.1810 (Core)
2.MySQL数据库版本
2.1 MHA-Master节点
|
[root@MHA-Master ~]# mysql -uroot -p
Enter password:
略
mysql> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.10 sec)
|
2.2 MHA-Slave02节点
|
[root@MHA-Slave01 ~]# mysql -uroot -p
Enter password:
略
mysql> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.08 sec)
|
2.3 MHA-Slave03
|
[root@MHA-Slave02 yum.repos.d]# mysql -uroot -p
Enter password:
略
mysql> select version();
+------------+
| version() |
+------------+
| 5.7.26-log |
+------------+
1 row in set (0.09 sec)
|
2.4 MHA-Manager
|
|
3.配置GTID复制模式的主从架构
3.1修改 server_uuid 和 server_id
|
由于三台虚拟机都是复制过来的,因此必须修改本地auto.cnf 文件中的 server_uuid
[root@MHA-Master ~]# vi /data/my3306/data/auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407100
[root@MHA-Slave01 data]# vi auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407101
[root@MHA-Slave02 yum.repos.d]# vi /data/my3306/data/auto.cnf
[auto]
server-uuid=6c26c068-a952-11ea-ac80-000c29407102
同理 server_id
[root@MHA-Slave01 data]# vi /etc/my.cnf
# GENERAL #
lower_case_table_names=1
server_id = 101
[root@MHA-Slave02 yum.repos.d]# vi /etc/my.cnf
# GENERAL #
lower_case_table_names=1
server_id = 102
|
3.2 启用GTID的复制模式
|
MHA-Master启用gtid模式
mysql> stop Slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> set global enforce_gtid_consistency=true;
Query OK, 0 rows affected (0.00 sec)
mysql> set global gtid_mode=off_permissive;
Query OK, 0 rows affected (0.01 sec)
mysql> set global gtid_mode=on_permissive;
Query OK, 0 rows affected (0.01 sec)
mysql> set global gtid_mode=on;
Query OK, 0 rows affected (0.00 sec)
分别在从节点启用gtid并指向主节点 开启Slave
mysql> change Master to Master_host='192.168.124.201',Master_port=3306,Master_user='repl',Master_password='123456',master_auto_position = 1;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start Slave;
Query OK, 0 rows affected (0.00 sec)
|
设置relay log清除方式(在每个Slave上)
set global relay_log_purge=0;
3.3 创建复制用户
|
mysql> create user 'repl'@'192.168.124.%' identified by '123456';
Query OK, 0 rows affected (0.32 sec)
mysql> grant replication Slave on *.* to 'repl'@'192.168.124.%';
Query OK, 0 rows affected (0.00 sec)
Master 创建用户,该用户会自动同步至其他两台主机
|
3.4 分别在三主机建立SSH的免认证登陆
|
Master
[root@MHA-Master ~]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa); 分别把该文件拷贝至三个主机 包括本地 哈哈
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:dSJpYSwanC+07ELkWyajwF3NPpsy+MAaqbLSCEVtAB8 root@MHA-Master
The key's randomart image is:
+---[RSA 2048]----+
|..E+ .o.o |
| .o.B..+.o |
|.+o+.=..+ o . |
|..*.B .+ o o |
|.+oB.. S |
|ooo+.o o |
|ooo.o o |
|=.. . |
|+. |
+----[SHA256]-----+
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
MHA-Slave01
[root@MHA-Slave01 data]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:8J5yqFgyYfXQa642uX3b8PVLVewVhlnIEq6TCxg4nP4 root@MHA-Slave02
The key's randomart image is:
+---[RSA 2048]----+
| .o =+ |
| . o. .. =...|
| =o.o .. +|
| ...oo+ o .o|
| o. .+.S+ o|
| . ..o o..o . |
| o .E+ =. . . |
| =++ o.+ . o |
| ..+o....o o. |
+----[SHA256]-----+
[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
MHA-Slave02
[root@MHA-Slave02 yum.repos.d]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:QF3o6ziZw12u7ihQ1OpdTkollbIjzWSzGydXf0WBEvE root@MHA-Slave03
The key's randomart image is:
+---[RSA 2048]----+
| ..o.+.oo .oo|
| ..B = .... .|
| . *.O . ..E . |
| + X.* . . |
| o + %S. . |
| . . + o . |
| . . * o |
| . B.o . |
| ..=+. |
+----[SHA256]-----+
[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
MHA-Manager
[root@MHA-Manager ~]# SSH-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.SSH/id_rsa):
Created directory '/root/.SSH'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.SSH/id_rsa.
Your public key has been saved in /root/.SSH/id_rsa.pub.
The key fingerprint is:
SHA256:f5LkhocdeJE9d6lQgH5nEslGP5lOwYx5u5wHo8yeo8U root@MHA-Manager
The key's randomart image is:
+---[RSA 2048]----+
| +oBo |
| . Xo++ .|
| . +.=*o..|
| o ++Oo. |
| S B *o= |
| O.* + . |
| o XEo . |
| o.* |
| .. . |
+----[SHA256]-----+
[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100
[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101
[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102
[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103
MHA-Master 将秘钥拷贝至两个从节点和manager主机 包括 manager主机
分别登陆两个主机测试
[root@MHA-Master ~]# SSH 192.168.124.102
Last login: Mon Aug 3 05:14:04 2020 from 192.168.124.9
[root@MHA-Slave02 ~]# exit
logout
Connection to 192.168.124.102 closed.
[root@MHA-Master ~]# SSH 192.168.124.101
Last login: Mon Aug 3 05:13:06 2020 from 192.168.124.9
[root@MHA-Slave01 ~]# exit
logout
Connection to 192.168.124.101 closed.
|
4.安装mha-manager 和mha-node软件包
安装mha4mysql-node-0.58-0.el7.noarch.rpm
4.1 从下面地址下载软件包
[root@MHA-Master ~]# wget \
[root@MHA-Master ~]# wget \
mysql 节点都需要安装 mysql-node 软件包 ,管理节点上需要安装 mysql-manager 软件包
4.1.1 安装安装 mysql-node 软件包
[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.102:~
root@192.168.124.102's password:
mha4mysql-node-0.58-0.el7.centos.noarch.rpm
[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
No package ncftp available.
No package perl-DBI.x86 available.
Nothing to do
报错找不到 ,哈哈
需要运行
[root@MHA-Master yum.repos.d]# yum install epel-release -y
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
再次安装
[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* epel: mirrors.tuna.tsinghua.edu.cn
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
Package 2:ncftp-3.2.5-7.el7.x86_64 already installed and latest version
No package perl-DBI.x86 available.
Nothing to do
dbi包找不到
[root@MHA-Master ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
Preparing... ################################# [100%]
Updating / installing...
1:mha4mysql-node-0.58-0.el7.centos ################################# [100%]
[root@MHA-Master ~]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --nodeps --force
Preparing... ################################# [100%]
Updating / installing...
1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]
[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.101:~
root@192.168.124.101's password:
mha4mysql-node-0.58-0.el7.centos.noarch.rpm 100% 35KB 656.2KB/s 00:00
4.4.2 安装 mysql-manager 软件包
[root@MHA-Manager softwares]# yum install perl-Config-Tiny.noarch perl-Time-HiRes.x86_64 perl-Parallel-ForkManager perl-Log-Dispatch-Perl.noarch perl-DBD-Mysql ncftp
略
QL ncftp
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
* base: mirrors.aliyun.com
* epel: mirrors.bfsu.edu.cn
* extras: mirrors.aliyun.com
* updates: mirrors.aliyun.com
Package perl-Config-Tiny-2.14-7.el7.noarch already installed and latest version
Package 4:perl-Time-HiRes-1.9725-3.el7.x86_64 already installed and latest version
Package perl-Parallel-ForkManager-1.18-2.el7.noarch already installed and latest version
No package perl-Log-Dispatch-Perl.noarch available.
Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version
Package 2:ncftp
费了一番周折 愣是装不上 perl-Log-Dispatch,重启机器搞定
[root@MHA-Manager ~]# yum install perl-Log-Dispatch
Loaded plugins: fastestmirror, langpacks
Loading mirror speeds from cached hostfile
。。。。。。
Installed:
perl-Log-Dispatch.noarch 0:2.41-1.el7.1
Dependency Installed:
perl-Class-Load.noarch 0:0.20-3.el7 perl-Email-Date-Format.noarch 0:1.002-15.el7 perl-MIME-Lite.noarch 0:3.030-1.el7
perl-Mail-Sendmail.noarch 0:0.79-21.el7 perl-MailTools.noarch 0:2.12-2.el7 perl-Net-SMTP-SSL.noarch 0:1.01-13.el7
perl-Package-Stash.noarch 0:0.34-2.el7 perl-Package-Stash-XS.x86_64 0:0.26-3.el7
Complete!
[root@MHA-Manager softwares]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --force --nodeps
Preparing... ################################# [100%]
Updating / installing...
1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]
5.MHA管理mysql各个节点
mha-manager 管理节点
[root@MHA-Manager ~]# mkdir /etc/mha mha的安装目录
[root@MHA-Manager ~]# mkdir /home/mysql_mha mha的工作目录
[root@MHA-Manager ~]# cd /etc/mha/
[root@MHA-Manager mha]# ls
MySQL主节点创建监控用户
mysql> create user 'ha_monitor'@'192.168.124.%' identified by '123456';
Query OK, 0 rows affected (0.32 sec)
mysql> grant all privileges on *.* to 'ha_monitor'@'192.168.124.%';
Query OK, 0 rows affected (0.00 sec)
创建mha的配置文件
[root@MHA-Manager mha]# vi app1.cnf
|
[server default]
#授权的manager监控用户和密码 mysql数据库用户
user=ha_monitor
password=123456
#MySQL服务复制用户和密码
repl_user=repl
repl_password=123456
# SSH远程登录用户和端口
SSH_user=root
SSH_port=22
# 检测主Master存活间隔和类型
ping_interval=1
ping_type=SELECT
#管理端log路径
manager_log=/home/mha/manager.log
#管理服务端所使用的目录
manager_workdir=/home/mha/
#远程节点工作目录
remote_workdir=/home/mha/
#数据库binlog目录
Master_binlog_dir=/data/my3306/log/binlog
#故障failover脚本,需在官网的基础上修改配置
Master_ip_failover_script=/etc/mha/Master_ip_failover
Master_ip_online_change_script=/etc/mha/Master_ip_online_change
#检测路由主Master是否宕机
secondary_check_script="masterha_secondary_check -s 192.168.124.101 -s 192.168.124.102"
[server1]
hostname=192.168.124.201
port=3306
candidate_Master=1
[server2]
hostname=192.168.124.202
port=3306
#设置可参数选举为MYSQL Master
candidate_Master=1
#忽略延迟
check_repl_delay=0
[server3]
hostname=192.168.124.203
port=3306
#不参与选举
no_Master=1
|
[root@MHA-Manager ~]# masterha_check_ssh --conf=/etc/mha/app1.conf
|
Tue Aug 4 01:34:10 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 4 01:34:10 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug 4 01:34:10 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug 4 01:34:10 2020 - [info] Starting SSH connection tests..
Tue Aug 4 01:34:14 2020 - [debug]
Tue Aug 4 01:34:11 2020 - [debug] Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.100(192.168.124.100:22)..
Tue Aug 4 01:34:13 2020 - [debug] ok.
Tue Aug 4 01:34:13 2020 - [debug] Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.101(192.168.124.101:22)..
Tue Aug 4 01:34:14 2020 - [debug] ok.
Tue Aug 4 01:34:14 2020 - [debug]
Tue Aug 4 01:34:10 2020 - [debug] Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.101(192.168.124.101:22)..
Tue Aug 4 01:34:12 2020 - [debug] ok.
Tue Aug 4 01:34:12 2020 - [debug] Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.102(192.168.124.102:22)..
Tue Aug 4 01:34:13 2020 - [debug] ok.
Tue Aug 4 01:34:14 2020 - [debug]
Tue Aug 4 01:34:11 2020 - [debug] Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.100(192.168.124.100:22)..
Tue Aug 4 01:34:12 2020 - [debug] ok.
Tue Aug 4 01:34:12 2020 - [debug] Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.102(192.168.124.102:22)..
Tue Aug 4 01:34:13 2020 - [debug] ok.
Tue Aug 4 01:34:14 2020 - [info] All SSH connection tests passed successfully.
|
[root@MHA-Manager ~]# masterha_check_repl --conf=/etc/mha/app1.conf
|
[root@MHA-Manager ~]# masterha_check_repl --conf=/etc/mha/app1.conf
Tue Aug 4 02:20:12 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 4 02:20:12 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug 4 02:20:12 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug 4 02:20:12 2020 - [info] MHA::MasterMonitor version 0.58.
Tue Aug 4 02:20:14 2020 - [info] GTID failover mode = 1
Tue Aug 4 02:20:14 2020 - [info] Dead Servers:
Tue Aug 4 02:20:14 2020 - [info] Alive Servers:
Tue Aug 4 02:20:14 2020 - [info] 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:20:14 2020 - [info] 192.168.124.101(192.168.124.101:3306)
Tue Aug 4 02:20:14 2020 - [info] 192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:20:14 2020 - [info] Alive Slaves:
Tue Aug 4 02:20:14 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:20:14 2020 - [info] GTID ON
Tue Aug 4 02:20:14 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:20:14 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:20:14 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:20:14 2020 - [info] GTID ON
Tue Aug 4 02:20:14 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:20:14 2020 - [info] Not candidate for the new Master (no_Master is set)
Tue Aug 4 02:20:14 2020 - [info] Current Alive Master: 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:20:14 2020 - [info] Checking Slave configurations..
Tue Aug 4 02:20:14 2020 - [info] read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).
Tue Aug 4 02:20:14 2020 - [info] read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).
Tue Aug 4 02:20:14 2020 - [info] Checking replication filtering settings..
Tue Aug 4 02:20:14 2020 - [info] binlog_do_db= , binlog_ignore_db=
Tue Aug 4 02:20:14 2020 - [info] Replication filtering check ok.
Tue Aug 4 02:20:14 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Tue Aug 4 02:20:14 2020 - [info] Checking SSH publickey authentication settings on the current Master..
Tue Aug 4 02:20:15 2020 - [info] HealthCheck: SSH to 192.168.124.100 is reachable.
Tue Aug 4 02:20:15 2020 - [info]
192.168.124.100(192.168.124.100:3306) (current Master)
+--192.168.124.101(192.168.124.101:3306)
+--192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:20:15 2020 - [info] Checking replication health on 192.168.124.101..
Tue Aug 4 02:20:15 2020 - [info] ok.
Tue Aug 4 02:20:15 2020 - [info] Checking replication health on 192.168.124.102..
Tue Aug 4 02:20:15 2020 - [info] ok.
Tue Aug 4 02:20:15 2020 - [info] Checking Master_ip_failover_script status:
Tue Aug 4 02:20:15 2020 - [info] /etc/mha/Master_ip_failover --command=status --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
Checking the Status of the script.. OK
SIOCSIFADDR: No such device
eth0:3: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Tue Aug 4 02:20:17 2020 - [info] OK.
Tue Aug 4 02:20:17 2020 - [warning] shutdown_script is not defined.
Tue Aug 4 02:20:17 2020 - [info] Got exit code 0 (Not Master dead).
MySQL Replication Health is OK.
|
开启MHA Manager监控
[root@MHA-Manager ~]# nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1 &
[1] 10135
Tue Aug 4 21:10:48 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 4 21:10:48 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug 4 21:10:48 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
检查MHA Manager的状态
[root@MHA-Manager ~]# masterha_check_status --conf=/etc/mha/app1.conf
app1 (pid:15102) is running(0:PING_OK), Master:192.168.124.101
关闭MHA Manager监控
[root@MHA-Manager ~]# masterha_stop --conf=/etc/mha/app1.conf
Stopped app1 successfully.
[2]+ Exit 1 nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1 (wd: /etc)
(wd now: /etc/mha)
主节点配置虚拟ip
|
[root@MHA-Master-201 softwares]# ifconfig ens33:1 192.168.124.99/24
[root@MHA-Master-201 softwares]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.124.201 netmask 255.255.255.0 broadcast 192.168.124.255
inet6 fe80::20c:29ff:fe8b:c2e2 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:8b:c2:e2 txqueuelen 1000 (Ethernet)
RX packets 6888 bytes 572997 (559.5 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4442 bytes 474278 (463.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.124.99 netmask 255.255.255.0 broadcast 192.168.124.255
ether 00:0c:29:8b:c2:e2 txqueuelen 1000 (Ethernet)
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 131 bytes 15546 (15.1 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 131 bytes 15546 (15.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
|
日志
[root@MHA-Manager ~]# tail -f /home/mha/manager.log
failover演练
关停主节点
[root@MHA-Master ~]# service mysql stop
Shutting down MySQL..................... SUCCESS!
MHA可以自动的 无人值守的从 其他剩余节点中选择新的主节点
|
Tue Aug 4 02:53:19 2020 - [warning] Connection failed 4 time(s)..
Tue Aug 4 02:53:19 2020 - [warning] Master is not reachable from health checker!
Tue Aug 4 02:53:19 2020 - [warning] Master 192.168.124.100(192.168.124.100:3306) is not reachable!
Tue Aug 4 02:53:19 2020 - [warning] SSH is reachable.
Tue Aug 4 02:53:19 2020 - [info] Connecting to a Master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.conf again, and trying to connect to all servers to check server status..
Tue Aug 4 02:53:19 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Aug 4 02:53:19 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..
Tue Aug 4 02:53:19 2020 - [info] Reading server configuration from /etc/mha/app1.conf..
Tue Aug 4 02:53:21 2020 - [info] GTID failover mode = 1
Tue Aug 4 02:53:21 2020 - [info] Dead Servers:
Tue Aug 4 02:53:21 2020 - [info] 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:21 2020 - [info] Alive Servers:
Tue Aug 4 02:53:21 2020 - [info] 192.168.124.101(192.168.124.101:3306)
Tue Aug 4 02:53:21 2020 - [info] 192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:53:21 2020 - [info] Alive Slaves:
Tue Aug 4 02:53:21 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:21 2020 - [info] GTID ON
Tue Aug 4 02:53:21 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:21 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:53:21 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:21 2020 - [info] GTID ON
Tue Aug 4 02:53:21 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:21 2020 - [info] Not candidate for the new Master (no_Master is set)
Tue Aug 4 02:53:21 2020 - [info] Checking Slave configurations..
Tue Aug 4 02:53:21 2020 - [info] read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).
Tue Aug 4 02:53:21 2020 - [info] read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).
Tue Aug 4 02:53:21 2020 - [info] Checking replication filtering settings..
Tue Aug 4 02:53:21 2020 - [info] Replication filtering check ok.
Tue Aug 4 02:53:21 2020 - [info] Master is down!
Tue Aug 4 02:53:21 2020 - [info] Terminating monitoring script.
Tue Aug 4 02:53:21 2020 - [info] Got exit code 20 (Master dead).
Tue Aug 4 02:53:21 2020 - [info] MHA::MasterFailover version 0.58.
Tue Aug 4 02:53:21 2020 - [info] Starting Master failover.
Tue Aug 4 02:53:21 2020 - [info]
Tue Aug 4 02:53:21 2020 - [info] * Phase 1: Configuration Check Phase..
Tue Aug 4 02:53:21 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] GTID failover mode = 1
Tue Aug 4 02:53:22 2020 - [info] Dead Servers:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Checking Master reachability via MySQL(double check)...
Tue Aug 4 02:53:22 2020 - [info] ok.
Tue Aug 4 02:53:22 2020 - [info] Alive Servers:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:53:22 2020 - [info] Alive Slaves:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)
Tue Aug 4 02:53:22 2020 - [info] Starting GTID based failover.
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase..
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] Forcing shutdown so that applications never connect to the current Master..
Tue Aug 4 02:53:22 2020 - [info] Executing Master IP deactivation script:
Tue Aug 4 02:53:22 2020 - [info] /etc/mha/Master_ip_failover --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --command=stopSSH --SSH_user=root
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
Disabling the VIP on old Master: 192.168.124.100
eth0:3: ERROR while getting interface flags: No such device
Tue Aug 4 02:53:22 2020 - [info] done.
Tue Aug 4 02:53:22 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead Master.
Tue Aug 4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] * Phase 3: Master Recovery Phase..
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] The latest binary log file/position on all Slaves is log-bin.000031:731
Tue Aug 4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4
Tue Aug 4 02:53:22 2020 - [info] Latest Slaves (Slaves that received relay log files to the latest):
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] The oldest binary log file/position on all Slaves is log-bin.000031:731
Tue Aug 4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4
Tue Aug 4 02:53:22 2020 - [info] Oldest Slaves:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] * Phase 3.3: Determining New Master Phase..
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] Searching new Master from Slaves..
Tue Aug 4 02:53:22 2020 - [info] Candidate Masters from the configuration file:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)
Tue Aug 4 02:53:22 2020 - [info] Non-candidate Masters:
Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled
Tue Aug 4 02:53:22 2020 - [info] GTID ON
Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)
Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)
Tue Aug 4 02:53:22 2020 - [info] Searching from candidate_Master Slaves which have received the latest relay log events..
Tue Aug 4 02:53:22 2020 - [info] New Master is 192.168.124.101(192.168.124.101:3306)
Tue Aug 4 02:53:22 2020 - [info] Starting Master failover..
Tue Aug 4 02:53:22 2020 - [info]
From:
192.168.124.100(192.168.124.100:3306) (current Master)
+--192.168.124.101(192.168.124.101:3306)
+--192.168.124.102(192.168.124.102:3306)
To:
192.168.124.101(192.168.124.101:3306) (new Master)
+--192.168.124.102(192.168.124.102:3306)
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] * Phase 3.3: New Master Recovery Phase..
Tue Aug 4 02:53:22 2020 - [info]
Tue Aug 4 02:53:22 2020 - [info] Waiting all logs to be applied..
Tue Aug 4 02:53:22 2020 - [info] done.
Tue Aug 4 02:53:22 2020 - [info] Getting new Master's binlog name and position..
Tue Aug 4 02:53:22 2020 - [info] log-bin.000032:731
Tue Aug 4 02:53:22 2020 - [info] All other Slaves should start replication from here. Statement should be: CHANGE Master TO Master_HOST='192.168.124.101', Master_PORT=3306, Master_AUTO_POSITION=1, Master_USER='repl', Master_PASSWORD='xxx';
Tue Aug 4 02:53:22 2020 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: log-bin.000032, 731, 6c26c068-a952-11ea-ac80-000c29407100:1-4
Tue Aug 4 02:53:22 2020 - [info] Executing Master IP activate script:
Tue Aug 4 02:53:22 2020 - [info] /etc/mha/Master_ip_failover --command=start --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --new_Master_host=192.168.124.101 --new_Master_ip=192.168.124.101 --new_Master_port=3306 --new_Master_user='ha_monitor' --new_Master_password=xxx
IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===
The new Master had set read_only=0.
Enabling the VIP - 192.168.124.99 on the new Master - 192.168.124.101
SIOCSIFADDR: No such device
eth0:3: ERROR while getting interface flags: No such device
SIOCSIFNETMASK: No such device
Tue Aug 4 02:53:23 2020 - [info] OK.
Tue Aug 4 02:53:23 2020 - [info] ** Finished Master recovery successfully.
Tue Aug 4 02:53:23 2020 - [info] * Phase 3: Master Recovery Phase completed.
Tue Aug 4 02:53:23 2020 - [info]
Tue Aug 4 02:53:23 2020 - [info] * Phase 4: Slaves Recovery Phase..
Tue Aug 4 02:53:23 2020 - [info]
Tue Aug 4 02:53:23 2020 - [info]
Tue Aug 4 02:53:23 2020 - [info] * Phase 4.1: Starting Slaves in parallel..
Tue Aug 4 02:53:23 2020 - [info]
Tue Aug 4 02:53:23 2020 - [info] -- Slave recovery on host 192.168.124.102(192.168.124.102:3306) started, pid: 12380. Check tmp log /home/mha//192.168.124.102_3306_20200804025321.log if it takes time..
Tue Aug 4 02:53:24 2020 - [info]
Tue Aug 4 02:53:24 2020 - [info] Log messages from 192.168.124.102 ...
Tue Aug 4 02:53:24 2020 - [info]
Tue Aug 4 02:53:23 2020 - [info] Resetting Slave 192.168.124.102(192.168.124.102:3306) and starting replication from the new Master 192.168.124.101(192.168.124.101:3306)..
|
创建时间:2021-12-22 11:29
넶浏览量:0