MHA 实施文档之最佳实践-亲试ok-202008041622-20201020

MHA 实施文档之最佳实践-亲试ok-202008041622

架构图如下：

Master(192.168.124.100)

vip：(192.168.12.499) MHA监控服务器(192.168.124.103)

Slave02 Slave03

(192.168.124.101) (192.168.124.102)

MHA软件由两部分组成：Manager工具包和Node工具包，具体说明如下：

MHA Manager：

1. masterha_check_ssh：检查MHA的SSH配置状况

2. masterha_check_repl：检查MySQL的复制状况

3. masterha_manager：启动MHA

4. masterha_check_status：检测当前MHA运行状态

5. masterha_master_monitor：检测Master是否宕机

6. masterha_master_switch：控制故障转移（自动或手动）

7. masterha_conf_host：添加或删除配置的server信息

8. masterha_stop：关闭MHA

MHA Node:

save_binary_logs：保存或复制Master的二进制日志

apply_diff_relay_logs：识别差异的relay log并将差异的event应用到其它Slave中

filter_mysqlbinlog：去除不必要的ROLLBACK事件（MHA已不再使用这个工具）

purge_relay_logs：消除中继日志（不会堵塞SQL线程）

另有如下几个脚本需自定义：

1. Master_ip_failover：管理VIP

2. Master_ip_online_change：

3. masterha_secondary_check：当MHA manager检测到Master不可用时，通过masterha_secondary_check脚本来进一步确认，减低误切的风险。

4. send_report：当发生故障切换时，可通过send_report脚本发送告警信息。

集群信息

角色 IP地址 ServerID 类型

Master 192.168.124.100 100 写入

Candicate Master 192.168.124.101 101 读

Slave 192.168.124.102 102 读

Monitor host 192.168.124.103 监控集群组

注：操作系统均为CentOS 7.6

其中，Master对外提供写服务，备选Master提供读服务，Slave也提供相关的读服务，一旦Master宕机，将会把备选Master提升为新的Master，Slave指向新的Master

1.操作系统版本

[root@MHA-Master ~]# cat /etc/redhat-release

CentOS Linux release 7.6.1810 (Core)

2.MySQL数据库版本

2.1 MHA-Master节点

[root@MHA-Master ~]# mysql -uroot -p

Enter password:

略

mysql> select version();

+------------+

| version() |

+------------+

| 5.7.26-log |

+------------+

1 row in set (0.10 sec)

2.2 MHA-Slave02节点

[root@MHA-Slave01 ~]# mysql -uroot -p

Enter password:

略

mysql> select version();

+------------+

| version() |

+------------+

| 5.7.26-log |

+------------+

1 row in set (0.08 sec)

2.3 MHA-Slave03

[root@MHA-Slave02 yum.repos.d]# mysql -uroot -p

Enter password:

略

mysql> select version();

+------------+

| version() |

+------------+

| 5.7.26-log |

+------------+

1 row in set (0.09 sec)

2.4 MHA-Manager

~~[root@MHA-Manager ~]# mysql -uroot -p~~

~~Enter password:~~

~~mysql> select version();~~

~~+------------+~~

~~| version() |~~

~~+------------+~~

| ~~5.7.26-log~~ |

~~+------------+~~

~~1 row in set (0.07 sec)~~

3.配置GTID复制模式的主从架构

3.1修改 server_uuid 和 server_id

由于三台虚拟机都是复制过来的，因此必须修改本地auto.cnf 文件中的 server_uuid

[root@MHA-Master ~]# vi /data/my3306/data/auto.cnf

[auto]

server-uuid=6c26c068-a952-11ea-ac80-000c29407100

[root@MHA-Slave01 data]# vi auto.cnf

[auto]

server-uuid=6c26c068-a952-11ea-ac80-000c29407101

[root@MHA-Slave02 yum.repos.d]# vi /data/my3306/data/auto.cnf

[auto]

server-uuid=6c26c068-a952-11ea-ac80-000c29407102

同理 server_id

[root@MHA-Slave01 data]# vi /etc/my.cnf

# GENERAL #

lower_case_table_names=1

server_id = 101

[root@MHA-Slave02 yum.repos.d]# vi /etc/my.cnf

# GENERAL #

lower_case_table_names=1

server_id = 102

3.2 启用GTID的复制模式

MHA-Master启用gtid模式

mysql> stop Slave;

Query OK, 0 rows affected, 1 warning (0.00 sec)

mysql> set global enforce_gtid_consistency=true;

Query OK, 0 rows affected (0.00 sec)

mysql> set global gtid_mode=off_permissive;

Query OK, 0 rows affected (0.01 sec)

mysql> set global gtid_mode=on_permissive;

Query OK, 0 rows affected (0.01 sec)

mysql> set global gtid_mode=on;

Query OK, 0 rows affected (0.00 sec)

分别在从节点启用gtid并指向主节点开启Slave

mysql> change Master to Master_host='192.168.124.201',Master_port=3306,Master_user='repl',Master_password='123456',master_auto_position = 1;

Query OK, 0 rows affected, 2 warnings (0.01 sec)

mysql> start Slave;

Query OK, 0 rows affected (0.00 sec)

设置relay log清除方式（在每个Slave上）

set global relay_log_purge=0;

3.3 创建复制用户

mysql> create user 'repl'@'192.168.124.%' identified by '123456';

Query OK, 0 rows affected (0.32 sec)

mysql> grant replication Slave on *.* to 'repl'@'192.168.124.%';

Query OK, 0 rows affected (0.00 sec)

Master 创建用户，该用户会自动同步至其他两台主机

3.4 分别在三主机建立SSH的免认证登陆

Master

[root@MHA-Master ~]# SSH-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.SSH/id_rsa); 分别把该文件拷贝至三个主机包括本地哈哈

Created directory '/root/.SSH'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.SSH/id_rsa.

Your public key has been saved in /root/.SSH/id_rsa.pub.

The key fingerprint is:

SHA256:dSJpYSwanC+07ELkWyajwF3NPpsy+MAaqbLSCEVtAB8 root@MHA-Master

The key's randomart image is:

+---[RSA 2048]----+

|..E+ .o.o |

| .o.B..+.o |

|.+o+.=..+ o . |

|..*.B .+ o o |

|.+oB.. S |

|ooo+.o o |

|ooo.o o |

|=.. . |

|+. |

+----[SHA256]-----+

[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100

[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101

[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102

[root@MHA-Master ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103

MHA-Slave01

[root@MHA-Slave01 data]# SSH-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.SSH/id_rsa):

Created directory '/root/.SSH'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.SSH/id_rsa.

Your public key has been saved in /root/.SSH/id_rsa.pub.

The key fingerprint is:

SHA256:8J5yqFgyYfXQa642uX3b8PVLVewVhlnIEq6TCxg4nP4 root@MHA-Slave02

The key's randomart image is:

+---[RSA 2048]----+

| .o =+ |

| . o. .. =...|

| =o.o .. +|

| ...oo+ o .o|

| o. .+.S+ o|

| . ..o o..o . |

| o .E+ =. . . |

| =++ o.+ . o |

| ..+o....o o. |

+----[SHA256]-----+

[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100

[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101

[root@MHA-Slave01~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102

[root@MHA-Slave01 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103

MHA-Slave02

[root@MHA-Slave02 yum.repos.d]# SSH-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.SSH/id_rsa):

Created directory '/root/.SSH'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.SSH/id_rsa.

Your public key has been saved in /root/.SSH/id_rsa.pub.

The key fingerprint is:

SHA256:QF3o6ziZw12u7ihQ1OpdTkollbIjzWSzGydXf0WBEvE root@MHA-Slave03

The key's randomart image is:

+---[RSA 2048]----+

| ..o.+.oo .oo|

| ..B = .... .|

| . *.O . ..E . |

| + X.* . . |

| o + %S. . |

| . . + o . |

| . . * o |

| . B.o . |

| ..=+. |

+----[SHA256]-----+

[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100

[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101

[root@MHA-Slave02~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102

[root@MHA-Slave02 ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103

MHA-Manager

[root@MHA-Manager ~]# SSH-keygen

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.SSH/id_rsa):

Created directory '/root/.SSH'.

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in /root/.SSH/id_rsa.

Your public key has been saved in /root/.SSH/id_rsa.pub.

The key fingerprint is:

SHA256:f5LkhocdeJE9d6lQgH5nEslGP5lOwYx5u5wHo8yeo8U root@MHA-Manager

The key's randomart image is:

+---[RSA 2048]----+

| +oBo |

| . Xo++ .|

| . +.=*o..|

| o ++Oo. |

| S B *o= |

| O.* + . |

| o XEo . |

| o.* |

| .. . |

+----[SHA256]-----+

[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.100

[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.101

[root@MHA-Manager~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.102

[root@MHA-Manager ~]# SSH-copy-id -i /root/.SSH/id_rsa root@192.168.124.103

MHA-Master 将秘钥拷贝至两个从节点和manager主机包括 manager主机

分别登陆两个主机测试

[root@MHA-Master ~]# SSH 192.168.124.102

Last login: Mon Aug 3 05:14:04 2020 from 192.168.124.9

[root@MHA-Slave02 ~]# exit

logout

Connection to 192.168.124.102 closed.

[root@MHA-Master ~]# SSH 192.168.124.101

Last login: Mon Aug 3 05:13:06 2020 from 192.168.124.9

[root@MHA-Slave01 ~]# exit

logout

Connection to 192.168.124.101 closed.

4.安装mha-manager 和mha-node软件包

安装mha4mysql-node-0.58-0.el7.noarch.rpm

4.1 从下面地址下载软件包

[root@MHA-Master ~]# wget \

> https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@MHA-Master ~]# wget \

> https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm

mysql 节点都需要安装 mysql-node 软件包，管理节点上需要安装 mysql-manager 软件包

4.1.1 安装安装 mysql-node 软件包

[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.102:~

root@192.168.124.102's password:

mha4mysql-node-0.58-0.el7.centos.noarch.rpm

[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86

Loaded plugins: fastestmirror, langpacks

Loading mirror speeds from cached hostfile

* base: mirrors.aliyun.com

* extras: mirrors.aliyun.com

* updates: mirrors.aliyun.com

Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version

No package ncftp available.

No package perl-DBI.x86 available.

Nothing to do

报错找不到，哈哈

需要运行

[root@MHA-Master yum.repos.d]# yum install epel-release -y

Loaded plugins: fastestmirror, langpacks

Loading mirror speeds from cached hostfile

再次安装

[root@MHA-Master yum.repos.d]# yum install -y perl-DBD-MySQL ncftp perl-DBI.x86

Loaded plugins: fastestmirror, langpacks

Loading mirror speeds from cached hostfile

* base: mirrors.aliyun.com

* epel: mirrors.tuna.tsinghua.edu.cn

* extras: mirrors.aliyun.com

* updates: mirrors.aliyun.com

Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version

Package 2:ncftp-3.2.5-7.el7.x86_64 already installed and latest version

No package perl-DBI.x86 available.

Nothing to do

dbi包找不到

[root@MHA-Master ~]# rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm

Preparing... ################################# [100%]

Updating / installing...

1:mha4mysql-node-0.58-0.el7.centos ################################# [100%]

[root@MHA-Master ~]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --nodeps --force

Preparing... ################################# [100%]

Updating / installing...

1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]

[root@MHA-Master ~]# scp mha4mysql-node-0.58-0.el7.centos.noarch.rpm 192.168.124.101:~

root@192.168.124.101's password:

mha4mysql-node-0.58-0.el7.centos.noarch.rpm 100% 35KB 656.2KB/s 00:00

4.4.2 安装 mysql-manager 软件包

[root@MHA-Manager softwares]# yum install perl-Config-Tiny.noarch perl-Time-HiRes.x86_64 perl-Parallel-ForkManager perl-Log-Dispatch-Perl.noarch perl-DBD-Mysql ncftp

略

QL ncftp

Loaded plugins: fastestmirror, langpacks

Loading mirror speeds from cached hostfile

* base: mirrors.aliyun.com

* epel: mirrors.bfsu.edu.cn

* extras: mirrors.aliyun.com

* updates: mirrors.aliyun.com

Package perl-Config-Tiny-2.14-7.el7.noarch already installed and latest version

Package 4:perl-Time-HiRes-1.9725-3.el7.x86_64 already installed and latest version

Package perl-Parallel-ForkManager-1.18-2.el7.noarch already installed and latest version

No package perl-Log-Dispatch-Perl.noarch available.

Package perl-DBD-MySQL-4.023-6.el7.x86_64 already installed and latest version

Package 2:ncftp

费了一番周折愣是装不上 perl-Log-Dispatch，重启机器搞定

[root@MHA-Manager ~]# yum install perl-Log-Dispatch

Loaded plugins: fastestmirror, langpacks

Loading mirror speeds from cached hostfile

。。。。。。

Installed:

perl-Log-Dispatch.noarch 0:2.41-1.el7.1

Dependency Installed:

perl-Class-Load.noarch 0:0.20-3.el7 perl-Email-Date-Format.noarch 0:1.002-15.el7 perl-MIME-Lite.noarch 0:3.030-1.el7

perl-Mail-Sendmail.noarch 0:0.79-21.el7 perl-MailTools.noarch 0:2.12-2.el7 perl-Net-SMTP-SSL.noarch 0:1.01-13.el7

perl-Package-Stash.noarch 0:0.34-2.el7 perl-Package-Stash-XS.x86_64 0:0.26-3.el7

Complete!

[root@MHA-Manager softwares]# rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm --force --nodeps

Preparing... ################################# [100%]

Updating / installing...

1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]

5.MHA管理mysql各个节点

mha-manager 管理节点

[root@MHA-Manager ~]# mkdir /etc/mha mha的安装目录

[root@MHA-Manager ~]# mkdir /home/mysql_mha mha的工作目录

[root@MHA-Manager ~]# cd /etc/mha/

[root@MHA-Manager mha]# ls

MySQL主节点创建监控用户

mysql> create user 'ha_monitor'@'192.168.124.%' identified by '123456';

Query OK, 0 rows affected (0.32 sec)

mysql> grant all privileges on *.* to 'ha_monitor'@'192.168.124.%';

Query OK, 0 rows affected (0.00 sec)

创建mha的配置文件

[root@MHA-Manager mha]# vi app1.cnf

[server default]

#授权的manager监控用户和密码 mysql数据库用户

user=ha_monitor

password=123456

#MySQL服务复制用户和密码

repl_user=repl

repl_password=123456

# SSH远程登录用户和端口

SSH_user=root

SSH_port=22

# 检测主Master存活间隔和类型

ping_interval=1

ping_type=SELECT

#管理端log路径

manager_log=/home/mha/manager.log

#管理服务端所使用的目录

manager_workdir=/home/mha/

#远程节点工作目录

remote_workdir=/home/mha/

#数据库binlog目录

Master_binlog_dir=/data/my3306/log/binlog

#故障failover脚本，需在官网的基础上修改配置

Master_ip_failover_script=/etc/mha/Master_ip_failover

Master_ip_online_change_script=/etc/mha/Master_ip_online_change

#检测路由主Master是否宕机

secondary_check_script="masterha_secondary_check -s 192.168.124.101 -s 192.168.124.102"

[server1]

hostname=192.168.124.201

port=3306

candidate_Master=1

[server2]

hostname=192.168.124.202

port=3306

#设置可参数选举为MYSQL Master

candidate_Master=1

#忽略延迟

check_repl_delay=0

[server3]

hostname=192.168.124.203

port=3306

#不参与选举

no_Master=1

[root@MHA-Manager ~]# masterha_check_ssh --conf=/etc/mha/app1.conf

Tue Aug 4 01:34:10 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Tue Aug 4 01:34:10 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..

Tue Aug 4 01:34:10 2020 - [info] Reading server configuration from /etc/mha/app1.conf..

Tue Aug 4 01:34:10 2020 - [info] Starting SSH connection tests..

Tue Aug 4 01:34:14 2020 - [debug]

Tue Aug 4 01:34:11 2020 - [debug] Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.100(192.168.124.100:22)..

Tue Aug 4 01:34:13 2020 - [debug] ok.

Tue Aug 4 01:34:13 2020 - [debug] Connecting via SSH from root@192.168.124.102(192.168.124.102:22) to root@192.168.124.101(192.168.124.101:22)..

Tue Aug 4 01:34:14 2020 - [debug] ok.

Tue Aug 4 01:34:14 2020 - [debug]

Tue Aug 4 01:34:10 2020 - [debug] Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.101(192.168.124.101:22)..

Tue Aug 4 01:34:12 2020 - [debug] ok.

Tue Aug 4 01:34:12 2020 - [debug] Connecting via SSH from root@192.168.124.100(192.168.124.100:22) to root@192.168.124.102(192.168.124.102:22)..

Tue Aug 4 01:34:13 2020 - [debug] ok.

Tue Aug 4 01:34:14 2020 - [debug]

Tue Aug 4 01:34:11 2020 - [debug] Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.100(192.168.124.100:22)..

Tue Aug 4 01:34:12 2020 - [debug] ok.

Tue Aug 4 01:34:12 2020 - [debug] Connecting via SSH from root@192.168.124.101(192.168.124.101:22) to root@192.168.124.102(192.168.124.102:22)..

Tue Aug 4 01:34:13 2020 - [debug] ok.

Tue Aug 4 01:34:14 2020 - [info] All SSH connection tests passed successfully.

[root@MHA-Manager ~]# masterha_check_repl --conf=/etc/mha/app1.conf

Tue Aug 4 02:20:12 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Tue Aug 4 02:20:12 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..

Tue Aug 4 02:20:12 2020 - [info] Reading server configuration from /etc/mha/app1.conf..

Tue Aug 4 02:20:12 2020 - [info] MHA::MasterMonitor version 0.58.

Tue Aug 4 02:20:14 2020 - [info] GTID failover mode = 1

Tue Aug 4 02:20:14 2020 - [info] Dead Servers:

Tue Aug 4 02:20:14 2020 - [info] Alive Servers:

Tue Aug 4 02:20:14 2020 - [info] 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:20:14 2020 - [info] 192.168.124.101(192.168.124.101:3306)

Tue Aug 4 02:20:14 2020 - [info] 192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:20:14 2020 - [info] Alive Slaves:

Tue Aug 4 02:20:14 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:20:14 2020 - [info] GTID ON

Tue Aug 4 02:20:14 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:20:14 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:20:14 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:20:14 2020 - [info] GTID ON

Tue Aug 4 02:20:14 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:20:14 2020 - [info] Not candidate for the new Master (no_Master is set)

Tue Aug 4 02:20:14 2020 - [info] Current Alive Master: 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:20:14 2020 - [info] Checking Slave configurations..

Tue Aug 4 02:20:14 2020 - [info] read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).

Tue Aug 4 02:20:14 2020 - [info] read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).

Tue Aug 4 02:20:14 2020 - [info] Checking replication filtering settings..

Tue Aug 4 02:20:14 2020 - [info] binlog_do_db= , binlog_ignore_db=

Tue Aug 4 02:20:14 2020 - [info] Replication filtering check ok.

Tue Aug 4 02:20:14 2020 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.

Tue Aug 4 02:20:14 2020 - [info] Checking SSH publickey authentication settings on the current Master..

Tue Aug 4 02:20:15 2020 - [info] HealthCheck: SSH to 192.168.124.100 is reachable.

Tue Aug 4 02:20:15 2020 - [info]

192.168.124.100(192.168.124.100:3306) (current Master)

+--192.168.124.101(192.168.124.101:3306)

+--192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:20:15 2020 - [info] Checking replication health on 192.168.124.101..

Tue Aug 4 02:20:15 2020 - [info] ok.

Tue Aug 4 02:20:15 2020 - [info] Checking replication health on 192.168.124.102..

Tue Aug 4 02:20:15 2020 - [info] ok.

Tue Aug 4 02:20:15 2020 - [info] Checking Master_ip_failover_script status:

Tue Aug 4 02:20:15 2020 - [info] /etc/mha/Master_ip_failover --command=status --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306

IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===

Checking the Status of the script.. OK

SIOCSIFADDR: No such device

eth0:3: ERROR while getting interface flags: No such device

SIOCSIFNETMASK: No such device

Tue Aug 4 02:20:17 2020 - [info] OK.

Tue Aug 4 02:20:17 2020 - [warning] shutdown_script is not defined.

Tue Aug 4 02:20:17 2020 - [info] Got exit code 0 (Not Master dead).

MySQL Replication Health is OK.

开启MHA Manager监控

[root@MHA-Manager ~]# nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1 &

[1] 10135

Tue Aug 4 21:10:48 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Tue Aug 4 21:10:48 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..

Tue Aug 4 21:10:48 2020 - [info] Reading server configuration from /etc/mha/app1.conf..

检查MHA Manager的状态

[root@MHA-Manager ~]# masterha_check_status --conf=/etc/mha/app1.conf

app1 (pid:15102) is running(0:PING_OK), Master:192.168.124.101

关闭MHA Manager监控

[root@MHA-Manager ~]# masterha_stop --conf=/etc/mha/app1.conf

Stopped app1 successfully.

[2]+ Exit 1 nohup masterha_manager --conf=/etc/mha/app1.conf --remove_dead_master_conf --ignore_last_failover < /dev/null > /home/mha/manager.log 2>&1 (wd: /etc)

(wd now: /etc/mha)

主节点配置虚拟ip

[root@MHA-Master-201 softwares]# ifconfig ens33:1 192.168.124.99/24

[root@MHA-Master-201 softwares]# ifconfig

ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 192.168.124.201 netmask 255.255.255.0 broadcast 192.168.124.255

inet6 fe80::20c:29ff:fe8b:c2e2 prefixlen 64 scopeid 0x20<link>

ether 00:0c:29:8b:c2:e2 txqueuelen 1000 (Ethernet)

RX packets 6888 bytes 572997 (559.5 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 4442 bytes 474278 (463.1 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500

inet 192.168.124.99 netmask 255.255.255.0 broadcast 192.168.124.255

ether 00:0c:29:8b:c2:e2 txqueuelen 1000 (Ethernet)

lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536

inet 127.0.0.1 netmask 255.0.0.0

inet6 ::1 prefixlen 128 scopeid 0x10<host>

loop txqueuelen 1000 (Local Loopback)

RX packets 131 bytes 15546 (15.1 KiB)

RX errors 0 dropped 0 overruns 0 frame 0

TX packets 131 bytes 15546 (15.1 KiB)

TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

日志

[root@MHA-Manager ~]# tail -f /home/mha/manager.log

failover演练

关停主节点

[root@MHA-Master ~]# service mysql stop

Shutting down MySQL..................... SUCCESS!

MHA可以自动的无人值守的从其他剩余节点中选择新的主节点

Tue Aug 4 02:53:19 2020 - [warning] Connection failed 4 time(s)..

Tue Aug 4 02:53:19 2020 - [warning] Master is not reachable from health checker!

Tue Aug 4 02:53:19 2020 - [warning] Master 192.168.124.100(192.168.124.100:3306) is not reachable!

Tue Aug 4 02:53:19 2020 - [warning] SSH is reachable.

Tue Aug 4 02:53:19 2020 - [info] Connecting to a Master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha/app1.conf again, and trying to connect to all servers to check server status..

Tue Aug 4 02:53:19 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Tue Aug 4 02:53:19 2020 - [info] Reading application default configuration from /etc/mha/app1.conf..

Tue Aug 4 02:53:19 2020 - [info] Reading server configuration from /etc/mha/app1.conf..

Tue Aug 4 02:53:21 2020 - [info] GTID failover mode = 1

Tue Aug 4 02:53:21 2020 - [info] Dead Servers:

Tue Aug 4 02:53:21 2020 - [info] 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:21 2020 - [info] Alive Servers:

Tue Aug 4 02:53:21 2020 - [info] 192.168.124.101(192.168.124.101:3306)

Tue Aug 4 02:53:21 2020 - [info] 192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:53:21 2020 - [info] Alive Slaves:

Tue Aug 4 02:53:21 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:21 2020 - [info] GTID ON

Tue Aug 4 02:53:21 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:21 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:53:21 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:21 2020 - [info] GTID ON

Tue Aug 4 02:53:21 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:21 2020 - [info] Not candidate for the new Master (no_Master is set)

Tue Aug 4 02:53:21 2020 - [info] Checking Slave configurations..

Tue Aug 4 02:53:21 2020 - [info] read_only=1 is not set on Slave 192.168.124.101(192.168.124.101:3306).

Tue Aug 4 02:53:21 2020 - [info] read_only=1 is not set on Slave 192.168.124.102(192.168.124.102:3306).

Tue Aug 4 02:53:21 2020 - [info] Checking replication filtering settings..

Tue Aug 4 02:53:21 2020 - [info] Replication filtering check ok.

Tue Aug 4 02:53:21 2020 - [info] Master is down!

Tue Aug 4 02:53:21 2020 - [info] Terminating monitoring script.

Tue Aug 4 02:53:21 2020 - [info] Got exit code 20 (Master dead).

Tue Aug 4 02:53:21 2020 - [info] MHA::MasterFailover version 0.58.

Tue Aug 4 02:53:21 2020 - [info] Starting Master failover.

Tue Aug 4 02:53:21 2020 - [info]

Tue Aug 4 02:53:21 2020 - [info] * Phase 1: Configuration Check Phase..

Tue Aug 4 02:53:21 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] GTID failover mode = 1

Tue Aug 4 02:53:22 2020 - [info] Dead Servers:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Checking Master reachability via MySQL(double check)...

Tue Aug 4 02:53:22 2020 - [info] ok.

Tue Aug 4 02:53:22 2020 - [info] Alive Servers:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:53:22 2020 - [info] Alive Slaves:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)

Tue Aug 4 02:53:22 2020 - [info] Starting GTID based failover.

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] ** Phase 1: Configuration Check Phase completed.

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase..

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] Forcing shutdown so that applications never connect to the current Master..

Tue Aug 4 02:53:22 2020 - [info] Executing Master IP deactivation script:

Tue Aug 4 02:53:22 2020 - [info] /etc/mha/Master_ip_failover --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --command=stopSSH --SSH_user=root

IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===

Disabling the VIP on old Master: 192.168.124.100

eth0:3: ERROR while getting interface flags: No such device

Tue Aug 4 02:53:22 2020 - [info] done.

Tue Aug 4 02:53:22 2020 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead Master.

Tue Aug 4 02:53:22 2020 - [info] * Phase 2: Dead Master Shutdown Phase completed.

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] * Phase 3: Master Recovery Phase..

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] * Phase 3.1: Getting Latest Slaves Phase..

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] The latest binary log file/position on all Slaves is log-bin.000031:731

Tue Aug 4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4

Tue Aug 4 02:53:22 2020 - [info] Latest Slaves (Slaves that received relay log files to the latest):

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] The oldest binary log file/position on all Slaves is log-bin.000031:731

Tue Aug 4 02:53:22 2020 - [info] Retrieved Gtid Set: 6c26c068-a952-11ea-ac80-000c29407100:3-4

Tue Aug 4 02:53:22 2020 - [info] Oldest Slaves:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] * Phase 3.3: Determining New Master Phase..

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] Searching new Master from Slaves..

Tue Aug 4 02:53:22 2020 - [info] Candidate Masters from the configuration file:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.101(192.168.124.101:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Primary candidate for the new Master (candidate_Master is set)

Tue Aug 4 02:53:22 2020 - [info] Non-candidate Masters:

Tue Aug 4 02:53:22 2020 - [info] 192.168.124.102(192.168.124.102:3306) Version=5.7.26-log (oldest major version between Slaves) log-bin:enabled

Tue Aug 4 02:53:22 2020 - [info] GTID ON

Tue Aug 4 02:53:22 2020 - [info] Replicating from 192.168.124.100(192.168.124.100:3306)

Tue Aug 4 02:53:22 2020 - [info] Not candidate for the new Master (no_Master is set)

Tue Aug 4 02:53:22 2020 - [info] Searching from candidate_Master Slaves which have received the latest relay log events..

Tue Aug 4 02:53:22 2020 - [info] New Master is 192.168.124.101(192.168.124.101:3306)

Tue Aug 4 02:53:22 2020 - [info] Starting Master failover..

Tue Aug 4 02:53:22 2020 - [info]

From:

192.168.124.100(192.168.124.100:3306) (current Master)

+--192.168.124.101(192.168.124.101:3306)

+--192.168.124.102(192.168.124.102:3306)

To:

192.168.124.101(192.168.124.101:3306) (new Master)

+--192.168.124.102(192.168.124.102:3306)

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] * Phase 3.3: New Master Recovery Phase..

Tue Aug 4 02:53:22 2020 - [info]

Tue Aug 4 02:53:22 2020 - [info] Waiting all logs to be applied..

Tue Aug 4 02:53:22 2020 - [info] done.

Tue Aug 4 02:53:22 2020 - [info] Getting new Master's binlog name and position..

Tue Aug 4 02:53:22 2020 - [info] log-bin.000032:731

Tue Aug 4 02:53:22 2020 - [info] All other Slaves should start replication from here. Statement should be: CHANGE Master TO Master_HOST='192.168.124.101', Master_PORT=3306, Master_AUTO_POSITION=1, Master_USER='repl', Master_PASSWORD='xxx';

Tue Aug 4 02:53:22 2020 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: log-bin.000032, 731, 6c26c068-a952-11ea-ac80-000c29407100:1-4

Tue Aug 4 02:53:22 2020 - [info] Executing Master IP activate script:

Tue Aug 4 02:53:22 2020 - [info] /etc/mha/Master_ip_failover --command=start --SSH_user=root --orig_Master_host=192.168.124.100 --orig_Master_ip=192.168.124.100 --orig_Master_port=3306 --new_Master_host=192.168.124.101 --new_Master_ip=192.168.124.101 --new_Master_port=3306 --new_Master_user='ha_monitor' --new_Master_password=xxx

IN SCRIPT TEST====/sbin/ifconfig eth0:3 down==/sbin/ifconfig eth0:3 192.168.124.99 netmask 255.255.255.255 ;/sbin/arping -I eth0 -c 3 -s 192.168.124.99 192.168.124.1 >/dev/null 2>&1===

The new Master had set read_only=0.

Enabling the VIP - 192.168.124.99 on the new Master - 192.168.124.101

SIOCSIFADDR: No such device

eth0:3: ERROR while getting interface flags: No such device

SIOCSIFNETMASK: No such device

Tue Aug 4 02:53:23 2020 - [info] OK.

Tue Aug 4 02:53:23 2020 - [info] ** Finished Master recovery successfully.

Tue Aug 4 02:53:23 2020 - [info] * Phase 3: Master Recovery Phase completed.

Tue Aug 4 02:53:23 2020 - [info]

Tue Aug 4 02:53:23 2020 - [info] * Phase 4: Slaves Recovery Phase..

Tue Aug 4 02:53:23 2020 - [info]

Tue Aug 4 02:53:23 2020 - [info] * Phase 4.1: Starting Slaves in parallel..

Tue Aug 4 02:53:23 2020 - [info]

Tue Aug 4 02:53:23 2020 - [info] -- Slave recovery on host 192.168.124.102(192.168.124.102:3306) started, pid: 12380. Check tmp log /home/mha//192.168.124.102_3306_20200804025321.log if it takes time..

Tue Aug 4 02:53:24 2020 - [info]

Tue Aug 4 02:53:24 2020 - [info] Log messages from 192.168.124.102 ...

Tue Aug 4 02:53:24 2020 - [info]

Tue Aug 4 02:53:23 2020 - [info] Resetting Slave 192.168.124.102(192.168.124.102:3306) and starting replication from the new Master 192.168.124.101(192.168.124.101:3306)..

创建时间：2021-12-22 11:29

넶浏览量：0

ꄴ前一个：无

ꄲ后一个：无

MHA 实施文档之最佳实践-亲试ok-202008041622-20201020

全部评论 (0)