Akemi

Redis哨兵模式

2024/09/22

主从复制的介绍中讲到,当主数据库出现问题时,需要手动将其中一个slave提升为master提供服务,在原有master起来时,将其更新为slave节点来同步数据

2.8之后提供了哨兵模式来自动化系统监控和故障恢复,这是一个官方推荐的架构
1.监控master和slave是否正常
2.master出现故障时,选举slave作为新的master,其他slave节点追随的master地址被自动改为新master地址

哨兵模式概念
每个sentinel节点就是一个redis节点
监控redis节点的同时,也会互相监控
哨兵节点也可以组成集群,用来监控多个redis集群

哨兵模式工作过程
sentinel节点判断master下线后,询问其他sentinel节点
其他节点检测master,如果已下线就会回复sentinel节点
如果一半以上的sentinel回复其下线,就会对master进行故障转移(转移条件)
在slave节点选取一个节点执行SLAVEOF NO ONE,断开主从关系,并提升为主库
向其他slave节点发送新master信息

哨兵模式从多个从节点中选举主节点的规则

哨兵模式搭建

配置文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
环境:
centos7.9
redis5.0.5
192.168.10.116 redis1
192.168.10.117 redis2
192.168.10.118 redis3

# 配置redis本体redis.conf
redis1主节点:
port 8000
daemonize yes
bind 0.0.0.0
pidfile /var/run/redis-8000.pid
logfile /var/log/redis/redis-8000.log

redis2和redis3从节点:
port 8000
daemonize yes
bind 0.0.0.0
pidfile /var/run/redis-8000.pid
logfile /var/log/redis/redis-8000.log
slaveof 192.168.10.116 8000

# 启动
redis-server /etc/redis.conf

# 每一台配置redis哨兵sentinel.conf
daemonize yes
port 7000
logfile /var/log/redis/sentinel.log
pidfile /var/run/sentinel.pid
sentinel monitor redismaster(哨兵集群的名称) 192.168.10.117 8000 2(多少个slave认为离线,进行切换,一般大于半数)
sentinel down-after-milliseconds redismaster 5000 哨兵节点认为节点离线的间隔时间(ms)
sentinel failover-timeout redismaster 15000 故障切换的超时时间

scp /etc/sentinel.conf redis2:/etc/sentinel.conf
scp /etc/sentinel.conf redis3:/etc/sentinel.conf
# 启动哨兵进程
redis-sentinel /etc/sentinel.conf

查看当前状态

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
连接redis查看info
redis-cli -p 8000
info
# Replication
role:master
connected_slaves:2
slave0:ip=192.168.10.117,port=8000,state=online,offset=85291,lag=0
slave1:ip=192.168.10.118,port=8000,state=online,offset=85291,lag=1
master_replid:82dc702b0395f4ac908f58b5ce93ac2803efdbaa
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:85450
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:85450

连接sentinel
redis-cli -p 7000
info
# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=redismaster,status=ok,address=192.168.10.116:8000,slaves=2,sentinels=3 组名master节点、从节点数、哨兵数

哨兵模式功能测试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
关闭主节点redis
ps -ef | grep redis-server
kill 31136
tail -f /var/log/redis/sentinel.log

14547:X 06 Jul 2024 14:48:00.723 # +sdown master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:00.789 # +odown master redismaster 192.168.10.116 8000 #quorum 2/2
14547:X 06 Jul 2024 14:48:00.789 # +new-epoch 1
14547:X 06 Jul 2024 14:48:00.789 # +try-failover master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:00.796 # +vote-for-leader 2077fac7fce192eee3d48b30f49918bbf707f745 1
14547:X 06 Jul 2024 14:48:00.802 # 726a08fc6822ab80eb68f69ebe64ea146e8ceaa7 voted for 2077fac7fce192eee3d48b30f49918bbf707f745 1
14547:X 06 Jul 2024 14:48:00.859 # +elected-leader master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:00.859 # +failover-state-select-slave master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:00.959 # +selected-slave slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:00.960 * +failover-state-send-slaveof-noone slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.043 * +failover-state-wait-promotion slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.811 # +promoted-slave slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.812 # +failover-state-reconf-slaves master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.860 * +slave-reconf-sent slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.860 # +failover-end master redismaster 192.168.10.116 8000
14547:X 06 Jul 2024 14:48:01.860 # +switch-master redismaster 192.168.10.116 8000 192.168.10.117 8000
14547:X 06 Jul 2024 14:48:01.860 * +slave slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaster 192.168.10.117 8000
14547:X 06 Jul 2024 14:48:01.861 * +slave slave 192.168.10.116:8000 192.168.10.116 8000 @ redismaster 192.168.10.117 8000
14547:X 06 Jul 2024 14:48:06.904 # +sdown slave 192.168.10.116:8000 192.168.10.116 8000 @ redismaster 192.168.10.117 8000
14547:X 06 Jul 2024 14:48:06.904 # +sdown slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaste

可以看到switch-master redismaster 192.168.10.116 8000 192.168.10.117 8000
主节点已经变成了117节点
可以看到sdown slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaste
将118加入这个组

重新登录116
redis-cli -h 192.168.10.116 -p 8000
info
# Replication
role:slave 这个时候已经变成从节点了
master_host:192.168.10.117
master_port:8000
master_link_status:down
master_last_io_seconds_ago:-1
master_sync_in_progress:0
slave_repl_offset:1328
master_link_down_since_seconds:1720249310
slave_priority:100
slave_read_only:1

哨兵模式使用VIP机制

当客户端连接redis进行读写时,哨兵模式切换回导致连接的主redis的IP地址变化
redis集群可以使用VIP机制,只使用固定的VIP地址向客户端提供服务。在故障转移时将VIP漂移到新的主redis上
使用VIP地址
1.每台sentinel.conf中添加配置
sentinel client-reconfig-script redismaster /root/redis-5.0.5/notifyvip.sh

使用redis sentinel的参数client-reconfig-script来传递参数
使其在做failover(切换)时执行脚本
在脚本中传递以下6个参数master-name、 role、 state、 from-ip、 from-port、 to-ip 、to-port

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
/root/redis-5.0.5/notifyvip.sh
#!/bin/bash
MASTER_IP=$6 #新主redis的ip地址
LOCAL_IP='192.168.10.116' #当前主机的ip
VIP='192.168.10.99' #设置VIP地址
NETMASK='24'
INTERFACE='ens18' #VIP绑定的网卡名称
if [[ "${MASTER_IP}" == "${LOCAL_IP}" ]]; then #如果本机是主redis,就将VIP绑定到自己的网卡上
ip addr add ${VIP}/${NETMASK} dev ${INTERFACE}
arping -q -c 3 -A ${VIP} -l ${INTERFACE}
exit 0
else # 如果本机不是主redis,就将VIP地址从网卡上删除
ip addr del ${VIP}/${NETMASK} dev ${INTERFACE}
exit 0
fi
exit 1

2.对sentinel哨兵进行重启
pkill -9 redis-sentinelredis-sentinel /etc/sentinel.conf
3.在当前master手动绑定VIP地址

1
2
3
4
5
6
7
ip addr add 192.168.10.99/24  dev ens18
ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:6d:0e:53 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.117/24 brd 192.168.10.255 scope global noprefixroute ens18
valid_lft forever preferred_lft forever
inet 192.168.10.99/24 scope global secondary ens18

4.测试
pkill -9 redis-server主节点关闭redis

1
2
3
4
5
6
7
8
9
10
11
12
13
2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether bc:24:11:6d:0e:53 brd ff:ff:ff:ff:ff:ff
inet 192.168.10.117/24 brd 192.168.10.255 scope global noprefixroute ens18
主节点VIP已经没了

15669:X 06 Jul 2024 20:18:19.673 # +switch-master redismaster 192.168.10.117 8000 192.168.10.116 8000
根据日志看到master被选择为了116

ip | grep -w inet
inet 192.168.10.116/24 brd 192.168.10.255 scope global noprefixroute ens18
inet 192.168.10.99/24 scope global secondary ens18
在116查看ip能看到99已经变到了116上

CATALOG
  1. 1. 哨兵模式搭建
  2. 2. 哨兵模式使用VIP机制