主从复制的介绍中讲到,当主数据库出现问题时,需要手动将其中一个slave提升为master提供服务,在原有master起来时,将其更新为slave节点来同步数据
2.8之后提供了哨兵模式来自动化系统监控和故障恢复,这是一个官方推荐的架构 1.监控master和slave是否正常 2.master出现故障时,选举slave作为新的master,其他slave节点追随的master地址被自动改为新master地址
哨兵模式概念 每个sentinel节点就是一个redis节点 监控redis节点的同时,也会互相监控 哨兵节点也可以组成集群,用来监控多个redis集群
哨兵模式工作过程 sentinel节点判断master下线后,询问其他sentinel节点 其他节点检测master,如果已下线就会回复sentinel节点 如果一半以上的sentinel回复其下线,就会对master进行故障转移(转移条件) 在slave节点选取一个节点执行SLAVEOF NO ONE
,断开主从关系,并提升为主库 向其他slave节点发送新master信息
哨兵模式从多个从节点中选举主节点的规则
哨兵模式搭建 配置文件
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 环境: centos7.9 redis5.0.5 192.168.10.116 redis1 192.168.10.117 redis2 192.168.10.118 redis3 # 配置redis本体redis.conf redis1主节点: port 8000 daemonize yes bind 0.0.0.0 pidfile /var/run/redis-8000.pid logfile /var/log/redis/redis-8000.log redis2和redis3从节点: port 8000 daemonize yes bind 0.0.0.0 pidfile /var/run/redis-8000.pid logfile /var/log/redis/redis-8000.log slaveof 192.168.10.116 8000 # 启动 redis-server /etc/redis.conf # 每一台配置redis哨兵sentinel.conf daemonize yes port 7000 logfile /var/log/redis/sentinel.log pidfile /var/run/sentinel.pid sentinel monitor redismaster(哨兵集群的名称) 192.168.10.117 8000 2(多少个slave认为离线,进行切换,一般大于半数) sentinel down-after-milliseconds redismaster 5000 哨兵节点认为节点离线的间隔时间(ms) sentinel failover-timeout redismaster 15000 故障切换的超时时间 scp /etc/sentinel.conf redis2:/etc/sentinel.conf scp /etc/sentinel.conf redis3:/etc/sentinel.conf # 启动哨兵进程 redis-sentinel /etc/sentinel.conf
查看当前状态
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 连接redis查看info redis-cli -p 8000 info # Replication role:master connected_slaves:2 slave0:ip=192.168.10.117,port=8000,state=online,offset=85291,lag=0 slave1:ip=192.168.10.118,port=8000,state=online,offset=85291,lag=1 master_replid:82dc702b0395f4ac908f58b5ce93ac2803efdbaa master_replid2:0000000000000000000000000000000000000000 master_repl_offset:85450 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:85450 连接sentinel redis-cli -p 7000 info # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=redismaster,status=ok,address=192.168.10.116:8000,slaves=2,sentinels=3 组名master节点、从节点数、哨兵数
哨兵模式功能测试
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 关闭主节点redis ps -ef | grep redis-server kill 31136 tail -f /var/log/redis/sentinel.log 14547:X 06 Jul 2024 14:48:00.723 # +sdown master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:00.789 # +odown master redismaster 192.168.10.116 8000 #quorum 2/2 14547:X 06 Jul 2024 14:48:00.789 # +new-epoch 1 14547:X 06 Jul 2024 14:48:00.789 # +try-failover master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:00.796 # +vote-for-leader 2077fac7fce192eee3d48b30f49918bbf707f745 1 14547:X 06 Jul 2024 14:48:00.802 # 726a08fc6822ab80eb68f69ebe64ea146e8ceaa7 voted for 2077fac7fce192eee3d48b30f49918bbf707f745 1 14547:X 06 Jul 2024 14:48:00.859 # +elected-leader master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:00.859 # +failover-state-select-slave master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:00.959 # +selected-slave slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:00.960 * +failover-state-send-slaveof-noone slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.043 * +failover-state-wait-promotion slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.811 # +promoted-slave slave 192.168.10.117:8000 192.168.10.117 8000 @ redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.812 # +failover-state-reconf-slaves master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.860 * +slave-reconf-sent slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.860 # +failover-end master redismaster 192.168.10.116 8000 14547:X 06 Jul 2024 14:48:01.860 # +switch-master redismaster 192.168.10.116 8000 192.168.10.117 8000 14547:X 06 Jul 2024 14:48:01.860 * +slave slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaster 192.168.10.117 8000 14547:X 06 Jul 2024 14:48:01.861 * +slave slave 192.168.10.116:8000 192.168.10.116 8000 @ redismaster 192.168.10.117 8000 14547:X 06 Jul 2024 14:48:06.904 # +sdown slave 192.168.10.116:8000 192.168.10.116 8000 @ redismaster 192.168.10.117 8000 14547:X 06 Jul 2024 14:48:06.904 # +sdown slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaste 可以看到switch-master redismaster 192.168.10.116 8000 192.168.10.117 8000 主节点已经变成了117节点 可以看到sdown slave 192.168.10.118:8000 192.168.10.118 8000 @ redismaste 将118加入这个组 重新登录116 redis-cli -h 192.168.10.116 -p 8000 info # Replication role:slave 这个时候已经变成从节点了 master_host:192.168.10.117 master_port:8000 master_link_status:down master_last_io_seconds_ago:-1 master_sync_in_progress:0 slave_repl_offset:1328 master_link_down_since_seconds:1720249310 slave_priority:100 slave_read_only:1
哨兵模式使用VIP机制 当客户端连接redis进行读写时,哨兵模式切换回导致连接的主redis的IP地址变化 redis集群可以使用VIP机制,只使用固定的VIP地址向客户端提供服务。在故障转移时将VIP漂移到新的主redis上使用VIP地址 1.每台sentinel.conf中添加配置sentinel client-reconfig-script redismaster /root/redis-5.0.5/notifyvip.sh
使用redis sentinel的参数client-reconfig-script来传递参数 使其在做failover(切换)时执行脚本 在脚本中传递以下6个参数master-name、 role、 state、 from-ip、 from-port、 to-ip 、to-port
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 /root/redis-5.0.5/notifyvip.sh #!/bin/bash MASTER_IP=$6 #新主redis的ip地址 LOCAL_IP='192.168.10.116' #当前主机的ip VIP='192.168.10.99' #设置VIP地址 NETMASK='24' INTERFACE='ens18' #VIP绑定的网卡名称 if [[ "${MASTER_IP}" == "${LOCAL_IP}" ]]; then #如果本机是主redis,就将VIP绑定到自己的网卡上 ip addr add ${VIP}/${NETMASK} dev ${INTERFACE} arping -q -c 3 -A ${VIP} -l ${INTERFACE} exit 0 else # 如果本机不是主redis,就将VIP地址从网卡上删除 ip addr del ${VIP}/${NETMASK} dev ${INTERFACE} exit 0 fi exit 1
2.对sentinel哨兵进行重启pkill -9 redis-sentinelredis-sentinel /etc/sentinel.conf
3.在当前master手动绑定VIP地址
1 2 3 4 5 6 7 ip addr add 192.168.10.99/24 dev ens18 ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:6d:0e:53 brd ff:ff:ff:ff:ff:ff inet 192.168.10.117/24 brd 192.168.10.255 scope global noprefixroute ens18 valid_lft forever preferred_lft forever inet 192.168.10.99/24 scope global secondary ens18
4.测试pkill -9 redis-server
主节点关闭redis
1 2 3 4 5 6 7 8 9 10 11 12 13 2: ens18: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 link/ether bc:24:11:6d:0e:53 brd ff:ff:ff:ff:ff:ff inet 192.168.10.117/24 brd 192.168.10.255 scope global noprefixroute ens18 主节点VIP已经没了 15669:X 06 Jul 2024 20:18:19.673 # +switch-master redismaster 192.168.10.117 8000 192.168.10.116 8000 根据日志看到master被选择为了116 ip | grep -w inet inet 192.168.10.116/24 brd 192.168.10.255 scope global noprefixroute ens18 inet 192.168.10.99/24 scope global secondary ens18 在116查看ip能看到99已经变到了116上