特殊机制——ASK路由 错误演示 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 下面是一个redis集群 192.168.10.116 redis1 192.168.10.117 redis2 192.168.10.118 redis3 8001端口主 8002端口从 测试脚本 PORT=8001 for (( i=1 ; i<=1002 ; i++ )) do KEY="k_$i " VALUE="v_$i " redis-cli -p $PORT set "$KEY " "$VALUE " done 执行该脚本时会报错: (error) MOVED 6468 192.168.10.117:8001 (error) MOVED 10535 192.168.10.117:8001 (error) MOVED 14598 192.168.10.118:8001 OK (error) MOVED 6592 192.168.10.117:8001 (error) MOVED 10659 192.168.10.117:8001 (error) MOVED 14722 192.168.10.118:8001 OK
出现这个报错的原因是redis-cli写入key时,连接了一个不属于当前redis实例的slot
所以无法进行写入
解决方法——引入了ASK路由
只要在redis-cli参数中加入-c,使用集群模式,就会使用ASK路由自动进行重定向,自动写到slot目标的redis实例中
redis-cli -c -p $PORT set "$KEY" "$VALUE"
参数——cluster-node-timeout cluster-node-timeout 5000
单位为毫秒ms 主从切换时间:当主节点无响应5000ms后,进行切换
场景:等待时间中第二台故障 如果设置时间过长,可能出现这样的情况:
第一台master无响应,在cluster-node-timeout等待时间中,出现第二台master故障的情况
集群状态如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 cluster nodes c35ba3d5788e974ad6830f5dbdeaa5fa0303bd2e 192.168.10.118:8002@18002 slave 68d676f1a14073941cfeb46e88f281bc85f61c1d 0 1727107349000 11 connected c37cd7a35beb35c7cf648d54c88f58514fd8dbdd 192.168.10.116:8001@18001 master,fail? - 1727107328357 1727107327000 9 disconnected 0-5460 68d676f1a14073941cfeb46e88f281bc85f61c1d 192.168.10.117:8001@18001 master,fail? - 1727107328860 1727107328760 11 disconnected 6826-12287 68c3eb97a6a9d531f83690af8f780f4ed2af73c4 192.168.10.118:8001@18001 myself,master - 0 1727107349000 10 connected 5461-6825 12288-16383 8324825e7ecf81cf4ae4bf2604b022b8a8f2e068 192.168.10.116:8002@18002 slave 68c3eb97a6a9d531f83690af8f780f4ed2af73c4 0 1727107349347 10 connected 8091088d4c52b74d29185729ff820f1e821bdb30 192.168.10.117:8002@18002 slave c37cd7a35beb35c7cf648d54c88f58514fd8dbdd 0 1727107350354 9 connected cluster info cluster_state:fail cluster_slots_assigned:16384 cluster_slots_ok:5461 cluster_slots_pfail:10923 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3
此时数据还没有丢失
解决方法:直接对redis进程进行重启
1 2 3 4 5 6 7 cluster nodes c35ba3d5788e974ad6830f5dbdeaa5fa0303bd2e 192.168.10.118:8002@18002 slave 68d676f1a14073941cfeb46e88f281bc85f61c1d 0 1727107543178 11 connected c37cd7a35beb35c7cf648d54c88f58514fd8dbdd 192.168.10.116:8001@18001 master - 0 1727107544181 9 connected 0-5460 68d676f1a14073941cfeb46e88f281bc85f61c1d 192.168.10.117:8001@18001 master - 0 1727107544081 11 connected 6826-12287 68c3eb97a6a9d531f83690af8f780f4ed2af73c4 192.168.10.118:8001@18001 myself,master - 0 1727107542000 10 connected 5461-6825 12288-16383 8324825e7ecf81cf4ae4bf2604b022b8a8f2e068 192.168.10.116:8002@18002 slave 68c3eb97a6a9d531f83690af8f780f4ed2af73c4 0 1727107543580 10 connected 8091088d4c52b74d29185729ff820f1e821bdb30 192.168.10.117:8002@18002 slave c37cd7a35beb35c7cf648d54c88f58514fd8dbdd 0 1727107543679 9 connected
场景:跨机房/跨地区通信 此时加入了传输时间的变量
需要综合传输时间、信息量进行redis压测后
决定cluster-node-timeout参数大小