如何在中标麒麟v10 Linux服务器上安装配置3主3从的Redis cluster,以及解决Waiting for the cluster to join的问题
Contents
零 需求场景
可以一套CentOS Linux服务器上部署了3主3从的Redis cluster,需要对该3台服务器执行内存降配操作。需要先停止Redis cluster,然后降配内存,重启这3台虚拟机服务器,再启动3节点的Redis cluster。由于,我们之前没有执行过类似操作,我需要找到3台机器,执行Redis cluster的安装配置,以及关闭Redis cluster,再重启Redis cluster的模拟操作。
本文档用于记录,在3台中标麒麟v10的Linux服务器上执行:安装3主3从Redis cluster,以及如何关闭、启动Redis cluster的操作。以及在安装Redis cluster过程中遇到的问题:Waiting for the cluster to join
为了模拟和生产环境尽可能一致,这里指定Redis运行的端口分别是6001和7001。相当于,在每台机器上启动运行了2个Redis,一个运行在6001端口的master节点,另一个在7001端口上的slave节点。
一 机器信息
机器IP | 主机操作系统版本 | hostname | 机器配置 | 端口 |
---|---|---|---|---|
10.0.9.63 | Kylin Linux Advanced Server V10 (Lance) | czmaster | 8C32G | 6001和7001 |
10.0.9.64 | Kylin Linux Advanced Server V10 (Lance) | czworker1 | 8C16G | 6001和7001 |
10.0.9.65 | Kylin Linux Advanced Server V10 (Lance) | czworker2 | 16C24G | 6001和7001 |
二 安装配置过程
注意📢:系列步骤1-6,需要分别在3台机器上都执行。step 7只需要在其中任意一台机器执行即可。
1 下载|解压Redis软件
cd /data/ wget http://download.redis.io/releases/redis-5.0.5.tar.gz tar -zxvf redis-5.0.5.tar.gz
2 编译|安装Redis
cd redis-5.0.5/ make make install PREFIX=/data/redis-5.0.5/
3 创建日志路径
#创建日志目录 mkdir -p /data/redis-5.0.5/logs
4 编辑配置文件
mkdir -p /data/redis-5.0.5/6001 mkdir -p /data/redis-5.0.5/7001 vi /data/redis-5.0.5/6001/redis.conf #内容如下: daemonize yes masterauth ETsb&11p requirepass ETsb&11p pidfile /data/redis-5.0.5/pidfile/redis_6001.pid port 6001 tcp-backlog 511 timeout 0 tcp-keepalive 0 loglevel notice logfile /data/redis-5.0.5/logs/redis_6001.log databases 16 save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes dbfilename dump.rdb dir /data/redis-5.0.5/6001/ replica-serve-stale-data yes replica-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-disable-tcp-nodelay no slave-priority 100 appendonly yes appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb aof-load-truncated yes lua-time-limit 5000 cluster-enabled yes cluster-config-file nodes.conf cluster-node-timeout 5000 cluster-require-full-coverage no slowlog-log-slower-than 10000 slowlog-max-len 128 latency-monitor-threshold 0 notify-keyspace-events "" hash-max-ziplist-entries 512 hash-max-ziplist-value 64 set-max-intset-entries 512 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 hll-sparse-max-bytes 3000 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 aof-rewrite-incremental-fsync yes protected-mode no vi /data/redis-5.0.5/7001/redis.conf #内容如下: daemonize yes masterauth ETsb&11p requirepass ETsb&11p pidfile /data/redis-5.0.5/pidfile/redis_7001.pid port 7001 tcp-backlog 511 timeout 0 tcp-keepalive 0 loglevel notice logfile /data/redis-5.0.5/logs/redis_7001.log databases 16 save 900 1 save 300 10 save 60 10000 stop-writes-on-bgsave-error yes rdbcompression yes rdbchecksum yes dbfilename dump.rdb dir /data/redis-5.0.5/7001/ replica-serve-stale-data yes replica-read-only yes repl-diskless-sync no repl-diskless-sync-delay 5 repl-disable-tcp-nodelay no slave-priority 100 appendonly yes appendfilename "appendonly.aof" appendfsync everysec no-appendfsync-on-rewrite no auto-aof-rewrite-percentage 100 auto-aof-rewrite-min-size 64mb aof-load-truncated yes lua-time-limit 5000 cluster-enabled yes cluster-config-file nodes.conf cluster-node-timeout 5000 cluster-require-full-coverage no slowlog-log-slower-than 10000 slowlog-max-len 128 latency-monitor-threshold 0 notify-keyspace-events "" hash-max-ziplist-entries 512 hash-max-ziplist-value 64 set-max-intset-entries 512 zset-max-ziplist-entries 128 zset-max-ziplist-value 64 hll-sparse-max-bytes 3000 activerehashing yes client-output-buffer-limit normal 0 0 0 client-output-buffer-limit slave 256mb 64mb 60 client-output-buffer-limit pubsub 32mb 8mb 60 hz 10 aof-rewrite-incremental-fsync yes protected-mode no
5 防火墙添加规则,放行6001、7001、16001、17001端口
firewall-cmd --zone=public --add-port=7001/tcp --permanent firewall-cmd --zone=public --add-port=6001/tcp --permanent firewall-cmd --zone=public --add-port=17001/tcp --permanent firewall-cmd --zone=public --add-port=16001/tcp --permanent firewall-cmd --reload firewall-cmd --list-ports
注意:其中的16001和17001端口,分别用于Redis cluster节点之间内部通信使用的端口。6001和7001用于对外提供服务使用。默认情况下,Redis运行在6379端口,加上10000的端口号16379则是用于Redis集群通信的端口。
6 启动Redis服务
cd /data/redis-5.0.5/ bin/redis-server 6001/redis.conf cd /data/redis-5.0.5/ bin/redis-server 7001/redis.conf
分别在每个机器上都需要分别执行上述命令。
[root@czmaster redis-5.0.5]# cd /data/redis-5.0.5/ [root@czmaster redis-5.0.5]# bin/redis-server 6001/redis.conf [root@czmaster redis-5.0.5]# [root@czmaster redis-5.0.5]# cd /data/redis-5.0.5/ [root@czmaster redis-5.0.5]# bin/redis-server 7001/redis.conf [root@czmaster redis-5.0.5]# ps -ef|grep redis root 2898809 1 0 11:51 ? 00:00:23 bin/redis-server *:6001 [cluster] root 2898818 1 0 11:51 ? 00:00:26 bin/redis-server *:7001 [cluster] root 2971444 660537 0 14:17 pts/0 00:00:00 grep redis [root@czmaster redis-5.0.5]#
10.0.9.64:
[root@czworker1 redis-5.0.5]# cd /data/redis-5.0.5/ [root@czworker1 redis-5.0.5]# bin/redis-server 6001/redis.conf [root@czworker1 redis-5.0.5]# [root@czworker1 redis-5.0.5]# cd /data/redis-5.0.5/ [root@czworker1 redis-5.0.5]# bin/redis-server 7001/redis.conf [root@czworker1 redis-5.0.5]# ps -ef|grep redis root 3946026 1 0 11:51 ? 00:00:25 bin/redis-server *:6001 [cluster] root 3946031 1 0 11:51 ? 00:00:26 bin/redis-server *:7001 [cluster] root 3963948 3380448 0 14:16 pts/0 00:00:00 grep redis [root@czworker1 redis-5.0.5]#
10.0.9.65:
[root@czworker2 redis-5.0.5]# cd /data/redis-5.0.5/ [root@czworker2 redis-5.0.5]# bin/redis-server 6001/redis.conf [root@czworker2 redis-5.0.5]# [root@czworker2 redis-5.0.5]# cd /data/redis-5.0.5/ [root@czworker2 redis-5.0.5]# bin/redis-server 7001/redis.conf [root@czworker2 redis-5.0.5]# ps -ef|grep redis root 2438906 1 0 11:51 ? 00:00:24 bin/redis-server *:6001 [cluster] root 2438913 1 0 11:51 ? 00:00:27 bin/redis-server *:7001 [cluster] root 2556635 3008695 0 14:17 pts/0 00:00:00 grep redis [root@czworker2 redis-5.0.5]#
7 创建Redis cluster
只在10.0.9.63这台机器上执行:
[root@czmaster redis-5.0.5]# bin/redis-cli -a "ETsb&11p" --cluster create 10.0.9.63:6001 10.0.9.63:7001 10.0.9.64:6001 10.0.9.64:7001 10.0.9.65:6001 10.0.9.65:7001 --cluster-replicas 1 Warning: Using a password with '-a' or '-u' option on the command line interface may not be safe. >>> Performing hash slots allocation on 6 nodes... Master[0] -> Slots 0 - 5460 Master[1] -> Slots 5461 - 10922 Master[2] -> Slots 10923 - 16383 Adding replica 10.0.9.64:7001 to 10.0.9.63:6001 Adding replica 10.0.9.65:7001 to 10.0.9.64:6001 Adding replica 10.0.9.63:7001 to 10.0.9.65:6001 M: b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001 slots:[0-5460] (5461 slots) master S: 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001 replicates 70acb193c5eb2d2b4e9354136ea4025810541405 M: 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001 slots:[5461-10922] (5462 slots) master S: fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001 replicates b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d M: 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001 slots:[10923-16383] (5461 slots) master S: d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001 replicates 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 Can I set the above configuration? (type 'yes' to accept): yes >>> Nodes configuration updated >>> Assign a different config epoch to each node >>> Sending CLUSTER MEET messages to join the cluster Waiting for the cluster to join ... >>> Performing Cluster Check (using node 10.0.9.63:6001) M: b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001 slots:[0-5460] (5461 slots) master 1 additional replica(s) S: d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001 slots: (0 slots) slave replicates 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 S: 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001 slots: (0 slots) slave replicates 70acb193c5eb2d2b4e9354136ea4025810541405 M: 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001 slots:[10923-16383] (5461 slots) master 1 additional replica(s) M: 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001 slots:[5461-10922] (5462 slots) master 1 additional replica(s) S: fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001 slots: (0 slots) slave replicates b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d [OK] All nodes agree about slots configuration. >>> Check for open slots... >>> Check slots coverage... [OK] All 16384 slots covered. [root@czmaster redis-5.0.5]#
至此,完成在3台服务器上创建3主3从模式的Redis cluster。
8 查看cluster信息
可以在任意一个节点上,分别访问6001端口和7001端口,来查看cluster信息。如下,在10.0.9.63机器上,分别执行:#6001端口,执行命令 bin/redis-cli -p 6001 auth ETsb&11p cluster nodes cluster info exit #7001端口,执行命令 bin/redis-cli -p 7001 auth ETsb&11p cluster nodes cluster info exit #6001端口,执行结果 [root@czmaster redis-5.0.5]# bin/redis-cli -p 6001 127.0.0.1:6001> 127.0.0.1:6001> auth ETsb&11p OK 127.0.0.1:6001> 127.0.0.1:6001> cluster nodes d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001@17001 slave 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 0 1705302217859 6 connected 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001@17001 slave 70acb193c5eb2d2b4e9354136ea4025810541405 0 1705302216855 5 connected 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001@16001 master - 0 1705302218000 5 connected 10923-16383 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001@16001 master - 0 1705302218000 3 connected 5461-10922 fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001@17001 slave b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 0 1705302218561 4 connected b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001@16001 myself,master - 0 1705302217000 1 connected 0-5460 127.0.0.1:6001> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:1 cluster_stats_messages_ping_sent:21160 cluster_stats_messages_pong_sent:20906 cluster_stats_messages_sent:42066 cluster_stats_messages_ping_received:20901 cluster_stats_messages_pong_received:21160 cluster_stats_messages_meet_received:5 cluster_stats_messages_received:42066 127.0.0.1:6001> exit [root@czmaster redis-5.0.5]# #7001端口,执行结果 [root@czmaster redis-5.0.5]# bin/redis-cli -p 7001 127.0.0.1:7001> 127.0.0.1:7001> auth ETsb&11p OK 127.0.0.1:7001> 127.0.0.1:7001> cluster nodes fec892994c3f418e216b919994bf77b79a5e0218 10.0.9.64:7001@17001 slave b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 0 1705302032527 4 connected 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 10.0.9.64:6001@16001 master - 0 1705302031624 3 connected 5461-10922 1f9b3ccd35c5b1e400f2c60fceff19cad556fa68 10.0.9.63:7001@17001 myself,slave 70acb193c5eb2d2b4e9354136ea4025810541405 0 1705302031000 2 connected 70acb193c5eb2d2b4e9354136ea4025810541405 10.0.9.65:6001@16001 master - 0 1705302032127 5 connected 10923-16383 d5ac772fa57f7ba65409e28de337877e04d4ae0b 10.0.9.65:7001@17001 slave 393d16c1ff2f7b9506eb4e3db3c8061f5bf40111 0 1705302032000 6 connected b9254c329d8e3ae3ac4e238bb707698aa9dc4b2d 10.0.9.63:6001@16001 master - 0 1705302032628 1 connected 0-5460 127.0.0.1:7001> cluster info cluster_state:ok cluster_slots_assigned:16384 cluster_slots_ok:16384 cluster_slots_pfail:0 cluster_slots_fail:0 cluster_known_nodes:6 cluster_size:3 cluster_current_epoch:6 cluster_my_epoch:5 cluster_stats_messages_ping_sent:20180 cluster_stats_messages_pong_sent:20366 cluster_stats_messages_meet_sent:3 cluster_stats_messages_sent:40549 cluster_stats_messages_ping_received:20363 cluster_stats_messages_pong_received:20183 cluster_stats_messages_meet_received:3 cluster_stats_messages_received:40549 127.0.0.1:7001> exit [root@czmaster redis-5.0.5]#
9 关闭Redis cluster
只需要分别在每个节点上,分别关闭6001和7001端口上的Redis服务,相当于关闭了Redis cluster。
#===关闭 bin/redis-cli -p 7001 auth ETsb&11p shutdown save exit #关闭6001 bin/redis-cli -p 6001 auth ETsb&11p shutdown save exit #校验端口是否存在 netstat -anp|grep 7001 netstat -anp|grep 6001
10 重启Redis cluster
在不重建Redis cluster的前提下,可以先执行上述的关闭Redis cluste操作;然后,依次在每台服务器上分别执行启动6001和7001端口上的Redis服务:
#====启动 cd /data/redis-5.0.5/ bin/redis-server 6001/redis.conf cd /data/redis-5.0.5/ bin/redis-server 7001/redis.conf netstat -anp|grep 6001 netstat -anp|grep 7001
这样,会自动读取nodes.conf文件,并启动Redis cluster。
三 遇到以及如何解决Waiting for the cluster to join问题
在执行创建Redis cluster的过程中,曾经遇到过Waiting for the cluster to join,发现一直卡着,导致cluster一直创建不成功。后来身份分析,发现在防火墙上,因为没有放开16001和17001端口的访问策略,导致创建失败。解决办法是:先在3个节点上分别停止6001和7001端口的Redis服务;然后,分别删除3个节点上/data/redis-5.0.5/6001/nodes.conf和/data/redis-5.0.5/7001/nodes.conf的配置文件。如果不删除对应路径下的nodes.conf文件的话,那么当下次重启Redis服务时,默认情况下,会继续读取上次的nodes.conf配置文件,这样,如果重启之前cluster的状态有问题,那么重启之后,依然读取了有问题的cluster配置文件,cluster则依然是一个有问题的状态。
解决流程:先停止3个节点上的6001和7001端口Redis服务,然后删除nodes.conf,最后重新执行初始化Redis cluster的命令即可解决。
四 参考和链接
如何重启redis cluster
https://blog.csdn.net/justry_deng/article/details/89205155
如何解决:Waiting for the cluster to join问题
https://linux.m2osw.com/redis-infamous-waiting-cluster-join-message