vlambda博客
学习文章列表

redis5.0.8哨兵模式安装

1. redis主从

1.1. 环境配置

3台服务器,一台master,两台slave

主机说明 主机IP 端口 sentinel端口
master 10.69.105.78 7000 26379
slave1 10.69.105.79 7001 26380
slave2 10.69.105.80 7002 26381

1.2. redis安装

1.2.1. 检查gcc是否安装

rpm -qa|grep gcc 
如果没有安装,配置好yum,执行如下命令进行安装
yum -y install gcc*

1.2.2. 下载并安装

groupadd -g 54321 redis
useradd -u 54321 -g redis redis
tar xvfz redis-5.0.8.tar.gz -C /home/redis

编译安装
cd /home/redis/redis-5.0.8
make
make PREFIX=/usr/local/redis/ install
mkdir -p /usr/local/redis/{etc,var,log,data}
cd /home/redis/redis-5.0.8/utils/

[root@cf-myjfx-yy-3 utils]# ./install_server.sh
Welcome to the redis service installer
This script will help you easily set up a running redis server

Please select the redis port for this instance: [6379] 7000
Please select the redis config file name [/etc/redis/7000.conf] /usr/local/redis/etc/redis-7000.conf
Please select the redis log file name [/var/log/redis_7000.log] /usr/local/redis/log/redis_7000.log
Please select the data directory for this instance [/var/lib/redis/7000] /usr/local/redis/data/7000
Please select the redis executable path [] /usr/local/redis/bin/redis-server
Selected config:
Port : 7000
Config file : /usr/local/redis/etc/redis-7000.conf
Log file : /usr/local/redis/log/redis_7000.log
Data dir : /usr/local/redis/data/7000
Executable : /usr/local/redis/bin/redis-server
Cli Executable : /usr/local/redis/bin/redis-cli
Is this ok? Then press ENTER to go on or Ctrl-C to abort.
Copied /tmp/7000.conf => /etc/init.d/redis_7000
Installing service...
Successfully added to chkconfig!
Successfully added to runlevels 345!
Starting Redis server...
Installation successful!
[root@cf-myjfx-yy-3 utils]#

另外两节点按照以上步骤进行安装即可。注意修改端口号

1.3. 主从复制配置

官方推荐使用AOF进行数据持久化,这也是官方推荐的。

1.3.1. 配置master的文件名redis-7000.conf

daemonize yes
port 7000
logfile 7000.log
dir ./
requirepass 123
masterauth 123 # 78服务器配置masterauth作用主要是为了后期sentinel引入后重新选举master并且7000端口redis重新加入主从复制时必备的,否则会出现权限不足
bind 10.69.105.78 127.0.0.1 # 生产环境中推荐使用绑定具体IP,生产环境下禁止使用0.0.0.0。这里必须配置上127.0.0.1,否则会出现无法启动的问题,一定要写在最后

# AOF 数据持久化
appendonly yes
appendfilename aof-7000.aof
appendfsync everysec
no-appendfsync-on-rewrite yes
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

1.2.2. 配置两个slave的配置文件

105.79的配置文件为redis-7001.conf

port 7001
daemonize yes
logfile 7001.log
dir ./
requirepass 123
replicaof 10.69.105.78 7000
masterauth 123
bind 10.69.105.79 127.0.0.1

# AOF 数据持久化
appendonly yes
appendfilename aof-7001.aof
appendfsync everysec
no-appendfsync-on-rewrite yes
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

105.80的配置文件为redis-7002.conf

port 7002
daemonize yes
logfile 7002.log
dir ./
requirepass 123
replicaof 10.69.105.78 7000
masterauth 123
bind 10.69.105.80 127.0.0.1

# AOF 数据持久化
appendonly yes
appendfilename aof-7002.aof
appendfsync everysec
no-appendfsync-on-rewrite yes
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb

1.2.3. 启动三台redis服务

/usr/local/redis/bin/redis-server redis-7000.conf
/usr/local/redis/bin/redis-server redis-7001.conf
/usr/local/redis/bin/redis-server redis-7002.conf

1.2.4. 检查是否配置成功

在slave节点下输入命令: info replication 可以看到role是slave,以及主机IP、端口、状态等信息,即代表配置成功

127.0.0.1:7001> info replication
# Replication
role:slave
master_host:10.69.105.78
master_port:7000
master_link_status:up
master_last_io_seconds_ago:6
master_sync_in_progress:0
slave_repl_offset:154
slave_priority:100
slave_read_only:1
connected_slaves:0
master_replid:9ba0b6800780220f0f85d5c483a02de43d9daafb
master_replid2:0000000000000000000000000000000000000000
master_repl_offset:154
second_repl_offset:-1
repl_backlog_active:1
repl_backlog_size:1048576
repl_backlog_first_byte_offset:1
repl_backlog_histlen:154

在master节点执行: set name zhangsan

127.0.0.1:7000> set name zhangsan
OK
127.0.0.1:7000>

在两台slave端获取:get name

127.0.0.1:7001> get name
"zhangsan"
127.0.0.1:7002> get name
"zhangsan"

2. 配置哨兵

2.1. 哨兵配置文件

进入105.78服务器的/usr/local/redis/etc目录下,创建一个名为sentinel-26379.conf配置文件,内容如下:

port 26379
daemonize yes
logfile "/usr/local/redis/log/sentinel-26379.log"
dir "/usr/local/redis/data"
sentinel monitor mymaster 10.69.105.78 7000 2
sentinel down-after-milliseconds mymaster 30000
sentinel parallel-syncs mymaster 1
sentinel failover-timeout mymaster 15000
sentinel auth-pass mymaster 123
bind 10.69.105.78

105.79和105.80服务器下的redis目录中我们也创建相同的sentinel配置文件,但主要修改一下端口和bind绑定的ip,下面介绍下配置文件中的配置项

sentinel monitor <master-name> <ip> <redis-port> <quorum>  
告诉sentinel去监听地址为ip:port的一个master,这里的master-name可以自定义,quorum是一个数字,指明当有多少个sentinel认为一个master失效时,master才算真正失效

sentinel auth-pass <master-name> <password>
设置连接master和slave时的密码,注意的是sentinel不能分别为master和slave设置不同的密码,因此master和slave的密码应该设置相同。

sentinel down-after-milliseconds <master-name> <milliseconds>
这个配置项指定了需要多少失效时间,一个master才会被这个sentinel主观地认为是不可用的。 单位是毫秒,默认为30秒

sentinel parallel-syncs <master-name> <numslaves>
这个配置项指定了在发生failover主备切换时最多可以有多少个slave同时对新的master进行 同步,这个数字越小,完成failover所需的时间就越长,但是如果这个数字越大,就意味着越 多的slave因为replication而不可用。可以通过将这个值设为 1 来保证每次只有一个slave 处于不能处理命令请求的状态。

sentinel failover-timeout <master-name> <milliseconds>
failover-timeout 可以用在以下这些方面:
1. 同一个sentinel对同一个master两次failover之间的间隔时间。
2. 当一个slave从一个错误的master那里同步数据开始计算时间。直到slave被纠正为向正确的master那里同步数据时。
3.当想要取消一个正在进行的failover所需要的时间。
4.当进行failover时,配置所有slaves指向新的master所需的最大时间。不过,即使过了这个超时,slaves依然会被正确配置为指向master,但是就不按parallel-syncs所配置的规则来了。

2.2. 启动哨兵

2.2.1. 启动master节点的哨兵

[root@cf-myjfx-yy-3 bin]# ./redis-sentinel ../etc/sentinel-26379.conf 
[root@cf-myjfx-yy-3 bin]# tail -f ../log/sentinel-26379.log
65899:X 21 May 2020 19:13:08.857 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
65899:X 21 May 2020 19:13:08.857 # Redis version=5.0.8, bits=64, commit=00000000, modified=0, pid=65899, just started
65899:X 21 May 2020 19:13:08.857 # Configuration loaded
65900:X 21 May 2020 19:13:08.860 * Increased maximum number of open files to 10032 (it was originally set to 1024).
65900:X 21 May 2020 19:13:08.860 * Running mode=sentinel, port=26379.
65900:X 21 May 2020 19:13:08.860 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
65900:X 21 May 2020 19:13:08.862 # Sentinel ID is b16d07d27e85d3127b05cf7abd392f7eb11ba688
65900:X 21 May 2020 19:13:08.862 # +monitor master mymaster 10.69.105.78 7000 quorum 2
65900:X 21 May 2020 19:13:08.862 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65900:X 21 May 2020 19:13:08.863 * +slave slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000

2.2.2 启动105.79节点的哨兵

[root@cf-myjfdx-yy-4 bin]# ./redis-sentinel ../etc/sentinel-26380.conf
[root@cf-myjfdx-yy-4 bin]# tail -f ../log/sentinel-26380.log
65550:X 21 May 2020 19:17:28.786 # Redis version=5.0.8, bits=64, commit=00000000, modified=0, pid=65550, just started
65550:X 21 May 2020 19:17:28.786 # Configuration loaded
65551:X 21 May 2020 19:17:28.787 * Increased maximum number of open files to 10032 (it was originally set to 1024).
65551:X 21 May 2020 19:17:28.788 * Running mode=sentinel, port=26380.
65551:X 21 May 2020 19:17:28.788 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
65551:X 21 May 2020 19:17:28.789 # Sentinel ID is 935e08bb931ee7734b04b48f7f8122f1f8e483cf
65551:X 21 May 2020 19:17:28.789 # +monitor master mymaster 10.69.105.78 7000 quorum 2
65551:X 21 May 2020 19:17:28.790 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:17:28.792 * +slave slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:17:30.577 * +sentinel sentinel b16d07d27e85d3127b05cf7abd392f7eb11ba688 10.69.105.78 26379 @ mymaster 10.69.105.78 7000

2.2.3. 启动105.80节点的哨兵

[root@cf-myjfdx-yy-5 bin]# ./redis-sentinel ../etc/sentinel-26381.conf 
[root@cf-myjfdx-yy-5 bin]# tail -f ../log/sentinel-26381.log
65860:X 21 May 2020 19:16:16.527 # Configuration loaded
65861:X 21 May 2020 19:16:16.531 * Increased maximum number of open files to 10032 (it was originally set to 1024).
65861:X 21 May 2020 19:16:16.533 * Running mode=sentinel, port=26381.
65861:X 21 May 2020 19:16:16.533 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
65861:X 21 May 2020 19:16:16.535 # Sentinel ID is 3e1929e27b04d1e4c9483c56a96c66e61db2abc6
65861:X 21 May 2020 19:16:16.535 # +monitor master mymaster 10.69.105.78 7000 quorum 2
65861:X 21 May 2020 19:16:16.536 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65861:X 21 May 2020 19:16:16.537 * +slave slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
65861:X 21 May 2020 19:16:17.261 * +sentinel sentinel b16d07d27e85d3127b05cf7abd392f7eb11ba688 10.69.105.78 26379 @ mymaster 10.69.105.78 7000
65861:X 21 May 2020 19:16:17.433 * +sentinel sentinel 935e08bb931ee7734b04b48f7f8122f1f8e483cf 10.69.105.79 26380 @ mymaster 10.69.105.78 7000

运行完后,再看看sentinel-26379.conf配置,发现配置文件被重写了,从内容可以看出有哪些salve和sentinel

[root@cf-myjfx-yy-3 bin]# cat /usr/local/redis/etc/sentinel-26379.conf 
port 26379
daemonize yes
logfile "/usr/local/redis/log/sentinel-26379.log"
dir "/usr/local/redis/data"
sentinel myid b16d07d27e85d3127b05cf7abd392f7eb11ba688
sentinel deny-scripts-reconfig yes
sentinel monitor mymaster 10.69.105.80 7002 2
sentinel failover-timeout mymaster 15000
sentinel auth-pass mymaster 123456
bind 10.69.105.78
# Generated by CONFIG REWRITE
protected-mode no
sentinel config-epoch mymaster 1
sentinel leader-epoch mymaster 1
sentinel known-replica mymaster 10.69.105.78 7000
sentinel known-replica mymaster 10.69.105.79 7001
sentinel known-sentinel mymaster 10.69.105.80 26381 3e1929e27b04d1e4c9483c56a96c66e61db2abc6
sentinel known-sentinel mymaster 10.69.105.79 26380 935e08bb931ee7734b04b48f7f8122f1f8e483cf
sentinel current-epoch 1

3. 验证集群

3.1. 关闭master(105.78)

查看105.78的sentinel-26380.log

======发现105.78 的master不可用  sdown
65900:X 21 May 2020 19:18:35.895 # +sdown master mymaster 10.69.105.78 7000
65900:X 21 May 2020 19:18:35.985 # +new-epoch 1
65900:X 21 May 2020 19:18:35.986 # +vote-for-leader 935e08bb931ee7734b04b48f7f8122f1f8e483cf 1
65900:X 21 May 2020 19:18:36.861 # +config-update-from sentinel 935e08bb931ee7734b04b48f7f8122f1f8e483cf 10.69.105.79 26380 @ mymaster 10.69.105.78 7000
65900:X 21 May 2020 19:18:36.862 # +switch-master mymaster 10.69.105.78 7000 10.69.105.80 7002
65900:X 21 May 2020 19:18:36.862 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.80 7002
65900:X 21 May 2020 19:18:36.862 * +slave slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002
65900:X 21 May 2020 19:19:06.883 # +sdown slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002

3.1. 查看105.79的哨兵日志

65551:X 21 May 2020 19:22:31.133 # +sdown master mymaster 10.69.105.78 7000
=====投票后,发现有2个sentinel发现master不能用
65551:X 21 May 2020 19:22:31.199 # +odown master mymaster 10.69.105.78 7000 #quorum 2/2
=====当前配置版本被更新
65551:X 21 May 2020 19:22:31.199 # +new-epoch 1
=====达到故障转移failover的条件,正等待其他sentinel的选举
65551:X 21 May 2020 19:22:31.199 # +try-failover master mymaster 10.69.105.78 7000
=====进行投票选举slave服务器
65551:X 21 May 2020 19:22:31.200 # +vote-for-leader 935e08bb931ee7734b04b48f7f8122f1f8e483cf 1
65551:X 21 May 2020 19:22:31.203 # b16d07d27e85d3127b05cf7abd392f7eb11ba688 voted for 935e08bb931ee7734b04b48f7f8122f1f8e483cf 1
65551:X 21 May 2020 19:22:31.204 # 3e1929e27b04d1e4c9483c56a96c66e61db2abc6 voted for 935e08bb931ee7734b04b48f7f8122f1f8e483cf 1
65551:X 21 May 2020 19:22:31.255 # +elected-leader master mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:31.255 # +failover-state-select-slave master mymaster 10.69.105.78 7000
=====选择一个slave当选新的master
65551:X 21 May 2020 19:22:31.308 # +selected-slave slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
=====把选举出来的slave进行身份master切换
65551:X 21 May 2020 19:22:31.308 * +failover-state-send-slaveof-noone slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:31.408 * +failover-state-wait-promotion slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:32.003 # +promoted-slave slave 10.69.105.80:7002 10.69.105.80 7002 @ mymaster 10.69.105.78 7000
=====把故障转移failover该表reconf-slaves
65551:X 21 May 2020 19:22:32.003 # +failover-state-reconf-slaves master mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:32.078 * +slave-reconf-sent slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:32.309 # -odown master mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:33.037 * +slave-reconf-inprog slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:33.037 * +slave-reconf-done slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.78 7000
65551:X 21 May 2020 19:22:33.092 # +failover-end master mymaster 10.69.105.78 7000
=====maser地址发生改变
65551:X 21 May 2020 19:22:33.092 # +switch-master mymaster 10.69.105.78 7000 10.69.105.80 7002
=====检测slave并添加到slave列表
65551:X 21 May 2020 19:22:33.093 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.80 7002
65551:X 21 May 2020 19:22:33.093 * +slave slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002
65551:X 21 May 2020 19:23:03.158 # +sdown slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002

3.3. 查看105.80的哨兵日志

65861:X 21 May 2020 19:20:47.262 # +sdown master mymaster 10.69.105.78 7000
65861:X 21 May 2020 19:20:47.292 # +new-epoch 1
65861:X 21 May 2020 19:20:47.293 # +vote-for-leader 935e08bb931ee7734b04b48f7f8122f1f8e483cf 1
65861:X 21 May 2020 19:20:47.345 # +odown master mymaster 10.69.105.78 7000 #quorum 3/2
65861:X 21 May 2020 19:20:47.345 # Next failover delay: I will not start a failover before Thu May 21 19:21:17 2020
65861:X 21 May 2020 19:20:48.168 # +config-update-from sentinel 935e08bb931ee7734b04b48f7f8122f1f8e483cf 10.69.105.79 26380 @ mymaster 10.69.105.78 7000
65861:X 21 May 2020 19:20:48.168 # +switch-master mymaster 10.69.105.78 7000 10.69.105.80 7002
65861:X 21 May 2020 19:20:48.169 * +slave slave 10.69.105.79:7001 10.69.105.79 7001 @ mymaster 10.69.105.80 7002
65861:X 21 May 2020 19:20:48.169 * +slave slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002
65861:X 21 May 2020 19:21:18.222 # +sdown slave 10.69.105.78:7000 10.69.105.78 7000 @ mymaster 10.69.105.80 7002