PostgreSQL高可用repmgr

repmgr简介

repmgr 是一个开源工具套件,用于管理 PostgreSQL 服务器集群中的复制和故障转移。它通过设置备用服务器、监控复制以及执行故障转移或手动切换操作等管理任务的工具增强了 PostgreSQL 的内置热备用功能。官方主页

repmgr构架

repmgr流复制管理工具对集群每个节点都有自己的repmgr.conf配置文件,用来记录本节点的ID、节点名称、连接信息、数据库PGDATA目录等配置参数。在配置好这些参数后就可以通过repmgr命令实现对集群节点的“一键式”部署。部署完成后,每个节点都有自己的repmgrd守护进程来监控节点数据库状态,且每个节点维护自己的元数据表,用于记录所有集群节点的信息。其中主节点守护进程主要用来监控本节点数据库服务状态,备节点守护进程主要用来监控主节点和本节点数据库服务状态。可以手动提升standby为primary和自动切换primary和standby角色。

下面我们就从零开始基于repmgr工具部署一套1主2从的高可用环境,并测试手动切换的场景。

环境准备

主机名 IP地址 操作系统版本 数据库版本 repmgr版本 角色
pg1 192.168.17.71 CentOS7.9 14.9 5.3.3 primary
pg2 192.168.17.72 CentOS7.9 14.9 5.3.3 standby
pg3 192.168.17.73 CentOS7.9 14.9 5.3.3 standby

下载软件地址

https://ftp.postgresql.org/pub/source/v14.9/postgresql-14.9.tar.gz

https://www.repmgr.org/download/repmgr-5.3.3.tar.gz

操作系统准备

安装CentOS7.9这里省略

关闭防火墙

systemctl stop firewalld systemctl disable firewalld

3台主机都要操作

[root@pg1 ~]# systemctl stop firewalld [root@pg1 ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. [root@pg2 ~]# systemctl stop firewalld [root@pg2 ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service. [root@pg3 ~]# systemctl stop firewalld [root@pg3 ~]# systemctl disable firewalld Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service. Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service

关闭SELINUX

setenforce 0 sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

3台主机都要操作

[root@pg1 ~]# setenforce 0 [root@pg1 ~]# sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config [root@pg2 ~]# setenforce 0 [root@pg2 ~]# sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config [root@pg3 ~]# setenforce 0 [root@pg3 ~]# sed -ri 's/SELINUX=enforcing/SELINUX=disabled/' /etc/selinux/config

配置yum源

sudo sed -e 's|^mirrorlist=|#mirrorlist=|g' \ -e 's|^#baseurl=http://mirror.centos.org|baseurl=https://mirrors.tuna.tsinghua.edu.cn|g' \ -i.bak \ /etc/yum.repos.d/CentOS-*.repo

3台主机都要操作

[root@pg1 ~]# sudo sed -e 's|^mirrorlist=|#mirrorlist=|g' \ > -e 's|^#baseurl=http://mirror.centos.org|baseurl=https://mirrors.tuna.tsinghua.edu.cn|g' \ > -i.bak \ > /etc/yum.repos.d/CentOS-*.repo [root@pg1 ~]# [root@pg2 ~]# sudo sed -e 's|^mirrorlist=|#mirrorlist=|g' \ > -e 's|^#baseurl=http://mirror.centos.org|baseurl=https://mirrors.tuna.tsinghua.edu.cn|g' \ > -i.bak \ > /etc/yum.repos.d/CentOS-*.repo [root@pg2 ~]# [root@pg3 ~]# sudo sed -e 's|^mirrorlist=|#mirrorlist=|g' \ > -e 's|^#baseurl=http://mirror.centos.org|baseurl=https://mirrors.tuna.tsinghua.edu.cn|g' \ > -i.bak \ > /etc/yum.repos.d/CentOS-*.repo [root@pg3 ~]#

配置时间同步

yum install -y ntpdate ntpdate time.windows.com && hwclock -w

这里我们3台主机简单配置一下时间同步操作略

安装所需依赖的包

yum install -y readline readline-devel zlib zlib-devel gettext gettext-devel openssl openssl-devel pam pam-devel libxml2 libxml2-devel libxslt libxslt-devel perl perl-devel tcl-devel uuid-devel gcc gcc-c++ make flex bison perl-ExtUtils*

3台主机都要操作,操作略

设置hosts文件

cat >> /etc/hosts << EOF 192.168.17.71 pg1 192.168.17.72 pg2 192.168.17.73 pg3 EOF

3台主机都要操作

[root@pg1 ~]# cat >> /etc/hosts << EOF > > 192.168.17.71 pg1 > 192.168.17.72 pg2 > 192.168.17.73 pg3 > > EOF [root@pg1 ~]# [root@pg2 ~]# cat >> /etc/hosts << EOF > > 192.168.17.71 pg1 > 192.168.17.72 pg2 > 192.168.17.73 pg3 > > EOF [root@pg2 ~]# [root@pg3 ~]# cat >> /etc/hosts << EOF > > 192.168.17.71 pg1 > 192.168.17.72 pg2 > 192.168.17.73 pg3 > > EOF [root@pg3 ~]#

添加用户

groupadd postgres useradd -g postgres postgres echo "postgres"|passwd --stdin postgres

3台主机都要操作

[root@pg1 ~]# groupadd postgres [root@pg1 ~]# useradd -g postgres postgres [root@pg1 ~]# echo "postgres"|passwd --stdin postgres Changing password for user postgres. passwd: all authentication tokens updated successfully. [root@pg1 ~]# [root@pg2 ~]# groupadd postgres [root@pg2 ~]# useradd -g postgres postgres [root@pg2 ~]# echo "postgres"|passwd --stdin postgres Changing password for user postgres. passwd: all authentication tokens updated successfully. [root@pg2 ~]# [root@pg3 ~]# groupadd postgres [root@pg3 ~]# useradd -g postgres postgres [root@pg3 ~]# echo "postgres"|passwd --stdin postgres Changing password for user postgres. passwd: all authentication tokens updated successfully. [root@pg3 ~]#

创建目录

mkdir -p /opt/pg14 /opt/pgdata /opt/pgarchive chown -R postgres:postgres /opt chmod 0755 /opt/pg14 chmod 0700 /opt/pgdata /opt/pgarchive

3台主机都要操作

[root@pg1 ~]# mkdir -p /opt/pg14 /opt/pgdata /opt/pgarchive [root@pg1 ~]# chown -R postgres:postgres /opt [root@pg1 ~]# chmod 0755 /opt/pg14 [root@pg1 ~]# chmod 0700 /opt/pgdata /opt/pgarchive [root@pg1 ~]# [root@pg2 ~]# mkdir -p /opt/pg14 /opt/pgdata /opt/pgarchive [root@pg2 ~]# chown -R postgres:postgres /opt [root@pg2 ~]# chmod 0755 /opt/pg14 [root@pg2 ~]# chmod 0700 /opt/pgdata /opt/pgarchive [root@pg2 ~]# [root@pg3 ~]# mkdir -p /opt/pg14 /opt/pgdata /opt/pgarchive [root@pg3 ~]# chown -R postgres:postgres /opt [root@pg3 ~]# chmod 0755 /opt/pg14 [root@pg3 ~]# chmod 0700 /opt/pgdata /opt/pgarchive [root@pg3 ~]#

源码编译软件

安装PostgreSQL和repmgr

把软件包postgresql-14.9.tar.gz和repmgr-5.3.3.tar.gz上传到/home/postgres目录下

su - postgres tar zxf postgresql-14.9.tar.gz cd postgresql-14.9/ ./configure --prefix=/opt/pg14 gmake world && gmake install-world cd .. tar zxf repmgr-5.3.3.tar.gz cd repmgr-5.3.3/ ./configure PG_CONFIG=/opt/pg14/bin/pg_config make && make install

3台主机都要操作,操作略

配置环境变量

cd ~ cat >> /home/postgres/.bash_profile <<EOF #PostgreSQL settings export PGPORT=5432 export PGUSER=postgres export PGHOME=/opt/pg14 export PGDATA=/opt/pgdata export LD_LIBRARY_PATH=\$PGHOME/lib:\$LD_LIBRARY_PATH export PATH=\$PGHOME/bin:\$PATH EOF source /home/postgres/.bash_profile postgres --version repmgr --version

3台主机都要操作

[postgres@pg1 repmgr-5.3.3]$ cd ~ [postgres@pg1 ~]$ cat >> /home/postgres/.bash_profile <<EOF > > #PostgreSQL settings > export PGPORT=5432 > export PGUSER=postgres > export PGHOME=/opt/pg14 > export PGDATA=/opt/pgdata > export LD_LIBRARY_PATH=\$PGHOME/lib:\$LD_LIBRARY_PATH > export PATH=\$PGHOME/bin:\$PATH > > EOF [postgres@pg1 ~]$ source /home/postgres/.bash_profile [postgres@pg1 ~]$ postgres --version postgres (PostgreSQL) 14.9 [postgres@pg1 ~]$ repmgr --version repmgr 5.3.3 [postgres@pg1 ~]$ [postgres@pg2 repmgr-5.3.3]$ cd ~ [postgres@pg2 ~]$ cat >> /home/postgres/.bash_profile <<EOF > > #PostgreSQL settings > export PGPORT=5432 > export PGUSER=postgres > export PGHOME=/opt/pg14 > export PGDATA=/opt/pgdata > export LD_LIBRARY_PATH=\$PGHOME/lib:\$LD_LIBRARY_PATH > export PATH=\$PGHOME/bin:\$PATH > > EOF [postgres@pg2 ~]$ source /home/postgres/.bash_profile [postgres@pg2 ~]$ postgres --version postgres (PostgreSQL) 14.9 [postgres@pg2 ~]$ repmgr --version repmgr 5.3.3 [postgres@pg2 ~]$ [postgres@pg3 repmgr-5.3.3]$ cd ~ [postgres@pg3 ~]$ cat >> /home/postgres/.bash_profile <<EOF > > #PostgreSQL settings > export PGPORT=5432 > export PGUSER=postgres > export PGHOME=/opt/pg14 > export PGDATA=/opt/pgdata > export LD_LIBRARY_PATH=\$PGHOME/lib:\$LD_LIBRARY_PATH > export PATH=\$PGHOME/bin:\$PATH > > EOF [postgres@pg3 ~]$ source /home/postgres/.bash_profile [postgres@pg3 ~]$ postgres --version postgres (PostgreSQL) 14.9 [postgres@pg3 ~]$ repmgr --version repmgr 5.3.3 [postgres@pg3 ~]$

配置ssh互信

ssh-keygen -t rsa for i in pg{1..3}; do echo ">>> $i";ssh-copy-id $i;done

3台主机都要操作 ,输入密码postgres

[postgres@pg1 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/postgres/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/postgres/.ssh/id_rsa. Your public key has been saved in /home/postgres/.ssh/id_rsa.pub. The key fingerprint is: SHA256:eCoDf0PPOzm8VWaagAsxx3nfGUtSWmV7JBoJAg+PQ50 postgres@pg1 The key's randomart image is: +---[RSA 2048]----+ | +o....+o+ .| | o *E. +.+ + | | o * + o + . .| | + = . + + . | | . . + S . B | | o o * . * | | + =.o.+ | | + .=o | | o+ | +----[SHA256]-----+ [postgres@pg1 ~]$ for i in pg{1..3}; do echo ">>> $i";ssh-copy-id $i;done >>> pg1 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg1 (192.168.17.71)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg1'" and check to make sure that only the key(s) you wanted were added. >>> pg2 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg2 (192.168.17.72)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg2'" and check to make sure that only the key(s) you wanted were added. >>> pg3 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg3 (192.168.17.73)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg3's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg3'" and check to make sure that only the key(s) you wanted were added. [postgres@pg1 ~]$ [postgres@pg2 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/postgres/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/postgres/.ssh/id_rsa. Your public key has been saved in /home/postgres/.ssh/id_rsa.pub. The key fingerprint is: SHA256:eJZh5ISEk4NxN3YLKhoU9n9kDaTVz8JF+tS5lzFLIAE postgres@pg2 The key's randomart image is: +---[RSA 2048]----+ | +oo.+B==Eo+.. | |o o.=++Boo..o o | |. ...+ o*o+. o + | | o .. oo =oo o =| |. ...S .. . + | | .o . | | | | | | | +----[SHA256]-----+ [postgres@pg2 ~]$ for i in pg{1..3}; do echo ">>> $i";ssh-copy-id $i;done >>> pg1 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg1 (192.168.17.71)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg1'" and check to make sure that only the key(s) you wanted were added. >>> pg2 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg2 (192.168.17.72)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg2'" and check to make sure that only the key(s) you wanted were added. >>> pg3 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg3 (192.168.17.73)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg3's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg3'" and check to make sure that only the key(s) you wanted were added. [postgres@pg2 ~]$ [postgres@pg3 ~]$ ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/home/postgres/.ssh/id_rsa): Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /home/postgres/.ssh/id_rsa. Your public key has been saved in /home/postgres/.ssh/id_rsa.pub. The key fingerprint is: SHA256:PDQrCDLUUtVv04enNE5PhhYTlVJ3gFds42EQ6r0IeVk postgres@pg3 The key's randomart image is: +---[RSA 2048]----+ | .o.... .+*==o| |.. . . +o.o++| |o o + ..*Eo.o| | o . . o *oO+= . | | . . So*+O. | | . .oo... | | . . | | | | | +----[SHA256]-----+ [postgres@pg3 ~]$ for i in pg{1..3}; do echo ">>> $i";ssh-copy-id $i;done >>> pg1 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg1 (192.168.17.71)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg1's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg1'" and check to make sure that only the key(s) you wanted were added. >>> pg2 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg2 (192.168.17.72)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg2's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg2'" and check to make sure that only the key(s) you wanted were added. >>> pg3 /bin/ssh-copy-id: INFO: Source of key(s) to be installed: "/home/postgres/.ssh/id_rsa.pub" The authenticity of host 'pg3 (192.168.17.73)' can't be established. ECDSA key fingerprint is SHA256:TU9s7eIMfhdN7CEZ8rcEHCCcQsNGJOHQVmwGzMrPBbE. ECDSA key fingerprint is MD5:71:5d:3d:a6:d4:31:d1:e1:69:6b:2a:14:18:9f:02:be. Are you sure you want to continue connecting (yes/no)? yes /bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys postgres@pg3's password: Number of key(s) added: 1 Now try logging into the machine, with: "ssh 'pg3'" and check to make sure that only the key(s) you wanted were added. [postgres@pg3 ~]$

初始化数据库(只需主库)

initdb -D/opt/pgdata -EUTF8 -Upostgres -W

只需主库操作

[postgres@pg1 ~]$ initdb -D/opt/pgdata -EUTF8 -Upostgres -W The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale "en_US.UTF-8". The default text search configuration will be set to "english". Data page checksums are disabled. Enter new superuser password: Enter it again: fixing permissions on existing directory /opt/pgdata ... ok creating subdirectories ... ok selecting dynamic shared memory implementation ... posix selecting default max_connections ... 100 selecting default shared_buffers ... 128MB selecting default time zone ... Asia/Shanghai creating configuration files ... ok running bootstrap script ... ok performing post-bootstrap initialization ... ok syncing data to disk ... ok initdb: warning: enabling "trust" authentication for local connections You can change this by editing pg_hba.conf or using the option -A, or --auth-local and --auth-host, the next time you run initdb. Success. You can now start the database server using: pg_ctl -D /opt/pgdata -l logfile start [postgres@pg1 ~]$

主库修改参数文件

cat >> /opt/pgdata/postgresql.conf <<EOF #add by custom listen_addresses = '*' port = 5432 max_connections = 2000 wal_level = replica archive_mode = on archive_command = 'test ! -f /opt/pgarchive/%f && cp %p /opt/pgarchive/%f' max_wal_senders = 10 wal_keep_size = 512 hot_standby = on wal_log_hints = on shared_preload_libraries = 'repmgr' EOF

仅主库操作

[postgres@pg1 ~]$ cat >> /opt/pgdata/postgresql.conf <<EOF > > #add by custom > listen_addresses = '*' > port = 5432 > max_connections = 2000 > wal_level = replica > archive_mode = on > archive_command = 'test ! -f /opt/pgarchive/%f && cp %p /opt/pgarchive/%f' > max_wal_senders = 10 > wal_keep_size = 512 > hot_standby = on > wal_log_hints = on > shared_preload_libraries = 'repmgr' > > EOF [postgres@pg1 ~]$

主库修改认证文件

cat >> /opt/pgdata/pg_hba.conf <<EOF #add by repmgr local all all trust host repmgr repmgr 192.168.17.71/32 trust host repmgr repmgr 192.168.17.72/32 trust host repmgr repmgr 192.168.17.73/32 trust host all all 0/0 md5 host all all ::/0 md5 #forbid self-replication its own IP local replication all reject host replication all 127.0.0.0/8 reject host replication all ::1/128 reject #allow any standby connection host replication repuser 0.0.0.0/0 trust EOF

仅在主库操作

[postgres@pg1 ~]$ cat >> /opt/pgdata/pg_hba.conf <<EOF > > #add by repmgr > local all all trust > host repmgr repmgr 192.168.17.71/32 trust > host repmgr repmgr 192.168.17.72/32 trust > host repmgr repmgr 192.168.17.73/32 trust > host all all 0/0 md5 > host all all ::/0 md5 > > #forbid self-replication its own IP > local replication all reject > host replication all 127.0.0.0/8 reject > host replication all ::1/128 reject > > #allow any standby connection > host replication repuser 0.0.0.0/0 trust > > EOF [postgres@pg1 ~]$

启动主库

pg_ctl -D /opt/pgdata -l logfile start

仅在主库操作

[postgres@pg1 ~]$ pg_ctl -D /opt/pgdata -l logfile start waiting for server to start.... done server started [postgres@pg1 ~]$

配置repmgr集群环境

创建repmgr库和用户、创建replication用户

psql
create user repmgr superuser password 'repmgr';
alter user repmgr set search_path to repmgr, "$user", public;
create database repmgr owner repmgr;
create user repuser replication;
\l

仅在主库操作

[postgres@pg1 ~]$ psql
psql (14.9)
Type "help" for help.

postgres=# create user repmgr superuser password 'repmgr';
CREATE ROLE
postgres=# alter user repmgr set search_path to repmgr, "$user", public;
ALTER ROLE
postgres=# create database repmgr owner repmgr;
CREATE DATABASE
postgres=# create user repuser replication;
CREATE ROLE
postgres=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 repmgr    | repmgr   | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

postgres=#

创建repmgr配置文件

cat >> /opt/pg14/repmgr.conf <<EOF #add by repmgr node_id = 1 node_name = 'pg1' conninfo = 'host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3' data_directory = '/opt/pgdata' replication_user = 'repuser' replication_type = 'physical' repmgr_bindir = '/opt/pg14/bin' pg_bindir = '/opt/pg14/bin' EOF

3台主机都要操作,配置文件根据主机写不同的node_id和IP地址

[postgres@pg1 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > #add by repmgr > node_id = 1 > node_name = 'pg1' > conninfo = 'host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3' > data_directory = '/opt/pgdata' > replication_user = 'repuser' > replication_type = 'physical' > repmgr_bindir = '/opt/pg14/bin' > pg_bindir = '/opt/pg14/bin' > > > EOF [postgres@pg1 ~]$ [postgres@pg2 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > #add by repmgr > node_id = 2 > node_name = 'pg2' > conninfo = 'host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3' > data_directory = '/opt/pgdata' > replication_user = 'repuser' > replication_type = 'physical' > repmgr_bindir = '/opt/pg14/bin' > pg_bindir = '/opt/pg14/bin' > > > EOF [postgres@pg2 ~]$ [postgres@pg3 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > #add by repmgr > node_id = 3 > node_name = 'pg3' > conninfo = 'host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3' > data_directory = '/opt/pgdata' > replication_user = 'repuser' > replication_type = 'physical' > repmgr_bindir = '/opt/pg14/bin' > pg_bindir = '/opt/pg14/bin' > > > EOF [postgres@pg3 ~]$

注册主节点、查看元数据

repmgr -f /opt/pg14/repmgr.conf primary register psql -d repmgr SELECT * FROM repmgr.nodes;

仅在主库操作

[postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf primary register
INFO: connecting to primary database...
NOTICE: attempting to install extension "repmgr"
NOTICE: "repmgr" extension successfully installed
NOTICE: primary node record (ID: 1) registered
[postgres@pg1 ~]$ psql -d repmgr
psql (14.9)
Type "help" for help.

repmgr=# SELECT * FROM repmgr.nodes;
 node_id | upstream_node_id | active | node_name |  type   | location | priority |                                         conninfo                                         | repluser | slot_name |      config_file
---------+------------------+--------+-----------+---------+----------+----------+------------------------------------------------------------------------------------------+----------+-----------+-----------------------
       1 |                  | t      | pg1       | primary | default  |      100 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 | repuser  |           | /opt/pg14/repmgr.conf
(1 row)

repmgr=#

克隆standby、启动数据库并注册

repmgr -h 192.168.17.71 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf standby clone pg_ctl -D /opt/pgdata start repmgr -f /opt/pg14/repmgr.conf standby register

在2台standby从库上分别操作

[postgres@pg2 ~]$ repmgr -h 192.168.17.71 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf standby clone NOTICE: destination directory "/opt/pgdata" provided INFO: connecting to source node DETAIL: connection string is: host=192.168.17.71 user=repmgr port=5432 dbname=repmgr DETAIL: current installation size is 33 MB INFO: replication slot usage not requested; no replication slot will be set up for this standby NOTICE: checking for available walsenders on the source node (2 required) NOTICE: checking replication connections can be made to the source server (2 required) INFO: checking and correcting permissions on existing directory "/opt/pgdata" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /opt/pg14/bin/pg_basebackup -l "repmgr base backup" -D /opt/pgdata -h 192.168.17.71 -p 5432 -U repuser -X stream NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /opt/pgdata start HINT: after starting the server, you need to register this standby with "repmgr standby register" [postgres@pg2 ~]$ pg_ctl -D /opt/pgdata start waiting for server to start....2023-08-22 22:06:06.735 CST [20664] LOG: starting PostgreSQL 14.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2023-08-22 22:06:06.736 CST [20664] LOG: listening on IPv4 address "0.0.0.0", port 5432 2023-08-22 22:06:06.737 CST [20664] LOG: listening on IPv6 address "::", port 5432 2023-08-22 22:06:06.737 CST [20664] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2023-08-22 22:06:06.739 CST [20665] LOG: database system was interrupted; last known up at 2023-08-22 22:05:54 CST 2023-08-22 22:06:06.743 CST [20665] LOG: entering standby mode 2023-08-22 22:06:06.744 CST [20665] LOG: redo starts at 0/2000028 2023-08-22 22:06:06.744 CST [20665] LOG: consistent recovery state reached at 0/2000100 2023-08-22 22:06:06.744 CST [20664] LOG: database system is ready to accept read-only connections 2023-08-22 22:06:06.750 CST [20669] LOG: started streaming WAL from primary at 0/3000000 on timeline 1 done server started [postgres@pg2 ~]$ repmgr -f /opt/pg14/repmgr.conf standby register INFO: connecting to local node "pg2" (ID: 2) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1) INFO: standby registration complete NOTICE: standby node "pg2" (ID: 2) successfully registered [postgres@pg2 ~]$ [postgres@pg3 ~]$ repmgr -h 192.168.17.71 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf standby clone NOTICE: destination directory "/opt/pgdata" provided INFO: connecting to source node DETAIL: connection string is: host=192.168.17.71 user=repmgr port=5432 dbname=repmgr DETAIL: current installation size is 33 MB INFO: replication slot usage not requested; no replication slot will be set up for this standby NOTICE: checking for available walsenders on the source node (2 required) NOTICE: checking replication connections can be made to the source server (2 required) INFO: checking and correcting permissions on existing directory "/opt/pgdata" NOTICE: starting backup (using pg_basebackup)... HINT: this may take some time; consider using the -c/--fast-checkpoint option INFO: executing: /opt/pg14/bin/pg_basebackup -l "repmgr base backup" -D /opt/pgdata -h 192.168.17.71 -p 5432 -U repuser -X stream NOTICE: standby clone (using pg_basebackup) complete NOTICE: you can now start your PostgreSQL server HINT: for example: pg_ctl -D /opt/pgdata start HINT: after starting the server, you need to register this standby with "repmgr standby register" [postgres@pg3 ~]$ pg_ctl -D /opt/pgdata start waiting for server to start....2023-08-22 22:08:09.997 CST [20715] LOG: starting PostgreSQL 14.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit 2023-08-22 22:08:09.998 CST [20715] LOG: listening on IPv4 address "0.0.0.0", port 5432 2023-08-22 22:08:09.998 CST [20715] LOG: listening on IPv6 address "::", port 5432 2023-08-22 22:08:09.999 CST [20715] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2023-08-22 22:08:10.001 CST [20716] LOG: database system was interrupted; last known up at 2023-08-22 22:08:02 CST 2023-08-22 22:08:10.004 CST [20716] LOG: entering standby mode 2023-08-22 22:08:10.005 CST [20716] LOG: redo starts at 0/4000028 2023-08-22 22:08:10.006 CST [20716] LOG: consistent recovery state reached at 0/4000100 2023-08-22 22:08:10.007 CST [20715] LOG: database system is ready to accept read-only connections 2023-08-22 22:08:10.013 CST [20720] LOG: started streaming WAL from primary at 0/5000000 on timeline 1 done server started [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf standby register INFO: connecting to local node "pg3" (ID: 3) INFO: connecting to primary database WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID: 1) INFO: standby registration complete NOTICE: standby node "pg3" (ID: 3) successfully registered [postgres@pg3 ~]$

查看集群状态及数据库元数据

repmgr -f /opt/pg14/repmgr.conf cluster show psql -d repmgr SELECT * FROM repmgr.nodes;

在任意节点查询都可以,至此集群已经搭建完毕

[postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show
 ID | Name | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string
----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------
 1  | pg1  | primary | * running |          | default  | 100      | 1        | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3
 2  | pg2  | standby |   running | pg1      | default  | 100      | 1        | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3
 3  | pg3  | standby |   running | pg1      | default  | 100      | 1        | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3
[postgres@pg1 ~]$ psql -d repmgr
psql (14.9)
Type "help" for help.

repmgr=# SELECT * FROM repmgr.nodes;
 node_id | upstream_node_id | active | node_name |  type   | location | priority |                                         conninfo                                         | repluser | slot_name |      config_file
---------+------------------+--------+-----------+---------+----------+----------+------------------------------------------------------------------------------------------+----------+-----------+-----------------------
       1 |                  | t      | pg1       | primary | default  |      100 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 | repuser  |           | /opt/pg14/repmgr.conf
       2 |                1 | t      | pg2       | standby | default  |      100 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 | repuser  |           | /opt/pg14/repmgr.conf
       3 |                1 | t      | pg3       | standby | default  |      100 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 | repuser  |           | /opt/pg14/repmgr.conf
(3 rows)

repmgr=#

切换测试

测试手动提升standby为primary

关闭主库模拟宕机,手动提升 standby(pg2) 节点为 primary 节点,集群其它standby重新指向新primary(pg2),将宕机原primary(pg1)降级为standby节点。

#主库操作pg1停库模拟宕机 su - postgres repmgr -f /opt/pg14/repmgr.conf cluster show psql create table t (id int); insert into t values(1); select * from t; \q pg_ctl stop #从库操作pg2手动提升为 primary节点 repmgr -f /opt/pg14/repmgr.conf cluster show repmgr -f /opt/pg14/repmgr.conf standby promote repmgr -f /opt/pg14/repmgr.conf cluster show psql insert into t values(2); \q #其他从库操作pg3重新指向新primary(pg2) repmgr -f /opt/pg14/repmgr.conf standby follow repmgr -f /opt/pg14/repmgr.conf cluster show #原主库操作pg1降级为standby节点 repmgr -h 192.168.17.72 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf node rejoin --force-rewind repmgr -f /opt/pg14/repmgr.conf cluster show

具体切换操作如下

[root@pg1 ~]# su - postgres Last login: Wed Aug 23 09:29:11 CST 2023 on pts/0 [postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | * running | | default | 100 | 1 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | standby | running | pg1 | default | 100 | 1 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg1 | default | 100 | 1 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 [postgres@pg1 ~]$ psql psql (14.9) Type "help" for help. postgres=# create table t (id int); CREATE TABLE postgres=# insert into t values(1); INSERT 0 1 postgres=# select * from t; id ---- 1 (1 row) postgres=# \q [postgres@pg1 ~]$ pg_ctl stop waiting for server to shut down....2023-08-23 09:55:22.466 CST [1257] LOG: received fast shutdown request 2023-08-23 09:55:22.466 CST [1257] LOG: aborting any active transactions 2023-08-23 09:55:22.466 CST [1257] LOG: background worker "logical replication launcher" (PID 1265) exited with exit code 1 2023-08-23 09:55:22.466 CST [1259] LOG: shutting down 2023-08-23 09:55:23.292 CST [1355] FATAL: the database system is shutting down 2023-08-23 09:55:23.452 CST [1257] LOG: database system is shut down done server stopped [postgres@pg1 ~]$ [root@pg2 ~]# su - postgres Last login: Wed Aug 23 09:29:55 CST 2023 on pts/0 [postgres@pg2 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+---------------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | ? unreachable | ? | default | 100 | | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | standby | running | ? pg1 | default | 100 | 1 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | ? pg1 | default | 100 | 1 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 WARNING: following issues were detected - unable to connect to node "pg1" (ID: 1) - node "pg1" (ID: 1) is registered as an active primary but is unreachable - unable to connect to node "pg2" (ID: 2)'s upstream node "pg1" (ID: 1) - unable to determine if node "pg2" (ID: 2) is attached to its upstream node "pg1" (ID: 1) - unable to connect to node "pg3" (ID: 3)'s upstream node "pg1" (ID: 1) - unable to determine if node "pg3" (ID: 3) is attached to its upstream node "pg1" (ID: 1) HINT: execute with --verbose option to see connection error messages [postgres@pg2 ~]$ repmgr -f /opt/pg14/repmgr.conf standby promote WARNING: 1 sibling nodes found, but option "--siblings-follow" not specified DETAIL: these nodes will remain attached to the current primary: pg3 (node ID: 3) NOTICE: promoting standby to primary DETAIL: promoting server "pg2" (ID: 2) using pg_promote() NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server "pg2" (ID: 2) was successfully promoted to primary [postgres@pg2 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | - failed | ? | default | 100 | | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | ? pg1 | default | 100 | 1 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 WARNING: following issues were detected - unable to connect to node "pg1" (ID: 1) - unable to connect to node "pg3" (ID: 3)'s upstream node "pg1" (ID: 1) - unable to determine if node "pg3" (ID: 3) is attached to its upstream node "pg1" (ID: 1) HINT: execute with --verbose option to see connection error messages [postgres@pg2 ~]$ psql psql (14.9) Type "help" for help. postgres=# insert into t values(2); INSERT 0 1 postgres=# \q [postgres@pg2 ~]$ [root@pg3 ~]# su - postgres Last login: Wed Aug 23 09:30:04 CST 2023 on pts/0 [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf standby follow NOTICE: attempting to find and follow current primary INFO: local node 3 can attach to follow target node 2 DETAIL: local node's recovery point: 0/70000A0; follow target node's fork point: 0/70000A0 NOTICE: setting node 3's upstream to node 2 NOTICE: STANDBY FOLLOW successful DETAIL: standby attached to upstream node "pg2" (ID: 2) [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | - failed | ? | default | 100 | | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg2 | default | 100 | 2 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 WARNING: following issues were detected - unable to connect to node "pg1" (ID: 1) HINT: execute with --verbose option to see connection error messages [postgres@pg3 ~]$ [postgres@pg1 ~]$ repmgr -h 192.168.17.72 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf node rejoin --force-rewind NOTICE: rejoin target is node "pg2" (ID: 2) NOTICE: executing pg_rewind DETAIL: pg_rewind command is "/opt/pg14/bin/pg_rewind -D '/opt/pgdata' --source-server='host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3'" NOTICE: 0 files copied to /opt/pgdata NOTICE: setting node 1's upstream to node 2 WARNING: unable to ping "host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" DETAIL: PQping() returned "PQPING_NO_RESPONSE" NOTICE: starting server using "/opt/pg14/bin/pg_ctl -w -D '/opt/pgdata' start" NOTICE: NODE REJOIN successful DETAIL: node 1 is now attached to node 2 [postgres@pg1 ~]$ psql psql (14.9) Type "help" for help. postgres=# select * from t; id ---- 1 2 (2 rows) postgres=# \q [postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | standby | running | pg2 | default | 100 | 2 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg2 | default | 100 | 2 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 [postgres@pg1 ~]$

测试手动将备用服务器提升为主服务器

在standby从库(pg1)上执行switchover命令提升为主库,切换后手动将其他 standby 节点(pg2,pg3)指向新的 primary 节点(pg1)

#pg1上操作 repmgr -f /opt/pg14/repmgr.conf cluster show repmgr -f /opt/pg14/repmgr.conf standby switchover #pg2上操作 repmgr -f /opt/pg14/repmgr.conf standby follow #pg3上操作 repmgr -f /opt/pg14/repmgr.conf standby follow repmgr -f /opt/pg14/repmgr.conf cluster show

具体操作如下

[postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | standby | running | pg2 | default | 100 | 2 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 2 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg2 | default | 100 | 2 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 [postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf standby switchover NOTICE: executing switchover on node "pg1" (ID: 1) WARNING: 1 sibling nodes found, but option "--siblings-follow" not specified DETAIL: these nodes will remain attached to the current primary: pg3 (node ID: 3) NOTICE: attempting to pause repmgrd on 3 nodes NOTICE: local node "pg1" (ID: 1) will be promoted to primary; current primary "pg2" (ID: 2) will be demoted to standby NOTICE: stopping current primary node "pg2" (ID: 2) NOTICE: issuing CHECKPOINT on node "pg2" (ID: 2) DETAIL: executing server command "/opt/pg14/bin/pg_ctl -D '/opt/pgdata' -W -m fast stop" INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout") INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout") NOTICE: current primary has been cleanly shut down at location 0/8000028 NOTICE: promoting standby to primary DETAIL: promoting server "pg1" (ID: 1) using pg_promote() NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete NOTICE: STANDBY PROMOTE successful DETAIL: server "pg1" (ID: 1) was successfully promoted to primary NOTICE: node "pg1" (ID: 1) promoted to primary, node "pg2" (ID: 2) demoted to standby NOTICE: switchover was successful DETAIL: node "pg1" is now primary and node "pg2" is attached as standby NOTICE: STANDBY SWITCHOVER has completed successfully [postgres@pg1 ~]$ [postgres@pg2 ~]$ repmgr -f /opt/pg14/repmgr.conf standby follow NOTICE: attempting to find and follow current primary INFO: timelines are same, this server is not ahead DETAIL: local node lsn is 0/8000EB8, follow target lsn is 0/8000EB8 NOTICE: setting node 2's upstream to node 1 NOTICE: STANDBY FOLLOW successful DETAIL: standby attached to upstream node "pg1" (ID: 1) [postgres@pg2 ~]$ [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf standby follow NOTICE: attempting to find and follow current primary INFO: timelines are same, this server is not ahead DETAIL: local node lsn is 0/8001018, follow target lsn is 0/8001018 NOTICE: setting node 3's upstream to node 1 WARNING: node "pg3" attached in state "startup" NOTICE: STANDBY FOLLOW successful DETAIL: standby attached to upstream node "pg1" (ID: 1) [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | * running | | default | 100 | 3 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | standby | running | pg1 | default | 100 | 3 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg1 | default | 100 | 3 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 [postgres@pg3 ~]$

自动切换设置

修改3台repmgr.conf,启动后台守护进程repmgrd,查看repmgr和repmgrd进程

cat >> /opt/pg14/repmgr.conf <<EOF #add by autofailover failover='automatic' connection_check_type=ping reconnect_attempts=6 reconnect_interval=10 promote_command='/opt/pg14/bin/repmgr standby promote -f /opt/pg14/repmgr.conf --log-to-file' follow_command='/opt/pg14/bin/repmgr standby follow -f /opt/pg14/repmgr.conf --log-to-file --upstream-node-id=%n' EOF repmgrd -f /opt/pg14/repmgr.conf ps f -u postgres repmgr -f /opt/pg14/repmgr.conf cluster event --event=repmgrd_start

具体操作如下

[postgres@pg1 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > > #add by autofailover > > failover='automatic' > connection_check_type=ping > reconnect_attempts=6 > reconnect_interval=10 > promote_command='/opt/pg14/bin/repmgr standby promote -f /opt/pg14/repmgr.conf --log-to-file' > follow_command='/opt/pg14/bin/repmgr standby follow -f /opt/pg14/repmgr.conf --log-to-file --upstream-node-id=%n' > > > EOF [postgres@pg1 ~]$ repmgrd -f /opt/pg14/repmgr.conf [2023-08-23 11:18:31] [NOTICE] repmgrd (repmgrd 5.3.3) starting up [2023-08-23 11:18:31] [INFO] connecting to database "host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" [postgres@pg1 ~]$ INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid [2023-08-23 11:18:31] [NOTICE] starting monitoring of node "pg1" (ID: 1) [2023-08-23 11:18:31] [INFO] "connection_check_type" set to "ping" [2023-08-23 11:18:31] [NOTICE] monitoring cluster primary "pg1" (ID: 1) [2023-08-23 11:18:31] [INFO] child node "pg2" (ID: 2) is attached [2023-08-23 11:18:31] [INFO] child node "pg3" (ID: 3) is attached [postgres@pg1 ~]$ ps f -u postgres PID TTY STAT TIME COMMAND 9108 pts/1 S 0:00 -bash 9152 pts/1 R+ 0:00 \_ ps f -u postgres 1278 pts/0 S+ 0:00 -bash 9130 ? S 0:00 repmgrd -f /opt/pg14/repmgr.conf 1378 ? Ss 0:00 /opt/pg14/bin/postgres -D /opt/pgdata 1380 ? Ss 0:00 \_ postgres: checkpointer 1381 ? Ss 0:00 \_ postgres: background writer 1382 ? Ss 0:00 \_ postgres: stats collector 1446 ? Ss 0:00 \_ postgres: walwriter 1447 ? Ss 0:00 \_ postgres: autovacuum launcher 1448 ? Ss 0:00 \_ postgres: archiver last was 000000020000000000000008.partial 1449 ? Ss 0:00 \_ postgres: logical replication launcher 1459 ? Ss 0:00 \_ postgres: walsender repuser 192.168.17.72(38920) streaming 0/800B8D8 1472 ? Ss 0:00 \_ postgres: walsender repuser 192.168.17.73(34826) streaming 0/800B8D8 9128 ? Ss 0:00 \_ postgres: repmgr repmgr 192.168.17.71(49700) idle [postgres@pg1 ~]$ [postgres@pg2 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > > #add by autofailover > > failover='automatic' > connection_check_type=ping > reconnect_attempts=6 > reconnect_interval=10 > promote_command='/opt/pg14/bin/repmgr standby promote -f /opt/pg14/repmgr.conf --log-to-file' > follow_command='/opt/pg14/bin/repmgr standby follow -f /opt/pg14/repmgr.conf --log-to-file --upstream-node-id=%n' > > > EOF [postgres@pg2 ~]$ repmgrd -f /opt/pg14/repmgr.conf [2023-08-23 11:21:00] [NOTICE] repmgrd (repmgrd 5.3.3) starting up [2023-08-23 11:21:00] [INFO] connecting to database "host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" [postgres@pg2 ~]$ INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid [2023-08-23 11:21:00] [NOTICE] starting monitoring of node "pg2" (ID: 2) [2023-08-23 11:21:00] [INFO] "connection_check_type" set to "ping" [2023-08-23 11:21:00] [INFO] monitoring connection to upstream node "pg1" (ID: 1) [postgres@pg2 ~]$ ps f -u postgres PID TTY STAT TIME COMMAND 1303 pts/1 S 0:00 -bash 9135 pts/1 R+ 0:00 \_ ps f -u postgres 1233 pts/0 S+ 0:00 -bash 9121 ? S 0:00 repmgrd -f /opt/pg14/repmgr.conf 1588 ? Ss 0:00 /opt/pg14/bin/postgres -D /opt/pgdata 1589 ? Ss 0:00 \_ postgres: startup recovering 000000030000000000000008 1590 ? Ss 0:00 \_ postgres: checkpointer 1591 ? Ss 0:00 \_ postgres: background writer 1592 ? Ss 0:00 \_ postgres: stats collector 1593 ? Ss 0:01 \_ postgres: walreceiver streaming 0/800BAB0 9119 ? Ss 0:00 \_ postgres: repmgr repmgr 192.168.17.72(44720) idle [postgres@pg2 ~]$ [postgres@pg3 ~]$ cat >> /opt/pg14/repmgr.conf <<EOF > > > #add by autofailover > > failover='automatic' > connection_check_type=ping > reconnect_attempts=6 > reconnect_interval=10 > promote_command='/opt/pg14/bin/repmgr standby promote -f /opt/pg14/repmgr.conf --log-to-file' > follow_command='/opt/pg14/bin/repmgr standby follow -f /opt/pg14/repmgr.conf --log-to-file --upstream-node-id=%n' > > > EOF [postgres@pg3 ~]$ repmgrd -f /opt/pg14/repmgr.conf [2023-08-23 11:21:37] [NOTICE] repmgrd (repmgrd 5.3.3) starting up [2023-08-23 11:21:37] [INFO] connecting to database "host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" [postgres@pg3 ~]$ INFO: set_repmgrd_pid(): provided pidfile is /tmp/repmgrd.pid [2023-08-23 11:21:37] [NOTICE] starting monitoring of node "pg3" (ID: 3) [2023-08-23 11:21:37] [INFO] "connection_check_type" set to "ping" [2023-08-23 11:21:37] [INFO] monitoring connection to upstream node "pg1" (ID: 1) [postgres@pg3 ~]$ ps f -u postgres PID TTY STAT TIME COMMAND 1348 pts/1 S 0:00 -bash 12456 pts/1 R+ 0:00 \_ ps f -u postgres 1233 pts/0 S+ 0:00 -bash 8951 ? S 0:01 repmgrd -f /opt/pg14/repmgr.conf 1254 ? Ss 0:01 /opt/pg14/bin/postgres 1255 ? Ss 0:00 \_ postgres: startup recovering 000000030000000000000008 1256 ? Ss 0:00 \_ postgres: checkpointer 1257 ? Ss 0:00 \_ postgres: background writer 1258 ? Ss 0:00 \_ postgres: stats collector 8911 ? Ss 0:03 \_ postgres: walreceiver streaming 0/800C370 8949 ? Ss 0:01 \_ postgres: repmgr repmgr 192.168.17.73(39552) idle [postgres@pg3 ~]$ [postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster event --event=repmgrd_start Node ID | Name | Event | OK | Timestamp | Details ---------+------+---------------+----+---------------------+------------------------------------------------------ 3 | pg3 | repmgrd_start | t | 2023-08-23 11:21:37 | monitoring connection to upstream node "pg1" (ID: 1) 2 | pg2 | repmgrd_start | t | 2023-08-23 11:21:00 | monitoring connection to upstream node "pg1" (ID: 1) 1 | pg1 | repmgrd_start | t | 2023-08-23 11:18:31 | monitoring cluster primary "pg1" (ID: 1) [postgres@pg1 ~]$

自动failover

停止pg1数据库,过1分钟后主库自动切换到pg2

#pg1操作停止数据库 pg_ctl stop #pg2上观察日志 #pg3查看集群状态 repmgr -f /opt/pg14/repmgr.conf cluster show #pg1从新加入集群变成standby repmgr -h 192.168.17.72 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf node rejoin --force-rewind

具体操作如下

[postgres@pg1 ~]$ pg_ctl stop waiting for server to shut down....2023-08-23 13:49:06.894 CST [1239] LOG: received fast shutdown request 2023-08-23 13:49:06.895 CST [1239] LOG: aborting any active transactions 2023-08-23 13:49:06.895 CST [1264] FATAL: terminating connection due to administrator command 2023-08-23 13:49:06.895 CST [1285] FATAL: terminating connection due to administrator command 2023-08-23 13:49:06.895 CST [1278] FATAL: terminating connection due to administrator command 2023-08-23 13:49:06.895 CST [1239] LOG: background worker "logical replication launcher" (PID 1247) exited with exit code 1 2023-08-23 13:49:06.896 CST [1241] LOG: shutting down 2023-08-23 13:49:07.021 CST [1354] FATAL: the database system is shutting down 2023-08-23 13:49:07.027 CST [1239] LOG: database system is shut down done server stopped [postgres@pg1 ~]$ [postgres@pg2 ~]$ 2023-08-23 13:49:06.041 CST [1219] LOG: replication terminated by primary server 2023-08-23 13:49:06.041 CST [1219] DETAIL: End of WAL reached on timeline 3 at 0/D0000A0. 2023-08-23 13:49:06.041 CST [1219] FATAL: could not send end-of-streaming message to primary: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. no COPY in progress 2023-08-23 13:49:06.041 CST [1215] LOG: invalid record length at 0/D0000A0: wanted 24, got 0 2023-08-23 13:49:06.044 CST [1252] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. [2023-08-23 13:49:07] [WARNING] unable to ping "host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" [2023-08-23 13:49:07] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:07] [WARNING] unable to connect to upstream node "pg1" (ID: 1) [2023-08-23 13:49:07] [INFO] checking state of node "pg1" (ID: 1), 1 of 6 attempts [2023-08-23 13:49:07] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:07] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:07] [INFO] sleeping up to 10 seconds until next reconnection attempt 2023-08-23 13:49:11.047 CST [1253] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? 2023-08-23 13:49:16.051 CST [1254] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? [2023-08-23 13:49:17] [INFO] checking state of node "pg1" (ID: 1), 2 of 6 attempts [2023-08-23 13:49:17] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:17] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:17] [INFO] sleeping up to 10 seconds until next reconnection attempt 2023-08-23 13:49:21.054 CST [1255] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? 2023-08-23 13:49:26.058 CST [1256] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? [2023-08-23 13:49:27] [INFO] checking state of node "pg1" (ID: 1), 3 of 6 attempts [2023-08-23 13:49:27] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:27] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:27] [INFO] sleeping up to 10 seconds until next reconnection attempt 2023-08-23 13:49:31.062 CST [1257] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? 2023-08-23 13:49:36.067 CST [1258] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? [2023-08-23 13:49:37] [INFO] checking state of node "pg1" (ID: 1), 4 of 6 attempts [2023-08-23 13:49:37] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:37] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:37] [INFO] sleeping up to 10 seconds until next reconnection attempt 2023-08-23 13:49:41.071 CST [1259] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? 2023-08-23 13:49:46.076 CST [1260] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? [2023-08-23 13:49:47] [INFO] checking state of node "pg1" (ID: 1), 5 of 6 attempts [2023-08-23 13:49:47] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:47] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:47] [INFO] sleeping up to 10 seconds until next reconnection attempt 2023-08-23 13:49:51.080 CST [1261] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? 2023-08-23 13:49:56.084 CST [1262] FATAL: could not connect to the primary server: connection to server at "192.168.17.71", port 5432 failed: Connection refused Is the server running on that host and accepting TCP/IP connections? [2023-08-23 13:49:57] [INFO] checking state of node "pg1" (ID: 1), 6 of 6 attempts [2023-08-23 13:49:57] [WARNING] unable to ping "user=repmgr password=repmgr connect_timeout=3 dbname=repmgr host=192.168.17.71 port=5432 fallback_application_name=repmgr" [2023-08-23 13:49:57] [DETAIL] PQping() returned "PQPING_NO_RESPONSE" [2023-08-23 13:49:57] [WARNING] unable to reconnect to node "pg1" (ID: 1) after 6 attempts [2023-08-23 13:49:57] [INFO] 1 active sibling nodes registered [2023-08-23 13:49:57] [INFO] 3 total nodes registered [2023-08-23 13:49:57] [INFO] primary node "pg1" (ID: 1) and this node have the same location ("default") [2023-08-23 13:49:57] [INFO] local node's last receive lsn: 0/D0000A0 [2023-08-23 13:49:57] [INFO] checking state of sibling node "pg3" (ID: 3) [2023-08-23 13:49:57] [INFO] node "pg3" (ID: 3) reports its upstream is node 1, last seen 52 second(s) ago [2023-08-23 13:49:57] [INFO] standby node "pg3" (ID: 3) last saw primary node 52 second(s) ago [2023-08-23 13:49:57] [INFO] last receive LSN for sibling node "pg3" (ID: 3) is: 0/D0000A0 [2023-08-23 13:49:57] [INFO] node "pg3" (ID: 3) has same LSN as current candidate "pg2" (ID: 2) [2023-08-23 13:49:57] [INFO] visible nodes: 2; total nodes: 2; no nodes have seen the primary within the last 4 seconds [2023-08-23 13:49:57] [NOTICE] promotion candidate is "pg2" (ID: 2) [2023-08-23 13:49:57] [NOTICE] this node is the winner, will now promote itself and inform other nodes [2023-08-23 13:49:57] [INFO] promote_command is: "/opt/pg14/bin/repmgr standby promote -f /opt/pg14/repmgr.conf --log-to-file" [2023-08-23 13:49:57] [WARNING] 1 sibling nodes found, but option "--siblings-follow" not specified [2023-08-23 13:49:57] [DETAIL] these nodes will remain attached to the current primary: pg3 (node ID: 3) [2023-08-23 13:49:57] [NOTICE] promoting standby to primary [2023-08-23 13:49:57] [DETAIL] promoting server "pg2" (ID: 2) using pg_promote() 2023-08-23 13:49:57.499 CST [1215] LOG: received promote request 2023-08-23 13:49:57.499 CST [1215] LOG: redo done at 0/D000028 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 315.46 s 2023-08-23 13:49:57.499 CST [1215] LOG: last completed transaction was at log time 2023-08-23 13:48:25.608586+08 [2023-08-23 13:49:57] [NOTICE] waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete 2023-08-23 13:49:57.500 CST [1215] LOG: selected new timeline ID: 4 2023-08-23 13:49:57.517 CST [1215] LOG: archive recovery complete 2023-08-23 13:49:57.518 CST [1214] LOG: database system is ready to accept connections [2023-08-23 13:49:58] [NOTICE] STANDBY PROMOTE successful [2023-08-23 13:49:58] [DETAIL] server "pg2" (ID: 2) was successfully promoted to primary [2023-08-23 13:49:58] [INFO] checking state of node 2, 1 of 6 attempts [2023-08-23 13:49:58] [NOTICE] node 2 has recovered, reconnecting [2023-08-23 13:49:58] [INFO] connection to node 2 succeeded [2023-08-23 13:49:58] [INFO] original connection is still available [2023-08-23 13:49:58] [INFO] 1 followers to notify [2023-08-23 13:49:58] [NOTICE] notifying node "pg3" (ID: 3) to follow node 2 INFO: node 3 received notification to follow node 2 [2023-08-23 13:49:58] [INFO] switching to primary monitoring mode [2023-08-23 13:49:58] [NOTICE] monitoring cluster primary "pg2" (ID: 2) [2023-08-23 13:50:04] [NOTICE] new standby "pg3" (ID: 3) has connected [2023-08-23 13:52:22] [NOTICE] new standby "pg1" (ID: 1) has connected [2023-08-23 13:54:58] [INFO] monitoring primary node "pg2" (ID: 2) in normal state [2023-08-23 13:59:59] [INFO] monitoring primary node "pg2" (ID: 2) in normal state [postgres@pg2 ~]$ [postgres@pg3 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | primary | - failed | ? | default | 100 | | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 4 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg2 | default | 100 | 3 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 WARNING: following issues were detected - unable to connect to node "pg1" (ID: 1) HINT: execute with --verbose option to see connection error messages [postgres@pg3 ~]$ [postgres@pg1 ~]$ repmgr -h 192.168.17.72 -U repmgr -p 5432 -d repmgr -f /opt/pg14/repmgr.conf node rejoin --force-rewind NOTICE: rejoin target is node "pg2" (ID: 2) NOTICE: pg_rewind execution required for this node to attach to rejoin target node 2 DETAIL: rejoin target server's timeline 4 forked off current database system timeline 3 before current recovery point 0/E000028 NOTICE: executing pg_rewind DETAIL: pg_rewind command is "/opt/pg14/bin/pg_rewind -D '/opt/pgdata' --source-server='host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3'" NOTICE: 0 files copied to /opt/pgdata NOTICE: setting node 1's upstream to node 2 WARNING: unable to ping "host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3" DETAIL: PQping() returned "PQPING_NO_RESPONSE" NOTICE: starting server using "/opt/pg14/bin/pg_ctl -w -D '/opt/pgdata' start" NOTICE: NODE REJOIN successful DETAIL: node 1 is now attached to node 2 [postgres@pg1 ~]$ [postgres@pg1 ~]$ repmgr -f /opt/pg14/repmgr.conf cluster show ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string ----+------+---------+-----------+----------+----------+----------+----------+------------------------------------------------------------------------------------------ 1 | pg1 | standby | running | pg2 | default | 100 | 4 | host=192.168.17.71 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 2 | pg2 | primary | * running | | default | 100 | 4 | host=192.168.17.72 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 3 | pg3 | standby | running | pg2 | default | 100 | 4 | host=192.168.17.73 port=5432 user=repmgr password=repmgr dbname=repmgr connect_timeout=3 [postgres@pg1 ~]$

免责声明:

1、本站资源由自动抓取工具收集整理于网络。

2、本站不承担由于内容的合法性及真实性所引起的一切争议和法律责任。

3、电子书、小说等仅供网友预览使用,书籍版权归作者或出版社所有。

4、如作者、出版社认为资源涉及侵权,请联系本站,本站将在收到通知书后尽快删除您认为侵权的作品。

5、如果您喜欢本资源,请您支持作者,购买正版内容。

6、资源失效,请下方留言,欢迎分享资源链接

文章评论

0条评论