PostgreSQL

PostgreSQL 9.4.10执行pg_basebackup的bug及解决方案

一 故障现象

在对一个PostgreSQL 9.4.10数据库执行基于时间点儿的不完全恢复的场景是,当通过pg_basebackup对数据库执行基础备份时,遇到1个错误:

pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host “[local]”, user “postgres”

pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host “127.0.0.1”, user “postgres”

[postgres@centos-master ~]$ /postgres/pg9.4/bin/pg_basebackup -p 5433 -D /ssd2/pg9410_backups/
pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host "[local]", user "postgres"

[postgres@centos-master ~]$ /postgres/pg9.4/bin/pg_basebackup -h 127.0.0.1 -p 5433 -D /ssd2/pg9410_backups/
pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host "127.0.0.1", user "postgres"

[postgres@centos-master ~]$ /postgres/pg9.4/bin/pg_basebackup -h 127.0.0.1 -p 5433 -D /ssd2/pg9410_backups/ -U postgres
pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host "127.0.0.1", user "postgres"

[postgres@centos-master ~]$ /postgres/pg9.4/bin/pg_basebackup -h 127.0.0.1 -p 5433 -D /ssd2/pg9410_backups/ -U postgres -W
Password:
pg_basebackup: could not connect to server: FATAL: no pg_hba.conf entry for replication connection from host "127.0.0.1", user "postgres"

[postgres@centos-master ~]$

二故障分析和解决

从故障上看,是pg_hba.conf配置文件没有添加访问权限。可是,我们用psql通过本地环路地址127.0.0.1却是可以用postgres用户正常登录数据库的。且:当前pg_hba.conf文件内容如下。

[postgres@centos-master pg9410data]$ cat  /ssd/pg9410data/pg_hba.conf
# PostgreSQL Client Authentication Configuration File
# ===================================================
...
# # "local" is for Unix domain socket connections only
local   all             all                                     trust
# # IPv4 local connections:
host   all             all             127.0.0.1/32           trust
# # IPv6 local connections:
# host   all             all             ::1/128                 trust
# # Allow replication connections from localhost, by a user with the
# # replication privilege.
# local   replication     all                                     trust
# host   replication     all             127.0.0.1/32           trust
# host   replication     all             ::1/128                 trust
[postgres@centos-master pg9410data]$

很显然,配置文件里已经添加了关于通过本地环路地址访问数据库的认证方式了。现在pg_basebackup却偏偏不能访问数据库,一直报错。

从网上寻找到的解决办法多数都是修改$PGDATA/postgres.conf文件,修改listen_addresses=’*’;以及修改$PGDATA/pg_hba.conf文件里添加host all all 127.0.0.1/32 trust。事实上,我这里的环境,已经都是这种配置了的。

postgres=# show listen_addresses ;
listen_addresses
------------------
*
(1 row)

postgres=#

经过反复测试验证搜索,发现该问题是个潜在的bug,毕竟现在还在使用PostgreSQL 9.4.10版本的数据库,确实比较old scholl了。同时,也有了解决方案,在pg_hba.conf文件里添加下述配置项:

local   replication     postgres trust

然后,重新加载pg_hba.conf文件就可以执行pg_basebackup了:

[postgres@centos-master ~]$ /postgres/pg9.4/bin/pg_basebackup  -p 5433 -D /ssd2/pg9410_backups/
NOTICE: pg_stop_backup complete, all required WAL segments have been archived
[postgres@centos-master ~]$ ll /ssd2/pg9410_backups/
总用量 148
-rw------- 1 postgres postgres   212 10月 29 15:11 backup_label
drwx------ 8 postgres postgres  4096 10月 29 15:35 base
drwx------ 2 postgres postgres  4096 10月 29 15:35 global
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_clog
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_dynshmem
-rw------- 1 postgres postgres  4522 10月 29 15:35 pg_hba.conf
-rw------- 1 postgres postgres  1636 10月 29 15:35 pg_ident.conf
drwx------ 2 postgres postgres 36864 10月 29 15:11 pg_log
drwx------ 4 postgres postgres  4096 10月 29 15:11 pg_logical
drwx------ 4 postgres postgres  4096 10月 29 15:35 pg_multixact
drwx------ 2 postgres postgres  4096 10月 29 15:11 pg_notify
drwx------ 2 postgres postgres  4096 10月 29 15:11 pg_replslot
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_serial
drwx------ 2 postgres postgres  4096 10月 29 15:11 pg_snapshots
drwx------ 2 postgres postgres  4096 10月 29 15:11 pg_stat
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_stat_tmp
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_subtrans
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_tblspc
drwx------ 2 postgres postgres  4096 10月 29 15:35 pg_twophase
-rw------- 1 postgres postgres     4 10月 29 15:35 PG_VERSION
drwx------ 3 postgres postgres  4096 10月 29 15:35 pg_xlog
-rw------- 1 postgres postgres    88 10月 29 15:35 postgresql.auto.conf
-rw------- 1 postgres postgres 20970 10月 29 15:11 postgresql.conf
[postgres@centos-master ~]$ du -sh /ssd2/pg9410_backups/
133G   /ssd2/pg9410_backups/
[postgres@centos-master ~]$

三 小结和参考链接

一个小的问题,花了我不少时间,原来是PostgreSQL数据库自身的1个bug,不过花时间找到问题的所在,以及参考网络上的解决方案,也算欣慰。

https://www.postgresql.org/message-id/1058774855.53202.1302373390729.JavaMail.root@mail-1.01.com

留言