1 源库配置

1.1 修改配置文件

  • 修改postgresql.conf文件
#逻辑级别日志
wal_level = logical
#如果需要远程读日志,需要设置,至少为1,就接受1个读日志线程
max_wal_senders = 10   #建议大于等于10 
#设置日志保留个数,设置太少了,日志量太大了日志自动删除后,容易读不到日志
wal_keep_segments = 10   #建议大于等于10

# wal_keep_size:要为日志复制进程保留多少日志量。旧版本使用的是wal_keep_segments参数(要为日志复制进程保留多少个日志文件),在pg 13中,该参数已被移除。wal_keep_size = wal_keep_segments * wal_segment_size
# 修改完成后需要重启数据库:pg_ctl -D /var/lib/postgresql/data -l logfile restart
# wal_level=logical 需要重启数据库
  • 验证是否成功
MogDB=#select name,setting,category from pg_settings where name in ('wal_level','wal_keep_segments','max_wal_senders','wal_keep_size');
       name        | setting |           category            
-------------------+---------+-------------------------------
 max_wal_senders   | 10      | Replication / Sending Servers
 wal_keep_segments | 10      | Replication / Sending Servers
 wal_level         | logical | Write-Ahead Log / Settings
(3 rows)
  • 如果远程读日志,需要设置读日志的权限,修改pg_hba.conf

    host    replication     all             0.0.0.0/0            ident/md5 ###可选
  • 验证文件是否成功

MogDB=#SELECT * FROM pg_read_file('pg_hba.conf', 0, 65535);
                              pg_read_file                               
-------------------------------------------------------------------------
 # PostgreSQL Client Authentication Configuration File                  +
 # ===================================================                  +
 #                                                                      +
 # Refer to the "Client Authentication" section in the PostgreSQL       +
 # documentation for a complete description of this file.  A short      +
 # synopsis follows.                                                    +
 #                                                                      +
 # This file controls: which hosts are allowed to connect, how clients  +
 # are authenticated, which PostgreSQL user names they can use, which   +
 # databases they can access.  Records take one of these forms:         +
 #                                                                      +
 # local      DATABASE  USER  METHOD  [OPTIONS]                         +
 # host       DATABASE  USER  ADDRESS  METHOD  [OPTIONS]                +
 # hostssl    DATABASE  USER  ADDRESS  METHOD  [OPTIONS]                +
 # hostnossl  DATABASE  USER  ADDRESS  METHOD  [OPTIONS]                +
 #                                                                      +
 # (The uppercase items must be replaced by actual values.)             +
 #                                                                      +
 # The first field is the connection type: "local" is a Unix-domain     +
 # socket, "host" is either a plain or SSL-encrypted TCP/IP socket,     +
 # "hostssl" is an SSL-encrypted TCP/IP socket, and "hostnossl" is a    +
 # plain TCP/IP socket.                                                 +
 #                                                                      +
 # DATABASE can be "all", "sameuser", "samerole", "replication", a      +
 # database name, or a comma-separated list thereof. The "all"          +
 # keyword does not match "replication". Access to replication          +
 # must be enabled in a separate record (see example below).            +
 #                                                                      +
 # USER can be "all", a user name, a group name prefixed with "+", or a +
 # comma-separated list thereof.  In both the DATABASE and USER fields  +
 # you can also write a file name prefixed with "@" to include names    +
 # from a separate file.                                                +
 #                                                                      +
 # ADDRESS specifies the set of hosts the record matches.  It can be a  +
 # host name, or it is made up of an IP address and a CIDR mask that is +
 # an integer (between 0 and 32 (IPv4) or 128 (IPv6) inclusive) that    +
 # specifies the number of significant bits in the mask.  A host name   +
 # that starts with a dot (.) matches a suffix of the actual host name. +
 # Alternatively, you can write an IP address and netmask in separate   +
 # columns to specify the set of hosts.  Instead of a CIDR-address, you +
 # can write "samehost" to match any of the server's own IP addresses,  +
 # or "samenet" to match any address in any subnet that the server is   +
 # directly connected to.                                               +
 #                                                                      +
 # METHOD can be "trust", "reject", "md5", "sha256" or "cert".          +
 # "sha256" is preferred since                                          +
                                                                        +
 #                                                                      +
 # OPTIONS are a set of options for the authentication in the format    +
 # NAME=VALUE.  The available options depend on the different           +
 # authentication methods -- refer to the "Client Authentication"       +
 # section in the documentation for a list of which options are         +
 # available for which authentication methods.                          +
 #                                                                      +
 # Database and user names containing spaces, commas, quotes and other  +
 # special characters must be quoted.  Quoting one of the keywords      +
 # "all", "sameuser", "samerole" or "replication" makes the name lose   +
 # its special character, and just match a database or username with    +
 # that name.                                                           +
 #                                                                      +
 # This file is read on server startup and when the postmaster receives +
 # a SIGHUP signal.  If you edit the file on a running system, you have +
 # to SIGHUP the postmaster for the changes to take effect.  You can    +
 # use "pg_ctl reload" to do that.                                      +
                                                                        +
 # Put your actual configuration here                                   +
 # ----------------------------------                                   +
 #                                                                      +
 # If you want to allow non-local connections, you need to add more     +
 # "host" records.  In that case you will also need to make PostgreSQL  +
 # listen on a non-local interface via the listen_addresses             +
 # configuration parameter, or via the -i or -h command line switches.  +
 #                                                                      +
 # NOTICE:                                                              +
 # ----------------------------------                                   +
 #                                                                      +
 # When you configure this file, please configure it from top to bottom,+
 # higher priority needs to be configured in front.                     +
                                                                        +
 # CAUTION: Configuring the system for local "trust" authentication     +
 # allows any local user to connect as any PostgreSQL user, including   +
 # the database sysadmin.  If you do not trust all your local users,    +
 # use another authentication method.                                   +
                                                                        +
                                                                        +
 # TYPE  DATABASE        USER            ADDRESS                 METHOD +
                                                                        +
 # "local" is for Unix domain socket connections only                   +
 local   all             all                                     trust  +
 # IPv4 local connections:                                              +
 host    all             all             127.0.0.1/32            trust  +
 # IPv6 local connections:                                              +
 host    all             all             ::1/128                 trust  +
 # Allow replication connections from localhost, by a user with the     +
 # replication privilege.                                               +
 #local   replication     omm                                trust      +
 #host    replication     omm        127.0.0.1/32            trust      +
 #host    replication     omm        ::1/128                 trust      +
                                                                        +
 host all all 0.0.0.0/0 md5                                             +
 host replication mogdb 0.0.0.0/0 md5                                   +
 host    replication     all             0.0.0.0/32            md5      +

(1 row)

以下过程中,web页面不要点击注册按键,所有配置手工完成!

1.2 创建用户及赋权

1.2.1 登录到同步库下执行

切换到同步库下,执行如下(数据库命令行执行):

create user zcbus with password  '*****';
grant connect on database *** to zcbus;  # ***为需要同步的数据库名称
create schema zcbus authorization zcbus;
GRANT ALL PRIVILEGES ON DATABASE testdb TO zcbus;
grant all on schema zcbus to zcbus;
grant all on schema public to zcbus;

备注:

  此处参考postgresql,只有owner或usersuper权限才能修改表结构或执行drop操作,请在执行操作前确定所同步表的Owner或用Superuser权限用户执行ALTER USER zcbus WITH SUPERUSER;

1.2.2 创建ddl_event_tbl表

create table zcbus.ddl_event_tbl (
event text,
tag text,
objid oid,
object_type text,
schema_name text,
object_identity text,
query text,
op_time timestamp default now()
);

alter table zcbus.ddl_event_tbl owner to zcbus;
  • 检查ddl_event_tbl表是否创建
select * from information_schema.tables where table_schema='zcbus' and table_type='BASE TABLE' and table_name='ddl_event_tbl';

1.2.5 为需要同步的表打开补充日志

ALTER TABLE schemaname.tabname REPLICA IDENTITY FULL;

---批量生成打开表补充日志sql语句参考
SELECT 'ALTER TABLE '||nspname||'.'||relname||' REPLICA IDENTITY FULL;' FROM pg_class c,pg_namespace n where relnamespace = n.oid and c.relkind='r' and nspname='public';

---注1:表所属用户或者超级用户才有开启REPLICA IDENTITY权限,否则报错:must be owner of table
---注2:关于分区表开启补充日志以及添加新的同步表,请查看本页2.1,2.2章节说明

1.2.6 用户授权

grant select,trigger on table zcbus.ddl_event_tbl  to zcbus;
grant execute ON FUNCTION zcbus.zcbus_func_ddl_command to zcbus;
alter user zcbus REPLICATION;
grant select on all tables in schema public to zcbus;
grant execute on function pg_hba_file_rules to zcbus;

1.3 添加槽位号说明(可选操作参考PG库)

(如发布数据库未配置SLOT_NAME参数,则此步骤添加槽位号跳过,目前软件仅支持添加物理复制槽)

1.3.1 创建槽位号(SQL/系统命令两种可选)

  • SQL命令
    • 创建槽位号
  select pg_create_physical_replication_slot('zcbus_slot_name');
- 删除槽位号
  select pg_drop_replication_slot('zcbus_slot_name');
  • 操作系统命令
    • 创建槽位号
  pg_receivewal --create-slot --slot=zcbus_slot_name
- 删除槽位号
  pg_receivewal --drop-slot --slot=zcbus_slot_name

1.3.2 查看槽位号信息

postgres=#  select * from pg_replication_slots;
    slot_name    | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn 
-----------------+--------+-----------+--------+----------+-----------+--------+------------+------+--------------+-------------+---------------------
 zcbus_slot_name |        | physical  |        |          | f         | t      |       2495 |      |              | 0/136FA6E0  | 
(1 row)

1.3.3 切换日志测试

  • PG10之前版本
  select pg_switch_xlog();
  • PG10之后版本
    select pg_switch_wal();

删除日志参考:

  • 检查可以删除wal的日志文件
     83e44cfdd569:~/data$ pg_controldata -D /var/lib/postgresql/data/ |grep "REDO WAL file"
     Latest checkpoint's REDO WAL file:    000000010000000100000067
  • 删除日志文件操作
     pg_archivecleanup /var/lib/postgresql/data/pg_wal/ 000000010000000100000067

2 补充说明

2.1 添加新的同步表

添加新表,根据schemaname.tablename,确认postgres账号开启表的补充日志模式状态开启成功

select relreplident from pg_class c , pg_tables tab where tab.schemaname = 'schemaname' and c.relname='tablename';

查询结果说明

d = 默认(主键)
n = 无
f = 所有列
i = 索引的indisreplident被设置或者默认
查询结果为f,则此表满足同步条件,开启表的补充日志模式成功

注:每次添加完新的同步表,需要再次对同步用户执行如下权限

grant select on all tables in schema public to zcbus;
grant execute on function pg_hba_file_rules to zcbus;

2.2 源表为分区表开启补充日志

SELECT
   parent.relname AS partitioned_table_name,
   parent.relkind AS partitioned_table_type,
   parent.relpartbound AS partition_bound,
   child.relname,
   child.relkind,
   child.relreplident,
   'ALTER TABLE ' || nspname || '.' || child.relname || ' REPLICA IDENTITY FULL' rif
FROM
 pg_class parent
LEFT JOIN pg_inherits ON pg_inherits.inhparent = parent.oid
LEFT JOIN pg_class child ON pg_inherits.inhrelid = child.oid
LEFT JOIN pg_namespace pns ON child.relnamespace = pns.oid
WHERE
 parent.relkind = 'p'
AND parent.relname = 'tabname';

---该语句能查询出分区表中的分区是否开启补充日志,若没有开启,可通过生成的批量SQL开启表补充日志

2.3 数据类型支持列表

 在抓取数据时,我们将PostgreSql数据类型与Zcbus支持的类型相匹配。如果我们不支持某个数据类型,我们会自动将该类型更改为最受支持的类型,或者在某些情况下,根本不加载该数据。我们的系统将会抛出异常、跳过我们不接受或转换的数据类型对应表。

PostgreSql类型 是否支持 描述
smallint 支持
integer 支持
bigint 支持
decimal(m,d) 支持
numeric(m,d) 支持
real 支持
double precision 支持
smallserial 支持
serial 支持
bigserial 支持
money 支持
character varying(n) 支持
varchar(n) 支持
character(n) 支持
char(n) 支持
text 支持
timestamp 支持
tmestamp with time zone 支持
date 支持
time 支持
time with time zone 支持
interval 支持
bit(n) 支持
bit varing(n) 支持
boolean 支持
point 支持
line 支持
lseg 支持
box 支持
path 支持
polygon 支持
circle 支持
cidr 支持
inet 支持
macaddr 支持
tsvector 支持
tsquery 支持
uuid 支持
xml 支持
文档更新时间: 2024-07-26 03:02   作者:liyue