Notes on psql logical replication

To enable blue-green deployment for a psql rds instance we have been looking into enabling logical replication. Here is a collection of notes and links which I found useful.

The logical replciation process starts by psql taking a snapshot of the publisher database (source) and copies it to the subscriber. After this initial snapshot, the publisher notifies of any data change so the subscriber can replicate it.

Logical replication restrictions

For a full list of restrictions got to psql docs. Here are the show stoppers for us:

  • Schema changes are not sync. The schema can be replicated with pg_dump but any changes in the publisher will have to be manually done in the subscriber.
  • Certain objects are not replicated, such as sequence data, views, materialized views or foreign tables.
  • Tables need to have primary keys, unique keys otherwise replica identity must be set to full.
  • Toasted columns are not replicated. They can be replicated by setting replica identity to full.

Updates and deletes will fail on the source, if the target table is published and it can not be replicated. An error such as the one below will appear:

ERROR: cannot delete from table "table-name" because it does not have a replica identity and publishes deletes
HINT: To enable deleting from the table, set REPLICA IDENTITY using ALTER TABLE.

Replica identity

The replica identiy is set on the published tables and is used to replicate update and delete operations on the subscriber. By default the tables primary key is used as the replica identity. Another unique key could be used under certain conditions. When there is no primary key or unique keys which match those conditions, the table replica identity value can be set to full. In this case the whole row is written to the WAL and the replication can be processed.

Writting the whole columns to the WAL implies more cpu used for writting, and more memory used to stored the logs. Keep in mind that the logs are rotated by default.

There are alternatives to setting the replica identity to full:

  • Create a primary key.
  • Create a unique index and let replica identity use it (link).
  • Dont replicate table.

Other configs to consider

The psql config of the publisher must be updated, at least the wal_level config must be set to logical. Psql docs also recommend to update the values of:

  • max_replication_slots
  • idle_replication_slot_timeout
  • max_wal_senders
  • wal_sender_timeout

The docs recommend updating the following config values of the subscriber:

  • max_active_replication_origin
  • max_logical_replication_workers
  • max_worker_processes
  • max_sync_workers_per_subscription
  • max_parallel_apply_workers_per_subscription
  • wal_receiver_timeout
  • wal_receiver_status_interval
  • wal_retrieve_retry_interval

Some links

  • Blog post by Omprakash replicating some of the logical replication errors.
  • Artie published a nice post with examples of messages. It seems to be removed from their web, so here is the link to the post in the internet archive.
  • On xata they tested the performance of setting replica identity full.