Replication with WAL (Write-Ahead Logging)

. 6 min read

Suhas:

Hey Gemini, can you give me a step by step guide to replicating two databases in two different AWS accounts with WAL?


Gemini 1.5 Pro:

Here’s a step-by-step guide to setting up PostgreSQL replication across two AWS accounts using WAL shipping:

Assumptions:

  • You have two AWS accounts (Account A and Account B).
  • PostgreSQL instances are already running in both accounts.
  • You have basic familiarity with AWS services like EC2, IAM, and S3.

Steps:

Account A (Primary Database):

  1. Configure PostgreSQL for Replication:

    • Edit postgresql.conf (usually located in /var/lib/pgsql/data):
      • Set wal_level = replica (or higher) to enable WAL archiving.
      • Set max_wal_senders = <number> (e.g., 5) to allow connections from standby servers.
      • Set wal_keep_segments = <number> (e.g., 1024) to retain enough WAL segments for replication.
      • Add a replication user:
        CREATE USER <replication_user> WITH REPLICATION PASSWORD '<strong_password>';
    • Edit pg_hba.conf to allow replication connections from the standby server’s IP address or CIDR block:
      host replication    <replication_user>    <standby_server_ip_address/cidr_block>    trust
    • Restart PostgreSQL.
  2. Set up WAL Archiving to S3:

    • Install the aws_s3 extension for pg_basebackup (refer to PostgreSQL documentation for installation instructions).
    • Create an IAM role in Account A with permissions to write to an S3 bucket.
    • Attach the IAM role to the EC2 instance hosting the primary PostgreSQL server.
    • Configure archive_command in postgresql.conf to archive WAL segments to the S3 bucket:
      archive_mode = on
      archive_command = 'pg_basebackup -b -D s3://<your-s3-bucket-name>/wal_archives/%f --format=tar --access-key-id=<your-access-key-id> --secret-access-key=<your-secret-access-key> --region=<your-s3-region>'
    • Replace placeholders with your actual values.

Account B (Standby Database):

  1. Create an IAM Role for Cross-Account Access:

    • Create an IAM role in Account B with permissions to:
      • Read objects from the S3 bucket in Account A where WAL segments are archived.
      • Assume the IAM role created in Account A (step 2 in Account A configuration).
    • Note the ARN of this newly created role.
  2. Configure the Standby Server:

    • Stop the PostgreSQL service on the standby server.
    • Take a base backup of the primary database and restore it on the standby server. You can use pg_basebackup with the aws_s3 extension and the IAM role ARN from step 1 to restore from the S3 bucket.
    • Edit postgresql.conf on the standby:
      • Set hot_standby = on to allow read-only queries on the standby.
    • Create recovery.conf (if it doesn’t exist) in the data directory and add:
      standby_mode = 'on'
      primary_conninfo = 'host=<primary_server_endpoint> port=5432 user=<replication_user> password=<strong_password>'
      restore_command = 'pg_basebackup -b -D s3://<your-s3-bucket-name>/wal_archives/%f --format=tar --access-key-id=<your-access-key-id> --secret-access-key=<your-secret-access-key> --region=<your-s3-region>'
      • Replace placeholders with actual values.
    • Start the PostgreSQL service on the standby server.

Verification:

  • Monitor the PostgreSQL logs on both the primary and standby servers to ensure replication is working as expected.
  • Run a SELECT pg_is_in_recovery(); query on the standby server. It should return true.

Security Considerations:

  • Use strong passwords for the replication user.
  • Configure your S3 bucket policies to restrict access only to authorized users and roles.
  • Consider encrypting data at rest in S3 and in transit using SSL/TLS.

This guide provides a basic framework for setting up cross-account replication with WAL shipping. You can customize it further based on your specific requirements and security needs. Remember to consult the PostgreSQL and AWS documentation for detailed instructions and best practices.


Crunchybridge to AWS RDS

Here’s a step-by-step guide to set up WAL replication between a Crunchy Bridge source database and an AWS RDS instance as the destination:

Assumptions:

  • You have an existing Crunchy Bridge PostgreSQL cluster.
  • You have an existing AWS RDS PostgreSQL instance (or are ready to create one).
  • You’re comfortable with both Crunchy Bridge and AWS management consoles.
  • You have necessary permissions in both environments to configure networking and security settings.

Steps:

1. Configure Crunchy Bridge (Source):

  • Enable logical decoding: In your Crunchy Bridge cluster settings, ensure the wal_level parameter is set to logical and max_replication_slots is at least 1. These settings control the level of detail captured in the WAL and allow for creating replication slots.
  • Create a replication slot: Create a new replication slot dedicated to your AWS RDS replica. This reserves WAL segments on the Crunchy Bridge side, ensuring they aren’t removed prematurely and the replica doesn’t lose data. Note the slot name.
  • Create a dedicated replication user: Create a PostgreSQL user account on your Crunchy Bridge cluster specifically for replication purposes. Grant it the REPLICATION privilege. Make note of the username and password.
  • Network connectivity: Ensure your AWS RDS instance has network access to your Crunchy Bridge cluster. This might involve configuring security groups or VPC peering depending on your AWS setup. Crunchy Bridge provides connection details, including hostname and port, that you’ll need for the next steps.

2. Configure AWS RDS (Destination):

  • Launch RDS Instance (if needed): If you don’t have an existing RDS PostgreSQL instance, create one. Choose an instance size appropriate for your workload and a PostgreSQL version compatible with your Crunchy Bridge cluster.
  • Modify RDS Parameter Group: Create a custom parameter group for your RDS instance or modify the default one. Set the following parameters:
    • wal_receiver_create_temp_slot to on (this allows RDS to create temporary slots for initial replication setup).
    • wal_receiver_status_interval to a lower value (e.g., 10) for faster replication monitoring.
  • Create a Security Group Rule: Add an inbound rule to your RDS instance’s security group that allows traffic on port 5432 (PostgreSQL) from the IP address or CIDR block of your Crunchy Bridge cluster.
  • Connect to the RDS instance: Use a tool like psql to connect to your RDS instance as the master user.

3. Initiate Replication from RDS:

  • Perform a base backup: While connected to your RDS instance, take a base backup of your Crunchy Bridge database. You can use pg_basebackup with the -X stream option to stream the backup over the network:
    pg_basebackup -h <crunchy-bridge-host> -p <crunchy-bridge-port> -U <replication-user> -D /path/to/backup/directory -X stream -P -W -S <slot-name>
    • Replace placeholders with actual values from your Crunchy Bridge cluster and the replication slot name.
  • Configure recovery.conf: Create or edit the recovery.conf file in the data directory of your RDS instance. Add the following lines, replacing placeholders with actual values:
    standby_mode = 'on'
    primary_conninfo = 'host=<crunchy-bridge-host> port=<crunchy-bridge-port> user=<replication-user> password=<replication-user-password> application_name=<descriptive-name>'
    recovery_target_timeline = 'latest'
  • Restart RDS Instance: Restart your RDS instance to apply the configuration changes and initiate the replication process.

4. Monitor Replication:

  • RDS Console: Monitor the replication status in the AWS RDS console. You should see your instance in a “replicating” state.
  • PostgreSQL Logs: Check the PostgreSQL logs on both your Crunchy Bridge cluster and RDS instance for any errors or warnings related to replication.

Additional Tips:

  • SSL/TLS: For enhanced security, configure SSL/TLS encryption for the replication connection between Crunchy Bridge and your RDS instance.

  • Monitoring: Set up monitoring for replication lag to ensure your RDS instance stays in sync with your Crunchy Bridge database.

  • Testing: Periodically test your failover scenario by promoting your RDS instance to a standalone primary to ensure everything works as expected.

    Important Notes:

  • Crunchy Bridge Limitations: Contact Crunchy Bridge support to confirm their recommended approach for setting up replication to external services and any limitations that might apply.

  • AWS RDS Restrictions: Be mindful of AWS RDS restrictions; for example, you cannot directly modify the pg_hba.conf file on an RDS instance.

  • Security Best Practices: Always follow security best practices, including using strong passwords, least privilege principles, and encrypting sensitive data in transit and at rest.