Postgres Write-Ahead Log Usage
Last updated December 12, 2022
Table of Contents
Heroku Postgres uses write-ahead logging as part of Continuous Protection. Heroku Postgres uses continuous WAL archival to external, reliable storage.
This article covers how WAL is used, issues that can arise when the rate of WAL generation is greater than the rate of WAL archival, and strategies to avoid generating too much WAL.
What Is Write-Ahead Logging?
Write-ahead logging (WAL) is a core part of enabling Postgres’ durability and data consistency guarantees. All changes are written to this append-only log first, then propagated to the data files on disk.
There are exceptions, such as temporary and unlogged tables, where changes are not written to the WAL first, making them non-crash safe and non-replicable.
Heroku Postgres persists WAL files to local disk first. If the WAL capacity fills completely, the database shuts down and is at risk of data loss.
Generally speaking, Postgres generates WAL when performing write operations (for example, INSERT
, UPDATE
). WAL capacity runs low when the rate of WAL generation exceeds the rate of WAL archival off-disk, or high database load results in lower throughput for the archiver.
What Can I Do?
Monitor WAL Capacity
In the Heroku Postgres Metrics Logs Heroku emits a sample#wal-percentage-used
metric. A healthy database has this metric at 0.75
or lower. Higher than 0.75
Heroku automatically limits connections to the Heroku Postgres instance.
When connections are limited, Heroku emits an additional log line, with the following structure:
source=DATABASE_URL addon=postgresql-rugged-12345 sample#wal-percentage-used=0.88 sample#max-connections=120 message=Database WAL usage is too high, throttling connections.
Some Heroku Add-on partners who provide monitoring can help with graphing and alerting on the sample#wal-percentage-used
metric and the appearance of the additional log line.
Reduce WAL Generation
WAL generation can be reduced by limiting the rate of writes to Heroku Postgres. Some strategies are listed in this section. Which strategies are used are dependent on the specific workload, so Heroku is unable to provide detailed instructions for all strategies.
Use Postgres Partitioning to Split up Larger Tables
For any database with larger tables, using Postgres native partitioning to split up large tables into smaller ones can help significantly. This process allows you to remove and archive data by manipulating whole partitions rather than subsets of tables, which vastly decreases the amount of WAL produced.
Use DROP
and TRUNCATE
for Deleting Data in Bulk
For datasets where a range, such as “the last N days/weeks/months”, is kept and periodically deleted, use table partitioning and partition by time range. Then use Postgres DROP
or TRUNCATE
to remove partitions. DROP TABLE
and TRUNCATE
generate a small amount of WAL that isn’t proportional to the amount of data removed. Whereas DELETE
generates an amount of WAL proportional to the amount of data removed.
Don’t Delete and Then Reinsert the Same Data
A pattern Heroku has seen in Heroku Postgres customers who regularly exhaust WAL capacity is continually deleting a large proportion of their data, then reimporting it from an external source. This pattern generates a large amount of WAL, both for the delete and the reinsert. Instead prefer only updating or inserting rows that have changed.
Batch Large Volume Writes
Heroku strongly recommends the use COPY
and other mechanisms to bulk insert data, over using INSERT
directly. Manipulating one row per query causes an amplification of the WAL produced.
The Postgres documentation on populating a database has more information on using COPY
for bulk inserts, as well as some other best practices for bulk data imports. Not all of the advice is applicable to Heroku Postgres.
Disable Triggers Causing Write Amplification
Postgres triggers are a powerful tool, but they can lead to write amplification, where a write to one table can cause many writes to other ones. This pattern isn’t good if you must update a large amount of data. It’s preferable to disable triggers when performing data loading or large amounts of writes. You can disable triggers via ALTER TABLE <table> DISABLE TRIGGER <trigger name>
for a specific trigger, or ALTER TABLE <table> DISABLE TRIGGER all
for all triggers.
Don’t use a Postgres as an Object Store
Postgres is a transactional relational database, designed for on-line transactional processing of data – it isn’t designed to be an object store. As a result, storing binary data and large amounts of text/JSON/JSONB in a single column is an anti-pattern.
Use Temporary or Unlogged Tables for Data Loading
If loading a large amount of data, it can be helpful to “stage” the data in tables that aren’t written to WAL. This process has some risks, as these tables aren’t crash safe tables, but can be a useful tool as part of an Extract-Transform-Load process.
You can create temporary tables using CREATE TEMPORARY TABLE
, and unlogged tables with CREATE UNLOGGED TABLE
.
Ensure Follower Parity
Heroku Postgres followers can be up 2 plans lower than the leader. A mismatch of plan sizes increases the chances of the follower being unable to keep up with WAL playback.