Skip to content

PostgreSQL Monitoring & Statistics — Section Overview

Contents:

This subcategory is PostgreSQL’s introspection plane: the machinery that lets a backend report what it is doing right now and lets the engine accumulate what has happened so that pg_stat_* views, the autovacuum scheduler, and the planner’s “did this table change enough to re-analyze?” heuristics have numbers to read. Almost everything here lives under one source root — src/backend/utils/activity/ — and that physical grouping is the subcategory boundary.

Three pillars share that home, and the sharp line between them is live state versus accumulated state:

  • The cumulative statistics system (pgstat*.c) — counters that add up over time: tuples inserted/updated/dead, blocks hit/read, vacuum and analyze timestamps, WAL bytes, checkpointer and bgwriter activity, per-IO timings. Since PostgreSQL 15 this is a shared-memory subsystem; it replaced the old dedicated stats-collector auxiliary process that received counter deltas over a UDP socket and owned a private file. That redesign is the defining fact of this subcategory and the reason it exists as its own section rather than a footnote in server-architecture.
  • Wait events and backend status (backend_status.c, wait_event.c) — a point-in-time report: each backend publishes “I am running this query, in this state, currently waiting on this event” into its own slot, cheap enough to update on every lock acquisition. This is the live view behind pg_stat_activity. The wait-event taxonomy is code-generated from wait_event_names.txt.
  • Progress reporting (backend_progress.c) — a small per-command progress array (VACUUM, CREATE INDEX, COPY, base backup, …) that a long-running command updates and pg_stat_progress_* views read, with a parallel-leader/worker aggregation path.

Boundaries — what this section hands off:

  • The processes that produce the numbers belong elsewhere. The checkpointer (which serializes the stats file at shutdown), the bgwriter, the startup process (which reloads it), and the backend lifecycle are owned by server-architecture. This subcategory owns the counters those processes emit, not the processes.
  • The substrate the counters describe. SLRU stats counters live here, but the SLRU caches themselves are txn-recovery (postgres-slru.md); IO stats live here, but the async-IO and buffer machinery are storage-engine (postgres-aio.md, postgres-buffer-manager.md).
  • The shared-memory and DSA machinery the cumulative system is built on — the fixed segment, DSM/DSA, dshash, LWLocks — is server-architecture (postgres-shared-memory-ipc.md). This section explains how the stats system uses that substrate, not how the substrate works.
  • The pg_stat_* views and SQL functions that surface these numbers are catalog/SQL surface, not analyzed as their own module here; this section stops at the C reporting layer the views read.

The historical arc — stats-collector → shared-memory cumulative stats — is captured separately in postgres-evolution-statistics.md; this router names the current (REL_18) shape and defers the version-by-version story there.

The two module docs split cleanly along the live-vs-accumulated seam. The cumulative system has a per-backend pending → shared store → disk flow; the wait/status/progress system is a direct per-PGPROC publish with no accumulation. Both ride the same utils/activity/ home and the same shared-memory substrate owned by server-architecture.

flowchart TB
  subgraph BACKEND["any backend / aux process"]
    ACT["running command<br/>(query, vacuum, copy, ...)"]
    PEND["pending stats (process-local)<br/>PgStat_EntryRef cache + have_*_stats"]
    LIVE["live self-report<br/>MyBEEntry + MyProc wait slot"]
  end

  ACT --> PEND
  ACT --> LIVE

  subgraph CUMUL["postgres-cumulative-stats.md  (accumulated view)"]
    FLUSH["pgstat_report_stat()<br/>flush at xact end / timeout"]
    FIXED["fixed-numbered kinds<br/>plain shmem block<br/>(checkpointer, bgwriter, WAL, IO, SLRU, archiver)"]
    VAR["variable-numbered kinds<br/>DSA + dshash, keyed by PgStat_HashKey<br/>(per-relation, per-function, per-db, replslot, subscription, backend)"]
    FILE["pgstat.stat on disk<br/>(checkpointer writes at shutdown,<br/>startup reads / discards after crash)"]
  end

  PEND --> FLUSH
  FLUSH --> FIXED
  FLUSH --> VAR
  FIXED --> FILE
  VAR --> FILE

  subgraph LIVEDOC["postgres-wait-events-progress.md  (live view)"]
    STATUS["backend status<br/>backend_status.c -> PgBackendStatus"]
    WAIT["wait events<br/>wait_event.c, taxonomy codegen'd<br/>from wait_event_names.txt"]
    PROG["command progress<br/>backend_progress.c -> st_progress_param[]"]
  end

  LIVE --> STATUS
  LIVE --> WAIT
  LIVE --> PROG

  subgraph READERS["readers (out of scope here)"]
    VIEWS["pg_stat_* / pg_stat_activity /<br/>pg_stat_progress_* views"]
    AV["autovacuum scheduler"]
    PLAN["planner re-analyze heuristics"]
  end

  FIXED --> VIEWS
  VAR --> VIEWS
  VAR --> AV
  VAR --> PLAN
  STATUS --> VIEWS
  WAIT --> VIEWS
  PROG --> VIEWS

The structural takeaways a reader should carry into the module docs:

  • Two storage classes inside one cumulative system. Fixed-numbered kinds (one or a handful of objects — checkpointer, bgwriter, WAL, IO, SLRU, archiver) live in plain shared memory carved at startup. Variable-numbered kinds (per-relation, per-function, per-database, replication slot, subscription, per-backend) live in dynamic shared memory reached through a dshash hash table keyed by PgStat_HashKey (kind + dboid + objid). The counters for variable kinds are allocated separately from the hash entry (the entry holds a pointer to a body), so different kinds share one table without bloating it.
  • Backends never write the shared store on the hot path. A backend bumps process-local pending counters and flushes them with pgstat_report_stat() at transaction end (or on a timeout), so the expensive shared-memory write is batched. This is the architectural payoff of the PG15 redesign over the old UDP-packet collector.
  • Live reporting is even cheaper and never accumulates. Wait-event and status updates write directly into the backend’s own PgBackendStatus / MyProc slot with no locking on the common path — designed to be lit up on every lock wait without measurable cost.

Cross-referenced-first: read the cumulative system before the live view, because the live view’s status reporting (backend_status.c) is physically co-located with and partly initialized alongside the cumulative subsystem, and because the “fixed vs variable kind / DSA dshash” model is the harder idea that the rest builds on.

  1. postgres-cumulative-stats.md — the PG15 shared-memory subsystem: the PgStat_Kind taxonomy, fixed-vs-variable storage, the per-backend pending → flush → dshash path, and the checkpointer/startup file lifecycle. Read postgres-shared-memory-ipc.md (server-architecture) first if the DSA/dshash/LWLock substrate is unfamiliar.
  2. postgres-wait-events-progress.md — the live self-report: backend status, the code-generated wait-event taxonomy, and the progress array. Lighter and more self-contained; safe to read second.

Then fan out to the readers: postgres-autovacuum.md and the planner docs (both consume variable-numbered relation stats), and postgres-evolution-statistics.md for the historical arc.

Forward references — these module docs may not exist yet. One-line scope each:

Module docWhat it will cover
postgres-cumulative-stats.mdThe PG15 shared-memory cumulative statistics system: the PgStat_Kind taxonomy and PgStat_KindInfo dispatch, fixed-numbered (plain shmem) vs variable-numbered (DSA + dshash, keyed by PgStat_HashKey) storage, the process-local pending-counter → pgstat_report_stat() flush model, and the file lifecycle (checkpointer serializes pgstat.stat at shutdown, startup reloads or discards after a crash) — i.e. the replacement for the old stats-collector process.
postgres-wait-events-progress.mdThe live self-report plane: PgBackendStatus and pg_stat_activity backing (backend_status.c), the wait-event class taxonomy and its code generation from wait_event_names.txt via generate-wait_event_types.pl (wait_event.c), and per-command progress reporting with parallel leader/worker aggregation (backend_progress.c).
  • server-architecture (postgres-overview-server-architecture.md) — the closest neighbor and the biggest handoff. It owns the processes that feed and persist these stats (checkpointer, bgwriter, startup, the backend lifecycle) and the shared-memory / DSA / dshash / LWLock substrate the cumulative system is built on. This section owns the counters; that one owns the machinery that emits and stores them.
  • txn-recovery (postgres-overview-txn-recovery.md) — supplies the subjects of several fixed-numbered stat kinds: WAL activity, SLRU caches, and the checkpointer’s work. SLRU and WAL stats counters live here; the SLRU caches and the WAL machinery themselves live there.
  • storage-engine (postgres-overview-storage-engine.md) — the IO stat kind reports on buffer-manager and async-IO activity; the relation stat kind (n_tup_ins/upd/del, n_dead_tup) describes heap mutation. The mechanisms behind those numbers (postgres-buffer-manager.md, postgres-aio.md, postgres-heap-am.md) are owned there.
  • query-processing (postgres-overview-query-processing.md) — a consumer: the planner reads variable-numbered relation stats to decide when cached plans and stale ANALYZE data need refreshing, and progress reporting surfaces CREATE INDEX / parallel-query progress.