PostgreSQL System Catalog & Caches — Section Overview
Contents:
What this section covers
Section titled “What this section covers”PostgreSQL is a catalog-driven engine. Types, operators, functions,
access methods, index strategies, namespaces, and the relations themselves
are not hard-coded — they are rows in pg_* system tables (pg_class,
pg_attribute, pg_type, pg_proc, pg_am, pg_index, …). That single
fact is the reason the engine is extensible at runtime (Axis 7 of
postgres-architecture-overview.md): teaching PostgreSQL a new type or
operator is an INSERT, not a recompile.
It is also the reason this subcategory exists. If every plan, every expression, and every tuple deform had to re-read the catalog from the heap, the catalog would be the bottleneck. So PostgreSQL puts three caches in front of the catalog and a shared-invalidation loop behind them to keep those caches honest across backends. This section covers exactly that stack, plus the two services that sit on top of the catalog rather than caching them:
- The catalog layout itself — what the
pg_*tables are, how a row gets there, and the bootstrap path (pg_*.dat→.bki→initdb) that populates them before the server can read SQL. →postgres-system-catalogs.md - The relcache — per-backend
Relationdescriptors, each assembled from several catalogs (pg_class+pg_attribute+pg_index+ AM info + …) and cached because they are read on every access to a table. →postgres-relcache.md - The catcache and syscache — the catcache caches individual catalog
tuples keyed by lookup; the syscache is the typed dispatch table that
names each cache (
TYPEOID,PROCOID, …) andlsyscachethe convenience accessors over it. →postgres-catcache-syscache.md - Cache invalidation (sinval) — the loop that ties the caches back to the
shared-memory substrate: a catalog mutation queues messages, every backend
drains and invalidates. →
postgres-cache-invalidation.md - Dependency tracking —
pg_depend/pg_shdepend, the object-reference graph that makesDROP ... CASCADE,RESTRICT, and pg_dump ordering work. →postgres-dependency-tracking.md - Namespace / search_path — schema resolution: turning an unqualified
foointo a specific catalog OID using the activesearch_path. →postgres-namespace-search-path.md
The sharp boundaries — what this section is not:
- Not the shared-memory substrate. The
sinvalqueue lives in the fixed shared-memory segment and is one of its structures; the segment itself,PGPROC,procsignal, and the IPC primitives that carry the invalidation signal belong to server-architecture (postgres-shared-memory-ipc.md). This section owns the cache-coherence protocol; it hands off the transport downward. - Not DDL execution. A
CREATE/ALTER/DROPstatement is what mutates the catalog and emits the invalidation messages, but the command machinery (tcop/utility.c,commands/tablecmds.c,commands/indexcmds.c) belongs to ddl-schema (postgres-ddl-execution.md,postgres-alter-table.md). This section owns the catalog as a data structure and a cache; ddl-schema owns the writers. Dependency tracking is the seam: this section describes thepg_dependgraph; ddl-schema invokesperformDeletionover it. - Not the catalog as a transactional heap. Catalog tables are ordinary
MVCC heap relations — they obey the same visibility, WAL, and vacuum rules
as user tables. Those mechanisms are txn-recovery and storage-engine
(
postgres-mvcc-snapshots.md,postgres-heap-am.md). This section assumes that floor and builds the caching layer on it. - Not the bootstrap tooling. This section explains that catalog rows
come from
pg_*.datviagenbki.plandinitdb; the codegen pipeline and theinitdbbinary themselves are utilities (postgres-initdb-bootstrap-genbki.md).
The layering
Section titled “The layering”The catalog is the bottom; the caches stack on top of it; sinval closes the
loop from a writer’s commit back to every reader’s caches. Dependency tracking
and namespace resolution are services layered beside the caches, reading the
same catalog rows.
flowchart TB
subgraph BOOT["bootstrap (initdb, once) — handed to utilities"]
DAT["pg_*.dat / .bki<br/>genbki.pl codegen"]
end
subgraph CAT["the catalog (pg_* tables on the MVCC heap)"]
direction LR
SYSCAT["postgres-system-catalogs.md<br/>pg_class, pg_attribute, pg_type,<br/>pg_proc, pg_am, pg_index, ..."]
end
DAT -. "populates at initdb" .-> SYSCAT
subgraph CACHES["per-backend caches (private copies)"]
direction LR
RELC["postgres-relcache.md<br/>Relation descriptors<br/>(assembled from several catalogs)"]
CATC["postgres-catcache-syscache.md<br/>catcache tuples + syscache dispatch<br/>+ lsyscache accessors"]
end
SYSCAT --> RELC
SYSCAT --> CATC
subgraph SVC["services over the catalog"]
direction LR
DEP["postgres-dependency-tracking.md<br/>pg_depend / pg_shdepend graph<br/>(DROP CASCADE, pg_dump order)"]
NS["postgres-namespace-search-path.md<br/>search_path -> OID resolution"]
end
SYSCAT --> DEP
SYSCAT --> NS
EXEC["executor / planner / tuple deform<br/>(query-processing)"]
RELC --> EXEC
CATC --> EXEC
NS --> EXEC
subgraph LOOP["coherence loop"]
DDL["a DDL / catalog mutation<br/>(ddl-schema writer)"]
INVAL["postgres-cache-invalidation.md<br/>CacheInvalidate* -> register messages"]
SINV["sinval queue<br/>(shared memory — server-architecture)"]
end
DDL --> SYSCAT
DDL --> INVAL
INVAL --> SINV
SINV -. "every backend drains at AcceptInvalidationMessages" .-> RELC
SINV -.-> CATC
Three things to read off the diagram:
- Caches are private, the catalog is shared. Each backend’s relcache and
catcache are in its own memory contexts. The only shared state in this
subcategory is the
sinvalmessage queue — which is why invalidation is a messaging problem, not a shared-cache-eviction problem. - The loop is the spine of this section. A writer (a DDL command) mutates
a catalog row and registers invalidation messages; at commit those go to
the shared queue; every other backend drains the queue at
AcceptInvalidationMessagesand drops the stale relcache / catcache entries.postgres-cache-invalidation.mdowns this loop end-to-end. - Dependency tracking and namespace resolution are read-mostly services.
They consult the catalog (
pg_depend,pg_namespace) but are not part of the cache stack; they are grouped here because they are catalog logic, not query or storage logic.
Reading order
Section titled “Reading order”Cross-referenced-first — read the thing other docs lean on before the docs that lean on it:
postgres-system-catalogs.md— start here. Everything else in this section is either a cache over the catalog or a service reading it, so you need the layout (whatpg_class/pg_attribute/pg_typehold, how OIDs work, how a row is bootstrapped) first.postgres-catcache-syscache.md— the simpler cache (individual tuples), and the one the relcache itself depends on. Read before the relcache.postgres-relcache.md— the heavier cache (whole-relation descriptors, assembled partly via syscache lookups). It is the most-referenced doc in the engine after the buffer manager, because every table access touches it.postgres-cache-invalidation.md— only meaningful once you know what is cached (docs 2–3). This is the coherence loop that keeps both caches honest; it forward-referencespostgres-shared-memory-ipc.mdfor transport.postgres-dependency-tracking.md— independent of the cache stack; read when you turn to DDL semantics (DROP CASCADE,pg_dumpordering).postgres-namespace-search-path.md— the thinnest doc; the name-resolution front end. Read last, or first if your question is purely “how doessearch_pathpick whichfoo?”.
Detail-doc summaries
Section titled “Detail-doc summaries”Forward references — these module docs may not exist yet; the summaries are predictive (the planned scope of each).
| Module doc | One-line scope |
|---|---|
postgres-system-catalogs.md | The pg_* system tables as data structures — pg_class, pg_attribute, pg_type, pg_proc, pg_am, pg_index and friends; OID assignment, shared vs database-local catalogs, relmapper for nailed/mapped relations, and how a row is created (heap_create_with_catalog, InsertPgClassTuple) and bootstrapped (pg_*.dat → .bki). |
postgres-relcache.md | The relation cache — building a Relation descriptor from several catalogs (RelationBuildDesc), the RelationIdGetRelation hot path, “nailed” system-catalog entries, the init file for fast backend startup, and how RelationClearRelation / RelationCacheInvalidate respond to sinval. |
postgres-catcache-syscache.md | The system-catalog tuple cache — catcache hash buckets and negative caching (SearchCatCache), the syscache typed dispatch table (SearchSysCache1..4, the TYPEOID / PROCOID cache ids), and the lsyscache convenience accessors that wrap common lookups. |
postgres-cache-invalidation.md | The shared-invalidation loop — CacheInvalidateHeapTuple / CacheInvalidateRelcache registering messages, transactional buffering through CommandEndInvalidationMessages, broadcast via the sinvaladt.c ring (SIInsertDataEntries / SIGetDataEntries), and consumption at AcceptInvalidationMessages. |
postgres-dependency-tracking.md | The object-dependency graph — pg_depend (local) and pg_shdepend (shared/global) edges, ObjectAddress identity, recordDependencyOn, and the performDeletion / findDependentObjects recursion that implements DROP ... CASCADE vs RESTRICT and pg_dump ordering. |
postgres-namespace-search-path.md | Schema resolution — the active search_path, recomputeNamespacePath, qualified vs unqualified name lookup (RangeVarGetRelid), temp-schema and pg_catalog precedence, and fetch_search_path for callers that need the resolved list. |
Adjacent sections
Section titled “Adjacent sections”- server-architecture (
postgres-overview-server-architecture.md) — owns the shared-memory segment thesinvalqueue lives in, and theprocsignal/ latch machinery that wakes a backend to drain it. This section’s invalidation loop is a client of that substrate; the seam ispostgres-shared-memory-ipc.md. - ddl-schema (
postgres-overview-ddl-schema.md) — the writers of the catalog. EveryCREATE/ALTER/DROPmutates apg_*table and emits the invalidation messages this section’s loop carries;DROP ... CASCADEwalks the dependency graph this section describes. ddl-schema owns the commands; this section owns the data and caches they touch. - txn-recovery (
postgres-overview-txn-recovery.md) — catalog tables are MVCC heap relations: their reads obey snapshot visibility, their writes are WAL-logged, and they are vacuumed. Catalog access uses a catalog snapshot (postgres-mvcc-snapshots.md) distinct from the query snapshot — a seam worth knowing when invalidation timing matters. - query-processing (
postgres-overview-query-processing.md) — the primary reader of this section’s caches. The planner pulls statistics and relation shape from the relcache; expression evaluation resolves functions and operators through the syscache; name resolution during analysis uses the namespace resolver here. - utilities (
postgres-overview-utilities.md) — owns the bootstrap tooling (genbki.pl,initdb) that populates the catalog this section reads, andpg_dump, whose object-ordering depends on the dependency graph here.