Skip to content

PostgreSQL CatCache & SysCache — Per-Backend Catalog Tuple Cache, Negative Entries, and the sinval Invalidation Loop

Contents:

A database engine spends a surprising fraction of its CPU in its own data dictionary — the set of tables that describe tables, columns, types, functions, operators, and every other schema object. In PostgreSQL these are the system catalogs (pg_class, pg_attribute, pg_proc, pg_type, and dozens of others). Almost every node of a query plan requires at least one catalog lookup: the parser checks that a relation name exists; the type resolver fetches type OIDs and input functions; the planner queries pg_statistic for column statistics; the executor resolves operator OIDs to function pointers. Without a cache, every one of these queries would go to disk through the buffer manager — a prohibitive overhead for a multi-join plan that touches hundreds of catalog rows.

The general concept is the data dictionary cache (or catalog cache): a main-memory structure that retains recently accessed catalog tuples so that the same lookup can be answered in microseconds rather than milliseconds. Database System Concepts (Silberschatz, ch. 25 §“Buffer Management”) names catalog caching as a standard technique alongside the buffer pool; Database Internals (Petrov, ch. 6) ties it to the broader observation that metadata lookups are on the critical path of every query and must be made as cheap as possible.

Two axes define the design space for a catalog cache:

  1. Scope: per-process vs. shared. Should the cache live in the backend’s private heap (one copy per session), in shared memory (shared across all backends), or in both? Per-process caches are simpler and require no locking for reads, but duplicate work across many concurrent backends. Shared caches are memory-efficient but require concurrency control on every access.

  2. Invalidation: eager vs. lazy. When a DDL statement changes a catalog row, cached copies elsewhere are stale. Eager invalidation (broadcast-and-wait) keeps caches consistent immediately but couples all backends to the DDL transaction. Lazy invalidation (mark-and- sweep on next use) decouples DDL from readers but requires each backend to detect staleness before returning a cached value.

PostgreSQL chooses per-process private caches with lazy invalidation mediated by the shared invalidation message queue (sinval). The choice reflects the same “shared-memory machine, processes as inhabitants” philosophy that governs the rest of the architecture (see postgres-architecture-overview.md axis 6): no per-cache locking is needed for normal reads, and the invalidation mechanism integrates cleanly into the existing sinval infrastructure rather than introducing a second coherence path.

An important theoretical nuance is the negative cache entry: an explicit record that a lookup for a given key returned no tuple. In a naive cache, a “miss” simply returns to the catalog. In a catalog with hundreds of concurrent backends doing the same repeated lookup for a non-existent row (common when checking whether an optional catalog entry exists), the cache can never help. A negative entry converts the miss into a cache hit on subsequent calls — the entry represents the absence of a tuple as positively as a real entry represents its presence. Invalidation handles negative entries exactly like positive ones: if a new tuple is inserted, the negative entry is marked dead.

Nearly every production database engine with a system catalog — Oracle, SQL Server, MySQL InnoDB, DB2 — converges on a set of patterns that sit between the textbook model and the specific implementation.

The cache is a hash table. The key is the catalog identity: which relation is being queried plus the specific key values used to look up a row (an OID, a name, or a composite). The value is the cached tuple (or, for a negative entry, a sentinel). The hash gives O(1) average lookup time without loading the entire catalog into memory.

Cached tuples are not freed at end-of-query. They live until no code holds a reference to them and they have been invalidated. The pattern is: caller increments a reference count on retrieval; caller decrements it on release; the cache frees the tuple only when the refcount drops to zero and the entry is marked dead. This lets the caller hold a raw pointer into the cache without copying and without worrying about the tuple disappearing underneath it.

Positive and negative entries managed identically

Section titled “Positive and negative entries managed identically”

Most engines that implement negative caching represent both positive and negative entries with the same struct, differing only in a negative flag. This uniformity means that invalidation, eviction, and refcount logic is identical for both kinds. Only the return value to the caller differs: positive returns the tuple pointer, negative returns NULL.

When a transaction commits a change to a catalog row, it must notify all other backends that their cached copies of that row are stale. The common mechanism is a shared message queue from which every backend reads on its next “safe point” (typically at the start of a command or at a wait point). The message carries enough information — usually just a hash value of the affected key — for each backend to mark the corresponding cached entry dead. The entry is not freed immediately if another caller holds a refcount; it is freed when the refcount drops to zero.

Separate layers: generic cache engine vs. named-cache facade

Section titled “Separate layers: generic cache engine vs. named-cache facade”

In most production engines the catalog cache has two levels: a generic tuple-cache layer that knows about hash buckets, refcounts, and invalidation, but not about any specific catalog; and a facade layer that maps human-readable catalog names to integer IDs and exposes convenience functions (get_type_by_oid(oid) rather than lookup(cache_id, datum)). The facade layer is what the rest of the engine calls; the generic layer is what the facade layer uses. This split lets the generic layer be tested and reasoned about independently, and lets the facade evolve its API without touching the cache internals.

Theory conceptPostgreSQL name
Data dictionary cacheCatCache (generic) + SysCache[] array (named)
Hash bucket arrayCatCache.cc_bucket[] — array of dlist_head
Cached tuple (positive)CatCTup with negative = false
Negative cache entryCatCTup with negative = true
Multi-row partial-key resultCatCList
Reference countCatCTup.refcount / CatCList.refcount
Invalidation messagesinval message carrying (cache_id, hash_value)
Named-cache facadesyscache.c + SysCache[] indexed by enum from MAKE_SYSCACHE
Convenience wrapperslsyscache.cget_attname, get_func_name, …

PostgreSQL’s catalog cache has three layers, each in its own source file:

lsyscache.c ← convenience API (get_attname, get_func_name, …)
syscache.c ← named-ID facade (SearchSysCache1/2/3/4, cacheinfo[])
catcache.c ← generic hash-table engine (CatCache, CatCTup, CatCList)

catcache.c knows nothing about which catalog it is caching. It receives a relation OID, an index OID, a key-count, and key-column attribute numbers, and builds a hash table. syscache.c populates the SysCache[] array using the cacheinfo[] table (generated from MAKE_SYSCACHE declarations in pg_*.h headers), and exposes the SearchSysCache1/2/3/4 family. lsyscache.c wraps syscache.c with one function per common query pattern, so callers write get_func_name(funcid) rather than manipulating HeapTuple directly.

Each catalog header declares its caches with MAKE_SYSCACHE:

// MAKE_SYSCACHE — src/include/catalog/pg_proc.h (line 143)
MAKE_SYSCACHE(PROCOID, pg_proc_oid_index, 128);
// MAKE_SYSCACHE — src/include/catalog/pg_type.h (line 268)
MAKE_SYSCACHE(TYPEOID, pg_type_oid_index, 64);
// MAKE_SYSCACHE — src/include/catalog/pg_class.h (line 162)
MAKE_SYSCACHE(RELOID, pg_class_oid_index, 128);
MAKE_SYSCACHE(RELNAMENSP, pg_class_relname_nsp_index, 128);

The macro registers: a symbolic cache ID (e.g. PROCOID), the underlying unique index OID expression, and the initial bucket count (always a power of two). The cacheinfo[] array in syscache.c is populated by #include "catalog/syscache_info.h", which expands those declarations into struct cachedesc entries holding (reloid, indoid, nkeys, key[], nbuckets). InitCatalogCache iterates that array and calls InitCatCache for each entry, building the SysCache[] array.

// struct catcache — src/include/utils/catcache.h
typedef struct catcache
{
int id; /* cache identifier (e.g. PROCOID) */
int cc_nbuckets; /* # of hash buckets (power of 2) */
TupleDesc cc_tupdesc; /* copied from relcache at first use */
dlist_head *cc_bucket; /* per-bucket doubly linked lists */
CCHashFN cc_hashfunc[CATCACHE_MAXKEYS]; /* per-key hash fns */
CCFastEqualFN cc_fastequal[CATCACHE_MAXKEYS]; /* per-key eq fns */
int cc_keyno[CATCACHE_MAXKEYS]; /* attr numbers */
int cc_nkeys; /* 1..CATCACHE_MAXKEYS */
int cc_ntup; /* current tuple count */
int cc_nlist; /* current list count */
dlist_head *cc_lbucket; /* list hash buckets (allocated on demand) */
Oid cc_reloid; /* relation being cached */
Oid cc_indexoid; /* supporting unique index */
bool cc_relisshared; /* shared across databases? */
ScanKeyData cc_skey[CATCACHE_MAXKEYS]; /* precomputed scan key */
/* ... CATCACHE_STATS fields omitted ... */
} CatCache;

cc_tupdesc is NULL until first use; CatalogCacheInitializeCache fills it by opening the relation. This lazy init avoids opening catalog relations before the relcache is ready, which matters during bootstrap.

// struct catctup — src/include/utils/catcache.h
typedef struct catctup
{
int ct_magic; /* CT_MAGIC = 0x57261502 */
uint32 hash_value; /* precomputed hash of keys */
Datum keys[CATCACHE_MAXKEYS]; /* extracted key values */
dlist_node cache_elem; /* link in per-bucket list (LRU order) */
int refcount; /* active references */
bool dead; /* marked for removal */
bool negative; /* negative cache entry? */
HeapTupleData tuple; /* tuple management header */
struct catclist *c_list; /* owning CatCList, or NULL */
CatCache *my_cache; /* owning CatCache */
/* aligned tuple data follows (unless negative) */
} CatCTup;

The CatCTup and its tuple data are allocated in a single palloc in CacheMemoryContext, so the tuple bytes immediately follow the struct. For negative entries, no tuple data is stored; keys[] hold the searched-for values (separately palloc’d, freed when the entry is removed).

A key design choice: per-bucket dlist_head lists are maintained in LRU order. SearchCatCacheInternal moves a hit to the front of its bucket list with dlist_move_head, so the most recently used entries are always at the head. This is cheap (two pointer swaps) and effectively makes the cache approximate-LRU within each bucket.

When a caller needs all tuples sharing a partial key — for example, all pg_attribute rows for a given relidSearchCatCacheList builds a CatCList:

// struct catclist — src/include/utils/catcache.h
typedef struct catclist
{
int cl_magic; /* CL_MAGIC = 0x52765103 */
uint32 hash_value; /* hash of the partial key */
dlist_node cache_elem; /* link in per-catcache list bucket */
Datum keys[CATCACHE_MAXKEYS]; /* partial-key values */
int refcount;
bool dead;
bool ordered; /* members in index order? */
short nkeys; /* # of key columns specified */
int n_members;
CatCache *my_cache;
CatCTup *members[FLEXIBLE_ARRAY_MEMBER]; /* member tuples */
} CatCList;

A CatCList entry holds pointers to the individual CatCTup entries it contains; those entries also exist in the per-bucket hash lists for point lookups. A CatCTup can belong to at most one CatCList. The list hash buckets (cc_lbucket) are allocated lazily on the first list search, and grow by doubling when fill factor exceeds 2.

The hot path through the cache avoids any catalog access:

// SearchCatCacheInternal — src/backend/utils/cache/catcache.c (line 1402)
static inline HeapTuple
SearchCatCacheInternal(CatCache *cache, int nkeys,
Datum v1, Datum v2, Datum v3, Datum v4)
{
uint32 hashValue;
Index hashIndex;
dlist_iter iter;
CatCTup *ct;
ConditionalCatalogCacheInitializeCache(cache); /* one-time init */
hashValue = CatalogCacheComputeHashValue(cache, nkeys, v1, v2, v3, v4);
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets); /* bitmask */
dlist_foreach(iter, &cache->cc_bucket[hashIndex])
{
ct = dlist_container(CatCTup, cache_elem, iter.cur);
if (ct->dead) continue;
if (ct->hash_value != hashValue) continue; /* early exit */
if (!CatalogCacheCompareTuple(...)) continue;
dlist_move_head(...); /* promote to LRU front */
if (!ct->negative)
{
ct->refcount++;
ResourceOwnerRememberCatCacheRef(...);
return &ct->tuple; /* ← cache HIT, positive */
}
return NULL; /* ← cache HIT, negative */
}
return SearchCatCacheMiss(...); /* ← cache MISS */
}

The function is pg_attribute_always_inline and is specialized into SearchCatCache1/2/3/4 with a fixed nkeys, allowing the compiler to unroll the hash computation loop. The hash is built by XOR-ing per-key hashes with fixed bit-rotations (8, 16, 24 bits for keys 2, 3, 4). Hash and equality functions are hard-coded fast-path variants for the common key types (OID, int4, int2, name, text) in GetCCHashEqFuncs, bypassing fmgr overhead on the critical path.

Figure 1 — Cache lookup decision tree

flowchart TD
    A[SearchSysCache1/2/3/4] --> B[ConditionalCatalogCacheInitializeCache]
    B --> C{cc_tupdesc == NULL?}
    C -- yes --> D[CatalogCacheInitializeCache<br/>open rel, copy TupleDesc]
    C -- no --> E[ComputeHashValue + HASH_INDEX]
    D --> E
    E --> F[Scan cc_bucket dlist]
    F --> G{dead or hash mismatch?}
    G -- skip --> F
    G -- match keys? --> H{negative?}
    H -- no --> I[bump refcount\nreturn &ct->tuple]
    H -- yes --> J[return NULL\nneg-cache HIT]
    F --> K{exhausted?}
    K -- yes --> L[SearchCatCacheMiss]
    L --> M[systable_beginscan on catalog index]
    M --> N{found?}
    N -- yes --> O[CatalogCacheCreateEntry\npositive entry]
    N -- no --> P[CatalogCacheCreateEntry\nnegative entry]
    O --> I
    P --> J

Figure 1 — The two-branch decision tree: hit on a positive entry returns the tuple directly; hit on a negative entry returns NULL without a catalog scan; miss calls SearchCatCacheMiss which opens the catalog via systable_beginscan.

On a cache miss, SearchCatCacheMiss opens the catalog relation with table_open(AccessShareLock) and performs an index scan via systable_beginscan. IndexScanOK gates whether an index scan is actually safe (it returns false for pg_am always, and for pg_index until criticalRelcachesBuilt, to avoid bootstrapping recursion):

// SearchCatCacheMiss — src/backend/utils/cache/catcache.c (line 1510)
relation = table_open(cache->cc_reloid, AccessShareLock);
do {
scandesc = systable_beginscan(relation, cache->cc_indexoid,
IndexScanOK(cache), NULL,
nkeys, cur_skey);
ct = NULL;
stale = false;
while (HeapTupleIsValid(ntp = systable_getnext(scandesc)))
{
ct = CatalogCacheCreateEntry(cache, ntp, NULL,
hashValue, hashIndex);
if (ct == NULL) { stale = true; break; } /* toast retry */
ct->refcount++;
break; /* unique index: at most one match */
}
systable_endscan(scandesc);
} while (stale);
table_close(relation, AccessShareLock);
if (ct == NULL) /* not found → negative entry */
ct = CatalogCacheCreateEntry(cache, NULL, arguments,
hashValue, hashIndex);

The stale retry loop handles a subtle race: if the tuple has out-of- line TOAST fields, CatalogCacheCreateEntry calls toast_flatten_tuple which can process AcceptInvalidationMessages, potentially marking the entry-in-progress dead. In that case CatalogCacheCreateEntry returns NULL and the scan restarts. This is the CatCInProgress stack mechanism: during entry creation a CatCInProgress node is pushed onto catcache_in_progress_stack; if an invalidation arrives while that node is live, it sets node->dead = true, which CatalogCacheCreateEntry checks before inserting the new entry.

// CatalogCacheCreateEntry — src/backend/utils/cache/catcache.c (line 2144)
ct = (CatCTup *) palloc(sizeof(CatCTup) + MAXIMUM_ALIGNOF + dtp->t_len);
ct->tuple.t_data = (HeapTupleHeader) MAXALIGN(((char *) ct) + sizeof(CatCTup));
memcpy((char *) ct->tuple.t_data, (const char *) dtp->t_data, dtp->t_len);
/* extract keys — they point into the tuple for positive entries */
for (i = 0; i < cache->cc_nkeys; i++)
ct->keys[i] = heap_getattr(&ct->tuple, cache->cc_keyno[i], ...);

The key insight: CatCTup + its tuple bytes are one allocation in CacheMemoryContext. The key Datum values point directly into the tuple data (for fixed-length by-value types they are copied inline; for variable-length they point into the heap tuple). For negative entries no tuple is stored, and the keys are separately palloc’d copies.

syscache.c wraps catcache.c with integer cache IDs:

// InitCatalogCache — src/backend/utils/cache/syscache.c (line 110)
void InitCatalogCache(void)
{
for (cacheId = 0; cacheId < SysCacheSize; cacheId++)
{
SysCache[cacheId] = InitCatCache(cacheId,
cacheinfo[cacheId].reloid,
cacheinfo[cacheId].indoid,
cacheinfo[cacheId].nkeys,
cacheinfo[cacheId].key,
cacheinfo[cacheId].nbuckets);
}
/* build sorted OID arrays for RelationHasSysCache / RelationSupportsSysCache */
qsort(SysCacheRelationOid, SysCacheRelationOidSize, sizeof(Oid), oid_compare);
/* ... de-dup ... */
CacheInitialized = true;
}

The public API is the SearchSysCache1/2/3/4 family, which delegates directly to SearchCatCache1/2/3/4:

// SearchSysCache1 — src/backend/utils/cache/syscache.c (line 221)
HeapTuple SearchSysCache1(int cacheId, Datum key1)
{
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
Assert(SysCache[cacheId]->cc_nkeys == 1);
return SearchCatCache1(SysCache[cacheId], key1);
}

Convenience macros SearchSysCacheCopy1, SearchSysCacheExists1, GetSysCacheOid1, and GetSysCacheHashValue1 are defined in syscache.h as inline wrappers that pad unused key arguments to zero.

SearchSysCacheLocked1 (line 287) is a PG18-era addition for inplace-updated tables (pg_class, pg_database, pg_authid, etc.) that must be locked before modification. It loops SearchSysCache1 → LockTuple → re-fetch until the TID is stable, so callers get both the tuple and the tuple lock in one atomic step.

SysCacheGetAttr and SysCacheGetAttrNotNull

Section titled “SysCacheGetAttr and SysCacheGetAttrNotNull”
// SysCacheGetAttr — src/backend/utils/cache/syscache.c (line 600)
Datum SysCacheGetAttr(int cacheId, HeapTuple tup,
AttrNumber attributeNumber, bool *isNull)
{
/* Assert cache's TupleDesc matches the tuple */
Assert(...);
return heap_getattr(tup, attributeNumber,
SysCache[cacheId]->cc_tupdesc, isNull);
}

Because cached tuples live in CacheMemoryContext (not the transaction context), the returned Datum pointer is valid for the duration of the backend — unless the entry is invalidated. Callers that need to hold the value past the next catalog access must copy it with datumCopy.

lsyscache.c provides ~150 typed wrapper functions so callers do not manipulate HeapTuple directly. The pattern is uniform:

// get_func_name — src/backend/utils/cache/lsyscache.c (line 1786)
char *
get_func_name(Oid funcid)
{
HeapTuple tp;
char *result;
tp = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
if (!HeapTupleIsValid(tp))
return NULL;
result = pstrdup(NameStr(((Form_pg_proc) GETSTRUCT(tp))->proname));
ReleaseSysCache(tp);
return result;
}

The function: look up by OID, extract the typed struct with GETSTRUCT, copy the field out of the cache tuple into the caller’s context, release the cache reference, return. The pstrdup is critical: the cached NameStr pointer is into CacheMemoryContext — copying it into the current memory context ensures the value survives potential subsequent catalog access that might trigger invalidation.

Figure 2 — Layer call flow for a typical catalog lookup

sequenceDiagram
    participant E as Executor / Planner
    participant L as lsyscache.c
    participant S as syscache.c
    participant C as catcache.c
    participant K as Catalog (heap)

    E->>L: get_func_name(funcid)
    L->>S: SearchSysCache1(PROCOID, funcid)
    S->>C: SearchCatCache1(SysCache[PROCOID], funcid)
    C->>C: ComputeHashValue + bucket scan
    alt cache HIT
        C-->>S: &ct->tuple (refcount++)
        S-->>L: HeapTuple
        L->>S: ReleaseSysCache(tp)
        S->>C: ReleaseCatCache (refcount--)
        L-->>E: pstrdup(proname)
    else cache MISS
        C->>K: table_open + systable_beginscan
        K-->>C: HeapTuple from index scan
        C->>C: CatalogCacheCreateEntry (palloc in CacheMemoryContext)
        C-->>S: &ct->tuple (refcount++)
        S-->>L: HeapTuple
        L->>S: ReleaseSysCache(tp)
        S->>C: ReleaseCatCache (refcount--)
        L-->>E: pstrdup(proname)
    end

Figure 2 — Both hit and miss paths end with a ReleaseSysCache call that decrements the refcount. The caller receives a pstrdup copy of the name, not a pointer into the cache.

Invalidation: CatCacheInvalidate and the sinval loop

Section titled “Invalidation: CatCacheInvalidate and the sinval loop”

When a transaction commits a catalog change, inval.c queues a sinval message containing (cacheId, hashValue) for each affected cache entry. At the next safe point — AcceptInvalidationMessages, called from transaction start, CHECK_FOR_INTERRUPTS, and similar sites — each backend processes its queue and calls SysCacheInvalidateCatCacheInvalidate:

// CatCacheInvalidate — src/backend/utils/cache/catcache.c (line 636)
void CatCacheInvalidate(CatCache *cache, uint32 hashValue)
{
/* 1. Invalidate ALL CatCLists in this cache (too hard to be selective) */
for (int i = 0; i < cache->cc_nlbuckets; i++)
dlist_foreach_modify(iter, &cache->cc_lbucket[i])
{
CatCList *cl = ...;
if (cl->refcount > 0) cl->dead = true;
else CatCacheRemoveCList(cache, cl);
}
/* 2. Invalidate matching tuples in the specific hash bucket */
hashIndex = HASH_INDEX(hashValue, cache->cc_nbuckets);
dlist_foreach_modify(iter, &cache->cc_bucket[hashIndex])
{
CatCTup *ct = ...;
if (hashValue == ct->hash_value)
{
if (ct->refcount > 0 || (ct->c_list && ct->c_list->refcount > 0))
ct->dead = true; /* in use: mark, free later */
else
CatCacheRemoveCTup(cache, ct); /* not in use: free now */
}
}
/* 3. Mark any in-progress build entries dead */
for (CatCInProgress *e = catcache_in_progress_stack; e; e = e->next)
if (e->cache == cache && (e->list || e->hash_value == hashValue))
e->dead = true;
}

Three points to observe: first, all CatCList entries in the affected cache are invalidated regardless of key, because it is too expensive to determine which partial-key lists are affected. Second, a tuple with refcount > 0 is merely marked dead — it is not freed until ReleaseCatCache decrements the refcount to zero. Third, the match is on hash_value, not on the tuple TID; this accepts a small risk of false-positive invalidations (hash collisions) in exchange for correctness after VACUUM FULL (which changes TIDs).

Figure 3 — Invalidation state machine for a CatCTup entry

stateDiagram-v2
    [*] --> Live : CatalogCacheCreateEntry
    Live --> Live : SearchCatCacheInternal hit\nrefcount++
    Live --> Live : ReleaseCatCache\nrefcount--
    Live --> Dead_Referenced : CatCacheInvalidate\nrefcount > 0
    Live --> Freed : CatCacheInvalidate\nrefcount == 0
    Dead_Referenced --> Freed : ReleaseCatCache\nrefcount drops to 0
    Freed --> [*] : CatCacheRemoveCTup pfree

Figure 3 — An entry transitions from Live to Dead_Referenced when invalidated while in use, and is freed only when the last caller releases it.

Subsystems that maintain their own derived state — like the relcache and the type cache — register callbacks via RegisterSyscacheCallback. When CatCacheInvalidate marks entries dead, CallSyscacheCallbacks fires each registered callback for the affected cache ID. The relcache uses this to invalidate RelationData entries; the type cache uses it to discard derived type information. This is the mechanism by which a single sinval event propagates through all layers of the in-memory catalog hierarchy.

The bucket count starts at the value from cacheinfo[] and doubles when the load factor exceeds a heuristic threshold. RehashCatCache allocates a new bucket array, iterates all existing entries, re-places each by HASH_INDEX(ct->hash_value, newnbuckets), and frees the old array. This never requires blocking: it happens synchronously inside the backend that triggered the rehash. Similarly, RehashCatCacheLists doubles the list bucket array.

  • InitCatCache — allocates CatCache struct in CacheMemoryContext, sets cc_reloid, cc_indexoid, cc_nkeys, cc_keyno, cc_nbuckets; does not open the relation.
  • CatalogCacheInitializeCache — opens the catalog relation (once, on first use), copies TupleDesc, resolves per-key hash/eq functions via GetCCHashEqFuncs, fills cc_skey[].
  • SearchCatCacheInternal — inline hot path: hash → bucket scan → LRU promote → refcount. Dispatched from SearchCatCache1/2/3/4.
  • SearchCatCacheMisspg_noinline cold path: systable_beginscan on the index, CatalogCacheCreateEntry for positive entry or negative entry on not-found.
  • CatalogCacheCreateEntry — single-palloc for CatCTup + tuple bytes; handles TOAST flattening with CatCInProgress stale-detection.
  • SearchCatCacheList — partial-key multi-row fetch into CatCList; retries if an invalidation arrives mid-build via in_progress_ent.
  • CatCacheInvalidate — marks/frees entries by hash value; invalidates all lists; marks in-progress entries.
  • ReleaseCatCache / ReleaseCatCacheList — decrement refcount; free if dead and refcount==0.
  • RehashCatCache / RehashCatCacheLists — double bucket arrays, redistribute entries.
  • PrepareToInvalidateCacheTuple — called by inval.c on tuple insert/update/delete to compute the hash value to queue.
  • InitCatalogCache — builds SysCache[] from cacheinfo[]; populates sorted SysCacheRelationOid[] and SysCacheSupportingRelOid[] arrays.
  • SearchSysCache1/2/3/4 — thin wrappers calling SearchCatCache1/2/3/4.
  • SearchSysCacheLocked1 — acquire LOCKTAG_TUPLE at InplaceUpdateTupleLock then fetch; loop until TID is stable.
  • SysCacheGetAttr / SysCacheGetAttrNotNullheap_getattr with the cache’s cc_tupdesc.
  • SysCacheInvalidate — index into SysCache[], call CatCacheInvalidate; also fires CallSyscacheCallbacks.
  • RelationHasSysCache / RelationSupportsSysCache — binary search of SysCacheRelationOid[] / SysCacheSupportingRelOid[].

Representative wrappers:

  • get_attname(relid, attnum)ATTNUM cache, returns pstrdup of attname.
  • get_opname(opno)OPEROID cache, returns pstrdup of oprname.
  • get_func_name(funcid)PROCOID cache, returns pstrdup of proname.
  • get_rel_name(relid)RELOID cache, returns pstrdup of relname.
  • get_namespace_name(nspid)NAMESPACEOID cache, returns pstrdup of nspname.

Position hints (commit 273fe94, 2026-06-05)

Section titled “Position hints (commit 273fe94, 2026-06-05)”
SymbolFileLine
struct catcachesrc/include/utils/catcache.h44
struct catctupsrc/include/utils/catcache.h88
struct catclistsrc/include/utils/catcache.h159
struct catcacheheadersrc/include/utils/catcache.h184
InitCatCachesrc/backend/utils/cache/catcache.c889
CatalogCacheInitializeCachesrc/backend/utils/cache/catcache.c1126
SearchCatCacheInternalsrc/backend/utils/cache/catcache.c1402
SearchCatCacheMisssrc/backend/utils/cache/catcache.c1510
CatalogCacheCreateEntrysrc/backend/utils/cache/catcache.c2144
SearchCatCacheListsrc/backend/utils/cache/catcache.c1730
CatCacheInvalidatesrc/backend/utils/cache/catcache.c636
ReleaseCatCachesrc/backend/utils/cache/catcache.c1658
ReleaseCatCacheListsrc/backend/utils/cache/catcache.c2104
PrepareToInvalidateCacheTuplesrc/backend/utils/cache/catcache.c2387
InitCatalogCachesrc/backend/utils/cache/syscache.c110
SearchSysCache1src/backend/utils/cache/syscache.c221
SearchSysCacheLocked1src/backend/utils/cache/syscache.c287
SysCacheGetAttrsrc/backend/utils/cache/syscache.c600
SysCacheInvalidatesrc/backend/utils/cache/syscache.c698
RelationHasSysCachesrc/backend/utils/cache/syscache.c745
get_func_namesrc/backend/utils/cache/lsyscache.c1786
get_attnamesrc/backend/utils/cache/lsyscache.c957
get_rel_namesrc/backend/utils/cache/lsyscache.c2106
get_namespace_namesrc/backend/utils/cache/lsyscache.c3544
  • CatCTup and its tuple bytes are a single allocation in CacheMemoryContext. Verified in CatalogCacheCreateEntry (catcache.c:2216): palloc(sizeof(CatCTup) + MAXIMUM_ALIGNOF + dtp->t_len), with ct->tuple.t_data pointing to MAXALIGN(((char *) ct) + sizeof(CatCTup)). This means the tuple bytes are directly after the struct header at a cache-line-aligned address.

  • Negative entries store separately-palloc’d keys; positive entries point into the tuple. Verified: for positive entries (ntp != NULL), ct->keys[i] = heap_getattr(...) points into the inlined tuple. For negative entries, CatCacheCopyKeys allocates separate datums. CatCacheRemoveCTup calls CatCacheFreeKeys only for negative entries.

  • CatCList invalidation is total — all lists in the cache are invalidated on any invalidation event. Verified in CatCacheInvalidate (catcache.c:636): the function iterates all cc_nlbuckets list buckets unconditionally. The comment says “it’s too hard to tell which searches might still be correct, so just zap ‘em all.”

  • cc_lbucket (list hash buckets) is allocated lazily on first SearchCatCacheList call. Verified: InitCatCache sets cp->cc_lbucket = NULL (catcache.c:949). SearchCatCacheList checks if (cache->cc_lbucket == NULL) and allocates an initial 16-bucket array on the first call.

  • The CatCInProgress stack is process-local. Verified: declared as static CatCInProgress *catcache_in_progress_stack = NULL (catcache.c:61). There is no shared-memory component; the race it protects against is intra-process (invalidation arriving during toast_flatten_tuple).

  • SearchSysCacheLocked1 loops until TID stability. Verified in syscache.c:287–376: the function performs SearchSysCache1 → LockTuple → re-fetch → compare TID in a loop. This is the mechanism described in README.tuplock §“Locking to write inplace-updated tables.” Added as part of PG16/17 inplace-update hardening.

  • MAKE_SYSCACHE declarations live in individual pg_*.h catalog headers. Verified by grep: MAKE_SYSCACHE(PROCOID, ...) in pg_proc.h:143, MAKE_SYSCACHE(TYPEOID, ...) in pg_type.h:268, MAKE_SYSCACHE(RELOID, ...) in pg_class.h:162. The syscache_info.h file is generated/included to populate cacheinfo[].

  1. Total count of SysCacheSize at commit 273fe94. The exact count of cache IDs is in the generated catalog/syscache_ids.h; the file was not found at the expected path under src/include/catalog/ in the working tree. Investigation path: grep -r SysCacheSize src/include/ to locate the generated header or enum definition.

  2. RegisterSyscacheCallback capacity limit. inval.c stores registered callbacks in a fixed array. The capacity limit and what happens on overflow (assert vs. error vs. expand) is not verified. Investigation path: read CallSyscacheCallbacks in inval.c and the array declaration.

  3. debug_discard_caches GUC behavior. ResetCatalogCachesExt(true) is called when debug_discard_caches is set. The exact GUC name and whether it is accessible in PG18 release builds (it requires USE_ASSERT_CHECKING?) is not verified.

Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”
  • Oracle row cache (shared). Oracle’s data dictionary cache lives in the shared pool (SGA), shared across all sessions. This avoids per- session redundancy but requires a latch on every cache access. The tradeoff: Oracle’s heavy concurrent workloads make the shared approach worth the latch cost; PostgreSQL’s per-process model avoids any lock for reads at the cost of duplicating entries across sessions. A side- by-side measurement under high-concurrency DDL-read workloads would quantify the tradeoff.

  • MySQL InnoDB table definition cache (table_definition_cache). A shared LRU cache of TABLE_SHARE objects, also shared across threads. The sharing granularity is the whole table definition rather than individual catalog tuples.

  • CockroachDB lease-based descriptor cache. CockroachDB uses range-leased schema descriptors: a node holds a time-bounded lease on a schema version and serves all requests from its local copy until the lease expires. This is closer to the eager-invalidation end of the spectrum, with a lease protocol taking the place of the sinval queue. Contrast with PostgreSQL’s demand-driven lazy invalidation.

  • Research: fine-grained list invalidation. PostgreSQL’s CatCacheInvalidate invalidates all CatCList entries on any catalog change, regardless of which partial key was affected. This is conservative and sometimes excessive. A finer-grained approach — hash the partial key and invalidate only matching lists — would reduce unnecessary list rebuilds under DDL workloads on large catalogs. No known published study quantifies this for PostgreSQL specifically.

  • Research: shared catalog cache. PostgreSQL has long had proposals for a shared catalog cache to reduce per-backend memory usage and catalog-scan startup overhead (especially relevant for connection poolers and short-lived sessions). The implementation difficulty is managing refcounts and invalidation across a shared structure without introducing a new global latch. Some work in this area appeared in pgsql-hackers discussions (~2019–2022); the topic remains open.

Source files (commit 273fe94, REL_18_STABLE)

Section titled “Source files (commit 273fe94, REL_18_STABLE)”
  • src/backend/utils/cache/catcache.c — generic catalog cache engine
  • src/backend/utils/cache/syscache.c — named-ID facade and SysCache[] array
  • src/backend/utils/cache/lsyscache.c — typed convenience wrappers
  • src/include/utils/catcache.hCatCache, CatCTup, CatCList struct definitions
  • src/include/utils/syscache.hSearchSysCache* prototypes and macro wrappers
  • src/include/utils/lsyscache.hget_attname, get_func_name, … prototypes
  • src/include/catalog/pg_class.h, pg_proc.h, pg_type.h, pg_attribute.h, pg_namespace.hMAKE_SYSCACHE declarations
  • Database Internals (Petrov, 2019) — ch. 6 §“Metadata Management”: catalog cache as critical-path optimization.
  • Database System Concepts (Silberschatz, 7e) — ch. 25 §“Buffer Management”: data dictionary cache alongside buffer pool.
  • knowledge/code-analysis/postgres/postgres-cache-invalidation.md — the sinval message queue and inval.c in full detail.
  • knowledge/code-analysis/postgres/postgres-relcache.md — relcache as the higher-level consumer of catcache; RegisterSyscacheCallback usage.
  • knowledge/code-analysis/postgres/postgres-system-catalogs.md — catalog layout, MAKE_SYSCACHE macro, .bki bootstrap.
  • knowledge/code-analysis/postgres/postgres-memory-contexts.mdCacheMemoryContext lifetime and allocation patterns.