PostgreSQL Table Access Method — Pluggable Storage Dispatch via TableAmRoutine
Contents:
- Theoretical Background
- Common DBMS Design
- PostgreSQL’s Approach
- Source Walkthrough
- Source verification (as of 2026-06-05)
- Beyond PostgreSQL — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”A database engine’s storage layer sits between the logical model — relations, tuples, snapshots — and the physical medium. Every relation needs a concrete storage implementation that answers: where is a tuple pinned in memory, how is it laid out on disk, how do scans advance, and what happens when a tuple is inserted, updated, or deleted. The design question is whether that implementation is fixed or replaceable.
In the fixed model the executor code calls storage functions directly. This is simple and fast but conflates two concerns: what the executor needs (a cursor over visible tuples, a way to insert a row) with how those needs are met (a heap of slotted pages with MVCC tuples). Separating the concerns requires an indirection layer: the executor calls a stable interface, and the storage implementation is selected at relation-open time and wired in through a dispatch table. The resulting architecture is a classic vtable or strategy object pattern from object-oriented design (Database Internals, Petrov, ch. 3 on pluggable components; Database System Concepts, Silberschatz et al., 7e, ch. 13 on storage structures).
The practical motivation for pluggable storage in an OLTP system is workload diversity. The no-overwrite MVCC heap that PostgreSQL has used since Stonebraker’s Berkeley POSTGRES lineage is excellent for mixed read-write OLTP but carries costs (tuple bloat, vacuum overhead, no native column compression) that columnar or in-memory alternatives can eliminate for analytic or append-only workloads. If the storage model is hardwired into the executor, replacing it means forking the whole engine. An indirection layer makes each storage implementation a plug-in that shares the query pipeline.
The classical treatment of pluggable storage is in Stonebraker & Rowe 1986
(The Design of POSTGRES), which envisioned an “abstract data manager”
interface separating query processing from storage. PostgreSQL 12 realized
a version of that vision with TableAmRoutine. The analogous Index Access
Method interface (IndexAmRoutine in amapi.h) predates it; the Table AM
interface is modeled on the same shape.
Common DBMS Design
Section titled “Common DBMS Design”Several recurring design conventions appear in pluggable storage interfaces across systems.
Function-pointer tables as the dispatch mechanism
Section titled “Function-pointer tables as the dispatch mechanism”The idiomatic C implementation of a vtable is a struct of function pointers.
Each slot in the struct corresponds to one operation on the abstraction; the
concrete implementation fills in the pointers with its own functions. A
relation (table object in memory) carries a pointer to the applicable vtable,
so dispatcher code reads relation->am_routine->some_op(...) without a
switch statement.
The virtues are: (1) the caller never needs to know which AM is in play; (2)
new AMs are registered without patching the core code; (3) the vtable is
allocated once per session as a static const struct, so dereferencing it
costs only two pointer chases (relation → vtable → function).
Mandatory vs. optional callbacks
Section titled “Mandatory vs. optional callbacks”Not every operation makes sense for every AM. An in-memory AM might have no concept of vacuum; a columnar AM might not support row-level locking. The convention is to mark a core set of callbacks as mandatory (asserted non-NULL by the routine validator) and allow the rest to be NULL-able (optional). The validator runs once at registration time and fails fast if a required slot is missing, giving a clean error instead of a null-pointer crash at runtime.
A shared DML outcome vocabulary
Section titled “A shared DML outcome vocabulary”When a DML operation (update, delete, lock) attempts to modify a tuple that another transaction is also touching, there are several distinct outcomes: the operation succeeded, the target was already modified by the same command, the target was updated by another committed transaction, the target is being modified by a concurrent in-progress transaction, or the operation would block when asked not to wait. A well-designed interface captures these outcomes in an enum and returns it to the caller rather than raising an error, so the caller can choose its own policy (retry, error out, skip).
Scan lifecycle: begin → getnextslot → rescan/end
Section titled “Scan lifecycle: begin → getnextslot → rescan/end”Sequential scans follow a three-phase lifecycle across almost all storage
interfaces: begin allocates a scan descriptor and pins the relation; a
loop of getnextslot calls advances a cursor and returns one tuple at a
time into a caller-supplied slot; end releases resources. A rescan
operation restarts the cursor without tearing down the descriptor, which
amortizes the begin/end overhead when the same scan must be re-run (e.g.,
in a nested-loop join inner side).
Theory ↔ PostgreSQL Table AM mapping
Section titled “Theory ↔ PostgreSQL Table AM mapping”| Design concept | PostgreSQL Table AM name |
|---|---|
| Function-pointer vtable | TableAmRoutine (in tableam.h) |
| Vtable registered on relation | Relation.rd_tableam (relcache field) |
| DML outcome vocabulary | TM_Result enum |
| DML failure detail | TM_FailureData struct |
| Scan descriptor | TableScanDescData / TableScanDesc |
| Index-fetch state | IndexFetchTableData |
| Mandatory callback validator | GetTableAmRoutine (in tableamapi.c) |
| Only in-tree AM at REL_18 | heapam_methods (in heapam_handler.c) |
PostgreSQL’s Approach
Section titled “PostgreSQL’s Approach”How the AM gets wired onto a relation
Section titled “How the AM gets wired onto a relation”When a backend opens a relation (table_open → heap_open →
RelationIdGetRelation), the relcache code reads pg_am.amhandler from the
relation’s pg_class row, calls GetTableAmRoutine(amhandler), and stores
the returned const TableAmRoutine * in rel->rd_tableam. Every subsequent
table_* call on that relation dispatches through this pointer. The handler
OID for the built-in heap AM resolves to heap_tableam_handler, which simply
returns &heapam_methods.
flowchart TB REL["Relation (relcache)\nrd_tableam → &heapam_methods"] EXEC["Executor\n(nodeSeqscan, nodeIndexscan,\nexecModifyTable, vacuum.c)"] WRAP["table_* inline wrappers\n(tableam.h)"] RT["const TableAmRoutine *\n= rel->rd_tableam"] HM["heapam_methods\n(static const TableAmRoutine)"] IMPL["heapam_handler.c / heapam.c\n(heap implementation)"] EXEC --> WRAP WRAP --> RT RT --> HM HM --> IMPL REL --> RT
Figure 1 — Dispatch chain. The executor calls a table_* inline wrapper in
tableam.h; the wrapper reads rel->rd_tableam and calls through the
function-pointer slot. At REL_18, every ordinary table binds heapam_methods,
which delegates to heapam_handler.c and ultimately to heapam.c.
The TableAmRoutine struct — callback inventory by area
Section titled “The TableAmRoutine struct — callback inventory by area”TableAmRoutine is declared in src/include/access/tableam.h starting at line
288. It has approximately 40 callback slots organized into six functional areas.
flowchart LR
subgraph TAR["TableAmRoutine (tableam.h:288)"]
direction TB
S["Slot\nslot_callbacks"]
SC["Sequential scan\nscan_begin\nscan_end\nscan_rescan\nscan_getnextslot\nscan_set_tidrange\nscan_getnextslot_tidrange\nparallelscan_*"]
IF["Index fetch\nindex_fetch_begin\nindex_fetch_reset\nindex_fetch_end\nindex_fetch_tuple"]
TV["Tuple visibility\ntuple_fetch_row_version\ntuple_tid_valid\ntuple_get_latest_tid\ntuple_satisfies_snapshot\nindex_delete_tuples"]
DML["DML\ntuple_insert\ntuple_insert_speculative\ntuple_complete_speculative\nmulti_insert\ntuple_delete\ntuple_update\ntuple_lock\nfinish_bulk_insert"]
DDL["DDL / vacuum / analyze\nrelation_set_new_filelocator\nrelation_nontransactional_truncate\nrelation_copy_data\nrelation_copy_for_cluster\nrelation_vacuum\nscan_analyze_next_block\nscan_analyze_next_tuple\nindex_build_range_scan\nindex_validate_scan"]
MISC["Misc / planner\nrelation_size\nrelation_needs_toast_table\nrelation_toast_am\nrelation_fetch_toast_slice\nrelation_estimate_size\nscan_bitmap_next_tuple\nscan_sample_next_block\nscan_sample_next_tuple"]
end
Figure 2 — TableAmRoutine callback inventory. All ~40 slots grouped by area.
The DML and scan groups are mandatory (asserted by GetTableAmRoutine). The
finish_bulk_insert, relation_toast_am, and relation_fetch_toast_slice
callbacks are optional (may be NULL).
TM_Result — the shared DML outcome enum
Section titled “TM_Result — the shared DML outcome enum”TM_Result is the return type of tuple_delete, tuple_update, and
tuple_lock. It captures the outcome space that any MVCC engine must handle:
// TM_Result — src/include/access/tableam.htypedef enum TM_Result{ TM_Ok, /* operation succeeded */ TM_Invisible, /* tuple not visible to relevant snapshot */ TM_SelfModified, /* tuple already modified by this backend */ TM_Updated, /* updated by another committed transaction */ TM_Deleted, /* deleted by another committed transaction */ TM_BeingModified, /* concurrent in-progress modification (nowait only) */ TM_WouldBlock, /* lock not available, nowait, skip (lock_tuple only) */} TM_Result;When a DML call returns anything other than TM_Ok, the TM_FailureData
struct (filled by the AM) carries the current ctid (the chain tip), the
outdating xmax, and (for TM_SelfModified) the cmax of the conflicting
command. The executor uses these to decide whether to retry, error, or (in
READ COMMITTED) re-fetch the newer version and re-evaluate quals.
flowchart TD CALL["table_tuple_delete / update / lock"] --> AM["AM callback\ntuple_delete / update / lock"] AM --> OK["TM_Ok\n→ done"] AM --> INV["TM_Invisible\n→ tuple gone"] AM --> SELF["TM_SelfModified\n→ already done in this cmd"] AM --> UPD["TM_Updated\n→ executor re-fetches\nnewer ctid (READ COMMITTED)"] AM --> DEL["TM_Deleted\n→ tuple gone"] AM --> BM["TM_BeingModified\n→ wait or skip (nowait)"] AM --> WB["TM_WouldBlock\n→ lock_tuple nowait only"]
Figure 3 — TM_Result decision tree. The executor caller inspects the return
code and, for TM_Updated / TM_Deleted, may use the TM_FailureData.ctid
to locate the successor and retry. The heap AM fills TM_FailureData
from the tuple’s t_ctid and t_xmax. See simple_table_tuple_delete and
simple_table_tuple_update in tableam.c for the simplest callers.
Scan lifecycle and ScanOptions
Section titled “Scan lifecycle and ScanOptions”scan_begin takes a bitmask of ScanOptions flags that communicate two
things: the scan type (exactly one of SO_TYPE_SEQSCAN,
SO_TYPE_BITMAPSCAN, SO_TYPE_SAMPLESCAN, SO_TYPE_TIDSCAN,
SO_TYPE_TIDRANGESCAN, SO_TYPE_ANALYZE) and behavior hints (zero or more
of SO_ALLOW_STRAT, SO_ALLOW_SYNC, SO_ALLOW_PAGEMODE). The AM may ignore
hints it does not support.
The lifecycle state machine is:
stateDiagram-v2 [*] --> Scanning : table_beginscan\nScanOptions flags set Scanning --> Scanning : table_scan_getnextslot\nreturns true Scanning --> Done : table_scan_getnextslot\nreturns false Scanning --> Scanning : table_rescan\ncursor reset Done --> [*] : table_endscan\nresources released Scanning --> [*] : table_endscan\nearly exit
Figure 4 — Scan lifecycle state machine. The executor calls
table_beginscan, loops on table_scan_getnextslot, and finishes with
table_endscan. table_rescan restarts the cursor without tearing down the
descriptor — used by nested-loop join re-runs and ExecReScanSeqScan.
The table_beginscan inline wrapper sets the standard seq-scan flags:
// table_beginscan — src/include/access/tableam.hstatic inline TableScanDesctable_beginscan(Relation rel, Snapshot snapshot, int nkeys, struct ScanKeyData *key){ uint32 flags = SO_TYPE_SEQSCAN | SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE;
return rel->rd_tableam->scan_begin(rel, snapshot, nkeys, key, NULL, flags);}table_scan_getnextslot stamps the slot’s tts_tableOid and delegates:
// table_scan_getnextslot — src/include/access/tableam.hstatic inline booltable_scan_getnextslot(TableScanDesc sscan, ScanDirection direction, TupleTableSlot *slot){ slot->tts_tableOid = RelationGetRelid(sscan->rs_rd); /* ... CheckXidAlive guard for logical decoding ... */ return sscan->rs_rd->rd_tableam->scan_getnextslot(sscan, direction, slot);}Executor call site — nodeSeqscan.c
Section titled “Executor call site — nodeSeqscan.c”SeqNext is the per-tuple workhorse inside ExecSeqScan. It calls exactly
two table AM functions: table_beginscan on the first call (lazy
initialization), then table_scan_getnextslot on every call:
// SeqNext — src/backend/executor/nodeSeqscan.cstatic TupleTableSlot *SeqNext(SeqScanState *node){ TableScanDesc scandesc = node->ss.ss_currentScanDesc; TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
if (scandesc == NULL) { scandesc = table_beginscan(node->ss.ss_currentRelation, estate->es_snapshot, 0, NULL); node->ss.ss_currentScanDesc = scandesc; }
if (table_scan_getnextslot(scandesc, direction, slot)) return slot; return NULL;}ExecInitSeqScan sets the scan-tuple slot type via table_slot_callbacks,
which asks the AM which TupleTableSlotOps implementation to use — for heap
this returns TTSOpsBufferHeapTuple, a slot type that can hold a heap tuple
pinned in a buffer. This is how the executor avoids materializing a copy of
every tuple for the common read path.
ExecEndSeqScan simply calls table_endscan. ExecReScanSeqScan calls
table_rescan(scan, NULL), which resets the scan position without releasing
the descriptor.
Index-fetch call site — table_index_fetch_tuple
Section titled “Index-fetch call site — table_index_fetch_tuple”Index scans decouple the index traversal from the heap fetch. The AM provides
a separate IndexFetchTableData state object (begun by
table_index_fetch_begin, freed by table_index_fetch_end). For each TID
yielded by the index, the executor calls:
// table_index_fetch_tuple — src/include/access/tableam.hstatic inline booltable_index_fetch_tuple(struct IndexFetchTableData *scan, ItemPointer tid, Snapshot snapshot, TupleTableSlot *slot, bool *call_again, bool *all_dead){ /* ... CheckXidAlive guard ... */ return scan->rel->rd_tableam->index_fetch_tuple(scan, tid, snapshot, slot, call_again, all_dead);}The call_again output parameter is the hook for HOT: because a single index
entry can reach multiple heap versions (the HOT chain), the AM sets
*call_again = true when there is another version to return for the same TID.
The all_dead output lets the AM tell the index AM that no backend could
possibly see the tuple, allowing the index entry to be marked dead and skipped
in future scans.
GetTableAmRoutine — registration and validation
Section titled “GetTableAmRoutine — registration and validation”GetTableAmRoutine (in tableamapi.c) is the factory function called at
relation-open time. It invokes the handler OID function, gets back a
TableAmRoutine *, and asserts every mandatory callback is non-NULL:
// GetTableAmRoutine — src/backend/access/table/tableamapi.c (condensed)const TableAmRoutine *GetTableAmRoutine(Oid amhandler){ Datum datum = OidFunctionCall0(amhandler); const TableAmRoutine *routine = (TableAmRoutine *) DatumGetPointer(datum);
if (routine == NULL || !IsA(routine, TableAmRoutine)) elog(ERROR, "table access method handler %u did not return " "a TableAmRoutine struct", amhandler);
Assert(routine->scan_begin != NULL); Assert(routine->scan_end != NULL); Assert(routine->scan_getnextslot != NULL); Assert(routine->index_fetch_begin != NULL); Assert(routine->index_fetch_tuple != NULL); Assert(routine->tuple_insert != NULL); Assert(routine->tuple_delete != NULL); Assert(routine->tuple_update != NULL); Assert(routine->relation_vacuum != NULL); /* ... ~30 more mandatory asserts ... */
return routine;}heapam_methods — the reference implementation
Section titled “heapam_methods — the reference implementation”The heap AM’s vtable is a static const struct in heapam_handler.c.
Every mandatory slot is filled; a few optional ones (finish_bulk_insert)
are also provided. Key bindings:
// heapam_methods — src/backend/access/heap/heapam_handler.c (condensed)static const TableAmRoutine heapam_methods = { .type = T_TableAmRoutine,
.slot_callbacks = heapam_slot_callbacks,
.scan_begin = heap_beginscan, .scan_end = heap_endscan, .scan_rescan = heap_rescan, .scan_getnextslot = heap_getnextslot,
.index_fetch_begin = heapam_index_fetch_begin, .index_fetch_reset = heapam_index_fetch_reset, .index_fetch_end = heapam_index_fetch_end, .index_fetch_tuple = heapam_index_fetch_tuple,
.tuple_insert = heapam_tuple_insert, .multi_insert = heap_multi_insert, .tuple_delete = heapam_tuple_delete, .tuple_update = heapam_tuple_update, .tuple_lock = heapam_tuple_lock,
.tuple_satisfies_snapshot = heapam_tuple_satisfies_snapshot, .index_delete_tuples = heap_index_delete_tuples,
.relation_vacuum = heap_vacuum_rel, .relation_size = table_block_relation_size, .relation_estimate_size = heapam_estimate_rel_size,
/* ... all remaining slots ... */};heap_tableam_handler (the SQL-visible amhandler function referenced by
pg_am) simply returns &heapam_methods. GetHeapamTableAmRoutine (used
internally by code that knows it is working with heap) returns the same
pointer directly.
table_tuple_insert — tracing a DML call end-to-end
Section titled “table_tuple_insert — tracing a DML call end-to-end”The insert wrapper is three lines: dispatch into the AM:
// table_tuple_insert — src/include/access/tableam.hstatic inline voidtable_tuple_insert(Relation rel, TupleTableSlot *slot, CommandId cid, int options, struct BulkInsertStateData *bistate){ rel->rd_tableam->tuple_insert(rel, slot, cid, options, bistate);}The heap implementation (heapam_tuple_insert in heapam_handler.c) unpacks
the slot into a HeapTuple and calls heap_insert in heapam.c. The slot
carries the executor’s columnar form of the row; the AM is responsible for
serializing it into whatever on-disk format it uses. For heap, that is the
23-byte HeapTupleHeaderData prefix plus data — see postgres-heap-am.md
for the full layout.
Vacuum dispatch
Section titled “Vacuum dispatch”table_relation_vacuum is the wrapper that routes VACUUM to the AM:
// table_relation_vacuum — src/include/access/tableam.hstatic inline voidtable_relation_vacuum(Relation rel, struct VacuumParams *params, BufferAccessStrategy bstrategy){ rel->rd_tableam->relation_vacuum(rel, params, bstrategy);}vacuum.c calls this after acquiring the ShareUpdateExclusiveLock. For the
heap AM this routes to heap_vacuum_rel in heapam.c (via the
heapam_methods slot). A custom AM that does not need vacuum (e.g., an
in-memory AM with no dead-tuple accumulation) would leave this slot wired to
a no-op or a minimal implementation.
Source Walkthrough
Section titled “Source Walkthrough”Anchor on symbol names, not line numbers. The position-hint table below records line numbers observed at commit
273fe94(REL_18_STABLE, 2026-06-05) as quick hints only — usegit grep -n '<symbol>'to find the current position.
Core interface (src/include/access/tableam.h)
Section titled “Core interface (src/include/access/tableam.h)”typedef enum TM_Result— DML outcome codes (TM_Ok,TM_Updated,TM_Deleted,TM_SelfModified,TM_BeingModified,TM_WouldBlock,TM_Invisible).typedef struct TM_FailureData— failure detail struct:ctid,xmax,cmax,traversed.typedef enum TU_UpdateIndexes— update-index hint returned bytuple_update:TU_None(HOT, no index update needed),TU_All, orTU_Summarizing.typedef enum ScanOptions— bitmask forscan_begin:SO_TYPE_*(scan type, exactly one) andSO_ALLOW_*(behavior hints, zero or more).typedef struct TableAmRoutine— the ~40-slot function-pointer vtable.table_beginscan/table_endscan/table_rescan/table_scan_getnextslot— the seq-scan lifecycle inline wrappers.table_index_fetch_begin/table_index_fetch_end/table_index_fetch_tuple— index-fetch lifecycle inline wrappers;call_againandall_deadoutput params.table_tuple_insert/table_tuple_delete/table_tuple_update/table_tuple_lock— DML inline wrappers returningvoidorTM_Result.table_relation_vacuum— vacuum dispatch wrapper.DEFAULT_TABLE_ACCESS_METHOD— compile-time constant"heap"; also the default value of thedefault_table_access_methodGUC.
Registration and validation (src/backend/access/table/tableamapi.c)
Section titled “Registration and validation (src/backend/access/table/tableamapi.c)”GetTableAmRoutine(Oid amhandler)— calls the handler function, validates all mandatory callbacks are non-NULL, returns theconst TableAmRoutine *.check_default_table_access_method— GUC check hook; validates that the named AM exists inpg_am.
Scan and parallel helpers (src/backend/access/table/tableam.c)
Section titled “Scan and parallel helpers (src/backend/access/table/tableam.c)”table_beginscan_catalog— variant for catalog scans; registers a catalog snapshot automatically.simple_table_tuple_insert/simple_table_tuple_delete/simple_table_tuple_update— wrappers for callers that do not handle concurrent-update cases and want errors on any non-TM_Okresult.table_block_parallelscan_estimate/…_initialize/…_reinitialize/…_startblock_init/…_nextpage— shared helpers for block-oriented AMs implementing parallel seq scans. These are not in the vtable; AMs call them from their ownparallelscan_*callbacks.
Scan descriptor (src/include/access/relscan.h)
Section titled “Scan descriptor (src/include/access/relscan.h)”typedef struct TableScanDescData— the base scan descriptor embedded (or extended) by each AM. Containsrs_rd(Relation),rs_snapshot,rs_nkeys,rs_key,rs_flags(theScanOptionsbitmask), andrs_parallel(parallel scan state or NULL).typedef struct IndexFetchTableData— minimal base for index-fetch state; AMs embed it in a larger AM-specific struct.
Heap AM binding (src/backend/access/heap/heapam_handler.c)
Section titled “Heap AM binding (src/backend/access/heap/heapam_handler.c)”heapam_methods—static const TableAmRoutine; all ~40 slots filled.GetHeapamTableAmRoutine()— returns&heapam_methods; used by code that is heap-specific by assumption.heap_tableam_handler(PG_FUNCTION_ARGS)— the SQL-visibleamhandlerfunction; returns&heapam_methodsviaPG_RETURN_POINTER.heapam_tuple_insert/heapam_tuple_delete/heapam_tuple_update/heapam_tuple_lock— thin wrappers: unpackTupleTableSlot, callheap_{insert,delete,update}inheapam.c, copy resulting TID back into slot.heapam_index_fetch_begin/heapam_index_fetch_tuple— allocate/drive aHeapScanDescDatafor TID-based lookups;heapam_index_fetch_tuplecallsheap_hot_search_buffer(seepostgres-heap-am.md).heapam_slot_callbacks— returns&TTSOpsBufferHeapTuple; the slot type that pins a heap tuple in a buffer without copying.
Executor call sites
Section titled “Executor call sites”SeqNext(innodeSeqscan.c) — callstable_beginscan(lazy) and loops ontable_scan_getnextslot; the AM-facing layer ofExecSeqScan.ExecInitSeqScan(innodeSeqscan.c) — callstable_slot_callbacksto pick the right slot type at init time.ExecEndSeqScan/ExecReScanSeqScan(innodeSeqscan.c) — calltable_endscan/table_rescan.- Index scan TID resolution (in
nodeIndexscan.c) — callstable_index_fetch_begin, loops ontable_index_fetch_tuplewithcall_again, callstable_index_fetch_end.
Position hints (as of 2026-06-05, REL_18 273fe94)
Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”| Symbol | File | Line |
|---|---|---|
typedef enum TM_Result | access/tableam.h | 71 |
typedef struct TM_FailureData | access/tableam.h | 146 |
typedef struct TableAmRoutine | access/tableam.h | 288 |
} TableAmRoutine | access/tableam.h | 843 |
table_beginscan (inline) | access/tableam.h | 875 |
table_endscan (inline) | access/tableam.h | 984 |
table_scan_getnextslot (inline) | access/tableam.h | 1020 |
table_index_fetch_begin (inline) | access/tableam.h | 1157 |
table_index_fetch_tuple (inline) | access/tableam.h | 1206 |
table_tuple_insert (inline) | access/tableam.h | 1367 |
table_tuple_delete (inline) | access/tableam.h | 1456 |
table_tuple_update (inline) | access/tableam.h | 1500 |
table_relation_vacuum (inline) | access/tableam.h | 1674 |
typedef struct TableScanDescData | access/relscan.h | 33 |
typedef struct IndexFetchTableData | access/relscan.h | 121 |
GetTableAmRoutine | table/tableamapi.c | 28 |
table_beginscan_catalog | table/tableam.c | 113 |
simple_table_tuple_insert | table/tableam.c | 277 |
simple_table_tuple_delete | table/tableam.c | 291 |
simple_table_tuple_update | table/tableam.c | 336 |
static const TableAmRoutine heapam_methods | heap/heapam_handler.c | 2616 |
GetHeapamTableAmRoutine | heap/heapam_handler.c | 2676 |
heap_tableam_handler | heap/heapam_handler.c | 2682 |
SeqNext | executor/nodeSeqscan.c | 51 |
ExecSeqScan | executor/nodeSeqscan.c | 110 |
ExecInitSeqScan | executor/nodeSeqscan.c | 207 |
ExecEndSeqScan | executor/nodeSeqscan.c | 289 |
Source verification (as of 2026-06-05)
Section titled “Source verification (as of 2026-06-05)”Each entry is a fact about the current source at commit
273fe94(REL_18_STABLE). The trailing note shows how it was checked.
Verified facts
Section titled “Verified facts”-
TableAmRoutineis declared as atypedef structat line 288 oftableam.h, and its closing brace is at line 843. Verified by reading the file directly. The struct body spans approximately 555 lines because each callback slot carries a multi-line comment block. -
GetTableAmRoutineasserts ~30 mandatory callbacks non-NULL. Verified by readingtableamapi.c: allAssert(routine->...)lines are present. Optional callbacks (finish_bulk_insert,scan_bitmap_next_tuple,relation_toast_am,relation_fetch_toast_slice) are not asserted.scan_set_tidrangeandscan_getnextslot_tidrangeare also not asserted (they must be provided together or neither). -
The only in-tree AM registered via
GetTableAmRoutineat REL_18 is heap. Verified bygit grep -r 'GetTableAmRoutine\|amhandler.*heap_tableam'insrc/backendandsrc/include— onlyheapam_handler.candtableamapi.creference the function. -
DEFAULT_TABLE_ACCESS_METHODis the string"heap"and is the initial value of thedefault_table_access_methodGUC. Verified intableam.hline 29 (#define DEFAULT_TABLE_ACCESS_METHOD "heap") andtableam.cline 49 (char *default_table_access_method = DEFAULT_TABLE_ACCESS_METHOD). -
table_beginscansetsSO_TYPE_SEQSCAN | SO_ALLOW_STRAT | SO_ALLOW_SYNC | SO_ALLOW_PAGEMODE. Verified in thetable_beginscaninline intableam.h(line 875–881).table_beginscan_bmsubstitutesSO_TYPE_BITMAPSCANand dropsSO_ALLOW_SYNC;table_beginscan_analyzeusesSO_TYPE_ANALYZEwith no flags. -
SeqNextcallstable_beginscanlazily (only ifscandesc == NULL), thentable_scan_getnextslotin a loop. Verified innodeSeqscan.clines 51–84 exactly as quoted above. -
table_index_fetch_tuplehas acall_againoutput parameter that the heap AM uses for HOT chains. Verified in the inline wrapper intableam.h(line 1206) and confirmed in theindex_fetch_tuplecomment block (lines 436–459 oftableam.h): “If there potentially is another tuple matching the tid, *call_again needs to be set to true.” -
heapam_methodsis astatic const TableAmRoutinedefined at line 2616 ofheapam_handler.c. Verified by reading the file.heap_tableam_handlerat line 2682 returns&heapam_methodsviaPG_RETURN_POINTER. There is no otherTableAmRoutinestruct insrc/backend/. -
TU_UpdateIndexesis a separate enum fromTM_Result. Verified intableam.hlines 109–119.tuple_updatereturnsTM_Resultand writesTU_UpdateIndexesthrough an output pointer (update_indexes). The executor usesTU_Noneto skip index updates on HOT updates andTU_Summarizingto update only summarizing (BRIN) indexes.
Open questions
Section titled “Open questions”-
Custom AM registration path at session start. A custom AM installed via
CREATE ACCESS METHOD ... TYPE TABLEregisters its handler inpg_am. On the next session that opens a relation using that AM,GetTableAmRoutineis called. The session-level caching (whether theTableAmRoutine *is re-fetched per relation-open or pinned) is governed by the relcache invalidation machinery. Exact caching behavior under concurrentALTER TABLE SET ACCESS METHODis not traced here. -
Interaction with logical decoding. The
CheckXidAliveguards intable_scan_getnextslotandtable_index_fetch_tuplereject calls during logical decoding. Whether a custom AM that performs its own tuple reconstruction (not going throughtable_scan_getnextslot) correctly participates in logical decoding is not spelled out in the interface contract.
Beyond PostgreSQL — Comparative Designs & Research Frontiers
Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”Pointers, not analysis. Each bullet is a starting handle for a follow-up doc.
-
The Index AM interface (
IndexAmRoutineinamapi.h) is the sibling contract. Both were present before PostgreSQL 12 — the Index AM API is older — butTableAmRoutinewas the newer addition that rounded out the pluggable-storage picture. Understanding both interfaces together gives the full “PostgreSQL as an extensible database engine” picture. The nbtree doc (postgres-nbtree.md) coversIndexAmRoutinefrom the B-tree side. -
zheap — the cancelled in-place AM. The zheap project (Percona / EnterpriseDB, circa 2017–2020) was an attempt to build an in-place storage AM with undo, using
TableAmRoutineas the extension point, to eliminate heap bloat and vacuum overhead. Its design documents describe the hardest parts of the AM contract (visibility, freezing, TOAST, WAL) from the perspective of an AM author, and show what the interface was not yet flexible enough to express. The project stalled but remains the clearest demonstration thatTableAmRoutinewas designed with architectural intent. -
Columnar/append-only AMs. External projects (Citus columnar, Hydra, ParadeDB) implement
TableAmRoutinefor columnar storage. They expose which callbacks are genuinely AM-neutral (scan lifecycle, DML outcome codes) versus which embed block-oriented assumptions (table_block_relation_size,scan_bitmap_next_tuple). The block-oriented helpers intableam.c(table_block_parallelscan_*) are provided precisely because many AMs share that assumption. -
Oracle’s pluggable storage (In-Memory Column Store). Oracle 12c introduced an in-memory columnar format as a dual-format option: the on-disk row store is unchanged, and the in-memory store is an additional representation populated by a background process. Queries can use either. This contrasts with PostgreSQL’s model where a table has exactly one AM. A PostgreSQL analog would require either a custom AM that internally maintains both formats or a foreign-table overlay, neither of which is currently clean.
-
The Stonebraker 1986 vision. The Design of POSTGRES (Stonebraker & Rowe, 1986) included a section on the “abstract data manager” and envisioned separating query processing from storage management.
TableAmRoutineis the closest production realization of that vision in PostgreSQL, 26 years later. Reading the 1986 paper alongside the currenttableam.his a worthwhile exercise in seeing how much the interface narrowed from the original vision.
Sources
Section titled “Sources”In-tree documentation
Section titled “In-tree documentation”src/include/access/tableam.h— the primary source; every callback is documented inline above its slot definition. The header comment referencestableam.sgmlfor higher-level documentation.
Textbook chapters (under knowledge/research/dbms-general/)
Section titled “Textbook chapters (under knowledge/research/dbms-general/)”- Database Internals (Petrov), ch. 3 “File Formats” — pluggable storage layer concepts and vtable dispatch patterns.
- Database System Concepts (Silberschatz et al., 7e), ch. 13 “Data Storage Structures” — storage manager abstraction and access method interfaces.
PostgreSQL source (under /data/hgryoo/references/postgres/, REL_18 273fe94)
Section titled “PostgreSQL source (under /data/hgryoo/references/postgres/, REL_18 273fe94)”src/include/access/tableam.hsrc/backend/access/table/tableam.csrc/backend/access/table/tableamapi.csrc/backend/access/heap/heapam_handler.csrc/backend/executor/nodeSeqscan.csrc/include/access/relscan.h
Cross-references (sibling docs)
Section titled “Cross-references (sibling docs)”postgres-heap-am.md— the heap AM as the reference implementation:HeapTupleHeaderData, HOT, pruning, visibility,heap_insert/heap_update/heap_delete.postgres-executor.md— the full executor node tree and slot types that consume the Table AM API.postgres-mvcc-snapshots.md— snapshot construction; the snapshot passed totable_beginscanandtable_index_fetch_tuple.postgres-vacuum.md—table_relation_vacuumcallers, autovacuum scheduling.postgres-nbtree.md— the Index AM (IndexAmRoutine) sibling interface.postgres-page-layout.md— the page geometry assumed by block-oriented AMs.