PostgreSQL Hooks — The Function-Pointer Extension Points
Contents:
- Theoretical Background
- Common DBMS Design
- PostgreSQL’s Approach
- Source Walkthrough
- Source verification (as of 2026-06-05)
- Beyond PostgreSQL — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”Every long-lived database engine faces the same tension. The core must stay small, auditable, and fast; yet real deployments want behavior the core authors never anticipated — query auditing, statement timing, workload-aware planning, custom authentication policies, per-extension shared state. The classic resolutions to that tension form a spectrum:
-
Fork the source. Copy the engine, patch it, maintain the divergence forever. Maximum power, maximum cost; every upstream release is a merge conflict.
-
Configuration knobs. Expose the anticipated variation as settings (GUCs in PostgreSQL). Cheap and safe, but only covers behavior the authors thought to parameterize.
-
Stored procedures / triggers. Let users inject logic at data-definition boundaries. Powerful for row-level policy, but it runs inside SQL semantics, not underneath them — it cannot reach the planner or the wire protocol.
-
Extension points / hooks. Publish a small set of stable interception seams in the engine’s control flow and let externally-compiled code bind to them at load time. This is the middle path: more reach than a GUC, far less maintenance burden than a fork.
PostgreSQL leans heavily on option 4, and its chosen mechanism is the
hook: a global function-pointer variable, initialized to NULL,
that the core checks at a well-defined point. When a loadable module
sets the pointer to its own function, that function gets called instead
of (or wrapped around) the built-in behavior. There is no plugin
registry, no manifest, no dynamic dispatch table — just a C function
pointer and a disciplined calling convention.
This is the Hollywood Principle (“don’t call us, we’ll call you”)
realized with the lowest-overhead primitive C offers. The theoretical
appeal is that an unset hook costs exactly one predictable-branch
if (ptr) test — effectively free on the hot path — while a set hook
costs one indirect call. There is no abstraction tax for the 99.9% of
servers that load no module at that seam.
The design rests on three properties the core must guarantee:
-
A stable seam. The hooked function’s signature and its position in the control flow must change rarely, because out-of-tree modules compile against it. PostgreSQL versions the ABI with
PG_MODULE_MAGICso a module built for the wrong major version is rejected at load rather than crashing at call time. -
A default that is the real implementation. The hook must wrap a function that already does the whole job, so a module that only wants to observe can call the default and add its own behavior before/after. PostgreSQL names these
standard_Foo(). -
A chaining convention. Because the pointer is a single global, two modules both wanting the same seam must cooperate. PostgreSQL’s unwritten-but-universal convention is save the previous value, call it from inside your replacement. This turns a single pointer into a linked stack of interceptors, ordered by load order.
The relevant theory anchor in the KB bibliography is Architecture of a
Database System (Hellerstein, Stonebraker & Hamilton, 2007;
dbms-papers/fntdb07-architecture.md), whose process-model and
query-lifecycle decomposition (parser → rewriter → planner → executor,
plus the shared-memory/process substrate) is exactly the spine along
which PostgreSQL drilled its hook seams. The hooks are not a separate
subsystem; they are taps on the lifecycle that paper describes. The
Berkeley POSTGRES extensibility lineage (Stonebraker & Kemnitz 1991,
“The POSTGRES Next-Generation DBMS”) established that an engine should
treat user-supplied access methods, types, and procedures as
first-class — the function-pointer hook is the in-process, C-level
descendant of that philosophy.
Common DBMS Design
Section titled “Common DBMS Design”Engines that support in-process extension converge on a small set of recurring techniques. Naming them makes PostgreSQL’s specific choices legible as one point in a shared design space.
A load-time entry point
Section titled “A load-time entry point”Every dynamic-extension system needs a moment, just after the shared
object is mapped into the server’s address space, when the module runs
arbitrary setup code: register its hooks, define its settings, reserve
resources. Unix-family engines reach this via the dynamic linker —
dlopen() the .so, then dlsym() a conventionally-named init symbol
and call it. PostgreSQL’s symbol is _PG_init; MySQL/MariaDB plugins
use a descriptor struct with init/deinit callbacks; SQLite uses
sqlite3_*_init entry points discovered by the extension loader.
ABI guarding
Section titled “ABI guarding”Binding compiled code into a running server is unsafe if the struct
layouts disagree. The universal guard is a magic block: a versioned
descriptor the loader reads before trusting any other symbol, aborting
the load on mismatch. PostgreSQL’s PG_MODULE_MAGIC macro emits a
Pg_magic_func returning a Pg_magic_struct stamped with the build’s
ABI fields; the loader compares it and refuses incompatible libraries.
Interception seams as the unit of extensibility
Section titled “Interception seams as the unit of extensibility”The actual extension surface is a curated set of points in the control flow where third-party code may intervene. Two implementation styles dominate:
- Callback registries — an array/list of subscribers per event, invoked in registration order (event-listener pattern). Flexible ordering, but heavier: allocation, iteration, a registry data structure.
- Single function pointers — one global per seam, defaulting to the built-in. Zero allocation, one branch when unused. The cost is that multiplexing is pushed onto the modules (they must chain), not the core.
PostgreSQL deliberately picks the second style for nearly all of its hooks. The core stays trivial; the chaining burden is a documented convention modules follow.
The wrap-the-default idiom
Section titled “The wrap-the-default idiom”For an observer (timer, logger, auditor) to coexist with the real
operation, the seam must expose the real operation as a callable. The
common idiom is to split Foo() into a public dispatcher and a
standard_Foo() (or default_Foo()) that holds the logic, so an
interceptor can do work, delegate to the default, and do more work. This
is precisely PostgreSQL’s planner / standard_planner split.
Phase-gated resource hooks
Section titled “Phase-gated resource hooks”Hooks that allocate shared resources cannot fire at an arbitrary time —
shared memory in a fork-based server must be sized before the segment
is created and populated after. Every such engine therefore splits the
resource hook into a request/sizing phase and a startup/init phase,
and forbids the request API outside its window. PostgreSQL enforces this
with process_shmem_requests_in_progress guarding
RequestAddinShmemSpace.
flowchart TD
subgraph core["Core engine"]
disp["dispatcher Foo()<br/>if (Foo_hook) call hook<br/>else standard_Foo()"]
std["standard_Foo()<br/>the real implementation"]
end
subgraph modА["Module A (_PG_init)"]
a_save["prevA = Foo_hook"]
a_set["Foo_hook = A_fn"]
a_fn["A_fn(): work;<br/>prevA ? prevA() : standard_Foo()"]
end
subgraph modB["Module B (_PG_init, loaded later)"]
b_save["prevB = Foo_hook (== A_fn)"]
b_set["Foo_hook = B_fn"]
b_fn["B_fn(): work;<br/>prevB ? prevB() : standard_Foo()"]
end
disp -->|hook unset| std
disp -->|hook set| b_fn
b_fn --> a_fn
a_fn --> std
a_save --> a_set --> a_fn
b_save --> b_set --> b_fn
PostgreSQL’s Approach
Section titled “PostgreSQL’s Approach”PostgreSQL’s hook mechanism has no central machinery at all. There is no
hooks.c. Each hook is a PGDLLIMPORT global declared in the header of
the subsystem it taps and defined (initialized to NULL) in that
subsystem’s .c file. The “system” is a convention, repeated
identically dozens of times across the tree.
The canonical dispatcher/default split
Section titled “The canonical dispatcher/default split”The query-optimizer entry point is the archetype. planner() is a
five-line dispatcher; standard_planner() is the thousand-line real
planner. The hook variable sits beside them, NULL until a module claims
it.
// planner_hook + planner() — src/backend/optimizer/plan/planner.c/* Hook for plugins to get control in planner() */planner_hook_type planner_hook = NULL;
PlannedStmt *planner(Query *parse, const char *query_string, int cursorOptions, ParamListInfo boundParams){ PlannedStmt *result;
if (planner_hook) result = (*planner_hook) (parse, query_string, cursorOptions, boundParams); else result = standard_planner(parse, query_string, cursorOptions, boundParams);
pgstat_report_plan_id(result->planId, false); return result;}The hook’s type is published in the public header so an out-of-tree
module gets the exact signature and the PGDLLIMPORT storage-class
marker needed to bind the symbol on every platform:
// planner_hook_type — src/include/optimizer/planner.h/* Hook for plugins to get control in planner() */typedef PlannedStmt *(*planner_hook_type) (Query *parse, const char *query_string, int cursorOptions, ParamListInfo boundParams);extern PGDLLIMPORT planner_hook_type planner_hook;Note the in-source guidance to plugin authors right above planner():
“standard_planner() scribbles on its Query input, so you’d better copy
that data structure if you want to plan more than once.” The hook
contract includes such caveats because the module is now responsible for
the same invariants the core would otherwise uphold.
The executor’s four-phase hook set
Section titled “The executor’s four-phase hook set”The executor exposes the same idiom at each of its lifecycle phases.
ExecutorStart, ExecutorRun, ExecutorFinish, and ExecutorEnd each
have a paired standard_* and a NULL-initialized hook. A single module
typically claims all four to bracket a query’s execution with timing or
instrumentation.
// Executor hook variables — src/backend/executor/execMain.c/* Hooks for plugins to get control in ExecutorStart/Run/Finish/End */ExecutorStart_hook_type ExecutorStart_hook = NULL;ExecutorRun_hook_type ExecutorRun_hook = NULL;ExecutorFinish_hook_type ExecutorFinish_hook = NULL;ExecutorEnd_hook_type ExecutorEnd_hook = NULL;
/* Hook for plugin to get control in ExecCheckPermissions() */ExecutorCheckPerms_hook_type ExecutorCheckPerms_hook = NULL;// ExecutorRun() dispatcher — src/backend/executor/execMain.cvoidExecutorRun(QueryDesc *queryDesc, ScanDirection direction, uint64 count){ if (ExecutorRun_hook) (*ExecutorRun_hook) (queryDesc, direction, count); else standard_ExecutorRun(queryDesc, direction, count);}ExecutorCheckPerms_hook is a slightly different shape: it is not a
wrap-the-default hook but an augment-after-core hook. The core runs
its full permission check first, and only if the built-in check passed
does it consult the hook for an additional verdict. The module cannot
grant access the core denied; it can only add a denial (a row-level
security or auditing extension uses this to layer policy on top).
// ExecCheckPermissions() tail — src/backend/executor/execMain.c foreach(l, rteperminfos) { RTEPermissionInfo *perminfo = lfirst_node(RTEPermissionInfo, l); result = ExecCheckOneRelPerms(perminfo); if (!result) { if (ereport_on_violation) aclcheck_error(/* ... */); return false; } }
if (ExecutorCheckPerms_hook) result = (*ExecutorCheckPerms_hook) (rangeTable, rteperminfos, ereport_on_violation); return result;The utility-command hook
Section titled “The utility-command hook”DDL and other non-SELECT/INSERT/UPDATE/DELETE statements flow through
ProcessUtility(), which has the same dispatcher/standard split. This
is the seam audit and replication extensions tap to observe CREATE TABLE, DROP, GRANT, and the like.
// ProcessUtility() dispatcher — src/backend/tcop/utility.cProcessUtility_hook_type ProcessUtility_hook = NULL;
voidProcessUtility(PlannedStmt *pstmt, const char *queryString, bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment *queryEnv, DestReceiver *dest, QueryCompletion *qc){ /* ... asserts ... */ if (ProcessUtility_hook) (*ProcessUtility_hook) (pstmt, queryString, readOnlyTree, context, params, queryEnv, dest, qc); else standard_ProcessUtility(pstmt, queryString, readOnlyTree, context, params, queryEnv, dest, qc);}The header comment for ProcessUtility carries a sharp warning that the
same queryString may be passed to multiple invocations (one per
semicolon-separated statement), and that some commands recurse into
ProcessUtility for sub-statements — so a hook that wants to identify
“its” statement must use pstmt->stmt_location and pstmt->stmt_len,
not the raw string. Again, the hook contract pushes correctness
obligations onto the module.
Diagram: where the query-path hooks sit on the lifecycle
Section titled “Diagram: where the query-path hooks sit on the lifecycle”flowchart LR q["parsed + rewritten Query"] --> P["planner()<br/>planner_hook"] P --> sp["standard_planner()"] sp --> PS["PlannedStmt"] PS --> ES["ExecutorStart()<br/>ExecutorStart_hook"] ES --> CP["ExecCheckPermissions()<br/>ExecutorCheckPerms_hook"] CP --> ER["ExecutorRun()<br/>ExecutorRun_hook"] ER --> EF["ExecutorFinish()<br/>ExecutorFinish_hook"] EF --> EE["ExecutorEnd()<br/>ExecutorEnd_hook"] PS -.->|"CMD_UTILITY (DDL etc.)"| PU["ProcessUtility()<br/>ProcessUtility_hook"]
The query-path hooks are unconditional dispatchers: they fire on every
plan/execute regardless of when the module was loaded. The planner and
executor seams are detailed further in postgres-planner-overview.md and
postgres-executor.md; here the point is only the shape of the tap, not
the machinery it wraps.
The two shared-memory hooks are phase-gated
Section titled “The two shared-memory hooks are phase-gated”Most hooks are time-agnostic, but a module that wants its own slice of the main shared-memory segment cannot allocate it whenever it likes. In a fork-based server the segment is sized once, created once by the postmaster, and then inherited by every backend. So the shmem extension surface is split into two hooks fired at two distinct moments of postmaster startup, and the sizing API is fenced to its window.
The first is shmem_request_hook, fired from process_shmem_requests().
Its body is the whole “system”: flip a guard flag, call the hook, clear the
flag.
// process_shmem_requests() — src/backend/utils/init/miscinit.cvoidprocess_shmem_requests(void){ process_shmem_requests_in_progress = true; if (shmem_request_hook) shmem_request_hook(); process_shmem_requests_in_progress = false;}That process_shmem_requests_in_progress flag is the fence.
RequestAddinShmemSpace() — the only legitimate way to enlarge the segment
— refuses to run outside the window, turning a timing rule into an
enforced invariant rather than a documentation footnote:
// RequestAddinShmemSpace() — src/backend/storage/ipc/ipci.cvoidRequestAddinShmemSpace(Size size){ if (!process_shmem_requests_in_progress) elog(FATAL, "cannot request additional shared memory outside shmem_request_hook"); total_addin_request = add_size(total_addin_request, size);}The postmaster calls process_shmem_requests() at a precise point — after
InitializeMaxBackends() and InitializeFastPathLocks() have fixed the
backend count, but before InitializeShmemGUCs() and the actual segment
creation — so that every module’s request is folded into the one size
computation:
// PostmasterMain() startup ordering — src/backend/postmaster/postmaster.c InitializeMaxBackends(); InitPostmasterChildSlots(); InitializeFastPathLocks();
/* Give preloaded libraries a chance to request additional shared memory. */ process_shmem_requests();
/* ... InitializeShmemGUCs(); then later CreateSharedMemoryAndSemaphores() */The second hook, shmem_startup_hook, fires at the tail of
CreateSharedMemoryAndSemaphores() — once the segment exists and the core
structures are laid in, the module gets its turn to carve out and
initialize the space it reserved in phase one (typically via
ShmemInitStruct under an AddinShmemInitLock):
// CreateSharedMemoryAndSemaphores() tail — src/backend/storage/ipc/ipci.c /* Initialize subsystems */ CreateOrAttachShmemStructs();
/* Initialize dynamic shared memory facilities. */ dsm_postmaster_startup(shim);
/* * Now give loadable modules a chance to set up their shmem allocations */ if (shmem_startup_hook) shmem_startup_hook();Because both shmem hooks only fire during postmaster startup, a module that
installs them is only useful when listed in shared_preload_libraries.
The wider shared-memory and IPC substrate is the subject of
postgres-shared-memory-ipc.md; the hooks are merely its two extension
seams. pg_stat_statements is the canonical in-tree user of both —
shmem_request_hook to size its hash table and shmem_startup_hook to
attach it — but it is contrib/, out of scope here, named only as an
example of the pattern.
The authentication hook: post-verdict, observe-or-veto
Section titled “The authentication hook: post-verdict, observe-or-veto”ClientAuthentication_hook taps the very end of ClientAuthentication(),
after the core has computed an authentication status. Like
ExecutorCheckPerms_hook it is not a wrap-the-default seam — the core does
the whole authentication itself, then hands the module the resulting
Port and status. A module can log the attempt, enforce an extra policy,
or ereport(FATAL, ...) to veto an otherwise-successful login; it cannot
itself synthesize a STATUS_OK out of a failure.
// ClientAuthentication() tail — src/backend/libpq/auth.c if (ClientAuthentication_hook) (*ClientAuthentication_hook) (port, status);
if (status == STATUS_OK) sendAuthRequest(port, AUTH_REQ_OK, NULL, 0); else auth_failed(port, status, logdetail);Its type erases nothing — (Port *, int) — so the module sees both the
connection descriptor and the raw verdict:
// ClientAuthentication_hook_type — src/include/libpq/auth.htypedef void (*ClientAuthentication_hook_type) (Port *, int);extern PGDLLIMPORT ClientAuthentication_hook_type ClientAuthentication_hook;Installing a hook: _PG_init and the load chain
Section titled “Installing a hook: _PG_init and the load chain”A hook variable is only useful if something assigns it. That something is
the module’s _PG_init(), the conventional entry point the loader calls
exactly once when the .so is first mapped. The whole load chain is in
internal_load_library(): dlopen the file, find and validate the
Pg_magic_func ABI block, and only then dlsym("_PG_init") and call it.
// internal_load_library() — ABI check then _PG_init — src/backend/utils/fmgr/dfmgr.c /* Check the magic function to determine compatibility */ magic_func = (PGModuleMagicFunction) dlsym(file_scanner->handle, PG_MAGIC_FUNCTION_NAME_STRING); if (magic_func) { const Pg_magic_struct *magic_data_ptr = (*magic_func) ();
/* Check ABI compatibility fields */ if (magic_data_ptr->len != sizeof(Pg_magic_struct) || memcmp(&magic_data_ptr->abi_fields, &magic_data, sizeof(Pg_abi_values)) != 0) { Pg_magic_struct module_magic_data = *magic_data_ptr; dlclose(file_scanner->handle); free(file_scanner); incompatible_module_error(libname, &module_magic_data.abi_fields); } file_scanner->magic = magic_data_ptr; } else { dlclose(file_scanner->handle); free(file_scanner); ereport(ERROR, (errmsg("incompatible library \"%s\": missing magic block", libname), errhint("Extension libraries are required to use the PG_MODULE_MAGIC macro."))); }
/* If the library has a _PG_init() function, call it. */ PG_init = (PG_init_t) dlsym(file_scanner->handle, "_PG_init"); if (PG_init) (*PG_init) ();The PG_MODULE_MAGIC macro a module is required to write emits exactly the
Pg_magic_func the loader looks for; it bakes the build’s ABI fields
(major version, FUNC_MAX_ARGS, INDEX_MAX_KEYS, NAMEDATALEN,
FLOAT8PASSBYVAL) into the .so, which the memcmp above compares against
the server’s own magic_data:
// PG_MODULE_MAGIC — src/include/fmgr.h#define PG_MODULE_MAGIC \extern PGDLLEXPORT const Pg_magic_struct *PG_MAGIC_FUNCTION_NAME(void); \const Pg_magic_struct * \PG_MAGIC_FUNCTION_NAME(void) \{ \ static const Pg_magic_struct Pg_magic_data = PG_MODULE_MAGIC_DATA(.name = NULL); \ return &Pg_magic_data; \} \extern int no_such_variableInside _PG_init, the install follows the save-and-chain convention.
The module saves whatever value the hook currently holds (NULL, or a
previously-loaded module’s function) into a file-static prev_*, then
overwrites the global with its own function. Its function does its work and
then calls prev if set, else the standard_* default — so N modules form
a load-ordered interceptor stack threaded through a single pointer:
// canonical save-and-chain idiom (shape used by every hook module)static planner_hook_type prev_planner_hook = NULL;
void_PG_init(void){ prev_planner_hook = planner_hook; /* save */ planner_hook = my_planner; /* chain in */}
static PlannedStmt *my_planner(Query *parse, const char *qs, int opts, ParamListInfo bp){ PlannedStmt *result; /* ... pre-work ... */ if (prev_planner_hook) result = prev_planner_hook(parse, qs, opts, bp); else result = standard_planner(parse, qs, opts, bp); /* ... post-work ... */ return result;}Two preload windows feed this chain. shared_preload_libraries is loaded
once in the postmaster before any fork — the only timing at which the two
shmem hooks are meaningful — by process_shared_preload_libraries().
session_preload_libraries / local_preload_libraries are loaded
per-backend by process_session_preload_libraries(); both ultimately call
the same load_libraries() → load_file() → internal_load_library()
chain shown above, and a module loaded at any of these times (or even
lazily by an explicit LOAD command) can install the query-path and auth
hooks, since those fire on every relevant operation.
Source Walkthrough
Section titled “Source Walkthrough”This section follows the hook machinery as a set of stable symbols,
grouped by the seam they implement. Every hook is the same triple: a
PGDLLIMPORT global pointer (NULL default) declared in a header, the
dispatcher that tests it, and — for wrap-the-default hooks — a
standard_* carrying the real logic. The load chain (internal_load_library
→ _PG_init) is shared by all of them.
Planner seam
Section titled “Planner seam”planner_hook— global pointer, defined= NULLinplanner.c.planner_hook_type—typedefinplanner.h; the published signaturePlannedStmt *(*)(Query *, const char *, int, ParamListInfo).planner()— the dispatcher:if (planner_hook) (*planner_hook)(...) else standard_planner(...), thenpgstat_report_plan_id().standard_planner()— the real optimizer entry, exported so a module can delegate. The in-source caveat (“scribbles on its Query input”) is part of the hook contract: an observer that re-plans mustcopyObjectthe Query.
Executor seam (four phases + permission augment)
Section titled “Executor seam (four phases + permission augment)”ExecutorStart_hook,ExecutorRun_hook,ExecutorFinish_hook,ExecutorEnd_hook— four globals defined together inexecMain.c.standard_ExecutorStart/Run/Finish/End— the four defaults; the dispatchersExecutorStart/Run/Finish/Endeach test their hook and fall back to the matchingstandard_*.ExecutorCheckPerms_hook— different shape:ExecCheckPermissions()runs the full built-in ACL check first and only consults the hook after a pass, so the hook can add a denial but never grant. Typebool (*)(List *rangeTable, List *rteperminfos, bool ereport_on_violation).
Utility-command seam
Section titled “Utility-command seam”ProcessUtility_hook— global inutility.c; dispatcherProcessUtility()falls back tostandard_ProcessUtility(). The header warns the samequeryStringmay be reused across statements and that commands recurse, so a hook keys onpstmt->stmt_location/pstmt->stmt_len, not the string.
Shared-memory seam (phase-gated)
Section titled “Shared-memory seam (phase-gated)”shmem_request_hook— global inmiscinit.c, fired fromprocess_shmem_requests(), which brackets the call withprocess_shmem_requests_in_progress = true/false.RequestAddinShmemSpace()— the sizing API, fenced by that flag; a call outside the window iselog(FATAL). Accumulates intototal_addin_request.shmem_startup_hook— global inipci.c, fired at the tail ofCreateSharedMemoryAndSemaphores()(and the EXEC_BACKEND attach path) once the segment exists.- Postmaster ordering:
PostmasterMain()callsprocess_shmem_requests()afterInitializeMaxBackends()/InitializeFastPathLocks()and beforeInitializeShmemGUCs(), so all requests fold into one size computation.
Authentication seam
Section titled “Authentication seam”ClientAuthentication_hook— global inauth.c, called at the tail ofClientAuthentication()with(port, status)after the verdict is fixed. Observe-or-veto: it mayereport(FATAL)but cannot upgrade a failure to STATUS_OK.
Load + ABI machinery (shared by every hook module)
Section titled “Load + ABI machinery (shared by every hook module)”_PG_init— the conventional per-module entry, centrally declaredPGDLLEXPORTinfmgr.h; the loaderdlsyms and calls it once per.so.internal_load_library()— the core loader:dlopen, findPg_magic_func,memcmpitsPg_abi_valuesagainst the server’smagic_data,incompatible_module_error()on mismatch, thendlsymand call_PG_init.PG_MODULE_MAGIC/PG_MODULE_MAGIC_DATA/Pg_magic_struct/Pg_abi_values— the ABI block macro and structs;PG_MODULE_ABI_DATAstamps major version,FUNC_MAX_ARGS,INDEX_MAX_KEYS,NAMEDATALEN,FLOAT8PASSBYVAL.load_external_function()/load_file()— public entry points that wrapinternal_load_library(); the former alsodlsyms a named function.process_shared_preload_libraries()(postmaster, pre-fork) andprocess_session_preload_libraries()(per-backend) — the two preload windows, both routing throughload_libraries()→load_file().
flowchart TD pre["shared_preload_libraries GUC"] --> psp["process_shared_preload_libraries()"] psp --> ll["load_libraries() -> load_file()"] ll --> ilib["internal_load_library()"] ilib --> dl["dlopen(.so)"] dl --> mg["Pg_magic_func ABI check<br/>memcmp vs server magic_data"] mg -->|mismatch| err["incompatible_module_error (FATAL)"] mg -->|match| pi["dlsym _PG_init; call it"] pi --> save["prev = the_hook;<br/>the_hook = my_fn"] save --> later["later: dispatcher fires my_fn<br/>my_fn calls prev or standard_*"]
Position hints (as of 2026-06-05, REL_18 273fe94)
Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”| Symbol | File | Line |
|---|---|---|
planner_hook (def) | src/backend/optimizer/plan/planner.c | 74 |
planner() | src/backend/optimizer/plan/planner.c | 305 |
standard_planner() | src/backend/optimizer/plan/planner.c | 321 |
planner_hook_type (typedef) | src/include/optimizer/planner.h | 26 |
ExecutorStart_hook (def) | src/backend/executor/execMain.c | 68 |
ExecutorCheckPerms_hook (def) | src/backend/executor/execMain.c | 74 |
ExecutorRun() | src/backend/executor/execMain.c | 297 |
standard_ExecutorRun() | src/backend/executor/execMain.c | 307 |
ExecCheckPermissions() | src/backend/executor/execMain.c | 582 |
ProcessUtility_hook (def) | src/backend/tcop/utility.c | 70 |
ProcessUtility() | src/backend/tcop/utility.c | 499 |
standard_ProcessUtility() | src/backend/tcop/utility.c | 543 |
shmem_startup_hook (def) | src/backend/storage/ipc/ipci.c | 58 |
RequestAddinShmemSpace() | src/backend/storage/ipc/ipci.c | 74 |
CreateSharedMemoryAndSemaphores() (hook tail) | src/backend/storage/ipc/ipci.c | 248 |
shmem_request_hook (def) | src/backend/utils/init/miscinit.c | 1841 |
process_shmem_requests_in_progress (def) | src/backend/utils/init/miscinit.c | 1842 |
process_shmem_requests() | src/backend/utils/init/miscinit.c | 1931 |
process_shared_preload_libraries() | src/backend/utils/init/miscinit.c | ~1900 |
process_shmem_requests() call | src/backend/postmaster/postmaster.c | 962 |
ClientAuthentication_hook (def) | src/backend/libpq/auth.c | 223 |
ClientAuthentication_hook call | src/backend/libpq/auth.c | 663 |
ClientAuthentication_hook_type (typedef) | src/include/libpq/auth.h | 45 |
load_external_function() | src/backend/utils/fmgr/dfmgr.c | 95 |
internal_load_library() | src/backend/utils/fmgr/dfmgr.c | 189 |
_PG_init call | src/backend/utils/fmgr/dfmgr.c | 297 |
_PG_init (central decl) | src/include/fmgr.h | 434 |
PG_MODULE_MAGIC (macro) | src/include/fmgr.h | 520 |
Pg_abi_values (struct) | src/include/fmgr.h | 467 |
Source verification (as of 2026-06-05)
Section titled “Source verification (as of 2026-06-05)”Every claim and excerpt above was checked against the REL_18 working tree
at /data/hgryoo/references/postgres, commit 273fe94852b3a7e34fd171e8abdf1481beb302fa
(REL_18_STABLE, 2026-06-05). The verification points:
-
planner()is a thin dispatcher. Confirmed atplanner.c:305— the body is theif (planner_hook) ... else standard_planner(...)branch plus thepgstat_report_plan_id()call.standard_planner()begins at line 321. Theplanner_hook_typetypedef andPGDLLIMPORTdecl are atplanner.h:26. -
Four executor hooks plus the permission hook are defined together.
ExecutorStart_hookatexecMain.c:68;ExecutorCheckPerms_hookat line 74.ExecutorRun()(line 297) falls back tostandard_ExecutorRun()(line 307).ExecCheckPermissions()(line 582) runs the per-relationExecCheckOneRelPerms()loop first and only calls(*ExecutorCheckPerms_hook)(...)after the built-in check passed — verifying the “augment-after-core, cannot grant” claim. -
ProcessUtility()mirrors the planner split.ProcessUtility_hookatutility.c:70, dispatcher at line 499,standard_ProcessUtility()at line 543. -
The shmem request fence is real.
RequestAddinShmemSpace()(ipci.c:74) opens withif (!process_shmem_requests_in_progress) elog(FATAL, "cannot request additional shared memory outside shmem_request_hook");. The flag is toggled only insideprocess_shmem_requests()(miscinit.c:1931), whose three-line body is quoted verbatim above.shmem_request_hookis defined atmiscinit.c:1841, immediately followed by the flag at line 1842. -
shmem_startup_hookfires after segment creation. Defined atipci.c:58; invoked at the tail ofCreateSharedMemoryAndSemaphores()(call at line 248) and on the EXEC_BACKEND attach path (line 190). The postmaster’sprocess_shmem_requests()call is atpostmaster.c:962, sequenced afterInitializeFastPathLocks()and beforeInitializeShmemGUCs()exactly as described. -
ClientAuthentication_hookis post-verdict. Defined atauth.c:223; the call(*ClientAuthentication_hook)(port, status)is at line 663–664, afterstatusis finalized and beforesendAuthRequest/auth_failed. The(Port *, int)typedef is atauth.h:45. -
The load chain enforces the ABI check before
_PG_init.internal_load_library()(dfmgr.c:189)dlsymsPg_magic_func,memcmpsPg_abi_valuesagainst the server’s staticmagic_data(dfmgr.c:78), callsincompatible_module_error()on mismatch, and only thendlsyms and calls_PG_init(line 297).PG_MODULE_MAGIC(fmgr.h:520) expands to thePg_magic_funcdefinition;_PG_initis centrally declaredPGDLLEXPORTatfmgr.h:434. -
Hook variables are
PGDLLIMPORT. Spot-checkedplanner_hook(planner.h),ClientAuthentication_hook(auth.h),shmem_startup_hook(ipc.h:78), andshmem_request_hook(miscadmin.h:534) — all carry theextern PGDLLIMPORTstorage marker so out-of-tree modules bind them on every platform.
Note on the save-and-chain excerpt: the my_planner / _PG_init block in
PostgreSQL’s Approach is a composite showing the idiom every hook module
follows; it is not copied from one core file (the core defines the seam, not
the modules that bind it). All other C excerpts are condensed-but-verbatim
from the cited core files.
Scope note: pg_stat_statements, auto_explain, passwordcheck, and
pgaudit are named only as familiar users of these hooks. They live under
contrib/ and are out of scope for this core-only document.
Beyond PostgreSQL — Comparative Designs & Research Frontiers
Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”PostgreSQL’s single-function-pointer hook is one resolution of the
extensibility tension framed in Architecture of a Database System
(dbms-papers/fntdb07-architecture.md). Placing it next to other engines
and to the research literature clarifies what the design buys and what it
forgoes.
MySQL / MariaDB: typed plugin descriptors and an audit API
Section titled “MySQL / MariaDB: typed plugin descriptors and an audit API”Where PostgreSQL exposes raw C pointers and a save-and-chain convention,
MySQL’s plugin API is a registry. A plugin ships a descriptor struct
(st_mysql_plugin) naming its type (storage engine, full-text parser,
audit, authentication) with init/deinit callbacks, and the server keeps
a managed table of plugins per type. The audit plugin interface is the
closest analog to PostgreSQL’s observer hooks: the server multiplexes
event delivery to every registered audit plugin itself, so plugins never
chain through a shared pointer. The trade is explicit: MySQL pays a registry
data structure and per-event iteration to get core-managed ordering and
clean unload; PostgreSQL pays nothing on the unused hot path but pushes
multiplexing and lifetime onto module authors (which is why PostgreSQL hooks
are rarely uninstalled — there is no unload protocol, and
internal_load_library never dlcloses a successfully loaded module).
SQLite: compile-time hooks and per-connection callbacks
Section titled “SQLite: compile-time hooks and per-connection callbacks”SQLite, an in-process library rather than a server, exposes a different mix:
some seams are run-time per-connection callbacks registered through the API
(sqlite3_set_authorizer, sqlite3_trace_v2, sqlite3_commit_hook,
update_hook), and others are compile-time virtual-table and function
registrations. The authorizer callback is a striking parallel to
PostgreSQL’s ExecutorCheckPerms_hook — it is consulted during statement
preparation and may return SQLITE_DENY to veto, but cannot widen access.
The difference is granularity of scope: SQLite’s callbacks are attached to a
sqlite3* connection handle, not to a process-global pointer, because there
is no shared-memory server to coordinate.
Extension density and the “thin core” thesis
Section titled “Extension density and the “thin core” thesis”The hook pattern is the mechanism behind PostgreSQL’s unusually deep
extension ecosystem — index access methods (pluggable index AMs), table
access methods, foreign data wrappers, custom scan providers, background
workers, and the planner/executor/utility taps documented here all rest on
the same “publish a stable seam, let .so code bind it” philosophy traced
to Berkeley POSTGRES (Stonebraker & Rowe 1986, “The Design of POSTGRES”).
The research-frontier tension is that hooks are uncoordinated: two modules
that both reorder the planner, or both rewrite plans, interact only through
load order, with no conflict detection. Academic work on composable query
optimizers and extensible cost models (e.g., the long line from Graefe’s
Volcano/Cascades framework onward) argues for a structured rule registry
where extensions declare what they transform — the inverse of PostgreSQL’s
deliberately unstructured pointer. PostgreSQL chooses the unstructured form
because it is auditable in five lines per seam and free when unused; the
cost is that correctness of composition is entirely the modules’
responsibility.
Security surface
Section titled “Security surface”Because a hook is a function pointer assignable by any shared_preload_libraries
entry, the hook surface is a privileged-code-execution surface — loading a
library is equivalent to patching the server. This is why
shared_preload_libraries is a postmaster-only (PGC_POSTMASTER) GUC
settable only by the server operator, and why restricted preload paths force
$libdir/plugins/. The ClientAuthentication_hook is the sharpest example:
a single line in a preloaded module can audit or veto every login, which is
exactly its purpose for security extensions but also exactly why the load
path is operator-gated. The trust model is “whoever can edit
postgresql.conf and drop a .so in $libdir already owns the server” —
the same model as LD_PRELOAD for any Unix process.
Sources
Section titled “Sources”- PostgreSQL REL_18 source (
/data/hgryoo/references/postgres, commit273fe94852b3a7e34fd171e8abdf1481beb302fa, 2026-06-05):src/backend/optimizer/plan/planner.c—planner_hook,planner(),standard_planner().src/backend/executor/execMain.c— the four executor hooks,ExecutorRun()/standard_ExecutorRun(),ExecutorCheckPerms_hook,ExecCheckPermissions().src/backend/tcop/utility.c—ProcessUtility_hook,ProcessUtility(),standard_ProcessUtility().src/backend/storage/ipc/ipci.c—shmem_startup_hook,RequestAddinShmemSpace(),CreateSharedMemoryAndSemaphores().src/backend/utils/init/miscinit.c—shmem_request_hook,process_shmem_requests(),process_shared_preload_libraries().src/backend/libpq/auth.c—ClientAuthentication_hookand its call site inClientAuthentication().src/backend/utils/fmgr/dfmgr.c—internal_load_library(),load_external_function(), the ABI check and_PG_initdispatch.src/backend/postmaster/postmaster.c—process_shmem_requests()ordering inPostmasterMain().- Headers:
src/include/optimizer/planner.h,src/include/executor/executor.h,src/include/tcop/utility.h,src/include/storage/ipc.h,src/include/miscadmin.h,src/include/libpq/auth.h,src/include/fmgr.h(PG_MODULE_MAGIC,Pg_abi_values,_PG_init).
- Theory anchors (KB bibliography,
.omc/plans/postgres-paper-bibliography.md):- Hellerstein, Stonebraker & Hamilton, Architecture of a Database
System (2007) —
knowledge/research/dbms-papers/fntdb07-architecture.md. - Stonebraker & Rowe, “The Design of POSTGRES” (SIGMOD 1986) — Berkeley extensibility lineage.
- Stonebraker & Kemnitz, “The POSTGRES Next-Generation DBMS” (1991).
- Hellerstein, Stonebraker & Hamilton, Architecture of a Database
System (2007) —
- Adjacent KB code-analysis docs (cross-references, not duplicated
here):
postgres-planner-overview.md,postgres-executor.md,postgres-shared-memory-ipc.md,postgres-extensions.md,postgres-postmaster.md,postgres-backend-lifecycle.md.