PostgreSQL PL/pgSQL — Handler, Compilation to a PLpgSQL_function AST, and the Tree-Walking Statement Executor
Contents:
- Theoretical Background
- Common DBMS Design
- PostgreSQL’s Approach
- Source Walkthrough
- Source verification (as of 2026-06-05)
- Beyond PostgreSQL — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”A procedural language inside the database is the answer to a latency
problem that the relational model created. SQL is declarative and
set-oriented: it has no native loops, no local variables, no IF, no
structured exception handling. The moment an application needs to express
“read this row, branch on its value, update that row, and repeat until a
condition holds,” it must either pull the data across the client/server
boundary and drive the logic from application code — paying a network
round-trip per statement — or push the control flow down next to the data.
Stored procedures and server-side functions are the second answer: a small
imperative program that runs in the backend, calls SQL locally through an
in-process interface, and returns only the final result. Database System
Concepts (Silberschatz 7e, ch. 5 “Advanced SQL”, “Functions and
Procedures” / “Procedural Constructs”) frames this as the standard SQL/PSM
feature set — DECLARE, assignment, IF/CASE/LOOP/WHILE/FOR,
cursors, and condition handlers — and PL/pgSQL is PostgreSQL’s dialect of it,
closely modeled on Oracle’s PL/SQL.
The implementation question this raises is: how do you execute an imperative program whose primitive operations are SQL queries? There are three classic strategies, and they sit on a spectrum from “compile to machine code” to “interpret the syntax tree.” The most heavyweight is to compile the procedure into native code (or an intermediate bytecode) ahead of time; the lightest is a pure source-level interpreter that re-parses on every statement. In the middle — and where almost every database PL lives — is the tree-walking interpreter: parse the source once into an abstract syntax tree (AST) of statement nodes, cache that tree, and execute by recursively walking it, dispatching on each node’s type. The AST amortizes the parse cost across all future calls while keeping the executor simple and debuggable. Database Internals (Petrov 2019) discusses this parse-once/execute-many pattern in the context of query plan caching; the same logic applies to procedural bodies.
But a PL is not just a control-flow interpreter — its leaves are SQL, and
SQL is expensive to prepare (parse, rewrite, plan). So a second caching
layer is essential: each embedded query must be planned once and the plan
reused, with invalidation when the schema underneath it changes. This is
the same cached-plan machinery the rest of the backend uses for prepared
statements (see postgres-portals-prepared.md), and a well-designed PL
leans on it rather than reinventing it. The third design pressure is
variable binding: the PL’s local variables must be visible inside the
embedded SQL (SELECT ... WHERE id = my_var), which means the SQL parser
needs a hook to resolve an otherwise-unknown identifier to a PL datum, and
the executor needs a way to feed the datum’s current value into the running
query as a parameter. The elegance of a PL implementation is largely
measured by how cleanly it threads these three concerns — AST caching, plan
caching, and variable binding — without duplicating the engine’s own query
machinery.
There is also a scoping concern that the AST must encode. An imperative
PL has block-structured variable scopes (DECLARE ... BEGIN ... END,
possibly nested and labeled), and SQL embedded at a given point must see
exactly the variables in scope there — no more, no less. The compiler
therefore maintains a namespace stack while parsing: entering a block
pushes a scope, declaring a variable adds an entry, and each embedded query
snapshots the visible chain so that resolution at run time is deterministic
and matches lexical scope. The same machinery resolves loop labels for
EXIT/CONTINUE and disambiguates the perennial “is this identifier a
PL variable or a table column?” question (PostgreSQL exposes this as the
plpgsql.variable_conflict setting). Getting scoping right at compile time
means the run-time interpreter never has to search for a name — it indexes
straight into a flat array.
A final theoretical wrinkle is exception handling with rollback
semantics. SQL/PSM condition handlers and PL/SQL EXCEPTION blocks promise
that when a statement inside a block raises an error, the block can catch it
by SQLSTATE and continue — but the partial effects of the failed statement
must be undone, or the database is left inconsistent. The clean way to get
“undo the statements since the block began” is to make the block a
savepoint / nested transaction: enter the block, set a savepoint, run the
body, and on error roll back to the savepoint before dispatching to the
handler. This ties the PL’s structured-exception feature directly to the
engine’s subtransaction mechanism (postgres-xact.md,
postgres-two-phase-commit.md for the transaction state machine).
Common DBMS Design
Section titled “Common DBMS Design”Across engines, a server-side procedural language is almost always built as a pluggable component behind a stable call interface, not as a hardcoded part of the SQL executor. The recurring architecture has four pieces:
-
A language handler registered in the catalog. The engine defines a function-call ABI (“here is a tuple of arguments and a return slot”);
CREATE LANGUAGE(or a built-in registration) names a C entry point that the function manager calls whenever a function written in that language is invoked. The handler is the bridge: it receives the generic call, looks up the procedure’s source, and runs it. This is what lets PostgreSQL ship PL/Python, PL/Perl, PL/Tcl, and third-party PLs as ordinary extensions — they all implement the same handler contract. -
A compile step that produces a cached AST. The first call (or a validation pass at
CREATE FUNCTIONtime) parses the procedure body into an internal representation and caches it keyed by the function OID, so the parse cost is paid once per backend per function. -
An in-process SQL interface. The PL does not re-implement parsing, planning, or execution of SQL. It calls a thin internal API — PostgreSQL’s is the Server Programming Interface (SPI) — that wraps the normal parse/plan/execute pipeline and the cached-plan layer. The PL gets back result rows and a processed-count; it never touches
PlannedStmtor the executor nodes directly. -
A datum / parameter bridge. Local variables live in a PL-owned array; to make them visible in embedded SQL the PL installs a parser hook that turns identifier references into placeholder parameters, and a param fetch hook that supplies the live value when the executor evaluates the placeholder.
The design tensions that distinguish good implementations from naïve ones:
- Re-parse vs. cache. A naïve PL re-parses the body on every call; a good one caches the AST and the per-query plans, and handles plan invalidation.
- Always-via-executor vs. fast-path expressions. Routing every scalar
expression (
x := a + 1) through the full SQL executor is correct but slow; mature PLs special-case trivial table-less expressions and evaluate them directly through the expression interpreter, skipping the SPI/executor overhead. - Flat error vs. structured handlers + rollback. Supporting
EXCEPTION WHEN ... THENrequires hooking the subtransaction/savepoint machinery so partial work is rolled back before the handler runs.
PostgreSQL’s PL/pgSQL is a textbook instance of all four pieces, and it makes the “good implementation” choice on every tension above.
PostgreSQL’s Approach
Section titled “PostgreSQL’s Approach”PL/pgSQL is a loadable module (plpgsql.so) that registers three C functions
matching the handler contract: a call handler
(plpgsql_call_handler), an inline handler for anonymous DO blocks
(plpgsql_inline_handler), and a validator (plpgsql_validator) that
CREATE FUNCTION runs to syntax-check the body. The body text itself lives
in pg_proc.prosrc like any other function’s source.
// plpgsql_call_handler — src/pl/plpgsql/src/pl_handler.cSPI_connect_ext(nonatomic ? SPI_OPT_NONATOMIC : 0);/* Find or compile the function */func = plpgsql_compile(fcinfo, false);...func->cfunc.use_count++;PG_TRY();{ if (CALLED_AS_TRIGGER(fcinfo)) retval = PointerGetDatum(plpgsql_exec_trigger(func, ...)); else if (CALLED_AS_EVENT_TRIGGER(fcinfo)) plpgsql_exec_event_trigger(func, ...); else retval = plpgsql_exec_function(func, fcinfo, NULL, NULL, procedure_resowner, !nonatomic);}PG_FINALLY();{ func->cfunc.use_count--; func->cur_estate = save_cur_estate; ... }PG_END_TRY();SPI_finish();The shape is the whole story: connect to SPI, compile-or-fetch the
function, bump a use-count so it can’t be freed underneath us, dispatch to
the trigger / event-trigger / plain-function executor inside a PG_TRY, then
disconnect SPI. The nonatomic flag — set when the function is called as a
procedure via CALL in a context that permits transaction control — flows
into SPI_connect_ext so that COMMIT/ROLLBACK inside the procedure are
allowed. The use_count discipline is what makes the cached compiled
function safe to reuse across recursive and re-entrant calls.
The validator is the CREATE FUNCTION-time path. It rejects pseudotype
arguments/returns (except the legal TRIGGER, RECORD, VOID, polymorphic
cases) and, when check_function_bodies is on, sets up a fake fcinfo and
test-compiles the body so syntax errors surface at definition time rather
than first call:
// plpgsql_validator — src/pl/plpgsql/src/pl_handler.cif (check_function_bodies){ ... flinfo.fn_oid = funcoid; if (is_dml_trigger) { ... fake_fcinfo->context = (Node *) &trigdata; } else if (is_event_trigger) { ... fake_fcinfo->context = (Node *) &etrigdata; } /* Test-compile the function */ plpgsql_compile(fake_fcinfo, true); /* forValidator = true */ ...}The inline handler runs a DO block. It has no catalog entry, so it
compiles the source text directly via plpgsql_compile_inline, and — because
a DO block may COMMIT/ROLLBACK — it builds a private EState and
ResourceOwner for simple-expression evaluation that survive transaction
boundaries, then frees the throwaway compiled function afterward:
// plpgsql_inline_handler — src/pl/plpgsql/src/pl_handler.cfunc = plpgsql_compile_inline(codeblock->source_text);func->cfunc.use_count++;...simple_eval_estate = CreateExecutorState();simple_eval_resowner = ResourceOwnerCreate(NULL, "PL/pgSQL DO block simple expressions");PG_TRY();{ retval = plpgsql_exec_function(func, fake_fcinfo, simple_eval_estate, simple_eval_resowner, simple_eval_resowner, /* also the proc resowner */ codeblock->atomic);}PG_CATCH();{ ... plpgsql_free_function_memory(func); PG_RE_THROW(); }PG_END_TRY();plpgsql_free_function_memory(func);Compilation: source text to a PLpgSQL_function
Section titled “Compilation: source text to a PLpgSQL_function”plpgsql_compile is a thin cache wrapper. It delegates to funccache.c’s
cached_function_compile, which keys on the function OID (plus argument
types for polymorphism) and only calls the expensive plpgsql_compile_callback
on a miss. The result pointer is stashed on fcinfo->flinfo->fn_extra so the
same FmgrInfo skips even the hash lookup on repeat calls within a query:
// plpgsql_compile — src/pl/plpgsql/src/pl_comp.cfunction = (PLpgSQL_function *) cached_function_compile(fcinfo, fcinfo->flinfo->fn_extra, plpgsql_compile_callback, plpgsql_delete_callback, sizeof(PLpgSQL_function), false, forValidator);fcinfo->flinfo->fn_extra = function; /* avoid search next time */return function;The real work is in plpgsql_compile_callback. It runs entirely inside a
dedicated per-function memory context (func_cxt, “PL/pgSQL function”)
so every palloc during compilation has the function’s lifetime and the
whole tree can be reclaimed by deleting one context. It initializes the
scanner over prosrc, pushes the outermost namespace level (named after
the function, holding parameters and special variables like FOUND), builds
a PLpgSQL_var/PLpgSQL_rec datum for each declared argument, and then runs
the Bison grammar (plpgsql_yyparse) which populates function->action,
the root PLpgSQL_stmt_block:
// plpgsql_compile_callback — src/pl/plpgsql/src/pl_comp.cfunc_cxt = AllocSetContextCreate(CurrentMemoryContext, "PL/pgSQL function", ALLOCSET_DEFAULT_SIZES);plpgsql_compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid);function->fn_cxt = func_cxt;function->resolve_option = plpgsql_variable_conflict;...plpgsql_ns_init();plpgsql_ns_push(NameStr(procStruct->proname), PLPGSQL_LABEL_BLOCK);plpgsql_start_datums();/* ... build a PLpgSQL_var/rec per argument, add to ns + datum list ... */The compiler’s two persistent outputs are the statement tree
(function->action) and the datum array (function->datums[]). Every
variable, row, record, and record-field the function knows about is a
PLpgSQL_datum referenced by its integer index dno. The grammar never
plans SQL: each embedded query is captured verbatim into a PLpgSQL_expr
whose query field holds the raw text and whose ns field snapshots the
namespace chain visible at that point — planning is deferred to first
execution.
flowchart TD A["CREATE FUNCTION ... LANGUAGE plpgsql<br/>body stored in pg_proc.prosrc"] --> B["first call:<br/>plpgsql_call_handler"] B --> C["plpgsql_compile<br/>(funccache: hit? reuse)"] C -->|miss| D["plpgsql_compile_callback<br/>per-function MemoryContext"] D --> E["scanner + Bison grammar<br/>plpgsql_yyparse"] E --> F["PLpgSQL_function:<br/>action = stmt tree<br/>datums[] = variables"] C -->|hit| F F --> G["cached on fn_extra<br/>use_count guards reuse"] G --> H["plpgsql_exec_function<br/>tree-walking executor"]
The datum / variable model
Section titled “The datum / variable model”PLpgSQL_datum is the common supertype. Four concrete kinds matter:
PLpgSQL_var (scalar, or a DTYPE_PROMISE lazy special variable),
PLpgSQL_row (a fixed list of variables, used for multi-column INTO and
multi-OUT-parameter returns), PLpgSQL_rec (a whole composite/RECORD
value, stored as an expanded record), and PLpgSQL_recfield (one field of
a record, resolved lazily against the record’s current tuple descriptor).
// PLpgSQL_var — src/pl/plpgsql/src/plpgsql.htypedef struct PLpgSQL_var{ PLpgSQL_datum_type dtype; /* PLPGSQL_DTYPE_VAR or _PROMISE */ int dno; /* index into datums[] */ char *refname; ... PLpgSQL_type *datatype; /* Fields below here can change at runtime */ Datum value; bool isnull; bool freeval; PLpgSQL_promise_type promise; /* PLPGSQL_PROMISE_NONE if normal var */} PLpgSQL_var;The promise mechanism is a small but characteristic optimization:
trigger context variables like TG_NAME, TG_OP, NEW, OLD are declared
as promise datums and only materialized the first time the function reads
them, so a trigger that never touches TG_TABLE_NAME pays nothing to compute
it. Records are always kept in expanded form (ExpandedRecordHeader),
which lets field access and field assignment mutate in place rather than
deconstructing and reconstructing a flat tuple each time.
Execution: the tree-walking interpreter
Section titled “Execution: the tree-walking interpreter”plpgsql_exec_function sets up a PLpgSQL_execstate, copies the function’s
datums into per-execution storage (so recursion and re-entrancy each get
their own variable values), installs an error-context callback for nice
CONTEXT: tracebacks, stores the actual argument values into the argument
datums, then walks the tree:
// plpgsql_exec_function — src/pl/plpgsql/src/pl_exec.cplpgsql_estate_setup(&estate, func, (ReturnSetInfo *) fcinfo->resultinfo, simple_eval_estate, simple_eval_resowner);estate.atomic = atomic;...copy_plpgsql_datums(&estate, func);/* Store the actual call argument values into the appropriate variables */for (i = 0; i < func->fn_nargs; i++){ int n = func->fn_argvarnos[i]; switch (estate.datums[n]->dtype) { case PLPGSQL_DTYPE_VAR: assign_simple_var(&estate, (PLpgSQL_var *) estate.datums[n], fcinfo->args[i].value, fcinfo->args[i].isnull, false); ... /* commandeer R/W expanded objects; force arrays to expanded form */ break; case PLPGSQL_DTYPE_REC: ... exec_move_row_from_datum(...); break; }}The interpreter core is two functions. exec_stmt_block initializes a
block’s locally-declared variables and runs its body; exec_stmts iterates a
statement list and switches on stmt->cmd_type, calling the matching
exec_stmt_* handler. Each handler returns one of four return codes —
PLPGSQL_RC_OK, PLPGSQL_RC_EXIT, PLPGSQL_RC_RETURN, PLPGSQL_RC_CONTINUE
— which propagate up the recursion to implement EXIT/CONTINUE/RETURN
and loop-label matching.
// exec_stmts — src/pl/plpgsql/src/pl_exec.cforeach(s, stmts){ PLpgSQL_stmt *stmt = (PLpgSQL_stmt *) lfirst(s); estate->err_stmt = stmt; if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg) ((*plpgsql_plugin_ptr)->stmt_beg) (estate, stmt); CHECK_FOR_INTERRUPTS(); switch (stmt->cmd_type) { case PLPGSQL_STMT_BLOCK: rc = exec_stmt_block(estate, ...); break; case PLPGSQL_STMT_ASSIGN: rc = exec_stmt_assign(estate, ...); break; case PLPGSQL_STMT_IF: rc = exec_stmt_if(estate, ...); break; case PLPGSQL_STMT_FORI: rc = exec_stmt_fori(estate, ...); break; case PLPGSQL_STMT_EXECSQL: rc = exec_stmt_execsql(estate, ...); break; case PLPGSQL_STMT_RETURN: rc = exec_stmt_return(estate, ...); break; ... /* one case per PLpgSQL_stmt_type */ } if (rc != PLPGSQL_RC_OK) return rc;}The *plpgsql_plugin_ptr rendezvous hook before each statement is the seam
that the pldebugger / profiler extensions tap (set up in _PG_init via
find_rendezvous_variable). CHECK_FOR_INTERRUPTS() per statement is what
lets a runaway PL/pgSQL loop be cancelled.
flowchart TD
S["exec_stmts(list)"] --> L{"for each stmt"}
L --> P["plugin stmt_beg hook<br/>CHECK_FOR_INTERRUPTS"]
P --> D{"switch cmd_type"}
D -->|ASSIGN| A["exec_stmt_assign<br/>exec_assign_expr -> exec_assign_value"]
D -->|IF/CASE/LOOP/FOR| C["control-flow handlers<br/>recurse into sub-blocks"]
D -->|EXECSQL| Q["exec_stmt_execsql<br/>SPI plan + execute"]
D -->|BLOCK + EXCEPTION| X["exec_stmt_block<br/>subtransaction + PG_TRY"]
D -->|RETURN| R["exec_stmt_return<br/>set estate.retval"]
A --> RC{"rc"}
C --> RC
Q --> RC
X --> RC
R --> RC
RC -->|RC_OK| L
RC -->|EXIT/RETURN/CONTINUE| U["propagate up the recursion"]
SQL inside PL/pgSQL: SPI and the simple-expression fast path
Section titled “SQL inside PL/pgSQL: SPI and the simple-expression fast path”PL/pgSQL never calls the planner or executor directly. Every embedded
statement is a PLpgSQL_expr that, on first execution, is prepared through
SPI with a parser-setup hook that lets the SQL parser resolve PL/pgSQL
variable references:
// exec_prepare_plan — src/pl/plpgsql/src/pl_exec.coptions.parserSetup = (ParserSetupHook) plpgsql_parser_setup;options.parserSetupArg = expr;options.parseMode = expr->parseMode;options.cursorOptions = cursorOptions;plan = SPI_prepare_extended(expr->query, &options);SPI_keepplan(plan);expr->plan = plan;/* Check to see if it's a simple expression */exec_simple_check_plan(estate, expr);A full-blown SQL statement (INSERT, multi-table SELECT, etc.) takes the
general path: exec_stmt_execsql prepares the plan, detects whether it is a
data-modifying command (by inspecting the cached CachedPlanSource’s command
tag), builds a ParamListInfo from the current datum values, and runs it via
SPI_execute_plan_with_paramlist, asking for at most two rows when there is
an INTO STRICT so it can detect “too many rows”:
// exec_stmt_execsql — src/pl/plpgsql/src/pl_exec.cif (expr->plan == NULL) exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK);...paramLI = setup_param_list(estate, expr);if (stmt->into) tcount = (stmt->strict || stmt->mod_stmt || too_many_rows_level) ? 2 : 1;else tcount = 0;rc = SPI_execute_plan_with_paramlist(expr->plan, paramLI, estate->readonly_func, tcount);The variable binding is the elegant part, and it has two halves: a
compile-time identifier resolver and a run-time value supplier. At
prepare time plpgsql_parser_setup installs plpgsql_param_ref /
plpgsql_post_column_ref, which consult the namespace chain snapshotted on
the PLpgSQL_expr (expr->ns) and rewrite any reference that matches a
PL/pgSQL datum into a PARAM_EXTERN Param. The key invariant is that the
Param’s paramid is exactly the datum’s dno + 1, so there is no separate
symbol table to consult at run time — the param id is the array index:
// plpgsql_param_ref — src/pl/plpgsql/src/pl_comp.csnprintf(pname, sizeof(pname), "$%d", pref->number);nse = plpgsql_ns_lookup(expr->ns, false, pname, NULL, NULL, NULL);if (nse == NULL) return NULL; /* name not known to plpgsql */return make_datum_param(expr, nse->itemno, pref->location);At run time, before a plan executes, setup_param_list decides whether the
query needs parameters at all. It reuses one shared ParamListInfo per
execution state (there is no materialized ParamExternData array — values
are fetched lazily) and points its parserSetupArg back at the active expr;
when the query references nothing, it returns NULL so plancache.c knows a
custom plan is pointless:
// setup_param_list — src/pl/plpgsql/src/pl_exec.cif (!bms_is_empty(expr->paramnos)){ paramLI = estate->paramLI; /* the common ParamListInfo */ paramLI->parserSetupArg = expr; /* where hook functions find the expr */}else paramLI = NULL; /* no params -> no custom plan */return paramLI;When the executor finally evaluates a Param, it calls back into PL/pgSQL
through plpgsql_param_eval_var, which reads the live value straight out of
the datum array — no copy, no intermediate structure:
// plpgsql_param_eval_var — src/pl/plpgsql/src/pl_exec.cint dno = op->d.cparam.paramid - 1;estate = (PLpgSQL_execstate *) econtext->ecxt_param_list_info->paramFetchArg;var = (PLpgSQL_var *) estate->datums[dno];/* inlined version of exec_eval_datum() */*op->resvalue = var->value;*op->resnull = var->isnull;For a scalar expression — x := a + 1, an IF condition, a loop bound —
routing through the SPI executor would be wasteful. PL/pgSQL detects when an
expression compiles to a table-less single-column SELECT with no aggs,
sublinks, sorts, or FROM clause, and marks it “simple”:
// exec_is_simple_query — src/pl/plpgsql/src/pl_exec.cif (query->commandType != CMD_SELECT) return false;if (query->rtable != NIL) return false;if (query->hasAggs || query->hasWindowFuncs || query->hasTargetSRFs || query->hasSubLinks || query->cteList || query->jointree->fromlist || query->jointree->quals || query->groupClause || ...) return false;if (list_length(query->targetList) != 1) return false;return true; /* treat it as a simple plan */Simple expressions are evaluated by exec_eval_simple_expr, which runs the
extracted ExprState directly against the function’s eval ExprContext,
bypassing the SPI tuple machinery entirely. The subtlety is cached-plan
revalidation: the simple expression caches a CachedPlan refcount per
local transaction, and on each use it checks CachedPlanIsSimplyValid; if a
DDL invalidation fired, it replans, re-tests simplicity, and falls back to
the general path if the expression is no longer simple. This is the single
biggest performance lever in PL/pgSQL — arithmetic and comparisons never
touch the executor proper.
EXECUTE '<dynamic string>' is the opposite extreme: exec_stmt_dynexecute
evaluates the string expression, then runs it with SPI_execute_extended
without saving a plan, because the text can differ every call. CALL
(exec_stmt_call) uses SPI_execute_plan_extended with
allow_nonatomic = true so the called procedure may commit/rollback, and
detects a transaction boundary by comparing MyProc->vxid.lxid before and
after.
Exception blocks as subtransactions
Section titled “Exception blocks as subtransactions”A BEGIN ... EXCEPTION WHEN ... END block is the place where the interpreter
touches the transaction manager. When block->exceptions is non-NULL,
exec_stmt_block opens an internal subtransaction, runs the body in a fresh
eval econtext under a PG_TRY, and on success releases the subtransaction;
on error it rolls the subtransaction back (undoing the body’s partial
effects), copies the ErrorData, and scans the handler list for a
SQLSTATE-condition match:
// exec_stmt_block (exception path) — src/pl/plpgsql/src/pl_exec.cBeginInternalSubTransaction(NULL);MemoryContextSwitchTo(oldcontext);PG_TRY();{ plpgsql_create_econtext(estate); /* econtext tied to the subxact */ rc = exec_stmts(estate, block->body); ... ReleaseCurrentSubTransaction(); /* commit inner subxact */ estate->eval_econtext = old_eval_econtext;}PG_CATCH();{ edata = CopyErrorData(); FlushErrorState(); RollbackAndReleaseCurrentSubTransaction(); /* undo the body */ ... foreach(e, block->exceptions->exc_list) { PLpgSQL_exception *exception = (PLpgSQL_exception *) lfirst(e); if (exception_matches_conditions(edata, exception->conditions)) { assign_text_var(estate, state_var, unpack_sql_state(edata->sqlerrcode)); assign_text_var(estate, errm_var, edata->message); estate->cur_error = edata; rc = exec_stmts(estate, exception->action); /* run handler */ break; } } if (e == NULL) ReThrowError(edata); /* no handler matched: re-raise */}PG_END_TRY();The cost is real: any function containing an EXCEPTION block pays
subtransaction setup/teardown on every entry to that block, which is why the
PL/pgSQL documentation warns against wrapping tight loops in exception
blocks. The magic variables SQLSTATE and SQLERRM are ordinary datums
(sqlstate_varno, sqlerrm_varno) populated from the captured ErrorData
before the handler runs.
flowchart TD
E["enter BEGIN...EXCEPTION block"] --> B["BeginInternalSubTransaction<br/>plpgsql_create_econtext"]
B --> T["PG_TRY: exec_stmts(body)"]
T -->|no error| OK["ReleaseCurrentSubTransaction<br/>(commit inner subxact)"]
T -->|error| Cat["PG_CATCH:<br/>CopyErrorData + FlushErrorState"]
Cat --> RB["RollbackAndReleaseCurrentSubTransaction<br/>(undo body effects)"]
RB --> M{"match exc_list<br/>by SQLSTATE?"}
M -->|match| H["set SQLSTATE/SQLERRM<br/>exec_stmts(handler.action)"]
M -->|no match| RT["ReThrowError(edata)<br/>propagate outward"]
OK --> Z["block return code handling"]
H --> Z
This subtransaction-per-block model defers the actual savepoint/XID
mechanics to the transaction manager (postgres-xact.md); PL/pgSQL only
orchestrates the Begin/Release/Rollback calls and the handler dispatch.
Returning values: scalar, composite, and set-returning
Section titled “Returning values: scalar, composite, and set-returning”exec_stmt_return sets estate->retval/retisnull/rettype and signals
PLPGSQL_RC_RETURN, which unwinds the whole recursion back to
plpgsql_exec_function, where the value is coerced to the declared return
type. There is a deliberate fast path: when the RETURN expression is a bare
variable reference (stmt->retvarno >= 0 — always true for functions with
OUT parameters), the executor copies the datum’s value into retval
directly rather than routing through exec_eval_expr, which lets a
read/write expanded object (a large array, say) be transferred to the
caller’s context cheaply instead of being flattened to a read-only copy:
// exec_stmt_return — src/pl/plpgsql/src/pl_exec.cif (estate->retisset) return PLPGSQL_RC_RETURN; /* SRF: final RETURN ends tuple production */...if (stmt->retvarno >= 0){ PLpgSQL_datum *retvar = estate->datums[stmt->retvarno]; switch (retvar->dtype) { case PLPGSQL_DTYPE_PROMISE: plpgsql_fulfill_promise(estate, (PLpgSQL_var *) retvar); /* FALL THRU */ case PLPGSQL_DTYPE_VAR: { PLpgSQL_var *var = (PLpgSQL_var *) retvar; estate->retval = var->value; /* R/W expanded value transfers cheaply */ estate->retisnull = var->isnull; estate->rettype = var->datatype->typoid; } ... }}Set-returning PL/pgSQL functions (RETURNS SETOF ...) work differently:
RETURN NEXT / RETURN QUERY append rows into a Tuplestorestate
(estate->tuple_store) rather than ending the function, and the final plain
RETURN just signals completion (the retisset early-return above). The
materialized tuplestore is handed back through the ReturnSetInfo the
function manager passed in — the same SFRM_Materialize protocol any
set-returning function uses. This is why a PL/pgSQL SRF buffers its entire
result before any row reaches the caller.
Source Walkthrough
Section titled “Source Walkthrough”The four .c files split cleanly by responsibility: pl_handler.c is the
function-manager boundary (handlers, validator, GUCs, xact callbacks);
pl_comp.c turns source text into the PLpgSQL_function AST and owns the
namespace and parser hooks; pl_exec.c is the tree-walking interpreter
plus the SPI/simple-expression bridge; pl_funcs.c holds the namespace
stack primitives, the AST tree-walker, memory-freeing, and the dump_* AST
pretty-printer. The shared type vocabulary lives in plpgsql.h.
Handler boundary (pl_handler.c)
Section titled “Handler boundary (pl_handler.c)”_PG_init— registers theplpgsql.*GUCs (variable_conflict,check_asserts,extra_warnings/extra_errors), the xact/subxact callbacks (plpgsql_xact_cb,plpgsql_subxact_cb), and thePLpgSQL_pluginrendezvous pointer.plpgsql_call_handler— thePG_FUNCTION_INFO_V1entry the function manager calls; connects SPI, compiles/fetches the function, dispatches toplpgsql_exec_function/plpgsql_exec_trigger/plpgsql_exec_event_trigger.plpgsql_inline_handler— runsDOblocks; compiles inline, builds a privateEState/ResourceOwner, frees the function after.plpgsql_validator—CREATE FUNCTION-time checks + optional test-compile undercheck_function_bodies.
Compilation and namespace (pl_comp.c)
Section titled “Compilation and namespace (pl_comp.c)”plpgsql_compile/plpgsql_compile_callback— cache wrapper + per-function-context AST builder.plpgsql_compile_inline— theDO-block compile path.plpgsql_parser_setup,plpgsql_pre_column_ref,plpgsql_post_column_ref,resolve_column_ref,plpgsql_param_ref,make_datum_param— the parser hooks that resolve SQL identifiers to PL/pgSQL datums and emitPARAM_EXTERNparams.plpgsql_build_variable,plpgsql_build_datatype,add_parameter_name,add_dummy_return— datum construction.
Namespace, tree-walk, dump, free (pl_funcs.c)
Section titled “Namespace, tree-walk, dump, free (pl_funcs.c)”plpgsql_ns_init,plpgsql_ns_push,plpgsql_ns_pop,plpgsql_ns_additem,plpgsql_ns_lookup,plpgsql_ns_lookup_label,plpgsql_ns_find_nearest_loop— the compile-time namespace stack (block-scoped variable resolution).plpgsql_stmt_typename,plpgsql_getdiag_kindname— human-readable names.plpgsql_statement_tree_walker_impl,plpgsql_mark_local_assignment_targets— generic AST traversal used for analysis.plpgsql_free_function_memory,plpgsql_delete_callback— context teardown.plpgsql_dumptreeand thedump_*family —DEBUGAST printer.
Execution (pl_exec.c)
Section titled “Execution (pl_exec.c)”plpgsql_exec_function,plpgsql_exec_trigger,plpgsql_exec_event_trigger— top-level entry points;copy_plpgsql_datums,plpgsql_estate_setup.exec_toplevel_block,exec_stmt_block,exec_stmts— the recursion core.exec_stmt_assign,exec_stmt_if,exec_stmt_case,exec_stmt_loop,exec_stmt_while,exec_stmt_fori,exec_stmt_fors,exec_stmt_forc,exec_stmt_foreach_a,exec_stmt_exit,exec_stmt_return,exec_stmt_return_next,exec_stmt_return_query,exec_stmt_raise,exec_stmt_assert,exec_stmt_getdiag— per-construct handlers.exec_stmt_execsql,exec_stmt_dynexecute,exec_stmt_dynfors,exec_stmt_open,exec_stmt_fetch,exec_stmt_close,exec_stmt_perform,exec_stmt_call— SQL and cursor statements.exec_prepare_plan,setup_param_list,exec_assign_expr,exec_assign_value,exec_eval_expr,exec_run_select,exec_move_row— the expression/assignment/row machinery.exec_eval_simple_expr,exec_simple_check_plan,exec_is_simple_query,exec_save_simple_expr,plpgsql_param_eval_var,plpgsql_param_compile— the simple-expression fast path.plpgsql_create_econtext,plpgsql_destroy_econtext,assign_simple_var,assign_text_var,exception_matches_conditions— runtime support.
Position hints (as of 2026-06-05, REL_18 273fe94)
Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”| Symbol | File | Line |
|---|---|---|
plpgsql_call_handler | src/pl/plpgsql/src/pl_handler.c | 224 |
plpgsql_inline_handler | src/pl/plpgsql/src/pl_handler.c | 316 |
plpgsql_validator | src/pl/plpgsql/src/pl_handler.c | 442 |
_PG_init | src/pl/plpgsql/src/pl_handler.c | 148 |
plpgsql_compile | src/pl/plpgsql/src/pl_comp.c | 106 |
plpgsql_compile_callback | src/pl/plpgsql/src/pl_comp.c | 167 |
plpgsql_compile_inline | src/pl/plpgsql/src/pl_comp.c | 739 |
plpgsql_param_ref | src/pl/plpgsql/src/pl_comp.c | 1056 |
resolve_column_ref | src/pl/plpgsql/src/pl_comp.c | 1083 |
plpgsql_ns_lookup | src/pl/plpgsql/src/pl_funcs.c | 130 |
plpgsql_ns_find_nearest_loop | src/pl/plpgsql/src/pl_funcs.c | 214 |
plpgsql_stmt_typename | src/pl/plpgsql/src/pl_funcs.c | 232 |
plpgsql_free_function_memory | src/pl/plpgsql/src/pl_funcs.c | 716 |
plpgsql_exec_function | src/pl/plpgsql/src/pl_exec.c | 493 |
exec_stmt_block | src/pl/plpgsql/src/pl_exec.c | 1663 |
exec_stmts | src/pl/plpgsql/src/pl_exec.c | 1996 |
exec_stmt_call | src/pl/plpgsql/src/pl_exec.c | 2197 |
exec_stmt_fori | src/pl/plpgsql/src/pl_exec.c | 2696 |
exec_stmt_return | src/pl/plpgsql/src/pl_exec.c | 3197 |
exec_stmt_raise | src/pl/plpgsql/src/pl_exec.c | 3725 |
exec_prepare_plan | src/pl/plpgsql/src/pl_exec.c | 4173 |
exec_stmt_execsql | src/pl/plpgsql/src/pl_exec.c | 4208 |
exec_stmt_dynexecute | src/pl/plpgsql/src/pl_exec.c | 4440 |
exec_assign_value | src/pl/plpgsql/src/pl_exec.c | 5061 |
exec_eval_expr | src/pl/plpgsql/src/pl_exec.c | 5665 |
exec_run_select | src/pl/plpgsql/src/pl_exec.c | 5753 |
exec_eval_simple_expr | src/pl/plpgsql/src/pl_exec.c | 6019 |
setup_param_list | src/pl/plpgsql/src/pl_exec.c | 6250 |
plpgsql_param_eval_var | src/pl/plpgsql/src/pl_exec.c | 6672 |
exec_is_simple_query | src/pl/plpgsql/src/pl_exec.c | 8205 |
plpgsql_create_econtext | src/pl/plpgsql/src/pl_exec.c | 8610 |
PLpgSQL_datum_type (enum) | src/pl/plpgsql/src/plpgsql.h | 62 |
PLpgSQL_stmt_type (enum) | src/pl/plpgsql/src/plpgsql.h | 103 |
PLpgSQL_expr (struct) | src/pl/plpgsql/src/plpgsql.h | 230 |
PLpgSQL_var (struct) | src/pl/plpgsql/src/plpgsql.h | 332 |
PLpgSQL_rec (struct) | src/pl/plpgsql/src/plpgsql.h | 412 |
PLpgSQL_stmt_block (struct) | src/pl/plpgsql/src/plpgsql.h | 525 |
PLpgSQL_function (struct) | src/pl/plpgsql/src/plpgsql.h | 958 |
PLpgSQL_execstate (struct) | src/pl/plpgsql/src/plpgsql.h | 1012 |
Source verification (as of 2026-06-05)
Section titled “Source verification (as of 2026-06-05)”Verified against /data/hgryoo/references/postgres at REL_18_STABLE,
commit 273fe94.
- Handler trio + validator.
plpgsql_call_handler,plpgsql_inline_handler, andplpgsql_validatorare present inpl_handler.cwith thePG_FUNCTION_INFO_V1macros; the dispatch onCALLED_AS_TRIGGER/CALLED_AS_EVENT_TRIGGER, theuse_countguard, and theSPI_connect_ext(nonatomic ? SPI_OPT_NONATOMIC : 0)call are quoted verbatim.PG_MODULE_MAGIC_EXT(.name = "plpgsql", .version = PG_VERSION)confirms the PG18 module-magic form. - Compile cache.
plpgsql_compiledelegates tocached_function_compile(funccache.c) — the PG17+ unification of the old per-PL caching — passingplpgsql_compile_callback/plpgsql_delete_callback. Confirmed the result is stored onfcinfo->flinfo->fn_extra. - AST / datum model. The
PLpgSQL_datum_typeenum lists exactlyVAR, ROW, REC, RECFIELD, PROMISE;PLpgSQL_stmt_typelists the 26PLPGSQL_STMT_*constants used by theexec_stmtsswitch.PLpgSQL_varcarries the runtimevalue/isnull/promise;PLpgSQL_recstores anExpandedRecordHeader *erh(records always expanded). All four struct shapes match the quoted excerpts. - Simple-expression path.
exec_is_simple_query’s rejection list (hasAggs,hasWindowFuncs,hasTargetSRFs,hasSubLinks,cteList, non-emptyrtable/fromlist/quals, etc.) and the single-target-list requirement are quoted directly.exec_eval_simple_expr’s per-LXIDCachedPlanIsSimplyValidrevalidation and replan-or-fall-back logic is present as shown.plpgsql_param_eval_varreadsestate->datums[dno]inline. - Exception subtransactions.
exec_stmt_blockcallsBeginInternalSubTransaction(NULL)underPG_TRY, and thePG_CATCHperformsCopyErrorData+FlushErrorState+RollbackAndReleaseCurrentSubTransaction, then iteratesblock->exceptions->exc_listcallingexception_matches_conditions, andReThrowError(edata)when no handler matches. Verbatim. - No PG19-only claims. This doc asserts only REL_18 facts. It makes no
reference to any post-18 rmgr, BackendType, or executor change; the
funccache-based compile cache and
PG_MODULE_MAGIC_EXTare PG17/PG18 features that are present in this tree.
Two simplifications are intentional and noted here rather than asserted as
full coverage: the cursor statements (OPEN/FETCH/CLOSE,
exec_stmt_forc) and the trigger/event-trigger executors are named in the
walkthrough but not excerpted, because their SPI-cursor mechanics belong with
postgres-spi.md; and the read/write expanded-object assignment optimization
(expr_rwopt, PLPGSQL_RWOPT_*) is described at the level of “why arrays are
forced to expanded form” without tracing every exec_assign_value branch.
Beyond PostgreSQL — Comparative Designs & Research Frontiers
Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”-
PL/pgSQL vs. Oracle PL/SQL. PL/pgSQL’s surface syntax is deliberately PL/SQL-like (block structure,
%TYPE/%ROWTYPE,EXCEPTION WHEN), but the execution models differ sharply. Oracle compiles PL/SQL to a bytecode (or, with native compilation, to C/machine code) executed by a dedicated PL/SQL virtual machine, and PL/SQL shares a tightly-integrated runtime with the SQL engine. PL/pgSQL is a pure tree-walking interpreter that reaches the SQL engine only through the arms-length SPI boundary. A focused comparison of the per-statement dispatch cost (PL/pgSQL’sswitch+ handler call vs. a bytecodegoto) would quantify what PostgreSQL trades for implementation simplicity and the ability to ship the PL as a loadable extension. -
The SPI boundary as an extensibility seam. Because every PL talks to the engine through SPI and the handler ABI, PostgreSQL supports PL/Python, PL/Perl, PL/Tcl, PL/v8, and arbitrary third-party PLs with no core changes. This is the Stonebraker-era “extensible type system” philosophy (POSTGRES design papers, see
dbms-papers/) extended to languages. The cost is that no PL can do cross-statement optimizations the engine can’t see; the benefit is a clean plug-in model. Mapping the exact handler contract against PL/Python’s handler would sharpen what the ABI actually requires. -
Simple-expression fast path vs. JIT. PL/pgSQL’s biggest performance win is keeping scalar expressions out of the SPI executor (
exec_eval_simple_expr). The natural research frontier is whether the backend’s LLVM JIT (postgres-expression-eval.md,jit/) could compile a whole PL/pgSQL function body — not just individualExprStates — into native code, the way Oracle native compilation and SQL Server’s natively-compiled stored procedures (Hekaton) do. PostgreSQL deliberately does not; the interpreter overhead is judged acceptable next to SQL execution cost for most workloads, but compute-heavy PL/pgSQL is where this assumption breaks. -
Exception blocks and subtransaction cost. Tying
EXCEPTIONto internal subtransactions is correct and simple but imposes per-block savepoint overhead and consumes subtransaction XIDs (thepg_subtransSLRU,postgres-multixact.md/postgres-slru.md). Engines with cheaper nested rollback (e.g., undo-log MVCC like Oracle or MySQL/InnoDB) can offer exception handling without this tax. A measurement ofBeginInternalSubTransactioncost in a tight loop versus the equivalent in an undo-based engine would frame the trade. -
CUBRID stored procedures. CUBRID historically leaned on Java stored procedures executing in an external JVM bridged to the engine, a very different point in the design space from an in-process tree-walking interpreter. A side-by-side of the call/marshalling path (CUBRID’s JVM-bridge round trip vs. PL/pgSQL’s in-backend SPI call) would highlight what in-process execution buys in latency. (See the CUBRID analyses in the cubrid tree.)
Sources
Section titled “Sources”In-tree source files (REL_18_STABLE, commit 273fe94)
Section titled “In-tree source files (REL_18_STABLE, commit 273fe94)”src/pl/plpgsql/src/pl_handler.c—_PG_init, the GUC/xact-callback registration, and theplpgsql_call_handler/plpgsql_inline_handler/plpgsql_validatortrio that implements the function-manager handler ABI.src/pl/plpgsql/src/pl_comp.c—plpgsql_compile,plpgsql_compile_callback,plpgsql_compile_inline, the parser hooks (plpgsql_parser_setup,plpgsql_param_ref,resolve_column_ref,make_datum_param), and datum/datatype construction.src/pl/plpgsql/src/pl_exec.c—plpgsql_exec_function,exec_stmt_block,exec_stmts, the fullexec_stmt_*family, the SPI bridge (exec_prepare_plan,exec_stmt_execsql,setup_param_list), and the simple-expression fast path (exec_eval_simple_expr,exec_is_simple_query,plpgsql_param_eval_var).src/pl/plpgsql/src/pl_funcs.c— the compile-time namespace stack, the AST tree-walker,plpgsql_free_function_memory, and thedump_*AST printer.src/pl/plpgsql/src/plpgsql.h—PLpgSQL_datum_type/PLpgSQL_stmt_typeenums, and thePLpgSQL_expr,PLpgSQL_var/row/rec/recfield,PLpgSQL_stmt_block,PLpgSQL_function, andPLpgSQL_execstatestructs.
Papers and textbook chapters
Section titled “Papers and textbook chapters”- Database System Concepts (Silberschatz, Korth, Sudarshan, 7e), ch. 5
“Advanced SQL” — SQL/PSM functions and procedures, procedural constructs,
and condition handlers, the standard PL/pgSQL implements
(
knowledge/research/dbms-general/database-system-concepts.md). - Database Internals (Petrov 2019) — parse-once/execute-many plan caching
and the cost model that motivates AST + plan caching
(
knowledge/research/dbms-general/database-internals.md). - Stonebraker & Rowe (1986), “The Design of POSTGRES” and the POSTGRES
data-model/implementation papers — the extensible-type-system lineage that
the handler/SPI plug-in model for procedural languages descends from
(cited in
.omc/plans/postgres-paper-bibliography.md).
Sibling docs (cross-references — mechanism owned there, not duplicated here)
Section titled “Sibling docs (cross-references — mechanism owned there, not duplicated here)”postgres-spi.md— the Server Programming Interface (SPI_connect,SPI_prepare_extended,SPI_execute_plan_with_paramlist, SPI cursors) that every embedded query in PL/pgSQL routes through.postgres-fmgr.md— the V1 function-call ABI (PG_FUNCTION_ARGS,fcinfo,FmgrInfo.fn_extra) and the language-handler registration thatplpgsql_call_handlerplugs into.postgres-portals-prepared.md— the cached-plan /CachedPlanSourcemachinery that PL/pgSQL’s per-PLpgSQL_exprplans and simple-expression revalidation depend on.postgres-expression-eval.md— theExprState/ExprContextinterpreter thatexec_eval_simple_exprruns directly.postgres-xact.md— the (sub)transaction state machine thatBeginInternalSubTransaction/RollbackAndReleaseCurrentSubTransactiondrive forEXCEPTIONblocks.postgres-triggers.md/postgres-event-triggers.md— the trigger and event-trigger firing paths thatplpgsql_exec_trigger/plpgsql_exec_event_triggerserve.postgres-architecture-overview.md— Axis (extensibility): pluggable procedural languages as loadable extensions behind the handler ABI.