Skip to content

PostgreSQL PL/pgSQL — Handler, Compilation to a PLpgSQL_function AST, and the Tree-Walking Statement Executor

Contents:

A procedural language inside the database is the answer to a latency problem that the relational model created. SQL is declarative and set-oriented: it has no native loops, no local variables, no IF, no structured exception handling. The moment an application needs to express “read this row, branch on its value, update that row, and repeat until a condition holds,” it must either pull the data across the client/server boundary and drive the logic from application code — paying a network round-trip per statement — or push the control flow down next to the data. Stored procedures and server-side functions are the second answer: a small imperative program that runs in the backend, calls SQL locally through an in-process interface, and returns only the final result. Database System Concepts (Silberschatz 7e, ch. 5 “Advanced SQL”, “Functions and Procedures” / “Procedural Constructs”) frames this as the standard SQL/PSM feature set — DECLARE, assignment, IF/CASE/LOOP/WHILE/FOR, cursors, and condition handlers — and PL/pgSQL is PostgreSQL’s dialect of it, closely modeled on Oracle’s PL/SQL.

The implementation question this raises is: how do you execute an imperative program whose primitive operations are SQL queries? There are three classic strategies, and they sit on a spectrum from “compile to machine code” to “interpret the syntax tree.” The most heavyweight is to compile the procedure into native code (or an intermediate bytecode) ahead of time; the lightest is a pure source-level interpreter that re-parses on every statement. In the middle — and where almost every database PL lives — is the tree-walking interpreter: parse the source once into an abstract syntax tree (AST) of statement nodes, cache that tree, and execute by recursively walking it, dispatching on each node’s type. The AST amortizes the parse cost across all future calls while keeping the executor simple and debuggable. Database Internals (Petrov 2019) discusses this parse-once/execute-many pattern in the context of query plan caching; the same logic applies to procedural bodies.

But a PL is not just a control-flow interpreter — its leaves are SQL, and SQL is expensive to prepare (parse, rewrite, plan). So a second caching layer is essential: each embedded query must be planned once and the plan reused, with invalidation when the schema underneath it changes. This is the same cached-plan machinery the rest of the backend uses for prepared statements (see postgres-portals-prepared.md), and a well-designed PL leans on it rather than reinventing it. The third design pressure is variable binding: the PL’s local variables must be visible inside the embedded SQL (SELECT ... WHERE id = my_var), which means the SQL parser needs a hook to resolve an otherwise-unknown identifier to a PL datum, and the executor needs a way to feed the datum’s current value into the running query as a parameter. The elegance of a PL implementation is largely measured by how cleanly it threads these three concerns — AST caching, plan caching, and variable binding — without duplicating the engine’s own query machinery.

There is also a scoping concern that the AST must encode. An imperative PL has block-structured variable scopes (DECLARE ... BEGIN ... END, possibly nested and labeled), and SQL embedded at a given point must see exactly the variables in scope there — no more, no less. The compiler therefore maintains a namespace stack while parsing: entering a block pushes a scope, declaring a variable adds an entry, and each embedded query snapshots the visible chain so that resolution at run time is deterministic and matches lexical scope. The same machinery resolves loop labels for EXIT/CONTINUE and disambiguates the perennial “is this identifier a PL variable or a table column?” question (PostgreSQL exposes this as the plpgsql.variable_conflict setting). Getting scoping right at compile time means the run-time interpreter never has to search for a name — it indexes straight into a flat array.

A final theoretical wrinkle is exception handling with rollback semantics. SQL/PSM condition handlers and PL/SQL EXCEPTION blocks promise that when a statement inside a block raises an error, the block can catch it by SQLSTATE and continue — but the partial effects of the failed statement must be undone, or the database is left inconsistent. The clean way to get “undo the statements since the block began” is to make the block a savepoint / nested transaction: enter the block, set a savepoint, run the body, and on error roll back to the savepoint before dispatching to the handler. This ties the PL’s structured-exception feature directly to the engine’s subtransaction mechanism (postgres-xact.md, postgres-two-phase-commit.md for the transaction state machine).

Across engines, a server-side procedural language is almost always built as a pluggable component behind a stable call interface, not as a hardcoded part of the SQL executor. The recurring architecture has four pieces:

  1. A language handler registered in the catalog. The engine defines a function-call ABI (“here is a tuple of arguments and a return slot”); CREATE LANGUAGE (or a built-in registration) names a C entry point that the function manager calls whenever a function written in that language is invoked. The handler is the bridge: it receives the generic call, looks up the procedure’s source, and runs it. This is what lets PostgreSQL ship PL/Python, PL/Perl, PL/Tcl, and third-party PLs as ordinary extensions — they all implement the same handler contract.

  2. A compile step that produces a cached AST. The first call (or a validation pass at CREATE FUNCTION time) parses the procedure body into an internal representation and caches it keyed by the function OID, so the parse cost is paid once per backend per function.

  3. An in-process SQL interface. The PL does not re-implement parsing, planning, or execution of SQL. It calls a thin internal API — PostgreSQL’s is the Server Programming Interface (SPI) — that wraps the normal parse/plan/execute pipeline and the cached-plan layer. The PL gets back result rows and a processed-count; it never touches PlannedStmt or the executor nodes directly.

  4. A datum / parameter bridge. Local variables live in a PL-owned array; to make them visible in embedded SQL the PL installs a parser hook that turns identifier references into placeholder parameters, and a param fetch hook that supplies the live value when the executor evaluates the placeholder.

The design tensions that distinguish good implementations from naïve ones:

  • Re-parse vs. cache. A naïve PL re-parses the body on every call; a good one caches the AST and the per-query plans, and handles plan invalidation.
  • Always-via-executor vs. fast-path expressions. Routing every scalar expression (x := a + 1) through the full SQL executor is correct but slow; mature PLs special-case trivial table-less expressions and evaluate them directly through the expression interpreter, skipping the SPI/executor overhead.
  • Flat error vs. structured handlers + rollback. Supporting EXCEPTION WHEN ... THEN requires hooking the subtransaction/savepoint machinery so partial work is rolled back before the handler runs.

PostgreSQL’s PL/pgSQL is a textbook instance of all four pieces, and it makes the “good implementation” choice on every tension above.

PL/pgSQL is a loadable module (plpgsql.so) that registers three C functions matching the handler contract: a call handler (plpgsql_call_handler), an inline handler for anonymous DO blocks (plpgsql_inline_handler), and a validator (plpgsql_validator) that CREATE FUNCTION runs to syntax-check the body. The body text itself lives in pg_proc.prosrc like any other function’s source.

// plpgsql_call_handler — src/pl/plpgsql/src/pl_handler.c
SPI_connect_ext(nonatomic ? SPI_OPT_NONATOMIC : 0);
/* Find or compile the function */
func = plpgsql_compile(fcinfo, false);
...
func->cfunc.use_count++;
PG_TRY();
{
if (CALLED_AS_TRIGGER(fcinfo))
retval = PointerGetDatum(plpgsql_exec_trigger(func, ...));
else if (CALLED_AS_EVENT_TRIGGER(fcinfo))
plpgsql_exec_event_trigger(func, ...);
else
retval = plpgsql_exec_function(func, fcinfo,
NULL, NULL,
procedure_resowner, !nonatomic);
}
PG_FINALLY();
{ func->cfunc.use_count--; func->cur_estate = save_cur_estate; ... }
PG_END_TRY();
SPI_finish();

The shape is the whole story: connect to SPI, compile-or-fetch the function, bump a use-count so it can’t be freed underneath us, dispatch to the trigger / event-trigger / plain-function executor inside a PG_TRY, then disconnect SPI. The nonatomic flag — set when the function is called as a procedure via CALL in a context that permits transaction control — flows into SPI_connect_ext so that COMMIT/ROLLBACK inside the procedure are allowed. The use_count discipline is what makes the cached compiled function safe to reuse across recursive and re-entrant calls.

The validator is the CREATE FUNCTION-time path. It rejects pseudotype arguments/returns (except the legal TRIGGER, RECORD, VOID, polymorphic cases) and, when check_function_bodies is on, sets up a fake fcinfo and test-compiles the body so syntax errors surface at definition time rather than first call:

// plpgsql_validator — src/pl/plpgsql/src/pl_handler.c
if (check_function_bodies)
{
...
flinfo.fn_oid = funcoid;
if (is_dml_trigger) { ... fake_fcinfo->context = (Node *) &trigdata; }
else if (is_event_trigger) { ... fake_fcinfo->context = (Node *) &etrigdata; }
/* Test-compile the function */
plpgsql_compile(fake_fcinfo, true); /* forValidator = true */
...
}

The inline handler runs a DO block. It has no catalog entry, so it compiles the source text directly via plpgsql_compile_inline, and — because a DO block may COMMIT/ROLLBACK — it builds a private EState and ResourceOwner for simple-expression evaluation that survive transaction boundaries, then frees the throwaway compiled function afterward:

// plpgsql_inline_handler — src/pl/plpgsql/src/pl_handler.c
func = plpgsql_compile_inline(codeblock->source_text);
func->cfunc.use_count++;
...
simple_eval_estate = CreateExecutorState();
simple_eval_resowner = ResourceOwnerCreate(NULL, "PL/pgSQL DO block simple expressions");
PG_TRY();
{
retval = plpgsql_exec_function(func, fake_fcinfo,
simple_eval_estate,
simple_eval_resowner,
simple_eval_resowner, /* also the proc resowner */
codeblock->atomic);
}
PG_CATCH();
{ ... plpgsql_free_function_memory(func); PG_RE_THROW(); }
PG_END_TRY();
plpgsql_free_function_memory(func);

Compilation: source text to a PLpgSQL_function

Section titled “Compilation: source text to a PLpgSQL_function”

plpgsql_compile is a thin cache wrapper. It delegates to funccache.c’s cached_function_compile, which keys on the function OID (plus argument types for polymorphism) and only calls the expensive plpgsql_compile_callback on a miss. The result pointer is stashed on fcinfo->flinfo->fn_extra so the same FmgrInfo skips even the hash lookup on repeat calls within a query:

// plpgsql_compile — src/pl/plpgsql/src/pl_comp.c
function = (PLpgSQL_function *)
cached_function_compile(fcinfo,
fcinfo->flinfo->fn_extra,
plpgsql_compile_callback,
plpgsql_delete_callback,
sizeof(PLpgSQL_function),
false,
forValidator);
fcinfo->flinfo->fn_extra = function; /* avoid search next time */
return function;

The real work is in plpgsql_compile_callback. It runs entirely inside a dedicated per-function memory context (func_cxt, “PL/pgSQL function”) so every palloc during compilation has the function’s lifetime and the whole tree can be reclaimed by deleting one context. It initializes the scanner over prosrc, pushes the outermost namespace level (named after the function, holding parameters and special variables like FOUND), builds a PLpgSQL_var/PLpgSQL_rec datum for each declared argument, and then runs the Bison grammar (plpgsql_yyparse) which populates function->action, the root PLpgSQL_stmt_block:

// plpgsql_compile_callback — src/pl/plpgsql/src/pl_comp.c
func_cxt = AllocSetContextCreate(CurrentMemoryContext,
"PL/pgSQL function", ALLOCSET_DEFAULT_SIZES);
plpgsql_compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid);
function->fn_cxt = func_cxt;
function->resolve_option = plpgsql_variable_conflict;
...
plpgsql_ns_init();
plpgsql_ns_push(NameStr(procStruct->proname), PLPGSQL_LABEL_BLOCK);
plpgsql_start_datums();
/* ... build a PLpgSQL_var/rec per argument, add to ns + datum list ... */

The compiler’s two persistent outputs are the statement tree (function->action) and the datum array (function->datums[]). Every variable, row, record, and record-field the function knows about is a PLpgSQL_datum referenced by its integer index dno. The grammar never plans SQL: each embedded query is captured verbatim into a PLpgSQL_expr whose query field holds the raw text and whose ns field snapshots the namespace chain visible at that point — planning is deferred to first execution.

flowchart TD
  A["CREATE FUNCTION ... LANGUAGE plpgsql<br/>body stored in pg_proc.prosrc"] --> B["first call:<br/>plpgsql_call_handler"]
  B --> C["plpgsql_compile<br/>(funccache: hit? reuse)"]
  C -->|miss| D["plpgsql_compile_callback<br/>per-function MemoryContext"]
  D --> E["scanner + Bison grammar<br/>plpgsql_yyparse"]
  E --> F["PLpgSQL_function:<br/>action = stmt tree<br/>datums[] = variables"]
  C -->|hit| F
  F --> G["cached on fn_extra<br/>use_count guards reuse"]
  G --> H["plpgsql_exec_function<br/>tree-walking executor"]

PLpgSQL_datum is the common supertype. Four concrete kinds matter: PLpgSQL_var (scalar, or a DTYPE_PROMISE lazy special variable), PLpgSQL_row (a fixed list of variables, used for multi-column INTO and multi-OUT-parameter returns), PLpgSQL_rec (a whole composite/RECORD value, stored as an expanded record), and PLpgSQL_recfield (one field of a record, resolved lazily against the record’s current tuple descriptor).

// PLpgSQL_var — src/pl/plpgsql/src/plpgsql.h
typedef struct PLpgSQL_var
{
PLpgSQL_datum_type dtype; /* PLPGSQL_DTYPE_VAR or _PROMISE */
int dno; /* index into datums[] */
char *refname;
...
PLpgSQL_type *datatype;
/* Fields below here can change at runtime */
Datum value;
bool isnull;
bool freeval;
PLpgSQL_promise_type promise; /* PLPGSQL_PROMISE_NONE if normal var */
} PLpgSQL_var;

The promise mechanism is a small but characteristic optimization: trigger context variables like TG_NAME, TG_OP, NEW, OLD are declared as promise datums and only materialized the first time the function reads them, so a trigger that never touches TG_TABLE_NAME pays nothing to compute it. Records are always kept in expanded form (ExpandedRecordHeader), which lets field access and field assignment mutate in place rather than deconstructing and reconstructing a flat tuple each time.

plpgsql_exec_function sets up a PLpgSQL_execstate, copies the function’s datums into per-execution storage (so recursion and re-entrancy each get their own variable values), installs an error-context callback for nice CONTEXT: tracebacks, stores the actual argument values into the argument datums, then walks the tree:

// plpgsql_exec_function — src/pl/plpgsql/src/pl_exec.c
plpgsql_estate_setup(&estate, func, (ReturnSetInfo *) fcinfo->resultinfo,
simple_eval_estate, simple_eval_resowner);
estate.atomic = atomic;
...
copy_plpgsql_datums(&estate, func);
/* Store the actual call argument values into the appropriate variables */
for (i = 0; i < func->fn_nargs; i++)
{
int n = func->fn_argvarnos[i];
switch (estate.datums[n]->dtype)
{
case PLPGSQL_DTYPE_VAR:
assign_simple_var(&estate, (PLpgSQL_var *) estate.datums[n],
fcinfo->args[i].value, fcinfo->args[i].isnull, false);
... /* commandeer R/W expanded objects; force arrays to expanded form */
break;
case PLPGSQL_DTYPE_REC:
... exec_move_row_from_datum(...);
break;
}
}

The interpreter core is two functions. exec_stmt_block initializes a block’s locally-declared variables and runs its body; exec_stmts iterates a statement list and switches on stmt->cmd_type, calling the matching exec_stmt_* handler. Each handler returns one of four return codesPLPGSQL_RC_OK, PLPGSQL_RC_EXIT, PLPGSQL_RC_RETURN, PLPGSQL_RC_CONTINUE — which propagate up the recursion to implement EXIT/CONTINUE/RETURN and loop-label matching.

// exec_stmts — src/pl/plpgsql/src/pl_exec.c
foreach(s, stmts)
{
PLpgSQL_stmt *stmt = (PLpgSQL_stmt *) lfirst(s);
estate->err_stmt = stmt;
if (*plpgsql_plugin_ptr && (*plpgsql_plugin_ptr)->stmt_beg)
((*plpgsql_plugin_ptr)->stmt_beg) (estate, stmt);
CHECK_FOR_INTERRUPTS();
switch (stmt->cmd_type)
{
case PLPGSQL_STMT_BLOCK: rc = exec_stmt_block(estate, ...); break;
case PLPGSQL_STMT_ASSIGN: rc = exec_stmt_assign(estate, ...); break;
case PLPGSQL_STMT_IF: rc = exec_stmt_if(estate, ...); break;
case PLPGSQL_STMT_FORI: rc = exec_stmt_fori(estate, ...); break;
case PLPGSQL_STMT_EXECSQL: rc = exec_stmt_execsql(estate, ...); break;
case PLPGSQL_STMT_RETURN: rc = exec_stmt_return(estate, ...); break;
... /* one case per PLpgSQL_stmt_type */
}
if (rc != PLPGSQL_RC_OK) return rc;
}

The *plpgsql_plugin_ptr rendezvous hook before each statement is the seam that the pldebugger / profiler extensions tap (set up in _PG_init via find_rendezvous_variable). CHECK_FOR_INTERRUPTS() per statement is what lets a runaway PL/pgSQL loop be cancelled.

flowchart TD
  S["exec_stmts(list)"] --> L{"for each stmt"}
  L --> P["plugin stmt_beg hook<br/>CHECK_FOR_INTERRUPTS"]
  P --> D{"switch cmd_type"}
  D -->|ASSIGN| A["exec_stmt_assign<br/>exec_assign_expr -> exec_assign_value"]
  D -->|IF/CASE/LOOP/FOR| C["control-flow handlers<br/>recurse into sub-blocks"]
  D -->|EXECSQL| Q["exec_stmt_execsql<br/>SPI plan + execute"]
  D -->|BLOCK + EXCEPTION| X["exec_stmt_block<br/>subtransaction + PG_TRY"]
  D -->|RETURN| R["exec_stmt_return<br/>set estate.retval"]
  A --> RC{"rc"}
  C --> RC
  Q --> RC
  X --> RC
  R --> RC
  RC -->|RC_OK| L
  RC -->|EXIT/RETURN/CONTINUE| U["propagate up the recursion"]

SQL inside PL/pgSQL: SPI and the simple-expression fast path

Section titled “SQL inside PL/pgSQL: SPI and the simple-expression fast path”

PL/pgSQL never calls the planner or executor directly. Every embedded statement is a PLpgSQL_expr that, on first execution, is prepared through SPI with a parser-setup hook that lets the SQL parser resolve PL/pgSQL variable references:

// exec_prepare_plan — src/pl/plpgsql/src/pl_exec.c
options.parserSetup = (ParserSetupHook) plpgsql_parser_setup;
options.parserSetupArg = expr;
options.parseMode = expr->parseMode;
options.cursorOptions = cursorOptions;
plan = SPI_prepare_extended(expr->query, &options);
SPI_keepplan(plan);
expr->plan = plan;
/* Check to see if it's a simple expression */
exec_simple_check_plan(estate, expr);

A full-blown SQL statement (INSERT, multi-table SELECT, etc.) takes the general path: exec_stmt_execsql prepares the plan, detects whether it is a data-modifying command (by inspecting the cached CachedPlanSource’s command tag), builds a ParamListInfo from the current datum values, and runs it via SPI_execute_plan_with_paramlist, asking for at most two rows when there is an INTO STRICT so it can detect “too many rows”:

// exec_stmt_execsql — src/pl/plpgsql/src/pl_exec.c
if (expr->plan == NULL)
exec_prepare_plan(estate, expr, CURSOR_OPT_PARALLEL_OK);
...
paramLI = setup_param_list(estate, expr);
if (stmt->into) tcount = (stmt->strict || stmt->mod_stmt || too_many_rows_level) ? 2 : 1;
else tcount = 0;
rc = SPI_execute_plan_with_paramlist(expr->plan, paramLI,
estate->readonly_func, tcount);

The variable binding is the elegant part, and it has two halves: a compile-time identifier resolver and a run-time value supplier. At prepare time plpgsql_parser_setup installs plpgsql_param_ref / plpgsql_post_column_ref, which consult the namespace chain snapshotted on the PLpgSQL_expr (expr->ns) and rewrite any reference that matches a PL/pgSQL datum into a PARAM_EXTERN Param. The key invariant is that the Param’s paramid is exactly the datum’s dno + 1, so there is no separate symbol table to consult at run time — the param id is the array index:

// plpgsql_param_ref — src/pl/plpgsql/src/pl_comp.c
snprintf(pname, sizeof(pname), "$%d", pref->number);
nse = plpgsql_ns_lookup(expr->ns, false, pname, NULL, NULL, NULL);
if (nse == NULL)
return NULL; /* name not known to plpgsql */
return make_datum_param(expr, nse->itemno, pref->location);

At run time, before a plan executes, setup_param_list decides whether the query needs parameters at all. It reuses one shared ParamListInfo per execution state (there is no materialized ParamExternData array — values are fetched lazily) and points its parserSetupArg back at the active expr; when the query references nothing, it returns NULL so plancache.c knows a custom plan is pointless:

// setup_param_list — src/pl/plpgsql/src/pl_exec.c
if (!bms_is_empty(expr->paramnos))
{
paramLI = estate->paramLI; /* the common ParamListInfo */
paramLI->parserSetupArg = expr; /* where hook functions find the expr */
}
else
paramLI = NULL; /* no params -> no custom plan */
return paramLI;

When the executor finally evaluates a Param, it calls back into PL/pgSQL through plpgsql_param_eval_var, which reads the live value straight out of the datum array — no copy, no intermediate structure:

// plpgsql_param_eval_var — src/pl/plpgsql/src/pl_exec.c
int dno = op->d.cparam.paramid - 1;
estate = (PLpgSQL_execstate *) econtext->ecxt_param_list_info->paramFetchArg;
var = (PLpgSQL_var *) estate->datums[dno];
/* inlined version of exec_eval_datum() */
*op->resvalue = var->value;
*op->resnull = var->isnull;

For a scalar expression — x := a + 1, an IF condition, a loop bound — routing through the SPI executor would be wasteful. PL/pgSQL detects when an expression compiles to a table-less single-column SELECT with no aggs, sublinks, sorts, or FROM clause, and marks it “simple”:

// exec_is_simple_query — src/pl/plpgsql/src/pl_exec.c
if (query->commandType != CMD_SELECT) return false;
if (query->rtable != NIL) return false;
if (query->hasAggs || query->hasWindowFuncs || query->hasTargetSRFs ||
query->hasSubLinks || query->cteList || query->jointree->fromlist ||
query->jointree->quals || query->groupClause || ...) return false;
if (list_length(query->targetList) != 1) return false;
return true; /* treat it as a simple plan */

Simple expressions are evaluated by exec_eval_simple_expr, which runs the extracted ExprState directly against the function’s eval ExprContext, bypassing the SPI tuple machinery entirely. The subtlety is cached-plan revalidation: the simple expression caches a CachedPlan refcount per local transaction, and on each use it checks CachedPlanIsSimplyValid; if a DDL invalidation fired, it replans, re-tests simplicity, and falls back to the general path if the expression is no longer simple. This is the single biggest performance lever in PL/pgSQL — arithmetic and comparisons never touch the executor proper.

EXECUTE '<dynamic string>' is the opposite extreme: exec_stmt_dynexecute evaluates the string expression, then runs it with SPI_execute_extended without saving a plan, because the text can differ every call. CALL (exec_stmt_call) uses SPI_execute_plan_extended with allow_nonatomic = true so the called procedure may commit/rollback, and detects a transaction boundary by comparing MyProc->vxid.lxid before and after.

A BEGIN ... EXCEPTION WHEN ... END block is the place where the interpreter touches the transaction manager. When block->exceptions is non-NULL, exec_stmt_block opens an internal subtransaction, runs the body in a fresh eval econtext under a PG_TRY, and on success releases the subtransaction; on error it rolls the subtransaction back (undoing the body’s partial effects), copies the ErrorData, and scans the handler list for a SQLSTATE-condition match:

// exec_stmt_block (exception path) — src/pl/plpgsql/src/pl_exec.c
BeginInternalSubTransaction(NULL);
MemoryContextSwitchTo(oldcontext);
PG_TRY();
{
plpgsql_create_econtext(estate); /* econtext tied to the subxact */
rc = exec_stmts(estate, block->body);
...
ReleaseCurrentSubTransaction(); /* commit inner subxact */
estate->eval_econtext = old_eval_econtext;
}
PG_CATCH();
{
edata = CopyErrorData();
FlushErrorState();
RollbackAndReleaseCurrentSubTransaction(); /* undo the body */
...
foreach(e, block->exceptions->exc_list)
{
PLpgSQL_exception *exception = (PLpgSQL_exception *) lfirst(e);
if (exception_matches_conditions(edata, exception->conditions))
{
assign_text_var(estate, state_var, unpack_sql_state(edata->sqlerrcode));
assign_text_var(estate, errm_var, edata->message);
estate->cur_error = edata;
rc = exec_stmts(estate, exception->action); /* run handler */
break;
}
}
if (e == NULL) ReThrowError(edata); /* no handler matched: re-raise */
}
PG_END_TRY();

The cost is real: any function containing an EXCEPTION block pays subtransaction setup/teardown on every entry to that block, which is why the PL/pgSQL documentation warns against wrapping tight loops in exception blocks. The magic variables SQLSTATE and SQLERRM are ordinary datums (sqlstate_varno, sqlerrm_varno) populated from the captured ErrorData before the handler runs.

flowchart TD
  E["enter BEGIN...EXCEPTION block"] --> B["BeginInternalSubTransaction<br/>plpgsql_create_econtext"]
  B --> T["PG_TRY: exec_stmts(body)"]
  T -->|no error| OK["ReleaseCurrentSubTransaction<br/>(commit inner subxact)"]
  T -->|error| Cat["PG_CATCH:<br/>CopyErrorData + FlushErrorState"]
  Cat --> RB["RollbackAndReleaseCurrentSubTransaction<br/>(undo body effects)"]
  RB --> M{"match exc_list<br/>by SQLSTATE?"}
  M -->|match| H["set SQLSTATE/SQLERRM<br/>exec_stmts(handler.action)"]
  M -->|no match| RT["ReThrowError(edata)<br/>propagate outward"]
  OK --> Z["block return code handling"]
  H --> Z

This subtransaction-per-block model defers the actual savepoint/XID mechanics to the transaction manager (postgres-xact.md); PL/pgSQL only orchestrates the Begin/Release/Rollback calls and the handler dispatch.

Returning values: scalar, composite, and set-returning

Section titled “Returning values: scalar, composite, and set-returning”

exec_stmt_return sets estate->retval/retisnull/rettype and signals PLPGSQL_RC_RETURN, which unwinds the whole recursion back to plpgsql_exec_function, where the value is coerced to the declared return type. There is a deliberate fast path: when the RETURN expression is a bare variable reference (stmt->retvarno >= 0 — always true for functions with OUT parameters), the executor copies the datum’s value into retval directly rather than routing through exec_eval_expr, which lets a read/write expanded object (a large array, say) be transferred to the caller’s context cheaply instead of being flattened to a read-only copy:

// exec_stmt_return — src/pl/plpgsql/src/pl_exec.c
if (estate->retisset)
return PLPGSQL_RC_RETURN; /* SRF: final RETURN ends tuple production */
...
if (stmt->retvarno >= 0)
{
PLpgSQL_datum *retvar = estate->datums[stmt->retvarno];
switch (retvar->dtype)
{
case PLPGSQL_DTYPE_PROMISE:
plpgsql_fulfill_promise(estate, (PLpgSQL_var *) retvar);
/* FALL THRU */
case PLPGSQL_DTYPE_VAR:
{
PLpgSQL_var *var = (PLpgSQL_var *) retvar;
estate->retval = var->value; /* R/W expanded value transfers cheaply */
estate->retisnull = var->isnull;
estate->rettype = var->datatype->typoid;
}
...
}
}

Set-returning PL/pgSQL functions (RETURNS SETOF ...) work differently: RETURN NEXT / RETURN QUERY append rows into a Tuplestorestate (estate->tuple_store) rather than ending the function, and the final plain RETURN just signals completion (the retisset early-return above). The materialized tuplestore is handed back through the ReturnSetInfo the function manager passed in — the same SFRM_Materialize protocol any set-returning function uses. This is why a PL/pgSQL SRF buffers its entire result before any row reaches the caller.

The four .c files split cleanly by responsibility: pl_handler.c is the function-manager boundary (handlers, validator, GUCs, xact callbacks); pl_comp.c turns source text into the PLpgSQL_function AST and owns the namespace and parser hooks; pl_exec.c is the tree-walking interpreter plus the SPI/simple-expression bridge; pl_funcs.c holds the namespace stack primitives, the AST tree-walker, memory-freeing, and the dump_* AST pretty-printer. The shared type vocabulary lives in plpgsql.h.

  • _PG_init — registers the plpgsql.* GUCs (variable_conflict, check_asserts, extra_warnings/extra_errors), the xact/subxact callbacks (plpgsql_xact_cb, plpgsql_subxact_cb), and the PLpgSQL_plugin rendezvous pointer.
  • plpgsql_call_handler — the PG_FUNCTION_INFO_V1 entry the function manager calls; connects SPI, compiles/fetches the function, dispatches to plpgsql_exec_function / plpgsql_exec_trigger / plpgsql_exec_event_trigger.
  • plpgsql_inline_handler — runs DO blocks; compiles inline, builds a private EState/ResourceOwner, frees the function after.
  • plpgsql_validatorCREATE FUNCTION-time checks + optional test-compile under check_function_bodies.
  • plpgsql_compile / plpgsql_compile_callback — cache wrapper + per-function-context AST builder.
  • plpgsql_compile_inline — the DO-block compile path.
  • plpgsql_parser_setup, plpgsql_pre_column_ref, plpgsql_post_column_ref, resolve_column_ref, plpgsql_param_ref, make_datum_param — the parser hooks that resolve SQL identifiers to PL/pgSQL datums and emit PARAM_EXTERN params.
  • plpgsql_build_variable, plpgsql_build_datatype, add_parameter_name, add_dummy_return — datum construction.

Namespace, tree-walk, dump, free (pl_funcs.c)

Section titled “Namespace, tree-walk, dump, free (pl_funcs.c)”
  • plpgsql_ns_init, plpgsql_ns_push, plpgsql_ns_pop, plpgsql_ns_additem, plpgsql_ns_lookup, plpgsql_ns_lookup_label, plpgsql_ns_find_nearest_loop — the compile-time namespace stack (block-scoped variable resolution).
  • plpgsql_stmt_typename, plpgsql_getdiag_kindname — human-readable names.
  • plpgsql_statement_tree_walker_impl, plpgsql_mark_local_assignment_targets — generic AST traversal used for analysis.
  • plpgsql_free_function_memory, plpgsql_delete_callback — context teardown.
  • plpgsql_dumptree and the dump_* family — DEBUG AST printer.
  • plpgsql_exec_function, plpgsql_exec_trigger, plpgsql_exec_event_trigger — top-level entry points; copy_plpgsql_datums, plpgsql_estate_setup.
  • exec_toplevel_block, exec_stmt_block, exec_stmts — the recursion core.
  • exec_stmt_assign, exec_stmt_if, exec_stmt_case, exec_stmt_loop, exec_stmt_while, exec_stmt_fori, exec_stmt_fors, exec_stmt_forc, exec_stmt_foreach_a, exec_stmt_exit, exec_stmt_return, exec_stmt_return_next, exec_stmt_return_query, exec_stmt_raise, exec_stmt_assert, exec_stmt_getdiag — per-construct handlers.
  • exec_stmt_execsql, exec_stmt_dynexecute, exec_stmt_dynfors, exec_stmt_open, exec_stmt_fetch, exec_stmt_close, exec_stmt_perform, exec_stmt_call — SQL and cursor statements.
  • exec_prepare_plan, setup_param_list, exec_assign_expr, exec_assign_value, exec_eval_expr, exec_run_select, exec_move_row — the expression/assignment/row machinery.
  • exec_eval_simple_expr, exec_simple_check_plan, exec_is_simple_query, exec_save_simple_expr, plpgsql_param_eval_var, plpgsql_param_compile — the simple-expression fast path.
  • plpgsql_create_econtext, plpgsql_destroy_econtext, assign_simple_var, assign_text_var, exception_matches_conditions — runtime support.

Position hints (as of 2026-06-05, REL_18 273fe94)

Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”
SymbolFileLine
plpgsql_call_handlersrc/pl/plpgsql/src/pl_handler.c224
plpgsql_inline_handlersrc/pl/plpgsql/src/pl_handler.c316
plpgsql_validatorsrc/pl/plpgsql/src/pl_handler.c442
_PG_initsrc/pl/plpgsql/src/pl_handler.c148
plpgsql_compilesrc/pl/plpgsql/src/pl_comp.c106
plpgsql_compile_callbacksrc/pl/plpgsql/src/pl_comp.c167
plpgsql_compile_inlinesrc/pl/plpgsql/src/pl_comp.c739
plpgsql_param_refsrc/pl/plpgsql/src/pl_comp.c1056
resolve_column_refsrc/pl/plpgsql/src/pl_comp.c1083
plpgsql_ns_lookupsrc/pl/plpgsql/src/pl_funcs.c130
plpgsql_ns_find_nearest_loopsrc/pl/plpgsql/src/pl_funcs.c214
plpgsql_stmt_typenamesrc/pl/plpgsql/src/pl_funcs.c232
plpgsql_free_function_memorysrc/pl/plpgsql/src/pl_funcs.c716
plpgsql_exec_functionsrc/pl/plpgsql/src/pl_exec.c493
exec_stmt_blocksrc/pl/plpgsql/src/pl_exec.c1663
exec_stmtssrc/pl/plpgsql/src/pl_exec.c1996
exec_stmt_callsrc/pl/plpgsql/src/pl_exec.c2197
exec_stmt_forisrc/pl/plpgsql/src/pl_exec.c2696
exec_stmt_returnsrc/pl/plpgsql/src/pl_exec.c3197
exec_stmt_raisesrc/pl/plpgsql/src/pl_exec.c3725
exec_prepare_plansrc/pl/plpgsql/src/pl_exec.c4173
exec_stmt_execsqlsrc/pl/plpgsql/src/pl_exec.c4208
exec_stmt_dynexecutesrc/pl/plpgsql/src/pl_exec.c4440
exec_assign_valuesrc/pl/plpgsql/src/pl_exec.c5061
exec_eval_exprsrc/pl/plpgsql/src/pl_exec.c5665
exec_run_selectsrc/pl/plpgsql/src/pl_exec.c5753
exec_eval_simple_exprsrc/pl/plpgsql/src/pl_exec.c6019
setup_param_listsrc/pl/plpgsql/src/pl_exec.c6250
plpgsql_param_eval_varsrc/pl/plpgsql/src/pl_exec.c6672
exec_is_simple_querysrc/pl/plpgsql/src/pl_exec.c8205
plpgsql_create_econtextsrc/pl/plpgsql/src/pl_exec.c8610
PLpgSQL_datum_type (enum)src/pl/plpgsql/src/plpgsql.h62
PLpgSQL_stmt_type (enum)src/pl/plpgsql/src/plpgsql.h103
PLpgSQL_expr (struct)src/pl/plpgsql/src/plpgsql.h230
PLpgSQL_var (struct)src/pl/plpgsql/src/plpgsql.h332
PLpgSQL_rec (struct)src/pl/plpgsql/src/plpgsql.h412
PLpgSQL_stmt_block (struct)src/pl/plpgsql/src/plpgsql.h525
PLpgSQL_function (struct)src/pl/plpgsql/src/plpgsql.h958
PLpgSQL_execstate (struct)src/pl/plpgsql/src/plpgsql.h1012

Verified against /data/hgryoo/references/postgres at REL_18_STABLE, commit 273fe94.

  • Handler trio + validator. plpgsql_call_handler, plpgsql_inline_handler, and plpgsql_validator are present in pl_handler.c with the PG_FUNCTION_INFO_V1 macros; the dispatch on CALLED_AS_TRIGGER / CALLED_AS_EVENT_TRIGGER, the use_count guard, and the SPI_connect_ext(nonatomic ? SPI_OPT_NONATOMIC : 0) call are quoted verbatim. PG_MODULE_MAGIC_EXT(.name = "plpgsql", .version = PG_VERSION) confirms the PG18 module-magic form.
  • Compile cache. plpgsql_compile delegates to cached_function_compile (funccache.c) — the PG17+ unification of the old per-PL caching — passing plpgsql_compile_callback/plpgsql_delete_callback. Confirmed the result is stored on fcinfo->flinfo->fn_extra.
  • AST / datum model. The PLpgSQL_datum_type enum lists exactly VAR, ROW, REC, RECFIELD, PROMISE; PLpgSQL_stmt_type lists the 26 PLPGSQL_STMT_* constants used by the exec_stmts switch. PLpgSQL_var carries the runtime value/isnull/promise; PLpgSQL_rec stores an ExpandedRecordHeader *erh (records always expanded). All four struct shapes match the quoted excerpts.
  • Simple-expression path. exec_is_simple_query’s rejection list (hasAggs, hasWindowFuncs, hasTargetSRFs, hasSubLinks, cteList, non-empty rtable/fromlist/quals, etc.) and the single-target-list requirement are quoted directly. exec_eval_simple_expr’s per-LXID CachedPlanIsSimplyValid revalidation and replan-or-fall-back logic is present as shown. plpgsql_param_eval_var reads estate->datums[dno] inline.
  • Exception subtransactions. exec_stmt_block calls BeginInternalSubTransaction(NULL) under PG_TRY, and the PG_CATCH performs CopyErrorData + FlushErrorState + RollbackAndReleaseCurrentSubTransaction, then iterates block->exceptions->exc_list calling exception_matches_conditions, and ReThrowError(edata) when no handler matches. Verbatim.
  • No PG19-only claims. This doc asserts only REL_18 facts. It makes no reference to any post-18 rmgr, BackendType, or executor change; the funccache-based compile cache and PG_MODULE_MAGIC_EXT are PG17/PG18 features that are present in this tree.

Two simplifications are intentional and noted here rather than asserted as full coverage: the cursor statements (OPEN/FETCH/CLOSE, exec_stmt_forc) and the trigger/event-trigger executors are named in the walkthrough but not excerpted, because their SPI-cursor mechanics belong with postgres-spi.md; and the read/write expanded-object assignment optimization (expr_rwopt, PLPGSQL_RWOPT_*) is described at the level of “why arrays are forced to expanded form” without tracing every exec_assign_value branch.

Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”
  • PL/pgSQL vs. Oracle PL/SQL. PL/pgSQL’s surface syntax is deliberately PL/SQL-like (block structure, %TYPE/%ROWTYPE, EXCEPTION WHEN), but the execution models differ sharply. Oracle compiles PL/SQL to a bytecode (or, with native compilation, to C/machine code) executed by a dedicated PL/SQL virtual machine, and PL/SQL shares a tightly-integrated runtime with the SQL engine. PL/pgSQL is a pure tree-walking interpreter that reaches the SQL engine only through the arms-length SPI boundary. A focused comparison of the per-statement dispatch cost (PL/pgSQL’s switch + handler call vs. a bytecode goto) would quantify what PostgreSQL trades for implementation simplicity and the ability to ship the PL as a loadable extension.

  • The SPI boundary as an extensibility seam. Because every PL talks to the engine through SPI and the handler ABI, PostgreSQL supports PL/Python, PL/Perl, PL/Tcl, PL/v8, and arbitrary third-party PLs with no core changes. This is the Stonebraker-era “extensible type system” philosophy (POSTGRES design papers, see dbms-papers/) extended to languages. The cost is that no PL can do cross-statement optimizations the engine can’t see; the benefit is a clean plug-in model. Mapping the exact handler contract against PL/Python’s handler would sharpen what the ABI actually requires.

  • Simple-expression fast path vs. JIT. PL/pgSQL’s biggest performance win is keeping scalar expressions out of the SPI executor (exec_eval_simple_expr). The natural research frontier is whether the backend’s LLVM JIT (postgres-expression-eval.md, jit/) could compile a whole PL/pgSQL function body — not just individual ExprStates — into native code, the way Oracle native compilation and SQL Server’s natively-compiled stored procedures (Hekaton) do. PostgreSQL deliberately does not; the interpreter overhead is judged acceptable next to SQL execution cost for most workloads, but compute-heavy PL/pgSQL is where this assumption breaks.

  • Exception blocks and subtransaction cost. Tying EXCEPTION to internal subtransactions is correct and simple but imposes per-block savepoint overhead and consumes subtransaction XIDs (the pg_subtrans SLRU, postgres-multixact.md / postgres-slru.md). Engines with cheaper nested rollback (e.g., undo-log MVCC like Oracle or MySQL/InnoDB) can offer exception handling without this tax. A measurement of BeginInternalSubTransaction cost in a tight loop versus the equivalent in an undo-based engine would frame the trade.

  • CUBRID stored procedures. CUBRID historically leaned on Java stored procedures executing in an external JVM bridged to the engine, a very different point in the design space from an in-process tree-walking interpreter. A side-by-side of the call/marshalling path (CUBRID’s JVM-bridge round trip vs. PL/pgSQL’s in-backend SPI call) would highlight what in-process execution buys in latency. (See the CUBRID analyses in the cubrid tree.)

In-tree source files (REL_18_STABLE, commit 273fe94)

Section titled “In-tree source files (REL_18_STABLE, commit 273fe94)”
  • src/pl/plpgsql/src/pl_handler.c_PG_init, the GUC/xact-callback registration, and the plpgsql_call_handler / plpgsql_inline_handler / plpgsql_validator trio that implements the function-manager handler ABI.
  • src/pl/plpgsql/src/pl_comp.cplpgsql_compile, plpgsql_compile_callback, plpgsql_compile_inline, the parser hooks (plpgsql_parser_setup, plpgsql_param_ref, resolve_column_ref, make_datum_param), and datum/datatype construction.
  • src/pl/plpgsql/src/pl_exec.cplpgsql_exec_function, exec_stmt_block, exec_stmts, the full exec_stmt_* family, the SPI bridge (exec_prepare_plan, exec_stmt_execsql, setup_param_list), and the simple-expression fast path (exec_eval_simple_expr, exec_is_simple_query, plpgsql_param_eval_var).
  • src/pl/plpgsql/src/pl_funcs.c — the compile-time namespace stack, the AST tree-walker, plpgsql_free_function_memory, and the dump_* AST printer.
  • src/pl/plpgsql/src/plpgsql.hPLpgSQL_datum_type / PLpgSQL_stmt_type enums, and the PLpgSQL_expr, PLpgSQL_var/row/rec/recfield, PLpgSQL_stmt_block, PLpgSQL_function, and PLpgSQL_execstate structs.
  • Database System Concepts (Silberschatz, Korth, Sudarshan, 7e), ch. 5 “Advanced SQL” — SQL/PSM functions and procedures, procedural constructs, and condition handlers, the standard PL/pgSQL implements (knowledge/research/dbms-general/database-system-concepts.md).
  • Database Internals (Petrov 2019) — parse-once/execute-many plan caching and the cost model that motivates AST + plan caching (knowledge/research/dbms-general/database-internals.md).
  • Stonebraker & Rowe (1986), “The Design of POSTGRES” and the POSTGRES data-model/implementation papers — the extensible-type-system lineage that the handler/SPI plug-in model for procedural languages descends from (cited in .omc/plans/postgres-paper-bibliography.md).

Sibling docs (cross-references — mechanism owned there, not duplicated here)

Section titled “Sibling docs (cross-references — mechanism owned there, not duplicated here)”
  • postgres-spi.md — the Server Programming Interface (SPI_connect, SPI_prepare_extended, SPI_execute_plan_with_paramlist, SPI cursors) that every embedded query in PL/pgSQL routes through.
  • postgres-fmgr.md — the V1 function-call ABI (PG_FUNCTION_ARGS, fcinfo, FmgrInfo.fn_extra) and the language-handler registration that plpgsql_call_handler plugs into.
  • postgres-portals-prepared.md — the cached-plan / CachedPlanSource machinery that PL/pgSQL’s per-PLpgSQL_expr plans and simple-expression revalidation depend on.
  • postgres-expression-eval.md — the ExprState / ExprContext interpreter that exec_eval_simple_expr runs directly.
  • postgres-xact.md — the (sub)transaction state machine that BeginInternalSubTransaction / RollbackAndReleaseCurrentSubTransaction drive for EXCEPTION blocks.
  • postgres-triggers.md / postgres-event-triggers.md — the trigger and event-trigger firing paths that plpgsql_exec_trigger / plpgsql_exec_event_trigger serve.
  • postgres-architecture-overview.md — Axis (extensibility): pluggable procedural languages as loadable extensions behind the handler ABI.