Skip to content

PostgreSQL SPI — The Server Programming Interface: How PLs and Extensions Run SQL From Inside the Backend

Contents:


Every relational engine eventually faces the same architectural question: how does code running inside the server submit SQL? The wire protocol (PQexec, the extended-query Parse/Bind/Execute messages) is the answer for an external client — a separate process that talks to the backend over a socket. But a stored procedure, a trigger function, or a procedural-language handler is not a separate process. It is C code linked into the backend, running on the backend’s own stack, sharing the backend’s transaction, its snapshot, its lock table, its memory contexts. For such code, going back out to the socket would be absurd: it would mean serializing a query to bytes, shipping it to itself, and re-parsing the result. What it needs instead is a function-call interface to the very parser/planner/executor pipeline it is already sitting inside.

This is the embedded query problem, and it is as old as stored procedures themselves. The SQL standard’s answer is embedded SQL (the EXEC SQL preprocessor of ESQL/C) and the module language; the practical answer that every major engine ships is a C-level library with three irreducible primitives:

  1. A session/connection handle — a way to say “I am now a query-issuing agent; allocate me whatever state I need,” paired with a teardown that frees it. Embedded code can nest (a function calls a function that issues SQL), so this handle is naturally a stack.
  2. Prepare/execute separation — the ability to parse-and-plan a statement once and execute it many times with different parameter values, because the archetypal consumer is a loop body inside a PL function that runs the same INSERT ten thousand times.
  3. A result cursor — a way to walk a large result set one row (or one batch) at a time, rather than materializing millions of rows into the function’s address space at once.

The deeper theory here is about isolation of execution contexts. When function f (issuing query Q1) calls function g (issuing query Q2), the two queries must not corrupt each other’s notion of “the current result,” “the current snapshot,” or “the memory in which my tuples live.” A textbook treatment (see Database System Concepts, the chapters on application development and on the storage/buffer interface captured in dbms-general/database-system-concepts.md) frames this as the classic problem of reentrancy with shared global state: the API exposes globals for ergonomics (SPI_tuptable is a plain global so callers don’t thread a handle through every line), but the implementation must save-and-restore those globals across every nesting boundary, exactly as a CPU saves callee registers.

The second theoretical axis is plan caching. Preparing a statement produces a generic or custom plan that can be reused, but reuse across executions spanning DDL, search_path changes, or statistics churn requires revalidation: the cached plan must be checked-and-possibly-replanned on each use. PostgreSQL factors this into a separate plancache subsystem (postgres-portals-prepared.md), and SPI is one of its principal clients — so “how SPI runs SQL” is inseparable from “how the plan cache decides whether to reuse or rebuild.”

The third axis — the one that makes SPI genuinely hard rather than merely clerical — is snapshot and transaction management across the embedded boundary. A read-only function (IMMUTABLE/STABLE) must see a stable snapshot for the duration of the surrounding query, so it should not take a fresh snapshot per statement. A VOLATILE function, by contrast, must see the effects of its own prior modifications, so it advances the command counter and re-snapshots. And a procedure called via CALL (or a DO block) may legally COMMIT mid-body, which means SPI must be able to tear down and restart the ambient transaction without losing the procedure’s own state. These three snapshot regimes, plus the transaction-control case, are the substance of the SPI executor loop.


Before looking at PostgreSQL’s symbols, it is worth naming the conventions that every server-side SQL API converges on, so the PostgreSQL specifics read as choices within a shared design space.

Server-side SQL APIs are reentrant: embedded code calls embedded code. The universal solution is a stack of execution frames, one pushed per connect, popped per finish, where each frame owns the resources allocated on its behalf. Oracle’s OCI/OCIStmt handles, SQL Server’s SQLCLR SqlContext, DB2’s SQL-routine runtime — all maintain per-invocation state that nests. The reason it is a stack and not a flat handle is that the outer frame’s results must survive intact while the inner frame runs and is torn down.

For ergonomics, these APIs expose the “most recent result” as ambient state rather than forcing the caller to thread a handle through every accessor. The implementation cost is that the connect path must snapshot the outer globals and the finish path must restore them, so a nested call is transparent to its caller. This is the embedded-SQL analog of caller/callee register save conventions.

Two execution modes are universal:

  • One-shot: hand over a query string, and the API parses, plans, and executes it, discarding the plan afterward. Cheap to call, expensive per call. The right choice for a statement run once.
  • Prepared: parse-and-plan once into a reusable handle, then execute the handle repeatedly with bound parameters. Expensive to set up, cheap to repeat. The right choice for a loop body.

A mature API additionally distinguishes transient prepared plans (live only until the current operation ends) from saved plans (live across calls, parented in a long-lived cache context). The lifetime decision is explicit because it trades memory for replanning cost.

Two delivery shapes recur. A materializing receiver accumulates the whole result set into a table-like structure in the caller’s memory — simple, but unbounded in size. A cursor/portal executes the plan lazily and lets the caller pull rows in batches — bounded memory, suitable for streaming a large result through a loop. Server-side APIs offer both and let the PL choose.

The embedded query usually runs under the caller’s transaction and, for read-only callers, the caller’s snapshot — it does not start its own transaction. Whether to take a fresh snapshot (to see one’s own prior writes) or reuse the existing one (to stay stable) is the single most consequential correctness knob, and it is driven by the volatility classification of the function issuing the SQL.

flowchart TD
  PL["PL handler / extension C code<br/>(already inside a backend)"] -->|"SPI_connect"| STK["push _SPI_connection frame<br/>(stack, snapshots globals)"]
  STK -->|"SPI_execute(src)"| ONE["one-shot:<br/>parse + rewrite + plan + execute"]
  STK -->|"SPI_prepare(src)"| PREP["build CachedPlanSource<br/>(transient SPIPlan)"]
  PREP -->|"SPI_keepplan"| SAVE["reparent into CacheMemoryContext<br/>(saved plan, reusable)"]
  SAVE -->|"SPI_execute_plan(plan, params)"| RUN["revalidate plan + execute"]
  STK -->|"SPI_cursor_open(plan)"| CUR["hand plan to Portal<br/>fetch incrementally"]
  ONE --> TT["SPITupleTable<br/>(SPI_processed / SPI_tuptable)"]
  RUN --> TT
  CUR -->|"SPI_cursor_fetch"| TT
  TT -->|"SPI_finish"| POP["pop frame, free contexts,<br/>restore outer globals"]

PostgreSQL implements all of the above conventions in a single file, src/backend/executor/spi.c, atop a small private header src/include/executor/spi_priv.h. The whole design hangs off four data structures and a global stack.

The connection stack. SPI keeps a process-global, dynamically grown array _SPI_stack of _SPI_connection frames, with _SPI_connected indexing the top and _SPI_current pointing at it. Each SPI_connect pushes a frame; each SPI_finish pops it. Crucially, the stack can move (it is repalloc’d when it grows), so any code that re-enters SPI must re-fetch _SPI_current after a nested call — a hazard the source calls out explicitly in _SPI_cursor_operation.

Two memory contexts per frame. Every frame owns a procCxt (“SPI Proc”) and an execCxt (“SPI Exec”). The split is the heart of SPI’s lifetime model: the exec context is reset after every SPI call (it holds parse/plan/execute scratch), while the proc context survives until SPI_finish (it holds results the caller will read, including SPITupleTables and transient plans). In atomic contexts these are children of TopTransactionContext; in non-atomic contexts (procedures that may commit) they hang off PortalContext so they outlive a transaction boundary.

Three API globals, snapshotted. SPI_processed (row count), SPI_tuptable (the result table), and SPI_result (error code) are plain globals. SPI_connect_ext saves the outer values into outer_processed, outer_tuptable, outer_result and zeroes the live ones; SPI_finish restores them. This is what makes nesting transparent.

The SPITupleTable. Results materialize into a SPITupleTable: a palloc0’d struct living in its own child context (tuptabcxt, “SPI TupTable”), holding a growable vals[] array of HeapTuple and a tupdesc. The DestSPI receiver — spi_dest_startup allocates the table, spi_printtup appends each tuple — is how the executor streams rows into it. Every live tuptable is linked onto the frame’s tuptables slist so subtransaction abort can free orphans.

Plans are plancache entries. An SPIPlanPtr (_SPI_plan) is a thin wrapper around a list of CachedPlanSources. A transient plan lives in execCxt and dies at end-of-call; _SPI_make_plan_non_temp copies it into procCxt; SPI_keepplan/_SPI_save_plan reparents it into CacheMemoryContext and marks the plancache entries saved, making the plan reusable across calls and immune to transaction end. Execution always goes through GetCachedPlan, which revalidates and replans as needed.

The snapshot regimes. _SPI_execute_plan is where SPI earns its complexity. Its header comment enumerates four behaviors keyed on (snapshot supplied?) × (read_only?). For read-only callers it reuses the active snapshot; for read-write it advances the command counter and takes a fresh transaction snapshot per statement; for non-atomic execution it leans on the Portal’s snapshot so a mid-procedure COMMIT works.

flowchart TD
  EP["_SPI_execute_plan(plan, options,<br/>snapshot, crosscheck)"] --> Q{"snapshot != InvalidSnapshot?"}
  Q -->|"yes, read_only"| PUSH1["PushActiveSnapshot(snapshot)<br/>one push for whole plan"]
  Q -->|"yes, read-write"| PUSH2["PushCopiedSnapshot(snapshot)<br/>advance CID per querytree"]
  Q -->|"no (InvalidSnapshot)"| LOOP["per CachedPlanSource:"]
  LOOP --> NEED{"stmt requires snapshot?"}
  NEED -->|"no"| SKIP["run with no snapshot"]
  NEED -->|"yes"| ENS["EnsurePortalSnapshotExists()"]
  ENS --> RW{"read-write &amp; atomic?"}
  RW -->|"yes"| FRESH["PushActiveSnapshot(GetTransactionSnapshot())<br/>fresh per statement-list"]
  RW -->|"no / nonatomic"| PORTAL["use Portal snapshot unmodified"]
  PUSH1 --> EXEC["foreach stmt: CreateQueryDesc + _SPI_pquery<br/>or ProcessUtility"]
  PUSH2 --> EXEC
  FRESH --> EXEC
  PORTAL --> EXEC
  SKIP --> EXEC

The remaining sections walk the actual code for each of these moving parts.


All excerpts are condensed from REL_18 (commit 273fe94, captured 2026-06-05). Each block leads with a // symbol — path comment; exact line numbers are collected in the position-hint table at the end of this section.

1. Pushing a frame: SPI_connect / SPI_connect_ext

Section titled “1. Pushing a frame: SPI_connect / SPI_connect_ext”

SPI_connect is a thin wrapper; all the work is in SPI_connect_ext, which grows the stack, initializes the frame, and snapshots the outer globals.

// SPI_connect_ext — src/backend/executor/spi.c
int
SPI_connect_ext(int options)
{
int newdepth;
/* Enlarge stack if necessary */
if (_SPI_stack == NULL)
{
if (_SPI_connected != -1 || _SPI_stack_depth != 0)
elog(ERROR, "SPI stack corrupted");
newdepth = 16;
_SPI_stack = (_SPI_connection *)
MemoryContextAlloc(TopMemoryContext,
newdepth * sizeof(_SPI_connection));
_SPI_stack_depth = newdepth;
}
else
{
if (_SPI_stack_depth == _SPI_connected + 1)
{
newdepth = _SPI_stack_depth * 2;
_SPI_stack = (_SPI_connection *)
repalloc(_SPI_stack, newdepth * sizeof(_SPI_connection));
_SPI_stack_depth = newdepth;
}
}
/* Enter new stack level */
_SPI_connected++;
_SPI_current = &(_SPI_stack[_SPI_connected]);
_SPI_current->processed = 0;
_SPI_current->tuptable = NULL;
slist_init(&_SPI_current->tuptables);
_SPI_current->connectSubid = GetCurrentSubTransactionId();
_SPI_current->atomic = (options & SPI_OPT_NONATOMIC ? false : true);
_SPI_current->internal_xact = false;
_SPI_current->outer_processed = SPI_processed;
_SPI_current->outer_tuptable = SPI_tuptable;
_SPI_current->outer_result = SPI_result;

Note the three outer_* fields: the caller’s results are stashed here so a nested SPI user cannot see them change underneath it. The atomic flag, derived from SPI_OPT_NONATOMIC, decides whether this frame may later commit a transaction.

The same function then builds the two memory contexts — and the choice of parent is the crux of SPI’s transaction-spanning ability:

// SPI_connect_ext (continued) — src/backend/executor/spi.c
/*
* In atomic contexts (the normal case), we use TopTransactionContext,
* otherwise PortalContext, so that it lives across transaction
* boundaries.
*/
_SPI_current->procCxt = AllocSetContextCreate(
_SPI_current->atomic ? TopTransactionContext : PortalContext,
"SPI Proc", ALLOCSET_DEFAULT_SIZES);
_SPI_current->execCxt = AllocSetContextCreate(
_SPI_current->atomic ? TopTransactionContext : _SPI_current->procCxt,
"SPI Exec", ALLOCSET_DEFAULT_SIZES);
/* ... and switch to procedure's context */
_SPI_current->savedcxt = MemoryContextSwitchTo(_SPI_current->procCxt);
/* Reset API global variables ... */
SPI_processed = 0;
SPI_tuptable = NULL;
SPI_result = 0;
return SPI_OK_CONNECT;
}

A non-atomic frame parents both contexts on PortalContext (not TopTransactionContext) precisely so that when the procedure’s body does COMMIT, the proc/exec contexts survive the transaction reset.

SPI_finish is the mirror image: switch back to the saved context, delete the two contexts (which frees every tuptable and transient plan), restore the outer globals, and pop the stack index.

// SPI_finish — src/backend/executor/spi.c
int
SPI_finish(void)
{
int res;
res = _SPI_begin_call(false); /* just check we're connected */
if (res < 0)
return res;
/* Restore memory context as it was before procedure call */
MemoryContextSwitchTo(_SPI_current->savedcxt);
/* Release memory used in procedure call (including tuptables) */
MemoryContextDelete(_SPI_current->execCxt);
_SPI_current->execCxt = NULL;
MemoryContextDelete(_SPI_current->procCxt);
_SPI_current->procCxt = NULL;
/* Restore outer API variables ... */
SPI_processed = _SPI_current->outer_processed;
SPI_tuptable = _SPI_current->outer_tuptable;
SPI_result = _SPI_current->outer_result;
/* Exit stack level */
_SPI_connected--;
if (_SPI_connected < 0)
_SPI_current = NULL;
else
_SPI_current = &(_SPI_stack[_SPI_connected]);
return SPI_OK_FINISH;
}

Because procCxt owns every SPITupleTable, deleting it reclaims all result memory in one stroke — there is no per-tuple bookkeeping at teardown. The comment “SPI_tuptable which is probably pointing at a just-deleted tuptable” is why the restore happens after the deletes.

3. The per-call boundary: _SPI_begin_call / _SPI_end_call

Section titled “3. The per-call boundary: _SPI_begin_call / _SPI_end_call”

Every public SPI entry point brackets its work with these two helpers. They switch into execCxt on entry and, on exit, switch back to procCxt and reset execCxt — this is what bounds the lifetime of parse/plan scratch to a single call.

// _SPI_begin_call / _SPI_end_call — src/backend/executor/spi.c
static int
_SPI_begin_call(bool use_exec)
{
if (_SPI_current == NULL)
return SPI_ERROR_UNCONNECTED;
if (use_exec)
{
_SPI_current->execSubid = GetCurrentSubTransactionId();
_SPI_execmem(); /* switch to executor context */
}
return 0;
}
static int
_SPI_end_call(bool use_exec)
{
if (use_exec)
{
_SPI_procmem(); /* switch back to procedure context */
_SPI_current->execSubid = InvalidSubTransactionId;
MemoryContextReset(_SPI_current->execCxt); /* free executor memory */
}
return 0;
}

_SPI_execmem and _SPI_procmem are one-liners — MemoryContextSwitchTo on execCxt and procCxt respectively — but they encode the entire “scratch dies per call, results survive per frame” discipline.

SPI_execute builds a stack-local _SPI_plan, parses it as a oneshot plan, and hands it straight to _SPI_execute_plan with InvalidSnapshot (meaning “SPI, manage the snapshot yourself per the read_only flag”).

// SPI_execute — src/backend/executor/spi.c
int
SPI_execute(const char *src, bool read_only, long tcount)
{
_SPI_plan plan;
SPIExecuteOptions options;
int res;
if (src == NULL || tcount < 0)
return SPI_ERROR_ARGUMENT;
res = _SPI_begin_call(true);
if (res < 0)
return res;
memset(&plan, 0, sizeof(_SPI_plan));
plan.magic = _SPI_PLAN_MAGIC;
plan.parse_mode = RAW_PARSE_DEFAULT;
plan.cursor_options = CURSOR_OPT_PARALLEL_OK;
_SPI_prepare_oneshot_plan(src, &plan);
memset(&options, 0, sizeof(options));
options.read_only = read_only;
options.tcount = tcount;
res = _SPI_execute_plan(&plan, &options,
InvalidSnapshot, InvalidSnapshot, true);
_SPI_end_call(true);
return res;
}

The plan lives on the C stack and its subsidiary data in execCxt, so it evaporates at _SPI_end_call. CURSOR_OPT_PARALLEL_OK lets a one-shot SELECT use parallel workers.

5. Preparing a reusable plan: SPI_prepare and _SPI_make_plan_non_temp

Section titled “5. Preparing a reusable plan: SPI_prepare and _SPI_make_plan_non_temp”

SPI_prepare (via SPI_prepare_cursor) parses into a transient _SPI_plan, then copies it into procCxt so it survives the call:

// SPI_prepare_cursor — src/backend/executor/spi.c
SPIPlanPtr
SPI_prepare_cursor(const char *src, int nargs, Oid *argtypes,
int cursorOptions)
{
_SPI_plan plan;
SPIPlanPtr result;
if (src == NULL || nargs < 0 || (nargs > 0 && argtypes == NULL))
{
SPI_result = SPI_ERROR_ARGUMENT;
return NULL;
}
SPI_result = _SPI_begin_call(true);
if (SPI_result < 0)
return NULL;
memset(&plan, 0, sizeof(_SPI_plan));
plan.magic = _SPI_PLAN_MAGIC;
plan.parse_mode = RAW_PARSE_DEFAULT;
plan.cursor_options = cursorOptions;
plan.nargs = nargs;
plan.argtypes = argtypes;
_SPI_prepare_plan(src, &plan);
/* copy plan to procedure context */
result = _SPI_make_plan_non_temp(&plan);
_SPI_end_call(true);
return result;
}

_SPI_make_plan_non_temp is the reparenting trick. Rather than deep-copying the plan trees, it creates an “SPI Plan” context under procCxt and reparents the existing CachedPlanSources into it, then unlinks them from the temporary plan so _SPI_end_call’s execCxt reset cannot touch them:

// _SPI_make_plan_non_temp — src/backend/executor/spi.c
static SPIPlanPtr
_SPI_make_plan_non_temp(SPIPlanPtr plan)
{
SPIPlanPtr newplan;
MemoryContext parentcxt = _SPI_current->procCxt;
MemoryContext plancxt;
ListCell *lc;
Assert(plan->magic == _SPI_PLAN_MAGIC);
Assert(!plan->oneshot); /* one-shot plans can't be saved */
plancxt = AllocSetContextCreate(parentcxt, "SPI Plan",
ALLOCSET_SMALL_SIZES);
MemoryContextSwitchTo(plancxt);
newplan = (SPIPlanPtr) palloc0(sizeof(_SPI_plan));
newplan->magic = _SPI_PLAN_MAGIC;
newplan->plancxt = plancxt;
/* ... copy parse_mode, cursor_options, argtypes ... */
foreach(lc, plan->plancache_list)
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc);
CachedPlanSetParentContext(plansource, parentcxt); /* reparent */
newplan->plancache_list = lappend(newplan->plancache_list, plansource);
}
/* For safety, unlink the CachedPlanSources from the temporary plan */
plan->plancache_list = NIL;
return newplan;
}

At this point the plan is unsaved: it lives in procCxt and dies at SPI_finish. To make it reusable across calls (the PL/pgSQL plan cache does this for every statement in a function body), the caller invokes SPI_keepplan.

6. Promoting to a saved plan: SPI_keepplan

Section titled “6. Promoting to a saved plan: SPI_keepplan”

SPI_keepplan reparents the plan context up to CacheMemoryContext — which outlives transactions entirely — and marks each CachedPlanSource saved so the plancache will track invalidations for it:

// SPI_keepplan — src/backend/executor/spi.c
int
SPI_keepplan(SPIPlanPtr plan)
{
ListCell *lc;
if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC ||
plan->saved || plan->oneshot)
return SPI_ERROR_ARGUMENT;
/*
* Mark it saved, reparent it under CacheMemoryContext, and mark all the
* component CachedPlanSources as saved. This sequence cannot fail
* partway through, so there's no risk of long-term memory leakage.
*/
plan->saved = true;
MemoryContextSetParent(plan->plancxt, CacheMemoryContext);
foreach(lc, plan->plancache_list)
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc);
SaveCachedPlan(plansource);
}
return 0;
}

The “cannot fail partway through” comment matters: a half-saved plan would leak into CacheMemoryContext permanently, so the sequence is deliberately allocation-free.

7. Executing a saved plan: SPI_execute_plan

Section titled “7. Executing a saved plan: SPI_execute_plan”

SPI_execute_plan converts the caller’s positional Values/Nulls arrays into a ParamListInfo and dispatches to _SPI_execute_plan — again with InvalidSnapshot so SPI manages snapshots:

// SPI_execute_plan — src/backend/executor/spi.c
int
SPI_execute_plan(SPIPlanPtr plan, Datum *Values, const char *Nulls,
bool read_only, long tcount)
{
SPIExecuteOptions options;
int res;
if (plan == NULL || plan->magic != _SPI_PLAN_MAGIC || tcount < 0)
return SPI_ERROR_ARGUMENT;
if (plan->nargs > 0 && Values == NULL)
return SPI_ERROR_PARAM;
res = _SPI_begin_call(true);
if (res < 0)
return res;
memset(&options, 0, sizeof(options));
options.params = _SPI_convert_params(plan->nargs, plan->argtypes,
Values, Nulls);
options.read_only = read_only;
options.tcount = tcount;
res = _SPI_execute_plan(plan, &options,
InvalidSnapshot, InvalidSnapshot, true);
_SPI_end_call(true);
return res;
}

8. The engine room: _SPI_execute_plan snapshot management

Section titled “8. The engine room: _SPI_execute_plan snapshot management”

This is the function that implements the four snapshot regimes. The leading case handles a caller-supplied snapshot (used internally, e.g. by SPI_execute_snapshot):

// _SPI_execute_plan — src/backend/executor/spi.c
static int
_SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
Snapshot snapshot, Snapshot crosscheck_snapshot,
bool fire_triggers)
{
bool allow_nonatomic;
bool pushed_active_snap = false;
ResourceOwner plan_owner = options->owner;
CachedPlan *cplan = NULL;
ListCell *lc1;
allow_nonatomic = options->allow_nonatomic &&
!_SPI_current->atomic && !IsSubTransaction();
if (snapshot != InvalidSnapshot)
{
Assert(!options->allow_nonatomic);
if (options->read_only)
{
PushActiveSnapshot(snapshot); /* use exactly this snapshot */
pushed_active_snap = true;
}
else
{
PushCopiedSnapshot(snapshot); /* private copy to advance CID */
pushed_active_snap = true;
}
}
/* ... ensure resource owner iff plan->saved ... */

When the snapshot is InvalidSnapshot (the public-API case), the per-statement loop decides whether each statement-list needs a snapshot and, for read-write atomic execution, takes a fresh transaction snapshot per statement-list:

// _SPI_execute_plan (per-plansource loop) — src/backend/executor/spi.c
foreach(lc1, plan->plancache_list)
{
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
/* ... one-shot plans get parse-analyzed here via CompleteCachedPlan ... */
/* Replan if needed, and increment plan refcount. */
cplan = GetCachedPlan(plansource, options->params,
plan_owner, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
if (snapshot == InvalidSnapshot &&
(list_length(stmt_list) > 1 ||
(list_length(stmt_list) == 1 &&
PlannedStmtRequiresSnapshot(linitial_node(PlannedStmt,
stmt_list)))))
{
/* back-fill a Portal snapshot in case prior op was COMMIT/ROLLBACK */
EnsurePortalSnapshotExists();
if (!options->read_only && !allow_nonatomic)
{
if (pushed_active_snap)
PopActiveSnapshot();
PushActiveSnapshot(GetTransactionSnapshot());
pushed_active_snap = true;
}
}
/* ... inner loop over individual PlannedStmts ... */
}

The inner statement loop is where read-write execution advances the command counter so each statement sees the previous one’s effects, then dispatches to either the executor (_SPI_pquery) or ProcessUtility:

// _SPI_execute_plan (inner stmt loop) — src/backend/executor/spi.c
foreach(lc2, stmt_list)
{
PlannedStmt *stmt = lfirst_node(PlannedStmt, lc2);
DestReceiver *dest;
_SPI_current->processed = 0;
_SPI_current->tuptable = NULL;
/* read-only callers may not run a volatile command */
if (options->read_only && !CommandIsReadOnly(stmt))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("%s is not allowed in a non-volatile function",
CreateCommandName((Node *) stmt))));
/* read-write: advance command counter + snapshot CID per command */
if (!options->read_only && pushed_active_snap)
{
CommandCounterIncrement();
UpdateActiveSnapshotCommandId();
}
if (!stmt->canSetTag)
dest = CreateDestReceiver(DestNone);
else if (options->dest)
dest = options->dest;
else
dest = CreateDestReceiver(DestSPI);
if (stmt->utilityStmt == NULL)
{
QueryDesc *qdesc;
Snapshot snap = ActiveSnapshotSet() ? GetActiveSnapshot()
: InvalidSnapshot;
qdesc = CreateQueryDesc(stmt, plansource->query_string,
snap, crosscheck_snapshot, dest,
options->params,
_SPI_current->queryEnv, 0);
res = _SPI_pquery(qdesc, fire_triggers,
stmt->canSetTag ? options->tcount : 0);
FreeQueryDesc(qdesc);
}
else
/* ProcessUtility(...) for DDL/utility statements */ ;
}

The CommandIsReadOnly guard is SPI’s enforcement of function volatility: a STABLE/IMMUTABLE function that tries to run an INSERT is rejected here with “not allowed in a non-volatile function.”

9. Result delivery: spi_dest_startup and spi_printtup

Section titled “9. Result delivery: spi_dest_startup and spi_printtup”

The DestSPI receiver materializes executor output into a SPITupleTable. spi_dest_startup allocates the table in a fresh child of procCxt and links it onto the frame’s tuptables list:

// spi_dest_startup — src/backend/executor/spi.c
void
spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
{
SPITupleTable *tuptable;
MemoryContext oldcxt, tuptabcxt;
if (_SPI_current == NULL)
elog(ERROR, "spi_dest_startup called while not connected to SPI");
if (_SPI_current->tuptable != NULL)
elog(ERROR, "improper call to spi_dest_startup");
oldcxt = _SPI_procmem(); /* switch to procedure memory context */
tuptabcxt = AllocSetContextCreate(CurrentMemoryContext, "SPI TupTable",
ALLOCSET_DEFAULT_SIZES);
MemoryContextSwitchTo(tuptabcxt);
_SPI_current->tuptable = tuptable = (SPITupleTable *)
palloc0(sizeof(SPITupleTable));
tuptable->tuptabcxt = tuptabcxt;
tuptable->subid = GetCurrentSubTransactionId();
/* put it on the SPI context's tuptables list (so abort can free it) */
slist_push_head(&_SPI_current->tuptables, &tuptable->next);
tuptable->alloced = 128;
tuptable->vals = (HeapTuple *) palloc(tuptable->alloced * sizeof(HeapTuple));
tuptable->numvals = 0;
tuptable->tupdesc = CreateTupleDescCopy(typeinfo);
MemoryContextSwitchTo(oldcxt);
}

spi_printtup is called once per result row; it grows vals[] geometrically and copies the slot’s tuple into the tuptable’s context:

// spi_printtup — src/backend/executor/spi.c
bool
spi_printtup(TupleTableSlot *slot, DestReceiver *self)
{
SPITupleTable *tuptable = _SPI_current->tuptable;
MemoryContext oldcxt;
if (tuptable == NULL)
elog(ERROR, "improper call to spi_printtup");
oldcxt = MemoryContextSwitchTo(tuptable->tuptabcxt);
if (tuptable->numvals >= tuptable->alloced)
{
uint64 newalloced = tuptable->alloced * 2; /* double the array */
tuptable->vals = (HeapTuple *)
repalloc_huge(tuptable->vals, newalloced * sizeof(HeapTuple));
tuptable->alloced = newalloced;
}
tuptable->vals[tuptable->numvals] = ExecCopySlotHeapTuple(slot);
(tuptable->numvals)++;
MemoryContextSwitchTo(oldcxt);
return true;
}

Because every tuple is copied into tuptabcxt, the result table is self-contained: it survives executor teardown and is freed only when its context is deleted (by SPI_freetuptable, SPI_finish, or abort).

SPI_freetuptable lets a caller reclaim one result table early. It guards against double-free by requiring the table to be on the current frame’s tuptables list, then deletes its context:

// SPI_freetuptable — src/backend/executor/spi.c
void
SPI_freetuptable(SPITupleTable *tuptable)
{
bool found = false;
if (tuptable == NULL)
return;
if (_SPI_current != NULL)
{
slist_mutable_iter siter;
slist_foreach_modify(siter, &_SPI_current->tuptables)
{
SPITupleTable *tt = slist_container(SPITupleTable, next, siter.cur);
if (tt == tuptable)
{
slist_delete_current(&siter);
found = true;
break;
}
}
}
if (!found)
{
elog(WARNING, "attempt to delete invalid SPITupleTable %p", tuptable);
return;
}
/* reset globals that might point at it, then free its memory */
if (tuptable == _SPI_current->tuptable)
_SPI_current->tuptable = NULL;
if (tuptable == SPI_tuptable)
SPI_tuptable = NULL;
MemoryContextDelete(tuptable->tuptabcxt);
}

Cursors give a PL incremental fetch by handing the plan to the Portal machinery (postgres-portals-prepared.md) instead of materializing everything. SPI_cursor_open_internal creates a portal, copies the plan in, sets up the snapshot exactly as the executor would, and calls PortalStart:

// SPI_cursor_open_internal — src/backend/executor/spi.c
Portal
SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
ParamListInfo paramLI, bool read_only)
{
CachedPlanSource *plansource;
CachedPlan *cplan;
List *stmt_list;
Portal portal;
Snapshot snapshot;
if (!SPI_is_cursor_plan(plan))
ereport(ERROR,
(errcode(ERRCODE_INVALID_CURSOR_DEFINITION),
errmsg("cannot open non-SELECT as cursor")));
plansource = (CachedPlanSource *) linitial(plan->plancache_list);
if (_SPI_begin_call(true) < 0)
elog(ERROR, "SPI_cursor_open called while not connected");
SPI_processed = 0;
SPI_tuptable = NULL;
/* create the portal (named or anonymous) */
if (name == NULL || name[0] == '\0')
portal = CreateNewPortal();
else
portal = CreatePortal(name, false, false);
/* revalidate plan, take a refcount for the portal */
cplan = GetCachedPlan(plansource, paramLI, NULL, _SPI_current->queryEnv);
stmt_list = cplan->stmt_list;
/* ... copy unsaved plans into portalContext; PortalDefineQuery(...) ... */
/* Set up the snapshot exactly like the executor would. */
if (read_only)
snapshot = GetActiveSnapshot();
else
{
CommandCounterIncrement();
snapshot = GetTransactionSnapshot();
}
PortalStart(portal, paramLI, 0, snapshot);
Assert(portal->strategy != PORTAL_MULTI_QUERY);
_SPI_end_call(true);
return portal;
}

Fetching then goes through _SPI_cursor_operation, which contains the famous re-fetch hazard: running the portal may run PL code that itself uses SPI, moving _SPI_stack via repalloc, so _SPI_current must be re-read after PortalRunFetch:

// _SPI_cursor_operation — src/backend/executor/spi.c
static void
_SPI_cursor_operation(Portal portal, FetchDirection direction, long count,
DestReceiver *dest)
{
uint64 nfetched;
if (!PortalIsValid(portal))
elog(ERROR, "invalid portal in SPI cursor operation");
if (_SPI_begin_call(true) < 0)
elog(ERROR, "SPI cursor operation called while not connected");
SPI_processed = 0;
SPI_tuptable = NULL;
_SPI_current->processed = 0;
_SPI_current->tuptable = NULL;
nfetched = PortalRunFetch(portal, direction, count, dest);
/*
* ... the portal may move _SPI_stack around; re-fetch _SPI_current
* after the call before storing into it.
*/
_SPI_current->processed = nfetched;
if (dest->mydest == DestSPI && _SPI_checktuples())
elog(ERROR, "consistency check on SPI tuple count failed");
SPI_processed = _SPI_current->processed;
SPI_tuptable = _SPI_current->tuptable;
_SPI_current->tuptable = NULL; /* now caller's responsibility */
_SPI_end_call(true);
}

12. Transaction-boundary cleanup: AtEOXact_SPI

Section titled “12. Transaction-boundary cleanup: AtEOXact_SPI”

At transaction end, any SPI frames left dangling (a PL that errored without calling SPI_finish) are popped here. The loop stops at an internal_xact frame — the frame belonging to the SPI_commit/SPI_rollback caller, which manages its own lifetime:

// AtEOXact_SPI — src/backend/executor/spi.c
void
AtEOXact_SPI(bool isCommit)
{
bool found = false;
while (_SPI_connected >= 0)
{
_SPI_connection *connection = &(_SPI_stack[_SPI_connected]);
if (connection->internal_xact)
break; /* belongs to SPI_commit/SPI_rollback caller */
found = true;
/* contexts go away with their parent; just restore globals + pop */
SPI_processed = connection->outer_processed;
SPI_tuptable = connection->outer_tuptable;
SPI_result = connection->outer_result;
_SPI_connected--;
_SPI_current = (_SPI_connected < 0) ? NULL
: &(_SPI_stack[_SPI_connected]);
}
if (found && isCommit)
ereport(WARNING,
(errcode(ERRCODE_WARNING),
errmsg("transaction left non-empty SPI stack"),
errhint("Check for missing \"SPI_finish\" calls.")));
}

A warning on commit (but silent on abort) encodes the contract: leaving a frame open across a successful transaction is a bug in the PL/extension; leaving one open across an error is expected (the error short-circuited the SPI_finish).

The companion AtEOSubXact_SPI does the same at subtransaction boundaries but explicitly deletes each frame’s exec/proc contexts (a subxact abort cannot rely on the parent context disappearing) and uses the subid stamped on each tuptable to free only those created in the aborting subtransaction.

Position hints (as of 2026-06-05, REL_18 273fe94)

Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”
SymbolFileLine
SPI_connectsrc/backend/executor/spi.c94
SPI_connect_extsrc/backend/executor/spi.c100
SPI_finishsrc/backend/executor/spi.c182
AtEOXact_SPIsrc/backend/executor/spi.c428
AtEOSubXact_SPIsrc/backend/executor/spi.c482
SPI_executesrc/backend/executor/spi.c596
SPI_execute_extendedsrc/backend/executor/spi.c637
SPI_execute_plansrc/backend/executor/spi.c672
SPI_preparesrc/backend/executor/spi.c860
SPI_prepare_cursorsrc/backend/executor/spi.c866
SPI_keepplansrc/backend/executor/spi.c976
SPI_saveplansrc/backend/executor/spi.c1003
SPI_copytuplesrc/backend/executor/spi.c1047
SPI_getvaluesrc/backend/executor/spi.c1220
SPI_freetuptablesrc/backend/executor/spi.c1386
SPI_cursor_opensrc/backend/executor/spi.c1445
SPI_cursor_open_internalsrc/backend/executor/spi.c1577
SPI_cursor_fetchsrc/backend/executor/spi.c1806
spi_dest_startupsrc/backend/executor/spi.c2123
spi_printtupsrc/backend/executor/spi.c2171
_SPI_prepare_plansrc/backend/executor/spi.c2221
_SPI_execute_plansrc/backend/executor/spi.c2399
_SPI_cursor_operationsrc/backend/executor/spi.c3007
_SPI_begin_callsrc/backend/executor/spi.c3077
_SPI_end_callsrc/backend/executor/spi.c3101
_SPI_make_plan_non_tempsrc/backend/executor/spi.c3141
_SPI_save_plansrc/backend/executor/spi.c3209
SPI_register_relationsrc/backend/executor/spi.c3297
_SPI_connection (struct)src/include/executor/spi_priv.h22
_SPI_plan (struct)src/include/executor/spi_priv.h90

Claims in this document were checked against the REL_18 tree at /data/hgryoo/references/postgres (commit 273fe94, captured 2026-06-05) as follows:

  • Connection stack is a growable global array. Confirmed: _SPI_stack, _SPI_stack_depth, _SPI_connected, _SPI_current are file-scope statics in spi.c; SPI_connect_ext allocates 16 frames initially and doubles via repalloc. The “stack can move” hazard is documented in _SPI_cursor_operation’s comment (“_SPI_stack is likely to move around … re-fetch the pointer”).
  • Two contexts per frame, parent chosen by atomicity. Confirmed in SPI_connect_ext: procCxt/execCxt are created with parent TopTransactionContext (atomic) or PortalContext/procCxt (non-atomic). The comment block spelling out the rationale is present verbatim.
  • Globals snapshotted at the boundary. Confirmed: outer_processed, outer_tuptable, outer_result fields in _SPI_connection (spi_priv.h), saved in SPI_connect_ext and restored in SPI_finish, AtEOXact_SPI, and AtEOSubXact_SPI.
  • One-shot path uses InvalidSnapshot. Confirmed: SPI_execute and SPI_execute_extended both call _SPI_prepare_oneshot_plan then _SPI_execute_plan(..., InvalidSnapshot, InvalidSnapshot, true).
  • Four snapshot regimes. Confirmed by the header comment block inside _SPI_execute_plan enumerating the (snapshot supplied?) × (read_only?) matrix, and by the code branches: PushActiveSnapshot/PushCopiedSnapshot for supplied snapshots, EnsurePortalSnapshotExists + PushActiveSnapshot(GetTransactionSnapshot()) for the managed read-write case.
  • Volatility enforcement. Confirmed: _SPI_execute_plan raises ERRCODE_FEATURE_NOT_SUPPORTED (“%s is not allowed in a non-volatile function”) when options->read_only && !CommandIsReadOnly(stmt).
  • SPITupleTable geometric growth + per-tuple copy. Confirmed: spi_dest_startup sets alloced = 128; spi_printtup doubles via repalloc_huge and copies with ExecCopySlotHeapTuple.
  • SPI_keepplan reparents to CacheMemoryContext. Confirmed: MemoryContextSetParent(plan->plancxt, CacheMemoryContext) + SaveCachedPlan loop, with the “cannot fail partway through” comment.
  • Transaction control restricted to non-atomic, non-subxact. Confirmed: _SPI_commit raises ERRCODE_INVALID_TRANSACTION_TERMINATION when _SPI_current->atomic or IsSubTransaction().

Line numbers in the position-hint table were read directly from the file on 2026-06-05; they are approximate anchors and may drift with future commits — the symbol names are the stable references.

Scope caveat: this document covers src/backend/executor/spi.c and its private header. It does not assert anything about PL/pgSQL’s internal state machine (pl_exec.c), the executor’s plan-node machinery, or portal/prepared-statement lifecycle beyond the SPI touch-points; those are deferred to postgres-plpgsql.md, postgres-executor.md, and postgres-portals-prepared.md respectively. contrib/ modules are out of scope.


Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”

SPI vs. the wire protocol — two front doors to one engine

Section titled “SPI vs. the wire protocol — two front doors to one engine”

The cleanest way to situate SPI is to contrast it with PostgreSQL’s other query-submission path, the extended-query protocol (postgres-wire-protocol.md, postgres-portals-prepared.md). The wire path runs exec_parse_messageexec_bind_messageexec_execute_message and is driven by an external client; SPI runs SPI_prepareSPI_execute_plan and is driven by in-backend C code. Both ultimately call into the same plancache and the same executor — SPI is best understood as a thin in-process facade that reuses the heavy machinery without the socket. The notable asymmetry is snapshot ownership: the wire path always starts its own command and snapshot, whereas SPI usually inherits the caller’s transaction and (for read-only callers) the caller’s snapshot. This inheritance is exactly what lets a STABLE function see a consistent view of the database for the lifetime of the outer query.

Comparison with other engines’ server-side SQL APIs

Section titled “Comparison with other engines’ server-side SQL APIs”
  • Oracle (OCI / PL/SQL’s EXECUTE IMMEDIATE). Oracle’s PL/SQL runs in the same process as the SQL engine and shares cursors through a shared-cursor cache keyed by SQL text; PostgreSQL’s per-statement CachedPlanSource + SPI_keepplan is the analog, but PostgreSQL’s plan cache is per-backend (no cross-session sharing of plans), trading memory for simplicity and isolation. Oracle’s bind-by-name versus PostgreSQL’s positional $1..$n/Values[] is a surface difference; the deeper one is that Oracle shares compiled plans across sessions in the shared pool, a design PostgreSQL deliberately avoids.
  • SQL Server (SQLCLR SqlContext/context connection). Managed code running inside SQL Server obtains a “context connection” that reuses the host’s transaction and session — conceptually identical to SPI inheriting the backend’s transaction. The SPI_OPT_NONATOMIC distinction (can this code commit?) mirrors SQL Server’s rules about which CLR contexts may control transactions.
  • SQLite (sqlite3_exec / virtual tables). SQLite has no separate-process boundary at all, so its “embedded SQL” is the only API; its prepared-statement object (sqlite3_stmt) is the moral equivalent of an SPIPlanPtr, and its sqlite3_step cursor maps to SPI_cursor_fetch. The absence of a connection stack reflects SQLite’s single-writer model versus PostgreSQL’s reentrant PL-calls-PL nesting.

Why the connection-stack-with-globals design persists

Section titled “Why the connection-stack-with-globals design persists”

A recurring critique of SPI is its reliance on process globals (SPI_tuptable et al.), which forces the save/restore dance and makes the API non-thread-safe by construction. The justification is ergonomic: PL handler authors write SPI_execute("..."); n = SPI_processed; without threading a handle through every call, and PostgreSQL’s one-backend-per-connection process model means there is never more than one SPI user per address space at a time anyway. The stack-of-frames pattern is the minimal machinery that makes those globals safe under nesting — a pragmatic choice that has survived essentially unchanged for two decades precisely because the process model removes the thread-safety pressure that would otherwise condemn it.

  • Compiling away the SPI boundary. Projects on SQL-to-native compilation (HyPer/Umbra-style data-centric code generation, and PostgreSQL’s own JIT for expression evaluation, postgres-expression-eval.md) raise the question of whether the parse→plan→SPITupleTable→copy pipeline could be short-circuited for hot PL loops by fusing the PL body and the embedded query into one compiled unit, eliminating the per-row ExecCopySlotHeapTuple. No production PostgreSQL does this, but it is the obvious next efficiency frontier for PL-heavy workloads.
  • Plan-cache sharing. Whether per-backend plan caches should become shared-memory-resident (as in Oracle’s shared pool) is a perennial design debate; the SPI layer would be a principal beneficiary, since every PL statement is a SPI_keepplan consumer. The counter-argument — invalidation complexity and cross-session contention — is why PostgreSQL has not moved.
  • Non-atomic execution and stored procedures. The SPI_OPT_NONATOMIC path, added to support transaction-controlling procedures (CALL), is the youngest and least-settled part of SPI; the interaction of mid-procedure COMMIT with pinned portals, held snapshots (HoldPinnedPortals, ForgetPortalSnapshots), and parallel workers remains an area of ongoing refinement.

Primary source (REL_18, commit 273fe94, captured 2026-06-05):

  • src/backend/executor/spi.c — the entire SPI implementation: connection stack, prepare/execute/cursor paths, SPITupleTable receiver, snapshot and transaction management.
  • src/include/executor/spi.h — public API surface, SPITupleTable, SPIExecuteOptions/SPIPrepareOptions, return-code constants.
  • src/include/executor/spi_priv.h_SPI_connection and _SPI_plan (SPIPlanPtr) private structs, _SPI_PLAN_MAGIC.
  • src/backend/utils/cache/plancache.cGetCachedPlan, SaveCachedPlan, CachedPlanSetParentContext; SPI is a primary client (see postgres-portals-prepared.md).
  • src/backend/tcop/pquery.cPortalStart, PortalRunFetch used by the SPI cursor path.

Theory and cross-references in this knowledge base:

  • knowledge/research/dbms-general/database-system-concepts.md — application development / embedded SQL / stored procedures framing.
  • knowledge/research/dbms-general/database-internals.md — execution-context and buffer/storage interface background.
  • knowledge/code-analysis/postgres/postgres-portals-prepared.md — portal and prepared-statement lifecycle (SPI cursors hand off here).
  • knowledge/code-analysis/postgres/postgres-executor.md — the executor proper that SPI drives via CreateQueryDesc/_SPI_pquery.
  • knowledge/code-analysis/postgres/postgres-plpgsql.md — the principal SPI client; PL/pgSQL’s own state machine is deferred there.
  • knowledge/code-analysis/postgres/postgres-mvcc-snapshots.md — the snapshot semantics SPI’s four regimes manipulate.
  • knowledge/code-analysis/postgres/postgres-memory-contexts.md — the procCxt/execCxt/tuptabcxt lifetime model rests on this.
  • knowledge/code-analysis/postgres/postgres-extensions.md — how extensions link against and call the SPI API.