Skip to content

PostgreSQL Portals, Prepared Statements, and the Plan Cache

Contents:

A SQL statement does not go straight from text to tuples. Database System Concepts (Silberschatz, 7e, ch. 15 “Query Processing”, §15.1) frames query processing as a three-stage pipeline: parsing and translation turn the SQL text into a relational-algebra parse tree; optimization turns that tree into an annotated evaluation plan — “a sequence of primitive operations that can be used to evaluate a query”; and evaluation runs the plan against the stored data to produce the answer. The textbook is explicit that the middle stage is expensive: the optimizer “examines several plans, estimates the cost of each, and chooses the cheapest” (§15.1, §16.5), and that cost estimation and plan enumeration dominate the compile-time budget for all but the most trivial queries.

That cost asymmetry is the entire reason the structures in this document exist. When the same query shape is run many times — the canonical case being an application that issues SELECT ... WHERE id = $1 for thousands of different id values — paying the parse-and-plan cost on every execution is pure waste. The textbook calls the response stored / parameterized queries: compile the query once into a plan with placeholders for the parts that vary (the constants), then reuse the compiled plan for each new set of placeholder values. The ANSI/ISO SQL standard exposes this as the PREPARE / EXECUTE statement pair, and every client protocol worth using (JDBC PreparedStatement, libpq’s extended-query protocol, ODBC) layers a prepared-statement API on top of it.

Two distinct concepts are folded into “prepared statement”, and keeping them separate is the key to reading the PostgreSQL source:

  1. The compiled query — the parse-analyzed, rewritten query tree plus the metadata needed to plan it (parameter types, the result row’s column types). This is created once by PREPARE and is logically immutable except when the schema it references changes.
  2. The execution plan — the physical operator tree the executor actually runs. There is a genuine choice here that the textbook glosses over: the plan can be built once, parameter-independently and reused (cheap to reuse, but the optimizer must commit to operator choices without knowing the actual constant values), or built fresh for each execution with the real constants substituted (expensive to build, but the optimizer can use the constants for selectivity estimation and index selection).

The second choice is a real cost/benefit tradeoff, not a detail. Consider WHERE status = $1 on a column where 99% of rows have status = 'active'. A plan built without knowing $1 must assume an average selectivity and might pick a sequential scan; a plan built knowing $1 = 'closed' (the rare 1%) can use an index. But if the application always passes common values, the parameter-independent plan is just as good and avoids re-optimization. No static rule wins universally — which is why PostgreSQL’s plan cache makes the decision adaptively, at run time, based on observed costs. That adaptive generic-vs-custom decision is the intellectual centerpiece of this document.

The third structure — the portal — has no direct textbook name because it is an engineering abstraction rather than a relational-algebra one. It is the runtime container that holds the execution state of one running command: which plan, what parameters, where the cursor currently sits, what the result columns look like. The textbook’s iterator model (§15.7.2.1, covered in depth by postgres-executor.md) assumes “the system makes repeated requests for tuples from the operation at the top of the pipeline” — but something has to own the iterator between requests, remember that this is the 50th FETCH on a cursor, and hold the snapshot pinned across all those requests. That owner is the portal. SQL cursors (DECLARE ... CURSOR, FETCH, MOVE) are the user-visible face of portals; the wire protocol’s Bind/Execute messages are the other face.

The textbook gives the why — compile once, reuse, and choose plans by cost. This section names the engineering conventions that production engines adopt to realize prepared statements, plan caching, and an execution container. PostgreSQL’s specific choices in ## PostgreSQL's Approach are one point in this shared design space.

Separate the compiled query from the execution plan

Section titled “Separate the compiled query from the execution plan”

A naive plan cache stores “SQL text → plan”. That is wrong for two reasons. First, the compiled query (parse tree + parameter types) is invalidated by a different set of events than the plan: a SET search_path or a role change can invalidate planning decisions without touching the parse analysis, while DROP INDEX invalidates the plan but not the parse tree. Second, one compiled query can spawn several plans over its life — a generic one and a sequence of custom ones — so the plan cannot be a field of the query. The convention is two reference-counted objects: a long-lived compiled query object that owns the parse tree and dependency lists, and shorter-lived plan objects that hang off it and can be discarded and rebuilt independently.

Generic vs. custom plans, decided adaptively

Section titled “Generic vs. custom plans, decided adaptively”

Given parameters, an engine can plan generically (treat $1 as an unknown of its declared type, plan once, reuse forever) or custom (substitute the literal values, replan each time). The mature convention is not to pick one statically but to measure: run custom plans for a warm-up period, record their average cost including the planning cost, and switch to the generic plan only once it is demonstrably no more expensive than re-optimizing. Folding the planning cost into the custom-plan side is the subtle part — it is what lets a generic plan win even when its raw execution cost is slightly higher, because the generic plan never pays to replan.

A cached plan references tables, indexes, functions, and operators by OID. DDL on any of them — DROP INDEX, ALTER TABLE, replacing a function — can make the plan wrong or illegal. The convention is dependency tracking plus invalidation messages: when a plan is built, record the OIDs it depends on; when DDL commits, broadcast a shared-invalidation (sinval) message; a callback keyed on those OIDs flips an is_valid flag on every affected cached object. Invalidation is lazy — the flag is just set; the actual re-analysis and replan happen on the next use of the object, not at invalidation time, so a burst of DDL does not trigger a storm of replanning for queries no one is currently running.

For one-row-at-a-time consumption — cursors, the protocol’s Execute with a row limit — the engine cannot run the plan to completion and buffer everything; it must be able to suspend the iterator mid-stream and resume on the next request. This demands a container that survives between requests and holds: the live executor state, the registered snapshot (so MVCC sees a consistent set of rows from first FETCH to last), the cursor position, and the result descriptor. Update-type commands (INSERT/UPDATE/DELETE) generally cannot be suspended safely — AFTER triggers and the rule rewriter assume the statement runs atomically — so the container classifies its contents and only allows suspension for plain SELECTs.

Snapshot and resource lifetime tied to the container

Section titled “Snapshot and resource lifetime tied to the container”

The container owns a snapshot and a set of locks/buffers for as long as it is runnable. The convention is to attach these to a resource owner scoped to the container, and to register the snapshot so vacuum cannot reclaim row versions the cursor can still see. Tearing the container down releases all of it in one shot — there is no retail bookkeeping of individual locks.

By the time you reach a named symbol in ## Source Walkthrough, you should already know what kind of thing it is:

Theory / conventionPostgreSQL name
Compiled query (parse tree + param types)CachedPlanSource (header calls it “CachedQuery”)
Execution plan object, refcountedCachedPlan (a list of PlannedStmt)
Generic plan slot on the compiled queryCachedPlanSource.gplan
Generic-vs-custom decisionchoose_custom_plan (called from GetCachedPlan)
Custom-plan warm-up countnum_custom_plans < 5
Planning-cost charge folded into custom costcached_plan_cost(plan, include_planner = true)
Cached generic plan costCachedPlanSource.generic_cost
Lazy invalidation flagis_valid on CachedPlanSource / CachedPlan
Re-analyze/replan on next useRevalidateCachedQuery + CheckCachedPlan
sinval-driven invalidation callbackPlanCacheRelCallback / PlanCacheObjectCallback
Execution containerPortal (PortalData)
Container classificationChoosePortalStrategyPortalStrategy
”Build the machine” for a portalPortalStart
”Pull tuples / run” the portalPortalRun / PortalRunSelect / PortalRunFetch
Tear down the containerPortalDrop
Prepared-statement registryprepared_queries hash table in prepare.c
PREPARE / EXECUTE utility commandsPrepareQuery / ExecuteQuery

The executor itself — ExecutorStart/Run/Finish/End, the PlanState tree, the TupleTableSlot — is owned by postgres-executor.md; this document stops at ExecutorStart/ExecutorRun calls and does not re-describe what happens inside them. How a backend reads a wire message and decides to call PortalStart/PortalRun is owned by postgres-wire-protocol.md and postgres-backend-lifecycle.md. This document covers the container, the prepared-statement registry, and the plan cache’s generic-vs-custom logic.

PostgreSQL renders the conventions above as three cooperating subsystems, each with its own source file:

  1. plancache.c owns the compiled query (CachedPlanSource), the plan objects (CachedPlan), the generic-vs-custom decision, and invalidation.
  2. prepare.c owns the SQL-level PREPARE/EXECUTE/DEALLOCATE commands and the per-backend hash table of named prepared statements. It is a thin client of plancache.c.
  3. pquery.c (plus portalmem.c for allocation) owns the portal — the execution container — its five strategies, and the run loop.

PortalData is the struct. The fields that matter for execution are the plan (stmts — a list of PlannedStmt — and the cplan it came from), the parameters (portalParams), the result descriptor (tupDesc), the cursor position (atStart / atEnd / portalPos), and the resource owner and snapshot:

// PortalData — src/include/utils/portal.h (condensed)
typedef struct PortalData
{
const char *name; /* portal's name */
MemoryContext portalContext; /* subsidiary memory for portal */
ResourceOwner resowner; /* resources owned by portal */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
ParamListInfo portalParams; /* params to pass to query */
PortalStrategy strategy; /* PORTAL_ONE_SELECT, ... */
PortalStatus status; /* PORTAL_NEW ... PORTAL_DONE/FAILED */
QueryDesc *queryDesc; /* executor invocation state, or NULL */
TupleDesc tupDesc; /* descriptor for result tuples */
Snapshot portalSnapshot; /* active snapshot, or NULL */
Tuplestorestate *holdStore; /* store for held / non-suspendable results */
bool atStart, atEnd; /* cursor position flags */
uint64 portalPos; /* rows fetched so far */
} PortalData;

A portal moves through a small status machine. CreatePortal makes a PORTAL_NEW one; PortalDefineQuery attaches the plan list and moves it to PORTAL_DEFINED; PortalStart classifies and prepares it, reaching PORTAL_READY; PortalRun marks it PORTAL_ACTIVE while running and then either PORTAL_READY (suspended, can resume) or PORTAL_DONE; an error sends it to PORTAL_FAILED. PortalDrop frees it.

flowchart TB
  NEW["PORTAL_NEW<br/>CreatePortal"]
  DEF["PORTAL_DEFINED<br/>PortalDefineQuery attaches stmts + cplan"]
  RDY["PORTAL_READY<br/>PortalStart: ChoosePortalStrategy,<br/>snapshot, ExecutorStart (ONE_SELECT)"]
  ACT["PORTAL_ACTIVE<br/>PortalRun / PortalRunSelect"]
  DONE["PORTAL_DONE<br/>run to completion / MULTI_QUERY"]
  FAIL["PORTAL_FAILED<br/>uncaught error"]
  NEW --> DEF --> RDY --> ACT
  ACT -->|"forward fetch, more rows"| RDY
  ACT -->|"atEnd or MULTI_QUERY"| DONE
  ACT -->|"error"| FAIL
  DONE --> DROP["PortalDrop:<br/>cleanup hook (ExecutorEnd),<br/>drop cplan refcount, free context"]
  FAIL --> DROP

Figure 1 — Portal status machine. The ACTIVE→READY back-edge is what makes cursor FETCH n and protocol Execute with a row limit possible: a PORTAL_ONE_SELECT portal suspends after returning n rows and can be resumed.

PortalStart calls ChoosePortalStrategy on the plan list to pick one of five PortalStrategy values. The strategy decides whether the portal can suspend (only PORTAL_ONE_SELECT runs the executor incrementally) and where results live (directly streamed, or buffered in a holdStore tuplestore). The classification is purely structural — it inspects command type, canSetTag, modifying-CTE and RETURNING flags:

// ChoosePortalStrategy — src/backend/tcop/pquery.c (condensed)
if (list_length(stmts) == 1)
{
/* ... for a single PlannedStmt that canSetTag ... */
if (pstmt->commandType == CMD_SELECT)
{
if (pstmt->hasModifyingCTE)
return PORTAL_ONE_MOD_WITH; /* SELECT with data-modifying CTE */
else
return PORTAL_ONE_SELECT; /* the suspendable case */
}
if (pstmt->commandType == CMD_UTILITY)
{
if (UtilityReturnsTuples(pstmt->utilityStmt))
return PORTAL_UTIL_SELECT; /* EXPLAIN, SHOW, ... */
return PORTAL_MULTI_QUERY;
}
}
/* exactly one canSetTag query with a RETURNING list: */
if (nSetTag == 1)
return PORTAL_ONE_RETURNING;
/* everything else (multi-statement, plain DML, ...) */
return PORTAL_MULTI_QUERY;

Only PORTAL_ONE_SELECT is read incrementally. The others run to completion on first call — PORTAL_ONE_RETURNING, PORTAL_ONE_MOD_WITH, and PORTAL_UTIL_SELECT dump their output into the portal’s holdStore tuplestore (because partial execution of a data-modifying or RETURNING query would break AFTER triggers), and PORTAL_MULTI_QUERY simply executes every statement. The portal header (portal.h) states the rule directly: suspension “is allowed only for portals that contain a single SELECT-type query. We do not want to let the client suspend an update-type query partway through!”

For a parameterized prepared statement, the interesting work happens inside GetCachedPlanchoose_custom_plan. This is PostgreSQL’s adaptive answer to the textbook tradeoff: it builds custom plans for the first five executions to gather cost data, then keeps using the generic plan only if its cost is no worse than the average custom plan (where “custom cost” deliberately includes a charge for the planning effort itself). The next section walks the code; the shape of the decision is:

flowchart TB
  GET["GetCachedPlan(plansource, boundParams)"]
  REV["RevalidateCachedQuery:<br/>re-analyze/replan if !is_valid;<br/>acquire planner locks"]
  CC["choose_custom_plan?"]
  GET --> REV --> CC
  CC -->|"no params, or FORCE_GENERIC"| GEN
  CC -->|"is_oneshot, or num_custom_plans &lt; 5,<br/>or FORCE_CUSTOM"| CUST
  CC -->|"num_custom_plans &gt;= 5:<br/>generic_cost &lt; avg custom cost?"| DECIDE
  DECIDE -->|"yes: generic cheaper"| GEN["generic plan:<br/>CheckCachedPlan reuses gplan,<br/>else BuildCachedPlan(NULL params)"]
  DECIDE -->|"no"| CUST["custom plan:<br/>BuildCachedPlan(boundParams),<br/>accumulate total_custom_cost"]
  GEN --> RET["bump refcount, return CachedPlan"]
  CUST --> RET

Figure 2 — The generic-vs-custom decision inside GetCachedPlan. Note the warm-up gate (num_custom_plans < 5) and that even after choosing generic, GetCachedPlan re-checks the decision once generic_cost becomes known, so a freshly-built generic plan that turns out expensive is discarded in favor of a custom plan before execution.

This section follows the lifetime of a parameterized prepared statement end to end: PREPARE builds the compiled query, EXECUTE binds parameters and asks the plan cache for a plan, the plan cache decides generic-vs-custom, a portal runs the chosen plan, and PortalDrop plus invalidation clean up. Symbols are the canonical anchors; the position-hint table at the end pairs each with a (file, line) as of the document’s updated: date.

PREPARE: building the CachedPlanSource (prepare.c + plancache.c)

Section titled “PREPARE: building the CachedPlanSource (prepare.c + plancache.c)”

PrepareQuery is the PREPARE name (types) AS query handler. The ordering is deliberate and called out in a comment: the CachedPlanSource is created from the raw (un-analyzed) parse tree first, because the plancache wants to own the unmodified tree, and only then is parse analysis run with the declared parameter types:

// PrepareQuery — src/backend/commands/prepare.c (condensed)
plansource = CreateCachedPlan(rawstmt, pstate->p_sourcetext,
CreateCommandTag(stmt->query));
/* ... build argtypes[] from the declared TypeNames ... */
query_list = pg_analyze_and_rewrite_varparams(rawstmt, pstate->p_sourcetext,
&argtypes, &nargs, NULL);
CompleteCachedPlan(plansource, query_list, NULL,
argtypes, nargs, NULL, NULL,
CURSOR_OPT_PARALLEL_OK, /* allow parallel mode */
true); /* fixed_result */
StorePreparedStatement(stmt->name, plansource, true);

CreateCachedPlan allocates a private memory context (named after the query text, which is why pg_backend_memory_contexts shows query strings) and a zeroed CachedPlanSource with the all-important counters initialized: generic_cost = -1 (“not yet known”), num_custom_plans = 0, total_custom_cost = 0:

// CreateCachedPlan — src/backend/utils/cache/plancache.c (condensed)
source_context = AllocSetContextCreate(CurrentMemoryContext,
"CachedPlanSource",
ALLOCSET_START_SMALL_SIZES);
plansource = (CachedPlanSource *) palloc0(sizeof(CachedPlanSource));
plansource->magic = CACHEDPLANSOURCE_MAGIC;
plansource->raw_parse_tree = copyObject(raw_parse_tree);
plansource->query_string = pstrdup(query_string);
plansource->generic_cost = -1; /* generic plan cost not yet known */
plansource->total_custom_cost = 0;
plansource->num_generic_plans = 0;
plansource->num_custom_plans = 0;

CompleteCachedPlan then stores the analyzed-and-rewritten query_list, the parameter types, and the result tupdesc, and — for statements that can be invalidated — runs extract_query_dependencies to record the OIDs the query depends on, saving them into relationOids / invalItems for the invalidation machinery. It sets is_complete = is_valid = true. Finally StorePreparedStatement inserts a PreparedStatement entry into the per-backend prepared_queries hash table and calls SaveCachedPlan to move the source into long-lived CacheMemoryContext so it survives the transaction:

// StorePreparedStatement — src/backend/commands/prepare.c (condensed)
entry = (PreparedStatement *) hash_search(prepared_queries, stmt_name,
HASH_ENTER, &found);
if (found)
ereport(ERROR, (errcode(ERRCODE_DUPLICATE_PSTATEMENT),
errmsg("prepared statement \"%s\" already exists", stmt_name)));
entry->plansource = plansource;
entry->from_sql = from_sql;
entry->prepare_time = cur_ts;
SaveCachedPlan(plansource); /* now safe to move to permanent memory */

The protocol-level Parse message (exec_parse_message in postgres.c, owned by postgres-wire-protocol.md) reaches the same CachedPlanSource state by a slightly different path, but the resulting object is identical.

EXECUTE: binding parameters and getting a plan (prepare.c)

Section titled “EXECUTE: binding parameters and getting a plan (prepare.c)”

ExecuteQuery handles EXECUTE name (params). It looks up the prepared statement, evaluates the parameter expressions into a ParamListInfo, creates a portal, and — critically — calls GetCachedPlan to obtain the plan before defining the portal’s query. The comment warning against throwing an error between GetCachedPlan and PortalDefineQuery is load-bearing: GetCachedPlan has already incremented the plan’s refcount, and only PortalDefineQuery ties that refcount to the portal’s lifetime, so an error in between leaks it:

// ExecuteQuery — src/backend/commands/prepare.c (condensed)
entry = FetchPreparedStatement(stmt->name, true);
if (entry->plansource->num_params > 0)
{
estate = CreateExecutorState();
estate->es_param_list_info = params;
paramLI = EvaluateParams(pstate, entry, stmt->params, estate);
}
portal = CreateNewPortal();
/* Replan if needed, and increment plan refcount for portal */
cplan = GetCachedPlan(entry->plansource, paramLI, NULL, NULL);
plan_list = cplan->stmt_list;
/* DO NOT add logic that could throw between GetCachedPlan and PortalDefineQuery */
PortalDefineQuery(portal, NULL, query_string,
entry->plansource->commandTag, plan_list, cplan);
/* ... */
PortalStart(portal, paramLI, eflags, GetActiveSnapshot());
(void) PortalRun(portal, count, false, dest, dest, qc);
PortalDrop(portal, false);

EvaluateParams checks the parameter count, parse-analyzes each parameter expression, and coerces it to the declared type — raising ERRCODE_DATATYPE_MISMATCH if a value cannot be coerced. The resulting ParamListInfo is what GetCachedPlan inspects to decide whether a custom plan is even worthwhile (no parameters ⇒ never custom).

GetCachedPlan: revalidate, then choose (plancache.c)

Section titled “GetCachedPlan: revalidate, then choose (plancache.c)”

GetCachedPlan is the hinge of the whole subsystem. It first calls RevalidateCachedQuery (re-analyze/replan the query tree if invalidation has fired, and acquire parse-time locks), then calls choose_custom_plan to make the generic-vs-custom decision:

// GetCachedPlan — src/backend/utils/cache/plancache.c (condensed)
qlist = RevalidateCachedQuery(plansource, queryEnv);
customplan = choose_custom_plan(plansource, boundParams);
if (!customplan)
{
if (CheckCachedPlan(plansource))
plan = plansource->gplan; /* reuse existing valid generic plan */
else
{
plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv); /* NULL = generic */
plansource->gplan = plan;
plan->refcount++;
/* Update generic_cost whenever we make a new generic plan */
plansource->generic_cost = cached_plan_cost(plan, false);
/* Re-check now that generic_cost is known */
customplan = choose_custom_plan(plansource, boundParams);
qlist = NIL;
}
}
if (customplan)
{
plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv); /* with params */
plansource->total_custom_cost += cached_plan_cost(plan, true); /* include planner */
plansource->num_custom_plans++;
}
else
plansource->num_generic_plans++;
plan->refcount++; /* flag the plan as in use by caller */

Two subtleties: (1) the double call to choose_custom_plan. The first call decides whether to even attempt generic; if it says generic but no generic plan exists yet, BuildCachedPlan(NULL) makes one, and then generic_cost is known for the first time, so choose_custom_plan is called again — if the freshly-measured generic plan is a loser, the function falls through to build a custom plan instead, discarding the generic one’s role for this execution. (2) cached_plan_cost(plan, true) for custom plans passes include_planner = true, so the custom side of the comparison is penalized by the planning cost; the generic side (cached_plan_cost(plan, false)) is not.

choose_custom_plan is the policy itself, and it is short enough to read whole:

// choose_custom_plan — src/backend/utils/cache/plancache.c (condensed)
if (plansource->is_oneshot)
return true; /* one-shot ⇒ always custom */
if (boundParams == NULL)
return false; /* no params ⇒ generic is just as good */
if (!StmtPlanRequiresRevalidation(plansource))
return false; /* planning would be a no-op */
if (plan_cache_mode == PLAN_CACHE_MODE_FORCE_GENERIC_PLAN)
return false;
if (plan_cache_mode == PLAN_CACHE_MODE_FORCE_CUSTOM_PLAN)
return true;
/* ... CURSOR_OPT_GENERIC_PLAN / CURSOR_OPT_CUSTOM_PLAN forcing ... */
/* Generate custom plans until we have done at least 5 (arbitrary) */
if (plansource->num_custom_plans < 5)
return true;
avg_custom_cost = plansource->total_custom_cost / plansource->num_custom_plans;
/* Prefer generic plan if it's less expensive than the average custom plan. */
if (plansource->generic_cost < avg_custom_cost)
return false;
return true;

The < 5 warm-up is the famous “five executions” rule: a parameterized statement is replanned for its first five EXECUTEs regardless, building up total_custom_cost, and only from the sixth does the cost comparison kick in. The plan_cache_mode GUC (auto / force_generic_plan / force_custom_plan) lets an operator override the heuristic outright when they know better than the cost model.

The planning-cost charge lives in cached_plan_cost. It sums the total_cost of every non-utility PlannedStmt, and when include_planner is set adds a crude per-relation planning penalty:

// cached_plan_cost — src/backend/utils/cache/plancache.c (condensed)
result += plannedstmt->planTree->total_cost;
if (include_planner)
{
/* crude estimate of planning effort: linear in rangetable size */
int nrelations = list_length(plannedstmt->rtable);
result += 1000.0 * cpu_operator_cost * (nrelations + 1);
}

The comment is refreshingly candid that this is a guess: “The current multiplier of 1000 * cpu_operator_cost is probably on the low side.” This single constant is the entire model of “how much does replanning cost relative to executing”, and it is why a query over many tables (large rangetable) leans toward the generic plan sooner.

Revalidation and invalidation (plancache.c)

Section titled “Revalidation and invalidation (plancache.c)”

RevalidateCachedQuery is the lazy-invalidation responder. If the query is still valid it merely acquires planner locks and rechecks (covering the race where an sinval message arrives between the validity check and the lock); if invalid, it discards the query tree and re-runs parse analysis and rewrite. The fast exit for statements that cannot be invalidated (e.g. transaction-control commands) avoids touching the catalog at all:

// RevalidateCachedQuery — src/backend/utils/cache/plancache.c (condensed)
if (plansource->is_oneshot || !StmtPlanRequiresRevalidation(plansource))
return NIL; /* nothing to revalidate */
if (plansource->is_valid)
{
AcquirePlannerLocks(plansource->query_list, true);
if (plansource->is_valid) /* still valid after locking? */
return NIL; /* yes: reuse query tree as-is */
AcquirePlannerLocks(plansource->query_list, false); /* race: undo locks */
}
/* invalid: discard and re-analyze below ... */
plansource->is_valid = false;
plansource->query_list = NIL;

CheckCachedPlan is the analogous validity gate for the plan (not the query): it acquires executor locks on the generic plan, re-checks is_valid, and also invalidates a transient plan if TransactionXmin has advanced past the plan’s saved_xmin (a plan built against a not-yet-committed catalog change must be rebuilt once that change is globally visible):

// CheckCachedPlan — src/backend/utils/cache/plancache.c (condensed)
if (plan->is_valid)
{
AcquireExecutorLocks(plan->stmt_list, true);
if (plan->is_valid &&
TransactionIdIsValid(plan->saved_xmin) &&
!TransactionIdEquals(plan->saved_xmin, TransactionXmin))
plan->is_valid = false; /* transient plan aged out */
if (plan->is_valid)
return true; /* good, locked, reusable */
AcquireExecutorLocks(plan->stmt_list, false); /* race: release */
}
ReleaseGenericPlan(plansource); /* drop the dead generic plan */
return false;

The actual invalidation is driven by shared-invalidation callbacks registered at InitPlanCache: PlanCacheRelCallback (relation OID changed) and PlanCacheObjectCallback (other catalog object changed) walk the saved plan list and flip is_valid = false on any CachedPlanSource / CachedPlan whose relationOids / invalItems match. This is the lazy convention from ## Common DBMS Design made concrete — the callbacks set flags only; RevalidateCachedQuery / CheckCachedPlan do the rebuild on next use. The sinval mechanism itself is owned by postgres-cache-invalidation.md.

GetCachedPlan has a low-overhead sibling for hot paths: CachedPlanAllowsSimpleValidityCheck / CachedPlanIsSimplyValid let a caller that already holds a plan reference re-verify it without re-acquiring locks — checking only is_valid, gplan identity, and search_path — used by SPI and the new-style SQL function executor.

PortalStart: classify, snapshot, ExecutorStart (pquery.c)

Section titled “PortalStart: classify, snapshot, ExecutorStart (pquery.c)”

Back in the portal. PortalStart runs ChoosePortalStrategy, then for the suspendable PORTAL_ONE_SELECT case pushes a snapshot, builds a QueryDesc, and calls ExecutorStart — but does not run the executor yet. It records the result tupdesc and resets the cursor position, leaving the portal PORTAL_READY:

// PortalStart — src/backend/tcop/pquery.c (condensed, PORTAL_ONE_SELECT arm)
portal->strategy = ChoosePortalStrategy(portal->stmts);
switch (portal->strategy)
{
case PORTAL_ONE_SELECT:
if (snapshot)
PushActiveSnapshot(snapshot);
else
PushActiveSnapshot(GetTransactionSnapshot());
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
portal->sourceText, GetActiveSnapshot(),
InvalidSnapshot, None_Receiver,
params, portal->queryEnv, 0);
if (portal->cursorOptions & CURSOR_OPT_SCROLL)
myeflags = eflags | EXEC_FLAG_REWIND | EXEC_FLAG_BACKWARD;
else
myeflags = eflags;
ExecutorStart(queryDesc, myeflags); /* build PlanState tree; don't run */
portal->queryDesc = queryDesc;
portal->tupDesc = queryDesc->tupDesc;
portal->atStart = true; portal->atEnd = false; portal->portalPos = 0;
PopActiveSnapshot();
break;
/* ONE_RETURNING / ONE_MOD_WITH: set tupDesc only, run later */
/* UTIL_SELECT: tupDesc from UtilityTupleDescriptor */
/* MULTI_QUERY: tupDesc = NULL, nothing to do now */
}
portal->status = PORTAL_READY;

The CURSOR_OPT_SCROLL branch is where scrollable cursors get their power: the executor is told EXEC_FLAG_REWIND | EXEC_FLAG_BACKWARD, which makes the planner/executor support backward and repositioned fetches (the planner having stacked a Material node atop the plan if needed). All of PortalStart runs inside a PG_TRY/PG_CATCH that marks the portal PORTAL_FAILED on any error.

PortalRun and PortalRunSelect: the suspendable pull (pquery.c)

Section titled “PortalRun and PortalRunSelect: the suspendable pull (pquery.c)”

PortalRun is the per-fetch driver. For the four single-query strategies it calls PortalRunSelect; for PORTAL_MULTI_QUERY it calls PortalRunMulti once and marks the portal done. The crucial difference from a one-shot execution is that for PORTAL_ONE_SELECT it does not fill a tuplestore — it runs the executor directly for count rows and leaves the portal resumable:

// PortalRun — src/backend/tcop/pquery.c (condensed, single-query arm)
MarkPortalActive(portal);
switch (portal->strategy)
{
case PORTAL_ONE_SELECT:
case PORTAL_ONE_RETURNING:
case PORTAL_ONE_MOD_WITH:
case PORTAL_UTIL_SELECT:
/* non-SELECT strategies materialize into holdStore first */
if (portal->strategy != PORTAL_ONE_SELECT && !portal->holdStore)
FillPortalStore(portal, isTopLevel);
nprocessed = PortalRunSelect(portal, true, count, dest);
/* ... copy command tag ... */
portal->status = PORTAL_READY; /* resumable: back to READY */
result = portal->atEnd; /* DONE only if we hit the end */
break;
case PORTAL_MULTI_QUERY:
PortalRunMulti(portal, isTopLevel, false, dest, altdest, qc);
MarkPortalDone(portal);
result = true;
break;
}

PortalRunSelect is where the iterator is actually advanced. For a held cursor or a completed utility query it reads from the tuplestore via RunFromStore; otherwise it pushes the portal’s snapshot and calls ExecutorRun for the requested direction and count, then updates the cursor position from es_processed:

// PortalRunSelect — src/backend/tcop/pquery.c (condensed, forward arm)
queryDesc = portal->queryDesc;
if (forward)
{
if (portal->atEnd || count <= 0)
direction = NoMovementScanDirection, count = 0;
else
direction = ForwardScanDirection;
if (count == FETCH_ALL) count = 0; /* executor: 0 = all rows */
if (portal->holdStore)
nprocessed = RunFromStore(portal, direction, (uint64) count, dest);
else
{
PushActiveSnapshot(queryDesc->snapshot);
ExecutorRun(queryDesc, direction, (uint64) count);
nprocessed = queryDesc->estate->es_processed;
PopActiveSnapshot();
}
if (!ScanDirectionIsNoMovement(direction))
{
if (nprocessed > 0) portal->atStart = false; /* can go backward now */
if (count == 0 || nprocessed < (uint64) count)
portal->atEnd = true; /* retrieved them all */
portal->portalPos += nprocessed;
}
}

Because ExecutorRun honors a tuple count and the executor’s PlanState tree keeps its iteration state between calls (see postgres-executor.md), calling PortalRun(portal, n, ...) repeatedly on the same PORTAL_ONE_SELECT portal advances through the result set n rows at a time — exactly the cursor FETCH n and protocol Execute(n) semantics. The backward arm enforces CURSOR_OPT_NO_SCROLL by raising an error on any backward fetch.

Non-sequential cursor operations (FETCH ABSOLUTE, FETCH BACKWARD ALL, MOVE) go through PortalRunFetchDoPortalRunFetch, which translates the FetchDirection and count into a sequence of forward/backward PortalRunSelect calls (and DoPortalRewind to reposition to the start). These build on the same suspend/resume primitive.

PortalDrop is the symmetric teardown. It refuses to drop a pinned or still PORTAL_ACTIVE portal, runs the cleanup hook (which calls ExecutorEnd to tear down the PlanState tree if the executor is still live), unregisters any held snapshot, drops the cached-plan refcount, and deletes the portal’s memory context:

// PortalDrop — src/backend/utils/mmgr/portalmem.c (condensed)
if (portal->portalPinned)
ereport(ERROR, (errcode(ERRCODE_INVALID_CURSOR_STATE),
errmsg("cannot drop pinned portal \"%s\"", portal->name)));
if (portal->status == PORTAL_ACTIVE)
ereport(ERROR, (errcode(ERRCODE_INVALID_CURSOR_STATE),
errmsg("cannot drop active portal \"%s\"", portal->name)));
if (PointerIsValid(portal->cleanup))
{
portal->cleanup(portal); /* PortalCleanup → ExecutorEnd */
portal->cleanup = NULL;
}
PortalHashTableDelete(portal);
PortalReleaseCachedPlan(portal); /* drop the CachedPlan refcount */

PortalReleaseCachedPlan is the other half of the refcount contract from ExecuteQuery: the refcount GetCachedPlan incremented and PortalDefineQuery took ownership of is released here, and if it was the last reference the CachedPlan is freed. For a generic plan that the CachedPlanSource still links, the source’s own reference keeps it alive for the next EXECUTE.

Position hints (as of 2026-06-05, REL_18 273fe94)

Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”
SymbolFileLine
PortalData (struct)src/include/utils/portal.h115
PortalStrategy (enum)src/include/utils/portal.h89
PortalStatus (enum)src/include/utils/portal.h103
CachedPlanSource (struct)src/include/utils/plancache.h105
CachedPlan (struct)src/include/utils/plancache.h159
PlanCacheMode (enum)src/include/utils/plancache.h31
ChoosePortalStrategysrc/backend/tcop/pquery.c210
PortalStartsrc/backend/tcop/pquery.c434
PortalRunsrc/backend/tcop/pquery.c685
PortalRunSelectsrc/backend/tcop/pquery.c864
FillPortalStoresrc/backend/tcop/pquery.c995
RunFromStoresrc/backend/tcop/pquery.c1056
PortalRunMultisrc/backend/tcop/pquery.c1185
PortalRunFetchsrc/backend/tcop/pquery.c1377
DoPortalRunFetchsrc/backend/tcop/pquery.c1475
DoPortalRewindsrc/backend/tcop/pquery.c1669
PrepareQuerysrc/backend/commands/prepare.c59
ExecuteQuerysrc/backend/commands/prepare.c150
EvaluateParamssrc/backend/commands/prepare.c281
StorePreparedStatementsrc/backend/commands/prepare.c392
FetchPreparedStatementsrc/backend/commands/prepare.c434
DropPreparedStatementsrc/backend/commands/prepare.c519
CreateCachedPlansrc/backend/utils/cache/plancache.c183
CompleteCachedPlansrc/backend/utils/cache/plancache.c391
SaveCachedPlansrc/backend/utils/cache/plancache.c530
RevalidateCachedQuerysrc/backend/utils/cache/plancache.c667
CheckCachedPlansrc/backend/utils/cache/plancache.c935
BuildCachedPlansrc/backend/utils/cache/plancache.c1019
choose_custom_plansrc/backend/utils/cache/plancache.c1158
cached_plan_costsrc/backend/utils/cache/plancache.c1215
GetCachedPlansrc/backend/utils/cache/plancache.c1280
ReleaseCachedPlansrc/backend/utils/cache/plancache.c1403
CachedPlanIsSimplyValidsrc/backend/utils/cache/plancache.c1563
PlanCacheRelCallbacksrc/backend/utils/cache/plancache.c2098
PlanCacheObjectCallbacksrc/backend/utils/cache/plancache.c2182
CreatePortalsrc/backend/utils/mmgr/portalmem.c175
PortalDefineQuerysrc/backend/utils/mmgr/portalmem.c282
PortalDropsrc/backend/utils/mmgr/portalmem.c468

All claims below were checked against the REL_18_STABLE working tree at commit 273fe94852b3a7e34fd171e8abdf1481beb302fa under /data/hgryoo/references/postgres.

  • Five PortalStrategy valuesPortalStrategy in portal.h enumerates exactly PORTAL_ONE_SELECT, PORTAL_ONE_RETURNING, PORTAL_ONE_MOD_WITH, PORTAL_UTIL_SELECT, PORTAL_MULTI_QUERY. Verified.
  • Six PortalStatus valuesPORTAL_NEW, PORTAL_DEFINED, PORTAL_READY, PORTAL_ACTIVE, PORTAL_DONE, PORTAL_FAILED. The ACTIVE→READY back-edge is documented in the portal.h header comment (“It is possible to transit from ACTIVE back to READY if the query is not run to completion”). Verified.
  • Only PORTAL_ONE_SELECT runs incrementallyPortalRun calls FillPortalStore for every single-query strategy except PORTAL_ONE_SELECT (portal->strategy != PORTAL_ONE_SELECT && !portal->holdStore). Verified in pquery.c.
  • The “5” warm-up constantchoose_custom_plan returns true while plansource->num_custom_plans < 5. The literal 5 with the comment “Generate custom plans until we have done at least 5 (arbitrary)” is present. Verified.
  • Generic chosen iff generic_cost < avg_custom_cost — the final comparison in choose_custom_plan is if (plansource->generic_cost < avg_custom_cost) return false; where avg_custom_cost = total_custom_cost / num_custom_plans. Verified.
  • Planning-cost chargecached_plan_cost adds 1000.0 * cpu_operator_cost * (nrelations + 1) only when include_planner is true, and GetCachedPlan calls it with true for custom plans and false for the generic plan. Verified.
  • Double choose_custom_plan callGetCachedPlan calls choose_custom_plan once before building anything and a second time immediately after building a generic plan and setting generic_cost. Verified.
  • plan_cache_mode GUCPlanCacheMode enum has PLAN_CACHE_MODE_AUTO, PLAN_CACHE_MODE_FORCE_GENERIC_PLAN, PLAN_CACHE_MODE_FORCE_CUSTOM_PLAN; registered in guc_tables.c with default PLAN_CACHE_MODE_AUTO. Verified.
  • Refcount contractGetCachedPlan does plan->refcount++ before returning; ExecuteQuery carries the warning comment “DO NOT add any logic that could possibly throw an error between GetCachedPlan and PortalDefineQuery”; PortalDrop calls PortalReleaseCachedPlan. Verified.
  • Lazy invalidationPlanCacheRelCallback / PlanCacheObjectCallback set is_valid = false (they do not rebuild); RevalidateCachedQuery and CheckCachedPlan perform the rebuild on next use. Verified.
  • fixed_result for PREPAREPrepareQuery passes true for fixed_result to CompleteCachedPlan, and ExecuteQuery errors with “EXECUTE does not support variable-result cached plans” if it is false. Verified.

Not separately re-verified (described at the level the source comments assert, deferred to sibling docs): the internals of ExecutorStart/ExecutorRun (postgres-executor.md); the sinval delivery mechanism behind the plan-cache callbacks (postgres-cache-invalidation.md); the wire-protocol Parse/Bind/ Execute path that reaches these same functions (postgres-wire-protocol.md).

Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”

PostgreSQL’s plan cache makes one of the more interesting design choices in the engine, and it sits at a junction of several active research and engineering threads. This section places the generic-vs-custom heuristic in context.

The cost of getting the decision wrong: parameter-sensitive plans

Section titled “The cost of getting the decision wrong: parameter-sensitive plans”

The whole generic-vs-custom problem is a special case of what the literature calls the parameter-sensitive query (PSQ) or parametric query optimization (PQO) problem: one query template whose optimal plan depends on the bind values. The classic failure mode is “skewed selectivity” — a generic plan chosen for the average parameter is catastrophic for an outlier parameter, or vice versa. PostgreSQL’s heuristic is deliberately crude: it does not model which parameters were seen, only the average cost of the custom plans it built. This means it cannot detect that, say, half the executions want plan A and half want plan B; it will average their costs and may settle on a mediocre generic plan that serves neither well. Commercial systems attack this more aggressively:

  • SQL Server uses parameter sniffing — it plans on the first set of bind values and caches that plan, which is fast but pathological when the first value is unrepresentative (the infamous “parameter sniffing problem”). Recent versions add Query Store and adaptive joins to mitigate it, and a OPTIMIZE FOR UNKNOWN hint that is the moral equivalent of forcing a generic plan.
  • Oracle ships adaptive cursor sharing and cardinality feedback: it caches multiple plans per statement keyed on bind-value buckets, escalating from a single shared cursor to bind-aware cursors when it observes that the same SQL produces very different cardinalities for different binds. This is strictly more powerful than PostgreSQL’s single generic plan, at the cost of a larger plan cache and more bookkeeping.
  • Bind-variable peeking in DB2 LUW occupies similar ground.

PostgreSQL has periodically debated multi-plan caching (keyed on parameter ranges) on pgsql-hackers; the recurring objection is that the bookkeeping and cache-bloat cost is hard to justify against the simple “≥5 then compare averages” rule, which is cheap and good-enough for the common case. The fact that the constant is literally 5 with the comment “(arbitrary)” is a candid acknowledgement that this is engineering pragmatism, not a tuned optimum.

Research: learned and feedback-driven plan selection

Section titled “Research: learned and feedback-driven plan selection”

The academic frontier has moved well past static cost comparison. Adaptive query processing (Deshpande, Ives, Raman, FnT in Databases 2007) reframes the whole question: instead of committing to one plan, re-route tuples between operators at run time based on observed selectivities — the eddy operator being the canonical example. Learned cost models and learned optimizers (Marcus et al., “Neo”, VLDB 2019; “Bao”, SIGMOD 2021) replace the analytic cost model entirely with a model trained on past executions, and can in principle learn the parameter-sensitivity that PostgreSQL’s averaging discards. None of this is in core PostgreSQL, but pg_hint_plan, the AQO extension (aqo, which feeds back actual cardinalities into the cost model), and the planner-hook surface that choose_custom_plan is reachable from make PostgreSQL a common host for these experiments. The plan cache’s clean separation of CachedPlanSource (the template) from CachedPlan (a concrete plan) is exactly the seam such extensions need.

Container design: portals vs. server-side cursors elsewhere

Section titled “Container design: portals vs. server-side cursors elsewhere”

The portal abstraction — a named, suspendable execution container with its own resource owner and snapshot — is unusually explicit in PostgreSQL. Many engines fold “cursor”, “prepared statement handle”, and “active statement” into a single client-visible handle and keep the suspend/resume state inside the network layer. PostgreSQL’s choice to make the portal a first-class server object is what lets the same machinery serve SQL DECLARE CURSOR, the protocol’s unnamed and named portals, and PL/pgSQL’s implicit cursors uniformly — and it is why pg_cursors can expose them as a system view. The cost is the careful snapshot- and resource-owner bookkeeping visible in PortalDrop and HoldPinnedPortals (the latter converts a pinned portal into a held cursor — results materialized into a tuplestore — so it can survive transaction commit, a feature WITH HOLD cursors depend on).

Prepared statements and connection poolers

Section titled “Prepared statements and connection poolers”

A practical frontier sits outside the engine entirely: connection poolers. Because prepared_queries is a per-backend hash table, a prepared statement created on one pooled backend does not exist on another. Transaction-pooling poolers (PgBouncer in transaction mode) historically broke server-side prepared statements for exactly this reason; PgBouncer 1.21+ added a prepared-statement emulation layer that tracks Parse/Bind/Close at the protocol level and re-prepares on the actual backend. This is a direct consequence of the design documented here: the plan cache’s lifetime is the backend’s, and nothing in plancache.c is shared across backends (unlike the buffer pool or the lock table). A genuinely shared plan cache — proposed several times — would need to solve cross-backend invalidation and memory accounting, which is why it has not landed.

  • No cross-statement plan sharing. Two textually-identical ad-hoc queries from different backends each plan from scratch; there is no global SQL-text → plan cache the way Oracle’s shared pool or SQL Server’s plan cache work. The plan cache is opt-in (via PREPARE / the extended protocol / SPI), not automatic for the simple-query path.
  • No plan stability / plan baselines. There is no built-in mechanism to pin a known-good plan against regression (Oracle’s SQL Plan Management). The closest levers are plan_cache_mode and pg_hint_plan.
  • No per-parameter-bucket plans. As discussed, one generic plan, period.

These are coherent with PostgreSQL’s general philosophy of a small, predictable core with extension points rather than a large feature surface — the same philosophy visible in the executor’s hook design (postgres-executor.md).

  • Code (REL_18_STABLE, commit 273fe94, under /data/hgryoo/references/postgres):
    • src/backend/tcop/pquery.c — portal run loop: ChoosePortalStrategy, PortalStart, PortalRun, PortalRunSelect, PortalRunMulti, PortalRunFetch, FillPortalStore.
    • src/backend/commands/prepare.c — SQL PREPARE/EXECUTE/DEALLOCATE: PrepareQuery, ExecuteQuery, EvaluateParams, StorePreparedStatement, FetchPreparedStatement, the prepared_queries hash table.
    • src/backend/utils/cache/plancache.c — the plan cache: CreateCachedPlan, CompleteCachedPlan, GetCachedPlan, choose_custom_plan, cached_plan_cost, RevalidateCachedQuery, CheckCachedPlan, BuildCachedPlan, the invalidation callbacks.
    • src/backend/utils/mmgr/portalmem.c — portal allocation and teardown: CreatePortal, PortalDefineQuery, PortalDrop, HoldPinnedPortals.
    • src/include/utils/portal.h, src/include/utils/plancache.h — the PortalData, CachedPlanSource, CachedPlan structs and their header comments (the header comments are themselves design documentation).
  • Theory anchor: Database System Concepts (Silberschatz, Korth, Sudarshan, 7th ed.), ch. 15 “Query Processing” (§15.1 the parse/optimize/ evaluate pipeline; §15.7.2 pipelined evaluation) and ch. 16 “Query Optimization” (§16.5 cost-based plan choice) — captured under knowledge/research/dbms-general/. See also .omc/plans/postgres-paper-bibliography.md.
  • Cross-references (this KB): postgres-executor.md (what ExecutorStart/ ExecutorRun do inside the portal), postgres-backend-lifecycle.md (how a backend dispatches to PortalRun), postgres-wire-protocol.md (the Parse/Bind/Execute extended-query path that reaches the same plan cache), postgres-cache-invalidation.md (the sinval mechanism behind the plan-cache callbacks), postgres-planner-overview.md (what BuildCachedPlan invokes).
  • Comparative / frontier reading (named, not quoted): Deshpande, Ives, Raman, “Adaptive Query Processing” (Foundations and Trends in Databases, 2007); Marcus et al., “Neo: A Learned Query Optimizer” (VLDB 2019) and “Bao” (SIGMOD 2021); the PostgreSQL aqo and pg_hint_plan extensions; SQL Server parameter-sniffing / Query Store and Oracle adaptive cursor sharing documentation.