Skip to content

PostgreSQL Plan Creation — From Winning Path to Executable Plan Tree

Contents:

A cost-based optimizer is, in the textbook framing, a search over the space of equivalent evaluation plans for one query. Database System Concepts (Silberschatz, 7e, ch. 16 “Query Optimization”) describes the job as: enumerate logically equivalent expressions, annotate each with a physical operator and an access method, estimate the cost of each, and “choose an evaluation plan that has lowest estimated cost” (§16.1). The chapter is careful to separate two artifacts that beginners conflate:

  • An evaluation-plan expression — the annotated operator tree the optimizer reasons about. Internally PostgreSQL calls these Paths. A Path is cheap to make and throw away: it records only what the optimizer needs to compare alternatives — an operator type, the inputs, the sort order it produces, the rows it will emit, and the cost to produce them. Thousands of Paths are generated and discarded during join-order search.
  • An evaluation plan — the concrete recipe the execution engine runs. DSC §16.1 notes that the chosen expression must still be “annotated with the method to be used for evaluating each operation” and turned into something executable. This is the Plan in PostgreSQL. There is exactly one, and it is built only for the single Path that won the cost tournament.

The transformation from “the cheapest expression” to “the executable plan” is its own phase, and it is the subject of this document. DSC treats it almost in passing — the textbook’s interest is the search, not the lowering — but the lowering is where a real engine does a surprising amount of work. Three problems must be solved before the operator tree can be handed to the iterator engine of postgres-executor.md:

  1. Operator selection and node materialization. The winning Path says “hash join here, sequential scan there.” Each abstract operator must become a concrete node carrying everything the executor needs: the join condition stripped to bare expressions, the qualification clauses ordered for cheapest-first short-circuiting, the output column list, and the frozen cost estimates for EXPLAIN.

  2. Variable resolution. DSC §15.3 and §16 reason about columns by name (r.a, s.b). The parser numbers them as (range-table-index, attribute-number) pairs. But the executor does not have a range table per operator — at a join, the inputs are two anonymous tuple slots, and the only sensible way to name a column is “field k of the outer tuple” or “field k of the inner tuple.” Resolving every Var from catalog identity to slot position is a second, independent tree walk.

  3. Nested sub-queries. A scalar sub-SELECT in a WHERE clause is a whole second query tree. DSC §16.4.4 (“Optimizing Nested Subqueries”) observes that a correlated subquery — one that references the outer row — is logically a function evaluated once per outer tuple, whereas an uncorrelated subquery can be “evaluated once” and its result reused. That dichotomy maps directly onto PostgreSQL’s SubPlan (per-tuple) versus initPlan (once-per-query) distinction.

The remaining sections trace how PostgreSQL’s createplan.c, setrefs.c, and subselect.c solve these three problems on the REL_18 source. The upstream phases — join enumeration and Path costing — are covered in postgres-planner-overview.md and postgres-path-generation.md; the downstream consumer of the finished Plan is postgres-executor.md.

The textbook gives the what (lower the chosen expression to something runnable). This section names the engineering conventions almost every production optimizer adopts for the lowering, so PostgreSQL’s specific choices read as one point in a shared design space.

Two object models: a throwaway plan-space node and a durable plan node

Section titled “Two object models: a throwaway plan-space node and a durable plan node”

Optimizers that do real search keep the search-space node deliberately thin and a separate, fatter node for the survivor. The search node (Path / Group expression in Cascades / AbstractRelNode in some academic systems) carries only comparison-relevant fields so that allocating millions of them is cheap. The survivor is expanded — once — into a node that carries execution detail (resolved expressions, serialization hints, parallel-worker counts). Conflating the two would make search allocate execution metadata it will throw away 99.99% of the time. PostgreSQL’s Path vs Plan split is the textbook instance.

Top-down structural recursion with a “what tlist do you owe me?” contract

Section titled “Top-down structural recursion with a “what tlist do you owe me?” contract”

The lowering walks the chosen tree from the root down. A subtlety almost every engine hits: a parent operator sometimes needs its child to emit exactly a certain set of columns in a certain order (a sort needs its sort keys present; a set-operation needs matching column counts), and sometimes does not care because the parent will project anyway (a nested-loop join re-projects its output, so its children may emit whatever is convenient). Engines therefore thread a small “tlist requirement” flag down the recursion so a child can choose the cheapest target list it is allowed to. Emitting a wider physical tuple when allowed lets the scan skip a projection step entirely.

Quals: order them, strip them, and split one-time from per-tuple

Section titled “Quals: order them, strip them, and split one-time from per-tuple”

A node’s qualification clauses are reorganized at lowering time, not run time. Two transformations are universal: (a) order the clauses so the cheapest and most selective run first, since AND short-circuits; and (b) strip the optimizer’s bookkeeping wrapper off each clause (PostgreSQL’s RestrictInfo) to leave the bare boolean expression the evaluator wants. A third, more specialized split pulls out pseudoconstant clauses — conditions that do not depend on the current row (WHERE 1=0, WHERE $1 IS NULL) — so they can be evaluated once as a gate rather than re-tested per tuple.

This is the deepest convention and the one most invisible to users. Inside the executor, an operator receives its inputs as opaque tuple slots; it must address a column as “slot, offset k,” not as “table T, column c.” So the lowering performs a global rewrite: at a scan leaf, a Var keeps its identity but its range-table index is shifted into a single flattened namespace; at every interior node, a Var that referred to a base column is rewritten to point at the position of that column in the child’s output tuple. After this pass, no operator needs the catalog to find a column — it dereferences an array. The price is a second full tree walk and a per-node index structure to make the “which output position is this Var?” lookup fast.

Subqueries: once-per-query vs once-per-row

Section titled “Subqueries: once-per-query vs once-per-row”

Every engine must decide, for each nested sub-SELECT, whether it can be hoisted to run a single time (its result cached in a parameter slot) or must be re-evaluated for each outer row because it depends on that row. The hoistable case is strictly cheaper and is taken whenever the subquery has no correlation to the current level. The per-row case is wired as a callable sub-plan that reads its correlation values from parameter slots the parent fills in before each invocation.

The cost and row-count estimates computed during search are copied onto the plan node and frozen. They are no longer used to make decisions — the decisions are over — but EXPLAIN surfaces them, and a few late rewrites (inserting a Sort or Material node not present as its own Path) re-derive a local estimate to keep EXPLAIN self-consistent.

ConventionPostgreSQL realization
Throwaway search nodePath (and subtypes) in pathnodes.h
Durable execution nodePlan (and subtypes) in plannodes.h
Top-down loweringcreate_plancreate_plan_recurse
Tlist requirement flagCP_EXACT_TLIST / CP_SMALL_TLIST / CP_LABEL_TLIST / CP_IGNORE_TLIST
Wider physical tlistuse_physical_tlistbuild_physical_tlist
Order + strip qualsorder_qual_clauses + extract_actual_clauses
One-time qual gateget_gating_qualscreate_gating_plan (a Result)
Positional variable bindingset_plan_referencesfix_scan_expr / fix_join_expr / fix_upper_expr
Per-row subquerySubPlan (build_subplan, isInitPlan = false)
Once-per-query subqueryinitPlan (isInitPlan = true, SS_attach_initplans)
Frozen estimatescopy_generic_path_info

PostgreSQL splits plan creation into two cleanly separated passes over the tree, with subquery wiring threaded through both.

Pass 1 — createplan.c: Path tree → Plan tree (still in parser Var numbering). create_plan is the entry point. It is handed the single best_path chosen by the join search and recurses down it, materializing one Plan node per Path node. Crucially, the comment at the head of the file is explicit that this pass leaves variable numbering alone:

// createplan.c file header — src/backend/optimizer/plan/createplan.c
* The tlists and quals in the plan tree are still in planner format,
* ie, Vars still correspond to the parser's numbering. This will be
* fixed later by setrefs.c.

The recursion dispatches purely on best_path->pathtype, and the switch is the table of contents for the whole module:

// create_plan_recurse — src/backend/optimizer/plan/createplan.c
switch (best_path->pathtype)
{
case T_SeqScan:
case T_IndexScan:
/* ... all scan types ... */
case T_CustomScan:
plan = create_scan_plan(root, best_path, flags);
break;
case T_HashJoin:
case T_MergeJoin:
case T_NestLoop:
plan = create_join_plan(root, (JoinPath *) best_path);
break;
/* ... Append, Sort, Agg, Material, Gather, ModifyTable, Limit ... */
}

The flags argument is the “tlist requirement” contract from the Common Design section. Its values are defined at the top of the file:

// flag definitions — src/backend/optimizer/plan/createplan.c
#define CP_EXACT_TLIST 0x0001 /* Plan must return specified tlist */
#define CP_SMALL_TLIST 0x0002 /* Prefer narrower tlists */
#define CP_LABEL_TLIST 0x0004 /* tlist must contain sortgrouprefs */
#define CP_IGNORE_TLIST 0x0008 /* caller will replace tlist */

create_plan seeds the recursion with CP_EXACT_TLIST because the top-level plan must emit exactly the query’s output columns; interior nodes relax this for their children when they themselves can project.

Pass 2 — setrefs.c: resolve every Var to a slot position. After create_plan returns, the caller (standard_plannerset_plan_references) walks the finished Plan tree a second time. This pass flattens all per-subquery range tables into one global finalrtable, shifts every scan node’s Vars by an rtoffset, and — the heart of the pass — rewrites each join/upper node’s Vars so they address the output position of the child tuple rather than a base column.

Subquery wiring — subselect.c. Sub-SELECTs were turned into SubPlan/initPlan structures earlier (during expression preprocessing and build_subplan), but plan creation is where they are stitched onto the tree: SS_attach_initplans hangs the level’s initPlans on the topmost node, and setrefs.c resolves the PARAM_EXEC references that connect a sub-plan’s output to the place it is consumed.

create_scan_plan is the worker for all leaf relations. It does four things in order: pick the restriction clauses, decide on a target list, build the typed node, and (if needed) wrap a gating Result. The target-list decision is the interesting part — when the parent does not demand an exact tlist, the scan prefers a physical tlist (all columns in storage order) so the executor can skip projection:

// create_scan_plan — src/backend/optimizer/plan/createplan.c
if (flags == CP_IGNORE_TLIST)
{
tlist = NULL;
}
else if (use_physical_tlist(root, best_path, flags))
{
/* ... index-only scan uses the index's tlist; else: */
tlist = build_physical_tlist(root, rel);
if (tlist == NIL) /* dropped cols -> fall back */
tlist = build_path_tlist(root, best_path);
}
else
{
tlist = build_path_tlist(root, best_path);
}

The normal target list is built by build_path_tlist, which simply wraps each expression of the Path’s pathtarget in a TargetEntry with a sequential resno, and — for a parameterized path — replaces any lateral references with nestloop Params on the way:

// build_path_tlist — src/backend/optimizer/plan/createplan.c
foreach(v, path->pathtarget->exprs)
{
Node *node = (Node *) lfirst(v);
TargetEntry *tle;
if (path->param_info)
node = replace_nestloop_params(root, node);
tle = makeTargetEntry((Expr *) node, resno, NULL, false);
if (sortgrouprefs)
tle->ressortgroupref = sortgrouprefs[resno - 1];
tlist = lappend(tlist, tle);
resno++;
}

use_physical_tlist is the gate: it refuses the wide-tuple shortcut whenever the parent demanded an exact or small tlist, whenever the relation is not a simple base/subquery/function/values/CTE rel, and whenever system columns or whole-row Vars are needed (those cannot be addressed positionally in setrefs.c):

// use_physical_tlist — src/backend/optimizer/plan/createplan.c
if (flags & (CP_EXACT_TLIST | CP_SMALL_TLIST))
return false;
if (rel->rtekind != RTE_RELATION &&
rel->rtekind != RTE_SUBQUERY &&
/* ... RTE_FUNCTION, RTE_TABLEFUNC, RTE_VALUES, RTE_CTE ... */ )
return false;
/* Can't do it if any system columns or whole-row Vars are requested. */
for (i = rel->min_attr; i <= 0; i++)
if (!bms_is_empty(rel->attr_needed[i - rel->min_attr]))
return false;

Once the tlist is chosen, the per-type creator runs. The sequential-scan case is the cleanest illustration of the whole node-building idiom — order quals, strip them, replace outer Vars, allocate, copy estimates:

// create_seqscan_plan — src/backend/optimizer/plan/createplan.c
/* Sort clauses into best execution order */
scan_clauses = order_qual_clauses(root, scan_clauses);
/* Reduce RestrictInfo list to bare expressions; ignore pseudoconstants */
scan_clauses = extract_actual_clauses(scan_clauses, false);
/* Replace any outer-relation variables with nestloop params */
if (best_path->param_info)
scan_clauses = (List *)
replace_nestloop_params(root, (Node *) scan_clauses);
scan_plan = make_seqscan(tlist, scan_clauses, scan_relid);
copy_generic_path_info(&scan_plan->scan.plan, best_path);

make_seqscan is a pure allocator — no logic, just field assignment — which keeps the policy (above) separate from the node shape:

// make_seqscan — src/backend/optimizer/plan/createplan.c
SeqScan *node = makeNode(SeqScan);
Plan *plan = &node->scan.plan;
plan->targetlist = qptlist;
plan->qual = qpqual;
plan->lefttree = NULL;
plan->righttree = NULL;
node->scan.scanrelid = scanrelid;

And copy_generic_path_info is where the search-phase estimates are frozen onto the durable node:

// copy_generic_path_info — src/backend/optimizer/plan/createplan.c
dest->disabled_nodes = src->disabled_nodes;
dest->startup_cost = src->startup_cost;
dest->total_cost = src->total_cost;
dest->plan_rows = src->rows;
dest->plan_width = src->pathtarget->width;
dest->parallel_aware = src->parallel_aware;
dest->parallel_safe = src->parallel_safe;

The overall control flow of Pass 1 for a two-table join with a correlated nested loop:

flowchart TD
    A["create_plan(root, best_path)<br/>seed flags = CP_EXACT_TLIST"] --> B["create_plan_recurse<br/>dispatch on pathtype"]
    B -->|T_NestLoop| C["create_join_plan<br/>-> create_nestloop_plan"]
    C --> D["recurse: outer child<br/>flags = 0 (NL can project)"]
    C --> E["push outer relids into<br/>root->curOuterRels"]
    E --> F["recurse: inner child<br/>flags = 0"]
    D -->|T_SeqScan| G["create_scan_plan<br/>physical tlist allowed?"]
    F -->|T_IndexScan| H["create_scan_plan<br/>fix indexquals, replace outer Vars"]
    G --> I["make_seqscan +<br/>copy_generic_path_info"]
    H --> J["make_indexscan +<br/>copy_generic_path_info"]
    C --> K["identify_current_nestloop_params<br/>build nestParams list"]
    K --> L["make_nestloop(tlist, joinclauses,<br/>nestParams, outer, inner)"]
    B --> M["apply_tlist_labeling on top node"]
    M --> N["SS_attach_initplans(root, plan)"]
    N --> O["Plan tree in PARSER Var numbering"]

create_join_plan dispatches the three join methods, then checks for pseudoconstant gating quals exactly as the scan path does:

// create_join_plan — src/backend/optimizer/plan/createplan.c
switch (best_path->path.pathtype)
{
case T_MergeJoin: plan = (Plan *) create_mergejoin_plan(root, (MergePath *) best_path); break;
case T_HashJoin: plan = (Plan *) create_hashjoin_plan(root, (HashPath *) best_path); break;
case T_NestLoop: plan = (Plan *) create_nestloop_plan(root, (NestPath *) best_path); break;
}
gating_clauses = get_gating_quals(root, best_path->joinrestrictinfo);
if (gating_clauses)
plan = create_gating_plan(root, (Path *) best_path, plan, gating_clauses);

create_nestloop_plan is where lateral correlation becomes machinery. A nested-loop join can re-scan its inner side once per outer row, so the inner subtree is allowed to reference outer columns. Those references are carried as Vars during search; here they are converted to NestLoopParams. The routine pushes the outer relids into root->curOuterRels before recursing into the inner child, so that replace_nestloop_params (called deep inside the inner scan’s clause processing) knows which Vars belong to the outer side:

// create_nestloop_plan — src/backend/optimizer/plan/createplan.c
/* NestLoop can project, so no need to be picky about child tlists */
outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, 0);
/* For a nestloop, include outer relids in curOuterRels for inner side */
outerrelids = best_path->jpath.outerjoinpath->parent->relids;
root->curOuterRels = bms_union(root->curOuterRels, outerrelids);
inner_plan = create_plan_recurse(root, best_path->jpath.innerjoinpath, 0);
/* Restore curOuterRels */
bms_free(root->curOuterRels);
root->curOuterRels = saveOuterRels;

After both children exist, the join clauses are ordered, stripped (with an outer-join split so that nulled-side quals become otherclauses), and the surviving nestloop params are identified and removed from root->curOuterParams:

// create_nestloop_plan — src/backend/optimizer/plan/createplan.c
if (best_path->jpath.path.param_info)
{
joinclauses = (List *) replace_nestloop_params(root, (Node *) joinclauses);
otherclauses = (List *) replace_nestloop_params(root, (Node *) otherclauses);
}
nestParams = identify_current_nestloop_params(root, outerrelids,
PATH_REQ_OUTER((Path *) best_path));
join_plan = make_nestloop(tlist, joinclauses, otherclauses, nestParams,
outer_plan, inner_plan,
best_path->jpath.jointype, best_path->jpath.inner_unique);

The replace_nestloop_params mutator itself is the rule: a Var is turned into a PARAM_EXEC parameter only if its varno is a member of the current outer relids; otherwise it passes through untouched. This is why the curOuterRels book-keeping above is load-bearing:

// replace_nestloop_params_mutator — src/backend/optimizer/plan/createplan.c
if (IsA(node, Var))
{
Var *var = (Var *) node;
/* If not to be replaced, we can just return the Var unmodified */
if (IS_SPECIAL_VARNO(var->varno) ||
!bms_is_member(var->varno, root->curOuterRels))
return node;
/* Replace the Var with a nestloop Param */
return (Node *) replace_nestloop_param_var(root, var);
}

Projection. create_projection_plan shows the tlist-requirement contract paying off: if the parent does not care about the exact tlist (use_physical_tlist), or if the child is itself projection-capable, no separate Result node is built — the projection is folded into the child by simply reassigning its targetlist. A Result is materialized only when a non-projecting child must have its output reshaped:

// create_projection_plan — src/backend/optimizer/plan/createplan.c
if (!needs_result_node)
{
/* Don't need a separate Result, just assign tlist to subplan */
plan = subplan;
plan->targetlist = tlist;
/* Label plan with the estimated costs we actually used */
plan->startup_cost = best_path->path.startup_cost;
/* ... */
}
else
{
/* We need a Result node */
plan = (Plan *) make_result(tlist, NULL, subplan);
copy_generic_path_info(plan, (Path *) best_path);
}

Pass 1: handing subqueries to a fresh planner context

Section titled “Pass 1: handing subqueries to a fresh planner context”

When a scan is over a sub-SELECT (RTE_SUBQUERY), create_subqueryscan_plan does not call create_plan_recurse; it calls create_plan afresh on the subquery’s own PlannerInfo (rel->subroot), because the subquery is a separate range-table namespace that must be lowered independently before being spliced in:

// create_subqueryscan_plan — src/backend/optimizer/plan/createplan.c
/*
* Recursively create Plan from Path for subquery. Since we are entering
* a different planner context (subroot), recurse to create_plan not
* create_plan_recurse.
*/
subplan = create_plan(rel->subroot, best_path->subpath);
/* ... */
if (best_path->path.param_info)
{
process_subquery_nestloop_params(root, rel->subplan_params);
scan_clauses = (List *)
replace_nestloop_params(root, (Node *) scan_clauses);
}
scan_plan = make_subqueryscan(tlist, scan_clauses, scan_relid, subplan);

Scalar sub-SELECTs in expressions (not in FROM) are a different mechanism entirely, handled in subselect.c. build_subplan classifies each one. The key decision is initPlan vs SubPlan: if the sub-SELECT has no parameters supplied by the current level (parParam == NIL) and is of a hoistable type (EXISTS, EXPR, ARRAY, ROWCOMPARE, MULTIEXPR), it becomes an initPlan that runs once and stashes its result in a PARAM_EXEC slot, and a bare Param is returned to stand in for it in the parent expression:

// build_subplan — src/backend/optimizer/plan/subselect.c
if (splan->parParam == NIL && subLinkType == EXPR_SUBLINK)
{
TargetEntry *te = linitial(plan->targetlist);
Param *prm;
prm = generate_new_exec_param(root,
exprType((Node *) te->expr),
exprTypmod((Node *) te->expr),
exprCollation((Node *) te->expr));
splan->setParam = list_make1_int(prm->paramid);
isInitPlan = true;
result = (Node *) prm; /* parent sees a Param, not the SubPlan */
}

Otherwise — a correlated subquery, or an ANY/ALL test that must scan the output per outer tuple — it stays a SubPlan that the executor invokes per row, optionally hash-tabled or materialized:

// build_subplan — src/backend/optimizer/plan/subselect.c
if (subLinkType == ANY_SUBLINK &&
splan->parParam == NIL &&
subplan_is_hashable(plan) &&
testexpr_is_hashable(splan->testexpr, splan->paramIds))
splan->useHashTable = true;
else if (splan->parParam == NIL && enable_material &&
!ExecMaterializesOutput(nodeTag(plan)))
plan = materialize_finished_plan(plan);
result = (Node *) splan;
isInitPlan = false;

Either way the finished sub-plan is registered in the global glob->subplans list (indexed by plan_id), and initPlans additionally go on root->init_plans so SS_attach_initplans can later hang them on the topmost node:

// build_subplan (tail) — src/backend/optimizer/plan/subselect.c
root->glob->subplans = lappend(root->glob->subplans, plan);
splan->plan_id = list_length(root->glob->subplans);
if (isInitPlan)
root->init_plans = lappend(root->init_plans, splan);
splan->plan_name = psprintf("%s %d",
isInitPlan ? "InitPlan" : "SubPlan",
splan->plan_id);

The second pass, set_plan_references, is the conceptual climax. The file header enumerates its nine jobs; the two that matter here are flattening the range table and resolving Vars. It begins by recording the current size of the global range table as rtoffset, then appends this query level’s RTEs to it — so every scan node in this level will need its varno shifted up by rtoffset to address the flattened table:

// set_plan_references — src/backend/optimizer/plan/setrefs.c
int rtoffset = list_length(glob->finalrtable);
/* Add all the query's RTEs to the flattened rangetable. */
add_rtes_to_flat_rtable(root, false);
/* ... adjust rowmarks, appendrels, alt-subplans ... */
/* Now fix the Plan tree */
result = set_plan_refs(root, plan, rtoffset);

set_plan_refs recurses the Plan tree, assigns each node a unique plan_node_id, and dispatches on node type. For a scan leaf, the fix is purely additive: shift scanrelid and every Var’s varno by rtoffset, because a scan Var still names a base column — only its table index moves into the global namespace:

// set_plan_refs (T_SeqScan case) — src/backend/optimizer/plan/setrefs.c
SeqScan *splan = (SeqScan *) plan;
splan->scan.scanrelid += rtoffset;
splan->scan.plan.targetlist =
fix_scan_list(root, splan->scan.plan.targetlist,
rtoffset, NUM_EXEC_TLIST(plan));
splan->scan.plan.qual =
fix_scan_list(root, splan->scan.plan.qual,
rtoffset, NUM_EXEC_QUAL(plan));

fix_scan_expr does the Var shift via its mutator. Note it asserts the scan never sees an INNER_VAR/OUTER_VAR — those special varnos only appear above a join, never at a leaf:

// fix_scan_expr_mutator — src/backend/optimizer/plan/setrefs.c
if (IsA(node, Var))
{
Var *var = copyVar((Var *) node);
Assert(var->varno != INNER_VAR);
Assert(var->varno != OUTER_VAR);
Assert(var->varno != ROWID_VAR);
if (!IS_SPECIAL_VARNO(var->varno))
var->varno += context->rtoffset;
if (var->varnosyn > 0)
var->varnosyn += context->rtoffset;
return (Node *) var;
}

At a join, the rewrite is fundamentally different. A join’s output Vars must no longer name base columns; they must name positions in the inner or outer child’s output tuple. set_join_references builds an indexed_tlist over each child’s target list, then rewrites the join’s own clauses and tlist against those indexes. The special varnos INNER_VAR (-1) and OUTER_VAR (-2) are the executor’s way of saying “this column comes from the inner / outer slot”:

// INNER_VAR / OUTER_VAR / INDEX_VAR — src/include/nodes/primnodes.h
#define INNER_VAR (-1) /* reference to inner subplan */
#define OUTER_VAR (-2) /* reference to outer subplan */
#define INDEX_VAR (-3) /* reference to index column */
// set_join_references — src/backend/optimizer/plan/setrefs.c
outer_itlist = build_tlist_index(outer_plan->targetlist);
inner_itlist = build_tlist_index(inner_plan->targetlist);
join->joinqual = fix_join_expr(root, join->joinqual,
outer_itlist, inner_itlist,
(Index) 0, rtoffset, NRM_EQUAL, ...);
/* ... join-type-specific: mergeclauses / hashclauses / nestParams ... */
join->plan.targetlist = fix_join_expr(root, join->plan.targetlist,
outer_itlist, inner_itlist,
(Index) 0, rtoffset,
(join->jointype == JOIN_INNER ?
NRM_EQUAL : NRM_SUPERSET), ...);

build_tlist_index is the lookup accelerator: it flattens a child’s tlist into an array of (varno, varattno) -> resno triples so that matching a parent Var to a child output position is a linear scan over a compact array rather than over TargetEntry nodes:

// build_tlist_index — src/backend/optimizer/plan/setrefs.c
foreach(l, tlist)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
if (tle->expr && IsA(tle->expr, Var))
{
Var *var = (Var *) tle->expr;
vinfo->varno = var->varno;
vinfo->varattno = var->varattno;
vinfo->resno = tle->resno;
vinfo->varnullingrels = var->varnullingrels;
vinfo++;
}
else if (tle->expr && IsA(tle->expr, PlaceHolderVar))
itlist->has_ph_vars = true;
else
itlist->has_non_vars = true;
}

fix_join_expr_mutator is the per-Var rewriter: it searches the outer index first, then the inner, and on a hit returns a copy of the Var with varno set to OUTER_VAR/INNER_VAR and varattno set to the child’s output resno — the column’s position, not its catalog attribute:

// fix_join_expr_mutator — src/backend/optimizer/plan/setrefs.c
if (context->outer_itlist)
{
newvar = search_indexed_tlist_for_var(var, context->outer_itlist,
OUTER_VAR, context->rtoffset,
context->nrm_match);
if (newvar) return (Node *) newvar;
}
if (context->inner_itlist)
{
newvar = search_indexed_tlist_for_var(var, context->inner_itlist,
INNER_VAR, context->rtoffset,
context->nrm_match);
if (newvar) return (Node *) newvar;
}
/* ... PlaceHolderVar / non-Var / acceptable_rel fallbacks ... */
elog(ERROR, "variable not found in subplan target lists");

And search_indexed_tlist_for_var is where the substitution is made concrete — the matched Var’s varno becomes the special varno and its varattno becomes the child output slot’s resno:

// search_indexed_tlist_for_var — src/backend/optimizer/plan/setrefs.c
if (vinfo->varno == varno && vinfo->varattno == varattno)
{
Var *newvar = copyVar(var);
/* ... varnullingrels cross-check ... */
newvar->varno = newvarno; /* OUTER_VAR or INNER_VAR */
newvar->varattno = vinfo->resno; /* position in child tuple */
if (newvar->varnosyn > 0)
newvar->varnosyn += rtoffset;
return newvar;
}

For single-input upper nodes (Agg, Group, Result), the analogous routine set_upper_references builds one index over the lone child and rewrites Vars to OUTER_VAR. The data-flow of Pass 2 for the same join:

flowchart TD
    A["set_plan_references<br/>rtoffset = len(finalrtable)"] --> B["add_rtes_to_flat_rtable<br/>append this level's RTEs"]
    B --> C["set_plan_refs<br/>recurse, assign plan_node_id"]
    C -->|leaf scan| D["scanrelid += rtoffset"]
    D --> E["fix_scan_expr<br/>Var.varno += rtoffset<br/>(still a base-column Var)"]
    C -->|join node| F["build_tlist_index<br/>over outer + inner tlists"]
    F --> G["fix_join_expr on<br/>joinqual, hashclauses, tlist"]
    G --> H["search_indexed_tlist_for_var"]
    H --> I["match outer?<br/>varno = OUTER_VAR<br/>varattno = child resno"]
    H --> J["match inner?<br/>varno = INNER_VAR<br/>varattno = child resno"]
    C -->|upper node| K["set_upper_references<br/>build_tlist_index over lefttree"]
    K --> L["fix_upper_expr -> OUTER_VAR"]
    I --> M["Plan tree: every Var is a<br/>(special varno, slot offset)"]
    J --> M
    L --> M
    E --> M

The plan-creation machinery lives in three files under src/backend/optimizer/plan/. The symbols below are grouped by the phase they belong to; each is a stable anchor (line numbers are hints in the table at the end).

  • create_plan — module entry point. Seeds the recursion with CP_EXACT_TLIST, calls create_plan_recurse, then does three finishing touches on the root node: apply_tlist_labeling (restore original column names for the result descriptor), SS_attach_initplans (hang this level’s initPlans), and an assertion that all NestLoopParams were assigned.
  • create_plan_recurse — the dispatch switch on best_path->pathtype. Guards against deep trees with check_stack_depth(). Returns one Plan * per Path.
  • create_scan_plan — leaf-relation worker. Selects scan_clauses (using indrestrictinfo for index paths), folds in param_info->ppi_clauses for parameterized scans, decides the tlist via use_physical_tlist, dispatches to the typed creator, and wraps a gating Result if get_gating_quals returned pseudoconstants.
  • create_join_plan — dispatches create_mergejoin_plan, create_hashjoin_plan, create_nestloop_plan; appends a gating Result for pseudoconstant joinrestrictinfo.
  • build_path_tlist — wraps each pathtarget->exprs element in a TargetEntry; applies replace_nestloop_params for parameterized paths; copies sortgrouprefs.
  • use_physical_tlist — gate for emitting a storage-order (all-columns) tlist so the executor can skip projection. Refuses when CP_EXACT_TLIST/CP_SMALL_TLIST set, for non-simple rtekinds, for inheritance/Custom paths, and when system columns or whole-row Vars are needed.
  • create_seqscan_plan / make_seqscan — the canonical node-build idiom (order → strip → replace outer Vars → allocate → copy estimates) and its pure allocator.
  • create_indexscan_plan — additionally calls fix_indexqual_references to rewrite index-key Vars to INDEX_VAR attribute numbers and to strip RestrictInfo wrappers.
  • create_nestloop_plan — manages root->curOuterRels around the inner-child recursion; calls identify_current_nestloop_params; builds the node with make_nestloop.
  • create_projection_plan / make_result / inject_projection_plan — fold projection into the child when possible, else materialize a Result.
  • create_subqueryscan_plan — re-enters create_plan on rel->subroot (a separate planner context), calls process_subquery_nestloop_params for lateral refs.
  • replace_nestloop_params / replace_nestloop_params_mutator — convert outer-relation Vars and PlaceHolderVars into PARAM_EXEC parameters, keyed on root->curOuterRels membership.
  • copy_generic_path_info — freeze cost/rows/width/parallel flags from the Path onto the Plan. copy_plan_costsize does the same for nodes inserted without their own Path (e.g. Sort, Material). change_plan_targetlist swaps a child’s tlist when a nestloop must publish a PHV in the outer output.

Pass 2 — setrefs.c: positional variable binding

Section titled “Pass 2 — setrefs.c: positional variable binding”
  • set_plan_references — pass entry. Computes rtoffset = list_length(glob->finalrtable), flattens RTEs (add_rtes_to_flat_rtable), adjusts rowmarks/appendrels, then calls set_plan_refs.
  • set_plan_refs — recursive per-node dispatch. Assigns plan_node_id; for scans shifts scanrelid and Vars by rtoffset; for joins delegates to set_join_references; for single-input upper nodes delegates to set_upper_references. Also deletes no-op SubqueryScan/Append/MergeAppend nodes (trivial_subqueryscan).
  • fix_scan_expr / fix_scan_expr_mutator / fix_scan_expr_walker — scan-level Var shift. The walker (no-copy) variant is taken when rtoffset == 0 and there are no params/PHVs to rewrite, saving cycles on trivial queries. Calls fix_param_node for PARAM_MULTIEXPRPARAM_EXEC substitution.
  • set_join_references — builds indexed_tlists over both children, rewrites joinqual, method-specific clauses (mergeclauses, hashclauses, hashkeys), nestParams, then targetlist/qual via fix_join_expr / fix_upper_expr.
  • set_upper_references — single-child analogue (Agg, Group, Result), rewriting Vars to OUTER_VAR.
  • set_indexonlyscan_references — special-cases index-only scans so Vars reference INDEX_VAR columns.
  • build_tlist_index / build_tlist_index_other_vars — flatten a child tlist into a tlist_vinfo array for O(n) Var matching; set has_ph_vars / has_non_vars flags.
  • search_indexed_tlist_for_var — the core substitution: (varno, varattno)(newvarno, resno) with a varnullingrels cross-check governed by NullingRelsMatch. Companions search_indexed_tlist_for_phv and search_indexed_tlist_for_non_var handle PHVs and whole expressions pushed into a child tlist.
  • fix_join_expr / fix_join_expr_mutator — outer-then-inner Var search; errors with "variable not found in subplan target lists" on miss. fix_upper_expr / fix_upper_expr_mutator — the single-input variant.
  • set_dummy_tlist_references — for nodes (Sort, Material, etc.) whose output is exactly their input: rewrite the tlist to a list of OUTER_VAR references by position.
  • make_subplan — top-level conversion of a SubLink to a plan; runs the subquery planner, then calls build_subplan.
  • build_subplan — the initPlan-vs-SubPlan classifier. Builds parParam/args, decides isInitPlan, allocates the output PARAM_EXEC via generate_new_exec_param / generate_subquery_params, optionally sets useHashTable or materializes, registers in glob->subplans and (for initPlans) root->init_plans, and labels for EXPLAIN.
  • generate_subquery_params / convert_testexpr — build the PARAM_EXEC list standing for the subquery’s output columns and splice them into the test expression.
  • SS_attach_initplans — assign plan->initPlan = root->init_plans on the topmost node (called from create_plan).
  • SS_process_ctes — turn WITH CTEs into initplans / CTE scans.
  • SS_finalize_plan / finalize_plan — final recursive pass computing each node’s extParam/allParam sets (run after set_plan_references).

Position hints (as of 2026-06-05, REL_18 273fe94):

SymbolFileLine
create_plansrc/backend/optimizer/plan/createplan.c337
create_plan_recursesrc/backend/optimizer/plan/createplan.c388
create_scan_plansrc/backend/optimizer/plan/createplan.c559
build_path_tlistsrc/backend/optimizer/plan/createplan.c825
use_physical_tlistsrc/backend/optimizer/plan/createplan.c865
get_gating_qualssrc/backend/optimizer/plan/createplan.c1002
create_gating_plansrc/backend/optimizer/plan/createplan.c1022
create_join_plansrc/backend/optimizer/plan/createplan.c1081
create_projection_plansrc/backend/optimizer/plan/createplan.c2015
inject_projection_plansrc/backend/optimizer/plan/createplan.c2117
change_plan_targetlistsrc/backend/optimizer/plan/createplan.c2149
create_seqscan_plansrc/backend/optimizer/plan/createplan.c2910
create_indexscan_plansrc/backend/optimizer/plan/createplan.c2999
create_nestloop_plansrc/backend/optimizer/plan/createplan.c4341
create_mergejoin_plansrc/backend/optimizer/plan/createplan.c4493
create_hashjoin_plansrc/backend/optimizer/plan/createplan.c4847
replace_nestloop_paramssrc/backend/optimizer/plan/createplan.c5036
replace_nestloop_params_mutatorsrc/backend/optimizer/plan/createplan.c5043
fix_indexqual_referencessrc/backend/optimizer/plan/createplan.c5121
copy_generic_path_infosrc/backend/optimizer/plan/createplan.c5514
copy_plan_costsizesrc/backend/optimizer/plan/createplan.c5530
make_seqscansrc/backend/optimizer/plan/createplan.c5643
make_nestloopsrc/backend/optimizer/plan/createplan.c6083
make_resultsrc/backend/optimizer/plan/createplan.c7129
set_plan_referencessrc/backend/optimizer/plan/setrefs.c288
add_rtes_to_flat_rtablesrc/backend/optimizer/plan/setrefs.c396
set_plan_refssrc/backend/optimizer/plan/setrefs.c619
set_indexonlyscan_referencessrc/backend/optimizer/plan/setrefs.c1333
trivial_subqueryscansrc/backend/optimizer/plan/setrefs.c1476
fix_param_nodesrc/backend/optimizer/plan/setrefs.c2125
fix_scan_exprsrc/backend/optimizer/plan/setrefs.c2212
fix_scan_expr_mutatorsrc/backend/optimizer/plan/setrefs.c2247
set_join_referencessrc/backend/optimizer/plan/setrefs.c2332
set_upper_referencessrc/backend/optimizer/plan/setrefs.c2481
set_dummy_tlist_referencessrc/backend/optimizer/plan/setrefs.c2692
build_tlist_indexsrc/backend/optimizer/plan/setrefs.c2759
build_tlist_index_other_varssrc/backend/optimizer/plan/setrefs.c2810
search_indexed_tlist_for_varsrc/backend/optimizer/plan/setrefs.c2868
search_indexed_tlist_for_phvsrc/backend/optimizer/plan/setrefs.c2933
search_indexed_tlist_for_non_varsrc/backend/optimizer/plan/setrefs.c2986
fix_join_exprsrc/backend/optimizer/plan/setrefs.c3104
fix_join_expr_mutatorsrc/backend/optimizer/plan/setrefs.c3126
fix_upper_exprsrc/backend/optimizer/plan/setrefs.c3278
fix_upper_expr_mutatorsrc/backend/optimizer/plan/setrefs.c3298
make_subplansrc/backend/optimizer/plan/subselect.c162
build_subplansrc/backend/optimizer/plan/subselect.c319
generate_subquery_paramssrc/backend/optimizer/plan/subselect.c582
convert_testexprsrc/backend/optimizer/plan/subselect.c644
SS_process_ctessrc/backend/optimizer/plan/subselect.c880
SS_attach_initplanssrc/backend/optimizer/plan/subselect.c2353
SS_finalize_plansrc/backend/optimizer/plan/subselect.c2368
INNER_VAR / OUTER_VAR / INDEX_VARsrc/include/nodes/primnodes.h242
IS_SPECIAL_VARNOsrc/include/nodes/primnodes.h247

All claims below were checked against the REL_18 working tree at commit 273fe94 on 2026-06-05.

  • Two passes, parser numbering preserved through Pass 1. Verified via the create_plan header comment (“Vars still correspond to the parser’s numbering. This will be fixed later by setrefs.c”) and by the fact that no createplan.c routine writes INNER_VAR/OUTER_VAR into a Var’s varno — those constants are written only in search_indexed_tlist_for_var / set_dummy_tlist_references in setrefs.c.
  • CP_* flags. The four flags and their hex values are defined at the top of createplan.c (lines 70–73) exactly as quoted; CP_IGNORE_TLIST is 0x0008 per the header comment block.
  • Physical-tlist gate. use_physical_tlist returns false on CP_EXACT_TLIST | CP_SMALL_TLIST and on system-column / whole-row Var requirements, as quoted. The fallback to build_path_tlist on build_physical_tlist returning NIL (dropped columns) is present in create_scan_plan.
  • NestLoop param machinery. create_nestloop_plan pushes outerrelids into root->curOuterRels before recursing into the inner child and restores it afterward; replace_nestloop_params_mutator keys replacement on bms_is_member(var->varno, root->curOuterRels). Both verified verbatim.
  • Special varnos. INNER_VAR = -1, OUTER_VAR = -2, INDEX_VAR = -3 and IS_SPECIAL_VARNO(varno) = ((int)(varno) < 0) confirmed in primnodes.h (lines 242–247).
  • Var → slot rewrite. search_indexed_tlist_for_var sets newvar->varno = newvarno (the passed-in OUTER_VAR/INNER_VAR) and newvar->varattno = vinfo->resno (the child output position), verified at the match branch.
  • initPlan vs SubPlan. build_subplan sets isInitPlan = true for parParam == NIL plus EXISTS/EXPR/ARRAY/ROWCOMPARE and the parParam == NIL arm of MULTIEXPR; ANY_SUBLINK uncorrelated may set useHashTable; the final lappend(root->init_plans, splan) runs only when isInitPlan. Verified.
  • SS_attach_initplans is a one-liner assigning plan->initPlan = root->init_plans; confirmed at line 2353. The comment in create_plan notes initPlans are attached only to the topmost node of the query level, which SS_finalize_plan’s header comment also relies on.
  • REL_18 specifics. The varreturningtype check in fix_join_expr_mutator (for RETURNING old/new Vars, a PG 18 MERGE/ RETURNING feature) is present and quoted; no removed-feature symbols (e.g. no XLOG2 rmgr, no B_DATACHECKSUMSWORKER_*) appear in these files, consistent with the REL_18 constraint.

One caveat worth flagging: line numbers in the position table are hints scoped to this revision. The symbol names are the durable anchors; a reformatting commit will move the numbers without touching the names.

Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”

PostgreSQL’s “lower the winning Path, then resolve Vars positionally” pipeline is one well-trodden point in a larger design space. The contrasts below sharpen what is essential versus what is a PostgreSQL convention.

Volcano/Cascades: lowering is implementation rules, not a separate phase

Section titled “Volcano/Cascades: lowering is implementation rules, not a separate phase”

In a Cascades-style optimizer (Microsoft SQL Server, Greenplum’s ORCA, Apache Calcite’s Volcano planner), there is no clean “Path tree → Plan tree” lowering step. The search memo already holds physical group expressions, and the chosen physical plan is extracted directly from the memo. Operator selection happens during search via implementation rules, not afterward. PostgreSQL instead keeps a deliberately simple System-R-style bottom-up join search (postgres-planner-overview.md) that produces lightweight Paths, and concentrates all the “make it runnable” work into createplan.c. The trade is clear: PostgreSQL’s lowering is simpler to read and reason about, but it cannot make operator-implementation decisions that depend on properties only known after the global plan shape is fixed.

Positional binding vs. name-based execution

Section titled “Positional binding vs. name-based execution”

The Var → (INNER_VAR/OUTER_VAR, resno) rewrite is the executor’s performance contract: at run time, reading a column is an array index, never a catalog lookup or a name comparison. Some interpreted or research engines keep columns name-addressed for flexibility and pay the lookup cost per access; PostgreSQL pays a whole second tree walk at plan time (setrefs.c) to buy O(1) column access at execution time, every tuple, forever. This is the same philosophy as JIT-compiled tuple deforming in postgres-expression-eval.md: spend plan-time cycles to make per-tuple cycles cheap.

HyPer and Umbra (Neumann, Efficiently Compiling Efficient Query Plans for Modern Hardware, VLDB 2011) push the idea further: instead of producing an interpreted operator tree, they compile the plan to machine code, fusing pipelines so that a tuple is pushed through a chain of operators with no per-operator virtual call. In that world the “plan” is a code-generation IR, and the analogue of setrefs.c’s positional binding is register/stack-slot allocation in the generated code. PostgreSQL’s REL_18 baseline remains an interpreted iterator tree (its LLVM JIT compiles expressions, not the operator tree), so the Plan node stays a durable interpretable structure — which is precisely why positional Var resolution must be baked in at plan time.

Because set_plan_references produces a Plan whose every reference is resolved to offsets and whose range table is flattened and frozen, the finished Plan is immutable and reusable. This is what lets plancache.c cache a prepared statement’s plan and re-execute it across transactions and parallel workers — each execution builds a fresh mutable PlanState tree (postgres-executor.md) over the shared read-only Plan. The dependency lists set_plan_references accumulates (glob->relationOids, glob->invalItems) are exactly what plan-cache invalidation keys on when a referenced relation or function changes. Engines without a separate immutable-plan artifact (those that re-plan on every execution) avoid this bookkeeping but forfeit the caching win.

PostgreSQL’s initPlan/SubPlan split (subselect.c) is a local decision: a subquery is hoisted to an initPlan only when it is already uncorrelated. It does not aggressively decorrelate — rewrite a correlated subquery into a join — beyond the pull-up transforms done earlier in the rewriter/planner for the cases it recognizes (ANY/EXISTS → semijoin). Research and commercial systems (the classic work is Neumann & Kemper, Unnesting Arbitrary Queries, BTW 2015) decorrelate arbitrary nested queries into a single dependent join that the cost optimizer can then plan freely, often a large win for analytic workloads. PostgreSQL’s per-tuple SubPlan is the conservative fallback when decorrelation does not fire — correct always, but potentially O(outer rows) re-executions of the subquery.

PostgreSQL deliberately favors a readable, debuggable, two-pass lowering with an immutable cacheable output over the fused-search-and-codegen designs of compilation-first engines. For the workloads it targets — mixed OLTP with moderate analytics, heavy reliance on prepared-statement plan caching — the separation of createplan.c (structure) from setrefs.c (binding) from subselect.c (subquery wiring) keeps each concern independently testable, which matches the engineering values of the rest of the tree documented in this code-analysis/postgres/ series.

  • Source tree: /data/hgryoo/references/postgres, REL_18_STABLE at commit 273fe94852b3a7e34fd171e8abdf1481beb302fa, read 2026-06-05. Core only (no contrib).
    • src/backend/optimizer/plan/createplan.c — Path → Plan lowering (Pass 1).
    • src/backend/optimizer/plan/setrefs.cset_plan_references and positional Var binding (Pass 2).
    • src/backend/optimizer/plan/subselect.c — initPlan/SubPlan construction and SS_attach_initplans.
    • src/include/nodes/primnodes.hINNER_VAR/OUTER_VAR/INDEX_VAR, IS_SPECIAL_VARNO, Var struct.
    • src/include/nodes/plannodes.h, pathnodes.hPlan / Path node definitions.
  • Silberschatz, Korth, Sudarshan, Database System Concepts, 7e, ch. 15 “Query Processing” and ch. 16 “Query Optimization” (esp. §16.1 cost-based plan choice, §16.4.4 nested-subquery optimization). Capture: knowledge/research/dbms-general/database-system-concepts.md.
  • Selinger et al. 1979, Access Path Selection in a Relational Database Management System (System R) — the bottom-up join search whose Paths this module lowers. See .omc/plans/postgres-paper-bibliography.mddbms-papers/systemr-optimizer.md.
  • Neumann, Efficiently Compiling Efficient Query Plans for Modern Hardware, VLDB 2011 — compilation-first contrast.
  • Neumann & Kemper, Unnesting Arbitrary Queries, BTW 2015 — subquery decorrelation contrast.
  • Cross-references within this series: postgres-planner-overview.md (join search that produces the Path), postgres-path-generation.md (how individual Paths are built and costed), postgres-executor.md (the consumer of the finished Plan), postgres-expression-eval.md (how the resolved expressions are evaluated at run time).