Skip to content

PostgreSQL Path Generation — Scan and Join Path Builders, add_path Pruning, and Pathkeys

Contents:

A cost-based optimizer must enumerate physical access strategies for each relation before it can rank them. The System R optimizer (Selinger et al., “Access Path Selection in a Relational Database Management System,” SIGMOD 1979, captured at research/dbms-papers/systemr-optimizer.md) named the enumeration unit the access path: for a base relation, a path is a particular combination of scan method, applicable predicates, and output sort order; for a join, a path is a particular pairing of outer and inner paths with a specific join algorithm. The key insight in System R is that paths are cheap to generate and discard — you evaluate thousands of candidates but build only the one winner into an executable plan.

Database Internals (Petrov, ch. 7) frames this as plan-space enumeration: the optimizer builds a lattice of candidate strategies bottom-up, keeping only the Pareto-optimal frontier at each level (cheapest total cost, cheapest startup cost, cheapest for each useful sort order). The Pareto-optimality criterion — keep a candidate only if no other candidate dominates it on every relevant dimension — is the pruning rule that makes the search tractable.

Two dimensions that every access-path system must manage are:

  1. Physical access strategies for a single table. For a heap relation, the choices are sequential scan, index scan (one per applicable index), index-only scan, bitmap index scan (possibly merging multiple indexes via AND/OR), and TID scan. Each has a different startup cost, total cost, and output sort order.

  2. Sort order as a first-class property. A plan that is more expensive than a competitor but emits rows pre-sorted on a join key or an ORDER BY column may still win overall by eliminating a downstream sort. System R calls these interesting sort orders and explicitly keeps a cheapest path for each one alongside the cheapest unordered path. This is the conceptual seed of PostgreSQL’s pathkeys.

The access-path layer sits between two sibling concerns: cost estimation (which assigns numeric cost to each candidate — costsize.c, postgres-cost-model.md) and join-order enumeration (which pairs and sequences the base-rel paths into join paths — joinpath.c, joinrels.c, postgres-join-ordering.md). This document covers only the access-path generation and pruning machinery — the scan-level builders, add_path, and the pathkeys system.

Per-relation candidate list with dominance pruning

Section titled “Per-relation candidate list with dominance pruning”

Every cost-based optimizer maintains a candidate list per relation that accumulates competing strategies as they are discovered and prunes dominated ones on the fly. The canonical shape is: a list (or priority queue) ordered by total cost; when a new candidate arrives, scan the list and discard any existing entry that the new candidate dominates on every relevant dimension. If any existing entry dominates the new candidate, discard the new one instead. The surviving frontier is then available to higher join levels.

The dimensions on which dominance is assessed vary by engine: cost (startup and total), output sort order, required outer relations (parameterization), row count, and in modern engines also parallel-safety. The wider the set of dimensions, the more candidates survive (because it is harder to dominate on all axes simultaneously), but the more thorough the search.

Separation of scan-path generation from join-path generation

Section titled “Separation of scan-path generation from join-path generation”

Practically all serious optimizers treat base-relation scanning as a distinct sub-problem from join ordering. Base-rel paths are fully generated and pruned first; join paths then compose from the surviving base-rel winners. This separation is clean because base-rel paths carry no dependency on the join order — a sequential scan of table A costs the same regardless of which table it will later be joined to, with one exception: parameterized paths (index scans whose index clauses reference values from another relation) do depend on which outer relation is available. Most optimizers therefore treat unparameterized base-rel paths as globally reusable and parameterized ones as join-level artifacts.

Index path generation: matching clauses to index columns

Section titled “Index path generation: matching clauses to index columns”

Index path generation is the most complex part of access-path enumeration. The standard engineering pattern has three steps:

  1. Match WHERE/ON predicates to index columns. A predicate of the form col OP constant (or col OP col2 for parameterized paths) can use index column col if the operator OP is in the index’s operator class. Only predicates that match the index’s key-column prefix can actually prune tuples via the index.

  2. Decide between regular scan, index-only scan, and bitmap scan. A regular index scan follows the index leaf entries and fetches heap pages one at a time in index order. An index-only scan reads the index without visiting the heap at all (possible when the index covers all needed columns). A bitmap scan first collects all matching TIDs into a bitmap, then fetches heap pages in physical order — a better strategy when many rows match and the selectivity is moderate. Engines typically generate all three variants and let the cost model pick.

  3. Handle partial indexes and EquivalenceClass-derived predicates. A partial index (one with a WHERE predicate in its definition) is only useful if the query’s predicates imply the index predicate. An EquivalenceClass-derived predicate (e.g., if a = b and b = 5, we can generate a = 5 for an index on a) can open additional index paths not explicitly named in the original query.

Pathkeys: sort order as a canonical algebraic structure

Section titled “Pathkeys: sort order as a canonical algebraic structure”

The classical treatment of interesting sort orders (System R) is ad hoc: the planner tests a small set of pre-identified orderings. A more principled approach canonicalizes sort order into a data structure that represents equivalence classes of expressions known to be equal: if a.x = b.y (a join predicate or filter), then any sort order on a.x is the same as a sort order on b.y. An engine that tracks this can recognize that an index scan on a.x satisfies an ORDER BY b.y without an extra sort, because a.x ≡ b.y in the relevant join context.

Theory / patternPostgreSQL name
Per-relation candidate listRelOptInfo.pathlist
Dominance pruning on addadd_path
Cheapest-path cache per dimensioncheapest_startup_path, cheapest_total_path, cheapest_parameterized_paths (in RelOptInfo)
Scan path for base rel (sequential)Path with pathtype = T_SeqScan; built by create_seqscan_path
Scan path for base rel (index)IndexPath; built by build_index_paths inside create_index_paths
Bitmap multi-index scanBitmapHeapPath wrapping BitmapAndPath/BitmapOrPath of IndexPaths
Parameterized path (index join clause)IndexPath with non-NULL param_info
TID scanTidPath / TidRangePath; built by create_tidscan_paths
Parallel scan fragmentpartial path in RelOptInfo.partial_pathlist; gathered by create_gather_path
Interesting sort orderPath.pathkeys — list of PathKey
Canonical sort-order expressionPathKey referencing an EquivalenceClass
Equivalence class of equal expressionsEquivalenceClass in PlannerInfo.eq_classes

Overview: the two-pass structure of make_one_rel

Section titled “Overview: the two-pass structure of make_one_rel”

make_one_rel (allpaths.c:171) is the function query_planner calls to produce the final join RelOptInfo. It has a strictly ordered two-pass structure:

// make_one_rel — src/backend/optimizer/path/allpaths.c
RelOptInfo *
make_one_rel(PlannerInfo *root, List *joinlist)
{
/* Pass 1: size estimates (needed by parameterized path cost) */
set_base_rel_sizes(root);
/* Accumulate total_table_pages for cost model */
total_pages = 0;
for (rti = 1; rti < root->simple_rel_array_size; rti++) { ... }
root->total_table_pages = total_pages;
/* Pass 2: generate all scan paths for base rels */
set_base_rel_pathlists(root);
/* Join search: produce the final joinrel */
rel = make_rel_from_joinlist(root, joinlist);
return rel;
}

set_base_rel_sizes must run before set_base_rel_pathlists because parameterized index paths need the cardinality estimate of the outer relation to compute their loop-count-adjusted cost. make_rel_from_joinlist is the entry point for join ordering (covered by postgres-join-ordering.md).

set_base_rel_pathlists: the per-RTE dispatch

Section titled “set_base_rel_pathlists: the per-RTE dispatch”

set_base_rel_pathlists (allpaths.c:333) iterates root->simple_rel_array and calls set_rel_pathlist for each base rel. set_rel_pathlist (allpaths.c:469) dispatches by RTE kind:

// set_rel_pathlist — src/backend/optimizer/path/allpaths.c
static void
set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte)
{
if (IS_DUMMY_REL(rel))
/* already proved empty, nothing to do */;
else if (rte->inh)
set_append_rel_pathlist(root, rel, rti, rte); /* partitioned/inherited */
else {
switch (rel->rtekind) {
case RTE_RELATION:
if (rte->relkind == RELKIND_FOREIGN_TABLE)
set_foreign_pathlist(root, rel, rte);
else if (rte->tablesample != NULL)
set_tablesample_rel_pathlist(root, rel, rte);
else
set_plain_rel_pathlist(root, rel, rte); /* ordinary heap */
break;
case RTE_FUNCTION: set_function_pathlist(...); break;
case RTE_VALUES: set_values_pathlist(...); break;
/* RTE_SUBQUERY, RTE_CTE, etc: handled in set_rel_size */
/* ... */
}
}
/* Plugin hook: extensions can add/remove paths here */
if (set_rel_pathlist_hook)
(*set_rel_pathlist_hook)(root, rel, rti, rte);
/* Collect partial paths into Gather paths (parallelism) */
if (rel->reloptkind == RELOPT_BASEREL &&
!bms_equal(rel->relids, root->all_query_rels))
generate_useful_gather_paths(root, rel, false);
/* Pin cheapest paths into RelOptInfo fields */
set_cheapest(rel);
}

Three things to notice. First, set_rel_pathlist_hook is the extensibility point for custom scan providers (CustomPath). Second, parallel partial paths are collected here via generate_useful_gather_paths — this wraps any partial_pathlist entries in GatherPath or GatherMergePath nodes and adds them to the main pathlist. Third, set_cheapest runs at the end of every base-rel dispatch, pinning cheapest_total_path before the join search begins.

set_plain_rel_pathlist: the three scan families

Section titled “set_plain_rel_pathlist: the three scan families”

For an ordinary heap table, set_plain_rel_pathlist (allpaths.c:768) adds exactly three families of paths:

// set_plain_rel_pathlist — src/backend/optimizer/path/allpaths.c
static void
set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel, RangeTblEntry *rte)
{
Relids required_outer = rel->lateral_relids;
/* TID scan (forced if CurrentOfExpr is present; returns if so) */
if (create_tidscan_paths(root, rel))
return;
/* Sequential scan — always added */
add_path(rel, create_seqscan_path(root, rel, required_outer, 0));
/* Parallel sequential scan fragments */
if (rel->consider_parallel && required_outer == NULL)
create_plain_partial_paths(root, rel);
/* All applicable index paths */
create_index_paths(root, rel);
}

The sequential scan is unconditional (it is always a valid, if expensive, choice). create_tidscan_paths returns true only when a CurrentOfExpr is present (cursor-positioned update/delete), in which case no other path is valid and the function returns immediately. create_index_paths is where most of the work happens.

create_index_paths: the index path generator

Section titled “create_index_paths: the index path generator”

create_index_paths (indxpath.c:240) iterates rel->indexlist (the list of IndexOptInfo structs describing every index on the relation) and for each index runs a three-phase clause-matching pipeline:

// create_index_paths — src/backend/optimizer/path/indxpath.c
void
create_index_paths(PlannerInfo *root, RelOptInfo *rel)
{
/* For each index on this relation: */
foreach(lc, rel->indexlist)
{
IndexOptInfo *index = (IndexOptInfo *) lfirst(lc);
/* Skip partial indexes whose predicate the query doesn't satisfy */
if (index->indpred != NIL && !index->predOK)
continue;
/* Phase 1: match WHERE/baserestrictinfo to index columns */
match_restriction_clauses_to_index(root, index, &rclauseset);
get_index_paths(root, rel, index, &rclauseset, &bitindexpaths);
/* Phase 2: match join clauses (loose, not yet in an EC) */
match_join_clauses_to_index(root, rel, index,
&jclauseset, &joinorclauses);
/* Phase 3: match EquivalenceClass-derived join predicates */
match_eclass_clauses_to_index(root, index, &eclauseset);
/* Combine phases 2+3 into parameterized index paths */
if (jclauseset.nonempty || eclauseset.nonempty)
consider_index_join_clauses(root, rel, index,
&rclauseset, &jclauseset,
&eclauseset, &bitjoinpaths);
}
/* Merge restriction bitmaps into one BitmapHeapPath */
if (bitindexpaths != NIL) {
bitmapqual = choose_bitmap_and(root, rel, bitindexpaths);
bpath = create_bitmap_heap_path(root, rel, bitmapqual,
rel->lateral_relids, 1.0, 0);
add_path(rel, (Path *) bpath);
if (rel->consider_parallel && rel->lateral_relids == NULL)
create_partial_bitmap_paths(root, rel, bitmapqual);
}
/* Merge join bitmaps, one BitmapHeapPath per distinct parameterization */
if (bitjoinpaths != NIL) { /* ... for each distinct outer-rel set ... */ }
}

Phase 1 (match_restriction_clauses_to_index) walks rel->baserestrictinfo and checks each RestrictInfo against each index key column. A match requires the operator to be in the index’s operator family and the indexable column to be on the appropriate side. get_index_paths then calls build_index_paths which emits IndexPath and IndexPath (index-only) objects via add_path for regular scans, and accumulates IndexPath objects into bitindexpaths for later bitmap aggregation.

Phases 2 and 3 find parameterized index paths — index scans where the index clause references a value from another relation (e.g., WHERE a.x = b.y can use an index on a.x if b.y is available from an outer rel in a nested-loop join). Phase 2 matches explicit join clauses; phase 3 matches predicates that can be derived from the query’s EquivalenceClass graph.

The bitmap path assembly at the end (choose_bitmap_and) is important: when multiple indexes each cover a different subset of the qualifying rows, combining their TID bitmaps with AND reduces the set of heap pages to fetch. The cost model in costsize.c evaluates whether the combined bitmap scan is cheaper than any individual index scan.

Figure 1 — create_index_paths clause pipeline per index

flowchart TD
  A["create_index_paths\n(rel->indexlist loop)"]
  B["match_restriction_clauses_to_index\n(WHERE / baserestrictinfo)"]
  C["get_index_paths → build_index_paths\n(IndexPath, IndexOnlyScan, bitmap)"]
  D["match_join_clauses_to_index\n(explicit join clauses)"]
  E["match_eclass_clauses_to_index\n(EC-derived join predicates)"]
  F["consider_index_join_clauses\n(parameterized IndexPath + bitmap)"]
  G["choose_bitmap_and\n→ BitmapHeapPath (restriction)"]
  H["choose_bitmap_and × parameterization\n→ BitmapHeapPath (join)"]

  A --> B --> C
  A --> D
  A --> E
  D & E --> F
  C -->|"bitindexpaths"| G
  F -->|"bitjoinpaths"| H

Figure 1 — The three-phase clause-matching pipeline inside create_index_paths. Each pass through the outer foreach(lc, rel->indexlist) loop runs phases 1–3 for one index. Bitmap paths from all indexes accumulate across the loop and are assembled into a single BitmapHeapPath (restriction) or one per parameterization (join) at the end.

Every path is a Path node with a pathtype tag that identifies the concrete scan or join algorithm. Specialized paths embed Path as their first member (the standard C struct-inheritance pattern):

// Path (base) — src/include/nodes/pathnodes.h
typedef struct Path {
NodeTag type;
NodeTag pathtype; /* T_SeqScan, T_IndexScan, T_NestLoop, … */
RelOptInfo *parent; /* the relation this path produces */
PathTarget *pathtarget; /* output column list (default = rel->reltarget) */
ParamPathInfo *param_info; /* NULL = unparameterized */
bool parallel_aware;
bool parallel_safe;
int parallel_workers;
Cardinality rows;
int disabled_nodes; /* count of GUC-disabled plan nodes used */
Cost startup_cost;
Cost total_cost;
List *pathkeys; /* sort order: list of PathKey */
} Path;
// IndexPath — src/include/nodes/pathnodes.h
typedef struct IndexPath {
Path path; /* pathtype = T_IndexScan or T_IndexOnlyScan */
IndexOptInfo *indexinfo;
List *indexclauses; /* list of IndexClause */
List *indexorderbys; /* ORDER BY expressions usable by index AM */
ScanDirection indexscandir;
Cost indextotalcost;
Selectivity indexselectivity;
} IndexPath;
// BitmapHeapPath — src/include/nodes/pathnodes.h
typedef struct BitmapHeapPath {
Path path; /* pathtype = T_BitmapHeapScan; pathkeys = NIL */
Path *bitmapqual; /* IndexPath, BitmapAndPath, or BitmapOrPath */
} BitmapHeapPath;

The disabled_nodes field (new in recent PG releases) counts how many nodes in the path tree use a GUC-disabled plan type (e.g., enable_seqscan = off). add_path treats disabled_nodes as a higher-order cost component: a path with fewer disabled nodes always beats one with more, regardless of estimated cost. This is how enable_seqscan = off actually works — it does not remove seqscan paths from the pathlist; it adds a large disabled_nodes penalty so they lose the add_path comparison.

Join paths embed a JoinPath struct (which extends Path with outer/inner subpaths and join type), and specializations for each algorithm:

// NestPath / MergePath / HashPath — src/include/nodes/pathnodes.h
typedef struct NestPath { JoinPath jpath; } NestPath;
typedef struct MergePath {
JoinPath jpath;
List *path_mergeclauses;
List *outersortkeys; /* NIL = outer is already sorted */
List *innersortkeys; /* NIL = inner is already sorted */
int outer_presorted_keys; /* for incremental sort eligibility */
bool skip_mark_restore;
bool materialize_inner;
} MergePath;
typedef struct HashPath {
JoinPath jpath;
List *path_hashclauses;
int num_batches;
Cardinality inner_rows_total;
} HashPath;

add_path (pathnode.c:464) is called every time a new path candidate is constructed. It implements the Pareto-dominance test against every existing path in rel->pathlist. A new path dominates an old path if the new one is at least as good on all of cost, pathkeys, parameterization, row count, and parallel-safety — and strictly better on at least one:

// add_path — src/backend/optimizer/util/pathnode.c
void
add_path(RelOptInfo *parent_rel, Path *new_path)
{
bool accept_new = true;
/* Parameterized paths are treated as having NIL pathkeys —
they cannot win on sort order, only on cost */
new_path_pathkeys = new_path->param_info ? NIL : new_path->pathkeys;
foreach(p1, parent_rel->pathlist) {
Path *old_path = (Path *) lfirst(p1);
bool remove_old = false;
costcmp = compare_path_costs_fuzzily(new_path, old_path, STD_FUZZ_FACTOR);
/* If costs differ in both directions, keep both (no dominance) */
if (costcmp != COSTS_DIFFERENT) {
keyscmp = compare_pathkeys(new_path_pathkeys, old_path_pathkeys);
if (keyscmp != PATHKEYS_DIFFERENT) {
outercmp = bms_subset_compare(PATH_REQ_OUTER(new_path),
PATH_REQ_OUTER(old_path));
/* check all combinations of costcmp, keyscmp, outercmp,
rows, parallel_safe to set remove_old or accept_new=false */
/* ... */
}
}
if (remove_old) { /* delete old from pathlist, pfree unless IndexPath */ }
else if (new_path->total_cost >= old_path->total_cost) insert_at++;
if (!accept_new) break;
}
if (accept_new)
parent_rel->pathlist = list_insert_nth(parent_rel->pathlist,
insert_at, new_path);
else
if (!IsA(new_path, IndexPath)) pfree(new_path);
}

Three design choices embedded in add_path are worth noting:

  1. Parameterized paths get NIL pathkeys. A parameterized path cannot win on sort order because doing so would allow a less-parameterized path to be discarded just because its parameterized competitor happens to be sorted — which would leave the planner without an unparameterized option when the join topology changes. Treating parameterized paths as unsorted limits how many of them survive, keeping the pathlist tractable.

  2. Fuzzy cost comparison. compare_path_costs_fuzzily uses a small fuzz factor (STD_FUZZ_FACTOR = 1.01) so that two paths whose costs differ by less than 1% are treated as equal. This prevents platform-specific floating-point rounding from producing spuriously different plans.

  3. IndexPath nodes are never pfree’d. IndexPath objects may be referenced both as standalone paths and as children of BitmapHeapPath nodes. add_path therefore skips pfree for IndexPath nodes when rejecting them (to avoid a dangling pointer in any BitmapHeapPath that already holds a pointer to it).

After add_path has been called for all paths of a relation, set_cheapest (pathnode.c:272) scans the surviving pathlist and pins four pointers into RelOptInfo:

// set_cheapest — src/backend/optimizer/util/pathnode.c
void
set_cheapest(RelOptInfo *parent_rel)
{
/* Scan pathlist (sorted by disabled_nodes then total_cost) */
foreach(p, parent_rel->pathlist) {
Path *path = (Path *) lfirst(p);
if (path->param_info) {
parameterized_paths = lappend(parameterized_paths, path);
/* track best_param_path by (least parameterization, then cost) */
} else {
/* update cheapest_startup_path and cheapest_total_path */
}
}
/* If no unparameterized path exists, use the best parameterized one */
parent_rel->cheapest_startup_path = cheapest_startup_path;
parent_rel->cheapest_total_path = cheapest_total_path;
parent_rel->cheapest_unique_path = NULL; /* computed on demand */
parent_rel->cheapest_parameterized_paths = parameterized_paths;
}

cheapest_parameterized_paths is a list (not a single pointer) because different outer-rel parameterizations are incomparable — a path parameterized on {A} and a path parameterized on {B} are both kept because neither is a subset of the other. The join-path builders will pick from this list based on which outer rel they are currently considering.

A PathKey (pathnodes.h:1595) represents one sort key in a path’s output ordering:

// PathKey — src/include/nodes/pathnodes.h
typedef struct PathKey {
NodeTag type;
EquivalenceClass *pk_eclass; /* EC containing the sort expression */
Oid pk_opfamily; /* btree opfamily defining the ordering */
CompareType pk_cmptype; /* ascending or descending */
bool pk_nulls_first;
} PathKey;

The core insight is that pk_eclass is an EquivalenceClass — a set of expressions that the optimizer has proved are mutually equal in the query context. If a.x = b.y is a join condition, both a.x and b.y belong to the same EC, so a PathKey referencing that EC represents a sort order on either expression. An index scan on a.x and an index scan on b.y will produce PathKey nodes that are pointer-equal (because make_canonical_pathkey returns a canonical shared instance per EC), so compare_pathkeys can determine they match by pointer comparison:

// make_canonical_pathkey — src/backend/optimizer/path/pathkeys.c
PathKey *
make_canonical_pathkey(PlannerInfo *root,
EquivalenceClass *eclass, Oid opfamily,
CompareType cmptype, bool nulls_first)
{
/* Chase ec_merged links to the canonical EC */
while (eclass->ec_merged) eclass = ecase->ec_merged;
/* Return pre-existing canonical PathKey if one matches */
foreach(lc, root->canon_pathkeys) {
pk = (PathKey *) lfirst(lc);
if (eclass == pk->pk_eclass && opfamily == pk->pk_opfamily &&
cmptype == pk->pk_cmptype && nulls_first == pk->pk_nulls_first)
return pk;
}
/* Allocate a new canonical PathKey in planner_cxt */
pk = makeNode(PathKey); /* ... fill fields ... */
root->canon_pathkeys = lappend(root->canon_pathkeys, pk);
return pk;
}

This canonicalization means pathkeys_contained_in (pathkeys.c:343) can determine whether one path’s sort order satisfies another’s requirements in O(n) list comparisons with pointer equality, rather than deep expression tree comparisons.

Figure 2 — PathKey canonicalization and EC-based sort-order matching

flowchart LR
  subgraph EC["EquivalenceClass {a.x, b.y}"]
    AX["a.x"]
    BY["b.y"]
  end
  PK["PathKey\npk_eclass → EC\npk_opfamily = int4_ops\npk_cmptype = ASC"]
  IX["IndexPath on a.x\npathkeys = [PK]"]
  IY["IndexPath on b.y\npathkeys = [PK]"]
  MJ["MergePath\noutersortkeys = [PK]\ninnersortkeys = [PK]"]
  OB["ORDER BY a.x\nrequired_pathkeys = [PK]"]

  EC --> PK
  PK --> IX
  PK --> IY
  PK --> MJ
  PK --> OB

Figure 2 — A single canonical PathKey is shared by every Path node whose sort output is a.x or b.y (equivalent expressions). The merge join and the ORDER BY reference the same PathKey pointer, so pathkeys_contained_in matches them by pointer equality in O(1) per key.

For partitioned tables and inheritance hierarchies, set_append_rel_pathlist (allpaths.c:1251) recurses into each child relation (calling set_rel_pathlist per child), then assembles the results into AppendPath or MergeAppendPath. A MergeAppendPath is used when all children produce output in the same sort order (e.g., each child has a matching index) and the merge can avoid a top-level sort. This is a key optimization for partition-wise operations: if the query has an ORDER BY that matches the partition key, the planner can avoid sorting by using MergeAppend over per-partition index scans.

Partition pruning happens before path generation, during set_rel_size — partitions whose ranges cannot overlap the query’s filter are marked as dummy rels. When set_append_rel_pathlist iterates the children, it skips dummy rels, so no paths are ever generated for pruned partitions.

set_foreign_pathlist (allpaths.c:938) calls the FDW’s GetForeignPaths callback, which creates ForeignPath nodes using create_foreignscan_path (pathnode.c). The FDW decides how many paths to submit; it must call add_path for each. The FDW hook pattern is the same add_path API as core scan builders — custom paths compete directly against heap-scan and index-scan paths in the same pathlist.

Custom scan providers (set_rel_pathlist_hook) use the same mechanism. CustomPath embeds a methods pointer to a CustomPathMethods struct whose PlanCustomPath callback is invoked by create_plan to emit the executable CustomScan node.

allpaths.c — make_one_rel and per-RTE dispatch

Section titled “allpaths.c — make_one_rel and per-RTE dispatch”
  • make_one_rel — top-level entry for base+join path generation; drives size estimation pass then path-generation pass; calls make_rel_from_joinlist for join ordering.
  • set_base_rel_sizes — first pass over base rels: calls set_rel_size per rel to populate rows, pages, consider_parallel.
  • set_base_rel_pathlists — second pass: calls set_rel_pathlist per base rel.
  • set_rel_pathlist — dispatches by rtekind/relkind to the appropriate path builder; calls set_rel_pathlist_hook, generate_useful_gather_paths, and set_cheapest at the end.
  • set_plain_rel_pathlist — for ordinary heap tables: TID, seqscan, parallel seqscan, index paths.
  • create_plain_partial_paths — emits parallel-worker-count partial seqscan paths into rel->partial_pathlist.
  • set_append_rel_pathlist — for partitioned/inherited tables: per-child recursion, then AppendPath/MergeAppendPath assembly.
  • set_foreign_pathlist — calls FDW GetForeignPaths callback.
  • set_subquery_pathlist — for subquery RTEs: selects a pushdown-safe plan from the recursively planned subquery, applies qual pushdown.
  • create_index_paths — outer loop over rel->indexlist; coordinates three clause-matching phases; assembles bitmap paths.
  • get_index_paths — for one index and one clause set, decides scan direction and calls build_index_paths for both forward and backward scan.
  • build_index_paths — emits IndexPath (IndexScan or IndexOnlyScan) via add_path; accumulates IndexPath to bitmap candidates list.
  • match_restriction_clauses_to_index — matches rel->baserestrictinfo against index key columns; fills IndexClauseSet.
  • match_join_clauses_to_index — matches rel->joininfo (loose join clauses) against index key columns; fills a separate IndexClauseSet.
  • match_eclass_clauses_to_index — for each EquivalenceClass that mentions this rel’s columns, generates indexable join predicates; fills eclauseset.
  • consider_index_join_clauses — for each distinct parameterization derived from phases 2+3, calls build_index_paths to produce parameterized IndexPath nodes.

pathnode.c — path constructors and pruning

Section titled “pathnode.c — path constructors and pruning”
  • set_cheapest — scans pathlist after all paths are added; sets cheapest_startup_path, cheapest_total_path, cheapest_parameterized_paths.
  • add_path — dominance pruning; inserts or discards a new path; the sorted insertion order (by disabled_nodes then total_cost) makes the scan fast.
  • add_partial_path — same pruning for partial_pathlist (parallel fragments).
  • create_seqscan_path — builds a plain Path node for a sequential scan with pathtype = T_SeqScan; cost from cost_seqscan.
  • create_nestloop_path — builds a NestPath; requires outer and inner path as inputs; cost from cost_nestloop.
  • create_mergejoin_path — builds a MergePath; computes sort requirements for both inputs; cost from cost_mergejoin.
  • create_hashjoin_path — builds a HashPath; cost from cost_hashjoin.
  • create_gather_path — wraps a partial path in a GatherPath for parallel execution; cost from cost_gather.

pathkeys.c — canonical sort-order tracking

Section titled “pathkeys.c — canonical sort-order tracking”
  • make_canonical_pathkey — returns (or creates) the unique PathKey for a given (EC, opfamily, cmptype, nulls_first) tuple; stored in root->canon_pathkeys.
  • pathkeys_contained_in — tests whether pathkeys list keys1 is a prefix of (or equal to) keys2; used to decide whether a sort is redundant.
  • build_join_pathkeys — derives the pathkeys for a join path from the outer path’s pathkeys and the available ECs after the join.
  • make_pathkeys_for_sortclauses — converts a SQL ORDER BY / GROUP BY clause list into canonical PathKey nodes for use as a required ordering.

Position hints (as of 2026-06-05, commit 273fe94)

Section titled “Position hints (as of 2026-06-05, commit 273fe94)”
SymbolFileLine
make_one_relsrc/backend/optimizer/path/allpaths.c171
set_base_rel_sizessrc/backend/optimizer/path/allpaths.c290
set_base_rel_pathlistssrc/backend/optimizer/path/allpaths.c333
set_rel_pathlistsrc/backend/optimizer/path/allpaths.c469
set_plain_rel_pathlistsrc/backend/optimizer/path/allpaths.c768
create_plain_partial_pathssrc/backend/optimizer/path/allpaths.c806
set_foreign_pathlistsrc/backend/optimizer/path/allpaths.c938
set_append_rel_pathlistsrc/backend/optimizer/path/allpaths.c1251
set_subquery_pathlistsrc/backend/optimizer/path/allpaths.c2529
create_index_pathssrc/backend/optimizer/path/indxpath.c240
get_index_pathssrc/backend/optimizer/path/indxpath.c716
build_index_pathssrc/backend/optimizer/path/indxpath.c810
match_restriction_clauses_to_indexsrc/backend/optimizer/path/indxpath.c2466
match_join_clauses_to_indexsrc/backend/optimizer/path/indxpath.c2482
match_eclass_clauses_to_indexsrc/backend/optimizer/path/indxpath.c2516
set_cheapestsrc/backend/optimizer/util/pathnode.c272
add_pathsrc/backend/optimizer/util/pathnode.c464
add_partial_pathsrc/backend/optimizer/util/pathnode.c798
create_seqscan_pathsrc/backend/optimizer/util/pathnode.c986
create_nestloop_pathsrc/backend/optimizer/util/pathnode.c2671
create_mergejoin_pathsrc/backend/optimizer/util/pathnode.c2768
create_hashjoin_pathsrc/backend/optimizer/util/pathnode.c2836
create_gather_pathsrc/backend/optimizer/util/pathnode.c2180
make_canonical_pathkeysrc/backend/optimizer/path/pathkeys.c56
pathkeys_contained_insrc/backend/optimizer/path/pathkeys.c343
build_join_pathkeyssrc/backend/optimizer/path/pathkeys.c1295
make_pathkeys_for_sortclausessrc/backend/optimizer/path/pathkeys.c1336
RelOptInfo (struct)src/include/nodes/pathnodes.h883
Path (struct)src/include/nodes/pathnodes.h1753
IndexPath (struct)src/include/nodes/pathnodes.h1842
BitmapHeapPath (struct)src/include/nodes/pathnodes.h1917
EquivalenceClass (struct)src/include/nodes/pathnodes.h1442
PathKey (struct)src/include/nodes/pathnodes.h1595
NestPath (struct)src/include/nodes/pathnodes.h2225
MergePath (struct)src/include/nodes/pathnodes.h2271
HashPath (struct)src/include/nodes/pathnodes.h2292
  • make_one_rel performs a strict two-pass structure: sizes before paths. Confirmed by reading allpaths.c:171–232: set_base_rel_sizes is called unconditionally before set_base_rel_pathlists, and the total_table_pages accumulation loop runs between them. This ordering is required because parameterized index path costs depend on the cardinality of outer rels.

  • set_rel_pathlist_hook fires after the built-in builders but before set_cheapest. Confirmed at allpaths.c:540–562: the hook call and the generate_useful_gather_paths call both precede set_cheapest(rel). This means custom scan providers see all built-in paths already in pathlist and can add to or remove from it before the cheapest pointers are set.

  • add_path never pfrees IndexPath nodes. Confirmed at pathnode.c (~line 650): the free call is if (!IsA(old_path, IndexPath)) pfree(old_path). The comment explains the reason: IndexPath objects are shared with BitmapHeapPath children.

  • disabled_nodes is a higher-order cost component. Confirmed by reading add_path: the insert_at logic uses new_path->disabled_nodes > old_path->disabled_nodes as the primary sort key before comparing total_cost. This means enable_seqscan = off does not eliminate seqscan paths from the search — it just ensures they lose to any enabled alternative.

  • Parameterized paths get NIL pathkeys in add_path. Confirmed at pathnode.c:506: new_path_pathkeys = new_path->param_info ? NIL : new_path->pathkeys. The comment explains the design rationale.

  • make_canonical_pathkey chases ec_merged links to find the canonical EC before lookup. Confirmed at pathkeys.c:56–104. This ensures that after two ECs are merged (because a transitive equality was discovered), all PathKey nodes built before the merge still resolve to the same canonical EC.

  • create_index_paths skips partial indexes whose predOK flag is false. Confirmed at indxpath.c:268: if (index->indpred != NIL && !index->predOK) continue. predOK is set by check_index_predicates inside set_plain_rel_size (before set_plain_rel_pathlist is called), which tests whether the query’s baserestrictinfo implies the partial index’s predicate.

  1. Partial bitmap path generation threshold. create_partial_bitmap_paths is called only when rel->consider_parallel && rel->lateral_relids == NULL. The condition for lateral_relids == NULL is clear (LATERAL references cannot be parallelized at the base-rel level), but it is not immediately clear from the code why a LATERAL reference on a bitmap heap path specifically prevents partial execution — investigation path: trace the required_outer handling in create_partial_bitmap_paths and create_bitmap_heap_path.

  2. choose_bitmap_and cost model accuracy. The function selects the “best AND combination” of bitmap index paths using a greedy heuristic (not exhaustive search). The cost model for combined bitmap scans (in cost_bitmap_heap_scan) uses an approximation for correlated pages. Whether this approximation is accurate for highly correlated columns (e.g., a composite index scenario) is an open question. Investigation path: cost_bitmap_and_node in costsize.c.

  3. outer_presorted_keys in MergePath and incremental sort interaction. The outer_presorted_keys field enables incremental sort when the outer path is partially sorted. The planner sets this when building merge paths, but the interaction with IncrementalSortPath generation (which happens in grouping_planner for the upper rels) is not verified — it is unclear whether base-rel merge paths can also generate IncrementalSortPath on their outer input or whether that only applies to upper-rel paths.

Beyond PostgreSQL — Comparative Designs & Research Frontiers

Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”
  • Cascades / Columbia / Orca memo architecture. PostgreSQL uses a per-relation pathlist (System R style), where candidates are pruned as they arrive and the structure is destroyed as the search moves up. Cascades-family optimizers (Graefe 1995; implemented in SQL Server, Orca/Greenplum) use a memo that preserves all logical and physical expressions in a hash table of groups, enabling top-down search with memoization and allowing the search to revisit lower groups when a higher-level cost context changes. This is strictly more powerful for queries with many equivalent logical transformations but has higher memory overhead. A side-by-side comparison of add_path’s bottom-up pruning versus Cascades’ top-down rule-fire model would illustrate the tradeoff.

  • Adaptive query processing and re-optimization. PostgreSQL’s path generation is entirely static — all costs are estimated from statistics before execution begins. Systems like Eddies (Avnur & Hellerstein, SIGMOD 2000) and more recent work on runtime re-optimization (e.g., mid-query plan switches in SQL Server) replace or augment the static search with feedback from actual runtime row counts. PG18’s async I/O (postgres-aio.md) improves I/O throughput for plans the static optimizer already chose, but does not feed back actual I/O costs into re-planning.

  • ML-based cardinality estimation and learned cost models. The accuracy of path selection in create_index_paths depends entirely on the cardinality estimates from the statistics subsystem. Recent research (e.g., Kipf et al., “Learned Cardinalities,” CIDR 2019; Marcus et al., “Bao,” SIGMOD 2021) replaces or augments histogram-based estimates with ML models trained on query execution feedback. PostgreSQL’s extended statistics (postgres-extended- statistics.md) is the production step in this direction within the current architecture.

  • Partial index exploitation depth. PostgreSQL’s predOK check (check_index_predicates) tests whether the query’s filter implies the index predicate. A deeper exploitation would also generate filters to make more partial indexes applicable — this is related to the “semantic query optimization” literature (e.g., using integrity constraints to generate new predicates). No current RDBMS does this fully.

  • Partition-wise join and aggregate path generation. set_append_rel_pathlist currently generates AppendPath/MergeAppendPath. When consider_partitionwise_join is set on a RelOptInfo, the join path builders in joinpath.c generate partition-wise join paths (introduced in PG11). The interaction between pathkey canonicalization and partition key expressions (which must be in the same EC for partition-wise join to be legal) is an active area of development.

  • src/backend/optimizer/README — the authoritative design document for the entire optimizer; covers path/join-tree construction, EquivalenceClasses, outer-join ordering, Vars/PlaceHolderVars, and the optimizer function outline.
  • Selinger et al., “Access Path Selection in a Relational Database Management System,” SIGMOD 1979 — the System R template; captured at knowledge/research/dbms-papers/systemr-optimizer.md.
  • Database Internals (Petrov, ch. 7) — cost-based search, plan-space enumeration, Pareto-optimal frontier.
  • Database System Concepts (Silberschatz, 7e, ch. 16) — query optimization overview; cost estimation; dynamic programming join enumeration.
  • src/backend/optimizer/path/allpaths.cmake_one_rel, all per-RTE path builders
  • src/backend/optimizer/path/indxpath.c — index path generation pipeline
  • src/backend/optimizer/path/pathkeys.c — canonical pathkey construction and comparison
  • src/backend/optimizer/util/pathnode.cadd_path, set_cheapest, all path constructors
  • src/include/nodes/pathnodes.hRelOptInfo, Path, IndexPath, BitmapHeapPath, EquivalenceClass, PathKey, join path structs