Skip to content

PostgreSQL Query Processing — Section Overview

Contents:

This subcategory is the read/compute path: everything between a SQL string arriving in a backend and a Plan tree producing tuples. It is the classic five-stage pipeline — parse → analyze → rewrite → plan → execute — plus the four things that hang off that spine: parallel query, prepared statements and portals, planner statistics, and the Node infrastructure the whole engine builds its trees from.

The per-stage entry points are functions in tcop/postgres.c (pg_parse_query, pg_analyze_and_rewrite_fixedparams, pg_rewrite_query, pg_plan_queries), so that file is the cheapest place to see the pipeline laid out as straight-line code before diving into any one stage.

Scope, with the sharp boundaries this section does not cross:

  • Where the SQL came from is out of scope. The wire protocol that delivers the query string, the simple-vs-extended query messages, and authentication belong to client-protocol. This section starts once a backend has a query_string in hand.
  • What the executor’s scans actually touch is out of scope. When a SeqScan or IndexScan node pulls a tuple, it calls a table access method or index access method — and those, plus the buffer manager and on-page layout below them, are owned by storage-engine. This section describes the node that drives the scan; storage-engine owns the heap page it lands on.
  • Visibility and locking under the scan are out of scope. Whether a tuple a scan reads is visible to the current snapshot, and what locks the executor takes, belong to txn-recovery (MVCC, snapshots, procarray) and server-architecture (the lock manager). The executor asks; those sections answer.
  • DDL and utility statements are out of scope. pg_plan_queries only plans optimizable statements (SELECT/INSERT/UPDATE/DELETE/ MERGE); a CREATE TABLE is a utility statement routed through ProcessUtility, which lives in ddl-schema. This section is the optimizable-query path only.
  • The catalog the planner reads is out of scope as a subsystem. The optimizer’s cost model consumes per-column statistics and relation metadata, but the relcache/catcache machinery that serves them belongs to system-catalog. This section owns the consumer (selectivity estimation, extended statistics), not the cache.

Within scope, the load-bearing claims a reader should carry forward: plan generation has two representations — cheapest-first Path nodes during search, then a single executable Plan tree built from the winner; join ordering is dynamic programming with a GEQO genetic fallback once the join count crosses geqo_threshold; and the executor is a demand-pull tree where every node answers ExecProcNode by pulling from its children.

flowchart TB
  SQL["SQL text (query_string)"]

  subgraph FRONT["front of the pipeline — tcop/postgres.c drives it"]
    PARSE["postgres-parser<br/>pg_parse_query: scan.l + gram.y<br/>-> raw parse tree (RawStmt)"]
    ANALYZE["postgres-analyze-transform<br/>parse_analyze: name resolution,<br/>type coercion -> Query tree"]
    REWRITE["postgres-rewriter<br/>rule system: view expansion,<br/>RLS, DO INSTEAD -> Query list"]
  end
  SQL --> PARSE --> ANALYZE --> REWRITE

  subgraph PLAN["the planner / optimizer — optimizer/README"]
    PREP["postgres-prep-rewrites<br/>subquery pull-up, qual canon"]
    PATHS["postgres-path-generation<br/>candidate Paths, cheapest-first"]
    JOINS["postgres-join-ordering<br/>DP join search + GEQO fallback"]
    COST["postgres-cost-model<br/>costsize.c + selectivity"]
    CREATE["postgres-plan-creation<br/>winning Path -> executable Plan tree"]
    STATS["postgres-extended-statistics<br/>multi-column stats for selectivity"]
  end
  REWRITE --> PREP --> PATHS
  PATHS --> JOINS
  COST -. feeds .-> PATHS
  COST -. feeds .-> JOINS
  STATS -. feeds .-> COST
  JOINS --> CREATE

  subgraph EXEC["the executor — executor/README"]
    PORTAL["postgres-portals-prepared<br/>PortalStart/PortalRun, plan cache"]
    DRIVER["postgres-executor<br/>ExecProcNode demand-pull tree"]
    EXPR["postgres-expression-eval<br/>switch-threaded interpreter"]
    SCANS["postgres-scan-nodes"]
    JOINNODES["postgres-join-nodes"]
    AGGSORT["postgres-agg-sort-nodes"]
    PARALLEL["postgres-parallel-query<br/>Gather forks workers over shm_mq"]
    JIT["postgres-jit<br/>optional LLVM compile of exprs"]
    SORT["postgres-tuplesort<br/>external merge sort / spill"]
  end
  CREATE --> PORTAL --> DRIVER
  DRIVER --> SCANS
  DRIVER --> JOINNODES
  DRIVER --> AGGSORT
  DRIVER --> PARALLEL
  DRIVER --> EXPR
  EXPR -. optionally .-> JIT
  AGGSORT -. spills via .-> SORT

  NODES["postgres-node-trees<br/>the Node/List substrate every tree above is built from"]
  NODES -. underpins .-> FRONT
  NODES -. underpins .-> PLAN
  NODES -. underpins .-> EXEC

  SCANS --> AM["table / index access methods (storage-engine)"]

Three structural facts the diagram encodes:

  • Trees, not streams. Each stage hands the next a tree of Nodes: the raw parse tree, the Query tree, the Plan tree, the runtime PlanState tree. postgres-node-trees owns the Node/List machinery (the gen_node_support.pl-generated copy/equal/out/read functions) that every other doc in this section assumes.
  • The planner has two grammars. During search it builds and compares Path nodes (cheap to make, cheapest-first); only the winner is turned into a Plan by postgres-plan-creation. Cost (postgres-cost-model) and statistics (postgres-extended-statistics) are inputs to search, not stages — hence the dotted “feeds” edges.
  • Parallel query is a fork of the same tree. A Gather/GatherMerge node launches background workers that run a copy of the sub-plan and stream tuples back over shm_mq; it is not a separate execution model.

Cross-referenced-first, so each doc leans on ones already read:

  1. postgres-node-trees — read this first. Everything downstream is a tree of Nodes; the copy/equal/out/read and List conventions are assumed by every other doc here.
  2. The front of the pipelinepostgres-parserpostgres-analyze-transformpostgres-rewriter. Short, linear, and they establish the Query tree the planner consumes.
  3. The plannerpostgres-planner-overview first (it frames the whole optimizer and is the home of optimizer/README), then postgres-path-generation and postgres-join-ordering for the search, postgres-cost-model and postgres-extended-statistics for the inputs, and postgres-plan-creation for Path→Plan. postgres-prep-rewrites can trail (it is the planner’s preprocessing).
  4. The executorpostgres-executor (the demand-pull driver) before the per-node docs postgres-scan-nodes, postgres-join-nodes, postgres-agg-sort-nodes. postgres-expression-eval pairs with the executor; postgres-tuplesort is the spill machinery agg/sort nodes lean on.
  5. The cross-cutting fourpostgres-parallel-query, postgres-portals-prepared, and (optional, leaf) postgres-jit last, since they presuppose the plan and executor models.

Forward references — these module docs may not exist yet. One line each on what each will own.

DocOne-line scope
postgres-parser.mdLexer (scan.l) and bison grammar (gram.y) turning SQL text into a raw parse tree; what is and isn’t decided at parse time.
postgres-analyze-transform.mdParse analysis (analyze.c, parse_*): name/relation resolution, type coercion, and the transformation of a raw parse tree into a Query.
postgres-rewriter.mdThe rule system (rewriteHandler.c): view expansion, DO INSTEAD rules, and row-level-security qual injection on the Query tree.
postgres-planner-overview.mdstandard_planner/subquery_planner top-level flow; the Path-vs-Plan model and how the optimizer’s pieces fit (home of optimizer/README).
postgres-path-generation.mdBuilding candidate access Paths per relation (allpaths.c, indxpath.c): seqscan/indexscan/bitmap paths, pathkeys, cheapest-first bookkeeping.
postgres-join-ordering.mdJoin-order search: dynamic programming over join relations (joinrels.c, joinpath.c) and the GEQO genetic fallback past geqo_threshold.
postgres-cost-model.mdThe cost functions (costsize.c) and selectivity estimation (selfuncs.c): how a Path’s total_cost is computed from row estimates and *_cost GUCs.
postgres-plan-creation.mdTurning the winning Path into an executable Plan tree (createplan.c, setrefs.c, subselect.c): var fixup, subplan handling.
postgres-prep-rewrites.mdPlanner preprocessing (optimizer/prep/*): subquery pull-up/flattening, qual canonicalization, set-operation and target-list prep.
postgres-extended-statistics.mdMulti-column extended statistics (statistics/*): ndistinct, functional dependencies, MCV lists, and how ANALYZE populates them for better selectivity.
postgres-executor.mdThe demand-pull engine: ExecInitNode/ExecProcNode/ExecEndNode, the PlanState tree, tuple slots, and the executor lifecycle.
postgres-expression-eval.mdExpression compilation to an opcode steps array (execExpr.c) and the switch-threaded interpreter (execExprInterp.c, ExecInterpExpr).
postgres-scan-nodes.mdLeaf scan executor nodes: seqscan, indexscan, index-only, bitmap-heap, tid and tid-range scans, and how they call the access methods.
postgres-join-nodes.mdJoin executor nodes: nested loop, hash join (nodeHash/nodeHashjoin), and merge join — algorithms, batching, and memory behavior.
postgres-agg-sort-nodes.mdGrouping/ordering nodes: hash and sorted aggregation (nodeAgg), sort, incremental sort, group, and window aggregation.
postgres-parallel-query.mdParallel execution: Gather/GatherMerge, execParallel worker setup, the shm_mq tuple queue, and parallel-aware nodes (per README.parallel).
postgres-portals-prepared.mdPortals (pquery.c) and prepared statements (prepare.c, plancache.c): plan caching, generic vs custom plans, cursor execution.
postgres-jit.mdOptional LLVM JIT (jit/llvm/*): when expressions and tuple deforming get compiled instead of interpreted, and the cost thresholds that trigger it.
postgres-tuplesort.mdGeneral-purpose sorting (tuplesort.c) and tuplestore: in-memory quicksort, external merge with logtape, and the spill path agg/sort/merge-join use.
postgres-node-trees.mdThe Node/List substrate (nodes/*, gen_node_support.pl): tagged nodes, copy/equal/out/read codegen, and Bitmapset/tidbitmap helpers.
  • storage-engine (postgres-overview-storage-engine.md) — the section below execution. The executor’s scan nodes stop at the table/index access-method boundary; storage-engine owns the AMs, the buffer manager, the page layout, and the heap they read. The handoff is the single most trafficked seam in the engine.
  • txn-recovery (postgres-overview-txn-recovery.md) — supplies the snapshot a scan reads against and the visibility rules that decide which tuple versions a query sees. The executor asks “is this tuple visible to my snapshot?”; txn-recovery answers.
  • system-catalog (postgres-overview-system-catalog.md) — the planner’s cost model and the parser’s name resolution both read catalog metadata (relations, types, operators, statistics) through the relcache/catcache. This section consumes those caches; system-catalog owns them.
  • client-protocol (postgres-overview-client-protocol.md) — the section before parse. It delivers the query_string and, via the extended query protocol, drives the Parse/Bind/Execute lifecycle that maps onto prepared statements and portals here.
  • server-architecture (postgres-overview-server-architecture.md) — the backend that runs this entire pipeline is a B_BACKEND, and parallel query’s workers are background workers attached to the shared-memory machine. Process model, locks, and the shm_mq/DSM substrate parallel query rides on all live there.