PostgreSQL Partitioning — From Table Inheritance to Declarative Partitioning
Contents:
- Why this subsystem had to evolve (the original limitation)
- Timeline
- Era 0 — Table inheritance + constraint exclusion (pre-10)
- Era 1 — Declarative partitioning syntax + tuple routing (PG10)
- Era 2 — Hash partitioning, default partition, partitionwise join/aggregate (PG11)
- Era 3 — Runtime pruning (PG11)
- Era 4 — Faster plan-time pruning + ATTACH/DETACH CONCURRENTLY + FK/index inheritance (PG12)
- Era 5 — Planner polish, triggers, logical replication (PG13–16)
- Where it stands at REL_18
- Sources
Why this subsystem had to evolve (the original limitation)
Section titled “Why this subsystem had to evolve (the original limitation)”For most of PostgreSQL’s history there was no such thing as a “partitioned
table.” There was only table inheritance — a Postgres-the-research-system
feature in which a child table CREATE TABLE measurement_y2006m02 () INHERITS (measurement) shares the parent’s column definitions and is automatically
scanned whenever the parent is queried. Partitioning was a recipe built on
top of inheritance, documented in the manual and reproduced in thousands of
production schemas, not a feature the engine understood as a unit.
The recipe had four moving parts, each of which the DBA had to assemble and keep in sync by hand:
- A parent table with no rows of its own, plus one child table per
partition declared
INHERITS (parent). - A
CHECKconstraint on every child spelling out which rows it is allowed to hold —CHECK (logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01'). This constraint was the only thing that told the planner what the child contained. - An
INSERTtrigger (or rule) on the parent that inspected each new row andINSERTed it into the correct child. Routing was user-written PL/pgSQL: a cascade ofIF logdate >= ... AND logdate < ... THEN INSERT INTO ...child...branches, re-edited every time a partition was added. constraint_exclusionturned on, so the planner would try to prove, for each child, that the child’sCHECKconstraint is refuted by the query’sWHEREclause, and skip the child’s scan if so.
Every one of these was a liability. The trigger was an O(N) IF ladder that
the DBA maintained as a string of SQL, with no help from the engine if a
branch was wrong or missing — a misrouted row silently landed in the wrong
child or errored. The CHECK constraints were free-form predicates: there was
no guarantee they were mutually exclusive or that they covered the key space,
so two children could legally claim the same row, or a row could match no
child and fall back into the (supposedly empty) parent.
The planning side was worse for performance. constraint_exclusion worked by
theorem proving: for each child relation, relation_excluded_by_constraints
fed the child’s CHECK predicate and the query’s restriction clauses to the
predicate-refutation prover (predicate_refuted_by) and asked “can these be
simultaneously true?” This is general and elegant — it handles any CHECK the
DBA writes — but it is also per-child, at plan time, and proportional to the
complexity of the predicates. A thousand-child table meant a thousand
refutation proofs on every plan, each one running the general SAT-like prover.
Constraint exclusion also only fired for constants known at plan time; a
parameter (logdate = $1) or a value from a join could not be used to exclude
children, because the prover had no constant to reason about.
The architectural diagnosis: the knowledge of “which partition holds which rows” lived in user SQL — trigger bodies and CHECK predicates — instead of in catalog metadata the engine could index. Routing was an interpreted IF ladder; pruning was a general predicate prover applied blindly to every child. Neither scaled to hundreds or thousands of partitions, and neither could use a runtime value. The fix was to move that knowledge into a structured, binary-searchable catalog representation and write dedicated C to read it — which is exactly what the declarative-partitioning project did, release by release.
Timeline
Section titled “Timeline”flowchart LR E0["Pre-10<br/>Table inheritance<br/>CHECK + INSERT trigger<br/>constraint_exclusion proofs"] E1["PG10<br/>PARTITION BY syntax<br/>PartitionBoundInfo catalog<br/>C tuple routing"] E2["PG11<br/>HASH partitioning<br/>DEFAULT partition<br/>partitionwise join/agg<br/>UPDATE row movement"] E3["PG11<br/>Runtime pruning<br/>PARAM_EXEC drives<br/>Append subplan skip"] E4["PG12<br/>Fast plan-time pruning<br/>ATTACH/DETACH CONCURRENTLY<br/>FK + index inheritance"] E5["PG13-16<br/>BEFORE-row triggers<br/>logical-rep publish root<br/>planner refinements"] E6["REL_18<br/>partbounds.c<br/>partprune.c<br/>execPartition.c"] E0 --> E1 --> E2 --> E3 --> E4 --> E5 --> E6
The thread running through every era is the same migration: partition
resolution moves out of user-written SQL (trigger bodies, CHECK predicates,
the general refutation prover) and into catalog metadata read by purpose-built
C in src/backend/partitioning/. Era 1 builds the metadata and the routing
reader; Era 2 widens the strategies and starts pushing operators below the
partition boundary; Era 3 lets pruning run after parameters are bound; Era 4
makes pruning cheap and partition maintenance online; Era 5 closes the
remaining feature gaps. The rest of this doc walks each era: what changed, why,
and the before/after code shape.
Era 0 — Table inheritance + constraint exclusion (pre-10)
Section titled “Era 0 — Table inheritance + constraint exclusion (pre-10)”Release: baseline — inheritance dates to the earliest PostgreSQL;
constraint_exclusion as a partition-pruning tool was added in PG 8.1
(2005) and made smarter (the partition mode that only fires for inheritance
children and UNION ALL) in PG 8.4.
What it looked like. A “partitioned” table was a manually wired tree:
-- Era 0: partitioning by convention, not by the engineCREATE TABLE measurement (logdate date NOT NULL, peaktemp int);
CREATE TABLE measurement_y2006m02 ( CHECK (logdate >= '2006-02-01' AND logdate < '2006-03-01')) INHERITS (measurement);CREATE TABLE measurement_y2006m03 ( CHECK (logdate >= '2006-03-01' AND logdate < '2006-04-01')) INHERITS (measurement);
-- routing is a hand-written triggerCREATE FUNCTION measurement_insert_trigger() RETURNS trigger AS $$BEGIN IF NEW.logdate >= '2006-02-01' AND NEW.logdate < '2006-03-01' THEN INSERT INTO measurement_y2006m02 VALUES (NEW.*); ELSIF NEW.logdate >= '2006-03-01' AND NEW.logdate < '2006-04-01' THEN INSERT INTO measurement_y2006m03 VALUES (NEW.*); ELSE RAISE EXCEPTION 'date out of range'; -- or silently lose the row END IF; RETURN NULL;END; $$ LANGUAGE plpgsql;How pruning worked — and why it was slow. When the planner expanded the
parent measurement into its inheritance children, it asked, for each child,
whether the query’s WHERE clause refutes that child’s CHECK constraint. The
mechanism is the general predicate prover, still present at REL_18 as the
constraint_exclusion GUC and relation_excluded_by_constraints in
plancat.c:
// relation_excluded_by_constraints — src/backend/optimizer/util/plancat.c// (still present at REL_18; this is the Era-0 mechanism)// ... gather the rel's CHECK constraints into safe_constraints ...// if (predicate_refuted_by(safe_constraints, rel->baserestrictinfo, false))// return true; // prove WHERE contradicts the CHECK -> excludeThe cost model of Era 0:
- Routing: O(N) interpreted per row. Each INSERT runs the PL/pgSQL trigger, walking the IF ladder until a branch matches. N partitions means up to N comparisons per row, executed by the PL/pgSQL interpreter.
- Pruning: O(N) refutation proofs per plan. Each child invokes the general
predicate_refuted_byprover. The prover is not a binary search — it is a structural proof over arbitrary boolean predicates, so it is both per-child and per-clause expensive. - No runtime values. A
WHERE logdate = $1prepared statement, or alogdatevalue arriving from a join, gives the prover nothing to refute against, so no children are excluded — every child is scanned.
Why it had to change. The two costs above are linear in the partition count at the wrong times (every INSERT, every plan), and the prover is general where a partitioned table wants something special: the partition map is exhaustive and disjoint by construction, so resolving a key to a partition should be a binary search, not a theorem. Era 0 had no place to store an ordered bound map because the engine did not model “this is a partition of that” — it only modeled “this inherits from that, and happens to carry a CHECK.” Everything after PG10 is about giving the engine that missing model.
Cross-link: the general refutation prover and the surviving constraint_exclusion
GUC are described in the planner mechanism doc,
postgres-planner-overview.md; the
predicate-driven scan choice it feeds is in
postgres-scan-nodes.md.
Era 1 — Declarative partitioning syntax + tuple routing (PG10)
Section titled “Era 1 — Declarative partitioning syntax + tuple routing (PG10)”Release: PostgreSQL 10 (2017). The headline feature: PARTITION BY and
PARTITION OF syntax, a real catalog model of partitions, and C-level tuple
routing on INSERT. PG10 shipped RANGE and LIST strategies (hash came
one release later).
What changed in the syntax. The four hand-wired pieces of Era 0 collapse into two DDL statements the engine understands as a unit:
-- Era 1: the engine owns the partition mapCREATE TABLE measurement (logdate date NOT NULL, peaktemp int) PARTITION BY RANGE (logdate);
CREATE TABLE measurement_y2006m02 PARTITION OF measurement FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');CREATE TABLE measurement_y2006m03 PARTITION OF measurement FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');-- no CHECK, no trigger: INSERT INTO measurement just worksThe CHECK constraint and the INSERT trigger are gone. The FOR VALUES
clause is parsed into a catalog row, and the engine derives routing and pruning
from it.
The new catalog model. PG10 added pg_partitioned_table (the parent’s
strategy and key) and stored each partition’s bound in pg_class.relpartbound.
At plan/exec time the relcache builds, per partitioned relation, a
PartitionDesc — an ordered oids[] array plus a PartitionBoundInfo that
canonicalizes the LIST/RANGE bounds into a sorted datums[] array with an
indexes[] map. This is the structured, binary-searchable representation that
Era 0 lacked. (The full structure is documented in the module doc; this evolution
doc names it but does not re-derive it.)
Tuple routing replaces the trigger. Routing moved from interpreted PL/pgSQL
into C in the new src/backend/executor/execPartition.c. The before/after of
the INSERT path:
BEFORE (Era 0): INSERT INTO parent -> BEFORE INSERT trigger fires -> PL/pgSQL interpreter walks IF/ELSIF ladder (O(N)) -> INSERT INTO matched child
AFTER (Era 1): INSERT INTO partitioned_table -> ExecFindPartition() -> FormPartitionKeyDatum() extracts the key -> get_partition_for_tuple() binary-searches boundinfo -> tuple stored in the chosen leaf ResultRelInfo// ExecFindPartition / get_partition_for_tuple — src/backend/executor/execPartition.c// (the C routing that replaced the Era-0 INSERT trigger)// ExecFindPartition(): FormPartitionKeyDatum(...) then// partidx = get_partition_for_tuple(dispatch, values, isnull);// get_partition_for_tuple(): binary search of the bound info// (partition_range_datum_bsearch / partition_list_bsearch)Plan-time pruning, first cut. PG10 still leaned on the existing constraint
machinery for pruning at plan time — the declarative bounds were internally
turned into the equivalent of the implicit partition constraint, and the planner
used the same expansion-plus-exclusion path. The dedicated, fast pruning engine
(partprune.c) had not arrived yet; that is Era 3/4. So PG10’s win was
correctness and ergonomics of the model plus fast C routing, not yet fast
pruning.
Why this was the pivotal release. PG10 is where partition knowledge stops
being user SQL and becomes catalog metadata the engine can index. Once the
bounds live in a sorted PartitionBoundInfo, both routing (a key -> partition
lookup) and pruning (a predicate -> set-of-partitions lookup) become
binary-search problems instead of, respectively, an interpreted IF ladder and a
per-child theorem. Every later era builds on this representation.
Cross-link: the PartitionDesc / PartitionBoundInfo catalog structure and the
ExecFindPartition / get_partition_for_tuple routing flow are detailed in the
module doc postgres-partitioning.md.
Era 2 — Hash partitioning, default partition, partitionwise join/aggregate (PG11)
Section titled “Era 2 — Hash partitioning, default partition, partitionwise join/aggregate (PG11)”Release: PostgreSQL 11 (2018). PG11 is the release that made declarative partitioning usable at scale rather than merely correct. Four largely independent additions landed together; the fifth, runtime pruning, is large enough to get its own era below.
1. HASH partitioning. PG10 had RANGE and LIST. PG11 added the third
textbook strategy, PARTITION BY HASH, with partitions declared by modulus and
remainder:
CREATE TABLE orders (order_id bigint, ...) PARTITION BY HASH (order_id);CREATE TABLE orders_p0 PARTITION OF orders FOR VALUES WITH (MODULUS 4, REMAINDER 0);CREATE TABLE orders_p1 PARTITION OF orders FOR VALUES WITH (MODULUS 4, REMAINDER 1);-- ... p2, p3Hash partitioning has no natural ordering, so it could never have been expressed
as Era-0 range CHECKs without an explicit hashfn(key) % n = r constraint that
the prover could not exploit. With declarative bounds, routing a row to a hash
partition is a modulo computation rather than a search:
// compute_partition_hash_value path — src/backend/partitioning/partbounds.c// PARTITION_STRATEGY_HASH: hash the key columns, take modulus/remainder.// case PARTITION_STRATEGY_HASH: (partbounds.c bound-cmp + routing switch)This is the clearest demonstration of why the catalog model mattered: hash partitioning is only practical once the engine, not a user CHECK, owns the mapping function.
2. DEFAULT partition. PG11 added PARTITION OF parent DEFAULT — a catch-all
for rows matching no explicit bound. In Era 0 a row matching no child’s CHECK
silently fell into the empty parent or errored in the trigger; declarative
partitioning in PG10 rejected such a row with an error. The default partition
restores the catch-all behavior safely, as a first-class, engine-known
partition with its own entry in the bound info (LIST and RANGE only; a hash
partitioned table is total by construction and has no default).
3. Partitionwise join. Controlled by enable_partitionwise_join, this lets
the planner join two compatibly-partitioned tables by joining their matching
partitions pairwise and Append-ing the results, instead of appending both
sides first and joining the unioned relations. The enabler is bound matching:
the planner calls partition_bounds_merge (new in PG11, in partbounds.c) to
check that the two inputs’ partition boundaries line up, and if they do, it
pushes the join below the Append.
4. Partitionwise aggregate. Controlled by enable_partitionwise_aggregate,
the dual of the above for GROUP BY: when the grouping key is the partition
key, each partition can be aggregated independently and the partial results
concatenated, turning one big hash/sort aggregate into N small ones — and, with
parallelism, N parallel ones.
The structural shift of partitionwise operators, before vs. after:
flowchart TB
subgraph Before["Before PG11 — join above Append"]
A1["Append (table A: a0,a1)"] --> J1["Join A x B"]
B1["Append (table B: b0,b1)"] --> J1
end
subgraph After["PG11 — partitionwise join below Append"]
J2a["Join a0 x b0"] --> AP["Append"]
J2b["Join a1 x b1"] --> AP
end
The win is twofold: each per-partition join works on a fraction of the data (so
a hash join’s build side fits in memory where the whole-table build would
spill), and the N independent joins parallelize cleanly. The requirement —
matching bounds — is why partition_bounds_merge had to exist: the planner must
prove a0 only ever joins b0, never b1, which is exactly a bound-overlap check.
5. UPDATE row movement. Also in PG11: an UPDATE that changes a row’s
partition key so the row no longer belongs in its current partition is now
handled by moving the row (DELETE from the old partition, INSERT into the new
one) instead of erroring. This is the update-path counterpart to INSERT tuple
routing, again in execPartition.c (the ExecCrossPartitionUpdate path).
Why these landed together. PG10 proved the catalog model worked; PG11 cashed it in. Hash partitioning and the default partition complete the strategy surface (all three textbook strategies plus a catch-all). Partitionwise join and aggregate are the first features to exploit the bound info for something beyond a single point lookup — they merge two bound sets — and they mark the moment the optimizer starts treating partition boundaries as a structure to push operators through, not just a pruning hint.
Cross-link: partition_bounds_merge and the partitionwise mechanism are in
postgres-partitioning.md; the cost/path choices that
decide whether to use a partitionwise plan are in
postgres-planner-overview.md.
Era 3 — Runtime pruning (PG11)
Section titled “Era 3 — Runtime pruning (PG11)”Release: PostgreSQL 11 (2018), shipped alongside the Era-2 features but conceptually distinct enough to call out separately: it is the first pruning mechanism that works on values not known at plan time.
The gap it closed. Constraint exclusion (Era 0) and PG10’s pruning both required a plan-time constant. Two enormous classes of query had no such constant:
- Generic prepared plans / parameters.
PREPARE p AS SELECT * FROM measurement WHERE logdate = $1; EXECUTE p('2006-02-15'). A generic plan is built once with$1unknown, so no child could be excluded — every partition was scanned on every execution. - Nested-loop join inner side.
... JOIN measurement ON measurement.logdate = outer.d. The value ofouter.dis only known per outer row, at run time, so plan-time pruning could exclude nothing.
The mechanism. PG11 introduced a pruning representation that can be
re-executed with bound parameter values: the planner compiles the partition
key clauses into a list of PartitionPruneStep nodes — PartitionPruneStepOp
(apply an operator against the bound info) and PartitionPruneStepCombine
(intersect/union the per-clause results) — and attaches them to the Append /
MergeAppend plan node. These steps are interpretable at run time, reading
PARAM_EXEC values that did not exist at plan time, to compute the surviving set
of subplans.
Two run-time pruning points were added:
- Initial pruning (
ExecDoInitialPruning/ExecFindMatchingSubPlansinexecPartition.c): runs once at executor startup, after external parameters (theEXECUTEargs of a generic plan) are bound, to drop subplans that cannot match. This is what makes a generic prepared plan prune. - Per-scan / exec pruning: re-runs the steps when a
PARAM_EXECvalue changes — e.g. each new outer tuple of a nested loop — so anAppendunder a nested loop only scans the partition(s) the current outer value can match.
The before/after of a parameterized scan:
BEFORE (PG10): EXECUTE p($1='2006-02-15') -> generic plan: Append over ALL partitions -> every leaf scanned, results discarded by filter
AFTER (PG11): EXECUTE p($1='2006-02-15') -> ExecDoInitialPruning() runs the prune steps with $1 bound -> get_matching_partitions() -> {measurement_y2006m02} -> Append scans ONLY the surviving subplan// ExecDoInitialPruning / ExecFindMatchingSubPlans — execPartition.c// run the compiled PartitionPruneStep list once params are known,// producing the bitmap of valid (surviving) Append subplans.// get_matching_partitions — partprune.c// the shared engine that both plan-time and run-time pruning call.Why it is structurally important. Runtime pruning is the point where the
pruning logic becomes a reusable, parameterizable program (the prune-step
list) rather than a one-shot plan-time decision. The same compiled steps serve
plan-time pruning (run with constants, before the plan is finalized) and runtime
pruning (run with PARAM_EXEC values, during execution). That unification —
“compile the predicate once into steps, run the steps whenever the values are
known” — is what partprune.c provides, and it is the foundation the Era-4
plan-time speedup is built on.
Cross-link: the prune-step compilation and the get_matching_partitions engine
are in postgres-partitioning.md; how the surviving
subplans feed the Append / MergeAppend executor nodes is in
postgres-scan-nodes.md.
Era 4 — Faster plan-time pruning + ATTACH/DETACH CONCURRENTLY + FK/index inheritance (PG12)
Section titled “Era 4 — Faster plan-time pruning + ATTACH/DETACH CONCURRENTLY + FK/index inheritance (PG12)”Release: PostgreSQL 12 (2019). PG12 is the “make it fast and make it
operable” release. The partitioning project was now feature-rich but had two
remaining pain points: plan-time pruning was still slow for high partition
counts, and every partition-maintenance operation (ATTACH, DETACH) took a
strong lock that blocked queries.
1. Fast plan-time pruning. PG11’s runtime-pruning machinery (partprune.c,
the prune-step program, get_matching_partitions) was generalized so that
plan-time pruning also runs through it, via
prune_append_rel_partitions, instead of the old expand-every-child-then-prove
path. The practical effect, widely reported at the time: planning a SELECT
against a table with thousands of partitions went from seconds to milliseconds,
because the planner binary-searches the bound info for the matching partitions
instead of running a refutation proof per child. PG12 also raised the practical
ceiling on partition count from “hundreds hurt” to “thousands are fine.”
// prune_append_rel_partitions — src/backend/partitioning/partprune.c// plan-time entry: compile the rel's restriction clauses into prune steps,// run get_matching_partitions(), return the surviving partition set —// the same engine Era 3 introduced for runtime, now used at plan time too.This closes the loop opened in Era 3: a single pruning engine
(partprune.c / get_matching_partitions) now serves plan-time pruning
(prune_append_rel_partitions) and runtime pruning (ExecFindMatchingSubPlans),
retiring constraint-exclusion-style per-child proofs for declarative tables.
constraint_exclusion remains in the tree for legacy inheritance and UNION ALL,
but declarative partitioned tables no longer depend on it.
2. ATTACH PARTITION (lighter lock) and DETACH … CONCURRENTLY. PG12 reduced
the locking impact of ALTER TABLE ... ATTACH PARTITION so attaching a new
partition no longer required an ACCESS EXCLUSIVE lock that blocked readers of
the whole table. DETACH PARTITION CONCURRENTLY — the online detach that uses a
two-transaction protocol so a long-running detach does not block queries — was
finalized in PG14; PG12 is where the locking-reduction work began. The
detach-concurrently logic is still visible in tablecmds.c, including the rule
that you cannot detach concurrently when a default partition exists (the default
would have to be revalidated under a strong lock anyway):
// DETACH PARTITION CONCURRENTLY — src/backend/commands/tablecmds.c// "cannot detach partitions concurrently when a default partition exists"// second transaction sets inhdetachpending and waits out concurrent snapshots3. Foreign keys and indexes inherit through partitions. Earlier declarative
partitioning had sharp edges: you could not create an index on a partitioned
parent and have it apply to all partitions, and a foreign key referencing a
partitioned table was not supported. PG11 began closing these (indexes declared
on the parent cascade to children; partitioned-table primary keys and unique
constraints); PG12 completed foreign-key support so a partitioned table can be
the referenced side of an FK. The DDL-cascade-to-children machinery lives in
tablecmds.c and the constraints subsystem; the evolution point is that
partition children stopped being independent tables you had to index/constrain
one by one and became managed members of a hierarchy.
Why this was the maturation release. After PG12, declarative partitioning is no longer a feature with a “but” attached. Pruning is cheap at any reasonable partition count, partition maintenance no longer requires an outage-grade lock, and the surrounding schema objects (indexes, PKs, FKs) inherit through the hierarchy. The single shared pruning engine introduced in Era 3 and generalized here is the design that REL_18 still ships.
Cross-link: prune_append_rel_partitions and the plan-time pruning flow are in
postgres-planner-overview.md and
postgres-partitioning.md; the resulting partition
scans are in postgres-scan-nodes.md.
Era 5 — Planner polish, triggers, logical replication (PG13–16)
Section titled “Era 5 — Planner polish, triggers, logical replication (PG13–16)”Releases: PostgreSQL 13 (2020) through 16 (2023). By PG13 the core architecture was settled, so this era is a long tail of gap-closing: each release removed a remaining “you can’t do X on a partitioned table” restriction and made the planner smarter about the structure it now had. Highlights, attributed by release:
- PG13 — partitionwise reach and BEFORE-row triggers. PG13 let
partition-key-preserving operations propagate more widely and, notably, added
support for
BEFORE ROWtriggers on partitioned tables declared on the parent and inherited by children (Era 0/PG10 only allowed statement-level or per-child row triggers). PG13 also improved pruning to handle more clause shapes (e.g. it can prune using a list of values /IN), and reduced memory/locking overhead when a query touches only a few of many partitions. - PG13 — logical replication of partitioned tables. A partitioned table can
be added to a
PUBLICATIONdirectly, andpublish_via_partition_rootlets the publisher send changes as if they belonged to the root table, so a partitioned table on one side can replicate to a non-partitioned (or differently partitioned) table on the other. Before this, you published the leaf tables individually. - PG14 —
DETACH PARTITION CONCURRENTLY. The online detach finalized: a two-transaction protocol marks the partitioninhdetachpendingand waits out in-flight snapshots so the detach does not block concurrent queries (the mechanism noted under Era 4, completed here). PG14 also further reduced per-partition planner/executor overhead for queries that prune down to a few partitions. - PG15 — partitionwise refinements and ordered scans. Continued planner work:
better use of
MergeAppendto preserve ordering across partitions (so anORDER BYon the partition key can stream from per-partition ordered scans without a top-level sort), and more cases where partitionwise join/aggregate applies. - PG16 — more aggressive pruning and partitionwise joins. PG16 extended partitionwise join to more situations where the partition bounds are compatible but not identical, and broadened the clause forms the pruning steps can exploit. General theme of the release: squeeze more plans into the partitionwise and pruned-scan shapes that earlier eras made possible.
The shape of this era. Unlike PG10–12, no single PG13–16 change rewrote the
architecture. Each is a delta against the same three pillars: the
PartitionBoundInfo catalog model (PG10), the bound-merge / partitionwise
optimizer hooks (PG11), and the unified partprune.c pruning engine
(PG11 runtime + PG12 plan-time). The work is breadth — more trigger types, more
replication scenarios, more clause shapes pruned, more join geometries pushed
below the Append — on a foundation that did not move.
flowchart TB Cat["PartitionBoundInfo<br/>catalog model (PG10)"] Merge["bound-merge / partitionwise<br/>optimizer hooks (PG11)"] Prune["unified prune engine<br/>partprune.c (PG11 rt + PG12 pt)"] P13["PG13: BEFORE-row triggers<br/>publish_via_partition_root"] P14["PG14: DETACH CONCURRENTLY<br/>lower per-partition overhead"] P15["PG15: MergeAppend ordering<br/>more partitionwise cases"] P16["PG16: compatible-bound joins<br/>broader pruning clauses"] Cat --> P13 Merge --> P15 Merge --> P16 Prune --> P14 Prune --> P16 Cat --> P13
Cross-link: the current trigger, replication, and ordered-scan behaviors are described in their own module docs; for the partitioning-specific mechanism see postgres-partitioning.md, and for the planner choices that drive partitionwise and ordered-append plans see postgres-planner-overview.md and postgres-scan-nodes.md.
Where it stands at REL_18
Section titled “Where it stands at REL_18”At REL_18 (273fe94, PostgreSQL 18.x) declarative partitioning is the mature,
catalog-backed subsystem the PG10–12 arc produced, with the PG13–16 breadth
folded in. The three source files that define it today are exactly the ones the
evolution converged on:
src/backend/partitioning/partbounds.c— canonicalizes LIST / RANGE / HASH bounds into the sortedPartitionBoundInfo, runs the bound comparisons used by routing, and merges two bound sets (partition_bounds_merge,merge_list_bounds,merge_range_bounds) for partitionwise join/aggregate.src/backend/partitioning/partprune.c— the single pruning engine.prune_append_rel_partitionsis the plan-time entry;get_matching_partitionsruns the compiledPartitionPruneStepOp/PartitionPruneStepCombineprogram; the same steps are re-run at run time. This one module serves both plan-time and runtime pruning for declarative tables, having retired per-child constraint proofs for them.src/backend/executor/execPartition.c— tuple routing on INSERT/UPDATE (ExecFindPartition->get_partition_for_tuple, with thePARTITION_CACHED_FIND_THRESHOLD = 16last-found cache) and runtime pruning ofAppend/MergeAppendsubplans (ExecDoInitialPruning,ExecFindMatchingSubPlans), plus cross-partitionUPDATErow movement.
constraint_exclusion survives in plancat.c / guc_tables.c but only for
legacy table-inheritance trees and UNION ALL — the Era-0 mechanism, kept for
backward compatibility, no longer on the path for declarative partitioned tables.
The mechanism-level detail of all of the above is in the current-state module
doc postgres-partitioning.md, with the planner and
scan-node context in postgres-planner-overview.md
and postgres-scan-nodes.md; this evolution doc does not
duplicate it.
Next step (PG19, just released). Work continues on the same foundation — incremental partitionwise-plan and pruning coverage and partition-maintenance ergonomics — rather than another architectural rewrite; the PG10–12 design is the one PostgreSQL is still extending, not replacing.
Sources
Section titled “Sources”PostgreSQL release notes (feature attribution):
- PostgreSQL 10 release notes — declarative table partitioning (
PARTITION BYRANGE/LIST,PARTITION OF, tuple routing on INSERT). - PostgreSQL 11 release notes — hash partitioning, default partition, partitionwise
join (
enable_partitionwise_join), partitionwise aggregate (enable_partitionwise_aggregate), UPDATE row movement, runtime partition pruning, partitioned-table indexes / primary keys. - PostgreSQL 12 release notes — faster plan-time pruning, reduced ATTACH locking, foreign keys referencing partitioned tables, COPY into partitioned tables improvements.
- PostgreSQL 13 release notes — BEFORE row triggers on partitioned tables, logical
replication of partitioned tables /
publish_via_partition_root, improved pruning and reduced overhead. - PostgreSQL 14 release notes —
DETACH PARTITION ... CONCURRENTLY, further partition planning/execution overhead reductions. - PostgreSQL 15 / 16 release notes — ordered (MergeAppend) partition scans, broader partitionwise join cases, broader pruning clause support.
Current-state module docs (mechanism — cited, not duplicated):
- postgres-partitioning.md —
PartitionDesc/PartitionBoundInfo, tuple routing, prune steps, partitionwise merge. - postgres-planner-overview.md — path/cost
selection,
constraint_exclusion,prune_append_rel_partitions. - postgres-scan-nodes.md — Append/MergeAppend and the scan nodes that surviving partitions feed.
Key source files (observable on REL_18, commit 273fe94):
src/backend/partitioning/partbounds.c— bound canonicalization, comparison, merge.src/backend/partitioning/partprune.c—prune_append_rel_partitions,get_matching_partitions, prune-step generation.src/backend/partitioning/partdesc.c—PartitionDescbuild.src/backend/executor/execPartition.c—ExecFindPartition,get_partition_for_tuple,ExecDoInitialPruning,ExecFindMatchingSubPlans.src/backend/optimizer/util/plancat.c—relation_excluded_by_constraints(the legacy constraint-exclusion mechanism).src/backend/commands/tablecmds.c— ATTACH / DETACH (incl. CONCURRENTLY) and DDL cascade to partitions.src/backend/utils/misc/guc_tables.c—constraint_exclusion,enable_partitionwise_join,enable_partitionwise_aggregateGUCs.