PostgreSQL Node System — NodeTag, copyObject, and the gen_node_support Codegen
Contents:
- Theoretical Background
- Common DBMS Design
- PostgreSQL’s Approach
- Source Walkthrough
- Source verification (as of 2026-06-05)
- Beyond PostgreSQL — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”A SQL statement is reborn several times on its way through a database backend: first as a parse tree (a faithful syntactic transcription of the text), then as a rewritten query tree, then as an optimized plan tree (a tree of physical operators), and finally as an execution state tree that the engine walks to produce rows. Database System Concepts (Silberschatz, 7e, ch. 15–16) describes this pipeline as a sequence of intermediate representations, each a tree whose nodes are relational-algebra-flavoured operators, and the recurring engineering question is: what is the in-memory representation of a tree node, and how do generic algorithms walk it?
Two classic answers compete:
- The object-oriented answer. Each node type is a class; copy, compare, and serialize are virtual methods; the compiler’s vtable does the dispatch. Clean, but it bakes a C++ (or Java) object model into the core, and serialization across process boundaries — needed the moment you ship a plan to a parallel worker — still has to be written or generated per class.
- The tagged-union answer. Every node carries a small integer tag
identifying its concrete type. A generic routine reads the tag and
switches to type-specific code. This is the Lisp heritage: an s-expression is a tagged cell, andcopy/equal/print/readare generic functions that recurse over the tag. PostgreSQL descends literally from a Lisp ancestor (POSTGRES was prototyped partly in Lisp), and the node system is the fossil record of that lineage.
The tagged approach has a well-known cost: the switch over tags, and the
per-type copy/compare/serialize code, must be kept in lock-step with the
struct definitions. Miss a field in the copy routine and you get a
shallow copy that aliases shared substructure; miss it in the compare
routine and the rewriter thinks two different expressions are equal. The
manual maintenance of this lock-step is exactly the kind of mechanical,
error-prone bookkeeping that a code generator should own — which is the
arc of this document: PostgreSQL keeps the tagged-union representation but
derives the per-type plumbing from the struct definitions themselves.
Two more theory anchors matter here:
-
Representation of a list. Relational plans are full of variable-arity sequences — a target list, a qual list, a join’s child list. The textbook treats these abstractly as sets/sequences; the implementation needs a concrete container with cheap append and cheap indexed access. The naive Lisp answer is a linked list of cons cells; the cache-aware answer is an expansible array. PostgreSQL migrated from the former to the latter (v13) while keeping the Lisp-flavoured API (
lappend,lfirst,foreach), a textbook case of preserving an interface across a representation change. -
Deep vs. shallow copy and structural equality. Trees that share substructure are dangerous to mutate. The planner copies a
Querybefore rewriting; a rule application copies an expression before splicing it elsewhere. “Copy” here must mean deep copy — clone the whole reachable subgraph — and “equal” must mean structural equality, recursing field by field, not pointer identity. These are the two generic recursions the node system must provide, and both are tag-dispatched.
Common DBMS Design
Section titled “Common DBMS Design”Stripped to its essentials, a tagged-node tree system in any DBMS makes a handful of recurring design decisions.
A discriminator field, first, always
Section titled “A discriminator field, first, always”The tag must be at a fixed, known offset so that a routine holding only
an untyped pointer can read it before it knows the concrete type. The
universal convention is offset zero: the tag is the first field of
every node, so *(Tag *) ptr always yields the type. PostgreSQL’s Node
is literally struct { NodeTag type; } and nodeTag(p) is a cast-and-read
of that first word.
Inheritance by struct prefix
Section titled “Inheritance by struct prefix”C has no classes, so “B is-a A” is encoded by making an A the first
field of B. Because A’s first field is the tag, B’s first field is
still the tag, recursively — so a B* can be cast to A* or to Node*
and every cast sees the right discriminator. PostgreSQL uses this for the
whole plan hierarchy: SeqScan embeds Scan embeds Plan embeds the
NodeTag.
Allocate-and-stamp construction
Section titled “Allocate-and-stamp construction”A constructor must (a) allocate the right number of bytes and (b) write
the correct tag. PostgreSQL’s makeNode(T) is a macro that expands to
newNode(sizeof(T), T_T), which palloc0s (zero-filled) and sets the
tag. Zero-fill matters: every unset field is a clean NULL/0, so the
generic deep-copy and serialize routines never read garbage.
Generic traversal driven by the tag
Section titled “Generic traversal driven by the tag”Copy, compare, serialize, deserialize, expression-walk, and
expression-mutate are all the same shape: read the tag, switch, do
type-specific work, recurse into child nodes. The open question every
implementation faces is whether that per-type work is hand-written or
generated. Hand-written is simple to read but rots; generated needs a
build-time tool but stays correct by construction.
Variable-arity children as a list type
Section titled “Variable-arity children as a list type”Operators with a variable number of children (an Append, an AND of N
quals) hold a list. The list itself is usually a node type so that the
generic copy/compare can recurse into it uniformly. PostgreSQL’s List
is a node (T_List), and copyObject/equal special-case it rather
than generating per-element code.
Theory ↔ PostgreSQL mapping
Section titled “Theory ↔ PostgreSQL mapping”| Concept (DSC ch. 15–16 / Lisp heritage) | PostgreSQL realization |
|---|---|
| Tagged cell / discriminator | NodeTag type as first field of every node (nodes.h) |
cons/car/cdr list, later an array | List node + ListCell union; lappend/lfirst/foreach (pg_list.h, list.c) |
| Single inheritance | struct-prefix embedding: SeqScan ⊃ Scan ⊃ Plan ⊃ NodeTag (plannodes.h) |
| Constructor | makeNode(T) → newNode (palloc0 + tag) (nodes.h) |
| Generic deep copy | copyObject() → copyObjectImpl → tag switch (copyfuncs.c) |
| Generic structural equality | equal() → tag switch (equalfuncs.c) |
| Serialize / deserialize | nodeToString/outNode, stringToNode/nodeRead (outfuncs.c, readfuncs.c) |
| Keeping plumbing in lock-step with structs | gen_node_support.pl emits the _copyT/_equalT/_outT/_readT from the headers |
PostgreSQL’s Approach
Section titled “PostgreSQL’s Approach”PostgreSQL’s node system is best read as one invariant plus one code generator. The tag threads every representation of a statement through the backend, and the generated support functions are what let each stage hand a tree to the next generically:
flowchart LR
SQL["SQL text"] -->|gram.y / raw parse| RAW["RawStmt<br/>SelectStmt, A_Expr, ColumnRef ..."]
RAW -->|analyze| Q["Query node<br/>rtable, jointree, targetList"]
Q -->|rewriter: copyObject + equal| Q2["Query (rewritten)"]
Q2 -->|planner| PLAN["PlannedStmt<br/>Plan tree: SeqScan / NestLoop / Agg"]
PLAN -->|nodeToString| WIRE["serialized text<br/>shipped to parallel worker"]
WIRE -->|stringToNode| PLAN2["Plan tree (worker copy)"]
PLAN -->|ExecInitNode| EST["PlanState tree<br/>executor state (nodetag_only)"]
classDef gen fill:#eef,stroke:#446;
class Q,Q2,PLAN,PLAN2 gen;
Every box above is a node tree; every arrow that copies, compares, or serializes is one of the four generated passes.
The invariant: the first field of every node is a NodeTag. Everything
else — casting, dispatch, inheritance, construction — is a consequence.
From nodes.h:
// Node, nodeTag, makeNode — src/include/nodes/nodes.htypedef struct Node{ NodeTag type;} Node;
#define nodeTag(nodeptr) (((const Node*)(nodeptr))->type)
static inline Node *newNode(size_t size, NodeTag tag){ Node *result;
Assert(size >= sizeof(Node)); /* need the tag, at least */ result = (Node *) palloc0(size); result->type = tag; return result;}
#define makeNode(_type_) ((_type_ *) newNode(sizeof(_type_),T_##_type_))#define IsA(nodeptr,_type_) (nodeTag(nodeptr) == T_##_type_)makeNode(Query) therefore allocates sizeof(Query) bytes, zero-fills
them, and writes T_Query into the first word. IsA(p, Query) is a
single integer compare. castNode(Query, p) is the same compare wrapped
in an Assert (only in assert-enabled builds), giving a typed pointer
with a cheap runtime type check for free in debug builds.
The NodeTag enum is a generated, ABI-frozen list
Section titled “The NodeTag enum is a generated, ABI-frozen list”NodeTag is not hand-maintained. nodes.h opens the enum and then
#includes a generated file:
// NodeTag — src/include/nodes/nodes.htypedef enum NodeTag{ T_Invalid = 0,
#include "nodes/nodetags.h"} NodeTag;nodetags.h is produced by gen_node_support.pl. Because the numeric
value of every tag is implied by its position in this list, inserting or
removing a node type renumbers everything after it — harmless during
development (the numbers never hit disk for plan nodes), but an ABI break
for extensions in a released branch. The generator therefore carries a
stable-branch guard: in REL_18 it asserts that the last auto-numbered tag
is WindowObjectData at number 479, and pg_node_attr(nodetag_number(N))
exists precisely to let a back-patched node claim a new number without
shifting the existing ones.
Inheritance: the plan hierarchy as struct prefixes
Section titled “Inheritance: the plan hierarchy as struct prefixes”The whole plan-node family is built by embedding. Plan is the abstract
base; Scan embeds a Plan; SeqScan embeds a Scan; Join embeds a
Plan; NestLoop embeds a Join:
// Plan, Scan, SeqScan, Join, NestLoop — src/include/nodes/plannodes.htypedef struct Plan{ pg_node_attr(abstract, no_equal, no_query_jumble) NodeTag type; int disabled_nodes; Cost startup_cost; /* ... total_cost, plan_rows, targetlist, qual, lefttree, righttree ... */} Plan;
typedef struct Scan{ pg_node_attr(abstract) Plan plan; Index scanrelid; /* index into the range table */} Scan;
typedef struct SeqScan{ Scan scan;} SeqScan;
typedef struct Join{ pg_node_attr(abstract) Plan plan; JoinType jointype; bool inner_unique; List *joinqual;} Join;A SeqScan * can be cast to Scan *, Plan *, or Node *; each cast
sees the right first field because NodeTag stays at offset 0 through the
whole chain. The pg_node_attr(abstract) markers tell the generator not
to emit a T_Scan/T_Plan/T_Join tag — these are supertypes you can
never instantiate, only inherit from. Plan additionally carries
no_equal, no_query_jumble: plan nodes are copied and serialized (they
ship to parallel workers) but never compared by equal(), so the
generator skips equal support for the entire hierarchy.
The parse-tree root, Query, is a plain node (tag first, no embedding):
// Query — src/include/nodes/parsenodes.htypedef struct Query{ NodeTag type; CmdType commandType; /* select|insert|update|delete|merge|utility */ QuerySource querySource pg_node_attr(query_jumble_ignore); /* ... rtable, jointree, targetList, ... */} Query;Note the field-level pg_node_attr(query_jumble_ignore): attributes
attach either to a struct (after the opening brace) or to a single field
(end of its line), and they steer exactly one of the four generated
passes.
List is a tagged expansible array, not a linked list
Section titled “List is a tagged expansible array, not a linked list”The name “List” and the API (lappend, lcons, lfirst, foreach) are
Lisp inheritance, but since v13 the representation is a flat, expansible
array. The header comment in pg_list.h is explicit that this is a
deliberate interface-preserving rewrite. The struct:
// ListCell, List — src/include/nodes/pg_list.htypedef union ListCell{ void *ptr_value; int int_value; Oid oid_value; TransactionId xid_value;} ListCell;
typedef struct List{ NodeTag type; /* T_List, T_IntList, T_OidList, or T_XidList */ int length; /* number of elements currently present */ int max_length; /* allocated length of elements[] */ ListCell *elements; /* re-allocatable array of cells */ ListCell initial_elements[FLEXIBLE_ARRAY_MEMBER];} List;
#define NIL ((List *) NULL)Four facts encode the whole design:
Listis a node — its first field is aNodeTag, so it dispatches throughcopyObject/equallike any other node.- The empty list is
NIL == NULL. A non-NIL list always haslength >= 1; there is no valid empty-but-non-NULL list. This is the one concession kept from the cons-cell era. - One union, four flavours. A cell holds a pointer, an int, an Oid, or
an Xid; the
typefield says which. Pointer lists (T_List) usually holdNode*, but the cell is declaredvoid *to spare callers a cast. - Header and first cells share one palloc.
initial_elements[]is a flexible array member co-allocated with the header, so a short list is one allocation andelementspoints back into the header block until the list outgrows it.
foreach is the canonical traversal, and it is now an index walk over the
array, not a pointer chase:
// foreach — src/include/nodes/pg_list.h#define foreach(cell, lst) \ for (ForEachState cell##__state = {(lst), 0}; \ (cell##__state.l != NIL && \ cell##__state.i < cell##__state.l->length) ? \ (cell = &cell##__state.l->elements[cell##__state.i], true) : \ (cell = NULL, false); \ cell##__state.i++)lfirst(cell) reads cell->ptr_value; lfirst_int/lfirst_oid read the
other union members. list_length(l) is l ? l->length : 0 — NIL-safe by
construction. The lesson: the array rewrite made indexed access O(1) and
iteration cache-friendly, but because callers only ever touched the macro
API, almost no call sites changed.
Source Walkthrough
Section titled “Source Walkthrough”Anchor on symbol names, not line numbers. Use
git grep -n '<symbol>' src/backend/nodes/to locate the current position; the position-hint table at the end of this section records the line numbers observed at commit273fe94(REL_18_STABLE).
The generic deep copy: copyObject and the tag switch
Section titled “The generic deep copy: copyObject and the tag switch”copyObject() is a type-preserving macro over copyObjectImpl, the single
generic entry point. The whole of the hand-written copyfuncs.c is: the
field-copy macros, a few custom_copy_equal routines, and this dispatch:
// copyObjectImpl — src/backend/nodes/copyfuncs.cvoid *copyObjectImpl(const void *from){ void *retval;
if (from == NULL) return NULL;
/* Guard against stack overflow due to overly complex expressions */ check_stack_depth();
switch (nodeTag(from)) {#include "copyfuncs.switch.c" /* generated: case T_Foo: ... _copyFoo(from) */
case T_List: retval = list_copy_deep(from); break;
case T_IntList: case T_OidList: case T_XidList: retval = list_copy(from); /* scalar cells: shallow is enough */ break;
default: elog(ERROR, "unrecognized node type: %d", (int) nodeTag(from)); retval = 0; /* keep compiler quiet */ break; } return retval;}Three things to notice. First, NULL copies to NULL — so a routine
can copy an optional child without a guard, which is why the palloc0
zero-fill in makeNode is load-bearing. Second, the big switch body is
#included from copyfuncs.switch.c, a generated file: one
case T_Foo: retval = _copyFoo(from); break; per copy-supporting node.
Third, List is special-cased: pointer lists deep-copy each element
(itself a copyObjectImpl call), while int/Oid/Xid lists only need a
shallow memcpy of cells because the cells hold values, not pointers.
The per-node copy routines live in copyfuncs.funcs.c (also generated)
and use the field macros. The macros hard-wire the local names newnode
and from:
// field-copy macros — src/backend/nodes/copyfuncs.c#define COPY_SCALAR_FIELD(fldname) \ (newnode->fldname = from->fldname)
#define COPY_NODE_FIELD(fldname) \ (newnode->fldname = copyObjectImpl(from->fldname)) /* recurse */
#define COPY_STRING_FIELD(fldname) \ (newnode->fldname = from->fldname ? pstrdup(from->fldname) : NULL)
#define COPY_BITMAPSET_FIELD(fldname) \ (newnode->fldname = bms_copy(from->fldname))
#define COPY_POINTER_FIELD(fldname, sz) \ do { \ Size _size = (sz); \ if (_size > 0) \ { \ newnode->fldname = palloc(_size); \ memcpy(newnode->fldname, from->fldname, _size); \ } \ } while (0)A scalar field is copied by value; a node-pointer field recurses through
copyObjectImpl (this is what makes the copy deep); a string is
pstrdup’d; a Bitmapset is bms_copy’d; a sized buffer is palloc +
memcpy. The generator’s job is purely to pick the right macro per field
from the field’s C type — which is the subject of the codegen subsection.
The recursion of a single copyObject(query) through the tag dispatch and
the List special-case looks like this:
flowchart TD
A["copyObject(node)"] --> B["copyObjectImpl(from)"]
B --> C{"from == NULL?"}
C -->|yes| D["return NULL"]
C -->|no| E["switch nodeTag(from)"]
E -->|T_Query, T_SeqScan, ...| F["_copyFoo(from)<br/>generated, field by field"]
E -->|T_List| G["list_copy_deep"]
E -->|T_IntList / T_OidList / T_XidList| H["list_copy<br/>shallow memcpy of cells"]
E -->|unknown| I["elog ERROR unrecognized node type"]
F -->|COPY_NODE_FIELD on each child| B
G -->|copyObjectImpl on each element| B
The two back-edges (COPY_NODE_FIELD and the per-element loop in
list_copy_deep) are the recursion that makes the copy deep.
Custom copy: when generated code is not enough
Section titled “Custom copy: when generated code is not enough”A handful of node types cannot be copied field-by-field and carry
pg_node_attr(custom_copy_equal), telling the generator to emit only the
case and call a hand-written routine. Const is the archetype, because
a by-reference datum needs a true value copy, not a pointer copy:
// _copyConst — src/backend/nodes/copyfuncs.cstatic Const *_copyConst(const Const *from){ Const *newnode = makeNode(Const);
COPY_SCALAR_FIELD(consttype); COPY_SCALAR_FIELD(consttypmod); COPY_SCALAR_FIELD(constcollid); COPY_SCALAR_FIELD(constlen);
if (from->constbyval || from->constisnull) newnode->constvalue = from->constvalue; /* by value: just copy */ else newnode->constvalue = datumCopy(from->constvalue, /* by ref: clone */ from->constbyval, from->constlen);
COPY_SCALAR_FIELD(constisnull); COPY_SCALAR_FIELD(constbyval); COPY_LOCATION_FIELD(location); return newnode;}ExtensibleNode is the other instructive custom case: it dispatches
through a registered ExtensibleNodeMethods vtable (methods->nodeCopy),
the one place the node system does fall back to virtual-method-style
dispatch — for node types defined by extensions, which the build-time
generator cannot see.
The generic structural compare: equal
Section titled “The generic structural compare: equal”equal() mirrors copyObjectImpl exactly, but answers a boolean and never
allocates. Its dispatch:
// equal — src/backend/nodes/equalfuncs.cboolequal(const void *a, const void *b){ bool retval;
if (a == b) return true; /* same pointer (covers NULL==NULL) */ if (a == NULL || b == NULL) return false; /* exactly one NULL */ if (nodeTag(a) != nodeTag(b)) return false; /* different concrete types */
check_stack_depth();
switch (nodeTag(a)) {#include "equalfuncs.switch.c" /* generated: case T_Foo: _equalFoo(a,b) */
case T_List: case T_IntList: case T_OidList: case T_XidList: retval = _equalList(a, b); break;
default: elog(ERROR, "unrecognized node type: %d", (int) nodeTag(a)); retval = false; break; } return retval;}The compare macros parallel the copy macros — COMPARE_SCALAR_FIELD,
COMPARE_NODE_FIELD (recurses via equal), COMPARE_STRING_FIELD
(strcmp, NULL-aware), COMPARE_BITMAPSET_FIELD (bms_equal) — and each
returns false on the first mismatch. The most important macro is the one
that does nothing:
// COMPARE_LOCATION_FIELD — src/backend/nodes/equalfuncs.c/* Compare a parse location field (this is a no-op, per note above) */#define COMPARE_LOCATION_FIELD(fldname) \ ((void) 0)Parse-location fields (the byte offset into the source text) are
deliberately not compared, so two references to column x at different
places in the query string are equal(). This is the whole reason
ParseLoc is its own typedef and location-named integer fields get
special generator treatment: location is metadata, not identity.
List is special-cased the same way it is in copy — _equalList first
rejects on differing type or length (cheap scalar checks), then walks
both arrays in lockstep with forboth, comparing pointer cells with
equal() and scalar cells with !=:
// _equalList (pointer-list arm) — src/backend/nodes/equalfuncs.cCOMPARE_SCALAR_FIELD(type);COMPARE_SCALAR_FIELD(length);switch (a->type){ case T_List: forboth(item_a, a, item_b, b) { if (!equal(lfirst(item_a), lfirst(item_b))) return false; } break; case T_IntList: forboth(item_a, a, item_b, b) { if (lfirst_int(item_a) != lfirst_int(item_b)) return false; } break; /* ... T_OidList, T_XidList ... */}List internals: allocate-with-slack, enlarge, deep-copy
Section titled “List internals: allocate-with-slack, enlarge, deep-copy”new_list is where the co-allocation trick lives. It rounds the cell
count up to a power of two so that palloc (which rounds to powers of two
anyway) hands back usable slack, and points elements at the inline
initial_elements[]:
// new_list — src/backend/nodes/list.cstatic List *new_list(NodeTag type, int min_size){ List *newlist; int max_size;
Assert(min_size > 0);
/* round the allocation up to a power of two for free slack */ max_size = pg_nextpower2_32(Max(8, min_size + LIST_HEADER_OVERHEAD)); max_size -= LIST_HEADER_OVERHEAD;
newlist = (List *) palloc(offsetof(List, initial_elements) + max_size * sizeof(ListCell)); newlist->type = type; newlist->length = min_size; newlist->max_length = max_size; newlist->elements = newlist->initial_elements; /* still inline */ return newlist;}lappend is the workhorse. It creates a one-cell list from NIL, else
makes room for one more tail cell (enlarging if at capacity) and writes the
datum into the new last cell. The return-value discipline — always use the
returned pointer — exists because enlargement may move the cell array:
// lappend — src/backend/nodes/list.cList *lappend(List *list, void *datum){ Assert(IsPointerList(list));
if (list == NIL) list = new_list(T_List, 1); else new_tail_cell(list); /* enlarge_list() if length == max_length */
llast(list) = datum; check_list_invariants(list); return list;}When the inline cells are exhausted, enlarge_list migrates from the
co-allocated initial_elements[] to a separately MemoryContextAlloc’d
array (memcpy’ing the live cells), or repallocs an already-separate
array — in both cases growing to the next power of two. After that point
the header stays put (so a retained List* is still valid) but the cell
array has moved (so retained ListCell* pointers are not). list_concat
appends one array onto another with a single memcpy; list_copy is a
shallow memcpy of all cells; and list_copy_deep is the pointer-list
deep copy that copyObjectImpl calls for T_List:
// list_copy_deep — src/backend/nodes/list.cList *list_copy_deep(const List *oldlist){ List *newlist;
if (oldlist == NIL) return NIL;
Assert(IsA(oldlist, List)); /* only sensible for pointer Lists */
newlist = new_list(oldlist->type, oldlist->length); for (int i = 0; i < newlist->length; i++) lfirst(&newlist->elements[i]) = copyObjectImpl(lfirst(&oldlist->elements[i])); /* recurse */
check_list_invariants(newlist); return newlist;}This closes the recursion: a node’s COPY_NODE_FIELD on a List* calls
copyObjectImpl, which routes T_List to list_copy_deep, which calls
copyObjectImpl on every element — so a target list of TargetEntry
nodes, each holding an expression subtree, is cloned in full by a single
top-level copyObject(query).
How the plumbing is generated: gen_node_support.pl
Section titled “How the plumbing is generated: gen_node_support.pl”The per-node _copyT/_equalT bodies and the switch.c files are emitted
at build time by gen_node_support.pl, which parses a fixed, ordered list
of node header files:
# canonical, order-stable input list — src/backend/nodes/gen_node_support.plmy @all_input_files = qw( nodes/nodes.h nodes/primnodes.h nodes/parsenodes.h nodes/pathnodes.h nodes/plannodes.h nodes/execnodes.h # ... access/*, commands/*, executor/tuptable.h, foreign/fdwapi.h, nodes/value.h ...);The order is load-bearing: it fixes the NodeTag numbering, and the script
back-stops that with the stable-branch ABI guard:
# ABI stability guard (REL_18) — src/backend/nodes/gen_node_support.plmy $last_nodetag = 'WindowObjectData';my $last_nodetag_no = 479;For each struct, the generator walks its fields and selects a copy/compare
macro from the field’s C type, modulated by any pg_node_attr. The
core of the type dispatch:
# field-type → macro selection (abridged) — gen_node_support.plif ($t eq 'char*') { print $cff "\tCOPY_STRING_FIELD($f);\n" unless $copy_ignore; print $eff "\tCOMPARE_STRING_FIELD($f);\n" unless $equal_ignore;}elsif ($t eq 'Bitmapset*' || $t eq 'Relids') { print $cff "\tCOPY_BITMAPSET_FIELD($f);\n" unless $copy_ignore; print $eff "\tCOMPARE_BITMAPSET_FIELD($f);\n" unless $equal_ignore;}elsif ($t eq 'ParseLoc') { print $cff "\tCOPY_LOCATION_FIELD($f);\n" unless $copy_ignore; print $eff "\tCOMPARE_LOCATION_FIELD($f);\n" unless $equal_ignore;}elsif (elem $t, @scalar_types or elem $t, @enum_types) { print $cff "\tCOPY_SCALAR_FIELD($f);\n" unless $copy_ignore; # ... COMPARE_SCALAR_FIELD, or equal_ignore_if_zero handling ...}# node type: pointer to a known node → recurseelsif (($t =~ /^(\w+)\*$/ or $t =~ /^struct\s+(\w+)\*$/) and elem $1, @node_types) { print $cff "\tCOPY_NODE_FIELD($f);\n" unless $copy_ignore; print $eff "\tCOMPARE_NODE_FIELD($f);\n" unless $equal_ignore;}So a char * becomes COPY_STRING_FIELD, a ParseLoc becomes the no-op
location handling, a pointer-to-known-node becomes the recursive
COPY_NODE_FIELD, and a plain scalar/enum becomes COPY_SCALAR_FIELD. Per
field, pg_node_attr can override: copy_as_scalar forces a pointer to be
copied shallowly, equal_ignore drops a field from comparison,
equal_ignore_if_zero ignores it only when zero, array_size(other) says
“this is a palloc’d array whose length is in field other” so the
generator emits a COPY_POINTER_FIELD sized from that companion field. Per
struct, no_copy/no_equal/nodetag_only/custom_copy_equal decide
whether any body is emitted at all. The dispatch case itself is gated the
same way:
# emit the switch case unless the struct opts out — gen_node_support.plprint $cfs "\t\tcase T_${n}:\n" . "\t\t\tretval = _copy${n}(from);\n" . "\t\t\tbreak;\n" unless $struct_no_copy;The generated _copySeqScan therefore looks like this (reconstructed from
the generator’s rules; the actual file is copyfuncs.funcs.c, produced
only in a built tree). It walks the embedded Scan→Plan fields because
inheritance is just a prefix the generator flattens:
// _copySeqScan (generated form) — copyfuncs.funcs.cstatic SeqScan *_copySeqScan(const SeqScan *from){ SeqScan *newnode = makeNode(SeqScan);
/* fields flattened from Plan, then Scan */ COPY_SCALAR_FIELD(scan.plan.disabled_nodes); COPY_SCALAR_FIELD(scan.plan.startup_cost); /* ... total_cost, plan_rows, plan_width ... */ COPY_NODE_FIELD(scan.plan.targetlist); COPY_NODE_FIELD(scan.plan.qual); COPY_NODE_FIELD(scan.plan.lefttree); COPY_NODE_FIELD(scan.plan.righttree); COPY_SCALAR_FIELD(scan.scanrelid);
return newnode;}Position hints (as of 2026-06-05, REL_18 273fe94)
Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”Anchor on symbol names, not line numbers — names are the stable handle across releases. Locate the current position with
git grep -n '<symbol>' src/backend/nodes/. Line numbers below were observed at commit273fe94(REL_18_STABLE) and are quick hints only.
| Symbol | File | Line |
|---|---|---|
typedef enum NodeTag | src/include/nodes/nodes.h | 26 |
typedef struct Node | src/include/nodes/nodes.h | 134 |
newNode | src/include/nodes/nodes.h | 150 |
makeNode (macro) | src/include/nodes/nodes.h | 161 |
IsA (macro) | src/include/nodes/nodes.h | 164 |
castNode (macro) | src/include/nodes/nodes.h | 180 |
copyObject (macro) | src/include/nodes/nodes.h | 228 |
extern bool equal | src/include/nodes/nodes.h | 236 |
typedef int ParseLoc | src/include/nodes/nodes.h | 246 |
typedef enum CmdType | src/include/nodes/nodes.h | 268 |
typedef union ListCell | src/include/nodes/pg_list.h | 45 |
typedef struct List | src/include/nodes/pg_list.h | 53 |
#define NIL | src/include/nodes/pg_list.h | 68 |
#define lfirst | src/include/nodes/pg_list.h | 172 |
#define list_make1 | src/include/nodes/pg_list.h | 212 |
#define foreach | src/include/nodes/pg_list.h | 373 |
new_list | src/backend/nodes/list.c | 91 |
enlarge_list | src/backend/nodes/list.c | 155 |
list_make1_impl | src/backend/nodes/list.c | 236 |
lappend | src/backend/nodes/list.c | 339 |
lcons | src/backend/nodes/list.c | 495 |
list_concat | src/backend/nodes/list.c | 561 |
list_member | src/backend/nodes/list.c | 661 |
list_copy | src/backend/nodes/list.c | 1573 |
list_copy_deep | src/backend/nodes/list.c | 1639 |
#include "copyfuncs.funcs.c" | src/backend/nodes/copyfuncs.c | 65 |
_copyConst | src/backend/nodes/copyfuncs.c | 73 |
_copyExtensibleNode | src/backend/nodes/copyfuncs.c | 147 |
copyObjectImpl | src/backend/nodes/copyfuncs.c | 177 |
#include "copyfuncs.switch.c" | src/backend/nodes/copyfuncs.c | 189 |
#include "equalfuncs.funcs.c" | src/backend/nodes/equalfuncs.c | 88 |
_equalConst | src/backend/nodes/equalfuncs.c | 96 |
_equalList | src/backend/nodes/equalfuncs.c | 156 |
equal | src/backend/nodes/equalfuncs.c | 223 |
#include "equalfuncs.switch.c" | src/backend/nodes/equalfuncs.c | 247 |
@all_input_files | src/backend/nodes/gen_node_support.pl | 53 |
$last_nodetag (ABI guard) | src/backend/nodes/gen_node_support.pl | 110 |
typedef struct PlannedStmt | src/include/nodes/plannodes.h | 46 |
typedef struct Plan | src/include/nodes/plannodes.h | 158 |
typedef struct Scan | src/include/nodes/plannodes.h | 470 |
typedef struct Join | src/include/nodes/plannodes.h | 916 |
typedef struct Query | src/include/nodes/parsenodes.h | 117 |
Source verification (as of 2026-06-05)
Section titled “Source verification (as of 2026-06-05)”Verified against /data/hgryoo/references/postgres at commit 273fe94
(REL_18_STABLE). Method: read each cited file directly; symbol line numbers
from grep -n.
- The NodeTag-first invariant.
struct Nodeinnodes.his exactly{ NodeTag type; };nodeTag()casts toconst Node *and reads->type;makeNodeexpands tonewNode(sizeof(T), T_T)whichpalloc0s and stamps the tag. Confirmed verbatim. - NodeTag is generated and ABI-guarded.
nodes.hline 30#include "nodes/nodetags.h"inside the enum;gen_node_support.plpins$last_nodetag = 'WindowObjectData'/$last_nodetag_no = 479for REL_18. TheT_XLOG2rmgr andB_DATACHECKSUMSWORKER_*symbols (forbidden by task scope) do not appear here — confirmed not present. - Inheritance by prefix.
Plancarriespg_node_attr(abstract, no_equal, no_query_jumble)withNodeTag typefirst;ScanembedsPlanand isabstract;SeqScanembedsScan;JoinembedsPlan;NestLoopembedsJoin. Confirmed inplannodes.h. - List representation.
ListCellis the four-member union andListis{ NodeTag; int length; int max_length; ListCell *elements; ListCell initial_elements[FLEXIBLE_ARRAY_MEMBER]; };NILis((List *) NULL).foreachis an index walk. Confirmed inpg_list.h. - Dispatch shape.
copyObjectImplandequalboth#includea generated*.switch.c, special-case the four list tags, andelog(ERROR, "unrecognized node type")in the default arm. Confirmed incopyfuncs.c/equalfuncs.c. - Custom copy/equal.
_copyConstusesdatumCopyfor by-reference datums;_copyExtensibleNode/_equalExtensibleNodedispatch throughExtensibleNodeMethods._equalConstusesdatumIsEqual. Confirmed. - Location no-op.
COMPARE_LOCATION_FIELD(fldname)is((void) 0); the file header comment explains why (soxequalsx). Confirmed. - List internals.
new_listrounds topg_nextpower2_32, pointselementsatinitial_elements;lappendreturns a possibly-moved list;enlarge_listmigrates inline→separate allocation;list_copyis a shallowmemcpy;list_copy_deeprecurses viacopyObjectImpl. Confirmed inlist.c. - Codegen.
@all_input_fileslists the headers in fixed order; the field-type→macro dispatch and thecase T_${n}emission are as quoted. Confirmed ingen_node_support.pl. - Caveat — generated artifacts.
copyfuncs.funcs.c,copyfuncs.switch.c,equalfuncs.funcs.c,equalfuncs.switch.c, andnodetags.hexist only in a built tree; the reference checkout is clean, so the_copySeqScanexcerpt is shown as a reconstructed generated form derived from the generator’s documented rules, not a byte-for-byte file capture. Every hand-written excerpt is verbatim.
Beyond PostgreSQL — Comparative Designs & Research Frontiers
Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”Where the node system sits in the backend. This document deliberately
stops at the representation and its generic plumbing. The producers and
consumers of these trees are covered elsewhere: the raw-parse and analysis
stages that build the parse/Query tree are in
postgres-parser.md; the path-and-plan construction
that turns a Query into a PlannedStmt is in
postgres-planner-overview.md and
postgres-path-generation.md; the
PlanState tree that ExecInitNode builds from the Plan tree — the
fourth tree in the first diagram, marked nodetag_only because it needs no
copy/equal/serialize — is in
postgres-executor.md. Expression nodes
(Var, OpExpr, Const) and their compiled evaluation live in
postgres-expression-eval.md.
Codegen vs. hand-maintenance. PostgreSQL’s choice — keep a tagged-union
representation but generate the copy/equal/out/read passes — is the
interesting design point. Before v15 these files were hand-written and a
recurring source of “forgot to copy the new field” bugs; the move to
gen_node_support.pl made the struct definition the single source of
truth and turned a class of latent aliasing bugs into compile errors. The
trade is a Perl build dependency and a layer of indirection (you read the
generator and the pg_node_attr markers, not the emitted code). Other
systems make different cuts:
- C++ engines (DuckDB, ClickHouse). Use real class hierarchies with
virtual
Copy()/Equals()/Serialize()methods. The compiler’s vtable replaces the tagswitch, and RTTI/dynamic_castreplacesIsA. Serialization for distributed/parallel execution is still written or generated per class, so the underlying problem PostgreSQL’s codegen solves does not vanish — it moves into a serialization framework. - Apache Calcite (JVM). Represents relational operators as
RelNodeobjects and immutableRexNodeexpressions, with structural equality via “digests” and copy-on-write transformation rules. Immutability sidesteps the deep-copy-before-mutate discipline that PostgreSQL enforces manually withcopyObject. - Substrait / protobuf plan IRs. The modern cross-engine trend is to
define the plan tree in a schema language (protobuf) and let the
serializer/deserializer be generated from the schema — essentially
PostgreSQL’s
gen_node_support.plidea generalized to a portable, language-neutral IR, so a plan produced by one engine can be executed by another.
The List rewrite as a case study. The v13 migration from cons cells to
expansible arrays (visible in this code as initial_elements[],
enlarge_list, and the index-based foreach) is a widely-cited example of
changing a data structure’s representation under a stable API for cache
locality. The cost it imposes — ListCell pointers are invalidated by
growth, so the codebase had to be audited for retained-cell assumptions
(DEBUG_LIST_MEMORY_USAGE exists precisely to flush those out) — is the
recurring tax of array-backed sequences and is the same reason
lappend/lcons insist callers use the return value.
Frontiers. (1) ABI-stable node evolution — the
nodetag_number(VALUE) attribute and the $last_nodetag guard are a
pragmatic answer to “add a node type in a minor release without breaking
extensions”; a cleaner answer (decoupling tag identity from enum position)
remains open. (2) Whole-tree interning / structural sharing — equal()
is O(tree size) and the planner calls it a lot; engines that hash-cons or
intern subexpressions get O(1) equality at the cost of a global table,
which PostgreSQL has so far declined for its memory-context simplicity.
(3) Schema-driven IRs — whether a future PostgreSQL plan format could be
emitted in a portable IR (for cross-version or cross-engine execution)
without losing the tight palloc/memory-context coupling that makes the
current node system fast.
Sources
Section titled “Sources”src/backend/nodes/README— the canonical overview: node concept, inheritance-by-convention, which support functions each node class needs, and the “Steps to Add a Node” checklist.src/include/nodes/nodes.h—NodeTag,Node,newNode,makeNode,IsA,castNode,copyObject/equaldeclarations,pg_node_attrdocumentation,ParseLoc.src/include/nodes/pg_list.h—List/ListCell,NIL, thelfirst/linitial/llastaccessors, and theforeachfamily.src/backend/nodes/list.c—new_list,enlarge_list,lappend/lcons,list_concat,list_copy,list_copy_deep,list_member.src/backend/nodes/copyfuncs.c— copy macros,copyObjectImpldispatch,_copyConst/_copyExtensibleNodecustom routines.src/backend/nodes/equalfuncs.c— compare macros (incl. the location/coercionform no-ops),equaldispatch,_equalList,_equalConst.src/backend/nodes/gen_node_support.pl— the build-time generator: input-file order, ABI guard, field-type→macro dispatch, switch emission.src/include/nodes/plannodes.h,src/include/nodes/parsenodes.h— thePlan/Scan/SeqScan/Join/NestLoopinheritance chain,PlannedStmt, andQuery.- Database System Concepts, Silberschatz et al., 7e, ch. 15–16 — query
representation and the intermediate-tree pipeline (theory anchor; see
.omc/plans/postgres-paper-bibliography.md). - Cross-references:
postgres-parser.md,postgres-planner-overview.md,postgres-executor.md,postgres-expression-eval.md.