PostgreSQL GiST — Generalized Search Trees and Opclass Support Functions
Contents:
- Theoretical Background
- Common DBMS Design
- PostgreSQL’s Approach
- Source Walkthrough
- Source verification (as of 2026-06-05)
- Beyond PostgreSQL — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”A B-tree (postgres-nbtree.md) is the right index for data that has a
total order: integers, strings, timestamps. Its entire machinery — the
separator keys, the “move right to recover,” the binary search inside a page
— rests on the assumption that any two keys can be compared with <. But a
vast amount of interesting data has no natural total order. A geometric
box is not “less than” another box; a point in the plane has no single
linear rank that preserves nearness; an IP subnet, a range of integers, a
tsvector of lexemes, a trigram set of a string — none of these sit on a
number line. What these datatypes do have is a notion of containment
or overlap: a query box overlaps an indexed box; a query point is
contained in an indexed bounding rectangle; a query range is contained in an
indexed range. An index that could prune the search tree using overlap
rather than order would serve all of them.
The structure that generalizes the B-tree from order to overlap is the
R-tree (Guttman 1984; the variant PostgreSQL effectively implements for
spatial keys is closer to the R*-tree of Beckmann, Kriegel, Schneider &
Seeger 1990, captured in dbms-papers/rstar-tree.md). An R-tree is a
height-balanced tree in which every entry is a bounding predicate — for
spatial data, a minimum bounding rectangle (MBR) — paired with a pointer. A
leaf entry’s MBR bounds a single object; an internal entry’s MBR bounds the
union of all MBRs in the subtree it points at. The two invariants that make
it a search tree are:
- Coverage. An internal entry’s predicate covers (contains) the predicates of every child beneath it. A query predicate that is disjoint from an internal entry can therefore safely skip the entire subtree.
- Balance. All leaves are at the same depth, so lookups, inserts, and
the R*-tree’s forced-reinsertion stay
O(log N).
Unlike a B-tree, an R-tree’s bounding predicates may overlap between siblings, so a point query can descend into more than one subtree. The quality of an R-tree — how little it overlaps, how tightly its MBRs hug the data — is determined entirely by two heuristics: the insert path choice (“which child’s MBR do I enlarge least to swallow the new entry?”) and the node split (“when a page overflows, how do I partition its entries into two groups that minimize total area / overlap?”). Guttman’s quadratic and linear split heuristics, and the R*-tree’s later margin-minimizing split plus forced reinsertion, are exactly these two knobs tuned for rectangles in the plane.
The conceptual leap that produced GiST is the observation that the tree
skeleton, the balance maintenance, the concurrency protocol, and the
crash-recovery logic do not depend on the keys being rectangles at all.
Hellerstein, Naughton & Pfeffer 1995, “Generalized Search Trees for
Database Systems” (SIGMOD; the foundational reference cited at the top of
src/backend/access/gist/README) made this explicit. Their thesis: a B-tree
and an R-tree are both instances of one balanced, multiway, bounding-
predicate tree that differs only in (a) what a “key” is and (b) four
abstract operations on keys. Pull those four operations out into a
user-supplied interface and you get a single, reusable index template
that an extension author can specialize for any datatype with a notion of
containment — not just rectangles, but ranges, full-text, network types,
even exotica an extension invents. The four canonical GiST methods, in the
1995 paper’s vocabulary, are:
- Consistent(E, q) — given an index entry
E(a predicate) and a query predicateq, return false only ifEandqcan provably never both hold for any object. This is the pruning primitive: a false lets the search skipE’s subtree; a true (possibly over-cautious) forces descent. - Union(E1..En) — return a single predicate that holds for every object
satisfying any of
E1..En. This is what an internal downlink stores: the bounding predicate of its children. - Penalty(E1, E2) — a number measuring the cost of inserting
E2into the subtree rooted atE1(typically the increase inE1’s “size,” e.g. area enlargement for a rectangle). This drives the insert path choice. - PickSplit(P) — given an overfull set of entries
P, partition them into two groups. This drives the node split.
Two more methods round out the practical interface (the paper hints at
them, PostgreSQL formalizes them): Compress/Decompress, which let the
on-disk key be a lossy or compact representation of the indexed value (an
MBR instead of a polygon), and Equal, used during downlink maintenance
to detect when a recomputed union is unchanged. PostgreSQL adds Distance
(for nearest-neighbor ORDER BY), Fetch (to reconstruct the original
value for index-only scans), Options (opclass-parameter parsing), and
SortSupport (to enable the bottom-up sorted build).
The payoff is enormous leverage: PostgreSQL’s box_ops, range_ops,
inet_ops, tsvector_ops, PostGIS’s spatial index, pg_trgm’s
similarity index, and btree_gist’s order-emulating opclasses are all the
same access method, differing only in a dozen small C functions. The cost
is that the access method cannot reason about the keys: it cannot binary-
search a page (no order), it must call a user function for every pruning
decision, and a sloppy opclass (a Consistent that returns true too often,
a PickSplit that produces overlapping groups) silently degrades to a
sequential scan dressed up as an index. GiST trades the B-tree’s
self-contained efficiency for extensibility, and the rest of this document
is about how PostgreSQL makes that trade crash-safe and concurrent.
Common DBMS Design
Section titled “Common DBMS Design”Before reading PostgreSQL’s symbols, it helps to name the recurring design moves that any extensible, overlap-based, balanced tree makes. These are the conventions GiST instantiates.
The “template method” index: skeleton in the engine, semantics in a plug-in
Section titled “The “template method” index: skeleton in the engine, semantics in a plug-in”The defining pattern is the classic Template Method: the database engine
owns the algorithm skeleton (descend, split, propagate, lock, log, recover)
and calls out to a small set of hooks that supply the datatype semantics.
Every extensible index converges on roughly the same hook set — a match
predicate for pruning, a bounding operation for downlinks, a cost
function for path selection, and a partition function for splitting —
because those are exactly the four decision points a balanced bounding-
predicate tree must make and the only four that depend on what a key means.
Oracle’s Extensible Indexing (the ODCIIndex interface), Informix’s
Datablade virtual indexes, and Microsoft’s CLR-based spatial indexes are
all the same idea under different names; GiST is the open-source canonical
form, and the 1995 paper is its specification.
Overlap, not order: the consequences
Section titled “Overlap, not order: the consequences”Because sibling predicates may overlap, three things differ from a B-tree:
- Search may fan out. A single query can match the Consistent test of several children of one page, so the search is not a single root-to-leaf path but a traversal of a frontier. Implementations keep a work queue (or stack) of pages still to visit rather than a single descent pointer.
- There is no high key to “move right” against. A B-tree recovers from a concurrent split by comparing the search key to the page’s high key. With no order, that test is unavailable — so the recovery signal must be made explicit (a sequence number per page) rather than derived from the key.
- Split quality is a heuristic, not a guarantee. A B-tree split is trivially balanced (cut the sorted run in half). An overlap tree’s split must choose a partition that minimizes future false descents, and the choice is datatype-specific — hence PickSplit is a user hook, and a bad one merely produces a slow index, not a wrong one.
Adjust-on-the-way-down inserts
Section titled “Adjust-on-the-way-down inserts”A naive bounding-predicate insert descends to a leaf, inserts, then walks back up enlarging every ancestor’s predicate to cover the new entry — two passes, and a crash mid-way leaves ancestors that under-cover their subtrees. The robust alternative, which PostgreSQL adopts, is to enlarge the chosen downlink during the downward pass: at each internal page, before recursing into the chosen child, replace the child’s downlink with the union of the old downlink and the new key. When the leaf is reached, every ancestor already covers the new entry, so the insert is a single pass and the tree is self-consistent after writing the leaf — no upward fix-up, which is what makes crash recovery simple.
Right-links + a sequence number for lock-light concurrency
Section titled “Right-links + a sequence number for lock-light concurrency”The B-link tree’s great trick (postgres-nbtree.md) — each level is a
right-linked list, a search holds one page latch at a time, and a search that
lands on a just-split page chases the right-link — transfers directly to an
overlap tree, except for the split-detection signal. Lacking a high key,
the overlap tree stamps each page with a node sequence number (NSN): when
a split posts its downlink to the parent, it copies the parent-update’s LSN
into the children’s NSN. A search remembers the parent page’s LSN when it
read the downlink; on reaching the child it compares that remembered LSN to
the child’s NSN. If the child’s NSN is newer, the child split after the
search saw the parent, so the search must also follow the child’s right-link
to the moved-away entries. This is Kornacker’s adaptation of L&Y to GiST
(“Access Methods for Next-Generation Database Systems,” the README’s
concurrency reference), and it is the exact mechanism PostgreSQL implements.
Crash-safe multi-page mutations via WAL-atomic steps
Section titled “Crash-safe multi-page mutations via WAL-atomic steps”As in any WAL engine, a split spanning N pages is decomposed into atomic log records each leaving a valid tree. The canonical decomposition: (1) atomically write the new right page(s), set the old page’s right-link, and flag the old page “my right sibling has no parent downlink yet”; (2) atomically post the downlink(s) to the parent and clear the flag, stamping the NSN. A crash between (1) and (2) leaves a fully searchable tree — the flag tells both searches and the next inserter to chase the right-link — and the next descent repairs the incomplete split lazily.
Bulk-loading: sorted vs. buffered
Section titled “Bulk-loading: sorted vs. buffered”Inserting a billion tuples one at a time into a disk tree is random-I/O bound: each insert touches a root-to-leaf path that is mostly not cached. Two standard accelerations apply. Sorted build (B-tree style): if the keys can be linearized into a sort order that preserves locality, sort them and pack leaves bottom-up, building each parent level from the level below. Buffering build (Arge, Hinrichs, Vahrenhold & Vitter, the README’s build reference): attach in-memory/temp-file buffers to internal nodes, push each incoming tuple only as far down as the first buffered node, and flush a buffer en masse when it fills — converting depth-first random descents into breadth-first batched ones sized to fit the cache.
Theory ↔ PostgreSQL mapping
Section titled “Theory ↔ PostgreSQL mapping”| Theory / convention | PostgreSQL name (gist) |
|---|---|
| Consistent (pruning predicate) | GIST_CONSISTENT_PROC → giststate->consistentFn, called in gistindex_keytest |
| Union (bounding predicate of children) | GIST_UNION_PROC → unionFn, via gistMakeUnionItVec |
| Penalty (insert-path cost) | GIST_PENALTY_PROC → penaltyFn, via gistpenalty / gistchoose |
| PickSplit (node partition) | GIST_PICKSPLIT_PROC → picksplitFn, via gistUserPicksplit / gistSplitByKey |
| Compress / Decompress (on-disk key form) | GIST_COMPRESS_PROC / GIST_DECOMPRESS_PROC → compressFn / decompressFn |
| Equal (unchanged-union detection) | GIST_EQUAL_PROC → equalFn, via gistgetadjusted |
| Distance (k-NN ORDER BY) | GIST_DISTANCE_PROC → distanceFn, k-NN queue in gistget.c |
| Fetch (index-only reconstruction) | GIST_FETCH_PROC → fetchFn, via gistFetchTuple |
| SortSupport (bottom-up build) | GIST_SORTSUPPORT_PROC, gates GIST_SORTED_BUILD |
| Bounding-predicate downlink | inner IndexTuple whose key is a union; t_tid is the child block |
| Adjust-on-the-way-down | gistgetadjusted + gistinserttuple inside gistdoinsert |
| Node sequence number (split signal) | GistPageGetNSN / GistPageSetNSN, GistNSN = XLogRecPtr |
| Right-link | GISTPageOpaqueData.rightlink |
| ”Right sibling lacks downlink” flag | F_FOLLOW_RIGHT (GistFollowRight / GistMarkFollowRight) |
| Split step 1 / step 2 | gistplacetopage (link+flag) / gistfinishsplit (post downlink, set NSN) |
| Lazy incomplete-split repair | gistfixsplit (triggered on F_FOLLOW_RIGHT) |
| Re-find migrated parent | gistFindCorrectParent / gistFindPath |
| Work queue for fan-out search | GISTScanOpaque->queue (a pairingheap) |
| Sorted bulk build | gist_indexsortbuild (GIST_SORTED_BUILD) |
| Buffered bulk build | gistInitBuffering + GISTBuildBuffers (GIST_BUFFERING_ACTIVE) |
| AM dispatch table | gisthandler returns IndexAmRoutine (see postgres-index-am.md) |
PostgreSQL’s Approach
Section titled “PostgreSQL’s Approach”PostgreSQL’s src/backend/access/gist/ is, in the README’s words, “an
implementation of GiST indexing,” concurrency-hardened per Kornacker and
bulk-load-accelerated per Arge et al. The on-disk tree is a height-balanced,
right-linked tree of BLCKSZ pages; the AM owns descent, split, WAL, and
concurrency; an opclass supplies the datatype semantics through up to
twelve support functions. The whole personality of the AM is declared in one
handler.
The AM handler: what GiST can and cannot do
Section titled “The AM handler: what GiST can and cannot do”// gisthandler — src/backend/access/gist/gist.cDatumgisthandler(PG_FUNCTION_ARGS){ IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
amroutine->amstrategies = 0; // strategy numbers are opclass-defined amroutine->amsupport = GISTNProcs; // 12 support procs (consistent..sortsupport) amroutine->amcanorder = false; // no total order -> can't drive ORDER BY col amroutine->amcanorderbyop = true; // CAN drive ORDER BY <-> (k-NN) amroutine->amcanmulticol = true; // multi-column GiST amroutine->amoptionalkey = true; amroutine->amsearchnulls = true; amroutine->amstorage = true; // on-disk key type != indexed type (compress) amroutine->amclusterable = true; amroutine->ampredlocks = true; // SERIALIZABLE predicate locking supported amroutine->amcanparallel = false; amroutine->amcaninclude = true; // INCLUDE columns // ... method pointers ... amroutine->ambuild = gistbuild; amroutine->aminsert = gistinsert; amroutine->ambulkdelete = gistbulkdelete; amroutine->amgettuple = gistgettuple; amroutine->amgetbitmap = gistgetbitmap; amroutine->amcanreturn = gistcanreturn; // index-only scans (if Fetch defined) amroutine->amtranslatecmptype = gisttranslatecmptype; PG_RETURN_POINTER(amroutine);}The two telling flags are amcanorder = false (GiST keys have no total
order, so a GiST index can never satisfy a plain ORDER BY col) and
amcanorderbyop = true (it can satisfy ORDER BY col <-> point, the
k-nearest-neighbor operator, because the Distance support function provides a
lower-bound metric). amstorage = true reflects Compress: the stored key
type may differ from the indexed type. amsupport = GISTNProcs (12) is the
size of the support-function array the opclass may fill.
Caching the opclass: GISTSTATE
Section titled “Caching the opclass: GISTSTATE”Every operation on a GiST index runs against a GISTSTATE, a per-statement
cache of the resolved support-function FmgrInfos, built once by
initGISTstate and stashed in IndexInfo->ii_AmCache:
// initGISTstate — src/backend/access/gist/gist.cfor (i = 0; i < IndexRelationGetNumberOfKeyAttributes(index); i++){ fmgr_info_copy(&(giststate->consistentFn[i]), index_getprocinfo(index, i + 1, GIST_CONSISTENT_PROC), scanCxt); fmgr_info_copy(&(giststate->unionFn[i]), index_getprocinfo(index, i + 1, GIST_UNION_PROC), scanCxt);
/* opclasses are not required to provide a Compress method */ if (OidIsValid(index_getprocid(index, i + 1, GIST_COMPRESS_PROC))) fmgr_info_copy(&(giststate->compressFn[i]), ... , scanCxt); else giststate->compressFn[i].fn_oid = InvalidOid; // ... penalty, picksplit, equal are mandatory; decompress/distance/fetch optional ...}consistent, union, penalty, picksplit, and equal are mandatory;
compress, decompress, distance, and fetch are optional and detected
by OidIsValid. The nonLeafTupdesc is a truncated copy of the index tuple
descriptor without INCLUDE columns, used when forming downlink (internal)
tuples — internal keys carry only the key columns, never the payload.
The two key-shapes: leaf entries vs. downlinks
Section titled “The two key-shapes: leaf entries vs. downlinks”A GiST leaf IndexTuple stores the (possibly Compress-ed) key for one heap
row, with t_tid pointing at the heap tuple. An internal IndexTuple stores
a union key — the bounding predicate of its child page — with t_tid
holding the child block number. The Consistent function is invoked with a
flag telling it whether it is looking at a leaf entry (an exact object) or an
internal entry (a bounding predicate), because “does the query overlap this
box” is asked of both, but the answer’s meaning differs.
Descending by least penalty, adjusting downlinks on the way down
Section titled “Descending by least penalty, adjusting downlinks on the way down”gistdoinsert walks from the root holding one page at a time. At each
internal page it calls gistchoose, which evaluates penaltyFn for every
downlink and returns the offset of least penalty (ties broken with a small
randomization for space/cache balance). Before recursing, it asks whether the
chosen downlink already covers the new key; if not, it enlarges it in place:
// gistdoinsert — src/backend/access/gist/gist.c (internal-page case)downlinkoffnum = gistchoose(state.r, stack->page, itup, giststate);iid = PageGetItemId(stack->page, downlinkoffnum);idxtuple = (IndexTuple) PageGetItem(stack->page, iid);childblkno = ItemPointerGetBlockNumber(&(idxtuple->t_tid));
/* Does the child's downlink already cover the key we're inserting? */newtup = gistgetadjusted(state.r, idxtuple, itup, giststate);if (newtup){ /* No: enlarge the downlink in place (may itself split this page). */ if (gistinserttuple(&state, stack, giststate, newtup, downlinkoffnum)) { /* the parent split; retry from the parent */ }}gistgetadjusted calls unionFn to compute union(old downlink, new key)
and equalFn to check whether the union actually changed; it returns NULL
(skip the update) when the downlink already covers the key. Because this
runs on the downward pass, by the time we reach the leaf every ancestor
covers the new entry — the single-pass property the README emphasizes.
Choosing the subtree: gistchoose / gistpenalty
Section titled “Choosing the subtree: gistchoose / gistpenalty”// gistpenalty — src/backend/access/gist/gistutil.cfloatgistpenalty(GISTSTATE *giststate, int attno, GISTENTRY *orig, bool isNullOrig, GISTENTRY *add, bool isNullAdd){ float penalty = 0.0; if (giststate->penaltyFn[attno].fn_strict == false || (isNullOrig == false && isNullAdd == false)) { FunctionCall3Coll(&giststate->penaltyFn[attno], giststate->supportCollation[attno], PointerGetDatum(orig), PointerGetDatum(add), PointerGetDatum(&penalty)); if (isnan(penalty) || penalty < 0.0) // sanitize a misbehaving opclass penalty = 0.0; } else if (isNullOrig && isNullAdd) penalty = 0.0; else penalty = get_float4_infinity(); // avoid mixing null and non-null return penalty;}For box_ops the penalty is the area enlargement of the bounding box;
gistchoose picks the downlink with the smallest enlargement, walking
multi-column indexes column-by-column with earlier columns dominating.
NSN, right-links, and the F_FOLLOW_RIGHT flag
Section titled “NSN, right-links, and the F_FOLLOW_RIGHT flag”The page header’s opaque area carries the concurrency machinery:
// gist.h — page flags, NSN type, and accessors#define F_LEAF (1 << 0) /* leaf page */#define F_DELETED (1 << 1) /* the page has been deleted */#define F_FOLLOW_RIGHT (1 << 3) /* page to the right has no downlink */
typedef XLogRecPtr GistNSN; /* node sequence number == a special LSN */
#define GistFollowRight(page) (GistPageGetOpaque(page)->flags & F_FOLLOW_RIGHT)#define GistPageGetNSN(page) (PageXLogRecPtrGet(GistPageGetOpaque(page)->nsn))#define GistPageSetNSN(page, val) (PageXLogRecPtrSet(GistPageGetOpaque(page)->nsn, val))GistNSN is literally an XLogRecPtr; the NSN of a child is set to the LSN
of the WAL record that inserted its downlink into the parent. A search
compares “the parent LSN I remembered” against “this child’s NSN” — if the
NSN is larger, the split happened after I read the parent, so I must chase
the right-link. This is the order-free replacement for the B-tree’s
high-key test.
Search as a frontier traversal with a pairing heap
Section titled “Search as a frontier traversal with a pairing heap”Because predicates overlap, a search is not a single descent. gistgettuple
seeds a queue with the root and pulls items off it; gistScanPage scans one
page, runs Consistent on each entry, and pushes matching children (or, for
k-NN, matching heap tuples with their distances) back onto the queue. For
plain scans the queue keeps heap tuples at the front for depth-first, LIMIT-
friendly output; for ORDER BY <-> it is a distance-ordered pairingheap
so the nearest tuple always emerges next.
flowchart TD
A["gistgettuple<br/>(root pushed to queue)"] --> B{"pull next<br/>queue item"}
B -->|"heap tuple"| C["return heapPtr<br/>(recheck if lossy)"]
B -->|"index page"| D["gistScanPage(page)"]
D --> E{"NSN > remembered<br/>parent LSN?"}
E -->|"yes: concurrent split"| F["push right sibling<br/>to front of queue"]
E -->|"no"| G["scan entries"]
F --> G
G --> H{"Consistent(entry, query)?"}
H -->|"false"| I["skip subtree"]
H -->|"true, leaf"| J["enqueue heap tuple<br/>(+ distance for k-NN)"]
H -->|"true, internal"| K["enqueue child page<br/>(remember this page's LSN)"]
J --> B
K --> B
I --> B
C --> B
This is the same “one page locked at a time, chase the right-link on a concurrent split” idiom as nbtree, with three differences forced by the overlap model: the descent is a queue not a pointer, the split signal is the NSN not a high key, and pruning is a user function call not a key comparison.
Source Walkthrough
Section titled “Source Walkthrough”This section walks the call flows in gist.c, gistget.c, gistsplit.c,
and gistbuild.c, anchored on stable symbol names. Line numbers live only in
the position-hint table at the end.
The support-function contract (gist.c, gist.h)
Section titled “The support-function contract (gist.c, gist.h)”The twelve support-procedure numbers are macros in gist.h:
GIST_CONSISTENT_PROC (1), GIST_UNION_PROC (2), GIST_COMPRESS_PROC (3),
GIST_DECOMPRESS_PROC (4), GIST_PENALTY_PROC (5), GIST_PICKSPLIT_PROC
(6), GIST_EQUAL_PROC (7), GIST_DISTANCE_PROC (8), GIST_FETCH_PROC (9),
GIST_OPTIONS_PROC (10), GIST_SORTSUPPORT_PROC (11), with GISTNProcs
= 12. gisthandler exposes the AM; initGISTstate resolves each procedure’s
OID to an FmgrInfo and caches it in the GISTSTATE, marking optional
procedures with InvalidOid when absent. gistvalidate (not quoted here)
is the amvalidate hook that CREATE OPERATOR CLASS runs to check an
opclass supplies a coherent set.
The Consistent function is reached through gistindex_keytest, which is the
literal pruning gate of every scan:
// gistindex_keytest — src/backend/access/gist/gistget.crecheck = true; // safest assumption if Consistent forgets to set ittest = FunctionCall5Coll(&key->sk_func, key->sk_collation, PointerGetDatum(&de), // index datum as GISTENTRY* key->sk_argument, // the query value Int16GetDatum(key->sk_strategy), ObjectIdGetDatum(key->sk_subtype), PointerGetDatum(&recheck));if (!DatumGetBool(test)) return false; // provably disjoint: prune this entry/subtree*recheck_p |= recheck; // lossy match: caller must recheck against heapTwo subtleties live here. First, recheck: a lossy opclass (e.g.
pg_trgm, or a Compress that stored a bounding box for a polygon) returns
true with recheck = true, meaning “this might match — confirm against the
real heap value.” The executor re-evaluates the original operator on the heap
tuple. Second, NULL handling: on an internal page a NULL key cannot prove the
subtree is null-free (GiST’s rule is union(VAL, NULL) = VAL), so
SK_SEARCHNULL only prunes on leaf pages — the README’s NULL-safety
guarantee made concrete.
Insert: gistinsert → gistdoinsert → gistplacetopage
Section titled “Insert: gistinsert → gistdoinsert → gistplacetopage”gistinsert is a thin wrapper: it lazily builds the GISTSTATE, forms the
leaf IndexTuple via gistFormTuple (which runs Compress), stamps t_tid
with the heap CTID, and calls gistdoinsert. gistdoinsert is the descent
engine. It starts at GIST_ROOT_BLKNO, grabs a shared lock optimistically,
and at each page checks, in order: (a) an incomplete split to repair, (b) a
concurrent split/deletion to retry from the parent, (c) leaf vs. internal.
// gistdoinsert — src/backend/access/gist/gist.c (concurrency checks)/* (a) crashed mid-split? finish it before touching the page. */if (GistFollowRight(stack->page)){ gistfixsplit(&state, giststate); state.stack = stack = stack->parent; // retry from parent continue;}/* (b) page split or was deleted since we read its parent downlink? */if ((stack->blkno != GIST_ROOT_BLKNO && stack->parent->lsn < GistPageGetNSN(stack->page)) || GistPageIsDeleted(stack->page)){ state.stack = stack = stack->parent; // rechoose best child continue;}When the leaf is reached, gistinserttuple/gistinserttuples call the
workhorse gistplacetopage, which either fits the tuple onto the page or
splits it. The split path is where the opclass PickSplit and the WAL-atomic
decomposition meet:
// gistplacetopage — src/backend/access/gist/gist.c (split path)is_split = gistnospace(page, itup, ntup, oldoffnum, freespace);if (is_split && GistPageIsLeaf(page) && GistPageHasGarbage(page)){ gistprunepage(rel, page, buffer, heapRel); // reclaim LP_DEAD first is_split = gistnospace(page, itup, ntup, oldoffnum, freespace);}if (is_split){ itvec = gistextractpage(page, &tlen); // all existing tuples itvec = gistjoinvector(itvec, &tlen, itup, ntup); dist = gistSplit(rel, page, itvec, tlen, giststate); // calls PickSplit if (npage > GIST_MAX_SPLIT_PAGES) elog(ERROR, "GiST page split into too many halves (%d, maximum %d)", ...); // ... allocate buffers, distribute tuples, set rightlinks ...}Note the order: GiST tries to prune dead tuples (gistprunepage, the
LP_DEAD reclamation) before splitting, exactly as nbtree deletes-or-dedups
before splitting. Note also GIST_MAX_SPLIT_PAGES: because keys are
variable-length and PickSplit has no size information, one split may need to
produce more than two pages, and the recursion in gistSplit handles that.
Linking the split and the F_FOLLOW_RIGHT / NSN protocol
Section titled “Linking the split and the F_FOLLOW_RIGHT / NSN protocol”After PickSplit decides the partition, gistplacetopage wires the new pages
into a right-linked chain and flags every non-rightmost page:
// gistplacetopage — src/backend/access/gist/gist.c (linking + flagging)/* Set up rightlinks */if (ptr->next && ptr->block.blkno != GIST_ROOT_BLKNO) GistPageGetOpaque(ptr->page)->rightlink = ptr->next->block.blkno;else GistPageGetOpaque(ptr->page)->rightlink = oldrlink; // tail keeps old rightlink
/* Flag all but the rightmost: "my right sibling has no parent downlink yet" */if (ptr->next && !is_rootsplit && markfollowright) GistMarkFollowRight(ptr->page);else GistClearFollowRight(ptr->page);
/* Copy the original page's NSN to every fragment so scans chase rightlinks */GistPageSetNSN(ptr->page, oldnsn);This is step 1 of the split: the new geometry is durable and searchable
(a reader chases the right-link because F_FOLLOW_RIGHT is set), but the
parent does not yet have downlinks for the new pages. gistplacetopage
returns the new downlink tuples to its caller via *splitinfo. Step 2 is
gistfinishsplit, which posts those downlinks to the parent (recursing up if
the parent itself splits) and, in doing so, sets each child’s NSN to the LSN
of the downlink-insertion record and clears F_FOLLOW_RIGHT. A crash between
the two steps leaves an index that searches traverse correctly and that the
next descent repairs through gistfixsplit.
flowchart TD
A["gistplacetopage:<br/>page full"] --> B["gistSplit ->gistSplitByKey<br/>(opclass PickSplit)"]
B --> C["allocate right pages,<br/>distribute tuples"]
C --> D["set rightlinks,<br/>GistMarkFollowRight,<br/>copy NSN — STEP 1"]
D --> E["XLogInsert split record<br/>(WAL-atomic)"]
E --> F["return *splitinfo<br/>(new downlinks) to caller"]
F --> G["gistfinishsplit:<br/>post downlinks to parent"]
G --> H{"parent full?"}
H -->|"yes"| A
H -->|"no"| I["set child NSN = downlink LSN,<br/>GistClearFollowRight — STEP 2"]
I --> J["split complete"]
D -.->|"crash here"| K["F_FOLLOW_RIGHT left set;<br/>next descent calls<br/>gistfixsplit to repair"]
Re-finding a migrated parent: gistFindCorrectParent / gistFindPath
Section titled “Re-finding a migrated parent: gistFindCorrectParent / gistFindPath”During an upward downlink insertion, the parent page recorded in the descent
stack may itself have split and migrated rightward. gistFindCorrectParent
checks whether the parent’s LSN changed since descent; if so it walks the
parent level’s right-links searching for the page holding the child’s
downlink, and falls back to gistFindPath — a breadth-first re-search from
the root following the README’s findPath pseudocode — when right-link
walking is insufficient. This is the GiST analogue of nbtree’s
_bt_getstackbuf.
PickSplit and the multi-column / secondary-split machinery
Section titled “PickSplit and the multi-column / secondary-split machinery”gistSplit (the recursive splitter) calls gistSplitByKey, which decides
which column to split on and invokes the user PickSplit through
gistUserPicksplit:
// gistUserPicksplit — src/backend/access/gist/gistsplit.cFunctionCall2Coll(&giststate->picksplitFn[attno], giststate->supportCollation[attno], PointerGetDatum(entryvec), // entries to partition PointerGetDatum(sv)); // GIST_SPLITVEC out-paramif (sv->spl_nleft == 0 || sv->spl_nright == 0){ /* opclass PickSplit put everything on one side: complain, then cope */ ereport(DEBUG1, (errmsg("picksplit method for column %d of index \"%s\" failed", ...))); genericPickSplit(giststate, entryvec, sv, attno); // fall back to a 50/50 split}Several robustness layers are visible. A PickSplit that degenerately dumps
everything on one side is caught and replaced with genericPickSplit (a
size-balanced halving) so a buggy opclass yields a poor but correct index.
For a multi-column GiST, gistSplitByKey recurses: if splitting on column 0
leaves the left and right union keys equal (gistKeyIsEQ) the split is
useless on that column, so it identifies don’t-care tuples (those equally
placeable on either side per findDontCares) and re-splits them on the next
column — the README’s “secondary split.” NULL keys are routed wholesale to
one side (gistSplitByKey’s null handling) so a page never mixes NULL and
non-NULL keys, preserving the union(VAL, NULL) = VAL invariant. After the
recursion, gistunionsubkey recomputes the left/right union datums for every
column so the two new downlinks correctly bound their pages.
Search: gistgettuple, gistgetbitmap, and k-NN
Section titled “Search: gistgettuple, gistgetbitmap, and k-NN”gistgettuple drives ordered and plain scans; gistgetbitmap drives bitmap
scans. Both funnel through gistScanPage, already shown. The k-NN path is
distinctive: gistindex_keytest also evaluates the Distance support function
for each ORDER BY x <-> q clause, giving heap entries an exact distance
and internal entries a lower-bound distance (the closest any child could
be). The pairingheap orders the queue by distance, so pulling items in heap
order yields tuples in increasing true distance — an incremental
nearest-neighbor scan that never has to materialize or sort the whole result.
If Distance set its recheck flag, the distance is only a lower bound and the
executor re-sorts with the exact value.
gistgettuple also performs lazy dead-tuple marking: when the executor
reports kill_prior_tuple, the offset is stashed in so->killedItems and
later flushed by gistkillitems, which sets LP_DEAD on the index entries —
the same hint-bit reclamation tier as nbtree’s _bt_killitems.
Build: three strategies in gistbuild
Section titled “Build: three strategies in gistbuild”gistbuild chooses among three strategies up front:
// gistbuild — src/backend/access/gist/gistbuild.c (strategy selection)if (options && options->buffering_mode == GIST_OPTION_BUFFERING_ON) buildstate.buildMode = GIST_BUFFERING_STATS; // forced bufferingelse buildstate.buildMode = GIST_BUFFERING_AUTO; // decide later
if (buildstate.buildMode != GIST_BUFFERING_STATS){ /* Use the sorted build if EVERY key column has a sortsupport proc. */ bool hasallsortsupports = true; for (int i = 0; i < keyscount; i++) if (!OidIsValid(index_getprocid(index, i + 1, GIST_SORTSUPPORT_PROC))) { hasallsortsupports = false; break; } if (hasallsortsupports) buildstate.buildMode = GIST_SORTED_BUILD;}-
Sorted build (
GIST_SORTED_BUILD), chosen when every key column’s opclass providesGIST_SORTSUPPORT_PROC. The heap is scanned into atuplesort(tuplesort_begin_index_gist), sorted by the opclass’s linearization (for points, typically a space-filling curve), thengist_indexsortbuildpacks leaves bottom-up and builds each parent level from the level below — the B-tree-style bottom-up build. This is the fastest path and the modern default forpoint_opsand PostGIS. -
Buffering build (
GIST_BUFFERING_ACTIVE), used for opclasses without sortsupport once the index outgrowseffective_cache_size.gistbuildstarts in plain insert mode, watches the index size ingistBuildCallback, and switches viagistInitBufferingwhen the tree no longer fits in cache. Buffers attached to internal nodes (GISTBuildBuffers) absorb incoming tuples;gistProcessEmptyingQueueflushes half-full buffers down a level at a time, converting random descents into cache-sized breadth-first batches (Arge et al.). -
Plain insert build, the fallback:
gistBuildCallbackcallsgistdoinsertdirectly for each heap tuple, exactly like run-time inserts but with WAL skipped (is_build = true, pages stamped withGistBuildLSN).
// gistBuildCallback — src/backend/access/gist/gistbuild.c (per-heap-tuple)itup = gistFormTuple(buildstate->giststate, index, values, isnull, true);itup->t_tid = *tid;if (buildstate->buildMode == GIST_BUFFERING_ACTIVE) gistBufferingBuildInsert(buildstate, itup); // push into node bufferselse gistdoinsert(index, itup, buildstate->freespace, buildstate->giststate, buildstate->heaprel, true); // directPosition hints (as of 2026-06-05, REL_18 273fe94)
Section titled “Position hints (as of 2026-06-05, REL_18 273fe94)”| Symbol | File | Line |
|---|---|---|
gisthandler | src/backend/access/gist/gist.c | 59 |
gistinsert | src/backend/access/gist/gist.c | 165 |
gistplacetopage | src/backend/access/gist/gist.c | 230 |
gistdoinsert | src/backend/access/gist/gist.c | 639 |
gistFindPath | src/backend/access/gist/gist.c | 914 |
gistFindCorrectParent | src/backend/access/gist/gist.c | 1027 |
gistformdownlink | src/backend/access/gist/gist.c | 1140 |
gistfixsplit | src/backend/access/gist/gist.c | 1200 |
gistinserttuple | src/backend/access/gist/gist.c | 1260 |
gistSplit | src/backend/access/gist/gist.c | 1466 |
initGISTstate | src/backend/access/gist/gist.c | 1537 |
gistindex_keytest | src/backend/access/gist/gistget.c | 125 |
gistScanPage | src/backend/access/gist/gistget.c | 328 |
gistkillitems | src/backend/access/gist/gistget.c | 38 |
gistgettuple | src/backend/access/gist/gistget.c | 612 |
gistgetbitmap | src/backend/access/gist/gistget.c | 745 |
gistcanreturn | src/backend/access/gist/gistget.c | 797 |
gistunionsubkey | src/backend/access/gist/gistsplit.c | 80 |
findDontCares | src/backend/access/gist/gistsplit.c | 113 |
supportSecondarySplit | src/backend/access/gist/gistsplit.c | 258 |
genericPickSplit | src/backend/access/gist/gistsplit.c | 344 |
gistUserPicksplit | src/backend/access/gist/gistsplit.c | 415 |
gistSplitHalf | src/backend/access/gist/gistsplit.c | 585 |
gistSplitByKey | src/backend/access/gist/gistsplit.c | 623 |
gistbuild | src/backend/access/gist/gistbuild.c | 179 |
gist_indexsortbuild | src/backend/access/gist/gistbuild.c | 400 |
gistInitBuffering | src/backend/access/gist/gistbuild.c | 626 |
gistBuildCallback | src/backend/access/gist/gistbuild.c | 822 |
gistProcessItup | src/backend/access/gist/gistbuild.c | 925 |
gistbufferinginserttuples | src/backend/access/gist/gistbuild.c | 1056 |
gistchoose | src/backend/access/gist/gistutil.c | 374 |
gistpenalty | src/backend/access/gist/gistutil.c | 724 |
gistMakeUnionItVec | src/backend/access/gist/gistutil.c | 155 |
gistdentryinit | src/backend/access/gist/gistutil.c | 547 |
gistFetchTuple | src/backend/access/gist/gistutil.c | 667 |
GIST_CONSISTENT_PROC..GISTNProcs | src/include/access/gist.h | 32–44 |
F_LEAF / F_DELETED / F_FOLLOW_RIGHT | src/include/access/gist.h | 49–53 |
GistNSN / GistBuildLSN | src/include/access/gist.h | 65 / 72 |
GistFollowRight / GistPageGetNSN | src/include/access/gist.h | 185 / 189 |
GISTSTATE (support FmgrInfo cache) | src/include/access/gist_private.h | 75 |
Source verification (as of 2026-06-05)
Section titled “Source verification (as of 2026-06-05)”All claims below were checked against the REL_18_STABLE working tree at commit 273fe94.
- Twelve support procedures,
GISTNProcs == 12. Verified insrc/include/access/gist.h:GIST_CONSISTENT_PROC(1) throughGIST_SORTSUPPORT_PROC(11),GIST_OPTIONS_PROC(10),#define GISTNProcs 12.gisthandlersetsamsupport = GISTNProcs. amcanorder = false,amcanorderbyop = true. Verified verbatim ingisthandler— GiST cannot satisfyORDER BY colbut can satisfyORDER BY col <-> qvia the Distance proc.- Mandatory vs. optional procs.
initGISTstatecopiesconsistentFn,unionFn,penaltyFn,picksplitFn,equalFnunconditionally;compressFn,decompressFn,distanceFn,fetchFnare guarded byOidIsValid(index_getprocid(...))and leftInvalidOidwhen absent. GistNSNis anXLogRecPtr. Verifiedtypedef XLogRecPtr GistNSN;ingist.h. The split detectorstack->parent->lsn < GistPageGetNSN(...)ingistdoinsertand the symmetric test ingistScanPageboth compare a remembered parent LSN to the child NSN.- F_FOLLOW_RIGHT lifecycle.
gistplacetopagecallsGistMarkFollowRighton non-rightmost split fragments (step 1);gistfinishsplit/downlink insertion clears it and sets the NSN (step 2);gistdoinsertandgistScanPageboth honor it;gistfixsplitrepairs a page left flagged by a crash. All verified ingist.c/gistget.c. - Prune-before-split.
gistplacetopagecallsgistprunepageto reclaimGistPageHasGarbageLP_DEAD tuples before deciding it must split — verified in the split path. GIST_MAX_SPLIT_PAGESguard.gistplacetopageraiseselog(ERROR, "GiST page split into too many halves ...")— verified; a single split can legitimately produce more than two pages because keys are variable length.- Degenerate-PickSplit fallback.
gistUserPicksplitfalls back togenericPickSplitand emits aDEBUG1“picksplit method … failed” when the opclass returns an empty side — verified ingistsplit.c. - Secondary split on multi-column.
gistSplitByKeyrecurses to the next column on a degenerate split, usingfindDontCaresand recomputing unions viagistunionsubkey— verified. - Three build strategies.
gistbuildselectsGIST_SORTED_BUILDiff every key column hasGIST_SORTSUPPORT_PROC; otherwise buffering (GIST_BUFFERING_AUTO/ACTIVEviagistInitBuffering) or plain inserts throughgistBuildCallback→gistdoinsert. Verified. - Predicate locking / SERIALIZABLE.
gistScanPagecallsPredicateLockPage, andgisthandlersetsampredlocks = true— verified. - k-NN via pairing heap.
gistgettupleroutesnumberOfOrderBys > 0togetNextNearest;gistScanPagepushes items intoso->queue, apairingheap, ordered by Distance-proc results. Verified. - No parallel scan/build.
gisthandlersetsamcanparallel = falseandamcanbuildparallel = falseon REL_18. Verified (distinguishes GiST from nbtree/BRIN which do support parallel builds).
Beyond PostgreSQL — Comparative Designs & Research Frontiers
Section titled “Beyond PostgreSQL — Comparative Designs & Research Frontiers”-
GiST 1995 vs. the production implementation. Hellerstein, Naughton & Pfeffer’s paper specifies the interface (Consistent / Union / Penalty / PickSplit, plus the search and insert templates) but is silent on concurrency, recovery, and bulk loading — the original Berkeley prototype (Hellerstein & Aoki, the README notes) was a single-user “university project.” Everything that makes PostgreSQL’s GiST a production index — the NSN + right-link protocol, the two-step WAL-atomic split, the
gistfixsplitlazy repair, the buffering build — is outside the 1995 paper. A focused note mapping each production mechanism back to the (absent) paper guarantee it had to invent would sharpen exactly what “generalize the B-tree” leaves unspecified. Paper: Hellerstein/Naughton/Pfeffer 1995 (the README’s foundational reference). -
NSN concurrency vs. Lehman & Yao’s high key. GiST’s split-detection signal is Kornacker’s NSN, an explicit per-page sequence number, precisely because an overlap tree has no high key to “move right” against (
postgres-nbtree.mduses the high-key test). The two protocols are isomorphic — one page latched at a time, chase the right-link on a concurrent split, repair the incomplete split lazily — but GiST pays an 8-byte NSN per page where nbtree reuses the separator it already stores. A side-by-side of_bt_moveright(key comparison) againstgistScanPage’sparentlsn < GistPageGetNSNtest is the cleanest way to see what “order-free Blink-tree” costs. Paper: Kornacker, “Access Methods for Next-Generation Database Systems” (the README’s concurrency reference). -
R-tree heuristics live entirely in the opclass. Guttman’s quadratic / linear splits and the R*-tree’s margin-minimizing split + forced reinsertion are not in
gistsplit.c— they are choices abox_ops/point_opsPickSplit could make. PostgreSQL’s in-core geometric PickSplit is a Guttman-style double-sort; the R*-tree’s forced reinsertion is not implemented (GiST never re-inserts entries from an overflowing page). An opclass that added R*-style reinsertion, or used a Hilbert/Z-order linearization for split, is an unrealized quality improvement the framework permits but does not provide. Paper: Beckmann, Kriegel, Schneider & Seeger 1990, R*-tree (dbms-papers/rstar-tree.md). -
Buffering build vs. sorted build vs. STR bulk-load. PostgreSQL’s two fast builds bracket the bulk-load literature: the sorted build (
GIST_SORTED_BUILD, gated on SortSupport) is the GiST analogue of bottom-up B-tree packing and, for a space-filling-curve linearization, approximates the Sort-Tile-Recursive (STR) bulk-load of spatial trees; the buffering build is Arge/Hinrichs/Vahrenhold/Vitter’s cache-aware algorithm for opclasses that cannot be linearized. The crossover point — when buffering beats plain inserts — is theeffective_cache_sizethreshold thatgistInitBufferingwatches. Comparing GiST’s heuristic switch against STR’s one-shot tiling would frame when “sort once, pack bottom-up” wins over “buffer and batch.” Paper: Arge et al., “Efficient Bulk Operations on Dynamic R-trees” (the README’s build reference). -
GiST vs. SP-GiST vs. BRIN — three answers to “no total order.” GiST is the balanced, overlapping-bounding-predicate answer; SP-GiST (
postgres-spgist.md) is the space-partitioned, non-overlapping answer (quadtrees, k-d trees, radix tries) where partitions are disjoint so a point query follows a single path; BRIN (postgres-brin.md) is the lossy block-range summary answer that gives up the tree entirely for a tiny min/max-per-range map. The three share the “engine owns the skeleton, opclass owns the semantics” template but differ in whether the partition overlaps (GiST yes, SP-GiST no) and whether there is a tree at all (BRIN no). A unifying note on the extensible-index family is the natural parent doc to all three. -
CUBRID has no GiST analogue. CUBRID ships a B+-tree and specialized spatial support but no general extensible-key index framework; an extension author who wants a range or full-text or trigram index in CUBRID has no “supply four C functions” path. Contrasting PostgreSQL’s opclass-pluggable AM against CUBRID’s fixed index set is the sharpest illustration of what the GiST abstraction buys an ecosystem. (See the CUBRID index analyses in the cubrid tree.)
Sources
Section titled “Sources”In-tree READMEs and source files (REL_18_STABLE, commit 273fe94)
Section titled “In-tree READMEs and source files (REL_18_STABLE, commit 273fe94)”src/backend/access/gist/README— the design document: the GiST interface (Consistent/Union/Compress/Decompress/Penalty/PickSplit/Equal/Distance/ Fetch), thefindPathre-search pseudocode, the NSN +F_FOLLOW_RIGHTconcurrency protocol, the incomplete-split repair, the buffering build, and the sorted build — plus the foundational paper citations quoted in section 6.src/include/access/gist.h—GIST_CONSISTENT_PROC..GIST_SORTSUPPORT_PROCandGISTNProcs, theF_LEAF/F_DELETED/F_FOLLOW_RIGHTpage flags,GistNSN(typedef XLogRecPtr),GistBuildLSN, and theGistFollowRight/GistPageGetNSN/GistPageSetNSNaccessors.src/include/access/gist_private.h—GISTSTATE(the per-statement support-FmgrInfocache),GISTScanOpaque(thepairingheapqueue andkilledItems),GISTBuildBuffers, and the split/insert helper prototypes.src/backend/access/gist/gist.c—gisthandler,gistinsert,gistdoinsert,gistplacetopage,gistfinishsplit,gistfixsplit,gistFindCorrectParent,gistFindPath,gistformdownlink,gistinserttuple(s),gistSplit, andinitGISTstate.src/backend/access/gist/gistget.c—gistindex_keytest(the Consistent gate),gistScanPage,gistgettuple,gistgetbitmap,getNextNearest(k-NN),gistkillitems,gistcanreturn.src/backend/access/gist/gistsplit.c—gistSplitByKey,gistUserPicksplit,genericPickSplit,findDontCares,supportSecondarySplit,gistunionsubkey,gistSplitHalf.src/backend/access/gist/gistbuild.c—gistbuild(strategy selection),gist_indexsortbuild,gistInitBuffering,gistBuildCallback,gistProcessItup,gistbufferinginserttuples.src/backend/access/gist/gistutil.c—gistchoose,gistpenalty,gistMakeUnionItVec,gistgetadjusted,gistFetchTuple,gistdentryinit.
Papers and textbook chapters
Section titled “Papers and textbook chapters”- Hellerstein, J. M., Naughton, J. F. & Pfeffer, A. (1995). “Generalized Search Trees for Database Systems.” Proc. VLDB 1995, 562-573. The GiST interface and search/insert templates; the README’s foundational reference.
- Kornacker, M. “Access Methods for Next-Generation Database Systems.” The NSN + right-link concurrency protocol PostgreSQL implements; the README’s concurrency reference.
- Arge, L., Hinrichs, K., Vahrenhold, J. & Vitter, J. S. “Efficient Bulk Operations on Dynamic R-trees.” The cache-aware buffering build; the README’s build reference.
- Guttman, A. (1984). “R-Trees: A Dynamic Index Structure for Spatial
Searching.” SIGMOD 1984, 47-57. The overlapping-MBR tree GiST generalizes;
background in
knowledge/research/dbms-papers/btree.md/rstar-tree.md. - Beckmann, N., Kriegel, H.-P., Schneider, R. & Seeger, B. (1990). “The
R*-tree: An Efficient and Robust Access Method for Points and Rectangles.”
SIGMOD 1990, 322-331. The split / forced-reinsertion heuristics a geometric
opclass may implement; captured in
knowledge/research/dbms-papers/rstar-tree.md. - Database Internals (Petrov 2019), ch. 2-7 — B-tree concurrency, WAL
framing, and the bulk-loading discussion that frames the build strategies
(
knowledge/research/dbms-general/).
Sibling docs (cross-references — mechanism owned there, not duplicated here)
Section titled “Sibling docs (cross-references — mechanism owned there, not duplicated here)”postgres-index-am.md— theIndexAmRoutinedispatch contract thatgisthandlerfills in (amgettuple/amgetbitmap/ambuild/aminsert).postgres-spgist.md— the space-partitioned, non-overlapping cousin in the extensible-index family.postgres-brin.md— the lossy block-range-summary answer to “no total order,” the third extensible AM.postgres-nbtree.md— the Lehman & Yao high-key concurrency protocol whose order-free adaptation is GiST’s NSN.postgres-buffer-manager.md— buffer pins and content-lock (LWLock) latching that GiST’s one-page-at-a-time descent is built on.postgres-ssi-predicate-locking.md— the predicate locks thatgistScanPage’sPredicateLockPage/CheckForSerializableConflictInfeed (ampredlocks = true).