Skip to content

CUBRID Query Executor — XASL Interpretation, Iterator Model, and Heap/Index Scan Operators

Contents:

A query executor is the runtime that turns a physical plan tree into a stream of result tuples. Database Internals (Petrov, Ch. 12 “Query Execution”) and Garcia-Molina/Ullman/Widom Database Systems: The Complete Book (Ch. 16 “The Query Compiler” and Ch. 15 “Query Execution”) frame it as a single design problem with two competing solution families: pull-based iterator models (Volcano), where each operator is asked for tuples one at a time, and push-based / vectorised models (MonetDB/X100, modern HyPer), where each operator emits batches into the next.

Goetz Graefe’s Volcano — An Extensible and Parallel Query Evaluation System (TKDE 1994) is the canonical reference for the iterator model. Its core is the three-method contract every operator implements:

  1. open() — initialise scan state, allocate buffers, open underlying files and child operators recursively.
  2. next() — return the next tuple (or end-of-stream), pulling from children as needed.
  3. close() — release scan state, close child operators recursively.

The contract has three structural consequences that show up in every Volcano-style engine, including CUBRID:

  • Composition is by tree, not by graph. Each operator names its children directly; the same tree shape that the optimiser produced is the shape the executor walks.
  • Pipelining is automatic for non-blocking operators. A Filter over a Scan over an Index produces no intermediate materialisation; control simply walks down the tree on every next().
  • Blocking operators (sort, hash-build, group-by, top-N) materialise. They consume their entire input inside open() (or on the first next()), write to a temporary file or in-memory structure, and become a leaf for everything above them.

Three textbook scan operators sit at the leaves of every plan tree:

  • Heap scan — sequential walk over a row-oriented heap file. The cheapest leaf when the predicate is unselective and there is no usable index. CUBRID’s heap scan walks HEAP_CHAIN.next_vpid and dispatches per slot via record_type (REC_HOME / REC_RELOCATION / REC_BIGONE); see cubrid-heap-manager.md.
  • Index scan — B+-tree range walk yielding (key, OID) pairs followed by a heap point-read per OID (or no heap read at all when the index is covering — all referenced columns are in the key).
  • List scan — read back a previously materialised tuple stream. The result of open()-ing a Sort or a sub-block is a list file; the parent operator reads it as a list scan. List scan is what makes the hierarchical XASL tree work: every materialising sub-XASL produces a list, and its parent consumes that list as if it were any other relation.

The pull-based model has one well-known cost — per-tuple virtual call overhead — and one well-known benefit — trivial composition. CUBRID pays the cost (a function pointer per operator per tuple, in scan_next_scanscan_next_scan_localscan_next_<type>_scan) and reaps the benefit (~600 KB of query_executor.c glue is enough to handle the entire SQL surface, including correlated subqueries, group-by, analytic windows, hierarchical queries, CTEs, and partitioned classes). The vectorised alternative would require restructuring every operator’s interface; that work has not happened, and the document accordingly tracks the plan-tree iterator model as it stands.

Every iterator-model executor — PostgreSQL, MySQL/InnoDB, Oracle, SQL Server, CUBRID — adopts the same patterns on top of Graefe’s three-method contract. Naming this vocabulary first makes the CUBRID-specific choices in ## CUBRID's Approach legible as dial settings, not inventions.

Each operator carries a state object that lives across open / next / close. PostgreSQL splits the immutable plan (Plan) from the per-execution plan state (PlanState); ExecProcNode(PlanState *) reads type and jumps into the per-operator next (ExecSeqScan, ExecIndexScan, ExecHashJoin, …) via a function pointer set in ExecInitNode. MySQL’s handler API (handler::rnd_init / rnd_next / rnd_end) plays the analogous role for storage-side scans. CUBRID’s plan is the XASL node tree (xasl_node); the per-operator state for a scan is the SCAN_ID tagged union (one of HEAP_SCAN_ID, INDX_SCAN_ID, LLIST_SCAN_ID, SHOWSTMT_SCAN_ID, …), and dispatch is switch (xasl->type) in qexec_execute_mainblock_internal over BUILDLIST_PROC / BUILDVALUE_PROC / UNION_PROC / MERGELIST_PROC / UPDATE_PROC / DELETE_PROC / INSERT_PROC / MERGE_PROC / CONNECTBY_PROC / CTE_PROC / DO_PROC and a few more. Row-producing types fall through to a generic pull interpreter, qexec_intprt_fnc.

Sort, hash-build, group-by, and sub-block boundaries cannot stream. The standard implementation gives each blocking operator a backing temporary file and a scan over that file as its output. PostgreSQL uses Tuplesortstate / Tuplestorestate. CUBRID writes list files (QFILE_LIST_ID, list_file.c); the parent reads them back as a list scan (scan_open_list_scan). Every BUILDLIST XASL materialises into xasl->list_id; the final query result is the top XASL’s list file, returned by qexec_execute_query.

Where a subquery hangs in the operator tree determines when it executes. PostgreSQL distinguishes InitPlan (uncorrelated, once-only) from SubPlan (correlated, per-row); CUBRID encodes the same distinction directly on the XASL node:

  • xasl->aptr_list — uncorrelated subqueries. Executed once before the outer scan; result is referenced via a REGU_VARIABLE of type CONSTANT whose dbvalptr points into the materialised tuple.
  • xasl->dptr_list — correlated subqueries. Executed once per outer row from inside qexec_intprt_fnc after the outer’s correlation columns are bound.
  • xasl->scan_ptr — the chained scan operator the outer iterates against. Multi-table joins are unrolled as a chain outer.scan_ptr = middle, middle.scan_ptr = inner.

When the leaf is an index scan, predicates split three ways: range (bounds the index walk), key (applied during the walk to columns in the index), data (applied after the heap fetch). CUBRID’s INDX_SCAN_ID stores them as range_pred / key_pred / scan_pred, each with its own regu_list and pr_eval_fnc. The same split appears in MySQL’s range/ref/Using where and PostgreSQL’s IndexQual/IndexFilter/Filter.

A heap scan calls mvcc_satisfies_snapshot on each fetched record (via heap_nextheap_get_visible_version_internal). An index scan additionally re-checks via locator_lock_and_get_object_with_evaluation when mvcc_select_lock_needed is set (SELECT ... FOR UPDATE).

Theoretical conceptCUBRID name
Operator plan nodexasl_node / XASL_NODE (src/xasl/xasl_node.hpp)
Plan-state / per-exec statexasl_state + per-spec SCAN_ID (src/query/scan_manager.h)
Dispatch by operator typeswitch (xasl->type) in qexec_execute_mainblock_internal (query_executor.c)
Generic pull-loop interpreterqexec_intprt_fnc (query_executor.c)
Operator open()qexec_open_scan (query_executor.c) → scan_open_<type>_scan (scan_manager.c)
Operator next()scan_next_scanscan_next_scan_localscan_next_<type>_scan (scan_manager.c)
Operator close()scan_end_scan + scan_close_scan (scan_manager.c)
Materialised outputQFILE_LIST_ID (list_file.c); per XASL on xasl->list_id
Uncorrelated subqueryxasl->aptr_list (executed once, pre-scan)
Correlated subqueryxasl->dptr_list (executed per outer row)
Join / nested operator chainxasl->scan_ptr (single-link chain, walked top-down)
Index range / key / data predicatesINDX_SCAN_ID::range_pred / key_pred / scan_pred
Predicate evaluator function pointerSCAN_PRED::pr_eval_fnc set by eval_fnc (query_evaluator.c)
Outer-block iteration entryqexec_next_scan_block_iterations (query_executor.c)
Outer-block result writeqexec_end_one_iteration (query_executor.c)
Group-byqexec_groupby / qexec_hash_gby_* (query_executor.c)
Order by + distinctqexec_orderby_distinct (query_executor.c)
Analytic functionsqexec_execute_analytic (query_executor.c)
Server-side query entryxqmgr_execute_queryqmgr_process_queryqexec_execute_query
Heap-scan operatorS_HEAP_SCAN SCAN_ID type, opened by scan_open_heap_scan
Index-scan operatorS_INDX_SCAN SCAN_ID type, opened by scan_open_index_scan
List-scan operatorS_LIST_SCAN SCAN_ID type, opened by scan_open_list_scan
Parallel heap scanS_PARALLEL_HEAP_SCAN (scan_open_parallel_heap_scan, query/parallel/)
Per-scan heap stateHEAP_SCANCACHE (cached page latch + snapshot — see cubrid-heap-manager.md)
Per-record decoderHEAP_CACHE_ATTRINFO (heap_attrinfo.h; heap_attrinfo_read_dbvalues)

The executor does four things: (1) walk the XASL tree and dispatch each node by xasl->type; (2) for the row-producing types, run a uniform open/next/close loop over the spec’s SCAN_ID; (3) for blocking nodes, materialise output into a list file and read it back via list scan; (4) for boundary nodes (group-by, order-by, analytic), drive a separate phase after the main loop. The four are interleaved by the execution model rather than sequential; the rest of this section walks them in the order an actual query traverses them.

Server-side entry — from xqmgr_execute_query to qexec_execute_mainblock_internal

Section titled “Server-side entry — from xqmgr_execute_query to qexec_execute_mainblock_internal”

A query reaches the executor through the query manager. xqmgr_execute_query looks up the cached XASL by XASL_ID, unpacks the host-variable buffer, and calls qmgr_process_query, which unpacks the XASL stream (if not already a tree) and calls qexec_execute_query. The latter initialises an XASL_STATE (carrying the VAL_DESCR with host vars and timezone) and dispatches to qexec_execute_mainblock, which is the recursion entry point that everything else (subqueries, joined blocks, CTEs) calls back into:

// qexec_execute_mainblock — src/query/query_executor.c (condensed)
int
qexec_execute_mainblock (THREAD_ENTRY * thread_p, xasl_node * xasl,
xasl_state * xstate,
UPDDEL_CLASS_INSTANCE_LOCK_INFO * p_class_lock_info)
{
// ... recursion-depth guard, trace timer setup ...
thread_inc_recursion_depth (thread_p);
error = qexec_execute_mainblock_internal (thread_p, xasl, xstate, p_class_lock_info);
// ... trace timer stop, perfmon stats accumulate ...
thread_dec_recursion_depth (thread_p);
return error;
}

qexec_execute_mainblock_internal is the operator-type dispatch:

// qexec_execute_mainblock_internal — src/query/query_executor.c (condensed)
static int
qexec_execute_mainblock_internal (THREAD_ENTRY * thread_p, XASL_NODE * xasl,
XASL_STATE * xasl_state,
UPDDEL_CLASS_INSTANCE_LOCK_INFO * p_lock_info)
{
// ... limit-clause early exit ...
switch (xasl->type)
{
case UPDATE_PROC: error = qexec_execute_update (thread_p, xasl, false, xasl_state); break;
case DELETE_PROC: error = qexec_execute_delete (thread_p, xasl, xasl_state); break;
case INSERT_PROC: error = qexec_execute_insert (thread_p, xasl, xasl_state, false); break;
case MERGE_PROC: error = qexec_execute_merge (thread_p, xasl, xasl_state); break;
case CTE_PROC: error = qexec_execute_cte (thread_p, xasl, xasl_state); break;
case DO_PROC: error = qexec_execute_do_stmt(thread_p, xasl, xasl_state); break;
case CONNECTBY_PROC: /* fall through to special hierarchical interpreter */ break;
/* BUILDLIST_PROC / BUILDVALUE_PROC / UNION_PROC / ... fall through to pull-loop below */
}
// ... pre-processing (open scan, run aptr)
// ... processing (qexec_intprt_fnc loop)
// ... post-processing (group-by, order-by, analytic)
}

DML proc types route to dedicated functions because they have to drive an outer scan and perform the modification with proper locking. Row-producing types share the same generic pull interpreter.

XASL tree shape — aptr / dptr / scan_ptr

Section titled “XASL tree shape — aptr / dptr / scan_ptr”

A single XASL node carries three pointers to other XASLs that frame its surroundings:

flowchart TB
  subgraph X["xasl_node"]
    T["type:<br/>BUILDLIST / BUILDVALUE /<br/>UNION / SCAN / ..."]
    SP["spec_list<br/>(ACCESS_SPEC_TYPE chain —<br/>tables/indexes/list-files for this block)"]
    OP["outptr_list<br/>(REGU_VARIABLE list — output projection)"]
    A["aptr_list →<br/>uncorrelated subqueries<br/>(executed pre-scan, once)"]
    D["dptr_list →<br/>correlated subqueries<br/>(executed per outer row)"]
    S["scan_ptr →<br/>next operator in chain<br/>(joined block / nested scan)"]
    L["list_id →<br/>materialised output<br/>(QFILE_LIST_ID)"]
  end
  A -. recurse .-> X1["aptr xasl"]
  D -. recurse .-> X2["dptr xasl"]
  S -. recurse .-> X3["scan_ptr xasl"]

The example from the v0.7 deck makes the relationship concrete:

SELECT (select col1 from tab c where col1 = a.col1) -- DPTR (correlated)
FROM tab a, tab b -- SCAN_PTR chain
WHERE a.col2 = b.col2
AND a.col1 = (select col1 from tab d); -- APTR (uncorrelated)
flowchart LR
  TopXASL["BUILDLIST<br/>(SELECT)<br/>spec=tab a"]
  AptrXASL["BUILDLIST<br/>spec=tab d<br/>flag=XASL_LINK_TO_REGU_VARIABLE"]
  DptrXASL["BUILDLIST<br/>spec=tab c<br/>flag=XASL_LINK_TO_REGU_VARIABLE"]
  ScanPtrXASL["SCAN_PROC<br/>spec=tab b"]
  TopXASL -- aptr_list --> AptrXASL
  TopXASL -- dptr_list --> DptrXASL
  TopXASL -- scan_ptr --> ScanPtrXASL

Plan-evaluation order: aptr → outer scan starts → for each outer row → dptr (executes once per row) → scan_ptr (joins), then end_one_iteration writes the projected row to the outer’s list_id.

For BUILDLIST / BUILDVALUE / SCAN_PROC nodes the executor runs the generic interpreter:

// qexec_intprt_fnc — src/query/query_executor.c (condensed)
static SCAN_CODE
qexec_intprt_fnc (THREAD_ENTRY * thread_p, XASL_NODE * xasl,
XASL_STATE * xasl_state, QFILE_TUPLE_RECORD * tplrec,
XASL_SCAN_FNC_PTR next_scan_fnc)
{
while ((xb_scan = qexec_next_scan_block_iterations (thread_p, xasl)) == S_SUCCESS)
{
while ((ls_scan = scan_next_scan (thread_p, &xasl->curr_spec->s_id)) == S_SUCCESS)
{
/* per-row: bptr (path expressions) */
for (xptr = xasl->bptr_list; qualified && xptr; xptr = xptr->next)
qexec_execute_obj_fetch (thread_p, xptr, xasl_state, scan_op);
/* per-row: dptr (correlated subqueries) */
for (xptr = xasl->dptr_list; xptr; xptr = xptr->next)
qexec_execute_mainblock (thread_p, xptr, xasl_state, NULL);
/* per-row: scan_ptr (joined block, recursive) */
if (xasl->scan_ptr)
qexec_execute_scan (thread_p, xasl->scan_ptr, xasl_state, tplrec, next_scan_fnc);
/* per-row: predicates not pushed to the scan */
if (xasl->if_pred)
ev_res = eval_pred (thread_p, xasl->if_pred, &xasl_state->vd, NULL);
/* write the qualified row to xasl->list_id */
if (qualified)
qexec_end_one_iteration (thread_p, xasl, xasl_state, tplrec);
}
}
}

Three nested loops: outer over scan blocks (each next_scan_block resets the SCAN_IDs for a new combination of access specs — see “Block iterations” below), middle over scan_next_scan for tuples within the current block, inner over the per-row work (path expressions, correlated subqueries, join chain, residual predicates). The loop is the literal Volcano shape, just with three guard rings instead of one.

Open / Next / Close — SCAN_ID is the operator state

Section titled “Open / Next / Close — SCAN_ID is the operator state”

The SCAN_ID is a tagged union over all the scan kinds CUBRID supports. Its type field selects which arm of the union is live:

// SCAN_ID — src/query/scan_manager.h (sketch)
typedef enum {
S_HEAP_SCAN, S_HEAP_SCAN_RECORD_INFO, S_HEAP_PAGE_SCAN, S_HEAP_SAMPLING_SCAN,
S_INDX_SCAN, S_INDX_KEY_INFO_SCAN, S_INDX_NODE_INFO_SCAN,
S_LIST_SCAN,
S_CLASS_ATTR_SCAN,
S_SET_SCAN, S_JSON_TABLE_SCAN,
S_VALUES_SCAN,
S_DBLINK_SCAN,
S_PARALLEL_HEAP_SCAN,
S_SHOWSTMT_SCAN,
...
} SCAN_TYPE;
struct scan_id_struct
{
SCAN_TYPE type;
SCAN_OPERATION_TYPE scan_op_type;
bool fixed; /* PEEK ok? (page latch held across calls) */
bool grouped; /* iterating in groups (UPDATE/DELETE join helper) */
bool mvcc_select_lock_needed;
S_DIRECTION direction; /* S_FORWARD / S_BACKWARD */
S_POSITION position; /* S_BEFORE / S_ON / S_AFTER */
S_STATUS status; /* S_OPENED / S_STARTED / S_ENDED / S_CLOSED */
QPROC_QUALIFICATION qualification;
VAL_LIST *val_list; /* output projection target */
VAL_DESCR *vd;
union {
HEAP_SCAN_ID hsid;
INDX_SCAN_ID isid;
LLIST_SCAN_ID llsid; /* list scan */
SET_SCAN_ID ssid;
REGU_VALUES_SCAN_ID rvsid; /* VALUES (...), VALUES (...) */
PARALLEL_HEAP_SCAN_ID phsid;
/* ... */
} s;
};

The lifecycle of a SCAN_ID is the textbook five-step sequence the heap-scan deck spelled out:

stateDiagram-v2
  [*]            --> S_OPENED   : scan_open_<type>_scan
  S_OPENED       --> S_STARTED  : scan_start_scan (fix initial pages, allocate caches)
  S_STARTED      --> S_STARTED  : scan_next_scan loop
  S_STARTED      --> S_ENDED    : scan_end_scan (release per-block resources)
  S_ENDED        --> S_OPENED   : (next access-spec / block iteration)
  S_ENDED        --> [*]        : scan_close_scan (release file/list/temp)

The split between start/end and open/close exists because of block iterations: when a query has multiple access specs (e.g. UNION-ALL of two tables, or a partitioned class with multiple sub-classes), each block walks start_scan → next_scan*loop → end_scan while the file-level resources (allocated by open_scan) are kept across blocks and released only at close_scan.

scan_open_heap_scan builds a HEAP_SCAN_ID from the access spec’s regu lists:

// scan_open_heap_scan — src/query/scan_manager.c (condensed)
int
scan_open_heap_scan (THREAD_ENTRY * thread_p, SCAN_ID * scan_id,
bool mvcc_select_lock_needed, SCAN_OPERATION_TYPE op_type,
int fixed, int grouped, QPROC_SINGLE_FETCH single_fetch,
DB_VALUE * join_dbval, val_list_node * val_list, VAL_DESCR * vd,
OID * cls_oid, HFID * hfid,
regu_variable_list_node * regu_list_pred, PRED_EXPR * pr,
regu_variable_list_node * regu_list_rest, ...)
{
HEAP_SCAN_ID *hsidp = &scan_id->s.hsid;
scan_id->type = scan_type; /* S_HEAP_SCAN */
scan_init_scan_id (scan_id, mvcc_select_lock_needed, op_type, fixed, grouped, ...);
COPY_OID (&hsidp->cls_oid, cls_oid);
hsidp->hfid = *hfid;
UT_CAST_TO_NULL_HEAP_OID (&hsidp->hfid, &hsidp->curr_oid); /* iterator cursor */
/* attach scan predicate + its compiled evaluator (eval_fnc picks the fastest
* pr_eval_fnc for this PRED_EXPR shape) */
scan_init_scan_pred (&hsidp->scan_pred, regu_list_pred, pr,
((pr) ? eval_fnc (thread_p, pr, &single_node_type) : NULL));
scan_init_scan_attrs (&hsidp->pred_attrs, num_attrs_pred, attrids_pred, cache_pred);
hsidp->rest_regu_list = regu_list_rest;
scan_init_scan_attrs (&hsidp->rest_attrs, num_attrs_rest, attrids_rest, cache_rest);
hsidp->scancache_inited = false; /* HEAP_SCANCACHE allocated lazily in start_scan */
hsidp->scanrange_inited = false;
return NO_ERROR;
}

The shape is uniform across scan types: copy identifiers and predicates from the optimiser-produced access spec into the SCAN_ID’s union, install a compiled predicate evaluator, and defer the actual page-fix / latch acquisition to scan_start_scan. eval_fnc (query_evaluator.c) is the operator-fusion entry point: it inspects the PRED_EXPR shape and returns one of a handful of optimised function pointers (single-comparison, conjunction, disjunction, between, …) so the per-row inner loop pays one indirect call instead of recursing through a generic predicate tree.

scan_next_heap_scan is the per-tuple next(). The structure is exactly the one the heap-scan deck describes:

// scan_next_heap_scan — src/query/scan_manager.c (condensed)
static SCAN_CODE
scan_next_heap_scan (THREAD_ENTRY * thread_p, SCAN_ID * scan_id)
{
HEAP_SCAN_ID *hsidp = &scan_id->s.hsid;
FILTER_INFO data_filter = { &hsidp->scan_pred, &hsidp->pred_attrs,
NULL, NULL, scan_id->val_list, scan_id->vd,
&hsidp->cls_oid, ... };
while (1)
{
/* 1. Pull the next visible tuple from the heap. */
sp_scan = heap_next (thread_p, &hsidp->hfid, &hsidp->cls_oid, &hsidp->curr_oid,
&recdes, &hsidp->scan_cache, is_peeking);
if (sp_scan != S_SUCCESS)
return (sp_scan == S_END) ? S_END : S_ERROR;
/* 2. Evaluate the WHERE-pushed predicate against the tuple's pred_attrs. */
ev_res = eval_data_filter (thread_p, p_current_oid, &recdes,
&hsidp->scan_cache, &data_filter);
if (ev_res != V_TRUE) continue; /* not qualified — try next */
/* 3. If FOR UPDATE / non-MVCC class, lock and re-fetch — re-evaluate predicate. */
if (scan_id->mvcc_select_lock_needed)
{
sp_scan = locator_lock_and_get_object_with_evaluation (...);
if (mvcc_reev_data.filter_result == V_FALSE) continue;
}
/* 4. Fetch the rest of the columns (those not needed for the predicate). */
if (hsidp->rest_regu_list)
fetch_val_list (thread_p, hsidp->rest_regu_list, scan_id->vd,
&hsidp->cls_oid, p_current_oid, NULL, COPY);
return S_SUCCESS;
}
}

The four-step inner shape — next obj → predicate → maybe-lock → rest values — is the executor-side amplifier of the predicate-attribute split: only pred_attrs are decoded before the predicate is evaluated, so a row that fails the WHERE clause never pays for decoding the projection columns. heap_attrinfo_read_dbvalues (called inside eval_data_filter) decodes only the pred_attrs; the rest_attrs decoder runs only after the row qualifies.

heap_next itself is the heap-side iterator that the executor calls into:

// heap_next — src/storage/heap_file.c (entry; condensed)
SCAN_CODE
heap_next (THREAD_ENTRY * thread_p, const HFID * hfid, OID * class_oid,
OID * next_oid, RECDES * recdes, HEAP_SCANCACHE * scan_cache,
int ispeeking)
{
return heap_next_internal (thread_p, hfid, class_oid, next_oid, recdes,
scan_cache, ispeeking, false /*reversed*/, NULL, NULL);
}
// heap_next_internal — src/storage/heap_file.c (condensed)
static SCAN_CODE
heap_next_internal (THREAD_ENTRY * thread_p, const HFID * hfid, OID * class_oid,
OID * next_oid, RECDES * recdes, HEAP_SCANCACHE * scan_cache,
bool ispeeking, bool reversed_direction, ...)
{
while (true)
{
while (true)
{
/* Reuse the cached page latch if it covers oid; otherwise unfix
* the old one and fix the new vpid via heap_scan_pb_lock_and_fetch. */
if (scan_cache->page_watcher.pgptr != NULL && !VPID_EQ (&vpid, vpidptr_incache))
pgbuf_replace_watcher (thread_p, &scan_cache->page_watcher, &old_page_watcher);
if (scan_cache->page_watcher.pgptr == NULL)
scan_cache->page_watcher.pgptr =
heap_scan_pb_lock_and_fetch (thread_p, &vpid, OLD_PAGE_PREVENT_DEALLOC,
S_LOCK, scan_cache, &scan_cache->page_watcher);
/* Walk to next slot on the page; if we ran off the page, follow
* HEAP_CHAIN.next_vpid and continue. */
...
}
/* Apply mvcc_satisfies_snapshot and follow REC_RELOCATION / REC_BIGONE
* chains to materialise the visible record body into recdes. */
...
}
}

The cache_last_fix_page pattern (the page latch is kept across heap_next calls when consecutive next-OIDs land on the same page) is what gives a heap scan amortised O(1)-per-tuple page-fix cost. See cubrid-heap-manager.md for record-type dispatch (REC_HOME / REC_RELOCATIONREC_NEWHOME / REC_BIGONE → overflow) inside heap_next_internal.

Index scan path — predicate split + heap point-read

Section titled “Index scan path — predicate split + heap point-read”

scan_open_index_scan builds an INDX_SCAN_ID whose three predicate slots — range_pred, key_pred, scan_pred — each carry a regu_list and an eval_fnc-compiled evaluator. scan_next_index_scan walks the B+-tree under btree_range_search, evaluates the range and key predicates during the walk, and for each surviving OID either (a) returns from the covering index directly when the projection is fully covered, or (b) calls locator_attribute_info_force / heap_get_visible_version to fetch the heap row and apply scan_pred against it.

flowchart LR
  Spec["ACCESS_SPEC<br/>(TARGET_CLASS, ACCESS_METHOD_INDEX)"] --> Open["scan_open_index_scan"]
  Open --> SID["INDX_SCAN_ID<br/>{range_pred, key_pred, scan_pred,<br/>cls_regu_list_rest, indx_cov, ...}"]
  SID --> Start["scan_start_scan"]
  Start --> Next["scan_next_index_scan"]
  Next -->|btree_range_search| Btree["B+-tree leaf walk"]
  Btree --> KeyEval["evaluate key_pred"]
  KeyEval -- pass --> CovTest{covering<br/>index?}
  CovTest -- yes --> ListWrite["copy from index key<br/>directly to val_list"]
  CovTest -- no --> Heap["heap_get_visible_version (oid)"]
  Heap --> ScanEval["evaluate scan_pred"]
  ScanEval -- pass --> ListWrite
  ListWrite --> Next

The benefit of pushing predicates this finely is workload-dependent: a query whose key_pred rejects 90 % of the index range pays no heap-fetch tax for those keys, only a B+-tree leaf comparison.

List scan path — feeding parents from materialised children

Section titled “List scan path — feeding parents from materialised children”

When the optimiser produces a sub-block — a derived table, a sorted intermediate, an aptr that can’t be inlined, an aggregation result — the sub-block’s XASL is its own BUILDLIST_PROC that materialises into xasl->list_id. The parent’s access spec is TARGET_LIST and points at the sub-XASL:

SELECT a.col1
FROM (select col1 from tab a where col2 = 3) a, tab b
WHERE a.col1 = b.col1;
flowchart LR
  TopXASL["Outer BUILDLIST<br/>spec_list=[TARGET_LIST(a), TARGET_CLASS(b)]"]
  AptrSub["Sub-XASL<br/>BUILDLIST(tab a)<br/>materialises → list_id"]
  TopXASL -- aptr_list --> AptrSub
  TopXASL --> ScanA["scan_open_list_scan over a.list_id"]
  TopXASL --> ScanB["scan_open_heap_scan over tab b"]
  ScanA --> Next1["scan_next_list_scan tuple"]
  ScanB --> Next2["scan_next_heap_scan tuple"]

scan_next_list_scan walks the list file via qfile_scan_list_next, applies scan_pred against the materialised tuple, and projects via rest_regu_list — exactly the same shape as heap-scan but with qfile_scan_list_next instead of heap_next and no MVCC step (the list file already holds projected DB_VALUEs).

The interpreter never calls a scan-type-specific next directly; it goes through scan_next_scan:

// scan_next_scan — src/query/scan_manager.c (condensed)
SCAN_CODE
scan_next_scan (THREAD_ENTRY * thread_p, SCAN_ID * s_id)
{
return scan_handle_single_scan (thread_p, s_id, scan_next_scan_local);
}

scan_handle_single_scan adds wrappers that handle (a) single_fetch semantics for SELECT-INTO-style queries, (b) per-tuple statistics counters, and (c) the qualification resolution loop. scan_next_scan_local is the actual dispatch:

// scan_next_scan_local — src/query/scan_manager.c (condensed)
static SCAN_CODE
scan_next_scan_local (THREAD_ENTRY * thread_p, SCAN_ID * scan_id)
{
switch (scan_id->type)
{
case S_HEAP_SCAN:
case S_HEAP_SCAN_RECORD_INFO:
case S_HEAP_SAMPLING_SCAN:
return scan_next_heap_scan (thread_p, scan_id);
case S_INDX_SCAN:
return scan_next_index_scan (thread_p, scan_id);
case S_LIST_SCAN:
return scan_next_list_scan (thread_p, scan_id);
case S_CLASS_ATTR_SCAN:
return scan_next_class_attr_scan (thread_p, scan_id);
case S_SHOWSTMT_SCAN:
return scan_next_showstmt_scan (thread_p, scan_id);
case S_SET_SCAN:
return scan_next_set_scan (thread_p, scan_id);
case S_JSON_TABLE_SCAN:
return scan_next_json_table_scan (thread_p, scan_id);
case S_VALUES_SCAN:
return scan_next_value_scan (thread_p, scan_id);
case S_DBLINK_SCAN:
return scan_next_dblink_scan (thread_p, scan_id);
case S_HEAP_PAGE_SCAN:
return scan_next_heap_page_scan (thread_p, scan_id);
case S_INDX_KEY_INFO_SCAN:
return scan_next_index_key_info_scan (thread_p, scan_id);
/* ... S_PARALLEL_HEAP_SCAN: returned by a different path, see below ... */
default:
return S_ERROR;
}
}

Adding a new scan operator is a closed surgery: extend SCAN_TYPE, add the union arm, add an open / next / start / end / close set, and append a case to this switch.

SELECT col1, col2 FROM t1 WHERE col1 = 1;

The plan is a single BUILDLIST_PROC with one TARGET_CLASS access spec on t1. Optimiser chose ACCESS_METHOD_SEQUENTIAL (no usable index, or seqscan cost wins). The runtime path is:

sequenceDiagram
  participant CL as Client
  participant QM as qmgr_process_query
  participant QE as qexec_execute_query
  participant MB as qexec_execute_mainblock_internal
  participant IPF as qexec_intprt_fnc
  participant SC as scan_next_scan / scan_next_heap_scan
  participant HF as heap_next_internal
  participant LF as list_file
  CL->>QM: xqmgr_execute_query (xasl_id, host_vars)
  QM->>QE: qexec_execute_query (xasl_p)
  QE->>MB: qexec_execute_mainblock_internal
  Note over MB: Pre-processing
  MB->>MB: qexec_open_scan(spec=t1) → scan_open_heap_scan
  MB->>MB: aptr loop (none here)
  MB->>IPF: dispatch BUILDLIST_PROC
  Note over IPF: Processing — pull loop
  loop while qexec_next_scan_block_iterations == S_SUCCESS
    IPF->>SC: scan_start_scan + scan_next_scan
    loop while scan_next_scan == S_SUCCESS
      SC->>HF: heap_next (next OID, fetch RECDES)
      HF-->>SC: RECDES + page latch (cached)
      SC->>SC: heap_attrinfo_read_dbvalues (pred_attrs)
      SC->>SC: pr_eval_fnc(scan_pred) — col1 = 1 ?
      alt qualified
        SC->>SC: fetch_val_list (rest_regu_list — col2)
        SC-->>IPF: S_SUCCESS
        IPF->>IPF: dptr loop, scan_ptr loop, if_pred
        IPF->>LF: qexec_end_one_iteration → write tuple
      else not qualified
        SC->>SC: continue (no rest fetch)
      end
    end
    SC->>SC: scan_end_scan
  end
  Note over MB: Post-processing
  MB->>MB: groupby? no. orderby? no. analytic? no.
  MB->>MB: qexec_close_scan
  MB-->>QE: NO_ERROR
  QE-->>QM: xasl->list_id
  QM-->>CL: cloned QFILE_LIST_ID

The dotted-line property the deck spelled out: nothing about this loop changes for a join — a join just makes xasl->scan_ptr non-NULL, and the per-row body in qexec_intprt_fnc recursively calls qexec_execute_scan on the inner block. Likewise, a correlated subquery just adds a dptr_list entry that runs once per qualified outer row.

Block iterations — (a ∪ b) ⋈ (c ∪ d)

Section titled “Block iterations — (a ∪ b) ⋈ (c ∪ d)”

When an access spec covers multiple targets — typically a partitioned class (ALL super_class) or an OID-list spec — the spec carries a chain of ACCESS_SPEC_TYPE nodes. The interpreter handles this through qexec_next_scan_block_iterations, which advances the spec chain top-down and resets every joined-block scan whenever an outer block resets.

flowchart LR
  Out0["outer access_spec<br/>= (a, b)"] -- block 1: a --> InRing0
  Out0 -- block 2: b --> InRing1
  subgraph InRing0[" "]
    direction LR
    In0a["inner access_spec<br/>= (c, d)<br/>block 1: c"] --> In0b["inner block 2: d"]
  end
  subgraph InRing1[" "]
    direction LR
    In1a["inner access_spec<br/>= (c, d)<br/>block 1: c (reset)"] --> In1b["inner block 2: d"]
  end

The four-block expansion (a∪b)⋈(c∪d) = (a⋈c)∪(a⋈d)∪(b⋈c)∪(b⋈d) falls out by running the outer block selector through qexec_next_scan_block on the outermost XASL, then recursively resetting all scan_ptr chains beneath it via the loop the v0.7 deck reproduces.

// qexec_next_scan_block_iterations — src/query/query_executor.c (sketch)
static SCAN_CODE
qexec_next_scan_block_iterations (THREAD_ENTRY * thread_p, XASL_NODE * xasl)
{
/* find the deepest scan_ptr that still has a qualified block */
for (last_xptr = xasl; last_xptr->scan_ptr; last_xptr = last_xptr->scan_ptr)
if (!last_xptr->next_scan_block_on || (...))
break;
/* advance that scan_ptr to its next block */
xs_scan = qexec_next_scan_block (thread_p, last_xptr);
/* reset every scan_ptr below it back to its first block */
for (xptr2 = last_xptr; xptr2; xptr2 = xptr2->scan_ptr)
qexec_next_scan_block (thread_p, xptr2->scan_ptr);
/* reset every scan above it to re-run on the new combination */
while (last_xptr != xasl) { /* unwind back up */ }
}

Subqueries — APTR runs once, DPTR runs per row, REGU_VARIABLE binds the result

Section titled “Subqueries — APTR runs once, DPTR runs per row, REGU_VARIABLE binds the result”

Both subquery kinds end up as separate XASL trees and execute via recursive qexec_execute_mainblock. The difference is when and how the result reaches the parent.

For an APTR (xasl->aptr_list), the outer’s pre-processing runs qexec_execute_mainblock(aptr) once. The result list is referenced via a REGU_VARIABLE of type CONSTANT whose value.dbvalptr points to a slot the executor populates from the materialised tuple. The aptr’s XASL has XASL_LINK_TO_REGU_VARIABLE set so that when the executor walks the aptr_list later, it skips it (the result is already materialised and bound).

For a DPTR (xasl->dptr_list), the per-row loop runs qexec_execute_mainblock(dptr) for each outer row. Inside the dptr’s scan, the outer’s correlation columns are visible because they live in the shared VAL_DESCR (via the XASL_STATE).

For a derived-table-style FROM-clause subquery, the outer’s spec is TARGET_LIST pointing at the inner XASL; pre-processing runs the inner once via aptr, and the outer reads the result with scan_open_list_scan.

Post-processing — group-by, order-by, analytic

Section titled “Post-processing — group-by, order-by, analytic”

After the main pull loop completes, the executor runs three optional post-processing phases on xasl->list_id:

// post-processing tail — qexec_execute_mainblock_internal (sketch)
{
/* ... main loop drained ... */
/* GROUP BY — sort or hash, then evaluate aggregates per group */
if (xasl->type == BUILDLIST_PROC && xasl->proc.buildlist.groupby_list)
qexec_groupby (thread_p, xasl, xasl_state, &tplrec);
/* analytic functions (window) — separate post-pass over list_id */
if (xasl->type == BUILDLIST_PROC && xasl->proc.buildlist.a_eval_list)
qexec_execute_analytic (thread_p, xasl, xasl_state, ...);
/* ORDER BY / DISTINCT — sort_listfile with SORT_ELIM_DUP or SORT_DUP */
if (xasl->orderby_list || (XASL_IS_FLAGED (xasl, XASL_HAS_DISTINCT)))
qexec_orderby_distinct (thread_p, xasl, query_opt_mode, &tplrec);
}

Group-by has two implementations — sort group-by and hash group-by — chosen by the optimiser:

flowchart LR
  subgraph Sort["sort group-by (qexec_groupby)"]
    SR["result list"] --> SS["sort_listfile"]
    SS --> SI["iterate sorted run"]
    SI --> SA["accumulate ACC<br/>(qdata_evaluate_aggregate_list)"]
    SA --> SO["result list"]
  end
  subgraph Hash["hash group-by"]
    HM["per-row evaluation<br/>(qexec_hash_gby_agg_tuple)"] --> HACC["in-memory hash ACC"]
    HACC -- spill --> HPL["partial list (sorted)"]
    HPL --> HM2["sort_listfile"]
    HM2 --> HMG["merge eq groups<br/>(qdata_aggregate_accumulator_to_accumulator)"]
    HMG --> HSA["accumulate residual ACC"]
    HSA --> HO["result list"]
  end

Hash group-by switches to the spill path when the in-memory hash crosses the configured budget (prm_get_integer_value(PRM_ID_GBY_HASH_TO_SCAN_THRESHOLD)); the spilled partial list is then merged via the same qdata_aggregate_accumulator_to_accumulator path. Order-by and DISTINCT both flow through sort_listfile with the SORT_ELIM_DUP flag for DISTINCT.

Parallel heap scan — the one push-side detour

Section titled “Parallel heap scan — the one push-side detour”

For sequential heap scans on large heaps, CUBRID can split the file across multiple worker threads. qexec_open_scan first tries scan_open_parallel_heap_scan; if it fails (small heap, partition too small to split), it falls back to single-thread scan_open_heap_scan. The parallel implementation lives under src/query/parallel/; the SCAN_ID arm is S_PARALLEL_HEAP_SCAN and produces results that the rest of the executor reads as if they were a list scan.

// scan_open_parallel_heap_scan — src/query/scan_manager.c (sketch via qexec_open_scan)
#if SERVER_MODE && !WINDOWS
error_code = scan_open_parallel_heap_scan (thread_p, s_id,
mvcc_select_lock_needed, fixed,
grouped, vd, curr_spec,
&ACCESS_SPEC_CLS_OID (curr_spec),
&ACCESS_SPEC_HFID (curr_spec),
xasl, query_id);
if (s_id->type == S_PARALLEL_HEAP_SCAN)
{
assert (s_id->s.phsid.manager != nullptr);
}
else
{
/* fallback — small heap; revert to S_HEAP_SCAN */
scan_open_heap_scan (thread_p, s_id, ...);
}
#endif

Parallel-heap is the one place the executor briefly leaves pure pull-mode: the workers push partial result tuples into a coordinator queue that the parent reads back in pull style.

Anchor on symbol names. Line numbers are scoped to this document’s updated: date and are intended only as quick hints; the symbol names are stable.

  • xqmgr_execute_query (query_manager.c) — wire-side server entry; looks up the cached XASL.
  • qmgr_process_query (query_manager.c) — unpack stream → XASL tree, call qexec_execute_query, clone/return list id.
  • qexec_execute_query (query_executor.c) — top entry; sets up XASL_STATE, returns the top-level QFILE_LIST_ID.
  • qexec_execute_mainblock / qexec_execute_mainblock_internal — recursion-depth wrapper + the operator-type dispatch switch.
  • qexec_intprt_fnc — generic pull-loop interpreter (BUILDLIST/BUILDVALUE/SCAN_PROC).
  • qexec_execute_scan / qexec_execute_scan_ptr — recursive entry for xasl->scan_ptr (joined-block chain).
  • qexec_open_scan — fan-out from access spec to scan_open_<type>_scan.
  • qexec_next_scan_block / qexec_next_scan_block_iterations — advance access-spec chain across UNION-ALL of partitions or sub-classes.
  • qexec_end_one_iteration / qexec_end_mainblock_iterations — write qualified row, close down the main block’s scans.
  • qexec_execute_update / qexec_execute_delete / qexec_execute_insert / qexec_execute_merge — pair a BUILDLIST row source with the mutation in locator_* / heap_*.
  • qexec_execute_dblink_queryTARGET_DBLINK leaf.
  • qexec_execute_do_stmt — DO-statement (no result, side-effects only).
  • qexec_execute_connect_by — hierarchical query (CONNECT BY).
  • qexec_execute_cte — CTE materialisation/recursion (xasl->max_iterations ceiling).
  • qexec_groupby / qexec_groupby_index — sort and index-driven group-by.
  • qexec_hash_gby_agg_tuple / qexec_hash_gby_get_next / qexec_hash_gby_put_next — hash group-by per-row, cursor wrappers.
  • qdata_evaluate_aggregate_list / qdata_aggregate_accumulator_to_accumulator — accumulator update; merge for hash spill.
  • qdata_save_agg_htable_to_list / qdata_load_agg_hvalue_in_agg_list — spill out / load back hash accumulators.
  • qexec_orderby_distinct / qexec_orderby_distinct_by_sorting — order-by + distinct via sort_listfile (SORT_ELIM_DUP / SORT_DUP).
  • qexec_execute_analytic — analytic-function (window) post-pass.

Scan management (src/query/scan_manager.{c,h})

Section titled “Scan management (src/query/scan_manager.{c,h})”
  • SCAN_ID — operator state (tagged union); SCAN_TYPE enum lists all scan kinds (S_HEAP_SCAN, S_INDX_SCAN, S_LIST_SCAN, S_CLASS_ATTR_SCAN, S_SET_SCAN, S_JSON_TABLE_SCAN, S_VALUES_SCAN, S_SHOWSTMT_SCAN, S_DBLINK_SCAN, S_PARALLEL_HEAP_SCAN, S_HEAP_PAGE_SCAN, S_INDX_KEY_INFO_SCAN).
  • S_STATUS lifecycle (S_OPENED / S_STARTED / S_ENDED / S_CLOSED); S_DIRECTION, S_POSITION.
  • scan_init_scan_id / scan_init_scan_pred / scan_init_scan_attrs — common initialisers.
  • scan_open_heap_scan / scan_open_index_scan / scan_open_list_scan — the three principal opens.
  • scan_open_class_attr_scan / scan_open_set_scan / scan_open_values_scan / scan_open_json_table_scan / scan_open_dblink_scan / scan_open_showstmt_scan / scan_open_heap_page_scan — secondary leaves.
  • scan_open_parallel_heap_scan — worker-pool-driven heap scan; falls back to single-thread scan_open_heap_scan.
  • scan_start_scan / scan_end_scan / scan_close_scan — per-block / file-level resource lifecycle.
  • scan_next_scan_block — advance access-spec chain inside one SCAN_ID.
  • scan_next_scanscan_handle_single_scanscan_next_scan_local — outer dispatcher chain.
  • scan_next_heap_scan / scan_next_index_scan / scan_next_list_scan — the three principal nexts.
  • scan_next_class_attr_scan / scan_next_set_scan / scan_next_value_scan / scan_next_json_table_scan / scan_next_dblink_scan / scan_next_heap_page_scan / scan_next_index_key_info_scan — secondary nexts.

Heap-side helpers (src/storage/heap_file.c)

Section titled “Heap-side helpers (src/storage/heap_file.c)”
  • heap_next / heap_prev — forward/backward iterators; thin wrappers over heap_next_internal (page-fix, slot walk, MVCC visibility, REC_RELOCATION / REC_BIGONE chain).
  • heap_next_sampling / heap_next_record_info — sampling and record-info variants.
  • heap_first / heap_last — single-fetch / positional-index entry points.
  • heap_get_visible_version / heap_get_visible_version_internal / heap_prepare_object_page — point-read with snapshot evaluation and version-chain walk.
  • heap_attrinfo_start / heap_attrinfo_end / heap_attrinfo_read_dbvalues — decoder cache lifecycle and RECDES → DB_VALUE conversion.
  • eval_data_filter / eval_fnc / eval_pred* (query_evaluator.c) — FILTER_INFO bridge, per-shape compiled predicate, fast paths.
  • fetch_val_list / fetch_peek_dbval (fetch.c) — REGU_VARIABLE → DB_VALUE materialisation.
  • qexec_execute_obj_fetch — bptr (path-expression) fetcher.
  • QFILE_LIST_ID, qfile_open_list / qfile_close_list, qfile_scan_list_next, qfile_save_tuple, qfile_clone_list_id, sort_listfile (list_file.c) — list-file I/O.
  • xasl_node — the operator/plan node.
  • proc_typeBUILDLIST_PROC / BUILDVALUE_PROC / UNION_PROC / INTERSECTION_PROC / DIFFERENCE_PROC / MERGELIST_PROC / OBJFETCH_PROC / SCAN_PROC / UPDATE_PROC / DELETE_PROC / INSERT_PROC / MERGE_PROC / CONNECTBY_PROC / CTE_PROC / DO_PROC.
  • ACCESS_SPEC_TYPETARGET_CLASS / TARGET_LIST / TARGET_REGUVAL_LIST / TARGET_SET / TARGET_DBLINK / TARGET_JSON_TABLE / TARGET_VALUES_LIST / TARGET_SHOWSTMT.
  • ACCESS_METHODACCESS_METHOD_SEQUENTIAL / ACCESS_METHOD_INDEX / ACCESS_METHOD_SEQUENTIAL_RECORD_INFO / ..._SAMPLING_SCAN.
SymbolFileLine
xqmgr_execute_queryquery_manager.c1299
qmgr_process_queryquery_manager.c1177
qmgr_get_query_entryquery_manager.c543
qexec_execute_queryquery_executor.c16292
qexec_execute_mainblockquery_executor.c14900
qexec_execute_mainblock_internalquery_executor.c15058
qexec_intprt_fncquery_executor.c9092
qexec_open_scanquery_executor.c7303
qexec_execute_scanquery_executor.c8300
qexec_execute_scan_ptrquery_executor.c8261
qexec_next_scan_blockquery_executor.c7851
qexec_next_scan_block_iterationsquery_executor.c7979
qexec_end_one_iterationquery_executor.c1130
qexec_end_mainblock_iterationsquery_executor.c14701
qexec_groupbyquery_executor.c5247
qexec_groupby_indexquery_executor.c20526
qexec_hash_gby_agg_tuplequery_executor.c4537
qexec_hash_gby_get_nextquery_executor.c4778
qexec_hash_gby_put_nextquery_executor.c4793
qexec_orderby_distinctquery_executor.c3936
qexec_orderby_distinct_by_sortingquery_executor.c4000
qexec_execute_analyticquery_executor.c21007
qexec_execute_obj_fetchquery_executor.c13182
qexec_execute_connect_byquery_executor.c16716
qexec_execute_ctequery_executor.c17472
scan_open_heap_scanscan_manager.c2846
scan_open_index_scanscan_manager.c3067
scan_open_list_scanscan_manager.c3715
scan_start_scanscan_manager.c4136
scan_next_scan_blockscan_manager.c4606
scan_end_scanscan_manager.c4749
scan_close_scanscan_manager.c4882
scan_next_scan_localscan_manager.c5193
scan_next_heap_scanscan_manager.c5361
scan_next_index_scanscan_manager.c5909
scan_next_list_scanscan_manager.c6647
scan_next_scanscan_manager.c7329
heap_next_internalstorage/heap_file.c7902
heap_nextstorage/heap_file.c19427
heap_firststorage/heap_file.c8483
heap_laststorage/heap_file.c8511
heap_attrinfo_read_dbvaluesstorage/heap_file.c10683
heap_get_visible_versionstorage/heap_file.c25456
heap_get_visible_version_internalstorage/heap_file.c25577
heap_prepare_object_pagestorage/heap_file.c25856

The raw analysis (analysis_QP_executor_0.7.pptx, dated 2020-03-16) is version 0.7 — explicitly an early exploratory pass — and tracks the executor as it was around CUBRID 10.x / 11-early. The Heap scan.pdf companion is more recent and aligned with the current source. Several aspects have shifted since the v0.7 deck was authored; flag these when reading the deck against the current tree:

  • The deck’s narrative entry point is qexec_execute_mainblock(). The current code adds an _internal indirection: qexec_execute_mainblock is just a recursion-depth + perfmon wrapper around qexec_execute_mainblock_internal. Symbol names quoted from the deck still resolve, but the dispatch switch lives one frame deeper.
  • SELECT_PROC is no longer a separate proc type. The deck’s slide on “EXECUTOR (qexec_execute_mainblock())” lists pre/processing/post-processing as the only code path, implicitly assuming a SELECT_PROC. The current code uses BUILDLIST_PROC and BUILDVALUE_PROC (and several others) and routes them through the same qexec_intprt_fnc interpreter; the proc type is what the optimiser writes, not what the executor branches on at the top level.
  • DML proc types (UPDATE_PROC/DELETE_PROC/INSERT_PROC/MERGE_PROC). The deck’s flowchart shows only the SELECT path. The current qexec_execute_mainblock_internal has dedicated arms for DML proc types that route to qexec_execute_update / _delete / _insert / _merge; these are not covered by the v0.7 narrative.
  • Hierarchical / CTE proc types. CONNECTBY_PROC and CTE_PROC are dispatched separately — neither is in the deck. The CTE recursive-iteration counter (xasl->max_iterations) drives a cte_offset_read_tuple hand-shaking inside qexec_intprt_fnc that postdates the deck.
  • Parallel heap scan. S_PARALLEL_HEAP_SCAN and scan_open_parallel_heap_scan are recent (SERVER_MODE && !WINDOWS only). The deck’s heap-scan flow assumes a single-threaded S_HEAP_SCAN. For large heaps with no usable index, the dispatch in qexec_open_scan first tries parallel and falls back to single-thread.
  • S_JSON_TABLE_SCAN / S_DBLINK_SCAN / S_VALUES_SCAN / S_SHOWSTMT_SCAN. Modern leaf scan types not present in the v0.7 deck. The same open/start/next/end/close shape applies, but each has its own scan_open_* / scan_next_* set.
  • Hash group-by spill mechanics. The deck’s diagram shows the partial-list-merge logic but does not name PRM_ID_GBY_HASH_TO_SCAN_THRESHOLD or the exact spill predicate. Current code uses a budgeted in-memory hash with explicit spill via qdata_save_agg_htable_to_list when the budget is exceeded; the merge step uses qdata_aggregate_accumulator_to_accumulator to combine equal-key accumulators.
  • grouped / single_fetch semantics on SCAN_ID. The deck does not detail these flags, but they meaningfully alter scan_next_heap_scan (fixed page latch + PEEK semantics under grouped) and the higher wrapper scan_handle_single_scan (single-tuple SELECT-INTO).
  • MVCC visibility was reworked. The deck pre-dates heap_get_visible_version / heap_get_visible_version_internal in their current form; in particular, the prev_version_lsa chain walking the undo log for too-new versions (see cubrid-mvcc.md / cubrid-heap-manager.md) is materially different from the deck’s brief MVCC-snapshot reference.
  • The heap-scan PDF’s five-step lifecycle (scan_open / scan_start / scan_next / scan_end / scan_close) is correct as of current source. The PDF’s description of the four-step inner loop in scan_next_heap_scan (next obj → predicate → maybe-lock → rest) also matches the current code body.
  1. Cost of the per-tuple virtual call. scan_next_scanscan_handle_single_scanscan_next_scan_localscan_next_<type>_scan is at least three indirect calls per tuple, on top of pr_eval_fnc. For TPC-H-shaped scans on cold caches, the cache-miss cost on the indirect targets is plausibly the bottleneck. Investigation path: PMU profiling on scan_next_heap_scan under sustained scans; consider the cost of a vector-batched alternative.
  2. cache_last_fix_page invariants under MVCC re-fetch. When mvcc_select_lock_needed == true and the row needs a re-fetch with the lock, the page LSA may have advanced (PGBUF_IS_PAGE_CHANGED), at which point the code drops PEEK to COPY and restarts. Are there cases where this loop does not converge — for instance, a heavily-updated heap page under repeated SELECT FOR UPDATE?
  3. Hash group-by spill threshold default. PRM_ID_GBY_HASH_TO_SCAN_THRESHOLD controls when the in-memory hash spills. The current default has not been re-tuned against modern memory sizes; benchmark sweep would surface a better default (or argue for a workload-adaptive policy).
  4. xasl->max_iterations for recursive CTE. The interpreter terminates a recursive CTE when recursive_iterations >= xasl->max_iterations (qexec_intprt_fnc). What does the optimiser set this to? Is it user-tunable, or is it a hard ceiling? Investigation path: track all assignments to xasl_node::max_iterations in xasl_generation/.
  5. count_star_with_iscan_opt correctness. The optimisation that skips per-key iteration for SELECT COUNT(*) FROM t when there is a usable index relies on specp->s_id.s.isid.need_count_only = true and a single oids_count accumulation per scan block. Are there filter shapes that silently bypass this and quietly produce wrong counts?
  6. Parallel heap scan fallback boundary. scan_open_parallel_heap_scan falls back to single-thread when the heap is small or the partition makes parallelism uneconomic. The exact boundary is internal; document the predicate and how it interacts with the optimiser’s cost model.
  7. S_DBLINK_SCAN push-down. Predicates that would normally split into key/scan/data for a local index — what fraction of those are pushed to the remote? Are there predicate shapes (NLS comparisons, JSON ops) that are silently re-evaluated locally?
  8. SELECT_PROC versus BUILDLIST_PROC history. Was there ever a distinct SELECT_PROC proc type that was later folded into BUILDLIST? The v0.7 deck is ambiguous; git history should resolve this. (A git log --diff-filter=D -S 'SELECT_PROC' pass on xasl/ would settle it.)
  9. XASL_LINK_TO_REGU_VARIABLE lifecycle. When an aptr’s result is bound into a parent’s REGU_VARIABLE, re-execution of the same XASL (e.g. inside a UNION-ALL spec) must not re-run the aptr. The flag suppresses re-execution but the ownership semantics of the bound DB_VALUE are subtle; investigate corner cases when an aptr returns a SET-typed value.

Raw analyses (raw/code-analysis/cubrid/query-processing/)

Section titled “Raw analyses (raw/code-analysis/cubrid/query-processing/)”
  • analysis_QP_executor_0.7.pptx — early (2020-03) executor overview deck; main-block flow, sub-query placement, group-by sketch.
  • Heap scan.pdf — heap-scan deep dive; the five-step scan lifecycle, four-step inner loop, HEAP_SCANCACHE/HEAP_CACHE_ATTRINFO local-cache framing.
  • _converted/analysisqpexecutor0.7.pptx.md — markitdown extract of the PPTX (used inline above).
  • _converted/heap-scan.pdf.md — markitdown extract of the PDF.
  • knowledge/code-analysis/cubrid/cubrid-heap-manager.md — record-type vocabulary (REC_HOME / REC_RELOCATION / REC_BIGONE), heap_next page-walk, and the five caches that HEAP_SCANCACHE participates in.
  • knowledge/code-analysis/cubrid/cubrid-mvcc.mdmvcc_satisfies_snapshot, prev_version_lsa chain, and how visibility plugs into the heap iterator.
  • Database Internals (Petrov), Ch. 12 “Query Execution” — pull vs push, operator state, materialisation strategies.
  • Database Systems: The Complete Book (Garcia-Molina/Ullman/Widom), Ch. 15 “Query Execution” + Ch. 16 “The Query Compiler” — Volcano model framing, sort/hash/group operator pseudocode.
  • Graefe, Goetz. Volcano — An Extensible and Parallel Query Evaluation System, IEEE TKDE 6(1), 1994 — the canonical iterator-model paper; introduces open/next/close and the exchange operator.
  • Graefe, Goetz. Query Evaluation Techniques for Large Databases, ACM Computing Surveys 25(2), 1993 — broader survey including blocking-operator design and join algorithms.

CUBRID source (/data/hgryoo/references/cubrid/)

Section titled “CUBRID source (/data/hgryoo/references/cubrid/)”
  • src/query/query_executor.c — the executor (~28 KLOC). Entry: qexec_execute_mainblock_internal.
  • src/query/query_manager.c — server-side query plumbing; entry: xqmgr_execute_query.
  • src/query/scan_manager.c — the per-tuple iterator dispatcher; entry: scan_next_scan.
  • src/query/scan_manager.hSCAN_ID, SCAN_TYPE, S_STATUS definitions.
  • src/query/fetch.cfetch_val_list / fetch_peek_dbval regu-variable materialisation.
  • src/query/query_evaluator.ceval_fnc, eval_data_filter, the per-shape predicate fast paths.
  • src/query/list_file.c / list_file.hQFILE_LIST_ID, qfile_scan_list_next, sort_listfile.
  • src/query/parallel/ — parallel heap-scan worker pool implementation.
  • src/storage/heap_file.cheap_next / heap_next_internal and the heap-side scan helpers used from scan_next_heap_scan.
  • src/xasl/xasl_node.hppxasl_node, proc type constants, ACCESS_SPEC_TYPE.