CUBRID Query Optimizer — Query Graph, Cost Model, Join Enumeration, and Compiled Plan
Contents:
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Cross-check Notes
- Open Questions
- Sources
Theoretical Background
Section titled “Theoretical Background”A query optimizer turns a declarative SQL statement into a procedural execution plan by exploring the space of equivalent plans and picking the one with the smallest estimated cost. Database Internals (Petrov, ch. 12 “Anti-Entropy and Dissemination” is unrelated; the relevant chapters are 5–6 on B-trees and storage and ch. “Query Processing and Optimization” in the print edition; in the open-source canon the same material appears as Hellerstein–Stonebraker–Hamilton, Architecture of a Database System, §6 “Relational Query Processor”). Three problems sit underneath the work:
-
Plan-space enumeration. For an N-way join, the number of join trees is super-exponential (Catalan(N-1) shapes × N! orderings). The two standard tractable approximations are the left-deep dynamic program of Selinger et al., Access Path Selection in a Relational Database Management System (SIGMOD 1979) — which builds best plans for every subset of size k from best plans for subsets of size k−1, in O(2^N · N) time and O(2^N) space — and bushy-plan search à la Cascades (Graefe, The Cascades Framework for Query Optimization, IEEE Data Eng. Bull. 1995) which uses memoised transformation rules on a Volcano-style search graph. Above ~12 tables both fall back to heuristic search: greedy operator ordering (GOO), Iterative Improvement, Simulated Annealing, or genetic search (PostgreSQL’s
geqo). -
Cardinality / cost estimation. A plan’s cost is the sum of per-operator costs, each modelled as a function of the operator’s input cardinality and per-row work. Cardinalities are propagated bottom-up using selectivity assumptions — independence, uniformity, principle of inclusion — that are known to be unreliable but cheap. The “fixed vs. variable” distinction in the cost model (Selinger §3.2) splits costs into a startup component paid once per scan (e.g., B-tree-root descent) and a per-row component scaled by estimated cardinality.
-
Access-path selection. For each base table, decide between sequential scan, single-column index, multi-column index, covering index, index skip-scan, etc. For each pair of already-joined sub-plans, decide between nested-loop, index-nested-loop, sort-merge, hash join. These are the “physical operator” decisions that the cost model ranks.
The Selinger paper introduced two specific tactics worth naming because they shape every later optimizer, including CUBRID’s:
- Interesting orders. A sub-plan that produces tuples
already sorted on a join column (because it used an ordered
index, or because a previous merge join sorted them) is worth
keeping even if its cost is higher than an unordered alternative,
because a later sort-merge join can avoid an explicit sort.
CUBRID materialises this as a per-equivalence-class
planvecperQO_INFOnode — every interesting order keeps its own best plan. - Pruning by dominance. When two plans cover the same
subset of base tables and produce the same interesting order,
the strictly more expensive one is discarded immediately.
CUBRID’s
qo_check_plan_on_infois the dominance test.
CUBRID’s plan-search structure follows Selinger directly with
two CUBRID-specific deviations: (a) on a partition graph that is
not strongly connected, each connected sub-graph is optimised
independently and the results are combined by Cartesian product
(qo_search_partition / qo_combine_partitions); and (b)
above eight tables the planner switches from full DP to a
partial-join “first-node-then-extend” heuristic, scoping the
DP window to a small join_unit (8 → 3 → 2 as table count
grows). This is documented inline in qo_search_partition_join
as a Sybase Adaptive Server-style adaptation.
Common DBMS Design
Section titled “Common DBMS Design”Every cost-based optimizer — System R, PostgreSQL, MySQL, Oracle, SQL Server, CUBRID — adopts the same core vocabulary on top of the textbook framing. Naming the conventions first makes the CUBRID-specific identifiers in the next section read as one set of dials within a shared design space, not as inventions.
Query graph + cost-annotated relational algebra
Section titled “Query graph + cost-annotated relational algebra”The optimizer’s working representation is a graph in which
nodes are base relations or derived tables and edges are
join predicates. Sargable single-relation predicates hang off
the node; non-sargable predicates (subqueries, complex
expressions) are deferred to post-join filters. PostgreSQL calls
this the RelOptInfo / RestrictInfo graph; SQL Server calls
it the “memo”; Oracle calls it the “join graph”; CUBRID calls it
the Query Graph with QO_NODE for relations, QO_TERM for
predicates and QO_SEGMENT for column references between them.
Bitset-keyed dynamic programming
Section titled “Bitset-keyed dynamic programming”Selinger DP keys plans on the subset of relations they
cover, encoded as an integer bitset over the relation IDs.
For N relations, the join-info table has 2^N entries.
PostgreSQL keeps this in root->join_rel_list (linked) plus a
hash; CUBRID allocates a flat join_info array of
QO_JOIN_INFO_SIZE = 1 << bitset_cardinality(partition.nodes)
slots, indexed directly by the partition-relative bitset.
Per-equivalence-class plan vector
Section titled “Per-equivalence-class plan vector”Within one DP slot, the optimizer can keep multiple non-dominated
plans, one per interesting order (one per equivalence class of
join columns). CUBRID’s QO_PLANVEC holds up to NPLANS = 4
plans per (info, order) pair; PostgreSQL’s pathlist is the
analogous collection.
Cost model: fixed + variable, CPU + IO
Section titled “Cost model: fixed + variable, CPU + IO”The standard model splits cost into four slots:
fixed-cpu, fixed-io, variable-cpu, variable-io. “Fixed” is paid
once per operator activation (e.g., descending the B-tree root
to the first leaf); “variable” scales with rows. Selectivity is
the independence-assumption product of per-predicate
selectivities (clamped from below by 1 / pkeys[i] to avoid
under-estimation), and cardinalities propagate by multiplying
relation cardinality by aggregate selectivity. CUBRID names
the four cost slots exactly this way on qo_plan and on
qo_summary.
Access-path candidates and join-method candidates
Section titled “Access-path candidates and join-method candidates”For each QO_NODE the optimizer enumerates {seq scan, one
index scan per usable B-tree index, group-by/order-by index
scan, loose / index-skip scan}. For each pair of sub-plans it
enumerates {nested-loop, index-correlated-NL, sort-merge,
hash-list}. CUBRID’s three qo_examine_* routines —
qo_examine_nl_join, qo_examine_idx_join,
qo_examine_merge_join — are the dispatch points; PostgreSQL’s
add_paths_to_joinrel plays the same role, calling match_unsorted_outer, sort_inner_and_outer, etc.
Three engine families and where CUBRID sits
Section titled “Three engine families and where CUBRID sits”- PostgreSQL uses left-deep DP up to
geqo_threshold(default 12), then switches to a genetic algorithm. The search structure is bottom-up only; no rule-based rewrites during search. - MySQL uses a greedy nested-loop ordering with
straight_joinhints. The cost model is simpler (no interesting orders; hash join was added only in 8.0). Bushy plans are not considered. - SQL Server (and Apache Calcite, MemSQL, CockroachDB) use a Cascades-style memoised search: rules transform sub-plans in a memo until cost converges. This produces bushy plans naturally but is harder to bound.
- Oracle uses a multi-strategy optimizer: a Selinger-like DP for small joins, with rule-based access-path elimination before search and a genetic / heuristic fallback for very large joins.
CUBRID sits in the System R lineage: bottom-up DP with
interesting orders, no transformation rules during search, but
with the partial-join window of late-Sybase variants for joins
beyond eight tables. The novel mechanic is the two-pass
search inside one partition: a partial-join sweep extends a
prefix one node at a time until join_unit tables are visited,
then if no plan is found, falls back to a full join search.
Theory ↔ CUBRID mapping
Section titled “Theory ↔ CUBRID mapping”| Theoretical concept | CUBRID name |
|---|---|
| Per-relation graph node | QO_NODE (one per PT_SPEC in FROM) |
| Inter-relation graph edge (join predicate) | QO_TERM with term_class == QO_TC_JOIN |
| Per-relation single-row predicate (sargable) | QO_TERM with term_class == QO_TC_SARG, indexed by QO_NODE_SARGS(node) |
| Column reference (segment) in the graph | QO_SEGMENT (one per attribute use, with head = owning node) |
| Equivalence class of join columns (interesting order) | QO_EQCLASS |
| Disconnected sub-graph (Cartesian-product partition) | QO_PARTITION |
| Selinger DP table indexed by node-subset bitset | planner->join_info[QO_INFO_INDEX(M_offset, rel_nodes)] |
| Per-DP-slot non-dominated plan list | QO_PLANVEC (NPLANS = 4) |
| Plan tree node (operator) | QO_PLAN with plan_type ∈ {SCAN, SORT, JOIN, FOLLOW, WORST} |
| Cost vector | qo_plan::{fixed,variable}_{cpu,io}_cost |
| Access-path enumeration | qo_search_planner per-node loop over node_index |
| Index/NL/merge join dispatch | qo_examine_idx_join, qo_examine_nl_join, qo_examine_merge_join |
| Selinger left-deep DP | planner_permutate → planner_visit_node |
| Partial-join window | planner->join_unit ∈ {2, 3, 8} |
| Plan-tree → physical plan lowering | qo_to_xasl (plan_generation.c) |
CUBRID’s Approach
Section titled “CUBRID’s Approach”CUBRID’s optimizer is a three-stage walk: (1) PT_NODE → Query
Graph builds QO_ENV from the semantically-checked parse
tree; (2) cost model + join enumeration runs Selinger DP on
the QG with the four-slot cost model; (3) Query Graph →
Compiled Graph lowers the surviving QO_PLAN tree into the
XASL access-spec tree that the server executes. The optimizer
runs client-side in CAS (#if !defined(SERVER_MODE)); the
plan that ships to the server is XASL, not the QO_PLAN tree.
Pipeline
Section titled “Pipeline”flowchart LR
SQL["SQL string"] --> P["Parser<br/>(csql_grammar.y)"]
P --> AST["PT_NODE tree<br/>(parse_tree.h)"]
AST --> SC["Semantic check<br/>(semantic_check.c)<br/>name resolution, type check"]
SC --> RW["Rewriter<br/>(mq_rewrite, parse_tree_cl.c)<br/>CNF, view merge,<br/>oracle-style outer join,<br/>predicate push"]
RW --> OPT["qo_optimize_query<br/>(query_graph.c)"]
OPT --> EH["qo_optimize_helper"]
EH --> QG["Query Graph<br/>QO_ENV{nodes, segs, terms,<br/>eqclasses, partitions}"]
QG --> AN["qo_analyze_term<br/>qo_classify_outerjoin_terms<br/>qo_discover_edges<br/>qo_assign_eq_classes<br/>qo_discover_indexes<br/>qo_discover_partitions"]
AN --> PLN["qo_planner_search<br/>· qo_alloc_planner"]
PLN --> SP["qo_search_planner<br/>per-node access-path<br/>scan plans"]
SP --> SPP["qo_search_partition (per partition)"]
SPP --> SPJ["qo_search_partition_join<br/>(planner_permutate +<br/>partial/total join window)"]
SPJ --> CMB["qo_combine_partitions<br/>(Cartesian)"]
CMB --> FIN["qo_plan_finalize"]
FIN --> X["qo_to_xasl<br/>(plan_generation.c)"]
X --> XASL["XASL_NODE tree"]
XASL --> WIRE["xasl_to_stream → server"]
The rewriter stage (between semantic check and the
optimizer entry point) is intentionally elided here: mq_rewrite
and the rewrite walks rooted in parse_tree_cl.c perform CNF
normalisation, single-row constant folding, view merging,
Oracle-style (+) outer-join conversion to ANSI form, and
predicate-push transformations before qo_optimize_query is
called. The optimizer therefore consumes a normalized PT_NODE
in which the WHERE clause is a flat top-level AND-list, every
SPEC is ANSI, and view subqueries have been merged when
algebraically safe. The companion analysis under
raw/code-analysis/cubrid/query-processing/[CUBRID QP] Parsing to Query Graph.pptx documents this stage in depth; the present
document picks up at qo_optimize_query.
Stage 1 — PT_NODE → Query Graph
Section titled “Stage 1 — PT_NODE → Query Graph”The entry point is qo_optimize_query (parser, tree):
// qo_optimize_query — src/optimizer/query_graph.cQO_PLAN *qo_optimize_query (PARSER_CONTEXT * parser, PT_NODE * tree){ QO_ENV *env; int level;
qo_get_optimization_param (&level, QO_PARAM_LEVEL); if (!OPTIMIZATION_ENABLED (level)) /* PRM_OPTIMIZATION_LEVEL == 0 */ return NULL; if (tree->node_type != PT_SELECT) /* DML/DDL bypass the optimizer */ return NULL;
env = qo_env_init (parser, tree); /* allocate QO_ENV, walk tree, count */ if (env == NULL) return NULL;
switch (setjmp (env->catch_)) { case 0: return qo_optimize_helper (env); /* normal path */ case 1: case 2: default: /* OOM, failed assertion, or unknown — error already set */ break; } qo_env_free (env); return NULL;}Two design points are visible in this prologue. (a) The
optimizer is opt-in: PRM_OPTIMIZATION_LEVEL can be set to 0
and the caller (XASL generator) falls back to a default plan.
(b) The whole optimization runs inside a setjmp / longjmp
catch frame — QO_ABORT(env) and QO_ASSERT(env, cond) (in
optimizer.h) longjmp back to this switch, allowing any
deeply-nested helper to bail out without unwinding by hand.
This is unusual in modern C/C++ but predates exceptions and is
internally consistent because every QO_ENV-allocated structure
is freed by qo_env_free.
qo_env_init (declared in query_graph.c) walks the tree once
to count nnodes/nsegs/nterms, allocates the four parallel
arrays, then qo_optimize_helper is the per-stage driver:
// qo_optimize_helper — src/optimizer/query_graph.c (condensed)static QO_PLAN *qo_optimize_helper (QO_ENV * env){ PARSER_CONTEXT *parser = QO_ENV_PARSER (env); PT_NODE *tree = QO_ENV_PT_TREE (env); PT_NODE *spec, *conj, *next; QO_TERM *term; QO_NODE *node, *p_node; BITSET nodeset; QO_PLAN *plan;
/* (1) Add QO_NODE per FROM spec, QO_SEGMENT per referenced attribute. */ parser_walk_tree (parser, tree, build_query_graph, &env, build_query_graph_post, &env); parser_walk_tree (parser, tree, build_query_graph_function_index, &env, NULL, NULL); add_hint (env, tree); add_using_index (env, tree->info.query.q.select.using_index);
/* (2) Dependent-derived-table dependency edges (correlated subqueries). */ for (n = 0; n < env->nnodes; n++) { node = QO_ENV_NODE (env, n); spec = QO_NODE_ENTITY_SPEC (node); if (is_dependent_table (spec)) { /* Find segments referenced by the derived-table expression and * back-resolve them to the nodes they live on. Establish * artificial dep_set edges so the planner respects ordering. */ qo_expr_segs (env, spec->info.spec.derived_table, &dependencies); if (!bitset_is_empty (&dependencies)) { qo_seg_nodes (env, &dependencies, &antecedents); qo_add_dep_term (node, &antecedents, &dependencies, env); } } }
/* (3) Add ON-clause terms (one location per ANSI join) and WHERE terms. */ for (spec = tree->info.query.q.select.from; spec; spec = spec->next) if (spec->node_type == PT_SPEC && spec->info.spec.on_cond) for (conj = spec->info.spec.on_cond; conj; conj = conj->next) { term = qo_add_term (conj, PREDICATE_TERM, env); if (QO_TERM_CLASS (term) == QO_TC_JOIN) bitset_add (&nodeset, QO_NODE_IDX (QO_TERM_TAIL (term))); } for (conj = tree->info.query.q.select.where; conj; conj = conj->next) qo_add_term (conj, PREDICATE_TERM, env);
/* (4) ANSI join without an explicit edge → DUMMY_JOIN term to keep * order stable; right-outer-join → outer_dep_set propagation. */ for (n = 1; n < env->nnodes; n++) { node = QO_ENV_NODE (env, n); if (QO_NODE_IS_ANSI_JOIN (node) && !BITSET_MEMBER (nodeset, n)) qo_add_dummy_join_term (env, QO_ENV_NODE (env, n - 1), node); if (QO_NODE_PT_JOIN_TYPE (node) == PT_JOIN_RIGHT_OUTER) for (k = n - 1; k >= 0 && QO_NODE_IS_ANSI_JOIN (p_node); k--) QO_ADD_OUTER_DEP_SET (node, p_node); }
qo_classify_outerjoin_terms (env); /* SARG → DURING_JOIN / AFTER_JOIN */
/* (5) Final-segment discovery (segments that survive past the join). */ parser_walk_tree (parser, tree->info.query.q.select.list, qo_add_final_segment, &env, pt_continue_walk, NULL); parser_walk_tree (parser, tree->info.query.q.select.group_by, qo_add_final_segment, &env, pt_continue_walk, NULL); parser_walk_tree (parser, tree->info.query.q.select.having, qo_add_final_segment, &env, pt_continue_walk, NULL);
/* (6) Edge / eq-class / index / partition / sort-limit discovery. */ qo_discover_edges (env); qo_assign_eq_classes (env); get_local_subqueries (env, tree); get_rank (env); qo_discover_indexes (env); qo_discover_partitions (env); qo_discover_sort_limit_nodes (env);
/* (7) Plan search. */ plan = qo_planner_search (env);
if (plan == NULL) qo_env_free (env); return plan;}Every step has a definite shape:
QO_NODE — one per FROM spec
Section titled “QO_NODE — one per FROM spec”A QO_NODE is a thick shell around the PT_SPEC for a relation.
Its statistical fields (ncard, tcard) are read from
QO_CLASS_INFO, which the prefetch stage cached in the CAS
workspace. The dependency bitsets (dep_set, outer_dep_set)
are how the planner enforces ordering constraints that survive
permutation: a correlated derived table or a right-outer-join
right-hand side cannot be visited before its prerequisites.
// qo_node — src/optimizer/query_graph.hstruct qo_node{ QO_ENV *env; PT_NODE *entity_spec; /* original PT_SPEC */
BITSET segs; /* QO_SEGMENT idxs emanating from this node */ BITSET eqclasses; /* QO_EQCLASS idxs joined to this node */ QO_PARTITION *partition; /* connected sub-graph this node lives in */ QO_SEGMENT *oid_seg; /* virtual OID column for joins */
BITSET dep_set; /* nodes this one depends on (corr. subq) */ BITSET sargs; /* QO_TERMs of class TC_SARG on this node */ double selectivity; /* product of QO_TERM_SELECTIVITY in sargs */ BITSET subqueries; /* QO_SUBQUERY idxs to evaluate per row */
QO_CLASS_INFO *info; unsigned long int ncard; /* est. row count (System R "ncard") */ unsigned long int tcard; /* est. page count ("tcard") */
const char *class_name; int idx; /* env-wide id (used in BITSETs) */ int rel_idx; /* partition-relative id (used in join_info BITSETs) */
QO_NODE_INDEX *indexes; /* indexes on the underlying class hierarchy */ QO_USING_INDEX *using_index; /* USING INDEX hint */ BITSET outer_dep_set; /* outer-join-induced predecessors */ PT_HINT_ENUM hint; bool sargable; bool sort_limit_candidate;};The idx / rel_idx distinction is non-obvious: idx is the
node’s slot in env->nodes[] and is used in BITSETs that span
the whole environment (e.g., QO_TERM_NODES). rel_idx is the
node’s position within its partition and is used in the
join_info bitsets that key the DP table — so the DP keys
collapse to a single partition-relative bitspace and the
join_info array stays a flat 2^|partition| vector even when
the query has multiple partitions.
The two cardinality fields ship the System R bequest: the
comments in query_graph.h:329-340 literally say “These silly
names are historical artifacts that correspond to the names in
the 197? IBM report on query optimization in System R.”
QO_SEGMENT — one per attribute reference
Section titled “QO_SEGMENT — one per attribute reference”A QO_SEGMENT represents a use of a column inside a query.
Two segments for the same column (e.g., the a in
SELECT a FROM t WHERE a = 1) are typically merged (lookup_seg
does the lookup and reuses the existing segment) but the
identity is column-occurrence, not column-definition. The
head field is the node from which the segment emanates; for
ORDBMS-style path expressions (u.direct_groups) the segment
also has a tail (the linked-class node) and is the seam at
which a PATH term lives.
// qo_segment — src/optimizer/query_graph.hstruct qo_segment{ QO_ENV *env; PT_NODE *pt_node; QO_NODE *head, *tail; /* tail != NULL iff path/join segment */ QO_SEGMENT *eq_root; /* union-find root for eq-class grouping */ QO_EQCLASS *eqclass; const char *name; bool set_valued, class_attr, shared_attr; bool index_term_eq_expr; QO_ATTR_INFO *info; /* per-class statistics */ BITSET index_terms; /* terms that can use an index on this seg */ int idx; bool is_function_index; bool is_not_null;};QO_TERM — predicate (sargable, join, fake)
Section titled “QO_TERM — predicate (sargable, join, fake)”Terms come from three places: WHERE conjuncts, ANSI ON
conjuncts (one term per conjunct, marked with non-zero
location), and synthetic terms the optimizer manufactures
itself (path-conjuncts on dot-expressions, dependent-link
terms for correlated derived tables, and dummy join terms for
ANSI joins without a join predicate).
// qo_term — src/optimizer/query_graph.hstruct qo_term{ QO_ENV *env; BITSET nodes; /* nodes this term references */ BITSET segments; double selectivity; /* product of leaf selectivities */ QO_TERMCLASS term_class; int rank; PT_NODE *pt_expr; /* original conjunct */ BITSET subqueries; JOIN_TYPE join_type; int can_use_index; QO_SEGMENT *index_seg[2]; /* sides the optimizer will treat as keys */ QO_SEGMENT *seg, *oid_seg; /* PATH-term-specific */ QO_NODE *head, *tail; QO_EQCLASS *eqclass; QO_SEGMENT *nominal_seg; int flag; /* equal_op? rangelist? mergeable_edge? */ int idx; short location; /* 0 for WHERE, > 0 for ON-conjuncts */ int *multi_col_segs; int multi_col_cnt; int pred_order; /* for view-merge / pred-push ordering */};The class taxonomy in QO_TERMCLASS is bit-encoded so that
QO_IS_PATH_TERM, QO_IS_EDGE_TERM, and QO_IS_FAKE_TERM
are single-bit tests. The class taxonomy is the ten-row table:
// QO_TERMCLASS — src/optimizer/query_graph.htypedef enum{ QO_TC_PATH = 0x30, /* 1 1 0 000 */ QO_TC_JOIN = 0x11, /* 0 1 0 001 */ QO_TC_SARG = 0x02, /* 0 0 0 010 */ QO_TC_OTHER = 0x03, /* 0 0 0 011 */ QO_TC_DEP_LINK = 0x1c, /* 0 1 1 100 */ QO_TC_DEP_JOIN = 0x1d, /* 0 1 1 101 */ QO_TC_DURING_JOIN = 0x04, /* 0 0 0 100 */ QO_TC_AFTER_JOIN = 0x05, /* 0 0 0 101 */ QO_TC_TOTALLY_AFTER_JOIN = 0x06, /* 0 0 0 110 */ QO_TC_DUMMY_JOIN = 0x1f /* 0 1 1 111 */} QO_TERMCLASS;
#define QO_IS_PATH_TERM(t) (QO_TERM_CLASS(t) & 0x20)#define QO_IS_EDGE_TERM(t) (QO_TERM_CLASS(t) & 0x10)#define QO_IS_FAKE_TERM(t) (QO_TERM_CLASS(t) & 0x08)The four columns of the comment are path | edge | fake | num;
the bit packing ensures that any term that participates in
join-graph traversal has the 0x10 bit and any synthetic term
(“FAKE”, manufactured by the optimizer) has 0x08. The
distinction matters for cost: fake terms cannot serve as merge
edges (fake_terms set in QO_ENV excludes them from
qo_examine_merge_join).
The most relevant subset — by frequency in real workloads — is
QO_TC_SARG (single-relation predicate, e.g. a.col2 = 1),
QO_TC_JOIN (inter-relation predicate, e.g. a.col1 = b.col1),
QO_TC_DURING_JOIN (predicate on the right-hand side of a
left-outer join, evaluated during the join with NULL-extension
semantics), and QO_TC_AFTER_JOIN (predicate on the right-hand
side of a left-outer join in the WHERE clause, evaluated
after the join). The DURING_JOIN vs. AFTER_JOIN distinction
is the canonical Codd-mistake-trap: putting a predicate in WHERE
versus ON for a left-outer join changes the result.
QO_EQCLASS — equivalence classes / interesting orders
Section titled “QO_EQCLASS — equivalence classes / interesting orders”Whenever an inner-equi-join term a.x = b.y is added, the two
segments are merged into one equivalence class via union-find
(eq_root pointers); transitively, a.x = b.y and b.y = c.z
collapse (a.x, b.y, c.z) into one class. After all terms are
added, qo_assign_eq_classes materialises one QO_EQCLASS
object per class. Each class is one interesting order for the
DP search: any sub-plan that produces output sorted on the class
will keep its own slot in the per-info planvec.
QO_PARTITION — connected sub-graph
Section titled “QO_PARTITION — connected sub-graph”qo_discover_partitions runs a connected-component sweep over
the graph (nodes connected by any non-fake edge term); each
connected component becomes a QO_PARTITION and its
M_offset records the start of its slice in the global
join_info array. Disconnected components imply an explicit
Cartesian product, which is rare and expensive; CUBRID
optimises each partition independently, then merges them via
qo_combine_partitions in dependency order.
Worked example — three-way inner join
Section titled “Worked example — three-way inner join”SELECT 1 FROM a, b, c WHERE a.col1 = b.col1 AND b.col1 = c.col1 AND a.col2 = 1 AND b.col2 = 1 AND c.col2 = 1flowchart LR
subgraph QGEX["QO_ENV (one partition)"]
direction LR
A["QO_NODE a [idx=0]<br/>ncard, tcard<br/>sargs={t3: a.col2=1}"]
B["QO_NODE b [idx=1]<br/>sargs={t4: b.col2=1}"]
C["QO_NODE c [idx=2]<br/>sargs={t5: c.col2=1}"]
T1["QO_TERM t1: a.col1=b.col1<br/>class=JOIN, eqclass=E0"]
T2["QO_TERM t2: b.col1=c.col1<br/>class=JOIN, eqclass=E0"]
A -- t1 --- B
B -- t2 --- C
A --- T1
B --- T1
B --- T2
C --- T2
end
E0["QO_EQCLASS E0<br/>{a.col1, b.col1, c.col1}"]
T1 -. eqclass .-> E0
T2 -. eqclass .-> E0
Three nodes; three sargable terms (one per node) which lower
the per-node cardinality estimate; two join terms which both
end up in the same equivalence class because the predicates
chain. There is one partition because the graph is connected.
The DP table will have 2^3 = 8 slots (a, b, c, ab, ac,
bc, abc, plus the empty slot which is unused).
Stage 2 — cost model and join enumeration
Section titled “Stage 2 — cost model and join enumeration”After the graph is finished, qo_planner_search allocates the
QO_PLANNER skeleton and runs qo_search_planner:
// qo_planner_search — src/optimizer/query_planner.cQO_PLAN *qo_planner_search (QO_ENV * env){ QO_PLANNER *planner = qo_alloc_planner (env); QO_PLAN *plan;
if (planner == NULL) return NULL;
qo_info_nodes_init (env); qo_plans_init (env); plan = qo_search_planner (planner); qo_clean_planner (planner); return plan;}qo_alloc_planner does the bookkeeping that fixes the DP
table’s shape: it sums M_offset + 2^|partition| across all
partitions, allocates planner->join_info of that size, and
caches it in planner->M.
qo_search_planner then runs three passes:
2a — per-node access-path enumeration
Section titled “2a — per-node access-path enumeration”For each base relation, generate a sequential scan and one
index scan per usable index, and record the cheapest in
planner->node_info[i]->planvec:
// qo_search_planner — query_planner.c (condensed, scan-plan loop)for (i = 0; i < planner->N; i++) { node = &planner->node[i]; info = planner->node_info[QO_NODE_IDX (node)];
if (PT_SPEC_SPECIAL_INDEX_SCAN (QO_NODE_ENTITY_SPEC (node))) { /* B-tree key/node info introspection — only one plan possible. */ ni_entry = QO_NI_ENTRY (QO_NODE_INDEXES (node), 0); qo_check_plan_on_info ( info, qo_index_scan_new (info, node, ni_entry, QO_SCANMETHOD_INDEX_SCAN_INSPECT, NULL, NULL)); continue; }
if (QO_NODE_INDEXES (node) != NULL) for (j = 0; j < QO_NI_N (QO_NODE_INDEXES (node)); j++) { ni_entry = QO_NI_ENTRY (QO_NODE_INDEXES (node), j); index_entry = ni_entry->head; if (index_entry->force < 0) continue; /* disabled index */
/* Walk index columns left-to-right collecting equal+other terms; * stop at the first column with no equal terms. */ for (nsegs = start_column; nsegs < index_entry->nsegs; nsegs++) { bitset_union (&seg_terms, &index_entry->seg_equal_terms[nsegs]); bitset_union (&seg_terms, &index_entry->seg_other_terms[nsegs]); if (bitset_is_empty (&index_entry->seg_equal_terms[nsegs])) { if (!bitset_is_empty (&index_entry->seg_other_terms[nsegs])) nsegs++; break; } } bitset_intersect (&seg_terms, &QO_NODE_SARGS (node));
if (!bitset_is_empty (&seg_terms)) qo_generate_index_scan (info, node, ni_entry, nsegs); else if (index_entry->constraints->filter_predicate && index_entry->force > 0) qo_check_plan_on_info ( info, qo_index_scan_new (info, node, ni_entry, QO_SCANMETHOD_INDEX_SCAN, &seg_terms, NULL)); else if (index_entry->ils_prefix_len > 0) qo_generate_loose_index_scan (info, node, ni_entry); else { /* try INDEX_GROUPBY_SCAN / INDEX_ORDERBY_SCAN if helpful */ } }
qo_generate_seq_scan (info, node); /* always considered */ }The matching of segments against index columns is left-to-right
and stops at the first column with no equality term — this is
the standard “leading-column” rule that means a B-tree on
(a,b,c) with a query WHERE a=1 AND c=3 only uses the a and
c columns (with b un-bound), and a query WHERE b=2 AND c=3
uses no index column at all. The is_iss_candidate flag is
the index-skip-scan opt-in, which lets the planner relax
this rule for low-cardinality leading columns.
qo_check_plan_on_info is the dominance test mentioned in the
theoretical background — it is what keeps planvec from
exploding. A plan is added if it is not dominated by, and
does not dominate, any existing plan in the same (info, order)
slot; if it dominates, the existing plan is replaced.
2b — cost computation
Section titled “2b — cost computation”The cost model evaluates four numbers per plan, all in synthetic units. For an index scan:
// qo_iscan_cost — src/optimizer/query_planner.c (condensed)static voidqo_iscan_cost (QO_PLAN * planp){ QO_NODE *nodep = planp->plan_un.scan.node; QO_NODE_INDEX_ENTRY *ni_entryp = planp->plan_un.scan.index; QO_INDEX_ENTRY *index_entryp = ni_entryp->head; QO_ATTR_CUM_STATS *cum_statsp = &ni_entryp->cum_stats;
/* (a) Selectivity of key-range terms (independence assumption). */ sel = 1.0; for (t = bitset_iterate (&planp->plan_un.scan.terms, &iter); t != -1; ...) sel *= QO_TERM_SELECTIVITY (QO_ENV_TERM (..., t)); sel = MIN (sel, 1.0);
/* (b) Lower-bound from B-tree partial-key statistics ("pkeys[i] = # * distinct values for the prefix of the first i columns"). */ if (i <= pkeys_num && cum_statsp->pkeys[index] >= 1) sel_limit = 1.0 / (double) cum_statsp->pkeys[index]; else if (cum_statsp->keys >= 1) sel_limit = 1.0 / (double) cum_statsp->keys; sel = MAX (sel, sel_limit);
/* (c) Selectivity of key-filter terms (terms that can be evaluated * against the index entry without going to the heap). */ filter_sel = 1.0; for (t = bitset_iterate (&planp->plan_un.scan.kf_terms, &iter); ...) filter_sel *= QO_TERM_SELECTIVITY (QO_ENV_TERM (..., t));
leaf_access = sel * QO_NODE_NCARD (nodep); height = MAX (cum_statsp->height - 1, 0); leaves = ceil (sel * cum_statsp->leafs); index_IO = (ni_entryp->n * height) + leaves;
/* Index-skip-scan adds K extra B-tree probes. */ if (qo_is_index_iss_scan (planp)) index_IO += cum_statsp->pkeys[0];
if (qo_is_index_covering_scan (planp)) { object_IO = 1.0; heap_access = 0; } else { object_IO = MAX (1.0, QO_NODE_TCARD (nodep) * sel * filter_sel); heap_access = QO_NODE_NCARD (nodep) * sel * filter_sel * (double) ISCAN_OID_ACCESS_OVERHEAD; }
planp->fixed_cpu_cost = 0.0; planp->fixed_io_cost = index_IO; planp->variable_cpu_cost = (leaf_access + heap_access) * (double) QO_CPU_WEIGHT; planp->variable_io_cost = object_IO; planp->info->scan_rows = MAX (1, QO_NODE_NCARD (nodep) * sel * filter_sel);}The named constants are the dials (all in
query_planner.c:78-90):
| Macro | Value | Meaning |
|---|---|---|
QO_CPU_WEIGHT | 0.0025 | CPU cost per row (calibration constant) |
ISCAN_OID_ACCESS_OVERHEAD | 20 | extra CPU per heap fetch on an index scan |
ISCAN_IO_HIT_RATIO | 0.5 | fraction of inner-NL fetches assumed already cached |
GUESSED_BIND_LIMIT_CARD | 2000 | cardinality guess for LIMIT ? |
For a nested-loop join:
// qo_nljoin_cost — src/optimizer/query_planner.c (condensed)static voidqo_nljoin_cost (QO_PLAN * planp){ QO_PLAN *outer = planp->plan_un.join.outer; QO_PLAN *inner = planp->plan_un.join.inner; double guessed_result_cardinality;
planp->fixed_cpu_cost = outer->fixed_cpu_cost + inner->fixed_cpu_cost; planp->fixed_io_cost = outer->fixed_io_cost + inner->fixed_io_cost;
if (QO_PLAN_HAS_LIMIT (planp) && /* limit pushdown applicable */ (planp->info->planner->can_apply_limit_card || qo_plan_is_orderby_skip_candidate (planp))) { /* Treat outer cardinality as min(limit / hit_prob, outer_card). */ limit_val = QO_PLAN_HAS_CONSTANT_LIMIT (planp) ? db_get_bigint (&QO_ENV_LIMIT_VALUE (planp->info->env)) : GUESSED_BIND_LIMIT_CARD; planp->limit_nljoin_guessed_card = MAX (limit_val / outer->info->hit_prob, 1.0); guessed_result_cardinality = MIN (planp->limit_nljoin_guessed_card, outer->info->cardinality); } else guessed_result_cardinality = outer->info->cardinality;
inner_cpu_cost = guessed_result_cardinality * inner->variable_cpu_cost; if (qo_is_iscan (inner)) inner_io_cost = guessed_result_cardinality * inner->variable_io_cost * (1 - ISCAN_IO_HIT_RATIO); else inner_io_cost = (guessed_result_cardinality + SSCAN_DEFAULT_CARD) * inner->variable_io_cost;
planp->variable_cpu_cost = inner_cpu_cost + outer->variable_cpu_cost; planp->variable_io_cost = inner_io_cost + outer->variable_io_cost;}The two specific tactics here are textbook: (a) the inner side
is multiplied by outer cardinality rather than by outer’s
total cost — this is the standard “rerun-the-inner-once-per-
outer-row” formula; and (b) when the inner is an index scan the
IO is weighted by (1 - ISCAN_IO_HIT_RATIO), modelling the fact
that successive index probes from the same outer batch hit the
buffer pool at a stable rate. The LIMIT interaction in the
middle is CUBRID-specific — it lets a LIMIT k query short-
circuit the cardinality estimate and prefer plans that produce
results in order with low fixed cost.
2c — join enumeration on the QG
Section titled “2c — join enumeration on the QG”qo_search_partition_join is the DP loop:
// qo_search_partition_join — src/optimizer/query_planner.c (condensed)static QO_INFO *qo_search_partition_join (QO_PLANNER * planner, QO_PARTITION * partition, BITSET * remaining_subqueries){ bitset_assign (&remaining_nodes, &QO_PARTITION_NODES (partition)); nodes_cnt = bitset_cardinality (&remaining_nodes);
/* Useful terms: every term whose nodes-set is a subset of this * partition's nodes. */ for (i = 0; i < planner->T; i++) { term = &planner->term[i]; if (QO_TERM_CLASS (term) == QO_TC_TOTALLY_AFTER_JOIN) continue; if (bitset_subset (&remaining_nodes, &QO_TERM_NODES (term))) { bitset_add (&remaining_terms, i); if (QO_TERM_CLASS (term) == QO_TC_PATH) num_path_inner++; } }
/* Set the join window. Bigger queries → smaller partial DP. */ if (num_path_inner || (hint & PT_HINT_ORDERED)) planner->join_unit = nodes_cnt; /* full DP, no fan-out */ else planner->join_unit = (nodes_cnt <= 25) ? MIN (8, nodes_cnt) : (nodes_cnt <= 37) ? 3 : 2;
/* STEP 1 — partial-join sweep with first-node selection. */ while (1) { planner_permutate (planner, partition, hint, /* prev_head: */ node, &visited_nodes, &visited_rel_nodes, &visited_terms, &nested_path_nodes, &remaining_nodes, &remaining_terms, remaining_subqueries, num_path_inner, (planner->join_unit < nodes_cnt) ? &node_idx : NULL); if (planner->best_info) break; /* OK, found best plan */
if (planner->join_unit >= nodes_cnt) break; /* full search exhausted */
if (node_idx == -1) { /* STEP 2 — partial search failed; fall back to total search. */ bitset_union (&remaining_nodes, &visited_nodes); bitset_union (&remaining_terms, &visited_terms); BITSET_CLEAR (visited_nodes); BITSET_CLEAR (visited_rel_nodes); BITSET_CLEAR (visited_terms); planner->join_unit = nodes_cnt; continue; }
/* Promote the partial-best's outermost node. */ node = QO_ENV_NODE (env, node_idx); bitset_add (&visited_nodes, node_idx); bitset_add (&visited_rel_nodes, QO_NODE_REL_IDX (node)); bitset_remove (&remaining_nodes, node_idx); visited_info = (bitset_cardinality (&visited_nodes) == 1) ? planner->node_info[node_idx] : planner->join_info[ QO_INFO_INDEX (QO_PARTITION_M_OFFSET (partition), visited_rel_nodes)]; if (visited_info == NULL) break; bitset_assign (&visited_terms, &visited_info->terms); bitset_difference (&remaining_terms, &visited_info->terms); planner->join_unit++; } return planner->best_info;}The two-pass shape is the load-bearing structure. Pass 1 picks
one outermost node for the prefix, builds the best
join_unit-sized prefix that starts there, and then anchors that
node and repeats — promoting one node per outer-loop iteration
until either join_unit reaches nodes_cnt or all nodes are
consumed. Pass 2 (only triggered if Pass 1 fails to find any
plan, which happens when dependency or outer-join constraints
make every prefix infeasible at the partial-join window) widens
join_unit to nodes_cnt and runs a full DP.
planner_permutate itself is the inner loop that, for each
candidate head node, calls planner_visit_node which is the
true DP step:
// planner_permutate — src/optimizer/query_planner.c (condensed)for (i = bitset_iterate (remaining_nodes, &bi); i != -1; ...) { head_node = QO_ENV_NODE (planner->env, i);
/* Dependency / outer-dependency check. */ if (!bitset_subset (visited_nodes, &QO_NODE_DEP_SET (head_node))) continue; if (!bitset_subset (visited_nodes, &QO_NODE_OUTER_DEP_SET (head_node))) continue;
if (bitset_is_empty (visited_nodes)) { /* This is the first node of the join. Iterate over all * candidate second-nodes and call planner_visit_node for each. */ bitset_add (visited_nodes, QO_NODE_IDX (head_node)); for (j = bitset_iterate (remaining_nodes, &bj); j != -1; ...) { tail_node = QO_ENV_NODE (planner->env, j); if (!bitset_subset (visited_nodes, &QO_NODE_DEP_SET (tail_node))) continue; if (!bitset_subset (visited_nodes, &QO_NODE_OUTER_DEP_SET (tail_node))) continue; planner_visit_node (planner, partition, hint, head_node, tail_node, visited_nodes, visited_rel_nodes, visited_terms, nested_path_nodes, remaining_nodes, remaining_terms, remaining_subqueries, num_path_inner); if (hint & PT_HINT_ORDERED) break; } /* Restore on backtrack. */ BITSET_CLEAR (*visited_nodes); } else { /* Not the first node — extend prefix with head_node as new tail. */ planner_visit_node (planner, partition, hint, prev_head_node, head_node, ...); } if (hint & PT_HINT_ORDERED) break; /* /*+ ORDERED */ disables search */ }Inside planner_visit_node (not shown in full — it spans
several hundred lines) the per-pair join-method enumeration is
the dispatch:
flowchart TD
V["planner_visit_node<br/>(outer = visited prefix's plan,<br/>inner = next single node)"]
V --> J{"join term exists between<br/>visited prefix and inner?"}
J -- "yes (TC_JOIN edge)" --> A["qo_examine_idx_join<br/>→ qo_examine_correlated_index<br/>(inner becomes index scan with<br/>outer values pushed as range)"]
J -- "yes" --> B["qo_examine_nl_join<br/>(inner = best plan on info,<br/>outer-card-times-inner-cost cost)"]
J -- "yes (mergeable edge)" --> C["qo_examine_merge_join<br/>(both sides ordered on eqclass)"]
J -- "no (Cartesian)" --> D["skip — not joinable<br/>at this DP level"]
A --> CHK["qo_check_plan_on_info<br/>per (joined-set, eqclass)"]
B --> CHK
C --> CHK
CHK --> NEXT["save into<br/>planner->join_info[joined_set]"]
Each qo_examine_* routine builds a candidate QO_PLAN of the
appropriate type, asks qo_plan_compute_cost to populate its
four cost fields, and submits it to the dominance test.
qo_examine_idx_join is the cheapest case — it fires only
when (a) the inner is a single-class scan, (b) the inner has an
index whose leading column matches a join term against the
outer, and (c) join hints permit. qo_examine_correlated_index
then builds an index scan plan in which the join term is used
as the key range:
// qo_examine_idx_join — src/optimizer/query_planner.c (condensed)static intqo_examine_idx_join (QO_INFO * info, JOIN_TYPE join_type, QO_INFO * outer, QO_INFO * inner, BITSET * afj_terms, BITSET * sarged_terms, BITSET * pinned_subqueries){ /* For RIGHT outer joins we converse outer/inner. */ inner_node = (join_type == JOIN_RIGHT) ? QO_ENV_NODE (outer->env, bitset_first_member (&outer->nodes)) : QO_ENV_NODE (inner->env, bitset_first_member (&inner->nodes));
/* Hint gating: USE_MERGE / USE_HASH / USE_NL with non-IDX disable us. */ if (QO_NODE_HINT (inner_node) & (PT_HINT_USE_IDX | PT_HINT_USE_NL)) {} else if (QO_NODE_HINT (inner_node) & PT_HINT_USE_MERGE) goto exit; else if ((QO_NODE_HINT (inner_node) & PT_HINT_USE_HASH) && !(QO_NODE_HINT (inner_node) & PT_HINT_NO_USE_HASH)) goto exit;
if (join_type == JOIN_RIGHT) n = qo_examine_correlated_index (info, JOIN_LEFT, inner, outer, afj_terms, sarged_terms, pinned_subqueries); else n = qo_examine_correlated_index (info, join_type, outer, inner, afj_terms, sarged_terms, pinned_subqueries);exit: return n;}qo_examine_nl_join is the always-applicable fallback. It picks
the best inner plan for the inner sub-info and pairs it with the
outer to form a QO_JOINMETHOD_NL_JOIN. qo_examine_merge_join
is gated on the outer and inner both being orderable on the
join-term’s equivalence class — which is why QO_EQCLASS is
attached to every mergeable term during graph construction.
Stage 3 — Query Graph → Compiled Graph (XASL)
Section titled “Stage 3 — Query Graph → Compiled Graph (XASL)”When qo_search_planner returns, planner->best_info holds
the winning DP slot for each partition; qo_combine_partitions
concatenates them via Cartesian product (rare; usually one
partition); qo_plan_finalize resolves QO_PLANTYPE_SORT
wrappers and the descending-iscan post-processing.
The result is a QO_PLAN tree whose physical-operator nodes
are not yet a CUBRID-server-executable artefact. The caller
(XASL generator, xasl_generation.c) calls qo_to_xasl:
// qo_to_xasl — src/optimizer/plan_generation.c (condensed)xasl_node *qo_to_xasl (QO_PLAN * plan, xasl_node * xasl){ QO_ENV *env; XASL_NODE *lastxasl;
if (plan && xasl && (env = plan->info->env)) { xasl = gen_outer (env, plan, &EMPTY_SET, NULL, NULL, xasl);
/* Walk to the deepest scan/fetch and push the SELECT-list * subqueries' dptrs there. */ for (lastxasl = xasl; lastxasl; ) { if (lastxasl->scan_ptr) lastxasl = lastxasl->scan_ptr; else if (lastxasl->fptr_list) lastxasl = lastxasl->fptr_list; else break; } pt_set_dptr (env->parser, env->pt_tree->info.query.q.select.list, lastxasl, MATCH_ALL);
xasl = preserve_info (env, plan, xasl); } else xasl = NULL; return xasl;}gen_outer recursively emits the access-spec tree for the
outer side of every join; for joins it then hangs the inner
side’s access-spec off the corresponding XASL node. The
mapping between QO_PLANTYPE_* and XASL is direct: a
QO_PLANTYPE_SCAN becomes one ACCESS_SPEC_TYPE slot inside
the parent’s XASL_NODE::spec_list; a QO_PLANTYPE_JOIN
becomes two access-specs on the same XASL plus a hierarchy of
fptr_list / scan_ptr linkages controlling the join’s loop
structure; a QO_PLANTYPE_SORT becomes a BUILDLIST_PROC
that materialises a temp file. The selectivity bookkeeping
preserve_info saves the plan’s cost summary into qo_summary
hung off the original PT_SELECT so that an enclosing query
(if this was a subquery) can incorporate the cost.
flowchart LR
subgraph QOPLAN["QO_PLAN tree (chosen)"]
direction TB
JN["PLANTYPE_JOIN<br/>method=IDX_JOIN<br/>type=INNER"]
SC1["PLANTYPE_SCAN<br/>method=SEQ_SCAN<br/>node=a, sargs={a.col2=1}"]
SC2["PLANTYPE_SCAN<br/>method=INDEX_SCAN<br/>node=b, idx=(b.col1)<br/>range_terms={a.col1=b.col1}"]
JN --> SC1
JN --> SC2
end
QOPLAN --> XGEN["qo_to_xasl<br/>gen_outer"]
XGEN --> XASL2["XASL_NODE tree"]
subgraph XASLR["XASL"]
direction TB
BL["BUILDLIST_PROC (top)<br/>spec_list, val_list,<br/>outptr_list"]
AS1["ACCESS_SPEC (a)<br/>SCAN_TYPE_HEAP<br/>where_pred=a.col2=1"]
AS2["ACCESS_SPEC (b)<br/>SCAN_TYPE_INDEX<br/>btid=idx_b_col1<br/>key_range=a.col1=b.col1"]
BL --> AS1
AS1 -. scan_ptr .-> AS2
end
After qo_to_xasl, the caller serialises the XASL tree to a
byte stream (xasl_to_stream.c) and ships it to cub_server,
where stream_to_xasl.c rehydrates it for query_executor.c.
The optimizer’s role ends at the byte stream.
End-to-end pipeline (full picture)
Section titled “End-to-end pipeline (full picture)”sequenceDiagram
participant U as User
participant P as Parser
participant SC as Semantic check
participant RW as Rewriter
participant QG as build_query_graph
participant AN as analyze/discover
participant PL as qo_planner_search
participant XL as qo_to_xasl
participant SV as cub_server
U->>P: SQL string
P->>P: csql_grammar.y → PT_NODE
P->>SC: PT_NODE
SC->>SC: pt_flat_spec_pre, bind_names,<br/>resolve_*, eval_type
SC->>RW: resolved PT_NODE
RW->>RW: CNF, view-merge,<br/>oracle-style outer-join,<br/>predicate-push
RW->>QG: optimized PT_NODE
QG->>QG: qo_add_node × N<br/>qo_add_term × T<br/>qo_add_dummy_join_term
QG->>AN: QO_ENV
AN->>AN: qo_classify_outerjoin_terms<br/>qo_discover_edges<br/>qo_assign_eq_classes<br/>qo_discover_indexes<br/>qo_discover_partitions
AN->>PL: QO_ENV (annotated)
PL->>PL: qo_alloc_planner<br/>per-node access-paths<br/>per-partition join DP<br/>combine partitions<br/>qo_plan_finalize
PL->>XL: QO_PLAN tree
XL->>XL: gen_outer recursive,<br/>access-spec emit
XL->>SV: XASL byte stream
SV->>U: result rows
Source Walkthrough
Section titled “Source Walkthrough”Anchor on symbol names, not line numbers. Position hints at the bottom of this section reflect the source as observed when the document was last
updated:.
Optimizer entry & environment (src/optimizer/)
Section titled “Optimizer entry & environment (src/optimizer/)”qo_optimize_query(query_graph.c) — public entry point called fromxasl_generation.c; runs the setjmp/longjmp catch frame.qo_optimize_helper(query_graph.c) — the per-stage driver: build graph, classify, discover edges/indexes/partitions, callqo_planner_search.qo_env_init/qo_env_free(query_graph.c) — allocate and tear downQO_ENV.qo_validate(query_graph.c) — sanity check after graph build.
Query graph types (query_graph.h)
Section titled “Query graph types (query_graph.h)”struct qo_env— top-level container with parallel arrays of nodes / segs / terms / eqclasses / partitions and thetmp_bitsetscratch slot.struct qo_node— per-relation graph node (withQO_NODE_*accessor macros). Statistics inncard/tcard, dependency bitsets indep_set/outer_dep_set, per-relation index list inindexes.struct qo_segment— per-attribute-occurrence graph segment (withQO_SEG_*macros).struct qo_term— predicate graph term. The ten-valueQO_TERMCLASSenum and the bit-encodedQO_IS_PATH_TERM/QO_IS_EDGE_TERM/QO_IS_FAKE_TERMpredicates live here.struct qo_eqclass— interesting-order equivalence class.struct qo_partition— connected sub-graph; carriesM_offsetfor join-info indexing.struct qo_subquery— correlated-subquery descriptor.struct qo_index_entry,qo_node_index_entry,qo_node_index— per-index meta plus per-segment term lookup tables (seg_equal_terms,seg_other_terms).struct qo_using_index—USING INDEXhint payload.
Query graph builders (query_graph.c)
Section titled “Query graph builders (query_graph.c)”build_query_graph/build_query_graph_post/build_query_graph_function_index—parser_walk_treecallbacks that iterate the PT_NODE.build_graph_for_entity— per-PT_SPEC builder (QO_BUILD_STATUSfinite-state stepper).qo_add_node— allocate aQO_NODEand link it to its PT_SPEC.qo_insert_segment/qo_join_segment/lookup_seg— segment alloc / join-segment alloc / lookup.qo_add_final_segment— walks the SELECT/GROUP BY/HAVING list to mark which segments must survive past joins.qo_add_term— promote one PT_NODE conjunct to aQO_TERM, classify it, and link it to its nodes/segs.qo_add_dep_term— synthesise aQO_TC_DEP_LINKfor a correlated derived table.qo_add_dummy_join_term— synthesise aQO_TC_DUMMY_JOINfor an ANSI join with no WHERE/ON predicate.qo_analyze_term— setterm_class,can_use_index,index_seg, equal-op / rangelist flags.qo_classify_outerjoin_terms— partition WHERE-clause predicates over outer-joined relations intoQO_TC_DURING_JOIN/QO_TC_AFTER_JOIN/QO_TC_TOTALLY_AFTER_JOIN.qo_discover_edges— re-sortenv->termsby predicate ordering so that edge-terms come before sarg-terms.qo_assign_eq_classes— union-find on segments, then materialise oneQO_EQCLASSper class.qo_discover_indexes— per-node, populateQO_NODE_INDEXwith the indexes on the underlying class hierarchy.qo_discover_partitions— connected-component sweep on the graph; oneQO_PARTITIONper component.qo_discover_sort_limit_nodes— flag nodes that must participate in any LIMIT plan.
Bitset primitive (query_bitset.{h,c})
Section titled “Bitset primitive (query_bitset.{h,c})”BITSETopaque type plusbitset_init,bitset_assign,bitset_union,bitset_intersect,bitset_difference,bitset_subset,BITSET_MEMBER,bitset_iterate,bitset_next_member,bitset_first_member,bitset_cardinality,bitset_delset. Used pervasively for {nodes, segs, terms, eqclasses, subqueries} sets and for the DP table key.
Planner types (query_planner.h)
Section titled “Planner types (query_planner.h)”enum QO_PLANTYPE—SCAN | SORT | JOIN | FOLLOW | WORST.enum QO_SCANMETHOD—SEQ_SCAN | INDEX_SCAN | INDEX_ORDERBY_SCAN | INDEX_GROUPBY_SCAN | INDEX_SCAN_INSPECT.enum QO_JOINMETHOD—NL_JOIN | IDX_JOIN | MERGE_JOIN | HASH_JOIN.struct qo_plan— physical-plan node with theplan_ununion (scan / sort / join / follow). Carries the four cost fields,sarged_terms,subqueries,order.struct qo_planvec— up toNPLANS=4non-dominated plans per (info, order) slot.struct qo_info— DP-table cell (one per node-subset); carriescardinality,scan_rows,total_rows,hit_prob, theplanvecarray, and abest_no_ordershortcut.struct qo_planner— top-level planner:node,term,node_info,join_info,cp_info,partition,best_info,worst_plan,worst_info.
Planner core (query_planner.c)
Section titled “Planner core (query_planner.c)”qo_planner_search— public entry; allocates planner, initialises plans, callsqo_search_planner, cleans up.qo_alloc_planner— copiesQO_ENV’s arrays into the planner, sets up partitionM_offsets, sumsM.qo_search_planner— the per-node access-path loop, the per-partition join driver, the partition combine, and the descending-index post-processing.qo_search_partition— single-partition driver (delegates toqo_search_partition_joinfor >1-node partitions).qo_search_partition_join— partial-then-total DP loop withjoin_unitwindow scaling.planner_permutate— for each candidate head node, callplanner_visit_nodewith each candidate inner.planner_visit_node(not excerpted) — the DP step: retrieves the visited prefix’sQO_INFO, dispatches intoqo_examine_*_join, writes the result back intojoin_info[QO_INFO_INDEX(...)].qo_examine_idx_join— index-correlated join (the cheapest case).qo_examine_nl_join— generic nested-loop fallback.qo_examine_merge_join— sort-merge on a shared equivalence class.qo_examine_correlated_index— builds an inner index-scan plan that uses the outer’s join values as the key range.qo_check_plan_on_info— dominance check; returns the number of plans surviving in the planvec after this candidate is considered.qo_seq_scan_new— allocate a sequential-scan plan and call its cost fn.qo_index_scan_new— allocate an index-scan plan (range, filter, kf terms set up) and call its cost fn.qo_join_new— allocate a join plan with method selected by caller; computes order propagation.qo_worst_new— allocate a sentinel infinitely-expensive plan (used when an examine routine cannot build anything).qo_plan_finalize— top-level finalisation (descending-index propagation, sort-list pruning).qo_combine_partitions— Cartesian-product combining of per-partition winners.sort_partitions— order partitions by inter-partition dependency before combining.
Cost model (query_planner.c)
Section titled “Cost model (query_planner.c)”qo_plan_compute_cost— vtbl dispatcher; callsplan->vtbl->cost_fn(plan)and adds correlated-subquery cost.qo_iscan_cost— index-scan four-slot cost.qo_seq_scan_cost— sequential-scan four-slot cost.qo_nljoin_cost— nested-loop join four-slot cost (with LIMIT pushdown branch).qo_mjoin_cost— sort-merge join four-slot cost.qo_sort_cost,qo_follow_cost,qo_worst_cost,qo_zero_cost— dispatch tails.qo_plan_compute_subquery_cost— one-shot subquery cost (uses the cachedQO_SUMMARY).qo_plan_cmp— three-way plan comparator (PLAN_COMP_LT/EQ/GT/UNK); used by the planvec dominance check.qo_plan_cmp_prefer_covering_index— covering-index tiebreaker.
Selectivity (query_planner.c)
Section titled “Selectivity (query_planner.c)”qo_expr_selectivity— recursive selectivity walker over PT_EXPR; dispatches per-op.qo_equal_selectivity,qo_comp_selectivity,qo_between_selectivity,qo_range_selectivity,qo_or_selectivity,qo_and_selectivity,qo_not_selectivity— per-op leaves.
XASL emission (plan_generation.c)
Section titled “XASL emission (plan_generation.c)”qo_to_xasl— public entry called fromxasl_generation.cafter the optimizer returns.gen_outer— recursive emit of the outer side of a plan; hands off togen_innerfor the inner.make_scan_proc,make_buildlist_proc— top-level XASL skeleton constructors.add_access_spec— populate oneACCESS_SPEC_TYPEfrom one scan plan.make_pred_expr— lower aPT_EXPRinto a runtime predicate expression.make_outptr_list— build the projection list from the SELECT clause.preserve_info— saveQO_SUMMARYon the parent PT_SELECT so callers can re-cost.
Position hints as of 2026-04-30
Section titled “Position hints as of 2026-04-30”| Symbol | File | Line |
|---|---|---|
qo_optimize_query | optimizer/query_graph.c | 353 |
qo_optimize_helper | optimizer/query_graph.c | 424 |
qo_env_init | optimizer/query_graph.c | 633 |
qo_validate | optimizer/query_graph.c | 762 |
build_query_graph | optimizer/query_graph.c | 942 |
build_query_graph_post | optimizer/query_graph.c | 982 |
build_query_graph_function_index | optimizer/query_graph.c | 1010 |
build_graph_for_entity | optimizer/query_graph.c | 1107 |
qo_add_node | optimizer/query_graph.c | 1243 |
lookup_node | optimizer/query_graph.c | 1374 |
qo_insert_segment | optimizer/query_graph.c | 1493 |
qo_join_segment | optimizer/query_graph.c | 1599 |
lookup_seg | optimizer/query_graph.c | 1620 |
qo_add_final_segment | optimizer/query_graph.c | 1690 |
qo_add_term | optimizer/query_graph.c | 1718 |
qo_add_dep_term | optimizer/query_graph.c | 1837 |
struct qo_node | optimizer/query_graph.h | 265 |
struct qo_segment | optimizer/query_graph.h | 436 |
struct qo_eqclass | optimizer/query_graph.h | 527 |
enum QO_TERMCLASS | optimizer/query_graph.h | 562 |
struct qo_term | optimizer/query_graph.h | 591 |
struct qo_partition | optimizer/query_graph.h | 797 |
struct qo_env | optimizer/query_graph.h | 849 |
enum QO_PLANTYPE | optimizer/query_planner.h | 42 |
enum QO_SCANMETHOD | optimizer/query_planner.h | 51 |
enum QO_JOINMETHOD | optimizer/query_planner.h | 60 |
struct qo_plan | optimizer/query_planner.h | 113 |
struct qo_planvec | optimizer/query_planner.h | 248 |
struct qo_info | optimizer/query_planner.h | 256 |
struct qo_planner | optimizer/query_planner.h | 337 |
QO_CPU_WEIGHT | optimizer/query_planner.c | 79 |
ISCAN_OID_ACCESS_OVERHEAD | optimizer/query_planner.c | 80 |
ISCAN_IO_HIT_RATIO | optimizer/query_planner.c | 85 |
GUESSED_BIND_LIMIT_CARD | optimizer/query_planner.c | 87 |
qo_plan_compute_cost | optimizer/query_planner.c | 750 |
qo_seq_scan_new | optimizer/query_planner.c | 1678 |
qo_index_scan_new | optimizer/query_planner.c | 1784 |
qo_iscan_cost | optimizer/query_planner.c | 2078 |
qo_join_new | optimizer/query_planner.c | 2844 |
qo_nljoin_cost | optimizer/query_planner.c | 3315 |
qo_mjoin_cost | optimizer/query_planner.c | 3465 |
qo_plan_cmp_prefer_covering_index | optimizer/query_planner.c | 4044 |
qo_plan_cmp | optimizer/query_planner.c | 4098 |
qo_plan_finalize | optimizer/query_planner.c | 5059 |
qo_check_plan_on_info | optimizer/query_planner.c | 5942 |
qo_examine_idx_join | optimizer/query_planner.c | 6151 |
qo_examine_nl_join | optimizer/query_planner.c | 6234 |
qo_examine_merge_join | optimizer/query_planner.c | 6404 |
qo_alloc_planner | optimizer/query_planner.c | 7044 |
planner_permutate | optimizer/query_planner.c | 8190 |
qo_planner_search | optimizer/query_planner.c | 8352 |
qo_generate_join_index_scan | optimizer/query_planner.c | 8389 |
qo_generate_index_scan | optimizer/query_planner.c | 8639 |
qo_search_planner | optimizer/query_planner.c | 8879 |
qo_search_partition_join | optimizer/query_planner.c | 9309 |
qo_search_partition | optimizer/query_planner.c | 9471 |
qo_combine_partitions | optimizer/query_planner.c | 9576 |
qo_expr_selectivity | optimizer/query_planner.c | 9710 |
make_scan_proc | optimizer/plan_generation.c | 133 |
make_buildlist_proc | optimizer/plan_generation.c | 728 |
add_access_spec | optimizer/plan_generation.c | 895 |
qo_to_xasl | optimizer/plan_generation.c | 2915 |
Cross-check Notes
Section titled “Cross-check Notes”The three raw analyses are slide decks; their inevitable imprecisions and their drift from the current source are flagged here so a reader holding the slides next to the code knows where to trust which.
analysis_QP_optimizer_1.0.pptxis from 2019-07-25 and predates several refinements:- The deck’s slide 22 puts the join_info vector at exactly
2^Nslots. The current source allocatesM_offset + 2^|partition|per partition (qo_alloc_planner,query_planner.c:7044), which is the same thing only when there is one partition. For Cartesian-product workloadsplanner->Mis the sum across partitions. - Slide 25 (“first node 선별”) notes that the first-node
selection logic does not consider join-relevant indexes
when comparing cost — “join 관계의 index가 고려되지 않은
로직”. The current source’s
planner_permutatestill has this property: cost isbest_plan->fixed + variableplusplanner_nodeset_join_costfor unvisited nodes. The remark stays valid. - Slide 27 documents three join methods (
idx_join,nl_join,merge_join); the current source also hasQO_JOINMETHOD_HASH_JOIN(inQO_JOINMETHODsince the hash-list join feature) but it is gated through different code paths (hash_termspropagation throughqo_join_new) rather than through a separateqo_examine_hash_join. - Slide 28’s plan-comparison priority table
(
plan_type_scan>plan_type_sort, etc.) still describesqo_plan_cmpaccurately at the single-plan level, but the currentqo_plan_cmp_prefer_covering_indexis a separate routine that runs on top of this and was added later.
- The deck’s slide 22 puts the join_info vector at exactly
[CUBRID QP] Parsing to Query Graph.pptx(2025-07) is recent and largely matches the source. One area to watch: the deck conflates “Query Graph” with the post-rewriter PT_NODE tree at slides 4–6; in the source, “build_query_graph” is the parser_walk callback that createsQO_ENVfrom the rewritten PT_NODE — the QG isQO_ENV, not a richer PT_NODE.[CUBRID QP] QG to CG.pptx(2025-08) mirrors the earlier deck’s structural slides 1–32 verbatim and only diverges at the “Plan enumeration” / “Code generation” sections (slides 33+, not excerpted here). The plan-enum description matchesqo_search_partition_join’s two-pass structure faithfully.- PT_NODE rewriter is out of scope of the optimizer module
per the source layout but inside scope per the slide deck.
mq_rewritelives inparser/, notoptimizer/.qo_optimize_helperdoes no large-scale tree rewriting; it only analyses. All view-merging and CNF normalisation happens upstream — the optimizer assumes a normalized PT_NODE. - The “first node selection” logic comment in slide 25 is worth preserving in this document because it materially affects plan quality on join-heavy queries with selective per-index conditions on a non-first table.
QO_JOINMETHOD_HASH_JOINis reachable but not enumerated by a dedicatedqo_examine_hash_join. The hash list scan is selected via thehash_termsbitset propagated throughqo_examine_nl_join→qo_join_newwhen the relevant hash hint is set. The deck’s “three join methods” claim therefore reads as three core methods — accurate at the cost-model granularity but a slight under-count at the join-method enum granularity.
Open Questions
Section titled “Open Questions”-
Is the
setjmp/longjmpcatch frame inqo_optimize_querystill needed? Modern C++17 exceptions are not used in engine code (the global anti-pattern inCLAUDE.md), but modern error-return style is. Investigation path: trace all call sites ofQO_ABORTandQO_ASSERT; check whether any are reachable from C++17 code that couldthrowinstead oflongjmp. -
Why is the partial-join
join_unitcapped at 8 / 3 / 2 without a runtime parameter?qo_search_partition_joinhardcodesMIN(8, nodes_cnt)for ≤25-table joins, 3 for 26–37, and 2 above. The Sybase Adaptive Server lineage referenced in the source comment suggests these numbers are rule-of-thumb. Investigation path: build a benchmark sweep over varyingnodes_cntand instrumentplanner->best_info->costto see whether widening the window improves real plans. -
QO_JOINMETHOD_HASH_JOINenumeration coverage. Hash joins are emitted viaqo_join_newwith a non-emptyhash_termsbitset, which is computed inside the merge/idx examine routines. Are there join shapes (e.g., Cartesian product with a single big sargable filter) where the planner never gets a chance to propose a hash-list scan? Investigation path: readqo_examine_*for thehash_termspropagation logic and check the join-hint gating. -
ISCAN_IO_HIT_RATIO = 0.5is a guess. The constant models the buffer-pool hit rate of inner index probes inside a nested-loop join. The actual rate depends on workload locality, buffer-pool size, and outer cardinality. Has anyone measured the cost-model error this introduces? Investigation path: instrumentqo_nljoin_costand compare predicted vs. observed inner I/O on TPC-H or similar. -
QO_CPU_WEIGHT = 0.0025calibration. This constant trades CPU and I/O units in the four-slot cost model. Modern hardware (NVMe, large DRAM) has a much higher I/O-to-CPU ratio than the 1990s-era hardware the constant was originally calibrated for. Investigation path: re-fit the constant on a representative modern workload; check whether reducing it makes the planner prefer index plans too aggressively or appropriately. -
Plan-cache feedback loop. XASL caching means the same plan is re-used across executions; bind-variable values can drift, making the cardinality estimate that produced the plan stale. Does the optimizer ever invalidate cached plans when statistics change beyond a threshold, or only on explicit DDL/
UPDATE STATISTICS? Investigation path: read the XASL-cache invalidation hooks inquery/query_executor.c. -
qo_check_plan_on_infodominance with NPLANS=4. Why exactly four? Empirically: enough to keep three or four useful interesting orders simultaneously (one unordered, one per likely sort), but the source carries no commentary on the choice. Is there a workload where NPLANS=4 is insufficient? Investigation path: instrumentqo_check_plan_on_infowith overflow-counting and run a wide-table benchmark. -
qo_classify_outerjoin_termscorrectness on nested outer joins. TheDURING_JOIN/AFTER_JOIN/TOTALLY_AFTER_JOINdistinction is delicate when outer joins nest (e.g.,a LEFT JOIN (b LEFT JOIN c ON …) ON …). The slide deck does not exercise this case. Investigation path: read the routine end-to-end, build adversarial test cases, diff against PostgreSQL’s outer-join planner (prepjointree.c).
Sources
Section titled “Sources”Raw analyses (raw/code-analysis/cubrid/query-processing/)
Section titled “Raw analyses (raw/code-analysis/cubrid/query-processing/)”analysis_QP_optimizer_1.0.pptx— original optimizer overview deck (2019-07-25, team2). 32 slides covering parser → semantic check → rewriter → optimizer → QO_ENV → planner → plan; the canonical source for the term-class taxonomy and the join-method comparison.[CUBRID QP] Parsing to Query Graph.pptx— Hyung-Gyu Ryoo, 2025-07. Deeper-than-deck-1 walkthrough of the PT_NODE → QO_ENV pipeline.[CUBRID QP] QG to CG.pptx— same author, 2025-08; covers plan enumeration and XASL emission._converted/analysisqpoptimizer1.0.pptx.md,_converted/cubrid-qp-parsing-to-query-graph.pptx.md,_converted/cubrid-qp-qg-to-cg.pptx.md— markitdown text extracts.
Sibling docs in this knowledge base
Section titled “Sibling docs in this knowledge base”knowledge/code-analysis/cubrid/cubrid-mvcc.md— MVCC visibility predicates that the index-scan and seq-scan access specs invoke per row.knowledge/code-analysis/cubrid/cubrid-heartbeat.md— unrelated module; cross-link only.
Textbook chapters / papers
Section titled “Textbook chapters / papers”- Selinger, Astrahan, Chamberlin, Lorie, Price, Access Path Selection in a Relational Database Management System, SIGMOD 1979 — the canonical reference for the System R DP optimizer that CUBRID’s planner is a descendant of. Specifically: §3.2 “Cost formulas”, §3.3 “Order preservation in joins” (interesting orders), §4 “Multi-relation queries” (left-deep DP, dominance pruning).
- Graefe, The Cascades Framework for Query Optimization, IEEE Data Eng. Bull. 1995 — the rule-based optimizer family CUBRID is not part of, included for contrast.
- Hellerstein, Stonebraker, Hamilton, Architecture of a Database System, FnT in Databases 2007, §6 “Relational Query Processor” — modern survey of the design space.
- Database Internals (Petrov), Ch. “Query Processing and Optimization” — textbook framing for this document.
CUBRID source (/data/hgryoo/references/cubrid/)
Section titled “CUBRID source (/data/hgryoo/references/cubrid/)”src/optimizer/optimizer.h— public types and entry prototypes.src/optimizer/query_graph.h—QO_NODE,QO_SEGMENT,QO_TERM,QO_EQCLASS,QO_PARTITION,QO_ENVplus accessor macros.src/optimizer/query_graph.c— graph builders,qo_optimize_*, edge/eq-class/index/partition discovery, term analysis.src/optimizer/query_planner.h—QO_PLAN,QO_PLANVEC,QO_INFO,QO_PLANNERplus enums.src/optimizer/query_planner.c— DP search, cost model,qo_examine_*routines, selectivity calculations.src/optimizer/plan_generation.c—qo_to_xasland the per-plan-type lowerings into XASL_NODE / ACCESS_SPEC_TYPE.src/optimizer/query_bitset.{c,h}— set-of-int primitive used by every part of the optimizer.src/parser/parse_tree.h—PT_NODEdefinitions consumed by the optimizer entry point.src/query/xasl.h—XASL_NODE/ACCESS_SPEC_TYPEproduced byqo_to_xasl.src/xasl/— modular XASL type headers.src/parser/xasl_generation.c— caller ofqo_optimize_queryandqo_to_xasl; not underoptimizer/but the entry surface.