CUBRID Query Rewrite — Pre-Optimization Tree Transformations and the LIMIT-Clause Case Study

Contents:

Theoretical Background
Common DBMS Design
CUBRID’s Approach
Source Walkthrough
Cross-check Notes
Open Questions
Sources

Theoretical Background

Query rewriting is the layer of query processing that lives between parsing (which only enforces syntax) and cost-based optimization (which enumerates physical plans and prices them). Its job is logical: take the parse tree the user wrote and produce an equivalent — but more amenable — parse tree, before the optimizer ever scores a plan. Database Internals (Petrov, Ch. 12 “Query Planning and Optimization”) frames it as the “rule-based” stage; Database System Concepts (Silberschatz et al., Ch. 16) calls it “transformation of relational expressions”. Either way, the contract is equivalence- preserving syntactic surgery.

Goetz Graefe’s The Cascades Framework for Query Optimization (IEEE Data Eng. Bull., 1995) makes the dichotomy explicit: transformation rules (rewrites that always improve, applied unconditionally) versus implementation rules (alternatives that the optimizer must cost). Predicate pushdown is the canonical transformation — it is never worse to filter early — so it lives in the rewrite layer. Join-order enumeration is the canonical implementation — it has to be costed — so it lives in the optimizer proper. CUBRID adheres to this split: the optimizer’s rewriter/ directory implements transformations, and query_planner.c / plan_generation.c implement enumeration and costing.

The classic transformation catalogue:

Predicate pushdown — move WHERE conjuncts as close to the scan as the schema allows. Reduces tuples earlier in the pipeline.
View inlining (mq_translate, in CUBRID) — substitute a view’s defining query for the view reference, exposing the underlying tables to subsequent rewrites.
Subquery flattening — turn an uncorrelated IN (SELECT ...) into a join with a derived table, exposing the inner tables to join enumeration.
Outer-join reduction — promote a left/right outer join to an inner join when a predicate above it is null-rejecting on the optional side. Database Internals §12.3 walks through why this is safe.
CNF normalisation — convert the WHERE filter to conjunctive normal form so each conjunct is a candidate for pushdown and for index-driven evaluation.
Constant folding / equality reduction — replace WHERE x = 5 AND ... x ... with 5 everywhere x dominated by that conjunct appears, exposing more constants for further folding.
Redundant-join elimination — drop a joined table whose columns are not referenced and whose join condition is FK→PK, i.e., the join cannot remove or duplicate rows.
LIMIT pushdown / row-counting predicate lowering — the topic of this document. The user-visible LIMIT n becomes a row-numbering predicate on the scan side, so early termination is expressible inside the existing predicate machinery rather than requiring a new “stop after n” plan operator.

The last item — LIMIT pushdown — is interesting because it is not pure pushdown. A LIMIT n over an unsorted scan is trivially expressible as INST_NUM() <= n on the scan’s tuple counter. A LIMIT n after a sort is not a scan-side predicate; it must sit on the sort output, where the counter is well-defined. A LIMIT n after a GROUP BY similarly sits on the group output. CUBRID encodes the choice via three different numbering operators (PT_INST_NUM, PT_ORDERBY_NUM, PT_GROUPBY_NUM) and routes the LIMIT into the right predicate slot (where, orderby_for, or having) based on the surrounding clauses. The lowering is a decision tree, walked once during semantic checking and (for some shapes) revisited during mq_rewrite.

Common DBMS Design

PostgreSQL’s subquery_planner (in src/backend/optimizer/plan/ planner.c) is structured as an ordered sequence of rewrite passes: pull_up_sublinks → inline_set_returning_functions → pull_up_subqueries → flatten_simple_union_all → reduce_outer_joins → remove_useless_result_rtes → preprocess_expression (which does constant folding, canonicalize_qual, eval_const_expressions) → preprocess_qual_ conditions. LIMIT lives as parse->limitOffset / parse->limitCount and is consumed by the executor’s ExecLimit node — Postgres does not lower it into a row- counting predicate at rewrite time; it stays a top-of-plan operator throughout.

MySQL’s optimizer entrypoint JOIN::optimize calls optimize_cond (a CNF-and-constant-folding pass), then prune_partitions, then make_join_statistics, optimize_keyuse, and finally cost-based join order. LIMIT n in MySQL is also a top-of-plan operator (limit field on SELECT_LEX), with two known exceptions where the rewrite layer peeks at it: index-condition pushdown and the loose-index-scan optimization for MIN/MAX over indexes — both of which can short-circuit a scan when LIMIT is small. There is no general- purpose lowering of LIMIT into a counting predicate.

Oracle’s view-merging (the heuristic version, before the cost- based variant arrived in 10g) is the closest precedent for CUBRID’s mq_translate: an inline-eligible view is substituted into its parent query, exposing the underlying tables to the optimizer. Oracle’s _complex_view_merging parameter and the MERGE / NO_MERGE hints are the operator-visible knobs.

CUBRID’s choice — to lower LIMIT into a row-counting predicate during semantic checking, then re-run a rewrite pass over the resulting expression — sits between Postgres (“LIMIT stays structural”) and the index-scan-internal LIMIT optimizations (“LIMIT influences the bottom of the plan”). The choice has two consequences:

LIMIT participates in CNF, constant folding, and auto- parameterization for free, because by the time those passes run, LIMIT is already a PT_LE expression with a PT_VALUE right-hand side.
The optimizer must un-recognise the lowered form when it wants to apply the multi-range index-scan LIMIT optimization. CUBRID does this in qo_check_iscan_for_multi_ range_opt (in plan_generation.c), which inspects query->info.query.orderby_for — the post-lowering slot — to decide whether the LIMIT is in the right shape to drive the multi-range optimization at scan time.

CUBRID’s Approach

The query-processing pipeline that surrounds rewrite is a six-stage path from SQL text to executable XASL. Rewrite is distributed across stages 3 and 5, not localised in one pass:

flowchart LR
  SQL["SQL text"] --> LX["Lexer<br/>csql_lexer.l"]
  LX --> PR["Parser<br/>csql_grammar.y<br/>→ PT_NODE tree"]
  PR --> NR["Name resolution<br/>pt_resolve_names"]
  NR --> SC["Semantic check<br/>pt_semantic_check<br/>type-check + LIMIT lowering<br/>(pt_eval_type_pre)"]
  SC --> VT["View inlining<br/>mq_translate"]
  VT --> RW["Rewrite pass<br/>mq_rewrite (qo_rewrite_queries)<br/>CNF, predicate pushdown,<br/>subquery flattening,<br/>residual LIMIT lowering,<br/>auto-parameterize"]
  RW --> OPT["Cost optimizer<br/>qo_optimize_statement"]
  OPT --> XG["XASL generation<br/>xasl_generation.c"]
  XG --> EX["Execution<br/>query_executor.c"]
  classDef rw fill:#fff5d6,stroke:#a07000;
  class SC,VT,RW rw;

The yellow stages all touch the parse tree; the rewrite layer proper is the third one (mq_rewrite). The earlier two are relevant because they (a) lower LIMIT into a numbering predicate and (b) produce a tree where view references have been substituted with their definitions. Both happen before mq_rewrite runs, so the rewrite layer can assume LIMIT is already a predicate in most cases and that it sees real tables.

The driver is in src/compat/db_vdb.c:

// db_compile_statement_local — src/compat/db_vdb.c (condensed)
int
db_compile_statement_local (DB_SESSION * session)
{
  // ... parser already populated session->statements ...

  /* prefetch and lock classes to avoid deadlock */
  (void) pt_class_pre_fetch (parser, statement);

  /* semantic check + LIMIT lowering happens inside */
  statement_result = pt_compile (parser, statement);

  /* view substitution + post-translate rewrites */
  statement_result = mq_translate (parser, statement);
  // ... mq_translate calls mq_rewrite internally for SELECTs ...

  /* re-prefetch translated real classes */
  (void) pt_class_pre_fetch (parser, statement);

  // ... do_prepare_statement → XASL generation ...
}

pt_compile is the entrypoint to semantic checking; mq_translate is the entrypoint to view inlining and post-translate rewrite (it calls mq_rewrite for the resulting SELECTs). Between them, all parse-tree shape changes complete before XASL generation.

LIMIT lowering during semantic checking

The first rewrite the LIMIT clause undergoes happens inside pt_eval_type_pre (in type_checking.c), which is called as the pre-walk callback of pt_semantic_type on every node in the tree. The decision tree is the heart of the LIMIT case study:

flowchart TD
  S["LIMIT clause<br/>node.info.query.limit != NULL<br/>flag.rewrite_limit == 1"]
  S -->|node_type| K{node_type?}
  K -->|PT_SELECT| SEL{has<br/>ORDER BY?}
  K -->|PT_UNION/<br/>PT_INTERSECTION/<br/>PT_DIFFERENCE| UNI{has<br/>ORDER BY?}
  K -->|PT_DELETE| DEL["INST_NUM lowering →<br/>delete_.search_cond"]
  K -->|PT_UPDATE| UPD{has<br/>ORDER BY?}
  SEL -->|yes,<br/>no order_siblings| OBN["ORDERBY_NUM lowering →<br/>query.orderby_for"]
  SEL -->|no,<br/>has GROUP BY| GBN["GROUPBY_NUM lowering →<br/>q.select.having"]
  SEL -->|no GBY,<br/>has DISTINCT| OBN2["ORDERBY_NUM lowering →<br/>query.orderby_for"]
  SEL -->|none of the above| INST["INST_NUM lowering →<br/>q.select.where"]
  UNI -->|yes| OBN3["ORDERBY_NUM lowering →<br/>query.orderby_for"]
  UNI -->|no| RESID["leave for mq_rewrite<br/>(see qo_rewrite_queries)"]
  UPD -->|yes| OBN4["ORDERBY_NUM lowering →<br/>update.orderby_for"]
  UPD -->|no| INST2["INST_NUM lowering →<br/>update.search_cond"]

The branch order is critical: ORDER BY beats GROUP BY beats DISTINCT beats “nothing”, and only the first match is taken. A SELECT with both GROUP BY and ORDER BY uses ORDERBY_NUM, not GROUPBY_NUM. The cheat-sheet’s enumeration of the cases matches the source one-for-one; the pt_eval_type_pre arm for PT_SELECT is:

// pt_eval_type_pre (PT_SELECT arm) — src/parser/type_checking.c
case PT_SELECT:
  /* rewrite limit clause as numbering expression and add it to the corresponding predicate */
  if (node->info.query.limit && node->info.query.flag.rewrite_limit)
    {
      PT_NODE *limit, *t_node;
      PT_NODE **expr_pred;

      if (node->info.query.order_by && !node->info.query.flag.order_siblings)
        {
          expr_pred = &node->info.query.orderby_for;
          limit = pt_limit_to_numbering_expr (parser, node->info.query.limit, PT_ORDERBY_NUM, false);
        }
      else if (node->info.query.q.select.group_by)
        {
          expr_pred = &node->info.query.q.select.having;
          limit = pt_limit_to_numbering_expr (parser, node->info.query.limit, PT_LAST_OPCODE, true);
        }
      else if (node->info.query.all_distinct == PT_DISTINCT)
        {
          expr_pred = &node->info.query.orderby_for;
          limit = pt_limit_to_numbering_expr (parser, node->info.query.limit, PT_ORDERBY_NUM, false);
        }
      else
        {
          expr_pred = &node->info.query.q.select.where;
          limit = pt_limit_to_numbering_expr (parser, node->info.query.limit, PT_INST_NUM, false);
        }

      if (limit != NULL)
        {
          /* append `limit` to the chosen predicate slot */
          // ... condensed: walk expr_pred to its tail, splice in `limit` ...
          node->info.query.flag.rewrite_limit = 0;
        }
    }
  break;

Two non-obvious facts about this arm. First, order_siblings (a CONNECT BY hierarchy modifier) intentionally vetoes the ORDER BY branch — siblings ordering is per-level, not global, so ORDERBY_NUM would count across siblings and overshoot the caller’s intent. Second, the GROUP BY case uses PT_LAST_OPCODE (a sentinel) for the operator and passes is_gby_num = true — the helper interprets this as “build a PT_GROUPBY_NUM function node”, not a PT_EXPR operator, because GROUPBY_NUM is a function in the parse tree taxonomy.

The flag node->info.query.flag.rewrite_limit is set to 1 by the grammar when LIMIT is present (csql_grammar.y writes it in the SELECT/UPDATE/DELETE production rules) and cleared to 0 once the lowering succeeds. After clearing, the LIMIT expression is still on node->info.query.limit for diagnostic printing, but it no longer participates in execution — the real predicate is the lowered numbering expression in the chosen predicate slot.

parser_print_tree checks flag.rewrite_limit and suppresses LIMIT output when the flag is 0 — i.e., after lowering, the printed query has no LIMIT clause but the equivalent WHERE inst_num() <= n (or its sibling). The cheat-sheet’s PT_CONVERT_RANGE formatting variant exposes the lowered form explicitly; this is what the raw analysis quotes for the select code, name FROM athlete ... LIMIT 1 example.

How the lowered predicate is constructed

pt_limit_to_numbering_expr (in parser_support.c) is a small factory. The user-facing LIMIT has up to two values: [offset,] row_count. The function builds either lhs <= row_count (when no offset) or lhs > offset AND lhs <= offset + row_count (when offset is present). The lhs is one of:

PT_EXPR with op = PT_INST_NUM — counts rows after the scan.
PT_EXPR with op = PT_ORDERBY_NUM — counts rows after sort.
PT_FUNCTION with function_type = PT_GROUPBY_NUM — counts groups after grouping.

// pt_limit_to_numbering_expr — src/parser/parser_support.c (condensed)
PT_NODE *
pt_limit_to_numbering_expr (PARSER_CONTEXT * parser, PT_NODE * limit,
                            PT_OP_TYPE num_op, bool is_gby_num)
{
  PT_NODE *lhs, *part1, *part2, *node;

  /* Build the LHS: GROUPBY_NUM is a function node, the others are expr nodes. */
  if (is_gby_num)
    {
      lhs = parser_new_node (parser, PT_FUNCTION);
      lhs->type_enum = PT_TYPE_INTEGER;
      lhs->info.function.function_type = PT_GROUPBY_NUM;
    }
  else
    {
      lhs = parser_new_node (parser, PT_EXPR);
      lhs->info.expr.op = num_op;        /* PT_INST_NUM or PT_ORDERBY_NUM */
    }

  if (limit->next == NULL)
    {
      /* `LIMIT n` → lhs <= n */
      node = parser_new_node (parser, PT_EXPR);
      node->info.expr.op = PT_LE;
      node->info.expr.arg1 = lhs;
      node->info.expr.arg2 = parser_copy_tree (parser, limit);
    }
  else
    {
      /* `LIMIT off, n` → lhs > off AND lhs <= off + n */
      part1->info.expr.op = PT_GT;
      part1->info.expr.arg1 = lhs;
      part1->info.expr.arg2 = parser_copy_tree (parser, limit);     /* offset */

      part2->info.expr.op = PT_LE;
      part2->info.expr.arg1 = parser_copy_tree (parser, lhs);
      /* part2->arg2 = (offset + row_count); folded if both are PT_VALUE */
      // ... condensed: build PT_PLUS sum, fold if both literals ...

      node = parser_new_node (parser, PT_EXPR);
      node->info.expr.op = PT_AND;
      node->info.expr.arg1 = part1;
      node->info.expr.arg2 = part2;
    }
  return node;
}

The PT_VALUE && PT_VALUE constant-folding branch in the two-arg case is a small win: LIMIT 100, 10 becomes inst_num () > 100 AND inst_num() <= 110 directly, with 110 already computed at rewrite time. If either bound is a host variable, the PT_PLUS expression node is kept and folded later.

The mq_rewrite stage — what happens after view inlining

mq_translate substitutes view references with their defining queries. After substitution, new SELECTs may appear in the tree that did not exist before semantic checking, and these SELECTs may carry a LIMIT that came from the view definition or that was attached by mq_translate itself. The mq_rewrite pass at the bottom of mq_translate therefore has to:

Re-lower any residual LIMIT clauses that were not lowered in semantic checking (the UNION case, primarily).
CNF-normalise predicates (which now include the lowered numbering predicates from semantic checking).
Push predicates down where safe.
Flatten uncorrelated subqueries to joins with derived tables.
Reduce equality terms (constant propagation through =).
Reduce outer joins to inner joins where safe.
Auto-parameterize constants for the XASL plan cache.

The driver is mq_rewrite, which parser_walk_trees the statement with qo_rewrite_queries as the pre-callback and qo_rewrite_queries_post as the post-callback:

// mq_rewrite — src/optimizer/rewriter/query_rewrite.c
PT_NODE *
mq_rewrite (PARSER_CONTEXT * parser, PT_NODE * statement)
{
  return parser_walk_tree (parser, statement,
                           qo_rewrite_queries, NULL,
                           qo_rewrite_queries_post, NULL);
}

qo_rewrite_queries is a 500-line dispatch with three phases.

Phase 1 — pre-rewrite

For each statement type, the pre-rewrite phase identifies the predicate slots that subsequent phases will operate on (wherep, havingp, orderby_for_p, etc.) and applies shape-only transformations:

qo_move_on_of_explicit_join_to_where — move ON-clause predicates of explicit joins into the WHERE list, tagged with their original location. The post-callback uses the location tag to undo this for outer joins. Inner joins keep the move; outer joins get the predicate restored to the ON clause unless it has been transformed into a copy-pushed term.
qo_rewrite_index_hints — drop hints that reference dropped tables (after view inlining, the hint may no longer apply).
For PT_UNION / PT_DIFFERENCE / PT_INTERSECTION, the residual LIMIT lowering runs here. This is the case semantic checking left for later: a UNION with LIMIT but no ORDER BY:

// qo_rewrite_queries (PT_UNION arm, residual LIMIT) — query_rewrite.c
case PT_UNION:
case PT_DIFFERENCE:
case PT_INTERSECTION:
  // ... condensed: rewrite hidden columns of arg1, arg2 as derived ...
  if (node->info.query.limit && node->info.query.flag.rewrite_limit)
    {
      limit = pt_limit_to_numbering_expr (parser, node->info.query.limit,
                                          PT_INST_NUM, false);
      if (limit != NULL)
        {
          node->info.query.flag.rewrite_limit = 0;

          /* Push limit into UNION arms when safe (no ORDER BY at top, no DISTINCT,
             no NO_PUSH_PRED hint). */
          if (node->info.query.order_by == NULL
              && !qo_check_distinct_union (parser, node)
              && !qo_check_hint_union (parser, node, PT_HINT_NO_PUSH_PRED))
            {
              node = qo_push_limit_to_union (parser, node, limit_node);
            }
          /* Wrap the UNION as a derived table and apply the lowered predicate
             on top: SELECT * FROM (UNION) T WHERE INST_NUM() <= n. */
          derived = mq_rewrite_query_as_derived (parser, node);
          derived->info.query.q.select.where = limit;
          node = derived;
        }
    }
  orderby_for_p = &node->info.query.orderby_for;
  break;

Two transformations stack here. The first is per-arm LIMIT push: qo_push_limit_to_union walks each branch of the UNION tree and copies the LIMIT (or LIMIT offset+row_count, summed to a single bound) onto each arm, so each arm independently stops after enough rows. The second is wrap-as-derived: the UNION becomes the FROM clause of a synthetic outer SELECT whose WHERE is the lowered INST_NUM() <= n, so the final top-of-pipeline cap is also expressible. The combination means UNION-with-LIMIT can short-circuit each arm independently and also stop the merged pipeline.

Phase 2 — optimization rewrites

Phase 2 runs only when OPTIMIZATION_ENABLED(level) is true (which it is for normal queries; the parameter optimization_ level can disable it for debugging).

// qo_rewrite_queries (phase 2 sketch) — query_rewrite.c
if (OPTIMIZATION_ENABLED (level))
  {
    if (node->node_type == PT_SELECT)
      {
        int continue_walk, idx = 0;
        qo_rewrite_subqueries (parser, node, &idx, &continue_walk);
      }

    /* convert predicate lists to CNF */
    if (*wherep)        *wherep        = pt_cnf (parser, *wherep);
    if (*havingp)       *havingp       = pt_cnf (parser, *havingp);
    if (*startwithp)    *startwithp    = pt_cnf (parser, *startwithp);
    if (*connectbyp)    *connectbyp    = pt_cnf (parser, *connectbyp);
    if (*aftercbfilterp)*aftercbfilterp= pt_cnf (parser, *aftercbfilterp);
    if (*orderby_for_p) *orderby_for_p = pt_cnf (parser, *orderby_for_p);
    // ... merge_upd_wherep, merge_ins_wherep, merge_del_wherep ...

    /* in HAVING with GROUP BY, move non-aggregate terms to WHERE */
    if (PT_IS_SELECT (node) && node->info.query.q.select.group_by && *havingp)
      {
        // ... condensed: HAVING-to-WHERE pushdown for non-aggregate terms ...
      }

    /* replace `WHERE x = const AND ... x ...` with `const` everywhere */
    if (*wherep) {
      if (PT_IS_SELECT (node))
        parser_walk_tree (parser, node, NULL, NULL,
                          qo_reduce_equality_terms_post, NULL);
      else
        QO_CHECK_AND_REDUCE_EQUALITY_TERMS (parser, node, wherep);
    }
    if (*havingp) QO_CHECK_AND_REDUCE_EQUALITY_TERMS (parser, node, havingp);

    /* per-clause term rewrites: range merging, OR→IN promotion, etc. */
    qo_rewrite_terms (parser, spec, wherep);
    qo_rewrite_terms (parser, spec, havingp);
    qo_rewrite_terms (parser, spec, startwithp);
    qo_rewrite_terms (parser, spec, connectbyp);
    qo_rewrite_terms (parser, spec, aftercbfilterp);
    qo_rewrite_terms (parser, spec, merge_upd_wherep);
    qo_rewrite_terms (parser, spec, merge_ins_wherep);
    qo_rewrite_terms (parser, spec, merge_del_wherep);

    /* outer-join reduction, predicate copy, redundant-join elimination */
    if (node->node_type == PT_SELECT)
      {
        if (!qo_rewrite_select_queries (parser, &node, wherep, &seqno))
          return node;
      }
  }

The HAVING-to-WHERE move on line “in HAVING with GROUP BY” deserves a closer look. A HAVING conjunct that does not reference an aggregate and does not reference a non-grouped column can be evaluated before grouping — moving it to WHERE is a textbook predicate pushdown. CUBRID’s variant adds two guards: the conjunct must not contain a pseudocolumn (CONNECT BY pseudocolumns are level-sensitive), and the GROUP BY must not have ROLLUP (ROLLUP groups depend on the pre-aggregate row count, so eliminating rows changes the result).

qo_rewrite_subqueries is the subquery-flattening pass. For each WHERE conjunct of the form attr op (SELECT ...) where the operator is one of =, IN, =SOME, >SOME, >=SOME, <SOME, <=SOME, the subquery’s columns are indexable, and the subquery is uncorrelated (correlation_level == 0), the pass:

Wraps the subquery as a derived table spec and appends it to the FROM list.
Replaces the operator with =, >, >=, <, or <=.
For the *SOME variants, replaces the subquery’s select list with MIN() or MAX() of the original column — > SOME becomes > MIN, < SOME becomes < MAX.

This trades a correlated lookup (one subquery execution per outer row) for a single derived-table join, which the join enumerator can then index-drive.

qo_rewrite_select_queries (in query_rewrite_select.c) does three things in order: rewrite outer joins to inner joins where a null-rejecting predicate dominates the optional side (qo_rewrite_outerjoin), reduce outer-joined tables when the join is uniquely keyed and unreferenced (qo_reduce_outer_joined_tbls), and remove tables joined by FK→PK whose columns aren’t projected (qo_reduce_joined_tbls_ ref_by_fk). The latter two are the redundant-join elimination pass.

Phase 3 — auto-parameterization

After all structural rewrites, qo_auto_parameterize walks each predicate list and replaces literal constants with input host variables (input markers). The motivation is the XASL plan cache: a cached plan keyed on the lowered parse tree is reusable across executions that differ only in literal values, which dramatically improves cache hit rate. The pass is gated by `prm_get_integer_value (PRM_ID_XASL_CACHE_MAX_ENTRIES)

0andnode->flag.cannot_prepare == 0`, and it runs only when the parser is not in static SQL or skip-auto-parameterize mode.

// qo_auto_parameterize_limit_clause — query_rewrite_auto_parameterize.c (condensed)
void
qo_auto_parameterize_limit_clause (PARSER_CONTEXT * parser, PT_NODE * node)
{
  PT_NODE *limit_offsetp, *limit_row_countp;

  /* Pull (offset, row_count) out of the per-statement-type slot. */
  // ... condensed: switch on node_type, set *_offsetp, *_row_countp ...

  /* Replace each literal with `pt_rewrite_to_auto_param` (creates an input host var). */
  if (limit_offsetp && pt_is_const_not_hostvar (limit_offsetp))
    new_limit_offsetp = pt_rewrite_to_auto_param (parser, limit_offsetp);
  if (limit_row_countp && pt_is_const_not_hostvar (limit_row_countp))
    new_limit_row_countp = pt_rewrite_to_auto_param (parser, limit_row_countp);

  /* Splice back into the slot. */
  // ... condensed: switch on node_type again to write back ...
}

Note that qo_auto_parameterize_limit_clause operates on the original node->info.query.limit slot, even though the lowered numbering predicate has already been built. Both copies exist after rewrite: the limit slot holds the host-var- parameterized LIMIT ? for diagnostic and printing purposes, while the lowered INST_NUM() <= ? (also host-var- parameterized via qo_auto_parameterize walking the WHERE list) is what XASL actually evaluates. This is the “NOTE: Dummy로 가지고 있고…” comment in the raw analysis — the question of whether keeping the dummy LIMIT is the right call is left open there, and remains so here.

The example, end-to-end

The raw analysis walks the example SELECT code, name INTO c, m FROM athlete WHERE gender = 'M' AND nation_code = 'KOR' LIMIT 1 through the three stages. With the source verified:

Stage 1 — parse. The grammar production for SELECT ... LIMIT 1 writes node->info.query.limit = PT_VALUE(1) and node->info.query.flag.rewrite_limit = 1. The original gender = 'M' AND nation_code = 'KOR' predicate goes on q.select.where.

Stage 2 — semantic check (pt_eval_type_pre, PT_SELECT arm). The query has no ORDER BY, no GROUP BY, no DISTINCT; the fall-through arm runs:

expr_pred = &node->info.query.q.select.where;
limit = pt_limit_to_numbering_expr (parser, node->info.query.limit, PT_INST_NUM, false);

pt_limit_to_numbering_expr returns PT_LE(PT_INST_NUM(), PT_VALUE(1)), which is appended to q.select.where. Both the existing gender and nation_code conjuncts and the new inst_num() <= 1 conjunct are now in the WHERE list. The rewrite_limit flag is cleared. Printing the tree with PT_CONVERT_RANGE formatting shows:

SELECT [public.athlete].code, [public.athlete].[name]
FROM   [public.athlete] [public.athlete]
WHERE  (inst_num() <= 1) AND [public.athlete].gender = 'M'
   AND [public.athlete].nation_code = 'KOR'

Stage 3 — mq_rewrite. CNF-normalises (already CNF), reduces equality terms (no opportunity here), then auto-parameterizes all literals:

SELECT [public.athlete].code, [public.athlete].[name]
FROM   [public.athlete] [public.athlete]
WHERE  (inst_num() <= ?:0) AND [public.athlete].gender = ?:1
   AND [public.athlete].nation_code = ?:2

The plan cache key is now insensitive to the specific values 1, ‘M’, ‘KOR’.

Multi-range LIMIT optimization at plan-generation time

The lowered LIMIT is consumed by the optimizer in two ways. The first is the ordinary one: INST_NUM() <= n participates in the scan’s predicate evaluation, and the executor stops calling the scan once enough rows are collected. The second is a special case for index scans with ORDER BY: when the ordering matches an index prefix and the LIMIT is small, CUBRID can drive the scan with a priority queue of index ranges, evaluating only enough of each range to satisfy the LIMIT. This is the “multi-range optimization” (MRO) and lives in qo_check_iscan_for_multi_range_opt (in plan_generation.c).

// qo_check_iscan_for_multi_range_opt — src/optimizer/plan_generation.c (condensed)
bool
qo_check_iscan_for_multi_range_opt (QO_PLAN * plan)
{
  if (!qo_is_iscan (plan)) return false;
  if (QO_NODE_IS_CLASS_HIERARCHY (plan->plan_un.scan.node)) return false;

  query = QO_ENV_PT_TREE (env);
  if (!PT_IS_SELECT (query)) return false;
  if (query->info.query.q.select.hint & PT_HINT_NO_MULTI_RANGE_OPT) return false;
  if (pt_has_aggregate (parser, query)) return false;            /* CBRD-22696 */
  if (query->info.query.order_by == NULL) return false;
  if (query->info.query.all_distinct == PT_DISTINCT) return false;
  if (query->info.query.orderby_for == NULL) return false;       /* must have lowered LIMIT */

  /* the order_by columns must occupy consecutive index positions, in matching
   * direction (or all reversed), starting at first_col_idx_pos */
  // ... build orderby_nodes from select_list using sort_spec.pos_descr.pos_no ...
  qo_check_plan_index_for_multi_range_opt (orderby_nodes, order_by, plan,
                                           &can_optimize, &first_col_idx_pos, &reverse);

  /* scan terms / key filter terms must allow MRO; subqueries must not depend on the LIMIT */
  qo_check_terms_for_multiple_range_opt (plan, first_col_idx_pos, &can_optimize);
  can_optimize = qo_check_subqueries_for_multi_range_opt (plan, first_col_idx_pos);

  /* upper bound must be < PRM_ID_MULTI_RANGE_OPT_LIMIT system parameter */
  if (!pt_check_ordby_num_for_multi_range_opt (parser, query,
                                               &env->multi_range_opt_candidate, NULL))
    return false;

  return true;
}

The MRO predicate explicitly checks for the post-lowering shape: orderby_for non-null means pt_eval_type_pre lowered LIMIT into the ORDERBY_NUM slot, which is the only slot MRO can act on. Equivalently, MRO is dead for the INST_NUM and GROUPBY_NUM lowerings; the optimization only applies when the LIMIT sits above an index-driven sort.

The pt_check_ordby_num_for_multi_range_opt predicate itself (in parser_support.c:9743) inspects the predicate shape: the upper bound has to be less than prm_get_integer_value (PRM_ID_MULTI_RANGE_OPT_LIMIT) (default in system_parameter.c), the predicate has to look like ORDERBY_NUM <= n or ORDERBY_NUM < n or n > ORDERBY_NUM or n >= ORDERBY_NUM, AND-combined; lower bounds are not allowed because MRO has no efficient way to skip leading rows of a priority queue.

Why the rewrite-time lowering matters

Three downstream consequences fall out of LIMIT being a predicate by the time mq_rewrite runs:

CNF and equality reduction work on it for free. The INST_NUM() <= n conjunct is a normal CNF term; if the user wrote LIMIT 1 and a prior rewrite folded another conjunct into a tautology, the CNF simplifier sees the survivor as inst_num() <= 1, not as a special LIMIT clause.
Auto-parameterization works on it for free. The literal n is recognised and replaced with a host-variable marker by the same walk that processes user-written constants.
Index-condition planning sees it as just another predicate — the optimizer’s usual machinery for identifying index-driven evaluation does not need a LIMIT- specific code path. The MRO machinery is the one place that does require LIMIT-awareness, and it is opt-in.

The price is the duplication noted earlier: the original info.query.limit field is preserved (for printing and for auto-parameterization of the offset/row_count host vars used by the keylimit and other downstream code paths), even though it is functionally a dummy after lowering. The cheat-sheet flags this as an open question; the source does not resolve it either.

Source Walkthrough

Anchor on symbol names, not line numbers. The CUBRID source moves; the position table at the end is scoped to the doc’s updated: date.

Parse-tree slots and flags

pt_query_info::limit (parse_tree.h) — the user-written LIMIT [offset,] row_count chain, list of PT_VALUE / PT_HOST_VAR / PT_EXPR nodes. Survives lowering.
pt_query_info::flag.rewrite_limit:1 (parse_tree.h) — set by the grammar when LIMIT is present, cleared by the lowering arm in pt_eval_type_pre (or in qo_rewrite_queries for the UNION-without-ORDER-BY case).
pt_delete_info::limit / pt_delete_info::rewrite_limit:1 (parse_tree.h) — same idea, DELETE-side.
pt_update_info::limit / pt_update_info::rewrite_limit:1 (parse_tree.h) — same idea, UPDATE-side.
PT_INST_NUM / PT_ORDERBY_NUM (parse_tree.h) — the two numbering operators in the PT_OP_TYPE enum.
PT_GROUPBY_NUM (parse_tree.h) — the numbering function in the FUNC_TYPE enum (functions and operators are different PT_NODE shapes; GROUPBY_NUM is a function because it is evaluated post-grouping).
PT_EXPR_INFO_GROUPBYNUM_LIMIT (parse_tree.h) — flag set on each conjunct of a GROUPBY_NUM-based lowered LIMIT, so the HAVING-to-WHERE move logic knows not to relocate it.

Grammar-side LIMIT handling

csql_grammar.y — three productions write flag.rewrite_ limit = 1: the SELECT/UNION limit production, the UPDATE limit production (sets update.rewrite_limit = 1), and the DELETE limit production (sets delete_.rewrite_limit = 1).

Semantic-check entry points

pt_compile (compile.c) — top-level entry, calls pt_semantic_check and returns. Owns the longjmp environment for parse-time errors.
pt_semantic_check / pt_check_with_info (semantic_check.c) — per-node-type semantic checks, dispatches to pt_resolve_names, pt_check_where, pt_semantic_type.
pt_semantic_type (type_checking.c) — owner of the pre/post walk that runs pt_eval_type_pre and pt_eval_type over the tree. Where LIMIT lowering lives.
pt_eval_type_pre (type_checking.c) — pre-walk callback, contains the LIMIT-lowering decision tree (PT_SELECT, PT_UPDATE, PT_DELETE, PT_UNION/PT_DIFFERENCE/PT_INTERSECTION arms).
pt_limit_to_numbering_expr (parser_support.c) — factory that builds the lhs <= row_count or lhs > offset AND lhs <= offset + row_count lowered expression.

View inlining and rewrite driver

mq_translate (view_transform.c) — top-level view-inlining and post-translate rewrite entry. Calls mq_rewrite for SELECT-shaped statements.
mq_rewrite (query_rewrite.c) — the parser_walk_tree driver of the rewrite layer. Wraps qo_rewrite_queries (pre) and qo_rewrite_queries_post (post).
qo_rewrite_queries (query_rewrite.c) — the three-phase rewrite dispatcher (pre-rewrite, optimization rewrites, auto-parameterize). The 500-line core.
qo_rewrite_queries_post (query_rewrite.c) — moves ON-clause predicates back to the join spec for outer joins; removes COPYPUSH terms.

Subquery flattening

qo_rewrite_subqueries (query_rewrite_subquery.c) — flatten uncorrelated subqueries in WHERE to derived-table joins.
qo_rewrite_hidden_col_as_derived (query_rewrite_subquery.c) — when an ORDER BY in a subquery would lose its hidden columns on the outer-projection side, wrap as a derived table so the order is preserved.
qo_add_limit_clause (query_rewrite_subquery.c) — injection used by EXISTS-rewriting: if a subquery would benefit from LIMIT 1, attach it. Bumps rewrite_limit back to 1 so the residual UNION-arm path picks it up.
mq_rewrite_query_as_derived (view_transform.c) — wrap a query as the FROM-list derived table of a synthetic outer SELECT.
mq_make_derived_spec (view_transform.c) — build the PT_SPEC node + new attribute names for a derived table.

CNF and equality reduction

pt_cnf (cnf.c) — convert a predicate list to conjunctive normal form, tagging taggable terms.
pt_do_cnf (cnf.c) — top-level CNF wrapper that walks all predicate-bearing nodes of a statement.
qo_reduce_equality_terms / qo_reduce_equality_terms_post (query_rewrite_term.c) — WHERE x = 5 AND p(x) → WHERE x = 5 AND p(5).
QO_CHECK_AND_REDUCE_EQUALITY_TERMS (query_rewrite.h) — guard macro that skips re-running reduction on the same node.
qo_rewrite_terms (query_rewrite_term.c) — per-clause range-merging, OR-to-IN, redundant-disjunct removal.

Outer-join reduction and redundant joins

qo_rewrite_select_queries (query_rewrite_select.c) — driver for the SELECT-only structural rewrites.
qo_rewrite_outerjoin (query_rewrite_select.c) — promote outer to inner when a null-rejecting predicate dominates.
qo_rewrite_innerjoin (query_rewrite_select.c) — symmetric cleanup.
qo_reduce_outer_joined_tbls (query_rewrite_select.c) — drop a uniquely-keyed outer-joined table that is not projected.
qo_reduce_joined_tbls_ref_by_fk (query_rewrite_select.c) — drop FK→PK joins when the parent’s columns are unreferenced.
qo_reduce_predicate_for_parent_spec (query_rewrite_select.c) — companion that adjusts surviving predicates after a join is reduced.

LIMIT-specific helpers

qo_push_limit_to_union (query_rewrite_set.c) — copy LIMIT onto each branch of a UNION when the top-level has no ORDER BY, no DISTINCT, and no NO_PUSH_PRED hint.
qo_check_distinct_union / qo_check_hint_union (query_rewrite_set.c) — the safety predicates for qo_push_limit_to_union.
qo_auto_parameterize_limit_clause (query_rewrite_auto_parameterize.c) — auto-parameterize the preserved LIMIT host vars.
qo_auto_parameterize_keylimit_clause (query_rewrite_auto_parameterize.c) — same for the per-index keylimit hint.
qo_auto_parameterize (query_rewrite_auto_parameterize.c) — generic literal-to-host-var pass over CNF terms.
pt_rewrite_to_auto_param (parse_tree_cl.c) — the individual literal-to-host-var transform.

Multi-range LIMIT optimization (plan-gen time)

qo_check_iscan_for_multi_range_opt (plan_generation.c) — the “can this index scan use MRO?” predicate; gates on query.orderby_for non-null (i.e., post-lowering shape).
qo_check_plan_index_for_multi_range_opt (plan_generation.c) — checks that ORDER BY columns occupy consecutive index positions, all in matching direction.
pt_check_ordby_num_for_multi_range_opt (parser_support.c) — checks the lowered ORDERBY_NUM ≤ n predicate shape and the PRM_ID_MULTI_RANGE_OPT_LIMIT upper-bound system parameter.
qo_plan_multi_range_opt (plan_generation.c) — boolean on a finished plan: was MRO actually applied?
PRM_ID_MULTI_RANGE_OPT_LIMIT (system_parameter.h) — upper bound on n for MRO eligibility.

Per-statement compile driver

db_compile_statement_local (db_vdb.c) — top-of-pipeline driver: parser → pt_class_pre_fetch → pt_compile → mq_translate → pt_class_pre_fetch again → do_prepare_statement. Where the rewrite layer fits into the broader compile flow.

Position hints as of 2026-04-30

Symbol	File	Line
`pt_query_info::flag.rewrite_limit:1`	`src/parser/parse_tree.h`	2141
`pt_delete_info::rewrite_limit:1`	`src/parser/parse_tree.h`	2850
`pt_update_info::rewrite_limit:1`	`src/parser/parse_tree.h`	2972
`PT_INST_NUM` / `PT_ORDERBY_NUM` enumerators	`src/parser/parse_tree.h`	1399
`pt_compile`	`src/parser/compile.c`	381
`pt_class_pre_fetch`	`src/parser/compile.c`	432
`pt_eval_type_pre` (PT_SELECT LIMIT arm)	`src/parser/type_checking.c`	7179
`pt_eval_type_pre` (PT_UNION LIMIT arm)	`src/parser/type_checking.c`	7138
`pt_eval_type_pre` (PT_DELETE LIMIT arm)	`src/parser/type_checking.c`	7242
`pt_eval_type_pre` (PT_UPDATE LIMIT arm)	`src/parser/type_checking.c`	7274
`pt_limit_to_numbering_expr`	`src/parser/parser_support.c`	4567
`pt_check_ordby_num_for_multi_range_opt`	`src/parser/parser_support.c`	9743
`pt_cnf`	`src/parser/cnf.c`	941
`pt_do_cnf`	`src/parser/cnf.c`	1183
`mq_translate`	`src/parser/view_transform.c`	(top)
`mq_rewrite`	`src/optimizer/rewriter/query_rewrite.c`	736
`qo_rewrite_queries`	`src/optimizer/rewriter/query_rewrite.c`	44
`qo_rewrite_queries_post`	`src/optimizer/rewriter/query_rewrite.c`	586
`qo_rewrite_queries` (PT_UNION residual LIMIT branch)	`src/optimizer/rewriter/query_rewrite.c`	173
`qo_rewrite_subqueries`	`src/optimizer/rewriter/query_rewrite_subquery.c`	40
`qo_rewrite_hidden_col_as_derived`	`src/optimizer/rewriter/query_rewrite_subquery.c`	331
`qo_add_limit_clause`	`src/optimizer/rewriter/query_rewrite_subquery.c`	473
`qo_push_limit_to_union`	`src/optimizer/rewriter/query_rewrite_set.c`	40
`qo_check_distinct_union`	`src/optimizer/rewriter/query_rewrite_set.c`	101
`qo_check_hint_union`	`src/optimizer/rewriter/query_rewrite_set.c`	132
`qo_rewrite_select_queries`	`src/optimizer/rewriter/query_rewrite_select.c`	62
`qo_rewrite_index_hints`	`src/optimizer/rewriter/query_rewrite_select.c`	151
`qo_move_on_of_explicit_join_to_where`	`src/optimizer/rewriter/query_rewrite_select.c`	516
`qo_reduce_outer_joined_tbls`	`src/optimizer/rewriter/query_rewrite_select.c`	1909
`qo_reduce_joined_tbls_ref_by_fk`	`src/optimizer/rewriter/query_rewrite_select.c`	2105
`qo_rewrite_outerjoin`	`src/optimizer/rewriter/query_rewrite_select.c`	3382
`qo_rewrite_nonnull_count_select_list`	`src/optimizer/rewriter/query_rewrite_select.c`	3777
`qo_auto_parameterize`	`src/optimizer/rewriter/query_rewrite_auto_parameterize.c`	41
`qo_auto_parameterize_limit_clause`	`src/optimizer/rewriter/query_rewrite_auto_parameterize.c`	129
`qo_auto_parameterize_keylimit_clause`	`src/optimizer/rewriter/query_rewrite_auto_parameterize.c`	293
`qo_check_iscan_for_multi_range_opt`	`src/optimizer/plan_generation.c`	4030
`qo_check_plan_index_for_multi_range_opt`	`src/optimizer/plan_generation.c`	4199
`qo_plan_multi_range_opt`	`src/optimizer/plan_generation.c`	3141
`db_compile_statement_local`	`src/compat/db_vdb.c`	531
`PRM_ID_MULTI_RANGE_OPT_LIMIT`	`src/base/system_parameter.h`	295

Cross-check Notes

The raw analysis says qo_optimize_queries runs the rewrite phases. That symbol no longer exists; the actual driver is mq_rewrite (query_rewrite.c:736), which calls qo_rewrite_queries (query_rewrite.c:44) as the parser_walk_tree pre-callback. The function name carries the original “optimize” framing in the PDF; the source has since renamed it to “rewrite” (the optimizer-proper now lives in query_planner.c and plan_generation.c).
The cheat-sheet’s qo_do_auto_parameterize is now qo_auto_parameterize. Both the WHERE-walker and the LIMIT-clause helper were renamed when the rewriter was modularised into src/optimizer/rewriter/. The behaviour is the same.
The cheat-sheet lists the parameter checks for is_parsing_static_sql / is_skip_auto_parameterize only on the LIMIT-clause helper. The current source applies them consistently — the WHERE-walking variant is gated by call_auto_parameterize in qo_rewrite_queries, which evaluates the same conditions.
The raw analysis treats LIMIT lowering as a single stage inside semantic checking. That is true for SELECT, UPDATE, and DELETE. UNION/DIFFERENCE/INTERSECTION are split: the ORDER-BY case is lowered in pt_eval_type_pre, but the no-ORDER-BY case is left for qo_rewrite_queries (which picks it up in the PT_UNION arm and additionally pushes the limit into each arm). The cheat-sheet acknowledges UNION’s “ORDER BY가 있는 경우” branch but not the residual handling.
The deck quotes the where_type() / eval_expr_type() walkers for SELECT in pt_eval_type. The LIMIT-lowering decision tree fired by the pt_eval_type_pre pre-callback runs before those walkers — i.e., LIMIT is already a numbering predicate by the time type evaluation per se starts on the surrounding WHERE.
The “Dummy LIMIT host variable” question raised in the raw analysis — whether to keep the original LIMIT after lowering — is still unresolved in the source. Both the semantic-check arms and qo_auto_parameterize_limit_clause preserve the original info.query.limit. The likely reason is parser_print_tree — operators dump the original LIMIT for diagnostics — but if no diagnostic mode reads it after auto-parameterization, the field is dead. Investigation would require auditing all readers of info.query.limit after mq_rewrite.

Open Questions

What exactly invokes mq_rewrite inside mq_translate? The raw cheat-sheet sketches mq_translate calls mq_rewrite “for SELECT”, but mq_translate has multiple per-type variants (mq_translate_select, mq_translate_update, mq_translate_delete, mq_translate_merge). Tracing each variant’s call into mq_rewrite would clarify whether DELETE and UPDATE LIMITs re-enter qo_rewrite_queries or only see semantic-check-time lowering.
PT_LAST_OPCODE as a sentinel. The GROUP BY arm passes PT_LAST_OPCODE as num_op. The helper ignores num_op when is_gby_num is true (it builds a function node, not an expr node), so the value is harmless. But is there a place where the sentinel could leak — e.g., into a copied expression — and confuse a downstream switch on op?
The “two LIMIT representations” trade-off. Why does qo_auto_parameterize_limit_clause parameterize the original info.query.limit separately from the numbering-predicate path? The lowered predicate already carries the value (auto-parameterized by the WHERE walker). Is the dual representation still needed for printing, keylimit, or some other path the raw analysis did not identify?
Interaction with CONNECT BY / hierarchical queries. The order_siblings veto in the ORDER BY arm prevents lowering into ORDERBY_NUM, leaving the LIMIT as info.query.limit. Does qo_rewrite_queries re-pick this up, or does it stay a non-lowered LIMIT through XASL generation, evaluated by a different mechanism? The cheat-sheet treats CONNECT BY as “분석 생략” (analysis skipped); the source path needs tracing.
MRO and PRM_ID_MULTI_RANGE_OPT_LIMIT defaults. The default value of the system parameter governs how large a LIMIT can be while still benefiting from MRO. The default should be checked against the system_parameter.c entry (line 3197). Workloads with LIMIT 1000 may or may not trigger MRO depending on the chosen default.
DBLINK and remote DML LIMIT. db_compile_statement_local skips mq_translate for DBLINK DML queries (PT_IS_DBLINK_DML_QUERY); the LIMIT is not lowered in that path either, because the remote server is expected to parse and execute the query itself. Does the local rewrite layer ever see DBLINK SELECTs whose LIMITs need lowering for the local optimization side? The grammar production for DBLINK SELECTs and the call into pt_rewrite_for_dblink would clarify.

Sources

Raw analyses (`raw/code-analysis/cubrid/query-processing/`)

code_analysis_Query_rewrite-LIMIT_clause.pdf — focused Korean analysis of LIMIT-clause rewriting with the per-statement-type decision tree and the select ... LIMIT 1 walked example.
code_analysis_Query_Processing_Cheat_Sheet.pdf — global parser-and-semantic-check map; locates LIMIT lowering inside pt_eval_type_pre and the auto-parameterization inside what it calls qo_optimize_queries.
_converted/codeanalysisqueryrewrite-limitclause.pdf.md — markitdown extract of the LIMIT PDF.
_converted/codeanalysisqueryprocessingcheatsheet.pdf.md — markitdown extract of the cheat-sheet PDF.

Sibling docs

knowledge/code-analysis/cubrid/cubrid-btree.md — the index scan that the lowered INST_NUM/ORDERBY_NUM predicate guards; MRO sits in the same plan-generation neighbourhood.
knowledge/code-analysis/cubrid/cubrid-mvcc.md — the visibility check the scan applies to each row that survives the lowered predicate.

Textbook chapters / papers

Database Internals (Petrov), Ch. 12 “Query Planning and Optimization” — rule-based vs cost-based split, predicate pushdown, view inlining.
Database System Concepts (Silberschatz, Korth, Sudarshan, 7th ed.), Ch. 16 “Query Optimization” — equivalence rules, the relational-algebra-level transformation catalogue.
Goetz Graefe, The Cascades Framework for Query Optimization, IEEE Data Eng. Bull. 18(3), 1995 — the formal split between transformation rules (always-improving rewrites) and implementation rules (cost-priced alternatives).
Goetz Graefe, Query Evaluation Techniques for Large Databases, ACM Computing Surveys 25(2), 1993 — historical context for the index-driven LIMIT optimizations CUBRID’s MRO inherits from.

CUBRID source (`/data/hgryoo/references/cubrid/`)

src/parser/csql_grammar.y — grammar productions that set rewrite_limit = 1 on SELECT/UNION/UPDATE/DELETE LIMIT clauses.
src/parser/parse_tree.h — the LIMIT-bearing structs and the PT_INST_NUM / PT_ORDERBY_NUM / PT_GROUPBY_NUM enumerations.
src/parser/compile.c — pt_compile entry; semantic check driver.
src/parser/type_checking.c — pt_semantic_type and pt_eval_type_pre; the LIMIT-lowering decision tree.
src/parser/parser_support.c — pt_limit_to_numbering_expr, the lowering factory; pt_check_ordby_num_for_multi_range_opt, the MRO predicate-shape check.
src/parser/cnf.c — pt_cnf, pt_do_cnf.
src/parser/view_transform.c — mq_translate (view inlining) and mq_rewrite_query_as_derived / mq_make_derived_spec (derived-table wrappers used by subquery flattening and UNION-with-LIMIT).
src/optimizer/rewriter/query_rewrite.c — mq_rewrite and qo_rewrite_queries; the rewrite-layer driver.
src/optimizer/rewriter/query_rewrite_subquery.c — qo_rewrite_subqueries, qo_rewrite_hidden_col_as_derived, qo_add_limit_clause.
src/optimizer/rewriter/query_rewrite_set.c — qo_push_limit_to_union, qo_check_distinct_union, qo_check_hint_union.
src/optimizer/rewriter/query_rewrite_select.c — qo_rewrite_select_queries, the outer-join-reduction and redundant-join helpers; qo_rewrite_index_hints.
src/optimizer/rewriter/query_rewrite_term.c — qo_reduce_equality_terms, qo_rewrite_terms.
src/optimizer/rewriter/query_rewrite_auto_parameterize.c — qo_auto_parameterize, qo_auto_parameterize_limit_clause, qo_auto_parameterize_keylimit_clause.
src/optimizer/plan_generation.c — qo_check_iscan_for_multi_range_opt and the MRO machinery that consumes the lowered ORDERBY_NUM predicate.
src/base/system_parameter.{c,h} — PRM_ID_MULTI_RANGE_OPT_LIMIT and related parameters.
src/compat/db_vdb.c — db_compile_statement_local, the per-statement compile driver that wires parser → semantic check → view inlining → rewrite → prepare.