CUBRID Runtime Memoization — Subquery Cache, Filter-Predicate Cache, and Per-Query Memoize Helpers

Contents:

Theoretical Background
Common DBMS Design
CUBRID’s Approach
Source Walkthrough
Cross-check Notes
Open Questions
Sources

Theoretical Background

A query plan is not the smallest unit at which a database can avoid redundant work. The XASL cache (cubrid-xasl-cache.md) memoises the compile pipeline so the second EXECUTE of a PREPAREd statement skips parse, semantic-check, optimise, and code-gen. But that cache stops at the plan boundary: every execution still walks the same scan operators, evaluates the same predicates row by row, and re-executes the same uncorrelated subquery once per outer tuple unless something inside the runtime intercepts the work and remembers the result.

That something is runtime memoization. The pattern from Michie’s 1968 Memo functions and machine learning: rewrite f(args) → result so the first call computes the result and stores it into a hash keyed by the arguments; subsequent calls return the stored result. In a query engine the “pure functions” are not pure — they read pages and take latches — but they are idempotent under stable state: while a transaction holds the necessary locks and no concurrent DDL has fired, calling the function twice with the same arguments yields the same result. That idempotency is why a runtime cache is safe.

Three places in the per-row hot path are memoization candidates.

Uncorrelated scalar subqueries. WHERE salary > (SELECT AVG(salary) FROM emp) contains an inner block independent of outer columns. A naive executor invokes it once per outer row; a smart one invokes it once and caches the scalar. Bind values form the key. The optimisation generalises to correlated subqueries when correlation columns take few distinct values: N inner executions become K (distinct correlation tuples seen).

Compiled filter predicates. A function-based or partial index carries its predicate as a serialised XASL byte stream in the B-tree. Every index operation — INSERT, UPDATE, DELETE, range scan — must turn it into a runnable PRED_EXPR_WITH_CONTEXT. Caching the deserialised tree per-BTID amortises across queries; pre-compiled regex (RLIKE) is reused for free.

Inner tuples of nested-loop joins. For A NLJ B ON A.x = B.x, naive execution scans B once per outer tuple. If A.x has few distinct values, the inner scan repeats redundantly. Caching the inner tuple set keyed by the join-column value collapses redundant scans, at the cost of materialising tuples streaming would have avoided. Vectorised engines (DuckDB, Velox) get this free through batched hash builds; row-stores need an explicit memoization pass.

All three are instances of the same pattern: a hash keyed by a DB_VALUE array, populated lazily, bounded by memory cap, with a hit-ratio guard. The differences are about what gets memoised, what the key is, and who owns the lifetime. fpcache is server-wide and cross-query; sq_cache and memoize::storage are per-XASL and per-execution. Only the cross-query cache needs DDL invalidation — the per-XASL caches are implicitly invalidated by the XASL node’s destruction.

Common DBMS Design

Every engine picks the same three knobs: unit of memoization, key, staleness.

Subquery caching. Postgres distinguishes InitPlan (uncorrelated, once-before) from SubPlan (correlated, inline); SubPlan’s useHashTable flag enables a work_mem hash keyed by correlation parameters. Oracle’s scalar subquery cache wraps every SELECT (...) scalar in a 256-entry LRU hash. SQL Server transforms scalar subqueries into joins at optimise time. MySQL materialises uncorrelated subqueries; correlated ones get no hashing. CUBRID’s subquery_cache sits closest to Oracle: per-call hash keyed by free constants, byte-bounded, fail-on-full, hit-ratio guard.

Filter-predicate caching. Postgres caches deserialised plan trees in the per-backend PlanCache; function-index predicate state is invalidated through RelationCacheInvalidateEntry. Oracle hides this in the cursor / library cache. CUBRID’s fpcache is more granular — function-based / partial indexes carry their predicate in the B-tree, and fpcache keeps a per-BTID pool of deserialised expressions.

Generic memoization. PostgreSQL ≥ 14’s Memoize plan node (nodeMemoize.c) caches inner output of parametrised NLJs keyed by the join key, work_mem-bounded, LRU-evicted, optimiser-chosen. CUBRID’s memoize::storage is the analogue but a runtime helper, not a plan node: attached at execute time when shape satisfies the static memoize::possible_check; runtime fall-back to recompute on exhaustion or poor hit ratio.

Theory ↔ CUBRID mapping

Theoretical concept	CUBRID symbol
Memoize-by-arguments	All three: `sq_cache`, `fpcache`, `memoize::storage`
Per-execution scalar-subquery cache	`SQ_CACHE` on `XASL_NODE::sq_cache`
Subquery key / value	`SQ_KEY { DB_VALUE **dbv_array }` / `SQ_VAL { SQ_REGU_VALUE val }`
Subquery hit-ratio guard	`SQ_CACHE_MIN_HIT_RATIO` (=9, fired at 60 % capacity)
Cross-query filter-predicate cache	`fpcache_Hashmap` (lockfree, BTID-keyed)
Filter-pred entry / clone pool	`FPCACHE_ENTRY` / `PRED_EXPR_WITH_CONTEXT **clone_stack`
Filter-pred eviction / invalidation	`fpcache_cleanup` (LRU heap) / `fpcache_remove_by_class`
NLJ inner-side memoization	`memoize::storage`
Memoize key / value / bucket	`memoize::key` / `value` / `std::unordered_multimap<key, value, …>`
Memoize key extractor	`key_maker<TARGET_CLASS>` / `key_maker<TARGET_LIST>`
Memoize applicability / hit-ratio guard	`possible_check` / `MEMOIZE_HIT_RATIO_THRESHOLD = 0.5` after 1000 accesses
Server params (sq / fp / memoize)	`PRM_ID_MAX_SUBQUERY_CACHE_SIZE` / `PRM_ID_FILTER_PRED_MAX_CACHE_*` / `PRM_ID_MEMOIZE_MEMORY_LIMIT`

CUBRID’s Approach

The three caches form a layered defence against redundant runtime work, sharing one playbook (DB_VALUE-array hash key, fail-on-full memory budget, hit-ratio guard) at three lifecycle scopes — per-call, per-query, server-wide.

Where the caches sit in the query lifecycle

flowchart TB
  subgraph PARSE["1) Parse + plan (client)"]
    P2["xasl_generation.c<br/>XASL_SET_FLAG (xasl, XASL_USES_SQ_CACHE)"]
  end
  subgraph PREP["2) Prepare (server)"]
    Q1["xqmgr_prepare_query + xcache_insert"]
  end
  subgraph EXEC["3) Execute (server)"]
    E2["qexec_execute_scan<br/>(per-block driver)"]
    M1[("memoize::storage<br/>per-XASL multi-tuple<br/>NLJ correlation key")]
    SQ1[("SQ_CACHE<br/>per-XASL scalar<br/>free-constant key")]
    EVAL["eval_pred<br/>fetch_peek_dbval"]
    INDEX["btree_range_search<br/>· filter pred"]
    FP1[("fpcache_Hashmap<br/>server-wide<br/>BTID key")]
  end
  subgraph DDL["4) DDL"]
    INV["locator_sr.c<br/>fpcache_remove_by_class"]
  end

  P2 --> Q1 --> E2
  E2 --> M1
  E2 --> EVAL --> SQ1
  E2 --> INDEX --> FP1
  FP1 --> EVAL
  INV --> FP1

Three boundaries.

(eval ↔ sq_cache) In eval_pred and fetch_peek_dbval, when the executor reaches a TYPE_CONSTANT regu variable pointing at an XASL subquery (regu_var->xasl != NULL) flagged XASL_USES_SQ_CACHE, the evaluator probes sq_cache first; on miss it executes the subquery, then puts the scalar result. Same pattern for R_EXISTS predicates.

(scan ↔ fpcache) In locator_eval_filter_predicate, before evaluating a function-based or partial index predicate, the locator calls fpcache_claim to obtain a deserialised PRED_EXPR_WITH_CONTEXT — a clone if available, fresh deserialisation otherwise. After evaluation, fpcache_retire returns the clone (or frees it if the stack is full).

(scan ↔ memoize) In qexec_execute_scan, when the inner XASL of a nested-loop join carries a memoize::storage, the driver calls memoize_get to probe for cached tuples for the current join key. On hit, cached tuples stream into the next scan function and the inner scan is skipped. On miss, the driver runs the normal scan and calls memoize_put per qualified tuple plus memoize_put_nullptr at scan end.

Subquery cache — `SQ_CACHE` on the XASL node

The cache lives on the XASL node it caches:

// SQ_CACHE — src/query/subquery_cache.h
struct sq_cache
{
  SQ_KEY *sq_key_struct;        /* prebuilt key shape */
#if defined (SERVER_MODE) || defined (SA_MODE)
  MHT_TABLE *ht;
  UINT64 size_max;              /* size budget (bytes) */
  UINT64 size;
  bool enabled;                 /* hit-ratio guard */
  struct { int hit; int miss; } stats;
#endif
};

struct sq_key { DB_VALUE **dbv_array; int n_elements; };
struct sq_val { SQ_REGU_VALUE val; REGU_DATATYPE type; };

The key is the free-constant array. At XASL-generation, when the planner identifies a subquery whose only varying inputs are binds or correlation columns, it sets XASL_USES_SQ_CACHE and stores an SQ_KEY shape in sq_key_struct — an array of pointers into the live regu variables. At execute, sq_make_key walks that array, deep-copies each DB_VALUE via db_value_copy, and returns a fresh SQ_KEY.

The hash function rotates and XORs mht_get_hash_number of each key element. The 13-bit rotation prevents collisions when equal elements are swapped — the same idiom used by memoize::key::hash:

// sq_hash_func — src/query/subquery_cache.c
for (int i = 0; i < k->n_elements; i++)
  {
    h = ROTL32 (h, 13);
    h ^= mht_get_hash_number (ht_size, k->dbv_array[i]);
  }

The value is one of two shapes. TYPE_CONSTANT (scalar subquery): deep-copied DB_VALUE. TYPE_LIST_ID (EXISTS): just a boolean — whether the inner result has any tuple. The boolean suffices because EXISTS only asks if the set is non-empty:

// sq_make_val — src/query/subquery_cache.c (condensed)
case TYPE_CONSTANT:
  ret->val.dbvalptr = db_value_copy (val->value.dbvalptr);
  break;
case TYPE_LIST_ID:
  ret->val.exists = (val->value.srlist_id->list_id->tuple_cnt > 0);
  break;

Memory budget and hit-ratio guard. Sized in bytes, not entries. PRM_ID_MAX_SUBQUERY_CACHE_SIZE is the hard cap; sq_put tracks bytes used, refusing oversized inserts and disabling the cache. The hit-ratio guard in sq_get fires once occupancy exceeds 60 %:

// sq_get — src/query/subquery_cache.c (condensed)
if ((double) SQ_CACHE_SIZE (xasl) > (double) SQ_CACHE_SIZE_MAX (xasl) * 0.6)
  if (SQ_CACHE_HIT (xasl) / SQ_CACHE_MISS (xasl) < SQ_CACHE_MIN_HIT_RATIO)
    { SQ_CACHE_ENABLED (xasl) = false; return false; }

Integer division floors hit/miss; the header comment reads “9, it means 90 %”, the actual cutoff is biased low.

Wired into row evaluation in two sites, both following the same probe-or-execute pattern — query_evaluator.c::eval_pred for R_EXISTS, and fetch.c::fetch_peek_dbval for TYPE_CONSTANT:

// fetch_peek_dbval — src/query/fetch.c (condensed)
if (xasl && XASL_IS_FLAGED (xasl, XASL_USES_SQ_CACHE)
    && !(SQ_CACHE_HT (xasl) && !SQ_CACHE_ENABLED (xasl)))
  {
    if (!sq_get (thread_p, SQ_CACHE_KEY_STRUCT (xasl), xasl, regu_var))
      {
        EXECUTE_REGU_VARIABLE_XASL (thread_p, regu_var, vd);
        if ((key = sq_make_key (thread_p, xasl)) == NULL)
          { XASL_CLEAR_FLAG (xasl, XASL_USES_SQ_CACHE); goto exit_on_error; }
        if (sq_put (thread_p, key, xasl, regu_var) == ER_FAILED)
          { sq_free_key (thread_p, key); }
      }
  }

The disable check !(SQ_CACHE_HT && !SQ_CACHE_ENABLED) means: either uninitialised (try lazily) or enabled (use it); only “initialised but disabled” skips the cache.

Filter-predicate cache — `fpcache_Hashmap`

The only server-wide cache. Function-based and partial indexes carry their predicate in the B-tree definition, and every index operation needs it as a runnable tree. fpcache caches deserialised trees by BTID.

// fpcache_ent — src/query/filter_pred_cache.c
struct fpcache_ent
{
  BTID btid;
  /* Latch-free stuff. */
  FPCACHE_ENTRY *stack, *next;
  pthread_mutex_t mutex;
  UINT64 del_id;

  OID class_oid;
  struct timeval time_last_used;        /* LRU timestamp */

  PRED_EXPR_WITH_CONTEXT **clone_stack; /* deserialised tree pool */
  INT32 clone_stack_head;
};

Two levels of memoization. Outer: the hashmap, BTID-keyed. Inner: the clone stack — each entry holds a bounded pool of deserialised PRED_EXPR_WITH_CONTEXT trees (sized by PRM_ID_FILTER_PRED_MAX_CACHE_CLONES). Concurrent readers claim a clone on entry and retire on exit; empty stack on claim triggers fresh deserialisation; full stack on retire frees the clone. The pattern solves “deserialised structure cannot be shared between threads” — each thread gets its own private copy without re-deserialising. Statistics distinguish four outcomes: hashmap hit/miss, clone hit/miss.

Concurrency. cubthread::lockfree_hashmap<BTID, fpcache_ent> — latch-free open-chained with deferred deletion (del_id). Readers do not block writers; writers serialise through fpcache_entry->mutex when modifying the clone stack.

Claim path.

// fpcache_claim — src/query/filter_pred_cache.c (condensed)
fpcache_entry = fpcache_Hashmap.find (thread_p, *btid);
if (fpcache_entry != NULL)
  {
    ATOMIC_INC_64 (&fpcache_Stat_hit, 1);
    if (fpcache_entry->clone_stack_head >= 0)
      {
        *filter_pred = fpcache_entry->clone_stack[fpcache_entry->clone_stack_head--];
        ATOMIC_INC_64 (&fpcache_Stat_clone_hit, 1);
      }
    pthread_mutex_unlock (&fpcache_entry->mutex);
  }

if (*filter_pred == NULL)
  {
    HL_HEAPID old_heap = db_change_private_heap (thread_p, 0);
    stx_map_stream_to_filter_pred (thread_p, filter_pred,
                                   or_pred->pred_stream,
                                   or_pred->pred_stream_size);
    db_change_private_heap (thread_p, old_heap);
  }

Key detail: deserialisation runs on the global heap (db_change_private_heap (thread_p, 0)), not the per-thread heap. Cached clones outlive any single transaction; a private-heap allocation would be reaped at transaction end, leaving dangling pointers.

Retire mirrors claim: find-or-insert, push clone if room else free. A fresh-entry insert triggers cleanup when fpcache_Entry_counter >= fpcache_Soft_capacity.

Eviction. fpcache_cleanup is LRU-by-binary-heap: walk all entries, sort the oldest 20 % (FPCACHE_CLEANUP_RATIO = 0.2) by time_last_used, erase. fpcache_Cleanup_flag serialises one cleaner at a time. The cap is soft — briefly exceeded during cleanup.

Invalidation. fpcache_remove_by_class is called from locator_sr.c:5679 (class drop), locator_sr.c:6302 (class alter), log_manager.c:5105 (DDL recovery). Two-phase walk: collect BTIDs (cannot delete during iteration), then erase. FPCACHE_DELETE_BTIDS_SIZE = 1024 bounds per-pass collection; the outer loop restarts iteration if the buffer fills.

Memoize — `memoize::storage` per XASL

The newest of the three (29 KB of code vs 15 and 23). Memoises the inner side of nested-loop joins: for each distinct correlation-key value seen by the outer scan, the inner scan’s tuples are cached so subsequent same-key outer rows read from the cache.

// storage — src/query/memoize.hpp (condensed)
class storage
{
public:
  static storage *new_storage (THREAD_ENTRY *, size_t max_storage_size,
                               xasl_node *);
  result_code get ();
  result_code put ();
  result_code put_nullptr();
  void set_key_changed ();

  size_t hit;
  size_t miss;
private:
  std::unordered_multimap<key *, value *, key::hash, key::equal>
    m_key_value_map;
  fixed_allocator<key> m_key_fixed_allocator;
  /* ... */
};

Multi-tuple values. Unlike sq_cache (one scalar per key) and fpcache (one tree per key plus clone pool), memoize uses an unordered_multimap: each key maps to zero or more values (column lists for inner tuples). The empty case (no inner match) is encoded as a nullptr value inserted by put_nullptr at scan end.

Two-step interaction with the scan loop. qexec_execute_scan probes the cache before running the inner scan; on hit it streams cached values into the next scan function. On miss it runs the normal scan, calls put per qualified tuple, and put_nullptr at end.

The state machine of storage::get has three branches keyed on (key_changed, has_range):

key_changed = true (outer row advanced via set_key_changed): build new key, look up in m_key_value_map. No entries → miss++, return NOT_FOUND (inner scan must run); entries found → hit++, dequeue first value (nullptr → ENDED, else SUCCESS).
key_changed = false, has_range = true: pop next cached tuple from m_current_value_list; empty → has_range=false, ENDED; else SUCCESS.
key_changed = false, has_range = false: NOT_FOUND.

Key extraction is structural. memoize::key_maker<TARGET_TYPE> walks the inner XASL’s predicate tree (where_key, where_pred, where_range, if_pred, after_join_pred, plus dptr_list) and collects all TYPE_CONSTANT regu variables — the correlation columns from the outer scan. Then it removes any TYPE_CONSTANT whose dbvalptr matches a spec regu list pointer (cls_regu_list_pred/rest/range/key for TARGET_CLASS, the list-node equivalents for TARGET_LIST): those are the inner scan’s own bindings, not correlation columns. The remainder becomes the cache key. The if constexpr branch chooses the right union field per spec type; the logic is otherwise identical.

Applicability is gated by memoize::possible_check, which walks the XASL node and rejects unsafe shapes: null spec/val_list, spec->next non-null (multi-spec), bptr_list/fptr_list non-null, unsupported spec->type (CLASS_ATTR, SET, JSON_TABLE, METHOD, REGUVAL_LIST, SHOWSTMT, DBLINK), system or MVCC-disabled class, spec->access not SEQUENTIAL/INDEX, set/multiset/sequence domains in val_list, dptr_list member missing XASL_LINK_TO_REGU_VARIABLE. The recursive level parameter gates aptr_list recursion at ≥ 1 (level-0 aptr_list carries optimiser-hoisted uncorrelated subqueries, irrelevant to inner-scan memoizability).

Hit-ratio guard. Same shape as sq_cache. After 1000 accesses (MEMOIZE_FREE_ITERATION_LIMIT), if hit ratio < 0.5 (MEMOIZE_HIT_RATIO_THRESHOLD), the storage disables itself:

// storage::put — src/query/memoize.cpp (condensed)
if (disabled || get_current_size() >= m_max_storage_size)
  { disabled = true; return result_code::FULL; }
if (hit + miss > MEMOIZE_FREE_ITERATION_LIMIT)
  if (((double)hit) / (hit + miss) < MEMOIZE_HIT_RATIO_THRESHOLD)
    { disabled = true; return result_code::FULL; }

The threshold is more permissive than sq_cache’s 90 % because a 50 % hit on a 100-tuple inner scan saves substantial CPU/IO.

Allocation pools. Keys from m_key_fixed_allocator (slab allocator, recycles fixed-size blocks). Values from malloc. Two allocators because keys are numerous (one per outer row) and benefit from slab recycling, while values vary in size.

Lifecycle wiring. In qexec_execute_mainblock near line 15770, when spec_level == 0 && level >= 1 && !mvcc_select_lock_needed, the executor calls new_memoize_storage on the inner XASL. qexec_execute_scan enters qexec_execute_nljoin_with_memoize whenever xasl->memoize_storage is non-null; the cached fast-path returns directly on hit, otherwise the normal scan loop runs and calls memoize_put per qualified tuple plus memoize_put_nullptr at scan end. The storage is destroyed by clear_memoize_storage on the next install (iteration reset) or by qexec_clear_xasl at teardown.

Comparing the three caches

Axis	`sq_cache`	`fpcache`	`memoize::storage`
Scope	Per-XASL	Server-wide	Per-XASL
Key	`DB_VALUE` array (free const)	`BTID`	`DB_VALUE` array (correlation cols)
Value	`DB_VALUE` or boolean	`PRED_EXPR_WITH_CONTEXT` clone pool	`std::vector<DB_VALUE>` (multi-tuple)
Hash table	`MHT_TABLE` (private)	`cubthread::lockfree_hashmap`	`std::unordered_multimap`
Concurrency	Single-thread	Latch-free + per-entry mutex	Single-thread
Eviction	Fail-on-full	LRU via binary heap	Fail-on-full
Invalidation	XASL teardown	`fpcache_remove_by_class`	XASL teardown
Hit-ratio guard	90 % at 60 % capacity	None (cleanup handles staleness)	50 % after 1000 accesses
Size budget	`PRM_ID_MAX_SUBQUERY_CACHE_SIZE`	`PRM_ID_FILTER_PRED_MAX_CACHE_*`	`PRM_ID_MEMOIZE_MEMORY_LIMIT`
Use case	Uncorrelated scalar subquery	Function/partial index predicate	NLJ inner side
Plan-time flag	`XASL_USES_SQ_CACHE`	None (always-on if enabled)	None (runtime `possible_check`)
Init	Lazy (first put/get)	Boot (`fpcache_initialize`)	Lazy (per-execute)

All three derive their key from a DB_VALUE array and use a rotation-XOR hash; all three treat memory as the governing budget; all three include a self-disabling guard. The differences come from what is memoised (scalar vs tree pool vs tuple stream) and where the cache sits in the lifecycle.

Source Walkthrough

Symbols grouped by file and call-flow; position table at the end.

subquery_cache.c

Key/value lifecycle: sq_make_key, sq_make_val, sq_free_key, sq_free_val, sq_unpack_val. Hash policy: sq_hash_func, sq_cmp_func, sq_rem_func. Lifecycle: sq_cache_initialize (lazy table creation), sq_cache_destroy, sq_put (bytes-budget guard, mht_put_if_not_exists), sq_get (60 %-capacity hit-ratio guard). Macros: SQ_CACHE_MIN_HIT_RATIO, SQ_CACHE_EXPECTED_ENTRY_SIZE, and the SQ_CACHE_*(xasl) accessors for hash table, enabled flag, key shape, hit/miss counters, size, and size cap.

filter_pred_cache.c

Static globals: fpcache_Enabled, fpcache_Soft_capacity, fpcache_Hashmap, fpcache_Entry_counter, fpcache_Clone_counter, fpcache_Clone_stack_size, fpcache_Cleanup_flag, fpcache_Cleanup_bh, the fpcache_Stat_* counter family.

Latch-free entry descriptor callbacks: fpcache_entry_alloc, fpcache_entry_free, fpcache_entry_init, fpcache_entry_uninit, fpcache_copy_key, all wired into fpcache_Entry_descriptor with LF_EM_USING_MUTEX.

Public API: fpcache_initialize, fpcache_finalize, fpcache_claim, fpcache_retire, fpcache_remove_by_class, fpcache_drop_all, fpcache_dump. Internal: fpcache_cleanup, fpcache_compare_cleanup_candidates (orders by tv_sec only). Constants: FPCACHE_CLEANUP_RATIO = 0.2, FPCACHE_DELETE_BTIDS_SIZE = 1024.

The only external claim/retire consumer is locator_eval_filter_predicate in locator_sr.c; fpcache_remove_by_class is fired by locator_drop_class, locator_check_btid, and the DDL log replay in log_manager.c.

memoize.cpp

Namespace constants: MEMOIZE_FREE_ITERATION_LIMIT = 1000, MEMOIZE_HIT_RATIO_THRESHOLD = 0.5, hash_entry_sz.

Types: result_code enum (SUCCESS/ENDED/NOT_FOUND/FULL/ERROR), key (vector of DB_VALUE + hash + equal), value (vector of DB_VALUE), storage class. Static singletons: checker (possible_check), cls_key_maker, list_key_maker (key_maker<TARGET_TYPE> instantiations).

key_maker<TARGET_TYPE> is a function-object template with overloads for every visitable XASL/predicate node — top-level (thread_p, xasl, key_ptr_src), recursive subquery walker, pred_expr*, pred*, eval_term, comp_eval_term, alsm_eval_term, like_eval_term, rlike_eval_term, regu_variable_node*, ARITH_TYPE*, REGU_VARIABLE_LIST, REGU_VALUE_LIST*, function_node*, sp_node*. It collects TYPE_CONSTANT regu pointers, subtracts spec own-regus, returns the remainder count.

storage member functions: new_storage (factory), constructor, destructor, init, get, put, put_nullptr, get_key, get_value, set_value, set_key_changed, start_timer/stop_timer (TSC, only when thread_p->on_trace), get_current_size. C-bridge functions: new_memoize_storage, clear_memoize_storage, memoize_get, memoize_put, memoize_put_nullptr — translating result_code to (success, is_ended) booleans and clearing storage on FULL.

Cross-references

query_executor.c: qexec_execute_nljoin_with_memoize (8164), qexec_execute_scan (8299), qexec_execute_mainblock (15770 install site), qexec_clear_xasl (teardown sites at ~2335, 2396, 2914). query_evaluator.c: eval_pred R_EXISTS branch (1849). fetch.c: fetch_peek_dbval TYPE_CONSTANT branch (4002). locator_sr.c: locator_eval_filter_predicate (~8170); fpcache_remove_by_class callers (5679, 6302). log_manager.c: DDL recovery call (5105). xasl_generation.c: XASL_USES_SQ_CACHE set sites (1681, 6996, 14010).

Position hints as of this revision

Symbol	File	Line
`SQ_KEY`, `SQ_VAL`, `SQ_CACHE`	`subquery_cache.h`	51-95
`SQ_CACHE_MIN_HIT_RATIO`	`subquery_cache.h`	40
`sq_make_key`, `sq_make_val`	`subquery_cache.c`	67, 103
`sq_free_key`, `sq_free_val`	`subquery_cache.c`	143, 162
`sq_unpack_val`, `sq_hash_func`	`subquery_cache.c`	191, 227
`sq_cmp_func`, `sq_rem_func`	`subquery_cache.c`	251, 283
`sq_cache_initialize`	`subquery_cache.c`	301
`sq_put`, `sq_get`	`subquery_cache.c`	337, 407
`sq_cache_destroy`	`subquery_cache.c`	461
`FPCACHE_ENTRY` (struct fpcache_ent)	`filter_pred_cache.c`	41
`fpcache_Hashmap`, `fpcache_Entry_descriptor`	`filter_pred_cache.c`	74, 118
`fpcache_initialize`, `fpcache_finalize`	`filter_pred_cache.c`	146, 209
`fpcache_entry_alloc`/`free`/`init`/`uninit`	`filter_pred_cache.c`	250-335
`fpcache_claim`, `fpcache_retire`	`filter_pred_cache.c`	362, 430
`fpcache_remove_by_class`, `fpcache_dump`	`filter_pred_cache.c`	506, 594
`fpcache_cleanup`	`filter_pred_cache.c`	645
`fpcache_compare_cleanup_candidates`	`filter_pred_cache.c`	733
`fpcache_drop_all`	`filter_pred_cache.c`	760
`MEMOIZE_FREE_ITERATION_LIMIT`/`THRESHOLD`	`memoize.hpp`	32-33
`memoize::key` / `value` / `storage`	`memoize.hpp`	47, 67, 81
`memoize::possible_check`	`memoize.cpp`	39
`memoize::key_maker<TARGET_TYPE>`	`memoize.cpp`	163
`memoize::key::hash`/`equal`	`memoize.cpp`	668, 679
`memoize::storage::new_storage`/`dtor`	`memoize.cpp`	693, 765
`memoize::storage::get`/`put`/`put_nullptr`	`memoize.cpp`	813, 893, 937
`memoize::storage::get_key`/`get_value`	`memoize.cpp`	981, 1003
C bridge: `new_memoize_storage`/`clear_`/`memoize_`	`memoize.cpp`	1045-1186
`XASL_USES_SQ_CACHE`, `sq_cache`, `memoize_storage`	`xasl.h`	518, 1167, 1194
`eval_pred` (sq_cache wiring)	`query_evaluator.c`	1849
`fetch_peek_dbval` (sq_cache wiring)	`fetch.c`	4002
`qexec_execute_nljoin_with_memoize`	`query_executor.c`	8164
`qexec_execute_scan` (memoize hooks)	`query_executor.c`	8299
`new_memoize_storage` install site	`query_executor.c`	15770
`locator_eval_filter_predicate`	`locator_sr.c`	~8170
`fpcache_remove_by_class` callers	`locator_sr.c` / `log_manager.c`	5679, 6302 / 5105

Cross-check Notes

Subquery-cache integer truncation. The guard SQ_CACHE_HIT / SQ_CACHE_MISS < SQ_CACHE_MIN_HIT_RATIO divides integers; if MISS == 0 it divides by zero. The guard runs only after 60 % capacity, so practically some miss has occurred — but a perfectly-cached workload up to 60 % would crash. No defensive MISS == 0 check.
SQ_CACHE_MIN_HIT_RATIO comment vs value. Header reads “9, it means 90 %”; the constant is integer 9. Cutoff depends on whether hits/misses ∈ [8, 9) (disabled) or [9, 10) (kept) — biased low.
Filter-pred soft cap. fpcache_Entry_counter can exceed Soft_capacity between insert and cleanup. The assert (false) in fpcache_remove_by_class for “entry not in hashmap” relies on no concurrent invalidation; a high-DDL workload could fire this in debug builds.
Memoize hit-ratio guard fires on put, not get. A read-only cache (inner side fully populated) is never disabled. On warmup the bias matters.
sq_cache_initialize called twice in sq_get — once via null-check, again inside if (!SQ_CACHE_HT). Hit-ratio block references SQ_CACHE_HIT/MISS before lazy init; harmless if zeroed, but order is fragile.
Server-mode-only fields. SQ_CACHE’s hash and stats are guarded by SERVER_MODE/SA_MODE; in CS_MODE the cache is just sq_key_struct. Keys are computed client-side, hash table is server-side; XASL_USES_SQ_CACHE travels with the serialised XASL.
fpcache_compare_cleanup_candidates ignores tv_usec. Only tv_sec is compared. Sub-second retires get undefined ordering; benchmark churn produces noisy eviction.
Memoize aptr_list recursion gate at level >= 1. At level 0, the outer aptr_list carries optimiser-hoisted uncorrelated subqueries — irrelevant. At ≥ 1 it is part of the inner block’s structure.
Two hashmap declarations in filter_pred_cache.c. Lines 74-76 declare fpcache_Hashmap (C++ lockfree_hashmap), fpcache_Ht (C LF_HASH_TABLE), and fpcache_Ht_freelist. Only the C++ map is used; the C-style ones look vestigial.

Open Questions

sq_cache and attribute-correlated subqueries. The cache is keyed by free constants only. Does the optimiser rewrite attribute-correlated subqueries to constant form, or does XASL_USES_SQ_CACHE simply not get set? The set-sites in xasl_generation.c (1681, 6996, 14010) would answer.
memoize::storage outliving DDL? The XASL cache invalidates entries via xcache_remove_by_oid on DDL; in-flight executions complete with the storage destroyed at execution end. Quick reading shows no race; formal proof needs closer tracing.
Why isn’t fpcache keyed by (class_oid, btid)? Current key is BTID alone. Two classes could in principle share a BTID across schema reuse — the cache trusts uniqueness and asserts via OID_EQ (class_oid) in fpcache_retire.
Are the hit-ratio thresholds tuned? 9 (sq_cache), 0.5 (memoize), and “no threshold” (fpcache) look like judgement picks rather than measurements.
Does fpcache serve MERGE / UPSERT? locator_eval_filter_predicate runs on every B-tree mutation. MERGE issues INSERT and UPDATE paths against the index; the cache should serve both, but the claim/retire order across a MERGE-inserted-then-updated row is not traced here.
set_value re-clones on every hit. For wide value lists this is a per-row copy cost even when the new value matches the old. Comparison-and-skip would help only if comparison is cheaper than pr_clone_value — unmeasured.

Sources

CUBRID source files (informational):

src/query/subquery_cache.{c,h} — scalar-subquery cache; SQ_CACHE, SQ_KEY, SQ_VAL, SQ_TYPE definitions; budget macros.
src/query/filter_pred_cache.{c,h} — server-wide filter-predicate cache; public fpcache_* API.
src/query/memoize.{hpp,cpp} — generic per-XASL memoization; storage, key, value, result_code, C bridge.
src/query/query_executor.c — wires memoize_* and sq_cache_destroy into qexec_execute_scan / mainblock / clear_xasl.
src/query/query_evaluator.c, src/query/fetch.c — sq_cache probe at eval_pred / fetch_peek_dbval.
src/transaction/locator_sr.c, src/transaction/log_manager.c — fpcache_claim / retire callers and fpcache_remove_by_class triggers (DDL + recovery).
src/parser/xasl_generation.c — XASL_USES_SQ_CACHE set sites.

Cross-document references:

cubrid-xasl-cache.md — the plan-level cache, complementary: XASL cache memoises whole compiled plans; the runtime caches memoise pieces of execution under those plans. Both share the latch-free hashmap pattern via cubthread::lockfree_hashmap.
cubrid-query-evaluator.md — eval_pred, fetch_peek_dbval, and the PRED_EXPR walker, one level above the sq_cache probe.
cubrid-btree.md — owner of the BTIDs used as fpcache keys.