CUBRID Runtime Memoization — Subquery Cache, Filter-Predicate Cache, and Per-Query Memoize Helpers
Contents:
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Cross-check Notes
- Open Questions
- Sources
Theoretical Background
Section titled “Theoretical Background”A query plan is not the smallest unit at which a database can avoid
redundant work. The XASL cache (cubrid-xasl-cache.md) memoises the
compile pipeline so the second EXECUTE of a PREPAREd statement skips
parse, semantic-check, optimise, and code-gen. But that cache stops at
the plan boundary: every execution still walks the same scan operators,
evaluates the same predicates row by row, and re-executes the same
uncorrelated subquery once per outer tuple unless something inside the
runtime intercepts the work and remembers the result.
That something is runtime memoization. The pattern from Michie’s 1968
Memo functions and machine learning: rewrite f(args) → result so
the first call computes the result and stores it into a hash keyed by
the arguments; subsequent calls return the stored result. In a query
engine the “pure functions” are not pure — they read pages and take
latches — but they are idempotent under stable state: while a
transaction holds the necessary locks and no concurrent DDL has fired,
calling the function twice with the same arguments yields the same
result. That idempotency is why a runtime cache is safe.
Three places in the per-row hot path are memoization candidates.
Uncorrelated scalar subqueries. WHERE salary > (SELECT AVG(salary) FROM emp) contains an inner block independent of outer columns. A
naive executor invokes it once per outer row; a smart one invokes it
once and caches the scalar. Bind values form the key. The optimisation
generalises to correlated subqueries when correlation columns take few
distinct values: N inner executions become K (distinct correlation
tuples seen).
Compiled filter predicates. A function-based or partial index
carries its predicate as a serialised XASL byte stream in the B-tree.
Every index operation — INSERT, UPDATE, DELETE, range scan — must
turn it into a runnable PRED_EXPR_WITH_CONTEXT. Caching the
deserialised tree per-BTID amortises across queries; pre-compiled
regex (RLIKE) is reused for free.
Inner tuples of nested-loop joins. For A NLJ B ON A.x = B.x,
naive execution scans B once per outer tuple. If A.x has few
distinct values, the inner scan repeats redundantly. Caching the inner
tuple set keyed by the join-column value collapses redundant scans, at
the cost of materialising tuples streaming would have avoided.
Vectorised engines (DuckDB, Velox) get this free through batched hash
builds; row-stores need an explicit memoization pass.
All three are instances of the same pattern: a hash keyed by a
DB_VALUE array, populated lazily, bounded by memory cap, with a
hit-ratio guard. The differences are about what gets memoised,
what the key is, and who owns the lifetime. fpcache is
server-wide and cross-query; sq_cache and memoize::storage are
per-XASL and per-execution. Only the cross-query cache needs DDL
invalidation — the per-XASL caches are implicitly invalidated by the
XASL node’s destruction.
Common DBMS Design
Section titled “Common DBMS Design”Every engine picks the same three knobs: unit of memoization, key, staleness.
Subquery caching. Postgres distinguishes InitPlan (uncorrelated,
once-before) from SubPlan (correlated, inline); SubPlan’s
useHashTable flag enables a work_mem hash keyed by correlation
parameters. Oracle’s scalar subquery cache wraps every SELECT (...)
scalar in a 256-entry LRU hash. SQL Server transforms scalar subqueries
into joins at optimise time. MySQL materialises uncorrelated subqueries;
correlated ones get no hashing. CUBRID’s subquery_cache sits closest
to Oracle: per-call hash keyed by free constants, byte-bounded,
fail-on-full, hit-ratio guard.
Filter-predicate caching. Postgres caches deserialised plan trees
in the per-backend PlanCache; function-index predicate state is
invalidated through RelationCacheInvalidateEntry. Oracle hides this
in the cursor / library cache. CUBRID’s fpcache is more granular —
function-based / partial indexes carry their predicate in the B-tree,
and fpcache keeps a per-BTID pool of deserialised expressions.
Generic memoization. PostgreSQL ≥ 14’s Memoize plan node
(nodeMemoize.c) caches inner output of parametrised NLJs keyed by
the join key, work_mem-bounded, LRU-evicted, optimiser-chosen.
CUBRID’s memoize::storage is the analogue but a runtime helper, not
a plan node: attached at execute time when shape satisfies the static
memoize::possible_check; runtime fall-back to recompute on
exhaustion or poor hit ratio.
Theory ↔ CUBRID mapping
Section titled “Theory ↔ CUBRID mapping”| Theoretical concept | CUBRID symbol |
|---|---|
| Memoize-by-arguments | All three: sq_cache, fpcache, memoize::storage |
| Per-execution scalar-subquery cache | SQ_CACHE on XASL_NODE::sq_cache |
| Subquery key / value | SQ_KEY { DB_VALUE **dbv_array } / SQ_VAL { SQ_REGU_VALUE val } |
| Subquery hit-ratio guard | SQ_CACHE_MIN_HIT_RATIO (=9, fired at 60 % capacity) |
| Cross-query filter-predicate cache | fpcache_Hashmap (lockfree, BTID-keyed) |
| Filter-pred entry / clone pool | FPCACHE_ENTRY / PRED_EXPR_WITH_CONTEXT **clone_stack |
| Filter-pred eviction / invalidation | fpcache_cleanup (LRU heap) / fpcache_remove_by_class |
| NLJ inner-side memoization | memoize::storage |
| Memoize key / value / bucket | memoize::key / value / std::unordered_multimap<key*, value*, …> |
| Memoize key extractor | key_maker<TARGET_CLASS> / key_maker<TARGET_LIST> |
| Memoize applicability / hit-ratio guard | possible_check / MEMOIZE_HIT_RATIO_THRESHOLD = 0.5 after 1000 accesses |
| Server params (sq / fp / memoize) | PRM_ID_MAX_SUBQUERY_CACHE_SIZE / PRM_ID_FILTER_PRED_MAX_CACHE_* / PRM_ID_MEMOIZE_MEMORY_LIMIT |
CUBRID’s Approach
Section titled “CUBRID’s Approach”The three caches form a layered defence against redundant runtime work,
sharing one playbook (DB_VALUE-array hash key, fail-on-full memory
budget, hit-ratio guard) at three lifecycle scopes — per-call,
per-query, server-wide.
Where the caches sit in the query lifecycle
Section titled “Where the caches sit in the query lifecycle”flowchart TB
subgraph PARSE["1) Parse + plan (client)"]
P2["xasl_generation.c<br/>XASL_SET_FLAG (xasl, XASL_USES_SQ_CACHE)"]
end
subgraph PREP["2) Prepare (server)"]
Q1["xqmgr_prepare_query + xcache_insert"]
end
subgraph EXEC["3) Execute (server)"]
E2["qexec_execute_scan<br/>(per-block driver)"]
M1[("memoize::storage<br/>per-XASL multi-tuple<br/>NLJ correlation key")]
SQ1[("SQ_CACHE<br/>per-XASL scalar<br/>free-constant key")]
EVAL["eval_pred<br/>fetch_peek_dbval"]
INDEX["btree_range_search<br/>· filter pred"]
FP1[("fpcache_Hashmap<br/>server-wide<br/>BTID key")]
end
subgraph DDL["4) DDL"]
INV["locator_sr.c<br/>fpcache_remove_by_class"]
end
P2 --> Q1 --> E2
E2 --> M1
E2 --> EVAL --> SQ1
E2 --> INDEX --> FP1
FP1 --> EVAL
INV --> FP1
Three boundaries.
(eval ↔ sq_cache) In eval_pred and fetch_peek_dbval, when the
executor reaches a TYPE_CONSTANT regu variable pointing at an XASL
subquery (regu_var->xasl != NULL) flagged XASL_USES_SQ_CACHE, the
evaluator probes sq_cache first; on miss it executes the subquery,
then puts the scalar result. Same pattern for R_EXISTS predicates.
(scan ↔ fpcache) In locator_eval_filter_predicate, before
evaluating a function-based or partial index predicate, the locator
calls fpcache_claim to obtain a deserialised
PRED_EXPR_WITH_CONTEXT — a clone if available, fresh deserialisation
otherwise. After evaluation, fpcache_retire returns the clone (or
frees it if the stack is full).
(scan ↔ memoize) In qexec_execute_scan, when the inner XASL of
a nested-loop join carries a memoize::storage, the driver calls
memoize_get to probe for cached tuples for the current join key. On
hit, cached tuples stream into the next scan function and the inner
scan is skipped. On miss, the driver runs the normal scan and calls
memoize_put per qualified tuple plus memoize_put_nullptr at scan
end.
Subquery cache — SQ_CACHE on the XASL node
Section titled “Subquery cache — SQ_CACHE on the XASL node”The cache lives on the XASL node it caches:
// SQ_CACHE — src/query/subquery_cache.hstruct sq_cache{ SQ_KEY *sq_key_struct; /* prebuilt key shape */#if defined (SERVER_MODE) || defined (SA_MODE) MHT_TABLE *ht; UINT64 size_max; /* size budget (bytes) */ UINT64 size; bool enabled; /* hit-ratio guard */ struct { int hit; int miss; } stats;#endif};
struct sq_key { DB_VALUE **dbv_array; int n_elements; };struct sq_val { SQ_REGU_VALUE val; REGU_DATATYPE type; };The key is the free-constant array. At XASL-generation, when the
planner identifies a subquery whose only varying inputs are binds or
correlation columns, it sets XASL_USES_SQ_CACHE and stores an
SQ_KEY shape in sq_key_struct — an array of pointers into the
live regu variables. At execute, sq_make_key walks that array,
deep-copies each DB_VALUE via db_value_copy, and returns a fresh
SQ_KEY.
The hash function rotates and XORs mht_get_hash_number of each
key element. The 13-bit rotation prevents collisions when equal
elements are swapped — the same idiom used by memoize::key::hash:
// sq_hash_func — src/query/subquery_cache.cfor (int i = 0; i < k->n_elements; i++) { h = ROTL32 (h, 13); h ^= mht_get_hash_number (ht_size, k->dbv_array[i]); }The value is one of two shapes. TYPE_CONSTANT (scalar subquery):
deep-copied DB_VALUE. TYPE_LIST_ID (EXISTS): just a boolean —
whether the inner result has any tuple. The boolean suffices because
EXISTS only asks if the set is non-empty:
// sq_make_val — src/query/subquery_cache.c (condensed)case TYPE_CONSTANT: ret->val.dbvalptr = db_value_copy (val->value.dbvalptr); break;case TYPE_LIST_ID: ret->val.exists = (val->value.srlist_id->list_id->tuple_cnt > 0); break;Memory budget and hit-ratio guard. Sized in bytes, not entries.
PRM_ID_MAX_SUBQUERY_CACHE_SIZE is the hard cap; sq_put tracks
bytes used, refusing oversized inserts and disabling the cache. The
hit-ratio guard in sq_get fires once occupancy exceeds 60 %:
// sq_get — src/query/subquery_cache.c (condensed)if ((double) SQ_CACHE_SIZE (xasl) > (double) SQ_CACHE_SIZE_MAX (xasl) * 0.6) if (SQ_CACHE_HIT (xasl) / SQ_CACHE_MISS (xasl) < SQ_CACHE_MIN_HIT_RATIO) { SQ_CACHE_ENABLED (xasl) = false; return false; }Integer division floors hit/miss; the header comment reads “9, it means 90 %”, the actual cutoff is biased low.
Wired into row evaluation in two sites, both following the same
probe-or-execute pattern — query_evaluator.c::eval_pred for
R_EXISTS, and fetch.c::fetch_peek_dbval for TYPE_CONSTANT:
// fetch_peek_dbval — src/query/fetch.c (condensed)if (xasl && XASL_IS_FLAGED (xasl, XASL_USES_SQ_CACHE) && !(SQ_CACHE_HT (xasl) && !SQ_CACHE_ENABLED (xasl))) { if (!sq_get (thread_p, SQ_CACHE_KEY_STRUCT (xasl), xasl, regu_var)) { EXECUTE_REGU_VARIABLE_XASL (thread_p, regu_var, vd); if ((key = sq_make_key (thread_p, xasl)) == NULL) { XASL_CLEAR_FLAG (xasl, XASL_USES_SQ_CACHE); goto exit_on_error; } if (sq_put (thread_p, key, xasl, regu_var) == ER_FAILED) { sq_free_key (thread_p, key); } } }The disable check !(SQ_CACHE_HT && !SQ_CACHE_ENABLED) means: either
uninitialised (try lazily) or enabled (use it); only “initialised but
disabled” skips the cache.
Filter-predicate cache — fpcache_Hashmap
Section titled “Filter-predicate cache — fpcache_Hashmap”The only server-wide cache. Function-based and partial indexes carry
their predicate in the B-tree definition, and every index operation
needs it as a runnable tree. fpcache caches deserialised trees by
BTID.
// fpcache_ent — src/query/filter_pred_cache.cstruct fpcache_ent{ BTID btid; /* Latch-free stuff. */ FPCACHE_ENTRY *stack, *next; pthread_mutex_t mutex; UINT64 del_id;
OID class_oid; struct timeval time_last_used; /* LRU timestamp */
PRED_EXPR_WITH_CONTEXT **clone_stack; /* deserialised tree pool */ INT32 clone_stack_head;};Two levels of memoization. Outer: the hashmap, BTID-keyed. Inner:
the clone stack — each entry holds a bounded pool of deserialised
PRED_EXPR_WITH_CONTEXT trees (sized by
PRM_ID_FILTER_PRED_MAX_CACHE_CLONES). Concurrent readers claim a
clone on entry and retire on exit; empty stack on claim triggers
fresh deserialisation; full stack on retire frees the clone. The
pattern solves “deserialised structure cannot be shared between
threads” — each thread gets its own private copy without
re-deserialising. Statistics distinguish four outcomes: hashmap
hit/miss, clone hit/miss.
Concurrency. cubthread::lockfree_hashmap<BTID, fpcache_ent> —
latch-free open-chained with deferred deletion (del_id). Readers do
not block writers; writers serialise through fpcache_entry->mutex
when modifying the clone stack.
Claim path.
// fpcache_claim — src/query/filter_pred_cache.c (condensed)fpcache_entry = fpcache_Hashmap.find (thread_p, *btid);if (fpcache_entry != NULL) { ATOMIC_INC_64 (&fpcache_Stat_hit, 1); if (fpcache_entry->clone_stack_head >= 0) { *filter_pred = fpcache_entry->clone_stack[fpcache_entry->clone_stack_head--]; ATOMIC_INC_64 (&fpcache_Stat_clone_hit, 1); } pthread_mutex_unlock (&fpcache_entry->mutex); }
if (*filter_pred == NULL) { HL_HEAPID old_heap = db_change_private_heap (thread_p, 0); stx_map_stream_to_filter_pred (thread_p, filter_pred, or_pred->pred_stream, or_pred->pred_stream_size); db_change_private_heap (thread_p, old_heap); }Key detail: deserialisation runs on the global heap
(db_change_private_heap (thread_p, 0)), not the per-thread heap.
Cached clones outlive any single transaction; a private-heap
allocation would be reaped at transaction end, leaving dangling
pointers.
Retire mirrors claim: find-or-insert, push clone if room else
free. A fresh-entry insert triggers cleanup when
fpcache_Entry_counter >= fpcache_Soft_capacity.
Eviction. fpcache_cleanup is LRU-by-binary-heap: walk all
entries, sort the oldest 20 % (FPCACHE_CLEANUP_RATIO = 0.2) by
time_last_used, erase. fpcache_Cleanup_flag serialises one cleaner
at a time. The cap is soft — briefly exceeded during cleanup.
Invalidation. fpcache_remove_by_class is called from
locator_sr.c:5679 (class drop), locator_sr.c:6302 (class alter),
log_manager.c:5105 (DDL recovery). Two-phase walk: collect BTIDs
(cannot delete during iteration), then erase.
FPCACHE_DELETE_BTIDS_SIZE = 1024 bounds per-pass collection; the
outer loop restarts iteration if the buffer fills.
Memoize — memoize::storage per XASL
Section titled “Memoize — memoize::storage per XASL”The newest of the three (29 KB of code vs 15 and 23). Memoises the inner side of nested-loop joins: for each distinct correlation-key value seen by the outer scan, the inner scan’s tuples are cached so subsequent same-key outer rows read from the cache.
// storage — src/query/memoize.hpp (condensed)class storage{public: static storage *new_storage (THREAD_ENTRY *, size_t max_storage_size, xasl_node *); result_code get (); result_code put (); result_code put_nullptr(); void set_key_changed ();
size_t hit; size_t miss;private: std::unordered_multimap<key *, value *, key::hash, key::equal> m_key_value_map; fixed_allocator<key> m_key_fixed_allocator; /* ... */};Multi-tuple values. Unlike sq_cache (one scalar per key) and
fpcache (one tree per key plus clone pool), memoize uses an
unordered_multimap: each key maps to zero or more values (column
lists for inner tuples). The empty case (no inner match) is encoded
as a nullptr value inserted by put_nullptr at scan end.
Two-step interaction with the scan loop. qexec_execute_scan
probes the cache before running the inner scan; on hit it streams
cached values into the next scan function. On miss it runs the normal
scan, calls put per qualified tuple, and put_nullptr at end.
The state machine of storage::get has three branches keyed on
(key_changed, has_range):
key_changed = true(outer row advanced viaset_key_changed): build new key, look up inm_key_value_map. No entries →miss++, returnNOT_FOUND(inner scan must run); entries found →hit++, dequeue first value (nullptr→ENDED, elseSUCCESS).key_changed = false,has_range = true: pop next cached tuple fromm_current_value_list; empty →has_range=false,ENDED; elseSUCCESS.key_changed = false,has_range = false:NOT_FOUND.
Key extraction is structural. memoize::key_maker<TARGET_TYPE>
walks the inner XASL’s predicate tree (where_key, where_pred,
where_range, if_pred, after_join_pred, plus dptr_list) and
collects all TYPE_CONSTANT regu variables — the correlation columns
from the outer scan. Then it removes any TYPE_CONSTANT whose
dbvalptr matches a spec regu list pointer
(cls_regu_list_pred/rest/range/key for TARGET_CLASS, the
list-node equivalents for TARGET_LIST): those are the inner scan’s
own bindings, not correlation columns. The remainder becomes the
cache key. The if constexpr branch chooses the right union field
per spec type; the logic is otherwise identical.
Applicability is gated by memoize::possible_check, which walks
the XASL node and rejects unsafe shapes: null spec/val_list,
spec->next non-null (multi-spec), bptr_list/fptr_list non-null,
unsupported spec->type (CLASS_ATTR, SET, JSON_TABLE, METHOD,
REGUVAL_LIST, SHOWSTMT, DBLINK), system or MVCC-disabled class,
spec->access not SEQUENTIAL/INDEX, set/multiset/sequence
domains in val_list, dptr_list member missing
XASL_LINK_TO_REGU_VARIABLE. The recursive level parameter gates
aptr_list recursion at ≥ 1 (level-0 aptr_list carries
optimiser-hoisted uncorrelated subqueries, irrelevant to inner-scan
memoizability).
Hit-ratio guard. Same shape as sq_cache. After 1000 accesses
(MEMOIZE_FREE_ITERATION_LIMIT), if hit ratio < 0.5
(MEMOIZE_HIT_RATIO_THRESHOLD), the storage disables itself:
// storage::put — src/query/memoize.cpp (condensed)if (disabled || get_current_size() >= m_max_storage_size) { disabled = true; return result_code::FULL; }if (hit + miss > MEMOIZE_FREE_ITERATION_LIMIT) if (((double)hit) / (hit + miss) < MEMOIZE_HIT_RATIO_THRESHOLD) { disabled = true; return result_code::FULL; }The threshold is more permissive than sq_cache’s 90 % because a
50 % hit on a 100-tuple inner scan saves substantial CPU/IO.
Allocation pools. Keys from m_key_fixed_allocator (slab
allocator, recycles fixed-size blocks). Values from malloc. Two
allocators because keys are numerous (one per outer row) and benefit
from slab recycling, while values vary in size.
Lifecycle wiring. In qexec_execute_mainblock near line 15770,
when spec_level == 0 && level >= 1 && !mvcc_select_lock_needed, the
executor calls new_memoize_storage on the inner XASL.
qexec_execute_scan enters qexec_execute_nljoin_with_memoize
whenever xasl->memoize_storage is non-null; the cached fast-path
returns directly on hit, otherwise the normal scan loop runs and
calls memoize_put per qualified tuple plus memoize_put_nullptr
at scan end. The storage is destroyed by clear_memoize_storage on
the next install (iteration reset) or by qexec_clear_xasl at
teardown.
Comparing the three caches
Section titled “Comparing the three caches”| Axis | sq_cache | fpcache | memoize::storage |
|---|---|---|---|
| Scope | Per-XASL | Server-wide | Per-XASL |
| Key | DB_VALUE array (free const) | BTID | DB_VALUE array (correlation cols) |
| Value | DB_VALUE or boolean | PRED_EXPR_WITH_CONTEXT clone pool | std::vector<DB_VALUE> (multi-tuple) |
| Hash table | MHT_TABLE (private) | cubthread::lockfree_hashmap | std::unordered_multimap |
| Concurrency | Single-thread | Latch-free + per-entry mutex | Single-thread |
| Eviction | Fail-on-full | LRU via binary heap | Fail-on-full |
| Invalidation | XASL teardown | fpcache_remove_by_class | XASL teardown |
| Hit-ratio guard | 90 % at 60 % capacity | None (cleanup handles staleness) | 50 % after 1000 accesses |
| Size budget | PRM_ID_MAX_SUBQUERY_CACHE_SIZE | PRM_ID_FILTER_PRED_MAX_CACHE_* | PRM_ID_MEMOIZE_MEMORY_LIMIT |
| Use case | Uncorrelated scalar subquery | Function/partial index predicate | NLJ inner side |
| Plan-time flag | XASL_USES_SQ_CACHE | None (always-on if enabled) | None (runtime possible_check) |
| Init | Lazy (first put/get) | Boot (fpcache_initialize) | Lazy (per-execute) |
All three derive their key from a DB_VALUE array and use a
rotation-XOR hash; all three treat memory as the governing budget;
all three include a self-disabling guard. The differences come from
what is memoised (scalar vs tree pool vs tuple stream) and where
the cache sits in the lifecycle.
Source Walkthrough
Section titled “Source Walkthrough”Symbols grouped by file and call-flow; position table at the end.
subquery_cache.c
Section titled “subquery_cache.c”Key/value lifecycle: sq_make_key, sq_make_val, sq_free_key,
sq_free_val, sq_unpack_val. Hash policy: sq_hash_func,
sq_cmp_func, sq_rem_func. Lifecycle: sq_cache_initialize (lazy
table creation), sq_cache_destroy, sq_put (bytes-budget guard,
mht_put_if_not_exists), sq_get (60 %-capacity hit-ratio guard).
Macros: SQ_CACHE_MIN_HIT_RATIO, SQ_CACHE_EXPECTED_ENTRY_SIZE, and
the SQ_CACHE_*(xasl) accessors for hash table, enabled flag, key
shape, hit/miss counters, size, and size cap.
filter_pred_cache.c
Section titled “filter_pred_cache.c”Static globals: fpcache_Enabled, fpcache_Soft_capacity,
fpcache_Hashmap, fpcache_Entry_counter, fpcache_Clone_counter,
fpcache_Clone_stack_size, fpcache_Cleanup_flag, fpcache_Cleanup_bh,
the fpcache_Stat_* counter family.
Latch-free entry descriptor callbacks: fpcache_entry_alloc,
fpcache_entry_free, fpcache_entry_init, fpcache_entry_uninit,
fpcache_copy_key, all wired into fpcache_Entry_descriptor with
LF_EM_USING_MUTEX.
Public API: fpcache_initialize, fpcache_finalize, fpcache_claim,
fpcache_retire, fpcache_remove_by_class, fpcache_drop_all,
fpcache_dump. Internal: fpcache_cleanup,
fpcache_compare_cleanup_candidates (orders by tv_sec only).
Constants: FPCACHE_CLEANUP_RATIO = 0.2,
FPCACHE_DELETE_BTIDS_SIZE = 1024.
The only external claim/retire consumer is
locator_eval_filter_predicate in locator_sr.c;
fpcache_remove_by_class is fired by locator_drop_class,
locator_check_btid, and the DDL log replay in log_manager.c.
memoize.cpp
Section titled “memoize.cpp”Namespace constants: MEMOIZE_FREE_ITERATION_LIMIT = 1000,
MEMOIZE_HIT_RATIO_THRESHOLD = 0.5, hash_entry_sz.
Types: result_code enum (SUCCESS/ENDED/NOT_FOUND/FULL/ERROR), key
(vector of DB_VALUE + hash + equal), value (vector of DB_VALUE),
storage class. Static singletons: checker (possible_check),
cls_key_maker, list_key_maker (key_maker<TARGET_TYPE>
instantiations).
key_maker<TARGET_TYPE> is a function-object template with overloads
for every visitable XASL/predicate node — top-level
(thread_p, xasl, key_ptr_src), recursive subquery walker,
pred_expr*, pred*, eval_term, comp_eval_term, alsm_eval_term,
like_eval_term, rlike_eval_term, regu_variable_node*,
ARITH_TYPE*, REGU_VARIABLE_LIST, REGU_VALUE_LIST*,
function_node*, sp_node*. It collects TYPE_CONSTANT regu
pointers, subtracts spec own-regus, returns the remainder count.
storage member functions: new_storage (factory), constructor,
destructor, init, get, put, put_nullptr, get_key,
get_value, set_value, set_key_changed, start_timer/stop_timer
(TSC, only when thread_p->on_trace), get_current_size. C-bridge
functions: new_memoize_storage, clear_memoize_storage,
memoize_get, memoize_put, memoize_put_nullptr — translating
result_code to (success, is_ended) booleans and clearing storage
on FULL.
Cross-references
Section titled “Cross-references”query_executor.c: qexec_execute_nljoin_with_memoize (8164),
qexec_execute_scan (8299), qexec_execute_mainblock (15770 install
site), qexec_clear_xasl (teardown sites at ~2335, 2396, 2914).
query_evaluator.c: eval_pred R_EXISTS branch (1849).
fetch.c: fetch_peek_dbval TYPE_CONSTANT branch (4002).
locator_sr.c: locator_eval_filter_predicate (~8170);
fpcache_remove_by_class callers (5679, 6302). log_manager.c:
DDL recovery call (5105). xasl_generation.c: XASL_USES_SQ_CACHE
set sites (1681, 6996, 14010).
Position hints as of this revision
Section titled “Position hints as of this revision”| Symbol | File | Line |
|---|---|---|
SQ_KEY, SQ_VAL, SQ_CACHE | subquery_cache.h | 51-95 |
SQ_CACHE_MIN_HIT_RATIO | subquery_cache.h | 40 |
sq_make_key, sq_make_val | subquery_cache.c | 67, 103 |
sq_free_key, sq_free_val | subquery_cache.c | 143, 162 |
sq_unpack_val, sq_hash_func | subquery_cache.c | 191, 227 |
sq_cmp_func, sq_rem_func | subquery_cache.c | 251, 283 |
sq_cache_initialize | subquery_cache.c | 301 |
sq_put, sq_get | subquery_cache.c | 337, 407 |
sq_cache_destroy | subquery_cache.c | 461 |
FPCACHE_ENTRY (struct fpcache_ent) | filter_pred_cache.c | 41 |
fpcache_Hashmap, fpcache_Entry_descriptor | filter_pred_cache.c | 74, 118 |
fpcache_initialize, fpcache_finalize | filter_pred_cache.c | 146, 209 |
fpcache_entry_alloc/free/init/uninit | filter_pred_cache.c | 250-335 |
fpcache_claim, fpcache_retire | filter_pred_cache.c | 362, 430 |
fpcache_remove_by_class, fpcache_dump | filter_pred_cache.c | 506, 594 |
fpcache_cleanup | filter_pred_cache.c | 645 |
fpcache_compare_cleanup_candidates | filter_pred_cache.c | 733 |
fpcache_drop_all | filter_pred_cache.c | 760 |
MEMOIZE_FREE_ITERATION_LIMIT/THRESHOLD | memoize.hpp | 32-33 |
memoize::key / value / storage | memoize.hpp | 47, 67, 81 |
memoize::possible_check | memoize.cpp | 39 |
memoize::key_maker<TARGET_TYPE> | memoize.cpp | 163 |
memoize::key::hash/equal | memoize.cpp | 668, 679 |
memoize::storage::new_storage/dtor | memoize.cpp | 693, 765 |
memoize::storage::get/put/put_nullptr | memoize.cpp | 813, 893, 937 |
memoize::storage::get_key/get_value | memoize.cpp | 981, 1003 |
C bridge: new_memoize_storage/clear_*/memoize_* | memoize.cpp | 1045-1186 |
XASL_USES_SQ_CACHE, sq_cache, memoize_storage | xasl.h | 518, 1167, 1194 |
eval_pred (sq_cache wiring) | query_evaluator.c | 1849 |
fetch_peek_dbval (sq_cache wiring) | fetch.c | 4002 |
qexec_execute_nljoin_with_memoize | query_executor.c | 8164 |
qexec_execute_scan (memoize hooks) | query_executor.c | 8299 |
new_memoize_storage install site | query_executor.c | 15770 |
locator_eval_filter_predicate | locator_sr.c | ~8170 |
fpcache_remove_by_class callers | locator_sr.c / log_manager.c | 5679, 6302 / 5105 |
Cross-check Notes
Section titled “Cross-check Notes”-
Subquery-cache integer truncation. The guard
SQ_CACHE_HIT / SQ_CACHE_MISS < SQ_CACHE_MIN_HIT_RATIOdivides integers; ifMISS == 0it divides by zero. The guard runs only after 60 % capacity, so practically some miss has occurred — but a perfectly-cached workload up to 60 % would crash. No defensiveMISS == 0check. -
SQ_CACHE_MIN_HIT_RATIOcomment vs value. Header reads “9, it means 90 %”; the constant is integer 9. Cutoff depends on whetherhits/misses ∈ [8, 9)(disabled) or[9, 10)(kept) — biased low. -
Filter-pred soft cap.
fpcache_Entry_countercan exceedSoft_capacitybetween insert and cleanup. Theassert (false)infpcache_remove_by_classfor “entry not in hashmap” relies on no concurrent invalidation; a high-DDL workload could fire this in debug builds. -
Memoize hit-ratio guard fires on
put, notget. A read-only cache (inner side fully populated) is never disabled. On warmup the bias matters. -
sq_cache_initializecalled twice insq_get— once via null-check, again insideif (!SQ_CACHE_HT). Hit-ratio block referencesSQ_CACHE_HIT/MISSbefore lazy init; harmless if zeroed, but order is fragile. -
Server-mode-only fields.
SQ_CACHE’s hash and stats are guarded bySERVER_MODE/SA_MODE; in CS_MODE the cache is justsq_key_struct. Keys are computed client-side, hash table is server-side;XASL_USES_SQ_CACHEtravels with the serialised XASL. -
fpcache_compare_cleanup_candidatesignorestv_usec. Onlytv_secis compared. Sub-second retires get undefined ordering; benchmark churn produces noisy eviction. -
Memoize
aptr_listrecursion gate atlevel >= 1. At level 0, the outeraptr_listcarries optimiser-hoisted uncorrelated subqueries — irrelevant. At ≥ 1 it is part of the inner block’s structure. -
Two hashmap declarations in
filter_pred_cache.c. Lines 74-76 declarefpcache_Hashmap(C++lockfree_hashmap),fpcache_Ht(CLF_HASH_TABLE), andfpcache_Ht_freelist. Only the C++ map is used; the C-style ones look vestigial.
Open Questions
Section titled “Open Questions”-
sq_cacheand attribute-correlated subqueries. The cache is keyed by free constants only. Does the optimiser rewrite attribute-correlated subqueries to constant form, or doesXASL_USES_SQ_CACHEsimply not get set? The set-sites inxasl_generation.c(1681, 6996, 14010) would answer. -
memoize::storageoutliving DDL? The XASL cache invalidates entries viaxcache_remove_by_oidon DDL; in-flight executions complete with the storage destroyed at execution end. Quick reading shows no race; formal proof needs closer tracing. -
Why isn’t
fpcachekeyed by(class_oid, btid)? Current key is BTID alone. Two classes could in principle share a BTID across schema reuse — the cache trusts uniqueness and asserts viaOID_EQ (class_oid)infpcache_retire. -
Are the hit-ratio thresholds tuned?
9(sq_cache),0.5(memoize), and “no threshold” (fpcache) look like judgement picks rather than measurements. -
Does
fpcacheserve MERGE / UPSERT?locator_eval_filter_predicateruns on every B-tree mutation. MERGE issues INSERT and UPDATE paths against the index; the cache should serve both, but theclaim/retireorder across a MERGE-inserted-then-updated row is not traced here. -
set_valuere-clones on every hit. For wide value lists this is a per-row copy cost even when the new value matches the old. Comparison-and-skip would help only if comparison is cheaper thanpr_clone_value— unmeasured.
Sources
Section titled “Sources”CUBRID source files (informational):
src/query/subquery_cache.{c,h}— scalar-subquery cache;SQ_CACHE,SQ_KEY,SQ_VAL,SQ_TYPEdefinitions; budget macros.src/query/filter_pred_cache.{c,h}— server-wide filter-predicate cache; publicfpcache_*API.src/query/memoize.{hpp,cpp}— generic per-XASL memoization;storage,key,value,result_code, C bridge.src/query/query_executor.c— wiresmemoize_*andsq_cache_destroyintoqexec_execute_scan/mainblock/clear_xasl.src/query/query_evaluator.c,src/query/fetch.c—sq_cacheprobe ateval_pred/fetch_peek_dbval.src/transaction/locator_sr.c,src/transaction/log_manager.c—fpcache_claim/retirecallers andfpcache_remove_by_classtriggers (DDL + recovery).src/parser/xasl_generation.c—XASL_USES_SQ_CACHEset sites.
Cross-document references:
cubrid-xasl-cache.md— the plan-level cache, complementary: XASL cache memoises whole compiled plans; the runtime caches memoise pieces of execution under those plans. Both share the latch-free hashmap pattern viacubthread::lockfree_hashmap.cubrid-query-evaluator.md—eval_pred,fetch_peek_dbval, and thePRED_EXPRwalker, one level above thesq_cacheprobe.cubrid-btree.md— owner of the BTIDs used asfpcachekeys.