CUBRID Architecture Overview — Process Model, Layered Stack, and the Map Into the Detail Docs

Contents:

Who this is for
Process model
Layered storage stack
Query pipeline
Concurrency, logging, recovery
Distribution
PL family
Cross-cutting infrastructure
Where to start reading
Subcategory map

Who this is for

This document is the front door for the CUBRID code-analysis tree. It is written for someone who has read zero lines of CUBRID source and wants the high-level shape — the long-lived processes, the layered stack inside each process, the dataflow of one query, the sub-systems that keep concurrency and durability honest, and the distribution surface that turns a single node into a cluster — before deciding which of the ~70 detail docs to open. Every section is deliberately schematic; the deep code analysis lives in the per-module docs and is referenced by file path (cubrid-X.md) at every opportunity. If a section feels thin, that is by design: this is a router, not a duplicate.

A second audience is the engineer who has read several detail docs and now wants to fit them together. The cross-cutting axes — “how does a SELECT travel from the broker to the heap and back?”, “what fires when cub_master declares this node the new HA primary?”, “why does the page buffer talk to the log manager before flushing?” — each cut across multiple subsystems. The diagrams here name the boundaries those axes cross, so you can pick the right pair of detail docs to read in tandem.

A third audience is anyone trying to understand why CUBRID looks the way it does. CUBRID is an object-relational engine with an OODB lineage (UniSQL → CUBRID), an HA story built on a separate cub_master rather than embedded consensus, a broker tier that pre- forks CAS workers and rendezvous-passes file descriptors, and a PL family that runs Java in a sibling JVM process rather than embedded. Each of these is a deliberate design choice with its own trade-offs; the detail docs treat them in depth, and cubrid-design-philosophy.md collects the rationale. This overview names the choices and points at the docs that explain them.

Process model

CUBRID is a multi-process engine. Four long-lived process types and one short-lived utility class cover every operational scenario, and the set of processes a database is using right now is the first mental model worth having.

cub_master — the per-host supervisor. One per host. Owns the master shared-memory anchor, listens on a Unix-domain socket for in-host process registration (servers, copylogdb, applylogdb, brokers register here), accepts inbound TCP connections from remote clients on the well-known port and hands the file descriptor to the right cub_server over its UDS, gossips liveness with peer cub_masters over UDP for HA, and runs the failover/failback FSM. Its work is supervisory — it does not own database state itself. Detail: cubrid-heartbeat.md, cubrid-network-protocol.md.
cub_server — the database engine. One per database, in general. Owns volumes, the page buffer, log files, lock table, catalog, the optimizer, the XASL executor, MVCC bookkeeping, vacuum, and the entire storage stack. All client connections ultimately terminate here. Detail: cubrid-boot.md, cubrid-network-protocol.md, cubrid-thread-worker-pool.md, cubrid-thread-manager-ng.md.
cub_pl — the PL JVM process. One per cub_server when Java/PL execution is enabled. Started by the server, runs Java stored procedures and PL/CSQL bytecode, talks back to the server’s session over a Unix-domain socket using the same CSS framing used elsewhere. Detail: cubrid-pl-javasp.md, cubrid-pl-plcsql.md.
cub_broker + cub_cas — the broker tier. cub_broker is a parent process that pre-forks a fixed pool of cub_cas workers, exposes a TCP listener for JDBC/CCI/ODBC clients, rendezvous-passes accepted client sockets to an idle CAS via SCM_RIGHTS over a Unix-domain channel, and tracks the pool through a SysV shared-memory region. Each cub_cas is a long-lived per-client-session worker that proxies CSS-framed traffic to cub_server. Detail: cubrid-broker.md.
SA-mode utilities — short-lived processes (cubrid loaddb, cubrid unloaddb, cubrid backupdb, cubrid compactdb, cubrid restoredb, …) that link the entire engine in-process via libcubridsa.so and operate on the on-disk database without a running cub_server. The same source tree compiles three ways (server / SA / CS) and a runtime dlopen picks the variant. Detail: cubrid-sa-cs-runtime.md, cubrid-loaddb.md, cubrid-backup-restore.md, cubrid-compactdb.md.

flowchart LR
  subgraph CLIENT["Client side"]
    APP["JDBC / CCI / ODBC / Python / PHP app"]
    CSQL["csql (CS-mode)"]
    UTILS["cubrid loaddb / backupdb / unloaddb<br/>(SA-mode utility)"]
  end

  subgraph BROKER["cub_broker host"]
    BPARENT["cub_broker<br/>(TCP listener)"]
    CAS1["cub_cas #1"]
    CAS2["cub_cas #2"]
    CASN["cub_cas #N"]
    BSHM["SysV shm<br/>(broker control)"]
  end

  subgraph DBHOST["cub_server host"]
    MASTER["cub_master<br/>(supervisor)"]
    SERVER["cub_server<br/>(engine)"]
    PL["cub_pl<br/>(JVM)"]
  end

  subgraph PEERS["Peer hosts (HA cluster)"]
    PEER["cub_master @ peer"]
  end

  APP -- TCP --> BPARENT
  BPARENT -. SCM_RIGHTS over UDS .-> CAS1
  BPARENT -. SCM_RIGHTS over UDS .-> CAS2
  BPARENT -. SCM_RIGHTS over UDS .-> CASN
  BPARENT --- BSHM
  CAS1 --- BSHM
  CAS2 --- BSHM
  CASN --- BSHM

  CSQL -- TCP --> MASTER
  CAS1 -- TCP --> MASTER
  CAS2 -- TCP --> MASTER
  CASN -- TCP --> MASTER

  MASTER -. UDS / SCM_RIGHTS .-> SERVER
  SERVER -- UDS --> PL

  MASTER <-- UDP heartbeat --> PEER

  UTILS -. dlopen libcubridsa .-> SERVER
  UTILS -- direct volume I/O --> SERVER

Three properties make the diagram less mysterious than it looks.

(a) The TCP listener is on cub_master, not cub_server. Every inbound connection — whether from csql, from a CAS, or from another tool — first lands on cub_master’s well-known port; the master parses the destination database name, finds the matching cub_server in its registered-process table, and forwards the file descriptor. The server then reads the protocol header off the freshly-handed-to-it socket and continues. This indirection is what allows multiple cub_server processes (one per database) to share a single host without each owning a port.

(b) The broker is a separate tier from the server. It runs on its own host (or co-located, but logically separate), with its own pool size, its own connection routing, its own ACL, and its own monitoring. The CAS-to-server hop is itself a TCP connection that happens to terminate inside the same data center, and there is nothing stopping you from operating multiple brokers fanning into one or more servers.

(c) cub_pl is a child of cub_server, not of cub_master. Java SPs and PL/CSQL execute inside their JVM; the server-side SP call serialises the arguments and walks them across to the JVM over a per-session UDS, then waits for the JVM to ship results back. The server treats the JVM as a co-process, not as a remote service.

Cross-references for the process model: cubrid-broker.md for the CAS pool, rendezvous protocol, and broker control plane; cubrid-heartbeat.md for the master-to-master UDP gossip; cubrid-pl-javasp.md for the JVM IPC; cubrid-network-protocol.md for the CSS framing and NRP dispatch; cubrid-sa-cs-runtime.md for the three-way build variant and the utility classification.

Layered storage stack

Inside cub_server, the storage stack is a strict layering. Each layer treats the layer below it as an abstraction and exports a narrower abstraction upward. Understanding the layering is the prerequisite for understanding any read or write path.

flowchart TB
  subgraph CLIENT_WORKSPACE["Client-side workspace"]
    WSP["MOP table<br/>(in-memory OID -> object cache)"]
    SMC["SM_CLASS graph<br/>(in-memory schema)"]
  end

  subgraph SERVER["cub_server"]
    LOC["locator_∗_force<br/>(server-side bridge)"]
    CAT["catalog_manager<br/>(_db_class, _db_attribute, ...)"]
    HEAP["heap_manager<br/>(slotted pages, MVCC headers)"]
    BTREE["btree<br/>(latch-coupled B+Tree)"]
    EHASH["extendible_hash<br/>(directory + buckets)"]
    OFLOW["overflow_file<br/>(big-record / overflow-OID chain)"]
    PB["page_buffer<br/>(BCB array, three-zone LRU)"]
    DWB["double_write_buffer<br/>(torn-write protection)"]
    DM["disk_manager<br/>(volumes, sectors, files, pages)"]
  end

  VOLS[("On-disk volumes<br/>(_dbname / _dbname_t / _lgar*)")]

  WSP --> LOC
  SMC --> LOC

  LOC --> HEAP
  LOC --> BTREE
  LOC --> CAT
  CAT --> HEAP
  CAT --> BTREE
  CAT --> EHASH

  HEAP --> OFLOW
  BTREE --> OFLOW

  HEAP --> PB
  BTREE --> PB
  EHASH --> PB
  OFLOW --> PB
  CAT --> PB

  PB --> DWB
  DWB --> DM
  PB --> DM
  DM --> VOLS

Reading bottom-up:

disk_manager owns volumes, sectors, files, and pages. A volume is one OS file; a sector is 64 contiguous pages and the allocation unit; a file is a sector bundle; a page is the I/O unit. The disk manager hands out and reclaims sectors, extends volumes when the disk cache says we are out of room, and separates permanent and temporary purposes so that temp files cannot starve the permanent space. Detail: cubrid-disk-manager.md.
page_buffer + double_write_buffer sit directly on top. The page buffer maps (VPID -> BCB -> in-memory frame) through a per-bucket hash, runs a three-zone LRU split into per-thread private lists with adjustable quotas plus a shared list, hands victims directly to sleeping waiters via lock-free queues, and protects each BCB with a custom read/write/flush latch. Every dirty page goes through DWB before it lands at its home location, so a torn write at the home page is recoverable from the DWB copy. Detail: cubrid-page-buffer-manager.md, cubrid-double-write-buffer.md.
heap_manager / btree / extendible_hash are the three on-page record organisations that sit on the page buffer. heap_manager stores variable-length user records in slotted pages with MVCC headers (insert MVCCID, delete MVCCID, prev- version chain). btree is a latch-coupled B+Tree with key||OID concatenation and unique-constraint enforcement at the OID- suffix level. extendible_hash is a Fagin-style hash file with a doubling directory used for class-name lookup, repr-id lookup, and a few internal dedup tables. Big records and big OID lists spill into overflow_file chains. Detail: cubrid-heap-manager.md, cubrid-btree.md, cubrid-extendible-hash.md, cubrid-overflow-file.md, cubrid-tde.md (encrypts page contents in/out of the buffer).
catalog_manager stores per-class disk representation and statistics in a dedicated catalog file (anchored by CTID), with a parallel set of user-visible system classes (_db_class, _db_attribute, _db_index, _db_serial, _db_user, _db_authorization, _db_trigger) bootstrapped from a fixed root-class OID. Everything from SHOW STATS to authorization resolves through the catalog. Detail: cubrid-catalog-manager.md, cubrid-statistics.md, cubrid-authentication.md, cubrid-serial.md.
SM_CLASS (class-object) is the in-memory schema graph that the workspace materialises from the catalog. It carries attributes, methods, partitions, constraints, triggers, and the partition-rule descriptor. The OODB lineage shows here: the schema is an object graph, not a flat relation, and DDL manipulates it before persisting. Detail: cubrid-class-object.md, cubrid-ddl-execution.md, cubrid-trigger.md, cubrid-partition.md.
locator is the bridge between the in-memory side and the on-disk side. The client-side workspace batches dirty objects into LC_COPYAREA buffers and ships them to a server-side locator_*_force family that fans out into heap, btree, lock, log, FK, and replication paths through one canonical entry point. Triggers and integrity rules also fire from here. Detail: cubrid-locator.md.
Client-side workspace is the topmost layer. The MOP table is the in-memory OID-to-object cache; it carries dirty bits, pin counts, and the materialised SM_CLASS graphs the application sees. ESQL/embedded-SQL, the DBI/CCI clients, and the broker’s CAS all read and write through this workspace. Detail: cubrid-class-object.md, cubrid-dbi-cci.md.

The crucial structural fact is that every record-organisation layer talks to the page buffer, never to the disk manager directly. This is what lets the buffer manager enforce the WAL ordering: a heap page mutation produces a log record first (routed through the prior list to the log manager), and the buffer manager refuses to flush the dirty page until the matching log LSA is durable. Same for B+Tree, ehash, and catalog mutations.

Query pipeline

A SELECT or DML statement walks a long pipeline from text to result rows. The pipeline is split between client side and server side, with serialisation across the wire at well-defined points, but the conceptual stages are uniform.

flowchart LR
  SQL["SQL text"]
  PARSER["parser<br/>(Flex+Bison -> PT_NODE)"]
  SC["semantic_check<br/>(name resolve, type check, CNF)"]
  REW["query_rewrite<br/>(LIMIT lowering, view inlining,<br/>subquery flatten, predicate reduce)"]
  OPT["query_optimizer<br/>(QO_ENV graph, DP join enum,<br/>System R cost model)"]
  XGEN["xasl_generator<br/>(QO_PLAN -> XASL_NODE tree)"]
  XCACHE["xasl_cache<br/>(SHA-1 keyed plan cache)"]
  XEXEC["query_executor<br/>(Volcano open/next/close)"]
  SCAN["scan_manager<br/>(SCAN_ID dispatch)"]
  AM["access methods<br/>(heap / btree / list / set /<br/>value / json-table / show / dblink)"]
  EVAL["query_evaluator<br/>(PRED_EXPR, regu_variable)"]
  POST["post_processing<br/>(group by / aggregates /<br/>window / order by)"]
  LF["list_file<br/>(QFILE_LIST_ID, spill to FILE_TEMP)"]
  CUR["cursor / dbi-cci<br/>(client fetch handle)"]
  ROWS["result rows"]

  SQL --> PARSER --> SC --> REW --> OPT --> XGEN --> XCACHE
  XCACHE --> XEXEC
  XEXEC --> SCAN --> AM
  XEXEC --> EVAL
  AM --> EVAL
  XEXEC --> POST
  POST --> LF
  AM --> LF
  XEXEC --> LF
  LF --> CUR --> ROWS

  REW -. cache hit shortcut .-> XCACHE

The pipeline has six structural facts worth naming:

(a) Compilation runs once, execution runs many. Stages 1-5 (parse → semantic check → rewrite → optimize → XASL generate) produce a serialised XASL tree that is cached server-wide in xasl_cache, keyed on a SHA-1 of the rewritten SQL. The second execute of the same SQL skips compilation and pulls the XASL straight from the cache. The cache also tracks per-class OIDs so DDL on any referenced class invalidates dependent entries. Detail: cubrid-xasl-cache.md.

(b) The optimizer’s plan and the executor’s plan are different IRs. Optimization works over a QO_PLAN tree of QO_NODE/QO_SEGMENT/QO_TERM graph objects with a System-R- style fixed-cpu/io plus variable-cpu/io cost model. The XASL generator then lowers the surviving QO_PLAN into a recursively-shaped XASL_NODE tree with aptr/dptr/scan_ptr slots, REGU_VARIABLE IRs for value derivation, ACCESS_SPEC for scan parameters, and an OUTPTR_LIST for the result row. Only XASL is ever serialised across the wire and persisted in the cache. Detail: cubrid-query-optimizer.md, cubrid-xasl-generator.md.

(c) The executor is Volcano-style. qexec_execute_main_block dispatches by xasl->type and drives a uniform open/next/close loop over SCAN_ID operators. The scan manager is the polymorphic access-method catalogue: heap, B+Tree, list-file, set, value, JSON-table, dblink, show, parallel-heap, and method scans all present the same protocol. Detail: cubrid-query-executor.md, cubrid-scan-manager.md.

(d) Predicates evaluate through a separate engine. Each pulled tuple is filtered by eval_pred walking a PRED_EXPR tree of T_PRED boolean nodes and T_EVAL_TERM leaves under three- valued logic; every leaf calls fetch_peek_dbval which dispatches on REGU_VARIABLE::type (constant, attribute fetch, list-file position, arithmetic expression, function call, host variable, OID, list-id) into a path-specific resolver. The arithmetic and string operators that the regu-variable engine calls into are themselves a separate operator-primitive layer. Detail: cubrid-query-evaluator.md, cubrid-scalar-functions.md.

(e) Materialisation is uniform via list-file. Every sub-query result, sort output, hash-build side, group-by accumulator, and final query result is one QFILE_LIST_ID abstraction backed by a per-query QMGR_TEMP_FILE (membuf- then-FILE_TEMP) substrate. Operators read upstream and write downstream through the same open/add/scan/close contract. Detail: cubrid-list-file.md.

(f) Specialised post-processing. Group-by, aggregates, and analytic/window functions live in their own pass that chooses sort-based or hash-based GROUP BY at runtime and falls back to external sort when the hash table outgrows budget; hash join is similarly a separate Build/Probe driver with three table-layout strategies and grace-style spilling; parallel query is yet another orchestrator on top of a global parallel- query worker pool. Detail: cubrid-post-processing.md, cubrid-hash-join.md, cubrid-parallel-query.md, cubrid-external-sort.md, cubrid-runtime-memoization.md.

The client-facing tail is the cursor: db_query_first_tuple / db_query_get_tuple_value (cubrid-dbi-cci.md) and cubrid-cursor.md cover the broker- and DBI-side handle that locks onto the server’s list file and pages tuples one network-page at a time.

Concurrency, logging, recovery

CUBRID is an MVCC engine with row-level locks for write conflicts and ARIES three-pass restart for crash recovery. Three timelines co-exist: the transactional timeline of MVCCIDs and locks, the physical timeline of WAL records and LSAs, and the page timeline of dirty buffer slots and flushes. The recovery story glues them.

flowchart LR
  TX["TDES (transaction descriptor)<br/>cubrid-transaction.md"]
  MVCC["MVCC table<br/>(active MVCCIDs, snapshots)<br/>cubrid-mvcc.md"]
  LOCK["lock manager<br/>(per-OID multi-granularity)<br/>cubrid-lock-manager.md"]
  PRIOR["prior_list<br/>(per-tx WAL queue)<br/>cubrid-prior-list.md"]
  LOG["log_manager<br/>(LSA, append page, archive)<br/>cubrid-log-manager.md"]
  CHKPT["checkpoint<br/>(fuzzy ARIES, redo-LSA hint)<br/>cubrid-checkpoint.md"]
  RECOV["recovery_manager<br/>(analysis / redo / undo)<br/>cubrid-recovery-manager.md"]
  VAC["vacuum<br/>(reclaim dead versions via WAL replay)<br/>cubrid-vacuum.md"]
  PB["page_buffer + DWB"]
  HEAP["heap / btree / catalog"]

  TX --> MVCC
  TX --> LOCK
  TX --> PRIOR
  PRIOR --> LOG
  LOG --> PB
  HEAP -- mvcc header writes --> MVCC
  HEAP -- WAL records --> PRIOR
  CHKPT -- LOG_START_CHKPT / LOG_END_CHKPT --> LOG
  CHKPT -- redo-LSA hint --> PB
  RECOV -- replays --> LOG
  RECOV -- restores --> PB
  VAC -- forward log walk --> LOG
  VAC -- physical reclaim --> PB
  VAC --> MVCC

The transactional core is the TDES (transaction descriptor) — one per active transaction, kept in a server-wide trantable. It carries the transaction’s MVCCID, isolation level, savepoint stack, lock list, and tail of WAL records. MVCC and the lock manager both index off the TDES. Isolation levels (READ COMMITTED, REPEATABLE READ, SERIALIZABLE) are dispatched through the snapshot construction in mvcctable::build_mvcc_info plus the lock-mode mapping in lock_manager. Detail: cubrid-transaction.md, cubrid-mvcc.md, cubrid-lock-manager.md.

The WAL pipeline has a deliberately split shape. Every record mutation produces a LOG_PRIOR_NODE on the per-transaction prior list — a singly-linked queue protected by a single short-held mutex, drained periodically by the log-flush daemon. Group commit emerges naturally from queue batching: when a transaction commits, its commit record joins the same prior list and the next drain flushes the whole batch under one log critical-section pass. Detail: cubrid-prior-list.md, cubrid-log-manager.md.

Checkpoint is fuzzy-ARIES. A periodic daemon emits a LOG_START_CHKPT / LOG_END_CHKPT pair carrying an active- transaction snapshot and a redo-LSA hint derived from the page- buffer’s dirty list. The hint advances log_Gl.hdr.chkpt_lsa so that the next analysis pass can skip everything below it. The checkpoint does not force-flush all dirty pages; it captures the snapshot and trusts the bgwriter / page-buffer victim path to keep up. Detail: cubrid-checkpoint.md.

Recovery is three passes. The analysis pass scans forward from chkpt_lsa reconstructing the active-transaction table. The redo pass replays page-mutating records on each affected page, parallelised through a per-page worker pool. The undo pass walks the active-transactions’ WAL backward applying compensating log records. Per-record-type dispatch is through the RV_fun[] table — every record carries an RVCODE whose recovery handlers are statically registered. Detail: cubrid-recovery-manager.md, cubrid-2pc.md (in-doubt transactions surface from the analysis pass).

Vacuum is the MVCC reclamation engine. It walks the WAL forward in fixed-size blocks below an oldest-visible-MVCCID watermark, dispatching per-block jobs from a master to a worker pool. Dead versions identified by their delete-MVCCID below the watermark are physically removed from heap pages and B+Tree leaves; dropped files are tracked separately so vacuum does not chase pages that no longer belong to any class. Detail: cubrid-vacuum.md.

Double-write buffer completes the durability story. Every dirty page is staged into the sequential, fixed-size DWB volume and fsync’d before the home write is issued. A torn write at the home page is recoverable from the DWB copy on restart, before the log replay begins. Detail: cubrid-double-write-buffer.md.

The non-obvious cross-cuts: (i) MVCCID assignment is transactionless for read-only — only writes consume MVCCIDs from the global counter, which keeps the active-MVCCID set bounded even under heavy read workloads; (ii) lock latches and page latches are intentionally separate (page latches are short, embedded in the BCB, unrelated to isolation — see cubrid-page-buffer-manager.md’s “Lock vs latch separation” note); (iii) backup, restore, and flashback all pivot on the same WAL the recovery manager replays (cubrid-backup-restore.md, cubrid-flashback.md).

Distribution

CUBRID’s distribution layer turns a single cub_server into one member of a master/standby HA cluster, exposes change-streaming and flashback over the WAL, and supports XA-driven 2PC.

flowchart TB
  subgraph CLUSTER["HA cluster"]
    M["master cub_master + cub_server"]
    S1["slave cub_master + cub_server"]
    S2["slave cub_master + cub_server"]
    M <-- UDP gossip --> S1
    M <-- UDP gossip --> S2
    S1 <-- UDP gossip --> S2
  end

  subgraph REPLICATION["Logical-log replication"]
    CL["copylogdb<br/>(ships log volumes master->slave)"]
    AL["applylogdb<br/>(la_apply_log_file)"]
    REPLD["LOG_REPLICATION_DATA / _STATEMENT"]
    M -- WAL --> CL
    CL -- log archives --> S1
    S1 -- reads archives --> AL
    AL -- per-record dispatch --> S1
    M -- emits --> REPLD
    REPLD -. carried in WAL .-> AL
  end

  subgraph CDC["Change Data Capture"]
    CDCAPI["cdc_∗ API<br/>(LOG_SUPPLEMENTAL_INFO walker)"]
    CDCCLIENT["downstream consumer"]
    M -- log_reader --> CDCAPI
    CDCAPI --> CDCCLIENT
  end

  subgraph TWOPC["XA 2PC"]
    COORD["coordinator<br/>(XA tm or internal)"]
    TPCFSM["LOG_2PC_PREPARE / _COMMIT_DECISION"]
    INDOUBT["in-doubt recovery<br/>(analysis pass)"]
    COORD --> TPCFSM
    TPCFSM --> M
    INDOUBT -. surfaces from .-> TPCFSM
  end

  subgraph BACKUP["Backup / Restore / Flashback"]
    BKP["backupdb<br/>(start_lsa marker + log archive)"]
    RST["restoredb<br/>(volume restore + log replay)"]
    FB["flashback<br/>(per-tx summary + replay)"]
    M -- volumes --> BKP
    M -- WAL --> BKP
    BKP --> RST
    M -- WAL --> FB
  end

The distribution decisions follow CUBRID’s “local decision over quorum consensus” stance:

Heartbeat is independent per node. Each cub_master computes a peer-table score on a timer, picks the minimum score as master, and falls into failover via TO_BE_MASTER with a second score-recompute gate. There is no consensus protocol — every node decides locally from its own peer table. Witness hosts (ha_ping_hosts) are the split-brain guard. Detail: cubrid-heartbeat.md.
HA replication is logical-log based. The master engine emits LOG_REPLICATION_DATA / LOG_REPLICATION_STATEMENT records alongside its physiological WAL during DML. A separate copylogdb daemon ships archived log volumes to the slave; applylogdb walks them forward through la_apply_log_file and dispatches per-record-type back into the storage layer for serialised, transactionally consistent replay. Detail: cubrid-ha-replication.md.
CDC walks the same WAL forward looking at LOG_SUPPLEMENTAL_INFO records through log_reader. The modern cdc_* API exposes this to downstream consumers; the older la_* HA applier is the same shape internally. Detail: cubrid-cdc.md.
Two-phase commit uses prepared-state log records that survive crash. The coordinator and participant FSMs are encoded in LOG_2PC_* records; in-doubt transactions surface during the analysis pass of recovery and either commit or abort based on the recorded decision. XA-driven distributed transactions and internal nested coordinators share the same machinery. Detail: cubrid-2pc.md.
Flashback answers “what did transactions T1..Tn do between time A and B” by walking the log forward in two phases — a per-transaction summary, then a per-transaction detailed log-info pull — sharing the CDC entry format but read against archived log volumes. Detail: cubrid-flashback.md.
Backup/restore is online physical backup: a snapshot of data volumes plus the log records bracketed by a start_lsa marker and the next checkpoint. Restore replays the log forward up to a user-supplied stop time for point-in-time recovery. Detail: cubrid-backup-restore.md.

The unifying observation is that the WAL is CUBRID’s single event log. Recovery, vacuum, replication, CDC, flashback, and backup all read the same record stream; they differ only in which record types they care about and in which direction they walk it.

PL family

Procedural extensions in CUBRID — Java stored procedures and PL/CSQL — run in a sibling JVM process, not embedded in the server. The decision to put PL in its own process makes the CUBRID server immune to JVM stalls, GC pauses, and user-code crashes.

flowchart LR
  subgraph CSERVER["cub_server (C)"]
    SP_CALL["SP call site<br/>(qexec_execute_proc)"]
    SP_BRIDGE["sp_bridge / sp_send_call_info"]
    SP_CTL["sp_pl_socket<br/>(per-session UDS)"]
  end

  subgraph CUBPL["cub_pl (JVM)"]
    LISTENER["listener thread<br/>(accepts session UDS)"]
    DISPATCH["dispatch handler"]
    JAVASP["JavaSP runtime<br/>(reflection on user JAR)"]
    PLCSQL["PL/CSQL runtime<br/>(compiled to Java AST -> JAR)"]
    JDBC["server-side JDBC bridge<br/>(callbacks to cub_server)"]
    CLASSLOADER["classloader hierarchy<br/>(parent: system, child: SP-class)"]
  end

  CATALOG[("_db_stored_procedure<br/>_db_stored_procedure_code")]

  SP_CALL --> SP_BRIDGE --> SP_CTL
  SP_CTL -- UDS / CSS framing --> LISTENER
  LISTENER --> DISPATCH
  DISPATCH --> JAVASP
  DISPATCH --> PLCSQL
  JAVASP --> CLASSLOADER
  PLCSQL --> CLASSLOADER
  JAVASP -- queries / DML --> JDBC
  PLCSQL -- queries / DML --> JDBC
  JDBC -- CSS frames --> SP_CTL
  CSERVER -- catalog rows --> CATALOG
  CUBPL -. reads via JDBC .-> CATALOG

Three properties shape the PL family:

(a) Two languages, one runtime. Both Java SPs and PL/CSQL execute in the same cub_pl JVM. PL/CSQL is parsed by an ANTLR 4 grammar inside pl_server, lowered to a CUBRID-specific Java AST (DeclProgram / StmtBlock / ExprBinaryOp / loopOpt), emitted as Java source by a visitor, compiled by an in-process javax.tools.JavaCompiler, packaged as a JAR (Base64), and returned to the C-side compile_handler so sp_add_stored_procedure_code can persist it next to the Java SP catalog rows. At call time both routes look identical to the dispatch handler. Detail: cubrid-pl-plcsql.md, cubrid-pl-javasp.md.

(b) The JVM calls back to the server through JDBC. When a stored procedure issues a query or DML, the JVM does not call the C-side server in-process; it goes through a server-side JDBC driver that routes through the same per-session UDS that received the SP call. From the engine’s point of view, PL is a privileged client — same CSS framing, same NRP dispatch, same prepared- statement cache.

(c) The classloader hierarchy is JavaSP-specific. PL/CSQL classes are generated by CUBRID and trusted; user JARs are loaded under a child classloader and a security manager that restricts what the SP code can do. The reflective dispatch on user JARs is JavaSP’s responsibility alone.

Detail docs: cubrid-pl-javasp.md for the Java SP wire protocol and JVM IPC; cubrid-pl-plcsql.md for the ANTLR grammar, AST, emitter, and JAR packaging.

Cross-cutting infrastructure

These subsystems do not fit cleanly into any single layer above because they are touched by every layer. Each gets a short paragraph here; the full coverage lives in the per-doc files.

Boot. cubrid-boot.md covers cub_server startup — first- time createdb formats volumes and bootstraps the root-class catalog; restart hands off to log_recovery’s three-pass replay; the client side wires boot_restart_client to xboot_register_client over the network. Boot is also where every subsystem’s init function fires in order, and where the thread pools, page buffer, log manager, lock manager, and catalog all come online.
Server session. cubrid-server-session.md describes the per-client server-side state container — a SESSION_STATE keyed by an integer session id in a lock-free hash, cached on the connection entry for O(1) request lookup, and bound to the per-thread TDES so that every server request lands on its rightful transaction descriptor, prepared-statement cache, and parameter set.
Thread + worker pool (legacy). cubrid-thread-worker-pool.md describes how every server thread of execution structures itself — the per-thread cubthread::entry context, the worker_pool template (cores → workers → task queue) that runs queries / vacuum / loaddb / parallel-redo, the daemon + looper pattern that drives every periodic background flush and detect, the lock- free hashmap shared by lock manager and page buffer, and the heavyweight csect RW primitive with its per-thread tracker.
Thread manager NG (CBRD-26177). cubrid-thread-manager-ng.md describes the connection/worker pool redesign — bounded epoll-driven connection workers, a coordinator brokering rebalancing and auto-scaling, send/recv budgets, per-worker context freelists, and atomic-free statistics — replacing the legacy thread-per-connection plus max_clients-task-worker layout.
Network protocol. cubrid-network-protocol.md covers how every server entry point is framed as one NET_SERVER_* opcode dispatched through a static (action_attribute, handler) table — connections accepted by cub_master, handed to cub_server workers via master::connector over a Unix- domain socket, then driven by an epoll-based connection worker reading CSS-framed packets and delegating to symmetric or_pack_* / or_unpack_* request marshalling on both sides.
Broker. cubrid-broker.md covers the CAS pool — cub_broker parent forks a fixed pool of cub_cas workers, exposes a single TCP listener, hands accepted client sockets to an idle CAS through a Unix-domain rendezvous channel using SCM_RIGHTS file-descriptor passing, and lets each CAS proxy CSS-framed traffic upstream to cub_server — all coordinated through one SysV shared-memory segment.
Error management. cubrid-error-management.md covers the global ER_* enum, the per-thread cuberr::context with its base er_message and nested-error stack, the er_set family with printf-style format compilation, the localised cubrid.msg / csql.msg / utils.msg catalogs in NetBSD/ FreeBSD nl_catd format, the cubrid_*.err log file with size-based rotation, and the wire format that flattens an error to three OR_INT fields plus the message string for client-server propagation.
System parameters. cubrid-system-parameters.md covers the prm_Def[] registry, the cubrid.conf INI parser with section selection, environment-variable overrides, the db_set_system_parameters SQL path, and the per-session SESSION_PARAM array — the one ordered resolution flow that every other subsystem reads through prm_get_*_value.
Monitoring. cubrid-monitoring.md covers two layered counter systems — a C++ template-based cubmonitor library that registers groups of statistics and supports per- transaction sheets, and the older C perf_monitor / pstat_Metadata array used by SHOW STATS and statdump — plus per-subsystem monitors (e.g., the per-vacuum-worker overflow-page threshold tracker).
DBI / CCI. cubrid-dbi-cci.md covers the client API — the single db_* C API on top of boot_cl and network_cl that walks every statement through a four-stage FSM (Initial → Compiled → Prepared → Executed) inside a DB_SESSION, and the broker-side wire driver (ux_database_connect, ux_prepare, ux_execute, ux_fetch, ux_end_tran) dispatched by a flat server_fn_table so JDBC, CCI, ODBC, Python, and PHP all reach the engine through the same db_* core.
SA / CS runtime. cubrid-sa-cs-runtime.md covers how the same source tree compiles three times — cub_server (SERVER_MODE), libcubridsa (SA_MODE), libcubridcs (CS_MODE) — so admin utilities can either embed the entire engine in-process and operate on the on-disk database directly, or talk over CSS to a separately running daemon, with the choice driven by per-utility classification (SA_ONLY / CS_ONLY / SA_CS) and a runtime dlopen of either libcubridsa.so or libcubridcs.so.

Two more cross-cutting docs round out the surface but are narrower in scope: cubrid-charset-collation.md for codeset conversion and locale-aware comparison; cubrid-timezone.md for IANA tzdata compilation and DATETIMETZ/TIMESTAMPTZ conversion; cubrid-show-commands.md for the introspection virtual scans; cubrid-runtime-memoization.md for the trio of runtime caches sharing one playbook.

The base / infrastructure substrate — custom memory allocators and lock-free primitives — has its own subcategory under cubrid-overview-base-infra.md. AREA (the slab pool for fixed- size objects), the per-thread Lea-heap private allocator, and the six lock-free docs (overview, transactional reclamation, bitmap, freelist, hashmap, circular queue) all live there. This is the family every layer above composes with rather than depending on.

Where to start reading

Different goals lead into the tree at different points. The table below names the canonical first-doc and the next two or three to follow.

Goal	Start at	Then
Understand a SELECT end to end	`cubrid-rpath-select.md`	`cubrid-broker.md` -> `cubrid-network-protocol.md` -> `cubrid-parser.md` -> `cubrid-semantic-check.md` -> `cubrid-query-rewrite.md` -> `cubrid-query-optimizer.md` -> `cubrid-xasl-generator.md` -> `cubrid-query-executor.md` -> `cubrid-scan-manager.md` -> `cubrid-list-file.md` -> `cubrid-cursor.md`
Understand a COMMIT (write path)	`cubrid-rpath-write.md`	`cubrid-locator.md` -> `cubrid-heap-manager.md` -> `cubrid-btree.md` -> `cubrid-mvcc.md` -> `cubrid-lock-manager.md` -> `cubrid-prior-list.md` -> `cubrid-log-manager.md` -> `cubrid-double-write-buffer.md` -> `cubrid-page-buffer-manager.md`
Understand a server restart	`cubrid-rpath-recovery.md`	`cubrid-boot.md` -> `cubrid-checkpoint.md` -> `cubrid-recovery-manager.md` -> `cubrid-2pc.md` -> `cubrid-vacuum.md`
Understand DDL on a class	`cubrid-ddl-execution.md`	`cubrid-class-object.md` -> `cubrid-catalog-manager.md` -> `cubrid-locator.md` -> `cubrid-xasl-cache.md` -> `cubrid-trigger.md` -> `cubrid-partition.md`
Understand HA failover	`cubrid-heartbeat.md`	`cubrid-ha-replication.md` -> `cubrid-cdc.md` -> `cubrid-2pc.md` -> `cubrid-flashback.md` -> `cubrid-backup-restore.md`
Understand a stored procedure call	`cubrid-pl-javasp.md`	`cubrid-pl-plcsql.md` -> `cubrid-network-protocol.md` -> `cubrid-server-session.md` -> `cubrid-dbi-cci.md`
Understand a backup / restore	`cubrid-backup-restore.md`	`cubrid-log-manager.md` -> `cubrid-checkpoint.md` -> `cubrid-recovery-manager.md` -> `cubrid-flashback.md`
Understand bulk load	`cubrid-loaddb.md`	`cubrid-locator.md` -> `cubrid-heap-manager.md` -> `cubrid-btree.md` -> `cubrid-statistics.md` -> `cubrid-sa-cs-runtime.md`
Understand CCI / JDBC client	`cubrid-dbi-cci.md`	`cubrid-broker.md` -> `cubrid-network-protocol.md` -> `cubrid-cursor.md` -> `cubrid-server-session.md`
Understand WHY CUBRID looks this way	`cubrid-design-philosophy.md`	this overview, then the subsystem of interest

Subcategory map

The cubrid tree is organised into eight subcategories. Each detail doc declares its subcategory: in the frontmatter; the table below is a navigation aid mirroring that taxonomy. The summary column is extracted (verbatim or lightly compressed) from each doc’s frontmatter summary: field.

server-architecture (process-level shape)

Doc	Summary
`cubrid-boot.md`	Server startup, first-time `createdb`, restart-recovery dispatch, and client connect — every subsystem’s init firing in order.
`cubrid-broker.md`	`cub_broker` parent + `cub_cas` worker pool, SCM_RIGHTS file-descriptor rendezvous, SysV shared-memory control plane, ACL, monitoring.
`cubrid-dbi-cci.md`	The unified `db_` client API (Initial -> Compiled -> Prepared -> Executed FSM in `DB_SESSION`) and the broker-side wire driver (`ux_`) dispatched by `server_fn_table`.
`cubrid-error-management.md`	`ER_*` enum, per-thread `cuberr::context`, nested-error stack, `er_set` family, `nl_catd` message catalog, `_latest`-symlinked rotating error log, wire propagation format.
`cubrid-locator.md`	OID workspace -> server-side `locator_*_force` bridge for batched insert/update/delete, fan-out to heap/btree/lock/log/FK/replication.
`cubrid-loaddb.md`	Bulk loader — tokenise CUBRID-format object file, batch ship to server-side worker pool under Bulk-Update lock, direct-path `locator_multi_insert_force`, post-load statistics rebuild.
`cubrid-monitoring.md`	C++ `cubmonitor` library + legacy `perf_monitor` / `pstat_Metadata` for `SHOW STATS` / `statdump` + per-subsystem ad-hoc monitors.
`cubrid-network-protocol.md`	`NET_SERVER_` opcode table, `cub_master` accept + `master::connector` UDS handoff, epoll-driven connection worker, symmetric `or_pack_` marshalling.
`cubrid-sa-cs-runtime.md`	One source tree, three builds (SERVER_MODE / SA_MODE / CS_MODE), per-utility classification SA_ONLY / CS_ONLY / SA_CS, runtime `dlopen` of `libcubridsa.so` or `libcubridcs.so`.
`cubrid-server-session.md`	`SESSION_STATE` lock-free hash by integer session id, cached on the connection entry, bound to per-thread TDES.
`cubrid-system-parameters.md`	`prm_Def[]` registry, `cubrid.conf` INI parsing with section selection, env overrides, `db_set_system_parameters` SQL path, per-session `SESSION_PARAM`.
`cubrid-thread-worker-pool.md`	`cubthread::entry` context, `worker_pool` template, `daemon`+`looper` pattern, lock-free hashmap, heavyweight `csect` RW primitive.
`cubrid-thread-manager-ng.md`	CBRD-26177 redesign — bounded epoll connection workers, coordinator-driven rebalancing, send/recv budgets, per-worker freelists.

storage-engine (the layered storage stack)

Doc	Summary
`cubrid-disk-manager.md`	Volumes / sectors / files / pages, two-step sector reservation, permanent vs temporary disk cache, adaptive volume extension, three extensible-data tables.
`cubrid-page-buffer-manager.md`	BCB array, three-zone LRU (private + shared) with quotas, direct victim handoff via lock-free queues, custom read/write/flush latch per BCB.
`cubrid-double-write-buffer.md`	Sequential staging volume `fsync`’d before home write — torn-write protection between page buffer and data files.
`cubrid-heap-manager.md`	Slotted pages, nine record types, INSERT/UPDATE/DELETE/READ flow, MVCC versioning inside the record header, hot-path caches.
`cubrid-btree.md`	Slotted-page nodes, key
`cubrid-extendible-hash.md`	EHID-rooted directory file with doubling pointer count, slotted bucket pages with binary search, system-op-bracketed splits/merges, RVEH_* WAL records.
`cubrid-overflow-file.md`	Heap big-record and B+Tree overflow-OID page chains, `FILE_MULTIPAGE_OBJECT_HEAP` / `FILE_BTREE_OVERFLOW_KEY` / per-tree OID overflow, WAL discipline for crash safety.
`cubrid-lob.md`	BLOB/CLOB stored as files outside the data volume, locator-URI naming, per-transaction red-black tree on TDES, commit/rollback file-system reconciliation.
`cubrid-tde.md`	Two-level key hierarchy, AES-256-CTR or ARIA-256-CTR with per-page nonces, encrypt-on-flush / decrypt-on-read hooks, separate `<db>_keys` master-key file, per-file TDE flag.

base-infra (custom allocators + lock-free primitives)

Doc	Summary
`cubrid-private-allocator.md`	Per-thread Lea-heap arena (Doug Lea’s `dlmalloc` vendored under `customheaps`) instantiated once per `THREAD_ENTRY`, fronted by `db_private_alloc / _free / _realloc` macros that route SERVER_MODE allocations to the thread’s heap, CS_MODE to the workspace, and SA_MODE through a `PRIVATE_MALLOC_HEADER`-tagged dispatch. C++ STL wrapper `cubmem::private_allocator<T>`, `private_unique_ptr<T>`, `PRIVATE_BLOCK_ALLOCATOR`.
`cubrid-common-area.md`	Slab-style pool allocator — chained 256-block BLOCKSET arrays of fixed-cell blocks, lock-free per-block bitmap, single hint pointer for the common case; serves `DB_VALUE`, `TP_DOMAIN`, `OBJ_TEMPLATE`, `DB_OBJLIST`, set objects, etc.
`cubrid-lockfree-overview.md`	Map of CUBRID’s lock-free primitives — legacy C `lock_free.{h,c}` family and modern C++ `lockfree::*` namespace — anchored on a single transactional reclamation spine.
`cubrid-lockfree-transaction.md`	System / table / descriptor / address-marker reclamation — per-data-structure transaction id, per-thread descriptors that bracket reads, periodic minimum-active-id scan that tells the freelist when a retired node is no longer reachable from any live reader.
`cubrid-lockfree-bitmap.md`	Chunked atomic-word bitmap — `std::atomic<unsigned int>` chunks, two chunking styles, CAS bit-flip, round-robin start hint that bumps atomically per `get_entry` under SERVER_MODE.
`cubrid-lockfree-circular-queue.md`	Bounded MPMC ring with two cursor atomics and a per-slot block-flag word — used for vacuum log-block dispatch, page-buffer victim handoff, and CDC log-info forwarding.
`cubrid-lockfree-freelist.md`	Typed `freelist<T>` with a single available stack, a one-block back-buffer that swaps in lazily, an `on_reclaim` payload hook, and a clearly-documented ABA window in the pop path bounded by the back-buffer time.
`cubrid-lockfree-hashmap.md`	Harris–Michael chained hash with optional per-entry mutex, in two parallel implementations (legacy C `lf_hash_*` and modern C++ `lockfree::hashmap<K,T>`) bridged by `cubthread::lockfree_hashmap<K,T>` whose `m_type ∈ {OLD, NEW}` is decided at init by `PRM_ID_ENABLE_NEW_LFHASH`.

query-processing (parse → execute → return)

Doc	Summary
`cubrid-parser.md`	Flex/Bison pipeline — single-buffer `YY_INPUT`, GLR Bison grammar, `parser_new_node`, polymorphic-tagged `PT_NODE` with three function-pointer arrays, per-`PARSER_CONTEXT` block allocator.
`cubrid-semantic-check.md`	`pt_check_with_info` driver chains four passes (name resolution, where-clause aggregate check, host-variable replacement, statement-aware `semantic_check_local`) then `pt_cnf` for predicate CNF.
`cubrid-query-rewrite.md`	LIMIT lowering into INST_NUM/ORDERBY_NUM/GROUPBY_NUM, view inlining, subquery flattening, predicate reduction, auto-parameterization, plan-time multi-range LIMIT optimization.
`cubrid-query-optimizer.md`	`QO_ENV` query graph (QO_NODE / QO_SEGMENT / QO_TERM), partial-then-total dynamic-programming join enumeration over 2^N `join_info` vector, System-R fixed-cpu/io+variable-cpu/io cost model, `QO_PLAN` finalisation.
`cubrid-xasl-generator.md`	`QO_PLAN` -> `XASL_NODE` tree, recursive `gen_outer`/`gen_inner` walk, `aptr/dptr/scan_ptr` slots, `REGU_VARIABLE` / `ACCESS_SPEC` / `OUTPTR_LIST` sub-IRs, `xts_*` offset-table serialisation.
`cubrid-xasl-cache.md`	Server-wide latch-free hashmap keyed on SHA-1 of rewritten SQL plus `time_stored`, single 32-bit `cache_flag` refcounting, recompile-threshold (RT) drift guard, per-class OID dependent-list invalidation on DDL.
`cubrid-query-executor.md`	Volcano-style XASL interpreter — `qexec_execute_main_block` dispatches by `xasl->type`, drives uniform open/next/close over `SCAN_ID` operators, pushes results into per-XASL list files.
`cubrid-scan-manager.md`	Polymorphic `SCAN_ID` handle + open/start/next/end/close protocol, per-`SCAN_TYPE` dispatch into heap, B+Tree, list-file, set, value, JSON-table, dblink, show, parallel-heap, method scans.
`cubrid-query-evaluator.md`	`eval_pred` walks `PRED_EXPR` tree of T_PRED / T_EVAL_TERM under three-valued logic, `fetch_peek_dbval` dispatches on `REGU_VARIABLE::type`, `eval_fnc` pre-compiles fast single-shape predicate.
`cubrid-scalar-functions.md`	Operator-primitive layer — `arithmetic.c`, `numeric_opfunc.c`, `string_opfunc.c`, `query_opfunc.c`, `crypt_opfunc.c`, `string_regex_*`; BCD numeric, collation-aware string, RE2/std::regex switching.
`cubrid-list-file.md`	`QFILE_LIST_ID` linked-page abstraction, per-query `QMGR_TEMP_FILE` membuf-then-`FILE_TEMP` substrate, uniform open/add/scan/close contract used by all materialised tuple streams.
`cubrid-post-processing.md`	`qexec_groupby` and `qexec_execute_analytic` — sort-based vs hash-based GROUP BY at runtime, fallback to external sort when hash exceeds `max_agg_hash_size`.
`cubrid-hash-join.md`	Build/Probe driver in `query_hash_join.c` reusing `HASH_LIST_SCAN`, three table layouts (in-memory `mht_hls`, hybrid memory+file, extendible `FHS`), grace-style equi-hash partitioning on spill.
`cubrid-parallel-query.md`	One global parallel-query worker pool, `compute_parallel_degree()` keyed on page count, three operator-specific orchestrators (parallel heap-scan, hash-join build/execute, query-execute fan-out).
`cubrid-external-sort.md`	Two-phase replacement-selection-style run generator (`sort_inphase_sort`) + balanced k-way merge (`sort_exphase_merge`) over `FILE_TEMP` runs, single callback-driven entry point.
`cubrid-cursor.md`	Client-side `CURSOR_ID` over server-side `QFILE_LIST_ID`, `qfile_get_list_file_page` paging, length-prefixed packed-row decoding, OID prefetch, holdability via session-scoped holdable-cursor list.
`cubrid-runtime-memoization.md`	Three independent caches sharing one playbook (DB_VALUE-array hash key, fail-on-full budget, hit-ratio guard) at three lifecycle scopes — per-XASL `sq_cache`, per-BTID `fpcache`, per-XASL `memoize::storage`.
`cubrid-partition.md`	Master class + N child classes, per-partition rule (range / hash / list) on master `SM_PARTITION`, server-side `PRUNING_CONTEXT` for optimize-time elimination + execute-time route + per-partition scan dispatch.
`cubrid-serial.md`	`_db_serial` row-per-sequence, exclusive-OID-lock advance with optional client-side caching, AUTO_INCREMENT columns through synthesised `<class>_ai_<attr>` serials.

txn-recovery (concurrency, logging, recovery)

Doc	Summary
`cubrid-transaction.md`	TDES descriptor in server-wide trantable, isolation-level dispatch (SI / lock-based), savepoint-driven nested partial-rollback boundaries via system ops.
`cubrid-mvcc.md`	MVCCID assignment, per-transaction snapshot construction in `mvcctable::build_mvcc_info`, active-MVCCID tracking, vacuum coordination.
`cubrid-lock-manager.md`	Multi-granularity per-OID lock grant/convert/revoke, transaction waits-for graph for deadlock detection.
`cubrid-prior-list.md`	Singly-linked LOG_PRIOR_NODE producer queue, single short-held mutex, log-flush daemon drain under log critical-section, group-commit-by-batching.
`cubrid-log-manager.md`	WAL record layout, LSA naming, in-memory prior-list / append-page pipeline, archive volumes, ACID-D underwriting.
`cubrid-checkpoint.md`	Fuzzy ARIES checkpoint — periodic daemon `LOG_START_CHKPT`/`LOG_END_CHKPT` pair with active-tx snapshot + redo-LSA hint, `log_Gl.hdr.chkpt_lsa` advance.
`cubrid-recovery-manager.md`	Three-pass restart anchored on most-recent checkpoint LSA, per-record-type dispatch via `RV_fun[]`, parallelised redo via per-page worker pool.
`cubrid-vacuum.md`	Forward WAL walk in fixed-size blocks below oldest-visible-MVCCID watermark, master->worker job dispatch, dropped-files tracker.
`cubrid-2pc.md`	Coordinator + participant FSMs through `LOG_2PC_EXECUTE`, prepared-state log records survive crash, in-doubt recovery during ARIES analysis pass.
`cubrid-backup-restore.md`	Online physical backup — snapshot data volumes + log records bracketed by `start_lsa` and next checkpoint, point-in-time forward replay on restore.

ddl-schema (catalog, schema graph, authorization, statistics)

Doc	Summary
`cubrid-catalog-manager.md`	Per-class disk representation + statistics in dedicated catalog (CTID), parallel system classes (`_db_class`, `_db_attribute`, `_db_index`, …) bootstrapped from fixed root-class OID.
`cubrid-class-object.md`	In-memory `SM_CLASS` graph in client-side workspace — attributes / methods / partitions / constraints / triggers; `catcls_*` mediates between graph and on-disk catalog records.
`cubrid-ddl-execution.md`	`do_statement` dispatch into `do_create_entity` / `do_alter` / `do_drop`, `SM_TEMPLATE` build, `sm_finish_class` -> `update_class` -> `install_new_representation`, `locator_add_class` + `catcls_insert_catalog_classes`, `sm_bump_local_schema_version`.
`cubrid-trigger.md`	SQL-99 ECA active rules — `_db_trigger` instances + `TR_TRIGGER` cache on `SM_CLASS`, `tr_prepare_class` / `tr_before_object` / `tr_after_object` triplet, OID-stack statement-level recursion control.
`cubrid-statistics.md`	`xstats_update_statistics` heap+B+Tree walk produces cardinality / NDV / leaf-page counts / partial-key fanouts; persisted on latest disk repr; client-side `qo_get_attr_info` feeds `qo_iscan_cost` / `qo_sscan_cost` / `qo_equal_selectivity` / `qo_range_selectivity`.
`cubrid-authentication.md`	Users / passwords / per-object privileges as MOP-keyed rows in `db_user`, `db_password`, `_db_auth`, `db_authorization`; `au_login` / `au_fetch_class` / per-class `AU_CLASS_CACHE` collapse SELECT-time grant lookup into one bitmask test.

replication-ha (distribution, log streaming, change capture)

Doc	Summary
`cubrid-heartbeat.md`	UDP gossip cluster liveness, per-node independent `calc_score` master election, slave -> to-be-master -> master FSM, job-queue with four worker threads, witness-host (`ha_ping_hosts`) split-brain guard.
`cubrid-ha-replication.md`	Master emits `LOG_REPLICATION_DATA` / `LOG_REPLICATION_STATEMENT` alongside physiological WAL; `copylogdb` ships archives; `applylogdb` (`la_apply_log_file`) walks them forward and per-record-type dispatches into the storage layer.
`cubrid-cdc.md`	`cdc_` API walks `LOG_SUPPLEMENTAL_INFO` records via `log_reader`; legacy `la_` HA applier shares the same record format internally.
`cubrid-flashback.md`	Two-phase forward log walk (per-tx summary then per-tx detailed log-info pull), shares CDC entry format, reads against archived log volumes.

pl-language (procedural extensions in the JVM)

Doc	Summary
`cubrid-pl-javasp.md`	JavaSP runtime in `cub_pl` JVM — reflective dispatch on user JARs, classloader hierarchy, security sandbox; shares catalog rows + transport with PL/CSQL.
`cubrid-pl-plcsql.md`	PL/CSQL parsed by ANTLR 4 in `pl_server` JVM, lowered to CUBRID Java AST, emitted as Java source, compiled in-process by `javax.tools.JavaCompiler`, packaged as Base64 JAR, persisted next to JavaSP catalog rows.

i18n-specialty (specialised features)

Doc	Summary
`cubrid-charset-collation.md`	Four codesets (binary, ISO-8859-1, EUC-KR, UTF-8), LDML locale rules compiled to UCA-weight shared libraries, function-pointer `LANG_COLLATION` vtable consumed by B+Tree, sort, and string operators.
`cubrid-timezone.md`	Compiles IANA tzdata into generated `timezones.c` + `libcubrid_timezones.so`, packs (zone, gmt-offset-rule, ds-rule) triple into 32-bit `TZ_ID`, resolves wall-clock via `tz_datetime_utc_conv` with LOCAL_STD/LOCAL_WALL/UTC qualifiers.
`cubrid-json-table.md`	C++ scanner whose cursor stack walks parser-built `cubxasl::json_table::node` tree, expands input JSON via `db_json_iterator_*` per NESTED PATH, emits rows at leaves; SCAN_TYPE consumed by `scan_next_scan`.
`cubrid-show-commands.md`	`SHOW <name>` rewritten into `SELECT * FROM (PT_SHOWSTMT)`, dispatched through `S_SHOWSTMT_SCAN`, tuples synthesised on demand from per-`SHOWSTMT_TYPE` start/next/end function pointers.
`cubrid-compactdb.md`	Offline compaction — walk each class heap, NULL dangling OID references, reclaim empty heap pages, drop obsolete catalog representations, defragment heap files; client-driven, scoped per class, three numbered passes.