Skip to content

CUBRID Architecture Overview — Process Model, Layered Stack, and the Map Into the Detail Docs

Contents:

This document is the front door for the CUBRID code-analysis tree. It is written for someone who has read zero lines of CUBRID source and wants the high-level shape — the long-lived processes, the layered stack inside each process, the dataflow of one query, the sub-systems that keep concurrency and durability honest, and the distribution surface that turns a single node into a cluster — before deciding which of the ~70 detail docs to open. Every section is deliberately schematic; the deep code analysis lives in the per-module docs and is referenced by file path (cubrid-X.md) at every opportunity. If a section feels thin, that is by design: this is a router, not a duplicate.

A second audience is the engineer who has read several detail docs and now wants to fit them together. The cross-cutting axes — “how does a SELECT travel from the broker to the heap and back?”, “what fires when cub_master declares this node the new HA primary?”, “why does the page buffer talk to the log manager before flushing?” — each cut across multiple subsystems. The diagrams here name the boundaries those axes cross, so you can pick the right pair of detail docs to read in tandem.

A third audience is anyone trying to understand why CUBRID looks the way it does. CUBRID is an object-relational engine with an OODB lineage (UniSQL → CUBRID), an HA story built on a separate cub_master rather than embedded consensus, a broker tier that pre- forks CAS workers and rendezvous-passes file descriptors, and a PL family that runs Java in a sibling JVM process rather than embedded. Each of these is a deliberate design choice with its own trade-offs; the detail docs treat them in depth, and cubrid-design-philosophy.md collects the rationale. This overview names the choices and points at the docs that explain them.

CUBRID is a multi-process engine. Four long-lived process types and one short-lived utility class cover every operational scenario, and the set of processes a database is using right now is the first mental model worth having.

  • cub_master — the per-host supervisor. One per host. Owns the master shared-memory anchor, listens on a Unix-domain socket for in-host process registration (servers, copylogdb, applylogdb, brokers register here), accepts inbound TCP connections from remote clients on the well-known port and hands the file descriptor to the right cub_server over its UDS, gossips liveness with peer cub_masters over UDP for HA, and runs the failover/failback FSM. Its work is supervisory — it does not own database state itself. Detail: cubrid-heartbeat.md, cubrid-network-protocol.md.
  • cub_server — the database engine. One per database, in general. Owns volumes, the page buffer, log files, lock table, catalog, the optimizer, the XASL executor, MVCC bookkeeping, vacuum, and the entire storage stack. All client connections ultimately terminate here. Detail: cubrid-boot.md, cubrid-network-protocol.md, cubrid-thread-worker-pool.md, cubrid-thread-manager-ng.md.
  • cub_pl — the PL JVM process. One per cub_server when Java/PL execution is enabled. Started by the server, runs Java stored procedures and PL/CSQL bytecode, talks back to the server’s session over a Unix-domain socket using the same CSS framing used elsewhere. Detail: cubrid-pl-javasp.md, cubrid-pl-plcsql.md.
  • cub_broker + cub_cas — the broker tier. cub_broker is a parent process that pre-forks a fixed pool of cub_cas workers, exposes a TCP listener for JDBC/CCI/ODBC clients, rendezvous-passes accepted client sockets to an idle CAS via SCM_RIGHTS over a Unix-domain channel, and tracks the pool through a SysV shared-memory region. Each cub_cas is a long-lived per-client-session worker that proxies CSS-framed traffic to cub_server. Detail: cubrid-broker.md.
  • SA-mode utilities — short-lived processes (cubrid loaddb, cubrid unloaddb, cubrid backupdb, cubrid compactdb, cubrid restoredb, …) that link the entire engine in-process via libcubridsa.so and operate on the on-disk database without a running cub_server. The same source tree compiles three ways (server / SA / CS) and a runtime dlopen picks the variant. Detail: cubrid-sa-cs-runtime.md, cubrid-loaddb.md, cubrid-backup-restore.md, cubrid-compactdb.md.
flowchart LR
  subgraph CLIENT["Client side"]
    APP["JDBC / CCI / ODBC / Python / PHP app"]
    CSQL["csql (CS-mode)"]
    UTILS["cubrid loaddb / backupdb / unloaddb<br/>(SA-mode utility)"]
  end

  subgraph BROKER["cub_broker host"]
    BPARENT["cub_broker<br/>(TCP listener)"]
    CAS1["cub_cas #1"]
    CAS2["cub_cas #2"]
    CASN["cub_cas #N"]
    BSHM["SysV shm<br/>(broker control)"]
  end

  subgraph DBHOST["cub_server host"]
    MASTER["cub_master<br/>(supervisor)"]
    SERVER["cub_server<br/>(engine)"]
    PL["cub_pl<br/>(JVM)"]
  end

  subgraph PEERS["Peer hosts (HA cluster)"]
    PEER["cub_master @ peer"]
  end

  APP -- TCP --> BPARENT
  BPARENT -. SCM_RIGHTS over UDS .-> CAS1
  BPARENT -. SCM_RIGHTS over UDS .-> CAS2
  BPARENT -. SCM_RIGHTS over UDS .-> CASN
  BPARENT --- BSHM
  CAS1 --- BSHM
  CAS2 --- BSHM
  CASN --- BSHM

  CSQL -- TCP --> MASTER
  CAS1 -- TCP --> MASTER
  CAS2 -- TCP --> MASTER
  CASN -- TCP --> MASTER

  MASTER -. UDS / SCM_RIGHTS .-> SERVER
  SERVER -- UDS --> PL

  MASTER <-- UDP heartbeat --> PEER

  UTILS -. dlopen libcubridsa .-> SERVER
  UTILS -- direct volume I/O --> SERVER

Three properties make the diagram less mysterious than it looks.

(a) The TCP listener is on cub_master, not cub_server. Every inbound connection — whether from csql, from a CAS, or from another tool — first lands on cub_master’s well-known port; the master parses the destination database name, finds the matching cub_server in its registered-process table, and forwards the file descriptor. The server then reads the protocol header off the freshly-handed-to-it socket and continues. This indirection is what allows multiple cub_server processes (one per database) to share a single host without each owning a port.

(b) The broker is a separate tier from the server. It runs on its own host (or co-located, but logically separate), with its own pool size, its own connection routing, its own ACL, and its own monitoring. The CAS-to-server hop is itself a TCP connection that happens to terminate inside the same data center, and there is nothing stopping you from operating multiple brokers fanning into one or more servers.

(c) cub_pl is a child of cub_server, not of cub_master. Java SPs and PL/CSQL execute inside their JVM; the server-side SP call serialises the arguments and walks them across to the JVM over a per-session UDS, then waits for the JVM to ship results back. The server treats the JVM as a co-process, not as a remote service.

Cross-references for the process model: cubrid-broker.md for the CAS pool, rendezvous protocol, and broker control plane; cubrid-heartbeat.md for the master-to-master UDP gossip; cubrid-pl-javasp.md for the JVM IPC; cubrid-network-protocol.md for the CSS framing and NRP dispatch; cubrid-sa-cs-runtime.md for the three-way build variant and the utility classification.

Inside cub_server, the storage stack is a strict layering. Each layer treats the layer below it as an abstraction and exports a narrower abstraction upward. Understanding the layering is the prerequisite for understanding any read or write path.

flowchart TB
  subgraph CLIENT_WORKSPACE["Client-side workspace"]
    WSP["MOP table<br/>(in-memory OID -> object cache)"]
    SMC["SM_CLASS graph<br/>(in-memory schema)"]
  end

  subgraph SERVER["cub_server"]
    LOC["locator_∗_force<br/>(server-side bridge)"]
    CAT["catalog_manager<br/>(_db_class, _db_attribute, ...)"]
    HEAP["heap_manager<br/>(slotted pages, MVCC headers)"]
    BTREE["btree<br/>(latch-coupled B+Tree)"]
    EHASH["extendible_hash<br/>(directory + buckets)"]
    OFLOW["overflow_file<br/>(big-record / overflow-OID chain)"]
    PB["page_buffer<br/>(BCB array, three-zone LRU)"]
    DWB["double_write_buffer<br/>(torn-write protection)"]
    DM["disk_manager<br/>(volumes, sectors, files, pages)"]
  end

  VOLS[("On-disk volumes<br/>(_dbname / _dbname_t / _lgar*)")]

  WSP --> LOC
  SMC --> LOC

  LOC --> HEAP
  LOC --> BTREE
  LOC --> CAT
  CAT --> HEAP
  CAT --> BTREE
  CAT --> EHASH

  HEAP --> OFLOW
  BTREE --> OFLOW

  HEAP --> PB
  BTREE --> PB
  EHASH --> PB
  OFLOW --> PB
  CAT --> PB

  PB --> DWB
  DWB --> DM
  PB --> DM
  DM --> VOLS

Reading bottom-up:

  • disk_manager owns volumes, sectors, files, and pages. A volume is one OS file; a sector is 64 contiguous pages and the allocation unit; a file is a sector bundle; a page is the I/O unit. The disk manager hands out and reclaims sectors, extends volumes when the disk cache says we are out of room, and separates permanent and temporary purposes so that temp files cannot starve the permanent space. Detail: cubrid-disk-manager.md.
  • page_buffer + double_write_buffer sit directly on top. The page buffer maps (VPID -> BCB -> in-memory frame) through a per-bucket hash, runs a three-zone LRU split into per-thread private lists with adjustable quotas plus a shared list, hands victims directly to sleeping waiters via lock-free queues, and protects each BCB with a custom read/write/flush latch. Every dirty page goes through DWB before it lands at its home location, so a torn write at the home page is recoverable from the DWB copy. Detail: cubrid-page-buffer-manager.md, cubrid-double-write-buffer.md.
  • heap_manager / btree / extendible_hash are the three on-page record organisations that sit on the page buffer. heap_manager stores variable-length user records in slotted pages with MVCC headers (insert MVCCID, delete MVCCID, prev- version chain). btree is a latch-coupled B+Tree with key||OID concatenation and unique-constraint enforcement at the OID- suffix level. extendible_hash is a Fagin-style hash file with a doubling directory used for class-name lookup, repr-id lookup, and a few internal dedup tables. Big records and big OID lists spill into overflow_file chains. Detail: cubrid-heap-manager.md, cubrid-btree.md, cubrid-extendible-hash.md, cubrid-overflow-file.md, cubrid-tde.md (encrypts page contents in/out of the buffer).
  • catalog_manager stores per-class disk representation and statistics in a dedicated catalog file (anchored by CTID), with a parallel set of user-visible system classes (_db_class, _db_attribute, _db_index, _db_serial, _db_user, _db_authorization, _db_trigger) bootstrapped from a fixed root-class OID. Everything from SHOW STATS to authorization resolves through the catalog. Detail: cubrid-catalog-manager.md, cubrid-statistics.md, cubrid-authentication.md, cubrid-serial.md.
  • SM_CLASS (class-object) is the in-memory schema graph that the workspace materialises from the catalog. It carries attributes, methods, partitions, constraints, triggers, and the partition-rule descriptor. The OODB lineage shows here: the schema is an object graph, not a flat relation, and DDL manipulates it before persisting. Detail: cubrid-class-object.md, cubrid-ddl-execution.md, cubrid-trigger.md, cubrid-partition.md.
  • locator is the bridge between the in-memory side and the on-disk side. The client-side workspace batches dirty objects into LC_COPYAREA buffers and ships them to a server-side locator_*_force family that fans out into heap, btree, lock, log, FK, and replication paths through one canonical entry point. Triggers and integrity rules also fire from here. Detail: cubrid-locator.md.
  • Client-side workspace is the topmost layer. The MOP table is the in-memory OID-to-object cache; it carries dirty bits, pin counts, and the materialised SM_CLASS graphs the application sees. ESQL/embedded-SQL, the DBI/CCI clients, and the broker’s CAS all read and write through this workspace. Detail: cubrid-class-object.md, cubrid-dbi-cci.md.

The crucial structural fact is that every record-organisation layer talks to the page buffer, never to the disk manager directly. This is what lets the buffer manager enforce the WAL ordering: a heap page mutation produces a log record first (routed through the prior list to the log manager), and the buffer manager refuses to flush the dirty page until the matching log LSA is durable. Same for B+Tree, ehash, and catalog mutations.

A SELECT or DML statement walks a long pipeline from text to result rows. The pipeline is split between client side and server side, with serialisation across the wire at well-defined points, but the conceptual stages are uniform.

flowchart LR
  SQL["SQL text"]
  PARSER["parser<br/>(Flex+Bison -> PT_NODE)"]
  SC["semantic_check<br/>(name resolve, type check, CNF)"]
  REW["query_rewrite<br/>(LIMIT lowering, view inlining,<br/>subquery flatten, predicate reduce)"]
  OPT["query_optimizer<br/>(QO_ENV graph, DP join enum,<br/>System R cost model)"]
  XGEN["xasl_generator<br/>(QO_PLAN -> XASL_NODE tree)"]
  XCACHE["xasl_cache<br/>(SHA-1 keyed plan cache)"]
  XEXEC["query_executor<br/>(Volcano open/next/close)"]
  SCAN["scan_manager<br/>(SCAN_ID dispatch)"]
  AM["access methods<br/>(heap / btree / list / set /<br/>value / json-table / show / dblink)"]
  EVAL["query_evaluator<br/>(PRED_EXPR, regu_variable)"]
  POST["post_processing<br/>(group by / aggregates /<br/>window / order by)"]
  LF["list_file<br/>(QFILE_LIST_ID, spill to FILE_TEMP)"]
  CUR["cursor / dbi-cci<br/>(client fetch handle)"]
  ROWS["result rows"]

  SQL --> PARSER --> SC --> REW --> OPT --> XGEN --> XCACHE
  XCACHE --> XEXEC
  XEXEC --> SCAN --> AM
  XEXEC --> EVAL
  AM --> EVAL
  XEXEC --> POST
  POST --> LF
  AM --> LF
  XEXEC --> LF
  LF --> CUR --> ROWS

  REW -. cache hit shortcut .-> XCACHE

The pipeline has six structural facts worth naming:

(a) Compilation runs once, execution runs many. Stages 1-5 (parse → semantic check → rewrite → optimize → XASL generate) produce a serialised XASL tree that is cached server-wide in xasl_cache, keyed on a SHA-1 of the rewritten SQL. The second execute of the same SQL skips compilation and pulls the XASL straight from the cache. The cache also tracks per-class OIDs so DDL on any referenced class invalidates dependent entries. Detail: cubrid-xasl-cache.md.

(b) The optimizer’s plan and the executor’s plan are different IRs. Optimization works over a QO_PLAN tree of QO_NODE/QO_SEGMENT/QO_TERM graph objects with a System-R- style fixed-cpu/io plus variable-cpu/io cost model. The XASL generator then lowers the surviving QO_PLAN into a recursively-shaped XASL_NODE tree with aptr/dptr/scan_ptr slots, REGU_VARIABLE IRs for value derivation, ACCESS_SPEC for scan parameters, and an OUTPTR_LIST for the result row. Only XASL is ever serialised across the wire and persisted in the cache. Detail: cubrid-query-optimizer.md, cubrid-xasl-generator.md.

(c) The executor is Volcano-style. qexec_execute_main_block dispatches by xasl->type and drives a uniform open/next/close loop over SCAN_ID operators. The scan manager is the polymorphic access-method catalogue: heap, B+Tree, list-file, set, value, JSON-table, dblink, show, parallel-heap, and method scans all present the same protocol. Detail: cubrid-query-executor.md, cubrid-scan-manager.md.

(d) Predicates evaluate through a separate engine. Each pulled tuple is filtered by eval_pred walking a PRED_EXPR tree of T_PRED boolean nodes and T_EVAL_TERM leaves under three- valued logic; every leaf calls fetch_peek_dbval which dispatches on REGU_VARIABLE::type (constant, attribute fetch, list-file position, arithmetic expression, function call, host variable, OID, list-id) into a path-specific resolver. The arithmetic and string operators that the regu-variable engine calls into are themselves a separate operator-primitive layer. Detail: cubrid-query-evaluator.md, cubrid-scalar-functions.md.

(e) Materialisation is uniform via list-file. Every sub-query result, sort output, hash-build side, group-by accumulator, and final query result is one QFILE_LIST_ID abstraction backed by a per-query QMGR_TEMP_FILE (membuf- then-FILE_TEMP) substrate. Operators read upstream and write downstream through the same open/add/scan/close contract. Detail: cubrid-list-file.md.

(f) Specialised post-processing. Group-by, aggregates, and analytic/window functions live in their own pass that chooses sort-based or hash-based GROUP BY at runtime and falls back to external sort when the hash table outgrows budget; hash join is similarly a separate Build/Probe driver with three table-layout strategies and grace-style spilling; parallel query is yet another orchestrator on top of a global parallel- query worker pool. Detail: cubrid-post-processing.md, cubrid-hash-join.md, cubrid-parallel-query.md, cubrid-external-sort.md, cubrid-runtime-memoization.md.

The client-facing tail is the cursor: db_query_first_tuple / db_query_get_tuple_value (cubrid-dbi-cci.md) and cubrid-cursor.md cover the broker- and DBI-side handle that locks onto the server’s list file and pages tuples one network-page at a time.

CUBRID is an MVCC engine with row-level locks for write conflicts and ARIES three-pass restart for crash recovery. Three timelines co-exist: the transactional timeline of MVCCIDs and locks, the physical timeline of WAL records and LSAs, and the page timeline of dirty buffer slots and flushes. The recovery story glues them.

flowchart LR
  TX["TDES (transaction descriptor)<br/>cubrid-transaction.md"]
  MVCC["MVCC table<br/>(active MVCCIDs, snapshots)<br/>cubrid-mvcc.md"]
  LOCK["lock manager<br/>(per-OID multi-granularity)<br/>cubrid-lock-manager.md"]
  PRIOR["prior_list<br/>(per-tx WAL queue)<br/>cubrid-prior-list.md"]
  LOG["log_manager<br/>(LSA, append page, archive)<br/>cubrid-log-manager.md"]
  CHKPT["checkpoint<br/>(fuzzy ARIES, redo-LSA hint)<br/>cubrid-checkpoint.md"]
  RECOV["recovery_manager<br/>(analysis / redo / undo)<br/>cubrid-recovery-manager.md"]
  VAC["vacuum<br/>(reclaim dead versions via WAL replay)<br/>cubrid-vacuum.md"]
  PB["page_buffer + DWB"]
  HEAP["heap / btree / catalog"]

  TX --> MVCC
  TX --> LOCK
  TX --> PRIOR
  PRIOR --> LOG
  LOG --> PB
  HEAP -- mvcc header writes --> MVCC
  HEAP -- WAL records --> PRIOR
  CHKPT -- LOG_START_CHKPT / LOG_END_CHKPT --> LOG
  CHKPT -- redo-LSA hint --> PB
  RECOV -- replays --> LOG
  RECOV -- restores --> PB
  VAC -- forward log walk --> LOG
  VAC -- physical reclaim --> PB
  VAC --> MVCC

The transactional core is the TDES (transaction descriptor) — one per active transaction, kept in a server-wide trantable. It carries the transaction’s MVCCID, isolation level, savepoint stack, lock list, and tail of WAL records. MVCC and the lock manager both index off the TDES. Isolation levels (READ COMMITTED, REPEATABLE READ, SERIALIZABLE) are dispatched through the snapshot construction in mvcctable::build_mvcc_info plus the lock-mode mapping in lock_manager. Detail: cubrid-transaction.md, cubrid-mvcc.md, cubrid-lock-manager.md.

The WAL pipeline has a deliberately split shape. Every record mutation produces a LOG_PRIOR_NODE on the per-transaction prior list — a singly-linked queue protected by a single short-held mutex, drained periodically by the log-flush daemon. Group commit emerges naturally from queue batching: when a transaction commits, its commit record joins the same prior list and the next drain flushes the whole batch under one log critical-section pass. Detail: cubrid-prior-list.md, cubrid-log-manager.md.

Checkpoint is fuzzy-ARIES. A periodic daemon emits a LOG_START_CHKPT / LOG_END_CHKPT pair carrying an active- transaction snapshot and a redo-LSA hint derived from the page- buffer’s dirty list. The hint advances log_Gl.hdr.chkpt_lsa so that the next analysis pass can skip everything below it. The checkpoint does not force-flush all dirty pages; it captures the snapshot and trusts the bgwriter / page-buffer victim path to keep up. Detail: cubrid-checkpoint.md.

Recovery is three passes. The analysis pass scans forward from chkpt_lsa reconstructing the active-transaction table. The redo pass replays page-mutating records on each affected page, parallelised through a per-page worker pool. The undo pass walks the active-transactions’ WAL backward applying compensating log records. Per-record-type dispatch is through the RV_fun[] table — every record carries an RVCODE whose recovery handlers are statically registered. Detail: cubrid-recovery-manager.md, cubrid-2pc.md (in-doubt transactions surface from the analysis pass).

Vacuum is the MVCC reclamation engine. It walks the WAL forward in fixed-size blocks below an oldest-visible-MVCCID watermark, dispatching per-block jobs from a master to a worker pool. Dead versions identified by their delete-MVCCID below the watermark are physically removed from heap pages and B+Tree leaves; dropped files are tracked separately so vacuum does not chase pages that no longer belong to any class. Detail: cubrid-vacuum.md.

Double-write buffer completes the durability story. Every dirty page is staged into the sequential, fixed-size DWB volume and fsync’d before the home write is issued. A torn write at the home page is recoverable from the DWB copy on restart, before the log replay begins. Detail: cubrid-double-write-buffer.md.

The non-obvious cross-cuts: (i) MVCCID assignment is transactionless for read-only — only writes consume MVCCIDs from the global counter, which keeps the active-MVCCID set bounded even under heavy read workloads; (ii) lock latches and page latches are intentionally separate (page latches are short, embedded in the BCB, unrelated to isolation — see cubrid-page-buffer-manager.md’s “Lock vs latch separation” note); (iii) backup, restore, and flashback all pivot on the same WAL the recovery manager replays (cubrid-backup-restore.md, cubrid-flashback.md).

CUBRID’s distribution layer turns a single cub_server into one member of a master/standby HA cluster, exposes change-streaming and flashback over the WAL, and supports XA-driven 2PC.

flowchart TB
  subgraph CLUSTER["HA cluster"]
    M["master cub_master + cub_server"]
    S1["slave cub_master + cub_server"]
    S2["slave cub_master + cub_server"]
    M <-- UDP gossip --> S1
    M <-- UDP gossip --> S2
    S1 <-- UDP gossip --> S2
  end

  subgraph REPLICATION["Logical-log replication"]
    CL["copylogdb<br/>(ships log volumes master->slave)"]
    AL["applylogdb<br/>(la_apply_log_file)"]
    REPLD["LOG_REPLICATION_DATA / _STATEMENT"]
    M -- WAL --> CL
    CL -- log archives --> S1
    S1 -- reads archives --> AL
    AL -- per-record dispatch --> S1
    M -- emits --> REPLD
    REPLD -. carried in WAL .-> AL
  end

  subgraph CDC["Change Data Capture"]
    CDCAPI["cdc_∗ API<br/>(LOG_SUPPLEMENTAL_INFO walker)"]
    CDCCLIENT["downstream consumer"]
    M -- log_reader --> CDCAPI
    CDCAPI --> CDCCLIENT
  end

  subgraph TWOPC["XA 2PC"]
    COORD["coordinator<br/>(XA tm or internal)"]
    TPCFSM["LOG_2PC_PREPARE / _COMMIT_DECISION"]
    INDOUBT["in-doubt recovery<br/>(analysis pass)"]
    COORD --> TPCFSM
    TPCFSM --> M
    INDOUBT -. surfaces from .-> TPCFSM
  end

  subgraph BACKUP["Backup / Restore / Flashback"]
    BKP["backupdb<br/>(start_lsa marker + log archive)"]
    RST["restoredb<br/>(volume restore + log replay)"]
    FB["flashback<br/>(per-tx summary + replay)"]
    M -- volumes --> BKP
    M -- WAL --> BKP
    BKP --> RST
    M -- WAL --> FB
  end

The distribution decisions follow CUBRID’s “local decision over quorum consensus” stance:

  • Heartbeat is independent per node. Each cub_master computes a peer-table score on a timer, picks the minimum score as master, and falls into failover via TO_BE_MASTER with a second score-recompute gate. There is no consensus protocol — every node decides locally from its own peer table. Witness hosts (ha_ping_hosts) are the split-brain guard. Detail: cubrid-heartbeat.md.
  • HA replication is logical-log based. The master engine emits LOG_REPLICATION_DATA / LOG_REPLICATION_STATEMENT records alongside its physiological WAL during DML. A separate copylogdb daemon ships archived log volumes to the slave; applylogdb walks them forward through la_apply_log_file and dispatches per-record-type back into the storage layer for serialised, transactionally consistent replay. Detail: cubrid-ha-replication.md.
  • CDC walks the same WAL forward looking at LOG_SUPPLEMENTAL_INFO records through log_reader. The modern cdc_* API exposes this to downstream consumers; the older la_* HA applier is the same shape internally. Detail: cubrid-cdc.md.
  • Two-phase commit uses prepared-state log records that survive crash. The coordinator and participant FSMs are encoded in LOG_2PC_* records; in-doubt transactions surface during the analysis pass of recovery and either commit or abort based on the recorded decision. XA-driven distributed transactions and internal nested coordinators share the same machinery. Detail: cubrid-2pc.md.
  • Flashback answers “what did transactions T1..Tn do between time A and B” by walking the log forward in two phases — a per-transaction summary, then a per-transaction detailed log-info pull — sharing the CDC entry format but read against archived log volumes. Detail: cubrid-flashback.md.
  • Backup/restore is online physical backup: a snapshot of data volumes plus the log records bracketed by a start_lsa marker and the next checkpoint. Restore replays the log forward up to a user-supplied stop time for point-in-time recovery. Detail: cubrid-backup-restore.md.

The unifying observation is that the WAL is CUBRID’s single event log. Recovery, vacuum, replication, CDC, flashback, and backup all read the same record stream; they differ only in which record types they care about and in which direction they walk it.

Procedural extensions in CUBRID — Java stored procedures and PL/CSQL — run in a sibling JVM process, not embedded in the server. The decision to put PL in its own process makes the CUBRID server immune to JVM stalls, GC pauses, and user-code crashes.

flowchart LR
  subgraph CSERVER["cub_server (C)"]
    SP_CALL["SP call site<br/>(qexec_execute_proc)"]
    SP_BRIDGE["sp_bridge / sp_send_call_info"]
    SP_CTL["sp_pl_socket<br/>(per-session UDS)"]
  end

  subgraph CUBPL["cub_pl (JVM)"]
    LISTENER["listener thread<br/>(accepts session UDS)"]
    DISPATCH["dispatch handler"]
    JAVASP["JavaSP runtime<br/>(reflection on user JAR)"]
    PLCSQL["PL/CSQL runtime<br/>(compiled to Java AST -> JAR)"]
    JDBC["server-side JDBC bridge<br/>(callbacks to cub_server)"]
    CLASSLOADER["classloader hierarchy<br/>(parent: system, child: SP-class)"]
  end

  CATALOG[("_db_stored_procedure<br/>_db_stored_procedure_code")]

  SP_CALL --> SP_BRIDGE --> SP_CTL
  SP_CTL -- UDS / CSS framing --> LISTENER
  LISTENER --> DISPATCH
  DISPATCH --> JAVASP
  DISPATCH --> PLCSQL
  JAVASP --> CLASSLOADER
  PLCSQL --> CLASSLOADER
  JAVASP -- queries / DML --> JDBC
  PLCSQL -- queries / DML --> JDBC
  JDBC -- CSS frames --> SP_CTL
  CSERVER -- catalog rows --> CATALOG
  CUBPL -. reads via JDBC .-> CATALOG

Three properties shape the PL family:

(a) Two languages, one runtime. Both Java SPs and PL/CSQL execute in the same cub_pl JVM. PL/CSQL is parsed by an ANTLR 4 grammar inside pl_server, lowered to a CUBRID-specific Java AST (DeclProgram / StmtBlock / ExprBinaryOp / loopOpt), emitted as Java source by a visitor, compiled by an in-process javax.tools.JavaCompiler, packaged as a JAR (Base64), and returned to the C-side compile_handler so sp_add_stored_procedure_code can persist it next to the Java SP catalog rows. At call time both routes look identical to the dispatch handler. Detail: cubrid-pl-plcsql.md, cubrid-pl-javasp.md.

(b) The JVM calls back to the server through JDBC. When a stored procedure issues a query or DML, the JVM does not call the C-side server in-process; it goes through a server-side JDBC driver that routes through the same per-session UDS that received the SP call. From the engine’s point of view, PL is a privileged client — same CSS framing, same NRP dispatch, same prepared- statement cache.

(c) The classloader hierarchy is JavaSP-specific. PL/CSQL classes are generated by CUBRID and trusted; user JARs are loaded under a child classloader and a security manager that restricts what the SP code can do. The reflective dispatch on user JARs is JavaSP’s responsibility alone.

Detail docs: cubrid-pl-javasp.md for the Java SP wire protocol and JVM IPC; cubrid-pl-plcsql.md for the ANTLR grammar, AST, emitter, and JAR packaging.

These subsystems do not fit cleanly into any single layer above because they are touched by every layer. Each gets a short paragraph here; the full coverage lives in the per-doc files.

  • Boot. cubrid-boot.md covers cub_server startup — first- time createdb formats volumes and bootstraps the root-class catalog; restart hands off to log_recovery’s three-pass replay; the client side wires boot_restart_client to xboot_register_client over the network. Boot is also where every subsystem’s init function fires in order, and where the thread pools, page buffer, log manager, lock manager, and catalog all come online.
  • Server session. cubrid-server-session.md describes the per-client server-side state container — a SESSION_STATE keyed by an integer session id in a lock-free hash, cached on the connection entry for O(1) request lookup, and bound to the per-thread TDES so that every server request lands on its rightful transaction descriptor, prepared-statement cache, and parameter set.
  • Thread + worker pool (legacy). cubrid-thread-worker-pool.md describes how every server thread of execution structures itself — the per-thread cubthread::entry context, the worker_pool template (cores → workers → task queue) that runs queries / vacuum / loaddb / parallel-redo, the daemon + looper pattern that drives every periodic background flush and detect, the lock- free hashmap shared by lock manager and page buffer, and the heavyweight csect RW primitive with its per-thread tracker.
  • Thread manager NG (CBRD-26177). cubrid-thread-manager-ng.md describes the connection/worker pool redesign — bounded epoll-driven connection workers, a coordinator brokering rebalancing and auto-scaling, send/recv budgets, per-worker context freelists, and atomic-free statistics — replacing the legacy thread-per-connection plus max_clients-task-worker layout.
  • Network protocol. cubrid-network-protocol.md covers how every server entry point is framed as one NET_SERVER_* opcode dispatched through a static (action_attribute, handler) table — connections accepted by cub_master, handed to cub_server workers via master::connector over a Unix- domain socket, then driven by an epoll-based connection worker reading CSS-framed packets and delegating to symmetric or_pack_* / or_unpack_* request marshalling on both sides.
  • Broker. cubrid-broker.md covers the CAS pool — cub_broker parent forks a fixed pool of cub_cas workers, exposes a single TCP listener, hands accepted client sockets to an idle CAS through a Unix-domain rendezvous channel using SCM_RIGHTS file-descriptor passing, and lets each CAS proxy CSS-framed traffic upstream to cub_server — all coordinated through one SysV shared-memory segment.
  • Error management. cubrid-error-management.md covers the global ER_* enum, the per-thread cuberr::context with its base er_message and nested-error stack, the er_set family with printf-style format compilation, the localised cubrid.msg / csql.msg / utils.msg catalogs in NetBSD/ FreeBSD nl_catd format, the cubrid_*.err log file with size-based rotation, and the wire format that flattens an error to three OR_INT fields plus the message string for client-server propagation.
  • System parameters. cubrid-system-parameters.md covers the prm_Def[] registry, the cubrid.conf INI parser with section selection, environment-variable overrides, the db_set_system_parameters SQL path, and the per-session SESSION_PARAM array — the one ordered resolution flow that every other subsystem reads through prm_get_*_value.
  • Monitoring. cubrid-monitoring.md covers two layered counter systems — a C++ template-based cubmonitor library that registers groups of statistics and supports per- transaction sheets, and the older C perf_monitor / pstat_Metadata array used by SHOW STATS and statdump — plus per-subsystem monitors (e.g., the per-vacuum-worker overflow-page threshold tracker).
  • DBI / CCI. cubrid-dbi-cci.md covers the client API — the single db_* C API on top of boot_cl and network_cl that walks every statement through a four-stage FSM (Initial → Compiled → Prepared → Executed) inside a DB_SESSION, and the broker-side wire driver (ux_database_connect, ux_prepare, ux_execute, ux_fetch, ux_end_tran) dispatched by a flat server_fn_table so JDBC, CCI, ODBC, Python, and PHP all reach the engine through the same db_* core.
  • SA / CS runtime. cubrid-sa-cs-runtime.md covers how the same source tree compiles three times — cub_server (SERVER_MODE), libcubridsa (SA_MODE), libcubridcs (CS_MODE) — so admin utilities can either embed the entire engine in-process and operate on the on-disk database directly, or talk over CSS to a separately running daemon, with the choice driven by per-utility classification (SA_ONLY / CS_ONLY / SA_CS) and a runtime dlopen of either libcubridsa.so or libcubridcs.so.

Two more cross-cutting docs round out the surface but are narrower in scope: cubrid-charset-collation.md for codeset conversion and locale-aware comparison; cubrid-timezone.md for IANA tzdata compilation and DATETIMETZ/TIMESTAMPTZ conversion; cubrid-show-commands.md for the introspection virtual scans; cubrid-runtime-memoization.md for the trio of runtime caches sharing one playbook.

The base / infrastructure substrate — custom memory allocators and lock-free primitives — has its own subcategory under cubrid-overview-base-infra.md. AREA (the slab pool for fixed- size objects), the per-thread Lea-heap private allocator, and the six lock-free docs (overview, transactional reclamation, bitmap, freelist, hashmap, circular queue) all live there. This is the family every layer above composes with rather than depending on.

Different goals lead into the tree at different points. The table below names the canonical first-doc and the next two or three to follow.

GoalStart atThen
Understand a SELECT end to endcubrid-rpath-select.mdcubrid-broker.md -> cubrid-network-protocol.md -> cubrid-parser.md -> cubrid-semantic-check.md -> cubrid-query-rewrite.md -> cubrid-query-optimizer.md -> cubrid-xasl-generator.md -> cubrid-query-executor.md -> cubrid-scan-manager.md -> cubrid-list-file.md -> cubrid-cursor.md
Understand a COMMIT (write path)cubrid-rpath-write.mdcubrid-locator.md -> cubrid-heap-manager.md -> cubrid-btree.md -> cubrid-mvcc.md -> cubrid-lock-manager.md -> cubrid-prior-list.md -> cubrid-log-manager.md -> cubrid-double-write-buffer.md -> cubrid-page-buffer-manager.md
Understand a server restartcubrid-rpath-recovery.mdcubrid-boot.md -> cubrid-checkpoint.md -> cubrid-recovery-manager.md -> cubrid-2pc.md -> cubrid-vacuum.md
Understand DDL on a classcubrid-ddl-execution.mdcubrid-class-object.md -> cubrid-catalog-manager.md -> cubrid-locator.md -> cubrid-xasl-cache.md -> cubrid-trigger.md -> cubrid-partition.md
Understand HA failovercubrid-heartbeat.mdcubrid-ha-replication.md -> cubrid-cdc.md -> cubrid-2pc.md -> cubrid-flashback.md -> cubrid-backup-restore.md
Understand a stored procedure callcubrid-pl-javasp.mdcubrid-pl-plcsql.md -> cubrid-network-protocol.md -> cubrid-server-session.md -> cubrid-dbi-cci.md
Understand a backup / restorecubrid-backup-restore.mdcubrid-log-manager.md -> cubrid-checkpoint.md -> cubrid-recovery-manager.md -> cubrid-flashback.md
Understand bulk loadcubrid-loaddb.mdcubrid-locator.md -> cubrid-heap-manager.md -> cubrid-btree.md -> cubrid-statistics.md -> cubrid-sa-cs-runtime.md
Understand CCI / JDBC clientcubrid-dbi-cci.mdcubrid-broker.md -> cubrid-network-protocol.md -> cubrid-cursor.md -> cubrid-server-session.md
Understand WHY CUBRID looks this waycubrid-design-philosophy.mdthis overview, then the subsystem of interest

The cubrid tree is organised into eight subcategories. Each detail doc declares its subcategory: in the frontmatter; the table below is a navigation aid mirroring that taxonomy. The summary column is extracted (verbatim or lightly compressed) from each doc’s frontmatter summary: field.

DocSummary
cubrid-boot.mdServer startup, first-time createdb, restart-recovery dispatch, and client connect — every subsystem’s init firing in order.
cubrid-broker.mdcub_broker parent + cub_cas worker pool, SCM_RIGHTS file-descriptor rendezvous, SysV shared-memory control plane, ACL, monitoring.
cubrid-dbi-cci.mdThe unified db_* client API (Initial -> Compiled -> Prepared -> Executed FSM in DB_SESSION) and the broker-side wire driver (ux_*) dispatched by server_fn_table.
cubrid-error-management.mdER_* enum, per-thread cuberr::context, nested-error stack, er_set family, nl_catd message catalog, _latest-symlinked rotating error log, wire propagation format.
cubrid-locator.mdOID workspace -> server-side locator_*_force bridge for batched insert/update/delete, fan-out to heap/btree/lock/log/FK/replication.
cubrid-loaddb.mdBulk loader — tokenise CUBRID-format object file, batch ship to server-side worker pool under Bulk-Update lock, direct-path locator_multi_insert_force, post-load statistics rebuild.
cubrid-monitoring.mdC++ cubmonitor library + legacy perf_monitor / pstat_Metadata for SHOW STATS / statdump + per-subsystem ad-hoc monitors.
cubrid-network-protocol.mdNET_SERVER_* opcode table, cub_master accept + master::connector UDS handoff, epoll-driven connection worker, symmetric or_pack_* marshalling.
cubrid-sa-cs-runtime.mdOne source tree, three builds (SERVER_MODE / SA_MODE / CS_MODE), per-utility classification SA_ONLY / CS_ONLY / SA_CS, runtime dlopen of libcubridsa.so or libcubridcs.so.
cubrid-server-session.mdSESSION_STATE lock-free hash by integer session id, cached on the connection entry, bound to per-thread TDES.
cubrid-system-parameters.mdprm_Def[] registry, cubrid.conf INI parsing with section selection, env overrides, db_set_system_parameters SQL path, per-session SESSION_PARAM.
cubrid-thread-worker-pool.mdcubthread::entry context, worker_pool template, daemon+looper pattern, lock-free hashmap, heavyweight csect RW primitive.
cubrid-thread-manager-ng.mdCBRD-26177 redesign — bounded epoll connection workers, coordinator-driven rebalancing, send/recv budgets, per-worker freelists.

storage-engine (the layered storage stack)

Section titled “storage-engine (the layered storage stack)”
DocSummary
cubrid-disk-manager.mdVolumes / sectors / files / pages, two-step sector reservation, permanent vs temporary disk cache, adaptive volume extension, three extensible-data tables.
cubrid-page-buffer-manager.mdBCB array, three-zone LRU (private + shared) with quotas, direct victim handoff via lock-free queues, custom read/write/flush latch per BCB.
cubrid-double-write-buffer.mdSequential staging volume fsync’d before home write — torn-write protection between page buffer and data files.
cubrid-heap-manager.mdSlotted pages, nine record types, INSERT/UPDATE/DELETE/READ flow, MVCC versioning inside the record header, hot-path caches.
cubrid-btree.mdSlotted-page nodes, key
cubrid-extendible-hash.mdEHID-rooted directory file with doubling pointer count, slotted bucket pages with binary search, system-op-bracketed splits/merges, RVEH_* WAL records.
cubrid-overflow-file.mdHeap big-record and B+Tree overflow-OID page chains, FILE_MULTIPAGE_OBJECT_HEAP / FILE_BTREE_OVERFLOW_KEY / per-tree OID overflow, WAL discipline for crash safety.
cubrid-lob.mdBLOB/CLOB stored as files outside the data volume, locator-URI naming, per-transaction red-black tree on TDES, commit/rollback file-system reconciliation.
cubrid-tde.mdTwo-level key hierarchy, AES-256-CTR or ARIA-256-CTR with per-page nonces, encrypt-on-flush / decrypt-on-read hooks, separate <db>_keys master-key file, per-file TDE flag.

base-infra (custom allocators + lock-free primitives)

Section titled “base-infra (custom allocators + lock-free primitives)”
DocSummary
cubrid-private-allocator.mdPer-thread Lea-heap arena (Doug Lea’s dlmalloc vendored under customheaps) instantiated once per THREAD_ENTRY, fronted by db_private_alloc / _free / _realloc macros that route SERVER_MODE allocations to the thread’s heap, CS_MODE to the workspace, and SA_MODE through a PRIVATE_MALLOC_HEADER-tagged dispatch. C++ STL wrapper cubmem::private_allocator<T>, private_unique_ptr<T>, PRIVATE_BLOCK_ALLOCATOR.
cubrid-common-area.mdSlab-style pool allocator — chained 256-block BLOCKSET arrays of fixed-cell blocks, lock-free per-block bitmap, single hint pointer for the common case; serves DB_VALUE, TP_DOMAIN, OBJ_TEMPLATE, DB_OBJLIST, set objects, etc.
cubrid-lockfree-overview.mdMap of CUBRID’s lock-free primitives — legacy C lock_free.{h,c} family and modern C++ lockfree::* namespace — anchored on a single transactional reclamation spine.
cubrid-lockfree-transaction.mdSystem / table / descriptor / address-marker reclamation — per-data-structure transaction id, per-thread descriptors that bracket reads, periodic minimum-active-id scan that tells the freelist when a retired node is no longer reachable from any live reader.
cubrid-lockfree-bitmap.mdChunked atomic-word bitmap — std::atomic<unsigned int> chunks, two chunking styles, CAS bit-flip, round-robin start hint that bumps atomically per get_entry under SERVER_MODE.
cubrid-lockfree-circular-queue.mdBounded MPMC ring with two cursor atomics and a per-slot block-flag word — used for vacuum log-block dispatch, page-buffer victim handoff, and CDC log-info forwarding.
cubrid-lockfree-freelist.mdTyped freelist<T> with a single available stack, a one-block back-buffer that swaps in lazily, an on_reclaim payload hook, and a clearly-documented ABA window in the pop path bounded by the back-buffer time.
cubrid-lockfree-hashmap.mdHarris–Michael chained hash with optional per-entry mutex, in two parallel implementations (legacy C lf_hash_* and modern C++ lockfree::hashmap<K,T>) bridged by cubthread::lockfree_hashmap<K,T> whose m_type ∈ {OLD, NEW} is decided at init by PRM_ID_ENABLE_NEW_LFHASH.

query-processing (parse → execute → return)

Section titled “query-processing (parse → execute → return)”
DocSummary
cubrid-parser.mdFlex/Bison pipeline — single-buffer YY_INPUT, GLR Bison grammar, parser_new_node, polymorphic-tagged PT_NODE with three function-pointer arrays, per-PARSER_CONTEXT block allocator.
cubrid-semantic-check.mdpt_check_with_info driver chains four passes (name resolution, where-clause aggregate check, host-variable replacement, statement-aware semantic_check_local) then pt_cnf for predicate CNF.
cubrid-query-rewrite.mdLIMIT lowering into INST_NUM/ORDERBY_NUM/GROUPBY_NUM, view inlining, subquery flattening, predicate reduction, auto-parameterization, plan-time multi-range LIMIT optimization.
cubrid-query-optimizer.mdQO_ENV query graph (QO_NODE / QO_SEGMENT / QO_TERM), partial-then-total dynamic-programming join enumeration over 2^N join_info vector, System-R fixed-cpu/io+variable-cpu/io cost model, QO_PLAN finalisation.
cubrid-xasl-generator.mdQO_PLAN -> XASL_NODE tree, recursive gen_outer/gen_inner walk, aptr/dptr/scan_ptr slots, REGU_VARIABLE / ACCESS_SPEC / OUTPTR_LIST sub-IRs, xts_* offset-table serialisation.
cubrid-xasl-cache.mdServer-wide latch-free hashmap keyed on SHA-1 of rewritten SQL plus time_stored, single 32-bit cache_flag refcounting, recompile-threshold (RT) drift guard, per-class OID dependent-list invalidation on DDL.
cubrid-query-executor.mdVolcano-style XASL interpreter — qexec_execute_main_block dispatches by xasl->type, drives uniform open/next/close over SCAN_ID operators, pushes results into per-XASL list files.
cubrid-scan-manager.mdPolymorphic SCAN_ID handle + open/start/next/end/close protocol, per-SCAN_TYPE dispatch into heap, B+Tree, list-file, set, value, JSON-table, dblink, show, parallel-heap, method scans.
cubrid-query-evaluator.mdeval_pred walks PRED_EXPR tree of T_PRED / T_EVAL_TERM under three-valued logic, fetch_peek_dbval dispatches on REGU_VARIABLE::type, eval_fnc pre-compiles fast single-shape predicate.
cubrid-scalar-functions.mdOperator-primitive layer — arithmetic.c, numeric_opfunc.c, string_opfunc.c, query_opfunc.c, crypt_opfunc.c, string_regex_*; BCD numeric, collation-aware string, RE2/std::regex switching.
cubrid-list-file.mdQFILE_LIST_ID linked-page abstraction, per-query QMGR_TEMP_FILE membuf-then-FILE_TEMP substrate, uniform open/add/scan/close contract used by all materialised tuple streams.
cubrid-post-processing.mdqexec_groupby and qexec_execute_analytic — sort-based vs hash-based GROUP BY at runtime, fallback to external sort when hash exceeds max_agg_hash_size.
cubrid-hash-join.mdBuild/Probe driver in query_hash_join.c reusing HASH_LIST_SCAN, three table layouts (in-memory mht_hls, hybrid memory+file, extendible FHS), grace-style equi-hash partitioning on spill.
cubrid-parallel-query.mdOne global parallel-query worker pool, compute_parallel_degree() keyed on page count, three operator-specific orchestrators (parallel heap-scan, hash-join build/execute, query-execute fan-out).
cubrid-external-sort.mdTwo-phase replacement-selection-style run generator (sort_inphase_sort) + balanced k-way merge (sort_exphase_merge) over FILE_TEMP runs, single callback-driven entry point.
cubrid-cursor.mdClient-side CURSOR_ID over server-side QFILE_LIST_ID, qfile_get_list_file_page paging, length-prefixed packed-row decoding, OID prefetch, holdability via session-scoped holdable-cursor list.
cubrid-runtime-memoization.mdThree independent caches sharing one playbook (DB_VALUE-array hash key, fail-on-full budget, hit-ratio guard) at three lifecycle scopes — per-XASL sq_cache, per-BTID fpcache, per-XASL memoize::storage.
cubrid-partition.mdMaster class + N child classes, per-partition rule (range / hash / list) on master SM_PARTITION, server-side PRUNING_CONTEXT for optimize-time elimination + execute-time route + per-partition scan dispatch.
cubrid-serial.md_db_serial row-per-sequence, exclusive-OID-lock advance with optional client-side caching, AUTO_INCREMENT columns through synthesised <class>_ai_<attr> serials.

txn-recovery (concurrency, logging, recovery)

Section titled “txn-recovery (concurrency, logging, recovery)”
DocSummary
cubrid-transaction.mdTDES descriptor in server-wide trantable, isolation-level dispatch (SI / lock-based), savepoint-driven nested partial-rollback boundaries via system ops.
cubrid-mvcc.mdMVCCID assignment, per-transaction snapshot construction in mvcctable::build_mvcc_info, active-MVCCID tracking, vacuum coordination.
cubrid-lock-manager.mdMulti-granularity per-OID lock grant/convert/revoke, transaction waits-for graph for deadlock detection.
cubrid-prior-list.mdSingly-linked LOG_PRIOR_NODE producer queue, single short-held mutex, log-flush daemon drain under log critical-section, group-commit-by-batching.
cubrid-log-manager.mdWAL record layout, LSA naming, in-memory prior-list / append-page pipeline, archive volumes, ACID-D underwriting.
cubrid-checkpoint.mdFuzzy ARIES checkpoint — periodic daemon LOG_START_CHKPT/LOG_END_CHKPT pair with active-tx snapshot + redo-LSA hint, log_Gl.hdr.chkpt_lsa advance.
cubrid-recovery-manager.mdThree-pass restart anchored on most-recent checkpoint LSA, per-record-type dispatch via RV_fun[], parallelised redo via per-page worker pool.
cubrid-vacuum.mdForward WAL walk in fixed-size blocks below oldest-visible-MVCCID watermark, master->worker job dispatch, dropped-files tracker.
cubrid-2pc.mdCoordinator + participant FSMs through LOG_2PC_EXECUTE, prepared-state log records survive crash, in-doubt recovery during ARIES analysis pass.
cubrid-backup-restore.mdOnline physical backup — snapshot data volumes + log records bracketed by start_lsa and next checkpoint, point-in-time forward replay on restore.

ddl-schema (catalog, schema graph, authorization, statistics)

Section titled “ddl-schema (catalog, schema graph, authorization, statistics)”
DocSummary
cubrid-catalog-manager.mdPer-class disk representation + statistics in dedicated catalog (CTID), parallel system classes (_db_class, _db_attribute, _db_index, …) bootstrapped from fixed root-class OID.
cubrid-class-object.mdIn-memory SM_CLASS graph in client-side workspace — attributes / methods / partitions / constraints / triggers; catcls_* mediates between graph and on-disk catalog records.
cubrid-ddl-execution.mddo_statement dispatch into do_create_entity / do_alter / do_drop, SM_TEMPLATE build, sm_finish_class -> update_class -> install_new_representation, locator_add_class + catcls_insert_catalog_classes, sm_bump_local_schema_version.
cubrid-trigger.mdSQL-99 ECA active rules — _db_trigger instances + TR_TRIGGER cache on SM_CLASS, tr_prepare_class / tr_before_object / tr_after_object triplet, OID-stack statement-level recursion control.
cubrid-statistics.mdxstats_update_statistics heap+B+Tree walk produces cardinality / NDV / leaf-page counts / partial-key fanouts; persisted on latest disk repr; client-side qo_get_attr_info feeds qo_iscan_cost / qo_sscan_cost / qo_equal_selectivity / qo_range_selectivity.
cubrid-authentication.mdUsers / passwords / per-object privileges as MOP-keyed rows in db_user, db_password, _db_auth, db_authorization; au_login / au_fetch_class / per-class AU_CLASS_CACHE collapse SELECT-time grant lookup into one bitmask test.

replication-ha (distribution, log streaming, change capture)

Section titled “replication-ha (distribution, log streaming, change capture)”
DocSummary
cubrid-heartbeat.mdUDP gossip cluster liveness, per-node independent calc_score master election, slave -> to-be-master -> master FSM, job-queue with four worker threads, witness-host (ha_ping_hosts) split-brain guard.
cubrid-ha-replication.mdMaster emits LOG_REPLICATION_DATA / LOG_REPLICATION_STATEMENT alongside physiological WAL; copylogdb ships archives; applylogdb (la_apply_log_file) walks them forward and per-record-type dispatches into the storage layer.
cubrid-cdc.mdcdc_* API walks LOG_SUPPLEMENTAL_INFO records via log_reader; legacy la_* HA applier shares the same record format internally.
cubrid-flashback.mdTwo-phase forward log walk (per-tx summary then per-tx detailed log-info pull), shares CDC entry format, reads against archived log volumes.

pl-language (procedural extensions in the JVM)

Section titled “pl-language (procedural extensions in the JVM)”
DocSummary
cubrid-pl-javasp.mdJavaSP runtime in cub_pl JVM — reflective dispatch on user JARs, classloader hierarchy, security sandbox; shares catalog rows + transport with PL/CSQL.
cubrid-pl-plcsql.mdPL/CSQL parsed by ANTLR 4 in pl_server JVM, lowered to CUBRID Java AST, emitted as Java source, compiled in-process by javax.tools.JavaCompiler, packaged as Base64 JAR, persisted next to JavaSP catalog rows.
DocSummary
cubrid-charset-collation.mdFour codesets (binary, ISO-8859-1, EUC-KR, UTF-8), LDML locale rules compiled to UCA-weight shared libraries, function-pointer LANG_COLLATION vtable consumed by B+Tree, sort, and string operators.
cubrid-timezone.mdCompiles IANA tzdata into generated timezones.c + libcubrid_timezones.so, packs (zone, gmt-offset-rule, ds-rule) triple into 32-bit TZ_ID, resolves wall-clock via tz_datetime_utc_conv with LOCAL_STD/LOCAL_WALL/UTC qualifiers.
cubrid-json-table.mdC++ scanner whose cursor stack walks parser-built cubxasl::json_table::node tree, expands input JSON via db_json_iterator_* per NESTED PATH, emits rows at leaves; SCAN_TYPE consumed by scan_next_scan.
cubrid-show-commands.mdSHOW <name> rewritten into SELECT * FROM (PT_SHOWSTMT), dispatched through S_SHOWSTMT_SCAN, tuples synthesised on demand from per-SHOWSTMT_TYPE start/next/end function pointers.
cubrid-compactdb.mdOffline compaction — walk each class heap, NULL dangling OID references, reclaim empty heap pages, drop obsolete catalog representations, defragment heap files; client-driven, scoped per class, three numbered passes.