Skip to content

CUBRID Reading Path — How a Write Commits End-to-End

Contents:

A single client connection sends INSERT INTO t VALUES (...) and then COMMIT. The reading path that follows is the durable path — how those two statements turn into bytes on three stable-storage targets (active log volume, heap data volume, overflow file) so the client’s commit acknowledgment is the engine’s promise that the row will survive any subsequent crash. We thread through ~13 detail docs in order: parser, semantic-check, statement dispatch, locator _force fan-in, heap manager, overflow file, B+Tree, MVCC, trigger, prior list, log manager, lock manager, HA replication, transaction state machine, page buffer, DWB, checkpoint, vacuum. The trip ends not at “COMMIT returned” but at the eventual page flush and eventually-eventual vacuum reclamation — calling the write “done” requires the LOG_COMMIT record durable, the row reachable through every relevant index under every isolation level, and the dirty heap+btree pages either flushed or covered by a torn-write defense.

Identical to the SELECT path: the application calls db_* (CCI, ODBC, JDBC, or a CAS broker shim), the broker ships the SQL text plus bind parameters over the network protocol to the server’s request dispatcher, and a worker thread picks it up. The client-side workspace (work_space.c, locator_cl.c) holds MOPs that will later receive permanent OIDs. See cubrid-rpath-select.md steps 1–3. The only INSERT-specific note is that the workspace will mark the new MOP dirty with LC_FLUSH_INSERT so the eventual flush packs it into an LC_COPYAREA — but for the executor-driven path that this doc follows (INSERT ... VALUES (...) parsed and run inside one statement), the workspace flush is bypassed and the executor calls locator_attribute_info_force directly on the server with locally-built attribute-info bundles.

Step 2 — Parse + semantic-check + statement dispatch

Section titled “Step 2 — Parse + semantic-check + statement dispatch”

The server’s parser phase is shared with SELECT: the GLR Bison grammar in csql_grammar.y reduces the INSERT INTO t VALUES (...) text into a PT_INSERT node — a PT_NODE whose node_type is the disambiguating tag and whose info.insert arm carries the target class, the value list, and the optional ON DUPLICATE KEY UPDATE clause. The lexer is a Flex DFA with start-condition states; the parser is built with %glr-parser so it can absorb SQL’s historic ambiguities. Semantic check then runs (cubrid-semantic-check.md): name resolution binds t to its class OID, the value-list types are unified against the table’s column types via the type-coercion rules, and any user-specified DEFAULT values are filled in. See cubrid-parser.md for the parse-tree shape and cubrid-semantic-check.md for the resolution and type-checking passes.

The post-semantic-check statement is then dispatched. The single entry point is do_statement in src/query/execute_statement.c, which is also the dispatcher for DDL — do_statement is one big switch on PT_NODE.node_type that routes to per-kind handlers. For PT_INSERT the handler is do_insert (and its prepared-statement sibling do_execute_statement re-enters the same switch on a re-prepared PT_NODE). do_insert builds an XASL fragment for the value-list, runs it through the executor, and for each produced row calls into the locator’s force family. See cubrid-ddl-execution.md §“Top-level dispatch — do_statement and the DDL switch” for the switch’s structure; the DML arms sit alongside the DDL arms in the same dispatch.

locator_attribute_info_force (locator_sr.c) is the canonical fan-in for every server-side row mutation. Its body is a switch (operation) on LC_COPYAREA_OPERATION. For INSERT the LC_FLUSH_INSERT arm builds a RECDES from HEAP_CACHE_ATTRINFO via locator_allocate_copy_area_by_attr_info and dispatches to locator_insert_force. UPDATE falls through into the same encoding step after reading the existing record; INSERT has no existing version and skips the snapshot-aware read.

locator_insert_force drives six responsibilities in order: (1) partition pruning to pick the actual partition class, (2) heap insert via heap_insert_logical (decides the OID), (3) the per-index loop in locator_add_or_remove_index that touches every B+Tree on the class, (4) FK checks via locator_check_foreign_key, side-effects to (5) HA replication via repl_log_insert and (6) WAL via the heap and btree primitives’ own log_append_* calls. See cubrid-locator.md §“The ‘force’ family”.

heap_insert_logical (heap_file.c) is the slotted-page side. (a) It stamps the record’s MVCC header — mvcc_ins_id is assigned to the transaction’s MVCCID (lazily allocated via mvcctable::get_new_mvccid if this is the first write; see cubrid-mvcc.md §“MVCCID assignment policy”). A brand-new INSERT’s mvcc_rec_header flag byte carries only VALID_INSID. (b) If the record is too big for any heap page, the row body goes to the overflow file via heap_ovf_insert → overflow_insert (cubrid-overflow-file.md), laid across a chain of OVERFLOW_FIRST_PART + OVERFLOW_REST_PART pages under a log_sysop_start / log_sysop_attach_to_outer bracket; the heap home slot stores a REC_BIGONE forwarding record. Each overflow page emits an RVOVF_NEWPAGE_INSERT redo record; a LOG_DUMMY_OVF_RECORD on the head page anchors an LSN for HA replication and vacuum. (c) Otherwise the record is REC_HOME; the heap manager finds a target home page via HEAP_STATS_BESTSPACE_CACHE, falling back to HEAP_HDR_STATS.estimates.best[], a bounded scan, and finally heap_alloc_new_page. (d) A slot is allocated; the slot id + (volid, pageid) is the row’s permanent OID. FILE_HEAP pages are ANCHORED_DONT_REUSE_SLOTS, so the OID never aliases. (e) Per-page stats and the bestspace cache update; the page is dirty. See cubrid-heap-manager.md §“Insert flow”.

For each B+Tree on the class, the locator’s locator_add_or_remove_index extracts the key columns from the new record via heap_attrvalue_get_key, and calls btree_insert with the (key, OID) pair. The btree side traverses the tree under latch-coupling discipline: the descent fixes parent with S latch, fixes child, releases parent, all the way down — except on the write path where it escalates the leaf to an X latch. If the leaf is full and a split is needed, btree_insert_helper opens a log_sysop_start system-op bracket, calls btree_split_node (or btree_split_root for a height-growing split), promotes a separator key into the parent, and closes with log_sysop_end_logical_undo — so abort can re-merge by replaying logical undo rather than reverse-applying physical page movement. See cubrid-btree.md §“Splits — split point, key promotion, parent update”.

The leaf itself stores LEAF_REC (a fixed prefix of VPID ovfl + short key_len) followed by the key bytes followed by an OID list. For non-unique keys the OID list grows inline up to a per-page threshold; spillover crosses into a per-key overflow chain allocated from BTID_INT::ovfidPAGE_BTREE-typed pages headed by a BTREE_OVERFLOW_HEADER whose slot 0 carries next_vpid and whose slot 1 carries the OIDs sorted by OID for binary search. New overflow pages are linked at the head (the new page’s next_vpid becomes the old first overflow page) so subsequent inserts don’t pay tail-walk cost. See cubrid-overflow-file.md §“Walk: B+Tree overflow OID list”.

Unique-key check happens at insert time, under the leaf’s X latch, dispatched via btree_find_oid_and_its_page with BTREE_OP_PURPOSE = INSERT. Because the OIDs in a leaf record’s suffix list are sorted by OID, a unique index has at most one OID per key — finding any OID for the key already there is the duplicate. The check is gated by the BTREE_NEED_UNIQUE_CHECK macro: it runs only on active transactions, never on recovery redo (recovery never inserts a duplicate because the original insert was already validated). See cubrid-btree.md §“Unique-key handling — OID-list and stats”.

After heap+index, locator_insert_force runs the constraint-orchestration helpers from the locator. Three are relevant for INSERT.

locator_add_or_remove_index (already invoked in step 5) is itself the unique-key check loop — btree_insert returns ER_BTREE_UNIQUE_FAILED if the key already exists in a unique B+Tree, and the locator propagates the error. locator_check_unique_btree_entries is the deeper integrity check used by CHECKDB and post-restore consistency, not on the hot insert path. locator_check_foreign_key walks the FK list on the class representation, extracts the referencing-column key from the new record, and probes the parent class’s PK B+Tree via btree_keyoid_checks; on miss the insert is rejected with ER_FK_INVALID. See cubrid-locator.md §“Constraint orchestration”.

The order is deliberate: heap insert first (to pin the OID), then indexes (to populate every key including the PK that other FKs might reference), then FK check (the parent might also be a row that this transaction is inserting earlier in the same batch). Within locator_check_foreign_key, the parent lookup is a regular btree fetch with the transaction’s own snapshot, so a parent inserted earlier in this transaction is visible by virtue of “my own writes are visible to me” — see cubrid-mvcc.md’s mvcc_satisfies_snapshot truth table for the MVCC_IS_REC_INSERTED_BY_ME arm.

If the target class has BEFORE INSERT or AFTER INSERT triggers defined, they fire from the client side — not the server. Trigger firing happens in obt_apply_assignments (object_template.c) before the dirty MOP is packed into the LC_COPYAREA. The dispatch is gated by sm_active_triggers, which short-circuits the no-trigger case in O(1). When triggers do exist, tr_prepare_class builds a TR_STATE, the BEFORE pass runs via tr_before_object → tr_execute_activities (calling eval_action on each trigger in priority order), the heap mutation happens, and tr_after_object runs the AFTER pass and queues DEFERRED triggers onto the per-transaction tr_Deferred_activities chain. Recursion is bounded two ways: a depth counter (tr_Current_depth ≤ 32) catches infinite row-level recursion, and an OID stack (tr_Stack) silently skips re-entry of a STATEMENT-level trigger. See cubrid-trigger.md §“Firing path” and §“Recursion control”.

For the executor-driven insert path that this doc traces, the trigger fires before the DML reaches locator_attribute_info_force, so by the time we are at steps 4–6 the trigger has either accepted, rejected (raising ER_TR_REJECTED and rolling back to the statement boundary), or invalidated the transaction (tr_Invalid_transaction = true so the eventual COMMIT becomes ABORT). AFTER triggers run after the heap mutation, on the next visit to tr_after_object. The locator’s force family knows nothing about triggers — cubrid-locator.md’s server side explicitly does heap, lock, btree, FK, log, and replication, but not triggers; this is the trigger_manager.h #error Does not belong to server module guard made manifest.

Every page change in steps 4–5 calls the WAL append API (log_append_undoredo_data, log_append_redo_data, log_append_undo_data). MVCC-flavored variants (LOG_MVCC_UNDOREDO_DATA, etc.) carry the writer’s MVCCID and a LOG_VACUUM_INFO whose prev_mvcc_op_log_lsa chains MVCC operations into a list the vacuum subsystem can walk without re-reading every record.

Each call funnels into prior_lsa_alloc_and_copy_data / _crumbs (log_append.cpp:273/:410), which mallocs a LOG_PRIOR_NODE outside any global mutex, optionally zlib-compresses payloads over log_Zip_min_size_to_compress, and returns the node. Then prior_lsa_next_record assigns the LSA from log_Gl.prior_info.prior_lsa under prior_lsa_mutex, links the node onto the tail, bumps list_size, unlocks. The mutex is held only across O(1) link manipulation and LSN arithmetic — the expensive compression and memcpy happen outside, so N producers build in parallel. See cubrid-prior-list.md §“Producer step 2”.

The drain runs separately. log_Flush_daemon (and any backpressure self-help) calls logpb_prior_lsa_append_all_list under LOG_CS_OWN_WRITE_MODE, which detaches the prior list under the mutex (swap head/tail/size to NULL/NULL/0), releases, then walks the detached list with logpb_append_next_record to copy each node’s bytes into the authoritative LOG_PAGE buffer. The disk write is a separate stage — see step 12.

For a single INSERT into a class with two indexes, the prior list receives LOG_MVCC_UNDOREDO (from heap_insert_logical), two LOG_UNDOREDO_DATA (from btree_insert × 2), plus a LOG_DUMMY_OVF_RECORD if the row spilled to overflow. Each carries LOG_RECORD_HEADER { prev_tranlsa, back_lsa, forw_lsa, trid, type } — the triple-LSA layout ARIES needs: prev_tranlsa chains records of the same transaction so undo walks backward; back_lsa/forw_lsa chain records in physical log order so redo scans forward.

If the server is configured as an HA master, the same locator_*_force flow that emitted the WAL records also calls repl_log_insert (replication.c), which appends a LOG_REPL_RECORD to the per-transaction staging array tdes->repl_records[]. The staging entry is intentionally minimal: the repl_data payload is | packed_pkey_size | class_name | pkey_dbvalue | — class name plus primary-key value, no full row image. The slave will re-fetch the row from the master’s heap when it applies the event. This keeps the per-transaction staging cost bounded even for batch inserts touching millions of rows.

The actual LOG_REPLICATION_DATA log record is not appended at this point — the entries sit on tdes->repl_records[] until commit time. See cubrid-ha-replication.md §“Master side — LOG_REPL_RECORD and the staging array”. The CDC channel is populated separately: every DML emits a LOG_SUPPLEMENTAL_INFO record (record type 52) inline with the WAL via log_append_supplemental_*, carrying a richer self-describing payload (table OID, before/after image, transaction user) so external pull-style consumers can decode without consulting the catalog. See cubrid-cdc.md §“LOG_SUPPLEMENTAL_INFO — the modern event format” and cubrid-log-manager.md §“LOG_SUPPLEMENTAL_INFO is the channel CDC uses”.

Locks flow through the locator path. For INSERT, the row’s OID is decided during heap_insert_logical when the slot is allocated, so the X-lock is acquired inside the heap path rather than upstream — INSERT is one of the few ops that takes its row lock inside the heap primitive. The lock manager’s public entry is lock_object (lock_manager.c:5945), delegating to lock_internal_perform_lock_object for the hash → resource → compatibility-check sequence.

lock_object finds-or-inserts an LK_RES keyed by LK_RES_KEY{type=INSTANCE, oid, class_oid}, then grants fresh, grants by adding to the holder list (compatible with total_holders_mode | total_waiters_mode), or splices into the waiter list and suspends. The compatibility check is one O(1) matrix lookup against aggregated mode bits. CUBRID’s 12-mode vocabulary (NA … SCH-M) carries IX for the parent class (taken upstream in the executor) and X for the row OID. The new OID has no prior holders, so the LK_RES is fresh and granting is trivial.

Index-key locks are not taken on the inline OID-list — CUBRID relies on MVCC + row-OID locking for non-SERIALIZABLE isolation. Under SERIALIZABLE, key-range locks are taken at scan boundaries. Under READ COMMITTED the instance lock is short-duration (released at statement end via lock_unlock_object_by_isolation); under REPEATABLE READ / SERIALIZABLE it is long-duration. See cubrid-lock-manager.md §“Lock acquisition flow”.

The client sends COMMIT. xtran_server_commit (transaction_sr.c:71) forwards to log_commit (log_manager.c:5352), which delegates to log_commit_local:

  1. Commit-side triggers. tr_check_commit_triggers runs any user-trigger TR_EVENT_COMMIT and drains the per-transaction tr_Deferred_activities queue. If a deferred action raises tr_Invalid_transaction, commit converts to abort (ER_TR_TRANSACTION_INVALIDATED).
  2. Drain postpones. If LOG_POSTPONE records were buffered, LOG_COMMIT_WITH_POSTPONE is appended, log_do_postpone replays them, state moves to TRAN_UNACTIVE_COMMITTED_WITH_POSTPONE.
  3. Atomic repl + commit emission. log_append_repl_info_and_commit_log takes prior_lsa_mutex once and appends every tdes->repl_records[] entry as LOG_REPLICATION_DATA (or _STATEMENT) records atomically with the commit record — no peer transaction’s commit can slip between them. See cubrid-ha-replication.md §“Atomic emission”.
  4. Append LOG_COMMIT. The commit record’s LSA (commit_lsa) is the transaction’s promise handle.
  5. Wait for durability. logpb_flush_pages(commit_lsa) parks the committer on gc_cond (under default async_commit=false, group_commit=true); the log-flush daemon ticks every log_get_log_group_commit_interval and broadcasts; the committer wakes when nxio_lsa >= commit_lsa. See cubrid-prior-list.md §“Commit waiters”.
  6. Transition + release. TDES state → TRAN_UNACTIVE_COMMITTED, logtb_complete_mvcc flips the bit in the active set, locks released (or retained if retain_lock), trantable index freed via logtb_release_tran_index. See cubrid-transaction.md.

log_Flush_daemon (log_manager.c::log_flush_execute) puts the bytes on disk. Each tick (timer or on-demand wakeup):

// log_flush_execute — log_manager.c (condensed)
LOG_CS_ENTER (&thread_ref);
logpb_flush_pages_direct (&thread_ref);
// → logpb_prior_lsa_append_all_list (drain prior list → LOG_PAGE buffer)
// → logpb_flush_all_append_pages (write LOG_PAGE → active log + fsync)
LOG_CS_EXIT (&thread_ref);
pthread_cond_broadcast (&log_Gl.group_commit_info.gc_cond);

logpb_flush_all_append_pages walks the dirty LOG_PAGE list, issues fileio_write_pages on the active log volume, and advances log_append_info::nxio_lsa — the lowest LSA not yet on stable storage. The two-step flush of partial records (everything except the page where the most-recent record header lives, then the header page last) makes the write resilient to a crash mid-flush: the on-disk log always ends at either an old end-of-log marker or a new one, never a dangling forward pointer. See cubrid-log-manager.md §“Flush”.

After the daemon’s broadcast, every committer whose commit_lsa <= nxio_lsa wakes, observes the watermark, and returns the acknowledgment to the client. This is the moment the durability promise crystallizes — once log_commit returns TRAN_UNACTIVE_COMMITTED, the row is reachable through every index and constraint after any subsequent crash. What is not yet on disk: the heap page’s modified slot, the btree leaf’s new entry, the overflow chain. Only the WAL is durable; the data pages catch up later.

Step 13 — Eventually: dirty-page flush + DWB

Section titled “Step 13 — Eventually: dirty-page flush + DWB”

The dirty heap and btree pages are flushed lazily by the page buffer’s three daemons (see cubrid-page-buffer-manager.md): Page Flush Daemon picks dirty BCBs and writes them at a rate adapted to the dirty ratio; Page Post-Flush Daemon post-processes flushed BCBs and hands them to direct-victim waiters; Page Maintenance Daemon adjusts per-private-LRU quotas every 100 ms.

Every dirty data-page write goes through the double-write buffer to defend against torn writes (cubrid-double-write-buffer.md). Producer-side: dwb_acquire_next_slot CAS-bumps the position counter to claim a slot in the in-memory DWB block, dwb_set_data_on_next_slot copies the page bytes in, the page is inserted into dwb_Global.slots_hashmap (so a concurrent reader finds it via dwb_read_page instead of re-reading a possibly-torn home), and when the block fills the dwb-flush-block daemon writes the block sequentially to the DWB volume, fsyncs, then writes each slot’s contents to its home volume.

Lockstep with the WAL invariant: before any data page is written home, pgbuf_flush_check_log_lsa ensures nxio_lsa >= page->lsa. Step 12’s commit force satisfies this for our INSERT’s pages, so their flush is unconditional from this point on.

The next checkpoint records that the flush has happened. logpb_checkpoint’s pgbuf_flush_checkpoint(newchkpt_lsa, ...) returns tmp_chkpt.redo_lsa = the smallest oldest_unflush_lsa remaining; this advances the recovery anchor. The checkpoint then walks the trantable, packs an active-transaction snapshot into LOG_REC_CHKPT, emits LOG_END_CHKPT, fsyncs, and updates log_Gl.hdr.chkpt_lsa in the active log header. See cubrid-checkpoint.md §“Top-level flow”. A crash after the checkpoint and before subsequent dirty-page flush replays only redo records below the new redo-LSA.

Step 14 — MVCC vacuum eventually reclaims dead versions

Section titled “Step 14 — MVCC vacuum eventually reclaims dead versions”

For an INSERT specifically, vacuum has little to do — there is no prior version to reclaim. But to close the loop on the MVCC machinery: once the oldest active snapshot moves past our commit’s MVCCID the row is universally visible. The transaction’s LOG_MVCC_UNDOREDO record is nonetheless visible to vacuum’s per-block scanner. vacuum_consume_buffer_log_blocks sweeps the log forward chunking it into blocks of 31 log pages (VACUUM_LOG_BLOCK_PAGES_DEFAULT); our record’s MVCCID feeds the block’s newest_mvccid. Once the global oldest_visible_mvccid exceeds newest_mvccid, the block becomes dispatchable. vacuum_master_task picks it up via its vacuum_job_cursor, CAS-transitions AVAILABLE → IN_PROGRESS, and hands it to a vacuum_worker from the pool of up to 50.

The worker walks the block’s MVCC chain backward via LOG_VACUUM_INFO::prev_mvcc_op_log_lsa, decompresses the undo image into its per-thread log_zip_p, builds VACUUM_HEAP_OBJECT { vfid, oid } per candidate, transitions to EXECUTE, fixes the target page through its private LRU and removes dead versions. For our INSERT the work is “walk past, alive”; material work happens for later DELETEs/UPDATEs that touch this row. vacuum_is_file_dropped short-circuits if the class was dropped. On success the block transitions to VACUUMED; on interrupt, INTERRUPTED + AVAILABLE so the master re-dispatches. See cubrid-vacuum.md §“Worker”.

flowchart TB
  CLIENT["client: INSERT INTO t VALUES (...);<br/>then COMMIT"]

  CLIENT -->|"net protocol<br/>(cubrid-rpath-select.md steps 1-3)"| PARSE

  subgraph SERVER["cub_server worker thread"]
    direction TB
    PARSE["Parse<br/>(cubrid-parser.md)<br/>PT_INSERT node"]
    SEM["Semantic check<br/>(cubrid-semantic-check.md)<br/>name resolution + type unify"]
    DISP["do_statement → do_insert<br/>(cubrid-ddl-execution.md)"]
    LOC["locator_attribute_info_force<br/>switch (LC_FLUSH_INSERT)<br/>(cubrid-locator.md)"]
    INS["locator_insert_force"]
    HI["heap_insert_logical<br/>(cubrid-heap-manager.md)"]
    HSTAMP["MVCC stamp<br/>mvcc_ins_id<br/>(cubrid-mvcc.md)"]
    OVF["heap_ovf_insert → overflow_insert<br/>(cubrid-overflow-file.md)<br/>only if record &gt; page"]
    BTI["btree_insert × N indexes<br/>(cubrid-btree.md)<br/>latch-coupling, unique check"]
    CONS["locator_check_unique_btree_entries<br/>locator_check_foreign_key<br/>(cubrid-locator.md)"]
    LK["lock_object<br/>X on row OID<br/>(cubrid-lock-manager.md)"]
    REPL["repl_log_insert<br/>tdes->repl_records[]<br/>(cubrid-ha-replication.md)"]
    SUP["log_append_supplemental_∗<br/>(cubrid-cdc.md)"]
  end

  CLIENT_TR["BEFORE/AFTER triggers<br/>(client side, cubrid-trigger.md)<br/>obt_apply_assignments"]
  CLIENT -. "if class has triggers" .-> CLIENT_TR
  CLIENT_TR -.-> DISP

  PARSE --> SEM --> DISP --> LOC --> INS
  INS --> HI --> HSTAMP
  INS --> BTI
  HI --> OVF
  HI -.OID assigned.-> LK
  INS --> CONS
  INS --> REPL
  INS --> SUP

  subgraph WAL["Per page change → WAL"]
    direction TB
    APP["log_append_undoredo_data<br/>log_append_redo_data<br/>(cubrid-log-manager.md)"]
    PRA["prior_lsa_alloc_and_copy_data<br/>malloc node, zlib outside mutex<br/>(cubrid-prior-list.md)"]
    PRN["prior_lsa_next_record<br/>assign LSN, link tail<br/>under prior_lsa_mutex"]
    PL["prior_list<br/>singly-linked queue"]
    APP --> PRA --> PRN --> PL
  end
  HI --> APP
  BTI --> APP
  OVF --> APP

  CLIENT -->|"second statement: COMMIT"| COMMIT
  COMMIT["log_commit_local<br/>(cubrid-transaction.md, cubrid-log-manager.md)"]
  COMMIT --> TRDC["tr_check_commit_triggers<br/>drain tr_Deferred_activities<br/>(cubrid-trigger.md)"]
  COMMIT --> POST["replay LOG_POSTPONE if any"]
  COMMIT --> RPC["log_append_repl_info_and_commit_log<br/>flush tdes->repl_records[]<br/>· append LOG_COMMIT<br/>under one prior_lsa_mutex hold"]
  RPC --> PRA
  COMMIT --> WAIT["logpb_flush_pages(commit_lsa)<br/>park on gc_cond timed-wait"]

  subgraph DAEMON["log_Flush_daemon"]
    DR["logpb_prior_lsa_append_all_list<br/>detach prior list under mutex<br/>copy nodes into LOG_PAGE buffer"]
    FLU["logpb_flush_all_append_pages<br/>fileio_write_pages → active log volume<br/>fsync; advance nxio_lsa"]
    BC["pthread_cond_broadcast(gc_cond)"]
    DR --> FLU --> BC
  end
  PL --> DR
  WAIT --> BC
  BC -->|"nxio_lsa &gt;= commit_lsa"| ACK["return TRAN_UNACTIVE_COMMITTED<br/>release locks<br/>logtb_release_tran_index"]
  ACK --> CLIENT_OK["COMMIT acknowledgment to client"]

  subgraph LATER["Eventually — page-buffer flush daemons"]
    direction TB
    PFD["Page Flush Daemon<br/>(cubrid-page-buffer-manager.md)"]
    PPF["Page Post-Flush Daemon"]
    PMD["Page Maintenance Daemon<br/>quota adjust every 100ms"]
    DWB["dwb_acquire_next_slot<br/>dwb_add_page<br/>(cubrid-double-write-buffer.md)<br/>sequential write + fsync"]
    HOME["fileio_write_pages → home volume"]
    PFD --> DWB --> HOME
    PPF -.-> DWB
  end
  HI -. "dirty heap pages" .-> PFD
  BTI -. "dirty btree pages" .-> PFD

  subgraph CHK["Periodic — log-checkpoint daemon"]
    CHKD["logpb_checkpoint<br/>(cubrid-checkpoint.md)<br/>LOG_START_CHKPT → pgbuf_flush_checkpoint<br/>→ LOG_END_CHKPT → log header fsync"]
  end
  HOME -. "advances redo-LSA" .-> CHKD

  subgraph VACUUM["Eventually-eventually — vacuum"]
    VC["vacuum_consume_buffer_log_blocks<br/>(cubrid-vacuum.md)<br/>chunk WAL into 31-page blocks"]
    VM["vacuum_master_task<br/>cursor over vacuum_Data<br/>dispatch IN_PROGRESS"]
    VW["vacuum_worker × ≤ 50<br/>walk MVCC chain backward<br/>fix page, remove dead version"]
    VC --> VM --> VW
  end
  PL -. "LOG_MVCC_* records" .-> VC
  VW -. "next visit" .-> HOME

Each arrow is annotated with the detail doc that owns its mechanism. Steps that share one thread of execution (parse → semantic check → dispatch → locator → heap → btree → WAL append) collapse into the upper region; the durability transition (commit wait → daemon flush → broadcast) is the boundary the client crosses. The bottom half — page-buffer flush, DWB, checkpoint, vacuum — happens outside the client’s commit acknowledgment.

The path above is the single-row, single-statement, single-server INSERT + COMMIT. Adjacent paths intentionally out of scope:

  • Bulk INSERT via loaddbBU class-lock + bottom-up B+Tree build via xbtree_load_index. See cubrid-loaddb.md, cubrid-btree.md §“Bulk load”.
  • DELETE specifics — sets mvcc_del_id; vacuum reclaims later. See cubrid-heap-manager.md §“Delete flow”.
  • UPDATE specifics — read old + encode new + diff-driven index update filtered by att_id[]; may relocate or push to overflow. See cubrid-locator.md §“locator_update_force”, cubrid-heap-manager.md §“Update flow”.
  • Trigger internals — ECA model, action PT_NODE lazy compile, recursion counter + OID stack, deferred drain. See cubrid-trigger.md.
  • Deadlock detectionLK_WFG_EDGE on conflict; lock_detect_local_deadlock aborts the most-recently-blocked transaction. See cubrid-lock-manager.md §“Deadlock detection”.
  • Two-phase commit (cross-server XA)LOG_2PC_* records, LOG_TDES::coord/gtrinfo, separate state machine on top of TRAN_STATE. See cubrid-2pc.md.
  • Replication apply / CDC consumer. Slave-side applylogdb/la_apply_log_file and pull-style cdc_make_loginfo. See cubrid-ha-replication.md, cubrid-cdc.md.
  • Crash recovery. Three-pass ARIES (analysis/redo/undo) anchored on log_Gl.hdr.chkpt_lsa. See cubrid-recovery-manager.md, cubrid-checkpoint.md §“Recovery integration”.

CUBRID source (/data/hgryoo/references/cubrid/)

Section titled “CUBRID source (/data/hgryoo/references/cubrid/)”
  • src/parser/csql_grammar.y, parse_tree.hPT_INSERT.
  • src/query/execute_statement.cdo_statement switch, do_insert, do_execute_statement.
  • src/transaction/locator_sr.clocator_attribute_info_force, locator_insert_force, locator_add_or_remove_index, locator_check_foreign_key.
  • src/storage/heap_file.cheap_insert_logical, heap_ovf_insert, heap_set_mvcc_rec_header_on_overflow.
  • src/storage/btree.c, btree_load.cbtree_insert, btree_split_node, btree_find_oid_and_its_page, btree_start_overflow_page.
  • src/storage/overflow_file.coverflow_insert, RVOVF_NEWPAGE_INSERT.
  • src/transaction/mvcc_table.cppmvcctable::get_new_mvccid, complete_mvcc.
  • src/object/trigger_manager.ctr_prepare_class, tr_before_object, tr_after_object, tr_check_commit_triggers.
  • src/transaction/replication.crepl_log_insert, repl_add_update_lsa.
  • src/transaction/log_manager.c, log_append.cpp, log_page_buffer.clog_append_*, prior_lsa_alloc_and_copy_data, prior_lsa_next_record, logpb_prior_lsa_append_all_list, logpb_flush_all_append_pages, log_flush_execute, log_commit, log_append_repl_info_and_commit_log.
  • src/transaction/transaction_sr.c, log_tran_table.cxtran_server_commit, logtb_release_tran_index, logtb_complete_mvcc.
  • src/transaction/lock_manager.clock_object, lock_internal_perform_lock_object, lock_detect_local_deadlock.
  • src/storage/page_buffer.cpgbuf_flush_check_log_lsa, pgbuf_flush_victim_candidates, the three flush daemons.
  • src/storage/double_write_buffer.cppdwb_acquire_next_slot, dwb_add_page, dwb_flush_block.
  • src/query/vacuum.cvacuum_consume_buffer_log_blocks, vacuum_master_task, vacuum_process_log_block.
  • cubrid-rpath-select.md — read path; steps 1–3 reused above.

Detail docs threaded through this synthesis

Section titled “Detail docs threaded through this synthesis”

cubrid-parser.md, cubrid-semantic-check.md, cubrid-ddl-execution.md, cubrid-locator.md, cubrid-heap-manager.md, cubrid-overflow-file.md, cubrid-btree.md, cubrid-mvcc.md, cubrid-trigger.md, cubrid-lock-manager.md, cubrid-prior-list.md, cubrid-log-manager.md, cubrid-ha-replication.md, cubrid-cdc.md, cubrid-transaction.md, cubrid-page-buffer-manager.md, cubrid-double-write-buffer.md, cubrid-checkpoint.md, cubrid-vacuum.md.