Skip to content

CUBRID Vacuum — Reclaiming Dead MVCC Versions Through Log Replay

Contents:

A multi-version concurrency control (MVCC) system creates dead versions: every UPDATE writes a new version of a row and leaves the previous one in place; every DELETE marks a row deleted but does not physically reclaim it; every aborted INSERT leaves a tombstone. Left unchecked, the heap and the indexes accumulate these tombstones and old versions until storage and scans are dominated by garbage. The vacuum subsystem is the engine’s garbage collector for MVCC.

Database Internals (Petrov, ch. 5 §“MVCC”) frames the reclamation problem in a single sentence: a version is reclaimable when no in-flight or future transaction’s snapshot can see it. The mechanics depend on what the engine does next:

  • The engine knows the oldest visible MVCCID — every transaction’s snapshot bounds visibility, and the smallest of those bounds across all live transactions is the threshold.
  • A version with xmax < oldest_visible_mvccid, or an aborted-insertion tombstone with xmin < oldest_visible_mvccid, is dead and reclaimable.
  • The reclamation work happens on data pages (heap rows, B+Tree leaf entries, OID list compaction) but is driven by the log: every MVCC operation has emitted a LOG_MVCC_* record, and the vacuum subsystem walks those records to learn what to clean.

Two implementation choices the model leaves open shape every real engine and frame the rest of this document:

  1. What drives reclamation: data-side scan or log-side replay? PostgreSQL’s autovacuum scans heap files; InnoDB’s purge thread walks the undo log. CUBRID is in the second camp — vacuum walks the WAL itself, fixing target pages on demand. The trade-off: data-side scan visits each tuple once per pass; log-side replay visits each modification once. Log-side wins on workloads where most tuples are never updated, loses on long-running scans without modifications.
  2. What granularity of work? Per-tuple visit, per-page sweep, or per-block log replay? CUBRID picks per-block — the log is chunked into fixed-size vacuum blocks (default 31 log pages), and one block is one work unit dispatched to a worker.

After the choices are named, every CUBRID-specific structure in this document either implements one of them or makes the implementation faster.

Every MVCC engine ships some form of garbage collector, and the shapes converge on a small handful of patterns.

Reclamation cannot proceed past the smallest active-snapshot MVCCID. Every engine maintains this watermark, recomputed when transactions begin or end. PostgreSQL calls it OldestXmin and vacuum_defer_cleanup_age; InnoDB calls it purge_view; CUBRID calls it oldest_visible_mvccid on the log header and inside each TDES’s mvccinfo.

A single master picks work, multiple workers do it. The master needs to be cheap so it can scan candidate work continuously; the workers need their own state because vacuuming pages requires page fixes, log reads, and undo-data buffers. PostgreSQL’s autovacuum launcher + workers, InnoDB’s coordinator + workers, CUBRID’s master + workers are all the same architecture.

When a table or index is dropped while old MVCC versions still reference it, vacuum must know not to follow the file ID into a freed extent. Every engine keeps a separate map from dropped-file ID to the MVCCID at which it was dropped; a vacuum job consults the map before chasing a record into a missing file. CUBRID’s vacuum_dropped_files_page is the structure.

A “block” or “batch” or “page” of log records is the unit a worker consumes. The size is a tuning knob: larger blocks amortise per-job overhead; smaller blocks parallelise better. PostgreSQL uses heap pages as the unit; InnoDB uses undo-log batches; CUBRID uses 31 log pages by default (VACUUM_LOG_BLOCK_PAGES_DEFAULT).

On a single target page, vacuum operations must be applied in LSN order to keep the page state consistent. Across pages, vacuum is embarrassingly parallel. The buffer manager’s page-fix is the natural synchronisation primitive — a single page can only have one writer at a time, so a worker that touches it serialises against any other worker that targets the same page.

Theoretical conceptCUBRID name
Oldest-visible-MVCCID watermarklog_Gl.hdr.oldest_visible_mvccid (log_storage.hpp); per-TDES mvccinfo
Vacuum mastervacuum_master_task : public cubthread::entry_task (vacuum.c:813)
Vacuum workerVACUUM_WORKER struct, max VACUUM_MAX_WORKER_COUNT = 50
Worker stateVACUUM_WORKER_STATE { INACTIVE, PROCESS_LOG, EXECUTE } (vacuum.h)
Block of workVACUUM_DATA_ENTRY { blockid, start_lsa, oldest_visible_mvccid, newest_mvccid }
Block size in log pagesVACUUM_LOG_BLOCK_PAGES_DEFAULT = 31
Vacuum data filevacuum_Data global with first_page/last_page cached
Per-page block listVACUUM_DATA_PAGE { next_page, index_unvacuumed, index_free, data[] }
Block-status bit-packTop 3 bits of blockid: STATUS_VACUUMED, IN_PROGRESS, AVAILABLE; +INTERRUPTED flag
Job cursorvacuum_job_cursor class — tracks progress across blockid relocations
Heap-side target listVACUUM_HEAP_OBJECT { vfid, oid } array per worker
Dropped-file tablevacuum_dropped_files_page + vacuum_dropped_file records (vacuum.c:580)
Per-block log link in WALLOG_VACUUM_INFO::prev_mvcc_op_log_lsa (log_record.hpp)
Log block boundary on the log headerlog_header::vacuum_last_blockid and does_block_need_vacuum
Per-record dispatchReuses RV_fun[] with vacuum-side undo / mvcc-undo paths

The vacuum subsystem has four moving parts: vacuum data — the on-disk catalogue of work to do, the master task — the selector that picks the next block, the worker pool — the parallel executors that consume blocks, and the dropped-file table — the catalogue of files to skip. We walk them in that order.

flowchart LR
  subgraph LOG["WAL log (cubrid-log-manager.md)"]
    LR1["MVCC op record\nblock B-1"]
    LR2["MVCC op record\nblock B"]
    LR3["MVCC op record\nblock B+1"]
  end
  subgraph VD["vacuum_Data (vacuum data file)"]
    VP1["VACUUM_DATA_PAGE 1\nentries..."]
    VP2["VACUUM_DATA_PAGE 2\nentries..."]
    VPn["..."]
  end
  subgraph M["Master (vacuum_master_task)"]
    CUR["vacuum_job_cursor"]
    SEL["select next block\nbelow watermark"]
  end
  subgraph W["Worker pool (≤ 50 VACUUM_WORKER)"]
    W1["worker 1\nstate=PROCESS_LOG"]
    W2["worker 2\nstate=EXECUTE"]
    Wn["..."]
  end
  subgraph DF["Dropped files (vacuum_dropped_files_page)"]
    DFP["vfid → mvccid map"]
  end
  subgraph TGT["Target heap and B+Tree pages"]
    HP["heap page"]
    BT["btree leaf"]
  end
  LOG -->|consume_buffer_log_blocks| VD
  VD -->|cursor visit| M
  M -->|dispatch block| W
  W -->|read MVCC ops in block| LOG
  W -->|consult before chase| DF
  W -->|fix + clean| TGT

The figure encodes three loops. (producer loop) the WAL emits MVCC operation records; vacuum_consume_buffer_log_blocks periodically translates them into vacuum-data entries. (master loop) the master walks vacuum_Data looking for blocks below the oldest-visible watermark and dispatches them. (worker loop) workers fetch a block, walk its log records, and clean target pages.

Vacuum data — the catalogue of pending work

Section titled “Vacuum data — the catalogue of pending work”

Each entry is one block of log to vacuum.

// VACUUM_DATA_ENTRY — src/query/vacuum.c
struct vacuum_data_entry
{
VACUUM_LOG_BLOCKID blockid; /* blockid + flags packed in top bits */
LOG_LSA start_lsa; /* LSA of last MVCC op log record in block */
MVCCID oldest_visible_mvccid; /* threshold at the time the block was logged */
MVCCID newest_mvccid; /* newest MVCCID in this block */
vacuum_data_entry () = default;
vacuum_data_entry (const log_lsa &lsa, MVCCID oldest, MVCCID newest);
vacuum_data_entry (const log_header &hdr);
VACUUM_LOG_BLOCKID get_blockid () const;
bool is_available () const;
bool is_vacuumed () const;
bool is_job_in_progress () const;
bool was_interrupted () const;
void set_vacuumed ();
void set_job_in_progress ();
void set_interrupted ();
};

The packing is interesting and worth marking up. blockid is 64-bit; the top 3 bits carry a 4-state status (AVAILABLE, IN_PROGRESS, VACUUMED — leaving one combination free) and the 4th-from-top bit carries an INTERRUPTED flag. The remaining 60 bits carry the actual block id. The macros encode this:

// Block status macros — src/query/vacuum.c
#define VACUUM_DATA_ENTRY_FLAG_MASK 0xE000000000000000
#define VACUUM_DATA_ENTRY_BLOCKID_MASK 0x1FFFFFFFFFFFFFFF
#define VACUUM_BLOCK_STATUS_VACUUMED 0x8000000000000000
#define VACUUM_BLOCK_STATUS_IN_PROGRESS_VACUUM 0x4000000000000000
#define VACUUM_BLOCK_STATUS_AVAILABLE 0x0000000000000000
#define VACUUM_BLOCK_FLAG_INTERRUPTED 0x2000000000000000

The bit-pack saves the per-entry int status; bool interrupted; that would otherwise pad to 8 bytes. Across millions of entries the saving is real.

The blocks live in pages of the vacuum data file:

// VACUUM_DATA_PAGE — src/query/vacuum.c
struct vacuum_data_page
{
VPID next_page; /* Linked list of pages */
INT16 index_unvacuumed; /* First not-yet-vacuumed index in data[] */
INT16 index_free; /* First free index in data[] */
VACUUM_DATA_ENTRY data[1]; /* Variable-size array */
bool is_empty () const;
bool is_index_valid (INT16 index) const;
INT16 get_index_of_blockid (VACUUM_LOG_BLOCKID blockid) const;
VACUUM_LOG_BLOCKID get_first_blockid () const;
};

The index_unvacuumed / index_free cursors mean that vacuumed entries at the head of a page can be cleared without reshuffling the live entries; growth happens at the tail. When index_unvacuumed catches up with index_free, the page is empty and unlinked.

The vacuum_Data global keeps the first and last pages of this list permanently fixed in the buffer pool — vacuum reads them on every cycle, so the per-fix overhead would be prohibitive. The vacuum_fix_data_page macro short-circuits to the cached pages when the requested VPID matches.

Vacuum data construction — log → blocks

Section titled “Vacuum data construction — log → blocks”

vacuum_consume_buffer_log_blocks (vacuum.c:5096) is the bridge from log to vacuum data. It runs whenever the log has accumulated unprocessed MVCC operations:

  1. Read the log forward from the last consumed LSA. (Note the direction contrast: this construction phase scans the WAL forward; the per-block worker walk in §“Worker” walks the block’s MVCC chain backward via prev_mvcc_op_log_lsa.)
  2. For each LOG_MVCC_* record encountered, find or create the block this record falls into (block id = pageid / vacuum_Data.log_block_npages — the runtime field initialised from VACUUM_LOG_BLOCK_PAGES_DEFAULT but overridable via PRM_ID_VACUUM_LOG_BLOCK_PAGES, so the divisor is the live value, not the macro).
  3. Update the block’s start_lsa (last MVCC op seen), newest_mvccid, and oldest_visible_mvccid (the watermark captured at the time the record was logged — not now).
  4. When a block is filled (no more MVCC ops will fall into it because the log has moved past it), write its entry to vacuum data with status AVAILABLE.

The captured oldest_visible_mvccid is the watermark from the time of logging, not the time of consumption. This matters: a block can be dispatched as soon as the current watermark exceeds the block’s newest_mvccid, regardless of how the watermark has moved since the block was logged.

Master task — picking blocks, dispatching jobs

Section titled “Master task — picking blocks, dispatching jobs”

The master is a cubthread::entry_task subclass:

// vacuum_master_task — src/query/vacuum.c:813
class vacuum_master_task : public cubthread::entry_task
{
public:
void execute (cubthread::entry &thread_ref) override;
private:
bool check_shutdown () const;
bool is_task_queue_full () const;
bool should_interrupt_iteration () const;
bool is_cursor_entry_ready_to_vacuum () const;
bool is_cursor_entry_available () const;
void start_job_on_cursor_entry ();
bool should_force_data_update () const;
void increase_outstanding_job ();
void decrease_outstanding_job (int count);
vacuum_job_cursor m_cursor; /* Where in vacuum data we are */
// ... condensed ...
};

vacuum_master_task::execute is the master loop. Each tick:

  1. Check shutdown / queue-full / interrupt conditions.
  2. Advance the cursor to the next AVAILABLE entry whose newest_mvccid is less than the current oldest-visible watermark.
  3. Atomically transition the entry to IN_PROGRESS.
  4. Increment the outstanding-job counter.
  5. Dispatch the entry to a worker via the thread pool.

The cursor is a separate class (vacuum_job_cursor, vacuum.c:277) because vacuum data pages can be added or removed between ticks (a fully-vacuumed page is freed; new blocks always land on the last page). The cursor’s readjust_to_vacuum_data_changes relocates the cursor’s blockid → page mapping after such changes.

Worker — per-block log replay and target cleanup

Section titled “Worker — per-block log replay and target cleanup”

vacuum_process_log_block (vacuum.c:3251) is the worker entry point. Given a block, it:

  1. Sets the worker’s state to PROCESS_LOG.
  2. Walks the block’s log records backward via LOG_VACUUM_INFO::prev_mvcc_op_log_lsa chains. The chain exists exactly because the log manager is courteous to the vacuum subsystem: every MVCC record carries a back-pointer to the previous MVCC record (cubrid-log-manager.md §“MVCC-flavoured records”).
  3. For each record, decompresses the undo image (using the worker’s per-thread log_zip_p).
  4. Builds a VACUUM_HEAP_OBJECT (vfid + oid) for each candidate to clean.
  5. Switches to state EXECUTE. Fixes target pages, removes dead versions, compacts B+Tree OID lists.
  6. On success, sets the block’s status to VACUUMED. On failure (interrupted, page latch contention, error), sets INTERRUPTED so the master will re-dispatch.

The worker maintains buffers reused across jobs to avoid allocation:

// VACUUM_WORKER — src/query/vacuum.h
struct vacuum_worker
{
VACUUM_WORKER_STATE state; /* INACTIVE / PROCESS_LOG / EXECUTE */
INT32 drop_files_version; /* Last seen dropped-files version */
struct log_zip *log_zip_p; /* Decompression context */
VACUUM_HEAP_OBJECT *heap_objects; /* Targets to clean this job */
int heap_objects_capacity;
int n_heap_objects;
char *undo_data_buffer;
int undo_data_buffer_capacity;
int private_lru_index; /* Per-worker LRU list in page buffer */
char *prefetch_log_buffer; /* Prefetched log pages */
LOG_PAGEID prefetch_first_pageid;
LOG_PAGEID prefetch_last_pageid;
bool allocated_resources;
int idx; /* -1 for master; sequence for workers */
};

The private LRU index is worth noting. CUBRID’s buffer manager supports per-thread LRU lists (cubrid-page-buffer-manager.md §“Quota and private lists”); vacuum workers each get their own list so a vacuum scan doesn’t pollute the global hot list. The prefetch buffer is a stash of upcoming log pages so the worker can chain through MVCC records without per-page fault latency.

When a class is dropped while old MVCC versions still reference its file, the vacuum worker must not follow the file id into freed storage. The vacuum_dropped_file table maps vfid to the MVCCID at which the file was dropped:

// vacuum_dropped_file — src/query/vacuum.c:580
struct vacuum_dropped_file
{
VFID vfid;
MVCCID mvccid;
};
struct vacuum_dropped_files_page
{
VPID next_page;
INT16 n_dropped_files;
vacuum_dropped_file dropped_files[1]; /* variable-size */
};

A worker calls vacuum_is_file_dropped (vacuum.c:6587) before chasing a vfid into a heap; if the answer is yes and the version predates the drop MVCCID, the version is implicitly dead and the worker skips it.

The dropped-files page list is updated by vacuum_log_add_dropped_file (vacuum.c:6121). The selector VACUUM_LOG_ADD_DROPPED_FILE_POSTPONE vs. VACUUM_LOG_ADD_DROPPED_FILE_UNDO (these are plain bool values passed as the pospone_or_undo argument, not OR-able flag bits) distinguishes between “this file was dropped at commit; vacuum it on commit-side postpone replay” and “this file was created and then aborted; vacuum it on undo replay”.

vacuum_data_load_and_recover (vacuum.c:4183) is the post-restart entry point. After log_recovery finishes the three ARIES passes, this function:

  1. Reloads vacuum_Data from its on-disk pages.
  2. Walks any blocks that were IN_PROGRESS at crash and resets them to AVAILABLE (with INTERRUPTED flag set), so the master picks them up again.
  3. Calls vacuum_recover_lost_block_data (vacuum.c:5465) to patch any blocks that were in flight in the WAL but not yet recorded in vacuum data — this can happen if the crash occurred between log emission of an MVCC record and the next vacuum_consume_buffer_log_blocks tick.
sequenceDiagram
  participant LM as log_manager
  participant CB as vacuum_consume_buffer_log_blocks
  participant VD as vacuum_Data file
  participant M  as vacuum_master_task
  participant W  as vacuum_worker
  participant PG as page buffer / heap / btree

  LM->>LM: append LOG_MVCC_* record
  Note over LM: every record's prev_mvcc_op_log_lsa\nlinks to previous MVCC record
  CB->>LM: read since last consumed LSA
  CB->>VD: append AVAILABLE entry for filled block
  loop master tick
    M->>VD: cursor next AVAILABLE entry below watermark
    M->>VD: CAS status → IN_PROGRESS
    M->>W: dispatch (block)
    W->>LM: walk MVCC chain in block
    W->>W: build VACUUM_HEAP_OBJECT list
    W->>PG: fix + remove dead versions / compact OID lists
    alt success
      W->>VD: CAS status → VACUUMED
    else interrupt / error
      W->>VD: CAS flag INTERRUPTED, status → AVAILABLE
    end
  end

Anchor on symbol names, not line numbers.

  • vacuum_worker (vacuum.h) — per-worker bookkeeping.
  • vacuum_worker_state enum (vacuum.h) — INACTIVE / PROCESS_LOG / EXECUTE.
  • VACUUM_HEAP_OBJECT (vacuum.h) — heap-side target.
  • VACUUM_LOG_BLOCK_PAGES_DEFAULT (vacuum.h) — block size.
  • vacuum_data_entry (vacuum.c) — one block of pending work.
  • vacuum_data_page (vacuum.c) — page of entries.
  • vacuum_data (vacuum.c) — global state.
  • vacuum_dropped_file / vacuum_dropped_files_page (vacuum.c) — skip list.
  • vacuum_master_task (vacuum.c) — master loop class.
  • vacuum_job_cursor (vacuum.c) — relocation-tolerant cursor.
  • vacuum_initialize (vacuum.c) — boot-time init.
  • vacuum_finalize (vacuum.c) — shutdown.
  • vacuum_data_load_and_recover (vacuum.c) — post-recovery reload.
  • vacuum_recover_lost_block_data (vacuum.c) — patch in-flight blocks the consumer didn’t see before crash.
  • vacuum_consume_buffer_log_blocks (vacuum.c) — log → vacuum data.
  • vacuum_master_task::execute (vacuum.c) — master tick.
  • vacuum_master_task::start_job_on_cursor_entry (vacuum.c) — CAS to IN_PROGRESS + dispatch.
  • vacuum_process_log_block (vacuum.c) — worker entry.
  • vacuum_worker_allocate_resources (vacuum.c) — first-touch allocation of log_zip_p, buffers.
  • vacuum_log_add_dropped_file (vacuum.c) — register.
  • vacuum_is_file_dropped (vacuum.c) — query.

Worker-state inline accessors (in vacuum.h)

Section titled “Worker-state inline accessors (in vacuum.h)”
  • vacuum_get_vacuum_worker, vacuum_is_thread_vacuum, vacuum_is_thread_vacuum_worker, vacuum_is_thread_vacuum_master, vacuum_get_worker_state, vacuum_set_worker_state, vacuum_worker_state_is_*. All __attribute__ ((ALWAYS_INLINE)) because they sit on the worker hot path.
SymbolFileLine
VACUUM_WORKER (struct)vacuum.h106
VACUUM_WORKER_STATE enumvacuum.h85
VACUUM_LOG_BLOCK_PAGES_DEFAULTvacuum.h82
VACUUM_MAX_WORKER_COUNTvacuum.h132
vacuum_data_entry (struct)vacuum.c104
vacuum_data_page (struct)vacuum.c194
vacuum_data (struct)vacuum.c350
vacuum_dropped_file (struct)vacuum.c580
vacuum_dropped_files_page (struct)vacuum.c588
vacuum_job_cursor (class)vacuum.c277
vacuum_master_task (class)vacuum.c813
vacuum_master_task::executevacuum.c3002
vacuum_initializevacuum.c1180
vacuum_finalizevacuum.c1416
vacuum_process_log_blockvacuum.c3251
vacuum_worker_allocate_resourcesvacuum.c3620
vacuum_finalize_workervacuum.c3689
vacuum_data_load_and_recovervacuum.c4183
vacuum_consume_buffer_log_blocksvacuum.c5096
vacuum_recover_lost_block_datavacuum.c5465
vacuum_log_add_dropped_filevacuum.c6121
vacuum_is_file_droppedvacuum.c6587
  • Vacuum sources live under src/query/, not src/transaction/. Verified by findsrc/query/vacuum.{c,h} exist, no src/transaction/vacuum.*. Implication: references: in the meta and frontmatter of this doc were corrected at draft time (the original skeleton placed them under src/transaction/ by analogy with cubrid-mvcc).

  • Block size is 31 log pages by default, encoded as VACUUM_LOG_BLOCK_PAGES_DEFAULT. Verified at vacuum.h:82. The corresponding runtime parameter is prm_get_integer_value (PRM_ID_VACUUM_LOG_BLOCK_PAGES) (covered in the vacuum_initialize body); 31 is the default but it can be overridden at boot.

  • Worker pool is capped at 50. Verified by VACUUM_MAX_WORKER_COUNT = 50 (vacuum.h:132). The actual count is configurable via a server parameter; the macro is the hard upper bound.

  • Block status is bit-packed into the top 3 bits of the 64-bit blockid; an INTERRUPTED flag uses the 4th bit. Verified at vacuum.c:135-186. Available status is 0x0000000000000000 (top bits zero); vacuumed is 0x8000000000000000; in-progress is 0x4000000000000000. The BLOCKID_MASK 0x1FFFFFFFFFFFFFFF extracts the actual id — 60 bits, ~1.15 × 10^18 blocks before exhaustion.

  • vacuum_Data.first_page and vacuum_Data.last_page are permanently fixed in the buffer pool. Verified at vacuum.c:223 (vacuum_fix_data_page macro short-circuits to the cached pages). Implication: page-buffer eviction never touches them, so master ticks pay no fix-overhead.

  • Workers maintain a private LRU list in the page buffer. Verified at vacuum.h:122 (VACUUM_WORKER::private_lru_index). This prevents vacuum scans from polluting the global hot list. (Cross-doc: cubrid-page-buffer-manager.md describes the per-thread LRU mechanism.)

  • Workers prefetch upcoming log pages. Verified at vacuum.h:124-126 (prefetch_log_buffer, prefetch_first_pageid, prefetch_last_pageid). The buffer is per-worker, sized at first allocation in vacuum_worker_allocate_resources.

  • The MVCC log chain is what drives backward walking inside a block. Verified by reading vacuum_process_log_block and the LOG_VACUUM_INFO::prev_mvcc_op_log_lsa field (cubrid-log-manager.md §“MVCC-flavoured records”). The chain is the only reason vacuum is faster than a full forward log walk.

  • The dropped-files table is paged like vacuum data, not inline in vacuum data. Verified at vacuum.c:588 (vacuum_dropped_files_page). Implication: dropped files survive vacuum data page churn, and vacuum cleanup can run even after vacuum data has been compacted.

  • Worker recovery on crash: in-progress blocks are reset to AVAILABLE with INTERRUPTED. Verified by reading vacuum_data_load_and_recover and the was_interrupted / set_interrupted accessors on vacuum_data_entry. The flag signals to the master that this block was already partially done, so the worker can skip records the previous attempt marked as cleaned (target pages already have advanced LSA).

  1. Master tick interval and adaptive throttling. The master loop’s wake interval and any backpressure mechanism (slow down when the system is busy) were not located. Investigation path: read the cubthread daemon registration of vacuum_master_task plus the should_interrupt_iteration / is_task_queue_full methods.

  2. Watermark advancement triggers. When does log_Gl.hdr.oldest_visible_mvccid get recomputed? Per transaction commit / abort? Periodically? Both? Investigation path: grep for writes to oldest_visible_mvccid; cross-ref with logtb_complete_mvcc (cubrid-transaction.md).

  3. Heap-vs-btree dispatch. A VACUUM_HEAP_OBJECT is just (vfid, oid). How does the worker decide that a vfid points to a heap rather than a B+Tree, and does it call the right per-subsystem cleanup function? Investigation path: read the body of the vacuum-execute path (around the worker’s state transition to EXECUTE).

  4. Interaction with online schema changes. If a B+Tree is dropped while vacuum is mid-job on a block that touches it, the dropped-files table prevents chasing the file — but what about the in-flight VACUUM_HEAP_OBJECT list? Are entries filtered against dropped-files, or does the worker have to handle ER_FILE_DROPPED at the page-fix layer? Investigation path: look for vacuum_is_file_dropped callers inside the execute path.

  5. Page-buffer private LRU semantics under contention. If a worker’s private LRU is full and another worker needs the same page, what happens? Hand off, or share via global LRU? Investigation path: cubrid-page-buffer-manager.md and pgbuf_*_private_lru_* paths.

  6. Recovery of vacuum_recover_lost_block_data. What exactly are “lost blocks” — blocks where the vacuum consumer was running at crash? Or blocks where the log emitted MVCC records but the consumer never ran? The function name suggests both. Investigation path: read its body and look for the ranges it patches over.

Beyond CUBRID — Comparative Designs & Research Frontiers

Section titled “Beyond CUBRID — Comparative Designs & Research Frontiers”

Pointers, not analysis. Each bullet is a starting handle for a follow-up doc.

  • PostgreSQL VACUUM — heap and index passes scan data pages, not the WAL. The dead-tuple bitmap is computed per-relation and consulted during cleanup. Cost: full scan of the relation; benefit: no log-side dependency. CUBRID’s log-driven design is closer to InnoDB’s.

  • InnoDB purge thread — walks the undo log (rollback segments) backward, removing dead versions when MVCCID watermark permits. CUBRID’s log-driven walk is structurally similar but uses redo log, not undo log, because CUBRID logs MVCC undo inside the same LOG_MVCC_* records.

  • Hekaton garbage collection (Larson et al., VLDB 2011) — epoch-based, lock-free, runs on a per-thread basis after the oldest active transaction’s epoch has retired. CUBRID’s master/worker model is the disk-resident analogue of the same idea.

  • Aurora’s MVCC at storage layer — versions are reclaimed by the storage engine, not the compute node, eliminating vacuum-on-compute. CUBRID is process-local; this is more a structural contrast than a feature gap.

  • Self-tuning autovacuum (PostgreSQL 16+) — dynamic block size and worker count based on dirty-page rate. CUBRID’s VACUUM_LOG_BLOCK_PAGES_DEFAULT is static; an adaptive variant would be a useful CBRD ticket follow-up.

  • VACUUM-as-replication-source (Debezium-style) — vacuuming emits a log stream of “this row went away”. CUBRID’s supplemental log records (cubrid-log-manager.md §“Supplemental records”) could be repurposed for this; the cubrid-cdc.md doc is the natural follow-up.

Raw analyses (raw/code-analysis/cubrid/storage/vacuum/)

Section titled “Raw analyses (raw/code-analysis/cubrid/storage/vacuum/)”
  • vacuum.pdf
  • vacuum.pptx
  • knowledge/code-analysis/cubrid/cubrid-mvcc.md — visibility model and oldest_visible_mvccid watermark.
  • knowledge/code-analysis/cubrid/cubrid-log-manager.mdLOG_MVCC_* records and LOG_VACUUM_INFO::prev_mvcc_op_log_lsa chain.
  • knowledge/code-analysis/cubrid/cubrid-heap-manager.md — per-record vacuum on the heap side.
  • knowledge/code-analysis/cubrid/cubrid-page-buffer-manager.md — per-thread LRU lists vacuum workers use.
  • knowledge/code-analysis/cubrid/cubrid-recovery-manager.mdvacuum_data_load_and_recover runs after the three-pass restart.

Textbook chapters (under knowledge/research/dbms-general/)

Section titled “Textbook chapters (under knowledge/research/dbms-general/)”
  • Database Internals (Petrov), Ch. 5 §“MVCC”, §“Garbage collection in MVCC”.

CUBRID source (/data/hgryoo/references/cubrid/)

Section titled “CUBRID source (/data/hgryoo/references/cubrid/)”
  • src/query/vacuum.{c,h}
  • src/transaction/mvcc.{c,h}