CUBRID Lock Manager — Multi-Granularity Locking, Conversion, and Deadlock Detection

Contents:

Theoretical Background
Common DBMS Design
CUBRID’s Approach
Source Walkthrough
Source verification (as of 2026-04-29)
Beyond CUBRID — Comparative Designs & Research Frontiers
Sources

Theoretical Background

A lock manager is the component that enforces logical serialization between transactions by granting and revoking locks on database objects. Database Internals (Petrov, ch. 5) is careful to distinguish locks from latches: locks protect transactional data integrity, are held for transaction duration, and are stored outside the data structure itself; latches protect physical structure (e.g., a page during a B-tree split) and are short-lived. CUBRID’s lock manager is the lock side; the page buffer’s PGBUF latches are the latch side.

The classical lock-based concurrency control protocol is two-phase locking (2PL): every transaction has a growing phase in which it may acquire locks, followed by a shrinking phase in which it may release them, with no further acquisitions allowed once any lock has been released. 2PL guarantees serializable schedules. The textbook also notes the structural cost of 2PL — contention and deadlocks — and the standard remedies: timeouts, conservative 2PL (acquire all locks up front), or a waits-for graph maintained by the transaction manager that is periodically scanned for cycles, with one transaction per cycle aborted as the deadlock victim ([WEIKUM01]). Avoidance schemes that use transaction timestamps (wait-die, wound-wait; [RAMAKRISHNAN03]) are an alternative to detection. CUBRID uses detection via a waits-for graph plus timeout as a fallback.

Two more theoretical building blocks shape the implementation:

Lock modes and a compatibility matrix. Beyond shared (S) and exclusive (X), real systems add intention modes (IS, IX, SIX) for multi-granularity locking — taking a coarse lock at one level (e.g., table) implies you are about to take finer locks at a deeper level (e.g., row). Intention modes let two transactions both lock different rows of the same table without false conflict at the table level. CUBRID extends this set with BU (bulk update), SCH-S/SCH-M (schema stability/modification), and a special NON_2PL marker.
Lock conversion. A transaction holding a lock in mode A that requests mode B is granted the least upper bound of the two — e.g., S + IX = SIX. A square conversion table replaces a search over per-mode rules.

The rest of this document tracks how CUBRID realizes these two pieces plus the waits-for-graph deadlock detector in src/transaction/lock_manager.{h,c}.

Common DBMS Design

The textbook gives 2PL, multi-granularity locking, and the waits-for graph; this section names the engineering conventions almost every DBMS lock manager (Oracle, MySQL InnoDB, SQL Server, PostgreSQL, CUBRID) adopts in some form. CUBRID’s specific choices in ## CUBRID's Approach are best read as one set of dials within this shared design space, not as inventions.

Lock vs latch separation

Locks are logical: tied to a transaction, externalized in a lock table, durable in their effect on serialization order. Latches are physical: short, embedded in pages or in-memory structures, guarding against concurrent corruption. Almost every modern DBMS has both. Conflating them produces the wrong design — you end up either holding logical locks during low-level page work, or guarding transaction state with latches. CUBRID’s LK_ENTRY is the lock side; the page buffer’s PGBUF_LATCH is the latch side.

Resource hashing

Lockable resources live in a hash table keyed by some object identifier. The identifier shape varies — OID (volid, pageid, slotid) in CUBRID, page+slot+xid range in Oracle, a descriptor of (relation, key range) for predicate locking in PostgreSQL. The per-resource state is a list of holders plus a list of waiters.

Aggregate-mode cache

A naive compatibility check is O(holders): walk the list and intersect modes. The standard optimization is to cache the aggregate of all granted modes (total_holders_mode) and of all waiting modes (total_waiters_mode) on the resource record itself, turning the check into one matrix lookup. CUBRID does this; Oracle calls the equivalent the consolidated mode; the pattern recurs.

Dual-view threading

The same lock fact is queried from two angles: per-resource (“who holds R?”) and per-transaction (“what does T hold?”). The cleanest implementation is one record threaded into both lists — the lock table indexes the resource view, and a per-transaction table indexes the transaction view, but a single struct carries both sets of pointers. CUBRID’s LK_ENTRY is the canonical example.

Multi-granularity locking + intention modes

Gray’s hierarchy: IS, IX, S, SIX, X. Taking a coarse lock implies you are about to take finer locks underneath, and intention modes let two transactions lock different rows of the same table without false conflict at the table level. The compatibility matrix grows from 2×2 (S, X) to 5×5 or 7×7. CUBRID extends the vocabulary to 12 modes by adding BU (bulk update), SCH-S / SCH-M (schema stability / modification), U (update), and a NON2PL marker for released-early locks.

Lock conversion via square table

Re-requesting a lock you already hold should upgrade, not deadlock with yourself. The new mode is the least upper bound of held and requested — S + IX = SIX, IS + S = S, etc. Engines either compute this on the fly or look it up in a square table indexed by (held, requested). CUBRID uses a table.

FIFO wait queue + starvation guard

Granting based on holders alone allows a steady stream of S requests to starve a waiting X. The standard fix: the grant test is holders ∧ waiters compat — a new request must clear both the currently-granted aggregate and the currently-waiting aggregate. CUBRID enforces this; PostgreSQL has the same pattern.

Deadlock handling: three schools

Timeout (cheap, false-positive prone). Used as a fallback in most engines, including CUBRID.
Waits-for graph cycle scan (CUBRID, MySQL InnoDB). A background detector walks the WFG, breaks cycles by aborting a victim. Ordering choices: youngest, most-recently-blocked, fewest locks held, etc.
Timestamp-based avoidance — wait-die, wound-wait. Textbook classic; less common in production because it interacts poorly with long-running analytics.

Lock duration policy = isolation level

The lock acquisition mechanism is uniform across isolation levels; isolation is encoded in the release path. Under Read Committed, instance read locks are short-duration — released as the statement finishes, often tracked in a side list (CUBRID’s non2pl) so that later in the same transaction we can still detect a re-read conflict. Under Repeatable Read / Serializable, locks are long-duration: held until commit. Class and schema locks are always long-duration.

Theory ↔ CUBRID mapping

The textbook concepts of §“Theoretical Background” map to CUBRID’s named entities as follows. ## CUBRID's Approach is the slow zoom into each row.

Theory	CUBRID name
Lockable resource	`LK_RES` keyed by `LK_RES_KEY{type, OID, class_OID}`
Held / pending lock instance	`LK_ENTRY` (`granted_mode` + `blocked_mode`)
Lock modes (S / X + intention modes)	12-mode vocabulary in `lock_table.h` (NA…SCH-M)
Compatibility matrix	12×12 (NA, NON2PL, U omitted in printed table) lookup driven by `total_holders_mode` / `_waiters_mode`
Conversion (lub of held + requested)	square table in `lock_table.c` consulted on re-request
Per-transaction state	`LK_TRAN_LOCK` (root / class / inst hold lists + local pool)
Global registry	`lk_global_data` (hash → `LK_RES`, per-tran table, free list)
Waits-for graph	`LK_WFG_NODE` / `LK_WFG_EDGE`, scanned by deadlock daemon
Deadlock victim	`LOCK_RESUMED_ABORTED` set on victim’s wait state
Released-early lock tracker (RC)	`non2pl` list, `NON2PL` mode marker

CUBRID’s Approach

CUBRID instantiates the conventions above with a single global lk_global_data, a per-transaction LK_TRAN_LOCK table, and a shared LK_ENTRY record threaded into both views (resource and transaction). The distinguishing choices are: (1) OID-shaped resource keys that match the on-disk record identity exactly; (2) a 12-mode vocabulary (NA…SCH-M) extending Gray’s MGL with BU, SCH-S / SCH-M, U, and NON2PL; (3) a two-API split between lock_scan (intent locks at class grain — see Source Walkthrough for the symbol; not excerpted here) and lock_object (real locks at instance grain); (4) a periodic WFG cycle scan with “most-recently-blocked” victim policy plus per-request timeout as a fallback.

How a lock request flows

A query touches the lock manager through two API surfaces, one per grain. The pattern is deliberate: take an intention lock at the class first, then the real lock at the row.

flowchart LR
  Q["query / DML"] --> S["lock_scan(class_oid, mode)\n• IS for SELECT\n• IX for INSERT/UPDATE/DELETE\n• S for SERIALIZABLE SELECT"]
  S --> O["lock_object(record_oid, class_oid, mode)\n• S for SELECT\n• X for write"]
  O --> R{"compat with\nholders ∧ waiters?"}
  R -- "yes" --> W["statement runs"]
  R -- "no" --> B["suspend → WFG edge\n→ deadlock daemon scans cycles"]
  B --> W
  W --> U{"isolation\nlevel?"}
  U -- "READ COMMITTED" --> RC["lock_unlock_object(\n  move_to_non2pl=true)\nper statement (instance locks)"]
  U -- "RR / SERIALIZABLE" --> EOT["release at commit/rollback\n(strict 2PL)"]

Each labeled box is unpacked in the subsections below — the OID-shaped resource hierarchy, the LK_ENTRY dual-list layout, the 12-mode vocabulary and 12×12 compatibility matrix, the acquisition state machine, escalation, and the WFG deadlock detector. The boxes do not move; only the level of detail increases.

Resource hierarchy and OID-shaped keys

Every lockable thing in CUBRID — database, class, instance — is named by an OID = (volid, pageid, slotid). The hierarchy used by intention locking is literally the database OID hierarchy:

flowchart TB
  DB["Database / Root Class\n(OID_Root_class_oid)"]
  C1["Class C_a\n(class OID)"]
  C2["Class C_b"]
  I1a["Instance I_a1\n(instance OID, class_oid = C_a)"]
  I1b["Instance I_a2"]
  I1n["..."]
  I2a["Instance I_b1"]
  I2n["..."]
  DB --> C1
  DB --> C2
  C1 --> I1a
  C1 --> I1b
  C1 --> I1n
  C2 --> I2a
  C2 --> I2n

A lock resource key, an LK_RES, and one acquired-or-pending lock, LK_ENTRY, are the three core types:

// struct lk_res_key — src/transaction/lock_manager.h
struct lk_res_key
{
  LOCK_RESOURCE_TYPE type;     /* class, instance, root_class */
  OID oid;
  OID class_oid;
};

// struct lk_res — src/transaction/lock_manager.h
struct lk_res
{
  LK_RES_KEY key;              /* hash key */
  LOCK total_holders_mode;     /* aggregate lock mode of all holders */
  LOCK total_waiters_mode;     /* aggregate of all waiters */
  LK_ENTRY *holder;            /* per-resource holder list */
  LK_ENTRY *waiter;            /* per-resource waiter list */
  LK_ENTRY *non2pl;            /* released-early-but-still-tracked list */
  pthread_mutex_t res_mutex;
  LK_RES *hash_next;           /* hash chain */
  LK_RES *stack;               /* freelist */
  UINT64 del_id;               /* lock-free reclamation */
};

// struct lk_entry — src/transaction/lock_manager.h
struct lk_entry
{
  struct lk_res *res_head;     /* back to the resource */
  THREAD_ENTRY *thrd_entry;
  int tran_index;
  LOCK granted_mode;
  LOCK blocked_mode;
  int count;                   /* re-entrant request counter */
  UINT64 del_id;
  LK_ENTRY *stack;
  LK_ENTRY *next;              /* per-resource list (holder or waiter) */
  LK_ENTRY *tran_next;         /* per-transaction list */
  LK_ENTRY *tran_prev;
  LK_ENTRY *class_entry;       /* parent class's LK_ENTRY (for instances) */
  int ngranules;               /* count of finer-grain children below */
  int instant_lock_count;
  int bind_index_in_tran;
  XASL_ID xasl_id;
};

Figure 1 — OID structure

Figure 1 — Top: CUBRID’s lock granularity hierarchy — the database (root class) DB contains classes Cₐ, Cᵦ, each containing instances Iₐ₁..Iₐₙ, Iᵦ₁..Iᵦₙ. Bottom: any node in that hierarchy is represented as an OID = (volid, pageid, slotid) record, backed by the db_identifier C struct (int pageid, short slotid, short volid). The example arrow shows instance Iᵦₙ mapping to its OID, but the same 3-tuple identifies a database, a class, or an instance — so the lock manager treats all three uniformly. (Source: original lock manager presentation deck, slides 4–5.)

Two views of the same lock state

The same LK_ENTRY (one acquired or pending lock) is threaded into two different lists so the same lock state can be queried from either side without scanning:

flowchart LR
  subgraph TT["transaction view (LK_TRAN_LOCK)"]
    direction TB
    R["root_class_hold"]
    C["class_hold_list"]
    I["inst_hold_list"]
    R -.-> C
    C -.-> I
  end

  subgraph RT["resource view (LK_RES, hashed by LK_RES_KEY)"]
    direction TB
    H["holder list (granted)"]
    W["waiter list (blocked)"]
  end

  E1["LK_ENTRY #1\n(tran 7, X-mode)\nres = R_91"]
  E2["LK_ENTRY #2\n(tran 7, IX-mode)\nres = C_a"]
  E3["LK_ENTRY #3\n(tran 9, blocked X)\nres = R_91"]

  I -- "tran_next/prev" --> E1
  C -- "tran_next/prev" --> E2
  H -- "next" --> E1
  W -- "next" --> E3

The tran_next/tran_prev doubly-linked list is per-transaction and separates root, class, and instance locks into three sub-lists. The single next pointer threads holders or waiters per-resource. One LK_ENTRY is in exactly one resource list (holder or waiter) and exactly one transaction list at any moment.

Figure 2 — Dual-view threading

Figure 2 — Dual-view threading made concrete. Top row: three per-transaction tables (LK_TRAN_LOCK 0/1/2), each pointing into its own inst_hold_list/class_hold_list/root_class_hold. Bottom box: the global hash table of LK_RES records, each with its own holder/waiter lists. The arrows from top to bottom and within the bottom group show that a single LK_ENTRY is reachable from both views — tran_next/tran_prev from the transaction side, next from the resource side. This is what lets release-at-commit walk one list and compatibility checks walk another, both in O(1) amortized. (Source: deck slide 35.)

The lock table — `lk_global_data`

A single global struct holds:

m_obj_hash_table — hash from LK_RES_KEY (OID-derived) to LK_RES. The hash function combines pageid and slotid modulo htsize = MAX_NTRANS * 300; collisions resolve into a chain via LK_RES.hash_next.
tran_lock_table[tran_index] — per-transaction LK_TRAN_LOCK.
obj_free_entry_list — global free list of recyclable LK_ENTRYs, used after the per-transaction local pool (lk_tran_lock.lk_entry_pool, default 10 entries) is exhausted.

flowchart TB
  HT["m_obj_hash_table\n(LK_RES_KEY → LK_RES)"]
  TLT["tran_lock_table[]\n(per tran_index)"]
  FREE["obj_free_entry_list\n(global LK_ENTRY freelist)"]

  HT --> R0["LK_RES (slot 0)\nholder, waiter, hash_next"]
  R0 --> R0H["holder LK_ENTRY chain"]
  R0 --> R0W["waiter LK_ENTRY chain"]
  R0 -. "hash collision" .-> R1["LK_RES (slot 0, chained)"]

  TLT --> TL0["LK_TRAN_LOCK[0]\n• inst_hold_list\n• class_hold_list\n• root_class_hold\n• lk_entry_pool[10]"]
  TLT --> TLN["LK_TRAN_LOCK[N]"]

  R0H -. "same LK_ENTRY,\ndifferent linkage" .-> TL0

Lock modes (12) and the compatibility matrix

CUBRID’s lock vocabulary, ordered by strength:

Value	Mode	Meaning
0	NA	no lock
1	NON2PL	incompatible 2PL marker (released early)
2	NULL	placeholder
3	SCH-S	schema stability (DDL must wait)
4	IS	intention shared
5	S	shared
6	IX	intention exclusive
7	BU	bulk update
8	SIX	shared + intention exclusive
9	U	update (S that intends to upgrade to X)
10	X	exclusive
11	SCH-M	schema modification

The compatibility matrix (12×12; NA, NON2PL, U omitted from the printed 9×9 below) is a static lookup table. From Lock_Manager.pdf, slide 20:

	NULL	SCH-S	IS	S	IX	BU	SIX	X	SCH-M
NULL	✓	✓	✓	✓	✓	✓	✓	✓	✓
SCH-S	✓	✓	✓	✓	✓	✓	✓	✓	✗
IS	✓	✓	✓	✓	✓	✗	✓	✗	✗
S	✓	✓	✓	✓	✗	✗	✗	✗	✗
IX	✓	✓	✓	✗	✓	✗	✗	✗	✗
BU	✓	✓	✗	✗	✗	✓	✗	✗	✗
SIX	✓	✓	✓	✗	✗	✗	✗	✗	✗
X	✓	✓	✗	✗	✗	✗	✗	✗	✗
SCH-M	✓	✗	✗	✗	✗	✗	✗	✗	✗

A new request must be compatible with both the resource’s total_holders_mode and total_waiters_mode:

The holders test prevents conflicts with currently-granted locks.
The waiters test is the starvation guard: a stream of S requests cannot leapfrog a waiting X request just because they happen to be compatible with the current holders.

Aggregated mode bits replace per-holder list scans, turning each compatibility test into O(1).

Wake reasons for blocked threads

A blocked thread waits with a per-request timeout and resumes for one of these reasons:

// enum LOCK_WAIT_STATE — src/transaction/lock_manager.c
typedef enum
{
  LOCK_SUSPENDED,
  LOCK_RESUMED,                  /* lock granted */
  LOCK_RESUMED_TIMEOUT,
  LOCK_RESUMED_DEADLOCK_TIMEOUT, /* picked as deadlock victim */
  LOCK_RESUMED_ABORTED,          /* must abort due to deadlock */
  LOCK_RESUMED_ABORTED_FIRST,
  LOCK_RESUMED_ABORTED_OTHER,
  LOCK_RESUMED_INTERRUPT
} LOCK_WAIT_STATE;

Initializing an entry as granted vs. blocked

Two helpers diverge on which mode field is populated:

// lock_initialize_entry_as_granted — src/transaction/lock_manager.c
static void
lock_initialize_entry_as_granted (LK_ENTRY * entry_ptr, int tran_index, LK_RES * res, LOCK lock)
{
  entry_ptr->tran_index    = tran_index;
  entry_ptr->res_head      = res;
  entry_ptr->granted_mode  = lock;
  entry_ptr->blocked_mode  = NULL_LOCK;
  entry_ptr->count         = 1;
  /* ... list pointers nulled ... */
}

// lock_initialize_entry_as_blocked — src/transaction/lock_manager.c
static void
lock_initialize_entry_as_blocked (LK_ENTRY * entry_ptr, THREAD_ENTRY * thread_p,
                                  int tran_index, LK_RES * res, LOCK lock)
{
  entry_ptr->tran_index    = tran_index;
  entry_ptr->thrd_entry    = thread_p;
  entry_ptr->res_head      = res;
  entry_ptr->granted_mode  = NULL_LOCK;
  entry_ptr->blocked_mode  = lock;
  entry_ptr->count         = 1;
  /* ... list pointers nulled ... */
}

For an entry that lives in the holder list, granted_mode is the useful field; for a waiter, blocked_mode is. Conversion-in-progress holders use both: granted_mode is the currently held mode, blocked_mode is the upgrade being awaited.

Lock acquisition flow — `lock_object` → `lock_internal_perform_lock_object`

// lock_internal_perform_lock_object — src/transaction/lock_manager.c (annotated excerpt)
static int
lock_internal_perform_lock_object (THREAD_ENTRY *thread_p, int tran_index,
                                   const OID *oid, const OID *class_oid,
                                   LOCK lock, int wait_msecs,
                                   LK_ENTRY **entry_addr_ptr,
                                   LK_ENTRY *class_entry)
{
  /* ... */

start:
  if (class_oid != NULL && !OID_IS_ROOTOID (class_oid))
    {
      /* instance lock request */
      ret_val = lock_escalate_if_needed (thread_p, class_entry, tran_index);
      /* ... if class lock now subsumes the request, return granted ... */
    }
  else
    {
      /* class lock request — try to find existing entry first */
      entry_ptr = lock_find_class_entry (tran_index, oid);
      if (entry_ptr != NULL)
        {
          res_ptr = entry_ptr->res_head;
          goto lock_tran_lk_entry;
        }
    }

  /* find or add the lockable object in the lock table */
  search_key = lock_create_search_key ((OID *) oid, (OID *) class_oid);
  (void) lk_Gl.m_obj_hash_table.find_or_insert (thread_p, search_key, res_ptr);
  is_res_mutex_locked = true;

  if (res_ptr->holder == NULL && res_ptr->waiter == NULL && res_ptr->non2pl == NULL)
    {
      /* fresh resource: grant unconditionally */
      /* ... lock_initialize_resource_as_allocated, allocate granted entry ... */
    }
  else
    {
      /* check compatibility with total_holders_mode | total_waiters_mode */
      /* ... if compatible: splice into holder list, update aggregates ...   */
      /* ... else: splice as waiter, update aggregates, suspend thread ...   */
    }
}

End-to-end:

flowchart TD
  A["lock_object(thread, oid, class_oid, mode)"]
  A --> B{"class lock\nor instance lock?"}
  B -- "instance" --> C["lock_escalate_if_needed\n(parent class)"]
  C --> D{"class lock\nsubsumes request?"}
  D -- "yes" --> Z1["return LK_GRANTED"]
  D -- "no" --> H
  B -- "class" --> E{"existing entry\nfor this tran?"}
  E -- "yes" --> J["bump count / request conversion"]
  E -- "no" --> H["m_obj_hash_table.find_or_insert(LK_RES_KEY)"]
  H --> F{"res empty\n(no holder/waiter)?"}
  F -- "yes" --> G["grant fresh:\ninitialize_entry_as_granted,\nsplice into holder list"]
  G --> Z2["return LK_GRANTED"]
  F -- "no" --> K{"compat with holders|waiters?"}
  K -- "yes" --> L["splice into holder list,\nupdate total_holders_mode"]
  L --> Z3["return LK_GRANTED"]
  K -- "no" --> M["splice into waiter list,\nupdate total_waiters_mode,\nsuspend thread"]
  M --> N{"resume reason"}
  N -- "LOCK_RESUMED" --> Z4["granted"]
  N -- "LOCK_RESUMED_TIMEOUT" --> Z5["LK_NOTGRANTED_DUE_TIMEOUT"]
  N -- "LOCK_RESUMED_ABORTED\n(deadlock victim)" --> Z6["abort transaction"]

Figure 3 — Acquisition scenario walkthrough

Figure 3 — Concrete acquisition scenario. Transaction 10 requests X_LOCK on table A. The hash table already contains LK_RES 1 for table A with two holders: tran 7 holding IS_LOCK, tran 11 currently waiting with blocked_mode = X_LOCK. The aggregate fields carry the answer: total_holders_mode = IX_LOCK (LUB of currently-granted IS+IX), total_waiters_mode = X_LOCK. A new X_LOCK request fails compatibility against the holders side, fails against the waiters side, and so is enqueued; if wait_msecs == LK_ZERO_WAIT, the call returns LK_NOTGRANTED_DUE_TIMEOUT instead of suspending. (Source: deck slide 110.)

Lock release — duration and isolation level

CUBRID tracks lock durations and releases at different points depending on isolation level:

Instance locks under READ COMMITTED — released as soon as the statement that acquired them finishes (short-duration). This is what keeps RC’s locking footprint small.
Instance locks under REPEATABLE READ / SERIALIZABLE — released at transaction end. Standard 2PL.
Class locks — released at transaction end regardless of isolation level (intention locks have to outlive their children).
Schema locks (SCH-S/SCH-M) — special-cased; SCH-M conflicts with every other mode except NULL.

The release path is lock_unlock_object → lock_internal_perform_unlock_object (both in src/transaction/lock_manager.c). lock_unlock_object_by_isolation implements the per-isolation policy.

Figure 4 — Unlock decision flow

Figure 4 — lock_internal_perform_unlock_object decision tree. release_flag = true is the unconditional path used at transaction end. With release_flag = false, the function checks for a re-entrant request (count > 0 and a pending conversion in blocked_mode); if more requests are outstanding, the lock is kept and count is just decremented. Only when count == 0 and blocked_mode == NULL_LOCK does the entry actually leave the holder list. (Source: deck slide 180.)

Lock escalation

Per-transaction lock counts are bounded. When a transaction acquires more than lock_escalation_threshold instance locks on a single class, lock_escalate_if_needed (in src/transaction/lock_manager.c) replaces the per-row locks with a single class-level lock and frees the instance entries. This is a memory-pressure relief valve and is also the reason LK_ENTRY carries ngranules (count of children for an intention lock) and a class_entry back-pointer.

Deadlock detection — waits-for graph

CUBRID uses a waits-for graph (WFG) maintained in src/transaction/wait_for_graph.{h,c} and walked periodically by a dedicated checker. The two graph types live in lock_manager.c:

// TWFG (transaction wait-for graph) types — src/transaction/lock_manager.c
/* TWFG (transaction wait-for graph) entry and edge */
typedef struct lk_WFG_node LK_WFG_NODE;
struct lk_WFG_node
{
  int first_edge;
  bool candidate;
  int current;
  int ancestor;
  INT64 thrd_wait_stime;
  int tran_edge_seq_num;
  bool checked_by_deadlock_detector;
  bool DL_victim;
};

typedef struct lk_WFG_edge LK_WFG_EDGE;
struct lk_WFG_edge
{
  int to_tran_index;
  int edge_seq_num;
  int holder_flag;
  int next;
  INT64 edge_wait_stime;
  LK_ENTRY *holder;
  LK_ENTRY *waiter;
};

When transaction T1 blocks waiting for a lock currently held by T2, an LK_WFG_EDGE is added from T1’s LK_WFG_NODE to T2’s. The detector (lock_detect_local_deadlock) walks the graph looking for cycles, picks an LK_DEADLOCK_VICTIM (the most recently blocked transaction), and marks it LOCK_RESUMED_ABORTED; the lock-acquisition loop in the victim then sees the abort flag and propagates it as a rollback.

flowchart LR
  subgraph Cycle["deadlock cycle (T1 ↔ T3)"]
    T1["T1\nholds R1, waits R2"]
    T2["T2\nholds R2, waits R3"]
    T3["T3\nholds R3, waits R1"]
    T1 -- "wants R2 held by" --> T2
    T2 -- "wants R3 held by" --> T3
    T3 -- "wants R1 held by" --> T1
  end

  DET["lock_detect_local_deadlock\n(periodic)"]
  DET -- "DFS over WFG" --> Cycle
  DET --> VICTIM["LK_DEADLOCK_VICTIM\n= most recently blocked\n(e.g., T3)"]
  VICTIM --> ABORT["set LOCK_RESUMED_ABORTED\non victim's lk_entry"]

The “local” qualifier is intentional: distributed deadlock detection is out of scope for the per-node manager. Timeout-based deadlock breaking is the fallback when graph maintenance lags.

Figure 5 — Wait-For Graph cycle detection

Figure 5 — Wait-For Graph cycle detection step by step. Three transactions (3, 6, 7) and four edges form the WFG; the algorithm performs a DFS marking each LK_WFG_NODE.current and .ancestor, and detects a cycle when a follow-up walk returns to a node whose ancestor == -2 (start sentinel). The exact field semantics shown on the right are the same first_edge, current, ancestor, to_tran_index, edge_seq_num, holder_flag, next fields walked by lock_detect_local_deadlock. (Source: deck slide 160.)

Lifecycle

lock_initialize (in src/transaction/lock_manager.c) allocates the hash table, per-tran tables, free entry list, and the deadlock-detection state.
lock_finalize tears them down, with lock_finalize_tran_lock_table handling the per-tran half.

Source Walkthrough

Anchor on symbol names, not line numbers. The CUBRID source moves; a function name (or struct/enum tag) is the stable handle. Use git grep -n '<symbol>' src/transaction/ to locate the current position. The position-hint table at the end of this section was observed when the document was last updated:.

Type definitions (`src/transaction/`)

struct lk_entry (in lock_manager.h) — one acquired or pending lock; threaded into both a resource list and a transaction list.
struct lk_res_key (in lock_manager.h) — OID-shaped hash key.
struct lk_res (in lock_manager.h) — per-resource state with total_holders_mode, total_waiters_mode, holder/waiter lists.
enum LOCK_WAIT_STATE (in lock_manager.c) — wake reasons (LOCK_RESUMED, LOCK_RESUMED_TIMEOUT, LOCK_RESUMED_DEADLOCK_TIMEOUT, LOCK_RESUMED_ABORTED, …).
struct lk_WFG_node, struct lk_WFG_edge, struct lk_deadlock_victim (in lock_manager.c) — waits-for graph nodes/edges plus victim record.

Initialization (`src/transaction/lock_manager.c`)

lock_initialize — entry point; calls each of the helpers below.
lock_initialize_tran_lock_table — per-tran tables.
lock_initialize_object_hash_table — m_obj_hash_table (MAX_NTRANS * 300 slots).
lock_initialize_object_lock_entry_list — global free list of LK_ENTRY.
lock_initialize_deadlock_detection — WFG state.

Acquisition / conversion / release (`src/transaction/lock_manager.c`)

lock_object — public entry.
lock_object_wait_msecs — variant with explicit timeout.
lock_internal_perform_lock_object — real work (excerpted above).
lock_internal_perform_unlock_object.
lock_unlock_object.
lock_unlock_objects_lock_set — bulk release for LC_LOCKSET.
lock_initialize_entry_as_granted / lock_initialize_entry_as_blocked / lock_initialize_entry_as_non2pl — entry constructors.

Escalation and deadlock (`src/transaction/lock_manager.c`)

lock_escalate_if_needed — escalate per-instance to per-class.
lock_detect_local_deadlock — periodic WFG cycle scan.
#include "wait_for_graph.h" near the top of the file — the WFG abstraction lives in src/transaction/wait_for_graph.{h,c}.

Lock-mode tables

src/transaction/lock_table.h — declarations for the compatibility and conversion matrices.
src/transaction/lock_table.c — the actual 12×12 tables.

Position hints as of this revision

These line numbers held when the document was last updated:. If you land at a different definition, the symbol name above is authoritative; update the table on your way through.

Symbol	File	Line
`struct lk_entry`	`lock_manager.h`	79
`struct lk_res_key`	`lock_manager.h`	164
`struct lk_res`	`lock_manager.h`	175
`enum LOCK_WAIT_STATE`	`lock_manager.c`	180
`struct lk_WFG_node`/`_edge`/`deadlock_victim`	`lock_manager.c`	256
`lock_initialize_entry_as_granted`	`lock_manager.c`	920
`lock_initialize_entry_as_blocked`	`lock_manager.c`	940
`lock_initialize_tran_lock_table`	`lock_manager.c`	1061
`lock_initialize_object_hash_table`	`lock_manager.c`	1121
`lock_initialize_object_lock_entry_list`	`lock_manager.c`	1152
`lock_initialize_deadlock_detection`	`lock_manager.c`	1180
`lock_escalate_if_needed`	`lock_manager.c`	2974
`lock_internal_perform_lock_object`	`lock_manager.c`	3223
`lock_internal_perform_unlock_object`	`lock_manager.c`	3945
`lock_initialize`	`lock_manager.c`	5627
`lock_finalize_tran_lock_table`	`lock_manager.c`	5702
`lock_finalize`	`lock_manager.c`	5859
`lock_object`	`lock_manager.c`	5945
`lock_object_wait_msecs`	`lock_manager.c`	6273
`lock_unlock_object`	`lock_manager.c`	6630
`lock_unlock_objects_lock_set`	`lock_manager.c`	6722

Source verification (as of 2026-04-29)

Each entry is a fact about the current source — readable without the original analysis materials. The trailing note shows how it was checked and, where relevant, historical drift or limits of verification. Open questions follow as the curator’s recorded gaps; future readers should treat them as starting points, not as known bugs.

Verified facts

LK_ENTRY, LK_RES_KEY, and LK_RES are the three core types in src/transaction/lock_manager.h. Symbol names are the stable handle; line numbers in the position-hint table above are scoped to this revision’s updated: date and will drift with reformatting.
The lock vocabulary has 12 modes (NA, NON2PL, NULL, SCH-S, IS, S, IX, BU, SIX, U, X, SCH-M). BU (bulk update) is incompatible with IS, S, IX, SIX, X, SCH-M — verified in lock_table.c on 2026-04-29. (Whether the bulk-update path is actively exercised on a current build vs. legacy is an open question.)
The object hash size is MAX_NTRANS * 300, hard-coded. Verified in lock_initialize_object_hash_table (lock_manager.c) on 2026-04-29. The “300” is a literal multiplier in the source, not a parameter. Tuning it requires a source change, not a config setting.
The per-transaction local entry pool size is LOCK_TRAN_LOCAL_POOL_MAX_SIZE = 10. Verified on 2026-04-29. lock_get_new_entry / lock_free_entry search the local pool before the global free list — the local-first lookup ordering matters because hot OLTP transactions can satisfy most lock needs without touching the shared free-list lock.
LK_ENTRY.class_entry is a one-hop pointer, not a multi-step chain. For instances, it points to the parent class’s LK_ENTRY; for classes, to the root-class LK_ENTRY. Verified in lock_internal_perform_lock_object and lock_escalate_if_needed on 2026-04-29. Designed for fast escalation, not for hierarchical traversal.
Wake reasons for blocked threads are enumerated in enum LOCK_WAIT_STATE (lock_manager.c). Confirmed members include LOCK_RESUMED, LOCK_RESUMED_TIMEOUT, LOCK_RESUMED_DEADLOCK_TIMEOUT, LOCK_RESUMED_ABORTED, LOCK_RESUMED_INTERRUPT. Verified on 2026-04-29.
Deadlock victim policy is “most-recently-blocked”. Verified by reading lock_detect_local_deadlock on 2026-04-29. The policy is implicit in WFG insertion order rather than a named constant, and not visible in the original analysis decks. (See Open Questions §1 — whether this is intentional or an artifact is unverified.)

Open questions

Is “most-recently-blocked” the intended victim policy, or an artifact of WFG insertion order? The source produces this ordering implicitly; no comment explicitly chooses it. Investigation path: read the original WFG design intent in src/transaction/wait_for_graph.{h,c} and check the CBRD ticket trail around lock_detect_local_deadlock.
Distributed deadlocks. lock_detect_local_deadlock is named “local” deliberately; CUBRID distributed setups (HA / shard load balancing) presumably handle cross-node deadlocks via timeout. Worth verifying once the heartbeat / CDC analyses are pulled in.
NON2PL exact protocol. lock_initialize_entry_as_non2pl marks locks released early under READ COMMITTED that other transactions may still want to “see” for serialization purposes. The exact handshake — which transactions check non2pl when, and what happens when a non2pl entry “wins” a conflict — is worth tracing through the unlock path.
Interaction with MVCC. Pure SI does not need locks for read-read or read-write conflicts, but CUBRID still acquires instance locks on the write path. The boundary between MVCC visibility and lock-based serialization (especially under SERIALIZABLE) is the highest-value follow-up — bridges this analysis with cubrid-mvcc.md.
Bulk-update path liveness. BU is in the vocabulary and the matrix, but is the bulk-update code path actually exercised on current builds, or has it become legacy? Investigation path: git log -S BU_LOCK -- src/transaction/ plus a runtime trace of any loaddb-style ingest.

Beyond CUBRID — Comparative Designs & Research Frontiers

Pointers, not analysis. Each bullet is a starting handle for a follow-up doc; depth here is intentionally shallow.

Bamboo: releasing locks early. Guo et al. (2021) — release exclusive locks before commit when subsequent readers can be re-validated, reducing tail latency on contended OLTP. CUBRID’s NON2PL machinery already tracks early-released RC locks for conflict detection; Bamboo generalizes the pattern.
Brook-2PL: deadlock-free 2PL. Pre-analyzes a static dependency ordering to make deadlocks impossible. The trade-off is the pre-analysis cost — interesting only for stable, repeating workloads.
TXSQL: lock optimizations for contended workloads. Adaptive lock-mode adjustment, batching, and contention-aware scheduling. Cross-pollination candidates with CUBRID’s lock_escalate_if_needed.
Optimistic CC without a central lock manager. Hekaton, Silo, Cicada eliminate the lock table — validation at commit time replaces acquisition. Different design point; relevant for in-memory workloads and the OCC-vs-2PL debate at high core counts.
VLL — Very Lightweight Locking. Reduces lock-acquisition cost through partitioning. Useful pattern when the lock table itself is the contention hotspot.
HTM-assisted lock managers. Hardware transactional memory fast-paths short transactions to avoid lock acquisition entirely. Still niche in production but explored for OLTP.
Predicate locking for serializability (PostgreSQL SSI). The alternative to CUBRID’s “fall back to S / IX table locks under SERIALIZABLE” approach. A side-by-side cost comparison would clarify when each is cheaper.
Formal models of lock escalation. KAIST DBLAB (Chang 2005) formalizes the trade-offs between memory pressure relief and unintended serialization caused by escalation. Relevant for tuning CUBRID’s lock_escalation_threshold.
Recent paper trail (from Notion’s “Storage – Concurrency” overview): Cahill et al., Serializable Snapshot Isolation in PostgreSQL (2008/2012); Yu et al., Staring into the Abyss (VLDB 15); Wu et al., empirical MVCC (VLDB 17); Bamboo (2021); Brook-2PL (2025); TXSQL (2025).

The intent of this section is to seed next documents, not to analyze. Each bullet should become its own curated note when its turn comes.

Sources

Raw analyses (under `raw/code-analysis/cubrid/storage/lock_manager/`)

Lock_Manager.pdf — primary deck; lock modes, hash table, resource/entry layout, compatibility/conversion matrices.
lock manager_발표.pdf and .pptx — presentation deck with step-by-step compatibility/conversion examples (e.g., an S request arriving with mixed holders).
LOG_TDES_LOCK_Data.pdf — narrative description of lk_global_data, lk_tran_lock, lk_entry, lk_res and their interrelations from the transaction-descriptor side.

Textbook chapters (under `knowledge/research/dbms-general/`)

Database Internals (Petrov), Ch. 5: §“Locks vs Latches” (≈ line 4404), §“Two-Phase Locking” (≈ line 4321), §“Deadlocks” (≈ line 4344) including waits-for graphs and the wait-die / wound-wait avoidance schemes.
Ch. 1, §“Lock manager” (≈ line 777) — high-level role of the component within DBMS architecture.

Notion (CUBRID DEV WIKI)

Storage – Concurrency 코드 분석 — module-level positioning that frames Lock Manager alongside MVCC and Vacuum.
Lock Manager 코드 분석 — 자료구조·초기화·잠금 획득 흐름 — authoritative for the LK_GLOBAL_DATA, LK_TRAN_LOCK, and the lock_internal_perform_lock_object state machine; the dual-view framing in §“Mental model” is distilled from this page.
Lock Manager 코드 분석 — Isolation Level별 잠금 동작 — the lock_unlock_object isolation branch; basis for the “isolation is policy, not mechanism” claim in §“How a lock request flows”.
Lock Manager 코드 분석 — Deadlock Detection / Lock Escalation / Lock Suspend·Resume / Composite Lock / Instant Lock Mode / non2pl 리스트 상세 / 디버깅 가이드 / lock_scan() — auxiliary chapters in the same workspace that drill into specific subsystems referenced here.

CUBRID source (under `/data/hgryoo/references/cubrid/`)

src/transaction/lock_manager.h
src/transaction/lock_manager.c
src/transaction/lock_table.h
src/transaction/lock_table.c
src/transaction/wait_for_graph.h (deadlock-detection abstraction)
src/transaction/wait_for_graph.c (deadlock-detection implementation)