CUBRID 2PC — Two-Phase Commit and In-Doubt Recovery
Contents:
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Source verification (as of 2026-04-30)
- Beyond CUBRID — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”The two-phase commit (2PC) protocol is the canonical answer to “how do N independent sites agree to commit or abort one distributed transaction without a central truthkeeper”. Jim Gray named it in Notes on Database Operating Systems (1978); the JTA/XA specification (X/Open, then Java JSR 907) made it the interoperability standard between transaction managers and resource managers. Database Internals (Petrov, ch. 13 “Distributed Transactions”) gives the textbook treatment.
The protocol has two roles and two phases:
- Coordinator drives the transaction; participants are the resource managers that hold parts of the state.
- Phase 1 (prepare): coordinator asks each participant “ready to commit?”. Each participant either votes YES (and promises not to abort unilaterally) or NO. The vote is durable before the participant replies.
- Phase 2 (decision): if all voted YES, coordinator decides COMMIT and tells each participant; otherwise ABORT. The decision is durable before the coordinator sends. Participants ack and the coordinator forgets the gtrid.
The ugly state in 2PC is in-doubt — a participant has voted YES but hasn’t yet heard the coordinator’s decision when it crashes. On restart, the in-doubt transaction is locked (participants hold all locks until the decision arrives) and the coordinator is asked. If the coordinator is also gone, operator intervention is needed. This is why XA defines heuristic decisions and a separate XID identity space.
Two implementation choices the model leaves open shape every real engine and frame the rest of this document:
- Same site as coordinator and participant? A single CUBRID
server can be both: a query that updates this server and a
linked server is a distributed transaction whose coordinator
is this server and participant is the linked server. CUBRID’s
LOG_2PC_EXECUTEenum dispatches by role. - How is a prepared transaction identified across crashes?
The protocol needs a global ID (
gtrid/XID) that survives the local TDES being recycled. CUBRID assigns gtrids atlog_2pc_startand stores them on the TDES; in-doubt recovery rebuilds the gtrid → tid map from prepared-state log records.
After the choices are named, every CUBRID-specific structure in this document either implements one of them or makes the protocol durable.
Common DBMS Design
Section titled “Common DBMS Design”Every engine that supports 2PC adopts the same set of patterns on top of Gray’s protocol.
Forced log writes at decision boundaries
Section titled “Forced log writes at decision boundaries”The prepared state and the decision must be on stable storage before the network message is sent. Both records are force-flushed (cubrid-log-manager.md §“Force-at-commit”). PG, InnoDB, Oracle all share this discipline.
gtrid as a separate identifier from local trid
Section titled “gtrid as a separate identifier from local trid”A local trid is reused after the transaction terminates.
gtrid survives — until in-doubt recovery completes or the
TM heuristically forgets it. Engines store the gtrid in a
side-channel (TDES field, separate table) so the local trid
can be recycled without losing the prepared transaction.
Coordinator info attached to TDES
Section titled “Coordinator info attached to TDES”The coordinator side of a distributed transaction needs to
remember the participant list and ack state. Storing this on
the TDES (rather than a separate coordinator table) means a
crash recovers it together with the rest of the TDES via
LOG_2PC_START / LOG_2PC_PREPARE log records.
Presumed-abort optimisation
Section titled “Presumed-abort optimisation”If a coordinator crashes after sending PREPARE but before deciding, in-doubt participants must conclude ABORT — there is no committed-decision record to find. The standard “presumed-abort” optimisation: don’t emit any log record for abort decisions you’ve already sent; participants assume abort on coordinator silence.
XA bridge
Section titled “XA bridge”Java/JTA + the X/Open XA spec define a TM↔RM contract. CUBRID
exposes XA through tran_2pc_* client APIs; gtrid here
becomes XA’s XID. The CUBRID server is both an RM (when
attached as a participant) and an internal coordinator (when
driving its own dependent participants).
Theory ↔ CUBRID mapping
Section titled “Theory ↔ CUBRID mapping”| Theoretical concept | CUBRID name |
|---|---|
| Coordinator/participant role enum | LOG_2PC_EXECUTE { FULL, PREPARE, COMMIT_DECISION, ABORT_DECISION } |
| Global transaction id | gtrid field on LOG_TDES; LOG_2PC_NULL_GTRID = -1 |
| Coordinator state on TDES | LOG_2PC_COORDINATOR { num_particps, particp_id_length, block_particps_ids } |
| Global tran user info (XID payload) | LOG_2PC_GTRINFO { info_length, info_data } |
| Lock-acquire flag on read-prepare | LOG_2PC_OBTAIN_LOCKS = true / LOG_2PC_DONT_OBTAIN_LOCKS = false |
| Phase 1 commit | log_2pc_commit_first_phase (log_2pc.c:437) |
| Phase 2 commit | log_2pc_commit_second_phase (log_2pc.c:503) |
| Phase dispatch | log_2pc_commit (log_2pc.c:632) |
| Prepared log record | LOG_2PC_PREPARE + LOG_REC_2PC_PREPCOMMIT (log_record.hpp:387) |
| Start record | LOG_2PC_START + LOG_REC_2PC_START (log_record.hpp:399) |
| Decision records | LOG_2PC_COMMIT_DECISION, LOG_2PC_ABORT_DECISION |
| Inform-participants records | LOG_2PC_COMMIT_INFORM_PARTICPS, LOG_2PC_ABORT_INFORM_PARTICPS |
| Ack record | LOG_2PC_RECV_ACK + LOG_REC_2PC_PARTICP_ACK (log_record.hpp:412) |
| Prepared TDES state | TRAN_UNACTIVE_2PC_PREPARE |
| In-doubt-collecting state | TRAN_UNACTIVE_2PC_COLLECTING_PARTICIPANT_VOTES |
| Phase-2 decision states | TRAN_UNACTIVE_2PC_COMMIT_DECISION / _ABORT_DECISION |
| Informing-after-decision states | TRAN_UNACTIVE_COMMITTED_INFORMING_PARTICIPANTS / _ABORTED_INFORMING_* |
| In-doubt recovery | log_2pc_recovery (log_2pc.h:96) |
| In-doubt analysis-pass annotation | log_2pc_recovery_analysis_info (log_2pc.h:95) |
| XA prepared-list query | log_2pc_recovery_prepared (log_2pc.c:915) |
| XA attach by gtrid | log_2pc_attach_global_tran (log_2pc.c:1036) |
| XA prepare for an attached gtrid | log_2pc_prepare_global_tran (log_2pc.c:1126) |
CUBRID’s Approach
Section titled “CUBRID’s Approach”The 2PC module has four moving parts: the role-dispatch machinery that routes one TDES through different code paths depending on coordinator/participant role, the prepared-state log records that make the protocol durable, the in-doubt recovery that brings prepared transactions back at restart, and the XA bridge that lets external transaction managers drive the protocol. We walk them in that order.
Overall structure
Section titled “Overall structure”flowchart LR
subgraph CL["Client / TM (XA)"]
XA["xa_prepare\nxa_commit\nxa_rollback"]
TC["tran_2pc_∗\n(transaction_cl.c)"]
XA --> TC
end
subgraph SR["Server (transaction_sr + log_2pc)"]
XSC["xtran_2pc_∗"]
L2C["log_2pc_∗"]
XSC --> L2C
end
subgraph TDES["log_tdes (per-tran state)"]
GT["gtrid"]
GI["gtrinfo (XID payload)"]
CO["coord (NULL if not coordinator)"]
ST["state (TRAN_UNACTIVE_2PC_∗)"]
end
subgraph LOG["WAL records"]
R0["LOG_2PC_PREPARE"]
R1["LOG_2PC_START"]
R2["LOG_2PC_COMMIT_DECISION"]
R3["LOG_2PC_ABORT_DECISION"]
R4["LOG_2PC_∗_INFORM_PARTICPS"]
R5["LOG_2PC_RECV_ACK"]
end
subgraph PART["Participants"]
P1["site B"]
P2["site C"]
end
TC -->|RPC| XSC
L2C --> TDES
L2C --> LOG
L2C -->|prepare| PART
PART -->|vote| L2C
L2C -->|commit / abort| PART
PART -->|ack| L2C
The figure encodes three boundaries. (client / server) the
XA / tran_2pc_* API is the client face; the server-side
log_2pc_* is the implementation. (TDES / log) the TDES
holds the live state (gtrid, coord, isolation-relevant
fields); the log holds the durable trail recovery
re-establishes. (coordinator / participant) the same
TDES can be either, dispatched by LOG_2PC_EXECUTE enum.
Role dispatch — LOG_2PC_EXECUTE
Section titled “Role dispatch — LOG_2PC_EXECUTE”// LOG_2PC_EXECUTE — src/transaction/log_2pc.h:45enum log_2pc_execute{ LOG_2PC_EXECUTE_FULL, /* The root coordinator */ LOG_2PC_EXECUTE_PREPARE, /* Participant that is also a non-root coordinator running phase 1 */ LOG_2PC_EXECUTE_COMMIT_DECISION, /* Participant + non-root coordinator running phase 2 (commit) */ LOG_2PC_EXECUTE_ABORT_DECISION /* Same but abort, possibly without phase 1 */};typedef enum log_2pc_execute LOG_2PC_EXECUTE;The four values map to four roles a single CUBRID server can play in a distributed transaction:
- FULL — this server is the root coordinator. It drives prepare, collects votes, decides, and informs all participants.
- PREPARE — this server is somewhere in the middle of the tree. It is a participant from the perspective of a higher coordinator, and a coordinator for participants below it. Phase 1 from above triggers phase 1 below.
- COMMIT_DECISION / ABORT_DECISION — same middle position, but executing phase 2.
The dispatch happens in log_2pc_commit (log_2pc.c:632):
// log_2pc_commit — src/transaction/log_2pc.c (signature)TRAN_STATElog_2pc_commit (THREAD_ENTRY *thread_p, log_tdes *tdes, LOG_2PC_EXECUTE execute_2pc_type, bool *decision);The execute_2pc_type argument selects the path; *decision
is filled with the local outcome (true=commit, false=abort)
that propagates up to a parent coordinator if any.
Coordinator state on TDES
Section titled “Coordinator state on TDES”When a TDES acts as a coordinator, the coord pointer points
to a LOG_2PC_COORDINATOR block:
// LOG_2PC_COORDINATOR — src/transaction/log_2pc.h:64struct log_2pc_coordinator{ int num_particps; /* Number of participating sites */ int particp_id_length; /* Length of one participant identifier */ void *block_particps_ids; /* Block of N × particp_id_length bytes */#ifdef LOG_2PC_ACK_RECV_REQUIRED bool *ack_received; /* Per-participant ack vector */#endif};block_particps_ids is a flat byte block of N participant IDs
of length particp_id_length each — a network address, a
name, or whatever the calling code passes to
log_2pc_alloc_coord_info. Storing it as a flat block (rather
than an array of pointers) means it serialises directly into
the LOG_2PC_START record:
// LOG_REC_2PC_START — src/transaction/log_record.hpp:399struct log_rec_2pc_start{ char user_name[DB_MAX_USER_LENGTH + 1]; int gtrid; int num_particps; int particp_id_length; /* immediately followed by num_particps × particp_id_length bytes */};The LOG_2PC_ACK_RECV_REQUIRED #ifdef controls whether the
coordinator tracks per-participant acks. When defined, the ack
vector is populated by LOG_2PC_RECV_ACK records during phase 2.
log_2pc_alloc_coord_info (declared at log_2pc.h:93) attaches
this struct to a TDES; log_2pc_free_coord_info releases it on
transaction end.
Global transaction identification
Section titled “Global transaction identification”A gtrid is an int handed out at log_2pc_start and stored
in log_tdes::gtrid (log_impl.h:499). The companion
LOG_2PC_GTRINFO carries the XA-style payload:
// LOG_2PC_GTRINFO — src/transaction/log_2pc.h:57struct log_2pc_gtrinfo{ int info_length; void *info_data; /* opaque to the engine — XID payload */};log_2pc_set_global_tran_info (log_2pc.c:705) writes the
payload onto the TDES; log_2pc_get_global_tran_info
(log_2pc.c:772) reads it back.
log_2pc_make_global_tran_id (log_2pc.c:323) generates a new
gtrid; log_2pc_check_duplicate_global_tran_id
(log_2pc.c:407) guards against gtrid collision (used during
in-doubt recovery to ensure a recovered gtrid doesn’t clash
with a freshly assigned one).
Phase 1 — log_2pc_commit_first_phase
Section titled “Phase 1 — log_2pc_commit_first_phase”The first-phase function, called via log_2pc_commit (..., FULL, &decision) or log_2pc_commit (..., PREPARE, &decision):
- Append
LOG_2PC_STARTrecord (only at the root) listing the participants. - Send PREPARE to each participant
(
log_2pc_send_prepare,log_2pc.c:190). - Append
LOG_2PC_PREPAREfor the local TDES (LOG_REC_2PC_PREPCOMMITpayload). - Force-flush the log.
- Transition state to
TRAN_UNACTIVE_2PC_COLLECTING_PARTICIPANT_VOTES. - Wait for participant votes.
- If all voted YES, set
*decision = true; transition toTRAN_UNACTIVE_2PC_COMMIT_DECISION. If any voted NO, set*decision = false; transition toTRAN_UNACTIVE_2PC_ABORT_DECISION.
The local prepared record carries the lock catalogue:
// LOG_REC_2PC_PREPCOMMIT — src/transaction/log_record.hpp:387struct log_rec_2pc_prepcommit{ char user_name[DB_MAX_USER_LENGTH + 1]; int gtrid; int gtrinfo_length; /* length of XID payload that follows */ unsigned int num_object_locks; unsigned int num_page_locks; /* followed by gtrinfo bytes, object-lock list, page-lock list */};The lock catalogue matters because in-doubt recovery re-acquires locks before exposing the prepared transaction — otherwise a freshly-restarted server could let a concurrent transaction read or modify objects the prepared transaction holds.
Phase 2 — log_2pc_commit_second_phase
Section titled “Phase 2 — log_2pc_commit_second_phase”After phase 1 produces a decision:
- Append the decision record:
LOG_2PC_COMMIT_DECISION(log_record.hppenum value 30) orLOG_2PC_ABORT_DECISION(31). - Force-flush.
- Transition to
TRAN_UNACTIVE_*_INFORMING_PARTICIPANTS. - Send the decision to each participant
(
log_2pc_send_commit_decision/_send_abort_decision,log_2pc.c:222 / 261). - Append
LOG_2PC_*_INFORM_PARTICPS(32 / 33). - Wait for participant acks.
- As each ack arrives, append
LOG_2PC_RECV_ACK(34) with the acknowledging participant’s index. - When all acks received, transition to
TRAN_UNACTIVE_COMMITTED/_ABORTEDand release locks.
Prepared-state durability — log records
Section titled “Prepared-state durability — log records”Six record types form the durable 2PC trail:
| Type number | Name | Purpose |
|---|---|---|
| 28 | LOG_2PC_PREPARE | Local prepared state with lock catalogue |
| 29 | LOG_2PC_START | Coordinator’s record of participants |
| 30 | LOG_2PC_COMMIT_DECISION | Phase-2 commit decision |
| 31 | LOG_2PC_ABORT_DECISION | Phase-2 abort decision |
| 32 | LOG_2PC_COMMIT_INFORM_PARTICPS | Sent commit to participants |
| 33 | LOG_2PC_ABORT_INFORM_PARTICPS | Sent abort to participants |
| 34 | LOG_2PC_RECV_ACK | Received ack from one participant |
The order on the log of a successful distributed commit:
LOG_2PC_START ...participant work records...LOG_2PC_PREPARE (force flush, send prepare) (collect votes)LOG_2PC_COMMIT_DECISION (force flush, send decision)LOG_2PC_COMMIT_INFORM_PARTICPSLOG_2PC_RECV_ACK (×N)LOG_COMMITLOG_2PC_RECV_ACK carries a LOG_REC_2PC_PARTICP_ACK { particp_index } (log_record.hpp:412) — just the index
into the start record’s participant block.
In-doubt recovery — the analysis-pass dance
Section titled “In-doubt recovery — the analysis-pass dance”The recovery analysis pass (cubrid-recovery-manager.md §“Analysis pass”) classifies every TRANID. For 2PC, the classification depends on what records are present:
LOG_2PC_PREPAREbut no decision record → stateTRAN_UNACTIVE_2PC_PREPARE. In-doubt. The recovery must hold locks and wait for the coordinator’s decision.LOG_2PC_COMMIT_DECISIONbutLOG_2PC_*_INFORM_PARTICPSnot seen for some participants → stateTRAN_UNACTIVE_COMMITTED_INFORMING_PARTICIPANTS. The decision is durable; we need to re-send to the missed participants and collect acks.LOG_2PC_RECV_ACKfor all participants → done; transition toTRAN_UNACTIVE_COMMITTED.
log_2pc_recovery_analysis_info (log_2pc.h:95) is called
from the analysis pass for each 2PC-bearing TDES. After the
analysis pass, log_2pc_recovery (log_2pc.h:96) walks the
in-doubt set:
- For each
TRAN_UNACTIVE_2PC_PREPARE— re-acquire locks (log_2pc_read_preparereads the lock catalogue from the prepared record, andLOG_2PC_OBTAIN_LOCKS = truemakes it acquire them); the transaction stays in-doubt until the coordinator (or an operator) decides. - For each
TRAN_UNACTIVE_*_INFORMING_PARTICIPANTS— resume inform-and-ack; re-send decision to participants whose ack is missing.
The fifth recovery phase LOG_RECOVERY_FINISH_2PC_PHASE
(declared in log_impl.h:631) is the named slot for this work,
even though the current log_recovery driver in
cubrid-recovery-manager.md doesn’t call it — open question §4
in this doc.
XA bridge — external transaction managers
Section titled “XA bridge — external transaction managers”The XA APIs (xa_prepare, xa_commit, xa_rollback,
xa_recover) flow through tran_2pc_* on the client
(transaction_cl.h) into xtran_2pc_* on the server. The
key entry points:
tran_2pc_start→log_2pc_start(log_2pc.c:833): generate a gtrid, install on TDES.tran_2pc_prepare→log_2pc_prepare(log_2pc.c:877): run phase 1 withLOG_2PC_EXECUTE_FULLif the local server is the root, elseLOG_2PC_EXECUTE_PREPARE.tran_2pc_recovery_prepared→log_2pc_recovery_prepared(log_2pc.c:915):xa_recoverequivalent — return the list of in-doubt gtrids the TM should resolve.tran_2pc_attach_global_tran→log_2pc_attach_global_tran(log_2pc.c:1036):xa_startresume — attach to an existing gtrid (used after a connection-failover or thread switch in a thread-per-request server).tran_2pc_prepare_global_tran→log_2pc_prepare_global_tran(log_2pc.c:1126): drive prepare on a previously attached gtrid.
log_2pc_find_tran_descriptor (log_2pc.c:952) is the
gtrid → TDES lookup used by every attach-style call.
One distributed commit, end to end
Section titled “One distributed commit, end to end”sequenceDiagram participant TM as Transaction Manager (XA) participant CO as Coordinator (CUBRID server) participant LM as log_manager participant P1 as Participant 1 participant P2 as Participant 2 TM->>CO: xa_start (gtrid) CO->>CO: log_2pc_start: assign gtrid, install on TDES Note over CO: ...transaction work happens... TM->>CO: xa_prepare CO->>LM: append LOG_2PC_START (participant block) CO->>P1: send PREPARE CO->>P2: send PREPARE CO->>LM: append LOG_2PC_PREPARE (local lock catalogue) CO->>LM: force flush CO->>CO: state = TRAN_UNACTIVE_2PC_COLLECTING_PARTICIPANT_VOTES P1-->>CO: vote YES P2-->>CO: vote YES CO->>CO: state = TRAN_UNACTIVE_2PC_COMMIT_DECISION TM->>CO: xa_commit CO->>LM: append LOG_2PC_COMMIT_DECISION CO->>LM: force flush CO->>P1: send COMMIT CO->>P2: send COMMIT CO->>LM: append LOG_2PC_COMMIT_INFORM_PARTICPS P1-->>CO: ack CO->>LM: append LOG_2PC_RECV_ACK (idx=1) P2-->>CO: ack CO->>LM: append LOG_2PC_RECV_ACK (idx=2) CO->>LM: append LOG_COMMIT CO->>CO: state = TRAN_UNACTIVE_COMMITTED CO->>CO: release locks
Source Walkthrough
Section titled “Source Walkthrough”Anchor on symbol names, not line numbers.
Header types and constants
Section titled “Header types and constants”LOG_2PC_NULL_GTRID(log_2pc.h) — sentinel for “no gtrid”.LOG_2PC_OBTAIN_LOCKS/LOG_2PC_DONT_OBTAIN_LOCKS(log_2pc.h) — flags forlog_2pc_read_prepare.LOG_2PC_EXECUTEenum (log_2pc.h) — role dispatch.LOG_2PC_GTRINFO(log_2pc.h) — XA payload wrapper.LOG_2PC_COORDINATOR(log_2pc.h) — coordinator state on TDES.LOG_REC_2PC_PREPCOMMIT(log_record.hpp) — prepared record payload.LOG_REC_2PC_START(log_record.hpp) — start record payload.LOG_REC_2PC_PARTICP_ACK(log_record.hpp) — ack payload.
Coordinator path
Section titled “Coordinator path”log_2pc_start(log_2pc.c) — assign gtrid.log_2pc_make_global_tran_id(log_2pc.c) — gtrid generator.log_2pc_check_duplicate_global_tran_id(log_2pc.c) — recovery-time guard.log_2pc_send_prepare(log_2pc.c) — phase-1 send.log_2pc_send_commit_decision/log_2pc_send_abort_decision(log_2pc.c) — phase-2 send.log_2pc_alloc_coord_info(log_2pc.h) — attachLOG_2PC_COORDINATORto TDES.log_2pc_free_coord_info(log_2pc.h) — release.
Phase orchestration
Section titled “Phase orchestration”log_2pc_commit_first_phase(log_2pc.c).log_2pc_commit_second_phase(log_2pc.c).log_2pc_commit(log_2pc.c) — top-level dispatcher.log_2pc_prepare(log_2pc.c) — XA prepare entry.log_2pc_append_start(log_2pc.c).log_2pc_append_decision(log_2pc.c).
Recovery
Section titled “Recovery”log_2pc_recovery_analysis_info(log_2pc.h) — per-TDES classification during analysis pass.log_2pc_recovery(log_2pc.h) — post-analysis driver for in-doubt and informing-participants TDES.log_2pc_read_prepare(log_2pc.h) — read prepared record; optionally reacquire locks.
XA / global-tran helpers
Section titled “XA / global-tran helpers”log_2pc_set_global_tran_info/log_2pc_get_global_tran_info(log_2pc.c).log_2pc_recovery_prepared(log_2pc.c) —xa_recoverequivalent.log_2pc_find_tran_descriptor(log_2pc.c).log_2pc_attach_client(log_2pc.c) — bind a client to a TDES.log_2pc_attach_global_tran(log_2pc.c) —xa_startresume.log_2pc_prepare_global_tran(log_2pc.c).
Helpers
Section titled “Helpers”log_2pc_get_num_participants(log_2pc.c).log_2pc_dump_participants/log_2pc_dump_gtrinfo/log_2pc_dump_acqobj_locks(log_2pc.c) — debug dumps.log_2pc_is_tran_distributed(log_2pc.h) — bool query.log_2pc_clear_and_is_tran_distributed(log_2pc.h).
Position hints as of 2026-04-30
Section titled “Position hints as of 2026-04-30”| Symbol | File | Line |
|---|---|---|
LOG_2PC_EXECUTE enum | log_2pc.h | 45 |
LOG_2PC_GTRINFO (struct) | log_2pc.h | 58 |
LOG_2PC_COORDINATOR (struct) | log_2pc.h | 65 |
log_2pc_get_num_participants | log_2pc.c | 132 |
log_2pc_dump_participants | log_2pc.c | 162 |
log_2pc_send_prepare | log_2pc.c | 190 |
log_2pc_send_commit_decision | log_2pc.c | 222 |
log_2pc_send_abort_decision | log_2pc.c | 261 |
log_2pc_make_global_tran_id | log_2pc.c | 323 |
log_2pc_check_duplicate_global_tran_id | log_2pc.c | 407 |
log_2pc_commit_first_phase | log_2pc.c | 437 |
log_2pc_commit_second_phase | log_2pc.c | 503 |
log_2pc_commit | log_2pc.c | 632 |
log_2pc_set_global_tran_info | log_2pc.c | 705 |
log_2pc_get_global_tran_info | log_2pc.c | 772 |
log_2pc_start | log_2pc.c | 833 |
log_2pc_prepare | log_2pc.c | 877 |
log_2pc_recovery_prepared | log_2pc.c | 915 |
log_2pc_find_tran_descriptor | log_2pc.c | 952 |
log_2pc_attach_client | log_2pc.c | 984 |
log_2pc_attach_global_tran | log_2pc.c | 1036 |
log_2pc_prepare_global_tran | log_2pc.c | 1126 |
log_2pc_read_prepare (LSA variant) | log_2pc.c | 1313 |
log_2pc_read_prepare (reader variant) | log_2pc.c | 1389 |
log_2pc_dump_gtrinfo | log_2pc.c | 1476 |
log_2pc_dump_acqobj_locks | log_2pc.c | 1491 |
log_2pc_append_start | log_2pc.c | 1513 |
log_2pc_append_decision | log_2pc.c | 1570 |
Source verification (as of 2026-04-30)
Section titled “Source verification (as of 2026-04-30)”Verified facts
Section titled “Verified facts”-
The
LOG_2PC_EXECUTEenum has four values, three of them for non-root coordinators. Verified atlog_2pc.h:45.FULLis the root path; the other three correspond to a middle-of-tree node that is both a participant from above and a coordinator below. -
Coordinator info is attached to the TDES, not stored separately. Verified at
log_impl.h:506(LOG_TDES::coordof typeLOG_2PC_COORDINATOR *) pluslog_2pc.h:65. The pointer isNULLwhen this site is not the coordinator; non-NULL when it owns the participant block. -
Per-participant ack tracking is
#ifdef LOG_2PC_ACK_RECV_REQUIRED. Verified atlog_2pc.h:70. The macro is presumably defined on builds that need conservative ack tracking; the alternative is to skip per-participant acks and rely on theLOG_2PC_*_INFORM_PARTICPSrecords’ sequencing. -
Six log record types form the 2PC durable trail (28-34). Verified at
log_record.hpp:99-107:LOG_2PC_PREPARE(28),LOG_2PC_START(29),LOG_2PC_COMMIT_DECISION(30),LOG_2PC_ABORT_DECISION(31),LOG_2PC_COMMIT_INFORM_PARTICPS(32),LOG_2PC_ABORT_INFORM_PARTICPS(33),LOG_2PC_RECV_ACK(34). The values are stable — they appear in old archived logs identical to current. -
The prepared record carries the full lock catalogue. Verified at
log_record.hpp:387-396(LOG_REC_2PC_PREPCOMMIT::num_object_locksandnum_page_locks). After the fixed-size header, the record carries the gtrinfo bytes followed by the lock list. This is whatlog_2pc_read_preparereads at recovery time to reacquire locks. -
There are two overloads of
log_2pc_read_prepare. Verified atlog_2pc.h:88-90: one takes aLOG_LSA *+LOG_PAGE *, the other takes alog_reader &. The two exist for compatibility — older code paths use the explicit LSA, newer code paths use thelog_readerclass (cubrid-recovery-manager.md). -
In-doubt recovery is a separate phase named
LOG_RECOVERY_FINISH_2PC_PHASE. Verified atlog_impl.h:631. The phase is named in the enum but is not called from thelog_recoverybody sketched in cubrid-recovery-manager.md — it is invoked from the analysis / undo passes vialog_2pc_recovery(open question §4). -
gtridis an int, not an opaque XID. Verified atlog_impl.h:499(LOG_TDES::gtridis int) andlog_2pc.h:41(LOG_2PC_NULL_GTRID = -1). The XA-style XID payload travels throughLOG_2PC_GTRINFO::info_dataseparately. -
log_2pc_recovery_preparedis thexa_recoverequivalent. Verified by signature (int gtrids[],int size) and name. Returns a list of currently in-doubt gtrids the external TM should resolve. -
log_2pc_attach_global_tranresumes a transaction by gtrid. Verified atlog_2pc.c:1036. Used by the XAxa_startresume path when a previously suspended transaction is being re-attached, possibly on a different thread. -
Lock acquisition during prepare is controlled by a flag. Verified at
log_2pc.h:42-43:LOG_2PC_OBTAIN_LOCKS = true/LOG_2PC_DONT_OBTAIN_LOCKS = false. The flag is passed tolog_2pc_read_prepare. False is for diagnostic dumping of a prepared record; true is for actual recovery use.
Open questions
Section titled “Open questions”-
Heuristic abort / heuristic commit handling. XA defines
xa_forgetfor resolved heuristic decisions. CUBRID’s API surface (tran_2pc_*) does not obviously expose the heuristic-decision record type. Investigation path: searchtran_2pc_*andxtran_2pc_*for a forget call. -
Presumed-abort optimisation. The standard “no abort log record on coordinator timeout” pattern — does CUBRID implement it?
log_2pc_send_abort_decisionappends a record before sending; whether this is force-flushed or skipped on coordinator timeout was not traced. Investigation path: readlog_2pc_send_abort_decisionbody. -
Multi-level coordination tree.
LOG_2PC_EXECUTE_PREPAREhandles “I am a participant and coordinator below me”. How does the protocol handle 3+ levels? Are votes propagated serially? Investigation path:log_2pc_commit_first_phaseforLOG_2PC_EXECUTE_PREPAREarm. -
LOG_RECOVERY_FINISH_2PC_PHASEinvocation. The phase is named inlog_impl.h:631but thelog_recoverydriver in cubrid-recovery-manager.md does not call into it explicitly. Where exactly doeslog_2pc_recoveryget invoked? Investigation path: grep forlog_2pc_recoverycallers. -
Coordinator-down-during-decision recovery. If the root coordinator crashes after
LOG_2PC_COMMIT_DECISIONbut beforeLOG_2PC_COMMIT_INFORM_PARTICPS, the participants are in-doubt and the coordinator’s restart must re-send. The stateTRAN_UNACTIVE_COMMITTED_INFORMING_PARTICIPANTScaptures this, but the re-send timing (how often, how long) was not traced. Investigation path:log_2pc_recoverybody. -
gtrid space exhaustion.
gtridis an int (~2 billion). Recycling vs. exhaustion behaviour wasn’t traced. Investigation path:log_2pc_make_global_tran_idandlog_2pc_check_duplicate_global_tran_id.
Beyond CUBRID — Comparative Designs & Research Frontiers
Section titled “Beyond CUBRID — Comparative Designs & Research Frontiers”Pointers, not analysis.
-
Paxos commit (Gray & Lamport, 2006) — replaces the blocking 2PC with a non-blocking protocol via Paxos consensus among coordinators. CUBRID’s
LOG_2PC_*is classical 2PC; a Paxos-commit follow-up doc would document what CUBRID gives up by not running multiple coordinators. -
Spanner’s 2PC over Paxos groups (Corbett et al., OSDI 2012) — globally-distributed 2PC where each participant is itself a Paxos group. The protocol is the same, but the participant side is replicated. Out of scope for CUBRID, but a useful contrast for the failure model.
-
Presumed-abort and presumed-commit optimisations — ARIES/PA, ARIES/PC. Reduce log volume in the common case. CUBRID’s discipline appears to be “log everything”; an audit of whether the optimisation could apply would be a good follow-up.
-
JTA/XA (X/Open CAE Spec C193, 1991) — the canonical resource-manager-to-transaction-manager contract. CUBRID supports it through the C XA library and the JDBC driver’s XADataSource.
-
Spanner’s TrueTime + commit wait — uses bounded clock uncertainty to externalise serializability. CUBRID’s 2PC has no clock-based ordering; reads cross-server can see inconsistent times.
-
eXtended Architecture for distributed transactions (D-XA, P-XA) — extensions for parallel and pipelined 2PC. Modern CUBRID could in principle pipeline phase 1 across participants more aggressively.
Sources
Section titled “Sources”Raw analyses (raw/code-analysis/cubrid/storage/transaction/)
Section titled “Raw analyses (raw/code-analysis/cubrid/storage/transaction/)”Transaction Internals.pdfTransaction Internals.pptx— the 2PC chapters; the document is shared with cubrid-transaction.md, with scope-decisions in.meta/cubrid-2pc.yamldocumenting the split.
Sibling docs
Section titled “Sibling docs”knowledge/code-analysis/cubrid/cubrid-transaction.md— parent: TDES, isolation, savepoints. The lifecycle statesTRAN_UNACTIVE_2PC_*are listed there in full.knowledge/code-analysis/cubrid/cubrid-log-manager.md— the six 2PC log record types’ on-disk format.knowledge/code-analysis/cubrid/cubrid-recovery-manager.md— the analysis pass that classifies in-doubt and informing TDES.knowledge/code-analysis/cubrid/cubrid-lock-manager.md— the lock manager whose lock catalogue the prepared record serialises.
Textbook chapters (under knowledge/research/dbms-general/)
Section titled “Textbook chapters (under knowledge/research/dbms-general/)”- Database Internals (Petrov), Ch. 13 “Distributed Transactions” — 2PC, Paxos commit, presumed abort/commit.
- Gray, Notes on Database Operating Systems, 1978 — the original 2PC protocol description.
- Concurrency Control and Recovery in Database Systems (Bernstein et al.), Ch. 7 “Distributed Recovery”.
CUBRID source (/data/hgryoo/references/cubrid/)
Section titled “CUBRID source (/data/hgryoo/references/cubrid/)”src/transaction/log_2pc.{c,h}src/transaction/log_record.hpp—LOG_REC_2PC_*payload structs.src/transaction/log_recovery.c— analysis-pass classification of 2PC records.src/transaction/log_tran_table.c— TDES allocation (gtrid lives on the TDES).src/transaction/transaction_{cl,sr}.{h,c}— the publictran_2pc_*/xtran_2pc_*API.