CUBRID Replication & HA — Section Overview
What this section covers
Section titled “What this section covers”CUBRID’s high-availability story is a primary/standby cluster
with asynchronous logical replication and a local-liveness leader
election. Each node runs a cub_master process that gossips
UDP heartbeats with its peers and reaches an independent verdict
about who the master is — there is no Raft, no Paxos, no quorum
round trip. On top of that liveness substrate the master engine
emits auxiliary LOG_REPLICATION_DATA / LOG_REPLICATION_STATEMENT
records into the same WAL stream that drives crash recovery, a
copylogdb daemon ships those volumes to the slave host, and an
applylogdb worker (la_apply_log_file) walks them forward and
re-executes the row events through the storage layer. The same
WAL is the input to a second consumer — the modern cdc_* API,
which streams LOG_SUPPLEMENTAL_INFO records to external
subscribers (Kafka, search indexes, audit pipelines). Three
detail docs in this subcategory unpack the three layers; this
overview is the map.
The HA stack
Section titled “The HA stack”The replication-ha subcategory is best read as three stacked layers, each with its own detail doc, each consuming the output of the layer below it. The shared invariant is that the WAL is the only durable cross-node contract: it drives crash recovery on a single node, it drives slave catch-up on a peer node, and it drives external CDC consumers — all from the same on-disk byte stream, just with different downstream readers.
-
Liveness —
cubrid-heartbeat.md. Thecub_masterprocess on each node pushes a UDP heartbeat packet to every peer it knows about and updates its local view of each peer’s(state, priority)score from the packets it receives. Failure detection is two-signal (gap counter for symmetric loss, last-heard timestamp for asymmetric loss); leader election is local only (each node decides independently from its own peer table; there is no global agreement step). A small job-queue FSM drives four worker threads through theslave → to-be-master → masterandmaster → slavetransitions that implement failover, failback, and the resource-failure demote. The split-brain guards live here too:ha_ping_hosts(an external reachability witness) and theis_isolatedpredicate (don’t promote if every non-replica peer isHB_NSTATE_UNKNOWN). -
Replication —
cubrid-ha-replication.md. Once liveness has named a master and a slave, the actual data flow is a logical-log pipeline running parallel to the physiological WAL. At DML time the master emits two record families into the same log file it already writes for crash recovery: physiological undo/redo records (page-level, byte-level, what the recovery manager replays) and auxiliary replication records (LOG_REPLICATION_DATAfor row events,LOG_REPLICATION_STATEMENTfor DDL and trigger-bound statements). Thecopylogdbdaemon on the slave host opens a TCP connection to the master and ships log volumes byte-for-byte into the slave’s local archive directory. A separateapplylogdbdaemon (la_apply_log_fileinlog_applier.c) tails those archives forward, parses each replication record, and dispatches per-record-type into the storage layer (heap_*,btree_*,locator_*) for serialised replay. The slave is equivalent to the master under the same query workload, but it is not byte-identical: page layouts, free-space distribution, and B-tree split shape may diverge. -
Change capture —
cubrid-cdc.md. The same WAL feeds a second consumer. The moderncdc_*API (src/api/cubrid_log.c) is a pull-style request/response interface: a downstream consumer asks “give me the next batch starting at LSA X” and the server walkslog_readerforward throughLOG_SUPPLEMENTAL_INFOrecords — explicit logical records the engine emits at DML time, intentionally rich enough that consumers don’t need a catalog lookup. The legacyla_*log-applier path (the same code that drives HA replication’sapplylogdb) coexists with the new API; both read the same on-disk log, just with different cursor and framing semantics. DDL travels inline asLOG_SUPPLEMENT_DDL, so a CDC consumer can maintain its own schema cache without ever calling back into the catalog.
flowchart TB
subgraph Master["Master node — cub_master"]
MWAL[("WAL (physiological + LOG_REPL + LOG_SUPPL)")]
MHB[heartbeat sender]
MWAL --> SHIP[copylogdb sender]
end
subgraph Slave["Slave node — cub_master"]
SHB[heartbeat receiver]
SCOPY[copylogdb receiver]
SAPPLY[applylogdb / la_apply_log_file]
SCOPY --> SARCH[(local log archive)]
SARCH --> SAPPLY
SAPPLY --> SSTORAGE[(storage: heap, btree, locator)]
end
subgraph Peer["Replica / witness — cub_master"]
PHB[heartbeat peer]
end
subgraph CDC["External CDC consumers"]
CDCAPI[cdc_make_loginfo via cubrid_log.c]
end
MHB <-. UDP heartbeat .-> SHB
MHB <-. UDP heartbeat .-> PHB
SHB <-. UDP heartbeat .-> PHB
SHIP -- TCP log shipping --> SCOPY
MWAL -- pull by LSA cursor --> CDCAPI
classDef liveness fill:#fde,stroke:#a36
classDef repl fill:#dfe,stroke:#3a6
classDef capture fill:#def,stroke:#36a
class MHB,SHB,PHB liveness
class SHIP,SCOPY,SAPPLY,SARCH,SSTORAGE repl
class CDCAPI capture
The diagram makes the stacking explicit. The pink layer
(heartbeat) decides who is master. The green layer
(replication) ships the master’s WAL to the slave and replays it.
The blue layer (CDC) lets external systems read the same WAL on a
different cursor. Nothing in the green layer or the blue layer
talks to the pink layer directly — they discover the master’s
identity through the ordinary client-routing path (the broker
reroutes connections after failover; see
cubrid-architecture-overview.md and cubrid-broker.md).
Reading order
Section titled “Reading order”The detail docs make the most sense bottom-up, because each layer’s invariants are inputs to the layer above it.
-
cubrid-heartbeat.mdfirst. Read this to see how a node decides “I am master” or “I am slave” from purely local information. The(state, priority)scoring, the gap-counter / last-heard dual signals, theha_ping_hostssplit-brain guard, and the four-thread FSM that drives state transitions are the vocabulary the other two docs use without re-deriving. The doc also frames CUBRID’s design choice — local decision instead of consensus — against Raft / ZAB and names the trade-off (cheaper, no quorum round-trip; inherits asymmetric-partition risk that consensus systems sidestep). -
cubrid-ha-replication.mdsecond. Read this once you know who the master is, because this doc is the actual data flow. The three-axis framing (physical vs. logical; statement vs. row; sync vs. async) places CUBRID’s choice — asynchronous, logical, hybrid statement+row — in the replication design space alongside MySQL row-based binlog, PostgreSQL logical decoding, and Oracle GoldenGate. The walkthrough names every symbol on the producer side (LOG_REPLICATION_DATAemission insideheap_*andbtree_*), the shipping side (copylogdb), and the consumer side (la_apply_log_fileand its per-record-type dispatch). -
cubrid-cdc.mdlast. Once you’ve seen the full HA pipeline, CDC is the same WAL with a different cursor and a different framing. The doc contrasts the modern pull-stylecdc_*API against the legacy push-stylela_*daemon, explains why both coexist (incremental migration), and walks through theLOG_SUPPLEMENTAL_INFOrecord family — including theSUPPLEMENT_REC_TYPEenum, transaction grouping at commit boundaries viatran_user, and thecdc_min_log_pageid_to_keepwatermark that stops the archive remover from deleting log volumes a consumer still needs.
If you only have time for one doc, read cubrid-heartbeat.md.
The local-liveness decision is the single design choice that
distinguishes CUBRID’s HA story from textbook descriptions, and
the rest of the section is comprehensible in outline once the
liveness model is in your head.
Cross-cutting concerns
Section titled “Cross-cutting concerns”A handful of invariants thread through all three docs. They are the reason this is one section instead of three loosely related ones.
One log, two consumers
Section titled “One log, two consumers”Both replication and CDC piggy-back on the same on-disk WAL —
the same byte stream the recovery manager reads on crash restart
(see cubrid-recovery-manager.md and cubrid-log-manager.md).
HA replication consumes LOG_REPLICATION_DATA and
LOG_REPLICATION_STATEMENT; CDC consumes
LOG_SUPPLEMENTAL_INFO. The records are interleaved with the
physiological undo/redo records the recovery manager needs;
nothing in the producer pipeline knows or cares which downstream
consumer will read them. This is why the detail docs share a
“Source Walkthrough” symbol set with the log-manager and
recovery-manager docs in the txn-recovery section: the producer
side is not separable from the WAL it lives in.
Local-liveness implies a split-brain guard
Section titled “Local-liveness implies a split-brain guard”Because each cub_master reaches its master/slave verdict
independently from its own peer table, a network partition
that hides the true master from a slave will, with timeout-only
logic, cause the slave to unilaterally promote and create
split-brain. The heartbeat doc names two guards built into the
liveness layer to avoid that:
ha_ping_hosts— an explicit list of external addresses (typically gateways or sibling-cluster sentinels) that the local node must be able to reach before it trusts its own promotion decision. If the witness is unreachable, the slave refuses to promote even though every peer looks dead from its vantage.- The
is_isolatedpredicate — the local node is isolated if every non-replica peer is inHB_NSTATE_UNKNOWN. An isolated node will not promote: by definition it cannot distinguish “everyone died” from “I am the one who got partitioned away”.
These are the parts of the heartbeat layer that the replication and CDC layers tacitly rely on — neither doc re-derives the split-brain story, but both assume the master they observe is unique.
Failover dovetails with backup-restore
Section titled “Failover dovetails with backup-restore”Failover and failback move the role label, but they do not
themselves perform media recovery. If a slave’s local archive is
torn (e.g., the node was offline long enough that the master’s
archive remover deleted log volumes the slave still needed), the
HA replication layer can no longer catch up by tailing log
files. The recovery path then has to fall back to a full
restore from a fresh backup taken at the master, after which
log shipping resumes from the post-restore LSA. That hand-off
lives in the txn-recovery section — see
cubrid-backup-restore.md for the restore mechanics and
cubrid-checkpoint.md for how restore points relate to the LSA
cursor that applylogdb carries. The replication doc names the
hand-off but does not duplicate the restore mechanics; the CDC
doc shares the same archive-retention pressure, governed by
cdc_min_log_pageid_to_keep.
Logical replay is not idempotent at the byte level
Section titled “Logical replay is not idempotent at the byte level”Both applylogdb and the legacy la_* log applier replay
logical row events through the storage layer rather than
memcpy-ing pages. That has two consequences worth keeping in
mind when reading the detail docs. First, the slave’s pages can
diverge from the master’s in their physical layout — same rows,
different free-space distribution, different B-tree split
points. Second, replay is not crash-idempotent in the trivial
“reapply the same redo record” sense the recovery manager uses;
the applier carries a durable last_committed_lsa cursor and
re-resolves restart by skipping past the cursor on restart.
Both detail docs name the cursor; neither reproduces the full
recovery-manager redo discipline (see cubrid-recovery-manager.md
for that).
Detail-doc summaries
Section titled “Detail-doc summaries”| Detail doc | Layer | Primary symbols / source files | One-line summary |
|---|---|---|---|
cubrid-heartbeat.md | Liveness | master_heartbeat.c/h, connection/heartbeat.c/h, util_service.c, commdb.c; HB_NODE_ENTRY, HB_NSTATE_*, node->score, four-thread job-queue FSM | UDP heartbeat between cub_master peers; per-peer scoring with priority+state; gap-counter and last-heard dual staleness signals; local master election (no consensus); ha_ping_hosts and is_isolated as split-brain guards; FSM drives slave→to-be-master→master and master→slave for failover/failback/demote. |
cubrid-ha-replication.md | Replication | log_applier.c/h, log_writer.c/h, replication.c/h, log_record.hpp, heap_file.c, btree.c, locator_sr.c; LOG_REPLICATION_DATA, LOG_REPLICATION_STATEMENT, la_apply_log_file, copylogdb | At DML time the master emits auxiliary LOG_REPL* records alongside the physiological WAL; copylogdb ships log volumes to the slave host; applylogdb (la_apply_log_file) walks them forward and re-executes per-record-type through the storage layer; asynchronous, logical, hybrid statement+row; slave is workload-equivalent but not byte-identical. |
cubrid-cdc.md | Change capture | log_manager.c/h, log_applier.c/h, log_applier_sql_log.c/h, log_reader.cpp/hpp, api/cubrid_log.c; LOG_SUPPLEMENTAL_INFO, SUPPLEMENT_REC_TYPE, cdc_make_loginfo, cdc_min_log_pageid_to_keep | Same WAL, second consumer: the modern cdc_* API is pull-style (consumer carries an LSA cursor, asks for the next batch); the legacy la_* daemon path is push-style and coexists; DDL travels inline as LOG_SUPPLEMENT_DDL; transaction grouping at commit boundaries via tran_user; archive remover gated by cdc_min_log_pageid_to_keep. |
Adjacent sections
Section titled “Adjacent sections”The replication-ha section sits between two other clusters in this code-analysis tree, and most multi-doc reading paths cross those boundaries.
-
Transaction & Recovery — the WAL that feeds every layer in this section is produced there. Read
cubrid-log-manager.mdfor the on-disk record format (including theLOG_REPL*andLOG_SUPPLEMENTAL_INFOfamilies that this section’s consumers parse),cubrid-recovery-manager.mdfor the redo/undo discipline that the same WAL drives on a single node,cubrid-checkpoint.mdfor how the LSA cursor relates to durable restart points, andcubrid-backup-restore.mdfor the media-recovery hand-off that catches a slave whose log has been truncated past its catch-up point. The replication and CDC docs both name these adjacencies but do not reproduce the WAL semantics. -
Server Architecture — the
cub_masterprocess that carries the heartbeat layer is part of the server-process tree documented incubrid-architecture-overview.md, and the broker (seecubrid-broker.md) is the component that uses failover output: when heartbeat moves the master label to a new host, the broker is what reroutes client connections. The heartbeat doc explains how the role is decided; the broker doc explains how clients discover the new role. Together they close the loop from “master died” to “client traffic reaches the new master”.
In short: read cubrid-log-manager.md if you want to know
where the records come from, this section if you want to know
where they go, and cubrid-broker.md if you want to know how
clients follow them after a failover.