CUBRID Server Architecture — Section Overview
What this section covers
Section titled “What this section covers”This section is everything that makes cub_server a server — the
process-level scaffolding around the storage, query, and replication
engines, not those engines themselves. It describes how the server
process comes alive, how a request travels from a client driver into a
worker thread, where per-client state lives between requests, what
threading primitives execute the work, how that work crosses into the
storage layer, and which cross-cutting subsystems (configuration, error,
monitoring) every other module depends on.
The thirteen detail docs split along four orthogonal concerns: process
lifecycle (boot order, SA vs CS compile-time choice), the client
surface (the C client API, the broker pool that fronts every TCP
client, the CSS framing and NRP dispatch that carries each request),
per-client and concurrency state inside the server (the
SESSION_STATE container plus the thread/worker pools, including the
CBRD-26177 NG redesign), the bridge to data (the locator OID
workspace), and three cross-cutting modules (system parameters,
error management, monitoring) that every other subsystem reads from or
writes into. A specialised utility doc (loaddb) is included because it
illustrates the SA-mode embedding plus a server-mode worker pool path.
This section is not about how the storage engine, query processor, DDL executor, or replication stack work internally — those are their own sections. It is the layer beneath all of them.
The process and request flow
Section titled “The process and request flow”The thirteen docs cluster by their role in the live cub_server request
path:
Lifecycle. cubrid-boot.md describes the strict
topological order in which cub_server brings every subsystem online
(error reporter → sysparams → memory → locale → page buffer → log → lock
→ MVCC → catalog → schema → networking) and the corresponding
boot_restart_client walk on the client side, including the special
first-time createdb path and the restart log_recovery three-pass
replay. cubrid-sa-cs-runtime.md explains why
the same source tree is compiled into three artefacts — cub_server
(SERVER_MODE daemon), libcubridsa (SA_MODE in-process engine), and
libcubridcs (CS_MODE wire driver) — and how a per-utility
SA_ONLY/CS_ONLY/SA_CS classification plus a runtime dlopen of the
right .so lets one CLI binary either embed the engine or talk over
CSS.
Client surface. cubrid-dbi-cci.md is the
client-side C API (db_open_buffer, db_compile_statement_local,
db_execute_statement, db_query_first_tuple, db_close_session) that
JDBC, CCI, ODBC, Python, and PHP all reach through —
the four-stage statement FSM (Initial → Compiled → Prepared → Executed)
and how it sits on top of boot_cl and network_cl.
cubrid-broker.md describes the
front-of-house: a cub_broker parent that exposes one TCP listener,
forks a fixed pool of cub_cas worker processes, hands each accepted
client to an idle CAS via a Unix-domain rendezvous channel using
SCM_RIGHTS file-descriptor passing, then lets that CAS proxy the
client’s CSS-framed traffic upstream to cub_server — all coordinated
through one SysV shared-memory segment carrying job queues, ACL state,
and monitoring counters. cubrid-network-protocol.md covers the wire on both sides: CSS length-prefixed packet framing, the
NRP opcode space (NET_SERVER_*), the static dispatch table of
(action_attribute, handler) records that turns each opcode into a
server-side handler, and the symmetric or_pack_* / or_unpack_*
marshalling shared by client and server.
Per-client state. cubrid-server-session.md is the per-client container that lives inside cub_server: a
lock-free hash keyed by SESSION_ID that holds prepared-statement
caches, autocommit mode, last-insert-id, and SET-variable bindings, with
the resolved SESSION_STATE * cached on the connection entry for O(1)
per-request lookup, and bound to the per-thread TDES so every handler
lands on its rightful transaction descriptor.
Concurrency primitives. cubrid-thread-worker-pool.md is the legacy/baseline thread layer: the per-thread
cubthread::entry context, the cubthread::worker_pool template
(cores → workers → task queue) that runs queries, vacuum, loaddb, and
parallel-redo, the daemon + looper pattern behind every periodic
flush and detect, the lock-free hashmap shared by lock manager and page
buffer, and the heavyweight csect RW primitive with its per-thread
tracker. cubrid-thread-manager-ng.md is
the live state on Guava: the CBRD-26177 redesign — bounded epoll-driven
connection_workers, a coordinator brokering rebalancing and
auto-scaling, send/recv budgets, per-worker context freelists, and
atomic-free statistics — that replaces the legacy
thread-per-connection plus max_clients-task-worker layout. Both
documents are required to read the running engine: the NG redesign
sits on top of the legacy primitives, not in place of them.
Bridge to data. cubrid-locator.md is the
boundary between this section and the Storage Engine section. It is the
OID workspace — a client-side workspace that batches dirty objects into
LC_COPYAREA buffers, paired with a server-side locator_*_force
family that fans out into heap, btree, lock, log, FK, and replication
paths through one canonical entry point. It belongs here because it is
a server-side fan-in (every DML request lands on it before reaching
storage), even though its work executes against storage.
Cross-cutting infrastructure. Three modules are touched by
practically every other subsystem and so are kept here rather than
scattered: cubrid-system-parameters.md
is the prm_Def[] registry plus cubrid.conf parser, environment
overrides, db_set_system_parameters SQL path, and per-session
SESSION_PARAM array, all read through prm_get_*_value;
cubrid-error-management.md is the
per-thread cuberr::context, the er_set family with printf-style
specs, the localised cubrid.msg / csql.msg / utils.msg nl_catd
catalogs, the rotating cubrid_*.err log, and the
er_get_area_error / er_set_area_error wire format that flattens an
error to OR_INT triples for cross-process propagation;
cubrid-monitoring.md is the layered counter
system — a C++ template cubmonitor library with per-transaction
sheets plus the older C perf_monitor / pstat_Metadata array used by
SHOW STATS and statdump, alongside per-subsystem monitors such as the
vacuum overflow-page threshold tracker.
Bulk loading. cubrid-loaddb.md is the
bulk-loader utility — primarily an SA-mode binary that tokenises a
CUBRID-format object file, splits it into batches, holds a Bulk-Update
lock, writes through locator_multi_insert_force into heap pages, and
finishes with a class-by-class statistics rebuild. It illustrates the
SA-mode embedding (in-process engine, no CSS hop) and exercises a
server-mode worker pool path for parallel batches.
The diagram below is the full happy-path picture: a JDBC driver inside
an application reaches the broker over TCP, gets paired with a CAS, the
CAS opens a CSS connection upstream to cub_server, and once inside
cub_server the request is dispatched through NRP to a worker thread
that sees its session state and uses the locator to reach storage.
flowchart LR
subgraph App["Application process"]
JDBC["JDBC / CCI / ODBC<br/>(driver)"]
end
subgraph Broker["cub_broker host"]
BR["cub_broker<br/>(TCP listener<br/>· shm)"]
CAS["cub_cas worker<br/>(forked, holds DB_SESSION,<br/>SQL log, ACL)"]
end
subgraph CubServer["cub_server (one DB)"]
NET["network_sr<br/>(CSS framer + NRP<br/>dispatch table)"]
WP["Worker pool<br/>(legacy worker_pool<br/>OR NG connection_worker<br/>· coordinator/epoll)"]
SESS["SESSION_STATE<br/>(prepared stmts,<br/>SET vars, TDES bind)"]
LOC["Locator<br/>(OID workspace,<br/>locator_force family)"]
XCUT["Cross-cutting<br/>sysparam · error_manager<br/>· monitor / perfmon"]
end
Storage[("Storage engine<br/>heap · btree · log · lock · MVCC")]
JDBC -->|TCP, CCI/JDBC wire| BR
BR -->|"SCM_RIGHTS<br/>fd handoff"| CAS
CAS -->|CSS framed,<br/>NRP opcodes| NET
NET --> WP
WP --> SESS
SESS --> LOC
LOC --> Storage
XCUT -.read by.-> NET
XCUT -.read by.-> WP
XCUT -.read by.-> SESS
XCUT -.read by.-> LOC
The dotted arrows are the cross-cutting modules: every box in
cub_server reads sysparams, raises errors through er_set, and
increments perf counters.
Reading order
Section titled “Reading order”For a newcomer to the CUBRID server, the documents in this section are best taken in this order rather than alphabetical:
- cubrid-boot.md — start here. The boot order is
the dependency graph; if you understand which subsystem comes up
before which, the rest of the engine becomes legible. The doc also
covers
createdband recovery dispatch, which give the “first-ever” and “after a crash” flavours of the same flow. - cubrid-network-protocol.md and
cubrid-broker.md — read these as a pair.
The broker doc explains the process topology (broker, CAS,
cub_server) and how a TCP socket becomes a CAS-owned engine handle; the network doc explains the protocol (CSS framing, NRP opcodes, dispatch table) used inside that handle. Either alone is incomplete. - cubrid-server-session.md — once you know how a request arrives, this is where it lands. The session is the per-client state container, the bridge between a connection entry and a transaction descriptor.
- cubrid-thread-worker-pool.md — who executes the request. Read this for the conceptual baseline (workers, daemons, lock-free hash, csect) before tackling the NG redesign. The legacy primitives are still in the binary.
- cubrid-locator.md — the bridge to storage. Every DML, every fetch, every flush of the workspace goes through it. Read it last in the request-flow group because it is what the handler eventually calls into.
- cubrid-thread-manager-ng.md — the modern redesign on top. Once the legacy worker pool is internalised, this doc walks through the CBRD-26177 connection-worker / coordinator / epoll architecture, send/recv budgets, auto-scaling, and per-worker freelists. It is the live state on Guava.
- Cross-cutting per need. Pull these in when a specific question
comes up:
- cubrid-system-parameters.md
when reading any code that calls
prm_get_*_valueor any subsystem whose tuning matters (buffer sizes, timeouts, thresholds). - cubrid-error-management.md
when reading any code that calls
er_set,ER_*, or returns a CUBRID error code — i.e. essentially every server file. - cubrid-monitoring.md when reading
code that uses
perfmon_*,cubmonitor::*, orpstat_*, or when investigating SHOW STATS / statdump output.
- cubrid-system-parameters.md
when reading any code that calls
- cubrid-dbi-cci.md — pick this up when you
start reading client-side code (CCI, JDBC native bridge, csql,
utility binaries that talk to a running server). It is also the
surface that the broker’s CAS calls into through
ux_*, so it is useful as soon as you start tracing a query out of the CAS. - cubrid-sa-cs-runtime.md — read when
you need to understand a utility binary (loaddb, unloaddb, csql in
--SA-mode, backupdb, restoredb, compactdb). The two-mode compilation explains why the samedb_*API can either embed the engine or talk to it remotely. - cubrid-loaddb.md — specialised, but illustrative: it is the most concrete example of the SA-mode path end-to-end (parser, batches, locator-bypass writes, statistics rebuild) and a good consolidation read after the rest of the section.
If you need only one mental model to start, read just cubrid-boot.md
plus cubrid-network-protocol.md plus the diagram above; that is
sufficient to start tracing real requests in the source.
Cross-cutting concerns
Section titled “Cross-cutting concerns”A few facts cut across the detail docs and are easy to miss if each is read in isolation:
The CBRD-26177 NG redesign sits on top of the legacy worker pool, not
in place of it. Both cubrid-thread-worker-pool.md and cubrid-thread-manager-ng.md are required to understand the running engine. The
template cubthread::worker_pool is still the substrate for the
in-server task pools (vacuum workers, loaddb workers, parallel-redo);
the NG redesign replaces only the connection-handling layer — the
old “thread-per-connection plus max_clients task workers” arrangement
becomes “bounded epoll-driven connection_workers plus a coordinator
that brokers rebalancing and auto-scaling”. Anyone trying to read the
networking code without the NG doc, or the vacuum/loaddb code without
the legacy doc, will get half the picture.
The locator is the boundary between this section and the storage engine. It is placed here, in server architecture, because it is a server-side fan-in — the canonical entry point that every DML reaches before storage — and because its workspace half lives on the client in CS mode. But its actual work executes against the storage layer (heap pages, btree leaves, lock table, log records, FK enforcement, replication tap-off). When tracing a write, the locator doc is the hinge: from there, the next stop is the Storage Engine section.
Sysparam, error, and monitoring touch every other subsystem.
cubrid-system-parameters.md is read by
every module on every request (timeouts, buffer sizes, feature
toggles); cubrid-error-management.md is
the way every module reports failure (the er_set calls are the
cross-section through the entire codebase); cubrid-monitoring.md is the way every module exposes counters
(perfmon_inc_stat, cubmonitor::*). They are documented separately
because they are conceptually distinct, but in practice every detail
doc in every section of the project assumes their existence.
The boot doc covers two flows that look like one. “First-ever
createdb” formats volumes and bootstraps the root-class catalog;
“restart of an existing database” runs log_recovery’s three-pass
replay before opening for clients. Both are described in
cubrid-boot.md, but they take very different paths
through the code. When something goes wrong at startup, knowing which
flow you are in is the first diagnostic question.
SA mode and CS mode are the same db_* API on top of different
substrates. cubrid-sa-cs-runtime.md
explains the per-utility classification and the dlopen choice; once
you internalise that, you can read any utility binary by checking which
.so the launcher loads and following the same db_* calls into
either an in-process engine (SA) or a CSS wire (CS). The client-side
db_* API documented in cubrid-dbi-cci.md is
identical in both cases — that is the point of the design.
The broker’s CAS uses the same db_* API as any external client.
The broker is not a privileged path. It calls db_compile_statement_local,
db_execute_statement, db_query_* exactly like CCI or JDBC native
would. The “CAS” abstraction is purely about process-level pooling
(forked workers, fd passing, ACL, SQL log) — see cubrid-broker.md and cubrid-dbi-cci.md together for
the full picture.
Detail-doc summaries
Section titled “Detail-doc summaries”| # | Doc | Module | One-line role |
|---|---|---|---|
| 1 | cubrid-boot.md | boot | Topological subsystem startup, createdb, restart-recovery dispatch, client connect handshake |
| 2 | cubrid-sa-cs-runtime.md | sa-cs-runtime | Three-way compile (cub_server, libcubridsa, libcubridcs) and per-utility dlopen choice |
| 3 | cubrid-dbi-cci.md | dbi-cci | The db_* C client API and the four-stage statement FSM under JDBC/CCI/ODBC/CSQL |
| 4 | cubrid-broker.md | broker | cub_broker + forked cub_cas pool, fd-passing rendezvous, SQL log, ACL, SysV shm |
| 5 | cubrid-network-protocol.md | network-protocol | CSS length-prefixed framing and the NRP NET_SERVER_* dispatch table |
| 6 | cubrid-server-session.md | server-session | Per-client SESSION_STATE lock-free hash + TDES binding |
| 7 | cubrid-thread-worker-pool.md | thread-worker-pool | Legacy thread layer: cubthread::entry, worker pool, daemons, csect, lock-free hashmap |
| 8 | cubrid-thread-manager-ng.md | thread-manager-ng | CBRD-26177 NG redesign: epoll connection workers, coordinator, budgets, auto-scaling |
| 9 | cubrid-locator.md | locator | OID workspace + locator_*_force server fan-in into heap/btree/lock/log/FK/replication |
| 10 | cubrid-system-parameters.md | system-parameters | prm_Def[] registry, conf/env/URL parsing, per-session scope, prm_get_*_value |
| 11 | cubrid-error-management.md | error-management | Per-thread cuberr::context, er_set family, message catalog, log rotation, wire format |
| 12 | cubrid-monitoring.md | monitoring | cubmonitor C++ library and legacy perf_monitor/pstat_Metadata C array |
| 13 | cubrid-loaddb.md | loaddb | SA-mode bulk loader: parser, batches, locator-bypass insert, statistics rebuild |
Adjacent sections
Section titled “Adjacent sections”Every other section of the CUBRID code-analysis tree depends on this one — server architecture is the floor on which the rest of the engine stands. The dependencies fall into a few groups:
- Storage Engine (heap, btree, page buffer, log, lock, MVCC,
recovery, double-write buffer, disk manager, extendible hash,
external sort, list file, overflow file, prior list, checkpoint,
backup-restore) is reached through the locator described here.
Every
locator_*_forcecall lands in heap or btree code. Read cubrid-locator.md as the hand-off point. - Query Processing (parser, optimizer, executor, evaluator, hash
join, parallel query, list file, post-processing, partition, JSON
table, cursor) executes inside server worker threads tracked by
cubrid-thread-worker-pool.md, with
state in the cubrid-server-session.md
prepared-statement cache, dispatched through one of the
NET_SERVER_QM_*opcodes documented in cubrid-network-protocol.md. - DDL and Schema (catalog manager, class object, ddl-execution, authentication, charset/collation) consumes the same boot-order cubrid-boot.md catalog bootstrap and reaches the storage layer through cubrid-locator.md for every catalog-class write.
- Replication and HA (ha-replication, heartbeat, CDC, 2PC,
flashback) routes its own traffic through a separate side of the
CSS protocol described in cubrid-network-protocol.md, is monitored by
cub_masterand the broker cluster, and depends on the same boot order; the log-shipping and replica-apply flows are layered on top of the worker-pool primitives in cubrid-thread-worker-pool.md and the locator’slocator_forcefamily. - PL family (pl-javasp, pl-plcsql) interacts with the JVM-hosted
pl_serverover a connection that originates fromcub_server— the request still arrives through the broker / CAS / CSS path documented here, but is delegated outward to a co-process.
If you are reading any of those sections and need to know how a request got to the code in question, the answer is in this section.