CUBRID TDE — Transparent Page-Level Encryption With Master-Key-Wrapped DEK
Contents:
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Cross-check Notes
- Open Questions
- Sources
Theoretical Background
Section titled “Theoretical Background”A page-level engine that wants to defend its on-disk corpus against attackers with raw filesystem access must answer four questions before writing a single byte: what cipher, which key, what nonce, where the key lives. Each answer leaks into every other layer — the buffer pool, the WAL, the recovery code, the backup tool, the DBA’s runbook — so the design space is much smaller than it looks from a distance.
Threat model. Encryption-at-rest defends against an adversary who can copy the data files, the WAL, the backups, or the disk image, but who cannot read live process memory. It is not designed to defend against a compromised DBA, a memory-scraping malware on the running host, or a side-channel in the cipher implementation. The textbook account in Database Internals (Petrov, Ch. 4 “Transaction Processing and Recovery”, and the encryption-at-rest digression in the closing chapters) frames TDE as a file-system-level defense recast at page granularity so the engine can decrypt only the pages it actually needs.
Cipher choice. AES (FIPS 197) is the default everywhere because of hardware support: AES-NI on x86, ARMv8 Crypto Extensions on ARM, and the corresponding intrinsics in OpenSSL bring per-page overhead down to a few cycles per byte. The mode of operation matters as much as the cipher. NIST SP 800-38A enumerates the options:
- CBC chains blocks; the last block depends on every previous block. Good diffusion, but the encrypt path is sequential within a page, and a one-bit flip in ciphertext propagates as a full-block garbage-in-plaintext on decrypt — which is fine because the page has its own checksum, but bad if you wanted partial-page rewrites.
- CTR generates a key-stream by encrypting
nonce || counterand XORing with the plaintext. Encryption is fully parallel within a page, ciphertext has the same length as plaintext (no padding), and partial in-place rewrites are trivially supported. The only hard rule is that (key, nonce) must never repeat — otherwise XORing two ciphertexts produces XOR of two plaintexts. CUBRID picks CTR.
Key hierarchy. A flat scheme — encrypt every page with the user’s master key — is hostile to key rotation: rotating the key means re-encrypting every page in the database. The textbook fix is the two-level hierarchy popularized by Oracle TDE:
- A Master Key (MK), supplied by the DBA at startup or via a key-management server. Lives in volatile memory while the server runs.
- One or more Data Encryption Keys (DEK), one per encryption purpose (table data, temp space, WAL). Generated once at database creation, stored on disk wrapped (i.e., encrypted) by the MK.
Rotating the MK then means decrypting the wrapped DEK with the old MK, re-wrapping with the new MK, and writing the new wrapped blob back. The data pages are untouched. Rotating a DEK still means re-encrypting every page that depends on it, so engines avoid that unless absolutely necessary.
Nonce per page. With CTR mode, the nonce-uniqueness invariant is a hard constraint that determines the granularity of the key. If a single DEK encrypts every page, the nonces must be globally unique across the entire database. The natural choices are:
- LSN/LSA — the page’s most-recent log sequence number. Unique for permanent pages because every modification writes a new log record before the page can be flushed (WAL).
- Page identifier —
(volid, pageid)is unique in space but does not change across rewrites of the same page; using it alone with a fixed key violates uniqueness as soon as the page is rewritten with different content. - Monotonic counter — works for pages whose LSA is meaningless (temp files have LSA pinned to a sentinel) but requires atomic increment.
- Logical page id of WAL — the per-database increasing sequence number of log pages.
CUBRID picks LSA for permanent pages, atomic counter for temp pages, logical pageid for log pages — three different policies on top of three different DEKs.
Where the wrapped DEK lives. Two camps:
- Inside the database. Postgres’ in-development TDE, MySQL InnoDB TDE, and CUBRID all store the wrapped DEK in a system table or dedicated heap inside the database. The trade-off is that backing up the database backs up the wrapped DEK, but the master key is still required to unwrap it.
- Beside the database. The master key file itself must live
somewhere — either a hardware module (HSM/KMS) or an OS file owned
by the DBA, separate from the database files. CUBRID uses an OS
file
<db>_keysco-located by default with the database but configurable viatde_keys_file_path.
Performance budget. AES-256-CTR with AES-NI is around 1 GB/s per core; an encrypt or decrypt of a 16 KB page costs roughly 16 microseconds of CPU. For a buffer pool dominated by hits this is invisible; for a write-heavy workload that hits the disk on every flush it adds maybe 5-10% to the total write path. The numbers get worse without AES-NI or with smaller pages. The cost falls entirely on the I/O path: a buffer-pool hit on an already-decrypted page is free.
Common DBMS Design
Section titled “Common DBMS Design”Five engines (Oracle, SQL Server, MySQL InnoDB, PostgreSQL, CUBRID) make broadly similar choices for the same reason: they all sit between an OS file system and a fixed-size page abstraction, so the encryption boundary lands at “the byte that crosses the buffer-pool / disk frontier”.
Oracle TDE — pioneered the two-level hierarchy in 2005. Each tablespace has its own DEK, all DEKs are wrapped by a master key held in an Oracle Wallet (a PKCS #12 file) or a HSM. The CTR-mode choice came later; older versions used CBC. Granularity is per tablespace, which means per-table is not natively supported but is often arranged by giving sensitive tables their own tablespace.
SQL Server TDE — three-level hierarchy: Service Master Key (SMK, machine-bound) wraps Database Master Key (DMK, per-database) wraps the Database Encryption Key (DEK). Encryption mode AES-256 (Always Encrypted, a separate feature, encrypts at the column level inside the client driver). Database-level granularity; the entire DB or nothing.
MySQL InnoDB TDE — per-tablespace DEK wrapped by a master key managed by the keyring plugin (file, AWS KMS, HashiCorp Vault, etc.). AES-256-CBC by default, with per-page IV derived from the tablespace ID and page number. The wrapped DEK lives in the tablespace header. Rotation of the master key re-wraps the per-tablespace keys without rewriting page data.
PostgreSQL TDE — historically not supported in-tree; ongoing patches (the “TDE Patch” by Cybertec / EnterpriseDB) target a cluster-level encryption with a single DEK derived from a passphrase, plus a separate cluster file encryption key (CFEK) for WAL. The upstream design is still in flux as of the early 2026 timeframe. Several forks (Percona, Cybertec) ship pre-release TDE.
CUBRID sits in the same space as MySQL InnoDB and Oracle: per-DB
DEKs (three of them — perm, temp, log), one master key per database,
AES-256-CTR (or ARIA-256-CTR for Korean compliance), with the master
key file living outside the database tree. The granularity dial
is per file (i.e., per heap, per B-tree) rather than per
tablespace, exposed via SQL CREATE TABLE ... ENCRYPT=AES. The
implementation is roughly 1700 lines in tde.c plus thin hooks in
the page buffer, the log page buffer, the file manager, and the boot
code.
CUBRID’s Approach
Section titled “CUBRID’s Approach”Key hierarchy
Section titled “Key hierarchy”CUBRID uses two levels — a Master Key wraps a Data Key Set — and splits the second level into three independent Data Encryption Keys:
// TDE_DATA_KEY_SET — src/storage/tde.htypedef struct tde_data_key_set{ unsigned char perm_key[TDE_DATA_KEY_LENGTH]; // permanent data pages unsigned char temp_key[TDE_DATA_KEY_LENGTH]; // temporary file pages unsigned char log_key[TDE_DATA_KEY_LENGTH]; // WAL pages} TDE_DATA_KEY_SET;All three DEKs are 256-bit (TDE_DATA_KEY_LENGTH = 32), as is the
master key (TDE_MASTER_KEY_LENGTH = 32). They are generated once,
at database creation, by tde_create_dk() calling OpenSSL’s
RAND_bytes() directly — no derivation, no passphrase. The reason
for splitting into three is nonce policy: each kind of page has
a different identifier that can serve as a nonce, and giving each
kind its own DEK isolates a hypothetical nonce-reuse bug to one page
class.
The on-disk record that materializes this hierarchy is TDE_KEYINFO:
// TDE_KEYINFO — src/storage/tde.htypedef struct tde_keyinfo{ int mk_index; // index in master-key file time_t created_time; // of the master key time_t set_time; // last MK change on this DB unsigned char mk_hash[TDE_MASTER_KEY_LENGTH]; // SHA-256(MK), for validation unsigned char dk_perm[TDE_DATA_KEY_LENGTH]; // wrapped: enc_MK(perm_key) unsigned char dk_temp[TDE_DATA_KEY_LENGTH]; // wrapped: enc_MK(temp_key) unsigned char dk_log[TDE_DATA_KEY_LENGTH]; // wrapped: enc_MK(log_key)} TDE_KEYINFO;Three observations about this layout:
- The DEKs are stored in their wrapped form. The MK is required to unwrap them. They never touch disk in plaintext.
- The MK is stored as a hash, not a value. The hash is enough to
validate a candidate MK (“does this MK match the one used to wrap
these DEKs?”) without storing the MK itself. This is important
for the open-the-database flow: present a MK, hash it, compare
against
mk_hash, reject if mismatch. - The
mk_indexis a back-pointer into the master-key file<db>_keys, which holds the actual MK byte stream.
The TDE_KEYINFO blob lives in a small CUBRID heap file inside the
database, hand-rolled to avoid MVCC adjustments
(tde_initialize() prepends a dummy repid_and_flag_bits int to the
record specifically to “prevent the record from adjustment in
vacuum_rv_check_at_undo() while UNDOing”). The HFID of that heap
is captured in boot_Db_parm.tde_keyinfo_hfid and threaded through
to tde_cipher_initialize() at restart.
The master key file itself sits at <db_full_name>_keys by default
(or under tde_keys_file_path if set), opened with mode 0600, and
holds a magic header followed by up to 128 fixed-size slots:
// TDE_MK_FILE_ITEM — src/storage/tde.htypedef struct tde_mk_file_item{ time_t created_time; // -1 if slot is invalid unsigned char master_key[TDE_MASTER_KEY_LENGTH];} TDE_MK_FILE_ITEM;
#define TDE_MK_FILE_ITEM_COUNT_MAX 128Slot zero starts at offset CUBRID_MAGIC_MAX_LENGTH (skipping the
magic). A slot whose created_time == -1 is treated as deleted —
tde_delete_mk() overwrites the timestamp with -1 but does not
shrink the file. tde_add_mk() scans for the first deleted slot
or appends if all slots are full.
graph TD
DBA[DBA passphrase / KMS] -- writes --> MKF[<db>_keys file<br/>mode 0600<br/>up to 128 MK slots]
MKF -- 32-byte MK at slot mk_index --> MK[Master Key in memory<br/>32 bytes]
MK -- AES-256-CTR / ARIA-256-CTR --> WRAP{wrap}
PERM[perm_key 32B] -.gen at DB create.-> WRAP
TEMP[temp_key 32B] -.gen at DB create.-> WRAP
LOG[log_key 32B] -.gen at DB create.-> WRAP
WRAP --> KI[TDE_KEYINFO heap record<br/>inside database]
KI -- mk_index --> MKF
KI -- mk_hash --> MK
MK -- unwrap on restart --> CIPHER[tde_Cipher.data_keys<br/>plaintext DEKs<br/>process memory only]
CIPHER --> ENC[encrypt / decrypt page]
Encryption mode and IV derivation
Section titled “Encryption mode and IV derivation”The cipher selection is one switch per algorithm in tde_encrypt_internal():
// tde_encrypt_internal — src/storage/tde.cswitch (tde_algo) { case TDE_ALGORITHM_AES: cipher_type = EVP_aes_256_ctr (); break; case TDE_ALGORITHM_ARIA: cipher_type = EVP_aria_256_ctr (); break; case TDE_ALGORITHM_NONE: default: assert (false); goto cleanup; }Both algorithms use 256-bit keys and CTR mode. ARIA is a Korean
national cipher (KS X 1213), included to satisfy KISA cryptographic
compliance for Korean financial and government installations. The
TDE_DK_ALGORITHM macro at the top of tde.h hard-codes AES for
DEK wrapping (the master key always wraps DEKs with AES-256-CTR
regardless of the data-page algorithm).
The IV — labeled nonce everywhere in the code — is a 16-byte
buffer constructed differently for each page class:
| Page class | DEK | Nonce source | Length |
|---|---|---|---|
| Permanent data page | perm_key | iopage_plain->prv.lsa (8 bytes) zero-padded | 16 B |
| Temporary data page | temp_key | tde_Cipher.temp_write_counter, ATOMIC_INC_64, 8 bytes zero-padded | 16 B |
| WAL page | log_key | logpage_plain->hdr.logical_pageid (8 bytes) zero-padded | 16 B |
| Wrapped DEK | (master key) | tde_dk_nonce(dk_type) — fixed pattern 0/1/2 by type | 16 B |
For permanent pages the LSA already increments on every modification
because of WAL — every undo/redo log append produces a fresh LSA
which is then stamped into the page. The combination
(perm_key, lsa) is unique across (page, version), which is the CTR
invariant. The nonce ends up inside the encrypted page itself at
prv.tde_nonce so the reader can recover it.
For temporary pages the LSA is permanently set to a sentinel
(pgbuf_init_temp_page_lsa), so the LSA cannot serve. A 64-bit
atomic counter local to tde_Cipher increments on every encrypt;
the value is written into prv.tde_nonce and read back on decrypt.
The counter resets to zero at server start because temporary files
are recreated on restart, so wrap-around within a server lifetime is
the only concern — at one increment per page-write,
2^64 takes thousands of years.
For log pages the logical pageid is the WAL’s own monotonic page
number; combined with the dedicated log_key, no log page ever
shares a (key, nonce) pair with any other.
For DEKs themselves, the nonce is a fixed pattern dictated by
tde_dk_nonce():
// tde_dk_nonce — src/storage/tde.ccase TDE_DATA_KEY_TYPE_PERM: memset (dk_nonce, 0, ...); break;case TDE_DATA_KEY_TYPE_TEMP: memset (dk_nonce, 1, ...); break;case TDE_DATA_KEY_TYPE_LOG: memset (dk_nonce, 2, ...); break;This is safe because the master key is rotated more often than the DEKs are rewritten — each (mk, dk_type) pair is encrypted only twice in the lifetime of the DEK (once at DB-create, once on each MK change), so even a fixed nonce per type does not violate CTR uniqueness as long as the master key changes between rewrites. If the same MK ever wraps the same DEK twice, the wrapped output will be byte-identical (CTR with same key+nonce produces the same key stream), which is acceptable.
Page encryption boundary
Section titled “Page encryption boundary”CUBRID’s I/O page is FILEIO_PAGE, a fixed-size struct:
// FILEIO_PAGE — src/storage/file_io.hstruct fileio_page_reserved{ LOG_LSA lsa; // page LSN INT32 pageid; INT16 volid; unsigned char ptype; unsigned char pflag; // bit 0x1 = AES, bit 0x2 = ARIA, mask 0x3 INT32 p_reserve_1; INT32 p_reserve_2; INT64 tde_nonce; // counter (temp) or LSA copy (perm)};struct fileio_page{ FILEIO_PAGE_RESERVED prv; // header — NEVER encrypted char page[1]; // body — encrypted on-disk FILEIO_PAGE_WATERMARK prv2; // tail-LSA — NEVER encrypted};The encryption boundary is set by two macros:
#define TDE_DATA_PAGE_ENC_OFFSET sizeof (FILEIO_PAGE_RESERVED)#define TDE_DATA_PAGE_ENC_LENGTH DB_PAGESIZEPlaintext on-disk: FILEIO_PAGE_RESERVED at the head and
FILEIO_PAGE_WATERMARK at the tail. Ciphertext on-disk: everything
in between, length DB_PAGESIZE. This split is required:
- The header must remain plaintext because the buffer manager
reads it before knowing whether to decrypt —
pflagitself is the flag that says “this page is encrypted”. Encryptingpflagwithpflagas part of the key state would be circular. - The watermark at the tail (a duplicate of the LSA) must remain
plaintext for the same reason it was added to the design in the
first place: torn-write detection. CUBRID compares
prv.lsa == prv2.lsato detect a partial-write on a page, and this check has to work on the encrypted form too. - The
tde_noncefield (insideprv) is the IV, which by cryptographic doctrine must be public.
Log pages have an analogous split:
#define TDE_LOG_PAGE_ENC_OFFSET sizeof (LOG_HDRPAGE)#define TDE_LOG_PAGE_ENC_LENGTH ((LOG_PAGESIZE) - (TDE_LOG_PAGE_ENC_OFFSET))The LOG_HDRPAGE (16 bytes: logical pageid, offset, flags, checksum)
stays plaintext. The flags field carries the encryption marker:
#define LOG_HDRPAGE_FLAG_ENCRYPTED_AES 0x1#define LOG_HDRPAGE_FLAG_ENCRYPTED_ARIA 0x2#define LOG_IS_PAGE_TDE_ENCRYPTED(p) \ ((p)->hdr.flags & LOG_HDRPAGE_FLAG_ENCRYPTED_AES \ || (p)->hdr.flags & LOG_HDRPAGE_FLAG_ENCRYPTED_ARIA)This flag mirrors pflag for data pages but lives in the log header.
Per-file granularity and the tablespace dial
Section titled “Per-file granularity and the tablespace dial”Encryption is per-file, not global. Each FILE_HEADER carries
its own TDE_ALGORITHM, manipulated via file_set_tde_algorithm().
When a page is first allocated to a file, file_alloc() reads the
header’s algorithm and stamps it onto the new page’s pflag:
// file_alloc — src/storage/file_manager.ctde_algo = file_get_tde_algorithm_internal (fhead);// ... debug logging ...pgbuf_set_tde_algorithm (thread_p, page_alloc, tde_algo, FILE_IS_TEMPORARY (fhead));Turning on encryption for an existing file is file_apply_tde_algorithm(),
which sets the file header’s algorithm and then walks every existing
user page to set its pflag, latching each page write-mode
unconditionally. Because each page must be re-flagged (and therefore
will be re-encrypted on its next flush), this is an O(file-pages)
operation that must run before the file is concurrently accessed.
The SQL surface is CREATE TABLE ... ENCRYPT=AES, which translates
into a file_apply_tde_algorithm() on the table’s heap file plus
each of its overflow files (the calls at heap_create_internal time
appear at file_apply_tde_algorithm() line 12401 and 12410 of
file_manager.c). The user-visible knob is per-table; the internal
mechanism is per-file.
Pages outside any file — most notably volume headers and the
file-table pages of the file manager itself — are never encrypted.
Their pflag stays at zero. This means an attacker with the data
files can still see the volume layout, the sector bitmap, and the
file allocation table. The defense applies only to user data — heap
pages, overflow pages, B-tree pages — and to the WAL.
Encrypt-on-flush — data pages
Section titled “Encrypt-on-flush — data pages”Every dirty data page passes through pgbuf_bcb_flush_with_wal() on
its way to disk. The encryption hook is at the very top of the
“copy the page out” sequence:
// pgbuf_bcb_flush_with_wal — src/storage/page_buffer.ciopage = (FILEIO_PAGE *) PTR_ALIGN (page_buf, MAX_ALIGNMENT);CAST_BFPTR_TO_PGPTR (pgptr, bufptr);tde_algo = pgbuf_get_tde_algorithm (pgptr);if (tde_algo != TDE_ALGORITHM_NONE) { error = tde_encrypt_data_page (&bufptr->iopage_buffer->iopage, tde_algo, is_temp, iopage); // ... }else { memcpy (iopage, &bufptr->iopage_buffer->iopage, IO_PAGESIZE); }if (uses_dwb) { error = dwb_set_data_on_next_slot (thread_p, iopage, ..., &dwb_slot); // ... }The flow:
- Allocate a stack-aligned scratch buffer
iopageofIO_MAX_PAGE_SIZE. - Read the per-page TDE algorithm from
pflag. If zero,memcpythe in-memory page to the scratch buffer and continue. - If non-zero, call
tde_encrypt_data_page(), which copies header and watermark in plaintext, computes the nonce (LSA for perm, atomic-counter for temp), writes the nonce intoiopage->prv.tde_nonce, and CTR-encrypts the body into the scratch buffer. - The DWB (double-write buffer), if active, stores the ciphertext page. The DWB is purely a torn-write defense; it sees only already-encrypted bytes.
- After the DWB hands control back,
fileio_writewrites the ciphertext page to the data volume.
sequenceDiagram
participant T as Worker thread
participant PB as Page buffer
participant TDE as tde_encrypt_data_page
participant DWB as Double-write buffer
participant FIO as fileio_write
participant DSK as Data volume
T->>PB: pgbuf_bcb_flush_with_wal
PB->>PB: read pflag → tde_algo
alt tde_algo != NONE
PB->>TDE: encrypt(plain, algo, is_temp, cipher)
TDE->>TDE: nonce = LSA (perm) or atomic++ (temp)
TDE->>TDE: copy header + watermark plaintext
TDE->>TDE: AES-256-CTR body → cipher buffer
TDE-->>PB: cipher buffer
else tde_algo == NONE
PB->>PB: memcpy plain → cipher
end
PB->>DWB: dwb_set_data_on_next_slot(cipher)
DWB->>FIO: fileio_write(cipher)
FIO->>DSK: write page
Decrypt-on-read — data pages
Section titled “Decrypt-on-read — data pages”The mirror image lives in pgbuf_claim_bcb_for_fix(), where a page
miss triggers a disk read followed by an in-place decrypt:
// pgbuf_claim_bcb_for_fix — src/storage/page_buffer.cif (dwb_read_page (thread_p, vpid, &bufptr->iopage_buffer->iopage, &success) != NO_ERROR) { ... }else if (success == true) { /* copied from DWB */ }else if (fileio_read (thread_p, ..., &bufptr->iopage_buffer->iopage, vpid->pageid, IO_PAGESIZE) == NULL) { ... }
CAST_IOPGPTR_TO_PGPTR (pgptr, &bufptr->iopage_buffer->iopage);tde_algo = pgbuf_get_tde_algorithm (pgptr);if (tde_algo != TDE_ALGORITHM_NONE) { if (tde_decrypt_data_page (&bufptr->iopage_buffer->iopage, tde_algo, pgbuf_is_temporary_volume (vpid->volid), &bufptr->iopage_buffer->iopage) != NO_ERROR) { // ... } }Decrypt happens in place — tde_decrypt_data_page() reads from
and writes to the same bufptr->iopage_buffer->iopage buffer,
because CTR is a stream cipher (the inverse of encryption is a second
XOR with the same key-stream, which is the same operation).
The DWB is consulted first: if a write to this VPID is currently
parked in the DWB, dwb_read_page() returns the DWB’s copy
(ciphertext, since the DWB holds encrypted pages). The decrypt
hook then runs unchanged. This is what makes DWB transparent to TDE.
sequenceDiagram
participant T as Worker thread
participant PB as Page buffer
participant DWB as Double-write buffer
participant FIO as fileio_read
participant DSK as Data volume
participant TDE as tde_decrypt_data_page
T->>PB: pgbuf_fix(vpid)
PB->>DWB: dwb_read_page(vpid)
alt page in DWB
DWB-->>PB: cipher (from DWB slot)
else not in DWB
PB->>FIO: fileio_read(vpid)
FIO->>DSK: read page
DSK-->>FIO: cipher bytes
FIO-->>PB: cipher
end
PB->>PB: pgbuf_get_tde_algorithm(pflag)
alt tde_algo != NONE
PB->>TDE: decrypt(cipher, algo, is_temp, plain)
TDE->>TDE: nonce = prv.tde_nonce
TDE->>TDE: AES-256-CTR body → plaintext (in place)
TDE-->>PB: plaintext page
end
PB-->>T: PAGE_PTR
Log encryption
Section titled “Log encryption”Log pages are encrypted on the same WAL flush path that already
exists, with two hooks: the appending page is marked encrypted via
logpb_set_tde_algorithm if any record on it carries user data, and
the flushing path encrypts immediately before write.
The append-side decision is made per log record, then promoted to the
page. prior_set_tde_encrypted() flags a log_prior_node as
TDE-relevant (called for record types in LOG_MAY_CONTAIN_USER_DATA,
the macro in tde.h listing every heap and B-tree user-data
recovery index). When the page is finalized,
logpb_next_append_page() and logpb_start_append() check
log_Gl.append.appending_page_tde_encrypted and stamp the
new page’s flags from PRM_ID_TDE_DEFAULT_ALGORITHM:
// logpb_next_append_page — src/transaction/log_page_buffer.cif (log_Gl.append.appending_page_tde_encrypted) { TDE_ALGORITHM tde_algo = (TDE_ALGORITHM) prm_get_integer_value (PRM_ID_TDE_DEFAULT_ALGORITHM); logpb_set_tde_algorithm (thread_p, log_Gl.append.log_pgptr, tde_algo); logpb_set_dirty (thread_p, log_Gl.append.log_pgptr); }The flush-side encryption is in logpb_writev_append_pages() and
logpb_write_page_to_disk():
// logpb_writev_append_pages — src/transaction/log_page_buffer.clog_pgptr = to_flush[i];if (LOG_IS_PAGE_TDE_ENCRYPTED (log_pgptr)) { if (tde_encrypt_log_page (log_pgptr, logpb_get_tde_algorithm (log_pgptr), enc_pgptr) != NO_ERROR) { logpb_set_tde_algorithm (thread_p, log_pgptr, TDE_ALGORITHM_NONE); // ... raise ER_TDE_ENCRYPTION_LOGPAGE_ERORR_AND_OFF_TDE ... } else { log_pgptr = enc_pgptr; } }The fallback on encryption failure is fail-open: clear the flag,
write the page in plaintext, raise an error. The intent (commented
in logpb_write_page_to_disk()) is “once it fails, the page always
spills user data un-encrypted from then [on]” — accepting a privacy
breach over a database stall. This is a deliberate trade-off and a
candidate for review.
The decrypt-on-read mirror sits in logpb_read_page_from_active_log()
and logpb_read_page_from_file(). The latter has an interesting
asymmetry: pages fetched from the archive are already plaintext
(archive log files store decrypted content, see below) so no decrypt
is needed; pages fetched from the active log carry their flag and
get decrypted in place.
// logpb_read_page_from_file — src/transaction/log_page_buffer.cTDE_ALGORITHM tde_algo = logpb_get_tde_algorithm ((LOG_PAGE *) log_pgptr);if (tde_algo != TDE_ALGORITHM_NONE) { if (tde_decrypt_log_page ((LOG_PAGE *) log_pgptr, tde_algo, (LOG_PAGE *) log_pgptr) != NO_ERROR) { ... } }The archive store-as-plaintext choice is unusual — it means archived
logs are not encrypted at rest, only the active log is. The
in-source comment explains this is because TDE for replication log is
disabled (UNSTABLE_TDE_FOR_REPLICATION_LOG); the archive files are
reused for both archival and replication, and replication is not
allowed to depend on TDE state.
DWB interaction — pass-through
Section titled “DWB interaction — pass-through”The double-write buffer sees the page only after the buffer manager
has applied encryption. There are zero tde_* calls in
double_write_buffer.cpp. The DWB simply slots the bytes it is
given, computes its own checksum on whatever bytes those happen to
be, and on a later flush writes them out twice (DWB volume first,
then the data volume). On crash recovery, the DWB volume is checked
for incomplete data-volume writes and the DWB’s copy (still
ciphertext) is replayed onto the data volume; the page-buffer code
will decrypt it normally on the next read.
This is a clean separation: the DWB cares about torn writes, not about content.
Master-key file lifecycle
Section titled “Master-key file lifecycle”tde_initialize() is called once at database creation
(boot_create_volume_dirs). It:
- Builds the master-key file path via
tde_make_keys_file_fullname(). - Calls
tde_create_keys_file(), whichO_CREAT | O_RDWRopens the file with mode0600, writesCUBRID_MAGIC_KEYSat offset 0, andfsyncs. Signals (other than fatal ones) are blocked across the create+write pair. - Generates a random MK via
tde_create_mk()(RAND_bytes(default_mk, 32)) plus the currenttime(NULL). tde_add_mk()writes the MK at slot 0 of the key file.- Generates three DEKs (
tde_create_dk()× 3, eachRAND_bytes(32)). tde_generate_keyinfo()builds aTDE_KEYINFOwith the MK index, the SHA-256(MK), and the three wrapped DEKs.- Inserts the
TDE_KEYINFOblob into the special TDE keyinfo heap.
tde_cipher_initialize() runs at every server restart via
boot_restart_server → tde_cipher_initialize (call sites at
boot_sr.c:2324 and boot_sr.c:5233):
- Mount
<db>_keys(try the user-supplied path first, then default). - Validate magic header.
- Read the
TDE_KEYINFOfrom the keyinfo heap. - Read the MK at
keyinfo.mk_indexfrom the key file. - Hash the MK and compare to
keyinfo.mk_hash. Reject on mismatch — this is how a wrong-master-key situation is detected. tde_load_dks()decrypts the three DEKs intotde_Cipher.data_keys.- Reset
tde_Cipher.temp_write_counter = 0. - Set
tde_Cipher.is_loaded = true. - Dismount the key file. The MK has now done its job; only the plaintext DEKs remain in process memory.
tde_Cipher is a single global of type TDE_CIPHER, holding the
three DEKs and the temp-counter, declared extern in tde.h and
defined in tde.c. Every encrypt/decrypt call reads through this
global; there is no per-thread copy.
Master-key rotation
Section titled “Master-key rotation”tde_change_mk() (and its admin wrapper xtde_change_mk_without_flock,
exposed via the cubrid tde --change-key=N admin command — see
util_admin.c TDE_CHANGE_KEY_S / util_cs.c:3941) replaces the
master key without re-encrypting any data:
- Look up the new MK by index in the key file.
- Validate it is not the same key already loaded.
- Validate the previous key still exists in the file (so the DBA does not strand the database without an unwrappable key path).
- Generate a new
TDE_KEYINFOwith the newmk_index, newmk_hash, and re-wrapped DEKs (the in-memory plaintext DEKs are re-encrypted with the new MK). heap_update_logicaloverwrites the oldTDE_KEYINFOrecord.heap_flush()is mandatory — without it, the new keyinfo could exist only in the page buffer when a crash takes down the server, leaving the database keyinfo pointing at the new MK while the on-disk keyinfo still points at the old one. The flush guarantees that after this call returns, the DBA can delete the previous MK from the key file and the database will still open.
What is not rotated by this command:
- The DEKs themselves — they remain the same byte values, just re-wrapped. Every page on disk continues to be encrypted with the same DEKs as before.
- Any data page — the existing ciphertext stays valid because the DEK that produced it is unchanged.
Rotating a DEK would require re-encrypting every page that uses it. The current source has no such operation; it is a candidate for future work.
Performance and AES-NI
Section titled “Performance and AES-NI”Both encrypt paths go through OpenSSL’s EVP_* interface, which
auto-selects AES-NI on x86 and ARM Crypto Extensions on ARM. The
hot loop is one EVP_EncryptUpdate per page — for a 16 KB page in
CTR mode, this expands to 16 KB / 16 B = 1024 AES block operations,
each ~1 cycle on AES-NI hardware. On modern hardware the per-page
overhead is sub-microsecond. The CTR mode also lets a multi-page
flush parallelize trivially across cores, though CUBRID does not
exploit this — the encrypt is inline on the flushing thread.
The cost falls on:
- Page-buffer miss on disk read: one decrypt per missed page.
- Buffer-pool flush: one encrypt per flushed dirty page.
- WAL flush: one encrypt per log page that contains user data.
A buffer-pool hit is free. A workload that fits in the buffer pool sees TDE only on its log-flush path; an OLTP workload that thrashes the buffer pool pays on every flush and every miss.
graph LR
subgraph Hot["hot path — buffer-pool hit"]
Q1[Query] --> H1[pgbuf_fix]
H1 --> M1[plaintext in memory]
end
subgraph Miss["cold path — buffer-pool miss"]
Q2[Query] --> H2[pgbuf_fix]
H2 --> R2[fileio_read cipher]
R2 --> D2[tde_decrypt_data_page]
D2 --> M2[plaintext in memory]
end
subgraph Flush["flush path — encrypt"]
F1[pgbuf_bcb_flush_with_wal] --> E1[tde_encrypt_data_page]
E1 --> W1[fileio_write cipher]
end
subgraph WAL["WAL path — log encrypt"]
L1[logpb_writev_append_pages] --> EL[tde_encrypt_log_page]
EL --> WL[fileio_write log cipher]
end
Source Walkthrough
Section titled “Source Walkthrough”Symbols grouped by subsystem. Each symbol’s role is summarized in two lines or fewer; full position hints are in the table at the end of this section.
Key-file management
Section titled “Key-file management”tde_create_keys_file(tde.c) —O_CREAT|O_RDWRopen with mode 0600, writeCUBRID_MAGIC_KEYS, fsync, close. Signals blocked across the create.tde_validate_keys_file(tde.c) — read magic, compare toCUBRID_MAGIC_KEYS. Used both at restart and before any read from the key file.tde_make_keys_file_fullname(tde.c) — resolves the master-key file path. Either<db>_keysco-located with the database files or${tde_keys_file_path}/<base>_keysif the system parameter is set.tde_copy_keys_file(tde.c) — block-copy the key file (used duringcubrid copydband backup).tde_add_mk(tde.c) — append or fill-first-deleted-slot. Caps atTDE_MK_FILE_ITEM_COUNT_MAX = 128keys per database.tde_find_mk(tde.c) — seek to slot, read item, validatecreated_time != -1.tde_find_first_mk(tde.c) — linear scan from slot 0 returning the first valid item. Used at DB-create to pick up an existing key onER_BO_VOLUME_EXISTS.tde_delete_mk(tde.c) — overwritecreated_timewith -1. In-place; the file does not shrink.tde_dump_mks(tde.c) — admin scan, prints index and creation-time for every valid slot.tde_create_mk(tde.c) —RAND_bytes(master_key, 32)plustime(NULL).tde_print_mk(tde.c) — hex print of an MK; used by admin.
Keyinfo (in-database wrapped-DEK record)
Section titled “Keyinfo (in-database wrapped-DEK record)”tde_initialize(tde.c) — DB-create entry: create key file, MK, three DEKs, build keyinfo, insert into TDE keyinfo heap.tde_cipher_initialize(tde.c) — server-restart entry: mount key file, fetch keyinfo, validate MK, decrypt DEKs intotde_Cipher.tde_get_keyinfo(tde.c) —heap_first-based fetch of the single record in the TDE keyinfo heap.tde_update_keyinfo(tde.c) —heap_update_logicalwithUPDATE_INPLACE_CURRENT_MVCCID. Hacksclass_oidto bypassheap_scancache_check_with_hfid.tde_generate_keyinfo(tde.c) — populate aTDE_KEYINFOwith SHA-256(MK) and three encrypted DEKs.tde_change_mk(tde.c) — re-wrap DEKs with a new MK andheap_flushthe new keyinfo. Forces flush so a subsequent crash does not strand the database.xtde_get_mk_info(tde.c) — admin RPC: returns the loaded keyinfo’smk_index,created_time,set_time.xtde_change_mk_without_flock(tde.c) — admin RPC: reads the key file withfileio_open(no fcntl lock — the client side held it), validates, callstde_change_mk.
Master-key validation and DEK wrap
Section titled “Master-key validation and DEK wrap”tde_load_mk(tde.c) — read MK from key file at given index; hash and compare tomk_hashfrom keyinfo.tde_validate_mk(tde.c) — SHA-256 the candidate MK,memcmpto stored hash. Constant-time? No (usesmemcmp).tde_make_mk_hash(tde.c) — SHA-256 over 32-byte MK. Note the static assertion thatSHA256_DIGEST_LENGTH == TDE_MASTER_KEY_LENGTH.tde_load_dks(tde.c) — decrypt all three DEKs from keyinfo intotde_Cipher.data_keys.tde_create_dk(tde.c) —RAND_bytes(data_key, 32).tde_encrypt_dk(tde.c) — wrap a DEK with the MK usingTDE_DK_ALGORITHM(= AES) and the type-derived nonce.tde_decrypt_dk(tde.c) — unwrap.tde_dk_nonce(tde.c) — the fixed 0/1/2-byte-fill pattern byTDE_DATA_KEY_TYPE.
Page encrypt / decrypt — internal
Section titled “Page encrypt / decrypt — internal”tde_encrypt_internal(tde.c) — single OpenSSL EVP path:EVP_CIPHER_CTX_new→EVP_EncryptInit_ex→EncryptUpdate→EncryptFinal_ex→EVP_CIPHER_CTX_free. Assertscipher_len == lengthbecause CTR is a stream cipher.tde_decrypt_internal(tde.c) — mirror.
Page encrypt / decrypt — public
Section titled “Page encrypt / decrypt — public”tde_encrypt_data_page(tde.c) — copy header (32 bytesFILEIO_PAGE_RESERVED) plaintext, copy watermark plaintext, derive nonce (LSA for perm, atomic-counter for temp), stamp nonce intoprv.tde_nonce, CTR-encrypt body.tde_decrypt_data_page(tde.c) — mirror; reads nonce fromprv.tde_nonce.tde_encrypt_log_page(tde.c) — analogous, but offset issizeof(LOG_HDRPAGE)and nonce ishdr.logical_pageid.tde_decrypt_log_page(tde.c) — mirror.tde_get_algorithm_name(tde.c) — string for log lines;"NONE","AES","ARIA".tde_is_loaded(tde.c) — getter fortde_Cipher.is_loaded.
Page-buffer hooks (data pages)
Section titled “Page-buffer hooks (data pages)”pgbuf_set_tde_algorithm(page_buffer.c) — writepflagbit; emit aRVPGBUF_SET_TDE_ALGORITHMlog record unlessskip_loggingis true (true for temp pages).pgbuf_get_tde_algorithm(page_buffer.c) — readpflagbit back asTDE_ALGORITHM_AES/_ARIA/_NONE.pgbuf_rv_set_tde_algorithm(page_buffer.c) — recovery: replay aRVPGBUF_SET_TDE_ALGORITHMlog record onto a page.pgbuf_bcb_flush_with_wal(page_buffer.c) — flush hot path; hosts the encrypt-on-flush hook before DWB.pgbuf_claim_bcb_for_fix(page_buffer.c) — fix cold path; hosts the decrypt-on-read hook after DWB / fileio_read.pgbuf_copy_from_area(page_buffer.c) — special path for direct page-area writes; takes aTDE_ALGORITHMargument and applies it to the new page.
File-manager hooks (per-file granularity)
Section titled “File-manager hooks (per-file granularity)”file_set_tde_algorithm(file_manager.c) — write the algorithm into theFILE_HEADER; emit aRVFL_FHEAD_SET_TDE_ALGORITHMlog record (unless temp).file_get_tde_algorithm_internal(file_manager.c) — read it back.file_set_tde_algorithm_internal(file_manager.c) — write it without logging (used by recovery).file_get_tde_algorithm(file_manager.c) — public read with page latch.file_apply_tde_algorithm(file_manager.c) — set on file header- walk every user page and stamp
pflag. Caller must guarantee no concurrent access (usesPGBUF_UNCONDITIONAL_LATCH).
- walk every user page and stamp
file_alloc(file_manager.c) — at allocation, copy file’s algo to the new page’spflag. The line that does it:pgbuf_set_tde_algorithm (thread_p, page_alloc, tde_algo, FILE_IS_TEMPORARY (fhead));.file_rv_set_tde_algorithm(file_manager.c) — recovery replay.
Log-page hooks
Section titled “Log-page hooks”logpb_set_tde_algorithm(log_page_buffer.c) — setLOG_HDRPAGE_FLAG_ENCRYPTED_*flag in log page header.logpb_get_tde_algorithm(log_page_buffer.c) — read back.LOG_IS_PAGE_TDE_ENCRYPTED(log_storage.hpp) — fast macro test.prior_set_tde_encrypted(log_append.cpp) — mark a prior node; refuses iftde_is_loaded()is false.prior_is_tde_encrypted(log_append.cpp) — read back.LOG_MAY_CONTAIN_USER_DATA(tde.h) — macro listing recovery indices that imply user data on the page; the trigger forprior_set_tde_encrypted.logpb_next_append_page(log_page_buffer.c) — when a new append page is allocated, propagateappending_page_tde_encryptedinto the page’s flags.logpb_start_append(log_page_buffer.c) — same, on the first record of a page.logpb_writev_append_pages(log_page_buffer.c) — flush hot path for active log; encrypts each TDE-flagged page into a scratch buffer before write.logpb_write_page_to_disk(log_page_buffer.c) — single-page flush variant; same encrypt logic.logpb_read_page_from_active_log(log_page_buffer.c) — bulk read; iterates pages and decrypts each TDE-flagged one in place.logpb_read_page_from_file(log_page_buffer.c) — single-page read with active/archive split; archive pages are already plaintext.
Boot integration
Section titled “Boot integration”tde_initializecall site —boot_sr.c:5104, insideboot_create_volume_dirs(DB-create flow).tde_cipher_initializecall sites —boot_sr.c:2324(restart with explicit MK path, used byrestoredb) andboot_sr.c:5233(normal restart, default MK path).boot_Db_parm.tde_keyinfo_hfid— the HFID of the TDE keyinfo heap, persisted in the DB control block.
Admin / utility
Section titled “Admin / utility”TDE_CHANGE_KEY_S(util_admin.c) — short option forcubrid tde --change-key=N; passed through to the server side which callsxtde_change_mk_without_flock.TDE_CHANGE_KEY_L(util_admin.c) — long option.tdeadmin command body —util_cs.c:3941reads the option and RPCs to the server.
Constants and types
Section titled “Constants and types”TDE_ALGORITHMenum (tde.h) —NONE(0),AES(1),ARIA(2).TDE_DATA_KEY_TYPEenum (tde.h) —PERM,TEMP,LOG.TDE_DATA_KEY_SET(tde.h) — three 32-byte DEKs in a struct.TDE_KEYINFO(tde.h) — on-disk keyinfo blob.TDE_MK_FILE_ITEM(tde.h) — on-disk master-key file slot.TDE_CIPHER(tde.h) — the in-memory singleton.TDE_DATA_PAGE_ENC_OFFSET/_LENGTH(tde.h) — encrypted region of a data page.TDE_LOG_PAGE_ENC_OFFSET/_LENGTH(tde.h) — encrypted region of a log page.TDE_DATA_PAGE_NONCE_LENGTH= 16,TDE_LOG_PAGE_NONCE_LENGTH= 16,TDE_DK_NONCE_LENGTH= 16 (tde.h).TDE_MASTER_KEY_LENGTH= 32,TDE_DATA_KEY_LENGTH= 32 (tde.h).TDE_MK_FILE_ITEM_COUNT_MAX= 128 (tde.h).LOG_DBTDE_KEYS_VOLID(log_volids.hpp) — synthetic volid assigned to the master-key file when mounted viafileio_mount.FILEIO_PAGE_FLAG_ENCRYPTED_AES= 0x1,_ARIA= 0x2,_MASK= 0x3 (file_io.h).LOG_HDRPAGE_FLAG_ENCRYPTED_AES= 0x1,_ARIA= 0x2,_MASK= 0x3 (log_storage.hpp).PRM_ID_TDE_DEFAULT_ALGORITHM(system_parameter) —tde_default_algorithm, the algorithm used for log encryption (data pages take their algorithm from the file header, set per-table).PRM_ID_TDE_KEYS_FILE_PATH(system_parameter) —tde_keys_file_path, optional override of the<db>_keyslocation.FILEIO_SUFFIX_KEYS="_keys"(file_io.h) — the default master-key-file suffix.
Position hints (as of 2026-05-01)
Section titled “Position hints (as of 2026-05-01)”| Symbol | File | Line |
|---|---|---|
tde_initialize | src/storage/tde.c | 107 |
tde_cipher_initialize | src/storage/tde.c | 233 |
tde_create_keys_file | src/storage/tde.c | 321 |
tde_validate_keys_file | src/storage/tde.c | 370 |
tde_copy_keys_file | src/storage/tde.c | 410 |
tde_make_keys_file_fullname | src/storage/tde.c | 504 |
tde_generate_keyinfo | src/storage/tde.c | 532 |
tde_get_keyinfo | src/storage/tde.c | 569 |
tde_update_keyinfo | src/storage/tde.c | 607 |
tde_change_mk | src/storage/tde.c | 661 |
tde_load_mk | src/storage/tde.c | 705 |
tde_load_dks | src/storage/tde.c | 742 |
tde_validate_mk | src/storage/tde.c | 774 |
tde_make_mk_hash | src/storage/tde.c | 794 |
tde_create_dk | src/storage/tde.c | 814 |
tde_encrypt_dk | src/storage/tde.c | 838 |
tde_decrypt_dk | src/storage/tde.c | 858 |
tde_dk_nonce | src/storage/tde.c | 875 |
tde_encrypt_data_page | src/storage/tde.c | 908 |
tde_decrypt_data_page | src/storage/tde.c | 961 |
tde_encrypt_log_page | src/storage/tde.c | 1009 |
tde_decrypt_log_page | src/storage/tde.c | 1039 |
tde_encrypt_internal | src/storage/tde.c | 1074 |
tde_decrypt_internal | src/storage/tde.c | 1153 |
xtde_get_mk_info | src/storage/tde.c | 1226 |
xtde_change_mk_without_flock | src/storage/tde.c | 1258 |
tde_create_mk | src/storage/tde.c | 1323 |
tde_add_mk | src/storage/tde.c | 1363 |
tde_find_mk | src/storage/tde.c | 1449 |
tde_find_first_mk | src/storage/tde.c | 1516 |
tde_delete_mk | src/storage/tde.c | 1581 |
tde_dump_mks | src/storage/tde.c | 1641 |
tde_get_algorithm_name | src/storage/tde.c | 1706 |
TDE_CIPHER (struct) | src/storage/tde.h | 148 |
TDE_KEYINFO (struct) | src/storage/tde.h | 160 |
TDE_MK_FILE_ITEM (struct) | src/storage/tde.h | 92 |
TDE_DATA_KEY_SET (struct) | src/storage/tde.h | 85 |
LOG_MAY_CONTAIN_USER_DATA (macro) | src/storage/tde.h | 107 |
FILEIO_PAGE_FLAG_ENCRYPTED_AES | src/storage/file_io.h | 63 |
FILEIO_PAGE_RESERVED (struct) | src/storage/file_io.h | 165 |
LOG_HDRPAGE_FLAG_ENCRYPTED_AES | src/transaction/log_storage.hpp | 42 |
LOG_IS_PAGE_TDE_ENCRYPTED | src/transaction/log_storage.hpp | 47 |
LOG_DBTDE_KEYS_VOLID | src/transaction/log_volids.hpp | 41 |
pgbuf_set_tde_algorithm | src/storage/page_buffer.c | 4880 |
pgbuf_rv_set_tde_algorithm | src/storage/page_buffer.c | 4933 |
pgbuf_get_tde_algorithm | src/storage/page_buffer.c | 4953 |
pgbuf_claim_bcb_for_fix (decrypt hook) | src/storage/page_buffer.c | 8277 |
pgbuf_bcb_flush_with_wal (encrypt hook) | src/storage/page_buffer.c | 10532 |
file_set_tde_algorithm | src/storage/file_manager.c | 5823 |
file_get_tde_algorithm | src/storage/file_manager.c | 5929 |
file_apply_tde_algorithm | src/storage/file_manager.c | 6003 |
file_alloc (TDE stamp) | src/storage/file_manager.c | 5503 |
prior_set_tde_encrypted | src/transaction/log_append.cpp | 1564 |
prior_is_tde_encrypted | src/transaction/log_append.cpp | 1580 |
logpb_get_tde_algorithm | src/transaction/log_page_buffer.c | 11564 |
logpb_set_tde_algorithm | src/transaction/log_page_buffer.c | 11592 |
logpb_writev_append_pages (encrypt hook) | src/transaction/log_page_buffer.c | 2819 |
logpb_write_page_to_disk (encrypt hook) | src/transaction/log_page_buffer.c | 2303 |
logpb_read_page_from_active_log (decrypt) | src/transaction/log_page_buffer.c | 2201 |
logpb_read_page_from_file (decrypt) | src/transaction/log_page_buffer.c | 2110 |
tde_initialize call site | src/transaction/boot_sr.c | 5104 |
tde_cipher_initialize call sites | src/transaction/boot_sr.c | 2324, 5233 |
Cross-check Notes
Section titled “Cross-check Notes”Against cubrid-page-buffer-manager.md. The page-buffer doc
describes the LRU/AOUT replacement and the dirty-flag accounting in
pgbuf_bcb_flush_with_wal, but does not mention the TDE hook
sitting at the very top of the flush sequence. The encrypt path
allocates a stack-aligned scratch IO_MAX_PAGE_SIZE buffer, copies
the in-memory plaintext into it via tde_encrypt_data_page (which
internally memcpys the FILEIO_PAGE_RESERVED header and watermark
plaintext, then CTR-encrypts the body), and from then on
DWB/fileio_write see only the cipher copy. The page in the BCB
itself is never modified — it stays plaintext in RAM, and only
the scratch copy that goes to disk gets encrypted. This means a
buffer-pool hit incurs zero TDE cost.
Against cubrid-double-write-buffer.md. The DWB doc claims
content-agnosticism; this is accurate for TDE. The DWB sees the
already-encrypted bytes from pgbuf_bcb_flush_with_wal, computes its
own checksum on those bytes, and writes them twice. On recovery,
the DWB replays its (cipher) copy onto the data volume; the next
read of that page goes through pgbuf_claim_bcb_for_fix which
applies the decrypt hook normally. There is exactly one place this
matters: the DWB volume itself is not encrypted as a unit, but every
page sitting in a DWB slot is the encrypted form of that page. An
attacker reading the DWB file gets ciphertext, not plaintext.
Against cubrid-log-manager.md (anticipated). The log manager
doc, when it exists, should note that:
- The TDE flag is a per-page property (
LOG_HDRPAGE_FLAG_ENCRYPTED_*), not a per-record one. - The decision to encrypt is made per-record (via
prior_set_tde_encryptedon records covering user data) and promotes to the page that holds the record. Once any user-data record lands on a page, the whole page is encrypted on flush. - Archive log files store plaintext. The active log’s TDE
flag is cleared by
logpb_archive_active_logbefore the page moves to archive (the relevant code path usestde_decrypt_log_pageand emits an unencrypted page into the archive). This is because TDE for replication log is currently disabled (UNSTABLE_TDE_FOR_REPLICATION_LOG) and the archive files are reused for replication. - The fail-open behaviour on encrypt error in the flush path —
raising
ER_TDE_ENCRYPTION_LOGPAGE_ERORR_AND_OFF_TDEand clearing the flag — is a known trade-off documented in the source comment.
Against cubrid-disk-manager.md. The disk manager doc covers
volume headers, sector tables, file allocation tables. None of
these are encrypted under TDE: pflag on those pages stays at zero.
The TDE boundary applies to user pages allocated via file_alloc,
which inherits the TDE flag from the file header. Volume layout is
visible to a filesystem-level attacker.
Master-key file location. Note the implicit assumption that the
master-key file resides on the same host as the database. This is
the default — <db>_keys co-located with the database. A KMS or
HSM integration would require replacing tde_create_keys_file and
the file-based tde_find_mk/tde_add_mk with calls to a key
manager. No such integration exists in the current source.
Replication / HA implications. The header notes a disabled
UNSTABLE_TDE_FOR_REPLICATION_LOG symbol. With this disabled, a HA
replica receives plaintext archive logs — which means that a
secondary that mirrors a TDE-enabled primary stores the data
encrypted (it has its own TDE) but receives the WAL stream
unencrypted across the network. This is a cross-cutting concern
that the code comments call out explicitly. Network-level
encryption (TLS) is the workaround.
Open Questions
Section titled “Open Questions”- HSM / KMS integration. All of
tde_load_mk,tde_find_mk,tde_add_mk,tde_delete_mkoperate directly on a POSIX file viaread/write/lseek. The cleanest extension point is a virtualization of those four functions with a pluggable backend; nothing else in the code path assumes file-system semantics for the master key. No such abstraction exists today. - Per-table re-keying / online rekey.
tde_change_mkonly re-wraps the DEKs. Rotating the actual DEKs would require reading every page, decrypting with the old DEK, and rewriting with the new DEK — analogous to whatfile_apply_tde_algorithmdoes for the algorithm flag, but with full body re-encryption. The current source has no such operation, and it is a non-trivial design problem because the page LSA serves as the nonce: re-encrypting in place without changing the LSA is impossible without violating CTR uniqueness. - Per-column or per-row encryption. TDE encrypts the entire page body. Attribute-level encryption (analogous to SQL Server Always Encrypted) would require client-side support, which CUBRID does not currently expose. The TDE module is purely page-level.
- Constant-time MK comparison.
tde_validate_mkuses rawmemcmpto compare the SHA-256 of a candidate MK against the stored hash. The timing side channel is small (32 bytes) but not zero. ACRYPTO_memcmpfrom OpenSSL would close it. - Archive-log encryption. Disabling TDE on archive logs is a conscious choice to allow plaintext replication; if HA-link encryption is expected to be done at the network layer (TLS), the trade-off is fine. Otherwise a re-encrypt-at-archive path would be needed.
- Key-file slot reclamation.
tde_delete_mkmarks slots invalid but does not compact the file. After many key rotations the file grows linearly until the 128-slot ceiling. No compaction routine exists. - Fail-open on log encrypt error. The
ER_TDE_ENCRYPTION_LOGPAGE_ERORR_AND_OFF_TDEpath silently writes plaintext when encryption fails. From a privacy standpoint, fail-closed (refuse to flush, halt the server) might be safer. The current choice prioritizes availability. - TDE for the keyinfo heap itself. The
TDE_KEYINFOblob is stored in a CUBRID heap file. That heap file is not TDE-flagged (it would be circular: you need the DEK to decrypt it, but it contains the wrapped DEKs). Confirm by inspection that the keyinfo HFID’s file header hastde_algo == NONE; a defensive assertion would be worthwhile. tde_print_mkin production builds. This function prints master-key material to stdout. Exposed for admin debugging (tde_dump_mkswithprint_value), but worth confirming it is guarded against accidental invocation in non-admin contexts.
Sources
Section titled “Sources”Code paths consulted:
src/storage/tde.hsrc/storage/tde.csrc/storage/page_buffer.c— encrypt-on-flush, decrypt-on-read,pflagaccessors, recovery hook.src/storage/file_io.h—FILEIO_PAGElayout,pflagconstants, watermark/sanity helpers.src/storage/file_io.c— only one TDE touchpoint (io_page->prv.tde_nonce = 0;in page-format scrub).src/storage/file_manager.h,file_manager.c— per-file TDE algorithm; page allocation propagates it.src/storage/double_write_buffer.cpp— confirmed: no TDE-aware code; DWB is content-agnostic by design.src/transaction/log_page_buffer.c— log page encrypt/decrypt hooks on flush, read, archive boundary.src/transaction/log_storage.hpp—LOG_HDRPAGE_FLAG_*,LOG_IS_PAGE_TDE_ENCRYPTED.src/transaction/log_append.cpp,log_append.hpp— prior-node TDE flag (prior_set_tde_encrypted/prior_is_tde_encrypted).src/transaction/log_volids.hpp— synthetic volid for the master-key file mount.src/transaction/boot_sr.c— DB-create and restart entry to the TDE module.src/executables/util_admin.c,util_cs.c—cubrid tdeadmin command parsing and dispatch.
Theoretical references:
- FIPS 197 (AES standard) — block cipher.
- NIST SP 800-38A — modes of operation. CUBRID uses CTR.
- KS X 1213 (ARIA) — Korean national cipher; the alternate algorithm offered alongside AES.
- Database Internals (Petrov, 2019) — encryption-at-rest discussion in the storage chapters.
- Comparative engines: Oracle TDE (per-tablespace DEK + Wallet master), SQL Server TDE (three-level SMK/DMK/DEK hierarchy), MySQL InnoDB TDE (per-tablespace DEK + keyring plugin), PostgreSQL TDE (in development, cluster-wide DEK with optional separate WAL key).