Skip to content

CUBRID migrate — One-Shot 9.1→9.2 In-Place Format Upgrader for Volume Headers, Active Log Codeset, and Collation Sync

A disk-format upgrader is the tool a database engine ships when its on-disk layout changes incompatibly between versions. Two design choices dominate:

  1. In-place vs. dump-and-reload. In-place rewrites the existing volume files; faster (no second disk space, no re-import) but requires a per-format converter and a robust crash-recovery story. Dump-and-reload (export with old binaries, import with new) is slower but conceptually simpler and reuses the standard export/import path.
  2. Single-version-pair vs. multi-version-chain. A single-pair tool migrates strictly from version N to N+1; a chain tool understands every historical format and converts from any to current. Chains are easier on operators (one tool per upgrade) but explode the test matrix; pairs are simple and explicit but require operators to traverse intermediate versions.

CUBRID’s migrate picks in-place + single-pair (9.1→9.2). The hard guard rel_disk_compatible() != V9_2_LEVEL at startup rejects the migrator if it’s been linked against a different CUBRID version, making it impossible to accidentally run the 9.1→9.2 converter on a 9.0 or 9.3 install. Every later version upgrade has its own pair tool (or a release-note instruction to dump-and-reload).

The in-place choice forces the migrator to maintain an undo journal — every modified volume header is first read, then written to a side buffer, then rewritten with the new format. A mid-migration crash can roll back via undo_fix_volume_header on each entry in the journal.

migrate <db_name> runs a strict four-phase sequence:

  1. Volume header rewrite. Walk every data volume listed in the database’s volume-info file; for each volume, read the v9.1 r91_disk_var_header (struct in migrate.c:115), reshape into the v9.2 layout, write back. Each pre-rewrite header is captured into an in-memory undo list (vol_undo_info, max UNDO_LIST_SIZE = 32).
  2. Codeset patch in active log. The active log header carries a codeset field; if it disagrees with the catalog (db_root.codeset), fix_codeset_in_active_log patches it in place so the log’s interpretation of textual records matches the catalog after migration.
  3. Catalog reconcile via db_restart + synccoll_force. Boot the database with the new binary, walk every collation row in db_collation (via catcls_get_db_collation), compare each against the locale library’s view of the same collation — id match, name match, codeset match, checksum match (unless contractions are present, in which case the checksum is allowed to drift because the on-disk COLL_CONTRACTION representation was reorganised in 9.2). Then synccoll_force rewrites the collation rows from the locale library.
  4. Volume statistics refresh. file_update_used_pages_of_vol_header scans every volume to refresh the used-pages counters in the header — these were computed differently in 9.1 and need to be rebuilt at 9.2 semantics.

If any phase fails, the migrator unwinds: error_undo_vol_header re-applies every entry in the undo list in reverse order, restoring the original v9.1 headers; then error_undo_compat reverts the compat-level marker (check_and_fix_compat_level) so the database can be reopened with 9.1 binaries.

// migrate.c (paraphrased)
fix_all_volume_header (db_path) {
walk volume info;
for each volume {
undo_page = make_volume_header_undo_page (vol_path, size); // read + journal
if (undo_page) {
vol_undo_info[vol_undo_count++] = (vol_path, undo_page, size);
}
fix_volume_header (vol_path); // in-place rewrite
}
}
error_undo_vol_header:
for (i = vol_undo_count - 1; i >= 0; i--) {
undo_fix_volume_header (vol_undo_info[i].filename,
vol_undo_info[i].page,
vol_undo_info[i].page_size);
}
free_volume_header_undo_list ();

The journal is in-memory only — a process crash mid-rewrite loses the journal and leaves the volumes in mixed state. Operators are expected to take a cold backup before running migrate so a process-crash recovery path exists outside the tool.

The signal handler (intr_handler, installed for SIGINT and Windows console-control) sets a flag the loop checks between volumes, allowing a graceful Ctrl-C that triggers the same undo sequence rather than aborting mid-volume.

r91_disk_var_header — the legacy on-disk shape

Section titled “r91_disk_var_header — the legacy on-disk shape”
// migrate.c — the format being migrated FROM
struct r91_disk_var_header {
char magic[CUBRID_MAGIC_MAX_LENGTH];
INT16 iopagesize;
INT16 volid;
DISK_VOLPURPOSE purpose;
INT32 sect_npgs;
INT32 total_sects;
INT32 free_sects;
INT32 hint_allocsect;
INT32 total_pages;
INT32 free_pages;
INT32 sect_alloctb_npages;
INT32 page_alloctb_npages;
INT32 sect_alloctb_page1;
INT32 page_alloctb_page1;
INT32 sys_lastpage;
INT64 db_creation;
INT32 max_npages;
INT32 dummy;
LOG_LSA chkpt_lsa;
HFID boot_hfid;
INT16 offset_to_vol_fullname;
INT16 offset_to_next_vol_fullname;
INT16 offset_to_vol_remarks;
char var_fields[1];
};

The 9.2 header (defined in disk_manager.{c,h}) reorganises this into a different field layout — sector-allocation tracking moves to a separate per-volume metadata page, the dummy field is repurposed, and several offsets are recalculated. fix_volume_header contains the field-by-field translation; the migrator is the only place in the codebase that knows the v9.1 layout.

check_and_fix_compat_level writes a sentinel to the database indicating “migration in progress.” If the migrator process is killed cleanly (or finishes successfully) the sentinel is cleared. If the database is opened by 9.1 or 9.2 binaries while the sentinel is set, both refuse to mount — preventing double-migration or accidental access during a migration window.

// migrate.c::main (paraphrased)
if (rel_disk_compatible () != V9_2_LEVEL) {
printf ("CUBRID library version is invalid.\n"
"Please upgrade to CUBRID 9.2 and retry migrate.\n");
return EXIT_FAILURE;
}

rel_disk_compatible() returns the linked CUBRID library’s disk-compat level. The migrator refuses to start unless the linked level is exactly 9.2 — preventing the binary from being mistakenly used to migrate to a different target version. This is why migrate is a strictly-version-locked tool: it’s not just “upgrade to whatever is current”; it’s “9.1 → 9.2 and nothing else.”

SymbolRole
mainEntry; arg check; version guard; four-phase orchestration; error rollback
intr_handlerSIGINT handler; sets interrupt flag for graceful abort
fix_all_volume_headerPhase 1 driver: walks volume info, calls fix_volume_header per volume
fix_volume_headerPer-volume header rewrite (v9.1 → v9.2)
make_volume_header_undo_pageRead existing header into a side buffer for the undo journal
undo_fix_volume_headerRestore one entry from the undo journal
free_volume_header_undo_listDrop the in-memory undo journal after success
get_active_log_vol_pathFind the active log path from the volume-info file
get_db_pathResolve database name → full path (via databases.txt)
check_and_fix_compat_levelWrite the migration-in-progress sentinel
fix_codeset_in_active_logPhase 2: patch the active log header’s codeset field
get_codeset_from_db_rootRead the catalog’s authoritative codeset for the patch
r91_disk_var_header (struct)Legacy v9.1 volume-header layout
VOLUME_UNDO_INFO (struct)Per-undo-entry record (filename + page bytes + size)
vol_undo_info (global array)The undo journal, max 32 entries
SymbolPath
mainsrc/executables/migrate.c:404
r91_disk_var_header (struct)src/executables/migrate.c:115
fix_all_volume_header (declaration)src/executables/migrate.c:155
get_active_log_vol_pathsrc/executables/migrate.c:167
fix_codeset_in_active_logsrc/executables/migrate.c:200
V9_1_LEVEL / V9_2_LEVEL (defines)src/executables/migrate.c:49–50

Symbol names are the canonical anchor; line numbers are hints scoped to the updated: date.

  • One-pair scope by design. Modern CUBRID (10.x and later) uses a different upgrade path — release notes typically prescribe dump-and-reload via unloaddb / loaddb rather than an in-place migrator. This binary is preserved for archaeological / disaster-recovery scenarios where someone is upgrading a pre-9.2 database.
  • In-memory undo journal limits. UNDO_LIST_SIZE = 32 means a database with more than 32 volumes cannot be rolled back beyond the most recent 32 in-flight rewrites. This is acceptable in practice because mid-migration aborts for crash reasons (not signal reasons) lose the journal anyway.
  • Collation contraction allow-list. The count_contr == 0 check around the checksum compare exists because 9.2’s COLL_CONTRACTION struct was reorganised; collations with contractions deliberately have a different on-disk checksum in 9.2 even though the user-visible behaviour is unchanged. Bypassing the checksum compare for those collations is a deliberate trust call.
  • AU_DISABLE_PASSWORDS then db_login("DBA", NULL). The migrator runs with auth bypassed; the DBA login is positional rather than authenticated. This is safe because the migrator runs locally on the database files and a second connection is impossible (the compat-level sentinel prevents normal mounts).
  • No cubrid migrate verb. Unlike most utilities, migrate is a standalone binary, not registered in ua_Utility_Map (see cubrid-cub-admin.md). Operators invoke it directly as migrate <db_name>. This deliberate omission matches its narrow scope — it doesn’t fit the general admin-CLI shape.
  • Modern format upgrade path. No documented equivalent of this tool exists for, e.g., 10.x → 11.x. The dump-and-reload path is implicit in release notes; whether a future version would benefit from a similar in-place migrator is undecided.
  • Cold-backup requirement. Operators must take a cold backup before running migrate; the tool itself doesn’t enforce this. A pre-flight check that refuses to run unless a recent backup is detected would harden the process.
  • The 32-entry undo cap is large enough for typical databases but small for large multi-volume installs. Whether to widen it (or move the journal to an on-disk file) is a scoping question that hasn’t come up because the tool isn’t in active development.
  • src/executables/migrate.c — the entire utility (single file, ~830 lines)
  • src/executables/AGENTS.md — agent guide
  • Adjacent docs: cubrid-disk-manager.md (the modern volume-header format that migrate produces), cubrid-charset-collation.md (collation infrastructure that phase 3 reconciles), cubrid-log-manager.md (the active-log header that phase 2 patches), cubrid-boot.md (the database-boot path that phase 3 exercises via db_restart), cubrid-cub-admin.md (the unified admin CLI; migrate is not in ua_Utility_Map)