CUBRID unloaddb — Schema and Data Export, Four-File Output Layout, and the Per-Class Multi-Thread · Multi-Process Driver

Theoretical Background

A logical exporter is the database utility that walks the catalog and writes out a textual or near-textual representation of the schema and data, suitable for re-import by the engine’s loader into another database (same engine, possibly different version, or a fresh empty database for cross-version migration / DR drills / test-fixture builds). Every relational engine has one — pg_dump, mysqldump, expdp, mongodump — and every implementation makes the same set of design choices.

Three axes dominate:

Schema-first vs. data-first ordering. Almost every loader needs the schema to exist before the data can be inserted, so exporters emit schema first and data second. The interesting question is what counts as schema: table definitions are obvious; views depend on tables and must come after; triggers reference tables (and may reference each other); indexes can be built from already-loaded data and are therefore deferable. The ordering choice — and which artefacts are deferred to a post-load pass — is the export’s contract with the loader.
One file vs. many files. A single export file is simple to ship around and easy to replay (cat dump.sql | psql). Many files (one per phase, or one per class) parallelise both export and import, and let operators selectively reload pieces. The tradeoff is operational: many files mean an exporter process coordinates output streams, and an importer needs to know the filename convention.
Snapshot semantics. What counts as “the database” at the moment of export? A naive read sees committed-but-then-changed rows differently for early-extracted vs. late-extracted classes. Engines with MVCC let the exporter pin a single snapshot (PostgreSQL pg_dump uses repeatable-read; CUBRID exposes the --latest-image toggle to choose between the last committed image vs. the snapshot at export start). Exporters that don’t pin a snapshot rely on the operator to suspend writes — which is sometimes acceptable for cold backups, never for online ones.

The unloaddb tool addresses all three with table-driven, fixed choices for ordering (schema → triggers → indexes → data) and a configurable split (one file per phase by default; one file per class as an option) plus an explicit --latest-image snapshot toggle.

Common DBMS Design

Engine	Phase ordering	File layout	Concurrency	Snapshot
PostgreSQL pg_dump	data is bracketed by `--section=pre-data` (schema), `--section=data`, `--section=post-data` (indexes, FKs, triggers)	one file (text/custom/directory format); directory format → one file per relation	`--jobs N` (directory format only) — parallel per-table COPY	repeatable-read snapshot pinned at start
MySQL mysqldump	DDL inline before each INSERT (`--single-transaction` to pin)	one file by default; `--tab` makes per-table .sql + .txt pairs	none in mysqldump itself; mysqlpump is the parallel variant	`--single-transaction` pins via REPEATABLE READ; otherwise no snapshot
Oracle expdp (Data Pump)	METADATA_ONLY / DATA_ONLY phases; full export emits both with metadata first inside the dumpfile	one or many `.dmp` dumpfiles in a directory object; format is binary	`PARALLEL=N` worker processes	`FLASHBACK_TIME` / `FLASHBACK_SCN`
MongoDB mongodump	metadata `.metadata.json` + BSON `.bson` per collection	one directory per database, one file pair per collection	implicit per-collection parallelism	none by default; `--snapshot` for collection-scan consistency
CUBRID unloaddb	schema → triggers → indexes → data (loaddb replays in this order)	four files (`_schema`/`_trigger`/`_indexes`/`_objects`) by default; `--datafile-per-class` adds one data file per class	`--thread-count N` (per-class fetch parallelism); `--mt-process N/M` (whole-process partition by class)	`--latest-image` toggle: latest committed vs. start-of-export

CUBRID’s distinguishing choice is the two-axis concurrency: either thread-pool inside one process (--thread-count) for network-bound fetch parallelism, or multiple processes (--mt-process N/M) where each process exports a disjoint subset of the class list and writes to a per-process file prefix. The two modes are mutually exclusive at invocation time but address different bottlenecks — threads for network round-trip latency, processes for CPU-bound text serialisation.

CUBRID’s Approach

The four-file output contract

A vanilla unloaddb mydb produces four files in the output directory:

File	Contents	Loaded by
`<prefix>_schema`	`CREATE CLASS ...` for tables, `CREATE VCLASS ...` for views, in a topologically-correct order; column definitions, primary keys, owner / auth grants when `do_auth = 1`	loaddb phase 1 (schema)
`<prefix>_trigger`	`CREATE TRIGGER ...` statements for every trigger; emitted after schema because triggers reference classes	loaddb phase 2 (triggers)
`<prefix>_indexes`	`CREATE INDEX ...` statements for every secondary index; emitted last so indexes can be built post-load over fully populated heaps (avoids the per-row index-maintenance cost during the data insert)	loaddb phase 4 (after data)
`<prefix>_objects`	The CUBRID object-loader format — one record per row, schema-aware text encoding	loaddb phase 3 (data)

<prefix> defaults to the database name and is overridden by --output-prefix / -p. --output-path / -od selects the output directory.

The phase order in the export (schema → trigger → index → data) is not the load order. Loaddb consumes them as: schema first (must be present for any data), then triggers (must be present before data so AFTER triggers fire correctly during the load — unless triggers are explicitly disabled with --no-trigger), then data, then indexes (built over the now-populated heaps in one batch, much faster than maintaining them per-row).

Entry point and option taxonomy

The unloaddb function in unloaddb.c is the SA/CS dual entry loaded by the unified cubrid admin CLI (see cubrid-cub-admin.md §“The database-admin arm”). It takes a UTIL_FUNCTION_ARG * and unpacks ~25 options grouped into five families:

// unloaddb — unloaddb.c
input_filename       = utility_get_option_string_value (arg_map, UNLOAD_INPUT_CLASS_FILE_S, 0);
include_references   = utility_get_option_bool_value   (arg_map, UNLOAD_INCLUDE_REFERENCE_S);
required_class_only  = utility_get_option_bool_value   (arg_map, UNLOAD_INPUT_CLASS_ONLY_S);
datafile_per_class   = utility_get_option_bool_value   (arg_map, UNLOAD_DATAFILE_PER_CLASS_S);
est_size             = utility_get_option_int_value    (arg_map, UNLOAD_ESTIMATED_SIZE_S);
cached_pages         = utility_get_option_int_value    (arg_map, UNLOAD_CACHED_PAGES_S);
output_dirname       = utility_get_option_string_value (arg_map, UNLOAD_OUTPUT_PATH_S, 0);
do_schema            = utility_get_option_bool_value   (arg_map, UNLOAD_SCHEMA_ONLY_S);
do_objects           = utility_get_option_bool_value   (arg_map, UNLOAD_DATA_ONLY_S);
latest_image_flag    = utility_get_option_bool_value   (arg_map, UNLOAD_LATEST_IMAGE_S);
output_prefix        = utility_get_option_string_value (arg_map, UNLOAD_OUTPUT_PREFIX_S, 0);
hash_filename        = utility_get_option_string_value (arg_map, UNLOAD_HASH_FILE_S, 0);
verbose_flag         = utility_get_option_bool_value   (arg_map, UNLOAD_VERBOSE_S);
order = utility_get_option_bool_value (arg_map, UNLOAD_KEEP_STORAGE_ORDER_S)
        ? FOLLOW_STORAGE_ORDER : FOLLOW_ATTRIBUTE_ORDER;
split_schema_files   = utility_get_option_string_value (arg_map, UNLOAD_SPLIT_SCHEMA_FILES_S, 0);
is_as_dba            = utility_get_option_string_value (arg_map, UNLOAD_AS_DBA_S, 0);
g_pre_alloc_varchar_size = utility_get_option_int_value (arg_map, UNLOAD_STRING_BUFFER_SIZE_S);
g_request_pages      = utility_get_option_int_value    (arg_map, UNLOAD_REQUEST_PAGES_S);
thread_count         = utility_get_option_int_value    (arg_map, UNLOAD_THREAD_COUNT_S);
sampling_records     = utility_get_option_int_value    (arg_map, UNLOAD_SAMPLING_TEST_S);
/* --mt-process N/M parsed as "process_idx/process_total" */

Five families:

What to extract. --schema-only / --data-only, --input-class-file, --input-class-only, --include-reference.
Where to put it. --output-path, --output-prefix, --datafile-per-class, --split-schema-files.
How to fetch. --cached-pages, --request-pages, --string-buffer-size, --latest-image, --keep-storage-order.
How to parallelise. --thread-count or --mt-process N/M.
Auth / verbose. --user, --password, --as-dba, --verbose, --enhanced-estimates, --sampling-test, --estimated-size, --hash-file.

The two driver paths

After option parse and db_restart_ex, the unloaddb function runs in two halves gated by do_schema / do_objects:

// unloaddb (paraphrased)
if (!status && (do_schema || !do_objects)) {       // schema half
    if (g_parallel_process_cnt <= 1 || g_parallel_process_idx == 1) {
        create_filename_trigger (output_dirname, output_prefix,
                                 trigger_output_filename, ...);
        create_filename_indexes (output_dirname, output_prefix,
                                 indexes_output_filename, ...);

        unload_context.do_auth        = 1;
        unload_context.storage_order  = order;
        unload_context.exec_name      = exec_name;
        unload_context.login_user     = user;
        unload_context.output_prefix  = output_prefix;

        extract_classes_to_file (unload_context);                       // _schema
        extract_triggers_to_file (unload_context, trigger_output_filename); // _trigger
        extract_indexes_to_file (unload_context, indexes_output_filename); // _indexes

        unload_context.clear_schema_workspace ();
    }
}

AU_SAVE_AND_ENABLE (au_save);
if (!status && (do_objects || !do_schema)) {       // data half
    extract_objects (unload_context, output_dirname,
                     thread_count, sampling_records, enhanced_estimates);
}
AU_RESTORE (au_save);

db_shutdown ();

Two important conditions:

Schema half runs only in process 1 of an --mt-process N/M partition. There’s only one _schema / _trigger / _indexes file regardless of process count; emitting it from every process would race the file system. Process 1 emits, processes 2..N skip and proceed straight to the data half.
Auth toggle around the data half. AU_SAVE_AND_ENABLE / AU_RESTORE brackets the data extraction so that even if the caller arrived with auth disabled (e.g., from an internal script), the actual row-by-row read enforces the connecting user’s authorisation. The schema half doesn’t need this because its emitter explicitly checks is_dba_user / is_dba_group_member via extract_context.

`extract_context` — the carrier struct

extract_context (declared in extract_schema.hpp) is the bag threaded through every emitter function:

Field	Purpose
`output_dirname`	Where files go
`output_prefix`	Per-database filename prefix
`exec_name`	”unloaddb” — for error messages
`login_user`	The connecting user
`is_dba_user` / `is_dba_group_member`	Authorisation flags resolved at startup
`do_auth`	Whether to emit `GRANT` statements (1 = yes)
`storage_order`	`FOLLOW_STORAGE_ORDER` (column order on disk) vs. `FOLLOW_ATTRIBUTE_ORDER` (declaration order from catalog); affects column ordering in `CREATE CLASS` and the data file
`classes`	`DB_OBJLIST` of class objects to export — populated by `extract_classes_to_file` for the schema half, reused for triggers/indexes
`has_indexes`	Cached flag set during schema extraction
`vclass_list_has_using_index`	View-level `USING INDEX` tracker
`clear_schema_workspace()`	Frees the per-half workspace

The struct’s role is to keep the emitter functions stateless across classes — every call passes the context, and the context carries all the policy.

Schema extraction (`extract_classes_to_file` →

emit_schema)

extract_classes_to_file (defined in unload_schema.c:1378) opens <prefix>_schema, populates ctxt.classes from the catalog, emits per-user-class headers, and calls emit_schema(ctxt, output, EXTRACT_CLASS) to write tables — then a second call with EXTRACT_VCLASS to write views. The two-pass split is what enforces “tables before views”; if any view references a table not yet emitted, the second pass fails with an explicit error rather than producing a non-replayable file.

emit_schema itself iterates ctxt.classes and per class produces the full CREATE CLASS statement: column list with CHAR(N) / NUMERIC(p,s) / etc. domain renderers; constraints (PRIMARY KEY, UNIQUE, NOT NULL inline; FOREIGN KEY at end); inheritance / partition clauses; OWNER name and GRANT ... TO ... when do_auth. The storage_order setting picks between two column-iteration helpers: attribute-order is the catalog declaration order (matches the user’s mental model), storage-order matches the on-disk slot layout (matters for cross-platform binary-tier replication).

extract_triggers_to_file and extract_indexes_to_file follow the same pattern: open output, walk the catalog, emit CREATE TRIGGER / CREATE INDEX per object. Triggers are emitted with their condition expressions verbatim; indexes preserve INDEX hint and filtered-index WHERE clauses.

Data extraction (`extract_objects` → `process_class`)

extract_objects (defined in unload_object.c:699) is the data half’s entry. It:

Re-walks the catalog (or the input class file if -i was given) to build the class list.
If --mt-process N/M, partitions the class list — class i is handled by process (i mod N) + 1. The partition is over the class list, so each process touches a disjoint subset of tables; combined output is the union of per-process <prefix>_objects files (or per-prefix-and-class files when --datafile-per-class is also on).
For each class assigned to this process, calls process_class.

process_class (line 1773) is the per-class loop. It:

Issues a server-side fetch for the class’s heap pages via db_get_all_objects / cursor APIs, batched by --cached-pages (default 100) and --request-pages (default 100).
For each fetched object, walks the class’s attribute list per storage_order and writes one record to the _objects file in the loader’s text format: %id|attr1=val1, attr2=val2, ... with type-specific encoders for OID references, set / sequence / multiset, JSON, LOB locators, MVCC visibility metadata.
If --datafile-per-class, the per-class output stream is a separate file <prefix>_<class>_objects; otherwise everything appends to the single _objects file with class headers separating sections.

When --thread-count > 1, process_class splits the per-class fetch across the configured number of worker threads — each thread fetches a disjoint range of OIDs and serialises to a per-thread buffer, then the buffers are concatenated under a class-level lock before being flushed to the output file. This parallelises the network round-trip, which is the dominant cost on remote-database exports.

--latest-image toggles between two visibility modes for the fetch:

Default (off). Fetch with the snapshot taken at the start of extract_objects — every class sees a consistent view of committed-as-of-start data.
--latest-image on. Fetch the most recent committed version of each row, regardless of when other classes were extracted. Faster (no snapshot tracking) but produces an export that may reflect transactions committed during the export run, which would surprise a downstream consumer expecting a coherent snapshot.

Selective extract (`-i input_classfile`,

-io input-class-only, -ir include-reference)

-i <file> restricts the export to a list of class names read from a file (one per line). -io (input-class-only) further restricts to only those classes — the default behaviour is to also pull in classes referenced from the listed ones (foreign keys, set element domains, view base classes), recursively, until closure.

-ir (include-reference) is the explicit form of the reference crawl — used when the user wants the reference-following behaviour without using -i at all (i.e., on a class list derived elsewhere).

required_class_only = true (set from -io) is checked inside extract_classes_to_file at the catalog walk: classes not in the input list are skipped even if they’re referenced from listed ones.

`--as-dba` cross-owner extract

By default, unloaddb extracts only classes owned by the connecting user (or all classes if the user is DBA). --as-dba asks for cross-owner extraction — useful when a non-DBA admin wants to dump another user’s schema for debugging or migration. The flag is gated:

// unloaddb (paraphrased)
if (is_as_dba) {
    unload_context.is_dba_group_member = au_is_dba_group_member (Au_user);
    if (!unload_context.is_dba_group_member) {
        fprintf (stderr, "\n--%s is an option available only when "
                         "the user is a DBA Group.\n", UNLOAD_AS_DBA_L);
        goto end;
    }
} else {
    unload_context.is_dba_user = ws_is_same_object (Au_dba_user, Au_user);
}

A user not in the DBA group invoking --as-dba is rejected before any extraction; the option is therefore an explicit operator opt-in rather than something that silently degrades.

Sampling and estimates (`--sampling-test`,

--enhanced-estimates)

--sampling-test N instructs extract_objects to fetch only the first N records per class — used for export-pipeline timing and disk-budget estimation without paying the full extraction cost. The normal _objects output is produced; downstream tooling that just wants byte counts can use this and divide by class cardinality to estimate full-export size.

--enhanced-estimates (only meaningful with --verbose) adds fine-grained timing breakdowns per class: catalog read time, per-row serialisation time, write time, lock time. It costs ~5% overhead and is gated behind --verbose because the extra rows in the report would otherwise pollute clean export logs.

`--split-schema-files`

When the schema half produces tens or hundreds of CREATE / GRANT statements, --split-schema-files writes one schema file per class instead of one big <prefix>_schema. Useful for downstream tooling (diffs, PR review, partial reload) that wants to operate on per-class units. The trigger and index files are not split.

`--mt-process N/M` semantics

--mt-process N/M means “this process is partition N of M total, doing the data half only”. Examples:

unloaddb --mt-process 1/4 mydb — process 1 of 4; emits the schema half (because it’s process 1) and the data half for classes assigned to partition 1.
unloaddb --mt-process 2/4 mydb — process 2 of 4; skips the schema half (only process 1 emits it); data half for classes assigned to partition 2.

Each process gets its own <prefix>_objects (or per-class if --datafile-per-class) named with the partition index in the prefix. The downstream loaddb concatenates them or loads them sequentially.

The per-class assignment is deterministic on the class list order returned by the catalog walk: class index i goes to partition (i mod M) + 1. This means a class added to the catalog between two unloaddb runs may end up in a different partition the second time — operators should not rely on per-partition class membership across runs.

The MAX_PROCESS_COUNT = 36 bound is hard-coded; the --thread-count bound is MAX_THREAD_COUNT = 127. The two flags are mutually exclusive at invocation time — --thread-count > 1 with --mt-process N/M (N > 1) is rejected.

Source Walkthrough

Entry and orchestration (`unloaddb.c`)

Symbol	Role
`unloaddb`	Library entry; option parse; auth setup; orchestrates schema-half then data-half; calls `db_shutdown` at end
`unload_usage`	Prints usage from message catalog
`do_multi_processing` (gated by `MULTI_PROCESSING_UNLOADDB_WITH_FORK`)	Optional fork-spawn of N children when in fork-mode multi-process; not the default code path
`output_prefix`, `output_dirname`, etc. (file-scope globals)	Carry the option values into helpers that don’t take the context

Schema export (`unload_schema.c`)

Symbol	Role
`extract_classes_to_file`	Opens `<prefix>_schema`; walks catalog; calls `emit_schema(EXTRACT_CLASS)` then `emit_schema(EXTRACT_VCLASS)`
`emit_schema`	Per-class emitter; renders `CREATE CLASS`/`CREATE VCLASS` with columns, constraints, owner, grants
`extract_triggers_to_file`	Opens `<prefix>_trigger`; per-trigger `CREATE TRIGGER` emit
`extract_indexes_to_file`	Opens `<prefix>_indexes`; calls `emit_indexes`
`emit_indexes`	Per-class index iteration; emits `CREATE INDEX` with hint / filtered clauses preserved
Per-domain renderers (`emit_domain`, `emit_attribute_def`, `emit_constraint_def`, `emit_partition_def`, etc.)	Type-specific emitters used by `emit_schema`

Data export (`unload_object.c`)

Symbol	Role
`extract_objects`	Data-half entry; partitions class list per `--mt-process`; loops over classes calling `process_class`
`process_class`	Per-class fetch + serialise loop; honours `--cached-pages`, `--request-pages`, `--thread-count`, `--latest-image`
Per-DB_TYPE writers (object/set/sequence/multiset/JSON/OID/LOB)	Type-specific text encoders into the loader format

Output-file plumbing (`unload_object_file.{c,h}`)

Symbol	Role
Output-file open / close / flush helpers	Used by the data-half writers; abstracts per-thread buffering and the eventual write to `<prefix>_objects` (or per-class file)
Filename builders (`create_filename_*`)	`<prefix>_schema`, `<prefix>_trigger`, `<prefix>_indexes`, `<prefix>_objects`, plus the per-class variants when `--datafile-per-class`

Carrier struct (`extract_schema.{cpp,hpp}`)

Symbol	Role
`extract_context` (struct)	The bag threaded through every emitter; fields described above
`extract_context::clear_schema_workspace`	Frees `ctxt.classes` and per-half scratch state

Position hints (as of 2026-05-05)

Symbol	Path
`unloaddb` (entry)	`src/executables/unloaddb.c:115`
`unload_usage`	`src/executables/unloaddb.c:99`
Option-unpack block	`src/executables/unloaddb.c:144–227`
Schema-half driver	`src/executables/unloaddb.c:404–455`
Data-half driver	`src/executables/unloaddb.c:457–492`
`extract_classes_to_file`	`src/executables/unload_schema.c:1378`
`extract_triggers_to_file`	`src/executables/unload_schema.c:1534`
`extract_indexes_to_file`	`src/executables/unload_schema.c:1606`
`emit_schema`	`src/executables/unload_schema.c:1708`
`emit_indexes`	`src/executables/unload_schema.c:1671`
`extract_objects`	`src/executables/unload_object.c:699`
`process_class`	`src/executables/unload_object.c:1773`
`process_class` declaration	`src/executables/unload_object.c:394`
`extract_context` (struct)	`src/executables/extract_schema.hpp`

Symbol names are the canonical anchor; line numbers are hints scoped to the updated: date.

Cross-check Notes

Loaddb’s load order is not unloaddb’s emit order. unloaddb emits schema → trigger → index → data; loaddb consumes schema → trigger → data → index. Index creation is deferred at load time to avoid per-row maintenance, which is why the index file is third in the unload but fourth in the load.
Schema-only and data-only are not strictly inverse predicates. --schema-only runs the schema half and skips the data half; --data-only runs the data half and skips the schema half; neither set means both run. The default is both. Expressing schema-and-trigger-but-no-index isn’t directly available — the operator would have to redirect or split files post-hoc.
--mt-process is the only multi-process mode actually used. An older MULTI_PROCESSING_UNLOADDB_WITH_FORK code path exists in the source (gated by a build define) that did fork-spawning inside one invocation; it’s not in the default build. The modern path is multiple invocations of the binary with different --mt-process N/M values, each producing per-partition output, coordinated by an outside script.
--use-delimiter is deprecated. Still parsed (the value is read and discarded) for backward compatibility with older scripts; has no effect.
Owner / grant emission is gated on do_auth = 1 which is hard-coded in the schema-half driver. There’s no flag to suppress authorization in the schema export — the schema file always includes OWNER and GRANT clauses if the connecting user can see them.
--latest-image and --mt-process interact unintuitively. Without --latest-image, each process pins its own snapshot, so partitions can see different snapshot-of-database states unless scheduled to start within the same wall-clock instant. With --latest-image, every process sees latest committed; coherence is the operator’s problem either way for multi-process exports.
storage_order = FOLLOW_STORAGE_ORDER changes both the schema column declaration order and the data file’s attribute-list order — the loader reads columns positionally, so the two must agree. If they disagree (e.g., from a manually edited schema file), loaddb mis-assigns values silently. This is why the flag is rarely used in practice — it’s there for cross-platform binary-tier replication where on-disk order matters and the schema file is regenerated.
AU_SAVE_AND_ENABLE around the data half uses the snapshot of Au_user taken at startup, not whatever Au_user happens to be at half-boundary. A schema half that ran with --as-dba doesn’t re-elevate the data half — both halves run with the same Au_user, just with the data half’s authorisation force-enabled.

Open Questions

Per-class data-file naming with --mt-process + --datafile-per-class. The collision rules between the partition-index-prefixed naming and the per-class-suffixed naming aren’t documented; in practice the output is <prefix>_<partidx>_<class>_objects but the filename builders aren’t called out in AGENTS.md.
Per-class fetch parallelism with --thread-count uses how many threads server-side? The client opens N parallel cursors; each cursor consumes server resources (a thread-pool worker per fetch). For a server with cubrid.conf thread_count lower than --thread-count, the export is bottlenecked at server capacity. The interaction is not warned about by the client.
Resumability. A failed export leaves a partial _objects file; there’s no resume mode. Operators rerun from scratch. Some engines (Oracle Data Pump) support RESUME jobs; CUBRID does not.
Compression. No built-in compression; the operator pipes the output through gzip post-hoc. mongodump and pg_dump’s custom format both support inline compression.
Cross-version compatibility of _objects format. The loaddb format is documented as stable across CUBRID versions but the exact compatibility matrix (especially around new data types like JSON Tables, TDE-encrypted classes) isn’t recorded.

Sources

src/executables/unloaddb.c — entry, option parse, two-half orchestration, multi-process partition logic
src/executables/unloaddb.h — option-string constants, shared types
src/executables/unload_schema.c — schema / trigger / index emitters; per-domain and per-constraint renderers
src/executables/unload_object.c — data extraction driver and per-class loop
src/executables/unload_object_file.{c,h} — output-file plumbing, per-thread buffering, filename builders
src/executables/extract_schema.{cpp,hpp} — extract_context struct definition and helpers
src/executables/AGENTS.md — agent guide; binary→source mapping
Adjacent docs: cubrid-loaddb.md (the inverse — loadddb is the consumer of every file unloaddb produces and its phase order is what dictates the unload’s emit order), cubrid-cub-admin.md (the unified cubrid admin entry that loads the unloaddb symbol; the SA_CS routing per --SA-mode / --CS-mode), cubrid-csql.md (the --loaddb-output mode that lets csql emit data in the same loader format unloaddb produces, useful for ad-hoc partial exports without invoking unloaddb at all), cubrid-class-object.md (the catalog walk that schema extraction relies on), cubrid-locator.md (the OID materialisation that data extraction depends on), cubrid-mvcc.md (the snapshot semantics underlying --latest-image)

CUBRID unloaddb — Schema and Data Export, Four-File Output Layout, and the Per-Class Multi-Thread · Multi-Process Driver

Theoretical Background

Common DBMS Design

CUBRID’s Approach

The four-file output contract

Entry point and option taxonomy

The two driver paths

extract_context — the carrier struct

Schema extraction (extract_classes_to_file →

Data extraction (extract_objects → process_class)

Selective extract (-i input_classfile,

--as-dba cross-owner extract

Sampling and estimates (--sampling-test,

--split-schema-files

--mt-process N/M semantics