CUBRID Lock-Free Bitmap — Chunked Atomic Allocator for Per-Thread Indexes and Slot Pools
Contents:
- Theoretical Background
- Common DBMS Design
- CUBRID’s Approach
- Source Walkthrough
- Source verification (as of 2026-05-07)
- Beyond CUBRID — Comparative Designs & Research Frontiers
- Sources
Theoretical Background
Section titled “Theoretical Background”A bitmap allocator is the simplest concurrent slot-allocator shape: one bit per slot, set means “in use,” cleared means “free.” Allocation finds a clear bit and CAS-flips it; release clears the bit. The textbook references are short — every operating-systems text mentions bitmap allocators in passing (Tanenbaum, Modern Operating Systems, ch. 4 on memory management) and concurrent-data-structures texts treat them as the warm-up for harder structures.
The interesting design decisions are around scaling:
- Word size. A single word’s worth of bits (32 or 64) gives a single CAS for any in-word allocation. A larger pool needs multiple words, and the allocator must visit them.
- Search strategy. Linear scan from word 0 every time produces a hot first word (cache-line ping-pong on a busy pool); rotating the start position spreads the load.
- Full-pool detection. Cheap O(1) check: an atomic counter
entry_count_in_usecompared against the capacity. - Padding the last word. If the bit count is not a multiple
of the word size, the trailing bits (which represent
non-existent slots) must be permanently set to “in use” so a
~chunk == 0test correctly excludes them.
CUBRID’s lockfree::bitmap uses 32-bit words (unsigned int),
linear scan with a rotating start hint, an explicit
entry_count_in_use for LIST_OF_CHUNKS style, and pre-set
trailing bits in the last word.
Why a “wait-free process” label?
Section titled “Why a “wait-free process” label?”The lf_bitmap_get_entry source carries a comment marking the
loop as restart: /* wait-free process */. The claim is that
in the LIST_OF_CHUNKS style, the operation completes in a
bounded number of steps even under contention — a CAS failure
sends control back to the start label, but each iteration makes
progress because some other thread successfully claimed the
slot we were targeting. Under ONE_CHUNK the behavior is
lock-free rather than wait-free (the “this should never happen”
assert at full-capacity tells us the caller dimensioned the
bitmap to never exhaust, so practical wait-freedom holds).
The wait-free claim is actually an amortized argument — a single thread can be repeatedly preempted between finding a clear bit and CAS-claiming it. In the worst case the thread loops indefinitely. But for any bounded contention, the loop terminates in O(N_threads) iterations, which is the bounded condition wait-free demands.
Round-robin hint
Section titled “Round-robin hint”A common optimization for concurrent bitmaps: keep a start index that rotates with each allocation, so consecutive threads do not all CAS-fight for the same first word. Implementations differ:
- Per-thread start hint. Thread-local start index. No contention but cache-unfriendly (thread A’s recently-freed bit is invisible to thread B).
- Global atomic start hint. A single
std::atomic<unsigned>bumped on every allocation. Contention on the bump but excellent distribution. - Sharded start hints. One per CPU. Hybrid; not used by CUBRID.
CUBRID picks the global atomic in SERVER_MODE (server-side
process where many threads contend) and a single non-atomic
“last allocated chunk” in client mode (single-threaded process,
no contention).
Common DBMS Design
Section titled “Common DBMS Design”Bitmap as transaction-id index pool
Section titled “Bitmap as transaction-id index pool”Every DBMS that has a fixed-size pool of “transaction slots” (or “thread slots,” “session slots,” …) needs an allocator for those slots. The shape is identical: a bitmap of N bits, an allocate primitive, a free primitive.
- PostgreSQL uses a
BackendIdallocated from a sentinel- per-slot array (InvalidBackendId= 0); not lock-free. - MySQL uses a
THD *per connection, allocated from a freelist underLOCK_thread_count. Mutex-protected. - CUBRID uses
lockfree::bitmapfor both the legacyLF_TRAN_SYSTEM::lf_bitmap(one per global tran-system) and the modernlockfree::tran::system::m_tran_idx_map. Plus the legacyarea_alloc.{h,c}reusesLF_BITMAPfor general- purpose slot allocation in client-mode contexts.
Bitmap as page allocator
Section titled “Bitmap as page allocator”Bitmaps are also the natural representation of disk page
allocation — one bit per page in a file, set means allocated.
CUBRID’s disk_manager uses a different (non-lock-free)
representation for this on-disk structure (see
cubrid-disk-manager.md); the lock-free bitmap is restricted to
in-memory slot pools.
Theory ↔ CUBRID mapping
Section titled “Theory ↔ CUBRID mapping”| Concept | CUBRID name |
|---|---|
| Bitmap allocator | lockfree::bitmap (typedef LF_BITMAP) |
| Bit array | bitfield[] (std::atomic<unsigned int> *) |
| Bit-per-slot, in-use semantics | set bit ⇔ slot allocated |
| Word size | LF_BITFIELD_WORD_SIZE = 32 (sizeof(unsigned int) * 8) |
| Style enum | chunking_style ∈ {ONE_CHUNK, LIST_OF_CHUNKS} |
| Full-usage threshold | FULL_USAGE_RATIO = 1.0f |
| 95-percentile usage threshold | NINTETYFIVE_PERCENTILE_USAGE_RATIO = 0.95f |
| Round-robin start hint | start_idx (std::atomic<unsigned int>) |
| In-use count | entry_count_in_use (std::atomic<int>, LIST_OF_CHUNKS only) |
| Capacity | entry_count |
| Allocation | get_entry() — returns slot index or -1 |
| Release | free_entry(idx) |
| Full check | is_full() |
| Aligned-to-word capacity helper | LF_BITMAP_COUNT_ALIGN(count) macro |
CUBRID’s Approach
Section titled “CUBRID’s Approach”The class
Section titled “The class”// lockfree::bitmap — src/base/lockfree_bitmap.hpp:31class bitmap {public: static const float FULL_USAGE_RATIO; // 1.0f static const float NINTETYFIVE_PERCENTILE_USAGE_RATIO; // 0.95f
enum chunking_style { ONE_CHUNK = 0, LIST_OF_CHUNKS };
bitmap (); ~bitmap ();
void init (chunking_style, int entries_count, float usage_ratio); void destroy (); int get_entry (); // returns -1 on full void free_entry (int entry_idx); bool is_full () const;
// public fields (TODO comments to make private): std::atomic<unsigned int> *bitfield; int entry_count; std::atomic<int> entry_count_in_use; chunking_style style; float usage_threshold; std::atomic<unsigned int> start_idx;};The class is a thin wrapper over four free functions
(lf_bitmap_init / destroy / get_entry / free_entry) defined
static inside the .cpp. Public fields are flagged as
“todo: make private” in the header — the wrapper is a partial
C-to-C++ migration step.
Two chunking styles
Section titled “Two chunking styles”The chunking_style enum has only two values:
ONE_CHUNK— usage_threshold must be1.0f(full usage). The bitmap has no usage cap; every bit is allocatable. When all bits are taken,get_entry()returns -1 (and asserts in debug, becauseONE_CHUNKis intended for sized- to-never-exhaust use cases like the transaction index pool).LIST_OF_CHUNKS— usage_threshold may be any value in (0, 1]. A separate atomic counterentry_count_in_useis maintained;is_full()returns true when the in-use count reachesusage_threshold * entry_count. The “list of chunks” name historically referred to a planned multi-chunk extension that was never finished — the current impl is a single contiguous bitfield in both styles.
The two style choices represent two consumer patterns:
| Style | Used for | Reason |
|---|---|---|
ONE_CHUNK | Per-thread transaction-index allocation (tran::system, legacy LF_TRAN_SYSTEM) | Pool size is bounded by max_threads; exhaustion is a configuration error, never normal operation. The bitmap cannot afford the extra atomic on a counter. |
LIST_OF_CHUNKS | Generic slot allocation (area_alloc) where soft fullness matters | The 95-percentile threshold lets the allocator say “full” before the array is genuinely full, leaving headroom for in-flight allocations to complete. |
// lf_bitmap_init — src/base/lockfree_bitmap.cpp:83static void lf_bitmap_init (LF_BITMAP *bitmap, LF_BITMAP_STYLE style, int entries_cnt, float usage_threshold){ /* We only allow full usage for LF_BITMAP_ONE_CHUNK. */ assert (style == LF_BITMAP_LIST_OF_CHUNKS || usage_threshold == 1.0f);
bitmap->style = style; bitmap->entry_count = entries_cnt; bitmap->entry_count_in_use = 0; bitmap->usage_threshold = usage_threshold; if (usage_threshold < 0.0f || usage_threshold > 1.0f) bitmap->usage_threshold = 1.0f; bitmap->start_idx = 0;
/* initialize bitfield */ size_t chunk_count = CEIL_PTVDIV (entries_cnt, LF_BITFIELD_WORD_SIZE); bitmap->bitfield = new std::atomic<unsigned int>[chunk_count] (); for (size_t it = 0; it < chunk_count; it++) bitmap->bitfield[it] = 0;
/* pad out the rest bits with 1, It will simplify the code in lf_bitmap_get_entry() */ if (entries_cnt % LF_BITFIELD_WORD_SIZE != 0) { unsigned int chunk = 0, mask = 1; int i = entries_cnt % LF_BITFIELD_WORD_SIZE; for (mask <<= i; i < LF_BITFIELD_WORD_SIZE; i++, mask <<= 1) chunk |= mask; bitmap->bitfield[chunk_count - 1] = chunk; }}Two observations:
- The trailing-bit padding is the design’s neat trick. If
the allocator’s bit count is, say, 100, the last word
represents bits 96..127 — 32 bits — but only 96..99 are real
slots. Bits 100..127 are pre-set to
1so the “find a 0 bit” search correctly skips them. - The defensive clamp
usage_threshold < 0.0f || > 1.0f → 1.0fis a belt-and-suspenders against caller mis-config; in practice the only callers useFULL_USAGE_RATIO (1.0f)andNINTETYFIVE_PERCENTILE_USAGE_RATIO (0.95f).
get_entry
Section titled “get_entry”// lf_bitmap_get_entry — src/base/lockfree_bitmap.cpp:138static int lf_bitmap_get_entry (LF_BITMAP *bitmap){ int chunk_count = CEIL_PTVDIV (bitmap->entry_count, LF_BITFIELD_WORD_SIZE); unsigned int mask, chunk, start_idx; int chunk_idx, slot_idx, i;
restart: /* wait-free process */ chunk_idx = -1; slot_idx = -1;
if (LF_BITMAP_IS_FULL (bitmap)) return -1;
#if defined (SERVER_MODE) /* round-robin: bump global hint, mod chunk count */ start_idx = bitmap->start_idx++; start_idx = start_idx % chunk_count;#else /* iterate from the last allocated chunk */ start_idx = bitmap->start_idx;#endif
/* find a chunk with at least one zero bit */ i = start_idx; do { chunk = bitmap->bitfield[i].load (); if (~chunk) { // any bit clear? chunk_idx = i; break; } i++; if (i >= chunk_count) i = 0; } while (i != (int) start_idx);
if (chunk_idx == -1) { // wrapped around — full if (bitmap->style == LF_BITMAP_ONE_CHUNK) assert (false); return -1; }
/* find first zero bit in chunk */ for (i = 0, mask = 1; i < LF_BITFIELD_WORD_SIZE; i++, mask <<= 1) if ((~chunk) & mask) { slot_idx = i; break; }
if (slot_idx == -1) goto restart; // chunk filled in the meantime
/* CAS-flip the bit */ do { chunk = bitmap->bitfield[chunk_idx].load (); if (chunk & mask) goto restart; // somebody else took this bit } while (!bitmap->bitfield[chunk_idx].compare_exchange_weak (chunk, chunk | mask));
if (bitmap->style == LF_BITMAP_LIST_OF_CHUNKS) bitmap->entry_count_in_use++;
#if !defined (SERVER_MODE) bitmap->start_idx = chunk_idx;#endif
return chunk_idx * LF_BITFIELD_WORD_SIZE + slot_idx;}The protocol:
- Early out on full.
is_full()is a cheap snapshot compare of the in-use counter (LIST_OF_CHUNKS) or always- false (ONE_CHUNK). Wrong direction: returns false even when actually full; the loop catches this with the wrap-around test. - Pick a start chunk. SERVER_MODE bumps a global atomic
start_idxand modulos by chunk count; non-SERVER_MODE reads the current value. The global bump is a contention point but a cheap one. - Linear scan for a chunk with a clear bit.
~chunk != 0⇔ chunk has at least one zero bit. Wrap from end back tostart_idxif we hit the array end. - Find the first clear bit. Linear scan within the
chunk, mask =
1 << i. - CAS the bit set. Compare-exchange-weak on the chunk word, going from “current value” to “current value | mask.” Loop until success or detect the bit was set under us (someone else won).
- Maintain in-use counter (LIST_OF_CHUNKS only) and return the slot index.
The double-CAS pattern (the inner do { ... } while (!CAS_weak) and the outer goto restart) is the standard
Treiber-style protocol: weak CAS may fail spuriously, so we
loop; if the bit is set by another, we abandon the chunk and
restart from start_idx.
free_entry
Section titled “free_entry”// lf_bitmap_free_entry — src/base/lockfree_bitmap.cpp:239static void lf_bitmap_free_entry (LF_BITMAP *bitmap, int entry_idx){ int pos = entry_idx / LF_BITFIELD_WORD_SIZE; int bit = entry_idx % LF_BITFIELD_WORD_SIZE; unsigned int inverse_mask = (unsigned int) (1 << bit); unsigned int mask = ~inverse_mask; unsigned int curr;
do { curr = bitmap->bitfield[pos].load (); if (! (curr & inverse_mask)) { assert (false); // already free — caller bug return; } } while (!bitmap->bitfield[pos].compare_exchange_weak (curr, curr & mask));
if (bitmap->style == LF_BITMAP_LIST_OF_CHUNKS) bitmap->entry_count_in_use--;
#if !defined (SERVER_MODE) bitmap->start_idx = pos; /* hint for a free slot */#endif}A single CAS-loop. The asserted invariant curr & inverse_mask
catches caller errors (double-free); production builds with
NDEBUG will silently no-op the second free, which is wrong
but bounded.
is_full
Section titled “is_full”bool bitmap::is_full () const { return ((float) entry_count_in_use.load ()) >= usage_threshold * entry_count;}Cheap O(1). Even for ONE_CHUNK style this returns
(0 >= 1.0 * entry_count) which is always false (since
entry_count > 0); thus is_full() is a no-op for ONE_CHUNK,
matching the design contract that a ONE_CHUNK bitmap is sized
to never exhaust.
Note: entry_count_in_use is not maintained in ONE_CHUNK
style (the increment/decrement at the end of get_entry /
free_entry is gated by style == LIST_OF_CHUNKS). For a
ONE_CHUNK bitmap, querying entry_count_in_use returns 0
unconditionally; the in-use detection in get_entry instead
relies on the wrap-around-without-finding-clear-bit signal.
Component overview
Section titled “Component overview”flowchart LR Caller["Caller"] Init["init(style, count, ratio)"] Caller --> Init Init --> Bitfield["bitfield[]:<br/>std::atomic<uint> * chunk_count"] Init -.pad.-> LastWord["last word:<br/>trailing 1s"] Caller --> Get["get_entry()"] Get --> StartIdx["start_idx (atomic)"] StartIdx -.SERVER_MODE: bump.-> Get Get --> Scan["scan chunks for ~chunk != 0"] Scan --> Find["find first clear bit"] Find --> CAS["CAS bit 0->1"] CAS -- success --> Counter["if LIST_OF_CHUNKS:<br/>entry_count_in_use++"] Counter --> Return["return chunk_idx * 32 + bit"] Caller --> Free["free_entry(idx)"] Free --> CAS2["CAS bit 1->0"] CAS2 --> Counter2["if LIST_OF_CHUNKS:<br/>entry_count_in_use--"]
Consumer survey
Section titled “Consumer survey”| Consumer | Style | Capacity | Purpose |
|---|---|---|---|
lockfree::tran::system::m_tran_idx_map | ONE_CHUNK | max_tran_count | Modern per-thread tran-index dispenser |
LF_TRAN_SYSTEM::lf_bitmap (legacy) | ONE_CHUNK | max_threads aligned up | Per-thread LF_TRAN_ENTRY index in legacy reclamation |
area_alloc (in area_alloc.{h,c}) | LIST_OF_CHUNKS | per-area | General-purpose slot allocator (legacy memory pool) |
The first two are the dominant uses. area_alloc is a small
client-mode helper not exercised on the SERVER_MODE hot path.
bitmap is also exposed as LF_BITMAP
Section titled “bitmap is also exposed as LF_BITMAP”The header carries:
using LF_BITMAP = lockfree::bitmap;using LF_BITMAP_STYLE = lockfree::bitmap::chunking_style;static const LF_BITMAP_STYLE LF_BITMAP_ONE_CHUNK = LF_BITMAP_STYLE::ONE_CHUNK;static const LF_BITMAP_STYLE LF_BITMAP_LIST_OF_CHUNKS = LF_BITMAP_STYLE::LIST_OF_CHUNKS;
#define LF_BITMAP_FULL_USAGE_RATIO lockfree::bitmap::FULL_USAGE_RATIO#define LF_BITMAP_95PERCENTILE_USAGE_RATIO lockfree::bitmap::NINTETYFIVE_PERCENTILE_USAGE_RATIO#define LF_BITFIELD_WORD_SIZE (int) (sizeof (unsigned int) * 8)#define LF_BITMAP_IS_FULL(bitmap) (bitmap)->is_full ()#define LF_BITMAP_COUNT_ALIGN(count) (((count) + (LF_BITFIELD_WORD_SIZE) - 1) & ~((LF_BITFIELD_WORD_SIZE) - 1))This is the dual-vocabulary support — code in lock_free.c and
area_alloc.c references LF_BITMAP_* macros, while modern
C++ code references lockfree::bitmap::* direct names. Both
resolve to the same class.
Memory ordering posture
Section titled “Memory ordering posture”Every atomic access on bitfield[] and start_idx is
seq_cst (the default for std::atomic operations). The
operations are word-sized and cache-line-friendly; on x86 the
seq_cst overhead is negligible. On ARM the additional fences
are paid; tightening to acquire/release pairs would be a
focused optimization (the fetch_or/fetch_and pattern is
naturally an RMW with memory_order_acq_rel).
A note on the typo NINTETYFIVE_PERCENTILE_USAGE_RATIO
Section titled “A note on the typo NINTETYFIVE_PERCENTILE_USAGE_RATIO”The constant is spelled with the typo “ninetyfive” missing an
e (it should be “ninety”). Every call site uses the typo
spelling so it is now load-bearing; renaming requires a
codebase-wide sweep. Flagged for future cleanup.
Source Walkthrough
Section titled “Source Walkthrough”Public surface
Section titled “Public surface”lockfree::bitmap(class) —lockfree_bitmap.hpp:31bitmap::init—lockfree_bitmap.cpp:53bitmap::destroy—:59bitmap::get_entry—:65bitmap::free_entry—:71bitmap::is_full—:77
Static helpers
Section titled “Static helpers”lf_bitmap_init—lockfree_bitmap.cpp:83lf_bitmap_destroy—:124lf_bitmap_get_entry—:138lf_bitmap_free_entry—:239
Macros and aliases (in the header)
Section titled “Macros and aliases (in the header)”LF_BITFIELD_WORD_SIZE— 32 (assumessizeof(unsigned int) == 4)LF_BITMAP_COUNT_ALIGN(count)— round up to multiple of word sizeLF_BITMAP_IS_FULL(bm)— alias for(bm)->is_full()LF_BITMAP_FULL_USAGE_RATIO/LF_BITMAP_95PERCENTILE_USAGE_RATIO
Position hints (as of 2026-05-07)
Section titled “Position hints (as of 2026-05-07)”| Symbol | File | Line |
|---|---|---|
lockfree::bitmap (class) | src/base/lockfree_bitmap.hpp | 31 |
chunking_style enum | src/base/lockfree_bitmap.hpp | 37 |
bitmap::is_full | src/base/lockfree_bitmap.cpp | 77 |
lf_bitmap_init | src/base/lockfree_bitmap.cpp | 83 |
lf_bitmap_get_entry | src/base/lockfree_bitmap.cpp | 138 |
lf_bitmap_free_entry | src/base/lockfree_bitmap.cpp | 239 |
LF_BITMAP_COUNT_ALIGN macro | src/base/lockfree_bitmap.hpp | 89 |
LF_BITFIELD_WORD_SIZE macro | src/base/lockfree_bitmap.hpp | 85 |
tran::system::system (uses bitmap) | src/base/lockfree_transaction_system.cpp | 29 |
lf_tran_system_init (legacy uses bitmap) | src/base/lock_free.c | 194 |
Source verification (as of 2026-05-07)
Section titled “Source verification (as of 2026-05-07)”Verified facts
Section titled “Verified facts”- Word size is 32 bits, hard-coded by the
unsigned intchoice.LF_BITFIELD_WORD_SIZE = (int) (sizeof (unsigned int) * 8)atlockfree_bitmap.hpp:85. On every platform CUBRID supports,sizeof(unsigned int) == 4, so 32 bits per chunk word. ONE_CHUNKrequiresFULL_USAGE_RATIO == 1.0f. Asserted atlockfree_bitmap.cpp:91:assert (style == LF_BITMAP_LIST_OF_CHUNKS || usage_threshold == 1.0f).- Trailing bits pre-set to 1 in the last word.
lf_bitmap_initat:112-121builds a mask of all bits above the requested count and writes it into the last word. SERVER_MODEuses an atomic global round-robin start.lf_bitmap_get_entryat:160-167incrementsbitmap->start_idxthen modulos by chunk count; the non-SERVER_MODE branch readsstart_idxwithout bumping and the free function writesstart_idx = posas a hint.entry_count_in_useisLIST_OF_CHUNKS-only. Increment at:228and decrement at:269are gated bybitmap->style == LF_BITMAP_LIST_OF_CHUNKS. AONE_CHUNKbitmap leaves the counter at 0 forever.- CAS uses
compare_exchange_weak. Bothget_entry(:224) andfree_entry(:265) use weak CAS, so spurious failures on weak-memory architectures retry inside the loop. is_fullis O(1). Comparison of a single atomic load against a multiplied float; no scan.- The
LIST_OF_CHUNKS“list of chunks” name is historical; the impl is a single contiguous bitfield. Confirmed by readinglf_bitmap_init—bitmap->bitfield = new std::atomic<unsigned int>[chunk_count]()is one contiguous allocation regardless of style.
Open questions
Section titled “Open questions”- Does CUBRID’s bitmap ever exhaust under realistic
ONE_CHUNKuse? The assertionassert (false)in the wrap-around branch fires only if the configured pool was under-sized. Investigation: stress-test with high thread count to confirmtran::system::assign_indexnever returnsINVALID_INDEXin normal operation. - Is the global-atomic
start_idxbump a measured bottleneck? Everyget_entryin SERVER_MODE bumps it. On a 64-core machine with a hashmap that retires aggressively the cache line forstart_idxis hot. A per-CPU sharded variant would avoid this; no profiling data exists. - Why is
free_entry’s assertion silent on production builds? A double-free in aLIST_OF_CHUNKSbitmap leavesentry_count_in_usedecremented twice, eventually wrapping below zero (it’s signed int, so undefined behavior). A noisyer_setwould be safer than anassertthat compiles out. - Word-size assumption portability.
LF_BITFIELD_WORD_SIZE = sizeof(unsigned int) * 8assumesunsigned intis 32 bits. On hypothetical 16-bit platforms, theLF_BITMAP_COUNT_ALIGNmacro and trailing-bit padding would still work, but the cast(int) (...)could overflow for large pools. CUBRID does not target such platforms; flagged for clarity, not action. - Refactor to make fields private. The header’s
// todo: make private fieldscomment has been there long enough that “todo” reads more like “won’t” than “will.” Promoting to private would forcelf_tran_system_init(which reaches intosys->lf_bitmap.bitfield[i]directly,lock_free.c:387) through accessors — non-zero refactor cost.
Beyond CUBRID — Comparative Designs & Research Frontiers
Section titled “Beyond CUBRID — Comparative Designs & Research Frontiers”- Sharded start hints. A per-CPU
start_idx(sharded bysched_getcpu()or by thread index) would eliminate the global atomic bump’s contention on many-core machines. - 64-bit chunk words. Switching to
uint64_tdoubles the in-word allocation density and halves the chunk count for the same capacity. The CAS becomes 64-bit, also free on x86-64. - Hierarchical bitmap. A multi-level bitmap (top level marks “this child has at least one zero bit”) reduces the scan from O(chunks) to O(log chunks). Worth it only for very large pools, which CUBRID does not have.
__builtin_ctzfor first-zero-bit search. The current inner loop is a linear scan from bit 0. Replacing with__builtin_ctz (~chunk)finds the first clear bit in O(1). Trivial micro-optimization; deferred because the inner loop is not a measured hotspot.- Replace bitmap with per-CPU freelist for transaction
indexes. A free list of
tran::indexper CPU, with steal-on-empty, would eliminate the bitmap entirely and remove the round-robin bump. The trade is more code for less contention. - Promote the typo to a clean rename.
NINTETYFIVE_PERCENTILE_USAGE_RATIOis universally typoed. A grep-replace pass would fix the spelling without semantic change.
Sources
Section titled “Sources”- Source files (in the CUBRID source tree at
/data/hgryoo/references/cubrid/):src/base/lockfree_bitmap.{hpp,cpp}(the entire implementation)src/base/lock_free.h(legacy aliases and macros)src/base/lock_free.c(legacyLF_TRAN_SYSTEM-side use)src/base/lockfree_transaction_system.{hpp,cpp}(modern use)src/base/area_alloc.h(legacy slot allocator usingLIST_OF_CHUNKS)
- Textbook chapters cited:
- Tanenbaum, Modern Operating Systems, ch. 4 §“Memory Management” — the bitmap-vs-freelist tradeoff for slot allocators.
- Herlihy & Shavit, The Art of Multiprocessor Programming, ch. 5 §“Wait-Free Synchronization” — progress conditions and the wait-free amortization argument.
- Sibling docs in this repo:
cubrid-lockfree-overview.md— umbrella.cubrid-lockfree-transaction.md— the dominant consumer of the bitmap (per-thread index allocation).