Skip to content

CUBRID Lock-Free Bitmap — Chunked Atomic Allocator for Per-Thread Indexes and Slot Pools

Contents:

A bitmap allocator is the simplest concurrent slot-allocator shape: one bit per slot, set means “in use,” cleared means “free.” Allocation finds a clear bit and CAS-flips it; release clears the bit. The textbook references are short — every operating-systems text mentions bitmap allocators in passing (Tanenbaum, Modern Operating Systems, ch. 4 on memory management) and concurrent-data-structures texts treat them as the warm-up for harder structures.

The interesting design decisions are around scaling:

  1. Word size. A single word’s worth of bits (32 or 64) gives a single CAS for any in-word allocation. A larger pool needs multiple words, and the allocator must visit them.
  2. Search strategy. Linear scan from word 0 every time produces a hot first word (cache-line ping-pong on a busy pool); rotating the start position spreads the load.
  3. Full-pool detection. Cheap O(1) check: an atomic counter entry_count_in_use compared against the capacity.
  4. Padding the last word. If the bit count is not a multiple of the word size, the trailing bits (which represent non-existent slots) must be permanently set to “in use” so a ~chunk == 0 test correctly excludes them.

CUBRID’s lockfree::bitmap uses 32-bit words (unsigned int), linear scan with a rotating start hint, an explicit entry_count_in_use for LIST_OF_CHUNKS style, and pre-set trailing bits in the last word.

The lf_bitmap_get_entry source carries a comment marking the loop as restart: /* wait-free process */. The claim is that in the LIST_OF_CHUNKS style, the operation completes in a bounded number of steps even under contention — a CAS failure sends control back to the start label, but each iteration makes progress because some other thread successfully claimed the slot we were targeting. Under ONE_CHUNK the behavior is lock-free rather than wait-free (the “this should never happen” assert at full-capacity tells us the caller dimensioned the bitmap to never exhaust, so practical wait-freedom holds).

The wait-free claim is actually an amortized argument — a single thread can be repeatedly preempted between finding a clear bit and CAS-claiming it. In the worst case the thread loops indefinitely. But for any bounded contention, the loop terminates in O(N_threads) iterations, which is the bounded condition wait-free demands.

A common optimization for concurrent bitmaps: keep a start index that rotates with each allocation, so consecutive threads do not all CAS-fight for the same first word. Implementations differ:

  • Per-thread start hint. Thread-local start index. No contention but cache-unfriendly (thread A’s recently-freed bit is invisible to thread B).
  • Global atomic start hint. A single std::atomic<unsigned> bumped on every allocation. Contention on the bump but excellent distribution.
  • Sharded start hints. One per CPU. Hybrid; not used by CUBRID.

CUBRID picks the global atomic in SERVER_MODE (server-side process where many threads contend) and a single non-atomic “last allocated chunk” in client mode (single-threaded process, no contention).

Every DBMS that has a fixed-size pool of “transaction slots” (or “thread slots,” “session slots,” …) needs an allocator for those slots. The shape is identical: a bitmap of N bits, an allocate primitive, a free primitive.

  • PostgreSQL uses a BackendId allocated from a sentinel- per-slot array (InvalidBackendId = 0); not lock-free.
  • MySQL uses a THD * per connection, allocated from a freelist under LOCK_thread_count. Mutex-protected.
  • CUBRID uses lockfree::bitmap for both the legacy LF_TRAN_SYSTEM::lf_bitmap (one per global tran-system) and the modern lockfree::tran::system::m_tran_idx_map. Plus the legacy area_alloc.{h,c} reuses LF_BITMAP for general- purpose slot allocation in client-mode contexts.

Bitmaps are also the natural representation of disk page allocation — one bit per page in a file, set means allocated. CUBRID’s disk_manager uses a different (non-lock-free) representation for this on-disk structure (see cubrid-disk-manager.md); the lock-free bitmap is restricted to in-memory slot pools.

ConceptCUBRID name
Bitmap allocatorlockfree::bitmap (typedef LF_BITMAP)
Bit arraybitfield[] (std::atomic<unsigned int> *)
Bit-per-slot, in-use semanticsset bit ⇔ slot allocated
Word sizeLF_BITFIELD_WORD_SIZE = 32 (sizeof(unsigned int) * 8)
Style enumchunking_style ∈ {ONE_CHUNK, LIST_OF_CHUNKS}
Full-usage thresholdFULL_USAGE_RATIO = 1.0f
95-percentile usage thresholdNINTETYFIVE_PERCENTILE_USAGE_RATIO = 0.95f
Round-robin start hintstart_idx (std::atomic<unsigned int>)
In-use countentry_count_in_use (std::atomic<int>, LIST_OF_CHUNKS only)
Capacityentry_count
Allocationget_entry() — returns slot index or -1
Releasefree_entry(idx)
Full checkis_full()
Aligned-to-word capacity helperLF_BITMAP_COUNT_ALIGN(count) macro
// lockfree::bitmap — src/base/lockfree_bitmap.hpp:31
class bitmap {
public:
static const float FULL_USAGE_RATIO; // 1.0f
static const float NINTETYFIVE_PERCENTILE_USAGE_RATIO; // 0.95f
enum chunking_style { ONE_CHUNK = 0, LIST_OF_CHUNKS };
bitmap ();
~bitmap ();
void init (chunking_style, int entries_count, float usage_ratio);
void destroy ();
int get_entry (); // returns -1 on full
void free_entry (int entry_idx);
bool is_full () const;
// public fields (TODO comments to make private):
std::atomic<unsigned int> *bitfield;
int entry_count;
std::atomic<int> entry_count_in_use;
chunking_style style;
float usage_threshold;
std::atomic<unsigned int> start_idx;
};

The class is a thin wrapper over four free functions (lf_bitmap_init / destroy / get_entry / free_entry) defined static inside the .cpp. Public fields are flagged as “todo: make private” in the header — the wrapper is a partial C-to-C++ migration step.

The chunking_style enum has only two values:

  • ONE_CHUNK — usage_threshold must be 1.0f (full usage). The bitmap has no usage cap; every bit is allocatable. When all bits are taken, get_entry() returns -1 (and asserts in debug, because ONE_CHUNK is intended for sized- to-never-exhaust use cases like the transaction index pool).
  • LIST_OF_CHUNKS — usage_threshold may be any value in (0, 1]. A separate atomic counter entry_count_in_use is maintained; is_full() returns true when the in-use count reaches usage_threshold * entry_count. The “list of chunks” name historically referred to a planned multi-chunk extension that was never finished — the current impl is a single contiguous bitfield in both styles.

The two style choices represent two consumer patterns:

StyleUsed forReason
ONE_CHUNKPer-thread transaction-index allocation (tran::system, legacy LF_TRAN_SYSTEM)Pool size is bounded by max_threads; exhaustion is a configuration error, never normal operation. The bitmap cannot afford the extra atomic on a counter.
LIST_OF_CHUNKSGeneric slot allocation (area_alloc) where soft fullness mattersThe 95-percentile threshold lets the allocator say “full” before the array is genuinely full, leaving headroom for in-flight allocations to complete.
// lf_bitmap_init — src/base/lockfree_bitmap.cpp:83
static void lf_bitmap_init (LF_BITMAP *bitmap, LF_BITMAP_STYLE style,
int entries_cnt, float usage_threshold)
{
/* We only allow full usage for LF_BITMAP_ONE_CHUNK. */
assert (style == LF_BITMAP_LIST_OF_CHUNKS || usage_threshold == 1.0f);
bitmap->style = style;
bitmap->entry_count = entries_cnt;
bitmap->entry_count_in_use = 0;
bitmap->usage_threshold = usage_threshold;
if (usage_threshold < 0.0f || usage_threshold > 1.0f)
bitmap->usage_threshold = 1.0f;
bitmap->start_idx = 0;
/* initialize bitfield */
size_t chunk_count = CEIL_PTVDIV (entries_cnt, LF_BITFIELD_WORD_SIZE);
bitmap->bitfield = new std::atomic<unsigned int>[chunk_count] ();
for (size_t it = 0; it < chunk_count; it++)
bitmap->bitfield[it] = 0;
/* pad out the rest bits with 1, It will simplify the code in lf_bitmap_get_entry() */
if (entries_cnt % LF_BITFIELD_WORD_SIZE != 0) {
unsigned int chunk = 0, mask = 1;
int i = entries_cnt % LF_BITFIELD_WORD_SIZE;
for (mask <<= i; i < LF_BITFIELD_WORD_SIZE; i++, mask <<= 1)
chunk |= mask;
bitmap->bitfield[chunk_count - 1] = chunk;
}
}

Two observations:

  • The trailing-bit padding is the design’s neat trick. If the allocator’s bit count is, say, 100, the last word represents bits 96..127 — 32 bits — but only 96..99 are real slots. Bits 100..127 are pre-set to 1 so the “find a 0 bit” search correctly skips them.
  • The defensive clamp usage_threshold < 0.0f || > 1.0f → 1.0f is a belt-and-suspenders against caller mis-config; in practice the only callers use FULL_USAGE_RATIO (1.0f) and NINTETYFIVE_PERCENTILE_USAGE_RATIO (0.95f).
// lf_bitmap_get_entry — src/base/lockfree_bitmap.cpp:138
static int lf_bitmap_get_entry (LF_BITMAP *bitmap)
{
int chunk_count = CEIL_PTVDIV (bitmap->entry_count, LF_BITFIELD_WORD_SIZE);
unsigned int mask, chunk, start_idx;
int chunk_idx, slot_idx, i;
restart: /* wait-free process */
chunk_idx = -1;
slot_idx = -1;
if (LF_BITMAP_IS_FULL (bitmap))
return -1;
#if defined (SERVER_MODE)
/* round-robin: bump global hint, mod chunk count */
start_idx = bitmap->start_idx++;
start_idx = start_idx % chunk_count;
#else
/* iterate from the last allocated chunk */
start_idx = bitmap->start_idx;
#endif
/* find a chunk with at least one zero bit */
i = start_idx;
do {
chunk = bitmap->bitfield[i].load ();
if (~chunk) { // any bit clear?
chunk_idx = i;
break;
}
i++;
if (i >= chunk_count) i = 0;
} while (i != (int) start_idx);
if (chunk_idx == -1) { // wrapped around — full
if (bitmap->style == LF_BITMAP_ONE_CHUNK)
assert (false);
return -1;
}
/* find first zero bit in chunk */
for (i = 0, mask = 1; i < LF_BITFIELD_WORD_SIZE; i++, mask <<= 1)
if ((~chunk) & mask) { slot_idx = i; break; }
if (slot_idx == -1) goto restart; // chunk filled in the meantime
/* CAS-flip the bit */
do {
chunk = bitmap->bitfield[chunk_idx].load ();
if (chunk & mask) goto restart; // somebody else took this bit
} while (!bitmap->bitfield[chunk_idx].compare_exchange_weak (chunk, chunk | mask));
if (bitmap->style == LF_BITMAP_LIST_OF_CHUNKS)
bitmap->entry_count_in_use++;
#if !defined (SERVER_MODE)
bitmap->start_idx = chunk_idx;
#endif
return chunk_idx * LF_BITFIELD_WORD_SIZE + slot_idx;
}

The protocol:

  1. Early out on full. is_full() is a cheap snapshot compare of the in-use counter (LIST_OF_CHUNKS) or always- false (ONE_CHUNK). Wrong direction: returns false even when actually full; the loop catches this with the wrap-around test.
  2. Pick a start chunk. SERVER_MODE bumps a global atomic start_idx and modulos by chunk count; non-SERVER_MODE reads the current value. The global bump is a contention point but a cheap one.
  3. Linear scan for a chunk with a clear bit. ~chunk != 0 ⇔ chunk has at least one zero bit. Wrap from end back to start_idx if we hit the array end.
  4. Find the first clear bit. Linear scan within the chunk, mask = 1 << i.
  5. CAS the bit set. Compare-exchange-weak on the chunk word, going from “current value” to “current value | mask.” Loop until success or detect the bit was set under us (someone else won).
  6. Maintain in-use counter (LIST_OF_CHUNKS only) and return the slot index.

The double-CAS pattern (the inner do { ... } while (!CAS_weak) and the outer goto restart) is the standard Treiber-style protocol: weak CAS may fail spuriously, so we loop; if the bit is set by another, we abandon the chunk and restart from start_idx.

// lf_bitmap_free_entry — src/base/lockfree_bitmap.cpp:239
static void lf_bitmap_free_entry (LF_BITMAP *bitmap, int entry_idx)
{
int pos = entry_idx / LF_BITFIELD_WORD_SIZE;
int bit = entry_idx % LF_BITFIELD_WORD_SIZE;
unsigned int inverse_mask = (unsigned int) (1 << bit);
unsigned int mask = ~inverse_mask;
unsigned int curr;
do {
curr = bitmap->bitfield[pos].load ();
if (! (curr & inverse_mask)) {
assert (false); // already free — caller bug
return;
}
} while (!bitmap->bitfield[pos].compare_exchange_weak (curr, curr & mask));
if (bitmap->style == LF_BITMAP_LIST_OF_CHUNKS)
bitmap->entry_count_in_use--;
#if !defined (SERVER_MODE)
bitmap->start_idx = pos; /* hint for a free slot */
#endif
}

A single CAS-loop. The asserted invariant curr & inverse_mask catches caller errors (double-free); production builds with NDEBUG will silently no-op the second free, which is wrong but bounded.

bool bitmap::is_full () const {
return ((float) entry_count_in_use.load ()) >= usage_threshold * entry_count;
}

Cheap O(1). Even for ONE_CHUNK style this returns (0 >= 1.0 * entry_count) which is always false (since entry_count > 0); thus is_full() is a no-op for ONE_CHUNK, matching the design contract that a ONE_CHUNK bitmap is sized to never exhaust.

Note: entry_count_in_use is not maintained in ONE_CHUNK style (the increment/decrement at the end of get_entry / free_entry is gated by style == LIST_OF_CHUNKS). For a ONE_CHUNK bitmap, querying entry_count_in_use returns 0 unconditionally; the in-use detection in get_entry instead relies on the wrap-around-without-finding-clear-bit signal.

flowchart LR
  Caller["Caller"]
  Init["init(style, count, ratio)"]
  Caller --> Init
  Init --> Bitfield["bitfield[]:<br/>std::atomic&lt;uint&gt; * chunk_count"]
  Init -.pad.-> LastWord["last word:<br/>trailing 1s"]

  Caller --> Get["get_entry()"]
  Get --> StartIdx["start_idx (atomic)"]
  StartIdx -.SERVER_MODE: bump.-> Get
  Get --> Scan["scan chunks for ~chunk != 0"]
  Scan --> Find["find first clear bit"]
  Find --> CAS["CAS bit 0->1"]
  CAS -- success --> Counter["if LIST_OF_CHUNKS:<br/>entry_count_in_use++"]
  Counter --> Return["return chunk_idx * 32 + bit"]

  Caller --> Free["free_entry(idx)"]
  Free --> CAS2["CAS bit 1->0"]
  CAS2 --> Counter2["if LIST_OF_CHUNKS:<br/>entry_count_in_use--"]
ConsumerStyleCapacityPurpose
lockfree::tran::system::m_tran_idx_mapONE_CHUNKmax_tran_countModern per-thread tran-index dispenser
LF_TRAN_SYSTEM::lf_bitmap (legacy)ONE_CHUNKmax_threads aligned upPer-thread LF_TRAN_ENTRY index in legacy reclamation
area_alloc (in area_alloc.{h,c})LIST_OF_CHUNKSper-areaGeneral-purpose slot allocator (legacy memory pool)

The first two are the dominant uses. area_alloc is a small client-mode helper not exercised on the SERVER_MODE hot path.

The header carries:

using LF_BITMAP = lockfree::bitmap;
using LF_BITMAP_STYLE = lockfree::bitmap::chunking_style;
static const LF_BITMAP_STYLE LF_BITMAP_ONE_CHUNK = LF_BITMAP_STYLE::ONE_CHUNK;
static const LF_BITMAP_STYLE LF_BITMAP_LIST_OF_CHUNKS = LF_BITMAP_STYLE::LIST_OF_CHUNKS;
#define LF_BITMAP_FULL_USAGE_RATIO lockfree::bitmap::FULL_USAGE_RATIO
#define LF_BITMAP_95PERCENTILE_USAGE_RATIO lockfree::bitmap::NINTETYFIVE_PERCENTILE_USAGE_RATIO
#define LF_BITFIELD_WORD_SIZE (int) (sizeof (unsigned int) * 8)
#define LF_BITMAP_IS_FULL(bitmap) (bitmap)->is_full ()
#define LF_BITMAP_COUNT_ALIGN(count) (((count) + (LF_BITFIELD_WORD_SIZE) - 1) & ~((LF_BITFIELD_WORD_SIZE) - 1))

This is the dual-vocabulary support — code in lock_free.c and area_alloc.c references LF_BITMAP_* macros, while modern C++ code references lockfree::bitmap::* direct names. Both resolve to the same class.

Every atomic access on bitfield[] and start_idx is seq_cst (the default for std::atomic operations). The operations are word-sized and cache-line-friendly; on x86 the seq_cst overhead is negligible. On ARM the additional fences are paid; tightening to acquire/release pairs would be a focused optimization (the fetch_or/fetch_and pattern is naturally an RMW with memory_order_acq_rel).

A note on the typo NINTETYFIVE_PERCENTILE_USAGE_RATIO

Section titled “A note on the typo NINTETYFIVE_PERCENTILE_USAGE_RATIO”

The constant is spelled with the typo “ninetyfive” missing an e (it should be “ninety”). Every call site uses the typo spelling so it is now load-bearing; renaming requires a codebase-wide sweep. Flagged for future cleanup.

  • lockfree::bitmap (class) — lockfree_bitmap.hpp:31
  • bitmap::initlockfree_bitmap.cpp:53
  • bitmap::destroy:59
  • bitmap::get_entry:65
  • bitmap::free_entry:71
  • bitmap::is_full:77
  • lf_bitmap_initlockfree_bitmap.cpp:83
  • lf_bitmap_destroy:124
  • lf_bitmap_get_entry:138
  • lf_bitmap_free_entry:239
  • LF_BITFIELD_WORD_SIZE — 32 (assumes sizeof(unsigned int) == 4)
  • LF_BITMAP_COUNT_ALIGN(count) — round up to multiple of word size
  • LF_BITMAP_IS_FULL(bm) — alias for (bm)->is_full()
  • LF_BITMAP_FULL_USAGE_RATIO / LF_BITMAP_95PERCENTILE_USAGE_RATIO
SymbolFileLine
lockfree::bitmap (class)src/base/lockfree_bitmap.hpp31
chunking_style enumsrc/base/lockfree_bitmap.hpp37
bitmap::is_fullsrc/base/lockfree_bitmap.cpp77
lf_bitmap_initsrc/base/lockfree_bitmap.cpp83
lf_bitmap_get_entrysrc/base/lockfree_bitmap.cpp138
lf_bitmap_free_entrysrc/base/lockfree_bitmap.cpp239
LF_BITMAP_COUNT_ALIGN macrosrc/base/lockfree_bitmap.hpp89
LF_BITFIELD_WORD_SIZE macrosrc/base/lockfree_bitmap.hpp85
tran::system::system (uses bitmap)src/base/lockfree_transaction_system.cpp29
lf_tran_system_init (legacy uses bitmap)src/base/lock_free.c194
  • Word size is 32 bits, hard-coded by the unsigned int choice. LF_BITFIELD_WORD_SIZE = (int) (sizeof (unsigned int) * 8) at lockfree_bitmap.hpp:85. On every platform CUBRID supports, sizeof(unsigned int) == 4, so 32 bits per chunk word.
  • ONE_CHUNK requires FULL_USAGE_RATIO == 1.0f. Asserted at lockfree_bitmap.cpp:91: assert (style == LF_BITMAP_LIST_OF_CHUNKS || usage_threshold == 1.0f).
  • Trailing bits pre-set to 1 in the last word. lf_bitmap_init at :112-121 builds a mask of all bits above the requested count and writes it into the last word.
  • SERVER_MODE uses an atomic global round-robin start. lf_bitmap_get_entry at :160-167 increments bitmap->start_idx then modulos by chunk count; the non-SERVER_MODE branch reads start_idx without bumping and the free function writes start_idx = pos as a hint.
  • entry_count_in_use is LIST_OF_CHUNKS-only. Increment at :228 and decrement at :269 are gated by bitmap->style == LF_BITMAP_LIST_OF_CHUNKS. A ONE_CHUNK bitmap leaves the counter at 0 forever.
  • CAS uses compare_exchange_weak. Both get_entry (:224) and free_entry (:265) use weak CAS, so spurious failures on weak-memory architectures retry inside the loop.
  • is_full is O(1). Comparison of a single atomic load against a multiplied float; no scan.
  • The LIST_OF_CHUNKS “list of chunks” name is historical; the impl is a single contiguous bitfield. Confirmed by reading lf_bitmap_initbitmap->bitfield = new std::atomic<unsigned int>[chunk_count]() is one contiguous allocation regardless of style.
  1. Does CUBRID’s bitmap ever exhaust under realistic ONE_CHUNK use? The assertion assert (false) in the wrap-around branch fires only if the configured pool was under-sized. Investigation: stress-test with high thread count to confirm tran::system::assign_index never returns INVALID_INDEX in normal operation.
  2. Is the global-atomic start_idx bump a measured bottleneck? Every get_entry in SERVER_MODE bumps it. On a 64-core machine with a hashmap that retires aggressively the cache line for start_idx is hot. A per-CPU sharded variant would avoid this; no profiling data exists.
  3. Why is free_entry’s assertion silent on production builds? A double-free in a LIST_OF_CHUNKS bitmap leaves entry_count_in_use decremented twice, eventually wrapping below zero (it’s signed int, so undefined behavior). A noisy er_set would be safer than an assert that compiles out.
  4. Word-size assumption portability. LF_BITFIELD_WORD_SIZE = sizeof(unsigned int) * 8 assumes unsigned int is 32 bits. On hypothetical 16-bit platforms, the LF_BITMAP_COUNT_ALIGN macro and trailing-bit padding would still work, but the cast (int) (...) could overflow for large pools. CUBRID does not target such platforms; flagged for clarity, not action.
  5. Refactor to make fields private. The header’s // todo: make private fields comment has been there long enough that “todo” reads more like “won’t” than “will.” Promoting to private would force lf_tran_system_init (which reaches into sys->lf_bitmap.bitfield[i] directly, lock_free.c:387) through accessors — non-zero refactor cost.

Beyond CUBRID — Comparative Designs & Research Frontiers

Section titled “Beyond CUBRID — Comparative Designs & Research Frontiers”
  • Sharded start hints. A per-CPU start_idx (sharded by sched_getcpu() or by thread index) would eliminate the global atomic bump’s contention on many-core machines.
  • 64-bit chunk words. Switching to uint64_t doubles the in-word allocation density and halves the chunk count for the same capacity. The CAS becomes 64-bit, also free on x86-64.
  • Hierarchical bitmap. A multi-level bitmap (top level marks “this child has at least one zero bit”) reduces the scan from O(chunks) to O(log chunks). Worth it only for very large pools, which CUBRID does not have.
  • __builtin_ctz for first-zero-bit search. The current inner loop is a linear scan from bit 0. Replacing with __builtin_ctz (~chunk) finds the first clear bit in O(1). Trivial micro-optimization; deferred because the inner loop is not a measured hotspot.
  • Replace bitmap with per-CPU freelist for transaction indexes. A free list of tran::index per CPU, with steal-on-empty, would eliminate the bitmap entirely and remove the round-robin bump. The trade is more code for less contention.
  • Promote the typo to a clean rename. NINTETYFIVE_PERCENTILE_USAGE_RATIO is universally typoed. A grep-replace pass would fix the spelling without semantic change.
  • Source files (in the CUBRID source tree at /data/hgryoo/references/cubrid/):
    • src/base/lockfree_bitmap.{hpp,cpp} (the entire implementation)
    • src/base/lock_free.h (legacy aliases and macros)
    • src/base/lock_free.c (legacy LF_TRAN_SYSTEM-side use)
    • src/base/lockfree_transaction_system.{hpp,cpp} (modern use)
    • src/base/area_alloc.h (legacy slot allocator using LIST_OF_CHUNKS)
  • Textbook chapters cited:
    • Tanenbaum, Modern Operating Systems, ch. 4 §“Memory Management” — the bitmap-vs-freelist tradeoff for slot allocators.
    • Herlihy & Shavit, The Art of Multiprocessor Programming, ch. 5 §“Wait-Free Synchronization” — progress conditions and the wait-free amortization argument.
  • Sibling docs in this repo:
    • cubrid-lockfree-overview.md — umbrella.
    • cubrid-lockfree-transaction.md — the dominant consumer of the bitmap (per-thread index allocation).