This content was automatically converted from the project's wiki Markdown to HTML. See the Basis Universal GitHub wiki for the latest content.
Copyright (C) 2025-2026 Binomial LLC. All rights reserved except as granted under the Apache 2.0 LICENSE. If you modify the Basis Universal source code, specifications, or wiki documents and redistribute the files, you must cause any modified files to carry prominent notices stating that you changed the files (see Apache 2.0 ยง4(b)).
Note: "Range Coding" and "Arithmetic Coding" are used interchangeably in the XUASTC LDR specification.
This document provides a specification for implementing the range coder used by XUASTC LDR, focusing exclusively on decompression (decoding) operations. The decoder is based on the range coding approach described by Amir Said in his Hewlett-Packard Laboratories April 21, 2004 technical report HPL-2004-76 "Introduction to Arithmetic Coding - Theory and Practice" (or here), and his open-source implementations on GitHub. (Also see the Lossless Compression Handbook, ed. Khalid Sayood, 2002.) However, this specification is derived directly from the independent implementation in the Basis Universal compression library, although the intention is to closely follow Said's original design with extensions and minor improvements. A compatible encoder can be easily created by following Said's technical report and open source code.
The decoder supports adaptive binary and multi-symbol models, as well as specialized encoding schemes like truncated binary, Rice, and Gamma codes. It uses 32-bit unsigned integer arithmetic to avoid overflows, with renormalization to maintain precision. The design assumes input streams of at least 5 bytes and handles adaptive model updates during decoding.
At a high level, the decoder works by maintaining a current range (defined by a base value and length) within a large integer space, progressively narrowing this range based on decoded symbols or bits. As the range shrinks below a minimum threshold, renormalization expands it by shifting in more bytes from the input stream. Adaptive models (binary or multi-symbol) are used to estimate probabilities, which guide how the range is subdivided for each decoding step. These models update their statistics after each decode to adapt to the data distribution, improving compression efficiency over time.
Specialized decoders like Gamma or Rice build on basic bit decoding but incorporate adaptive contexts or fixed structures for efficient representation of non-negative integers. The overall system is modular: the core decoder state interacts with models during adaptive decoding, and non-adaptive routines serve as building blocks for higher-level ones.
This spec is structured from the lowest level (constants and basic structures) to the highest level (specialized decoding methods).
These constants define the fundamental limits and scaling factors for
the entire decoder system. They ensure consistent behavior across models
and decoding operations, preventing overflows and setting thresholds for
renormalization. For example, shifts like DM_LEN_SHIFT and
BM_LEN_SHIFT determine the precision of probability scaling
in multi-symbol and binary models, respectively, while
ARITH_MIN_LEN triggers renormalization in the core decoder
to maintain numerical stability. These values are used throughout the
classes and routines to bound arrays, scale probabilities, and control
loop behaviors.
ARITH_MAX_SYMS: 2048 (maximum symbols in a multi-symbol
model).
DM_LEN_SHIFT: 15 (shift for multi-symbol probabilities;
DM_MAX_COUNT = 1 << 15 = 32768).
BM_LEN_SHIFT: 13 (shift for binary probabilities;
BM_MAX_COUNT = 1 << 13 = 8192).
ARITH_MIN_LEN: 1 << 24 = 16777216 (threshold for
renormalization).
ARITH_MAX_LEN: 4294967295 (UINT32_MAX, initial/full range
length).
ARITH_MIN_EXPECTED_DATA_BUF_SIZE: 5 (minimum input buffer
size; shorter streams are invalid).
MAX_PUT_BITS_LEN: 20 (maximum bits for direct multi-bit
decoding, as a safety limit).
ARITH_GAMMA_MAX_PREFIX_CTX: 3 (contexts for Gamma
prefix).
ARITH_GAMMA_MAX_TAIL_CTX: 4 (contexts for Gamma tail).
All arithmetic uses unsigned 32-bit integers (uint32_t) unless noted otherwise.
These data structures form the backbone of adaptive decoding, storing probability estimates and counts that evolve as data is decoded. The binary model handles simple 0/1 decisions, the multi-symbol model manages larger alphabets, and gamma contexts provide specialized binary models for Gamma coding. They interact with the core decoder by providing probability splits for range subdivision during decoding, and they are updated post-decode to refine estimates. Initialization and reset routines prepare them for use, often called at the start of decoding or when switching contexts in an application.
arith_bit_model)The binary model is an adaptive probability estimator for bits (0 or 1). It maintains live counts of observed bits and a snapshot probability used for decoding. During decoding, the core decoder queries the model's probability to split the range, decodes the bit based on where the current value falls, then updates the counts. If the update countdown reaches zero, the model rescales and recomputes its probability. This adaptation allows it to learn from the data stream, improving accuracy for skewed distributions. It interacts with the decoder's adaptive bit routine and is used in contexts like Gamma coding.
bit0_count: uint32_t, count of 0 bits (live
counter).bit_count: uint32_t, total bits (live counter).bit0_prob: uint32_t, scaled probability of 0 (snapshot
from last update).bits_until_update: int, countdown to next update.update_interval: uint32_t, interval between
updates.This routine resets the model to a uniform prior (equal chance for 0 and 1), setting initial counts and intervals. It's called when creating a new model or resetting statistics, ensuring a clean start before decoding begins.
function reset_binary_model(model):
model.bit0_count = 1
model.bit_count = 2
model.bit0_prob = 1 << (BM_LEN_SHIFT - 1) // 4096
model.update_interval = 4
model.bits_until_update = 4
This routine is invoked after a certain number of bits have been decoded (when the countdown hits zero). It checks for count overflow and halves them if needed to prevent saturation, then recomputes the probability snapshot using scaling. The interval is adjusted to grow over time, delaying updates as the model stabilizes. This interacts with the adaptive bit decoder, which decrements the countdown after each decode.
function update_binary_model(model):
if model.bit_count >= BM_MAX_COUNT:
model.bit_count = (model.bit_count + 1) >> 1
model.bit0_count = (model.bit0_count + 1) >> 1
if model.bit0_count == model.bit_count:
model.bit_count += 1
scale = 0x80000000U / model.bit_count
model.bit0_prob = (model.bit0_count * scale) >> (31 - BM_LEN_SHIFT)
model.update_interval = clamp((5 * model.update_interval) >> 2, 4, 128)
model.bits_until_update = model.update_interval
arith_data_model)The multi-symbol model estimates probabilities for an alphabet of up to 2048 symbols. It keeps live frequency counts and a snapshot of cumulative probabilities for efficient range splitting. During symbol decoding, the core decoder uses the cumulatives to find the symbol via binary search, then updates the frequency and total. Updates rescale frequencies on overflow and rebuild cumulatives. This model interacts with the adaptive symbol decoder, adapting to symbol distributions in the stream for better compression.
num_data_syms: uint32_t, number of symbols (0 if
uninitialized).total_sym_freq: uint32_t, sum of frequencies
(live).sym_freqs: array of uint32_t, frequencies per symbol
(size num_data_syms, live histogram).cum_sym_freqs: array of uint32_t, cumulative
frequencies (size num_data_syms + 1, snapshot from last update).update_interval: uint32_t, interval between
updates.num_syms_until_next_update: int, countdown to next
update.This sets up the model with a given number of symbols, allocates arrays, and calls reset. It's the entry point for creating a model instance, preparing it for use in decoding.
function init_multi_model(model, num_syms, faster_update):
model.num_data_syms = num_syms
allocate model.sym_freqs[num_syms]
allocate model.cum_sym_freqs[num_syms + 1]
reset_multi_model(model, faster_update)
This fully clears the model's state, releasing arrays and resetting
to an uninitialized form. It's complementary to init for
object reuse.
function clear_multi_model(model):
clear model.cum_sym_freqs // e.g., resize to 0 or free
clear model.sym_freqs // e.g., resize to 0 or free
model.num_data_syms = 0
model.total_sym_freq = 0
model.update_interval = 0
model.num_syms_until_next_update = 0
This reinitializes frequencies to uniform (1 each), resets totals and
intervals, and triggers an initial update to build cumulatives. The
faster_update flag accelerates adaptation for small
alphabets. It's used to restart the model mid-stream or at the
beginning.
function reset_multi_model(model, faster_update):
if model.num_data_syms == 0:
return
for i = 0 to model.num_data_syms - 1:
model.sym_freqs[i] = 1
model.total_sym_freq = model.num_data_syms
model.update_interval = model.num_data_syms
model.num_syms_until_next_update = 0
update_multi_model(model)
if faster_update:
model.update_interval = clamp((model.num_data_syms + 7) / 8, 4, (model.num_data_syms + 6) << 3)
model.num_syms_until_next_update = model.update_interval
Invoked after decoding enough symbols (countdown zero), this halves frequencies on overflow, recomputes scaled cumulatives for probability mapping, and adjusts the update interval. It ensures the model doesn't saturate and adapts its update frequency.
function update_multi_model(model):
while model.total_sym_freq >= DM_MAX_COUNT:
model.total_sym_freq = 0
for n = 0 to model.num_data_syms - 1:
model.sym_freqs[n] = (model.sym_freqs[n] + 1) >> 1
model.total_sym_freq += model.sym_freqs[n]
scale = 0x80000000U / model.total_sym_freq
sum = 0
for i = 0 to model.num_data_syms - 1:
model.cum_sym_freqs[i] = (scale * sum) >> (31 - DM_LEN_SHIFT)
sum += model.sym_freqs[i]
model.cum_sym_freqs[model.num_data_syms] = DM_MAX_COUNT
model.update_interval = clamp((5 * model.update_interval) >> 2, 4, (model.num_data_syms + 6) << 3)
model.num_syms_until_next_update = model.update_interval
arith_gamma_contexts)This structure groups multiple binary models for contextual Gamma decoding: prefix for unary length, tail for binary bits. Each context is a separate binary model, allowing adaptation per position. It interacts with the Gamma decoder, which selects models based on bit position for better prediction in variable-length codes.
ctx_prefix: array of arith_bit_model (size
ARITH_GAMMA_MAX_PREFIX_CTX = 3), for unary prefix.ctx_tail: array of arith_bit_model (size
ARITH_GAMMA_MAX_TAIL_CTX = 4), for binary suffix.Usage: Reset each arith_bit_model individually using the
binary reset method.
function init_gamma_contexts(ctxs):
for i = 0 to ARITH_GAMMA_MAX_PREFIX_CTX - 1:
reset_binary_model(ctxs.ctx_prefix[i])
for i = 0 to ARITH_GAMMA_MAX_TAIL_CTX - 1:
reset_binary_model(ctxs.ctx_tail[i])
arith_dec)The decoder state encapsulates the input buffer and current range (value and length). It serves as the central hub, providing the context for all decoding operations. Basic bit decoders modify the range directly, while adaptive ones use models to guide splits. Renormalization is called from within decoding routines when length drops too low, loading more input bytes.
p_data_buf: const uint8_t*, start of input buffer.p_data_buf_last_byte: const uint8_t*, last byte of
buffer.p_data_buf_cur: const uint8_t*, current read
position.data_buf_size: size_t, buffer size.value: uint32_t, current code value.length: uint32_t, current range length.This clears all state, preparing for a new initialization. It's useful for reusing the decoder object.
function clear_decoder(dec):
dec.p_data_buf = null
dec.p_data_buf_last_byte = null
dec.p_data_buf_cur = null
dec.data_buf_size = 0
dec.value = 0
dec.length = 0
This sets up the decoder with an input buffer, loading the initial 32-bit value and setting the full range. It skips the first 4 bytes for the initial value load, assuming the stream starts with encoded data. This is the starting point for any decoding session.
function init_decoder(dec, pBuf, buf_size):
if buf_size < 5:
return false
dec.p_data_buf = pBuf
dec.p_data_buf_last_byte = pBuf + buf_size - 1
dec.p_data_buf_cur = dec.p_data_buf + 4
dec.data_buf_size = buf_size
dec.value = (pBuf[0] << 24) | (pBuf[1] << 16) | (pBuf[2] << 8) | pBuf[3]
dec.length = ARITH_MAX_LEN
return true
renorm)Renormalization is a core utility routine that expands the decoding
range when it becomes too narrow for precise subdivision. It's called
inline from all decoding operations whenever length falls below
ARITH_MIN_LEN, shifting in 8 bits (a byte) from the input
stream each time. If the end of the buffer is reached, it pads with 0s.
This maintains the decoder's precision and allows continued operation on
long streams without overflow.
function renorm(dec):
while dec.length < ARITH_MIN_LEN:
next_byte = if dec.p_data_buf_cur > dec.p_data_buf_last_byte then 0 else *dec.p_data_buf_cur++
dec.value = (dec.value << 8) | next_byte
dec.length <<= 8
These routines provide fundamental non-adaptive bit extraction, serving as primitives for higher-level decoders. They directly manipulate the decoder state's range without models, narrowing it based on the decoded bits and triggering renormalization as needed. They are building blocks for truncated binary, Rice, and adaptive methods.
get_bit)This decodes a single bit by halving the range and checking where the value falls. If it's in the upper half, it's a 1 and the range shifts down; otherwise, 0. Renormalization follows if needed. It's the simplest operation, used extensively in unary prefixes or binary tails.
function get_bit(dec):
dec.length >>= 1
bit = if dec.value >= dec.length then 1 else 0
if bit == 1:
dec.value -= dec.length
if dec.length < ARITH_MIN_LEN:
renorm(dec)
return bit
get_bits, num_bits from 1 to
MAX_PUT_BITS_LEN=20)This extracts a fixed-length integer by dividing the range into 2^num_bits parts and finding the sub-range containing the value. It updates the state accordingly and renormalizes. It's efficient for small fixed-bit fields, like remainders in Rice coding.
function get_bits(dec, num_bits):
if num_bits < 1 or num_bits > 20:
return 0 // or error
dec.length >>= num_bits
v = dec.value / dec.length
dec.value -= dec.length * v
if dec.length < ARITH_MIN_LEN:
renorm(dec)
return v
These build on basic bit operations to decode variable-length codes
for non-negative integers, without adaptation. They are self-contained
but can be used in contexts where models aren't needed, and they
interact with the decoder state via get_bit and
get_bits.
decode_truncated_binary, n >=2)This efficiently decodes a uniform value from 0 to n-1 using a variable number of bits (floor(log2(n)) or one more). It computes a threshold and potentially decodes an extra bit, minimizing bits for non-power-of-2 ranges. It's a low-level optimizer for bounded integers.
function decode_truncated_binary(dec, n):
k = floor_log2(n)
u = (1 << (k + 1)) - n
result = get_bits(dec, k)
if result >= u:
result = (result << 1) | get_bits(dec, 1)
result -= u
return result
decode_rice, m >0)Rice coding decodes large values efficiently: a unary quotient
(sequence of 1s ended by 0) followed by a fixed m-bit remainder. It uses
get_bit for the unary part and get_bits for
the binary, with a sanity limit on quotient size. Useful for geometric
distributions.
function decode_rice(dec, m):
q = 0
while true:
k = get_bit(dec)
if k == 0:
break
q += 1
if q > 64:
return 0 // error
return (q << m) + get_bits(dec, m)
These top-level routines incorporate adaptation or contexts, updating models after each decode. They represent the full power of the system for compressed streams.
decode_bit, using
arith_bit_model data model)This decodes a bit using the model's current probability to split the range asymmetrically. After decoding, it updates counts and potentially the model. It combines range coding with adaptation for binary streams.
function decode_bit(dec, dm):
x = dm.bit0_prob * (dec.length >> BM_LEN_SHIFT)
bit = if dec.value >= x then 1 else 0
if bit == 0:
dec.length = x
dm.bit0_count += 1
else:
dec.value -= x
dec.length -= x
dm.bit_count += 1
if dec.length < ARITH_MIN_LEN:
renorm(dec)
dm.bits_until_update -= 1
if dm.bits_until_update <= 0:
update_binary_model(dm)
return bit
decode_gamma, using
arith_gamma_contexts ctxs)Gamma decoding handles variable-length codes for positives: adaptive unary prefix (using prefix contexts) determines bit length k, then adaptive binary suffix (using tail contexts) fills the value. It uses decode_bit for each, adapting per position for better efficiency on power-law data.
function decode_gamma(dec, ctxs):
k = 0
while decode_bit(dec, ctxs.ctx_prefix[min(k, ARITH_GAMMA_MAX_PREFIX_CTX-1)] ) == 1:
k += 1
if k > 16:
return 0 // error
n = 1 << k
for i = k-1 downto 0:
bit = decode_bit(dec, ctxs.ctx_tail[min(i, ARITH_GAMMA_MAX_TAIL_CTX-1)] )
n |= (bit << i)
return n
decode_sym, using
arith_data_model data model)This decodes a symbol by binary-searching the model's cumulatives to find the matching sub-range, then narrowing the decoder's range. Post-decode, it increments frequency and potentially updates the model. Ideal for arbitrary alphabets with changing probabilities.
function decode_sym(dec, dm)
x = 0
y = dec.length
dec.length >>= DM_LEN_SHIFT
low_idx = 0
hi_idx = dm.num_data_syms
mid_idx = hi_idx >> 1
do:
z = dec.length * dm.cum_sym_freqs[mid_idx]
if z > dec.value:
hi_idx = mid_idx
y = z
else:
low_idx = mid_idx
x = z
mid_idx = (low_idx + hi_idx) >> 1
while mid_idx != low_idx
dec.value -= x
dec.length = y - x
if dec.length < ARITH_MIN_LEN:
renorm(dec)
dm.sym_freqs[low_idx] += 1
dm.total_sym_freq += 1
dm.num_syms_until_next_update -= 1
if dm.num_syms_until_next_update <= 0:
update_multi_model(dm)
return low_idx
clz(n) in C).transcoder/basisu_transcoder_internal.h.