GPU-Textures-as-Universal-Latents

This content was automatically converted from the project's wiki Markdown to HTML. See the Basis Universal GitHub wiki for the latest content.

Overview

Basis Universal formats can be best understood as a family of universal texture latents. Latents in this context are explicitly engineered representations, not learned or neural.

Rather than treating GPU texture formats (ASTC, BCn, ETC, PVRTC, etc.) as canonical assets, Basis Universal defines a set of compact, structured latent representations that are:

stored and distributed as the canonical data
perceptually optimized
format-agnostic
and rapidly projected into vendor GPU texture formats at load time

In this model, GPU textures are derived (compiled) data, not primary assets. GPU block formats were always latent models. We just treated them as opaque compression schemes, until now.

What Is a Latent?

A latent variable is a compact, unobserved representation from which observed data can be derived — a concept originating in statistical factor analysis over a century ago. Basis Universal applies this idea directly: GPU texture blocks are treated as observed data, while the universal formats define structured latents from which those blocks are synthesized.

  latent  →  GPU texture blocks  →  decoded texels

A latent is a coordinate system for texture information that:

is not pixel data
is not a specific GPU block format
preserves the information needed to synthesize GPU blocks directly
supports fast, deterministic projection into multiple target formats

Unlike traditional intermediates, Basis Universal latents are:

compact
block-native
structured for prediction, transforms, and entropy coding
designed for direct block synthesis, not decode → recompress workflows

The Basis Universal latents are structured, entropy-coded intermediate representations for standardized GPU generative decoders. The decoders are globally standardized engineered generative models in silicon, and Basis Universal is now wrapping an IR around it.

Basis Universal Image/Texture Pipeline

Source Image
    ↓
[Encode to Universal Latent]
    ↓ (Analysis-by-Synthesis encoder, hyper-adaptive to image/texture content via block-local PCA)
ETC1S / UASTC / XUASTC / UASTC HDR
    ↓ (stored/transmitted)
[Transcode to GPU Format, Potentially with Deblocking for non-ASTC]
    ↓ (direct latent to latent, or analytically encoded with no search)
ASTC / BC7 / BC6H / ETC1 / etc.
    ↓
GPU Hardware: Standardized Generative Decoders Already in Billions of Devices
    ↓
Pixels (Potentially Deblocked Using a Pixel Shader for ASTC)

Key Insight

GPU formats are execution formats.
Universal latents are distribution formats.

The latent is closer to the reconstruction model than to pixels.
Transform coding the latent is more efficient than coding pixels directly.

This decouples:

Authoring (encode once)
Distribution (very compact, universal)
Execution (GPU-specific, potentially cached locally - like shader binaries)

The Basis Universal Latent Family

Each Basis Universal format defines a latent optimized for a specific region of the design space (bitrate, quality, HDR/LDR, block size, transcoding throughput/complexity, specification complexity, transcoder WASM code size).

ETC1S

Very low-rate, simple latent

Highly constrained grammar based on ETC1 (no subblock usage, otherwise identical to ETC1, which we named "ETC1S")
Low bitrate: roughly 0.75-2 bpp over network
Optimized for tiny payloads, very fast and simple transcoding to vendor formats even in plain WASM
Trades expressiveness for entropy friendliness
Supports temporal supercompression, BC7 transcoder adds real-time chroma filtering
Ideal for large-scale distribution of geospatial texture data

UASTC LDR 4x4

High-quality, low-distortion latent for textures

High bitrate: roughly 4.5-8 bpp over network, 4 bpp or 8 bpp in memory
Latent grammar consists of 19 modes
Minimal transforms, latent is a subset of ASTC LDR constrained for direct transcoding to BC7 with no pixel-wise recompression
Latent has hint bits to accelerate transcoding to other common formats (BC1, ETC1)
Fast, mostly analytical transcodes
Suitable for high-quality universal delivery

XUASTC LDR 4x4-12x12

Hyper-adaptive and scalable transform-domain latent

Explicit prediction, the first supercompressed format with Weight Grid DCT, and multiple entropy coding profiles
Latent grammar consists of 13,659 ASTC LDR configurations
Bitrate range: roughly 0.3-5.7 bpp over network, 0.89-8.00 bpp in memory (on ASTC devices, otherwise 4 bpp or 8bpp)
Tunable bitrate / quality tradeoffs, adaptive deblocking at larger block sizes
Supports all 14 ASTC block sizes, fast direct latent to latent transcoding to BC7 for common 4x4, 6x6, and 8x6 block sizes
Highly analytical AbS (analysis by synthesis) encoder uses DCT to measure weight grid downsampling error, with closed form calculations to estimate weight/endpoint quantization errors

UASTC HDR 4×4

Strict HDR latent

100% standard ASTC HDR 4x4 block data on disk/over network
Carefully chosen, constrained ASTC HDR latent grammar (24 modes), fast to encode
Very high quality HDR at fixed bitrate: 8 bpp over network/in memory (less if Zstd compression applied in container)
Directly compatible with ASTC HDR, with direct latent to latent transcode to BC6H
HDR AbS encoder results in predictable quality

UASTC HDR 6×6i (Intermediate)

Photographic HDR latent

Much richer latent grammar than HDR 4×4 (75 modes), but still practical to encode
Perceptual Lagrangian encoder uses a modern HDR colorspace (delta E ITP) and SSIM, targets roughly .75-3 bpp over network, 3.56 bpp (ASTC) or 8.0 bpp (BC6H) in memory
Expands into standard ASTC HDR 6×6 blocks, transcodes to full BC6H in real-time using an analytical encoder
Optimized for photographic HDR content at scale, also tested thoroughly with upconverted LDR/SDR content
HDR AbS encoder results in predictable quality

The Key Property: Projection Speed

Traditional pipelines operate as:

  compressed → full pixels → recompress

Basis Universal latents operate as:

  latent → direct block synthesis

  or

  latent → transcoded latent → direct block synthesis

This means:

no expensive per-block searches
no iterative re-encoding
often no floating point
frequently no iteration at all

For this reason, “transcode” is the correct term — GPU formats are produced by projection, not by decoding and re-encoding pixels.

Why This Is Different From Classic Intermediates

Historically, intermediate formats were:

large
operate or decompress to pixels
slow to re-encode

Basis Universal latents are:

compact
lower dimensional
structured
block-native

They live between pixels and GPU formats, not close to either extreme.
As a result, the same latent can efficiently serve:

ASTC
BCn
ETC
PVRTC
or uncompressed RGBA output

Consistency Across LDR and HDR

Once the system is viewed as latent design rather than encoder design:

HDR is no longer a special case
LDR is no longer forgiving
perceptual metrics become mandatory
grammar constraints become obvious, not arbitrary

HDR support was not added by extending a codec — it was added by introducing another latent into the same architectural framework.

The Core Implication

Basis Universal decouples asset distribution from GPU format choice.

This enables pipelines where textures are:

authored once
stored once
streamed once
and adapted late, cheaply, and deterministically

This is the foundation for scalable pipelines operating at terabyte or petabyte scale, where GPU texture formats are treated as cached, derived artifacts.

Summary

Basis Universal defines a set of universal texture latents, each optimized for a specific Pareto-optimal point in the design space.

These latents:

are the canonical stored representation
project extremely efficiently into real GPU texture formats
scale consistently across LDR and HDR
and form a new, stable layer in the graphics stack