diffulex.utils¶
diffulex.utils contains shared helpers that do not belong to one model
family, strategy, or serving path. The modules here are still part of the
runtime path: checkpoint loading, tokenizer construction, and output accounting
all flow through this package.
Module |
Main responsibility |
|---|---|
|
Small dataclasses used by checkpoint weight resolution. |
|
Base-model and LoRA weight loading. |
|
Generation trajectories, text conversion, and benchmark metrics. |
|
Display helpers for registry factories. |
|
Robust Hugging Face tokenizer construction. |
diffulex.utils.checkpoint¶
checkpoint defines the value objects used by model and layer code when a
checkpoint tensor needs custom loading behavior.
Symbol |
How to use it |
What it does |
|---|---|---|
|
Receive it inside a module’s |
Carries the engine |
|
Return it from |
Describes where a tensor should go: a parameter, a buffer, a custom loader, a transform, a shard id, or a skip marker. |
Use ResolvedWeight when checkpoint names do not map cleanly to PyTorch
parameter names. It keeps family-specific mapping logic near the model or layer
that owns the weight, while loader handles the actual copy.
diffulex.utils.loader¶
loader is responsible for reading .safetensors checkpoints, applying custom
weight resolvers, handling packed module mappings, and optionally loading LoRA
adapters.
Symbol |
How to use it |
What it does |
|---|---|---|
|
Pass a LoRA adapter directory. |
Reads |
|
Pass a model and LoRA config before loading weights. |
Calls |
|
Use as the fallback parameter loader. |
Copies the loaded tensor into |
|
Pass model, checkpoint tensor name, and config. |
Walks modules from most specific prefix to root and asks |
|
Pass a |
Applies transforms, custom loaders, parameter shards, buffers, or skip behavior. |
|
Pass model, tensor name, and loaded tensor. |
Attempts direct parameter or buffer loading by exact name. |
|
Pass model, packed mapping, tensor name, loaded tensor, and config. |
Handles packed projections such as merged QKV or model-family-specific aliases. |
|
Pass an initialized model and |
Loads base |
|
Pass a LoRA-enabled model and adapter path. |
Finds LoRA A/B tensors, handles TP sharding for supported layers, and optionally pre-merges adapters. |
The load order is deliberate. Custom resolvers get the first chance to map a checkpoint tensor, packed-module mappings run second, and exact parameter/buffer names are tried last. This makes unusual model-family layouts explicit without breaking the simple case.
diffulex.utils.output¶
output stores generation results and computes the metrics shown by offline
inference and benchmarks. It keeps both token-level trajectories and aggregate
throughput counters.
Symbol |
How to use it |
What it does |
|---|---|---|
|
Pass a tokenizer and token IDs. |
Decodes normally first, then falls back to token conversion for tokenizers with stricter decode signatures. |
|
Created for each scheduled engine step. |
Records step time, prefill/decode mode, generated token count, running tokens, buffer block IDs, and optional block trace. |
|
One item per prompt. |
Stores final token IDs, full response token IDs, truncation flags, completion reason, text, and per-step trajectory. |
|
Created by the engine for a batch. |
Accumulates trajectories and exposes metrics such as TPF, TTFT, TPOT, throughput, prefill throughput, and decode throughput. |
|
Call after each engine step. |
Updates batch counters and appends |
|
Pass the tokenizer after generation finishes. |
Decodes truncated and full token responses into text. |
|
Call before returning benchmark-compatible data. |
Produces |
Set DIFFULEX_SAVE_TRACE=0 when block-level traces are not needed. Leaving it
enabled records per-block status and mask ratios, which is useful for debugging
decoding behavior but adds more data to each trajectory.
diffulex.utils.registry¶
registry contains small helpers used by the registry classes for readable
errors and diagnostics.
Symbol |
How to use it |
What it does |
|---|---|---|
|
Pass a class, function, |
Returns a stable module-qualified display name after unwrapping decorators and partials. |
Use this helper when a registry needs to describe which factory is currently bound without assuming the factory is a plain class.
diffulex.utils.tokenizer¶
tokenizer wraps Hugging Face AutoTokenizer.from_pretrained with a fallback
for tokenizer configs that store extra_special_tokens as a list. Some
tokenizer versions expect a dict, so Diffulex coerces the list into stable
generated token names before retrying.
Symbol |
How to use it |
What it does |
|---|---|---|
|
Use instead of calling |
Loads the tokenizer, and if necessary retries with coerced |
The fallback only runs for the known extra_special_tokens shape error. Other
tokenizer loading failures are re-raised so configuration problems remain
visible.