# diffulex.utils `diffulex.utils` contains shared helpers that do not belong to one model family, strategy, or serving path. The modules here are still part of the runtime path: checkpoint loading, tokenizer construction, and output accounting all flow through this package. | Module | Main responsibility | | --- | --- | | `diffulex.utils.checkpoint` | Small dataclasses used by checkpoint weight resolution. | | `diffulex.utils.loader` | Base-model and LoRA weight loading. | | `diffulex.utils.output` | Generation trajectories, text conversion, and benchmark metrics. | | `diffulex.utils.registry` | Display helpers for registry factories. | | `diffulex.utils.tokenizer` | Robust Hugging Face tokenizer construction. | ## diffulex.utils.checkpoint `checkpoint` defines the value objects used by model and layer code when a checkpoint tensor needs custom loading behavior. | Symbol | How to use it | What it does | | --- | --- | --- | | `LoadContext` | Receive it inside a module's `resolve_checkpoint_weight` hook. | Carries the engine `Config` and the full checkpoint tensor name being resolved. | | `ResolvedWeight` | Return it from `resolve_checkpoint_weight`. | Describes where a tensor should go: a parameter, a buffer, a custom loader, a transform, a shard id, or a skip marker. | Use `ResolvedWeight` when checkpoint names do not map cleanly to PyTorch parameter names. It keeps family-specific mapping logic near the model or layer that owns the weight, while `loader` handles the actual copy. ## diffulex.utils.loader `loader` is responsible for reading `.safetensors` checkpoints, applying custom weight resolvers, handling packed module mappings, and optionally loading LoRA adapters. | Symbol | How to use it | What it does | | --- | --- | --- | | `load_lora_config` | Pass a LoRA adapter directory. | Reads `adapter_config.json` when present; otherwise returns an empty dict. | | `enable_lora_for_model` | Pass a model and LoRA config before loading weights. | Calls `__init_lora__` on matching modules so LoRA tensors exist. | | `default_weight_loader` | Use as the fallback parameter loader. | Copies the loaded tensor into `param.data`. | | `resolve_weight_spec` | Pass model, checkpoint tensor name, and config. | Walks modules from most specific prefix to root and asks `resolve_checkpoint_weight` hooks for a `ResolvedWeight`. | | `apply_resolved_weight` | Pass a `ResolvedWeight` and loaded tensor. | Applies transforms, custom loaders, parameter shards, buffers, or skip behavior. | | `try_load_direct` | Pass model, tensor name, and loaded tensor. | Attempts direct parameter or buffer loading by exact name. | | `try_load_via_packed_mapping` | Pass model, packed mapping, tensor name, loaded tensor, and config. | Handles packed projections such as merged QKV or model-family-specific aliases. | | `load_model` | Pass an initialized model and `Config`. | Loads base `.safetensors` files, enables LoRA when requested, then loads LoRA weights. | | `load_lora_weights` | Pass a LoRA-enabled model and adapter path. | Finds LoRA A/B tensors, handles TP sharding for supported layers, and optionally pre-merges adapters. | The load order is deliberate. Custom resolvers get the first chance to map a checkpoint tensor, packed-module mappings run second, and exact parameter/buffer names are tried last. This makes unusual model-family layouts explicit without breaking the simple case. ## diffulex.utils.output `output` stores generation results and computes the metrics shown by offline inference and benchmarks. It keeps both token-level trajectories and aggregate throughput counters. | Symbol | How to use it | What it does | | --- | --- | --- | | `decode_token_ids_robust` | Pass a tokenizer and token IDs. | Decodes normally first, then falls back to token conversion for tokenizers with stricter decode signatures. | | `ReqStep` | Created for each scheduled engine step. | Records step time, prefill/decode mode, generated token count, running tokens, buffer block IDs, and optional block trace. | | `ReqTrajectory` | One item per prompt. | Stores final token IDs, full response token IDs, truncation flags, completion reason, text, and per-step trajectory. | | `GenerationOutputs` | Created by the engine for a batch. | Accumulates trajectories and exposes metrics such as TPF, TTFT, TPOT, throughput, prefill throughput, and decode throughput. | | `GenerationOutputs.record_step` | Call after each engine step. | Updates batch counters and appends `ReqStep` records for each active request. | | `GenerationOutputs.convert_to_text` | Pass the tokenizer after generation finishes. | Decodes truncated and full token responses into text. | | `GenerationOutputs.to_benchmark_format` | Call before returning benchmark-compatible data. | Produces `{text, full_text, token_ids, nfe}` dictionaries. | Set `DIFFULEX_SAVE_TRACE=0` when block-level traces are not needed. Leaving it enabled records per-block status and mask ratios, which is useful for debugging decoding behavior but adds more data to each trajectory. ## diffulex.utils.registry `registry` contains small helpers used by the registry classes for readable errors and diagnostics. | Symbol | How to use it | What it does | | --- | --- | --- | | `fetch_factory_name` | Pass a class, function, `functools.partial`, or callable object. | Returns a stable module-qualified display name after unwrapping decorators and partials. | Use this helper when a registry needs to describe which factory is currently bound without assuming the factory is a plain class. ## diffulex.utils.tokenizer `tokenizer` wraps Hugging Face `AutoTokenizer.from_pretrained` with a fallback for tokenizer configs that store `extra_special_tokens` as a list. Some tokenizer versions expect a dict, so Diffulex coerces the list into stable generated token names before retrying. | Symbol | How to use it | What it does | | --- | --- | --- | | `auto_tokenizer_from_pretrained` | Use instead of calling `AutoTokenizer.from_pretrained` directly in Diffulex code. | Loads the tokenizer, and if necessary retries with coerced `extra_special_tokens`. | The fallback only runs for the known `extra_special_tokens` shape error. Other tokenizer loading failures are re-raised so configuration problems remain visible.