# diffulex.engine `diffulex.engine` contains the core inference lifecycle: request creation, scheduling, KV cache management, model-runner execution, sampling, output recording, and worker cleanup. Decoding strategies plug into this package through registries rather than by changing the public engine API. | Module | Role | | --- | --- | | `diffulex.engine.dllm_block` | Block and block-buffer state used by diffusion decoding requests. | | `diffulex.engine.engine` | Main `DiffulexEngine` implementation and worker process entry point. | | `diffulex.engine.kv_cache_manager` | Base KV cache manager contract, page objects, and registry. | | `diffulex.engine.model_runner` | Base model runner contract and registry. | | `diffulex.engine.request` | Base request state and request registry. | | `diffulex.engine.scheduler` | Scheduler base classes, data-parallel wrapper, and scheduler registry. | | `diffulex.engine.status` | Request, block, and block-type enums. | | `diffulex.engine.strategy_registry` | Shared strategy registry implementation used by request, scheduler, cache, and runner registries. | ## diffulex.engine.dllm_block This module models diffusion decoding at block granularity. It tracks which tokens are masked, filled, semi-complete, or complete and stores per-block state needed by multi-block and edit-style strategies. | Symbol | Purpose | | --- | --- | | `DllmBlock` | Per-block token state, masks, counters, and status transitions. | | `DllmBlockBuffer` | Active block buffer used by multi-block request state. | Strategy templates build on these classes instead of duplicating block-state bookkeeping. ## diffulex.engine.engine This module contains `DiffulexEngine`, the in-process engine behind the public `diffulex.Diffulex` alias. It validates configuration, loads tokenizer metadata, spawns nonzero-rank model runners, constructs strategy components, submits requests, steps the scheduler/model/sampler loop, and records outputs. | Symbol | Purpose | | --- | --- | | `DiffulexEngine` | Main engine implementation for offline generation and lower-level step APIs. | | `_run_model_runner_worker` | Worker process entry point for nonzero ranks. | Most users should import `Diffulex` from the package root. Engine internals are useful when extending scheduling, worker lifecycle, or profiling behavior. ## diffulex.engine.kv_cache_manager This module defines how strategies allocate, append, release, and reuse KV cache pages. Concrete strategies register their cache manager implementations under a decoding-strategy key. | Symbol | Purpose | | --- | --- | | `Page` | KV cache page descriptor. | | `KVCacheManagerBase` | Abstract cache manager contract used by schedulers. | | `AutoKVCacheManager` | Strategy registry for concrete cache managers. | Cache manager changes should be paired with attention metadata checks because page allocation and kernel layout must agree. ## diffulex.engine.model_runner Model runners prepare tensors, initialize attention metadata, execute model forward passes, call samplers, and optionally capture CUDA graph paths. The base runner owns common model/sampler construction and worker lifecycle behavior. | Symbol | Purpose | | --- | --- | | `ModelRunnerBase` | Common runner functionality for model loading, sampler loading, execution setup, and worker control. | | `AutoModelRunner` | Strategy registry for concrete model runners. | New strategies usually subclass a strategy template model runner instead of subclassing `ModelRunnerBase` directly. ## diffulex.engine.request This module provides the base request object and request registry. A request tracks prompt tokens, generated tokens, sampling parameters, output state, and strategy-specific lifecycle fields supplied by mixins or templates. | Symbol | Purpose | | --- | --- | | `DllmReq` | Base request state used by scheduler and model runner code. | | `AutoReq` | Strategy registry for concrete request classes. | Strategy-specific request classes should keep only request-local decoding state here; scheduler policy belongs in scheduler classes. ## diffulex.engine.scheduler Schedulers decide which requests can prefill, decode, append blocks, finish, or abort during each engine step. The data-parallel wrapper coordinates multiple request-processing groups. | Symbol | Purpose | | --- | --- | | `SchedulerBase` | Abstract scheduler contract used by `DiffulexEngine`. | | `DataParallelScheduler` | Wrapper for data-parallel scheduling. | | `AutoScheduler` | Strategy registry for concrete schedulers. | Scheduler changes should preserve the contract between request state, cache manager decisions, and model runner tensor preparation. ## diffulex.engine.status This module contains status enums shared by requests and block buffers. | Symbol | Purpose | | --- | --- | | `DllmBlockType` | Identifies block categories. | | `DllmBlockStatus` | Tracks block lifecycle state. | | `DllmReqStatus` | Tracks request lifecycle state. | Use these enums instead of ad hoc strings in request or scheduler code. ## diffulex.engine.strategy_registry This module implements the registry pattern used by `AutoReq`, `AutoScheduler`, `AutoKVCacheManager`, and `AutoModelRunner`. | Symbol | Purpose | | --- | --- | | `DiffulexStrategyRegistry` | Base class for strategy-keyed registries with aliases, defaults, and factory lookup. | When adding a strategy, every registered component should use the same decoding strategy key so `Config.decoding_strategy` resolves a coherent component set.