diffulex.engine

diffulex.engine contains the core inference lifecycle: request creation, scheduling, KV cache management, model-runner execution, sampling, output recording, and worker cleanup. Decoding strategies plug into this package through registries rather than by changing the public engine API.

Module

Role

diffulex.engine.dllm_block

Block and block-buffer state used by diffusion decoding requests.

diffulex.engine.engine

Main DiffulexEngine implementation and worker process entry point.

diffulex.engine.kv_cache_manager

Base KV cache manager contract, page objects, and registry.

diffulex.engine.model_runner

Base model runner contract and registry.

diffulex.engine.request

Base request state and request registry.

diffulex.engine.scheduler

Scheduler base classes, data-parallel wrapper, and scheduler registry.

diffulex.engine.status

Request, block, and block-type enums.

diffulex.engine.strategy_registry

Shared strategy registry implementation used by request, scheduler, cache, and runner registries.

diffulex.engine.dllm_block

This module models diffusion decoding at block granularity. It tracks which tokens are masked, filled, semi-complete, or complete and stores per-block state needed by multi-block and edit-style strategies.

Symbol

Purpose

DllmBlock

Per-block token state, masks, counters, and status transitions.

DllmBlockBuffer

Active block buffer used by multi-block request state.

Strategy templates build on these classes instead of duplicating block-state bookkeeping.

diffulex.engine.engine

This module contains DiffulexEngine, the in-process engine behind the public diffulex.Diffulex alias. It validates configuration, loads tokenizer metadata, spawns nonzero-rank model runners, constructs strategy components, submits requests, steps the scheduler/model/sampler loop, and records outputs.

Symbol

Purpose

DiffulexEngine

Main engine implementation for offline generation and lower-level step APIs.

_run_model_runner_worker

Worker process entry point for nonzero ranks.

Most users should import Diffulex from the package root. Engine internals are useful when extending scheduling, worker lifecycle, or profiling behavior.

diffulex.engine.kv_cache_manager

This module defines how strategies allocate, append, release, and reuse KV cache pages. Concrete strategies register their cache manager implementations under a decoding-strategy key.

Symbol

Purpose

Page

KV cache page descriptor.

KVCacheManagerBase

Abstract cache manager contract used by schedulers.

AutoKVCacheManager

Strategy registry for concrete cache managers.

Cache manager changes should be paired with attention metadata checks because page allocation and kernel layout must agree.

diffulex.engine.model_runner

Model runners prepare tensors, initialize attention metadata, execute model forward passes, call samplers, and optionally capture CUDA graph paths. The base runner owns common model/sampler construction and worker lifecycle behavior.

Symbol

Purpose

ModelRunnerBase

Common runner functionality for model loading, sampler loading, execution setup, and worker control.

AutoModelRunner

Strategy registry for concrete model runners.

New strategies usually subclass a strategy template model runner instead of subclassing ModelRunnerBase directly.

diffulex.engine.request

This module provides the base request object and request registry. A request tracks prompt tokens, generated tokens, sampling parameters, output state, and strategy-specific lifecycle fields supplied by mixins or templates.

Symbol

Purpose

DllmReq

Base request state used by scheduler and model runner code.

AutoReq

Strategy registry for concrete request classes.

Strategy-specific request classes should keep only request-local decoding state here; scheduler policy belongs in scheduler classes.

diffulex.engine.scheduler

Schedulers decide which requests can prefill, decode, append blocks, finish, or abort during each engine step. The data-parallel wrapper coordinates multiple request-processing groups.

Symbol

Purpose

SchedulerBase

Abstract scheduler contract used by DiffulexEngine.

DataParallelScheduler

Wrapper for data-parallel scheduling.

AutoScheduler

Strategy registry for concrete schedulers.

Scheduler changes should preserve the contract between request state, cache manager decisions, and model runner tensor preparation.

diffulex.engine.status

This module contains status enums shared by requests and block buffers.

Symbol

Purpose

DllmBlockType

Identifies block categories.

DllmBlockStatus

Tracks block lifecycle state.

DllmReqStatus

Tracks request lifecycle state.

Use these enums instead of ad hoc strings in request or scheduler code.

diffulex.engine.strategy_registry

This module implements the registry pattern used by AutoReq, AutoScheduler, AutoKVCacheManager, and AutoModelRunner.

Symbol

Purpose

DiffulexStrategyRegistry

Base class for strategy-keyed registries with aliases, defaults, and factory lookup.

When adding a strategy, every registered component should use the same decoding strategy key so Config.decoding_strategy resolves a coherent component set.