# diffulex.strategy.templates `diffulex.strategy.templates` contains reusable strategy building blocks. A concrete strategy package usually subclasses or directly reuses one of these templates, then registers request, scheduler, KV cache manager, model runner, and attention metadata classes under a strategy name. The templates are organized by decoding layout: | Template package | Use it when | Core classes | | --- | --- | --- | | `diffulex.strategy.templates.dual_cache` | A strategy needs the multi-block lifecycle but wants a separate cache-layout variant. | Dual-cache extension points. | | `diffulex.mixin.multi_block` | A strategy decodes with a rolling buffer of fixed-size diffusion blocks. | Multi-block request, scheduler, KV cache, runner, attention metadata, and full-static runner templates. | | `diffulex.strategy.templates.token_merge` | A strategy is multi-block and also passes token-merge distributions into attention. | Token-merge request, scheduler, KV cache, runner, and attention metadata templates. | For extension work, start from the smallest template that already matches the decoding lifecycle. Most new block-diffusion strategies should begin with the multi-block template; only use token-merge templates when attention truly needs per-token merge descriptors. ## dual_cache The dual-cache template package is currently a set of named extension points for future strategies that need separate cache views or cache lifecycles. The planned Dual Cache mechanism is tracked separately from the standard multi-block runtime. ## multi_block The multi-block template represents each request as a chain of `DllmBlock` objects, keeps a rolling block buffer, schedules prefill/decode work against the KV cache, and prepares attention metadata for paged attention. The request template owns most lifecycle semantics: waiting, prefilling, decoding, completed, preempted, EOS, `max_tokens`, `max_model_len`, `max_nfe`, and `max_repetition_run`. The scheduler template remains narrow: it chooses which requests run this step, asks the cache manager whether pages are available, and applies sampler writes. ## token_merge Token-merge templates extend multi-block decoding with per-position token-merge metadata. The DMax strategy uses this family to pass top-k token IDs, top-k probabilities, residual probabilities, merge mode, weight, and renormalization settings into attention metadata.