# Decoding Strategies Diffulex selects strategy-specific request, scheduler, KV cache manager, model runner, and attention metadata components through registries. The strategy is chosen by `decoding_strategy`. ## Decoding Strategy Set `decoding_strategy` to one of `d2f`, `multi_bd`, `fast_dllm_v2`, `dmax`, or `diffusion_gemma`. Benchmark config input also normalizes older aliases `multi_block_diffusion`, `block_diffusion`, and `fast_dllm` to `multi_bd`. The choice changes more than the sampler name: | Strategy | Behavior | | --- | --- | | `d2f` | Forces full-prefix multi-block behavior and disables prefix caching. | | `multi_bd` | Implements Multi-Block Diffusion (MultiBD): a bounded active block set with block-causal visibility and prefix caching enabled when compatible. | | `fast_dllm_v2` | Implements Fast-dLLM-v2 dual-cache decoding: 3-mode FSM (full-buffer init, sub-block refine, final commit) with per-mode CUDA graphs. | | `dmax` | Enables DMax-style token merging on supported edit-sampling models. | | `diffusion_gemma` | Uses the native DiffusionGemma canvas/block runtime. | ## Decoding Thresholds Thresholds tune when a strategy adds, releases, accepts, edits, or remasks tokens and blocks. | Key | How to set it | What it does | | --- | --- | --- | | `add_block_threshold` | Start from the default `0.1`; tune as a float for block-add behavior. | Controls when another decoding block can be added. | | `semi_complete_threshold` | Start from the default `0.9`; tune as a float for block advancement. | Controls when semi-complete block state can advance. | | `accept_threshold` | Use a confidence value from `0` to `1`. The default is `0.9`. | Accepts mask-to-token updates once confidence is high enough. | | `edit_threshold` | Use a confidence value from `0` to `1`. The default is `0.0`. | Accepts token-to-token edits in edit-style decoding. | | `remask_threshold` | Use a confidence value from `0` to `1`. The default is `0.4`. | Remasks filled tokens that fall below the confidence threshold. | | `token_stability_threshold` | Use a stability ratio from `0` to `1`. The default is `0.0`. | Controls DMax-style edit-block progress. | Keep thresholds in YAML when comparing experiments. Use CLI overrides for short ad hoc runs. ## Sampling Mode Set `sampling_mode` to `naive` for the standard sampler path or `edit` for edit-style decoding. `sampling_mode="edit"` is restricted to edit-sampling model names: | Compatible `model_name` | | --- | | `llada2` | | `llada2_moe` | | `llada2_mini` | | `llada2dot1_mini` | | `llada2_mini_dmax` | `decoding_strategy="dmax"` requires `sampling_mode="edit"` and one of the compatible model names. ## Related Arguments | Surface | Names | Notes | | --- | --- | --- | | Python/config | `decoding_strategy`, `sampling_mode`, `decoding_thresholds` | Use these in `Config`, YAML, or Python construction. | | CLI | `--decoding-strategy`, `--sampling-mode`, `--add-block-threshold`, `--semi-complete-threshold`, `--accept-threshold`, `--edit-threshold`, `--remask-threshold`, `--token-stability-threshold` | Use CLI overrides for short experiment runs. |