Installation¶

Diffulex is installed from source. The supported production path today is Linux with NVIDIA CUDA GPUs and local Hugging Face-style checkpoint directories.

Environment Requirements¶

Component	Requirement
OS	Linux. Other platforms are not a supported runtime target.
Python	Python 3.11 or newer.
GPU	NVIDIA CUDA GPU visible to PyTorch. H200, H100, A100, RTX 4090, and RTX 3090 have been used in development.
Checkpoints	Local checkpoint directories. Diffulex examples do not download model weights at runtime.
vLLM	`vllm==0.23.0` is the tested version for optimized layer/MoE backends and vLLM baseline presets.

If PyTorch or vLLM cannot see the GPU, fix that environment first. Diffulex will not be able to recover from an invalid CUDA/PyTorch installation.

Create the Environment¶

From the repository root:

uv venv --python 3.11 --seed
source .venv/bin/activate
uv pip install -e .

Install the tested vLLM build in the same environment if you plan to use the optimized vLLM layer backends, MoE kernels, or the vLLM baseline scripts:

uv pip install vllm==0.23.0

On clusters with strict CUDA wheel requirements, install a PyTorch build that matches the driver and CUDA runtime provided by the cluster. Diffulex currently validates its vLLM-backed paths against vllm==0.23.0; use other vLLM versions only when you are intentionally testing compatibility. Keep all of Diffulex, PyTorch, and vLLM in the same Python environment unless you are only using the external vLLM baseline launcher.

Verify the Environment¶

Check that the package imports:

python -c "from diffulex import Diffulex, SamplingParams; print('diffulex ok')"

Check CUDA visibility:

python -c "import torch; print(torch.cuda.is_available(), torch.cuda.device_count())"

If vLLM-backed paths will be used, check vLLM separately:

python -c "import vllm; print(vllm.__version__)"

Model Paths¶

Most configs in this repository use paths from the development cluster, for example /data/ckpts/inclusionAI/LLaDA2.0-mini. Replace those paths with the local checkpoint directory in your environment.

For one-off runs, prefer command-line overrides:

MODEL_PATH=/path/to/LLaDA2.0-mini \
DATASET_LIMIT=10 \
CUDA_VISIBLE_DEVICES=0 \
script/run_llada2_mini_gsm8k.sh

For repeated runs, edit or copy the YAML config under diffulex_bench/configs/.

Optional vLLM Baseline Environment¶

The DiffusionGemma vLLM baseline launcher can use a separate editable vLLM environment. By default it looks under /data/jyj/vllm-env; override that when your vLLM checkout lives elsewhere:

VLLM_ENV_DIR=/path/to/vllm-env \
CUDA_VISIBLE_DEVICES=0 \
script/run_vllm_diffusion_gemma_gsm8k.sh

The vLLM install used by that script must support DiffusionGemmaForBlockDiffusion. The pinned Diffulex environment uses vllm==0.23.0; a separate editable vLLM checkout is only needed when you are testing unreleased baseline behavior.

Build the Documentation¶

Install documentation dependencies, then build the Sphinx site:

uv pip install -r docs/requirements.txt
python -m sphinx -b html docs docs/_build/html

The generated site is written to docs/_build/html.