diffulex.layer

diffulex.layer contains reusable neural-network layers and backend adapters used by model implementations. The package keeps tensor-parallel layout, optional LoRA handling, rotary embeddings, activation fusion, and vLLM-backed fallbacks outside individual model files.

Module

Role

diffulex.layer.activation

Fused gated activations with native and optional vLLM-backed paths.

diffulex.layer.embed_head

Tensor-parallel vocabulary embeddings and LM heads.

diffulex.layer.layernorm

RMSNorm and fused add-RMSNorm wrappers.

diffulex.layer.linear

Replicated, column-parallel, row-parallel, QKV, and merged linear layers with LoRA support.

diffulex.layer.rotary_embedding

Rotary embedding construction and application helpers.

diffulex.layer.vllm_backend

Runtime toggles and lazy accessors for optional vLLM layer implementations.

diffulex.layer.activation

This module provides gated activation blocks used by MLP implementations. It prefers vLLM fused operators when enabled and available, then falls back to native PyTorch implementations.

Symbol

Purpose

SiluAndMul

SiLU-gated activation block for SwiGLU-style MLPs.

GeluAndMul

GELU-tanh gated activation block.

Use these modules in model code instead of open-coding chunk/split activation logic.

diffulex.layer.embed_head

This module handles vocabulary sharding for embeddings and output projection. It gathers or reduces tensor-parallel outputs as needed so model code can share the same layer abstractions across single-GPU and tensor-parallel execution.

Symbol

Purpose

VocabParallelEmbedding

Sharded embedding table with tensor-parallel rank handling.

ParallelLMHead

Output head built on the same vocabulary-parallel layout.

Use these layers when model vocab weights need to be partitioned across tensor parallel ranks.

diffulex.layer.layernorm

This module provides RMSNorm implementations with optional fused vLLM paths. The wrapper keeps model code independent of the selected backend.

Symbol

Purpose

RMSNorm

RMSNorm module with optional fused add+norm path.

Use RMSNorm in model implementations when the checkpoint architecture expects RMS normalization.

diffulex.layer.linear

This module contains the common linear-layer variants used by model families. It combines tensor-parallel splitting/gathering with optional LoRA weight loading hooks.

Symbol

Purpose

LoRAMixin

Adapter-loading behavior shared by linear variants.

LinearBase

Common base class for Diffulex linear layers.

ReplicatedLinear

Non-sharded linear layer.

ColumnParallelLinear

Column-sharded tensor-parallel linear layer.

MergedColumnParallelLinear

Column-parallel layer for merged projections.

QKVParallelLinear

Specialized QKV projection layer.

RowParallelLinear

Row-sharded tensor-parallel linear layer.

Choose the layer variant that matches the checkpoint’s weight layout and the model’s tensor-parallel split.

diffulex.layer.rotary_embedding

This module builds and applies rotary position embeddings. It includes standard rotary embeddings, partial rotary embeddings, Gemma-style proportional rotary scaling, and adapters to vLLM rotary implementations.

Symbol

Purpose

RotaryEmbedding

Standard rotary embedding module.

PartialRotaryEmbedding

Rotary embedding for models that rotate only part of the head dimension.

Gemma4ProportionalRotaryEmbedding

Gemma-style proportional rotary scaling.

VllmRotaryEmbeddingAdapter

Adapter around vLLM rotary implementations.

get_rope

Cached rotary embedding factory.

get_gemma4_proportional_rope

Cached Gemma proportional rotary factory.

Use the factory helpers instead of constructing rotary modules manually when model code should share cache behavior.

diffulex.layer.vllm_backend

This module owns the runtime switch for optional vLLM-backed common layers. It keeps imports lazy so environments without the relevant vLLM symbols can still import Diffulex.

Symbol

Purpose

set_vllm_layers_enabled

Enables or disables vLLM-backed layer paths.

is_vllm_layers_enabled

Reports whether vLLM-backed paths are active.

clear_vllm_layer_caches

Clears cached backend lookups.

get_vllm_silu_and_mul_cls / get_vllm_gelu_and_mul_cls

Lazy activation backend accessors.

get_vllm_rmsnorm_cls

Lazy RMSNorm backend accessor.

get_vllm_rope_fn

Lazy rotary backend accessor.

Use these helpers inside layer wrappers, not directly from model code.