Files
Rain-Bus ffd2defdfc add Chinese annotations to all source files for learning purposes
Annotated 16 source files covering the full architecture:
engine (scheduler, block manager, model runner), layers (attention,
linear, sampler, etc.), model (qwen3), and utils.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 21:33:15 +08:00

22 lines
826 B
Python

from dataclasses import dataclass
@dataclass(slots=True)
class SamplingParams:
"""生成采样的参数配置。
Args:
temperature: 采样温度,控制输出的随机性。值越大越随机,越接近 0 越确定。
注意:本项目不支持 temperature=0(贪心解码),必须大于 1e-10。
max_tokens: 单个请求最大生成的 token 数量。
ignore_eos: 是否忽略 EOS token。设为 True 时即使遇到结束符也继续生成,直到 max_tokens 耗尽。
基准测试中用于确保每个请求都生成固定数量的 token。
"""
temperature: float = 1.0
max_tokens: int = 64
ignore_eos: bool = False
def __post_init__(self):
assert self.temperature > 1e-10, "greedy sampling is not permitted"