18 Commits

Author SHA1 Message Date
Rain-Bus ffd2defdfc add Chinese annotations to all source files for learning purposes
Annotated 16 source files covering the full architecture:
engine (scheduler, block manager, model runner), layers (attention,
linear, sampler, etc.), model (qwen3), and utils.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 21:33:15 +08:00
GeekExplorer 8d63a98c03 support chunked prefill and fix minor bug 2026-04-14 03:05:35 +08:00
GeekExplorer 9e8507ef41 minor simplify 2026-04-13 22:09:46 +08:00
Anai-Guo bf99453d90 fix RowParallelLinear weight_loader crash when bias is enabled
When RowParallelLinear has bias=True, the weight_loader crashes with an
IndexError because it calls param_data.size(tp_dim) where tp_dim=1, but
the bias tensor is 1D and only has dimension 0.

The bias in RowParallelLinear is not sharded (all ranks hold the full
bias, only rank 0 applies it), so skip the sharding logic for 1D params.

Fixes GeeeekExplorer/nano-vllm#125
2026-04-12 23:10:52 -07:00
GeeeekExplorer 6ef2a4f630 compile random sampling 2025-08-31 22:55:34 +08:00
GeeeekExplorer df99418f7d simplify 2025-08-31 20:02:51 +08:00
GeeeekExplorer 38baf0bbe4 remove assert shape 2025-06-27 23:00:30 +08:00
GeeeekExplorer 1caeec8dfa same as vllm 2025-06-27 18:50:56 +08:00
GeeeekExplorer 658520b788 warmup and allocate 2025-06-27 01:51:57 +08:00
xiaohajiayou 054aec852d Fix: Division-by-Zero Risk and Typo 2025-06-24 02:02:33 +08:00
GeeeekExplorer cde3fc22c2 simplify 2025-06-21 17:19:15 +08:00
GeeeekExplorer 7e42fa6f63 fix 2025-06-15 13:28:29 +08:00
GeeeekExplorer fc778a4da9 better 2025-06-15 10:36:45 +08:00
cheunglei 53b3ef2e32 support tensor parallel 2025-06-15 01:31:24 +08:00
GeeeekExplorer 08c84ec08d multi file loader 2025-06-12 01:00:09 +08:00
GeeeekExplorer 386290d69e refactor 2025-06-11 21:12:57 +08:00
GeeeekExplorer b98e1ca305 fix 2025-06-10 21:25:54 +08:00
GeeeekExplorer a5a4909e6a init commit 2025-06-10 00:27:01 +08:00