15 Commits

Author SHA1 Message Date
Rain-Bus ffd2defdfc add Chinese annotations to all source files for learning purposes
Annotated 16 source files covering the full architecture:
engine (scheduler, block manager, model runner), layers (attention,
linear, sampler, etc.), model (qwen3), and utils.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 21:33:15 +08:00
GeekExplorer f64d821c20 fix chunked prefill bugs and refactor 2026-04-26 02:53:06 +08:00
GeekExplorer 8d63a98c03 support chunked prefill and fix minor bug 2026-04-14 03:05:35 +08:00
Xingkai Yu 52d2215911 Merge pull request #148 from guodongxiaren/main
remove hard code for block_size
2026-04-13 20:36:32 +08:00
guodongxiaren 55c64e7fdf remove hard code for block_size 2025-12-30 01:55:17 +08:00
Mengqi 82f5ca244f fix bug for tp 2025-12-18 01:28:25 +08:00
GeeeekExplorer 658520b788 warmup and allocate 2025-06-27 01:51:57 +08:00
GeeeekExplorer 03cfc13bb3 faster pickle 2025-06-23 00:51:52 +08:00
GeeeekExplorer bc0ad5a116 better 2025-06-17 23:33:38 +08:00
cheunglei 53b3ef2e32 support tensor parallel 2025-06-15 01:31:24 +08:00
GeeeekExplorer b6136383c9 support fast pickle 2025-06-14 13:36:57 +08:00
GeeeekExplorer 4a8aa090a7 fix 2025-06-14 00:56:07 +08:00
GeeeekExplorer 386290d69e refactor 2025-06-11 21:12:57 +08:00
GeeeekExplorer b98e1ca305 fix 2025-06-10 21:25:54 +08:00
GeeeekExplorer a5a4909e6a init commit 2025-06-10 00:27:01 +08:00