daily-paper

Author	SHA1	Message	Date
Rain-Bus	1fc6303e09	feat: refactor PDF extraction to caption-based screenshots, add upvote refresh, clean up UI - PDF extractor: rewrite from embedded bitmap extraction to caption-based page region screenshots. Finds Figure/Table captions via regex,截取上方/下方 page region, handles compound figures and vector graphics. - Upvote refresh: new crawler.refresh_upvotes() re-fetches upvotes for recent N days without inserting new papers. Scheduler runs daily 30min after pipeline. - Admin: add /admin/refresh-upvotes endpoint and dashboard button. - UI: remove date quick nav, show upvote update time on detail/card pages, clean up CSS date-chip styles. - Utils: add recent_date_strs() helper.	2026-06-09 18:01:01 +08:00
Rain-Bus	18f44ac244	feat: improve PDF image extraction with caption-based labeling and fallback matching - Enhance pdf_image_extractor with caption text extraction near images/tables - Add figure/table type correction based on caption content - Implement sequential numbering fallback for unmatched items - Improve figure linking in pages with manifest ID matching and fallback strategies - Remove docling dependency, add dev dependency group	2026-06-09 14:07:21 +08:00
Rain-Bus	32978b3fc5	feat: add admin dashboard, pipeline service, lightbox, and update dependencies	2026-06-09 09:32:10 +08:00
Rain-Bus	0d293422ac	feat: enhance UI, refactor services, improve templates and tests - Replace image_extractor with pdf_image_extractor service - Enhance pi_client with expanded API capabilities - Improve summarizer service with additional features - Update admin routes with more endpoints - Add login page template - Enhance detail page with comprehensive layout - Improve search and trends pages - Update base template with additional elements - Refactor tests for better coverage - Add validate_summary script - Update project configuration and dependencies	2026-06-07 19:38:58 +08:00

4 Commits