feat: refactor PDF extraction to caption-based screenshots, add upvote refresh, clean up UI
- PDF extractor: rewrite from embedded bitmap extraction to caption-based page region screenshots. Finds Figure/Table captions via regex,截取上方/下方 page region, handles compound figures and vector graphics. - Upvote refresh: new crawler.refresh_upvotes() re-fetches upvotes for recent N days without inserting new papers. Scheduler runs daily 30min after pipeline. - Admin: add /admin/refresh-upvotes endpoint and dashboard button. - UI: remove date quick nav, show upvote update time on detail/card pages, clean up CSS date-chip styles. - Utils: add recent_date_strs() helper.
This commit is contained in:
@@ -41,6 +41,7 @@ class Settings(BaseSettings):
|
||||
SCHEDULE_HOUR: int = 4
|
||||
SCHEDULE_MINUTE: int = 0
|
||||
APP_WORKERS: int = 1
|
||||
UPVOTE_REFRESH_DAYS: int = 7 # 刷新最近 N 天论文的 upvotes
|
||||
|
||||
# 数据库
|
||||
DATABASE_URL: str = "sqlite:///data/db/papers.db"
|
||||
|
||||
Reference in New Issue
Block a user