feat: refactor PDF extraction to caption-based screenshots, add upvote refresh, clean up UI
- PDF extractor: rewrite from embedded bitmap extraction to caption-based page region screenshots. Finds Figure/Table captions via regex,截取上方/下方 page region, handles compound figures and vector graphics. - Upvote refresh: new crawler.refresh_upvotes() re-fetches upvotes for recent N days without inserting new papers. Scheduler runs daily 30min after pipeline. - Admin: add /admin/refresh-upvotes endpoint and dashboard button. - UI: remove date quick nav, show upvote update time on detail/card pages, clean up CSS date-chip styles. - Utils: add recent_date_strs() helper.
This commit is contained in:
@@ -57,6 +57,13 @@ def yesterday_str() -> str:
|
||||
return yesterday.isoformat()
|
||||
|
||||
|
||||
def recent_date_strs(n: int) -> list[str]:
|
||||
"""最近 N 天的日期字符串列表(含今天,按 APP_TIMEZONE)。"""
|
||||
tz = ZoneInfo(settings.APP_TIMEZONE)
|
||||
today = datetime.now(tz).date()
|
||||
return [(today - timedelta(days=i)).isoformat() for i in range(n)]
|
||||
|
||||
|
||||
def latest_paper_date(db) -> str:
|
||||
"""查询数据库中最新的 paper_date,无数据时回退到 today_str()。"""
|
||||
from sqlalchemy import func, select
|
||||
|
||||
Reference in New Issue
Block a user