feat: refactor PDF extraction to caption-based screenshots, add upvote refresh, clean up UI
- PDF extractor: rewrite from embedded bitmap extraction to caption-based page region screenshots. Finds Figure/Table captions via regex,截取上方/下方 page region, handles compound figures and vector graphics. - Upvote refresh: new crawler.refresh_upvotes() re-fetches upvotes for recent N days without inserting new papers. Scheduler runs daily 30min after pipeline. - Admin: add /admin/refresh-upvotes endpoint and dashboard button. - UI: remove date quick nav, show upvote update time on detail/card pages, clean up CSS date-chip styles. - Utils: add recent_date_strs() helper.
This commit is contained in:
@@ -20,7 +20,7 @@
|
||||
{% endif %}
|
||||
</a>
|
||||
</h2>
|
||||
<span class="paper-upvotes">👍 {{ paper.upvotes }}</span>
|
||||
<span class="paper-upvotes" title="数据更新于 {{ paper.crawled_at.strftime('%m-%d %H:%M') if paper.crawled_at else '' }}">👍 {{ paper.upvotes }}</span>
|
||||
{% if variant == 'search' and distances and paper.arxiv_id in distances %}
|
||||
<span class="similarity-score" title="语义相似度距离">
|
||||
🎯 {{ "%.3f"|format(distances[paper.arxiv_id]) }}
|
||||
|
||||
Reference in New Issue
Block a user