feat: refactor summarizer and PDF extraction pipeline
- Split summarizer into summary_generator and summary_persister modules - Refactor pdf_image_extractor to two-phase pipeline with PicoDet layout detection - Add layout_detector service for PicoDet-S_layout_3cls integration - Add exceptions module with ConflictError and NotFoundError - Improve admin dashboard with better statistics and task management - Add design review document with system optimization suggestions - Add new tests for crawler, pdf_downloader, pipeline, and summary_utils - Update dependencies and configuration - Clean up dead code and improve error handling
This commit is contained in:
@@ -207,11 +207,14 @@ async def delete_papers_by_date_range(
|
||||
completed_at=utc_now(),
|
||||
papers_found=total,
|
||||
papers_new=deleted,
|
||||
details_json=json.dumps({
|
||||
"total_before": total,
|
||||
"deleted": deleted,
|
||||
"failed": len(failed_items),
|
||||
}, ensure_ascii=False),
|
||||
details_json=json.dumps(
|
||||
{
|
||||
"total_before": total,
|
||||
"deleted": deleted,
|
||||
"failed": len(failed_items),
|
||||
},
|
||||
ensure_ascii=False,
|
||||
),
|
||||
error=job_error,
|
||||
)
|
||||
db.add(log_entry)
|
||||
|
||||
Reference in New Issue
Block a user