feat: refactor summarizer and PDF extraction pipeline

- Split summarizer into summary_generator and summary_persister modules - Refactor pdf_image_extractor to two-phase pipeline with PicoDet layout detection - Add layout_detector service for PicoDet-S_layout_3cls integration - Add exceptions module with ConflictError and NotFoundError - Improve admin dashboard with better statistics and task management - Add design review document with system optimization suggestions - Add new tests for crawler, pdf_downloader, pipeline, and summary_utils - Update dependencies and configuration - Clean up dead code and improve error handling
2026-06-13 13:16:47 +08:00
parent e2f0e1a8be
commit 21f16e6756
43 changed files with 3304 additions and 1494 deletions
@@ -207,11 +207,14 @@ async def delete_papers_by_date_range(
        completed_at=utc_now(),
        papers_found=total,
        papers_new=deleted,
-        details_json=json.dumps({
-            "total_before": total,
-            "deleted": deleted,
-            "failed": len(failed_items),
-        }, ensure_ascii=False),
+        details_json=json.dumps(
+            {
+                "total_before": total,
+                "deleted": deleted,
+                "failed": len(failed_items),
+            },
+            ensure_ascii=False,
+        ),
        error=job_error,
    )
    db.add(log_entry)