4.2 KiB
Context
Current operations are concentrated in backend/cli.py with a single force-fetch command and no unified admin maintenance suite. Operational actions such as archive cleanup, translation regeneration, image refresh, and cache/news reset require manual code/DB operations. Existing backend services already contain reusable primitives: ingestion (process_and_store_news), archival helpers (archive_old_news, delete_archived_news), and translation generation pipelines in backend/news_service.py.
Goals / Non-Goals
Goals:
- Introduce an admin command suite that consolidates common maintenance and recovery actions.
- Implement queued image refetch for latest 30 items, sequentially processed with exponential backoff.
- Improve image refresh relevance by combining keyword and mood/sentiment cues with deterministic fallback behavior.
- Provide safe destructive operations (
clear-news,clean-archive, cache clear) with operator guardrails. - Add translation regeneration and parameterized fetch count command to reduce manual intervention.
Non-Goals:
- Replacing the scheduled ingestion model.
- Introducing external queue infrastructure (RabbitMQ/Redis workers) for this phase.
- Redesigning storage models or adding new DB tables unless strictly necessary.
- Building a web-based admin dashboard in this change.
Decisions
Decision: Extend existing CLI with subcommands
Decision: Expand backend/cli.py into a multi-subcommand admin command suite.
Rationale:
- Reuses existing deployment/runtime assumptions.
- Keeps operations scriptable via terminal/cron and avoids UI scope expansion.
Alternatives considered:
- New standalone admin binary: rejected due to duplicated bootstrapping/runtime checks.
Decision: Queue image refetch in-process with sequential workers
Decision: Build a bounded in-memory queue for latest 30 items and process one-by-one.
Rationale:
- Meets rate-limit resilience requirement without new infrastructure.
- Deterministic and easy to monitor in command output.
Alternatives considered:
- Parallel refetch workers: rejected due to higher provider throttling risk.
Decision: Exponential backoff for external image calls
Decision: Apply exponential backoff with capped retries for rate-limited or transient failures.
Rationale:
- Reduces burst retry amplification.
- Improves success rate under API pressure.
Decision: Safety-first destructive command ergonomics
Decision: Destructive operations require explicit confirmation/flags and support dry-run where meaningful.
Rationale:
- Prevents accidental data loss.
- Makes admin actions auditable and predictable.
Decision: Fetch-N command reuses ingestion pipeline
Decision: Add a fetch-count option that drives existing ingestion/fetch flow rather than building a second implementation.
Rationale:
- Preserves deduplication/retry logic and minimizes divergence.
Risks / Trade-offs
- [Risk] Operator misuse of destructive commands -> Mitigation: confirmation gate + explicit flags + dry-run.
- [Risk] Backoff can increase command runtime -> Mitigation: cap retries and print progress ETA-style output.
- [Risk] Queue processing interruption mid-run -> Mitigation: idempotent per-item updates and resumable reruns.
- [Trade-off] In-process queue is simpler but non-distributed -> Mitigation: acceptable for admin-invoked maintenance scope.
Migration Plan
- Extend CLI parser with admin subcommands and argument validation.
- Add reusable maintenance handlers (archive clean, cache clear, clear news, rebuild, regenerate translations, fetch-n).
- Implement queued image-refetch handler with exponential backoff and per-item progress logs.
- Add safe guards (
--confirm, optional--dry-run) for destructive operations. - Document command usage and examples in README.
Rollback:
- Keep existing
force-fetchpath intact. - Revert new subcommands while preserving unaffected ingestion pipeline.
Open Questions
- What cache layers are considered in-scope for
clear-cache(in-memory only vs additional filesystem cache)? - Should
rebuild-sitechain all maintenance actions or remain a defined subset with explicit steps? - Should
fetch nenforce an upper bound to avoid accidental high-cost runs?