Initial Commit
This commit is contained in:
2
openspec/changes/p02-force-fetch-command/.openspec.yaml
Normal file
2
openspec/changes/p02-force-fetch-command/.openspec.yaml
Normal file
@@ -0,0 +1,2 @@
|
||||
schema: spec-driven
|
||||
created: 2026-02-12
|
||||
93
openspec/changes/p02-force-fetch-command/design.md
Normal file
93
openspec/changes/p02-force-fetch-command/design.md
Normal file
@@ -0,0 +1,93 @@
|
||||
## Context
|
||||
|
||||
ClawFort currently performs automated hourly news ingestion through APScheduler (`scheduled_news_fetch()` in `backend/news_service.py`) and the same pipeline handles retries, deduplication, image optimization, and persistence. There is no operator-facing command to run this pipeline on demand.
|
||||
|
||||
The change adds an explicit manual trigger path for operations use cases:
|
||||
- first-time bootstrap (populate content immediately after setup)
|
||||
- recovery after failed external API calls
|
||||
- ad-hoc operational refresh without waiting for scheduler cadence
|
||||
|
||||
Constraints:
|
||||
- Reuse existing fetch pipeline to avoid logic drift
|
||||
- Keep behavior idempotent with existing duplicate detection
|
||||
- Preserve scheduler behavior; manual runs must not mutate scheduler configuration
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Provide a Python command to force an immediate news fetch.
|
||||
- Reuse existing retry, dedup, and storage logic.
|
||||
- Return clear terminal output and process exit status for automation.
|
||||
- Keep command safe to run repeatedly.
|
||||
|
||||
**Non-Goals:**
|
||||
- Replacing APScheduler-based hourly fetch.
|
||||
- Introducing new API endpoints for manual triggering.
|
||||
- Changing data schema or retention policy.
|
||||
- Building a full operator dashboard.
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision: Add a dedicated CLI entrypoint module
|
||||
**Decision:** Add a small CLI entrypoint under backend (for example `backend/cli.py`) with a subcommand that invokes the fetch pipeline.
|
||||
|
||||
**Rationale:**
|
||||
- Keeps operational workflow explicit and scriptable.
|
||||
- Avoids coupling manual trigger behavior to HTTP routes.
|
||||
- Works in local dev and containerized runtime.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Add an admin HTTP endpoint: rejected due to unnecessary security exposure.
|
||||
- Trigger APScheduler internals directly: rejected to avoid scheduler-state side effects.
|
||||
|
||||
### Decision: Invoke the existing news pipeline directly
|
||||
**Decision:** The command should call `process_and_store_news()` (or the existing sync wrapper) instead of implementing parallel fetch logic.
|
||||
|
||||
**Rationale:**
|
||||
- Guarantees parity with scheduled runs.
|
||||
- Reuses retry/backoff, fallback provider behavior, image handling, and dedup checks.
|
||||
- Minimizes maintenance overhead.
|
||||
|
||||
**Alternatives considered:**
|
||||
- New command-specific fetch implementation: rejected due to drift risk.
|
||||
|
||||
### Decision: Standardize command exit semantics
|
||||
**Decision:** Exit code `0` for successful command execution (including zero new items), non-zero for operational failures (for example unhandled exceptions or fatal setup errors).
|
||||
|
||||
**Rationale:**
|
||||
- Enables CI/cron/operator scripts to react deterministically.
|
||||
- Matches common CLI conventions.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Exit non-zero when zero new items were inserted: rejected because dedup can make zero-item runs valid.
|
||||
|
||||
### Decision: Keep manual and scheduled paths independent
|
||||
**Decision:** Manual command does not reconfigure or trigger scheduler jobs; it performs a one-off run only.
|
||||
|
||||
**Rationale:**
|
||||
- Avoids race-prone manipulation of scheduler internals.
|
||||
- Reduces complexity and risk in production runtime.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Temporarily altering scheduler trigger times: rejected as brittle and harder to reason about.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **[Risk] Overlapping manual and scheduled runs may happen at boundary times** -> Mitigation: document operational guidance and keep dedup checks as safety net.
|
||||
- **[Risk] External API failures still occur during forced runs** -> Mitigation: existing retry/backoff plus fallback provider path and explicit error output.
|
||||
- **[Trade-off] Command success does not guarantee new rows** -> Mitigation: command output reports inserted count so operators can distinguish no-op vs failure.
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. Add CLI module and force-fetch subcommand wired to existing pipeline.
|
||||
2. Add command result reporting and exit code behavior.
|
||||
3. Document usage in README for bootstrap and recovery flows.
|
||||
4. Validate command in local runtime and container runtime.
|
||||
|
||||
Rollback:
|
||||
- Remove CLI entrypoint and related docs; scheduler-based hourly behavior remains unchanged.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should force-fetch support an optional `--max-attempts` override, or stay fixed to pipeline defaults for v1?
|
||||
- Should concurrent-run prevention use a process lock in this phase, or remain a documented operational constraint?
|
||||
35
openspec/changes/p02-force-fetch-command/proposal.md
Normal file
35
openspec/changes/p02-force-fetch-command/proposal.md
Normal file
@@ -0,0 +1,35 @@
|
||||
## Why
|
||||
|
||||
ClawFort currently fetches news on a fixed hourly schedule, which is not enough during first-time setup or after a failed API cycle. Operators need a reliable way to force an immediate news pull so they can bootstrap content quickly and recover without waiting for the next scheduled run.
|
||||
|
||||
## What Changes
|
||||
|
||||
- **New Capabilities:**
|
||||
- Add a manual Python command to trigger an immediate news fetch on demand.
|
||||
- Add command output that clearly reports success/failure, number of fetched/stored items, and error details.
|
||||
- Add safe invocation behavior so manual runs reuse existing fetch/retry/dedup logic.
|
||||
- **Backend:**
|
||||
- Add a CLI entrypoint/script for force-fetch execution.
|
||||
- Wire the command to existing news aggregation pipeline used by scheduled jobs.
|
||||
- Return non-zero exit codes on command failure for operational automation.
|
||||
- **Operations:**
|
||||
- Document how and when to run the force-fetch command (initial setup and recovery scenarios).
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
- `force-fetch-command`: Provide a Python command that triggers immediate news aggregation outside the hourly scheduler.
|
||||
- `fetch-run-reporting`: Provide operator-facing command output and exit semantics for successful runs and failures.
|
||||
- `manual-fetch-recovery`: Support manual recovery workflow after failed or partial API fetch cycles.
|
||||
|
||||
### Modified Capabilities
|
||||
- None.
|
||||
|
||||
## Impact
|
||||
|
||||
- **Code:** New CLI command module/entrypoint plus minimal integration with existing `news_service` execution path.
|
||||
- **APIs:** No external API contract changes.
|
||||
- **Dependencies:** No required new runtime dependencies expected.
|
||||
- **Infrastructure:** No deployment topology change; command runs in the same container/runtime.
|
||||
- **Environment:** Reuses existing env vars (`PERPLEXITY_API_KEY`, `OPENROUTER_API_KEY`, `IMAGE_QUALITY`).
|
||||
- **Data:** No schema changes; command writes through existing dedup + persistence flow.
|
||||
@@ -0,0 +1,27 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Command reports run outcome to operator
|
||||
The system SHALL present operator-facing output that describes whether the forced run succeeded or failed.
|
||||
|
||||
#### Scenario: Successful run reporting
|
||||
- **WHEN** a forced fetch command completes without fatal errors
|
||||
- **THEN** the command output includes a success indication
|
||||
- **AND** includes the number of items stored in that run
|
||||
|
||||
#### Scenario: Failed run reporting
|
||||
- **WHEN** a forced fetch command encounters a fatal execution error
|
||||
- **THEN** the command output includes a failure indication
|
||||
- **AND** includes actionable error details for operator diagnosis
|
||||
|
||||
### Requirement: Command exposes automation-friendly exit semantics
|
||||
The system SHALL return deterministic process exit codes for command success and failure.
|
||||
|
||||
#### Scenario: Exit code on success
|
||||
- **WHEN** the force-fetch command execution completes successfully
|
||||
- **THEN** the process exits with code 0
|
||||
- **AND** automation tooling can treat the run as successful
|
||||
|
||||
#### Scenario: Exit code on fatal failure
|
||||
- **WHEN** the force-fetch command execution fails fatally
|
||||
- **THEN** the process exits with a non-zero code
|
||||
- **AND** automation tooling can detect the failure state
|
||||
@@ -0,0 +1,27 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Operator can trigger immediate news fetch via Python command
|
||||
The system SHALL provide a Python command that triggers one immediate news aggregation run outside of the hourly scheduler.
|
||||
|
||||
#### Scenario: Successful forced fetch invocation
|
||||
- **WHEN** an operator runs the documented force-fetch command with valid runtime configuration
|
||||
- **THEN** the system executes one full fetch cycle using the existing aggregation pipeline
|
||||
- **AND** the command terminates after the run completes
|
||||
|
||||
#### Scenario: Command does not reconfigure scheduler
|
||||
- **WHEN** an operator runs the force-fetch command while the service scheduler exists
|
||||
- **THEN** the command performs a one-off run only
|
||||
- **AND** scheduler job definitions and cadence remain unchanged
|
||||
|
||||
### Requirement: Forced fetch reuses existing aggregation behavior
|
||||
The system SHALL use the same retry, fallback, deduplication, image processing, and persistence logic as scheduled fetch runs.
|
||||
|
||||
#### Scenario: Retry and fallback parity
|
||||
- **WHEN** the primary news provider request fails during a forced run
|
||||
- **THEN** the system applies the configured retry behavior
|
||||
- **AND** uses the configured fallback provider path if available
|
||||
|
||||
#### Scenario: Deduplication parity
|
||||
- **WHEN** fetched headlines match existing duplicate rules
|
||||
- **THEN** duplicate items are skipped according to existing deduplication policy
|
||||
- **AND** only eligible items are persisted
|
||||
@@ -0,0 +1,22 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Manual command supports bootstrap and recovery workflows
|
||||
The system SHALL allow operators to run the forced fetch command during first-time setup and after failed scheduled cycles.
|
||||
|
||||
#### Scenario: Bootstrap content population
|
||||
- **WHEN** the system is newly deployed and contains no current news items
|
||||
- **THEN** an operator can run the force-fetch command immediately
|
||||
- **AND** the command attempts to populate the dataset without waiting for the next hourly schedule
|
||||
|
||||
#### Scenario: Recovery after failed scheduled fetch
|
||||
- **WHEN** a prior scheduled fetch cycle failed or produced incomplete results
|
||||
- **THEN** an operator can run the force-fetch command on demand
|
||||
- **AND** the system performs a fresh one-off fetch attempt
|
||||
|
||||
### Requirement: Repeated manual runs remain operationally safe
|
||||
The system SHALL support repeated operator-triggered runs without corrupting data integrity.
|
||||
|
||||
#### Scenario: Repeated invocation in same day
|
||||
- **WHEN** an operator runs the force-fetch command multiple times within the same day
|
||||
- **THEN** existing deduplication behavior prevents duplicate persistence for matching items
|
||||
- **AND** each command run completes with explicit run status output
|
||||
28
openspec/changes/p02-force-fetch-command/tasks.md
Normal file
28
openspec/changes/p02-force-fetch-command/tasks.md
Normal file
@@ -0,0 +1,28 @@
|
||||
## 1. CLI Command Foundation
|
||||
|
||||
- [x] 1.1 Create `backend/cli.py` with command parsing for force-fetch execution
|
||||
- [x] 1.2 Add a force-fetch command entrypoint that can be invoked via Python module execution
|
||||
- [x] 1.3 Ensure command initializes required runtime context (env + database readiness)
|
||||
|
||||
## 2. Force-Fetch Execution Path
|
||||
|
||||
- [x] 2.1 Wire command to existing news aggregation execution path (`process_and_store_news` or sync wrapper)
|
||||
- [x] 2.2 Ensure command runs as a one-off operation without changing scheduler job configuration
|
||||
- [x] 2.3 Preserve existing deduplication, retry, fallback, and image processing behavior during manual runs
|
||||
|
||||
## 3. Operator Reporting and Exit Semantics
|
||||
|
||||
- [x] 3.1 Add success output that includes stored item count for the forced run
|
||||
- [x] 3.2 Add failure output with actionable error details when fatal execution errors occur
|
||||
- [x] 3.3 Return exit code `0` on success and non-zero on fatal failures
|
||||
|
||||
## 4. Recovery Workflow and Validation
|
||||
|
||||
- [x] 4.1 Validate bootstrap workflow: force-fetch on a fresh deployment with no current items
|
||||
- [x] 4.2 Validate recovery workflow: force-fetch after simulated failed scheduled cycle
|
||||
- [x] 4.3 Validate repeated same-day manual runs do not create duplicate records under dedup policy
|
||||
|
||||
## 5. Documentation
|
||||
|
||||
- [x] 5.1 Update `README.md` with force-fetch command usage for first-time setup
|
||||
- [x] 5.2 Document recovery-run usage and expected command output/exit behavior
|
||||
Reference in New Issue
Block a user