2.6 KiB
2.6 KiB
Purpose
Canonical specification for news-aggregator requirements synced from OpenSpec change deltas.
Requirements
Requirement: News aggregation via Perplexity API
The system SHALL fetch AI news hourly from Perplexity API and store it with full attribution.
Scenario: Hourly news fetch
- WHEN the scheduled job runs every hour
- THEN the system calls Perplexity API with query "latest AI news"
- AND stores the response with headline, summary, source URL, and timestamp
Scenario: API error handling
- WHEN Perplexity API returns an error or timeout
- THEN the system logs the error with cost tracking
- AND retries with exponential backoff up to 3 times
- AND falls back to OpenRouter API if
OPENROUTER_API_KEYis configured - AND continues using cached content if all retries and fallback fail
Requirement: Featured image generation
The system SHALL generate or fetch a relevant featured image for each news item.
Scenario: Image acquisition
- WHEN a new news item is fetched
- THEN the system SHALL request a relevant image URL from Perplexity
- AND download and optimize the image locally using Pillow
- AND apply quality compression based on
IMAGE_QUALITYenv var (1-100, default 85) - AND store the optimized image path and original image credit/source information
Scenario: Image optimization configuration
- WHEN the system processes an image
- THEN it SHALL read
IMAGE_QUALITYfrom environment (default: 85) - AND apply JPEG compression at specified quality level
- AND resize images exceeding 1200px width while maintaining aspect ratio
- AND store optimized images in
/app/static/images/directory
Scenario: Image fallback
- WHEN image generation fails or returns no result
- THEN the system SHALL use a default ClawFort branded placeholder image
Requirement: News data persistence with retention
The system SHALL store news items for exactly 30 days with automatic archiving.
Scenario: News storage
- WHEN a news item is fetched from Perplexity
- THEN the system SHALL store it in SQLite with fields: id, headline, summary, source_url, image_url, image_credit, published_at, created_at
- AND set archived=false by default
Scenario: Automatic archiving
- WHEN a nightly cleanup job runs
- THEN the system SHALL mark all news items older than 30 days as archived=true
- AND delete archived items older than 60 days permanently
Scenario: Duplicate prevention
- WHEN fetching news that matches an existing headline (within 24 hours)
- THEN the system SHALL skip insertion to prevent duplicates