clawfort/openspec/specs/news-aggregator/spec.md at main - clawfort - Gitea: Git with a cup of tea

santhoshj/clawfort

Files

Santhosh Janardhanan bf4a40f533 bulk commit changes!

2026-02-13 02:32:06 -05:00

2.6 KiB

Raw Permalink Blame History

Purpose

Canonical specification for news-aggregator requirements synced from OpenSpec change deltas.

Requirements

Requirement: News aggregation via Perplexity API

The system SHALL fetch AI news hourly from Perplexity API and store it with full attribution.

Scenario: Hourly news fetch

WHEN the scheduled job runs every hour
THEN the system calls Perplexity API with query "latest AI news"
AND stores the response with headline, summary, source URL, and timestamp

Scenario: API error handling

WHEN Perplexity API returns an error or timeout
THEN the system logs the error with cost tracking
AND retries with exponential backoff up to 3 times
AND falls back to OpenRouter API if OPENROUTER_API_KEY is configured
AND continues using cached content if all retries and fallback fail

Requirement: Featured image generation

The system SHALL generate or fetch a relevant featured image for each news item.

Scenario: Image acquisition

WHEN a new news item is fetched
THEN the system SHALL request a relevant image URL from Perplexity
AND download and optimize the image locally using Pillow
AND apply quality compression based on IMAGE_QUALITY env var (1-100, default 85)
AND store the optimized image path and original image credit/source information

Scenario: Image optimization configuration

WHEN the system processes an image
THEN it SHALL read IMAGE_QUALITY from environment (default: 85)
AND apply JPEG compression at specified quality level
AND resize images exceeding 1200px width while maintaining aspect ratio
AND store optimized images in /app/static/images/ directory

Scenario: Image fallback

WHEN image generation fails or returns no result
THEN the system SHALL use a default ClawFort branded placeholder image

Requirement: News data persistence with retention

The system SHALL store news items for exactly 30 days with automatic archiving.

Scenario: News storage

WHEN a news item is fetched from Perplexity
THEN the system SHALL store it in SQLite with fields: id, headline, summary, source_url, image_url, image_credit, published_at, created_at
AND set archived=false by default

Scenario: Automatic archiving

WHEN a nightly cleanup job runs
THEN the system SHALL mark all news items older than 30 days as archived=true
AND delete archived items older than 60 days permanently

Scenario: Duplicate prevention

WHEN fetching news that matches an existing headline (within 24 hours)
THEN the system SHALL skip insertion to prevent duplicates