Files

2.6 KiB

Purpose

Canonical specification for news-aggregator requirements synced from OpenSpec change deltas.

Requirements

Requirement: News aggregation via Perplexity API

The system SHALL fetch AI news hourly from Perplexity API and store it with full attribution.

Scenario: Hourly news fetch

  • WHEN the scheduled job runs every hour
  • THEN the system calls Perplexity API with query "latest AI news"
  • AND stores the response with headline, summary, source URL, and timestamp

Scenario: API error handling

  • WHEN Perplexity API returns an error or timeout
  • THEN the system logs the error with cost tracking
  • AND retries with exponential backoff up to 3 times
  • AND falls back to OpenRouter API if OPENROUTER_API_KEY is configured
  • AND continues using cached content if all retries and fallback fail

The system SHALL generate or fetch a relevant featured image for each news item.

Scenario: Image acquisition

  • WHEN a new news item is fetched
  • THEN the system SHALL request a relevant image URL from Perplexity
  • AND download and optimize the image locally using Pillow
  • AND apply quality compression based on IMAGE_QUALITY env var (1-100, default 85)
  • AND store the optimized image path and original image credit/source information

Scenario: Image optimization configuration

  • WHEN the system processes an image
  • THEN it SHALL read IMAGE_QUALITY from environment (default: 85)
  • AND apply JPEG compression at specified quality level
  • AND resize images exceeding 1200px width while maintaining aspect ratio
  • AND store optimized images in /app/static/images/ directory

Scenario: Image fallback

  • WHEN image generation fails or returns no result
  • THEN the system SHALL use a default ClawFort branded placeholder image

Requirement: News data persistence with retention

The system SHALL store news items for exactly 30 days with automatic archiving.

Scenario: News storage

  • WHEN a news item is fetched from Perplexity
  • THEN the system SHALL store it in SQLite with fields: id, headline, summary, source_url, image_url, image_credit, published_at, created_at
  • AND set archived=false by default

Scenario: Automatic archiving

  • WHEN a nightly cleanup job runs
  • THEN the system SHALL mark all news items older than 30 days as archived=true
  • AND delete archived items older than 60 days permanently

Scenario: Duplicate prevention

  • WHEN fetching news that matches an existing headline (within 24 hours)
  • THEN the system SHALL skip insertion to prevent duplicates