Files
astro-website/openspec/specs/social-content-aggregation/spec.md
2026-02-10 01:20:58 -05:00

3.8 KiB

ADDED Requirements

Requirement: Normalized content items

The system MUST normalize all ingested items (YouTube videos, Instagram posts, podcast episodes) into a single internal schema so the website can render them consistently.

The normalized item MUST include at minimum:

  • id (stable within its source)
  • source (youtube, instagram, or podcast)
  • url
  • title
  • publishedAt (ISO-8601)
  • thumbnailUrl (optional)

Scenario: Normalizing a YouTube video

  • WHEN the system ingests a YouTube video item
  • THEN it produces a normalized item containing id, source: youtube, url, title, and publishedAt

Scenario: Normalizing a podcast episode

  • WHEN the system ingests a podcast RSS episode
  • THEN it produces a normalized item containing id, source: podcast, url, title, and publishedAt

Requirement: YouTube ingestion with stats when available

The system MUST support ingesting YouTube videos for channel youtube.com/santhoshj.

When a YouTube API key is configured, the system MUST ingest video metadata and MUST ingest view count (and MAY ingest likes/comments if available) so "high-performing" can be computed.

When no YouTube API key is configured, the system MUST still ingest latest videos using a non-authenticated mechanism (for example, channel RSS) but MUST omit performance stats.

Scenario: API key configured

  • WHEN a YouTube API key is configured
  • THEN the system ingests video metadata and includes metrics.views for each ingested video when available from the API

Scenario: No API key configured

  • WHEN no YouTube API key is configured
  • THEN the system ingests latest videos and does not require metrics.views to be present

Requirement: Podcast RSS ingestion

The system MUST ingest the Irregular Mind podcast RSS feed and produce normalized items representing podcast episodes.

Scenario: RSS feed fetch succeeds

  • WHEN the system fetches the podcast RSS feed successfully
  • THEN it produces one normalized item per episode with source: podcast

Requirement: Instagram content support via embed-first approach

The system MUST support representing Instagram posts for @santhoshjanan in the site content surface.

If API-based ingestion is not configured/available, the system MUST support an embed-first representation where the normalized item contains a url to the Instagram post and any additional embed metadata needed by the renderer.

Scenario: Embed-first mode

  • WHEN Instagram API ingestion is not configured
  • THEN the system provides normalized Instagram items that contain a public post url suitable for embedding

Requirement: Refresh and caching

The system MUST cache the latest successful ingestion output and MUST serve the cached data to the site renderer.

The system MUST support periodic refresh on a schedule (at minimum daily) and MUST support a manual refresh trigger.

On ingestion failure, the system MUST continue serving the most recent cached data.

The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).

Scenario: Scheduled refresh fails

  • WHEN a scheduled refresh run fails to fetch one or more sources
  • THEN the site continues to use the most recent successfully cached dataset

Scenario: Manual refresh requested

  • WHEN a manual refresh is triggered
  • THEN the system attempts ingestion immediately and updates the cache if ingestion succeeds

Scenario: Cache hit avoids refetch

  • WHEN a refresh run is executed within the cache TTL for a given source+parameters
  • THEN the ingestion pipeline uses cached data for that source instead of refetching over the network