## Context The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`). Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable. ## Goals / Non-Goals **Goals:** - Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation. - Use the cache layer to reduce repeated network/API calls and parsing work for: - social content ingestion (YouTube/podcast/Instagram list) - WordPress `wp-json` ingestion - Provide a default “industry standard” TTL with environment override. - Add a manual cache clear command/script. - Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected. **Non-Goals:** - Adding a runtime server for the site (the site remains static HTML served by nginx). - Caching browser requests to nginx (no CDN/edge cache configuration in this change). - Perfect cache coherence across multiple machines/environments (dev+docker is the target). ## Decisions - **Decision: Use Redis as the shared cache backend (docker-compose service).** - Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker. - Alternative considered: Local file-based cache only. Rejected because it doesn’t provide a shared service and is harder to invalidate consistently. - **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.** - Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win. - Approach: - Cache keys like `youtube:api::`, `podcast:rss:`, `wp:posts`, `wp:pages`, `wp:categories`. - Store JSON values, set TTL, and log hit/miss per key. - **Decision: Default TTL = 1 hour (3600s), configurable via env.** - Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily). - Allow overrides for local testing and production tuning. - **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.** - Rationale: Simple manual operation and easy to verify. - Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped. ## Risks / Trade-offs - [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable. - [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command. - [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.