better cache

2026-02-10 01:20:58 -05:00
parent c773affbc8
commit f056e67eae
39 changed files with 830 additions and 17 deletions
--- a/openspec/changes/archive/2026-02-10-better-cache/.openspec.yaml
+++ b/openspec/changes/archive/2026-02-10-better-cache/.openspec.yaml
--- a/openspec/changes/archive/2026-02-10-better-cache/design.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/design.md
@@ -0,0 +1,48 @@
+## Context
+
+The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`).
+
+Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable.
+
+## Goals / Non-Goals
+
+**Goals:**
+- Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation.
+- Use the cache layer to reduce repeated network/API calls and parsing work for:
+  - social content ingestion (YouTube/podcast/Instagram list)
+  - WordPress `wp-json` ingestion
+- Provide a default “industry standard” TTL with environment override.
+- Add a manual cache clear command/script.
+- Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected.
+
+**Non-Goals:**
+- Adding a runtime server for the site (the site remains static HTML served by nginx).
+- Caching browser requests to nginx (no CDN/edge cache configuration in this change).
+- Perfect cache coherence across multiple machines/environments (dev+docker is the target).
+
+## Decisions
+
+- **Decision: Use Redis as the shared cache backend (docker-compose service).**
+  - Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker.
+  - Alternative considered: Local file-based cache only. Rejected because it doesn’t provide a shared service and is harder to invalidate consistently.
+
+- **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.**
+  - Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win.
+  - Approach:
+    - Cache keys like `youtube:api:<channelId>:<limit>`, `podcast:rss:<url>`, `wp:posts`, `wp:pages`, `wp:categories`.
+    - Store JSON values, set TTL, and log hit/miss per key.
+
+- **Decision: Default TTL = 1 hour (3600s), configurable via env.**
+  - Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily).
+  - Allow overrides for local testing and production tuning.
+
+- **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.**
+  - Rationale: Simple manual operation and easy to verify.
+  - Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped.
+
+## Risks / Trade-offs
+
+- [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable.
+- [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command.
+- [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.
+
--- a/openspec/changes/archive/2026-02-10-better-cache/proposal.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/proposal.md
@@ -0,0 +1,28 @@
+## Why
+
+Reduce IO and external fetch load by adding a shared caching layer so repeated requests for the same content do not re-hit disk/network unnecessarily.
+
+## What Changes
+
+- Add a caching layer (Redis or similar lightweight cache) used by the site’s data/ingestion flows.
+- Add a cache service to `docker-compose.yml`.
+- Define an industry-standard cache invalidation interval (TTL) with a sensible default and allow it to be configured via environment variables.
+- Add a script/command to manually clear the cache on demand.
+- Add verification that the cache is working (cache hits/misses and TTL behavior).
+
+## Capabilities
+
+### New Capabilities
+- `cache-layer`: Provide a shared caching service (Redis or equivalent) with TTL-based invalidation and a manual clear operation for the website’s data flows.
+
+### Modified Capabilities
+- `social-content-aggregation`: Use the cache layer to avoid re-fetching or re-processing external content sources on repeated runs/requests.
+- `wordpress-content-source`: Use the cache layer to reduce repeated `wp-json` fetches and parsing work.
+
+## Impact
+
+- Deployment/local dev: add Redis (or equivalent) to `docker-compose.yml` and wire environment/config for connection + TTL.
+- Scripts/services: update ingestion/build-time fetch to read/write via cache and log hit/miss for verification.
+- Tooling: add a cache-clear script/command (and document usage).
+- Testing: add tests or a lightweight verification step proving cached reads are used and expire as expected.
+
--- a/openspec/changes/archive/2026-02-10-better-cache/specs/cache-layer/spec.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/specs/cache-layer/spec.md
@@ -0,0 +1,38 @@
+## ADDED Requirements
+
+### Requirement: Redis-backed cache service
+The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
+
+The cache service MUST be runnable in local development via Docker Compose.
+
+#### Scenario: Cache service available in Docker
+- **WHEN** the Docker Compose stack is started
+- **THEN** a Redis service is available to other services/scripts on the internal network
+
+### Requirement: TTL-based invalidation
+Cached entries MUST support TTL-based invalidation.
+
+The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
+
+#### Scenario: Default TTL applies
+- **WHEN** a cached entry is written without an explicit TTL override
+- **THEN** it expires after the configured default TTL
+
+#### Scenario: TTL override applies
+- **WHEN** a TTL override is configured via environment/config
+- **THEN** new cached entries use that TTL for expiration
+
+### Requirement: Cache key namespace
+Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
+
+#### Scenario: Two different sources do not collide
+- **WHEN** the system caches a YouTube fetch and a WordPress fetch
+- **THEN** they use different key namespaces and do not overwrite each other
+
+### Requirement: Manual cache clear
+The system MUST provide a script/command to manually clear the cache.
+
+#### Scenario: Manual clear executed
+- **WHEN** a developer runs the cache clear command
+- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses
+
--- a/openspec/changes/archive/2026-02-10-better-cache/specs/social-content-aggregation/spec.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/specs/social-content-aggregation/spec.md
@@ -0,0 +1,23 @@
+## MODIFIED Requirements
+
+### Requirement: Refresh and caching
+The system MUST cache the latest successful ingestion output and MUST serve the cached data to the site renderer.
+
+The system MUST support periodic refresh on a schedule (at minimum daily) and MUST support a manual refresh trigger.
+
+On ingestion failure, the system MUST continue serving the most recent cached data.
+
+The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
+
+#### Scenario: Scheduled refresh fails
+- **WHEN** a scheduled refresh run fails to fetch one or more sources
+- **THEN** the site continues to use the most recent successfully cached dataset
+
+#### Scenario: Manual refresh requested
+- **WHEN** a manual refresh is triggered
+- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
+
+#### Scenario: Cache hit avoids refetch
+- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
+- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network
+
--- a/openspec/changes/archive/2026-02-10-better-cache/specs/wordpress-content-source/spec.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/specs/wordpress-content-source/spec.md
@@ -0,0 +1,19 @@
+## MODIFIED Requirements
+
+### Requirement: Build-time caching
+WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
+
+If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
+- keep the last-known-good cached WordPress content (if present), or
+- store an empty WordPress dataset and allow the rest of the site to build.
+
+When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
+
+#### Scenario: WordPress fetch fails
+- **WHEN** a WordPress API request fails
+- **THEN** the site build can still complete and the blog surface renders a graceful empty state
+
+#### Scenario: Cache hit avoids wp-json refetch
+- **WHEN** WordPress ingestion is executed within the configured cache TTL
+- **THEN** it uses cached data instead of refetching from `wp-json`
+
--- a/openspec/changes/archive/2026-02-10-better-cache/tasks.md
+++ b/openspec/changes/archive/2026-02-10-better-cache/tasks.md
@@ -0,0 +1,26 @@
+## 1. Cache Service And Config
+
+- [x] 1.1 Add Redis service to `docker-compose.yml` and wire basic health/ports for local dev
+- [x] 1.2 Add cache env/config variables (Redis URL/host+port, DB index, default TTL seconds) and document in `site/.env.example`
+
+## 2. Cache Client And Utilities
+
+- [x] 2.1 Add a small Redis cache client wrapper (get/set JSON with TTL, namespaced keys) for Node scripts
+- [x] 2.2 Add logging for cache hit/miss per key to support verification
+- [x] 2.3 Ensure caching is optional: if Redis is unreachable, ingestion proceeds without caching
+
+## 3. Integrate With Ingestion
+
+- [x] 3.1 Cache YouTube fetches (API and/or RSS) by source+params and reuse within TTL
+- [x] 3.2 Cache podcast RSS fetch by URL and reuse within TTL
+- [x] 3.3 Cache WordPress `wp-json` fetches (posts/pages/categories) and reuse within TTL
+
+## 4. Cache Invalidation
+
+- [x] 4.1 Add a command/script to manually clear the cache (scoped to configured Redis DB)
+- [x] 4.2 Document the cache clear command usage
+
+## 5. Verification
+
+- [x] 5.1 Add a test that exercises the cache wrapper (set/get JSON + TTL expiration behavior)
+- [x] 5.2 Add a test or build verification that a second ingestion run within TTL produces cache hits
--- a/openspec/changes/archive/2026-02-10-blogs-section/.openspec.yaml
+++ b/openspec/changes/archive/2026-02-10-blogs-section/.openspec.yaml
--- a/openspec/changes/archive/2026-02-10-blogs-section/design.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/design.md
--- a/openspec/changes/archive/2026-02-10-blogs-section/proposal.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/proposal.md
--- a/openspec/changes/archive/2026-02-10-blogs-section/specs/blog-section-surface/spec.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/specs/blog-section-surface/spec.md
--- a/openspec/changes/archive/2026-02-10-blogs-section/specs/seo-content-surface/spec.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/specs/seo-content-surface/spec.md
--- a/openspec/changes/archive/2026-02-10-blogs-section/specs/wordpress-content-source/spec.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/specs/wordpress-content-source/spec.md
--- a/openspec/changes/archive/2026-02-10-blogs-section/tasks.md
+++ b/openspec/changes/archive/2026-02-10-blogs-section/tasks.md
--- a/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/.openspec.yaml
+++ b/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/.openspec.yaml
@@ -0,0 +1,2 @@
+schema: spec-driven
+created: 2026-02-10
--- a/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/design.md
+++ b/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/design.md
--- a/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/proposal.md
+++ b/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/proposal.md
--- a/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/specs/homepage-content-modules/spec.md
+++ b/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/specs/homepage-content-modules/spec.md
--- a/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/tasks.md
+++ b/openspec/changes/archive/2026-02-10-hide-ig-if-no-data/tasks.md