better cache
This commit is contained in:
48
openspec/changes/archive/2026-02-10-better-cache/design.md
Normal file
48
openspec/changes/archive/2026-02-10-better-cache/design.md
Normal file
@@ -0,0 +1,48 @@
|
||||
## Context
|
||||
|
||||
The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`).
|
||||
|
||||
Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation.
|
||||
- Use the cache layer to reduce repeated network/API calls and parsing work for:
|
||||
- social content ingestion (YouTube/podcast/Instagram list)
|
||||
- WordPress `wp-json` ingestion
|
||||
- Provide a default “industry standard” TTL with environment override.
|
||||
- Add a manual cache clear command/script.
|
||||
- Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected.
|
||||
|
||||
**Non-Goals:**
|
||||
- Adding a runtime server for the site (the site remains static HTML served by nginx).
|
||||
- Caching browser requests to nginx (no CDN/edge cache configuration in this change).
|
||||
- Perfect cache coherence across multiple machines/environments (dev+docker is the target).
|
||||
|
||||
## Decisions
|
||||
|
||||
- **Decision: Use Redis as the shared cache backend (docker-compose service).**
|
||||
- Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker.
|
||||
- Alternative considered: Local file-based cache only. Rejected because it doesn’t provide a shared service and is harder to invalidate consistently.
|
||||
|
||||
- **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.**
|
||||
- Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win.
|
||||
- Approach:
|
||||
- Cache keys like `youtube:api:<channelId>:<limit>`, `podcast:rss:<url>`, `wp:posts`, `wp:pages`, `wp:categories`.
|
||||
- Store JSON values, set TTL, and log hit/miss per key.
|
||||
|
||||
- **Decision: Default TTL = 1 hour (3600s), configurable via env.**
|
||||
- Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily).
|
||||
- Allow overrides for local testing and production tuning.
|
||||
|
||||
- **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.**
|
||||
- Rationale: Simple manual operation and easy to verify.
|
||||
- Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable.
|
||||
- [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command.
|
||||
- [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.
|
||||
|
||||
28
openspec/changes/archive/2026-02-10-better-cache/proposal.md
Normal file
28
openspec/changes/archive/2026-02-10-better-cache/proposal.md
Normal file
@@ -0,0 +1,28 @@
|
||||
## Why
|
||||
|
||||
Reduce IO and external fetch load by adding a shared caching layer so repeated requests for the same content do not re-hit disk/network unnecessarily.
|
||||
|
||||
## What Changes
|
||||
|
||||
- Add a caching layer (Redis or similar lightweight cache) used by the site’s data/ingestion flows.
|
||||
- Add a cache service to `docker-compose.yml`.
|
||||
- Define an industry-standard cache invalidation interval (TTL) with a sensible default and allow it to be configured via environment variables.
|
||||
- Add a script/command to manually clear the cache on demand.
|
||||
- Add verification that the cache is working (cache hits/misses and TTL behavior).
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
- `cache-layer`: Provide a shared caching service (Redis or equivalent) with TTL-based invalidation and a manual clear operation for the website’s data flows.
|
||||
|
||||
### Modified Capabilities
|
||||
- `social-content-aggregation`: Use the cache layer to avoid re-fetching or re-processing external content sources on repeated runs/requests.
|
||||
- `wordpress-content-source`: Use the cache layer to reduce repeated `wp-json` fetches and parsing work.
|
||||
|
||||
## Impact
|
||||
|
||||
- Deployment/local dev: add Redis (or equivalent) to `docker-compose.yml` and wire environment/config for connection + TTL.
|
||||
- Scripts/services: update ingestion/build-time fetch to read/write via cache and log hit/miss for verification.
|
||||
- Tooling: add a cache-clear script/command (and document usage).
|
||||
- Testing: add tests or a lightweight verification step proving cached reads are used and expire as expected.
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Redis-backed cache service
|
||||
The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
|
||||
|
||||
The cache service MUST be runnable in local development via Docker Compose.
|
||||
|
||||
#### Scenario: Cache service available in Docker
|
||||
- **WHEN** the Docker Compose stack is started
|
||||
- **THEN** a Redis service is available to other services/scripts on the internal network
|
||||
|
||||
### Requirement: TTL-based invalidation
|
||||
Cached entries MUST support TTL-based invalidation.
|
||||
|
||||
The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
|
||||
|
||||
#### Scenario: Default TTL applies
|
||||
- **WHEN** a cached entry is written without an explicit TTL override
|
||||
- **THEN** it expires after the configured default TTL
|
||||
|
||||
#### Scenario: TTL override applies
|
||||
- **WHEN** a TTL override is configured via environment/config
|
||||
- **THEN** new cached entries use that TTL for expiration
|
||||
|
||||
### Requirement: Cache key namespace
|
||||
Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
|
||||
|
||||
#### Scenario: Two different sources do not collide
|
||||
- **WHEN** the system caches a YouTube fetch and a WordPress fetch
|
||||
- **THEN** they use different key namespaces and do not overwrite each other
|
||||
|
||||
### Requirement: Manual cache clear
|
||||
The system MUST provide a script/command to manually clear the cache.
|
||||
|
||||
#### Scenario: Manual clear executed
|
||||
- **WHEN** a developer runs the cache clear command
|
||||
- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses
|
||||
|
||||
@@ -0,0 +1,23 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Refresh and caching
|
||||
The system MUST cache the latest successful ingestion output and MUST serve the cached data to the site renderer.
|
||||
|
||||
The system MUST support periodic refresh on a schedule (at minimum daily) and MUST support a manual refresh trigger.
|
||||
|
||||
On ingestion failure, the system MUST continue serving the most recent cached data.
|
||||
|
||||
The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
|
||||
|
||||
#### Scenario: Scheduled refresh fails
|
||||
- **WHEN** a scheduled refresh run fails to fetch one or more sources
|
||||
- **THEN** the site continues to use the most recent successfully cached dataset
|
||||
|
||||
#### Scenario: Manual refresh requested
|
||||
- **WHEN** a manual refresh is triggered
|
||||
- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
|
||||
|
||||
#### Scenario: Cache hit avoids refetch
|
||||
- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
|
||||
- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Build-time caching
|
||||
WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
|
||||
|
||||
If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
|
||||
- keep the last-known-good cached WordPress content (if present), or
|
||||
- store an empty WordPress dataset and allow the rest of the site to build.
|
||||
|
||||
When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
|
||||
|
||||
#### Scenario: WordPress fetch fails
|
||||
- **WHEN** a WordPress API request fails
|
||||
- **THEN** the site build can still complete and the blog surface renders a graceful empty state
|
||||
|
||||
#### Scenario: Cache hit avoids wp-json refetch
|
||||
- **WHEN** WordPress ingestion is executed within the configured cache TTL
|
||||
- **THEN** it uses cached data instead of refetching from `wp-json`
|
||||
|
||||
26
openspec/changes/archive/2026-02-10-better-cache/tasks.md
Normal file
26
openspec/changes/archive/2026-02-10-better-cache/tasks.md
Normal file
@@ -0,0 +1,26 @@
|
||||
## 1. Cache Service And Config
|
||||
|
||||
- [x] 1.1 Add Redis service to `docker-compose.yml` and wire basic health/ports for local dev
|
||||
- [x] 1.2 Add cache env/config variables (Redis URL/host+port, DB index, default TTL seconds) and document in `site/.env.example`
|
||||
|
||||
## 2. Cache Client And Utilities
|
||||
|
||||
- [x] 2.1 Add a small Redis cache client wrapper (get/set JSON with TTL, namespaced keys) for Node scripts
|
||||
- [x] 2.2 Add logging for cache hit/miss per key to support verification
|
||||
- [x] 2.3 Ensure caching is optional: if Redis is unreachable, ingestion proceeds without caching
|
||||
|
||||
## 3. Integrate With Ingestion
|
||||
|
||||
- [x] 3.1 Cache YouTube fetches (API and/or RSS) by source+params and reuse within TTL
|
||||
- [x] 3.2 Cache podcast RSS fetch by URL and reuse within TTL
|
||||
- [x] 3.3 Cache WordPress `wp-json` fetches (posts/pages/categories) and reuse within TTL
|
||||
|
||||
## 4. Cache Invalidation
|
||||
|
||||
- [x] 4.1 Add a command/script to manually clear the cache (scoped to configured Redis DB)
|
||||
- [x] 4.2 Document the cache clear command usage
|
||||
|
||||
## 5. Verification
|
||||
|
||||
- [x] 5.1 Add a test that exercises the cache wrapper (set/get JSON + TTL expiration behavior)
|
||||
- [x] 5.2 Add a test or build verification that a second ingestion run within TTL produces cache hits
|
||||
@@ -0,0 +1,2 @@
|
||||
schema: spec-driven
|
||||
created: 2026-02-10
|
||||
66
openspec/specs/blog-section-surface/spec.md
Normal file
66
openspec/specs/blog-section-surface/spec.md
Normal file
@@ -0,0 +1,66 @@
|
||||
## Purpose
|
||||
|
||||
Expose a blog section on the site backed by cached WordPress content, including listing, detail pages, and category browsing.
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Primary navigation entry
|
||||
The site MUST add a header navigation link to the blog index at `/blog` labeled "Blog".
|
||||
|
||||
#### Scenario: Blog link in header
|
||||
- **WHEN** a user views any page
|
||||
- **THEN** the header navigation includes a "Blog" link that navigates to `/blog`
|
||||
|
||||
### Requirement: Blog index listing (posts)
|
||||
The site MUST provide a blog index page at `/blog` that lists WordPress posts as cards containing:
|
||||
- featured image (when available)
|
||||
- title
|
||||
- excerpt/summary
|
||||
|
||||
The listing MUST be ordered by publish date descending (newest first).
|
||||
|
||||
#### Scenario: Blog index lists posts
|
||||
- **WHEN** the cached WordPress dataset contains posts
|
||||
- **THEN** `/blog` renders a list of post cards ordered by publish date descending
|
||||
|
||||
### Requirement: Blog post detail
|
||||
The site MUST provide a blog post detail page for each WordPress post that renders:
|
||||
- title
|
||||
- publish date
|
||||
- featured image (when available)
|
||||
- full post content
|
||||
|
||||
#### Scenario: Post detail renders
|
||||
- **WHEN** a user navigates to a blog post detail page
|
||||
- **THEN** the page renders the full post content from the cached WordPress dataset
|
||||
|
||||
### Requirement: WordPress pages support
|
||||
The blog section MUST support WordPress pages by rendering page detail routes that show:
|
||||
- title
|
||||
- featured image (when available)
|
||||
- full page content
|
||||
|
||||
#### Scenario: Page detail renders
|
||||
- **WHEN** a user navigates to a WordPress page detail route
|
||||
- **THEN** the page renders the full page content from the cached WordPress dataset
|
||||
|
||||
### Requirement: Category-based secondary navigation
|
||||
The blog section MUST render a secondary navigation under the header derived from the cached WordPress categories.
|
||||
|
||||
Selecting a category MUST navigate to a category listing page showing only posts in that category.
|
||||
|
||||
#### Scenario: Category nav present
|
||||
- **WHEN** the cached WordPress dataset contains categories
|
||||
- **THEN** the blog section shows a secondary navigation with those categories
|
||||
|
||||
#### Scenario: Category listing filters posts
|
||||
- **WHEN** a user navigates to a category listing page
|
||||
- **THEN** only posts assigned to that category are listed
|
||||
|
||||
### Requirement: Graceful empty states
|
||||
If there are no WordPress posts available, the blog index MUST render a non-broken empty state and MUST still render header/navigation.
|
||||
|
||||
#### Scenario: No posts available
|
||||
- **WHEN** the cached WordPress dataset contains no posts
|
||||
- **THEN** `/blog` renders a helpful empty state
|
||||
|
||||
42
openspec/specs/cache-layer/spec.md
Normal file
42
openspec/specs/cache-layer/spec.md
Normal file
@@ -0,0 +1,42 @@
|
||||
## Purpose
|
||||
|
||||
Provide a shared caching layer (Redis-backed) for ingestion and content processing flows, with TTL-based invalidation and manual cache clearing.
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Redis-backed cache service
|
||||
The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
|
||||
|
||||
The cache service MUST be runnable in local development via Docker Compose.
|
||||
|
||||
#### Scenario: Cache service available in Docker
|
||||
- **WHEN** the Docker Compose stack is started
|
||||
- **THEN** a Redis service is available to other services/scripts on the internal network
|
||||
|
||||
### Requirement: TTL-based invalidation
|
||||
Cached entries MUST support TTL-based invalidation.
|
||||
|
||||
The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
|
||||
|
||||
#### Scenario: Default TTL applies
|
||||
- **WHEN** a cached entry is written without an explicit TTL override
|
||||
- **THEN** it expires after the configured default TTL
|
||||
|
||||
#### Scenario: TTL override applies
|
||||
- **WHEN** a TTL override is configured via environment/config
|
||||
- **THEN** new cached entries use that TTL for expiration
|
||||
|
||||
### Requirement: Cache key namespace
|
||||
Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
|
||||
|
||||
#### Scenario: Two different sources do not collide
|
||||
- **WHEN** the system caches a YouTube fetch and a WordPress fetch
|
||||
- **THEN** they use different key namespaces and do not overwrite each other
|
||||
|
||||
### Requirement: Manual cache clear
|
||||
The system MUST provide a script/command to manually clear the cache.
|
||||
|
||||
#### Scenario: Manual clear executed
|
||||
- **WHEN** a developer runs the cache clear command
|
||||
- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses
|
||||
|
||||
@@ -38,7 +38,8 @@ When `metrics.views` is not available, the system MUST render the high-performin
|
||||
### Requirement: Graceful empty and error states
|
||||
If a module has no content to display, the homepage MUST render a non-broken empty state for that module and MUST still render the rest of the page.
|
||||
|
||||
The Instagram module is an exception: if there are no Instagram items to display, the homepage MUST omit the Instagram module entirely (no empty state block) and MUST still render the rest of the page.
|
||||
|
||||
#### Scenario: No Instagram items available
|
||||
- **WHEN** the cached dataset contains no Instagram items
|
||||
- **THEN** the Instagram-related module renders an empty state and the homepage still renders other modules
|
||||
|
||||
- **THEN** the Instagram-related module is not rendered and the homepage still renders other modules
|
||||
|
||||
@@ -45,9 +45,19 @@ The site MUST provide:
|
||||
- `sitemap.xml` enumerating indexable pages
|
||||
- `robots.txt` that allows indexing of indexable pages
|
||||
|
||||
The sitemap MUST include the blog surface routes:
|
||||
- `/blog`
|
||||
- blog post detail routes
|
||||
- blog page detail routes
|
||||
- blog category listing routes
|
||||
|
||||
#### Scenario: Sitemap is available
|
||||
- **WHEN** a crawler requests `/sitemap.xml`
|
||||
- **THEN** the server returns an XML sitemap listing `/`, `/videos`, `/podcast`, and `/about`
|
||||
- **THEN** the server returns an XML sitemap listing `/`, `/videos`, `/podcast`, `/about`, and `/blog`
|
||||
|
||||
#### Scenario: Blog URLs appear in sitemap
|
||||
- **WHEN** WordPress content is available in the cache at build time
|
||||
- **THEN** the generated sitemap includes the blog detail URLs for those items
|
||||
|
||||
### Requirement: Structured data
|
||||
The site MUST support structured data (JSON-LD) for Video and Podcast content when detail pages exist, and MUST ensure the JSON-LD is valid JSON.
|
||||
|
||||
@@ -57,6 +57,8 @@ The system MUST support periodic refresh on a schedule (at minimum daily) and MU
|
||||
|
||||
On ingestion failure, the system MUST continue serving the most recent cached data.
|
||||
|
||||
The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
|
||||
|
||||
#### Scenario: Scheduled refresh fails
|
||||
- **WHEN** a scheduled refresh run fails to fetch one or more sources
|
||||
- **THEN** the site continues to use the most recent successfully cached dataset
|
||||
@@ -65,3 +67,6 @@ On ingestion failure, the system MUST continue serving the most recent cached da
|
||||
- **WHEN** a manual refresh is triggered
|
||||
- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
|
||||
|
||||
#### Scenario: Cache hit avoids refetch
|
||||
- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
|
||||
- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network
|
||||
|
||||
69
openspec/specs/wordpress-content-source/spec.md
Normal file
69
openspec/specs/wordpress-content-source/spec.md
Normal file
@@ -0,0 +1,69 @@
|
||||
## Purpose
|
||||
|
||||
Provide a build-time content source backed by a WordPress site via the `wp-json` REST APIs.
|
||||
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: WordPress API configuration
|
||||
The system MUST allow configuring a WordPress content source using environment/config values:
|
||||
- WordPress base URL
|
||||
- credentials (username + password or application password) when required by the WordPress instance
|
||||
|
||||
The WordPress base URL MUST be used to construct requests to the WordPress `wp-json` REST APIs.
|
||||
|
||||
#### Scenario: Config provided
|
||||
- **WHEN** WordPress configuration values are provided
|
||||
- **THEN** the system can attempt to fetch WordPress content via `wp-json`
|
||||
|
||||
### Requirement: Fetch posts
|
||||
The system MUST fetch the latest WordPress posts via `wp-json` and map them into an internal representation with:
|
||||
- stable ID
|
||||
- slug
|
||||
- title
|
||||
- excerpt/summary
|
||||
- content HTML
|
||||
- featured image URL when available
|
||||
- publish date/time and last modified date/time
|
||||
- category assignments (IDs and slugs when available)
|
||||
|
||||
#### Scenario: Posts fetched successfully
|
||||
- **WHEN** the WordPress posts endpoint returns a non-empty list
|
||||
- **THEN** the system stores the mapped post items in the content cache for rendering
|
||||
|
||||
### Requirement: Fetch pages
|
||||
The system MUST fetch WordPress pages via `wp-json` and map them into an internal representation with:
|
||||
- stable ID
|
||||
- slug
|
||||
- title
|
||||
- excerpt/summary when available
|
||||
- content HTML
|
||||
- featured image URL when available
|
||||
- publish date/time and last modified date/time
|
||||
|
||||
#### Scenario: Pages fetched successfully
|
||||
- **WHEN** the WordPress pages endpoint returns a non-empty list
|
||||
- **THEN** the system stores the mapped page items in the content cache for rendering
|
||||
|
||||
### Requirement: Fetch categories
|
||||
The system MUST fetch WordPress categories via `wp-json` and store them for rendering a category-based secondary navigation under the blog section.
|
||||
|
||||
#### Scenario: Categories fetched successfully
|
||||
- **WHEN** the WordPress categories endpoint returns a list of categories
|
||||
- **THEN** the system stores categories (ID, slug, name) in the content cache for blog navigation
|
||||
|
||||
### Requirement: Build-time caching
|
||||
WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
|
||||
|
||||
If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
|
||||
- keep the last-known-good cached WordPress content (if present), or
|
||||
- store an empty WordPress dataset and allow the rest of the site to build.
|
||||
|
||||
When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
|
||||
|
||||
#### Scenario: WordPress fetch fails
|
||||
- **WHEN** a WordPress API request fails
|
||||
- **THEN** the site build can still complete and the blog surface renders a graceful empty state
|
||||
|
||||
#### Scenario: Cache hit avoids wp-json refetch
|
||||
- **WHEN** WordPress ingestion is executed within the configured cache TTL
|
||||
- **THEN** it uses cached data instead of refetching from `wp-json`
|
||||
Reference in New Issue
Block a user