better cache
This commit is contained in:
48
openspec/changes/archive/2026-02-10-better-cache/design.md
Normal file
48
openspec/changes/archive/2026-02-10-better-cache/design.md
Normal file
@@ -0,0 +1,48 @@
|
||||
## Context
|
||||
|
||||
The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`).
|
||||
|
||||
Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation.
|
||||
- Use the cache layer to reduce repeated network/API calls and parsing work for:
|
||||
- social content ingestion (YouTube/podcast/Instagram list)
|
||||
- WordPress `wp-json` ingestion
|
||||
- Provide a default “industry standard” TTL with environment override.
|
||||
- Add a manual cache clear command/script.
|
||||
- Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected.
|
||||
|
||||
**Non-Goals:**
|
||||
- Adding a runtime server for the site (the site remains static HTML served by nginx).
|
||||
- Caching browser requests to nginx (no CDN/edge cache configuration in this change).
|
||||
- Perfect cache coherence across multiple machines/environments (dev+docker is the target).
|
||||
|
||||
## Decisions
|
||||
|
||||
- **Decision: Use Redis as the shared cache backend (docker-compose service).**
|
||||
- Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker.
|
||||
- Alternative considered: Local file-based cache only. Rejected because it doesn’t provide a shared service and is harder to invalidate consistently.
|
||||
|
||||
- **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.**
|
||||
- Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win.
|
||||
- Approach:
|
||||
- Cache keys like `youtube:api:<channelId>:<limit>`, `podcast:rss:<url>`, `wp:posts`, `wp:pages`, `wp:categories`.
|
||||
- Store JSON values, set TTL, and log hit/miss per key.
|
||||
|
||||
- **Decision: Default TTL = 1 hour (3600s), configurable via env.**
|
||||
- Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily).
|
||||
- Allow overrides for local testing and production tuning.
|
||||
|
||||
- **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.**
|
||||
- Rationale: Simple manual operation and easy to verify.
|
||||
- Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable.
|
||||
- [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command.
|
||||
- [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.
|
||||
|
||||
28
openspec/changes/archive/2026-02-10-better-cache/proposal.md
Normal file
28
openspec/changes/archive/2026-02-10-better-cache/proposal.md
Normal file
@@ -0,0 +1,28 @@
|
||||
## Why
|
||||
|
||||
Reduce IO and external fetch load by adding a shared caching layer so repeated requests for the same content do not re-hit disk/network unnecessarily.
|
||||
|
||||
## What Changes
|
||||
|
||||
- Add a caching layer (Redis or similar lightweight cache) used by the site’s data/ingestion flows.
|
||||
- Add a cache service to `docker-compose.yml`.
|
||||
- Define an industry-standard cache invalidation interval (TTL) with a sensible default and allow it to be configured via environment variables.
|
||||
- Add a script/command to manually clear the cache on demand.
|
||||
- Add verification that the cache is working (cache hits/misses and TTL behavior).
|
||||
|
||||
## Capabilities
|
||||
|
||||
### New Capabilities
|
||||
- `cache-layer`: Provide a shared caching service (Redis or equivalent) with TTL-based invalidation and a manual clear operation for the website’s data flows.
|
||||
|
||||
### Modified Capabilities
|
||||
- `social-content-aggregation`: Use the cache layer to avoid re-fetching or re-processing external content sources on repeated runs/requests.
|
||||
- `wordpress-content-source`: Use the cache layer to reduce repeated `wp-json` fetches and parsing work.
|
||||
|
||||
## Impact
|
||||
|
||||
- Deployment/local dev: add Redis (or equivalent) to `docker-compose.yml` and wire environment/config for connection + TTL.
|
||||
- Scripts/services: update ingestion/build-time fetch to read/write via cache and log hit/miss for verification.
|
||||
- Tooling: add a cache-clear script/command (and document usage).
|
||||
- Testing: add tests or a lightweight verification step proving cached reads are used and expire as expected.
|
||||
|
||||
@@ -0,0 +1,38 @@
|
||||
## ADDED Requirements
|
||||
|
||||
### Requirement: Redis-backed cache service
|
||||
The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
|
||||
|
||||
The cache service MUST be runnable in local development via Docker Compose.
|
||||
|
||||
#### Scenario: Cache service available in Docker
|
||||
- **WHEN** the Docker Compose stack is started
|
||||
- **THEN** a Redis service is available to other services/scripts on the internal network
|
||||
|
||||
### Requirement: TTL-based invalidation
|
||||
Cached entries MUST support TTL-based invalidation.
|
||||
|
||||
The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
|
||||
|
||||
#### Scenario: Default TTL applies
|
||||
- **WHEN** a cached entry is written without an explicit TTL override
|
||||
- **THEN** it expires after the configured default TTL
|
||||
|
||||
#### Scenario: TTL override applies
|
||||
- **WHEN** a TTL override is configured via environment/config
|
||||
- **THEN** new cached entries use that TTL for expiration
|
||||
|
||||
### Requirement: Cache key namespace
|
||||
Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
|
||||
|
||||
#### Scenario: Two different sources do not collide
|
||||
- **WHEN** the system caches a YouTube fetch and a WordPress fetch
|
||||
- **THEN** they use different key namespaces and do not overwrite each other
|
||||
|
||||
### Requirement: Manual cache clear
|
||||
The system MUST provide a script/command to manually clear the cache.
|
||||
|
||||
#### Scenario: Manual clear executed
|
||||
- **WHEN** a developer runs the cache clear command
|
||||
- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses
|
||||
|
||||
@@ -0,0 +1,23 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Refresh and caching
|
||||
The system MUST cache the latest successful ingestion output and MUST serve the cached data to the site renderer.
|
||||
|
||||
The system MUST support periodic refresh on a schedule (at minimum daily) and MUST support a manual refresh trigger.
|
||||
|
||||
On ingestion failure, the system MUST continue serving the most recent cached data.
|
||||
|
||||
The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
|
||||
|
||||
#### Scenario: Scheduled refresh fails
|
||||
- **WHEN** a scheduled refresh run fails to fetch one or more sources
|
||||
- **THEN** the site continues to use the most recent successfully cached dataset
|
||||
|
||||
#### Scenario: Manual refresh requested
|
||||
- **WHEN** a manual refresh is triggered
|
||||
- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
|
||||
|
||||
#### Scenario: Cache hit avoids refetch
|
||||
- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
|
||||
- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network
|
||||
|
||||
@@ -0,0 +1,19 @@
|
||||
## MODIFIED Requirements
|
||||
|
||||
### Requirement: Build-time caching
|
||||
WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
|
||||
|
||||
If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
|
||||
- keep the last-known-good cached WordPress content (if present), or
|
||||
- store an empty WordPress dataset and allow the rest of the site to build.
|
||||
|
||||
When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
|
||||
|
||||
#### Scenario: WordPress fetch fails
|
||||
- **WHEN** a WordPress API request fails
|
||||
- **THEN** the site build can still complete and the blog surface renders a graceful empty state
|
||||
|
||||
#### Scenario: Cache hit avoids wp-json refetch
|
||||
- **WHEN** WordPress ingestion is executed within the configured cache TTL
|
||||
- **THEN** it uses cached data instead of refetching from `wp-json`
|
||||
|
||||
26
openspec/changes/archive/2026-02-10-better-cache/tasks.md
Normal file
26
openspec/changes/archive/2026-02-10-better-cache/tasks.md
Normal file
@@ -0,0 +1,26 @@
|
||||
## 1. Cache Service And Config
|
||||
|
||||
- [x] 1.1 Add Redis service to `docker-compose.yml` and wire basic health/ports for local dev
|
||||
- [x] 1.2 Add cache env/config variables (Redis URL/host+port, DB index, default TTL seconds) and document in `site/.env.example`
|
||||
|
||||
## 2. Cache Client And Utilities
|
||||
|
||||
- [x] 2.1 Add a small Redis cache client wrapper (get/set JSON with TTL, namespaced keys) for Node scripts
|
||||
- [x] 2.2 Add logging for cache hit/miss per key to support verification
|
||||
- [x] 2.3 Ensure caching is optional: if Redis is unreachable, ingestion proceeds without caching
|
||||
|
||||
## 3. Integrate With Ingestion
|
||||
|
||||
- [x] 3.1 Cache YouTube fetches (API and/or RSS) by source+params and reuse within TTL
|
||||
- [x] 3.2 Cache podcast RSS fetch by URL and reuse within TTL
|
||||
- [x] 3.3 Cache WordPress `wp-json` fetches (posts/pages/categories) and reuse within TTL
|
||||
|
||||
## 4. Cache Invalidation
|
||||
|
||||
- [x] 4.1 Add a command/script to manually clear the cache (scoped to configured Redis DB)
|
||||
- [x] 4.2 Document the cache clear command usage
|
||||
|
||||
## 5. Verification
|
||||
|
||||
- [x] 5.1 Add a test that exercises the cache wrapper (set/get JSON + TTL expiration behavior)
|
||||
- [x] 5.2 Add a test or build verification that a second ingestion run within TTL produces cache hits
|
||||
@@ -0,0 +1,2 @@
|
||||
schema: spec-driven
|
||||
created: 2026-02-10
|
||||
Reference in New Issue
Block a user