49 lines
3.0 KiB
Markdown
49 lines
3.0 KiB
Markdown
## Context
|
||
|
||
The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`).
|
||
|
||
Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable.
|
||
|
||
## Goals / Non-Goals
|
||
|
||
**Goals:**
|
||
- Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation.
|
||
- Use the cache layer to reduce repeated network/API calls and parsing work for:
|
||
- social content ingestion (YouTube/podcast/Instagram list)
|
||
- WordPress `wp-json` ingestion
|
||
- Provide a default “industry standard” TTL with environment override.
|
||
- Add a manual cache clear command/script.
|
||
- Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected.
|
||
|
||
**Non-Goals:**
|
||
- Adding a runtime server for the site (the site remains static HTML served by nginx).
|
||
- Caching browser requests to nginx (no CDN/edge cache configuration in this change).
|
||
- Perfect cache coherence across multiple machines/environments (dev+docker is the target).
|
||
|
||
## Decisions
|
||
|
||
- **Decision: Use Redis as the shared cache backend (docker-compose service).**
|
||
- Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker.
|
||
- Alternative considered: Local file-based cache only. Rejected because it doesn’t provide a shared service and is harder to invalidate consistently.
|
||
|
||
- **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.**
|
||
- Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win.
|
||
- Approach:
|
||
- Cache keys like `youtube:api:<channelId>:<limit>`, `podcast:rss:<url>`, `wp:posts`, `wp:pages`, `wp:categories`.
|
||
- Store JSON values, set TTL, and log hit/miss per key.
|
||
|
||
- **Decision: Default TTL = 1 hour (3600s), configurable via env.**
|
||
- Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily).
|
||
- Allow overrides for local testing and production tuning.
|
||
|
||
- **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.**
|
||
- Rationale: Simple manual operation and easy to verify.
|
||
- Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped.
|
||
|
||
## Risks / Trade-offs
|
||
|
||
- [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable.
|
||
- [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command.
|
||
- [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.
|
||
|