better cache

This commit is contained in:
2026-02-10 01:20:58 -05:00
parent c773affbc8
commit f056e67eae
39 changed files with 830 additions and 17 deletions

View File

@@ -6,3 +6,13 @@ services:
ports:
- "8080:80"
redis:
image: redis:7-alpine
ports:
# Use 6380 to avoid colliding with any locally installed Redis on 6379.
- "6380:6379"
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 5s
timeout: 3s
retries: 20

View File

@@ -0,0 +1,48 @@
## Context
The site is an Astro static build served via nginx. Content is gathered by build-time ingestion (`site/scripts/fetch-content.ts`) that reads/writes a repo-local cache file (`site/content/cache/content.json`).
Today, repeated ingestion runs can re-hit external sources (YouTube API/RSS, podcast RSS, WordPress `wp-json`) and re-do normalization work. We want a shared caching layer to reduce IO and network load and to make repeated runs faster and more predictable.
## Goals / Non-Goals
**Goals:**
- Add a Redis-backed cache layer usable from Node scripts (ingestion) with TTL-based invalidation.
- Use the cache layer to reduce repeated network/API calls and parsing work for:
- social content ingestion (YouTube/podcast/Instagram list)
- WordPress `wp-json` ingestion
- Provide a default “industry standard” TTL with environment override.
- Add a manual cache clear command/script.
- Provide verification (tests and/or logs) that cache hits occur and TTL expiration behaves as expected.
**Non-Goals:**
- Adding a runtime server for the site (the site remains static HTML served by nginx).
- Caching browser requests to nginx (no CDN/edge cache configuration in this change).
- Perfect cache coherence across multiple machines/environments (dev+docker is the target).
## Decisions
- **Decision: Use Redis as the shared cache backend (docker-compose service).**
- Rationale: Redis is widely adopted, lightweight, supports TTLs natively, and is easy to run in dev via Docker.
- Alternative considered: Local file-based cache only. Rejected because it doesnt provide a shared service and is harder to invalidate consistently.
- **Decision: Cache at the “source fetch” and “normalized dataset” boundaries.**
- Rationale: The biggest cost is network + parsing/normalization. Caching raw API responses (or normalized outputs) by source+params gives the best win.
- Approach:
- Cache keys like `youtube:api:<channelId>:<limit>`, `podcast:rss:<url>`, `wp:posts`, `wp:pages`, `wp:categories`.
- Store JSON values, set TTL, and log hit/miss per key.
- **Decision: Default TTL = 1 hour (3600s), configurable via env.**
- Rationale: A 1h TTL is a common baseline for content freshness vs load. It also aligns with typical ingestion schedules (hourly/daily).
- Allow overrides for local testing and production tuning.
- **Decision: Cache clear script uses Redis `FLUSHDB` in the configured Redis database.**
- Rationale: Simple manual operation and easy to verify.
- Guardrail: Use a dedicated Redis DB index (e.g., `0` by default) so the script is scoped.
## Risks / Trade-offs
- [Risk] Redis introduces a new dependency and operational moving part. -> Mitigation: Keep Redis optional; ingestion should fall back to no-cache mode if Redis is not reachable.
- [Risk] Stale content if TTL too long. -> Mitigation: Default to 1h and allow env override; provide manual clear command.
- [Risk] Cache key mistakes lead to wrong content reuse. -> Mitigation: Centralize key generation and add tests for key uniqueness and TTL behavior.

View File

@@ -0,0 +1,28 @@
## Why
Reduce IO and external fetch load by adding a shared caching layer so repeated requests for the same content do not re-hit disk/network unnecessarily.
## What Changes
- Add a caching layer (Redis or similar lightweight cache) used by the sites data/ingestion flows.
- Add a cache service to `docker-compose.yml`.
- Define an industry-standard cache invalidation interval (TTL) with a sensible default and allow it to be configured via environment variables.
- Add a script/command to manually clear the cache on demand.
- Add verification that the cache is working (cache hits/misses and TTL behavior).
## Capabilities
### New Capabilities
- `cache-layer`: Provide a shared caching service (Redis or equivalent) with TTL-based invalidation and a manual clear operation for the websites data flows.
### Modified Capabilities
- `social-content-aggregation`: Use the cache layer to avoid re-fetching or re-processing external content sources on repeated runs/requests.
- `wordpress-content-source`: Use the cache layer to reduce repeated `wp-json` fetches and parsing work.
## Impact
- Deployment/local dev: add Redis (or equivalent) to `docker-compose.yml` and wire environment/config for connection + TTL.
- Scripts/services: update ingestion/build-time fetch to read/write via cache and log hit/miss for verification.
- Tooling: add a cache-clear script/command (and document usage).
- Testing: add tests or a lightweight verification step proving cached reads are used and expire as expected.

View File

@@ -0,0 +1,38 @@
## ADDED Requirements
### Requirement: Redis-backed cache service
The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
The cache service MUST be runnable in local development via Docker Compose.
#### Scenario: Cache service available in Docker
- **WHEN** the Docker Compose stack is started
- **THEN** a Redis service is available to other services/scripts on the internal network
### Requirement: TTL-based invalidation
Cached entries MUST support TTL-based invalidation.
The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
#### Scenario: Default TTL applies
- **WHEN** a cached entry is written without an explicit TTL override
- **THEN** it expires after the configured default TTL
#### Scenario: TTL override applies
- **WHEN** a TTL override is configured via environment/config
- **THEN** new cached entries use that TTL for expiration
### Requirement: Cache key namespace
Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
#### Scenario: Two different sources do not collide
- **WHEN** the system caches a YouTube fetch and a WordPress fetch
- **THEN** they use different key namespaces and do not overwrite each other
### Requirement: Manual cache clear
The system MUST provide a script/command to manually clear the cache.
#### Scenario: Manual clear executed
- **WHEN** a developer runs the cache clear command
- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses

View File

@@ -0,0 +1,23 @@
## MODIFIED Requirements
### Requirement: Refresh and caching
The system MUST cache the latest successful ingestion output and MUST serve the cached data to the site renderer.
The system MUST support periodic refresh on a schedule (at minimum daily) and MUST support a manual refresh trigger.
On ingestion failure, the system MUST continue serving the most recent cached data.
The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
#### Scenario: Scheduled refresh fails
- **WHEN** a scheduled refresh run fails to fetch one or more sources
- **THEN** the site continues to use the most recent successfully cached dataset
#### Scenario: Manual refresh requested
- **WHEN** a manual refresh is triggered
- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
#### Scenario: Cache hit avoids refetch
- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network

View File

@@ -0,0 +1,19 @@
## MODIFIED Requirements
### Requirement: Build-time caching
WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
- keep the last-known-good cached WordPress content (if present), or
- store an empty WordPress dataset and allow the rest of the site to build.
When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
#### Scenario: WordPress fetch fails
- **WHEN** a WordPress API request fails
- **THEN** the site build can still complete and the blog surface renders a graceful empty state
#### Scenario: Cache hit avoids wp-json refetch
- **WHEN** WordPress ingestion is executed within the configured cache TTL
- **THEN** it uses cached data instead of refetching from `wp-json`

View File

@@ -0,0 +1,26 @@
## 1. Cache Service And Config
- [x] 1.1 Add Redis service to `docker-compose.yml` and wire basic health/ports for local dev
- [x] 1.2 Add cache env/config variables (Redis URL/host+port, DB index, default TTL seconds) and document in `site/.env.example`
## 2. Cache Client And Utilities
- [x] 2.1 Add a small Redis cache client wrapper (get/set JSON with TTL, namespaced keys) for Node scripts
- [x] 2.2 Add logging for cache hit/miss per key to support verification
- [x] 2.3 Ensure caching is optional: if Redis is unreachable, ingestion proceeds without caching
## 3. Integrate With Ingestion
- [x] 3.1 Cache YouTube fetches (API and/or RSS) by source+params and reuse within TTL
- [x] 3.2 Cache podcast RSS fetch by URL and reuse within TTL
- [x] 3.3 Cache WordPress `wp-json` fetches (posts/pages/categories) and reuse within TTL
## 4. Cache Invalidation
- [x] 4.1 Add a command/script to manually clear the cache (scoped to configured Redis DB)
- [x] 4.2 Document the cache clear command usage
## 5. Verification
- [x] 5.1 Add a test that exercises the cache wrapper (set/get JSON + TTL expiration behavior)
- [x] 5.2 Add a test or build verification that a second ingestion run within TTL produces cache hits

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-02-10

View File

@@ -0,0 +1,66 @@
## Purpose
Expose a blog section on the site backed by cached WordPress content, including listing, detail pages, and category browsing.
## ADDED Requirements
### Requirement: Primary navigation entry
The site MUST add a header navigation link to the blog index at `/blog` labeled "Blog".
#### Scenario: Blog link in header
- **WHEN** a user views any page
- **THEN** the header navigation includes a "Blog" link that navigates to `/blog`
### Requirement: Blog index listing (posts)
The site MUST provide a blog index page at `/blog` that lists WordPress posts as cards containing:
- featured image (when available)
- title
- excerpt/summary
The listing MUST be ordered by publish date descending (newest first).
#### Scenario: Blog index lists posts
- **WHEN** the cached WordPress dataset contains posts
- **THEN** `/blog` renders a list of post cards ordered by publish date descending
### Requirement: Blog post detail
The site MUST provide a blog post detail page for each WordPress post that renders:
- title
- publish date
- featured image (when available)
- full post content
#### Scenario: Post detail renders
- **WHEN** a user navigates to a blog post detail page
- **THEN** the page renders the full post content from the cached WordPress dataset
### Requirement: WordPress pages support
The blog section MUST support WordPress pages by rendering page detail routes that show:
- title
- featured image (when available)
- full page content
#### Scenario: Page detail renders
- **WHEN** a user navigates to a WordPress page detail route
- **THEN** the page renders the full page content from the cached WordPress dataset
### Requirement: Category-based secondary navigation
The blog section MUST render a secondary navigation under the header derived from the cached WordPress categories.
Selecting a category MUST navigate to a category listing page showing only posts in that category.
#### Scenario: Category nav present
- **WHEN** the cached WordPress dataset contains categories
- **THEN** the blog section shows a secondary navigation with those categories
#### Scenario: Category listing filters posts
- **WHEN** a user navigates to a category listing page
- **THEN** only posts assigned to that category are listed
### Requirement: Graceful empty states
If there are no WordPress posts available, the blog index MUST render a non-broken empty state and MUST still render header/navigation.
#### Scenario: No posts available
- **WHEN** the cached WordPress dataset contains no posts
- **THEN** `/blog` renders a helpful empty state

View File

@@ -0,0 +1,42 @@
## Purpose
Provide a shared caching layer (Redis-backed) for ingestion and content processing flows, with TTL-based invalidation and manual cache clearing.
## ADDED Requirements
### Requirement: Redis-backed cache service
The system MUST provide a Redis-backed cache service for use by ingestion and content processing flows.
The cache service MUST be runnable in local development via Docker Compose.
#### Scenario: Cache service available in Docker
- **WHEN** the Docker Compose stack is started
- **THEN** a Redis service is available to other services/scripts on the internal network
### Requirement: TTL-based invalidation
Cached entries MUST support TTL-based invalidation.
The system MUST define a default TTL and MUST allow overriding the TTL via environment/config.
#### Scenario: Default TTL applies
- **WHEN** a cached entry is written without an explicit TTL override
- **THEN** it expires after the configured default TTL
#### Scenario: TTL override applies
- **WHEN** a TTL override is configured via environment/config
- **THEN** new cached entries use that TTL for expiration
### Requirement: Cache key namespace
Cache keys MUST be namespaced by source and parameters so that different data requests do not collide.
#### Scenario: Two different sources do not collide
- **WHEN** the system caches a YouTube fetch and a WordPress fetch
- **THEN** they use different key namespaces and do not overwrite each other
### Requirement: Manual cache clear
The system MUST provide a script/command to manually clear the cache.
#### Scenario: Manual clear executed
- **WHEN** a developer runs the cache clear command
- **THEN** the cache is cleared and subsequent ingestion runs produce cache misses

View File

@@ -38,7 +38,8 @@ When `metrics.views` is not available, the system MUST render the high-performin
### Requirement: Graceful empty and error states
If a module has no content to display, the homepage MUST render a non-broken empty state for that module and MUST still render the rest of the page.
The Instagram module is an exception: if there are no Instagram items to display, the homepage MUST omit the Instagram module entirely (no empty state block) and MUST still render the rest of the page.
#### Scenario: No Instagram items available
- **WHEN** the cached dataset contains no Instagram items
- **THEN** the Instagram-related module renders an empty state and the homepage still renders other modules
- **THEN** the Instagram-related module is not rendered and the homepage still renders other modules

View File

@@ -45,9 +45,19 @@ The site MUST provide:
- `sitemap.xml` enumerating indexable pages
- `robots.txt` that allows indexing of indexable pages
The sitemap MUST include the blog surface routes:
- `/blog`
- blog post detail routes
- blog page detail routes
- blog category listing routes
#### Scenario: Sitemap is available
- **WHEN** a crawler requests `/sitemap.xml`
- **THEN** the server returns an XML sitemap listing `/`, `/videos`, `/podcast`, and `/about`
- **THEN** the server returns an XML sitemap listing `/`, `/videos`, `/podcast`, `/about`, and `/blog`
#### Scenario: Blog URLs appear in sitemap
- **WHEN** WordPress content is available in the cache at build time
- **THEN** the generated sitemap includes the blog detail URLs for those items
### Requirement: Structured data
The site MUST support structured data (JSON-LD) for Video and Podcast content when detail pages exist, and MUST ensure the JSON-LD is valid JSON.

View File

@@ -57,6 +57,8 @@ The system MUST support periodic refresh on a schedule (at minimum daily) and MU
On ingestion failure, the system MUST continue serving the most recent cached data.
The ingestion pipeline MUST use the cache layer (when configured and reachable) to reduce repeated network and parsing work for external sources (for example, YouTube API/RSS and podcast RSS).
#### Scenario: Scheduled refresh fails
- **WHEN** a scheduled refresh run fails to fetch one or more sources
- **THEN** the site continues to use the most recent successfully cached dataset
@@ -65,3 +67,6 @@ On ingestion failure, the system MUST continue serving the most recent cached da
- **WHEN** a manual refresh is triggered
- **THEN** the system attempts ingestion immediately and updates the cache if ingestion succeeds
#### Scenario: Cache hit avoids refetch
- **WHEN** a refresh run is executed within the cache TTL for a given source+parameters
- **THEN** the ingestion pipeline uses cached data for that source instead of refetching over the network

View File

@@ -0,0 +1,69 @@
## Purpose
Provide a build-time content source backed by a WordPress site via the `wp-json` REST APIs.
## ADDED Requirements
### Requirement: WordPress API configuration
The system MUST allow configuring a WordPress content source using environment/config values:
- WordPress base URL
- credentials (username + password or application password) when required by the WordPress instance
The WordPress base URL MUST be used to construct requests to the WordPress `wp-json` REST APIs.
#### Scenario: Config provided
- **WHEN** WordPress configuration values are provided
- **THEN** the system can attempt to fetch WordPress content via `wp-json`
### Requirement: Fetch posts
The system MUST fetch the latest WordPress posts via `wp-json` and map them into an internal representation with:
- stable ID
- slug
- title
- excerpt/summary
- content HTML
- featured image URL when available
- publish date/time and last modified date/time
- category assignments (IDs and slugs when available)
#### Scenario: Posts fetched successfully
- **WHEN** the WordPress posts endpoint returns a non-empty list
- **THEN** the system stores the mapped post items in the content cache for rendering
### Requirement: Fetch pages
The system MUST fetch WordPress pages via `wp-json` and map them into an internal representation with:
- stable ID
- slug
- title
- excerpt/summary when available
- content HTML
- featured image URL when available
- publish date/time and last modified date/time
#### Scenario: Pages fetched successfully
- **WHEN** the WordPress pages endpoint returns a non-empty list
- **THEN** the system stores the mapped page items in the content cache for rendering
### Requirement: Fetch categories
The system MUST fetch WordPress categories via `wp-json` and store them for rendering a category-based secondary navigation under the blog section.
#### Scenario: Categories fetched successfully
- **WHEN** the WordPress categories endpoint returns a list of categories
- **THEN** the system stores categories (ID, slug, name) in the content cache for blog navigation
### Requirement: Build-time caching
WordPress posts, pages, and categories MUST be written into the repo-local content cache used by the site build.
If the WordPress fetch fails, the system MUST NOT crash the entire build pipeline; it MUST either:
- keep the last-known-good cached WordPress content (if present), or
- store an empty WordPress dataset and allow the rest of the site to build.
When the cache layer is configured and reachable, the WordPress ingestion MUST cache `wp-json` responses (or normalized outputs) using a TTL so repeated ingestion runs avoid unnecessary network requests and parsing work.
#### Scenario: WordPress fetch fails
- **WHEN** a WordPress API request fails
- **THEN** the site build can still complete and the blog surface renders a graceful empty state
#### Scenario: Cache hit avoids wp-json refetch
- **WHEN** WordPress ingestion is executed within the configured cache TTL
- **THEN** it uses cached data instead of refetching from `wp-json`

View File

@@ -19,3 +19,17 @@ WORDPRESS_BASE_URL=
# Optional credentials (prefer an Application Password). Leave blank if your WP endpoints are public.
WORDPRESS_USERNAME=
WORDPRESS_APP_PASSWORD=
# Cache layer (optional; used by ingestion scripts)
# If unset, caching is disabled.
#
# Using docker-compose redis:
# CACHE_REDIS_URL=redis://localhost:6380/0
CACHE_REDIS_URL=
# Alternative config if you prefer host/port/db:
CACHE_REDIS_HOST=localhost
CACHE_REDIS_PORT=6380
CACHE_REDIS_DB=0
# Default cache TTL (seconds). 3600 = 1 hour.
CACHE_DEFAULT_TTL_SECONDS=3600

View File

@@ -1,5 +1,5 @@
{
"generatedAt": "2026-02-10T06:01:51.379Z",
"generatedAt": "2026-02-10T06:16:26.031Z",
"items": [
{
"id": "gPGbtfQdaw4",
@@ -31,7 +31,7 @@
"publishedAt": "2026-02-05T04:31:18.000Z",
"thumbnailUrl": "https://i.ytimg.com/vi/9t8cBpZLHUo/hqdefault.jpg",
"metrics": {
"views": 328
"views": 325
}
},
{

101
site/package-lock.json generated
View File

@@ -10,6 +10,7 @@
"dependencies": {
"@astrojs/sitemap": "^3.7.0",
"astro": "^5.17.1",
"redis": "^4.7.1",
"rss-parser": "^3.13.0",
"zod": "^3.25.76"
},
@@ -1241,6 +1242,65 @@
"integrity": "sha512-70wQhgYmndg4GCPxPPxPGevRKqTIJ2Nh4OkiMWmDAVYsTQ+Ta7Sq+rPevXyXGdzr30/qZBnyOalCszoMxlyldQ==",
"license": "MIT"
},
"node_modules/@redis/bloom": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/@redis/bloom/-/bloom-1.2.0.tgz",
"integrity": "sha512-HG2DFjYKbpNmVXsa0keLHp/3leGJz1mjh09f2RLGGLQZzSHpkmZWuwJbAvo3QcRY8p80m5+ZdXZdYOSBLlp7Cg==",
"license": "MIT",
"peerDependencies": {
"@redis/client": "^1.0.0"
}
},
"node_modules/@redis/client": {
"version": "1.6.1",
"resolved": "https://registry.npmjs.org/@redis/client/-/client-1.6.1.tgz",
"integrity": "sha512-/KCsg3xSlR+nCK8/8ZYSknYxvXHwubJrU82F3Lm1Fp6789VQ0/3RJKfsmRXjqfaTA++23CvC3hqmqe/2GEt6Kw==",
"license": "MIT",
"dependencies": {
"cluster-key-slot": "1.1.2",
"generic-pool": "3.9.0",
"yallist": "4.0.0"
},
"engines": {
"node": ">=14"
}
},
"node_modules/@redis/graph": {
"version": "1.1.1",
"resolved": "https://registry.npmjs.org/@redis/graph/-/graph-1.1.1.tgz",
"integrity": "sha512-FEMTcTHZozZciLRl6GiiIB4zGm5z5F3F6a6FZCyrfxdKOhFlGkiAqlexWMBzCi4DcRoyiOsuLfW+cjlGWyExOw==",
"license": "MIT",
"peerDependencies": {
"@redis/client": "^1.0.0"
}
},
"node_modules/@redis/json": {
"version": "1.0.7",
"resolved": "https://registry.npmjs.org/@redis/json/-/json-1.0.7.tgz",
"integrity": "sha512-6UyXfjVaTBTJtKNG4/9Z8PSpKE6XgSyEb8iwaqDcy+uKrd/DGYHTWkUdnQDyzm727V7p21WUMhsqz5oy65kPcQ==",
"license": "MIT",
"peerDependencies": {
"@redis/client": "^1.0.0"
}
},
"node_modules/@redis/search": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/@redis/search/-/search-1.2.0.tgz",
"integrity": "sha512-tYoDBbtqOVigEDMAcTGsRlMycIIjwMCgD8eR2t0NANeQmgK/lvxNAvYyb6bZDD4frHRhIHkJu2TBRvB0ERkOmw==",
"license": "MIT",
"peerDependencies": {
"@redis/client": "^1.0.0"
}
},
"node_modules/@redis/time-series": {
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/@redis/time-series/-/time-series-1.1.0.tgz",
"integrity": "sha512-c1Q99M5ljsIuc4YdaCwfUEXsofakb9c8+Zse2qxTadu8TalLXuAESzLvFAvNVbkmSlvlzIQOLpBCmWI9wTOt+g==",
"license": "MIT",
"peerDependencies": {
"@redis/client": "^1.0.0"
}
},
"node_modules/@rollup/pluginutils": {
"version": "5.3.0",
"resolved": "https://registry.npmjs.org/@rollup/pluginutils/-/pluginutils-5.3.0.tgz",
@@ -2515,6 +2575,15 @@
"node": ">=6"
}
},
"node_modules/cluster-key-slot": {
"version": "1.1.2",
"resolved": "https://registry.npmjs.org/cluster-key-slot/-/cluster-key-slot-1.1.2.tgz",
"integrity": "sha512-RMr0FhtfXemyinomL4hrWcYJxmX6deFdCxpJzhDttxgO1+bcCnkk+9drydLVDmAMG7NE6aN/fl4F7ucU/90gAA==",
"license": "Apache-2.0",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/color-convert": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/color-convert/-/color-convert-2.0.1.tgz",
@@ -3090,6 +3159,15 @@
"node": "^8.16.0 || ^10.6.0 || >=11.0.0"
}
},
"node_modules/generic-pool": {
"version": "3.9.0",
"resolved": "https://registry.npmjs.org/generic-pool/-/generic-pool-3.9.0.tgz",
"integrity": "sha512-hymDOu5B53XvN4QT9dBmZxPX4CWhBPPLguTZ9MMFeFa/Kg0xWVfylOVNlJji/E7yTZWFd/q9GO5TxDLq156D7g==",
"license": "MIT",
"engines": {
"node": ">= 4"
}
},
"node_modules/get-caller-file": {
"version": "2.0.5",
"resolved": "https://registry.npmjs.org/get-caller-file/-/get-caller-file-2.0.5.tgz",
@@ -4687,6 +4765,23 @@
"url": "https://paulmillr.com/funding/"
}
},
"node_modules/redis": {
"version": "4.7.1",
"resolved": "https://registry.npmjs.org/redis/-/redis-4.7.1.tgz",
"integrity": "sha512-S1bJDnqLftzHXHP8JsT5II/CtHWQrASX5K96REjWjlmWKrviSOLWmM7QnRLstAWsu1VBBV1ffV6DzCvxNP0UJQ==",
"license": "MIT",
"workspaces": [
"./packages/*"
],
"dependencies": {
"@redis/bloom": "1.2.0",
"@redis/client": "1.6.1",
"@redis/graph": "1.1.1",
"@redis/json": "1.0.7",
"@redis/search": "1.2.0",
"@redis/time-series": "1.1.0"
}
},
"node_modules/regex": {
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/regex/-/regex-6.1.0.tgz",
@@ -6745,6 +6840,12 @@
"node": ">=10"
}
},
"node_modules/yallist": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/yallist/-/yallist-4.0.0.tgz",
"integrity": "sha512-3wdGidZyq5PB084XLES5TpOSRA3wjXAlIWMhum2kRcv/41Sn2emQ0dycQW4uZXLejwKvg6EsvbdlVL+FYEct7A==",
"license": "ISC"
},
"node_modules/yaml": {
"version": "2.8.2",
"resolved": "https://registry.npmjs.org/yaml/-/yaml-2.8.2.tgz",

View File

@@ -7,6 +7,7 @@
"build": "astro build",
"preview": "astro preview",
"fetch-content": "tsx scripts/fetch-content.ts",
"cache:clear": "tsx scripts/cache-clear.ts",
"verify:blog": "npm run build && tsx scripts/verify-blog-build.ts",
"typecheck": "astro check",
"format": "prettier -w .",
@@ -17,6 +18,7 @@
"dependencies": {
"@astrojs/sitemap": "^3.7.0",
"astro": "^5.17.1",
"redis": "^4.7.1",
"rss-parser": "^3.13.0",
"zod": "^3.25.76"
},

View File

@@ -0,0 +1,22 @@
import "dotenv/config";
import { createCacheFromEnv } from "../src/lib/cache";
function log(msg: string) {
// eslint-disable-next-line no-console
console.log(`[cache-clear] ${msg}`);
}
async function main() {
const cache = await createCacheFromEnv(process.env, { namespace: "fast-website", log });
await cache.flush();
await cache.close();
log("ok");
}
main().catch((e) => {
// eslint-disable-next-line no-console
console.error(`[cache-clear] failed: ${String(e)}`);
process.exitCode = 1;
});

View File

@@ -4,6 +4,8 @@ import { promises as fs } from "node:fs";
import path from "node:path";
import { getIngestConfigFromEnv } from "../src/lib/config";
import { createCacheFromEnv } from "../src/lib/cache";
import { cachedCompute } from "../src/lib/cache/memoize";
import type { ContentCache, ContentItem } from "../src/lib/content/types";
import { readInstagramEmbedPosts } from "../src/lib/ingest/instagram";
import { fetchPodcastRss } from "../src/lib/ingest/podcast";
@@ -42,6 +44,11 @@ async function main() {
const all: ContentItem[] = [];
const outPath = path.join(process.cwd(), "content", "cache", "content.json");
const kv = await createCacheFromEnv(process.env, {
namespace: "fast-website",
log,
});
// Read the existing cache so we can keep last-known-good sections if a source fails.
let existing: ContentCache | undefined;
try {
@@ -56,17 +63,29 @@ async function main() {
log("YouTube: skipped (missing YOUTUBE_CHANNEL_ID)");
} else if (cfg.youtubeApiKey) {
try {
const items = await fetchYoutubeViaApi(cfg.youtubeChannelId, cfg.youtubeApiKey, 25);
const cacheKey = `youtube:api:${cfg.youtubeChannelId}:25`;
const { value: items, cached } = await cachedCompute(kv, cacheKey, () =>
fetchYoutubeViaApi(cfg.youtubeChannelId!, cfg.youtubeApiKey!, 25),
);
log(`YouTube: API ${cached ? "cache" : "live"} (${items.length} items)`);
log(`YouTube: API ok (${items.length} items)`);
all.push(...items);
} catch (e) {
log(`YouTube: API failed (${String(e)}), falling back to RSS`);
const items = await fetchYoutubeViaRss(cfg.youtubeChannelId, 25);
const cacheKey = `youtube:rss:${cfg.youtubeChannelId}:25`;
const { value: items, cached } = await cachedCompute(kv, cacheKey, () =>
fetchYoutubeViaRss(cfg.youtubeChannelId!, 25),
);
log(`YouTube: RSS ${cached ? "cache" : "live"} (${items.length} items)`);
log(`YouTube: RSS ok (${items.length} items)`);
all.push(...items);
}
} else {
const items = await fetchYoutubeViaRss(cfg.youtubeChannelId, 25);
const cacheKey = `youtube:rss:${cfg.youtubeChannelId}:25`;
const { value: items, cached } = await cachedCompute(kv, cacheKey, () =>
fetchYoutubeViaRss(cfg.youtubeChannelId!, 25),
);
log(`YouTube: RSS ${cached ? "cache" : "live"} (${items.length} items)`);
log(`YouTube: RSS ok (${items.length} items)`);
all.push(...items);
}
@@ -76,7 +95,11 @@ async function main() {
log("Podcast: skipped (missing PODCAST_RSS_URL)");
} else {
try {
const items = await fetchPodcastRss(cfg.podcastRssUrl, 50);
const cacheKey = `podcast:rss:${cfg.podcastRssUrl}:50`;
const { value: items, cached } = await cachedCompute(kv, cacheKey, () =>
fetchPodcastRss(cfg.podcastRssUrl!, 50),
);
log(`Podcast: RSS ${cached ? "cache" : "live"} (${items.length} items)`);
log(`Podcast: RSS ok (${items.length} items)`);
all.push(...items);
} catch (e) {
@@ -103,11 +126,17 @@ async function main() {
wordpress = existing?.wordpress || wordpress;
} else {
try {
const wp = await fetchWordpressContent({
baseUrl: cfg.wordpressBaseUrl,
username: cfg.wordpressUsername,
appPassword: cfg.wordpressAppPassword,
});
const cacheKey = `wp:content:${cfg.wordpressBaseUrl}`;
const { value: wp, cached } = await cachedCompute(kv, cacheKey, () =>
fetchWordpressContent({
baseUrl: cfg.wordpressBaseUrl!,
username: cfg.wordpressUsername,
appPassword: cfg.wordpressAppPassword,
}),
);
log(
`WordPress: wp-json ${cached ? "cache" : "live"} (${wp.posts.length} posts, ${wp.pages.length} pages, ${wp.categories.length} categories)`,
);
wordpress = wp;
log(
`WordPress: wp-json ok (${wp.posts.length} posts, ${wp.pages.length} pages, ${wp.categories.length} categories)`,
@@ -119,14 +148,16 @@ async function main() {
}
}
const cache: ContentCache = {
const contentCache: ContentCache = {
generatedAt,
items: dedupe(all),
wordpress,
};
await writeAtomic(outPath, JSON.stringify(cache, null, 2));
log(`Wrote cache: ${outPath} (${cache.items.length} total items)`);
await writeAtomic(outPath, JSON.stringify(contentCache, null, 2));
log(`Wrote cache: ${outPath} (${contentCache.items.length} total items)`);
await kv.close();
}
main().catch((e) => {

28
site/src/lib/cache/index.ts vendored Normal file
View File

@@ -0,0 +1,28 @@
import type { CacheLogFn, CacheStore } from "./redis-cache";
import {
createRedisCache,
resolveDefaultTtlSecondsFromEnv,
resolveRedisUrlFromEnv,
} from "./redis-cache";
import { createNoopCache } from "./noop-cache";
export async function createCacheFromEnv(
env: NodeJS.ProcessEnv,
opts?: { namespace?: string; log?: CacheLogFn },
): Promise<CacheStore> {
const url = resolveRedisUrlFromEnv(env);
if (!url) return createNoopCache(opts?.log);
try {
return await createRedisCache({
url,
defaultTtlSeconds: resolveDefaultTtlSecondsFromEnv(env),
namespace: opts?.namespace,
log: opts?.log,
});
} catch (e) {
opts?.log?.(`cache: disabled (redis connect failed: ${String(e)})`);
return createNoopCache(opts?.log);
}
}

16
site/src/lib/cache/memoize.ts vendored Normal file
View File

@@ -0,0 +1,16 @@
import type { CacheStore } from "./redis-cache";
export async function cachedCompute<T>(
cache: CacheStore,
key: string,
compute: () => Promise<T>,
ttlSeconds?: number,
): Promise<{ value: T; cached: boolean }> {
const hit = await cache.getJson<T>(key);
if (hit !== undefined) return { value: hit, cached: true };
const value = await compute();
await cache.setJson(key, value, ttlSeconds);
return { value, cached: false };
}

49
site/src/lib/cache/memory-cache.ts vendored Normal file
View File

@@ -0,0 +1,49 @@
import type { CacheStore } from "./redis-cache";
type Entry = { value: string; expiresAt: number };
export function createMemoryCache(defaultTtlSeconds: number): CacheStore {
const store = new Map<string, Entry>();
function nowMs() {
return Date.now();
}
function isExpired(e: Entry) {
return e.expiresAt !== 0 && nowMs() > e.expiresAt;
}
return {
async getJson<T>(key: string) {
const e = store.get(key);
if (!e) return undefined;
if (isExpired(e)) {
store.delete(key);
return undefined;
}
try {
return JSON.parse(e.value) as T;
} catch {
store.delete(key);
return undefined;
}
},
async setJson(key: string, value: unknown, ttlSeconds?: number) {
const ttl = Math.max(1, Math.floor(ttlSeconds ?? defaultTtlSeconds));
store.set(key, {
value: JSON.stringify(value),
expiresAt: nowMs() + ttl * 1000,
});
},
async flush() {
store.clear();
},
async close() {
// no-op
},
};
}

19
site/src/lib/cache/noop-cache.ts vendored Normal file
View File

@@ -0,0 +1,19 @@
import type { CacheLogFn, CacheStore } from "./redis-cache";
export function createNoopCache(log?: CacheLogFn): CacheStore {
return {
async getJson() {
return undefined;
},
async setJson() {
// no-op
},
async flush() {
log?.("cache: noop flush");
},
async close() {
// no-op
},
};
}

92
site/src/lib/cache/redis-cache.ts vendored Normal file
View File

@@ -0,0 +1,92 @@
import { createClient } from "redis";
export type CacheLogFn = (msg: string) => void;
export type CacheStore = {
getJson<T>(key: string): Promise<T | undefined>;
setJson(key: string, value: unknown, ttlSeconds?: number): Promise<void>;
flush(): Promise<void>;
close(): Promise<void>;
};
type RedisCacheOptions = {
url: string;
defaultTtlSeconds: number;
namespace?: string;
log?: CacheLogFn;
};
function nsKey(namespace: string | undefined, key: string) {
return namespace ? `${namespace}:${key}` : key;
}
export async function createRedisCache(opts: RedisCacheOptions): Promise<CacheStore> {
const log = opts.log;
const client = createClient({ url: opts.url });
client.on("error", (err) => {
log?.(`cache: redis error (${String(err)})`);
});
await client.connect();
return {
async getJson<T>(key: string) {
const k = nsKey(opts.namespace, key);
const raw = await client.get(k);
if (raw == null) {
log?.(`cache: miss ${k}`);
return undefined;
}
log?.(`cache: hit ${k}`);
try {
return JSON.parse(raw) as T;
} catch {
// Bad cache entry: treat as miss.
return undefined;
}
},
async setJson(key: string, value: unknown, ttlSeconds?: number) {
const k = nsKey(opts.namespace, key);
const ttl = Math.max(1, Math.floor(ttlSeconds ?? opts.defaultTtlSeconds));
const raw = JSON.stringify(value);
await client.set(k, raw, { EX: ttl });
},
async flush() {
await client.flushDb();
},
async close() {
try {
await client.quit();
} catch {
// ignore
}
},
};
}
export function resolveRedisUrlFromEnv(env: NodeJS.ProcessEnv): string | undefined {
const url = env.CACHE_REDIS_URL;
if (url) return url;
const host = env.CACHE_REDIS_HOST;
const port = env.CACHE_REDIS_PORT;
const db = env.CACHE_REDIS_DB;
if (!host) return undefined;
const p = port ? Number(port) : 6379;
const d = db ? Number(db) : 0;
if (!Number.isFinite(p) || !Number.isFinite(d)) return undefined;
return `redis://${host}:${p}/${d}`;
}
export function resolveDefaultTtlSecondsFromEnv(env: NodeJS.ProcessEnv): number {
const raw = env.CACHE_DEFAULT_TTL_SECONDS;
const n = raw ? Number(raw) : NaN;
if (Number.isFinite(n) && n > 0) return Math.floor(n);
return 3600;
}

View File

@@ -14,6 +14,11 @@ type IngestConfig = {
wordpressBaseUrl?: string;
wordpressUsername?: string;
wordpressAppPassword?: string;
cacheRedisUrl?: string;
cacheRedisHost?: string;
cacheRedisPort?: number;
cacheRedisDb?: number;
cacheDefaultTtlSeconds?: number;
};
export function getPublicConfig(): PublicConfig {
@@ -37,5 +42,12 @@ export function getIngestConfigFromEnv(env: NodeJS.ProcessEnv): IngestConfig {
wordpressBaseUrl: env.WORDPRESS_BASE_URL,
wordpressUsername: env.WORDPRESS_USERNAME,
wordpressAppPassword: env.WORDPRESS_APP_PASSWORD,
cacheRedisUrl: env.CACHE_REDIS_URL,
cacheRedisHost: env.CACHE_REDIS_HOST,
cacheRedisPort: env.CACHE_REDIS_PORT ? Number(env.CACHE_REDIS_PORT) : undefined,
cacheRedisDb: env.CACHE_REDIS_DB ? Number(env.CACHE_REDIS_DB) : undefined,
cacheDefaultTtlSeconds: env.CACHE_DEFAULT_TTL_SECONDS
? Number(env.CACHE_DEFAULT_TTL_SECONDS)
: undefined,
};
}

View File

@@ -0,0 +1,40 @@
import { describe, expect, it } from "vitest";
import { createMemoryCache } from "../src/lib/cache/memory-cache";
import { cachedCompute } from "../src/lib/cache/memoize";
function sleep(ms: number) {
return new Promise((r) => setTimeout(r, ms));
}
describe("cache wrapper", () => {
it("set/get JSON and expires by TTL", async () => {
const cache = createMemoryCache(1);
await cache.setJson("k", { a: 1 }, 1);
const v1 = await cache.getJson<{ a: number }>("k");
expect(v1).toEqual({ a: 1 });
await sleep(1100);
const v2 = await cache.getJson("k");
expect(v2).toBeUndefined();
});
it("cachedCompute hits on second call within TTL", async () => {
const cache = createMemoryCache(60);
let calls = 0;
const compute = async () => {
calls++;
return { ok: true, n: calls };
};
const r1 = await cachedCompute(cache, "x", compute, 60);
const r2 = await cachedCompute(cache, "x", compute, 60);
expect(r1.cached).toBe(false);
expect(r2.cached).toBe(true);
expect(calls).toBe(1);
expect(r2.value).toEqual(r1.value);
});
});