Obsidian RAG Plugin — Work Breakdown Structure
Date: 2026-04-10
Based on: Technical Design Document v1.0
WBS Overview
The work is decomposed into 5 phases, 15 work areas, and 48 work packages. Phases are sequenced by dependency: foundation first, then bottom-up through the protocol layers, then integration and hardening.
Each work package follows the format:
- ID: Hierarchical code (e.g., 1.1.2)
- Name: Imperative, action-oriented title
- Delivers: Concrete artifact or behavior
- Depends on: Prerequisite WBS IDs
- Effort: S/M/L relative sizing (S=1-2 sessions, M=3-5 sessions, L=6+ sessions)
Phase 0: Project Scaffolding & Environment
0.1 Repository & Build Setup
| ID |
Name |
Delivers |
Depends on |
Effort |
| 0.1.1 |
Initialize TypeScript project structure |
package.json, tsconfig.json, src/ directory skeleton |
— |
S |
| 0.1.2 |
Initialize Python package structure |
pyproject.toml, obsidian_rag/ module skeleton |
— |
S |
| 0.1.3 |
Create development config file |
./obsidian-rag/config.json with ./KnowledgeVault/Default |
0.1.1 |
S |
| 0.1.4 |
Set up OpenClaw plugin manifest |
openclaw.plugin.json with tool declarations |
0.1.1 |
S |
| 0.1.5 |
Configure test runners |
vitest config (TS), pytest config (Python) |
0.1.1, 0.1.2 |
S |
0.2 Environment Validation
| ID |
Name |
Delivers |
Depends on |
Effort |
| 0.2.1 |
Verify Ollama + mxbai-embed-large |
Script that calls /api/embed and returns 1024-dim vector |
— |
S |
| 0.2.2 |
Verify LanceDB Python package |
Script that creates a table, inserts, queries |
— |
S |
| 0.2.3 |
Verify sample vault accessibility |
Script that walks ./KnowledgeVault/Default and counts .md files |
— |
S |
Phase 1: Data Layer (Python Indexer)
The data layer is the foundation — everything else depends on it being able to index and store vectors.
1.1 Configuration (Python)
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.1.1 |
Implement config loader |
config.py — reads JSON config, resolves paths cross-platform, validates schema |
0.1.2 |
S |
| 1.1.2 |
Write config tests |
test_config.py — valid/invalid config, path resolution, defaults |
1.1.1 |
S |
1.2 Security (Python)
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.2.1 |
Implement path traversal prevention |
security.py — validate_path() rejects ../, absolute, symlinks outside vault |
1.1.1 |
S |
| 1.2.2 |
Implement input sanitization |
security.py — sanitize_text() strips HTML, code blocks, normalizes whitespace, caps length |
1.1.1 |
S |
| 1.2.3 |
Implement sensitive content detection |
security.py — detect_sensitive() returns categories matched (health/financial/relations) |
1.1.1 |
S |
| 1.2.4 |
Implement directory access control |
security.py — should_index_dir() applies deny/allow lists |
1.1.1 |
S |
| 1.2.5 |
Write security tests |
test_security.py — path traversal vectors (incl. Windows), sanitization, sensitive detection, dir control |
1.2.1–1.2.4 |
M |
1.3 Chunking
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.3.1 |
Implement markdown parser |
chunker.py — parse frontmatter, headings, tags, date from filename |
0.1.2 |
S |
| 1.3.2 |
Implement structured chunker |
chunker.py — split by section headers, each section = chunk with metadata |
1.3.1 |
M |
| 1.3.3 |
Implement sliding window chunker |
chunker.py — 500 token window, 100 overlap, for unstructured notes |
1.3.1 |
S |
| 1.3.4 |
Implement chunk router |
chunker.py — detect structured vs unstructured, route to correct chunker |
1.3.2, 1.3.3 |
S |
| 1.3.5 |
Write chunker tests |
test_chunker.py — section splitting, sliding window, metadata, edge cases (empty, single-line) |
1.3.4 |
M |
1.4 Embedding
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.4.1 |
Implement Ollama embedder |
embedder.py — call /api/embed, batch 64 chunks, handle errors/retries |
1.1.1 |
M |
| 1.4.2 |
Implement embedding cache |
embedder.py — optional file-based cache to avoid re-embedding unchanged chunks |
1.4.1 |
S |
| 1.4.3 |
Write embedder tests |
test_embedder.py — mocked Ollama, batch handling, error recovery, cache hit/miss |
1.4.1, 1.4.2 |
S |
1.5 Vector Store
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.5.1 |
Implement LanceDB table creation |
vector_store.py — create obsidian_chunks table with schema |
0.2.2 |
S |
| 1.5.2 |
Implement vector upsert |
vector_store.py — add/update chunks by chunk_id |
1.5.1 |
S |
| 1.5.3 |
Implement vector delete |
vector_store.py — remove chunks by source_file (for deleted files) |
1.5.1 |
S |
| 1.5.4 |
Implement vector search |
vector_store.py — query by embedding vector with filters (directory, date, tags) |
1.5.1 |
M |
| 1.5.5 |
Write vector store tests |
test_vector_store.py — CRUD, upsert idempotency, search with filters, temp directory cleanup |
1.5.2–1.5.4 |
M |
1.6 Indexer Pipeline & CLI
| ID |
Name |
Delivers |
Depends on |
Effort |
| 1.6.1 |
Implement full index pipeline |
indexer.py — scan → parse → chunk → enrich → embed → store, for all vault files |
1.2.4, 1.3.4, 1.4.1, 1.5.2 |
M |
| 1.6.2 |
Implement incremental sync |
indexer.py — compare mtime, process only changed/deleted files |
1.6.1, 1.5.3 |
M |
| 1.6.3 |
Implement reindex (nuke + rebuild) |
indexer.py — drop table, run full index |
1.6.1 |
S |
| 1.6.4 |
Implement sync-result.json writer |
indexer.py — write atomic .tmp + rename with index stats |
1.6.1 |
S |
| 1.6.5 |
Implement CLI entry point |
cli.py — obsidian-rag index/sync/reindex/status commands, NDJSON progress on stdout |
1.6.1, 1.6.2, 1.6.3 |
M |
| 1.6.6 |
Write indexer tests |
test_indexer.py — full pipeline with mock embedder, incremental sync, reindex, CLI arg parsing |
1.6.5 |
M |
Phase 2: Data Layer (TypeScript Client)
2.1 Configuration (TypeScript)
| ID |
Name |
Delivers |
Depends on |
Effort |
| 2.1.1 |
Implement config loader |
config.ts — read JSON config, validate schema, resolve relative paths |
0.1.1 |
S |
| 2.1.2 |
Implement config types |
config.ts — TypeScript interfaces for all config sections |
2.1.1 |
S |
2.2 LanceDB Client
| ID |
Name |
Delivers |
Depends on |
Effort |
| 2.2.1 |
Implement LanceDB query client |
lancedb.ts — connect to existing table, perform vector search with filters |
0.1.1 |
M |
| 2.2.2 |
Implement full-text search fallback |
lancedb.ts — LanceDB scalar query when Ollama is down (degraded mode) |
2.2.1 |
S |
2.3 Indexer Bridge
| ID |
Name |
Delivers |
Depends on |
Effort |
| 2.3.1 |
Implement subprocess spawner |
indexer-bridge.ts — spawn python -m obsidian_rag.cli, parse NDJSON progress |
0.1.1 |
M |
| 2.3.2 |
Implement sync-result reader |
indexer-bridge.ts — read sync-result.json, parse and return |
2.3.1 |
S |
| 2.3.3 |
Implement job tracking |
indexer-bridge.ts — track active job (job_id, mode, progress), detect completion |
2.3.1 |
S |
Phase 3: Session & Transport Layers (TypeScript)
3.1 Health State Machine
| ID |
Name |
Delivers |
Depends on |
Effort |
| 3.1.1 |
Implement health prober |
services/health.ts — probe Ollama (/api/tags), probe LanceDB (table exists), probe vault (dir exists) |
2.1.1, 2.2.1 |
S |
| 3.1.2 |
Implement state machine |
services/health.ts — HEALTHY/DEGRADED/UNAVAILABLE transitions, 30s re-probe timer |
3.1.1 |
S |
| 3.1.3 |
Implement staleness detector |
services/health.ts — if last sync >1h and vault changed, set degraded |
3.1.2, 2.3.2 |
S |
3.2 Vault Watcher
| ID |
Name |
Delivers |
Depends on |
Effort |
| 3.2.1 |
Implement file watcher |
vault-watcher.ts — chokidar watch on vault_path, respect deny/allow dirs |
2.1.1 |
S |
| 3.2.2 |
Implement debounce & batching |
vault-watcher.ts — 2s debounce, 5s collect window, group into changeset |
3.2.1 |
M |
| 3.2.3 |
Implement auto-sync trigger |
vault-watcher.ts — after batch, spawn indexer sync, update health on result |
3.2.2, 2.3.1, 3.1.2 |
M |
| 3.2.4 |
Write vault watcher tests |
vault-watcher.test.ts — mock chokidar events, debounce timing, batch grouping |
3.2.3 |
M |
3.3 Response Envelope & Error Normalization
| ID |
Name |
Delivers |
Depends on |
Effort |
| 3.3.1 |
Implement response envelope factory |
utils/response.ts — build {status, data, error, meta} from tool results |
0.1.1 |
S |
| 3.3.2 |
Implement error normalizer |
utils/response.ts — map exceptions/codes to error codes, status, recoverable flag |
3.3.1 |
S |
3.4 Security Guard (TypeScript)
| ID |
Name |
Delivers |
Depends on |
Effort |
| 3.4.1 |
Implement directory filter validator |
security-guard.ts — validate directory_filter against known vault dirs |
2.1.1 |
S |
| 3.4.2 |
Implement sensitive content flag |
security-guard.ts — set sensitive_detected, generate memory_suggestion |
3.4.1 |
S |
| 3.4.3 |
Write security guard tests |
security-guard.test.ts — invalid dirs, sensitive patterns, suggestion generation |
3.4.2 |
S |
Phase 4: Tool Layer (TypeScript)
4.1 Tool Implementations
| ID |
Name |
Delivers |
Depends on |
Effort |
| 4.1.1 |
Implement obsidian_rag_search tool |
tools/search.ts — validate params, call LanceDB search, apply filters, flag sensitive, return envelope |
2.2.1, 3.3.1, 3.4.2 |
M |
| 4.1.2 |
Implement obsidian_rag_index tool |
tools/index.ts — validate mode, spawn indexer, return job_id, track progress |
2.3.1, 2.3.3, 3.3.1 |
M |
| 4.1.3 |
Implement obsidian_rag_status tool |
tools/status.ts — return health state, index stats, active job, ollama status |
3.1.2, 2.3.2, 3.3.1 |
S |
| 4.1.4 |
Implement obsidian_rag_memory_store tool |
tools/memory.ts — validate key/value/source, persist to OpenClaw memory |
3.3.1 |
S |
| 4.1.5 |
Write tool unit tests |
search.test.ts, index.test.ts, memory.test.ts — param validation, filter logic, response shape |
4.1.1–4.1.4 |
M |
4.2 Plugin Registration
| ID |
Name |
Delivers |
Depends on |
Effort |
| 4.2.1 |
Implement plugin entry point |
index.ts — Plugin.onLoad (probe deps, start watcher), register tools, Plugin.onUnload |
4.1.1–4.1.4, 3.2.3, 3.1.2 |
M |
| 4.2.2 |
Verify OpenClaw plugin lifecycle |
Manual test: install → register → call tools → shutdown |
4.2.1 |
S |
Phase 5: Integration & Hardening
5.1 Integration Tests
| ID |
Name |
Delivers |
Depends on |
Effort |
| 5.1.1 |
Full pipeline integration test |
Index KnowledgeVault → search → verify results |
1.6.5, 4.2.1 |
M |
| 5.1.2 |
Sync cycle integration test |
Modify vault file → auto-sync → search returns updated content |
3.2.3, 5.1.1 |
M |
| 5.1.3 |
Health state integration test |
Stop Ollama → verify degraded → restart → verify healthy |
3.1.2, 5.1.1 |
S |
| 5.1.4 |
OpenClaw protocol integration test |
Agent calls all 4 tools, validates envelope, error paths |
4.2.1 |
M |
5.2 Security Test Suite
| ID |
Name |
Delivers |
Depends on |
Effort |
| 5.2.1 |
Path traversal tests |
../, symlinks, absolute paths, encoded paths, Windows-specific (C:, UNC) |
1.2.1, 3.4.1 |
S |
| 5.2.2 |
XSS prevention tests |
HTML/script injection in chunk_text, response rendering |
1.2.2 |
S |
| 5.2.3 |
Prompt injection tests |
Malicious vault note content attempting agent manipulation |
4.1.1 |
S |
| 5.2.4 |
Network audit test |
Verify zero outbound requests when local_only=true |
1.4.1 |
S |
| 5.2.5 |
Sensitive content tests |
Pattern detection, flagging in search results, blocking on external API |
1.2.3, 3.4.2 |
S |
5.3 Documentation & Publishing
| ID |
Name |
Delivers |
Depends on |
Effort |
| 5.3.1 |
Write README |
Usage, setup, config reference, CLI commands, OpenClaw integration |
4.2.1 |
S |
| 5.3.2 |
Create SKILL.md |
Skill manifest for ClawHub publishing |
4.2.1 |
S |
| 5.3.3 |
Publish to ClawHub |
clawhub skill publish + clawhub package publish |
5.1.1–5.2.5 |
S |
Dependency Map (Critical Path)
Critical path: 0.1 → 1.1 → 1.3 → 1.6 → 2.2 → 3.1 → 3.2 → 4.1 → 4.2 → 5.1
Parallelizable work:
- 1.2 (Python security) can run parallel with 1.3 (chunker) after 1.1
- 1.4 (embedder) can run parallel with 1.3 after 1.1
- 1.5 (vector store) can run parallel with 1.3–1.4 after 0.2.2
- 2.1 (TS config) can run parallel with Phase 1 after 0.1.1
- 3.3 (response envelope) can run parallel with 3.1–3.2 after 0.1.1
- 3.4 (security guard) can run parallel with 3.1–3.2 after 2.1.1
Effort Summary
| Phase |
Work Packages |
S |
M |
L |
Estimated Sessions |
| 0: Scaffolding |
8 |
8 |
0 |
0 |
4–8 |
| 1: Python Data Layer |
20 |
7 |
11 |
0 |
25–40 |
| 2: TS Data Client |
7 |
3 |
3 |
0 |
9–15 |
| 3: Session & Transport |
10 |
5 |
4 |
0 |
13–20 |
| 4: Tool Layer |
7 |
1 |
5 |
0 |
10–18 |
| 5: Integration & Hardening |
12 |
7 |
4 |
0 |
15–22 |
| Total |
64 |
31 |
27 |
0 |
76–123 sessions |
Risk Items
| Risk |
Impact |
Mitigation |
| Ollama mxbai-embed-large model not pulled |
Blocks embedding pipeline |
WBS 0.2.1 validates early; pull model before Phase 1 |
| LanceDB Python API breaking changes |
Schema/query code breaks |
Pin lancedb version in pyproject.toml |
| OpenClaw plugin SDK not available/stable |
Plugin registration fails |
Stub plugin interfaces for development; defer 4.2.2 until SDK confirmed |
| Windows path handling edge cases |
Security bypass or crashes |
Dedicated Windows test vectors in 5.2.1 |
| chokidar unreliable on Windows |
Auto-sync misses changes |
Integration test 5.1.2 validates on actual Windows FS; fallback to polling if needed |
| 677 files take too long to embed |
UX poor on first index |
Batch embedding (64/chunk) + NDJSON progress; measure actual time in 1.6.1 |