- Section-split first for structured notes - Large sections (>max_section_chars) broken via sliding-window - Small sections stay intact with heading preserved - Adds max_section_chars config (default 4000) - 2 new TDD tests for hierarchical chunking
2.5 KiB
AGENTS.md
Stack
Two independent packages in one repo:
| Directory | Role | Entry | Build |
|---|---|---|---|
src/ |
TypeScript OpenClaw plugin | src/index.ts |
esbuild → dist/index.js |
python/ |
Python CLI indexer | obsidian_rag/cli.py |
pip install -e |
Commands
TypeScript (OpenClaw plugin):
npm run build # esbuild → dist/index.js
npm run typecheck # tsc --noEmit
npm run test # vitest run
Python (RAG indexer):
pip install -e python/ # editable install
obsidian-rag index|sync|reindex|status # CLI
pytest python/ # tests
ruff check python/ # lint
OpenClaw Plugin Install
Plugin package.json MUST have:
"openclaw": {
"extensions": ["./dist/index.js"],
"hook": []
}
extensions= array, string pathhook= singular, nothooks
Config
User config at ~/.obsidian-rag/config.json or ./obsidian-rag/ dev config.
Key indexing fields:
indexing.chunk_size— sliding window chunk size (default 500)indexing.chunk_overlap— overlap between chunks (default 100)indexing.max_section_chars— max chars per section before hierarchical split (default 4000)
Key security fields:
security.require_confirmation_for— list of categories (e.g.["health", "financial_debt"]). Empty list disables guard.security.auto_approve_sensitive—truebypasses sensitive content prompts.security.local_only—trueblocks non-localhost Ollama.
Ollama Context Length
python/obsidian_rag/embedder.py truncates chunks at MAX_CHUNK_CHARS = 8000 before embedding. If Ollama 500 error returns, increase max_section_chars (to reduce section sizes) or reduce chunk_size in config.
Hierarchical Chunking
Structured notes (date-named files) use section-split first, then sliding-window within sections that exceed max_section_chars. Small sections stay intact; large sections are broken into sub-chunks with the parent section heading preserved.
Sensitive Content Guard
Triggered by categories in require_confirmation_for. Raises SensitiveContentError from obsidian_rag/indexer.py.
To disable: set require_confirmation_for: [] or auto_approve_sensitive: true in config.
Architecture
User query → OpenClaw (TypeScript plugin src/index.ts)
→ obsidian_rag_* tools (python/obsidian_rag/)
→ Ollama embeddings (http://localhost:11434)
→ LanceDB vector store