Files
obsidian-rag/openclaw.plugin.json
Santhosh Janardhanan 5c281165c7 Sprint 0-1: Python indexer, TS plugin scaffolding, and test suite
## What's new

**Python indexer (`python/obsidian_rag/`)** — full pipeline from scan to LanceDB:
- `config.py` — JSON config loader with cross-platform path resolution
- `security.py` — path traversal prevention, HTML stripping, sensitive content detection, dir allow/deny lists
- `chunker.py` — section-split for journal entries (date-named files), sliding-window for unstructured notes
- `embedder.py` — Ollama `/api/embeddings` client with batched requests and timeout/error handling
- `vector_store.py` — LanceDB schema, upsert (merge_insert), delete, search with filters, stats
- `indexer.py` — full/sync/reindex pipeline orchestrator with progress yields
- `cli.py` — `index | sync | reindex | status` CLI commands

**TypeScript plugin (`src/`)** — OpenClaw plugin scaffold:
- `utils/` — config loader, TypeScript types, response envelope factory, LanceDB client
- `services/` — health state machine (HEALTHY/DEGRADED/UNAVAILABLE), vault watcher with debounce/batching, indexer bridge (subprocess spawner)
- `tools/` — 4 tool stubs: search, index, status, memory_store (OpenClaw wiring pending)
- `index.ts` — plugin entry point with health probe + vault watcher startup

**Config** (`obsidian-rag/config.json`, `openclaw.plugin.json`):
- 627 files / 3764 chunks indexed in dev vault

**Tests: 76 passing**
- Python: 64 pytest tests (chunker, security, vector_store, config)
- TypeScript: 12 vitest tests (lancedb client, response envelope)

## Bugs fixed

- LanceDB `tags` column filter: `LIKE '%tag%'` → `list_contains(tags, 'tag')` (List<String> column)
- LanceDB JS `db.list_tables()` returns `ListTablesResponse` object, not plain array
- LanceDB JS result score field: `_score` → `_distance`
- TypeScript regex literal with unescaped `/` in path-resolve regex
- Python: `create_table_if_not_exists` identity check → name comparison

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-10 22:56:50 -04:00

97 lines
3.6 KiB
JSON

{
"schema_version": "1.0",
"name": "obsidian-rag",
"version": "0.1.0",
"description": "Semantic search through Obsidian vault notes using RAG. Powers natural language queries like 'How was my mental health in 2024?' across journal entries, financial records, health data, and more.",
"author": "Santhosh Janardhanan",
"tools": [
{
"name": "obsidian_rag_search",
"description": "Primary semantic search tool. Given a natural language query, searches the Obsidian vault index and returns the most relevant note chunks ranked by semantic similarity. Supports filtering by directory, date range, and tags.",
"parameter_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Natural language question or topic to search for"
},
"max_results": {
"type": "integer",
"description": "Maximum number of chunks to return",
"default": 5,
"minimum": 1,
"maximum": 50
},
"directory_filter": {
"type": "array",
"description": "Limit search to specific vault subdirectories (e.g. ['Journal', 'Finance'])",
"items": { "type": "string" }
},
"date_range": {
"type": "object",
"description": "Filter by date range",
"properties": {
"from": { "type": "string", "description": "Start date (YYYY-MM-DD)" },
"to": { "type": "string", "description": "End date (YYYY-MM-DD)" }
}
},
"tags": {
"type": "array",
"description": "Filter by hashtags found in notes (e.g. ['#mentalhealth', '#therapy'])",
"items": { "type": "string" }
}
},
"required": ["query"]
},
"required_permissions": []
},
{
"name": "obsidian_rag_index",
"description": "Trigger indexing of the Obsidian vault. Use 'full' for first-time setup, 'sync' for incremental updates, 'reindex' to force a clean rebuild.",
"parameter_schema": {
"type": "object",
"properties": {
"mode": {
"type": "string",
"description": "Indexing mode",
"enum": ["full", "sync", "reindex"]
}
},
"required": ["mode"]
},
"required_permissions": []
},
{
"name": "obsidian_rag_status",
"description": "Check the health of the Obsidian RAG plugin — index statistics, last sync time, unindexed files, and Ollama status. Call this first when unsure if the index is ready.",
"parameter_schema": {
"type": "object",
"properties": {}
},
"required_permissions": []
},
{
"name": "obsidian_rag_memory_store",
"description": "Commit an important fact from search results to OpenClaw's memory for faster future retrieval. Use after finding significant information (e.g. 'I owe Sreenivas $50') that should be remembered.",
"parameter_schema": {
"type": "object",
"properties": {
"key": {
"type": "string",
"description": "Identifier for the fact (e.g. 'debt_to_sreenivas')"
},
"value": {
"type": "string",
"description": "The fact to remember"
},
"source": {
"type": "string",
"description": "Source file path in the vault (e.g. 'Journal/2025-03-15.md')"
}
},
"required": ["key", "value", "source"]
},
"required_permissions": []
}
]
}