## What's new **Python indexer (`python/obsidian_rag/`)** — full pipeline from scan to LanceDB: - `config.py` — JSON config loader with cross-platform path resolution - `security.py` — path traversal prevention, HTML stripping, sensitive content detection, dir allow/deny lists - `chunker.py` — section-split for journal entries (date-named files), sliding-window for unstructured notes - `embedder.py` — Ollama `/api/embeddings` client with batched requests and timeout/error handling - `vector_store.py` — LanceDB schema, upsert (merge_insert), delete, search with filters, stats - `indexer.py` — full/sync/reindex pipeline orchestrator with progress yields - `cli.py` — `index | sync | reindex | status` CLI commands **TypeScript plugin (`src/`)** — OpenClaw plugin scaffold: - `utils/` — config loader, TypeScript types, response envelope factory, LanceDB client - `services/` — health state machine (HEALTHY/DEGRADED/UNAVAILABLE), vault watcher with debounce/batching, indexer bridge (subprocess spawner) - `tools/` — 4 tool stubs: search, index, status, memory_store (OpenClaw wiring pending) - `index.ts` — plugin entry point with health probe + vault watcher startup **Config** (`obsidian-rag/config.json`, `openclaw.plugin.json`): - 627 files / 3764 chunks indexed in dev vault **Tests: 76 passing** - Python: 64 pytest tests (chunker, security, vector_store, config) - TypeScript: 12 vitest tests (lancedb client, response envelope) ## Bugs fixed - LanceDB `tags` column filter: `LIKE '%tag%'` → `list_contains(tags, 'tag')` (List<String> column) - LanceDB JS `db.list_tables()` returns `ListTablesResponse` object, not plain array - LanceDB JS result score field: `_score` → `_distance` - TypeScript regex literal with unescaped `/` in path-resolve regex - Python: `create_table_if_not_exists` identity check → name comparison Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
97 lines
3.6 KiB
JSON
97 lines
3.6 KiB
JSON
{
|
|
"schema_version": "1.0",
|
|
"name": "obsidian-rag",
|
|
"version": "0.1.0",
|
|
"description": "Semantic search through Obsidian vault notes using RAG. Powers natural language queries like 'How was my mental health in 2024?' across journal entries, financial records, health data, and more.",
|
|
"author": "Santhosh Janardhanan",
|
|
"tools": [
|
|
{
|
|
"name": "obsidian_rag_search",
|
|
"description": "Primary semantic search tool. Given a natural language query, searches the Obsidian vault index and returns the most relevant note chunks ranked by semantic similarity. Supports filtering by directory, date range, and tags.",
|
|
"parameter_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"query": {
|
|
"type": "string",
|
|
"description": "Natural language question or topic to search for"
|
|
},
|
|
"max_results": {
|
|
"type": "integer",
|
|
"description": "Maximum number of chunks to return",
|
|
"default": 5,
|
|
"minimum": 1,
|
|
"maximum": 50
|
|
},
|
|
"directory_filter": {
|
|
"type": "array",
|
|
"description": "Limit search to specific vault subdirectories (e.g. ['Journal', 'Finance'])",
|
|
"items": { "type": "string" }
|
|
},
|
|
"date_range": {
|
|
"type": "object",
|
|
"description": "Filter by date range",
|
|
"properties": {
|
|
"from": { "type": "string", "description": "Start date (YYYY-MM-DD)" },
|
|
"to": { "type": "string", "description": "End date (YYYY-MM-DD)" }
|
|
}
|
|
},
|
|
"tags": {
|
|
"type": "array",
|
|
"description": "Filter by hashtags found in notes (e.g. ['#mentalhealth', '#therapy'])",
|
|
"items": { "type": "string" }
|
|
}
|
|
},
|
|
"required": ["query"]
|
|
},
|
|
"required_permissions": []
|
|
},
|
|
{
|
|
"name": "obsidian_rag_index",
|
|
"description": "Trigger indexing of the Obsidian vault. Use 'full' for first-time setup, 'sync' for incremental updates, 'reindex' to force a clean rebuild.",
|
|
"parameter_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"mode": {
|
|
"type": "string",
|
|
"description": "Indexing mode",
|
|
"enum": ["full", "sync", "reindex"]
|
|
}
|
|
},
|
|
"required": ["mode"]
|
|
},
|
|
"required_permissions": []
|
|
},
|
|
{
|
|
"name": "obsidian_rag_status",
|
|
"description": "Check the health of the Obsidian RAG plugin — index statistics, last sync time, unindexed files, and Ollama status. Call this first when unsure if the index is ready.",
|
|
"parameter_schema": {
|
|
"type": "object",
|
|
"properties": {}
|
|
},
|
|
"required_permissions": []
|
|
},
|
|
{
|
|
"name": "obsidian_rag_memory_store",
|
|
"description": "Commit an important fact from search results to OpenClaw's memory for faster future retrieval. Use after finding significant information (e.g. 'I owe Sreenivas $50') that should be remembered.",
|
|
"parameter_schema": {
|
|
"type": "object",
|
|
"properties": {
|
|
"key": {
|
|
"type": "string",
|
|
"description": "Identifier for the fact (e.g. 'debt_to_sreenivas')"
|
|
},
|
|
"value": {
|
|
"type": "string",
|
|
"description": "The fact to remember"
|
|
},
|
|
"source": {
|
|
"type": "string",
|
|
"description": "Source file path in the vault (e.g. 'Journal/2025-03-15.md')"
|
|
}
|
|
},
|
|
"required": ["key", "value", "source"]
|
|
},
|
|
"required_permissions": []
|
|
}
|
|
]
|
|
} |