Openclaw install instructions added

This commit is contained in:
2026-04-11 15:17:44 -04:00
parent 4e991c329e
commit 90d6f83937

671
INSTALL.md Normal file
View File

@@ -0,0 +1,671 @@
# Obsidian-RAG — Installation Guide for OpenClaw
**What this plugin does:** Indexes an Obsidian vault into LanceDB using Ollama embeddings, then powers four OpenClaw tools — `obsidian_rag_search`, `obsidian_rag_index`, `obsidian_rag_status`, and `obsidian_rag_memory_store` — so OpenClaw can answer natural-language questions over your personal notes (journal, finance, health, relationships, etc.).
**Stack:**
- Python 3.11+ CLI → LanceDB vector store + Ollama embeddings
- TypeScript/OpenClaw plugin → OpenClaw agent tools
- Ollama (local) → embedding inference
---
## Table of Contents
1. [Prerequisites](#1-prerequisites)
2. [Clone the Repository](#2-clone-the-repository)
3. [Install Ollama + Embedding Model](#3-install-ollama--embedding-model)
4. [Install Python CLI (Indexer)](#4-install-python-cli-indexer)
5. [Install Node.js / TypeScript Plugin](#5-install-nodejs--typescript-plugin)
6. [Configure the Plugin](#6-configure-the-plugin)
7. [Run the Initial Index](#7-run-the-initial-index)
8. [Register the Plugin with OpenClaw](#8-register-the-plugin-with-openclaw)
9. [Verify Everything Works](#9-verify-everything-works)
10. [Keeping the Index Fresh](#10-keeping-the-index-fresh)
11. [Troubleshooting](#11-troubleshooting)
---
## 1. Prerequisites
| Component | Required Version | Why |
|---|---|---|
| Python | ≥ 3.11 | Async I/O, modern type hints |
| Node.js | ≥ 18 | ESM modules, `node:` imports |
| npm | any recent | installs TypeScript deps |
| Ollama | running on `localhost:11434` | local embedding inference |
| Disk space | ~500 MB free | LanceDB store grows with vault |
**Verify your environment:**
```bash
python --version # → Python 3.11.x or higher
node --version # → v18.x.x or higher
npm --version # → 9.x.x or higher
curl http://localhost:11434/api/tags # → {"models": [...]} if Ollama is running
```
If Ollama is not running yet, skip to [§3](#3-install-ollama--embedding-model) before continuing.
---
## 2. Clone the Repository
```bash
# Replace DESTINATION with where you want the project to live.
# The project root must be writable (not inside /System or a read-only mount).
DESTINATION="$HOME/dev/obsidian-rag"
mkdir -p "$HOME/dev"
git clone https://github.com/YOUR_GITHUB_USER/obsidian-rag.git "$DESTINATION"
cd "$DESTINATION"
```
> **Important:** The `obsidian-rag/config.json`, `obsidian-rag/vectors.lance/`, and `obsidian-rag/sync-result.json` directories are created at runtime below the project root. Choose a destination with adequate write permissions.
> **Note for existing clones:** If you are re-running this guide on an already-cloned copy, pull the latest changes first:
> ```bash
> git pull origin main
> ```
---
## 3. Install Ollama + Embedding Model
The plugin requires Ollama running locally with the `mxbai-embed-large:335m` embedding model.
### 3.1 Install Ollama
**macOS / Linux:**
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
**Windows:** Download the installer from https://ollama.com/download
**Verify:**
```bash
ollama --version
```
### 3.2 Start Ollama
```bash
ollama serve &
# Give it 2 seconds to bind to port 11434
sleep 2
curl http://localhost:11434/api/tags
# → {"models": []}
```
> **Auto-start tip:** On macOS, consider installing Ollama as a LaunchAgent so it survives reboots.
> On Linux systemd: `sudo systemctl enable ollama`
### 3.3 Pull the Embedding Model
```bash
ollama pull mxbai-embed-large:335m
```
This downloads ~335 MB. Expected output:
```
pulling manifest
pulling 4a5b... 100%
verifying sha256 digest
writing manifest
success
```
**Verify the model is available:**
```bash
ollama list
# → NAME ID SIZE MODIFIED
# → mxbai-embed-large:335m 7c6d... 335 MB 2026-04-...
```
> **Model note:** The config (`obsidian-rag/config.json`) defaults to `mxbai-embed-large:335m`. If you use a different model, update `embedding.model` and `embedding.dimensions` in the config file (see [§6](#6-configure-the-plugin)).
---
## 4. Install Python CLI (Indexer)
The Python CLI (`obsidian-rag`) handles all vault scanning, chunking, embedding, and LanceDB storage.
### 4.1 Create a Virtual Environment
Using a virtual environment isolates this project's dependencies from your system Python.
**macOS / Linux:**
```bash
cd "$DESTINATION"
python -m venv .venv
source .venv/bin/activate
```
**Windows (PowerShell):**
```powershell
cd "$DESTINATION"
python -m venv .venv
.venv\Scripts\Activate.ps1
```
**Windows (CMD):**
```cmd
cd %DESTINATION%
python -m venv .venv
.venv\Scripts\activate.bat
```
You should now see `(.venv)` prepended to your shell prompt.
### 4.2 Install the Package in Editable Mode
```bash
pip install -e python/
```
This installs all runtime dependencies:
- `lancedb` — vector database
- `httpx` — HTTP client for Ollama
- `pyyaml` — config file parsing
- `python-frontmatter` — YAML frontmatter extraction
**Verify the CLI is accessible:**
```bash
obsidian-rag --help
```
Expected output:
```
usage: obsidian-rag [-h] {index,sync,reindex,status}
positional arguments:
{index,sync,reindex,status}
index Full vault index (scan → chunk → embed → store)
sync Incremental sync (only changed files)
reindex Force clean rebuild (deletes existing index)
status Show index health and statistics
```
> **Python path tip:** The CLI entry point (`obsidian-rag`) is installed into `.venv/bin/`. Always activate the venv before running CLI commands:
> ```bash
> source .venv/bin/activate # macOS/Linux
> .venv\Scripts\activate # Windows PowerShell
> ```
> **Without venv:** If you prefer a system-wide install instead of a venv, skip step 4.1 and run `pip install -e python/` directly. Not recommended if you have other Python projects with conflicting dependencies.
---
## 5. Install Node.js / TypeScript Plugin
The TypeScript plugin registers the OpenClaw tools (`obsidian_rag_search`, `obsidian_rag_index`, `obsidian_rag_status`, `obsidian_rag_memory_store`).
### 5.1 Install npm Dependencies
```bash
cd "$DESTINATION"
npm install
```
This installs into `node_modules/` and writes `package-lock.json`. Packages include:
- `openclaw` — plugin framework
- `@lancedb/lancedb` — vector DB client (Node.js bindings)
- `chokidar` — file system watcher for auto-sync
- `yaml` — config file parsing
### 5.2 Build the Plugin
```bash
npm run build
```
This compiles `src/index.ts``dist/index.js` (a single ESM bundle, ~131 KB).
Expected output:
```
dist/index.js 131.2kb
Done in ~1s
```
> **Watch mode (development):** Run `npm run dev` to rebuild automatically on file changes.
> **Type checking (optional but recommended):**
> ```bash
> npm run typecheck
> ```
> Should produce no errors.
---
## 6. Configure the Plugin
All configuration lives in `obsidian-rag/config.json` relative to the project root.
### 6.1 Inspect the Default Config
```bash
cat "$DESTINATION/obsidian-rag/config.json"
```
### 6.2 Key Fields to Customize
| Field | Default | Change if… |
|---|---|---|
| `vault_path` | `"./KnowledgeVault/Default"` | Your vault is in a different location |
| `embedding.model` | `"mxbai-embed-large:335m"` | You pulled a different Ollama model |
| `embedding.base_url` | `"http://localhost:11434"` | Ollama runs on a different host/port |
| `vector_store.path` | `"./obsidian-rag/vectors.lance"` | You want data in a different directory |
| `deny_dirs` | `[".obsidian", ".trash", ...]` | You want to skip or allow additional directories |
### 6.3 Set Your Vault Path
**Option A — Relative to the project root (recommended):**
Symlink or place your vault relative to the project:
```bash
# Example: your vault is at ~/obsidian-vault
# In config.json:
"vault_path": "../obsidian-vault"
```
**Option B — Absolute path:**
```json
"vault_path": "/Users/yourusername/obsidian-vault"
```
**Option C — Windows absolute path:**
```json
"vault_path": "C:\\Users\\YourUsername\\obsidian-vault"
```
> **Path validation:** The CLI validates `vault_path` exists on the filesystem before indexing. You can verify manually:
> ```bash
> ls "$DESTINATION/obsidian-rag/config.json"
> python3 -c "
> import json
> with open('$DESTINATION/obsidian-rag/config.json') as f:
> cfg = json.load(f)
> import os
> assert os.path.isdir(cfg['vault_path']), 'vault_path does not exist'
> print('Vault path OK:', cfg['vault_path'])
> "
---
## 7. Run the Initial Index
This is a one-time step that scans every `.md` file in your vault, chunks them, embeds them via Ollama, and stores them in LanceDB.
```bash
# Make sure the venv is active
source .venv/bin/activate # macOS/Linux
# .venv\Scripts\activate # Windows
obsidian-rag index
```
**Expected output (truncated):**
```json
{
"type": "complete",
"indexed_files": 627,
"total_chunks": 3764,
"duration_ms": 45230,
"errors": []
}
```
### What happens during `index`:
1. **Vault walk** — traverses all subdirectories, skipping `deny_dirs` (`.obsidian`, `.trash`, `zzz-Archive`, etc.)
2. **Frontmatter parse** — extracts YAML frontmatter, headings, tags, and dates from each `.md` file
3. **Chunking** — structured notes (journal entries) split by `# heading`; unstructured notes use a 500-token sliding window with 100-token overlap
4. **Embedding** — batches of 64 chunks sent to Ollama `/api/embeddings` endpoint
5. **Storage** — vectors upserted into LanceDB at `obsidian-rag/vectors.lance/`
6. **Sync record** — writes `obsidian-rag/sync-result.json` with timestamp and stats
> **Time estimate:** ~3060 seconds for 500700 files on a modern machine. The embedding step is the bottleneck; Ollama must process each batch sequentially.
>
> **Batch size tuning:** If embedding is slow, reduce `embedding.batch_size` in `config.json` (e.g., `"batch_size": 32`).
---
## 8. Register the Plugin with OpenClaw
OpenClaw auto-discovers plugins by reading the `openclaw.plugin.json` manifest in the project root and loading `dist/index.js`.
### 8.1 Register via OpenClaw's Plugin Manager
```bash
# OpenClaw CLI — register the local plugin
openclaw plugin add "$DESTINATION"
# or, if OpenClaw has a specific register command:
openclaw plugins register --path "$DESTINATION/dist/index.js" --name obsidian-rag
```
> **Note:** The exact command depends on your OpenClaw version. Check `openclaw --help` or `openclaw plugin --help` for the correct syntax. The plugin manifest (`openclaw.plugin.json`) in this project already declares all four tools.
### 8.2 Alternative — Register by Path in OpenClaw Config
If your OpenClaw installation uses a config file (e.g., `~/.openclaw/config.json` or `~/.openclaw/plugins.json`), add this project's built bundle:
```json
{
"plugins": [
{
"name": "obsidian-rag",
"path": "/full/path/to/obsidian-rag/dist/index.js"
}
]
}
```
### 8.3 Confirm the Plugin Loaded
```bash
openclaw plugins list
# → obsidian-rag 0.1.0 (loaded)
```
Or, if OpenClaw has a status command:
```bash
openclaw status
# → Plugin: obsidian-rag ✓ loaded
```
---
## 9. Verify Everything Works
### 9.1 Check Index Health
```bash
source .venv/bin/activate # macOS/Linux
obsidian-rag status
```
Expected:
```json
{
"total_docs": 627,
"total_chunks": 3764,
"last_sync": "2026-04-11T00:30:00Z"
}
```
### 9.2 Test Semantic Search (via Node)
```bash
node --input-type=module -e "
import { loadConfig } from './src/utils/config.js';
import { searchVectorDb } from './src/utils/lancedb.js';
const config = loadConfig();
console.log('Searching for: how was my mental health in 2024');
const results = await searchVectorDb(config, 'how was my mental health in 2024', { max_results: 3 });
for (const r of results) {
console.log('---');
console.log('[' + r.score.toFixed(3) + '] ' + r.source_file + ' | ' + (r.section || '(no section)'));
console.log(' ' + r.chunk_text.slice(0, 180) + '...');
}
"
```
Expected: ranked list of relevant note chunks with cosine similarity scores.
### 9.3 Test DEGRADED Mode (Ollama Down)
If Ollama is unavailable, the plugin falls back to BM25 full-text search on `chunk_text`. Verify this:
```bash
# Stop Ollama
pkill -f ollama # macOS/Linux
# taskkill /F /IM ollama.exe # Windows
# Run the same search — should still return results via FTS
node --input-type=module -e "
import { searchVectorDb } from './src/utils/lancedb.js';
import { loadConfig } from './src/utils/config.js';
const config = loadConfig();
const results = await searchVectorDb(config, 'mental health', { max_results: 3 });
results.forEach(r => console.log('[' + r.score.toFixed(4) + '] ' + r.source_file));
"
# Restart Ollama
ollama serve
```
### 9.4 Test OpenClaw Tools Directly
Ask OpenClaw to use the plugin:
```
Ask OpenClaw: "How was my mental health in 2024?"
```
OpenClaw should invoke `obsidian_rag_search` with your query and return ranked results from your journal.
```
Ask OpenClaw: "Run obsidian_rag_status"
```
OpenClaw should invoke `obsidian_rag_status` and display index stats.
---
## 10. Keeping the Index Fresh
### 10.1 Manual Incremental Sync
After editing or adding notes, run:
```bash
source .venv/bin/activate # macOS/Linux
obsidian-rag sync
```
This only re-indexes files whose `mtime` changed since the last sync. Typically <5 seconds for a handful of changed files.
### 10.2 Automatic Sync via File Watcher
The TypeScript plugin includes a `VaultWatcher` service (using `chokidar`) that monitors the vault directory and auto-triggers incremental syncs on file changes.
To enable the watcher, call the watcher initialization in your OpenClaw setup or run:
```bash
node --input-type=module -e "
import { startVaultWatcher } from './src/services/vault-watcher.js';
import { loadConfig } from './src/utils/config.js';
const config = loadConfig();
const watcher = startVaultWatcher(config);
console.log('Watching vault for changes...');
// Keep process alive
setInterval(() => {}, 10000);
"
```
> **Note:** The watcher runs as a long-lived background process. Terminate it when shutting down.
### 10.3 Force Rebuild
If the index becomes corrupted or you change the chunking strategy:
```bash
obsidian-rag reindex
```
This drops the LanceDB table and rebuilds from scratch (equivalent to `obsidian-rag index`).
### 10.4 After Upgrading the Plugin
If you pull a new version of this plugin that changed the LanceDB schema or added new indexes (e.g., the FTS index on `chunk_text`), always reindex:
```bash
obsidian-rag reindex
```
---
## 11. Troubleshooting
### `FileNotFoundError: config.json`
The CLI searches for config at:
1. `./obsidian-rag/config.json` (relative to project root, where you run `obsidian-rag`)
2. `~/.obsidian-rag/config.json` (home directory fallback)
**Fix:** Ensure you run `obsidian-rag` from the project root (`$DESTINATION`), or verify the config file exists:
```bash
ls "$DESTINATION/obsidian-rag/config.json"
```
### `ERROR: Index not found. Run 'obsidian-rag index' first.`
LanceDB table doesn't exist. This is normal on first install.
**Fix:**
```bash
source .venv/bin/activate
obsidian-rag index
```
### `ConnectionRefusedError` / `Ollama connection refused`
Ollama is not running.
**Fix:**
```bash
ollama serve &
sleep 2
curl http://localhost:11434/api/tags # must return JSON
```
If on a remote machine, update `embedding.base_url` in `config.json`:
```json
"base_url": "http://192.168.1.100:11434"
```
### Vector search returns 0 results
1. Check the index exists: `obsidian-rag status`
2. Check Ollama model is available: `ollama list`
3. Rebuild the index: `obsidian-rag reindex`
### FTS (DEGRADED mode) not working after upgrade
The FTS index on `chunk_text` was added in a recent change. **Reindex to rebuild with FTS:**
```bash
obsidian-rag reindex
```
### `npm run build` fails with TypeScript errors
```bash
npm run typecheck
```
Fix any type errors in `src/`, then rebuild. Common causes: missing type declarations, outdated `openclaw` package.
### Permission errors (Windows)
Run your terminal as Administrator, or install Python/Ollama to user-writable directories (not `C:\Program Files`).
### Very slow embedding (~minutes for 500 files)
- Reduce `batch_size` in `config.json` to `32` or `16`
- Ensure no other heavy processes are competing for CPU
- Ollama embedding is CPU-bound on machines without AVX2/AVX512
### Vault path contains spaces or special characters
Use an absolute path with proper escaping:
**macOS/Linux:**
```bash
# In config.json, use double quotes and escape spaces:
"vault_path": "/Users/your name/Documents/My Vault"
```
**Windows:**
```json
"vault_path": "C:\\Users\\yourname\\Documents\\My Vault"
```
### Plugin not appearing in `openclaw plugins list`
1. Confirm `dist/index.js` exists (`ls -la dist/`)
2. Check that `openclaw.plugin.json` is valid JSON
3. Try re-registering: `openclaw plugin add "$DESTINATION"`
4. Check OpenClaw's plugin discovery path — it may need to be in a specific directory like `~/.openclaw/plugins/`
---
## Quick Reference — All Commands in Order
```bash
# 1. Clone
git clone https://github.com/YOUR_GITHUB_USER/obsidian-rag.git ~/dev/obsidian-rag
cd ~/dev/obsidian-rag
# 2. Install Ollama (if not installed)
curl -fsSL https://ollama.com/install.sh | sh
ollama serve &
ollama pull mxbai-embed-large:335m
# 3. Python venv + CLI
python -m venv .venv
source .venv/bin/activate
pip install -e python/
# 4. Node.js plugin
npm install
npm run build
# 5. Edit config: set vault_path in obsidian-rag/config.json
# 6. First-time index
obsidian-rag index
# 7. Register with OpenClaw
openclaw plugin add ~/dev/obsidian-rag
# 8. Verify
obsidian-rag status
openclaw plugins list
```
---
## Project Layout Reference
```
obsidian-rag/ # Project root (git-cloned)
├── .git/ # Git history
├── .venv/ # Python virtual environment (created in step 4)
├── dist/
│ └── index.js # Built plugin bundle (created by npm run build)
├── node_modules/ # npm packages (created by npm install)
├── obsidian-rag/ # Runtime data directory (created on first index)
│ ├── config.json # Plugin configuration
│ ├── vectors.lance/ # LanceDB vector store (created on first index)
│ └── sync-result.json # Last sync metadata
├── openclaw.plugin.json # Plugin manifest (do not edit — auto-generated)
├── python/
│ ├── obsidian_rag/ # Python package source
│ │ ├── cli.py # CLI entry point
│ │ ├── config.py # Config loader
│ │ ├── indexer.py # Full indexing pipeline
│ │ ├── chunker.py # Text chunking
│ │ ├── embedder.py # Ollama client
│ │ ├── vector_store.py # LanceDB CRUD
│ │ └── security.py # Path traversal, HTML strip
│ └── tests/ # 64 pytest tests
├── src/
│ ├── index.ts # OpenClaw plugin entry (definePluginEntry)
│ ├── tools/ # Tool registrations + implementations
│ ├── services/ # Health, watcher, indexer bridge
│ └── utils/ # Config, LanceDB, types, response
├── package.json
├── tsconfig.json
└── vitest.config.ts
```
---
*Last updated: 2026-04-11 — obsidian-rag v0.1.0*