docs: add comprehensive README and module documentation

2026-04-13 15:35:22 -04:00
parent 47ac2f36e0
commit e77fa69b31
6 changed files with 2117 additions and 0 deletions
--- a/docs/rag.md
+++ b/docs/rag.md
@@ -0,0 +1,269 @@
+# RAG Module Documentation
+
+The RAG (Retrieval-Augmented Generation) module provides semantic search over your Obsidian vault. It handles document chunking, embedding generation, and vector similarity search.
+
+## Architecture
+
+```
+Vault Markdown Files
+         ↓
+┌─────────────────┐
+│    Chunker      │  - Split by strategy (sliding window / section)
+│  (chunker.py)   │  - Extract metadata (tags, dates, sections)
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│    Embedder     │  - HTTP client for Ollama API
+│  (embedder.py)  │  - Batch processing with retries
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│  Vector Store   │  - LanceDB persistence
+│(vector_store.py)│  - Upsert, delete, search
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│  Indexer        │  - Full/incremental sync
+│  (indexer.py)   │  - File watching
+└─────────────────┘
+```
+
+## Components
+
+### Chunker (`companion.rag.chunker`)
+
+Splits markdown files into searchable chunks.
+
+```python
+from companion.rag.chunker import chunk_file, ChunkingRule
+
+rules = {
+    "default": ChunkingRule(strategy="sliding_window", chunk_size=500, chunk_overlap=100),
+    "Journal/**": ChunkingRule(strategy="section", section_tags=["#DayInShort"], chunk_size=300, chunk_overlap=50),
+}
+
+chunks = chunk_file(
+    file_path=Path("journal/2026-04-12.md"),
+    vault_root=Path("~/vault"),
+    rules=rules,
+    modified_at=1234567890.0,
+)
+
+for chunk in chunks:
+    print(f"{chunk.source_file}:{chunk.chunk_index}")
+    print(f"Text: {chunk.text[:100]}...")
+    print(f"Tags: {chunk.tags}")
+    print(f"Date: {chunk.date}")
+```
+
+#### Chunking Strategies
+
+**Sliding Window**
+- Fixed-size chunks with overlap
+- Best for: Longform text, articles
+
+```python
+ChunkingRule(
+    strategy="sliding_window",
+    chunk_size=500,    # words per chunk
+    chunk_overlap=100, # words overlap between chunks
+)
+```
+
+**Section-Based**
+- Split on section headers (tags)
+- Best for: Structured journals, daily notes
+
+```python
+ChunkingRule(
+    strategy="section",
+    section_tags=["#DayInShort", "#mentalhealth", "#work"],
+    chunk_size=300,
+    chunk_overlap=50,
+)
+```
+
+#### Metadata Extraction
+
+Each chunk includes:
+- `source_file` - Relative path from vault root
+- `source_directory` - Top-level directory
+- `section` - Section header (for section strategy)
+- `date` - Parsed from filename
+- `tags` - Hashtags and wikilinks
+- `chunk_index` - Position in document
+- `modified_at` - File mtime for sync
+
+### Embedder (`companion.rag.embedder`)
+
+Generates embeddings via Ollama API.
+
+```python
+from companion.rag.embedder import OllamaEmbedder
+
+embedder = OllamaEmbedder(
+    base_url="http://localhost:11434",
+    model="mxbai-embed-large",
+    batch_size=32,
+)
+
+# Single embedding
+embeddings = embedder.embed(["Hello world"])
+print(len(embeddings[0]))  # 1024 dimensions
+
+# Batch embedding (with automatic batching)
+texts = ["text 1", "text 2", "text 3", ...]  # 100 texts
+embeddings = embedder.embed(texts)  # Automatically batches
+```
+
+#### Features
+
+- **Batching**: Automatically splits large requests
+- **Retries**: Exponential backoff on failures
+- **Context Manager**: Proper resource cleanup
+
+```python
+with OllamaEmbedder(...) as embedder:
+    embeddings = embedder.embed(texts)
+```
+
+### Vector Store (`companion.rag.vector_store`)
+
+LanceDB wrapper for vector storage.
+
+```python
+from companion.rag.vector_store import VectorStore
+
+store = VectorStore(
+    uri="~/.companion/vectors.lance",
+    dimensions=1024,
+)
+
+# Upsert chunks
+store.upsert(
+    ids=["file.md::0", "file.md::1"],
+    texts=["chunk 1", "chunk 2"],
+    embeddings=[[0.1, ...], [0.2, ...]],
+    metadatas=[
+        {"source_file": "file.md", "source_directory": "docs"},
+        {"source_file": "file.md", "source_directory": "docs"},
+    ],
+)
+
+# Search
+results = store.search(
+    query_vector=[0.1, ...],
+    top_k=8,
+    filters={"source_directory": "Journal"},
+)
+```
+
+#### Schema
+
+| Field | Type | Nullable |
+|-------|------|----------|
+| id | string | No |
+| text | string | No |
+| vector | list[float32] | No |
+| source_file | string | No |
+| source_directory | string | No |
+| section | string | Yes |
+| date | string | Yes |
+| tags | list[string] | Yes |
+| chunk_index | int32 | No |
+| total_chunks | int32 | No |
+| modified_at | float64 | Yes |
+| rule_applied | string | No |
+
+### Indexer (`companion.rag.indexer`)
+
+Orchestrates vault indexing.
+
+```python
+from companion.config import load_config
+from companion.rag.indexer import Indexer
+from companion.rag.vector_store import VectorStore
+
+config = load_config()
+store = VectorStore(
+    uri=config.rag.vector_store.path,
+    dimensions=config.rag.embedding.dimensions,
+)
+
+indexer = Indexer(config, store)
+
+# Full reindex (clear + rebuild)
+indexer.full_index()
+
+# Incremental sync (only changed files)
+indexer.sync()
+
+# Get status
+status = indexer.status()
+print(f"Total chunks: {status['total_chunks']}")
+print(f"Unindexed files: {status['unindexed_files']}")
+```
+
+### Search (`companion.rag.search`)
+
+High-level search interface.
+
+```python
+from companion.rag.search import SearchEngine
+
+engine = SearchEngine(
+    vector_store=store,
+    embedder_base_url="http://localhost:11434",
+    embedder_model="mxbai-embed-large",
+    default_top_k=8,
+    similarity_threshold=0.75,
+    hybrid_search_enabled=False,
+)
+
+results = engine.search(
+    query="What did I learn about friendships?",
+    top_k=8,
+    filters={"source_directory": "Journal"},
+)
+
+for result in results:
+    print(f"Source: {result['source_file']}")
+    print(f"Relevance: {1 - result['_distance']:.2f}")
+```
+
+## CLI Commands
+
+```bash
+# Full index
+python -m companion.indexer_daemon.cli index
+
+# Incremental sync
+python -m companion.indexer_daemon.cli sync
+
+# Check status
+python -m companion.indexer_daemon.cli status
+
+# Reindex (same as index)
+python -m companion.indexer_daemon.cli reindex
+```
+
+## Performance Tips
+
+1. **Chunk Size**: Smaller chunks = better retrieval, larger = more context
+2. **Batch Size**: 32 is optimal for Ollama embeddings
+3. **Filters**: Use directory filters to narrow search scope
+4. **Sync vs Index**: Use `sync` for daily updates, `index` for full rebuilds
+
+## Troubleshooting
+
+**Slow indexing**
+- Check Ollama is running: `ollama ps`
+- Reduce batch size if OOM
+
+**No results**
+- Verify vault path in config
+- Check `indexer.status()` for unindexed files
+
+**Duplicate chunks**
+- Each chunk ID is `{source_file}::{chunk_index}`
+- Use `full_index()` to clear and rebuild