docs: add comprehensive README and module documentation
This commit is contained in:
269
docs/rag.md
Normal file
269
docs/rag.md
Normal file
@@ -0,0 +1,269 @@
|
||||
# RAG Module Documentation
|
||||
|
||||
The RAG (Retrieval-Augmented Generation) module provides semantic search over your Obsidian vault. It handles document chunking, embedding generation, and vector similarity search.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Vault Markdown Files
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Chunker │ - Split by strategy (sliding window / section)
|
||||
│ (chunker.py) │ - Extract metadata (tags, dates, sections)
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Embedder │ - HTTP client for Ollama API
|
||||
│ (embedder.py) │ - Batch processing with retries
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Vector Store │ - LanceDB persistence
|
||||
│(vector_store.py)│ - Upsert, delete, search
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Indexer │ - Full/incremental sync
|
||||
│ (indexer.py) │ - File watching
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### Chunker (`companion.rag.chunker`)
|
||||
|
||||
Splits markdown files into searchable chunks.
|
||||
|
||||
```python
|
||||
from companion.rag.chunker import chunk_file, ChunkingRule
|
||||
|
||||
rules = {
|
||||
"default": ChunkingRule(strategy="sliding_window", chunk_size=500, chunk_overlap=100),
|
||||
"Journal/**": ChunkingRule(strategy="section", section_tags=["#DayInShort"], chunk_size=300, chunk_overlap=50),
|
||||
}
|
||||
|
||||
chunks = chunk_file(
|
||||
file_path=Path("journal/2026-04-12.md"),
|
||||
vault_root=Path("~/vault"),
|
||||
rules=rules,
|
||||
modified_at=1234567890.0,
|
||||
)
|
||||
|
||||
for chunk in chunks:
|
||||
print(f"{chunk.source_file}:{chunk.chunk_index}")
|
||||
print(f"Text: {chunk.text[:100]}...")
|
||||
print(f"Tags: {chunk.tags}")
|
||||
print(f"Date: {chunk.date}")
|
||||
```
|
||||
|
||||
#### Chunking Strategies
|
||||
|
||||
**Sliding Window**
|
||||
- Fixed-size chunks with overlap
|
||||
- Best for: Longform text, articles
|
||||
|
||||
```python
|
||||
ChunkingRule(
|
||||
strategy="sliding_window",
|
||||
chunk_size=500, # words per chunk
|
||||
chunk_overlap=100, # words overlap between chunks
|
||||
)
|
||||
```
|
||||
|
||||
**Section-Based**
|
||||
- Split on section headers (tags)
|
||||
- Best for: Structured journals, daily notes
|
||||
|
||||
```python
|
||||
ChunkingRule(
|
||||
strategy="section",
|
||||
section_tags=["#DayInShort", "#mentalhealth", "#work"],
|
||||
chunk_size=300,
|
||||
chunk_overlap=50,
|
||||
)
|
||||
```
|
||||
|
||||
#### Metadata Extraction
|
||||
|
||||
Each chunk includes:
|
||||
- `source_file` - Relative path from vault root
|
||||
- `source_directory` - Top-level directory
|
||||
- `section` - Section header (for section strategy)
|
||||
- `date` - Parsed from filename
|
||||
- `tags` - Hashtags and wikilinks
|
||||
- `chunk_index` - Position in document
|
||||
- `modified_at` - File mtime for sync
|
||||
|
||||
### Embedder (`companion.rag.embedder`)
|
||||
|
||||
Generates embeddings via Ollama API.
|
||||
|
||||
```python
|
||||
from companion.rag.embedder import OllamaEmbedder
|
||||
|
||||
embedder = OllamaEmbedder(
|
||||
base_url="http://localhost:11434",
|
||||
model="mxbai-embed-large",
|
||||
batch_size=32,
|
||||
)
|
||||
|
||||
# Single embedding
|
||||
embeddings = embedder.embed(["Hello world"])
|
||||
print(len(embeddings[0])) # 1024 dimensions
|
||||
|
||||
# Batch embedding (with automatic batching)
|
||||
texts = ["text 1", "text 2", "text 3", ...] # 100 texts
|
||||
embeddings = embedder.embed(texts) # Automatically batches
|
||||
```
|
||||
|
||||
#### Features
|
||||
|
||||
- **Batching**: Automatically splits large requests
|
||||
- **Retries**: Exponential backoff on failures
|
||||
- **Context Manager**: Proper resource cleanup
|
||||
|
||||
```python
|
||||
with OllamaEmbedder(...) as embedder:
|
||||
embeddings = embedder.embed(texts)
|
||||
```
|
||||
|
||||
### Vector Store (`companion.rag.vector_store`)
|
||||
|
||||
LanceDB wrapper for vector storage.
|
||||
|
||||
```python
|
||||
from companion.rag.vector_store import VectorStore
|
||||
|
||||
store = VectorStore(
|
||||
uri="~/.companion/vectors.lance",
|
||||
dimensions=1024,
|
||||
)
|
||||
|
||||
# Upsert chunks
|
||||
store.upsert(
|
||||
ids=["file.md::0", "file.md::1"],
|
||||
texts=["chunk 1", "chunk 2"],
|
||||
embeddings=[[0.1, ...], [0.2, ...]],
|
||||
metadatas=[
|
||||
{"source_file": "file.md", "source_directory": "docs"},
|
||||
{"source_file": "file.md", "source_directory": "docs"},
|
||||
],
|
||||
)
|
||||
|
||||
# Search
|
||||
results = store.search(
|
||||
query_vector=[0.1, ...],
|
||||
top_k=8,
|
||||
filters={"source_directory": "Journal"},
|
||||
)
|
||||
```
|
||||
|
||||
#### Schema
|
||||
|
||||
| Field | Type | Nullable |
|
||||
|-------|------|----------|
|
||||
| id | string | No |
|
||||
| text | string | No |
|
||||
| vector | list[float32] | No |
|
||||
| source_file | string | No |
|
||||
| source_directory | string | No |
|
||||
| section | string | Yes |
|
||||
| date | string | Yes |
|
||||
| tags | list[string] | Yes |
|
||||
| chunk_index | int32 | No |
|
||||
| total_chunks | int32 | No |
|
||||
| modified_at | float64 | Yes |
|
||||
| rule_applied | string | No |
|
||||
|
||||
### Indexer (`companion.rag.indexer`)
|
||||
|
||||
Orchestrates vault indexing.
|
||||
|
||||
```python
|
||||
from companion.config import load_config
|
||||
from companion.rag.indexer import Indexer
|
||||
from companion.rag.vector_store import VectorStore
|
||||
|
||||
config = load_config()
|
||||
store = VectorStore(
|
||||
uri=config.rag.vector_store.path,
|
||||
dimensions=config.rag.embedding.dimensions,
|
||||
)
|
||||
|
||||
indexer = Indexer(config, store)
|
||||
|
||||
# Full reindex (clear + rebuild)
|
||||
indexer.full_index()
|
||||
|
||||
# Incremental sync (only changed files)
|
||||
indexer.sync()
|
||||
|
||||
# Get status
|
||||
status = indexer.status()
|
||||
print(f"Total chunks: {status['total_chunks']}")
|
||||
print(f"Unindexed files: {status['unindexed_files']}")
|
||||
```
|
||||
|
||||
### Search (`companion.rag.search`)
|
||||
|
||||
High-level search interface.
|
||||
|
||||
```python
|
||||
from companion.rag.search import SearchEngine
|
||||
|
||||
engine = SearchEngine(
|
||||
vector_store=store,
|
||||
embedder_base_url="http://localhost:11434",
|
||||
embedder_model="mxbai-embed-large",
|
||||
default_top_k=8,
|
||||
similarity_threshold=0.75,
|
||||
hybrid_search_enabled=False,
|
||||
)
|
||||
|
||||
results = engine.search(
|
||||
query="What did I learn about friendships?",
|
||||
top_k=8,
|
||||
filters={"source_directory": "Journal"},
|
||||
)
|
||||
|
||||
for result in results:
|
||||
print(f"Source: {result['source_file']}")
|
||||
print(f"Relevance: {1 - result['_distance']:.2f}")
|
||||
```
|
||||
|
||||
## CLI Commands
|
||||
|
||||
```bash
|
||||
# Full index
|
||||
python -m companion.indexer_daemon.cli index
|
||||
|
||||
# Incremental sync
|
||||
python -m companion.indexer_daemon.cli sync
|
||||
|
||||
# Check status
|
||||
python -m companion.indexer_daemon.cli status
|
||||
|
||||
# Reindex (same as index)
|
||||
python -m companion.indexer_daemon.cli reindex
|
||||
```
|
||||
|
||||
## Performance Tips
|
||||
|
||||
1. **Chunk Size**: Smaller chunks = better retrieval, larger = more context
|
||||
2. **Batch Size**: 32 is optimal for Ollama embeddings
|
||||
3. **Filters**: Use directory filters to narrow search scope
|
||||
4. **Sync vs Index**: Use `sync` for daily updates, `index` for full rebuilds
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Slow indexing**
|
||||
- Check Ollama is running: `ollama ps`
|
||||
- Reduce batch size if OOM
|
||||
|
||||
**No results**
|
||||
- Verify vault path in config
|
||||
- Check `indexer.status()` for unindexed files
|
||||
|
||||
**Duplicate chunks**
|
||||
- Each chunk ID is `{source_file}::{chunk_index}`
|
||||
- Use `full_index()` to clear and rebuild
|
||||
Reference in New Issue
Block a user