Files

Santhosh Janardhanan e77fa69b31 docs: add comprehensive README and module documentation

2026-04-13 15:35:22 -04:00

6.9 KiB

Raw Blame History

Personal Companion AI

A fully local, privacy-first AI companion trained on your Obsidian vault. Combines fine-tuned reasoning with RAG-powered memory to answer questions about your life, relationships, and experiences.

Architecture

┌─────────────────────────────────────────────────────────────┐
│                    Personal Companion AI                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
│  │  React UI    │◄──►│  FastAPI        │◄──►│  Ollama  │  │
│  │  (Vite)      │    │  Backend        │    │  Models  │  │
│  └──────────────┘    └─────────────────┘    └──────────┘  │
│                              │                              │
│        ┌─────────────────────┼─────────────────────┐       │
│        ↓                     ↓                     ↓       │
│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
│  │  Fine-tuned  │    │  RAG Engine     │    │  Vault   │  │
│  │  7B Model    │    │  (LanceDB)      │    │  Indexer │  │
│  │              │    │                 │    │          │  │
│  │  Quarterly   │    │  • semantic     │    │  • watch │  │
│  │  retrain     │    │    search       │    │  • chunk │  │
│  │              │    │  • hybrid       │    │  • embed │  │
│  │              │    │    filters      │    │          │  │
│  │              │    │  • relationship │    │  Daily    │  │
│  │              │    │    graph        │    │  auto-sync│  │
│  └──────────────┘    └─────────────────┘    └──────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Quick Start

Prerequisites

Python 3.11+
Node.js 18+ (for UI)
Ollama running locally
RTX 5070 or equivalent (12GB+ VRAM for fine-tuning)

Installation

# Clone and setup
cd kv-rag
pip install -e ".[dev]"

# Install UI dependencies
cd ui && npm install && cd ..

# Pull required Ollama models
ollama pull mxbai-embed-large
ollama pull llama3.1:8b

Configuration

Copy config.json and customize:

{
  "vault": {
    "path": "/path/to/your/obsidian/vault"
  },
  "companion": {
    "name": "SAN"
  }
}

See docs/config.md for full configuration reference.

Running

Terminal 1 - Backend:

python -m uvicorn companion.api:app --host 0.0.0.0 --port 7373

Terminal 2 - Frontend:

cd ui && npm run dev

Terminal 3 - Indexer (optional):

# One-time full index
python -m companion.indexer_daemon.cli index

# Or continuous file watching
python -m companion.indexer_daemon.watcher

Open http://localhost:5173

Usage

Chat Interface

Type messages naturally. The companion will:

Retrieve relevant context from your vault
Reference past events, relationships, decisions
Provide reflective, companion-style responses

Indexing Your Vault

# Full reindex
python -m companion.indexer_daemon.cli index

# Incremental sync
python -m companion.indexer_daemon.cli sync

# Check status
python -m companion.indexer_daemon.cli status

Fine-Tuning (Optional)

Train a custom model that reasons like you:

# Extract training examples from vault reflections
python -m companion.forge.cli extract

# Train with QLoRA (4-6 hours on RTX 5070)
python -m companion.forge.cli train --epochs 3

# Reload the fine-tuned model
python -m companion.forge.cli reload ~/.companion/training/final

Modules

Module	Purpose	Documentation
`companion.config`	Configuration management	docs/config.md
`companion.rag`	RAG engine (chunk, embed, search)	docs/rag.md
`companion.forge`	Fine-tuning pipeline	docs/forge.md
`companion.api`	FastAPI backend	docs/api.md
`ui/`	React frontend	docs/ui.md

Project Structure

kv-rag/
├── companion/                 # Python backend
│   ├── __init__.py
│   ├── api.py              # FastAPI app
│   ├── config.py           # Configuration
│   ├── memory.py           # Session memory (SQLite)
│   ├── orchestrator.py     # Chat orchestration
│   ├── prompts.py          # Prompt templates
│   ├── rag/                # RAG modules
│   │   ├── chunker.py
│   │   ├── embedder.py
│   │   ├── indexer.py
│   │   ├── search.py
│   │   └── vector_store.py
│   ├── forge/              # Fine-tuning
│   │   ├── extract.py
│   │   ├── train.py
│   │   ├── export.py
│   │   └── reload.py
│   └── indexer_daemon/     # File watching
│       ├── cli.py
│       └── watcher.py
├── ui/                      # React frontend
│   ├── src/
│   │   ├── App.tsx
│   │   ├── components/
│   │   └── hooks/
│   └── package.json
├── tests/                   # Test suite
├── config.json             # Configuration file
├── docs/                   # Documentation
└── README.md

Testing

# Run all tests
pytest tests/ -v

# Run specific module
pytest tests/test_chunker.py -v

Privacy & Security

Fully Local: No data leaves your machine
Vault Data: Never sent to external APIs for training
Config: local_only: true blocks external API calls
Sensitive Tags: Configurable patterns for health, finance, etc.

License

MIT License - See LICENSE file

Acknowledgments

Built with Unsloth for efficient fine-tuning
Uses LanceDB for vector storage
UI inspired by Obsidian aesthetics

6.9 KiB Raw Blame History