kv-ai/README.md

# Personal Companion AI

A fully local, privacy-first AI companion trained on your Obsidian vault. Combines fine-tuned reasoning with RAG-powered memory to answer questions about your life, relationships, and experiences.

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                    Personal Companion AI                     │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
│  │  React UI    │◄──►│  FastAPI        │◄──►│  Ollama  │  │
│  │  (Vite)      │    │  Backend        │    │  Models  │  │
│  └──────────────┘    └─────────────────┘    └──────────┘  │
│                              │                              │
│        ┌─────────────────────┼─────────────────────┐       │
│        ↓                     ↓                     ↓       │
│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
│  │  Fine-tuned  │    │  RAG Engine     │    │  Vault   │  │
│  │  7B Model    │    │  (LanceDB)      │    │  Indexer │  │
│  │              │    │                 │    │          │  │
│  │  Quarterly   │    │  • semantic     │    │  • watch │  │
│  │  retrain     │    │    search       │    │  • chunk │  │
│  │              │    │  • hybrid       │    │  • embed │  │
│  │              │    │    filters      │    │          │  │
│  │              │    │  • relationship │    │  Daily    │  │
│  │              │    │    graph        │    │  auto-sync│  │
│  └──────────────┘    └─────────────────┘    └──────────┘  │
│                                                             │
└─────────────────────────────────────────────────────────────┘
```

## Quick Start

### Prerequisites

- Python 3.11+
- Node.js 18+ (for UI)
- Ollama running locally
- RTX 5070 or equivalent (12GB+ VRAM for fine-tuning)
- See [GPU Compatibility Guide](docs/gpu-compatibility.md) for RTX 50-series setup

### Installation

```bash
# Clone and setup
cd kv-rag
pip install -e ".[dev]"

# Install UI dependencies
cd ui && npm install && cd ..

# Pull required Ollama models
ollama pull mxbai-embed-large
ollama pull llama3.1:8b
```

### Configuration

Copy `config.json` and customize:

```json
{
  "vault": {
    "path": "/path/to/your/obsidian/vault"
  },
  "companion": {
    "name": "SAN"
  }
}
```

See [docs/config.md](docs/config.md) for full configuration reference.

### Running

**Terminal 1 - Backend:**
```bash
python -m companion.api
```

**Terminal 2 - Frontend:**
```bash
cd ui && npm run dev
```

**Terminal 3 - Indexer (optional):**
```bash
# One-time full index
python -m companion.indexer_daemon.cli index

# Or continuous file watching
python -m companion.indexer_daemon.watcher
```

Open http://localhost:5173

## Usage

### Chat Interface

Type messages naturally. The companion will:
- Retrieve relevant context from your vault
- Reference past events, relationships, decisions
- Provide reflective, companion-style responses

### Indexing Your Vault

```bash
# Full reindex
python -m companion.indexer_daemon.cli index

# Incremental sync
python -m companion.indexer_daemon.cli sync

# Check status
python -m companion.indexer_daemon.cli status
```

### Fine-Tuning (Optional)

Train a custom model that reasons like you:

```bash
# Extract training examples from vault reflections
python -m companion.forge.cli extract

# Train with QLoRA (4-6 hours on RTX 5070)
python -m companion.forge.cli train --epochs 3

# Reload the fine-tuned model
python -m companion.forge.cli reload ~/.companion/training/final
```

## Modules

| Module | Purpose | Documentation |
|--------|---------|---------------|
| `companion.config` | Configuration management | [docs/config.md](docs/config.md) |
| `companion.rag` | RAG engine (chunk, embed, search) | [docs/rag.md](docs/rag.md) |
| `companion.forge` | Fine-tuning pipeline | [docs/forge.md](docs/forge.md) |
| `companion.api` | FastAPI backend | This README |
| `ui/` | React frontend | [docs/ui.md](docs/ui.md) |
| **GPU Setup** | RTX 50-series compatibility | [docs/gpu-compatibility.md](docs/gpu-compatibility.md) |

## Project Structure

```
kv-rag/
├── companion/                 # Python backend
│   ├── __init__.py
│   ├── api.py              # FastAPI app
│   ├── config.py           # Configuration
│   ├── memory.py           # Session memory (SQLite)
│   ├── orchestrator.py     # Chat orchestration
│   ├── prompts.py          # Prompt templates
│   ├── rag/                # RAG modules
│   │   ├── chunker.py
│   │   ├── embedder.py
│   │   ├── indexer.py
│   │   ├── search.py
│   │   └── vector_store.py
│   ├── forge/              # Fine-tuning
│   │   ├── extract.py
│   │   ├── train.py
│   │   ├── export.py
│   │   └── reload.py
│   └── indexer_daemon/     # File watching
│       ├── cli.py
│       └── watcher.py
├── ui/                      # React frontend
│   ├── src/
│   │   ├── App.tsx
│   │   ├── components/
│   │   └── hooks/
│   └── package.json
├── tests/                   # Test suite
├── config.json             # Configuration file
├── docs/                   # Documentation
└── README.md
```

## Testing

```bash
# Run all tests
pytest tests/ -v

# Run specific module
pytest tests/test_chunker.py -v
```

## Privacy & Security

- **Fully Local**: No data leaves your machine
- **Vault Data**: Never sent to external APIs for training
- **Config**: `local_only: true` blocks external API calls
- **Sensitive Tags**: Configurable patterns for health, finance, etc.

## License

MIT License - See LICENSE file

## Acknowledgments

- Built with [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning
- Uses [LanceDB](https://lancedb.github.io/) for vector storage
- UI inspired by [Obsidian](https://obsidian.md/) aesthetics