- README: Fixed backend command, added GPU compatibility reference - forge.md: Fixed train CLI (--output-dir), added GPU troubleshooting - Added reference to GPU compatibility guide for RTX 50-series
210 lines
7.0 KiB
Markdown
210 lines
7.0 KiB
Markdown
# Personal Companion AI
|
|
|
|
A fully local, privacy-first AI companion trained on your Obsidian vault. Combines fine-tuned reasoning with RAG-powered memory to answer questions about your life, relationships, and experiences.
|
|
|
|
## Architecture
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────┐
|
|
│ Personal Companion AI │
|
|
├─────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌──────────────┐ ┌─────────────────┐ ┌──────────┐ │
|
|
│ │ React UI │◄──►│ FastAPI │◄──►│ Ollama │ │
|
|
│ │ (Vite) │ │ Backend │ │ Models │ │
|
|
│ └──────────────┘ └─────────────────┘ └──────────┘ │
|
|
│ │ │
|
|
│ ┌─────────────────────┼─────────────────────┐ │
|
|
│ ↓ ↓ ↓ │
|
|
│ ┌──────────────┐ ┌─────────────────┐ ┌──────────┐ │
|
|
│ │ Fine-tuned │ │ RAG Engine │ │ Vault │ │
|
|
│ │ 7B Model │ │ (LanceDB) │ │ Indexer │ │
|
|
│ │ │ │ │ │ │ │
|
|
│ │ Quarterly │ │ • semantic │ │ • watch │ │
|
|
│ │ retrain │ │ search │ │ • chunk │ │
|
|
│ │ │ │ • hybrid │ │ • embed │ │
|
|
│ │ │ │ filters │ │ │ │
|
|
│ │ │ │ • relationship │ │ Daily │ │
|
|
│ │ │ │ graph │ │ auto-sync│ │
|
|
│ └──────────────┘ └─────────────────┘ └──────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
### Prerequisites
|
|
|
|
- Python 3.11+
|
|
- Node.js 18+ (for UI)
|
|
- Ollama running locally
|
|
- RTX 5070 or equivalent (12GB+ VRAM for fine-tuning)
|
|
- See [GPU Compatibility Guide](docs/gpu-compatibility.md) for RTX 50-series setup
|
|
|
|
### Installation
|
|
|
|
```bash
|
|
# Clone and setup
|
|
cd kv-rag
|
|
pip install -e ".[dev]"
|
|
|
|
# Install UI dependencies
|
|
cd ui && npm install && cd ..
|
|
|
|
# Pull required Ollama models
|
|
ollama pull mxbai-embed-large
|
|
ollama pull llama3.1:8b
|
|
```
|
|
|
|
### Configuration
|
|
|
|
Copy `config.json` and customize:
|
|
|
|
```json
|
|
{
|
|
"vault": {
|
|
"path": "/path/to/your/obsidian/vault"
|
|
},
|
|
"companion": {
|
|
"name": "SAN"
|
|
}
|
|
}
|
|
```
|
|
|
|
See [docs/config.md](docs/config.md) for full configuration reference.
|
|
|
|
### Running
|
|
|
|
**Terminal 1 - Backend:**
|
|
```bash
|
|
python -m companion.api
|
|
```
|
|
|
|
**Terminal 2 - Frontend:**
|
|
```bash
|
|
cd ui && npm run dev
|
|
```
|
|
|
|
**Terminal 3 - Indexer (optional):**
|
|
```bash
|
|
# One-time full index
|
|
python -m companion.indexer_daemon.cli index
|
|
|
|
# Or continuous file watching
|
|
python -m companion.indexer_daemon.watcher
|
|
```
|
|
|
|
Open http://localhost:5173
|
|
|
|
## Usage
|
|
|
|
### Chat Interface
|
|
|
|
Type messages naturally. The companion will:
|
|
- Retrieve relevant context from your vault
|
|
- Reference past events, relationships, decisions
|
|
- Provide reflective, companion-style responses
|
|
|
|
### Indexing Your Vault
|
|
|
|
```bash
|
|
# Full reindex
|
|
python -m companion.indexer_daemon.cli index
|
|
|
|
# Incremental sync
|
|
python -m companion.indexer_daemon.cli sync
|
|
|
|
# Check status
|
|
python -m companion.indexer_daemon.cli status
|
|
```
|
|
|
|
### Fine-Tuning (Optional)
|
|
|
|
Train a custom model that reasons like you:
|
|
|
|
```bash
|
|
# Extract training examples from vault reflections
|
|
python -m companion.forge.cli extract
|
|
|
|
# Train with QLoRA (4-6 hours on RTX 5070)
|
|
python -m companion.forge.cli train --epochs 3
|
|
|
|
# Reload the fine-tuned model
|
|
python -m companion.forge.cli reload ~/.companion/training/final
|
|
```
|
|
|
|
## Modules
|
|
|
|
| Module | Purpose | Documentation |
|
|
|--------|---------|---------------|
|
|
| `companion.config` | Configuration management | [docs/config.md](docs/config.md) |
|
|
| `companion.rag` | RAG engine (chunk, embed, search) | [docs/rag.md](docs/rag.md) |
|
|
| `companion.forge` | Fine-tuning pipeline | [docs/forge.md](docs/forge.md) |
|
|
| `companion.api` | FastAPI backend | This README |
|
|
| `ui/` | React frontend | [docs/ui.md](docs/ui.md) |
|
|
| **GPU Setup** | RTX 50-series compatibility | [docs/gpu-compatibility.md](docs/gpu-compatibility.md) |
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
kv-rag/
|
|
├── companion/ # Python backend
|
|
│ ├── __init__.py
|
|
│ ├── api.py # FastAPI app
|
|
│ ├── config.py # Configuration
|
|
│ ├── memory.py # Session memory (SQLite)
|
|
│ ├── orchestrator.py # Chat orchestration
|
|
│ ├── prompts.py # Prompt templates
|
|
│ ├── rag/ # RAG modules
|
|
│ │ ├── chunker.py
|
|
│ │ ├── embedder.py
|
|
│ │ ├── indexer.py
|
|
│ │ ├── search.py
|
|
│ │ └── vector_store.py
|
|
│ ├── forge/ # Fine-tuning
|
|
│ │ ├── extract.py
|
|
│ │ ├── train.py
|
|
│ │ ├── export.py
|
|
│ │ └── reload.py
|
|
│ └── indexer_daemon/ # File watching
|
|
│ ├── cli.py
|
|
│ └── watcher.py
|
|
├── ui/ # React frontend
|
|
│ ├── src/
|
|
│ │ ├── App.tsx
|
|
│ │ ├── components/
|
|
│ │ └── hooks/
|
|
│ └── package.json
|
|
├── tests/ # Test suite
|
|
├── config.json # Configuration file
|
|
├── docs/ # Documentation
|
|
└── README.md
|
|
```
|
|
|
|
## Testing
|
|
|
|
```bash
|
|
# Run all tests
|
|
pytest tests/ -v
|
|
|
|
# Run specific module
|
|
pytest tests/test_chunker.py -v
|
|
```
|
|
|
|
## Privacy & Security
|
|
|
|
- **Fully Local**: No data leaves your machine
|
|
- **Vault Data**: Never sent to external APIs for training
|
|
- **Config**: `local_only: true` blocks external API calls
|
|
- **Sensitive Tags**: Configurable patterns for health, finance, etc.
|
|
|
|
## License
|
|
|
|
MIT License - See LICENSE file
|
|
|
|
## Acknowledgments
|
|
|
|
- Built with [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning
|
|
- Uses [LanceDB](https://lancedb.github.io/) for vector storage
|
|
- UI inspired by [Obsidian](https://obsidian.md/) aesthetics
|