From e77fa69b31da65c3c0938cb3a7ffee366fc144c4 Mon Sep 17 00:00:00 2001 From: Santhosh Janardhanan Date: Mon, 13 Apr 2026 15:35:22 -0400 Subject: [PATCH] docs: add comprehensive README and module documentation --- README.md | 207 ++++++++++++++ config-schema.json | 667 +++++++++++++++++++++++++++++++++++++++++++++ docs/config.md | 278 +++++++++++++++++++ docs/forge.md | 288 +++++++++++++++++++ docs/rag.md | 269 ++++++++++++++++++ docs/ui.md | 408 +++++++++++++++++++++++++++ 6 files changed, 2117 insertions(+) create mode 100644 README.md create mode 100644 config-schema.json create mode 100644 docs/config.md create mode 100644 docs/forge.md create mode 100644 docs/rag.md create mode 100644 docs/ui.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..83bee53 --- /dev/null +++ b/README.md @@ -0,0 +1,207 @@ +# Personal Companion AI + +A fully local, privacy-first AI companion trained on your Obsidian vault. Combines fine-tuned reasoning with RAG-powered memory to answer questions about your life, relationships, and experiences. + +## Architecture + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Personal Companion AI │ +├─────────────────────────────────────────────────────────────┤ +│ │ +│ ┌──────────────┐ ┌─────────────────┐ ┌──────────┐ │ +│ │ React UI │◄──►│ FastAPI │◄──►│ Ollama │ │ +│ │ (Vite) │ │ Backend │ │ Models │ │ +│ └──────────────┘ └─────────────────┘ └──────────┘ │ +│ │ │ +│ ┌─────────────────────┼─────────────────────┐ │ +│ ↓ ↓ ↓ │ +│ ┌──────────────┐ ┌─────────────────┐ ┌──────────┐ │ +│ │ Fine-tuned │ │ RAG Engine │ │ Vault │ │ +│ │ 7B Model │ │ (LanceDB) │ │ Indexer │ │ +│ │ │ │ │ │ │ │ +│ │ Quarterly │ │ • semantic │ │ • watch │ │ +│ │ retrain │ │ search │ │ • chunk │ │ +│ │ │ │ • hybrid │ │ • embed │ │ +│ │ │ │ filters │ │ │ │ +│ │ │ │ • relationship │ │ Daily │ │ +│ │ │ │ graph │ │ auto-sync│ │ +│ └──────────────┘ └─────────────────┘ └──────────┘ │ +│ │ +└─────────────────────────────────────────────────────────────┘ +``` + +## Quick Start + +### Prerequisites + +- Python 3.11+ +- Node.js 18+ (for UI) +- Ollama running locally +- RTX 5070 or equivalent (12GB+ VRAM for fine-tuning) + +### Installation + +```bash +# Clone and setup +cd kv-rag +pip install -e ".[dev]" + +# Install UI dependencies +cd ui && npm install && cd .. + +# Pull required Ollama models +ollama pull mxbai-embed-large +ollama pull llama3.1:8b +``` + +### Configuration + +Copy `config.json` and customize: + +```json +{ + "vault": { + "path": "/path/to/your/obsidian/vault" + }, + "companion": { + "name": "SAN" + } +} +``` + +See [docs/config.md](docs/config.md) for full configuration reference. + +### Running + +**Terminal 1 - Backend:** +```bash +python -m uvicorn companion.api:app --host 0.0.0.0 --port 7373 +``` + +**Terminal 2 - Frontend:** +```bash +cd ui && npm run dev +``` + +**Terminal 3 - Indexer (optional):** +```bash +# One-time full index +python -m companion.indexer_daemon.cli index + +# Or continuous file watching +python -m companion.indexer_daemon.watcher +``` + +Open http://localhost:5173 + +## Usage + +### Chat Interface + +Type messages naturally. The companion will: +- Retrieve relevant context from your vault +- Reference past events, relationships, decisions +- Provide reflective, companion-style responses + +### Indexing Your Vault + +```bash +# Full reindex +python -m companion.indexer_daemon.cli index + +# Incremental sync +python -m companion.indexer_daemon.cli sync + +# Check status +python -m companion.indexer_daemon.cli status +``` + +### Fine-Tuning (Optional) + +Train a custom model that reasons like you: + +```bash +# Extract training examples from vault reflections +python -m companion.forge.cli extract + +# Train with QLoRA (4-6 hours on RTX 5070) +python -m companion.forge.cli train --epochs 3 + +# Reload the fine-tuned model +python -m companion.forge.cli reload ~/.companion/training/final +``` + +## Modules + +| Module | Purpose | Documentation | +|--------|---------|---------------| +| `companion.config` | Configuration management | [docs/config.md](docs/config.md) | +| `companion.rag` | RAG engine (chunk, embed, search) | [docs/rag.md](docs/rag.md) | +| `companion.forge` | Fine-tuning pipeline | [docs/forge.md](docs/forge.md) | +| `companion.api` | FastAPI backend | [docs/api.md](docs/api.md) | +| `ui/` | React frontend | [docs/ui.md](docs/ui.md) | + +## Project Structure + +``` +kv-rag/ +├── companion/ # Python backend +│ ├── __init__.py +│ ├── api.py # FastAPI app +│ ├── config.py # Configuration +│ ├── memory.py # Session memory (SQLite) +│ ├── orchestrator.py # Chat orchestration +│ ├── prompts.py # Prompt templates +│ ├── rag/ # RAG modules +│ │ ├── chunker.py +│ │ ├── embedder.py +│ │ ├── indexer.py +│ │ ├── search.py +│ │ └── vector_store.py +│ ├── forge/ # Fine-tuning +│ │ ├── extract.py +│ │ ├── train.py +│ │ ├── export.py +│ │ └── reload.py +│ └── indexer_daemon/ # File watching +│ ├── cli.py +│ └── watcher.py +├── ui/ # React frontend +│ ├── src/ +│ │ ├── App.tsx +│ │ ├── components/ +│ │ └── hooks/ +│ └── package.json +├── tests/ # Test suite +├── config.json # Configuration file +├── docs/ # Documentation +└── README.md +``` + +## Testing + +```bash +# Run all tests +pytest tests/ -v + +# Run specific module +pytest tests/test_chunker.py -v +``` + +## Privacy & Security + +- **Fully Local**: No data leaves your machine +- **Vault Data**: Never sent to external APIs for training +- **Config**: `local_only: true` blocks external API calls +- **Sensitive Tags**: Configurable patterns for health, finance, etc. + +## License + +MIT License - See LICENSE file + +## Acknowledgments + +- Built with [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning +- Uses [LanceDB](https://lancedb.github.io/) for vector storage +- UI inspired by [Obsidian](https://obsidian.md/) aesthetics diff --git a/config-schema.json b/config-schema.json new file mode 100644 index 0000000..508014a --- /dev/null +++ b/config-schema.json @@ -0,0 +1,667 @@ +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "$id": "https://companion.ai/config-schema.json", + "title": "Companion AI Configuration", + "description": "Configuration schema for Personal Companion AI", + "type": "object", + "required": ["companion", "vault", "rag", "model", "api", "ui", "logging", "security"], + "properties": { + "companion": { + "type": "object", + "title": "Companion Settings", + "required": ["name", "persona", "memory", "chat"], + "properties": { + "name": { + "type": "string", + "description": "Display name for the companion", + "default": "SAN" + }, + "persona": { + "type": "object", + "required": ["role", "tone", "style", "boundaries"], + "properties": { + "role": { + "type": "string", + "description": "Role of the companion", + "enum": ["companion", "advisor", "reflector"], + "default": "companion" + }, + "tone": { + "type": "string", + "description": "Communication tone", + "enum": ["reflective", "supportive", "analytical", "mixed"], + "default": "reflective" + }, + "style": { + "type": "string", + "description": "Interaction style", + "enum": ["questioning", "supportive", "direct", "mixed"], + "default": "questioning" + }, + "boundaries": { + "type": "array", + "description": "Behavioral guardrails", + "items": { + "type": "string", + "enum": [ + "does_not_impersonate_user", + "no_future_predictions", + "no_medical_or_legal_advice" + ] + }, + "default": ["does_not_impersonate_user", "no_future_predictions", "no_medical_or_legal_advice"] + } + } + }, + "memory": { + "type": "object", + "required": ["session_turns", "persistent_store", "summarize_after"], + "properties": { + "session_turns": { + "type": "integer", + "description": "Messages to keep in context", + "minimum": 1, + "maximum": 100, + "default": 20 + }, + "persistent_store": { + "type": "string", + "description": "SQLite database path", + "default": "~/.companion/memory.db" + }, + "summarize_after": { + "type": "integer", + "description": "Summarize history after N turns", + "minimum": 5, + "maximum": 50, + "default": 10 + } + } + }, + "chat": { + "type": "object", + "required": ["streaming", "max_response_tokens", "default_temperature", "allow_temperature_override"], + "properties": { + "streaming": { + "type": "boolean", + "description": "Stream responses in real-time", + "default": true + }, + "max_response_tokens": { + "type": "integer", + "description": "Max tokens per response", + "minimum": 256, + "maximum": 8192, + "default": 2048 + }, + "default_temperature": { + "type": "number", + "description": "Creativity level (0.0=deterministic, 2.0=creative)", + "minimum": 0.0, + "maximum": 2.0, + "default": 0.7 + }, + "allow_temperature_override": { + "type": "boolean", + "description": "Let users adjust temperature", + "default": true + } + } + } + } + }, + "vault": { + "type": "object", + "title": "Vault Settings", + "required": ["path", "indexing", "chunking_rules"], + "properties": { + "path": { + "type": "string", + "description": "Absolute path to Obsidian vault root" + }, + "indexing": { + "type": "object", + "required": ["auto_sync", "auto_sync_interval_minutes", "watch_fs_events", "file_patterns", "deny_dirs", "deny_patterns"], + "properties": { + "auto_sync": { + "type": "boolean", + "description": "Enable automatic syncing", + "default": true + }, + "auto_sync_interval_minutes": { + "type": "integer", + "description": "Minutes between full syncs", + "minimum": 60, + "maximum": 10080, + "default": 1440 + }, + "watch_fs_events": { + "type": "boolean", + "description": "Watch for file system changes", + "default": true + }, + "file_patterns": { + "type": "array", + "description": "File patterns to index", + "items": { "type": "string" }, + "default": ["*.md"] + }, + "deny_dirs": { + "type": "array", + "description": "Directories to skip", + "items": { "type": "string" }, + "default": [".obsidian", ".trash", "zzz-Archive", ".git", ".logseq"] + }, + "deny_patterns": { + "type": "array", + "description": "File patterns to ignore", + "items": { "type": "string" }, + "default": ["*.tmp", "*.bak", "*conflict*", ".*"] + } + } + }, + "chunking_rules": { + "type": "object", + "description": "Per-directory chunking rules (key: glob pattern, value: rule)", + "additionalProperties": { + "type": "object", + "required": ["strategy", "chunk_size", "chunk_overlap"], + "properties": { + "strategy": { + "type": "string", + "enum": ["sliding_window", "section"], + "description": "Chunking strategy" + }, + "chunk_size": { + "type": "integer", + "description": "Target chunk size in words", + "minimum": 50, + "maximum": 2000 + }, + "chunk_overlap": { + "type": "integer", + "description": "Overlap between chunks in words", + "minimum": 0, + "maximum": 500 + }, + "section_tags": { + "type": "array", + "description": "Tags that mark sections (for section strategy)", + "items": { "type": "string" } + } + } + } + } + } + }, + "rag": { + "type": "object", + "title": "RAG Settings", + "required": ["embedding", "vector_store", "search"], + "properties": { + "embedding": { + "type": "object", + "required": ["provider", "model", "base_url", "dimensions", "batch_size"], + "properties": { + "provider": { + "type": "string", + "description": "Embedding service provider", + "enum": ["ollama"], + "default": "ollama" + }, + "model": { + "type": "string", + "description": "Model name for embeddings", + "enum": ["mxbai-embed-large", "nomic-embed-text", "all-minilm"], + "default": "mxbai-embed-large" + }, + "base_url": { + "type": "string", + "description": "Provider API endpoint", + "format": "uri", + "default": "http://localhost:11434" + }, + "dimensions": { + "type": "integer", + "description": "Embedding vector size", + "enum": [384, 768, 1024], + "default": 1024 + }, + "batch_size": { + "type": "integer", + "description": "Texts per embedding batch", + "minimum": 1, + "maximum": 256, + "default": 32 + } + } + }, + "vector_store": { + "type": "object", + "required": ["type", "path"], + "properties": { + "type": { + "type": "string", + "description": "Vector database type", + "enum": ["lancedb"], + "default": "lancedb" + }, + "path": { + "type": "string", + "description": "Storage path", + "default": "~/.companion/vectors.lance" + } + } + }, + "search": { + "type": "object", + "required": ["default_top_k", "max_top_k", "similarity_threshold", "hybrid_search", "filters"], + "properties": { + "default_top_k": { + "type": "integer", + "description": "Default results to retrieve", + "minimum": 1, + "maximum": 100, + "default": 8 + }, + "max_top_k": { + "type": "integer", + "description": "Maximum allowed results", + "minimum": 1, + "maximum": 100, + "default": 20 + }, + "similarity_threshold": { + "type": "number", + "description": "Minimum relevance score (0-1)", + "minimum": 0.0, + "maximum": 1.0, + "default": 0.75 + }, + "hybrid_search": { + "type": "object", + "required": ["enabled", "keyword_weight", "semantic_weight"], + "properties": { + "enabled": { + "type": "boolean", + "description": "Combine keyword + semantic search", + "default": true + }, + "keyword_weight": { + "type": "number", + "description": "Keyword search weight", + "minimum": 0.0, + "maximum": 1.0, + "default": 0.3 + }, + "semantic_weight": { + "type": "number", + "description": "Semantic search weight", + "minimum": 0.0, + "maximum": 1.0, + "default": 0.7 + } + } + }, + "filters": { + "type": "object", + "required": ["date_range_enabled", "tag_filter_enabled", "directory_filter_enabled"], + "properties": { + "date_range_enabled": { + "type": "boolean", + "description": "Enable date range filtering", + "default": true + }, + "tag_filter_enabled": { + "type": "boolean", + "description": "Enable tag filtering", + "default": true + }, + "directory_filter_enabled": { + "type": "boolean", + "description": "Enable directory filtering", + "default": true + } + } + } + } + } + } + }, + "model": { + "type": "object", + "title": "Model Settings", + "required": ["inference", "fine_tuning", "retrain_schedule"], + "properties": { + "inference": { + "type": "object", + "required": ["backend", "model_path", "context_length", "gpu_layers", "batch_size", "threads"], + "properties": { + "backend": { + "type": "string", + "description": "Inference engine", + "enum": ["llama.cpp", "vllm"], + "default": "llama.cpp" + }, + "model_path": { + "type": "string", + "description": "Path to GGUF or HF model", + "default": "~/.companion/models/companion-7b-q4.gguf" + }, + "context_length": { + "type": "integer", + "description": "Max context tokens", + "minimum": 2048, + "maximum": 32768, + "default": 8192 + }, + "gpu_layers": { + "type": "integer", + "description": "Layers to offload to GPU (0 for CPU-only)", + "minimum": 0, + "maximum": 100, + "default": 35 + }, + "batch_size": { + "type": "integer", + "description": "Inference batch size", + "minimum": 1, + "maximum": 2048, + "default": 512 + }, + "threads": { + "type": "integer", + "description": "CPU threads for inference", + "minimum": 1, + "maximum": 64, + "default": 8 + } + } + }, + "fine_tuning": { + "type": "object", + "required": ["base_model", "output_dir", "lora_rank", "lora_alpha", "learning_rate", "batch_size", "gradient_accumulation_steps", "num_epochs", "warmup_steps", "save_steps", "eval_steps", "training_data_path", "validation_split"], + "properties": { + "base_model": { + "type": "string", + "description": "Base model for fine-tuning", + "default": "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit" + }, + "output_dir": { + "type": "string", + "description": "Training outputs directory", + "default": "~/.companion/training" + }, + "lora_rank": { + "type": "integer", + "description": "LoRA rank (higher = more capacity, more VRAM)", + "minimum": 4, + "maximum": 128, + "default": 16 + }, + "lora_alpha": { + "type": "integer", + "description": "LoRA alpha (scaling factor, typically 2x rank)", + "minimum": 8, + "maximum": 256, + "default": 32 + }, + "learning_rate": { + "type": "number", + "description": "Training learning rate", + "minimum": 1e-6, + "maximum": 1e-3, + "default": 0.0002 + }, + "batch_size": { + "type": "integer", + "description": "Per-device batch size", + "minimum": 1, + "maximum": 32, + "default": 4 + }, + "gradient_accumulation_steps": { + "type": "integer", + "description": "Steps to accumulate before update", + "minimum": 1, + "maximum": 64, + "default": 4 + }, + "num_epochs": { + "type": "integer", + "description": "Training epochs", + "minimum": 1, + "maximum": 20, + "default": 3 + }, + "warmup_steps": { + "type": "integer", + "description": "Learning rate warmup steps", + "minimum": 0, + "maximum": 10000, + "default": 100 + }, + "save_steps": { + "type": "integer", + "description": "Checkpoint frequency", + "minimum": 10, + "maximum": 10000, + "default": 500 + }, + "eval_steps": { + "type": "integer", + "description": "Evaluation frequency", + "minimum": 10, + "maximum": 10000, + "default": 250 + }, + "training_data_path": { + "type": "string", + "description": "Training data directory", + "default": "~/.companion/training_data/" + }, + "validation_split": { + "type": "number", + "description": "Fraction of data for validation", + "minimum": 0.0, + "maximum": 0.5, + "default": 0.1 + } + } + }, + "retrain_schedule": { + "type": "object", + "required": ["auto_reminder", "default_interval_days", "reminder_channels"], + "properties": { + "auto_reminder": { + "type": "boolean", + "description": "Enable retrain reminders", + "default": true + }, + "default_interval_days": { + "type": "integer", + "description": "Days between retrain reminders", + "minimum": 30, + "maximum": 365, + "default": 90 + }, + "reminder_channels": { + "type": "array", + "description": "Where to show reminders", + "items": { + "type": "string", + "enum": ["chat_stream", "log", "ui"] + }, + "default": ["chat_stream", "log"] + } + } + } + } + }, + "api": { + "type": "object", + "title": "API Settings", + "required": ["host", "port", "cors_origins", "auth"], + "properties": { + "host": { + "type": "string", + "description": "Bind address (use 0.0.0.0 for LAN access)", + "default": "127.0.0.1" + }, + "port": { + "type": "integer", + "description": "HTTP port", + "minimum": 1, + "maximum": 65535, + "default": 7373 + }, + "cors_origins": { + "type": "array", + "description": "Allowed CORS origins", + "items": { + "type": "string", + "format": "uri" + }, + "default": ["http://localhost:5173"] + }, + "auth": { + "type": "object", + "required": ["enabled"], + "properties": { + "enabled": { + "type": "boolean", + "description": "Enable API key authentication", + "default": false + } + } + } + } + }, + "ui": { + "type": "object", + "title": "UI Settings", + "required": ["web", "cli"], + "properties": { + "web": { + "type": "object", + "required": ["enabled", "theme", "features"], + "properties": { + "enabled": { + "type": "boolean", + "description": "Enable web interface", + "default": true + }, + "theme": { + "type": "string", + "description": "UI theme", + "enum": ["obsidian"], + "default": "obsidian" + }, + "features": { + "type": "object", + "required": ["streaming", "citations", "source_preview"], + "properties": { + "streaming": { + "type": "boolean", + "description": "Real-time response streaming", + "default": true + }, + "citations": { + "type": "boolean", + "description": "Show source citations", + "default": true + }, + "source_preview": { + "type": "boolean", + "description": "Preview source snippets", + "default": true + } + } + } + } + }, + "cli": { + "type": "object", + "required": ["enabled", "rich_output"], + "properties": { + "enabled": { + "type": "boolean", + "description": "Enable CLI interface", + "default": true + }, + "rich_output": { + "type": "boolean", + "description": "Rich terminal formatting", + "default": true + } + } + } + } + }, + "logging": { + "type": "object", + "title": "Logging Settings", + "required": ["level", "file", "max_size_mb", "backup_count"], + "properties": { + "level": { + "type": "string", + "description": "Log level", + "enum": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"], + "default": "INFO" + }, + "file": { + "type": "string", + "description": "Log file path", + "default": "~/.companion/logs/companion.log" + }, + "max_size_mb": { + "type": "integer", + "description": "Max log file size in MB", + "minimum": 10, + "maximum": 1000, + "default": 100 + }, + "backup_count": { + "type": "integer", + "description": "Number of rotated backups", + "minimum": 1, + "maximum": 20, + "default": 5 + } + } + }, + "security": { + "type": "object", + "title": "Security Settings", + "required": ["local_only", "vault_path_traversal_check", "sensitive_content_detection", "sensitive_patterns", "require_confirmation_for_external_apis"], + "properties": { + "local_only": { + "type": "boolean", + "description": "Block external API calls", + "default": true + }, + "vault_path_traversal_check": { + "type": "boolean", + "description": "Prevent path traversal attacks", + "default": true + }, + "sensitive_content_detection": { + "type": "boolean", + "description": "Tag sensitive content", + "default": true + }, + "sensitive_patterns": { + "type": "array", + "description": "Tags considered sensitive", + "items": { "type": "string" }, + "default": ["#mentalhealth", "#physicalhealth", "#finance", "#Relations"] + }, + "require_confirmation_for_external_apis": { + "type": "boolean", + "description": "Confirm before external API calls", + "default": true + } + } + } + } +} diff --git a/docs/config.md b/docs/config.md new file mode 100644 index 0000000..628eb4a --- /dev/null +++ b/docs/config.md @@ -0,0 +1,278 @@ +# Configuration Reference + +Complete reference for `config.json` configuration options. + +## Overview + +The configuration file uses JSON format with support for: +- Path expansion (`~` expands to home directory) +- Type validation via Pydantic models +- Environment-specific overrides + +## Schema Validation + +Validate your config against the schema: + +```bash +python -c "from companion.config import load_config; load_config('config.json')" +``` + +Or use the JSON Schema directly: [config-schema.json](../config-schema.json) + +## Configuration Sections + +### companion + +Core companion personality and behavior settings. + +```json +{ + "companion": { + "name": "SAN", + "persona": { + "role": "companion", + "tone": "reflective", + "style": "questioning", + "boundaries": [ + "does_not_impersonate_user", + "no_future_predictions", + "no_medical_or_legal_advice" + ] + }, + "memory": { + "session_turns": 20, + "persistent_store": "~/.companion/memory.db", + "summarize_after": 10 + }, + "chat": { + "streaming": true, + "max_response_tokens": 2048, + "default_temperature": 0.7, + "allow_temperature_override": true + } + } +} +``` + +#### Fields + +| Field | Type | Default | Description | +|-------|------|---------|-------------| +| `name` | string | "SAN" | Display name for the companion | +| `persona.role` | string | "companion" | Role description (companion/advisor/reflector) | +| `persona.tone` | string | "reflective" | Communication tone (reflective/supportive/analytical) | +| `persona.style` | string | "questioning" | Interaction style (questioning/supportive/direct) | +| `persona.boundaries` | string[] | [...] | Behavioral guardrails | +| `memory.session_turns` | int | 20 | Messages to keep in context | +| `memory.persistent_store` | string | "~/.companion/memory.db" | SQLite database path | +| `memory.summarize_after` | int | 10 | Summarize history after N turns | +| `chat.streaming` | bool | true | Stream responses in real-time | +| `chat.max_response_tokens` | int | 2048 | Max tokens per response | +| `chat.default_temperature` | float | 0.7 | Creativity (0.0=deterministic, 2.0=creative) | +| `chat.allow_temperature_override` | bool | true | Let users adjust temperature | + +--- + +### vault + +Obsidian vault indexing configuration. + +```json +{ + "vault": { + "path": "~/KnowledgeVault/Default", + "indexing": { + "auto_sync": true, + "auto_sync_interval_minutes": 1440, + "watch_fs_events": true, + "file_patterns": ["*.md"], + "deny_dirs": [".obsidian", ".trash", "zzz-Archive", ".git"], + "deny_patterns": ["*.tmp", "*.bak", "*conflict*"] + }, + "chunking_rules": { + "default": { + "strategy": "sliding_window", + "chunk_size": 500, + "chunk_overlap": 100 + }, + "Journal/**": { + "strategy": "section", + "section_tags": ["#DayInShort", "#mentalhealth", "#work"], + "chunk_size": 300, + "chunk_overlap": 50 + } + } + } +} +``` + +--- + +### rag + +RAG (Retrieval-Augmented Generation) engine configuration. + +```json +{ + "rag": { + "embedding": { + "provider": "ollama", + "model": "mxbai-embed-large", + "base_url": "http://localhost:11434", + "dimensions": 1024, + "batch_size": 32 + }, + "vector_store": { + "type": "lancedb", + "path": "~/.companion/vectors.lance" + }, + "search": { + "default_top_k": 8, + "max_top_k": 20, + "similarity_threshold": 0.75, + "hybrid_search": { + "enabled": true, + "keyword_weight": 0.3, + "semantic_weight": 0.7 + }, + "filters": { + "date_range_enabled": true, + "tag_filter_enabled": true, + "directory_filter_enabled": true + } + } + } +} +``` + +--- + +### model + +LLM configuration for inference and fine-tuning. + +```json +{ + "model": { + "inference": { + "backend": "llama.cpp", + "model_path": "~/.companion/models/companion-7b-q4.gguf", + "context_length": 8192, + "gpu_layers": 35, + "batch_size": 512, + "threads": 8 + }, + "fine_tuning": { + "base_model": "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit", + "output_dir": "~/.companion/training", + "lora_rank": 16, + "lora_alpha": 32, + "learning_rate": 0.0002, + "batch_size": 4, + "gradient_accumulation_steps": 4, + "num_epochs": 3, + "warmup_steps": 100, + "save_steps": 500, + "eval_steps": 250, + "training_data_path": "~/.companion/training_data/", + "validation_split": 0.1 + }, + "retrain_schedule": { + "auto_reminder": true, + "default_interval_days": 90, + "reminder_channels": ["chat_stream", "log"] + } + } +} +``` + +--- + +### api + +FastAPI backend configuration. + +```json +{ + "api": { + "host": "127.0.0.1", + "port": 7373, + "cors_origins": ["http://localhost:5173"], + "auth": { + "enabled": false + } + } +} +``` + +--- + +### ui + +Web UI configuration. + +```json +{ + "ui": { + "web": { + "enabled": true, + "theme": "obsidian", + "features": { + "streaming": true, + "citations": true, + "source_preview": true + } + }, + "cli": { + "enabled": true, + "rich_output": true + } + } +} +``` + +--- + +### logging + +Logging configuration. + +```json +{ + "logging": { + "level": "INFO", + "file": "~/.companion/logs/companion.log", + "max_size_mb": 100, + "backup_count": 5 + } +} +``` + +--- + +### security + +Security and privacy settings. + +```json +{ + "security": { + "local_only": true, + "vault_path_traversal_check": true, + "sensitive_content_detection": true, + "sensitive_patterns": [ + "#mentalhealth", + "#physicalhealth", + "#finance", + "#Relations" + ], + "require_confirmation_for_external_apis": true + } +} +``` + +--- + +## Full Example + +See [config.json](../config.json) for a complete working configuration. diff --git a/docs/forge.md b/docs/forge.md new file mode 100644 index 0000000..e6fc04f --- /dev/null +++ b/docs/forge.md @@ -0,0 +1,288 @@ +# FORGE Module Documentation + +The FORGE module handles fine-tuning of the companion model. It extracts training examples from your vault reflections and trains a custom LoRA adapter using QLoRA on your local GPU. + +## Architecture + +``` +Vault Reflections + ↓ +┌─────────────────┐ +│ Extract │ - Scan for #reflection, #insight tags +│ (extract.py) │ - Parse reflection patterns +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Curate │ - Manual review (optional) +│ (curate.py) │ - Deduplication +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Train │ - QLoRA fine-tuning +│ (train.py) │ - Unsloth + transformers +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Export │ - Merge LoRA weights +│ (export.py) │ - Convert to GGUF +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Reload │ - Hot-swap in API +│ (reload.py) │ - No restart needed +└─────────────────┘ +``` + +## Requirements + +- **GPU**: RTX 5070 or equivalent (12GB+ VRAM) +- **Dependencies**: Install with `pip install -e ".[train]"` +- **Time**: 4-6 hours for full training run + +## Workflow + +### 1. Extract Training Data + +Scan your vault for reflection patterns: + +```bash +python -m companion.forge.cli extract +``` + +This scans for: +- Tags: `#reflection`, `#insight`, `#learning`, `#decision`, etc. +- Patterns: "I think", "I realize", "Looking back", "What if" +- Section headers in journal entries + +Output: `~/.companion/training_data/extracted.jsonl` + +**Example extracted data:** + +```json +{ + "messages": [ + {"role": "system", "content": "You are a thoughtful, reflective companion."}, + {"role": "user", "content": "I'm facing a decision. How should I think through this?"}, + {"role": "assistant", "content": "#reflection I think I need to slow down..."} + ], + "source_file": "Journal/2026/04/2026-04-12.md", + "tags": ["#reflection", "#DayInShort"], + "date": "2026-04-12" +} +``` + +### 2. Train Model + +Run QLoRA fine-tuning: + +```bash +python -m companion.forge.cli train --epochs 3 --lr 2e-4 +``` + +**Hyperparameters (from config):** + +| Parameter | Default | Description | +|-----------|---------|-------------| +| `lora_rank` | 16 | LoRA rank (8-64) | +| `lora_alpha` | 32 | LoRA scaling factor | +| `learning_rate` | 2e-4 | Optimizer learning rate | +| `num_epochs` | 3 | Training epochs | +| `batch_size` | 4 | Per-device batch | +| `gradient_accumulation_steps` | 4 | Steps before update | + +**Training Output:** +- Checkpoints: `~/.companion/training/checkpoint-*/` +- Final model: `~/.companion/training/final/` +- Logs: Training loss, eval metrics + +### 3. Reload Model + +Hot-swap without restarting API: + +```bash +python -m companion.forge.cli reload ~/.companion/training/final +``` + +Or via API: + +```bash +curl -X POST http://localhost:7373/admin/reload-model \ + -H "Content-Type: application/json" \ + -d '{"model_path": "~/.companion/training/final"}' +``` + +## Components + +### Extractor (`companion.forge.extract`) + +```python +from companion.forge.extract import TrainingDataExtractor, extract_training_data + +# Extract from vault +extractor = TrainingDataExtractor(config) +examples = extractor.extract() + +# Get statistics +stats = extractor.get_stats() +print(f"Extracted {stats['total']} examples") + +# Save to JSONL +extractor.save_to_jsonl(Path("training.jsonl")) +``` + +**Reflection Detection:** + +- **Tags**: `#reflection`, `#learning`, `#insight`, `#decision`, `#analysis`, `#takeaway`, `#realization` +- **Patterns**: "I think", "I feel", "I realize", "I wonder", "Looking back", "On one hand...", "Ultimately decided" + +### Trainer (`companion.forge.train`) + +```python +from companion.forge.train import train + +final_path = train( + data_path=Path("training.jsonl"), + output_dir=Path("~/.companion/training"), + base_model="unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit", + lora_rank=16, + lora_alpha=32, + learning_rate=2e-4, + num_epochs=3, +) +``` + +**Base Models:** + +- `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit` - Recommended +- `unsloth/llama-3-8b-bnb-4bit` - Alternative + +**Target Modules:** + +LoRA is applied to: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj` + +### Exporter (`companion.forge.export`) + +```python +from companion.forge.export import merge_only + +# Merge LoRA into base model +merged_path = merge_only( + checkpoint_path=Path("~/.companion/training/checkpoint-500"), + output_path=Path("~/.companion/models/merged"), +) +``` + +### Reloader (`companion.forge.reload`) + +```python +from companion.forge.reload import reload_model, get_model_status + +# Check current model +status = get_model_status(config) +print(f"Model size: {status['size_mb']} MB") + +# Reload with new model +new_path = reload_model( + config=config, + new_model_path=Path("~/.companion/training/final"), + backup=True, +) +``` + +## CLI Reference + +```bash +# Extract training data +companion.forge.cli extract [--output PATH] + +# Train model +companion.forge.cli train \ + [--data PATH] \ + [--output PATH] \ + [--epochs N] \ + [--lr FLOAT] + +# Check model status +companion.forge.cli status + +# Reload model +companion.forge.cli reload MODEL_PATH [--no-backup] +``` + +## Training Tips + +**Dataset Size:** +- Minimum: 50 examples +- Optimal: 100-500 examples +- More is not always better - quality over quantity + +**Epochs:** +- Start with 3 epochs +- Increase if underfitting (high loss) +- Decrease if overfitting (loss increases on eval) + +**LoRA Rank:** +- `8` - Quick experiments +- `16` - Balanced (recommended) +- `32-64` - High capacity, more VRAM + +**Overfitting Signs:** +- Training loss decreasing, eval loss increasing +- Model repeats exact phrases from training data +- Responses feel "memorized" not "learned" + +## VRAM Usage (RTX 5070, 12GB) + +| Config | VRAM | Batch Size | +|--------|------|------------| +| Rank 16, 8-bit adam | ~10GB | 4 | +| Rank 32, 8-bit adam | ~11GB | 4 | +| Rank 64, 8-bit adam | OOM | - | + +Use `gradient_accumulation_steps` to increase effective batch size. + +## Troubleshooting + +**CUDA Out of Memory** +- Reduce `lora_rank` to 8 +- Reduce `batch_size` to 2 +- Increase `gradient_accumulation_steps` + +**Training Loss Not Decreasing** +- Check data quality (reflections present?) +- Increase learning rate to 5e-4 +- Check for data formatting issues + +**Model Not Loading After Reload** +- Check path exists: `ls -la ~/.companion/models/` +- Verify model format (GGUF vs HF) +- Check API logs for errors + +**Slow Training** +- Expected: ~6 hours for 3 epochs on RTX 5070 +- Enable gradient checkpointing (enabled by default) +- Close other GPU applications + +## Advanced: Custom Training Script + +```python +# custom_train.py +from companion.forge.train import train +from companion.config import load_config + +config = load_config() + +final_path = train( + data_path=config.model.fine_tuning.training_data_path / "curated.jsonl", + output_dir=config.model.fine_tuning.output_dir, + base_model=config.model.fine_tuning.base_model, + lora_rank=32, # Higher capacity + lora_alpha=64, + learning_rate=3e-4, # Slightly higher + num_epochs=5, # More epochs + batch_size=2, # Smaller batches + gradient_accumulation_steps=8, # Effective batch = 16 +) + +print(f"Model saved to: {final_path}") +``` diff --git a/docs/rag.md b/docs/rag.md new file mode 100644 index 0000000..f76a70d --- /dev/null +++ b/docs/rag.md @@ -0,0 +1,269 @@ +# RAG Module Documentation + +The RAG (Retrieval-Augmented Generation) module provides semantic search over your Obsidian vault. It handles document chunking, embedding generation, and vector similarity search. + +## Architecture + +``` +Vault Markdown Files + ↓ +┌─────────────────┐ +│ Chunker │ - Split by strategy (sliding window / section) +│ (chunker.py) │ - Extract metadata (tags, dates, sections) +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Embedder │ - HTTP client for Ollama API +│ (embedder.py) │ - Batch processing with retries +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Vector Store │ - LanceDB persistence +│(vector_store.py)│ - Upsert, delete, search +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ Indexer │ - Full/incremental sync +│ (indexer.py) │ - File watching +└─────────────────┘ +``` + +## Components + +### Chunker (`companion.rag.chunker`) + +Splits markdown files into searchable chunks. + +```python +from companion.rag.chunker import chunk_file, ChunkingRule + +rules = { + "default": ChunkingRule(strategy="sliding_window", chunk_size=500, chunk_overlap=100), + "Journal/**": ChunkingRule(strategy="section", section_tags=["#DayInShort"], chunk_size=300, chunk_overlap=50), +} + +chunks = chunk_file( + file_path=Path("journal/2026-04-12.md"), + vault_root=Path("~/vault"), + rules=rules, + modified_at=1234567890.0, +) + +for chunk in chunks: + print(f"{chunk.source_file}:{chunk.chunk_index}") + print(f"Text: {chunk.text[:100]}...") + print(f"Tags: {chunk.tags}") + print(f"Date: {chunk.date}") +``` + +#### Chunking Strategies + +**Sliding Window** +- Fixed-size chunks with overlap +- Best for: Longform text, articles + +```python +ChunkingRule( + strategy="sliding_window", + chunk_size=500, # words per chunk + chunk_overlap=100, # words overlap between chunks +) +``` + +**Section-Based** +- Split on section headers (tags) +- Best for: Structured journals, daily notes + +```python +ChunkingRule( + strategy="section", + section_tags=["#DayInShort", "#mentalhealth", "#work"], + chunk_size=300, + chunk_overlap=50, +) +``` + +#### Metadata Extraction + +Each chunk includes: +- `source_file` - Relative path from vault root +- `source_directory` - Top-level directory +- `section` - Section header (for section strategy) +- `date` - Parsed from filename +- `tags` - Hashtags and wikilinks +- `chunk_index` - Position in document +- `modified_at` - File mtime for sync + +### Embedder (`companion.rag.embedder`) + +Generates embeddings via Ollama API. + +```python +from companion.rag.embedder import OllamaEmbedder + +embedder = OllamaEmbedder( + base_url="http://localhost:11434", + model="mxbai-embed-large", + batch_size=32, +) + +# Single embedding +embeddings = embedder.embed(["Hello world"]) +print(len(embeddings[0])) # 1024 dimensions + +# Batch embedding (with automatic batching) +texts = ["text 1", "text 2", "text 3", ...] # 100 texts +embeddings = embedder.embed(texts) # Automatically batches +``` + +#### Features + +- **Batching**: Automatically splits large requests +- **Retries**: Exponential backoff on failures +- **Context Manager**: Proper resource cleanup + +```python +with OllamaEmbedder(...) as embedder: + embeddings = embedder.embed(texts) +``` + +### Vector Store (`companion.rag.vector_store`) + +LanceDB wrapper for vector storage. + +```python +from companion.rag.vector_store import VectorStore + +store = VectorStore( + uri="~/.companion/vectors.lance", + dimensions=1024, +) + +# Upsert chunks +store.upsert( + ids=["file.md::0", "file.md::1"], + texts=["chunk 1", "chunk 2"], + embeddings=[[0.1, ...], [0.2, ...]], + metadatas=[ + {"source_file": "file.md", "source_directory": "docs"}, + {"source_file": "file.md", "source_directory": "docs"}, + ], +) + +# Search +results = store.search( + query_vector=[0.1, ...], + top_k=8, + filters={"source_directory": "Journal"}, +) +``` + +#### Schema + +| Field | Type | Nullable | +|-------|------|----------| +| id | string | No | +| text | string | No | +| vector | list[float32] | No | +| source_file | string | No | +| source_directory | string | No | +| section | string | Yes | +| date | string | Yes | +| tags | list[string] | Yes | +| chunk_index | int32 | No | +| total_chunks | int32 | No | +| modified_at | float64 | Yes | +| rule_applied | string | No | + +### Indexer (`companion.rag.indexer`) + +Orchestrates vault indexing. + +```python +from companion.config import load_config +from companion.rag.indexer import Indexer +from companion.rag.vector_store import VectorStore + +config = load_config() +store = VectorStore( + uri=config.rag.vector_store.path, + dimensions=config.rag.embedding.dimensions, +) + +indexer = Indexer(config, store) + +# Full reindex (clear + rebuild) +indexer.full_index() + +# Incremental sync (only changed files) +indexer.sync() + +# Get status +status = indexer.status() +print(f"Total chunks: {status['total_chunks']}") +print(f"Unindexed files: {status['unindexed_files']}") +``` + +### Search (`companion.rag.search`) + +High-level search interface. + +```python +from companion.rag.search import SearchEngine + +engine = SearchEngine( + vector_store=store, + embedder_base_url="http://localhost:11434", + embedder_model="mxbai-embed-large", + default_top_k=8, + similarity_threshold=0.75, + hybrid_search_enabled=False, +) + +results = engine.search( + query="What did I learn about friendships?", + top_k=8, + filters={"source_directory": "Journal"}, +) + +for result in results: + print(f"Source: {result['source_file']}") + print(f"Relevance: {1 - result['_distance']:.2f}") +``` + +## CLI Commands + +```bash +# Full index +python -m companion.indexer_daemon.cli index + +# Incremental sync +python -m companion.indexer_daemon.cli sync + +# Check status +python -m companion.indexer_daemon.cli status + +# Reindex (same as index) +python -m companion.indexer_daemon.cli reindex +``` + +## Performance Tips + +1. **Chunk Size**: Smaller chunks = better retrieval, larger = more context +2. **Batch Size**: 32 is optimal for Ollama embeddings +3. **Filters**: Use directory filters to narrow search scope +4. **Sync vs Index**: Use `sync` for daily updates, `index` for full rebuilds + +## Troubleshooting + +**Slow indexing** +- Check Ollama is running: `ollama ps` +- Reduce batch size if OOM + +**No results** +- Verify vault path in config +- Check `indexer.status()` for unindexed files + +**Duplicate chunks** +- Each chunk ID is `{source_file}::{chunk_index}` +- Use `full_index()` to clear and rebuild diff --git a/docs/ui.md b/docs/ui.md new file mode 100644 index 0000000..3f23384 --- /dev/null +++ b/docs/ui.md @@ -0,0 +1,408 @@ +# UI Module Documentation + +The UI is a React + Vite frontend for the companion chat interface. It provides real-time streaming chat with a clean, Obsidian-inspired dark theme. + +## Architecture + +``` +HTTP/SSE + ↓ +┌─────────────────┐ +│ App.tsx │ - State management +│ Message state │ - User/assistant messages +└────────┬────────┘ + ↓ +┌─────────────────┐ +│ MessageList │ - Render messages +│ (components/) │ - User/assistant styling +└─────────────────┘ +┌─────────────────┐ +│ ChatInput │ - Textarea + send +│ (components/) │ - Auto-resize, hotkeys +└─────────────────┘ + ↓ +┌─────────────────┐ +│ useChatStream │ - SSE streaming +│ (hooks/) │ - Session management +└─────────────────┘ +``` + +## Project Structure + +``` +ui/ +├── src/ +│ ├── main.tsx # React entry point +│ ├── App.tsx # Main app component +│ ├── App.css # App layout styles +│ ├── index.css # Global styles +│ ├── components/ +│ │ ├── MessageList.tsx # Message display +│ │ ├── MessageList.css # Message styling +│ │ ├── ChatInput.tsx # Input textarea +│ │ └── ChatInput.css # Input styling +│ └── hooks/ +│ └── useChatStream.ts # SSE streaming hook +├── index.html # HTML template +├── vite.config.ts # Vite configuration +├── tsconfig.json # TypeScript config +└── package.json # Dependencies +``` + +## Components + +### App.tsx + +Main application state management: + +```typescript +interface Message { + role: 'user' | 'assistant' + content: string +} + +// State +const [messages, setMessages] = useState([]) +const [input, setInput] = useState('') +const [isLoading, setIsLoading] = useState(false) + +// Handlers +const handleSend = async () => { /* ... */ } +const handleKeyDown = (e) => { /* Enter to send, Shift+Enter newline */ } +``` + +**Features:** +- Auto-scroll to bottom on new messages +- Keyboard shortcuts (Enter to send, Shift+Enter for newline) +- Loading state with animation +- Message streaming in real-time + +### MessageList.tsx + +Renders the chat history: + +```typescript +interface MessageListProps { + messages: Message[] + isLoading: boolean +} +``` + +**Layout:** +- User messages: Right-aligned, blue background +- Assistant messages: Left-aligned, gray background with border +- Loading indicator: Three animated dots +- Empty state: Prompt text when no messages + +**Styling:** +- Max-width 800px, centered +- Smooth scroll behavior +- Avatar-less design (clean, text-focused) + +### ChatInput.tsx + +Textarea input with send button: + +```typescript +interface ChatInputProps { + value: string + onChange: (value: string) => void + onSend: () => void + onKeyDown: (e: KeyboardEvent) => void + disabled: boolean +} +``` + +**Features:** +- Auto-resizing textarea +- Send button with loading state +- Placeholder text +- Disabled during streaming + +## Hooks + +### useChatStream.ts + +Manages SSE streaming connection: + +```typescript +interface UseChatStreamReturn { + sendMessage: ( + message: string, + onChunk: (chunk: string) => void + ) => Promise + sessionId: string | null +} + +const { sendMessage, sessionId } = useChatStream() +``` + +**Usage:** + +```typescript +await sendMessage("Hello", (chunk) => { + // Append chunk to current response + setMessages(prev => { + const last = prev[prev.length - 1] + if (last?.role === 'assistant') { + last.content += chunk + return [...prev] + } + return [...prev, { role: 'assistant', content: chunk }] + }) +}) +``` + +**SSE Protocol:** + +The API streams events in this format: + +``` +data: {"type": "chunk", "content": "Hello"} + +data: {"type": "chunk", "content": " world"} + +data: {"type": "sources", "sources": [{"file": "journal.md"}]} + +data: {"type": "done", "session_id": "uuid"} +``` + +## Styling + +### Design System + +Based on Obsidian's dark theme: + +```css +:root { + --bg-primary: #0d1117; /* App background */ + --bg-secondary: #161b22; /* Header/footer */ + --bg-tertiary: #21262d; /* Input background */ + + --text-primary: #c9d1d9; /* Main text */ + --text-secondary: #8b949e; /* Placeholder */ + + --accent-primary: #58a6ff; /* Primary blue */ + --accent-secondary: #79c0ff;/* Lighter blue */ + + --border: #30363d; /* Borders */ + --user-bg: #1f6feb; /* User message */ + --assistant-bg: #21262d; /* Assistant message */ +} +``` + +### Message Styling + +**User Message:** +- Blue background (`--user-bg`) +- White text +- Border radius: 12px (12px 12px 4px 12px) +- Max-width: 80% + +**Assistant Message:** +- Gray background (`--assistant-bg`) +- Light text (`--text-primary`) +- Border: 1px solid `--border` +- Border radius: 12px (12px 12px 12px 4px) + +### Loading Animation + +Three bouncing dots using CSS keyframes: + +```css +@keyframes bounce { + 0%, 80%, 100% { transform: scale(0.6); } + 40% { transform: scale(1); } +} +``` + +## Development + +### Setup + +```bash +cd ui +npm install +``` + +### Dev Server + +```bash +npm run dev +# Opens http://localhost:5173 +``` + +### Build + +```bash +npm run build +# Output: ui/dist/ +``` + +### Preview Production Build + +```bash +npm run preview +``` + +## Configuration + +### Vite Config + +`vite.config.ts`: + +```typescript +export default defineConfig({ + plugins: [react()], + server: { + port: 5173, + proxy: { + '/api': { + target: 'http://localhost:7373', + changeOrigin: true, + }, + }, + }, +}) +``` + +**Proxy Setup:** +- Frontend: `http://localhost:5173` +- API: `http://localhost:7373` +- `/api/*` → `http://localhost:7373/api/*` + +This allows using relative API paths in the code: + +```typescript +const API_BASE = '/api' // Not http://localhost:7373/api +``` + +## TypeScript + +### Types + +```typescript +// Message role +type Role = 'user' | 'assistant' + +// Message object +interface Message { + role: Role + content: string +} + +// Chat request +type ChatRequest = { + message: string + session_id?: string + temperature?: number +} + +// SSE chunk +type ChunkEvent = { + type: 'chunk' + content: string +} + +type SourcesEvent = { + type: 'sources' + sources: Array<{ + file: string + section?: string + date?: string + }> +} + +type DoneEvent = { + type: 'done' + session_id: string +} +``` + +## API Integration + +### Chat Endpoint + +```typescript +const response = await fetch('/api/chat', { + method: 'POST', + headers: { 'Content-Type': 'application/json' }, + body: JSON.stringify({ + message: userInput, + session_id: sessionId, // null for new session + stream: true, + }), +}) + +// Read SSE stream +const reader = response.body?.getReader() +const decoder = new TextDecoder() + +while (true) { + const { done, value } = await reader.read() + if (done) break + + const chunk = decoder.decode(value, { stream: true }) + // Parse SSE lines +} +``` + +### Session Persistence + +The backend maintains conversation history via `session_id`: + +1. First message: `session_id: null` → backend creates UUID +2. Response header: `X-Session-ID: ` +3. Subsequent messages: include `session_id: ` +4. History retrieved automatically + +## Customization + +### Themes + +Modify `App.css` and `index.css`: + +```css +/* Custom accent color */ +--accent-primary: #ff6b6b; +--user-bg: #ff6b6b; +``` + +### Fonts + +Update `index.css`: + +```css +body { + font-family: 'Inter', -apple-system, sans-serif; +} +``` + +### Message Layout + +Modify `MessageList.css`: + +```css +.message-content { + max-width: 90%; /* Wider messages */ + font-size: 16px; /* Larger text */ +} +``` + +## Troubleshooting + +**CORS errors** +- Check `vite.config.ts` proxy configuration +- Verify backend CORS origins include `http://localhost:5173` + +**Stream not updating** +- Check browser network tab for SSE events +- Verify `EventSourceResponse` from backend + +**Messages not appearing** +- Check React DevTools for state updates +- Verify `messages` array is being mutated correctly + +**Build fails** +- Check TypeScript errors: `npx tsc --noEmit` +- Update dependencies: `npm update`