From e77fa69b31da65c3c0938cb3a7ffee366fc144c4 Mon Sep 17 00:00:00 2001
From: Santhosh Janardhanan <santhoshj@gmail.com>
Date: Mon, 13 Apr 2026 15:35:22 -0400
Subject: [PATCH] docs: add comprehensive README and module documentation

---
 README.md          | 207 ++++++++++++++
 config-schema.json | 667 +++++++++++++++++++++++++++++++++++++++++++++
 docs/config.md     | 278 +++++++++++++++++++
 docs/forge.md      | 288 +++++++++++++++++++
 docs/rag.md        | 269 ++++++++++++++++++
 docs/ui.md         | 408 +++++++++++++++++++++++++++
 6 files changed, 2117 insertions(+)
 create mode 100644 README.md
 create mode 100644 config-schema.json
 create mode 100644 docs/config.md
 create mode 100644 docs/forge.md
 create mode 100644 docs/rag.md
 create mode 100644 docs/ui.md

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..83bee53
--- /dev/null
+++ b/README.md
@@ -0,0 +1,207 @@
+# Personal Companion AI
+
+A fully local, privacy-first AI companion trained on your Obsidian vault. Combines fine-tuned reasoning with RAG-powered memory to answer questions about your life, relationships, and experiences.
+
+## Architecture
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Personal Companion AI                     │
+├─────────────────────────────────────────────────────────────┤
+│                                                             │
+│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
+│  │  React UI    │◄──►│  FastAPI        │◄──►│  Ollama  │  │
+│  │  (Vite)      │    │  Backend        │    │  Models  │  │
+│  └──────────────┘    └─────────────────┘    └──────────┘  │
+│                              │                              │
+│        ┌─────────────────────┼─────────────────────┐       │
+│        ↓                     ↓                     ↓       │
+│  ┌──────────────┐    ┌─────────────────┐    ┌──────────┐  │
+│  │  Fine-tuned  │    │  RAG Engine     │    │  Vault   │  │
+│  │  7B Model    │    │  (LanceDB)      │    │  Indexer │  │
+│  │              │    │                 │    │          │  │
+│  │  Quarterly   │    │  • semantic     │    │  • watch │  │
+│  │  retrain     │    │    search       │    │  • chunk │  │
+│  │              │    │  • hybrid       │    │  • embed │  │
+│  │              │    │    filters      │    │          │  │
+│  │              │    │  • relationship │    │  Daily    │  │
+│  │              │    │    graph        │    │  auto-sync│  │
+│  └──────────────┘    └─────────────────┘    └──────────┘  │
+│                                                             │
+└─────────────────────────────────────────────────────────────┘
+```
+
+## Quick Start
+
+### Prerequisites
+
+- Python 3.11+
+- Node.js 18+ (for UI)
+- Ollama running locally
+- RTX 5070 or equivalent (12GB+ VRAM for fine-tuning)
+
+### Installation
+
+```bash
+# Clone and setup
+cd kv-rag
+pip install -e ".[dev]"
+
+# Install UI dependencies
+cd ui && npm install && cd ..
+
+# Pull required Ollama models
+ollama pull mxbai-embed-large
+ollama pull llama3.1:8b
+```
+
+### Configuration
+
+Copy `config.json` and customize:
+
+```json
+{
+  "vault": {
+    "path": "/path/to/your/obsidian/vault"
+  },
+  "companion": {
+    "name": "SAN"
+  }
+}
+```
+
+See [docs/config.md](docs/config.md) for full configuration reference.
+
+### Running
+
+**Terminal 1 - Backend:**
+```bash
+python -m uvicorn companion.api:app --host 0.0.0.0 --port 7373
+```
+
+**Terminal 2 - Frontend:**
+```bash
+cd ui && npm run dev
+```
+
+**Terminal 3 - Indexer (optional):**
+```bash
+# One-time full index
+python -m companion.indexer_daemon.cli index
+
+# Or continuous file watching
+python -m companion.indexer_daemon.watcher
+```
+
+Open http://localhost:5173
+
+## Usage
+
+### Chat Interface
+
+Type messages naturally. The companion will:
+- Retrieve relevant context from your vault
+- Reference past events, relationships, decisions
+- Provide reflective, companion-style responses
+
+### Indexing Your Vault
+
+```bash
+# Full reindex
+python -m companion.indexer_daemon.cli index
+
+# Incremental sync
+python -m companion.indexer_daemon.cli sync
+
+# Check status
+python -m companion.indexer_daemon.cli status
+```
+
+### Fine-Tuning (Optional)
+
+Train a custom model that reasons like you:
+
+```bash
+# Extract training examples from vault reflections
+python -m companion.forge.cli extract
+
+# Train with QLoRA (4-6 hours on RTX 5070)
+python -m companion.forge.cli train --epochs 3
+
+# Reload the fine-tuned model
+python -m companion.forge.cli reload ~/.companion/training/final
+```
+
+## Modules
+
+| Module | Purpose | Documentation |
+|--------|---------|---------------|
+| `companion.config` | Configuration management | [docs/config.md](docs/config.md) |
+| `companion.rag` | RAG engine (chunk, embed, search) | [docs/rag.md](docs/rag.md) |
+| `companion.forge` | Fine-tuning pipeline | [docs/forge.md](docs/forge.md) |
+| `companion.api` | FastAPI backend | [docs/api.md](docs/api.md) |
+| `ui/` | React frontend | [docs/ui.md](docs/ui.md) |
+
+## Project Structure
+
+```
+kv-rag/
+├── companion/                 # Python backend
+│   ├── __init__.py
+│   ├── api.py              # FastAPI app
+│   ├── config.py           # Configuration
+│   ├── memory.py           # Session memory (SQLite)
+│   ├── orchestrator.py     # Chat orchestration
+│   ├── prompts.py          # Prompt templates
+│   ├── rag/                # RAG modules
+│   │   ├── chunker.py
+│   │   ├── embedder.py
+│   │   ├── indexer.py
+│   │   ├── search.py
+│   │   └── vector_store.py
+│   ├── forge/              # Fine-tuning
+│   │   ├── extract.py
+│   │   ├── train.py
+│   │   ├── export.py
+│   │   └── reload.py
+│   └── indexer_daemon/     # File watching
+│       ├── cli.py
+│       └── watcher.py
+├── ui/                      # React frontend
+│   ├── src/
+│   │   ├── App.tsx
+│   │   ├── components/
+│   │   └── hooks/
+│   └── package.json
+├── tests/                   # Test suite
+├── config.json             # Configuration file
+├── docs/                   # Documentation
+└── README.md
+```
+
+## Testing
+
+```bash
+# Run all tests
+pytest tests/ -v
+
+# Run specific module
+pytest tests/test_chunker.py -v
+```
+
+## Privacy & Security
+
+- **Fully Local**: No data leaves your machine
+- **Vault Data**: Never sent to external APIs for training
+- **Config**: `local_only: true` blocks external API calls
+- **Sensitive Tags**: Configurable patterns for health, finance, etc.
+
+## License
+
+MIT License - See LICENSE file
+
+## Acknowledgments
+
+- Built with [Unsloth](https://github.com/unslothai/unsloth) for efficient fine-tuning
+- Uses [LanceDB](https://lancedb.github.io/) for vector storage
+- UI inspired by [Obsidian](https://obsidian.md/) aesthetics
diff --git a/config-schema.json b/config-schema.json
new file mode 100644
index 0000000..508014a
--- /dev/null
+++ b/config-schema.json
@@ -0,0 +1,667 @@
+{
+  "$schema": "http://json-schema.org/draft-07/schema#",
+  "$id": "https://companion.ai/config-schema.json",
+  "title": "Companion AI Configuration",
+  "description": "Configuration schema for Personal Companion AI",
+  "type": "object",
+  "required": ["companion", "vault", "rag", "model", "api", "ui", "logging", "security"],
+  "properties": {
+    "companion": {
+      "type": "object",
+      "title": "Companion Settings",
+      "required": ["name", "persona", "memory", "chat"],
+      "properties": {
+        "name": {
+          "type": "string",
+          "description": "Display name for the companion",
+          "default": "SAN"
+        },
+        "persona": {
+          "type": "object",
+          "required": ["role", "tone", "style", "boundaries"],
+          "properties": {
+            "role": {
+              "type": "string",
+              "description": "Role of the companion",
+              "enum": ["companion", "advisor", "reflector"],
+              "default": "companion"
+            },
+            "tone": {
+              "type": "string",
+              "description": "Communication tone",
+              "enum": ["reflective", "supportive", "analytical", "mixed"],
+              "default": "reflective"
+            },
+            "style": {
+              "type": "string",
+              "description": "Interaction style",
+              "enum": ["questioning", "supportive", "direct", "mixed"],
+              "default": "questioning"
+            },
+            "boundaries": {
+              "type": "array",
+              "description": "Behavioral guardrails",
+              "items": {
+                "type": "string",
+                "enum": [
+                  "does_not_impersonate_user",
+                  "no_future_predictions",
+                  "no_medical_or_legal_advice"
+                ]
+              },
+              "default": ["does_not_impersonate_user", "no_future_predictions", "no_medical_or_legal_advice"]
+            }
+          }
+        },
+        "memory": {
+          "type": "object",
+          "required": ["session_turns", "persistent_store", "summarize_after"],
+          "properties": {
+            "session_turns": {
+              "type": "integer",
+              "description": "Messages to keep in context",
+              "minimum": 1,
+              "maximum": 100,
+              "default": 20
+            },
+            "persistent_store": {
+              "type": "string",
+              "description": "SQLite database path",
+              "default": "~/.companion/memory.db"
+            },
+            "summarize_after": {
+              "type": "integer",
+              "description": "Summarize history after N turns",
+              "minimum": 5,
+              "maximum": 50,
+              "default": 10
+            }
+          }
+        },
+        "chat": {
+          "type": "object",
+          "required": ["streaming", "max_response_tokens", "default_temperature", "allow_temperature_override"],
+          "properties": {
+            "streaming": {
+              "type": "boolean",
+              "description": "Stream responses in real-time",
+              "default": true
+            },
+            "max_response_tokens": {
+              "type": "integer",
+              "description": "Max tokens per response",
+              "minimum": 256,
+              "maximum": 8192,
+              "default": 2048
+            },
+            "default_temperature": {
+              "type": "number",
+              "description": "Creativity level (0.0=deterministic, 2.0=creative)",
+              "minimum": 0.0,
+              "maximum": 2.0,
+              "default": 0.7
+            },
+            "allow_temperature_override": {
+              "type": "boolean",
+              "description": "Let users adjust temperature",
+              "default": true
+            }
+          }
+        }
+      }
+    },
+    "vault": {
+      "type": "object",
+      "title": "Vault Settings",
+      "required": ["path", "indexing", "chunking_rules"],
+      "properties": {
+        "path": {
+          "type": "string",
+          "description": "Absolute path to Obsidian vault root"
+        },
+        "indexing": {
+          "type": "object",
+          "required": ["auto_sync", "auto_sync_interval_minutes", "watch_fs_events", "file_patterns", "deny_dirs", "deny_patterns"],
+          "properties": {
+            "auto_sync": {
+              "type": "boolean",
+              "description": "Enable automatic syncing",
+              "default": true
+            },
+            "auto_sync_interval_minutes": {
+              "type": "integer",
+              "description": "Minutes between full syncs",
+              "minimum": 60,
+              "maximum": 10080,
+              "default": 1440
+            },
+            "watch_fs_events": {
+              "type": "boolean",
+              "description": "Watch for file system changes",
+              "default": true
+            },
+            "file_patterns": {
+              "type": "array",
+              "description": "File patterns to index",
+              "items": { "type": "string" },
+              "default": ["*.md"]
+            },
+            "deny_dirs": {
+              "type": "array",
+              "description": "Directories to skip",
+              "items": { "type": "string" },
+              "default": [".obsidian", ".trash", "zzz-Archive", ".git", ".logseq"]
+            },
+            "deny_patterns": {
+              "type": "array",
+              "description": "File patterns to ignore",
+              "items": { "type": "string" },
+              "default": ["*.tmp", "*.bak", "*conflict*", ".*"]
+            }
+          }
+        },
+        "chunking_rules": {
+          "type": "object",
+          "description": "Per-directory chunking rules (key: glob pattern, value: rule)",
+          "additionalProperties": {
+            "type": "object",
+            "required": ["strategy", "chunk_size", "chunk_overlap"],
+            "properties": {
+              "strategy": {
+                "type": "string",
+                "enum": ["sliding_window", "section"],
+                "description": "Chunking strategy"
+              },
+              "chunk_size": {
+                "type": "integer",
+                "description": "Target chunk size in words",
+                "minimum": 50,
+                "maximum": 2000
+              },
+              "chunk_overlap": {
+                "type": "integer",
+                "description": "Overlap between chunks in words",
+                "minimum": 0,
+                "maximum": 500
+              },
+              "section_tags": {
+                "type": "array",
+                "description": "Tags that mark sections (for section strategy)",
+                "items": { "type": "string" }
+              }
+            }
+          }
+        }
+      }
+    },
+    "rag": {
+      "type": "object",
+      "title": "RAG Settings",
+      "required": ["embedding", "vector_store", "search"],
+      "properties": {
+        "embedding": {
+          "type": "object",
+          "required": ["provider", "model", "base_url", "dimensions", "batch_size"],
+          "properties": {
+            "provider": {
+              "type": "string",
+              "description": "Embedding service provider",
+              "enum": ["ollama"],
+              "default": "ollama"
+            },
+            "model": {
+              "type": "string",
+              "description": "Model name for embeddings",
+              "enum": ["mxbai-embed-large", "nomic-embed-text", "all-minilm"],
+              "default": "mxbai-embed-large"
+            },
+            "base_url": {
+              "type": "string",
+              "description": "Provider API endpoint",
+              "format": "uri",
+              "default": "http://localhost:11434"
+            },
+            "dimensions": {
+              "type": "integer",
+              "description": "Embedding vector size",
+              "enum": [384, 768, 1024],
+              "default": 1024
+            },
+            "batch_size": {
+              "type": "integer",
+              "description": "Texts per embedding batch",
+              "minimum": 1,
+              "maximum": 256,
+              "default": 32
+            }
+          }
+        },
+        "vector_store": {
+          "type": "object",
+          "required": ["type", "path"],
+          "properties": {
+            "type": {
+              "type": "string",
+              "description": "Vector database type",
+              "enum": ["lancedb"],
+              "default": "lancedb"
+            },
+            "path": {
+              "type": "string",
+              "description": "Storage path",
+              "default": "~/.companion/vectors.lance"
+            }
+          }
+        },
+        "search": {
+          "type": "object",
+          "required": ["default_top_k", "max_top_k", "similarity_threshold", "hybrid_search", "filters"],
+          "properties": {
+            "default_top_k": {
+              "type": "integer",
+              "description": "Default results to retrieve",
+              "minimum": 1,
+              "maximum": 100,
+              "default": 8
+            },
+            "max_top_k": {
+              "type": "integer",
+              "description": "Maximum allowed results",
+              "minimum": 1,
+              "maximum": 100,
+              "default": 20
+            },
+            "similarity_threshold": {
+              "type": "number",
+              "description": "Minimum relevance score (0-1)",
+              "minimum": 0.0,
+              "maximum": 1.0,
+              "default": 0.75
+            },
+            "hybrid_search": {
+              "type": "object",
+              "required": ["enabled", "keyword_weight", "semantic_weight"],
+              "properties": {
+                "enabled": {
+                  "type": "boolean",
+                  "description": "Combine keyword + semantic search",
+                  "default": true
+                },
+                "keyword_weight": {
+                  "type": "number",
+                  "description": "Keyword search weight",
+                  "minimum": 0.0,
+                  "maximum": 1.0,
+                  "default": 0.3
+                },
+                "semantic_weight": {
+                  "type": "number",
+                  "description": "Semantic search weight",
+                  "minimum": 0.0,
+                  "maximum": 1.0,
+                  "default": 0.7
+                }
+              }
+            },
+            "filters": {
+              "type": "object",
+              "required": ["date_range_enabled", "tag_filter_enabled", "directory_filter_enabled"],
+              "properties": {
+                "date_range_enabled": {
+                  "type": "boolean",
+                  "description": "Enable date range filtering",
+                  "default": true
+                },
+                "tag_filter_enabled": {
+                  "type": "boolean",
+                  "description": "Enable tag filtering",
+                  "default": true
+                },
+                "directory_filter_enabled": {
+                  "type": "boolean",
+                  "description": "Enable directory filtering",
+                  "default": true
+                }
+              }
+            }
+          }
+        }
+      }
+    },
+    "model": {
+      "type": "object",
+      "title": "Model Settings",
+      "required": ["inference", "fine_tuning", "retrain_schedule"],
+      "properties": {
+        "inference": {
+          "type": "object",
+          "required": ["backend", "model_path", "context_length", "gpu_layers", "batch_size", "threads"],
+          "properties": {
+            "backend": {
+              "type": "string",
+              "description": "Inference engine",
+              "enum": ["llama.cpp", "vllm"],
+              "default": "llama.cpp"
+            },
+            "model_path": {
+              "type": "string",
+              "description": "Path to GGUF or HF model",
+              "default": "~/.companion/models/companion-7b-q4.gguf"
+            },
+            "context_length": {
+              "type": "integer",
+              "description": "Max context tokens",
+              "minimum": 2048,
+              "maximum": 32768,
+              "default": 8192
+            },
+            "gpu_layers": {
+              "type": "integer",
+              "description": "Layers to offload to GPU (0 for CPU-only)",
+              "minimum": 0,
+              "maximum": 100,
+              "default": 35
+            },
+            "batch_size": {
+              "type": "integer",
+              "description": "Inference batch size",
+              "minimum": 1,
+              "maximum": 2048,
+              "default": 512
+            },
+            "threads": {
+              "type": "integer",
+              "description": "CPU threads for inference",
+              "minimum": 1,
+              "maximum": 64,
+              "default": 8
+            }
+          }
+        },
+        "fine_tuning": {
+          "type": "object",
+          "required": ["base_model", "output_dir", "lora_rank", "lora_alpha", "learning_rate", "batch_size", "gradient_accumulation_steps", "num_epochs", "warmup_steps", "save_steps", "eval_steps", "training_data_path", "validation_split"],
+          "properties": {
+            "base_model": {
+              "type": "string",
+              "description": "Base model for fine-tuning",
+              "default": "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit"
+            },
+            "output_dir": {
+              "type": "string",
+              "description": "Training outputs directory",
+              "default": "~/.companion/training"
+            },
+            "lora_rank": {
+              "type": "integer",
+              "description": "LoRA rank (higher = more capacity, more VRAM)",
+              "minimum": 4,
+              "maximum": 128,
+              "default": 16
+            },
+            "lora_alpha": {
+              "type": "integer",
+              "description": "LoRA alpha (scaling factor, typically 2x rank)",
+              "minimum": 8,
+              "maximum": 256,
+              "default": 32
+            },
+            "learning_rate": {
+              "type": "number",
+              "description": "Training learning rate",
+              "minimum": 1e-6,
+              "maximum": 1e-3,
+              "default": 0.0002
+            },
+            "batch_size": {
+              "type": "integer",
+              "description": "Per-device batch size",
+              "minimum": 1,
+              "maximum": 32,
+              "default": 4
+            },
+            "gradient_accumulation_steps": {
+              "type": "integer",
+              "description": "Steps to accumulate before update",
+              "minimum": 1,
+              "maximum": 64,
+              "default": 4
+            },
+            "num_epochs": {
+              "type": "integer",
+              "description": "Training epochs",
+              "minimum": 1,
+              "maximum": 20,
+              "default": 3
+            },
+            "warmup_steps": {
+              "type": "integer",
+              "description": "Learning rate warmup steps",
+              "minimum": 0,
+              "maximum": 10000,
+              "default": 100
+            },
+            "save_steps": {
+              "type": "integer",
+              "description": "Checkpoint frequency",
+              "minimum": 10,
+              "maximum": 10000,
+              "default": 500
+            },
+            "eval_steps": {
+              "type": "integer",
+              "description": "Evaluation frequency",
+              "minimum": 10,
+              "maximum": 10000,
+              "default": 250
+            },
+            "training_data_path": {
+              "type": "string",
+              "description": "Training data directory",
+              "default": "~/.companion/training_data/"
+            },
+            "validation_split": {
+              "type": "number",
+              "description": "Fraction of data for validation",
+              "minimum": 0.0,
+              "maximum": 0.5,
+              "default": 0.1
+            }
+          }
+        },
+        "retrain_schedule": {
+          "type": "object",
+          "required": ["auto_reminder", "default_interval_days", "reminder_channels"],
+          "properties": {
+            "auto_reminder": {
+              "type": "boolean",
+              "description": "Enable retrain reminders",
+              "default": true
+            },
+            "default_interval_days": {
+              "type": "integer",
+              "description": "Days between retrain reminders",
+              "minimum": 30,
+              "maximum": 365,
+              "default": 90
+            },
+            "reminder_channels": {
+              "type": "array",
+              "description": "Where to show reminders",
+              "items": {
+                "type": "string",
+                "enum": ["chat_stream", "log", "ui"]
+              },
+              "default": ["chat_stream", "log"]
+            }
+          }
+        }
+      }
+    },
+    "api": {
+      "type": "object",
+      "title": "API Settings",
+      "required": ["host", "port", "cors_origins", "auth"],
+      "properties": {
+        "host": {
+          "type": "string",
+          "description": "Bind address (use 0.0.0.0 for LAN access)",
+          "default": "127.0.0.1"
+        },
+        "port": {
+          "type": "integer",
+          "description": "HTTP port",
+          "minimum": 1,
+          "maximum": 65535,
+          "default": 7373
+        },
+        "cors_origins": {
+          "type": "array",
+          "description": "Allowed CORS origins",
+          "items": {
+            "type": "string",
+            "format": "uri"
+          },
+          "default": ["http://localhost:5173"]
+        },
+        "auth": {
+          "type": "object",
+          "required": ["enabled"],
+          "properties": {
+            "enabled": {
+              "type": "boolean",
+              "description": "Enable API key authentication",
+              "default": false
+            }
+          }
+        }
+      }
+    },
+    "ui": {
+      "type": "object",
+      "title": "UI Settings",
+      "required": ["web", "cli"],
+      "properties": {
+        "web": {
+          "type": "object",
+          "required": ["enabled", "theme", "features"],
+          "properties": {
+            "enabled": {
+              "type": "boolean",
+              "description": "Enable web interface",
+              "default": true
+            },
+            "theme": {
+              "type": "string",
+              "description": "UI theme",
+              "enum": ["obsidian"],
+              "default": "obsidian"
+            },
+            "features": {
+              "type": "object",
+              "required": ["streaming", "citations", "source_preview"],
+              "properties": {
+                "streaming": {
+                  "type": "boolean",
+                  "description": "Real-time response streaming",
+                  "default": true
+                },
+                "citations": {
+                  "type": "boolean",
+                  "description": "Show source citations",
+                  "default": true
+                },
+                "source_preview": {
+                  "type": "boolean",
+                  "description": "Preview source snippets",
+                  "default": true
+                }
+              }
+            }
+          }
+        },
+        "cli": {
+          "type": "object",
+          "required": ["enabled", "rich_output"],
+          "properties": {
+            "enabled": {
+              "type": "boolean",
+              "description": "Enable CLI interface",
+              "default": true
+            },
+            "rich_output": {
+              "type": "boolean",
+              "description": "Rich terminal formatting",
+              "default": true
+            }
+          }
+        }
+      }
+    },
+    "logging": {
+      "type": "object",
+      "title": "Logging Settings",
+      "required": ["level", "file", "max_size_mb", "backup_count"],
+      "properties": {
+        "level": {
+          "type": "string",
+          "description": "Log level",
+          "enum": ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"],
+          "default": "INFO"
+        },
+        "file": {
+          "type": "string",
+          "description": "Log file path",
+          "default": "~/.companion/logs/companion.log"
+        },
+        "max_size_mb": {
+          "type": "integer",
+          "description": "Max log file size in MB",
+          "minimum": 10,
+          "maximum": 1000,
+          "default": 100
+        },
+        "backup_count": {
+          "type": "integer",
+          "description": "Number of rotated backups",
+          "minimum": 1,
+          "maximum": 20,
+          "default": 5
+        }
+      }
+    },
+    "security": {
+      "type": "object",
+      "title": "Security Settings",
+      "required": ["local_only", "vault_path_traversal_check", "sensitive_content_detection", "sensitive_patterns", "require_confirmation_for_external_apis"],
+      "properties": {
+        "local_only": {
+          "type": "boolean",
+          "description": "Block external API calls",
+          "default": true
+        },
+        "vault_path_traversal_check": {
+          "type": "boolean",
+          "description": "Prevent path traversal attacks",
+          "default": true
+        },
+        "sensitive_content_detection": {
+          "type": "boolean",
+          "description": "Tag sensitive content",
+          "default": true
+        },
+        "sensitive_patterns": {
+          "type": "array",
+          "description": "Tags considered sensitive",
+          "items": { "type": "string" },
+          "default": ["#mentalhealth", "#physicalhealth", "#finance", "#Relations"]
+        },
+        "require_confirmation_for_external_apis": {
+          "type": "boolean",
+          "description": "Confirm before external API calls",
+          "default": true
+        }
+      }
+    }
+  }
+}
diff --git a/docs/config.md b/docs/config.md
new file mode 100644
index 0000000..628eb4a
--- /dev/null
+++ b/docs/config.md
@@ -0,0 +1,278 @@
+# Configuration Reference
+
+Complete reference for `config.json` configuration options.
+
+## Overview
+
+The configuration file uses JSON format with support for:
+- Path expansion (`~` expands to home directory)
+- Type validation via Pydantic models
+- Environment-specific overrides
+
+## Schema Validation
+
+Validate your config against the schema:
+
+```bash
+python -c "from companion.config import load_config; load_config('config.json')"
+```
+
+Or use the JSON Schema directly: [config-schema.json](../config-schema.json)
+
+## Configuration Sections
+
+### companion
+
+Core companion personality and behavior settings.
+
+```json
+{
+  "companion": {
+    "name": "SAN",
+    "persona": {
+      "role": "companion",
+      "tone": "reflective",
+      "style": "questioning",
+      "boundaries": [
+        "does_not_impersonate_user",
+        "no_future_predictions",
+        "no_medical_or_legal_advice"
+      ]
+    },
+    "memory": {
+      "session_turns": 20,
+      "persistent_store": "~/.companion/memory.db",
+      "summarize_after": 10
+    },
+    "chat": {
+      "streaming": true,
+      "max_response_tokens": 2048,
+      "default_temperature": 0.7,
+      "allow_temperature_override": true
+    }
+  }
+}
+```
+
+#### Fields
+
+| Field | Type | Default | Description |
+|-------|------|---------|-------------|
+| `name` | string | "SAN" | Display name for the companion |
+| `persona.role` | string | "companion" | Role description (companion/advisor/reflector) |
+| `persona.tone` | string | "reflective" | Communication tone (reflective/supportive/analytical) |
+| `persona.style` | string | "questioning" | Interaction style (questioning/supportive/direct) |
+| `persona.boundaries` | string[] | [...] | Behavioral guardrails |
+| `memory.session_turns` | int | 20 | Messages to keep in context |
+| `memory.persistent_store` | string | "~/.companion/memory.db" | SQLite database path |
+| `memory.summarize_after` | int | 10 | Summarize history after N turns |
+| `chat.streaming` | bool | true | Stream responses in real-time |
+| `chat.max_response_tokens` | int | 2048 | Max tokens per response |
+| `chat.default_temperature` | float | 0.7 | Creativity (0.0=deterministic, 2.0=creative) |
+| `chat.allow_temperature_override` | bool | true | Let users adjust temperature |
+
+---
+
+### vault
+
+Obsidian vault indexing configuration.
+
+```json
+{
+  "vault": {
+    "path": "~/KnowledgeVault/Default",
+    "indexing": {
+      "auto_sync": true,
+      "auto_sync_interval_minutes": 1440,
+      "watch_fs_events": true,
+      "file_patterns": ["*.md"],
+      "deny_dirs": [".obsidian", ".trash", "zzz-Archive", ".git"],
+      "deny_patterns": ["*.tmp", "*.bak", "*conflict*"]
+    },
+    "chunking_rules": {
+      "default": {
+        "strategy": "sliding_window",
+        "chunk_size": 500,
+        "chunk_overlap": 100
+      },
+      "Journal/**": {
+        "strategy": "section",
+        "section_tags": ["#DayInShort", "#mentalhealth", "#work"],
+        "chunk_size": 300,
+        "chunk_overlap": 50
+      }
+    }
+  }
+}
+```
+
+---
+
+### rag
+
+RAG (Retrieval-Augmented Generation) engine configuration.
+
+```json
+{
+  "rag": {
+    "embedding": {
+      "provider": "ollama",
+      "model": "mxbai-embed-large",
+      "base_url": "http://localhost:11434",
+      "dimensions": 1024,
+      "batch_size": 32
+    },
+    "vector_store": {
+      "type": "lancedb",
+      "path": "~/.companion/vectors.lance"
+    },
+    "search": {
+      "default_top_k": 8,
+      "max_top_k": 20,
+      "similarity_threshold": 0.75,
+      "hybrid_search": {
+        "enabled": true,
+        "keyword_weight": 0.3,
+        "semantic_weight": 0.7
+      },
+      "filters": {
+        "date_range_enabled": true,
+        "tag_filter_enabled": true,
+        "directory_filter_enabled": true
+      }
+    }
+  }
+}
+```
+
+---
+
+### model
+
+LLM configuration for inference and fine-tuning.
+
+```json
+{
+  "model": {
+    "inference": {
+      "backend": "llama.cpp",
+      "model_path": "~/.companion/models/companion-7b-q4.gguf",
+      "context_length": 8192,
+      "gpu_layers": 35,
+      "batch_size": 512,
+      "threads": 8
+    },
+    "fine_tuning": {
+      "base_model": "unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
+      "output_dir": "~/.companion/training",
+      "lora_rank": 16,
+      "lora_alpha": 32,
+      "learning_rate": 0.0002,
+      "batch_size": 4,
+      "gradient_accumulation_steps": 4,
+      "num_epochs": 3,
+      "warmup_steps": 100,
+      "save_steps": 500,
+      "eval_steps": 250,
+      "training_data_path": "~/.companion/training_data/",
+      "validation_split": 0.1
+    },
+    "retrain_schedule": {
+      "auto_reminder": true,
+      "default_interval_days": 90,
+      "reminder_channels": ["chat_stream", "log"]
+    }
+  }
+}
+```
+
+---
+
+### api
+
+FastAPI backend configuration.
+
+```json
+{
+  "api": {
+    "host": "127.0.0.1",
+    "port": 7373,
+    "cors_origins": ["http://localhost:5173"],
+    "auth": {
+      "enabled": false
+    }
+  }
+}
+```
+
+---
+
+### ui
+
+Web UI configuration.
+
+```json
+{
+  "ui": {
+    "web": {
+      "enabled": true,
+      "theme": "obsidian",
+      "features": {
+        "streaming": true,
+        "citations": true,
+        "source_preview": true
+      }
+    },
+    "cli": {
+      "enabled": true,
+      "rich_output": true
+    }
+  }
+}
+```
+
+---
+
+### logging
+
+Logging configuration.
+
+```json
+{
+  "logging": {
+    "level": "INFO",
+    "file": "~/.companion/logs/companion.log",
+    "max_size_mb": 100,
+    "backup_count": 5
+  }
+}
+```
+
+---
+
+### security
+
+Security and privacy settings.
+
+```json
+{
+  "security": {
+    "local_only": true,
+    "vault_path_traversal_check": true,
+    "sensitive_content_detection": true,
+    "sensitive_patterns": [
+      "#mentalhealth",
+      "#physicalhealth",
+      "#finance",
+      "#Relations"
+    ],
+    "require_confirmation_for_external_apis": true
+  }
+}
+```
+
+---
+
+## Full Example
+
+See [config.json](../config.json) for a complete working configuration.
diff --git a/docs/forge.md b/docs/forge.md
new file mode 100644
index 0000000..e6fc04f
--- /dev/null
+++ b/docs/forge.md
@@ -0,0 +1,288 @@
+# FORGE Module Documentation
+
+The FORGE module handles fine-tuning of the companion model. It extracts training examples from your vault reflections and trains a custom LoRA adapter using QLoRA on your local GPU.
+
+## Architecture
+
+```
+Vault Reflections
+         ↓
+┌─────────────────┐
+│    Extract      │  - Scan for #reflection, #insight tags
+│  (extract.py)   │  - Parse reflection patterns
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│     Curate      │  - Manual review (optional)
+│  (curate.py)    │  - Deduplication
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│     Train       │  - QLoRA fine-tuning
+│  (train.py)     │  - Unsloth + transformers
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│     Export      │  - Merge LoRA weights
+│  (export.py)    │  - Convert to GGUF
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│     Reload      │  - Hot-swap in API
+│  (reload.py)    │  - No restart needed
+└─────────────────┘
+```
+
+## Requirements
+
+- **GPU**: RTX 5070 or equivalent (12GB+ VRAM)
+- **Dependencies**: Install with `pip install -e ".[train]"`
+- **Time**: 4-6 hours for full training run
+
+## Workflow
+
+### 1. Extract Training Data
+
+Scan your vault for reflection patterns:
+
+```bash
+python -m companion.forge.cli extract
+```
+
+This scans for:
+- Tags: `#reflection`, `#insight`, `#learning`, `#decision`, etc.
+- Patterns: "I think", "I realize", "Looking back", "What if"
+- Section headers in journal entries
+
+Output: `~/.companion/training_data/extracted.jsonl`
+
+**Example extracted data:**
+
+```json
+{
+  "messages": [
+    {"role": "system", "content": "You are a thoughtful, reflective companion."},
+    {"role": "user", "content": "I'm facing a decision. How should I think through this?"},
+    {"role": "assistant", "content": "#reflection I think I need to slow down..."}
+  ],
+  "source_file": "Journal/2026/04/2026-04-12.md",
+  "tags": ["#reflection", "#DayInShort"],
+  "date": "2026-04-12"
+}
+```
+
+### 2. Train Model
+
+Run QLoRA fine-tuning:
+
+```bash
+python -m companion.forge.cli train --epochs 3 --lr 2e-4
+```
+
+**Hyperparameters (from config):**
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| `lora_rank` | 16 | LoRA rank (8-64) |
+| `lora_alpha` | 32 | LoRA scaling factor |
+| `learning_rate` | 2e-4 | Optimizer learning rate |
+| `num_epochs` | 3 | Training epochs |
+| `batch_size` | 4 | Per-device batch |
+| `gradient_accumulation_steps` | 4 | Steps before update |
+
+**Training Output:**
+- Checkpoints: `~/.companion/training/checkpoint-*/`
+- Final model: `~/.companion/training/final/`
+- Logs: Training loss, eval metrics
+
+### 3. Reload Model
+
+Hot-swap without restarting API:
+
+```bash
+python -m companion.forge.cli reload ~/.companion/training/final
+```
+
+Or via API:
+
+```bash
+curl -X POST http://localhost:7373/admin/reload-model \
+  -H "Content-Type: application/json" \
+  -d '{"model_path": "~/.companion/training/final"}'
+```
+
+## Components
+
+### Extractor (`companion.forge.extract`)
+
+```python
+from companion.forge.extract import TrainingDataExtractor, extract_training_data
+
+# Extract from vault
+extractor = TrainingDataExtractor(config)
+examples = extractor.extract()
+
+# Get statistics
+stats = extractor.get_stats()
+print(f"Extracted {stats['total']} examples")
+
+# Save to JSONL
+extractor.save_to_jsonl(Path("training.jsonl"))
+```
+
+**Reflection Detection:**
+
+- **Tags**: `#reflection`, `#learning`, `#insight`, `#decision`, `#analysis`, `#takeaway`, `#realization`
+- **Patterns**: "I think", "I feel", "I realize", "I wonder", "Looking back", "On one hand...", "Ultimately decided"
+
+### Trainer (`companion.forge.train`)
+
+```python
+from companion.forge.train import train
+
+final_path = train(
+    data_path=Path("training.jsonl"),
+    output_dir=Path("~/.companion/training"),
+    base_model="unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
+    lora_rank=16,
+    lora_alpha=32,
+    learning_rate=2e-4,
+    num_epochs=3,
+)
+```
+
+**Base Models:**
+
+- `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit` - Recommended
+- `unsloth/llama-3-8b-bnb-4bit` - Alternative
+
+**Target Modules:**
+
+LoRA is applied to: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+
+### Exporter (`companion.forge.export`)
+
+```python
+from companion.forge.export import merge_only
+
+# Merge LoRA into base model
+merged_path = merge_only(
+    checkpoint_path=Path("~/.companion/training/checkpoint-500"),
+    output_path=Path("~/.companion/models/merged"),
+)
+```
+
+### Reloader (`companion.forge.reload`)
+
+```python
+from companion.forge.reload import reload_model, get_model_status
+
+# Check current model
+status = get_model_status(config)
+print(f"Model size: {status['size_mb']} MB")
+
+# Reload with new model
+new_path = reload_model(
+    config=config,
+    new_model_path=Path("~/.companion/training/final"),
+    backup=True,
+)
+```
+
+## CLI Reference
+
+```bash
+# Extract training data
+companion.forge.cli extract [--output PATH]
+
+# Train model
+companion.forge.cli train \
+  [--data PATH] \
+  [--output PATH] \
+  [--epochs N] \
+  [--lr FLOAT]
+
+# Check model status
+companion.forge.cli status
+
+# Reload model
+companion.forge.cli reload MODEL_PATH [--no-backup]
+```
+
+## Training Tips
+
+**Dataset Size:**
+- Minimum: 50 examples
+- Optimal: 100-500 examples
+- More is not always better - quality over quantity
+
+**Epochs:**
+- Start with 3 epochs
+- Increase if underfitting (high loss)
+- Decrease if overfitting (loss increases on eval)
+
+**LoRA Rank:**
+- `8` - Quick experiments
+- `16` - Balanced (recommended)
+- `32-64` - High capacity, more VRAM
+
+**Overfitting Signs:**
+- Training loss decreasing, eval loss increasing
+- Model repeats exact phrases from training data
+- Responses feel "memorized" not "learned"
+
+## VRAM Usage (RTX 5070, 12GB)
+
+| Config | VRAM | Batch Size |
+|--------|------|------------|
+| Rank 16, 8-bit adam | ~10GB | 4 |
+| Rank 32, 8-bit adam | ~11GB | 4 |
+| Rank 64, 8-bit adam | OOM | - |
+
+Use `gradient_accumulation_steps` to increase effective batch size.
+
+## Troubleshooting
+
+**CUDA Out of Memory**
+- Reduce `lora_rank` to 8
+- Reduce `batch_size` to 2
+- Increase `gradient_accumulation_steps`
+
+**Training Loss Not Decreasing**
+- Check data quality (reflections present?)
+- Increase learning rate to 5e-4
+- Check for data formatting issues
+
+**Model Not Loading After Reload**
+- Check path exists: `ls -la ~/.companion/models/`
+- Verify model format (GGUF vs HF)
+- Check API logs for errors
+
+**Slow Training**
+- Expected: ~6 hours for 3 epochs on RTX 5070
+- Enable gradient checkpointing (enabled by default)
+- Close other GPU applications
+
+## Advanced: Custom Training Script
+
+```python
+# custom_train.py
+from companion.forge.train import train
+from companion.config import load_config
+
+config = load_config()
+
+final_path = train(
+    data_path=config.model.fine_tuning.training_data_path / "curated.jsonl",
+    output_dir=config.model.fine_tuning.output_dir,
+    base_model=config.model.fine_tuning.base_model,
+    lora_rank=32,  # Higher capacity
+    lora_alpha=64,
+    learning_rate=3e-4,  # Slightly higher
+    num_epochs=5,  # More epochs
+    batch_size=2,  # Smaller batches
+    gradient_accumulation_steps=8,  # Effective batch = 16
+)
+
+print(f"Model saved to: {final_path}")
+```
diff --git a/docs/rag.md b/docs/rag.md
new file mode 100644
index 0000000..f76a70d
--- /dev/null
+++ b/docs/rag.md
@@ -0,0 +1,269 @@
+# RAG Module Documentation
+
+The RAG (Retrieval-Augmented Generation) module provides semantic search over your Obsidian vault. It handles document chunking, embedding generation, and vector similarity search.
+
+## Architecture
+
+```
+Vault Markdown Files
+         ↓
+┌─────────────────┐
+│    Chunker      │  - Split by strategy (sliding window / section)
+│  (chunker.py)   │  - Extract metadata (tags, dates, sections)
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│    Embedder     │  - HTTP client for Ollama API
+│  (embedder.py)  │  - Batch processing with retries
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│  Vector Store   │  - LanceDB persistence
+│(vector_store.py)│  - Upsert, delete, search
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│  Indexer        │  - Full/incremental sync
+│  (indexer.py)   │  - File watching
+└─────────────────┘
+```
+
+## Components
+
+### Chunker (`companion.rag.chunker`)
+
+Splits markdown files into searchable chunks.
+
+```python
+from companion.rag.chunker import chunk_file, ChunkingRule
+
+rules = {
+    "default": ChunkingRule(strategy="sliding_window", chunk_size=500, chunk_overlap=100),
+    "Journal/**": ChunkingRule(strategy="section", section_tags=["#DayInShort"], chunk_size=300, chunk_overlap=50),
+}
+
+chunks = chunk_file(
+    file_path=Path("journal/2026-04-12.md"),
+    vault_root=Path("~/vault"),
+    rules=rules,
+    modified_at=1234567890.0,
+)
+
+for chunk in chunks:
+    print(f"{chunk.source_file}:{chunk.chunk_index}")
+    print(f"Text: {chunk.text[:100]}...")
+    print(f"Tags: {chunk.tags}")
+    print(f"Date: {chunk.date}")
+```
+
+#### Chunking Strategies
+
+**Sliding Window**
+- Fixed-size chunks with overlap
+- Best for: Longform text, articles
+
+```python
+ChunkingRule(
+    strategy="sliding_window",
+    chunk_size=500,    # words per chunk
+    chunk_overlap=100, # words overlap between chunks
+)
+```
+
+**Section-Based**
+- Split on section headers (tags)
+- Best for: Structured journals, daily notes
+
+```python
+ChunkingRule(
+    strategy="section",
+    section_tags=["#DayInShort", "#mentalhealth", "#work"],
+    chunk_size=300,
+    chunk_overlap=50,
+)
+```
+
+#### Metadata Extraction
+
+Each chunk includes:
+- `source_file` - Relative path from vault root
+- `source_directory` - Top-level directory
+- `section` - Section header (for section strategy)
+- `date` - Parsed from filename
+- `tags` - Hashtags and wikilinks
+- `chunk_index` - Position in document
+- `modified_at` - File mtime for sync
+
+### Embedder (`companion.rag.embedder`)
+
+Generates embeddings via Ollama API.
+
+```python
+from companion.rag.embedder import OllamaEmbedder
+
+embedder = OllamaEmbedder(
+    base_url="http://localhost:11434",
+    model="mxbai-embed-large",
+    batch_size=32,
+)
+
+# Single embedding
+embeddings = embedder.embed(["Hello world"])
+print(len(embeddings[0]))  # 1024 dimensions
+
+# Batch embedding (with automatic batching)
+texts = ["text 1", "text 2", "text 3", ...]  # 100 texts
+embeddings = embedder.embed(texts)  # Automatically batches
+```
+
+#### Features
+
+- **Batching**: Automatically splits large requests
+- **Retries**: Exponential backoff on failures
+- **Context Manager**: Proper resource cleanup
+
+```python
+with OllamaEmbedder(...) as embedder:
+    embeddings = embedder.embed(texts)
+```
+
+### Vector Store (`companion.rag.vector_store`)
+
+LanceDB wrapper for vector storage.
+
+```python
+from companion.rag.vector_store import VectorStore
+
+store = VectorStore(
+    uri="~/.companion/vectors.lance",
+    dimensions=1024,
+)
+
+# Upsert chunks
+store.upsert(
+    ids=["file.md::0", "file.md::1"],
+    texts=["chunk 1", "chunk 2"],
+    embeddings=[[0.1, ...], [0.2, ...]],
+    metadatas=[
+        {"source_file": "file.md", "source_directory": "docs"},
+        {"source_file": "file.md", "source_directory": "docs"},
+    ],
+)
+
+# Search
+results = store.search(
+    query_vector=[0.1, ...],
+    top_k=8,
+    filters={"source_directory": "Journal"},
+)
+```
+
+#### Schema
+
+| Field | Type | Nullable |
+|-------|------|----------|
+| id | string | No |
+| text | string | No |
+| vector | list[float32] | No |
+| source_file | string | No |
+| source_directory | string | No |
+| section | string | Yes |
+| date | string | Yes |
+| tags | list[string] | Yes |
+| chunk_index | int32 | No |
+| total_chunks | int32 | No |
+| modified_at | float64 | Yes |
+| rule_applied | string | No |
+
+### Indexer (`companion.rag.indexer`)
+
+Orchestrates vault indexing.
+
+```python
+from companion.config import load_config
+from companion.rag.indexer import Indexer
+from companion.rag.vector_store import VectorStore
+
+config = load_config()
+store = VectorStore(
+    uri=config.rag.vector_store.path,
+    dimensions=config.rag.embedding.dimensions,
+)
+
+indexer = Indexer(config, store)
+
+# Full reindex (clear + rebuild)
+indexer.full_index()
+
+# Incremental sync (only changed files)
+indexer.sync()
+
+# Get status
+status = indexer.status()
+print(f"Total chunks: {status['total_chunks']}")
+print(f"Unindexed files: {status['unindexed_files']}")
+```
+
+### Search (`companion.rag.search`)
+
+High-level search interface.
+
+```python
+from companion.rag.search import SearchEngine
+
+engine = SearchEngine(
+    vector_store=store,
+    embedder_base_url="http://localhost:11434",
+    embedder_model="mxbai-embed-large",
+    default_top_k=8,
+    similarity_threshold=0.75,
+    hybrid_search_enabled=False,
+)
+
+results = engine.search(
+    query="What did I learn about friendships?",
+    top_k=8,
+    filters={"source_directory": "Journal"},
+)
+
+for result in results:
+    print(f"Source: {result['source_file']}")
+    print(f"Relevance: {1 - result['_distance']:.2f}")
+```
+
+## CLI Commands
+
+```bash
+# Full index
+python -m companion.indexer_daemon.cli index
+
+# Incremental sync
+python -m companion.indexer_daemon.cli sync
+
+# Check status
+python -m companion.indexer_daemon.cli status
+
+# Reindex (same as index)
+python -m companion.indexer_daemon.cli reindex
+```
+
+## Performance Tips
+
+1. **Chunk Size**: Smaller chunks = better retrieval, larger = more context
+2. **Batch Size**: 32 is optimal for Ollama embeddings
+3. **Filters**: Use directory filters to narrow search scope
+4. **Sync vs Index**: Use `sync` for daily updates, `index` for full rebuilds
+
+## Troubleshooting
+
+**Slow indexing**
+- Check Ollama is running: `ollama ps`
+- Reduce batch size if OOM
+
+**No results**
+- Verify vault path in config
+- Check `indexer.status()` for unindexed files
+
+**Duplicate chunks**
+- Each chunk ID is `{source_file}::{chunk_index}`
+- Use `full_index()` to clear and rebuild
diff --git a/docs/ui.md b/docs/ui.md
new file mode 100644
index 0000000..3f23384
--- /dev/null
+++ b/docs/ui.md
@@ -0,0 +1,408 @@
+# UI Module Documentation
+
+The UI is a React + Vite frontend for the companion chat interface. It provides real-time streaming chat with a clean, Obsidian-inspired dark theme.
+
+## Architecture
+
+```
+HTTP/SSE
+    ↓
+┌─────────────────┐
+│   App.tsx       │  - State management
+│   Message state │  - User/assistant messages
+└────────┬────────┘
+         ↓
+┌─────────────────┐
+│  MessageList    │  - Render messages
+│  (components/)  │  - User/assistant styling
+└─────────────────┘
+┌─────────────────┐
+│   ChatInput     │  - Textarea + send
+│  (components/)  │  - Auto-resize, hotkeys
+└─────────────────┘
+         ↓
+┌─────────────────┐
+│ useChatStream   │  - SSE streaming
+│   (hooks/)      │  - Session management
+└─────────────────┘
+```
+
+## Project Structure
+
+```
+ui/
+├── src/
+│   ├── main.tsx              # React entry point
+│   ├── App.tsx               # Main app component
+│   ├── App.css               # App layout styles
+│   ├── index.css             # Global styles
+│   ├── components/
+│   │   ├── MessageList.tsx     # Message display
+│   │   ├── MessageList.css     # Message styling
+│   │   ├── ChatInput.tsx       # Input textarea
+│   │   └── ChatInput.css       # Input styling
+│   └── hooks/
+│       └── useChatStream.ts   # SSE streaming hook
+├── index.html               # HTML template
+├── vite.config.ts           # Vite configuration
+├── tsconfig.json            # TypeScript config
+└── package.json             # Dependencies
+```
+
+## Components
+
+### App.tsx
+
+Main application state management:
+
+```typescript
+interface Message {
+  role: 'user' | 'assistant'
+  content: string
+}
+
+// State
+const [messages, setMessages] = useState<Message[]>([])
+const [input, setInput] = useState('')
+const [isLoading, setIsLoading] = useState(false)
+
+// Handlers
+const handleSend = async () => { /* ... */ }
+const handleKeyDown = (e) => { /* Enter to send, Shift+Enter newline */ }
+```
+
+**Features:**
+- Auto-scroll to bottom on new messages
+- Keyboard shortcuts (Enter to send, Shift+Enter for newline)
+- Loading state with animation
+- Message streaming in real-time
+
+### MessageList.tsx
+
+Renders the chat history:
+
+```typescript
+interface MessageListProps {
+  messages: Message[]
+  isLoading: boolean
+}
+```
+
+**Layout:**
+- User messages: Right-aligned, blue background
+- Assistant messages: Left-aligned, gray background with border
+- Loading indicator: Three animated dots
+- Empty state: Prompt text when no messages
+
+**Styling:**
+- Max-width 800px, centered
+- Smooth scroll behavior
+- Avatar-less design (clean, text-focused)
+
+### ChatInput.tsx
+
+Textarea input with send button:
+
+```typescript
+interface ChatInputProps {
+  value: string
+  onChange: (value: string) => void
+  onSend: () => void
+  onKeyDown: (e: KeyboardEvent) => void
+  disabled: boolean
+}
+```
+
+**Features:**
+- Auto-resizing textarea
+- Send button with loading state
+- Placeholder text
+- Disabled during streaming
+
+## Hooks
+
+### useChatStream.ts
+
+Manages SSE streaming connection:
+
+```typescript
+interface UseChatStreamReturn {
+  sendMessage: (
+    message: string,
+    onChunk: (chunk: string) => void
+  ) => Promise<void>
+  sessionId: string | null
+}
+
+const { sendMessage, sessionId } = useChatStream()
+```
+
+**Usage:**
+
+```typescript
+await sendMessage("Hello", (chunk) => {
+  // Append chunk to current response
+  setMessages(prev => {
+    const last = prev[prev.length - 1]
+    if (last?.role === 'assistant') {
+      last.content += chunk
+      return [...prev]
+    }
+    return [...prev, { role: 'assistant', content: chunk }]
+  })
+})
+```
+
+**SSE Protocol:**
+
+The API streams events in this format:
+
+```
+data: {"type": "chunk", "content": "Hello"}
+
+data: {"type": "chunk", "content": " world"}
+
+data: {"type": "sources", "sources": [{"file": "journal.md"}]}
+
+data: {"type": "done", "session_id": "uuid"}
+```
+
+## Styling
+
+### Design System
+
+Based on Obsidian's dark theme:
+
+```css
+:root {
+  --bg-primary: #0d1117;      /* App background */
+  --bg-secondary: #161b22;    /* Header/footer */
+  --bg-tertiary: #21262d;     /* Input background */
+  
+  --text-primary: #c9d1d9;    /* Main text */
+  --text-secondary: #8b949e;  /* Placeholder */
+  
+  --accent-primary: #58a6ff;  /* Primary blue */
+  --accent-secondary: #79c0ff;/* Lighter blue */
+  
+  --border: #30363d;          /* Borders */
+  --user-bg: #1f6feb;         /* User message */
+  --assistant-bg: #21262d;    /* Assistant message */
+}
+```
+
+### Message Styling
+
+**User Message:**
+- Blue background (`--user-bg`)
+- White text
+- Border radius: 12px (12px 12px 4px 12px)
+- Max-width: 80%
+
+**Assistant Message:**
+- Gray background (`--assistant-bg`)
+- Light text (`--text-primary`)
+- Border: 1px solid `--border`
+- Border radius: 12px (12px 12px 12px 4px)
+
+### Loading Animation
+
+Three bouncing dots using CSS keyframes:
+
+```css
+@keyframes bounce {
+  0%, 80%, 100% { transform: scale(0.6); }
+  40% { transform: scale(1); }
+}
+```
+
+## Development
+
+### Setup
+
+```bash
+cd ui
+npm install
+```
+
+### Dev Server
+
+```bash
+npm run dev
+# Opens http://localhost:5173
+```
+
+### Build
+
+```bash
+npm run build
+# Output: ui/dist/
+```
+
+### Preview Production Build
+
+```bash
+npm run preview
+```
+
+## Configuration
+
+### Vite Config
+
+`vite.config.ts`:
+
+```typescript
+export default defineConfig({
+  plugins: [react()],
+  server: {
+    port: 5173,
+    proxy: {
+      '/api': {
+        target: 'http://localhost:7373',
+        changeOrigin: true,
+      },
+    },
+  },
+})
+```
+
+**Proxy Setup:**
+- Frontend: `http://localhost:5173`
+- API: `http://localhost:7373`
+- `/api/*` → `http://localhost:7373/api/*`
+
+This allows using relative API paths in the code:
+
+```typescript
+const API_BASE = '/api'  // Not http://localhost:7373/api
+```
+
+## TypeScript
+
+### Types
+
+```typescript
+// Message role
+type Role = 'user' | 'assistant'
+
+// Message object
+interface Message {
+  role: Role
+  content: string
+}
+
+// Chat request
+type ChatRequest = {
+  message: string
+  session_id?: string
+  temperature?: number
+}
+
+// SSE chunk
+type ChunkEvent = {
+  type: 'chunk'
+  content: string
+}
+
+type SourcesEvent = {
+  type: 'sources'
+  sources: Array<{
+    file: string
+    section?: string
+    date?: string
+  }>
+}
+
+type DoneEvent = {
+  type: 'done'
+  session_id: string
+}
+```
+
+## API Integration
+
+### Chat Endpoint
+
+```typescript
+const response = await fetch('/api/chat', {
+  method: 'POST',
+  headers: { 'Content-Type': 'application/json' },
+  body: JSON.stringify({
+    message: userInput,
+    session_id: sessionId,  // null for new session
+    stream: true,
+  }),
+})
+
+// Read SSE stream
+const reader = response.body?.getReader()
+const decoder = new TextDecoder()
+
+while (true) {
+  const { done, value } = await reader.read()
+  if (done) break
+  
+  const chunk = decoder.decode(value, { stream: true })
+  // Parse SSE lines
+}
+```
+
+### Session Persistence
+
+The backend maintains conversation history via `session_id`:
+
+1. First message: `session_id: null` → backend creates UUID
+2. Response header: `X-Session-ID: <uuid>`
+3. Subsequent messages: include `session_id: <uuid>`
+4. History retrieved automatically
+
+## Customization
+
+### Themes
+
+Modify `App.css` and `index.css`:
+
+```css
+/* Custom accent color */
+--accent-primary: #ff6b6b;
+--user-bg: #ff6b6b;
+```
+
+### Fonts
+
+Update `index.css`:
+
+```css
+body {
+  font-family: 'Inter', -apple-system, sans-serif;
+}
+```
+
+### Message Layout
+
+Modify `MessageList.css`:
+
+```css
+.message-content {
+  max-width: 90%;  /* Wider messages */
+  font-size: 16px; /* Larger text */
+}
+```
+
+## Troubleshooting
+
+**CORS errors**
+- Check `vite.config.ts` proxy configuration
+- Verify backend CORS origins include `http://localhost:5173`
+
+**Stream not updating**
+- Check browser network tab for SSE events
+- Verify `EventSourceResponse` from backend
+
+**Messages not appearing**
+- Check React DevTools for state updates
+- Verify `messages` array is being mutated correctly
+
+**Build fails**
+- Check TypeScript errors: `npx tsc --noEmit`
+- Update dependencies: `npm update`