From 1d0ea4f2cf957e7af8b3c998921d5c1a472be094 Mon Sep 17 00:00:00 2001
From: Santhosh Janardhanan <santhoshj@gmail.com>
Date: Mon, 13 Apr 2026 17:23:58 -0400
Subject: [PATCH] docs: update README and forge documentation

- README: Fixed backend command, added GPU compatibility reference
- forge.md: Fixed train CLI (--output-dir), added GPU troubleshooting
- Added reference to GPU compatibility guide for RTX 50-series
---
 README.md     |  6 ++++--
 docs/forge.md | 23 +++++++++++++++--------
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/README.md b/README.md
index 83bee53..2acd7cb 100644
--- a/README.md
+++ b/README.md
@@ -39,6 +39,7 @@ A fully local, privacy-first AI companion trained on your Obsidian vault. Combin
 - Node.js 18+ (for UI)
 - Ollama running locally
 - RTX 5070 or equivalent (12GB+ VRAM for fine-tuning)
+- See [GPU Compatibility Guide](docs/gpu-compatibility.md) for RTX 50-series setup
 
 ### Installation
 
@@ -76,7 +77,7 @@ See [docs/config.md](docs/config.md) for full configuration reference.
 
 **Terminal 1 - Backend:**
 ```bash
-python -m uvicorn companion.api:app --host 0.0.0.0 --port 7373
+python -m companion.api
 ```
 
 **Terminal 2 - Frontend:**
@@ -139,8 +140,9 @@ python -m companion.forge.cli reload ~/.companion/training/final
 | `companion.config` | Configuration management | [docs/config.md](docs/config.md) |
 | `companion.rag` | RAG engine (chunk, embed, search) | [docs/rag.md](docs/rag.md) |
 | `companion.forge` | Fine-tuning pipeline | [docs/forge.md](docs/forge.md) |
-| `companion.api` | FastAPI backend | [docs/api.md](docs/api.md) |
+| `companion.api` | FastAPI backend | This README |
 | `ui/` | React frontend | [docs/ui.md](docs/ui.md) |
+| **GPU Setup** | RTX 50-series compatibility | [docs/gpu-compatibility.md](docs/gpu-compatibility.md) |
 
 ## Project Structure
 
diff --git a/docs/forge.md b/docs/forge.md
index e6fc04f..b4da61d 100644
--- a/docs/forge.md
+++ b/docs/forge.md
@@ -193,22 +193,24 @@ new_path = reload_model(
 
 ```bash
 # Extract training data
-companion.forge.cli extract [--output PATH]
+python -m companion.forge.cli extract [--output PATH]
 
 # Train model
-companion.forge.cli train \
-  [--data PATH] \
-  [--output PATH] \
-  [--epochs N] \
-  [--lr FLOAT]
+python -m companion.forge.train \
+  --data PATH \
+  --output-dir PATH \
+  --epochs N \
+  --lr FLOAT
 
 # Check model status
-companion.forge.cli status
+python -m companion.forge.cli status
 
 # Reload model
-companion.forge.cli reload MODEL_PATH [--no-backup]
+python -m companion.forge.cli reload MODEL_PATH [--no-backup]
 ```
 
+**Note:** Use `--output-dir` (or `--output`) to specify the training output directory.
+
 ## Training Tips
 
 **Dataset Size:**
@@ -243,6 +245,11 @@ Use `gradient_accumulation_steps` to increase effective batch size.
 
 ## Troubleshooting
 
+**GPU Not Detected / CUDA Not Available**
+- See [GPU Compatibility Guide](gpu-compatibility.md)
+- Common issue on RTX 50-series: Install CUDA-enabled PyTorch: `pip install torch==2.5.1+cu121 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121`
+- Verify: `python -c "import torch; print(torch.cuda.is_available())"`
+
 **CUDA Out of Memory**
 - Reduce `lora_rank` to 8
 - Reduce `batch_size` to 2