docs: add comprehensive README and module documentation
This commit is contained in:
288
docs/forge.md
Normal file
288
docs/forge.md
Normal file
@@ -0,0 +1,288 @@
|
||||
# FORGE Module Documentation
|
||||
|
||||
The FORGE module handles fine-tuning of the companion model. It extracts training examples from your vault reflections and trains a custom LoRA adapter using QLoRA on your local GPU.
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
Vault Reflections
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Extract │ - Scan for #reflection, #insight tags
|
||||
│ (extract.py) │ - Parse reflection patterns
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Curate │ - Manual review (optional)
|
||||
│ (curate.py) │ - Deduplication
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Train │ - QLoRA fine-tuning
|
||||
│ (train.py) │ - Unsloth + transformers
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Export │ - Merge LoRA weights
|
||||
│ (export.py) │ - Convert to GGUF
|
||||
└────────┬────────┘
|
||||
↓
|
||||
┌─────────────────┐
|
||||
│ Reload │ - Hot-swap in API
|
||||
│ (reload.py) │ - No restart needed
|
||||
└─────────────────┘
|
||||
```
|
||||
|
||||
## Requirements
|
||||
|
||||
- **GPU**: RTX 5070 or equivalent (12GB+ VRAM)
|
||||
- **Dependencies**: Install with `pip install -e ".[train]"`
|
||||
- **Time**: 4-6 hours for full training run
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Extract Training Data
|
||||
|
||||
Scan your vault for reflection patterns:
|
||||
|
||||
```bash
|
||||
python -m companion.forge.cli extract
|
||||
```
|
||||
|
||||
This scans for:
|
||||
- Tags: `#reflection`, `#insight`, `#learning`, `#decision`, etc.
|
||||
- Patterns: "I think", "I realize", "Looking back", "What if"
|
||||
- Section headers in journal entries
|
||||
|
||||
Output: `~/.companion/training_data/extracted.jsonl`
|
||||
|
||||
**Example extracted data:**
|
||||
|
||||
```json
|
||||
{
|
||||
"messages": [
|
||||
{"role": "system", "content": "You are a thoughtful, reflective companion."},
|
||||
{"role": "user", "content": "I'm facing a decision. How should I think through this?"},
|
||||
{"role": "assistant", "content": "#reflection I think I need to slow down..."}
|
||||
],
|
||||
"source_file": "Journal/2026/04/2026-04-12.md",
|
||||
"tags": ["#reflection", "#DayInShort"],
|
||||
"date": "2026-04-12"
|
||||
}
|
||||
```
|
||||
|
||||
### 2. Train Model
|
||||
|
||||
Run QLoRA fine-tuning:
|
||||
|
||||
```bash
|
||||
python -m companion.forge.cli train --epochs 3 --lr 2e-4
|
||||
```
|
||||
|
||||
**Hyperparameters (from config):**
|
||||
|
||||
| Parameter | Default | Description |
|
||||
|-----------|---------|-------------|
|
||||
| `lora_rank` | 16 | LoRA rank (8-64) |
|
||||
| `lora_alpha` | 32 | LoRA scaling factor |
|
||||
| `learning_rate` | 2e-4 | Optimizer learning rate |
|
||||
| `num_epochs` | 3 | Training epochs |
|
||||
| `batch_size` | 4 | Per-device batch |
|
||||
| `gradient_accumulation_steps` | 4 | Steps before update |
|
||||
|
||||
**Training Output:**
|
||||
- Checkpoints: `~/.companion/training/checkpoint-*/`
|
||||
- Final model: `~/.companion/training/final/`
|
||||
- Logs: Training loss, eval metrics
|
||||
|
||||
### 3. Reload Model
|
||||
|
||||
Hot-swap without restarting API:
|
||||
|
||||
```bash
|
||||
python -m companion.forge.cli reload ~/.companion/training/final
|
||||
```
|
||||
|
||||
Or via API:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:7373/admin/reload-model \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"model_path": "~/.companion/training/final"}'
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### Extractor (`companion.forge.extract`)
|
||||
|
||||
```python
|
||||
from companion.forge.extract import TrainingDataExtractor, extract_training_data
|
||||
|
||||
# Extract from vault
|
||||
extractor = TrainingDataExtractor(config)
|
||||
examples = extractor.extract()
|
||||
|
||||
# Get statistics
|
||||
stats = extractor.get_stats()
|
||||
print(f"Extracted {stats['total']} examples")
|
||||
|
||||
# Save to JSONL
|
||||
extractor.save_to_jsonl(Path("training.jsonl"))
|
||||
```
|
||||
|
||||
**Reflection Detection:**
|
||||
|
||||
- **Tags**: `#reflection`, `#learning`, `#insight`, `#decision`, `#analysis`, `#takeaway`, `#realization`
|
||||
- **Patterns**: "I think", "I feel", "I realize", "I wonder", "Looking back", "On one hand...", "Ultimately decided"
|
||||
|
||||
### Trainer (`companion.forge.train`)
|
||||
|
||||
```python
|
||||
from companion.forge.train import train
|
||||
|
||||
final_path = train(
|
||||
data_path=Path("training.jsonl"),
|
||||
output_dir=Path("~/.companion/training"),
|
||||
base_model="unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit",
|
||||
lora_rank=16,
|
||||
lora_alpha=32,
|
||||
learning_rate=2e-4,
|
||||
num_epochs=3,
|
||||
)
|
||||
```
|
||||
|
||||
**Base Models:**
|
||||
|
||||
- `unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit` - Recommended
|
||||
- `unsloth/llama-3-8b-bnb-4bit` - Alternative
|
||||
|
||||
**Target Modules:**
|
||||
|
||||
LoRA is applied to: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
|
||||
|
||||
### Exporter (`companion.forge.export`)
|
||||
|
||||
```python
|
||||
from companion.forge.export import merge_only
|
||||
|
||||
# Merge LoRA into base model
|
||||
merged_path = merge_only(
|
||||
checkpoint_path=Path("~/.companion/training/checkpoint-500"),
|
||||
output_path=Path("~/.companion/models/merged"),
|
||||
)
|
||||
```
|
||||
|
||||
### Reloader (`companion.forge.reload`)
|
||||
|
||||
```python
|
||||
from companion.forge.reload import reload_model, get_model_status
|
||||
|
||||
# Check current model
|
||||
status = get_model_status(config)
|
||||
print(f"Model size: {status['size_mb']} MB")
|
||||
|
||||
# Reload with new model
|
||||
new_path = reload_model(
|
||||
config=config,
|
||||
new_model_path=Path("~/.companion/training/final"),
|
||||
backup=True,
|
||||
)
|
||||
```
|
||||
|
||||
## CLI Reference
|
||||
|
||||
```bash
|
||||
# Extract training data
|
||||
companion.forge.cli extract [--output PATH]
|
||||
|
||||
# Train model
|
||||
companion.forge.cli train \
|
||||
[--data PATH] \
|
||||
[--output PATH] \
|
||||
[--epochs N] \
|
||||
[--lr FLOAT]
|
||||
|
||||
# Check model status
|
||||
companion.forge.cli status
|
||||
|
||||
# Reload model
|
||||
companion.forge.cli reload MODEL_PATH [--no-backup]
|
||||
```
|
||||
|
||||
## Training Tips
|
||||
|
||||
**Dataset Size:**
|
||||
- Minimum: 50 examples
|
||||
- Optimal: 100-500 examples
|
||||
- More is not always better - quality over quantity
|
||||
|
||||
**Epochs:**
|
||||
- Start with 3 epochs
|
||||
- Increase if underfitting (high loss)
|
||||
- Decrease if overfitting (loss increases on eval)
|
||||
|
||||
**LoRA Rank:**
|
||||
- `8` - Quick experiments
|
||||
- `16` - Balanced (recommended)
|
||||
- `32-64` - High capacity, more VRAM
|
||||
|
||||
**Overfitting Signs:**
|
||||
- Training loss decreasing, eval loss increasing
|
||||
- Model repeats exact phrases from training data
|
||||
- Responses feel "memorized" not "learned"
|
||||
|
||||
## VRAM Usage (RTX 5070, 12GB)
|
||||
|
||||
| Config | VRAM | Batch Size |
|
||||
|--------|------|------------|
|
||||
| Rank 16, 8-bit adam | ~10GB | 4 |
|
||||
| Rank 32, 8-bit adam | ~11GB | 4 |
|
||||
| Rank 64, 8-bit adam | OOM | - |
|
||||
|
||||
Use `gradient_accumulation_steps` to increase effective batch size.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**CUDA Out of Memory**
|
||||
- Reduce `lora_rank` to 8
|
||||
- Reduce `batch_size` to 2
|
||||
- Increase `gradient_accumulation_steps`
|
||||
|
||||
**Training Loss Not Decreasing**
|
||||
- Check data quality (reflections present?)
|
||||
- Increase learning rate to 5e-4
|
||||
- Check for data formatting issues
|
||||
|
||||
**Model Not Loading After Reload**
|
||||
- Check path exists: `ls -la ~/.companion/models/`
|
||||
- Verify model format (GGUF vs HF)
|
||||
- Check API logs for errors
|
||||
|
||||
**Slow Training**
|
||||
- Expected: ~6 hours for 3 epochs on RTX 5070
|
||||
- Enable gradient checkpointing (enabled by default)
|
||||
- Close other GPU applications
|
||||
|
||||
## Advanced: Custom Training Script
|
||||
|
||||
```python
|
||||
# custom_train.py
|
||||
from companion.forge.train import train
|
||||
from companion.config import load_config
|
||||
|
||||
config = load_config()
|
||||
|
||||
final_path = train(
|
||||
data_path=config.model.fine_tuning.training_data_path / "curated.jsonl",
|
||||
output_dir=config.model.fine_tuning.output_dir,
|
||||
base_model=config.model.fine_tuning.base_model,
|
||||
lora_rank=32, # Higher capacity
|
||||
lora_alpha=64,
|
||||
learning_rate=3e-4, # Slightly higher
|
||||
num_epochs=5, # More epochs
|
||||
batch_size=2, # Smaller batches
|
||||
gradient_accumulation_steps=8, # Effective batch = 16
|
||||
)
|
||||
|
||||
print(f"Model saved to: {final_path}")
|
||||
```
|
||||
Reference in New Issue
Block a user