This commit is contained in:
2026-04-10 19:00:38 -04:00
parent b8996d2ecb
commit 2c976bb75b
10 changed files with 957 additions and 0 deletions

View File

@@ -0,0 +1,126 @@
<h2>Security & Privacy</h2>
<p class="subtitle">Path Traversal Prevention, Input Sanitization, Sensitive Content Guards & Local-Only Enforcement</p>
<div class="section">
<h3>Security Architecture: Four Layers</h3>
<div class="mockup">
<div class="mockup-header">Defense in depth — each layer blocks what the previous might miss</div>
<div class="mockup-body" style="font-family: monospace; font-size: 13px; padding: 16px;">
<div style="display: flex; flex-direction: column; gap: 10px;">
<div style="display: flex; gap: 12px; align-items: flex-start;">
<div style="background: #1a3a5c; border: 2px solid #53a8b6; border-radius: 8px; padding: 10px; min-width: 80px; text-align: center;">
<div style="color: #53a8b6; font-weight: bold;">LAYER 1</div>
<div style="color: #888; font-size: 10px;">Boundary</div>
</div>
<div style="flex: 1;">
<div style="color: #53a8b6; font-weight: bold;">Path Traversal Prevention</div>
<div style="color: #eee; margin-top: 4px;">All file reads restricted to vault_path. Reject:</div>
<div style="color: #f0a500; margin-left: 12px;">• Any path containing ".." components</div>
<div style="color: #f0a500; margin-left: 12px;">• Absolute paths not under vault_path</div>
<div style="color: #f0a500; margin-left: 12px;">• Symlinks resolving outside vault_path</div>
<div style="color: #eee; margin-top: 4px;">Implementation: <code>path.resolve(vault_path, input)</code> must start with <code>vault_path</code></div>
</div>
</div>
<div style="display: flex; gap: 12px; align-items: flex-start;">
<div style="background: #3d3200; border: 2px solid #f0a500; border-radius: 8px; padding: 10px; min-width: 80px; text-align: center;">
<div style="color: #f0a500; font-weight: bold;">LAYER 2</div>
<div style="color: #888; font-size: 10px;">Sanitize</div>
</div>
<div style="flex: 1;">
<div style="color: #f0a500; font-weight: bold;">Input Sanitization</div>
<div style="color: #eee; margin-top: 4px;">All vault content treated as untrusted before embedding:</div>
<div style="color: #eee; margin-left: 12px;">• Strip HTML tags (prevent XSS in chunk_text)</div>
<div style="color: #eee; margin-left: 12px;">• Remove executable code blocks (```...```)</div>
<div style="color: #eee; margin-left: 12px;">• Normalize whitespace (prevent injection via formatting)</div>
<div style="color: #eee; margin-left: 12px;">• Cap chunk_text at 2000 chars (prevent oversized payloads)</div>
</div>
</div>
<div style="display: flex; gap: 12px; align-items: flex-start;">
<div style="background: #3d0000; border: 2px solid #e94560; border-radius: 8px; padding: 10px; min-width: 80px; text-align: center;">
<div style="color: #e94560; font-weight: bold;">LAYER 3</div>
<div style="color: #888; font-size: 10px;">Guard</div>
</div>
<div style="flex: 1;">
<div style="color: #e94560; font-weight: bold;">Sensitive Content Guard</div>
<div style="color: #eee; margin-top: 4px;">Detect and gate sensitive content in search results:</div>
<div style="color: #eee; margin-left: 12px;">• Health: #mentalhealth, #physicalhealth, medication, therapy</div>
<div style="color: #eee; margin-left: 12px;">• Financial: owe, owed, debt, paid, $, spent</div>
<div style="color: #eee; margin-left: 12px;">• Relations: #Relations (configurable section list)</div>
<div style="color: #eee; margin-top: 4px;">Action: Set <code style="color: #f0a500;">sensitive_detected=true</code> in response</div>
<div style="color: #eee; margin-left: 12px;">→ Agent must confirm before displaying to user</div>
<div style="color: #eee; margin-left: 12px;">→ Never transmit sensitive content to external APIs</div>
</div>
</div>
<div style="display: flex; gap: 12px; align-items: flex-start;">
<div style="background: #1a472a; border: 2px solid #2ecc71; border-radius: 8px; padding: 10px; min-width: 80px; text-align: center;">
<div style="color: #2ecc71; font-weight: bold;">LAYER 4</div>
<div style="color: #888; font-size: 10px;">Network</div>
</div>
<div style="flex: 1;">
<div style="color: #2ecc71; font-weight: bold;">Local-Only Enforcement</div>
<div style="color: #eee; margin-top: 4px;">All data stays on the local machine:</div>
<div style="color: #eee; margin-left: 12px;">• Ollama: localhost:11434 only</div>
<div style="color: #eee; margin-left: 12px;">• LanceDB: local filesystem only</div>
<div style="color: #eee; margin-left: 12px;">• Config: <code style="color: #f0a500;">"local_only": true</code> enforced</div>
<div style="color: #eee; margin-left: 12px;">• Network audit test: verify no outbound requests</div>
<div style="color: #e94560; margin-top: 4px;">If external embedding endpoint configured:</div>
<div style="color: #e94560; margin-left: 12px;">→ Require explicit user confirmation</div>
<div style="color: #e94560; margin-left: 12px;">→ Block if sensitive content detected in payload</div>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="section" style="margin-top: 20px;">
<h3>Directory Access Control</h3>
<div class="mockup">
<div class="mockup-header">deny_dirs and allow_dirs enforcement</div>
<div class="mockup-body" style="font-family: monospace; font-size: 12px; line-height: 1.7; padding: 16px;">
<div style="color: #53a8b6; font-weight: bold; margin-bottom: 6px;">DENY LIST (default)</div>
<div style="color: #eee; margin-left: 12px;">.obsidian, .trash, zzz-Archive, .git</div>
<div style="color: #888; margin-left: 12px;">Always excluded — system dirs, trash, archives, VCS</div>
<div style="color: #53a8b6; font-weight: bold; margin-top: 10px; margin-bottom: 6px;">ALLOW LIST (optional)</div>
<div style="color: #eee; margin-left: 12px;">If set: ONLY these directories are indexed</div>
<div style="color: #eee; margin-left: 12px;">If empty: all dirs except deny list are indexed</div>
<div style="color: #888; margin-left: 12px;">e.g. ["Journal", "Finance"] → only these two</div>
<div style="color: #53a8b6; font-weight: bold; margin-top: 10px; margin-bottom: 6px;">FILTER LOGIC (Python indexer)</div>
<div style="color: #eee; margin-left: 12px;">1. If allow_dirs is non-empty → only walk those dirs</div>
<div style="color: #eee; margin-left: 12px;">2. Skip any path matching deny_dirs patterns</div>
<div style="color: #eee; margin-left: 12px;">3. Skip hidden dirs (starting with .)</div>
<div style="color: #53a8b6; font-weight: bold; margin-top: 10px; margin-bottom: 6px;">FILTER LOGIC (TS plugin)</div>
<div style="color: #eee; margin-left: 12px;">directory_filter parameter validates against known dirs</div>
<div style="color: #eee; margin-left: 12px;">Reject unknown dirs → prevent probing vault structure</div>
</div>
</div>
</div>
<div class="section" style="margin-top: 20px;">
<h3>Key Design Points</h3>
<div class="pros-cons">
<div class="pros"><h4>Why this works</h4>
<ul>
<li>Four independent layers — bypass one and the others still protect</li>
<li>Sensitive content is flagged, not blocked — agent decides with user</li>
<li>Local-only is default; external APIs require explicit opt-in + confirmation</li>
<li>Directory controls applied in both Python (indexing) and TS (querying)</li>
</ul>
</div>
<div class="cons"><h4>Trade-offs</h4>
<ul>
<li>Sensitive content detection is pattern-based — may have false positives/negatives</li>
<li>Stripping code blocks loses technical notes content</li>
<li>Network audit test must be run manually — no runtime enforcement</li>
</ul>
</div>
</div>
</div>