Security & Privacy

Path Traversal Prevention, Input Sanitization, Sensitive Content Guards & Local-Only Enforcement

Security Architecture: Four Layers

Defense in depth — each layer blocks what the previous might miss
LAYER 1
Boundary
Path Traversal Prevention
All file reads restricted to vault_path. Reject:
• Any path containing ".." components
• Absolute paths not under vault_path
• Symlinks resolving outside vault_path
Implementation: path.resolve(vault_path, input) must start with vault_path
LAYER 2
Sanitize
Input Sanitization
All vault content treated as untrusted before embedding:
• Strip HTML tags (prevent XSS in chunk_text)
• Remove executable code blocks (```...```)
• Normalize whitespace (prevent injection via formatting)
• Cap chunk_text at 2000 chars (prevent oversized payloads)
LAYER 3
Guard
Sensitive Content Guard
Detect and gate sensitive content in search results:
• Health: #mentalhealth, #physicalhealth, medication, therapy
• Financial: owe, owed, debt, paid, $, spent
• Relations: #Relations (configurable section list)
Action: Set sensitive_detected=true in response
→ Agent must confirm before displaying to user
→ Never transmit sensitive content to external APIs
LAYER 4
Network
Local-Only Enforcement
All data stays on the local machine:
• Ollama: localhost:11434 only
• LanceDB: local filesystem only
• Config: "local_only": true enforced
• Network audit test: verify no outbound requests
If external embedding endpoint configured:
→ Require explicit user confirmation
→ Block if sensitive content detected in payload

Directory Access Control

deny_dirs and allow_dirs enforcement
DENY LIST (default)
.obsidian, .trash, zzz-Archive, .git
Always excluded — system dirs, trash, archives, VCS
ALLOW LIST (optional)
If set: ONLY these directories are indexed
If empty: all dirs except deny list are indexed
e.g. ["Journal", "Finance"] → only these two
FILTER LOGIC (Python indexer)
1. If allow_dirs is non-empty → only walk those dirs
2. Skip any path matching deny_dirs patterns
3. Skip hidden dirs (starting with .)
FILTER LOGIC (TS plugin)
directory_filter parameter validates against known dirs
Reject unknown dirs → prevent probing vault structure

Key Design Points

Why this works

  • Four independent layers — bypass one and the others still protect
  • Sensitive content is flagged, not blocked — agent decides with user
  • Local-only is default; external APIs require explicit opt-in + confirmation
  • Directory controls applied in both Python (indexing) and TS (querying)

Trade-offs

  • Sensitive content detection is pattern-based — may have false positives/negatives
  • Stripping code blocks loses technical notes content
  • Network audit test must be run manually — no runtime enforcement