Testing Strategy

Python Tests, TypeScript Tests, Integration Tests & Security Test Suites

Test Architecture: Four Suites

Each suite tests a distinct concern
PYTHON UNIT
pytest / python/tests/
test_chunker.py
→ Section splitting, sliding window, metadata
test_embedder.py
→ Mocked Ollama, batch embed, error handling
test_vector_store.py
→ LanceDB CRUD, upsert, delete by source
test_security.py
→ Path traversal, sanitization, sensitive detection
test_indexer.py
→ Full pipeline, incremental sync, config
TYPESCRIPT UNIT
vitest / tests/
search.test.ts
→ Parameter validation, filter logic, response shape
index.test.ts
→ Mode validation, subprocess spawn, progress parse
memory.test.ts
→ Key/value storage, auto-suggest patterns
vault-watcher.test.ts
→ Chokidar events, debounce, batching
security-guard.test.ts
→ Directory filter validation, sensitive flag
INTEGRATION
E2E / tests/integration/
full_pipeline.test.ts
→ Index vault → search → verify results
sync_cycle.test.ts
→ Modify file → auto-sync → search updated
health_state.test.ts
→ Stop Ollama → degraded → restart → healthy
openclaw_protocol.test.ts
→ Agent calls tools, validates envelope
SECURITY
Dedicated / tests/security/
path_traversal.test.ts
→ ../, symlinks, absolute, encoded paths
xss_prevention.test.ts
→ HTML/script injection in chunk text
prompt_injection.test.ts
→ Malicious content in vault notes
network_audit.test.ts
→ Verify zero outbound when local_only=true
sensitive_content.test.ts
→ Pattern detection, flagging, blocking

Testing Approach by Layer

What gets mocked vs what gets hit for real
Component Unit Test Integration Test
Ollama embedding Mocked — fixed 1024-dim vectors Real — requires Ollama running
LanceDB Real — temp directory, cleaned up Real — temp directory, cleaned up
Obsidian vault Mocked — fixture markdown files Real — temp vault with real files
Python CLI Mocked subprocess Real — actual CLI invocation
Chokidar watcher Mocked events Real — actual file system events
OpenClaw agent N/A Real — tool call envelope validation

Key Design Points

Why this works

  • Four suites cover all four security layers + both languages
  • LanceDB uses real temp dirs — no mock drift for vector operations
  • Integration tests require Ollama but skip gracefully if unavailable
  • Security suite is standalone — always runs, no dependencies

Trade-offs

  • Integration tests need Ollama installed locally
  • File watcher integration tests are timing-dependent (debounce/batch)
  • Sensitive content detection is pattern-based — test coverage is only as good as patterns