## Context The project already has API, DB, and contract-style tests, but recent regressions (theme contrast/state accessibility and contact tooltip behavior) exposed a gap in browser-native UI/UX regression detection. Current checks do not consistently validate interaction states across themes, breakpoints, and keyboard/pointer paths using a real rendering engine. This change introduces a Playwright-centered UI/UX testing design that covers existing product capabilities end-to-end, while staying compatible with current quality-gate flow and OpenSpec requirements. ## Goals / Non-Goals **Goals:** - Establish a deterministic Playwright test architecture for UI/UX regression prevention. - Build a capability-to-test coverage matrix mapped to existing OpenSpec specs. - Validate all critical interaction states: default, hover, focus-visible, active/pressed, and visited where applicable. - Validate key journeys across light, dark, and contrast themes and across mobile/tablet/desktop breakpoints. - Integrate Playwright execution into CI quality gates with actionable artifacts (trace, video/screenshot on failure). **Non-Goals:** - Replacing existing API/DB tests. - Pixel-perfect visual diff for every component in this phase. - Introducing new product behavior unrelated to testing. ## Decisions ### Decision 1: Use capability-driven Playwright test organization - **Choice:** Organize tests by capability groups aligned to OpenSpec specs (e.g., `summary-modal-experience`, `share-and-contact-microinteractions`). - **Why:** Makes regression ownership and traceability explicit and keeps specs/test synchronization straightforward. - **Alternative considered:** Organize by page-only files. Rejected because cross-capability assertions become fragmented and harder to audit. ### Decision 2: Add a reusable test harness with deterministic fixtures - **Choice:** Build shared Playwright fixtures for theme selection, viewport profiles, seeded content, and stable selectors. - **Why:** Prevents flaky setup differences and reduces duplicated orchestration logic. - **Alternative considered:** Per-test custom setup. Rejected due to drift and higher maintenance. ### Decision 3: Use interaction-state assertions, not just presence assertions - **Choice:** Assert computed styles and behavior transitions for focus/hover/active/link states on key controls. - **Why:** The reported regressions were state-specific and escaped static presence checks. - **Alternative considered:** DOM existence-only assertions. Rejected as insufficient for UX quality. ### Decision 4: Add tiered execution profile for CI speed and depth - **Choice:** Define smoke profile (PR) and full profile (main/nightly) while both remain gating where required by policy. - **Why:** Balances quick feedback with comprehensive coverage. - **Alternative considered:** Always run full matrix on every PR. Rejected due to avoidable CI latency. ### Decision 5: Capture failure diagnostics by default - **Choice:** Enable trace-on-retry and screenshot/video on failure. - **Why:** UI failures are often environment- and timing-sensitive; diagnostics reduce mean time to fix. - **Alternative considered:** Logs-only. Rejected for weak debuggability. ## Risks / Trade-offs - **[Risk] Flaky tests from network/data timing** -> **Mitigation:** deterministic fixtures, route stubbing/controlled seeds for unstable dependencies, explicit wait-for conditions tied to user-visible state. - **[Risk] CI runtime increase** -> **Mitigation:** split smoke/full profiles and parallel workers with bounded retries. - **[Risk] Selector fragility during UI refinements** -> **Mitigation:** standardize test-id/role-based selectors for critical controls and avoid brittle CSS-path selectors. - **[Risk] Theme/breakpoint matrix explosion** -> **Mitigation:** capability-priority matrix (must-cover flows across all themes; lower-risk flows sampled strategically). ## Migration Plan 1. Add Playwright config and shared fixtures with environment-safe defaults. 2. Implement capability-mapped smoke scenarios first (high-risk UX/accessibility flows). 3. Expand to complete matrix scenarios for themes and breakpoints. 4. Wire smoke profile into PR quality gate and full profile into main/nightly gate. 5. Publish test strategy/coverage mapping documentation and triage rules. Rollback strategy: - If CI instability occurs, temporarily gate on smoke profile while isolating flaky cases; do not remove previously passing coverage without replacement. ## Open Questions - Which environment should provide canonical seeded data for full-profile runs (local fixture API vs shared staging snapshot)? - Should visual snapshots be introduced in this phase for selected high-risk components (hero, modal, footer controls), or deferred? - What retry policy threshold should fail fast versus quarantine for flaky-only diagnostics?