Files
Santhosh Janardhanan e2406bf978
Some checks failed
quality-gates / lint-and-test (push) Has been cancelled
quality-gates / security-scan (push) Has been cancelled
Pushing to live
2026-02-13 10:19:01 -05:00

4.8 KiB

Context

The project already has API, DB, and contract-style tests, but recent regressions (theme contrast/state accessibility and contact tooltip behavior) exposed a gap in browser-native UI/UX regression detection. Current checks do not consistently validate interaction states across themes, breakpoints, and keyboard/pointer paths using a real rendering engine.

This change introduces a Playwright-centered UI/UX testing design that covers existing product capabilities end-to-end, while staying compatible with current quality-gate flow and OpenSpec requirements.

Goals / Non-Goals

Goals:

  • Establish a deterministic Playwright test architecture for UI/UX regression prevention.
  • Build a capability-to-test coverage matrix mapped to existing OpenSpec specs.
  • Validate all critical interaction states: default, hover, focus-visible, active/pressed, and visited where applicable.
  • Validate key journeys across light, dark, and contrast themes and across mobile/tablet/desktop breakpoints.
  • Integrate Playwright execution into CI quality gates with actionable artifacts (trace, video/screenshot on failure).

Non-Goals:

  • Replacing existing API/DB tests.
  • Pixel-perfect visual diff for every component in this phase.
  • Introducing new product behavior unrelated to testing.

Decisions

Decision 1: Use capability-driven Playwright test organization

  • Choice: Organize tests by capability groups aligned to OpenSpec specs (e.g., summary-modal-experience, share-and-contact-microinteractions).
  • Why: Makes regression ownership and traceability explicit and keeps specs/test synchronization straightforward.
  • Alternative considered: Organize by page-only files. Rejected because cross-capability assertions become fragmented and harder to audit.

Decision 2: Add a reusable test harness with deterministic fixtures

  • Choice: Build shared Playwright fixtures for theme selection, viewport profiles, seeded content, and stable selectors.
  • Why: Prevents flaky setup differences and reduces duplicated orchestration logic.
  • Alternative considered: Per-test custom setup. Rejected due to drift and higher maintenance.

Decision 3: Use interaction-state assertions, not just presence assertions

  • Choice: Assert computed styles and behavior transitions for focus/hover/active/link states on key controls.
  • Why: The reported regressions were state-specific and escaped static presence checks.
  • Alternative considered: DOM existence-only assertions. Rejected as insufficient for UX quality.

Decision 4: Add tiered execution profile for CI speed and depth

  • Choice: Define smoke profile (PR) and full profile (main/nightly) while both remain gating where required by policy.
  • Why: Balances quick feedback with comprehensive coverage.
  • Alternative considered: Always run full matrix on every PR. Rejected due to avoidable CI latency.

Decision 5: Capture failure diagnostics by default

  • Choice: Enable trace-on-retry and screenshot/video on failure.
  • Why: UI failures are often environment- and timing-sensitive; diagnostics reduce mean time to fix.
  • Alternative considered: Logs-only. Rejected for weak debuggability.

Risks / Trade-offs

  • [Risk] Flaky tests from network/data timing -> Mitigation: deterministic fixtures, route stubbing/controlled seeds for unstable dependencies, explicit wait-for conditions tied to user-visible state.
  • [Risk] CI runtime increase -> Mitigation: split smoke/full profiles and parallel workers with bounded retries.
  • [Risk] Selector fragility during UI refinements -> Mitigation: standardize test-id/role-based selectors for critical controls and avoid brittle CSS-path selectors.
  • [Risk] Theme/breakpoint matrix explosion -> Mitigation: capability-priority matrix (must-cover flows across all themes; lower-risk flows sampled strategically).

Migration Plan

  1. Add Playwright config and shared fixtures with environment-safe defaults.
  2. Implement capability-mapped smoke scenarios first (high-risk UX/accessibility flows).
  3. Expand to complete matrix scenarios for themes and breakpoints.
  4. Wire smoke profile into PR quality gate and full profile into main/nightly gate.
  5. Publish test strategy/coverage mapping documentation and triage rules.

Rollback strategy:

  • If CI instability occurs, temporarily gate on smoke profile while isolating flaky cases; do not remove previously passing coverage without replacement.

Open Questions

  • Which environment should provide canonical seeded data for full-profile runs (local fixture API vs shared staging snapshot)?
  • Should visual snapshots be introduced in this phase for selected high-risk components (hero, modal, footer controls), or deferred?
  • What retry policy threshold should fail fast versus quarantine for flaky-only diagnostics?