Files
clawfort/openspec/changes/p04-summary/design.md
2026-02-13 00:49:22 -05:00

4.8 KiB

Context

ClawFort currently stores and displays full headline/summary text from the ingestion pipeline and renders feed content directly in cards/hero. There is no dedicated concise summary format, modal reading experience, or summary-specific analytics lifecycle.

This change introduces a structured summary artifact per fetched article, with template-driven rendering and event instrumentation.

Constraints:

  • Reuse existing Perplexity integration for generation.
  • Keep source attribution visible and preserved.
  • Prefer royalty-free image retrieval via MCP integration when available, with deterministic fallback path.
  • Ensure modal interactions are fully tagged in Umami.

Goals / Non-Goals

Goals:

  • Generate and persist concise summary content at ingestion time.
  • Persist and return template-compatible summary fields and image metadata.
  • Present summary in a modal dialog with required visual structure.
  • Track modal open/close/link-out analytics events consistently.

Non-Goals:

  • Replacing the existing core feed API model end-to-end.
  • Building a full long-form article reader.
  • Introducing user-authored summary editing workflows.
  • Supporting arbitrary analytics providers beyond current Umami hooks.

Decisions

Decision: Persist structured summary fields alongside article records

Decision: Store summary artifacts as explicit fields (TL;DR bullets, summary body, citation/source, summary image URL/credit) linked to each article.

Rationale:

  • Enables deterministic API response shape for modal rendering.
  • Keeps summary retrieval simple at read time.
  • Avoids dynamic prompt regeneration during page interactions.

Alternatives considered:

  • Generate summary on-demand at modal open: rejected due to latency and cost spikes.
  • Store a single blob markdown string only: rejected due to weaker field-level control and analytics granularity.

Decision: Use Perplexity for summary generation with strict output schema

Decision: Prompt Perplexity to return machine-parseable JSON fields that map directly to the template sections.

Rationale:

  • Existing Perplexity integration and operational familiarity.
  • Structured output reduces frontend parsing fragility.

Alternatives considered:

  • Free-form text generation then regex parsing: rejected as brittle.

Decision: Prefer MCP royalty-free image sourcing, fallback to deterministic non-MCP source path

Decision: When MCP image retrieval integration is configured, use it first; otherwise use a configured royalty-free provider path and fallback placeholder.

Rationale:

  • Satisfies preference for MCP leverage while preserving reliability.
  • Maintains legal/licensing constraints and avoids blocked ingestion.

Alternatives considered:

  • Hard dependency on MCP only: rejected due to availability/runtime coupling risk.

Decision: Add modal-specific analytics event contract

Decision: Define and emit explicit Umami events for summary modal open, close, and source link-out clicks.

Rationale:

  • Makes summary engagement measurable independently of feed interactions.
  • Prevents implicit/ambiguous event interpretation.

Alternatives considered:

  • Reusing existing generic card click events only: rejected due to insufficient modal-level observability.

Risks / Trade-offs

  • [Risk] Summary generation adds ingest latency -> Mitigation: bounded retries and skip/fallback behavior.
  • [Risk] Provider output schema drift breaks parser -> Mitigation: strict validation + fallback summary text behavior.
  • [Risk] Royalty-free image selection may be semantically weak -> Mitigation: relevance prompt constraints and placeholder fallback.
  • [Trade-off] Additional stored fields increase row size -> Mitigation: concise field limits and optional archival policy alignment.
  • [Risk] Event overcount from repeated modal toggles -> Mitigation: standardize open/close trigger boundaries and dedupe rules in frontend logic.

Migration Plan

  1. Add summary/image metadata fields or related model for persisted summary artifacts.
  2. Extend ingestion flow to generate structured summary + citation via Perplexity.
  3. Integrate royalty-free image retrieval with MCP-preferred flow and fallback.
  4. Extend API payloads to return summary-modal-ready data.
  5. Implement frontend modal rendering with exact template and analytics tags.
  6. Validate event tagging correctness and rendering fallback behavior.

Rollback:

  • Disable modal entrypoint and return existing feed behavior while retaining stored summary data.

Open Questions

  • Should TL;DR bullet count be fixed (for example 3) or provider-adaptive within a bounded range?
  • Should summary modal open be card-click only or have an explicit "Read Summary" CTA in each card?
  • Which royalty-free provider is preferred default when MCP is unavailable?