Files

3.0 KiB

Purpose

Canonical specification for image-query-refinement requirements synced from OpenSpec change deltas.

Requirements

Requirement: Keyword extraction from headline

The system SHALL extract relevant keywords from article headlines for image search.

Scenario: Extract keywords from standard headline

  • WHEN headline is "OpenAI Announces GPT-5 with Revolutionary Reasoning Capabilities"
  • THEN extracted query is "OpenAI GPT-5 Revolutionary Reasoning"
  • AND stop words like "Announces", "with", "Capabilities" are removed

Scenario: Handle short headline

  • WHEN headline is "AI Breakthrough"
  • THEN extracted query is "AI Breakthrough"
  • AND no keywords are removed (headline too short)

Scenario: Handle headline with special characters

  • WHEN headline is "Tesla's Self-Driving AI: 99.9% Accuracy Achieved!"
  • THEN extracted query is "Tesla Self-Driving AI Accuracy"
  • AND special characters like apostrophes, colons, and punctuation are normalized

Requirement: Stop word removal

The system SHALL remove common English stop words from search queries.

Scenario: Remove articles and prepositions

  • WHEN headline is "The Future of AI in the Healthcare Industry"
  • THEN extracted query is "Future AI Healthcare Industry"
  • AND "The", "of", "in", "the" are removed

Scenario: Preserve technical terms

  • WHEN headline is "How Machine Learning Models Learn from Data"
  • THEN extracted query is "Machine Learning Models Learn Data"
  • AND technical terms "Machine", "Learning", "Models" are preserved

Requirement: Query length limit

The system SHALL limit search query length to optimize API results.

Scenario: Truncate long query

  • WHEN extracted keywords exceed 10 words
  • THEN query is limited to first 5 most significant keywords
  • AND remaining keywords are dropped

Scenario: Preserve short query

  • WHEN extracted keywords are 5 words or fewer
  • THEN all keywords are included in query
  • AND no truncation occurs

Requirement: URL-safe query encoding

The system SHALL URL-encode queries before sending to provider APIs.

Scenario: Encode spaces and special characters

  • WHEN query is "AI Machine Learning"
  • THEN encoded query is "AI+Machine+Learning" or "AI%20Machine%20Learning"
  • AND query is safe for HTTP GET parameters

Scenario: Handle Unicode characters

  • WHEN query contains Unicode like "AI für Deutschland"
  • THEN Unicode characters are properly percent-encoded
  • AND API request succeeds without encoding errors

Requirement: Empty query handling

The system SHALL handle edge cases where no keywords can be extracted.

Scenario: Headline with only stop words

  • WHEN headline is "The and a or but"
  • THEN system uses fallback query "news technology"
  • AND image search proceeds with generic query

Scenario: Empty headline

  • WHEN headline is empty string or whitespace only
  • THEN system uses fallback query "news technology"
  • AND image search proceeds with generic query