Files

76 lines
3.0 KiB
Markdown

## Purpose
Canonical specification for image-query-refinement requirements synced from OpenSpec change deltas.
## Requirements
### Requirement: Keyword extraction from headline
The system SHALL extract relevant keywords from article headlines for image search.
#### Scenario: Extract keywords from standard headline
- **WHEN** headline is "OpenAI Announces GPT-5 with Revolutionary Reasoning Capabilities"
- **THEN** extracted query is "OpenAI GPT-5 Revolutionary Reasoning"
- **AND** stop words like "Announces", "with", "Capabilities" are removed
#### Scenario: Handle short headline
- **WHEN** headline is "AI Breakthrough"
- **THEN** extracted query is "AI Breakthrough"
- **AND** no keywords are removed (headline too short)
#### Scenario: Handle headline with special characters
- **WHEN** headline is "Tesla's Self-Driving AI: 99.9% Accuracy Achieved!"
- **THEN** extracted query is "Tesla Self-Driving AI Accuracy"
- **AND** special characters like apostrophes, colons, and punctuation are normalized
### Requirement: Stop word removal
The system SHALL remove common English stop words from search queries.
#### Scenario: Remove articles and prepositions
- **WHEN** headline is "The Future of AI in the Healthcare Industry"
- **THEN** extracted query is "Future AI Healthcare Industry"
- **AND** "The", "of", "in", "the" are removed
#### Scenario: Preserve technical terms
- **WHEN** headline is "How Machine Learning Models Learn from Data"
- **THEN** extracted query is "Machine Learning Models Learn Data"
- **AND** technical terms "Machine", "Learning", "Models" are preserved
### Requirement: Query length limit
The system SHALL limit search query length to optimize API results.
#### Scenario: Truncate long query
- **WHEN** extracted keywords exceed 10 words
- **THEN** query is limited to first 5 most significant keywords
- **AND** remaining keywords are dropped
#### Scenario: Preserve short query
- **WHEN** extracted keywords are 5 words or fewer
- **THEN** all keywords are included in query
- **AND** no truncation occurs
### Requirement: URL-safe query encoding
The system SHALL URL-encode queries before sending to provider APIs.
#### Scenario: Encode spaces and special characters
- **WHEN** query is "AI Machine Learning"
- **THEN** encoded query is "AI+Machine+Learning" or "AI%20Machine%20Learning"
- **AND** query is safe for HTTP GET parameters
#### Scenario: Handle Unicode characters
- **WHEN** query contains Unicode like "AI für Deutschland"
- **THEN** Unicode characters are properly percent-encoded
- **AND** API request succeeds without encoding errors
### Requirement: Empty query handling
The system SHALL handle edge cases where no keywords can be extracted.
#### Scenario: Headline with only stop words
- **WHEN** headline is "The and a or but"
- **THEN** system uses fallback query "news technology"
- **AND** image search proceeds with generic query
#### Scenario: Empty headline
- **WHEN** headline is empty string or whitespace only
- **THEN** system uses fallback query "news technology"
- **AND** image search proceeds with generic query