2.9 KiB
2.9 KiB
ADDED Requirements
Requirement: Keyword extraction from headline
The system SHALL extract relevant keywords from article headlines for image search.
Scenario: Extract keywords from standard headline
- WHEN headline is "OpenAI Announces GPT-5 with Revolutionary Reasoning Capabilities"
- THEN extracted query is "OpenAI GPT-5 Revolutionary Reasoning"
- AND stop words like "Announces", "with", "Capabilities" are removed
Scenario: Handle short headline
- WHEN headline is "AI Breakthrough"
- THEN extracted query is "AI Breakthrough"
- AND no keywords are removed (headline too short)
Scenario: Handle headline with special characters
- WHEN headline is "Tesla's Self-Driving AI: 99.9% Accuracy Achieved!"
- THEN extracted query is "Tesla Self-Driving AI Accuracy"
- AND special characters like apostrophes, colons, and punctuation are normalized
Requirement: Stop word removal
The system SHALL remove common English stop words from search queries.
Scenario: Remove articles and prepositions
- WHEN headline is "The Future of AI in the Healthcare Industry"
- THEN extracted query is "Future AI Healthcare Industry"
- AND "The", "of", "in", "the" are removed
Scenario: Preserve technical terms
- WHEN headline is "How Machine Learning Models Learn from Data"
- THEN extracted query is "Machine Learning Models Learn Data"
- AND technical terms "Machine", "Learning", "Models" are preserved
Requirement: Query length limit
The system SHALL limit search query length to optimize API results.
Scenario: Truncate long query
- WHEN extracted keywords exceed 10 words
- THEN query is limited to first 5 most significant keywords
- AND remaining keywords are dropped
Scenario: Preserve short query
- WHEN extracted keywords are 5 words or fewer
- THEN all keywords are included in query
- AND no truncation occurs
Requirement: URL-safe query encoding
The system SHALL URL-encode queries before sending to provider APIs.
Scenario: Encode spaces and special characters
- WHEN query is "AI Machine Learning"
- THEN encoded query is "AI+Machine+Learning" or "AI%20Machine%20Learning"
- AND query is safe for HTTP GET parameters
Scenario: Handle Unicode characters
- WHEN query contains Unicode like "AI für Deutschland"
- THEN Unicode characters are properly percent-encoded
- AND API request succeeds without encoding errors
Requirement: Empty query handling
The system SHALL handle edge cases where no keywords can be extracted.
Scenario: Headline with only stop words
- WHEN headline is "The and a or but"
- THEN system uses fallback query "news technology"
- AND image search proceeds with generic query
Scenario: Empty headline
- WHEN headline is empty string or whitespace only
- THEN system uses fallback query "news technology"
- AND image search proceeds with generic query