bulk commit changes!

This commit is contained in:
2026-02-13 02:32:06 -05:00
parent c8f98c54c9
commit bf4a40f533
152 changed files with 2210 additions and 19 deletions

View File

@@ -0,0 +1,2 @@
schema: spec-driven
created: 2026-02-12

View File

@@ -0,0 +1,94 @@
## Context
ClawFort currently stores and serves article content in a single language flow. The news creation path fetches English content via Perplexity and persists one record per article, while frontend hero/feed rendering consumes that single-language payload.
This change introduces multilingual support for Tamil and Malayalam with language-aware rendering and persistent user preference.
Constraints:
- Keep existing English behavior as default and fallback.
- Reuse current Perplexity integration for translation generation.
- Keep API and frontend changes minimal and backward-compatible where possible.
- Persist user language preference client-side so returning users keep their choice.
## Goals / Non-Goals
**Goals:**
- Generate Tamil and Malayalam translations at article creation time.
- Persist translation variants linked to the base article.
- Serve language-specific content in hero/feed API responses.
- Add landing-page language selector and persist preference across sessions.
**Non-Goals:**
- Supporting arbitrary language expansion in this phase.
- Introducing user accounts/server-side profile preferences.
- Building editorial translation workflows or manual override UI.
- Replacing Perplexity as translation provider.
## Decisions
### Decision: Model translations as child records linked to a base article
**Decision:** Keep one source article and store translation rows keyed by article ID + language code.
**Rationale:**
- Avoids duplicating non-language metadata (source URL, image attribution, timestamps).
- Supports language lookup with deterministic fallback to English.
- Eases future language additions without schema redesign.
**Alternatives considered:**
- Inline columns on article table (`headline_ta`, `headline_ml`): rejected as rigid and harder to extend.
- Fully duplicated article rows per language: rejected due to dedup and feed-order complexity.
### Decision: Translate immediately after article creation in ingestion pipeline
**Decision:** For each newly accepted article, request Tamil and Malayalam translations and persist before ingestion cycle completes.
**Rationale:**
- Keeps article and translations synchronized.
- Avoids delayed jobs and partial language availability in normal flow.
- Fits existing per-article processing loop.
**Alternatives considered:**
- Asynchronous background translation queue: rejected for higher complexity in this phase.
### Decision: Add optional language input to read APIs with English fallback
**Decision:** Add language selection input (query param) on existing read endpoints; if translation missing, return English source text.
**Rationale:**
- Preserves endpoint footprint and frontend integration simplicity.
- Guarantees response completeness even when translation fails.
- Supports progressive rollout without breaking existing consumers.
**Alternatives considered:**
- New language-specific endpoints: rejected as unnecessary API surface growth.
### Decision: Persist frontend language preference in localStorage with cookie fallback
**Decision:** Primary persistence in `localStorage`; optional cookie fallback for constrained browsers.
**Rationale:**
- Simple client-only persistence without backend session dependencies.
- Matches one-page app architecture and current no-auth model.
**Alternatives considered:**
- Cookie-only preference: rejected as less ergonomic for JS state hydration.
## Risks / Trade-offs
- **[Risk] Translation generation increases API cost/latency per ingestion cycle** -> Mitigation: bounded retries, fallback to English when translation unavailable.
- **[Risk] Partial translation failures create mixed-language feed** -> Mitigation: deterministic fallback to English for missing translation rows.
- **[Trade-off] Translation-at-ingest adds synchronous processing time** -> Mitigation: keep language set fixed to two targets in this phase.
- **[Risk] Language preference desynchronization between tabs/devices** -> Mitigation: accept per-browser persistence scope in current architecture.
## Migration Plan
1. Add translation persistence model and migration path.
2. Extend ingestion pipeline to request/store Tamil and Malayalam translations.
3. Add language-aware API response behavior with fallback.
4. Implement frontend language selector + preference persistence.
5. Validate language switching, fallback, and returning-user preference behavior.
Rollback:
- Disable language selection in frontend and return English-only payload while retaining translation data safely.
## Open Questions
- Should translation failures be retried independently per language within the same cycle, or skipped after one failed language call?
- Should unsupported language requests return 400 or silently fallback to English in v1?

View File

@@ -0,0 +1,37 @@
## Why
ClawFort currently publishes content in a single language, which limits accessibility for regional audiences. Adding multilingual delivery now improves usability for Tamil and Malayalam readers while keeping the current English workflow intact.
## What Changes
- **New Capabilities:**
- Persist the fetched articles locally in database.
- Generate Tamil and Malayalam translations for each newly created article using Perplexity.
- Store translated variants as language-specific content items linked to the same base article.
- Add a language selector on the landing page to switch article rendering language.
- Persist user language preference in browser storage (local storage or cookie) and restore it for returning users.
- **Frontend:**
- Add visible language switcher UI on the one-page experience.
- Render hero and feed content in selected language when translation exists.
- **Backend:**
- Extend content generation flow to request and save multilingual outputs.
- Serve language-specific content for existing API reads.
## Capabilities
### New Capabilities
- `article-translations-ml-tm`: Create and store Tamil and Malayalam translated content variants for each article at creation time.
- `language-aware-content-delivery`: Return and render language-specific article fields based on selected language.
- `language-preference-persistence`: Persist and restore user-selected language across sessions for returning users.
### Modified Capabilities
- None.
## Impact
- **Code:** Backend aggregation/storage flow, API response handling, and frontend rendering/state management will be updated.
- **APIs:** Existing read endpoints will need language-aware response behavior or language selection input handling.
- **Dependencies:** Reuses Perplexity integration; no mandatory new external provider expected.
- **Infrastructure:** No deployment topology changes.
- **Environment:** Uses existing Perplexity configuration; may introduce optional translation toggles/settings later.
- **Data:** Adds translation data model/fields linked to each source article.

View File

@@ -0,0 +1,27 @@
## ADDED Requirements
### Requirement: System generates Tamil and Malayalam translations at article creation time
The system SHALL generate Tamil (`ta`) and Malayalam (`ml`) translations for each newly created article during ingestion.
#### Scenario: Translation generation for new article
- **WHEN** a new source article is accepted for storage
- **THEN** the system requests Tamil and Malayalam translations for headline and summary
- **AND** translation generation occurs in the same ingestion flow for that article
#### Scenario: Translation failure fallback
- **WHEN** translation generation fails for one or both target languages
- **THEN** the system stores the base article in English
- **AND** marks missing translations as unavailable without failing the whole ingestion cycle
### Requirement: System stores translation variants linked to the same article
The system SHALL persist language-specific translated content as translation items associated with the base article.
#### Scenario: Persist linked translations
- **WHEN** Tamil and Malayalam translations are generated successfully
- **THEN** the system stores them as language-specific content variants linked to the base article identifier
- **AND** translation records remain queryable by language code
#### Scenario: No duplicate translation variants per language
- **WHEN** translation storage is attempted for an article-language pair that already exists
- **THEN** the system avoids creating duplicate translation items for the same language
- **AND** preserves one authoritative translation variant per article per language in this phase

View File

@@ -0,0 +1,27 @@
## ADDED Requirements
### Requirement: API supports language-aware content retrieval
The system SHALL support language-aware content delivery for hero and feed reads using selected language input.
#### Scenario: Language-specific latest article response
- **WHEN** a client requests latest article data with a supported language selection
- **THEN** the system returns headline and summary in the selected language when available
- **AND** includes the corresponding base article metadata and media attribution
#### Scenario: Language-specific paginated feed response
- **WHEN** a client requests paginated feed data with a supported language selection
- **THEN** the system returns each feed item's headline and summary in the selected language when available
- **AND** preserves existing pagination behavior and ordering semantics
### Requirement: Language fallback to English is deterministic
The system SHALL return English source content when the requested translation is unavailable.
#### Scenario: Missing translation fallback
- **WHEN** a client requests Tamil or Malayalam content for an article lacking that translation
- **THEN** the system returns the English headline and summary for that article
- **AND** response shape remains consistent with language-aware responses
#### Scenario: Unsupported language handling
- **WHEN** a client requests a language outside supported values (`en`, `ta`, `ml`)
- **THEN** the system applies the defined default language behavior for this phase
- **AND** avoids breaking existing consumers of news endpoints

View File

@@ -0,0 +1,27 @@
## ADDED Requirements
### Requirement: Landing page provides language selector
The system SHALL display a language selector on the landing page that allows switching between English, Tamil, and Malayalam content views.
#### Scenario: User selects language from landing page
- **WHEN** a user chooses Tamil or Malayalam from the language selector
- **THEN** hero and feed content update to requested language-aware rendering
- **AND** subsequent API requests use the selected language context
#### Scenario: User switches back to English
- **WHEN** a user selects English in the language selector
- **THEN** content renders in English
- **AND** language state updates immediately in the frontend view
### Requirement: User language preference is persisted and restored
The system SHALL persist selected language preference in client-side storage and restore it for returning users.
#### Scenario: Persist language selection
- **WHEN** a user selects a supported language on the landing page
- **THEN** the selected language code is stored in local storage or a client cookie
- **AND** the persisted value is used as preferred language for future visits on the same browser
#### Scenario: Restore preference on return visit
- **WHEN** a returning user opens the landing page
- **THEN** the system reads persisted language preference from client storage
- **AND** initializes the UI and content requests with that language by default

View File

@@ -0,0 +1,40 @@
## 1. Translation Data Model and Persistence
- [x] 1.1 Add translation persistence model linked to base article with language code (`en`, `ta`, `ml`)
- [x] 1.2 Update database initialization/migration path to create translation storage structures
- [x] 1.3 Add repository operations to create/read translation variants by article and language
- [x] 1.4 Enforce no duplicate translation variant for the same article-language pair
## 2. Ingestion Pipeline Translation Generation
- [x] 2.1 Extend ingestion flow to trigger Tamil and Malayalam translation generation for each new article
- [x] 2.2 Reuse Perplexity integration for translation calls with language-specific prompts
- [x] 2.3 Persist generated translations as linked variants during the same ingestion cycle
- [x] 2.4 Implement graceful fallback when translation generation fails (store English base, continue cycle)
## 3. Language-Aware API Delivery
- [x] 3.1 Add language selection input handling to latest-news endpoint
- [x] 3.2 Add language selection input handling to paginated feed endpoint
- [x] 3.3 Return translated headline/summary when available and fallback to English when missing
- [x] 3.4 Define and implement behavior for unsupported language requests in this phase
## 4. Frontend Language Selector and Rendering
- [x] 4.1 Add landing-page language selector UI with English, Tamil, and Malayalam options
- [x] 4.2 Update hero data fetch/render flow to request and display selected language content
- [x] 4.3 Update feed pagination fetch/render flow to request and display selected language content
- [x] 4.4 Keep existing attribution/media rendering behavior intact across language switches
## 5. Preference Persistence and Returning User Behavior
- [x] 5.1 Persist user-selected language in localStorage with cookie fallback
- [x] 5.2 Restore persisted language on page load before initial content fetch
- [x] 5.3 Initialize selector state and API language requests from restored preference
## 6. Validation and Documentation
- [x] 6.1 Validate translation creation and retrieval for Tamil and Malayalam on new articles
- [x] 6.2 Validate fallback behavior for missing translation variants and unsupported language input
- [x] 6.3 Validate returning-user language persistence across browser sessions
- [x] 6.4 Update README with multilingual behavior, language selector usage, and persistence details