Initial Commit
This commit is contained in:
94
openspec/changes/p03-languages-ml-tm/design.md
Normal file
94
openspec/changes/p03-languages-ml-tm/design.md
Normal file
@@ -0,0 +1,94 @@
|
||||
## Context
|
||||
|
||||
ClawFort currently stores and serves article content in a single language flow. The news creation path fetches English content via Perplexity and persists one record per article, while frontend hero/feed rendering consumes that single-language payload.
|
||||
|
||||
This change introduces multilingual support for Tamil and Malayalam with language-aware rendering and persistent user preference.
|
||||
|
||||
Constraints:
|
||||
- Keep existing English behavior as default and fallback.
|
||||
- Reuse current Perplexity integration for translation generation.
|
||||
- Keep API and frontend changes minimal and backward-compatible where possible.
|
||||
- Persist user language preference client-side so returning users keep their choice.
|
||||
|
||||
## Goals / Non-Goals
|
||||
|
||||
**Goals:**
|
||||
- Generate Tamil and Malayalam translations at article creation time.
|
||||
- Persist translation variants linked to the base article.
|
||||
- Serve language-specific content in hero/feed API responses.
|
||||
- Add landing-page language selector and persist preference across sessions.
|
||||
|
||||
**Non-Goals:**
|
||||
- Supporting arbitrary language expansion in this phase.
|
||||
- Introducing user accounts/server-side profile preferences.
|
||||
- Building editorial translation workflows or manual override UI.
|
||||
- Replacing Perplexity as translation provider.
|
||||
|
||||
## Decisions
|
||||
|
||||
### Decision: Model translations as child records linked to a base article
|
||||
**Decision:** Keep one source article and store translation rows keyed by article ID + language code.
|
||||
|
||||
**Rationale:**
|
||||
- Avoids duplicating non-language metadata (source URL, image attribution, timestamps).
|
||||
- Supports language lookup with deterministic fallback to English.
|
||||
- Eases future language additions without schema redesign.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Inline columns on article table (`headline_ta`, `headline_ml`): rejected as rigid and harder to extend.
|
||||
- Fully duplicated article rows per language: rejected due to dedup and feed-order complexity.
|
||||
|
||||
### Decision: Translate immediately after article creation in ingestion pipeline
|
||||
**Decision:** For each newly accepted article, request Tamil and Malayalam translations and persist before ingestion cycle completes.
|
||||
|
||||
**Rationale:**
|
||||
- Keeps article and translations synchronized.
|
||||
- Avoids delayed jobs and partial language availability in normal flow.
|
||||
- Fits existing per-article processing loop.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Asynchronous background translation queue: rejected for higher complexity in this phase.
|
||||
|
||||
### Decision: Add optional language input to read APIs with English fallback
|
||||
**Decision:** Add language selection input (query param) on existing read endpoints; if translation missing, return English source text.
|
||||
|
||||
**Rationale:**
|
||||
- Preserves endpoint footprint and frontend integration simplicity.
|
||||
- Guarantees response completeness even when translation fails.
|
||||
- Supports progressive rollout without breaking existing consumers.
|
||||
|
||||
**Alternatives considered:**
|
||||
- New language-specific endpoints: rejected as unnecessary API surface growth.
|
||||
|
||||
### Decision: Persist frontend language preference in localStorage with cookie fallback
|
||||
**Decision:** Primary persistence in `localStorage`; optional cookie fallback for constrained browsers.
|
||||
|
||||
**Rationale:**
|
||||
- Simple client-only persistence without backend session dependencies.
|
||||
- Matches one-page app architecture and current no-auth model.
|
||||
|
||||
**Alternatives considered:**
|
||||
- Cookie-only preference: rejected as less ergonomic for JS state hydration.
|
||||
|
||||
## Risks / Trade-offs
|
||||
|
||||
- **[Risk] Translation generation increases API cost/latency per ingestion cycle** -> Mitigation: bounded retries, fallback to English when translation unavailable.
|
||||
- **[Risk] Partial translation failures create mixed-language feed** -> Mitigation: deterministic fallback to English for missing translation rows.
|
||||
- **[Trade-off] Translation-at-ingest adds synchronous processing time** -> Mitigation: keep language set fixed to two targets in this phase.
|
||||
- **[Risk] Language preference desynchronization between tabs/devices** -> Mitigation: accept per-browser persistence scope in current architecture.
|
||||
|
||||
## Migration Plan
|
||||
|
||||
1. Add translation persistence model and migration path.
|
||||
2. Extend ingestion pipeline to request/store Tamil and Malayalam translations.
|
||||
3. Add language-aware API response behavior with fallback.
|
||||
4. Implement frontend language selector + preference persistence.
|
||||
5. Validate language switching, fallback, and returning-user preference behavior.
|
||||
|
||||
Rollback:
|
||||
- Disable language selection in frontend and return English-only payload while retaining translation data safely.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should translation failures be retried independently per language within the same cycle, or skipped after one failed language call?
|
||||
- Should unsupported language requests return 400 or silently fallback to English in v1?
|
||||
Reference in New Issue
Block a user