Add offline merchant category suggestions

2026-03-23 13:28:00 -04:00
parent 12c72ddcad
commit 696d393fca
11 changed files with 352 additions and 31 deletions
--- a/openspec/changes/monthly-expense-tracker-v1/design.md
+++ b/openspec/changes/monthly-expense-tracker-v1/design.md
@@ -1,6 +1,6 @@
 ## Context

-The repository starts with a product plan and OpenSpec configuration but no application code. The first version needs a complete local-first implementation using `Next.js`, `Prisma`, `SQLite`, and `OpenAI`, while keeping scope intentionally narrow: one user, manual data entry, fixed categories, and dashboard-only insights. Month boundaries are based on the local machine timezone, which affects date parsing, monthly aggregation, and paycheck coverage calculations.
+The repository starts with a product plan and OpenSpec configuration but no application code. The first version needs a complete local-first implementation using `Next.js`, `Prisma`, `SQLite`, and a fully offline local LLM runtime, while keeping scope intentionally narrow: one user, manual data entry, fixed categories, merchant-assisted categorization, and dashboard-only insights. Month boundaries are based on the local machine timezone, which affects date parsing, monthly aggregation, and paycheck coverage calculations.

 ## Goals / Non-Goals

@@ -8,12 +8,13 @@ The repository starts with a product plan and OpenSpec configuration but no appl
 - Build a single deployable `Next.js` app with UI views and server routes in one codebase.
 - Persist expenses, paychecks, and generated monthly insights in a local SQLite database managed by Prisma.
 - Centralize monthly aggregation logic so dashboard reads and AI generation use the same numbers.
- Keep AI integration isolated behind a small service layer that prepares structured monthly context and calls `OpenAI`.
+- Keep AI integration isolated behind a small service layer that prepares structured monthly context and calls a fully offline local inference runtime.
 - Make v1 testable with deterministic validation, aggregation, and safe fallback behavior for sparse data.
+- Add privacy-preserving merchant category suggestion with deterministic merchant mappings before model inference.

 **Non-Goals:**
 - Authentication, multi-user support, bank sync, receipt scanning, background jobs, or email delivery.
- Automatic categorization, editing data through AI, or free-form custom categories in v1.
+- Fully automatic uncapped categorization without user review for ambiguous merchants, editing data through AI, or free-form custom categories in v1.
 - Complex financial forecasting beyond simple next-month guidance derived from recent activity.

 ## Decisions
@@ -34,16 +35,25 @@ The repository starts with a product plan and OpenSpec configuration but no appl
 - Rationale: dashboard totals, paycheck coverage, category breakdowns, and AI snapshots must stay consistent across endpoints.
 - Alternative considered: separate logic per route. Rejected because it risks drift between dashboard and insight generation.

+### Use `Ollama` with a local Qwen-class instruct model
+- Rationale: privacy is a primary product requirement, and the target machine can comfortably run a recent local model for lightweight categorization and summary generation.
+- Alternative considered: hosted `OpenAI`. Rejected because it violates the privacy-first goal for personal financial data.
+
 ### Add an AI service boundary with structured prompt input and fallback responses
- Rationale: the app needs provider isolation, predictable prompt shape, and safe messaging when data is too sparse for useful advice.
- Alternative considered: calling `OpenAI` directly from a route handler with raw records. Rejected because it couples prompting, aggregation, and transport too tightly.
+- Rationale: the app needs runtime isolation, predictable prompt shape, and safe messaging when local inference is unavailable or data is too sparse for useful advice.
+- Alternative considered: calling the local model directly from a route handler with raw records. Rejected because it couples prompting, aggregation, and transport too tightly.
+
+### Use merchant rules first and local-model fallback second for category suggestion
+- Rationale: most repeated merchants can be categorized deterministically and faster than model inference, while unknown merchants still benefit from local AI assistance.
+- Alternative considered: model-only categorization. Rejected because it is slower, less predictable, and unnecessary for common merchants.

 ## Risks / Trade-offs

 - [Local timezone handling differs by machine] -> Normalize month calculations around stored local-date strings and test month edges explicitly.
 - [SQLite limits concurrency] -> Acceptable for single-user local-first v1; no mitigation beyond keeping writes simple.
 - [AI output quality varies with sparse or noisy data] -> Add minimum-data fallback logic and keep prompts grounded in structured aggregates.
- [OpenAI dependency requires API key management] -> Read configuration from environment variables and keep failure messages explicit in the UI/API.
+- [Local model may be unavailable or not yet pulled] -> Detect runtime/model readiness and return explicit offline setup guidance in the UI/API.
+- [Merchant names can be ambiguous] -> Use auto-fill only for known deterministic mappings and require user confirmation for fallback suggestions.

 ## Migration Plan

@@ -51,13 +61,13 @@ The repository starts with a product plan and OpenSpec configuration but no appl
 2. Add the Prisma schema, create the initial SQLite migration, and generate the client.
 3. Implement CRUD routes and UI forms for expenses and paychecks.
 4. Implement dashboard aggregation and month filtering.
-5. Add the AI insight service and persistence for generated monthly insights.
+5. Add the offline AI service, merchant-category suggestion flow, and persistence for generated monthly insights.
 6. Run automated tests, then exercise the main flows in the browser.

 Rollback is straightforward in early development: revert the code change and reset the local SQLite database if schema changes become invalid.

 ## Open Questions

- Which `OpenAI` model should be the initial default for monthly insight generation?
+- Which exact local Qwen model tag should be the initial default in `Ollama`?
 - Should generated monthly insights overwrite prior insights for the same month or create a historical trail of regenerated summaries?
 - Do we want soft confirmation in the UI before deleting expenses or paychecks, or is immediate deletion acceptable for v1?