Add offline merchant category suggestions

2026-03-23 13:28:00 -04:00
parent 12c72ddcad
commit 696d393fca
11 changed files with 352 additions and 31 deletions
--- a/plan.md
+++ b/plan.md
@@ -1,9 +1,9 @@
 # Monthly Expense Tracker With AI Insights

 ## Summary
-Build a single-user, local-first web app for manually recording daily expenses and biweekly paychecks, then generating month-to-date and end-of-month spending insights with next-month guidance.
+Build a single-user, local-first web app for manually recording daily expenses and biweekly paychecks, then generating month-to-date and end-of-month spending insights with next-month guidance while keeping all AI features fully offline.

-The first version is optimized for fast daily entry and a dashboard-first review flow. It uses fixed starter categories, a simple local database, and an in-app AI summary rather than email or exports.
+The first version is optimized for fast daily entry and a dashboard-first review flow. It uses fixed starter categories, a simple local database, a fully offline local LLM for private AI features, and an in-app AI summary rather than email or exports.

 ## Implementation Changes
 - App shape:
@@ -16,29 +16,33 @@ The first version is optimized for fast daily entry and a dashboard-first review
 - Categories:
  - Ship with fixed starter categories such as `Rent`, `Food`, `Transport`, `Bills`, `Shopping`, `Health`, `Entertainment`, `Misc`.
  - Store category as a controlled value so monthly summaries can group reliably.
+  - Support merchant-name-based category suggestion: apply deterministic merchant rules first, then use the local LLM only for unknown merchants.
+  - Treat AI categorization as assistive: known merchants may auto-fill a category, but unknown-merchant suggestions should be confirmed before save.
 - Dashboard behavior:
  - Show current month totals for expenses, category breakdown, paycheck total, and net cash flow.
  - Include month-to-date charts and simple comparisons like highest category, largest single expense, average daily spend, and spend vs paycheck coverage.
  - Provide a `Generate Insights` action that works any time during the month, not only at month-end.
 - AI insight generation:
-  - Build a summarization pipeline that prepares structured monthly aggregates plus recent transaction samples, then sends that context to the AI model.
+  - Build a summarization pipeline that prepares structured monthly aggregates plus recent transaction samples, then sends that context to a fully offline local model.
  - Ask the model to return:
    - spending pattern summary
    - unusual categories or spikes
    - paycheck-to-spend timing observations
    - practical next-month suggestions
-  - Keep AI read-only in v1: it does not edit data or auto-categorize entries.
+  - Use AI for merchant-category suggestion as well as monthly summaries, but keep the final saved category under user control for ambiguous merchants.
 - Storage and architecture:
  - Use a simple embedded database for local-first persistence, preferably SQLite.
  - Implement the app with `Next.js` for the web UI and server routes.
  - Use `Prisma` for the data layer and migrations.
  - Keep the AI integration behind a small service boundary so the model/provider can be swapped later without changing UI code.
-  - Use `OpenAI` for insight generation in v1.
+  - Use `Ollama` with a local Qwen-class instruct model for offline inference in v1.
+  - Keep the app functional when the local model is unavailable by returning a clear fallback message instead of failing silently.
 - Public interfaces / APIs:
  - `POST /expenses`, `GET /expenses`, `DELETE /expenses/:id`
  - `POST /paychecks`, `GET /paychecks`, `DELETE /paychecks/:id`
  - `GET /dashboard?month=YYYY-MM`
  - `POST /insights/generate?month=YYYY-MM`
+  - `POST /categories/suggest` with merchant/shop name input for local category suggestion
  - Insight response should include structured fields for totals and a rendered narrative summary for the dashboard.

 ## Implementation Checklist
@@ -48,9 +52,10 @@ The first version is optimized for fast daily entry and a dashboard-first review
 - [ ] Implement expense CRUD routes and forms.
 - [ ] Implement paycheck CRUD routes and forms.
 - [ ] Build dashboard aggregation logic for totals, categories, cash flow, and comparisons.
- [ ] Add the insight generation service boundary and `OpenAI` integration.
+- [ ] Add the insight generation service boundary and offline `Ollama` integration.
+- [ ] Add merchant-name category suggestion using merchant rules first and local-model fallback second.
 - [ ] Render AI insight output in the dashboard with fallback behavior for sparse months.
- [ ] Add tests for validation, aggregates, persistence, and insight generation.
+- [ ] Add tests for validation, aggregates, persistence, local-model fallback behavior, and category suggestion.
 - [ ] Verify all month-boundary behavior using local timezone dates.

 ## Test Plan
@@ -67,6 +72,11 @@ The first version is optimized for fast daily entry and a dashboard-first review
  - AI request uses aggregated monthly inputs plus transaction samples.
  - Manual generation works for current month and prior months with existing data.
  - Empty or near-empty months return a safe fallback message instead of low-quality advice.
+  - App returns a clear fallback message when `Ollama` or the local model is unavailable.
+- Category suggestion:
+  - Known merchants resolve deterministically to the expected category.
+  - Unknown merchants fall back to the local model and return a suggested category.
+  - Ambiguous suggestions require user confirmation before save.
 - Persistence:
  - Data remains available after app restart.
  - Deleting an expense or paycheck updates dashboard and future insight results correctly.
@@ -79,3 +89,5 @@ The first version is optimized for fast daily entry and a dashboard-first review
 - Fixed starter categories are sufficient for v1; custom categories can be added later.
 - Income is modeled as discrete biweekly paychecks because that materially affects next-month guidance and intra-month cash-flow interpretation.
 - Month and paycheck boundaries use the local machine timezone.
+- Privacy matters more than hosted-model quality for this app, so AI features should stay fully offline.
+- A recent local Qwen instruct model running through `Ollama` is the default model family for v1.