docs(28-01): complete ollama service + routes plan — detectOllama, listOllamaModels, model catalog, HTTP routes

2026-04-02 16:57:03 +00:00 · 2026-04-02 16:57:03 +00:00 · ed085737e3
commit ed085737e3
parent 9e7c2890d5
4 changed files with 130 additions and 22 deletions
--- a/.planning/REQUIREMENTS.md
+++ b/.planning/REQUIREMENTS.md
@ -12,11 +12,11 @@

 ## Ollama Integration (5)

- [ ] **OLLA-01** — Nexus detects whether Ollama is installed locally
- [ ] **OLLA-02** — User can see a list of available Ollama models when configuring a Hermes agent
+- [x] **OLLA-01** — Nexus detects whether Ollama is installed locally
+- [x] **OLLA-02** — User can see a list of available Ollama models when configuring a Hermes agent
 - [ ] **OLLA-03** — User can configure a Hermes agent with any local Ollama model
- [ ] **OLLA-04** — Model recommendation based on RAM/VRAM from a shipped catalog
- [ ] **OLLA-05** — If Ollama is not present, user is offered installation instructions
+- [x] **OLLA-04** — Model recommendation based on RAM/VRAM from a shipped catalog
+- [x] **OLLA-05** — If Ollama is not present, user is offered installation instructions

 ## Default Provider Logic (4)

@ -48,11 +48,11 @@ None deferred — all PRD items included in this milestone.
 | HERM-05 | Phase 28 | Pending |
 | HERM-06 | Phase 28 | Pending |
 | HERM-07 | Phase 28 | Pending |
-| OLLA-01 | Phase 28 | Pending |
-| OLLA-02 | Phase 28 | Pending |
+| OLLA-01 | Phase 28 | Complete |
+| OLLA-02 | Phase 28 | Complete |
 | OLLA-03 | Phase 28 | Pending |
-| OLLA-04 | Phase 28 | Pending |
-| OLLA-05 | Phase 28 | Pending |
+| OLLA-04 | Phase 28 | Complete |
+| OLLA-05 | Phase 28 | Complete |
 | DFLT-01 | Phase 29 | Pending |
 | DFLT-02 | Phase 29 | Pending |
 | DFLT-03 | Phase 29 | Pending |
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@ -43,9 +43,9 @@ Plans:
  5. The agent config page shows Nexus-managed skills alongside Hermes native skills in a single unified list
  6. The dashboard agent card for a Hermes agent shows model name, memory usage, and native skill count
  7. Token usage and estimated model cost are recorded per heartbeat and surfaced in the cost tracking view
-**Plans:** 3 plans
+**Plans:** 1/3 plans executed
 Plans:
- [ ] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
+- [x] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
 - [ ] 28-02-PLAN.md — UI model selector dropdown, install callout, Hermes skill badge
 - [ ] 28-03-PLAN.md — Hermes stateJson runtime data and dashboard HermesRuntimeCard
 **UI hint**: yes
@ -93,5 +93,5 @@ All 16 v1 requirements are mapped to exactly one phase. No orphans.
 | Phase | Milestone | Plans Complete | Status | Completed |
 |-------|-----------|----------------|--------|-----------|
 | 27. Hermes Adapter | v1.4 | 1/1 | Complete    | 2026-04-02 |
-| 28. Ollama Integration & Agent Surface | v1.4 | 0/3 | Not started | - |
+| 28. Ollama Integration & Agent Surface | v1.4 | 1/3 | In Progress|  |
 | 29. Default Provider & End-to-End | v1.4 | 0/? | Not started | - |
--- a/.planning/STATE.md
+++ b/.planning/STATE.md
@ -2,15 +2,15 @@
 gsd_state_version: 1.0
 milestone: v1.4
 milestone_name: milestone
-status: verifying
-stopped_at: Completed 27-hermes-adapter-27-01-PLAN.md
-last_updated: "2026-04-02T16:31:58.709Z"
+status: executing
+stopped_at: Completed 28-ollama-integration-28-01-PLAN.md
+last_updated: "2026-04-02T16:56:46.973Z"
 last_activity: 2026-04-02
 progress:
  total_phases: 3
  completed_phases: 1
-  total_plans: 1
-  completed_plans: 1
+  total_plans: 4
+  completed_plans: 2
  percent: 0
 ---

@ -21,13 +21,13 @@ progress:
 See: .planning/PROJECT.md (updated 2026-04-02)

 **Core value:** Nexus works out of the box without any paid subscription or API key.
-**Current focus:** Phase 27 — hermes-adapter
+**Current focus:** Phase 28 — ollama-integration

 ## Current Position

-Phase: 28
-Plan: Not started
-Status: Phase complete — ready for verification
+Phase: 28 (ollama-integration) — EXECUTING
+Plan: 2 of 3
+Status: Ready to execute
 Last activity: 2026-04-02

 Progress: [__________] 0%
@ -91,6 +91,7 @@ Progress: [__________] 0%
 | Phase 26-pwa-performance P02 | 20 | 2 tasks | 8 files |
 | Phase 26-pwa-performance P04 | 15 | 2 tasks | 10 files |
 | Phase 27-hermes-adapter P01 | 2 | 3 tasks | 3 files |
+| Phase 28-ollama-integration P01 | 3 | 2 tasks | 6 files |

 ## Accumulated Context

@ -180,6 +181,8 @@ Recent decisions affecting current work:
 - [Phase 26-pwa-performance]: NotificationPermissionPrompt engagement gate: agentResponseCount >= 3 derived via useMemo from messages with role === assistant
 - [Phase 27-hermes-adapter]: Toolsets field moved inside !isCreate guard — new agents get default toolsets; edit form uses adapterConfig.toolsets correctly
 - [Phase 27-hermes-adapter]: Hermes session codec has no cwd field (unlike claude/codex/cursor/gemini) — only sessionId tracked
+- [Phase 28-ollama-integration]: Force-added server/src/data/ with git add -f — source catalog JSON is not generated data despite data/ gitignore pattern
+- [Phase 28-ollama-integration]: getRecommendedModel uses QUALITY_RANK map (best>reasoning>balanced>fast) to pick highest quality variant within 75% RAM budget

 ### Pending Todos

@ -191,6 +194,6 @@ None identified yet.

 ## Session Continuity

-Last session: 2026-04-02T16:26:30.124Z
-Stopped at: Completed 27-hermes-adapter-27-01-PLAN.md
+Last session: 2026-04-02T16:56:46.970Z
+Stopped at: Completed 28-ollama-integration-28-01-PLAN.md
 Resume file: None
--- a/.planning/phases/28-ollama-integration/28-01-SUMMARY.md
+++ b/.planning/phases/28-ollama-integration/28-01-SUMMARY.md
@ -0,0 +1,105 @@
+---
+phase: 28-ollama-integration
+plan: "01"
+subsystem: server
+tags: [ollama, model-catalog, service, routes, unit-tests]
+dependency_graph:
+  requires: []
+  provides: [ollamaService, ollamaRoutes, ollama-model-catalog]
+  affects: [server/src/app.ts, server/src/routes/index.ts]
+tech_stack:
+  added: []
+  patterns: [AbortController-timeout, catalog-based-recommendation, assertCompanyAccess-authz]
+key_files:
+  created:
+    - server/src/services/ollama.ts
+    - server/src/data/ollama-model-catalog.json
+    - server/src/__tests__/ollama-service.test.ts
+    - server/src/routes/ollama.ts
+  modified:
+    - server/src/routes/index.ts
+    - server/src/app.ts
+decisions:
+  - "Force-added server/src/data/ with git add -f because root .gitignore has data/ pattern — source catalog is not generated data"
+  - "Used loadCatalog() with fs.readFileSync + fileURLToPath for reliable ESM-compatible JSON loading"
+  - "getRecommendedModel picks highest quality-ranked variant within 75% RAM budget using QUALITY_RANK map"
+  - "listOllamaModels includes its own AbortController timeout — guards against Ollama going down mid-request"
+metrics:
+  duration: "3 minutes"
+  completed_date: "2026-04-02"
+  tasks_completed: 2
+  files_modified: 6
+requirements_satisfied: [OLLA-01, OLLA-02, OLLA-04, OLLA-05]
+---
+
+# Phase 28 Plan 01: Ollama Service, Routes, and Model Catalog Summary
+
+**One-liner:** Ollama detection + model listing service with AbortController timeouts, static 5-family model catalog for RAM-based recommendations, and Express routes at `/companies/:companyId/ollama/status` and `/models`.
+
+## Tasks Completed
+
+| Task | Name | Commit | Files |
+|------|------|--------|-------|
+| TDD RED | Add failing tests for ollama service | 2169a21e | server/src/__tests__/ollama-service.test.ts |
+| TDD GREEN (Task 1) | ollamaService + model catalog | 4fce48e1 | server/src/services/ollama.ts, server/src/data/ollama-model-catalog.json |
+| Task 2 | Ollama HTTP routes + app mount | e45a2578 | server/src/routes/ollama.ts, routes/index.ts, app.ts |
+
+## What Was Built
+
+### ollamaService (`server/src/services/ollama.ts`)
+
+- `detectOllama()`: Probes `OLLAMA_BASE_URL/api/version` with a 3s AbortController timeout. Returns `{ installed: true, version }` on success, `{ installed: false, installUrl }` on any error or timeout.
+- `listOllamaModels()`: Fetches `OLLAMA_BASE_URL/api/tags`, maps Ollama's native response (with `details.parameter_size`, `details.quantization_level`, `details.family`) to `OllamaModel[]`. Returns `[]` on any error.
+- `getRecommendedModel(models, systemRamBytes)`: Reads the static catalog, computes usable RAM as 75% of total, ranks catalog variants by quality tier (best > reasoning > balanced > fast), and marks the single best-fitting model as `recommended: true` with a human-readable `recommendationReason`.
+- Respects `process.env.OLLAMA_BASE_URL` override — never hard-codes `localhost:11434`.
+
+### Model Catalog (`server/src/data/ollama-model-catalog.json`)
+
+5 families with 11 total variants:
+- **qwen2**: qwen2.5-coder 7b/14b/32b
+- **llama**: llama3.2 3b, llama3.1 8b/70b
+- **mistral**: mistral 7b/22b
+- **phi**: phi4 14b
+- **deepseek**: deepseek-r1 7b/32b
+
+### Ollama Routes (`server/src/routes/ollama.ts`)
+
+- `GET /companies/:companyId/ollama/status` — returns `OllamaStatus` JSON
+- `GET /companies/:companyId/ollama/models` — returns `{ models: OllamaModel[], ramGb: number }`. Short-circuits to `{ models: [], ramGb: 0 }` if Ollama not installed.
+- Both routes gated with `assertCompanyAccess(req, companyId)`.
+- Mounted in `app.ts` as `api.use(ollamaRoutes())` after `agentRoutes`.
+
+## Test Coverage
+
+12 unit tests (all passing):
+- `detectOllama`: success, ECONNREFUSED failure, AbortController timeout, non-ok response
+- `listOllamaModels`: success with full OllamaTagsResponse shape, ECONNREFUSED, non-ok
+- `getRecommendedModel`: 8GB → 7b, 32GB → 32b, unknown models → all false, empty input, RAM too low → no recommendation
+
+## Deviations from Plan
+
+### Auto-fixed Issues
+
+**1. [Rule 3 - Blocking] server/src/data/ gitignored by root .gitignore**
+- **Found during:** Task 1 commit
+- **Issue:** Root `.gitignore` has `data/` pattern; `server/src/data/ollama-model-catalog.json` was silently ignored
+- **Fix:** Used `git add -f` to force-track the file. The catalog is source code (not generated data), so this is correct behavior.
+- **Files modified:** `.gitignore` not modified — file force-added
+- **Commit:** 4fce48e1
+
+## Known Stubs
+
+None — all functions return real data structures. Routes wire directly to service functions. No placeholder values in the response paths.
+
+## Self-Check: PASSED
+
+Files exist:
+- server/src/services/ollama.ts: FOUND
+- server/src/data/ollama-model-catalog.json: FOUND
+- server/src/__tests__/ollama-service.test.ts: FOUND
+- server/src/routes/ollama.ts: FOUND
+
+Commits exist:
+- 2169a21e: FOUND (test RED)
+- 4fce48e1: FOUND (feat GREEN + catalog)
+- e45a2578: FOUND (feat routes)