docs(28-01): complete ollama service + routes plan — detectOllama, listOllamaModels, model catalog, HTTP routes

This commit is contained in:
Nexus Dev 2026-04-02 16:57:03 +00:00
parent 9e7c2890d5
commit ed085737e3
4 changed files with 130 additions and 22 deletions

View file

@ -12,11 +12,11 @@
## Ollama Integration (5)
- [ ] **OLLA-01** — Nexus detects whether Ollama is installed locally
- [ ] **OLLA-02** — User can see a list of available Ollama models when configuring a Hermes agent
- [x] **OLLA-01** — Nexus detects whether Ollama is installed locally
- [x] **OLLA-02** — User can see a list of available Ollama models when configuring a Hermes agent
- [ ] **OLLA-03** — User can configure a Hermes agent with any local Ollama model
- [ ] **OLLA-04** — Model recommendation based on RAM/VRAM from a shipped catalog
- [ ] **OLLA-05** — If Ollama is not present, user is offered installation instructions
- [x] **OLLA-04** — Model recommendation based on RAM/VRAM from a shipped catalog
- [x] **OLLA-05** — If Ollama is not present, user is offered installation instructions
## Default Provider Logic (4)
@ -48,11 +48,11 @@ None deferred — all PRD items included in this milestone.
| HERM-05 | Phase 28 | Pending |
| HERM-06 | Phase 28 | Pending |
| HERM-07 | Phase 28 | Pending |
| OLLA-01 | Phase 28 | Pending |
| OLLA-02 | Phase 28 | Pending |
| OLLA-01 | Phase 28 | Complete |
| OLLA-02 | Phase 28 | Complete |
| OLLA-03 | Phase 28 | Pending |
| OLLA-04 | Phase 28 | Pending |
| OLLA-05 | Phase 28 | Pending |
| OLLA-04 | Phase 28 | Complete |
| OLLA-05 | Phase 28 | Complete |
| DFLT-01 | Phase 29 | Pending |
| DFLT-02 | Phase 29 | Pending |
| DFLT-03 | Phase 29 | Pending |

View file

@ -43,9 +43,9 @@ Plans:
5. The agent config page shows Nexus-managed skills alongside Hermes native skills in a single unified list
6. The dashboard agent card for a Hermes agent shows model name, memory usage, and native skill count
7. Token usage and estimated model cost are recorded per heartbeat and surfaced in the cost tracking view
**Plans:** 3 plans
**Plans:** 1/3 plans executed
Plans:
- [ ] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
- [x] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
- [ ] 28-02-PLAN.md — UI model selector dropdown, install callout, Hermes skill badge
- [ ] 28-03-PLAN.md — Hermes stateJson runtime data and dashboard HermesRuntimeCard
**UI hint**: yes
@ -93,5 +93,5 @@ All 16 v1 requirements are mapped to exactly one phase. No orphans.
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|
| 27. Hermes Adapter | v1.4 | 1/1 | Complete | 2026-04-02 |
| 28. Ollama Integration & Agent Surface | v1.4 | 0/3 | Not started | - |
| 28. Ollama Integration & Agent Surface | v1.4 | 1/3 | In Progress| |
| 29. Default Provider & End-to-End | v1.4 | 0/? | Not started | - |

View file

@ -2,15 +2,15 @@
gsd_state_version: 1.0
milestone: v1.4
milestone_name: milestone
status: verifying
stopped_at: Completed 27-hermes-adapter-27-01-PLAN.md
last_updated: "2026-04-02T16:31:58.709Z"
status: executing
stopped_at: Completed 28-ollama-integration-28-01-PLAN.md
last_updated: "2026-04-02T16:56:46.973Z"
last_activity: 2026-04-02
progress:
total_phases: 3
completed_phases: 1
total_plans: 1
completed_plans: 1
total_plans: 4
completed_plans: 2
percent: 0
---
@ -21,13 +21,13 @@ progress:
See: .planning/PROJECT.md (updated 2026-04-02)
**Core value:** Nexus works out of the box without any paid subscription or API key.
**Current focus:** Phase 27 — hermes-adapter
**Current focus:** Phase 28 — ollama-integration
## Current Position
Phase: 28
Plan: Not started
Status: Phase complete — ready for verification
Phase: 28 (ollama-integration) — EXECUTING
Plan: 2 of 3
Status: Ready to execute
Last activity: 2026-04-02
Progress: [__________] 0%
@ -91,6 +91,7 @@ Progress: [__________] 0%
| Phase 26-pwa-performance P02 | 20 | 2 tasks | 8 files |
| Phase 26-pwa-performance P04 | 15 | 2 tasks | 10 files |
| Phase 27-hermes-adapter P01 | 2 | 3 tasks | 3 files |
| Phase 28-ollama-integration P01 | 3 | 2 tasks | 6 files |
## Accumulated Context
@ -180,6 +181,8 @@ Recent decisions affecting current work:
- [Phase 26-pwa-performance]: NotificationPermissionPrompt engagement gate: agentResponseCount >= 3 derived via useMemo from messages with role === assistant
- [Phase 27-hermes-adapter]: Toolsets field moved inside !isCreate guard — new agents get default toolsets; edit form uses adapterConfig.toolsets correctly
- [Phase 27-hermes-adapter]: Hermes session codec has no cwd field (unlike claude/codex/cursor/gemini) — only sessionId tracked
- [Phase 28-ollama-integration]: Force-added server/src/data/ with git add -f — source catalog JSON is not generated data despite data/ gitignore pattern
- [Phase 28-ollama-integration]: getRecommendedModel uses QUALITY_RANK map (best>reasoning>balanced>fast) to pick highest quality variant within 75% RAM budget
### Pending Todos
@ -191,6 +194,6 @@ None identified yet.
## Session Continuity
Last session: 2026-04-02T16:26:30.124Z
Stopped at: Completed 27-hermes-adapter-27-01-PLAN.md
Last session: 2026-04-02T16:56:46.970Z
Stopped at: Completed 28-ollama-integration-28-01-PLAN.md
Resume file: None

View file

@ -0,0 +1,105 @@
---
phase: 28-ollama-integration
plan: "01"
subsystem: server
tags: [ollama, model-catalog, service, routes, unit-tests]
dependency_graph:
requires: []
provides: [ollamaService, ollamaRoutes, ollama-model-catalog]
affects: [server/src/app.ts, server/src/routes/index.ts]
tech_stack:
added: []
patterns: [AbortController-timeout, catalog-based-recommendation, assertCompanyAccess-authz]
key_files:
created:
- server/src/services/ollama.ts
- server/src/data/ollama-model-catalog.json
- server/src/__tests__/ollama-service.test.ts
- server/src/routes/ollama.ts
modified:
- server/src/routes/index.ts
- server/src/app.ts
decisions:
- "Force-added server/src/data/ with git add -f because root .gitignore has data/ pattern — source catalog is not generated data"
- "Used loadCatalog() with fs.readFileSync + fileURLToPath for reliable ESM-compatible JSON loading"
- "getRecommendedModel picks highest quality-ranked variant within 75% RAM budget using QUALITY_RANK map"
- "listOllamaModels includes its own AbortController timeout — guards against Ollama going down mid-request"
metrics:
duration: "3 minutes"
completed_date: "2026-04-02"
tasks_completed: 2
files_modified: 6
requirements_satisfied: [OLLA-01, OLLA-02, OLLA-04, OLLA-05]
---
# Phase 28 Plan 01: Ollama Service, Routes, and Model Catalog Summary
**One-liner:** Ollama detection + model listing service with AbortController timeouts, static 5-family model catalog for RAM-based recommendations, and Express routes at `/companies/:companyId/ollama/status` and `/models`.
## Tasks Completed
| Task | Name | Commit | Files |
|------|------|--------|-------|
| TDD RED | Add failing tests for ollama service | 2169a21e | server/src/__tests__/ollama-service.test.ts |
| TDD GREEN (Task 1) | ollamaService + model catalog | 4fce48e1 | server/src/services/ollama.ts, server/src/data/ollama-model-catalog.json |
| Task 2 | Ollama HTTP routes + app mount | e45a2578 | server/src/routes/ollama.ts, routes/index.ts, app.ts |
## What Was Built
### ollamaService (`server/src/services/ollama.ts`)
- `detectOllama()`: Probes `OLLAMA_BASE_URL/api/version` with a 3s AbortController timeout. Returns `{ installed: true, version }` on success, `{ installed: false, installUrl }` on any error or timeout.
- `listOllamaModels()`: Fetches `OLLAMA_BASE_URL/api/tags`, maps Ollama's native response (with `details.parameter_size`, `details.quantization_level`, `details.family`) to `OllamaModel[]`. Returns `[]` on any error.
- `getRecommendedModel(models, systemRamBytes)`: Reads the static catalog, computes usable RAM as 75% of total, ranks catalog variants by quality tier (best > reasoning > balanced > fast), and marks the single best-fitting model as `recommended: true` with a human-readable `recommendationReason`.
- Respects `process.env.OLLAMA_BASE_URL` override — never hard-codes `localhost:11434`.
### Model Catalog (`server/src/data/ollama-model-catalog.json`)
5 families with 11 total variants:
- **qwen2**: qwen2.5-coder 7b/14b/32b
- **llama**: llama3.2 3b, llama3.1 8b/70b
- **mistral**: mistral 7b/22b
- **phi**: phi4 14b
- **deepseek**: deepseek-r1 7b/32b
### Ollama Routes (`server/src/routes/ollama.ts`)
- `GET /companies/:companyId/ollama/status` — returns `OllamaStatus` JSON
- `GET /companies/:companyId/ollama/models` — returns `{ models: OllamaModel[], ramGb: number }`. Short-circuits to `{ models: [], ramGb: 0 }` if Ollama not installed.
- Both routes gated with `assertCompanyAccess(req, companyId)`.
- Mounted in `app.ts` as `api.use(ollamaRoutes())` after `agentRoutes`.
## Test Coverage
12 unit tests (all passing):
- `detectOllama`: success, ECONNREFUSED failure, AbortController timeout, non-ok response
- `listOllamaModels`: success with full OllamaTagsResponse shape, ECONNREFUSED, non-ok
- `getRecommendedModel`: 8GB → 7b, 32GB → 32b, unknown models → all false, empty input, RAM too low → no recommendation
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 3 - Blocking] server/src/data/ gitignored by root .gitignore**
- **Found during:** Task 1 commit
- **Issue:** Root `.gitignore` has `data/` pattern; `server/src/data/ollama-model-catalog.json` was silently ignored
- **Fix:** Used `git add -f` to force-track the file. The catalog is source code (not generated data), so this is correct behavior.
- **Files modified:** `.gitignore` not modified — file force-added
- **Commit:** 4fce48e1
## Known Stubs
None — all functions return real data structures. Routes wire directly to service functions. No placeholder values in the response paths.
## Self-Check: PASSED
Files exist:
- server/src/services/ollama.ts: FOUND
- server/src/data/ollama-model-catalog.json: FOUND
- server/src/__tests__/ollama-service.test.ts: FOUND
- server/src/routes/ollama.ts: FOUND
Commits exist:
- 2169a21e: FOUND (test RED)
- 4fce48e1: FOUND (feat GREEN + catalog)
- e45a2578: FOUND (feat routes)