nexus/.planning/phases/28-ollama-integration/28-01-SUMMARY.md
Nexus Dev 51eb2edf0b chore: complete v1.5 Smart Onboarding + Personal AI Assistant milestone
6 phases, 13 plans, 21 requirements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 23:03:46 +00:00

105 lines
4.8 KiB
Markdown

---
phase: 28-ollama-integration
plan: "01"
subsystem: server
tags: [ollama, model-catalog, service, routes, unit-tests]
dependency_graph:
requires: []
provides: [ollamaService, ollamaRoutes, ollama-model-catalog]
affects: [server/src/app.ts, server/src/routes/index.ts]
tech_stack:
added: []
patterns: [AbortController-timeout, catalog-based-recommendation, assertCompanyAccess-authz]
key_files:
created:
- server/src/services/ollama.ts
- server/src/data/ollama-model-catalog.json
- server/src/__tests__/ollama-service.test.ts
- server/src/routes/ollama.ts
modified:
- server/src/routes/index.ts
- server/src/app.ts
decisions:
- "Force-added server/src/data/ with git add -f because root .gitignore has data/ pattern — source catalog is not generated data"
- "Used loadCatalog() with fs.readFileSync + fileURLToPath for reliable ESM-compatible JSON loading"
- "getRecommendedModel picks highest quality-ranked variant within 75% RAM budget using QUALITY_RANK map"
- "listOllamaModels includes its own AbortController timeout — guards against Ollama going down mid-request"
metrics:
duration: "3 minutes"
completed_date: "2026-04-02"
tasks_completed: 2
files_modified: 6
requirements_satisfied: [OLLA-01, OLLA-02, OLLA-04, OLLA-05]
---
# Phase 28 Plan 01: Ollama Service, Routes, and Model Catalog Summary
**One-liner:** Ollama detection + model listing service with AbortController timeouts, static 5-family model catalog for RAM-based recommendations, and Express routes at `/companies/:companyId/ollama/status` and `/models`.
## Tasks Completed
| Task | Name | Commit | Files |
|------|------|--------|-------|
| TDD RED | Add failing tests for ollama service | 2169a21e | server/src/__tests__/ollama-service.test.ts |
| TDD GREEN (Task 1) | ollamaService + model catalog | 4fce48e1 | server/src/services/ollama.ts, server/src/data/ollama-model-catalog.json |
| Task 2 | Ollama HTTP routes + app mount | e45a2578 | server/src/routes/ollama.ts, routes/index.ts, app.ts |
## What Was Built
### ollamaService (`server/src/services/ollama.ts`)
- `detectOllama()`: Probes `OLLAMA_BASE_URL/api/version` with a 3s AbortController timeout. Returns `{ installed: true, version }` on success, `{ installed: false, installUrl }` on any error or timeout.
- `listOllamaModels()`: Fetches `OLLAMA_BASE_URL/api/tags`, maps Ollama's native response (with `details.parameter_size`, `details.quantization_level`, `details.family`) to `OllamaModel[]`. Returns `[]` on any error.
- `getRecommendedModel(models, systemRamBytes)`: Reads the static catalog, computes usable RAM as 75% of total, ranks catalog variants by quality tier (best > reasoning > balanced > fast), and marks the single best-fitting model as `recommended: true` with a human-readable `recommendationReason`.
- Respects `process.env.OLLAMA_BASE_URL` override — never hard-codes `localhost:11434`.
### Model Catalog (`server/src/data/ollama-model-catalog.json`)
5 families with 11 total variants:
- **qwen2**: qwen2.5-coder 7b/14b/32b
- **llama**: llama3.2 3b, llama3.1 8b/70b
- **mistral**: mistral 7b/22b
- **phi**: phi4 14b
- **deepseek**: deepseek-r1 7b/32b
### Ollama Routes (`server/src/routes/ollama.ts`)
- `GET /companies/:companyId/ollama/status` — returns `OllamaStatus` JSON
- `GET /companies/:companyId/ollama/models` — returns `{ models: OllamaModel[], ramGb: number }`. Short-circuits to `{ models: [], ramGb: 0 }` if Ollama not installed.
- Both routes gated with `assertCompanyAccess(req, companyId)`.
- Mounted in `app.ts` as `api.use(ollamaRoutes())` after `agentRoutes`.
## Test Coverage
12 unit tests (all passing):
- `detectOllama`: success, ECONNREFUSED failure, AbortController timeout, non-ok response
- `listOllamaModels`: success with full OllamaTagsResponse shape, ECONNREFUSED, non-ok
- `getRecommendedModel`: 8GB → 7b, 32GB → 32b, unknown models → all false, empty input, RAM too low → no recommendation
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 3 - Blocking] server/src/data/ gitignored by root .gitignore**
- **Found during:** Task 1 commit
- **Issue:** Root `.gitignore` has `data/` pattern; `server/src/data/ollama-model-catalog.json` was silently ignored
- **Fix:** Used `git add -f` to force-track the file. The catalog is source code (not generated data), so this is correct behavior.
- **Files modified:** `.gitignore` not modified — file force-added
- **Commit:** 4fce48e1
## Known Stubs
None — all functions return real data structures. Routes wire directly to service functions. No placeholder values in the response paths.
## Self-Check: PASSED
Files exist:
- server/src/services/ollama.ts: FOUND
- server/src/data/ollama-model-catalog.json: FOUND
- server/src/__tests__/ollama-service.test.ts: FOUND
- server/src/routes/ollama.ts: FOUND
Commits exist:
- 2169a21e: FOUND (test RED)
- 4fce48e1: FOUND (feat GREEN + catalog)
- e45a2578: FOUND (feat routes)