nexus/.planning/milestones/v1.4-phases/28-ollama-integration/28-01-PLAN.md at 006cc44d85a29d28dade56edb4aa5f4ccc76d57f

Nexus Dev 8ae8e526d9 chore: complete v1.4 Hermes Default Provider milestone

3 phases, 6 plans, 16 requirements. Archives copied to milestones/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-04 03:55:49 +00:00

10 KiB

Raw Blame History

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

28-ollama-integration

execute

server/src/services/ollama.ts

server/src/routes/ollama.ts

server/src/routes/index.ts

server/src/app.ts

server/src/data/ollama-model-catalog.json

server/src/__tests__/ollama-service.test.ts

true

OLLA-01

OLLA-02

OLLA-04

OLLA-05

truths

artifacts

key_links

detectOllama() returns installed:true + version when Ollama responds at /api/version

detectOllama() returns installed:false + installUrl when Ollama is absent or times out

listOllamaModels() returns model list with parameterSize, family, quantization from /api/tags

getRecommendedModel() returns highest-quality model that fits within 75% system RAM

GET /companies/:companyId/ollama/status returns OllamaStatus JSON

GET /companies/:companyId/ollama/models returns OllamaModel[] + ramGb

path

provides

exports

server/src/services/ollama.ts

Ollama detection, model listing, recommendation logic

detectOllama

listOllamaModels

OllamaStatus

OllamaModel

path

provides

exports

server/src/routes/ollama.ts

HTTP routes for Ollama status and model listing

ollamaRoutes

path	provides	contains
server/src/data/ollama-model-catalog.json	Static model catalog for RAM-based recommendations	qwen2.5-coder

path	provides	min_lines
server/src/__tests__/ollama-service.test.ts	Unit tests for ollamaService	60

from	to	via	pattern
server/src/routes/ollama.ts	server/src/services/ollama.ts	import detectOllama, listOllamaModels	import.from.services/ollama

from	to	via	pattern
server/src/app.ts	server/src/routes/ollama.ts	api.use(ollamaRoutes())	ollamaRoutes

Create the server-side Ollama integration service, HTTP routes, model catalog, and unit tests.

Purpose: Provides the backend API surface that the UI (Plan 02) will consume to detect Ollama, list available models, and recommend a model based on system RAM. Output: Working server routes at /companies/:companyId/ollama/status and /models, plus unit tests.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/28-ollama-integration/28-RESEARCH.md

@server/src/app.ts @server/src/routes/index.ts @server/src/routes/agents.ts (for assertCompanyAccess pattern) @ui/src/api/client.ts (for API client pattern reference)

Task 1: Create ollamaService + model catalog + unit tests server/src/services/ollama.ts, server/src/data/ollama-model-catalog.json, server/src/__tests__/ollama-service.test.ts - server/src/services/heartbeat.ts (lines 1-30 for import patterns) - .planning/phases/28-ollama-integration/28-RESEARCH.md (Pattern 1, Pattern 5, Code Examples) - detectOllama returns { installed: true, version: "0.5.x" } when fetch to /api/version succeeds - detectOllama returns { installed: false, version: null, installUrl: "https://ollama.com/download" } when fetch rejects (ECONNREFUSED) - detectOllama returns { installed: false, version: null, installUrl } when fetch times out (AbortController) - listOllamaModels returns OllamaModel[] mapped from /api/tags response with name, parameterSize, quantization, sizeBytes, family - listOllamaModels returns empty array when Ollama is absent - getRecommendedModel marks the highest-quality model that fits within 75% of given RAM budget as recommended=true - getRecommendedModel with 8GB RAM recommends a 7b model, not a 32b model - Model catalog JSON contains at least qwen2, llama, mistral, phi, deepseek families 1. Create `server/src/data/ollama-model-catalog.json` with the static catalog from RESEARCH Pattern 5. Include families: qwen2, llama, mistral, phi, deepseek. Each variant has name, ramGb, vramGb, quality fields.

2. Create `server/src/services/ollama.ts`:
   - `const OLLAMA_BASE_URL = process.env.OLLAMA_BASE_URL ?? "http://localhost:11434"`
   - `const OLLAMA_TIMEOUT_MS = 3000`
   - `const INSTALL_URL = "https://ollama.com/download"`
   - Export interface `OllamaStatus { installed: boolean; version: string | null; installUrl: string }`
   - Export interface `OllamaModel { name: string; parameterSize: string; quantization: string; sizeBytes: number; family: string; recommended: boolean; recommendationReason: string | null }`
   - Export `async function detectOllama(): Promise<OllamaStatus>` — fetch /api/version with AbortController 3s timeout. On success return installed:true + version. On any error return installed:false + installUrl.
   - Export `async function listOllamaModels(): Promise<OllamaModel[]>` — fetch /api/tags, map response.models to OllamaModel[]. On error return [].
   - Export `function getRecommendedModel(models: OllamaModel[], systemRamBytes: number): OllamaModel[]` — reads catalog JSON, computes usableRamGb = (systemRamBytes / 1024^3) * 0.75, for each model checks if a catalog entry matches by name and fits in RAM. Returns models with `recommended` field set. The highest-quality model within budget gets recommended=true + a recommendationReason string. All others get recommended=false.
   - Anti-pattern: Do NOT poll in a loop. Do NOT hard-code localhost:11434 — always use OLLAMA_BASE_URL.

3. Create `server/src/__tests__/ollama-service.test.ts`:
   - Mock global fetch using vi.stubGlobal("fetch", vi.fn())
   - Test detectOllama success case (mock returns { version: "0.5.1" })
   - Test detectOllama failure case (mock rejects with ECONNREFUSED)
   - Test detectOllama timeout case (mock never resolves, verify AbortController fires)
   - Test listOllamaModels success with mock /api/tags response matching OllamaTagsResponse shape from RESEARCH
   - Test listOllamaModels returns [] on fetch error
   - Test getRecommendedModel with 8GB RAM → recommends 7b-class model
   - Test getRecommendedModel with 32GB RAM → recommends 32b-class model
   - Test getRecommendedModel with models not in catalog → recommended=false for all

cd /opt/nexus/server && npx vitest run src/__tests__/ollama-service.test.ts - grep -q "detectOllama" server/src/services/ollama.ts - grep -q "listOllamaModels" server/src/services/ollama.ts - grep -q "getRecommendedModel" server/src/services/ollama.ts - grep -q "OLLAMA_BASE_URL" server/src/services/ollama.ts - grep -q "qwen2.5-coder" server/src/data/ollama-model-catalog.json - grep -q "detectOllama" server/src/__tests__/ollama-service.test.ts - grep -q "getRecommendedModel" server/src/__tests__/ollama-service.test.ts All ollama service tests pass. detectOllama, listOllamaModels, and getRecommendedModel functions exported and tested. Model catalog JSON file exists with 5+ model families. Task 2: Create Ollama HTTP routes and mount in app server/src/routes/ollama.ts, server/src/routes/index.ts, server/src/app.ts - server/src/routes/agents.ts (lines 1-60 for route pattern, assertCompanyAccess usage) - server/src/app.ts (lines 134-170 for route mounting pattern) - server/src/routes/index.ts 1. Create `server/src/routes/ollama.ts`: - Import Router from express, import assertCompanyAccess from "./authz.js", import detectOllama/listOllamaModels/getRecommendedModel from "../services/ollama.js" - Import `os` for `os.totalmem()` - Export function `ollamaRoutes()` returning Router - `GET /companies/:companyId/ollama/status`: - Call `assertCompanyAccess(req, companyId)` - Call `detectOllama()`, return JSON response - `GET /companies/:companyId/ollama/models`: - Call `assertCompanyAccess(req, companyId)` - Call `detectOllama()` first — if not installed, return `{ models: [], ramGb: 0 }` - Call `listOllamaModels()`, then `getRecommendedModel(models, os.totalmem())` - Return `{ models: enrichedModels, ramGb: Math.round(os.totalmem() / 1073741824) }` - Wrap each handler in try/catch, return 500 on unexpected errors

2. Add `export { ollamaRoutes } from "./ollama.js"` to `server/src/routes/index.ts`

3. In `server/src/app.ts`:
   - Add import: `import { ollamaRoutes } from "./routes/ollama.js"`
   - Add `api.use(ollamaRoutes())` after the `api.use(agentRoutes(db))` line (around line 152)

cd /opt/nexus/server && npx tsc --noEmit 2>&1 | head -20 - grep -q "ollamaRoutes" server/src/routes/ollama.ts - grep -q "assertCompanyAccess" server/src/routes/ollama.ts - grep -q "/companies/:companyId/ollama/status" server/src/routes/ollama.ts - grep -q "/companies/:companyId/ollama/models" server/src/routes/ollama.ts - grep -q "ollamaRoutes" server/src/routes/index.ts - grep -q "ollamaRoutes" server/src/app.ts - grep -q "os.totalmem" server/src/routes/ollama.ts Ollama routes mounted at /companies/:companyId/ollama/status and /models. TypeScript compiles without errors. Routes use assertCompanyAccess for auth and os.totalmem() for RAM detection. - `cd /opt/nexus/server && npx vitest run src/__tests__/ollama-service.test.ts` — all tests pass - `cd /opt/nexus/server && npx tsc --noEmit` — no type errors - Ollama service gracefully returns installed:false when Ollama is not running (no crashes)

<success_criteria>

Ollama detection service exists with detectOllama, listOllamaModels, getRecommendedModel
Model catalog JSON ships with 5+ model families
HTTP routes mounted and accessible at /companies/:companyId/ollama/status and /models
Unit tests cover success, failure, timeout, and recommendation scenarios
All code compiles without TypeScript errors </success_criteria>

After completion, create `.planning/phases/28-ollama-integration/28-01-SUMMARY.md`

10 KiB Raw Blame History

10 KiB

Raw Blame History