# Phase 28: Ollama Integration & Agent Surface - Research **Researched:** 2026-04-01 **Domain:** Ollama HTTP API, Hermes adapter extension, agent dashboard UI, cost tracking **Confidence:** HIGH ## Summary Phase 28 adds three distinct capabilities on top of the completed Phase 27 Hermes adapter: (1) Ollama detection and model catalog — Nexus queries `localhost:11434` to detect Ollama, lists available models, and ships a static JSON catalog for hardware-aware recommendations; (2) Hermes config surface extension — the model field in `config-fields.tsx` becomes a dropdown fed by live Ollama discovery rather than a free-text input, and a new `base_url`/`provider: custom` adapterConfig field routes Hermes to the local endpoint; (3) Hermes runtime data in the dashboard — `stateJson` in `agentRuntimeState` is the right place to store Hermes-specific runtime metadata (model name, native skill count, memory usage from Ollama's `/api/ps`), and the `AgentOverview` component in `AgentDetail.tsx` is the right insertion point. The most important finding is that **Hermes does not have a native "ollama" provider**. Ollama is configured as a custom OpenAI-compatible endpoint: `provider: custom`, `base_url: http://localhost:11434/v1`. The model field passes the Ollama model name bare (e.g. `qwen2.5-coder:32b`). This shapes OLLA-02, OLLA-03, and the `config-fields.tsx` changes. For cost tracking (HERM-06): `hermes-paperclip-adapter@0.2.1` already parses `token_usage` and `cost` regex patterns from Hermes stdout. When Hermes returns non-zero usage, `heartbeat.ts:updateRuntimeState` already calls `costService.createEvent`. The only gap is that Hermes running local Ollama models will have `costUsd = undefined` (no billing) — the infrastructure handles this correctly (zero cost event is suppressed when `additionalCostCents === 0 && !hasTokenUsage`). No cost tracking code changes are needed for local models; the planner just needs to verify the regex path works end-to-end. For HERM-05 (skill visibility): `syncHermesNativeSkills` already exists in `skillRegistryService` and is already called from the `GET /skill-registry/agents/:agentId/skills` route when `adapterType === "hermes_local"`. The Hermes adapter's `listHermesSkills` function merges Paperclip-managed and native skills. The integration is already complete at the data layer. What is missing is the UI surface in the Skills tab that renders the `originLabel: "Hermes skill"` / `readOnly: true` entries distinctly from managed skills. **Primary recommendation:** Implement as four focused plans — (P01) server-side Ollama service + routes; (P02) Hermes config-fields UI extension for Ollama model selection; (P03) dashboard Hermes runtime info card; (P04) model catalog JSON + recommendation logic. --- ## User Constraints (from CONTEXT.md) ### Locked Decisions All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions. ### Claude's Discretion All implementation choices are at Claude's discretion. ### Deferred Ideas (OUT OF SCOPE) None — discuss phase skipped. Refer to REQUIREMENTS.md for in-scope requirements. Out of scope per REQUIREMENTS.md: - Multi-provider model routing (Hermes can use OpenRouter/Anthropic/OpenAI but that's Hermes config, not Nexus) - Hermes MCP server management - Custom Hermes skill authoring UI - DFLT-01 through DFLT-04 (Phase 29) --- ## Phase Requirements | ID | Description | Research Support | |----|-------------|------------------| | OLLA-01 | Nexus detects whether Ollama is installed locally | HTTP probe to `localhost:11434/api/version`; new server service `ollamaService` | | OLLA-02 | User can see list of available Ollama models when configuring a Hermes agent | `GET /api/tags` from Ollama HTTP API; new server route `GET /companies/:id/ollama/models`; config-fields.tsx dropdown | | OLLA-03 | User can configure a Hermes agent with any local Ollama model | Sets `adapterConfig.model = `, `adapterConfig.provider = "custom"`, `adapterConfig.base_url = "http://localhost:11434/v1"` | | OLLA-04 | Model recommendation based on RAM/VRAM from a shipped catalog | Static JSON catalog in `server/src/data/ollama-model-catalog.json`; server reads `os.totalmem()` to filter; returned with model list | | OLLA-05 | If Ollama is not present, user is offered installation instructions | Ollama status endpoint returns `installed: false` + `installUrl`; UI shows callout in Hermes config-fields | | HERM-05 | Nexus-managed skills visible alongside Hermes native skills in agent config | Already wired at data layer — UI Skills tab needs `originLabel: "Hermes skill"` rendering distinction | | HERM-06 | Cost tracking captures token usage and model costs for Hermes agents | Infrastructure already handles this; verify end-to-end with local Ollama (zero cost is correct, no change needed) | | HERM-07 | Dashboard shows Hermes-specific info (model name, memory usage, native skill count) | Store in `agentRuntimeState.stateJson`; render in `AgentOverview` component | --- ## Standard Stack ### Core | Library | Version | Purpose | Why Standard | |---------|---------|---------|--------------| | Node.js `os` module | built-in | Read total system RAM | Already used in heartbeat.ts; no new dep | | Node.js `fetch` | Node 18+ built-in | HTTP calls to Ollama API at localhost:11434 | Already confirmed available in runtime | | `hermes-paperclip-adapter` | 0.2.1 (installed) | Hermes execution, skill sync, model detection | Already wired into adapter registry | ### No New Dependencies Required All capabilities needed for Phase 28 are achievable with existing infrastructure: - Ollama HTTP API is probed with `fetch` (built-in Node 18+) - Model catalog is a static JSON file in the server package - RAM reading uses `os.totalmem()` (built-in) - Hermes Ollama configuration uses existing `adapterConfig` fields ## Architecture Patterns ### Recommended Project Structure (new files) ``` server/src/services/ollama.ts # ollamaService — detect + list models server/src/routes/ollama.ts # HTTP routes: /companies/:id/ollama/status, /models server/src/data/ollama-model-catalog.json # shipped catalog for OLLA-04 server/src/__tests__/ollama-service.test.ts # unit tests for ollamaService ui/src/api/ollama.ts # ollamaApi client — wraps server routes ``` ### Pattern 1: Ollama Service (server-side) ```typescript // server/src/services/ollama.ts const OLLAMA_BASE_URL = process.env.OLLAMA_BASE_URL ?? "http://localhost:11434"; const OLLAMA_TIMEOUT_MS = 3000; export interface OllamaStatus { installed: boolean; version: string | null; installUrl: string; } export interface OllamaModel { name: string; // e.g. "qwen2.5-coder:32b" parameterSize: string; // e.g. "32.8B" from /api/tags details quantization: string; // e.g. "Q4_K_M" sizeBytes: number; family: string; // e.g. "qwen2" recommended: boolean; // from catalog match + RAM check recommendationReason: string | null; } export async function detectOllama(): Promise { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), OLLAMA_TIMEOUT_MS); try { const res = await fetch(`${OLLAMA_BASE_URL}/api/version`, { signal: controller.signal, }); if (!res.ok) return { installed: false, version: null, installUrl: INSTALL_URL }; const body = await res.json() as { version?: string }; return { installed: true, version: body.version ?? null, installUrl: INSTALL_URL }; } catch { return { installed: false, version: null, installUrl: INSTALL_URL }; } finally { clearTimeout(timeout); } } ``` **Why this pattern:** Matches the existing codex-models.ts pattern — HTTP fetch with timeout, graceful failure returns empty/false rather than throwing. The 3s timeout prevents hanging requests when Ollama is not installed. ### Pattern 2: Ollama Routes (mounted under /companies/:companyId) ``` GET /companies/:companyId/ollama/status → { installed: boolean, version: string|null, installUrl: string } GET /companies/:companyId/ollama/models → { models: OllamaModel[], ramGb: number } ``` Both routes use existing `assertCompanyAccess(req, companyId)` authz pattern from `agents.ts`. Mount in `server/src/routes/index.ts` alongside the existing `agentsRoutes`. ### Pattern 3: Hermes Config-Fields Enhancement The existing `HermesLocalConfigFields` in `config-fields.tsx` has a free-text `Model` input. For Ollama support, it becomes a hybrid: dropdown (when Ollama is present) + manual entry fallback. ```tsx // Fetch Ollama status + models (only for hermes_local adapter) const { data: ollamaStatus } = useQuery({ queryKey: ["ollama", "status", companyId], queryFn: () => ollamaApi.status(companyId!), enabled: Boolean(companyId), }); const { data: ollamaModels } = useQuery({ queryKey: ["ollama", "models", companyId], queryFn: () => ollamaApi.models(companyId!), enabled: Boolean(companyId && ollamaStatus?.installed), }); ``` When `ollamaStatus.installed === false`, render an install callout (OLLA-05) instead of the dropdown. When a local Ollama model is selected, `buildHermesConfig` (or `mark`) must also set `provider: "custom"` and `base_url: "http://localhost:11434/v1"` in `adapterConfig`. This is the critical mapping from OLLA-03. ### Pattern 4: Hermes Runtime Data in stateJson (HERM-07) `agentRuntimeState.stateJson` is `jsonb` typed as `Record`. The heartbeat service writes this via `updateRuntimeState`. The Hermes adapter's `execute.ts` already returns `resultJson` with `session_id`, `usage`, and `cost_usd`. For HERM-07 runtime data (model name, native skill count, memory usage), the server-side approach is: - After a Hermes run completes, read `resultJson.result` and extract/store model + detected skill count into `stateJson` - Optionally query Ollama `/api/ps` (running models) to get `size_vram` for memory usage display **Insertion point for stateJson patch:** `heartbeat.ts:updateRuntimeState` already calls `db.update(agentRuntimeState).set(...)`. Add a `stateJson` merge here when `adapterType === "hermes_local"`. **UI insertion point:** `AgentOverview` component in `AgentDetail.tsx` (line ~1183). Add a `HermesRuntimeCard` component after the charts section, gated by `agent.adapterType === "hermes_local"`: ```tsx {agent.adapterType === "hermes_local" && runtimeState && ( )} ``` ### Pattern 5: Model Catalog JSON (OLLA-04) ```json // server/src/data/ollama-model-catalog.json { "models": [ { "family": "qwen2", "variants": [ { "name": "qwen2.5-coder:7b", "ramGb": 5, "vramGb": 5, "quality": "fast" }, { "name": "qwen2.5-coder:32b", "ramGb": 22, "vramGb": 22, "quality": "best" } ] }, { "family": "llama", "variants": [ { "name": "llama3.2:3b", "ramGb": 3, "vramGb": 3, "quality": "fast" }, { "name": "llama3.1:8b", "ramGb": 6, "vramGb": 6, "quality": "balanced" }, { "name": "llama3.1:70b", "ramGb": 48, "vramGb": 48, "quality": "best" } ] }, { "family": "mistral", "variants": [ { "name": "mistral:7b", "ramGb": 5, "vramGb": 5, "quality": "balanced" }, { "name": "mistral:22b", "ramGb": 14, "vramGb": 14, "quality": "best" } ] }, { "family": "phi", "variants": [ { "name": "phi4:14b", "ramGb": 10, "vramGb": 10, "quality": "balanced" } ] }, { "family": "deepseek", "variants": [ { "name": "deepseek-r1:7b", "ramGb": 5, "vramGb": 5, "quality": "reasoning" }, { "name": "deepseek-r1:32b", "ramGb": 22, "vramGb": 22, "quality": "reasoning" } ] } ] } ``` Recommendation logic: `os.totalmem()` gives total RAM. Use 75% as usable RAM budget (leave OS headroom). Filter catalog entries where `ramGb <= totalRamGb * 0.75`. Return the highest-quality variant within budget plus a `recommendationReason` string. ### Anti-Patterns to Avoid - **Polling Ollama in a loop:** Use a 60-second TTL in-memory cache (same as codex-models.ts `MODELS_CACHE_TTL_MS`). Do not re-probe on every API call. - **Blocking server startup on Ollama check:** Ollama detection is on-demand (per-request), not at startup. - **Hard-coding `localhost:11434`:** Always read from `process.env.OLLAMA_BASE_URL ?? "http://localhost:11434"` so users with non-standard ports work. - **Requiring Ollama for Hermes:** All Ollama paths are optional. Hermes without Ollama continues to work unchanged. Never throw when Ollama is absent. - **Overwriting all of stateJson:** Merge into stateJson using spread, never replace: `stateJson: { ...existingState, hermesModel: ..., hermesNativeSkillCount: ... }`. --- ## Don't Hand-Roll | Problem | Don't Build | Use Instead | Why | |---------|-------------|-------------|-----| | Ollama connectivity check | Custom TCP socket probe | `fetch` to `/api/version` with AbortController timeout | Reuses existing pattern from codex-models.ts | | YAML config parsing | Full YAML parser | Existing `parseModelFromConfig` in hermes adapter | Already ships in hermes-paperclip-adapter/dist | | System RAM reading | Shell commands | `os.totalmem()` | Built-in, no dep, works cross-platform | | Token cost tracking | New billing logic | Existing `costService.createEvent` + `updateRuntimeState` | Already handles Hermes via regex-extracted usage | --- ## Common Pitfalls ### Pitfall 1: Hermes Does Not Have an "ollama" Provider **What goes wrong:** Setting `adapterConfig.provider = "ollama"` causes Hermes to fail — "ollama" is not a valid VALID_PROVIDERS entry in `constants.js`. **Why it happens:** Ollama mimics the OpenAI API, so Hermes treats it as `provider: "custom"` with `base_url: "http://localhost:11434/v1"`. **How to avoid:** When a user selects an Ollama model, always write `provider: "custom"` and `base_url: "http://localhost:11434/v1"` into `adapterConfig`. These fields are already in the Hermes config schema (see `agentConfigurationDoc`). **Warning signs:** Hermes stderr shows "unknown provider" or authentication errors during local model runs. ### Pitfall 2: Ollama API Returns Models at `/api/tags`, Not `/v1/models` **What goes wrong:** Using the OpenAI-compat endpoint `/v1/models` to list models misses the `details` object (parameterSize, quantization_level, family) needed for OLLA-04. **Why it happens:** `/v1/models` is OpenAI-compat, `/api/tags` is Ollama-native with richer data. **How to avoid:** Use `GET localhost:11434/api/tags` for model listing (returns `details.parameter_size`, `details.family`). Use `/v1/models` only if passing through to Hermes. ### Pitfall 3: stateJson Merge Requires Read-Modify-Write **What goes wrong:** `db.update(agentRuntimeState).set({ stateJson: newData })` overwrites other fields stored by other parts of the system. **Why it happens:** Drizzle `.set()` replaces the entire column value. **How to avoid:** Use Postgres jsonb merge: `stateJson: sql\`${agentRuntimeState.stateJson} || ${JSON.stringify(patch)}::jsonb\`` or read existing `stateJson` first, then spread. The existing `ensureRuntimeState` call in `updateRuntimeState` already reads the row. ### Pitfall 4: HermesLocalConfigFields Uses adapterConfig for Both Create and Edit Modes **What goes wrong:** Setting `provider` and `base_url` only in create mode loses the values on edit, or vice versa. **Why it happens:** The `isCreate` flag switches between `set!({ model: v })` (create) and `mark("adapterConfig", "model", v)` (edit) — both paths must update all three fields (model, provider, base_url) when an Ollama model is selected. **How to avoid:** When Ollama model is selected, call the setter for all three config fields atomically. For create mode: `set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" })`. For edit mode: three `mark()` calls or a compound helper. ### Pitfall 5: Ollama /api/ps Probe May Have No Models Running **What goes wrong:** `/api/ps` returns an empty `models: []` when no model is currently loaded — this does not mean Ollama is absent. **Why it happens:** Ollama only shows models in `/api/ps` when they are actively loaded in memory. **How to avoid:** Use `/api/version` for detection (OLLA-01), `/api/tags` for the model list (OLLA-02), and `/api/ps` only for the optional "memory usage" metric in HERM-07 — handling the empty case as "not currently loaded". ### Pitfall 6: HERM-06 Cost Tracking — Ollama Models Return Zero Cost **What goes wrong:** Expecting a `cost_usd` value from runs using local Ollama models — there is no external billing. **Why it happens:** Hermes does not know the user's GPU/CPU cost. The `COST_REGEX` will not match if Hermes does not emit a cost line. **How to avoid:** This is correct behavior. `normalizeBilledCostCents(undefined, "unknown")` returns `0`. Token usage may still be captured if Hermes emits token counts. Accept that Ollama-based runs show $0.00 in the cost UI — that is accurate. --- ## Code Examples ### Ollama /api/tags Response Shape (verified) ```typescript // Source: https://docs.ollama.com/api/tags (verified 2026-04-01) interface OllamaTagsResponse { models: Array<{ name: string; // "qwen2.5-coder:32b" model: string; // same as name modified_at: string; size: number; // bytes digest: string; details: { parent_model: string; format: string; // "gguf" family: string; // "qwen2" families: string[]; parameter_size: string; // "32.8B" quantization_level: string; // "Q4_K_M" }; }>; } ``` ### Ollama /api/ps Response Shape (verified) ```typescript // Source: https://docs.ollama.com/api/tags (verified 2026-04-01) interface OllamaPsResponse { models: Array<{ name: string; model: string; size: number; digest: string; details: { /* same as tags */ }; expires_at: string; size_vram: number; // bytes used in VRAM }>; } ``` ### Reading hermes-adapter stateJson Hermes fields ```typescript // In AgentDetail.tsx HermesRuntimeCard — read from runtimeState.stateJson const hermesModel = runtimeState.stateJson?.hermesModel as string | undefined; const hermesNativeSkillCount = runtimeState.stateJson?.hermesNativeSkillCount as number | undefined; const hermesMemoryBytes = runtimeState.stateJson?.hermesMemoryBytes as number | undefined; ``` ### Hermes Ollama adapterConfig (what to write) ```typescript // When user selects an Ollama model in config-fields.tsx: // model = "qwen2.5-coder:32b" (bare Ollama model name) // provider = "custom" (OpenAI-compatible endpoint) // base_url = "http://localhost:11434/v1" // For create mode: set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" }) // For edit mode: mark("adapterConfig", "model", model); mark("adapterConfig", "provider", "custom"); mark("adapterConfig", "base_url", "http://localhost:11434/v1"); ``` ### Cost Tracking — Already Wired (HERM-06 context) ```typescript // Source: server/src/services/heartbeat.ts:updateRuntimeState // Hermes execute.ts returns: // result.usage = { inputTokens, outputTokens } (from regex) // result.costUsd = number | undefined (from regex, usually undefined for local) // // heartbeat.ts normalizes: const usage = normalizeUsageTotals(result.usage); const additionalCostCents = normalizeBilledCostCents(result.costUsd, billingType); // Then: if (additionalCostCents > 0 || hasTokenUsage) { await costs.createEvent(companyId, { ... model: result.model ?? "unknown" ... }); } // → For Ollama: costCents=0, but inputTokens/outputTokens may be > 0 → cost event recorded // → If Hermes doesn't emit token counts: no event recorded (correct behavior) ``` --- ## HERM-05: Skill Visibility — What Is Already Done vs. What Is Missing ### Already Done (data layer is complete) - `skillRegistryService.syncHermesNativeSkills(agentId)` scans `~/.hermes/skills/` and inserts `source: "native"` rows - Called automatically from `GET /skill-registry/agents/:agentId/skills` when `adapterType === "hermes_local"` - Returns `AgentSkillEntry[]` with `{ skillId, source, installedAt }` — both `"native"` and `"managed"` source values - Hermes adapter `listHermesSkills` returns snapshot with `originLabel: "Hermes skill"` and `readOnly: true` for native skills ### What Is Missing (UI rendering in AgentSkillsTab) The `unmanagedSkillRows` section in `AgentSkillsTab` (AgentDetail.tsx:2566) renders read-only adapter entries. It uses `entry.originLabel` and `entry.locationLabel` for display. Hermes native skills already flow through this path. The gap: the UI may not clearly distinguish "Hermes skill" entries from other unmanaged entries. The `originLabel: "Hermes skill"` badge rendering and skill count display are the UI additions needed. This is a targeted render update to `AgentSkillsTab`, not a new data flow. --- ## HERM-07: Dashboard Hermes Runtime Info ### What to Store in stateJson ```typescript // Written by heartbeat.ts updateRuntimeState after a Hermes run { hermesModel: string; // e.g. "qwen2.5-coder:32b" or "anthropic/claude-sonnet-4" hermesNativeSkillCount: number; // from skillRegistryService query hermesMemoryBytes: number | null; // from /api/ps size_vram, null if unavailable } ``` ### Where to Write stateJson In `heartbeat.ts:updateRuntimeState`, after the existing `db.update(agentRuntimeState).set(...)` call, add a second update that merges hermes-specific fields when `agent.adapterType === "hermes_local"`. Read `result.model` for `hermesModel`. Query `skillRegistryDb` for `hermesNativeSkillCount`. Query Ollama `/api/ps` for `hermesMemoryBytes` (non-blocking, fire-and-forget). ### What to Render A `HermesRuntimeCard` component in `AgentOverview` (gated by `adapterType === "hermes_local"`): - Model name (from stateJson.hermesModel) - Native skill count (from stateJson.hermesNativeSkillCount) - Memory usage (from stateJson.hermesMemoryBytes, formatted as "X.X GB" or "Not loaded") --- ## Environment Availability | Dependency | Required By | Available | Version | Fallback | |------------|------------|-----------|---------|----------| | Ollama daemon | OLLA-01 through OLLA-05 | No (not installed) | — | All paths degrade gracefully; UI shows install instructions | | hermes-paperclip-adapter | HERM-05, HERM-06, HERM-07 | Yes | 0.2.1 | — | | Node.js fetch | Ollama HTTP probing | Yes | built-in (Node 18+) | — | | Node.js os module | OLLA-04 RAM reading | Yes | built-in | — | | Vitest | Tests | Yes | (server vitest.config.ts) | — | **Missing dependencies with no fallback:** None — all Ollama features degrade gracefully when Ollama is absent. **Pre-existing test failures (not Phase 28 regressions):** 4 test files failing before Phase 28 begins: - `app-hmr-port.test.ts` - `plugin-worker-manager.test.ts` - `heartbeat-workspace-session.test.ts` (5 tests) - `skill-registry-routes.test.ts` (1 test) --- ## Validation Architecture ### Test Framework | Property | Value | |----------|-------| | Framework | Vitest (server) | | Config file | `server/vitest.config.ts` | | Quick run command | `cd server && npx vitest run src/__tests__/ollama-service.test.ts` | | Full suite command | `cd server && npx vitest run` | ### Phase Requirements → Test Map | Req ID | Behavior | Test Type | Automated Command | File Exists? | |--------|----------|-----------|-------------------|-------------| | OLLA-01 | `detectOllama()` returns `installed: false` when Ollama absent | unit | `npx vitest run src/__tests__/ollama-service.test.ts` | No — Wave 0 | | OLLA-01 | `detectOllama()` returns `installed: true` + version when Ollama present | unit | same | No — Wave 0 | | OLLA-01 | `detectOllama()` times out cleanly (AbortController) | unit | same | No — Wave 0 | | OLLA-02 | `listOllamaModels()` returns AdapterModel[] from /api/tags | unit | same | No — Wave 0 | | OLLA-04 | `buildModelRecommendation()` returns correct model for given RAM budget | unit | same | No — Wave 0 | | OLLA-05 | Routes return `installUrl` when Ollama absent | unit | same | No — Wave 0 | | HERM-05 | Skills tab renders `originLabel: "Hermes skill"` badge | manual-only | — | — | | HERM-06 | `updateRuntimeState` records cost event when Hermes emits token data | unit (existing pattern) | `npx vitest run src/__tests__/costs-service.test.ts` | Yes | | HERM-07 | stateJson receives hermesModel/hermesNativeSkillCount after run | unit | `npx vitest run src/__tests__/ollama-service.test.ts` | No — Wave 0 | ### Sampling Rate - **Per task commit:** `cd server && npx vitest run src/__tests__/ollama-service.test.ts` - **Per wave merge:** `cd server && npx vitest run` - **Phase gate:** Full suite green before `/gsd:verify-work` (excluding 4 pre-existing failures) ### Wave 0 Gaps - [ ] `server/src/__tests__/ollama-service.test.ts` — covers OLLA-01, OLLA-02, OLLA-04, OLLA-05, HERM-07 stateJson logic - [ ] Test stubs use mock fetch (AbortController pattern); no real Ollama needed --- ## State of the Art | Old Approach | Current Approach | When Changed | Impact | |--------------|------------------|--------------|--------| | Manual text entry for Hermes model | Dropdown fed from Ollama + manual fallback | Phase 28 | Better UX for local models | | stateJson unused for Hermes | stateJson stores hermesModel, skillCount, memoryBytes | Phase 28 | Dashboard can show runtime info | | Hermes native skills in separate table only | Skills tab renders both managed + native in unified view | Phase 28 (HERM-05 completion) | Unified skill surface | --- ## Open Questions 1. **Should Ollama route be gated to hermes_local only?** - What we know: Only Hermes uses the Ollama custom endpoint pattern currently - What's unclear: Future adapters (Phase 29 defaults) may also use Ollama - Recommendation: Mount under `/companies/:companyId/ollama/*` without adapter-type gating — the endpoint is useful generically and Pi/OpenCode adapters may benefit in Phase 29 2. **Should listOllamaModels also extend the hermes adapter's `listModels` function?** - What we know: `listAdapterModels("hermes_local")` already calls `adapter.listModels()` if present; hermes adapter has no `listModels` implementation (returns `models: []`) - What's unclear: Whether to add `listModels` to hermes adapter (requires adapter package change) or use a separate Ollama API route in Nexus - Recommendation: Use a separate Nexus route (`/companies/:companyId/ollama/models`). Avoids changing the hermes-paperclip-adapter package (external dependency). The config-fields.tsx component can call the Nexus route directly. **Do not modify the hermes-paperclip-adapter package.** 3. **stateJson hermesNativeSkillCount — count from skillRegistry or from adapter snapshot?** - What we know: `skillRegistryDb` is a separate libSQL DB; querying it in `updateRuntimeState` adds cross-DB complexity - What's unclear: Is the extra query worth it for a display-only count? - Recommendation: Store the count from `result.resultJson` if Hermes emits it, or derive from the adapter skill snapshot after run. Alternatively, skip native skill count from stateJson and derive it in the UI from `agentsApi.skills(agentId)` query. The UI approach avoids cross-DB concerns in heartbeat. --- ## Sources ### Primary (HIGH confidence) - hermes-paperclip-adapter@0.2.1 dist source code — `execute.js`, `skills.js`, `detect-model.js`, `test.js`, `constants.js` — read directly from `/opt/nexus/server/node_modules/hermes-paperclip-adapter/dist/` - Nexus codebase — `server/src/services/heartbeat.ts`, `server/src/services/costs.ts`, `server/src/services/skill-registry.ts`, `ui/src/pages/AgentDetail.tsx`, `ui/src/adapters/hermes-local/config-fields.tsx` — read directly - Ollama REST API — `https://docs.ollama.com/api/tags` — verified /api/tags response shape with `details.parameter_size`, `details.family`, `details.quantization_level` - Node.js built-ins — `os.totalmem()`, `fetch` with AbortController — confirmed available in Node 18+ runtime ### Secondary (MEDIUM confidence) - Hermes Agent provider docs — `https://hermes-agent.nousresearch.com/docs/integrations/providers/` — verified "ollama uses custom provider + localhost:11434/v1 base_url" - Hermes Agent + Ollama guide — Medium/Substack articles cross-referencing official docs — confirmed custom endpoint configuration steps ### Tertiary (LOW confidence) - Ollama model RAM requirements (catalog) — community sources + Ollama model page tags — use conservative estimates; verify against https://ollama.com/library model pages before shipping --- ## Metadata **Confidence breakdown:** - Ollama API: HIGH — verified from official docs, response shapes confirmed - Hermes + Ollama provider mapping: HIGH — verified from official Hermes provider docs - Standard stack: HIGH — all existing infrastructure confirmed from source code - Architecture patterns: HIGH — follow existing codex-models.ts, heartbeat.ts, config-fields.tsx patterns exactly - HERM-05 data layer status: HIGH — verified syncHermesNativeSkills exists and is already called - HERM-06 cost tracking: HIGH — execute.js returns usage/costUsd, heartbeat.ts wires it to costService - Pitfalls: HIGH — derived from actual source code analysis **Research date:** 2026-04-01 **Valid until:** 2026-05-01 (Ollama API is stable; hermes-paperclip-adapter may receive new releases)