nexus/.planning/milestones/v1.4-phases/28-ollama-integration/28-RESEARCH.md
Nexus Dev 147529076d chore: complete v1.4 Hermes Default Provider milestone
3 phases, 6 plans, 16 requirements. Archives copied to milestones/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:44:55 +00:00

29 KiB

Phase 28: Ollama Integration & Agent Surface - Research

Researched: 2026-04-01 Domain: Ollama HTTP API, Hermes adapter extension, agent dashboard UI, cost tracking Confidence: HIGH

Summary

Phase 28 adds three distinct capabilities on top of the completed Phase 27 Hermes adapter: (1) Ollama detection and model catalog — Nexus queries localhost:11434 to detect Ollama, lists available models, and ships a static JSON catalog for hardware-aware recommendations; (2) Hermes config surface extension — the model field in config-fields.tsx becomes a dropdown fed by live Ollama discovery rather than a free-text input, and a new base_url/provider: custom adapterConfig field routes Hermes to the local endpoint; (3) Hermes runtime data in the dashboard — stateJson in agentRuntimeState is the right place to store Hermes-specific runtime metadata (model name, native skill count, memory usage from Ollama's /api/ps), and the AgentOverview component in AgentDetail.tsx is the right insertion point.

The most important finding is that Hermes does not have a native "ollama" provider. Ollama is configured as a custom OpenAI-compatible endpoint: provider: custom, base_url: http://localhost:11434/v1. The model field passes the Ollama model name bare (e.g. qwen2.5-coder:32b). This shapes OLLA-02, OLLA-03, and the config-fields.tsx changes.

For cost tracking (HERM-06): hermes-paperclip-adapter@0.2.1 already parses token_usage and cost regex patterns from Hermes stdout. When Hermes returns non-zero usage, heartbeat.ts:updateRuntimeState already calls costService.createEvent. The only gap is that Hermes running local Ollama models will have costUsd = undefined (no billing) — the infrastructure handles this correctly (zero cost event is suppressed when additionalCostCents === 0 && !hasTokenUsage). No cost tracking code changes are needed for local models; the planner just needs to verify the regex path works end-to-end.

For HERM-05 (skill visibility): syncHermesNativeSkills already exists in skillRegistryService and is already called from the GET /skill-registry/agents/:agentId/skills route when adapterType === "hermes_local". The Hermes adapter's listHermesSkills function merges Paperclip-managed and native skills. The integration is already complete at the data layer. What is missing is the UI surface in the Skills tab that renders the originLabel: "Hermes skill" / readOnly: true entries distinctly from managed skills.

Primary recommendation: Implement as four focused plans — (P01) server-side Ollama service + routes; (P02) Hermes config-fields UI extension for Ollama model selection; (P03) dashboard Hermes runtime info card; (P04) model catalog JSON + recommendation logic.


<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.

Claude's Discretion

All implementation choices are at Claude's discretion.

Deferred Ideas (OUT OF SCOPE)

None — discuss phase skipped. Refer to REQUIREMENTS.md for in-scope requirements.

Out of scope per REQUIREMENTS.md:

  • Multi-provider model routing (Hermes can use OpenRouter/Anthropic/OpenAI but that's Hermes config, not Nexus)
  • Hermes MCP server management
  • Custom Hermes skill authoring UI
  • DFLT-01 through DFLT-04 (Phase 29) </user_constraints>

<phase_requirements>

Phase Requirements

ID Description Research Support
OLLA-01 Nexus detects whether Ollama is installed locally HTTP probe to localhost:11434/api/version; new server service ollamaService
OLLA-02 User can see list of available Ollama models when configuring a Hermes agent GET /api/tags from Ollama HTTP API; new server route GET /companies/:id/ollama/models; config-fields.tsx dropdown
OLLA-03 User can configure a Hermes agent with any local Ollama model Sets adapterConfig.model = <model-name>, adapterConfig.provider = "custom", adapterConfig.base_url = "http://localhost:11434/v1"
OLLA-04 Model recommendation based on RAM/VRAM from a shipped catalog Static JSON catalog in server/src/data/ollama-model-catalog.json; server reads os.totalmem() to filter; returned with model list
OLLA-05 If Ollama is not present, user is offered installation instructions Ollama status endpoint returns installed: false + installUrl; UI shows callout in Hermes config-fields
HERM-05 Nexus-managed skills visible alongside Hermes native skills in agent config Already wired at data layer — UI Skills tab needs originLabel: "Hermes skill" rendering distinction
HERM-06 Cost tracking captures token usage and model costs for Hermes agents Infrastructure already handles this; verify end-to-end with local Ollama (zero cost is correct, no change needed)
HERM-07 Dashboard shows Hermes-specific info (model name, memory usage, native skill count) Store in agentRuntimeState.stateJson; render in AgentOverview component
</phase_requirements>

Standard Stack

Core

Library Version Purpose Why Standard
Node.js os module built-in Read total system RAM Already used in heartbeat.ts; no new dep
Node.js fetch Node 18+ built-in HTTP calls to Ollama API at localhost:11434 Already confirmed available in runtime
hermes-paperclip-adapter 0.2.1 (installed) Hermes execution, skill sync, model detection Already wired into adapter registry

No New Dependencies Required

All capabilities needed for Phase 28 are achievable with existing infrastructure:

  • Ollama HTTP API is probed with fetch (built-in Node 18+)
  • Model catalog is a static JSON file in the server package
  • RAM reading uses os.totalmem() (built-in)
  • Hermes Ollama configuration uses existing adapterConfig fields

Architecture Patterns

server/src/services/ollama.ts          # ollamaService — detect + list models
server/src/routes/ollama.ts            # HTTP routes: /companies/:id/ollama/status, /models
server/src/data/ollama-model-catalog.json  # shipped catalog for OLLA-04
server/src/__tests__/ollama-service.test.ts  # unit tests for ollamaService
ui/src/api/ollama.ts                   # ollamaApi client — wraps server routes

Pattern 1: Ollama Service (server-side)

// server/src/services/ollama.ts
const OLLAMA_BASE_URL = process.env.OLLAMA_BASE_URL ?? "http://localhost:11434";
const OLLAMA_TIMEOUT_MS = 3000;

export interface OllamaStatus {
  installed: boolean;
  version: string | null;
  installUrl: string;
}

export interface OllamaModel {
  name: string;           // e.g. "qwen2.5-coder:32b"
  parameterSize: string;  // e.g. "32.8B" from /api/tags details
  quantization: string;   // e.g. "Q4_K_M"
  sizeBytes: number;
  family: string;         // e.g. "qwen2"
  recommended: boolean;   // from catalog match + RAM check
  recommendationReason: string | null;
}

export async function detectOllama(): Promise<OllamaStatus> {
  const controller = new AbortController();
  const timeout = setTimeout(() => controller.abort(), OLLAMA_TIMEOUT_MS);
  try {
    const res = await fetch(`${OLLAMA_BASE_URL}/api/version`, {
      signal: controller.signal,
    });
    if (!res.ok) return { installed: false, version: null, installUrl: INSTALL_URL };
    const body = await res.json() as { version?: string };
    return { installed: true, version: body.version ?? null, installUrl: INSTALL_URL };
  } catch {
    return { installed: false, version: null, installUrl: INSTALL_URL };
  } finally {
    clearTimeout(timeout);
  }
}

Why this pattern: Matches the existing codex-models.ts pattern — HTTP fetch with timeout, graceful failure returns empty/false rather than throwing. The 3s timeout prevents hanging requests when Ollama is not installed.

Pattern 2: Ollama Routes (mounted under /companies/:companyId)

GET /companies/:companyId/ollama/status
  → { installed: boolean, version: string|null, installUrl: string }

GET /companies/:companyId/ollama/models
  → { models: OllamaModel[], ramGb: number }

Both routes use existing assertCompanyAccess(req, companyId) authz pattern from agents.ts.

Mount in server/src/routes/index.ts alongside the existing agentsRoutes.

Pattern 3: Hermes Config-Fields Enhancement

The existing HermesLocalConfigFields in config-fields.tsx has a free-text Model input. For Ollama support, it becomes a hybrid: dropdown (when Ollama is present) + manual entry fallback.

// Fetch Ollama status + models (only for hermes_local adapter)
const { data: ollamaStatus } = useQuery({
  queryKey: ["ollama", "status", companyId],
  queryFn: () => ollamaApi.status(companyId!),
  enabled: Boolean(companyId),
});

const { data: ollamaModels } = useQuery({
  queryKey: ["ollama", "models", companyId],
  queryFn: () => ollamaApi.models(companyId!),
  enabled: Boolean(companyId && ollamaStatus?.installed),
});

When ollamaStatus.installed === false, render an install callout (OLLA-05) instead of the dropdown.

When a local Ollama model is selected, buildHermesConfig (or mark) must also set provider: "custom" and base_url: "http://localhost:11434/v1" in adapterConfig. This is the critical mapping from OLLA-03.

Pattern 4: Hermes Runtime Data in stateJson (HERM-07)

agentRuntimeState.stateJson is jsonb typed as Record<string, unknown>. The heartbeat service writes this via updateRuntimeState. The Hermes adapter's execute.ts already returns resultJson with session_id, usage, and cost_usd.

For HERM-07 runtime data (model name, native skill count, memory usage), the server-side approach is:

  • After a Hermes run completes, read resultJson.result and extract/store model + detected skill count into stateJson
  • Optionally query Ollama /api/ps (running models) to get size_vram for memory usage display

Insertion point for stateJson patch: heartbeat.ts:updateRuntimeState already calls db.update(agentRuntimeState).set(...). Add a stateJson merge here when adapterType === "hermes_local".

UI insertion point: AgentOverview component in AgentDetail.tsx (line ~1183). Add a HermesRuntimeCard component after the charts section, gated by agent.adapterType === "hermes_local":

{agent.adapterType === "hermes_local" && runtimeState && (
  <HermesRuntimeCard runtimeState={runtimeState} />
)}

Pattern 5: Model Catalog JSON (OLLA-04)

// server/src/data/ollama-model-catalog.json
{
  "models": [
    {
      "family": "qwen2",
      "variants": [
        { "name": "qwen2.5-coder:7b",  "ramGb": 5,  "vramGb": 5,  "quality": "fast" },
        { "name": "qwen2.5-coder:32b", "ramGb": 22, "vramGb": 22, "quality": "best" }
      ]
    },
    {
      "family": "llama",
      "variants": [
        { "name": "llama3.2:3b",  "ramGb": 3,  "vramGb": 3,  "quality": "fast" },
        { "name": "llama3.1:8b",  "ramGb": 6,  "vramGb": 6,  "quality": "balanced" },
        { "name": "llama3.1:70b", "ramGb": 48, "vramGb": 48, "quality": "best" }
      ]
    },
    {
      "family": "mistral",
      "variants": [
        { "name": "mistral:7b",   "ramGb": 5,  "vramGb": 5,  "quality": "balanced" },
        { "name": "mistral:22b",  "ramGb": 14, "vramGb": 14, "quality": "best" }
      ]
    },
    {
      "family": "phi",
      "variants": [
        { "name": "phi4:14b",    "ramGb": 10, "vramGb": 10, "quality": "balanced" }
      ]
    },
    {
      "family": "deepseek",
      "variants": [
        { "name": "deepseek-r1:7b",  "ramGb": 5,  "vramGb": 5,  "quality": "reasoning" },
        { "name": "deepseek-r1:32b", "ramGb": 22, "vramGb": 22, "quality": "reasoning" }
      ]
    }
  ]
}

Recommendation logic: os.totalmem() gives total RAM. Use 75% as usable RAM budget (leave OS headroom). Filter catalog entries where ramGb <= totalRamGb * 0.75. Return the highest-quality variant within budget plus a recommendationReason string.

Anti-Patterns to Avoid

  • Polling Ollama in a loop: Use a 60-second TTL in-memory cache (same as codex-models.ts MODELS_CACHE_TTL_MS). Do not re-probe on every API call.
  • Blocking server startup on Ollama check: Ollama detection is on-demand (per-request), not at startup.
  • Hard-coding localhost:11434: Always read from process.env.OLLAMA_BASE_URL ?? "http://localhost:11434" so users with non-standard ports work.
  • Requiring Ollama for Hermes: All Ollama paths are optional. Hermes without Ollama continues to work unchanged. Never throw when Ollama is absent.
  • Overwriting all of stateJson: Merge into stateJson using spread, never replace: stateJson: { ...existingState, hermesModel: ..., hermesNativeSkillCount: ... }.

Don't Hand-Roll

Problem Don't Build Use Instead Why
Ollama connectivity check Custom TCP socket probe fetch to /api/version with AbortController timeout Reuses existing pattern from codex-models.ts
YAML config parsing Full YAML parser Existing parseModelFromConfig in hermes adapter Already ships in hermes-paperclip-adapter/dist
System RAM reading Shell commands os.totalmem() Built-in, no dep, works cross-platform
Token cost tracking New billing logic Existing costService.createEvent + updateRuntimeState Already handles Hermes via regex-extracted usage

Common Pitfalls

Pitfall 1: Hermes Does Not Have an "ollama" Provider

What goes wrong: Setting adapterConfig.provider = "ollama" causes Hermes to fail — "ollama" is not a valid VALID_PROVIDERS entry in constants.js. Why it happens: Ollama mimics the OpenAI API, so Hermes treats it as provider: "custom" with base_url: "http://localhost:11434/v1". How to avoid: When a user selects an Ollama model, always write provider: "custom" and base_url: "http://localhost:11434/v1" into adapterConfig. These fields are already in the Hermes config schema (see agentConfigurationDoc). Warning signs: Hermes stderr shows "unknown provider" or authentication errors during local model runs.

Pitfall 2: Ollama API Returns Models at /api/tags, Not /v1/models

What goes wrong: Using the OpenAI-compat endpoint /v1/models to list models misses the details object (parameterSize, quantization_level, family) needed for OLLA-04. Why it happens: /v1/models is OpenAI-compat, /api/tags is Ollama-native with richer data. How to avoid: Use GET localhost:11434/api/tags for model listing (returns details.parameter_size, details.family). Use /v1/models only if passing through to Hermes.

Pitfall 3: stateJson Merge Requires Read-Modify-Write

What goes wrong: db.update(agentRuntimeState).set({ stateJson: newData }) overwrites other fields stored by other parts of the system. Why it happens: Drizzle .set() replaces the entire column value. How to avoid: Use Postgres jsonb merge: stateJson: sql\${agentRuntimeState.stateJson} || ${JSON.stringify(patch)}::jsonb`or read existingstateJsonfirst, then spread. The existingensureRuntimeStatecall inupdateRuntimeState` already reads the row.

Pitfall 4: HermesLocalConfigFields Uses adapterConfig for Both Create and Edit Modes

What goes wrong: Setting provider and base_url only in create mode loses the values on edit, or vice versa. Why it happens: The isCreate flag switches between set!({ model: v }) (create) and mark("adapterConfig", "model", v) (edit) — both paths must update all three fields (model, provider, base_url) when an Ollama model is selected. How to avoid: When Ollama model is selected, call the setter for all three config fields atomically. For create mode: set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" }). For edit mode: three mark() calls or a compound helper.

Pitfall 5: Ollama /api/ps Probe May Have No Models Running

What goes wrong: /api/ps returns an empty models: [] when no model is currently loaded — this does not mean Ollama is absent. Why it happens: Ollama only shows models in /api/ps when they are actively loaded in memory. How to avoid: Use /api/version for detection (OLLA-01), /api/tags for the model list (OLLA-02), and /api/ps only for the optional "memory usage" metric in HERM-07 — handling the empty case as "not currently loaded".

Pitfall 6: HERM-06 Cost Tracking — Ollama Models Return Zero Cost

What goes wrong: Expecting a cost_usd value from runs using local Ollama models — there is no external billing. Why it happens: Hermes does not know the user's GPU/CPU cost. The COST_REGEX will not match if Hermes does not emit a cost line. How to avoid: This is correct behavior. normalizeBilledCostCents(undefined, "unknown") returns 0. Token usage may still be captured if Hermes emits token counts. Accept that Ollama-based runs show $0.00 in the cost UI — that is accurate.


Code Examples

Ollama /api/tags Response Shape (verified)

// Source: https://docs.ollama.com/api/tags (verified 2026-04-01)
interface OllamaTagsResponse {
  models: Array<{
    name: string;             // "qwen2.5-coder:32b"
    model: string;            // same as name
    modified_at: string;
    size: number;             // bytes
    digest: string;
    details: {
      parent_model: string;
      format: string;         // "gguf"
      family: string;         // "qwen2"
      families: string[];
      parameter_size: string; // "32.8B"
      quantization_level: string; // "Q4_K_M"
    };
  }>;
}

Ollama /api/ps Response Shape (verified)

// Source: https://docs.ollama.com/api/tags (verified 2026-04-01)
interface OllamaPsResponse {
  models: Array<{
    name: string;
    model: string;
    size: number;
    digest: string;
    details: { /* same as tags */ };
    expires_at: string;
    size_vram: number;  // bytes used in VRAM
  }>;
}

Reading hermes-adapter stateJson Hermes fields

// In AgentDetail.tsx HermesRuntimeCard — read from runtimeState.stateJson
const hermesModel = runtimeState.stateJson?.hermesModel as string | undefined;
const hermesNativeSkillCount = runtimeState.stateJson?.hermesNativeSkillCount as number | undefined;
const hermesMemoryBytes = runtimeState.stateJson?.hermesMemoryBytes as number | undefined;

Hermes Ollama adapterConfig (what to write)

// When user selects an Ollama model in config-fields.tsx:
// model = "qwen2.5-coder:32b"  (bare Ollama model name)
// provider = "custom"           (OpenAI-compatible endpoint)
// base_url = "http://localhost:11434/v1"

// For create mode:
set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" })

// For edit mode:
mark("adapterConfig", "model", model);
mark("adapterConfig", "provider", "custom");
mark("adapterConfig", "base_url", "http://localhost:11434/v1");

Cost Tracking — Already Wired (HERM-06 context)

// Source: server/src/services/heartbeat.ts:updateRuntimeState
// Hermes execute.ts returns:
//   result.usage = { inputTokens, outputTokens }  (from regex)
//   result.costUsd = number | undefined           (from regex, usually undefined for local)
//
// heartbeat.ts normalizes:
const usage = normalizeUsageTotals(result.usage);
const additionalCostCents = normalizeBilledCostCents(result.costUsd, billingType);
// Then:
if (additionalCostCents > 0 || hasTokenUsage) {
  await costs.createEvent(companyId, { ... model: result.model ?? "unknown" ... });
}
// → For Ollama: costCents=0, but inputTokens/outputTokens may be > 0 → cost event recorded
// → If Hermes doesn't emit token counts: no event recorded (correct behavior)

HERM-05: Skill Visibility — What Is Already Done vs. What Is Missing

Already Done (data layer is complete)

  • skillRegistryService.syncHermesNativeSkills(agentId) scans ~/.hermes/skills/ and inserts source: "native" rows
  • Called automatically from GET /skill-registry/agents/:agentId/skills when adapterType === "hermes_local"
  • Returns AgentSkillEntry[] with { skillId, source, installedAt } — both "native" and "managed" source values
  • Hermes adapter listHermesSkills returns snapshot with originLabel: "Hermes skill" and readOnly: true for native skills

What Is Missing (UI rendering in AgentSkillsTab)

The unmanagedSkillRows section in AgentSkillsTab (AgentDetail.tsx:2566) renders read-only adapter entries. It uses entry.originLabel and entry.locationLabel for display. Hermes native skills already flow through this path.

The gap: the UI may not clearly distinguish "Hermes skill" entries from other unmanaged entries. The originLabel: "Hermes skill" badge rendering and skill count display are the UI additions needed. This is a targeted render update to AgentSkillsTab, not a new data flow.


HERM-07: Dashboard Hermes Runtime Info

What to Store in stateJson

// Written by heartbeat.ts updateRuntimeState after a Hermes run
{
  hermesModel: string;         // e.g. "qwen2.5-coder:32b" or "anthropic/claude-sonnet-4"
  hermesNativeSkillCount: number;  // from skillRegistryService query
  hermesMemoryBytes: number | null; // from /api/ps size_vram, null if unavailable
}

Where to Write stateJson

In heartbeat.ts:updateRuntimeState, after the existing db.update(agentRuntimeState).set(...) call, add a second update that merges hermes-specific fields when agent.adapterType === "hermes_local". Read result.model for hermesModel. Query skillRegistryDb for hermesNativeSkillCount. Query Ollama /api/ps for hermesMemoryBytes (non-blocking, fire-and-forget).

What to Render

A HermesRuntimeCard component in AgentOverview (gated by adapterType === "hermes_local"):

  • Model name (from stateJson.hermesModel)
  • Native skill count (from stateJson.hermesNativeSkillCount)
  • Memory usage (from stateJson.hermesMemoryBytes, formatted as "X.X GB" or "Not loaded")

Environment Availability

Dependency Required By Available Version Fallback
Ollama daemon OLLA-01 through OLLA-05 No (not installed) All paths degrade gracefully; UI shows install instructions
hermes-paperclip-adapter HERM-05, HERM-06, HERM-07 Yes 0.2.1
Node.js fetch Ollama HTTP probing Yes built-in (Node 18+)
Node.js os module OLLA-04 RAM reading Yes built-in
Vitest Tests Yes (server vitest.config.ts)

Missing dependencies with no fallback: None — all Ollama features degrade gracefully when Ollama is absent.

Pre-existing test failures (not Phase 28 regressions): 4 test files failing before Phase 28 begins:

  • app-hmr-port.test.ts
  • plugin-worker-manager.test.ts
  • heartbeat-workspace-session.test.ts (5 tests)
  • skill-registry-routes.test.ts (1 test)

Validation Architecture

Test Framework

Property Value
Framework Vitest (server)
Config file server/vitest.config.ts
Quick run command cd server && npx vitest run src/__tests__/ollama-service.test.ts
Full suite command cd server && npx vitest run

Phase Requirements → Test Map

Req ID Behavior Test Type Automated Command File Exists?
OLLA-01 detectOllama() returns installed: false when Ollama absent unit npx vitest run src/__tests__/ollama-service.test.ts No — Wave 0
OLLA-01 detectOllama() returns installed: true + version when Ollama present unit same No — Wave 0
OLLA-01 detectOllama() times out cleanly (AbortController) unit same No — Wave 0
OLLA-02 listOllamaModels() returns AdapterModel[] from /api/tags unit same No — Wave 0
OLLA-04 buildModelRecommendation() returns correct model for given RAM budget unit same No — Wave 0
OLLA-05 Routes return installUrl when Ollama absent unit same No — Wave 0
HERM-05 Skills tab renders originLabel: "Hermes skill" badge manual-only
HERM-06 updateRuntimeState records cost event when Hermes emits token data unit (existing pattern) npx vitest run src/__tests__/costs-service.test.ts Yes
HERM-07 stateJson receives hermesModel/hermesNativeSkillCount after run unit npx vitest run src/__tests__/ollama-service.test.ts No — Wave 0

Sampling Rate

  • Per task commit: cd server && npx vitest run src/__tests__/ollama-service.test.ts
  • Per wave merge: cd server && npx vitest run
  • Phase gate: Full suite green before /gsd:verify-work (excluding 4 pre-existing failures)

Wave 0 Gaps

  • server/src/__tests__/ollama-service.test.ts — covers OLLA-01, OLLA-02, OLLA-04, OLLA-05, HERM-07 stateJson logic
  • Test stubs use mock fetch (AbortController pattern); no real Ollama needed

State of the Art

Old Approach Current Approach When Changed Impact
Manual text entry for Hermes model Dropdown fed from Ollama + manual fallback Phase 28 Better UX for local models
stateJson unused for Hermes stateJson stores hermesModel, skillCount, memoryBytes Phase 28 Dashboard can show runtime info
Hermes native skills in separate table only Skills tab renders both managed + native in unified view Phase 28 (HERM-05 completion) Unified skill surface

Open Questions

  1. Should Ollama route be gated to hermes_local only?

    • What we know: Only Hermes uses the Ollama custom endpoint pattern currently
    • What's unclear: Future adapters (Phase 29 defaults) may also use Ollama
    • Recommendation: Mount under /companies/:companyId/ollama/* without adapter-type gating — the endpoint is useful generically and Pi/OpenCode adapters may benefit in Phase 29
  2. Should listOllamaModels also extend the hermes adapter's listModels function?

    • What we know: listAdapterModels("hermes_local") already calls adapter.listModels() if present; hermes adapter has no listModels implementation (returns models: [])
    • What's unclear: Whether to add listModels to hermes adapter (requires adapter package change) or use a separate Ollama API route in Nexus
    • Recommendation: Use a separate Nexus route (/companies/:companyId/ollama/models). Avoids changing the hermes-paperclip-adapter package (external dependency). The config-fields.tsx component can call the Nexus route directly. Do not modify the hermes-paperclip-adapter package.
  3. stateJson hermesNativeSkillCount — count from skillRegistry or from adapter snapshot?

    • What we know: skillRegistryDb is a separate libSQL DB; querying it in updateRuntimeState adds cross-DB complexity
    • What's unclear: Is the extra query worth it for a display-only count?
    • Recommendation: Store the count from result.resultJson if Hermes emits it, or derive from the adapter skill snapshot after run. Alternatively, skip native skill count from stateJson and derive it in the UI from agentsApi.skills(agentId) query. The UI approach avoids cross-DB concerns in heartbeat.

Sources

Primary (HIGH confidence)

  • hermes-paperclip-adapter@0.2.1 dist source code — execute.js, skills.js, detect-model.js, test.js, constants.js — read directly from /opt/nexus/server/node_modules/hermes-paperclip-adapter/dist/
  • Nexus codebase — server/src/services/heartbeat.ts, server/src/services/costs.ts, server/src/services/skill-registry.ts, ui/src/pages/AgentDetail.tsx, ui/src/adapters/hermes-local/config-fields.tsx — read directly
  • Ollama REST API — https://docs.ollama.com/api/tags — verified /api/tags response shape with details.parameter_size, details.family, details.quantization_level
  • Node.js built-ins — os.totalmem(), fetch with AbortController — confirmed available in Node 18+ runtime

Secondary (MEDIUM confidence)

  • Hermes Agent provider docs — https://hermes-agent.nousresearch.com/docs/integrations/providers/ — verified "ollama uses custom provider + localhost:11434/v1 base_url"
  • Hermes Agent + Ollama guide — Medium/Substack articles cross-referencing official docs — confirmed custom endpoint configuration steps

Tertiary (LOW confidence)

  • Ollama model RAM requirements (catalog) — community sources + Ollama model page tags — use conservative estimates; verify against https://ollama.com/library model pages before shipping

Metadata

Confidence breakdown:

  • Ollama API: HIGH — verified from official docs, response shapes confirmed
  • Hermes + Ollama provider mapping: HIGH — verified from official Hermes provider docs
  • Standard stack: HIGH — all existing infrastructure confirmed from source code
  • Architecture patterns: HIGH — follow existing codex-models.ts, heartbeat.ts, config-fields.tsx patterns exactly
  • HERM-05 data layer status: HIGH — verified syncHermesNativeSkills exists and is already called
  • HERM-06 cost tracking: HIGH — execute.js returns usage/costUsd, heartbeat.ts wires it to costService
  • Pitfalls: HIGH — derived from actual source code analysis

Research date: 2026-04-01 Valid until: 2026-05-01 (Ollama API is stable; hermes-paperclip-adapter may receive new releases)