3 phases, 6 plans, 16 requirements. Archives copied to milestones/. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
29 KiB
Phase 28: Ollama Integration & Agent Surface - Research
Researched: 2026-04-01 Domain: Ollama HTTP API, Hermes adapter extension, agent dashboard UI, cost tracking Confidence: HIGH
Summary
Phase 28 adds three distinct capabilities on top of the completed Phase 27 Hermes adapter: (1) Ollama detection and model catalog — Nexus queries localhost:11434 to detect Ollama, lists available models, and ships a static JSON catalog for hardware-aware recommendations; (2) Hermes config surface extension — the model field in config-fields.tsx becomes a dropdown fed by live Ollama discovery rather than a free-text input, and a new base_url/provider: custom adapterConfig field routes Hermes to the local endpoint; (3) Hermes runtime data in the dashboard — stateJson in agentRuntimeState is the right place to store Hermes-specific runtime metadata (model name, native skill count, memory usage from Ollama's /api/ps), and the AgentOverview component in AgentDetail.tsx is the right insertion point.
The most important finding is that Hermes does not have a native "ollama" provider. Ollama is configured as a custom OpenAI-compatible endpoint: provider: custom, base_url: http://localhost:11434/v1. The model field passes the Ollama model name bare (e.g. qwen2.5-coder:32b). This shapes OLLA-02, OLLA-03, and the config-fields.tsx changes.
For cost tracking (HERM-06): hermes-paperclip-adapter@0.2.1 already parses token_usage and cost regex patterns from Hermes stdout. When Hermes returns non-zero usage, heartbeat.ts:updateRuntimeState already calls costService.createEvent. The only gap is that Hermes running local Ollama models will have costUsd = undefined (no billing) — the infrastructure handles this correctly (zero cost event is suppressed when additionalCostCents === 0 && !hasTokenUsage). No cost tracking code changes are needed for local models; the planner just needs to verify the regex path works end-to-end.
For HERM-05 (skill visibility): syncHermesNativeSkills already exists in skillRegistryService and is already called from the GET /skill-registry/agents/:agentId/skills route when adapterType === "hermes_local". The Hermes adapter's listHermesSkills function merges Paperclip-managed and native skills. The integration is already complete at the data layer. What is missing is the UI surface in the Skills tab that renders the originLabel: "Hermes skill" / readOnly: true entries distinctly from managed skills.
Primary recommendation: Implement as four focused plans — (P01) server-side Ollama service + routes; (P02) Hermes config-fields UI extension for Ollama model selection; (P03) dashboard Hermes runtime info card; (P04) model catalog JSON + recommendation logic.
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
Claude's Discretion
All implementation choices are at Claude's discretion.
Deferred Ideas (OUT OF SCOPE)
None — discuss phase skipped. Refer to REQUIREMENTS.md for in-scope requirements.
Out of scope per REQUIREMENTS.md:
- Multi-provider model routing (Hermes can use OpenRouter/Anthropic/OpenAI but that's Hermes config, not Nexus)
- Hermes MCP server management
- Custom Hermes skill authoring UI
- DFLT-01 through DFLT-04 (Phase 29) </user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| OLLA-01 | Nexus detects whether Ollama is installed locally | HTTP probe to localhost:11434/api/version; new server service ollamaService |
| OLLA-02 | User can see list of available Ollama models when configuring a Hermes agent | GET /api/tags from Ollama HTTP API; new server route GET /companies/:id/ollama/models; config-fields.tsx dropdown |
| OLLA-03 | User can configure a Hermes agent with any local Ollama model | Sets adapterConfig.model = <model-name>, adapterConfig.provider = "custom", adapterConfig.base_url = "http://localhost:11434/v1" |
| OLLA-04 | Model recommendation based on RAM/VRAM from a shipped catalog | Static JSON catalog in server/src/data/ollama-model-catalog.json; server reads os.totalmem() to filter; returned with model list |
| OLLA-05 | If Ollama is not present, user is offered installation instructions | Ollama status endpoint returns installed: false + installUrl; UI shows callout in Hermes config-fields |
| HERM-05 | Nexus-managed skills visible alongside Hermes native skills in agent config | Already wired at data layer — UI Skills tab needs originLabel: "Hermes skill" rendering distinction |
| HERM-06 | Cost tracking captures token usage and model costs for Hermes agents | Infrastructure already handles this; verify end-to-end with local Ollama (zero cost is correct, no change needed) |
| HERM-07 | Dashboard shows Hermes-specific info (model name, memory usage, native skill count) | Store in agentRuntimeState.stateJson; render in AgentOverview component |
| </phase_requirements> |
Standard Stack
Core
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
Node.js os module |
built-in | Read total system RAM | Already used in heartbeat.ts; no new dep |
Node.js fetch |
Node 18+ built-in | HTTP calls to Ollama API at localhost:11434 | Already confirmed available in runtime |
hermes-paperclip-adapter |
0.2.1 (installed) | Hermes execution, skill sync, model detection | Already wired into adapter registry |
No New Dependencies Required
All capabilities needed for Phase 28 are achievable with existing infrastructure:
- Ollama HTTP API is probed with
fetch(built-in Node 18+) - Model catalog is a static JSON file in the server package
- RAM reading uses
os.totalmem()(built-in) - Hermes Ollama configuration uses existing
adapterConfigfields
Architecture Patterns
Recommended Project Structure (new files)
server/src/services/ollama.ts # ollamaService — detect + list models
server/src/routes/ollama.ts # HTTP routes: /companies/:id/ollama/status, /models
server/src/data/ollama-model-catalog.json # shipped catalog for OLLA-04
server/src/__tests__/ollama-service.test.ts # unit tests for ollamaService
ui/src/api/ollama.ts # ollamaApi client — wraps server routes
Pattern 1: Ollama Service (server-side)
// server/src/services/ollama.ts
const OLLAMA_BASE_URL = process.env.OLLAMA_BASE_URL ?? "http://localhost:11434";
const OLLAMA_TIMEOUT_MS = 3000;
export interface OllamaStatus {
installed: boolean;
version: string | null;
installUrl: string;
}
export interface OllamaModel {
name: string; // e.g. "qwen2.5-coder:32b"
parameterSize: string; // e.g. "32.8B" from /api/tags details
quantization: string; // e.g. "Q4_K_M"
sizeBytes: number;
family: string; // e.g. "qwen2"
recommended: boolean; // from catalog match + RAM check
recommendationReason: string | null;
}
export async function detectOllama(): Promise<OllamaStatus> {
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), OLLAMA_TIMEOUT_MS);
try {
const res = await fetch(`${OLLAMA_BASE_URL}/api/version`, {
signal: controller.signal,
});
if (!res.ok) return { installed: false, version: null, installUrl: INSTALL_URL };
const body = await res.json() as { version?: string };
return { installed: true, version: body.version ?? null, installUrl: INSTALL_URL };
} catch {
return { installed: false, version: null, installUrl: INSTALL_URL };
} finally {
clearTimeout(timeout);
}
}
Why this pattern: Matches the existing codex-models.ts pattern — HTTP fetch with timeout, graceful failure returns empty/false rather than throwing. The 3s timeout prevents hanging requests when Ollama is not installed.
Pattern 2: Ollama Routes (mounted under /companies/:companyId)
GET /companies/:companyId/ollama/status
→ { installed: boolean, version: string|null, installUrl: string }
GET /companies/:companyId/ollama/models
→ { models: OllamaModel[], ramGb: number }
Both routes use existing assertCompanyAccess(req, companyId) authz pattern from agents.ts.
Mount in server/src/routes/index.ts alongside the existing agentsRoutes.
Pattern 3: Hermes Config-Fields Enhancement
The existing HermesLocalConfigFields in config-fields.tsx has a free-text Model input. For Ollama support, it becomes a hybrid: dropdown (when Ollama is present) + manual entry fallback.
// Fetch Ollama status + models (only for hermes_local adapter)
const { data: ollamaStatus } = useQuery({
queryKey: ["ollama", "status", companyId],
queryFn: () => ollamaApi.status(companyId!),
enabled: Boolean(companyId),
});
const { data: ollamaModels } = useQuery({
queryKey: ["ollama", "models", companyId],
queryFn: () => ollamaApi.models(companyId!),
enabled: Boolean(companyId && ollamaStatus?.installed),
});
When ollamaStatus.installed === false, render an install callout (OLLA-05) instead of the dropdown.
When a local Ollama model is selected, buildHermesConfig (or mark) must also set provider: "custom" and base_url: "http://localhost:11434/v1" in adapterConfig. This is the critical mapping from OLLA-03.
Pattern 4: Hermes Runtime Data in stateJson (HERM-07)
agentRuntimeState.stateJson is jsonb typed as Record<string, unknown>. The heartbeat service writes this via updateRuntimeState. The Hermes adapter's execute.ts already returns resultJson with session_id, usage, and cost_usd.
For HERM-07 runtime data (model name, native skill count, memory usage), the server-side approach is:
- After a Hermes run completes, read
resultJson.resultand extract/store model + detected skill count intostateJson - Optionally query Ollama
/api/ps(running models) to getsize_vramfor memory usage display
Insertion point for stateJson patch: heartbeat.ts:updateRuntimeState already calls db.update(agentRuntimeState).set(...). Add a stateJson merge here when adapterType === "hermes_local".
UI insertion point: AgentOverview component in AgentDetail.tsx (line ~1183). Add a HermesRuntimeCard component after the charts section, gated by agent.adapterType === "hermes_local":
{agent.adapterType === "hermes_local" && runtimeState && (
<HermesRuntimeCard runtimeState={runtimeState} />
)}
Pattern 5: Model Catalog JSON (OLLA-04)
// server/src/data/ollama-model-catalog.json
{
"models": [
{
"family": "qwen2",
"variants": [
{ "name": "qwen2.5-coder:7b", "ramGb": 5, "vramGb": 5, "quality": "fast" },
{ "name": "qwen2.5-coder:32b", "ramGb": 22, "vramGb": 22, "quality": "best" }
]
},
{
"family": "llama",
"variants": [
{ "name": "llama3.2:3b", "ramGb": 3, "vramGb": 3, "quality": "fast" },
{ "name": "llama3.1:8b", "ramGb": 6, "vramGb": 6, "quality": "balanced" },
{ "name": "llama3.1:70b", "ramGb": 48, "vramGb": 48, "quality": "best" }
]
},
{
"family": "mistral",
"variants": [
{ "name": "mistral:7b", "ramGb": 5, "vramGb": 5, "quality": "balanced" },
{ "name": "mistral:22b", "ramGb": 14, "vramGb": 14, "quality": "best" }
]
},
{
"family": "phi",
"variants": [
{ "name": "phi4:14b", "ramGb": 10, "vramGb": 10, "quality": "balanced" }
]
},
{
"family": "deepseek",
"variants": [
{ "name": "deepseek-r1:7b", "ramGb": 5, "vramGb": 5, "quality": "reasoning" },
{ "name": "deepseek-r1:32b", "ramGb": 22, "vramGb": 22, "quality": "reasoning" }
]
}
]
}
Recommendation logic: os.totalmem() gives total RAM. Use 75% as usable RAM budget (leave OS headroom). Filter catalog entries where ramGb <= totalRamGb * 0.75. Return the highest-quality variant within budget plus a recommendationReason string.
Anti-Patterns to Avoid
- Polling Ollama in a loop: Use a 60-second TTL in-memory cache (same as codex-models.ts
MODELS_CACHE_TTL_MS). Do not re-probe on every API call. - Blocking server startup on Ollama check: Ollama detection is on-demand (per-request), not at startup.
- Hard-coding
localhost:11434: Always read fromprocess.env.OLLAMA_BASE_URL ?? "http://localhost:11434"so users with non-standard ports work. - Requiring Ollama for Hermes: All Ollama paths are optional. Hermes without Ollama continues to work unchanged. Never throw when Ollama is absent.
- Overwriting all of stateJson: Merge into stateJson using spread, never replace:
stateJson: { ...existingState, hermesModel: ..., hermesNativeSkillCount: ... }.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Ollama connectivity check | Custom TCP socket probe | fetch to /api/version with AbortController timeout |
Reuses existing pattern from codex-models.ts |
| YAML config parsing | Full YAML parser | Existing parseModelFromConfig in hermes adapter |
Already ships in hermes-paperclip-adapter/dist |
| System RAM reading | Shell commands | os.totalmem() |
Built-in, no dep, works cross-platform |
| Token cost tracking | New billing logic | Existing costService.createEvent + updateRuntimeState |
Already handles Hermes via regex-extracted usage |
Common Pitfalls
Pitfall 1: Hermes Does Not Have an "ollama" Provider
What goes wrong: Setting adapterConfig.provider = "ollama" causes Hermes to fail — "ollama" is not a valid VALID_PROVIDERS entry in constants.js.
Why it happens: Ollama mimics the OpenAI API, so Hermes treats it as provider: "custom" with base_url: "http://localhost:11434/v1".
How to avoid: When a user selects an Ollama model, always write provider: "custom" and base_url: "http://localhost:11434/v1" into adapterConfig. These fields are already in the Hermes config schema (see agentConfigurationDoc).
Warning signs: Hermes stderr shows "unknown provider" or authentication errors during local model runs.
Pitfall 2: Ollama API Returns Models at /api/tags, Not /v1/models
What goes wrong: Using the OpenAI-compat endpoint /v1/models to list models misses the details object (parameterSize, quantization_level, family) needed for OLLA-04.
Why it happens: /v1/models is OpenAI-compat, /api/tags is Ollama-native with richer data.
How to avoid: Use GET localhost:11434/api/tags for model listing (returns details.parameter_size, details.family). Use /v1/models only if passing through to Hermes.
Pitfall 3: stateJson Merge Requires Read-Modify-Write
What goes wrong: db.update(agentRuntimeState).set({ stateJson: newData }) overwrites other fields stored by other parts of the system.
Why it happens: Drizzle .set() replaces the entire column value.
How to avoid: Use Postgres jsonb merge: stateJson: sql\${agentRuntimeState.stateJson} || ${JSON.stringify(patch)}::jsonb`or read existingstateJsonfirst, then spread. The existingensureRuntimeStatecall inupdateRuntimeState` already reads the row.
Pitfall 4: HermesLocalConfigFields Uses adapterConfig for Both Create and Edit Modes
What goes wrong: Setting provider and base_url only in create mode loses the values on edit, or vice versa.
Why it happens: The isCreate flag switches between set!({ model: v }) (create) and mark("adapterConfig", "model", v) (edit) — both paths must update all three fields (model, provider, base_url) when an Ollama model is selected.
How to avoid: When Ollama model is selected, call the setter for all three config fields atomically. For create mode: set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" }). For edit mode: three mark() calls or a compound helper.
Pitfall 5: Ollama /api/ps Probe May Have No Models Running
What goes wrong: /api/ps returns an empty models: [] when no model is currently loaded — this does not mean Ollama is absent.
Why it happens: Ollama only shows models in /api/ps when they are actively loaded in memory.
How to avoid: Use /api/version for detection (OLLA-01), /api/tags for the model list (OLLA-02), and /api/ps only for the optional "memory usage" metric in HERM-07 — handling the empty case as "not currently loaded".
Pitfall 6: HERM-06 Cost Tracking — Ollama Models Return Zero Cost
What goes wrong: Expecting a cost_usd value from runs using local Ollama models — there is no external billing.
Why it happens: Hermes does not know the user's GPU/CPU cost. The COST_REGEX will not match if Hermes does not emit a cost line.
How to avoid: This is correct behavior. normalizeBilledCostCents(undefined, "unknown") returns 0. Token usage may still be captured if Hermes emits token counts. Accept that Ollama-based runs show $0.00 in the cost UI — that is accurate.
Code Examples
Ollama /api/tags Response Shape (verified)
// Source: https://docs.ollama.com/api/tags (verified 2026-04-01)
interface OllamaTagsResponse {
models: Array<{
name: string; // "qwen2.5-coder:32b"
model: string; // same as name
modified_at: string;
size: number; // bytes
digest: string;
details: {
parent_model: string;
format: string; // "gguf"
family: string; // "qwen2"
families: string[];
parameter_size: string; // "32.8B"
quantization_level: string; // "Q4_K_M"
};
}>;
}
Ollama /api/ps Response Shape (verified)
// Source: https://docs.ollama.com/api/tags (verified 2026-04-01)
interface OllamaPsResponse {
models: Array<{
name: string;
model: string;
size: number;
digest: string;
details: { /* same as tags */ };
expires_at: string;
size_vram: number; // bytes used in VRAM
}>;
}
Reading hermes-adapter stateJson Hermes fields
// In AgentDetail.tsx HermesRuntimeCard — read from runtimeState.stateJson
const hermesModel = runtimeState.stateJson?.hermesModel as string | undefined;
const hermesNativeSkillCount = runtimeState.stateJson?.hermesNativeSkillCount as number | undefined;
const hermesMemoryBytes = runtimeState.stateJson?.hermesMemoryBytes as number | undefined;
Hermes Ollama adapterConfig (what to write)
// When user selects an Ollama model in config-fields.tsx:
// model = "qwen2.5-coder:32b" (bare Ollama model name)
// provider = "custom" (OpenAI-compatible endpoint)
// base_url = "http://localhost:11434/v1"
// For create mode:
set!({ model, provider: "custom", base_url: "http://localhost:11434/v1" })
// For edit mode:
mark("adapterConfig", "model", model);
mark("adapterConfig", "provider", "custom");
mark("adapterConfig", "base_url", "http://localhost:11434/v1");
Cost Tracking — Already Wired (HERM-06 context)
// Source: server/src/services/heartbeat.ts:updateRuntimeState
// Hermes execute.ts returns:
// result.usage = { inputTokens, outputTokens } (from regex)
// result.costUsd = number | undefined (from regex, usually undefined for local)
//
// heartbeat.ts normalizes:
const usage = normalizeUsageTotals(result.usage);
const additionalCostCents = normalizeBilledCostCents(result.costUsd, billingType);
// Then:
if (additionalCostCents > 0 || hasTokenUsage) {
await costs.createEvent(companyId, { ... model: result.model ?? "unknown" ... });
}
// → For Ollama: costCents=0, but inputTokens/outputTokens may be > 0 → cost event recorded
// → If Hermes doesn't emit token counts: no event recorded (correct behavior)
HERM-05: Skill Visibility — What Is Already Done vs. What Is Missing
Already Done (data layer is complete)
skillRegistryService.syncHermesNativeSkills(agentId)scans~/.hermes/skills/and insertssource: "native"rows- Called automatically from
GET /skill-registry/agents/:agentId/skillswhenadapterType === "hermes_local" - Returns
AgentSkillEntry[]with{ skillId, source, installedAt }— both"native"and"managed"source values - Hermes adapter
listHermesSkillsreturns snapshot withoriginLabel: "Hermes skill"andreadOnly: truefor native skills
What Is Missing (UI rendering in AgentSkillsTab)
The unmanagedSkillRows section in AgentSkillsTab (AgentDetail.tsx:2566) renders read-only adapter entries. It uses entry.originLabel and entry.locationLabel for display. Hermes native skills already flow through this path.
The gap: the UI may not clearly distinguish "Hermes skill" entries from other unmanaged entries. The originLabel: "Hermes skill" badge rendering and skill count display are the UI additions needed. This is a targeted render update to AgentSkillsTab, not a new data flow.
HERM-07: Dashboard Hermes Runtime Info
What to Store in stateJson
// Written by heartbeat.ts updateRuntimeState after a Hermes run
{
hermesModel: string; // e.g. "qwen2.5-coder:32b" or "anthropic/claude-sonnet-4"
hermesNativeSkillCount: number; // from skillRegistryService query
hermesMemoryBytes: number | null; // from /api/ps size_vram, null if unavailable
}
Where to Write stateJson
In heartbeat.ts:updateRuntimeState, after the existing db.update(agentRuntimeState).set(...) call, add a second update that merges hermes-specific fields when agent.adapterType === "hermes_local". Read result.model for hermesModel. Query skillRegistryDb for hermesNativeSkillCount. Query Ollama /api/ps for hermesMemoryBytes (non-blocking, fire-and-forget).
What to Render
A HermesRuntimeCard component in AgentOverview (gated by adapterType === "hermes_local"):
- Model name (from stateJson.hermesModel)
- Native skill count (from stateJson.hermesNativeSkillCount)
- Memory usage (from stateJson.hermesMemoryBytes, formatted as "X.X GB" or "Not loaded")
Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| Ollama daemon | OLLA-01 through OLLA-05 | No (not installed) | — | All paths degrade gracefully; UI shows install instructions |
| hermes-paperclip-adapter | HERM-05, HERM-06, HERM-07 | Yes | 0.2.1 | — |
| Node.js fetch | Ollama HTTP probing | Yes | built-in (Node 18+) | — |
| Node.js os module | OLLA-04 RAM reading | Yes | built-in | — |
| Vitest | Tests | Yes | (server vitest.config.ts) | — |
Missing dependencies with no fallback: None — all Ollama features degrade gracefully when Ollama is absent.
Pre-existing test failures (not Phase 28 regressions): 4 test files failing before Phase 28 begins:
app-hmr-port.test.tsplugin-worker-manager.test.tsheartbeat-workspace-session.test.ts(5 tests)skill-registry-routes.test.ts(1 test)
Validation Architecture
Test Framework
| Property | Value |
|---|---|
| Framework | Vitest (server) |
| Config file | server/vitest.config.ts |
| Quick run command | cd server && npx vitest run src/__tests__/ollama-service.test.ts |
| Full suite command | cd server && npx vitest run |
Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|---|---|---|---|---|
| OLLA-01 | detectOllama() returns installed: false when Ollama absent |
unit | npx vitest run src/__tests__/ollama-service.test.ts |
No — Wave 0 |
| OLLA-01 | detectOllama() returns installed: true + version when Ollama present |
unit | same | No — Wave 0 |
| OLLA-01 | detectOllama() times out cleanly (AbortController) |
unit | same | No — Wave 0 |
| OLLA-02 | listOllamaModels() returns AdapterModel[] from /api/tags |
unit | same | No — Wave 0 |
| OLLA-04 | buildModelRecommendation() returns correct model for given RAM budget |
unit | same | No — Wave 0 |
| OLLA-05 | Routes return installUrl when Ollama absent |
unit | same | No — Wave 0 |
| HERM-05 | Skills tab renders originLabel: "Hermes skill" badge |
manual-only | — | — |
| HERM-06 | updateRuntimeState records cost event when Hermes emits token data |
unit (existing pattern) | npx vitest run src/__tests__/costs-service.test.ts |
Yes |
| HERM-07 | stateJson receives hermesModel/hermesNativeSkillCount after run | unit | npx vitest run src/__tests__/ollama-service.test.ts |
No — Wave 0 |
Sampling Rate
- Per task commit:
cd server && npx vitest run src/__tests__/ollama-service.test.ts - Per wave merge:
cd server && npx vitest run - Phase gate: Full suite green before
/gsd:verify-work(excluding 4 pre-existing failures)
Wave 0 Gaps
server/src/__tests__/ollama-service.test.ts— covers OLLA-01, OLLA-02, OLLA-04, OLLA-05, HERM-07 stateJson logic- Test stubs use mock fetch (AbortController pattern); no real Ollama needed
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Manual text entry for Hermes model | Dropdown fed from Ollama + manual fallback | Phase 28 | Better UX for local models |
| stateJson unused for Hermes | stateJson stores hermesModel, skillCount, memoryBytes | Phase 28 | Dashboard can show runtime info |
| Hermes native skills in separate table only | Skills tab renders both managed + native in unified view | Phase 28 (HERM-05 completion) | Unified skill surface |
Open Questions
-
Should Ollama route be gated to hermes_local only?
- What we know: Only Hermes uses the Ollama custom endpoint pattern currently
- What's unclear: Future adapters (Phase 29 defaults) may also use Ollama
- Recommendation: Mount under
/companies/:companyId/ollama/*without adapter-type gating — the endpoint is useful generically and Pi/OpenCode adapters may benefit in Phase 29
-
Should listOllamaModels also extend the hermes adapter's
listModelsfunction?- What we know:
listAdapterModels("hermes_local")already callsadapter.listModels()if present; hermes adapter has nolistModelsimplementation (returnsmodels: []) - What's unclear: Whether to add
listModelsto hermes adapter (requires adapter package change) or use a separate Ollama API route in Nexus - Recommendation: Use a separate Nexus route (
/companies/:companyId/ollama/models). Avoids changing the hermes-paperclip-adapter package (external dependency). The config-fields.tsx component can call the Nexus route directly. Do not modify the hermes-paperclip-adapter package.
- What we know:
-
stateJson hermesNativeSkillCount — count from skillRegistry or from adapter snapshot?
- What we know:
skillRegistryDbis a separate libSQL DB; querying it inupdateRuntimeStateadds cross-DB complexity - What's unclear: Is the extra query worth it for a display-only count?
- Recommendation: Store the count from
result.resultJsonif Hermes emits it, or derive from the adapter skill snapshot after run. Alternatively, skip native skill count from stateJson and derive it in the UI fromagentsApi.skills(agentId)query. The UI approach avoids cross-DB concerns in heartbeat.
- What we know:
Sources
Primary (HIGH confidence)
- hermes-paperclip-adapter@0.2.1 dist source code —
execute.js,skills.js,detect-model.js,test.js,constants.js— read directly from/opt/nexus/server/node_modules/hermes-paperclip-adapter/dist/ - Nexus codebase —
server/src/services/heartbeat.ts,server/src/services/costs.ts,server/src/services/skill-registry.ts,ui/src/pages/AgentDetail.tsx,ui/src/adapters/hermes-local/config-fields.tsx— read directly - Ollama REST API —
https://docs.ollama.com/api/tags— verified /api/tags response shape withdetails.parameter_size,details.family,details.quantization_level - Node.js built-ins —
os.totalmem(),fetchwith AbortController — confirmed available in Node 18+ runtime
Secondary (MEDIUM confidence)
- Hermes Agent provider docs —
https://hermes-agent.nousresearch.com/docs/integrations/providers/— verified "ollama uses custom provider + localhost:11434/v1 base_url" - Hermes Agent + Ollama guide — Medium/Substack articles cross-referencing official docs — confirmed custom endpoint configuration steps
Tertiary (LOW confidence)
- Ollama model RAM requirements (catalog) — community sources + Ollama model page tags — use conservative estimates; verify against https://ollama.com/library model pages before shipping
Metadata
Confidence breakdown:
- Ollama API: HIGH — verified from official docs, response shapes confirmed
- Hermes + Ollama provider mapping: HIGH — verified from official Hermes provider docs
- Standard stack: HIGH — all existing infrastructure confirmed from source code
- Architecture patterns: HIGH — follow existing codex-models.ts, heartbeat.ts, config-fields.tsx patterns exactly
- HERM-05 data layer status: HIGH — verified syncHermesNativeSkills exists and is already called
- HERM-06 cost tracking: HIGH — execute.js returns usage/costUsd, heartbeat.ts wires it to costService
- Pitfalls: HIGH — derived from actual source code analysis
Research date: 2026-04-01 Valid until: 2026-05-01 (Ollama API is stable; hermes-paperclip-adapter may receive new releases)