nexus/.planning/milestone-queue/v1.4-REQUIREMENTS.md
Mikkel Georgsen 6c4272ce85 [nexus] chore: migrate .planning/ from agent repo to nexus repo
Planning artifacts (milestones v1.0-v1.2.1, v1.3 queue, PROJECT.md,
STATE.md, config) now live alongside the code they describe.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 03:55:42 +00:00

16 KiB

Requirements: v1.4 Hermes as Default Inference Provider + Web Control Plane

Version: 1.4 Status: Queued Depends on: v1.2.0.1 (nxr), v1.2.1 (Universal Skills), v1.3 (Web Chat) Source PRD: ~/Downloads/nexus-v1.4-prd.md


Summary

Category Count
ONBOARD 8
SCORE 6
UPGRADE 6
WEBCP 14
SCAN 5
SKILL 7
NXR 9
DATA 7
Total 62

ONBOARD — Free-by-Default Onboarding

  • ONBOARD-01 CLI onboarding (npx buildthis) prompts the user to optionally install Hermes Agent, explaining it runs entirely on free models with zero API key required.
  • ONBOARD-02 CLI onboarding detects available local AI capabilities at install time: Ollama (model loaded, GPU/RAM), faster-whisper, and flags their availability in output and in the onboarding DB record.
  • ONBOARD-03 When the user accepts Hermes install, onboarding automatically creates three default agents (PM, Engineer, Hermes) with role-appropriate free models assigned via the scoring algorithm.
  • ONBOARD-04 Onboarding output displays each created agent's name, assigned model, model metadata (context window, capabilities), and pre-loaded skillset.
  • ONBOARD-05 The zero-key experience allows all agents to run without an OpenRouter API key using the openrouter/free auto-router; free tier rate limits (50 req/day without credits, 1000/day with $10+) are communicated clearly at the end of onboarding.
  • ONBOARD-06 When local Ollama is detected during onboarding, auxiliary tasks (compression, vision, web extract) are automatically configured to route locally at zero cost with no rate limits.
  • ONBOARD-07 The web onboarding wizard (/setup) is feature-equivalent to the CLI flow: five steps (workspace name, Hermes install pitch, auto-create agents, optional API key paste, dashboard redirect) and is idempotent with the CLI flow.
  • ONBOARD-08 Running both the CLI onboarding and web wizard produces the same default state and causes no duplication — the onboarding table record prevents re-execution.

SCORE — Model Scoring Engine

  • SCORE-01 A deterministic scoring function exists in nxr that assigns a numeric score to any free model for a given role using the formula: (context_length * 0.3) + (has_tools * 30) + (has_reasoning * 20) + (throughput * 0.1).
  • SCORE-02 The scoring function filters the or_model_catalog to only is_free = 1 AND is_available = 1 models, applies role requirements (tools support, vision, reasoning, minimum context window), and returns the highest-scoring model for the role.
  • SCORE-03 If no model satisfies the role requirements, the function falls back to openrouter/free auto-router; if Ollama is available, local Qwen3.5 9B is the emergency fallback.
  • SCORE-04 The scoring function covers all six role categories: pm, engineer, hermes, creative, research, and custom — each with documented minimum requirements.
  • SCORE-05 Scoring results are persisted to the model_scores libSQL table with model_id, role, composite score, sub-scores (context, tools, reasoning, throughput), is_free, and scored_at timestamp.
  • SCORE-06 The scoring function is callable from both the nxr CLI (nxr agents rescore) and from the Go HTTP backend API (POST /api/models/rescore), producing identical output for the same catalog state.

UPGRADE — API Key Upgrade Flow

  • UPGRADE-01 nxr upgrade (interactive mode) validates the provided OpenRouter API key, detects account balance, and displays the free tier unlock status (50/day → 1000/day).
  • UPGRADE-02 After key validation, nxr upgrade presents a per-agent upgrade preview showing each agent's current free model and its recommended paid replacement with price per million tokens and context window.
  • UPGRADE-03 The user can respond Y (upgrade all), n (store key only, keep free models), or custom (per-agent model picker TUI) — each path executes correctly and writes to config atomically.
  • UPGRADE-04 nxr upgrade --all upgrades all agents to their recommended paid models non-interactively; nxr upgrade --agent "<name>" upgrades a single named agent.
  • UPGRADE-05 nxr upgrade --revert switches all agents back to their last-recorded free model assignments; agent memory, skills, and session history are preserved across all upgrade and revert operations.
  • UPGRADE-06 The web agent manager (/agents/manage) exposes a "Bulk Upgrade" button and per-agent upgrade controls that call POST /api/agents/bulk-upgrade or GET /api/agents/:id/recommend respectively, with the same logic as the CLI flow.

WEBCP — Web Control Plane (Nexus Hub)

Process Control (/hermes)

  • WEBCP-01 The /hermes page displays live Hermes process status: PID, uptime, current model, and tmux viewer count; data is fetched from GET /api/hermes/ps.
  • WEBCP-02 The /hermes page provides Start, Stop, and Restart buttons that call POST /api/hermes/up, POST /api/hermes/down, and POST /api/hermes/restart respectively, with visible success/failure feedback.
  • WEBCP-03 The /hermes page shows an Ollama status card with loaded model, GPU usage, and throughput when Ollama is available.

Model Switcher (/models/switch)

  • WEBCP-04 The /models/switch page renders a visual slot editor showing all 7 routing slots (primary, fallback, simple, vision, web_extract, approval, compression) with their currently assigned models.
  • WEBCP-05 Clicking any slot opens a model picker modal with fuzzy search and filter toggles (free, tools, vision, reasoning, MoE); selecting a model and confirming calls PUT /api/routing/:slot and writes config.yaml atomically.
  • WEBCP-06 The model picker modal displays a price comparison side panel showing cost per million input/output tokens for the currently selected model vs. the active slot model.

Agent Manager (/agents/manage)

  • WEBCP-07 The /agents/manage page lists all agents with per-agent model assignment, role-aware model recommendation (from GET /api/agents/:id/recommend), last heartbeat, message count, cost, and error rate.
  • WEBCP-08 The /agents/manage page allows skill assignment per agent using category templates, calling POST /api/agents/:id/skills.
  • WEBCP-09 The /agents/manage page includes a "Create Agent" flow: category picker → auto model assignment → auto skill suggestion → name input → confirm — calling the existing agent creation API with the new role, skillset, and model_auto_assigned fields.

Budget Dashboard (/budget)

  • WEBCP-10 The /budget page shows a real-time free tier gauge (requests used / daily limit) sourced from hermes_tracking.db usage data.
  • WEBCP-11 The /budget page shows per-agent cost breakdown for today, last 7 days, and last 30 days, plus a cost projection graph and a rate limit event log.
  • WEBCP-12 The /budget page provides an "Export as CSV" action that downloads the usage data for the selected time range.

Notifications Center (/notifications)

  • WEBCP-13 The /notifications page displays all notification types from v1.2.1 plus new v1.4 types: rate limit warnings, agent auto-upgrade events, and model availability alerts; each notification can be marked read via PUT /api/notifications/:id/read.
  • WEBCP-14 The /notifications page includes per-type Telegram forwarding toggles that persist via POST /api/notifications/settings.

SCAN — Scanner Updates

  • SCAN-01 After each 6-hourly OpenRouter catalog scan, the scanner re-scores all free models for every role category using the SCORE-01 algorithm and persists results to model_scores.
  • SCAN-02 If a re-score reveals a better free model for an active agent's role, the scanner creates a notification with the old model name, new model name, role, and score delta — plus a one-click upgrade action.
  • SCAN-03 A config flag auto_upgrade_free_models in ~/.hermes/config.yaml (default false) controls whether the scanner auto-switches agents to better free models or only notifies.
  • SCAN-04 The scanner queries model_usage to compute average free requests per hour for the current day and projects whether the workspace will hit the daily limit before midnight UTC; this projection is surfaced in the dashboard and nxr budget.
  • SCAN-05 Rate limit threshold warnings are triggered at 70% (dashboard warning), 90% (Telegram notification if gateway configured), and 100% (agents queue tasks until midnight UTC reset, or route to local Qwen if available).

SKILL — Default Skillsets and Agent Templates

  • SKILL-01 The PM agent is created with exactly 8 pre-loaded skills: planning, task-breakdown, prioritization, status-reporting, dependency-mapping, sprint-planning, risk-assessment, stakeholder-comms.
  • SKILL-02 The Engineer agent is created with exactly 8 pre-loaded skills: coding, debugging, git-workflow, testing, code-review, refactoring, architecture, documentation.
  • SKILL-03 The Hermes agent is created with exactly 8 pre-loaded skills: memory, web-search, file-ops, cron, usage-tracker, model-scanner, skill-creator, session-search.
  • SKILL-04 All agent skill assignments go through Hermes's skill assignment system so that the existing listSkills/syncSkills adapter API sees the skills correctly.
  • SKILL-05 When creating a custom agent, the user can select a role category (tech, creative, business, research, media, personal); each category has a suggested skill template that pre-populates the skill selector.
  • SKILL-06 Custom category skill templates are defined for all 6 categories: tech (coding, debugging, git-workflow, testing, architecture), creative (creative-writing, screenwriting, worldbuilding, dialogue), business (strategy, proposal-writing, market-analysis, financial-modeling), research (paper-analysis, literature-review, data-analysis, methodology), media (journalism, copywriting, social-media, content-strategy), personal (goal-setting, language-tutoring, fitness, travel-planning).
  • SKILL-07 Skills assigned during onboarding or agent creation are suggestions only — users can add or remove skills freely after creation, and skills from agentskills.io are installable via hermes skills search.

NXR — nxr Additions

  • NXR-01 nxr init runs the interactive onboarding wizard: workspace name, Hermes install prompt, auto-creates PM + Engineer + Hermes with free models, optional API key prompt, and displays the ready summary.
  • NXR-02 nxr init --free skips the API key prompt and runs in pure free mode without interactive prompts beyond workspace name.
  • NXR-03 nxr init --key <sk-or-...> accepts a pre-set API key and uses it during init without prompting.
  • NXR-04 nxr upgrade runs the interactive upgrade picker as described in UPGRADE-01 through UPGRADE-03.
  • NXR-05 nxr upgrade --all and nxr upgrade --agent "<name>" run non-interactively as described in UPGRADE-04.
  • NXR-06 nxr upgrade --revert reverts all agents to free models as described in UPGRADE-05.
  • NXR-07 nxr agents recommend prints a table showing the recommended model for each agent based on its role, pulled from the scoring algorithm.
  • NXR-08 nxr agents rescore re-runs the free model scoring algorithm for all agents and updates model_scores; output shows any model changes.
  • NXR-09 nxr agents create --role <role> --name <name> --free creates a new agent with auto-selected free model and auto-applied skill template for the given role; TUI Tab 5 gains a "Create Agent" wizard with the same category → model → skills → name flow.

DATA — Data Model Changes

  • DATA-01 The agents table in hermes_tracking.db gains a role TEXT column storing one of: pm, engineer, hermes, custom.
  • DATA-02 The agents table gains a skillset TEXT column storing a JSON array of skill name strings.
  • DATA-03 The agents table gains a model_score REAL DEFAULT 0 column storing the auto-calculated quality score at the time of last model assignment.
  • DATA-04 The agents table gains a model_auto_assigned INTEGER DEFAULT 0 column; value 1 indicates nxr selected the model automatically.
  • DATA-05 The agents table gains workspace_id TEXT and is_default INTEGER DEFAULT 0 columns; is_default = 1 marks agents created during onboarding.
  • DATA-06 A new model_scores table is created in hermes_tracking.db (libSQL) with columns: id, model_id, role, score, context_score, tools_score, reasoning_score, throughput_score, is_free, scored_at; unique constraint on (model_id, role, scored_at); indexes on role and is_free.
  • DATA-07 A new onboarding table is created in hermes_tracking.db (libSQL) with columns: id, completed_at, workspace_name, has_openrouter_key, has_ollama, has_whisper, agents_created (JSON array of agent IDs), initial_free_models (JSON snapshot of model assignments at creation).

Out of Scope

The following are explicitly excluded from v1.4 per PRD Section 13:

  • Multi-user auth for web dashboard — single-user, localhost only; future milestone
  • Self-hosted model registries beyond Ollama — future consideration
  • Model training or fine-tuning — models used as-is from OpenRouter
  • Auto-spending user money — free-by-default; paid models require explicit opt-in every time
  • Guaranteed free model availability — scanner detects and adapts when models leave the free tier; no SLA
  • WebSocket live terminal stream (WS /api/hermes/stream) — stretch goal; nxr watch is the primary observation path; ship without and add later
  • Paperclip agent orchestration replacement — Hermes provides inference, Paperclip provides orchestration; they are complementary
  • Blog auto-generation triggers — PRD Section 11 is informational context for v1.2.1 auto-blogging system; not a v1.4 implementation requirement

Traceability

Requirement Phase Status
ONBOARD-01 Phase 30 Pending
ONBOARD-02 Phase 30 Pending
ONBOARD-03 Phase 30 Pending
ONBOARD-04 Phase 30 Pending
ONBOARD-05 Phase 30 Pending
ONBOARD-06 Phase 30 Pending
ONBOARD-07 Phase 30 Pending
ONBOARD-08 Phase 30 Pending
SCORE-01 Phase 28 Pending
SCORE-02 Phase 28 Pending
SCORE-03 Phase 28 Pending
SCORE-04 Phase 28 Pending
SCORE-05 Phase 28 Pending
SCORE-06 Phase 28 Pending
UPGRADE-01 Phase 31 Pending
UPGRADE-02 Phase 31 Pending
UPGRADE-03 Phase 31 Pending
UPGRADE-04 Phase 31 Pending
UPGRADE-05 Phase 31 Pending
UPGRADE-06 Phase 31 Pending
WEBCP-01 Phase 34 Pending
WEBCP-02 Phase 34 Pending
WEBCP-03 Phase 34 Pending
WEBCP-04 Phase 34 Pending
WEBCP-05 Phase 34 Pending
WEBCP-06 Phase 34 Pending
WEBCP-07 Phase 34 Pending
WEBCP-08 Phase 34 Pending
WEBCP-09 Phase 34 Pending
WEBCP-10 Phase 34 Pending
WEBCP-11 Phase 34 Pending
WEBCP-12 Phase 34 Pending
WEBCP-13 Phase 34 Pending
WEBCP-14 Phase 34 Pending
SCAN-01 Phase 32 Pending
SCAN-02 Phase 32 Pending
SCAN-03 Phase 32 Pending
SCAN-04 Phase 32 Pending
SCAN-05 Phase 32 Pending
SKILL-01 Phase 29 Pending
SKILL-02 Phase 29 Pending
SKILL-03 Phase 29 Pending
SKILL-04 Phase 29 Pending
SKILL-05 Phase 29 Pending
SKILL-06 Phase 29 Pending
SKILL-07 Phase 29 Pending
NXR-01 Phase 30 Pending
NXR-02 Phase 30 Pending
NXR-03 Phase 30 Pending
NXR-04 Phase 31 Pending
NXR-05 Phase 31 Pending
NXR-06 Phase 31 Pending
NXR-07 Phase 33 Pending
NXR-08 Phase 33 Pending
NXR-09 Phase 33 Pending
DATA-01 Phase 27 Pending
DATA-02 Phase 27 Pending
DATA-03 Phase 27 Pending
DATA-04 Phase 27 Pending
DATA-05 Phase 27 Pending
DATA-06 Phase 27 Pending
DATA-07 Phase 27 Pending