nexus/.planning/milestone-queue/v1.4-REQUIREMENTS.md
Mikkel Georgsen 6c4272ce85 [nexus] chore: migrate .planning/ from agent repo to nexus repo
Planning artifacts (milestones v1.0-v1.2.1, v1.3 queue, PROJECT.md,
STATE.md, config) now live alongside the code they describe.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 03:55:42 +00:00

217 lines
16 KiB
Markdown

# Requirements: v1.4 Hermes as Default Inference Provider + Web Control Plane
**Version:** 1.4
**Status:** Queued
**Depends on:** v1.2.0.1 (nxr), v1.2.1 (Universal Skills), v1.3 (Web Chat)
**Source PRD:** `~/Downloads/nexus-v1.4-prd.md`
---
## Summary
| Category | Count |
|----------|-------|
| ONBOARD | 8 |
| SCORE | 6 |
| UPGRADE | 6 |
| WEBCP | 14 |
| SCAN | 5 |
| SKILL | 7 |
| NXR | 9 |
| DATA | 7 |
| **Total** | **62** |
---
## ONBOARD — Free-by-Default Onboarding
- [ ] **ONBOARD-01** CLI onboarding (`npx buildthis`) prompts the user to optionally install Hermes Agent, explaining it runs entirely on free models with zero API key required.
- [ ] **ONBOARD-02** CLI onboarding detects available local AI capabilities at install time: Ollama (model loaded, GPU/RAM), faster-whisper, and flags their availability in output and in the `onboarding` DB record.
- [ ] **ONBOARD-03** When the user accepts Hermes install, onboarding automatically creates three default agents (PM, Engineer, Hermes) with role-appropriate free models assigned via the scoring algorithm.
- [ ] **ONBOARD-04** Onboarding output displays each created agent's name, assigned model, model metadata (context window, capabilities), and pre-loaded skillset.
- [ ] **ONBOARD-05** The zero-key experience allows all agents to run without an OpenRouter API key using the `openrouter/free` auto-router; free tier rate limits (50 req/day without credits, 1000/day with $10+) are communicated clearly at the end of onboarding.
- [ ] **ONBOARD-06** When local Ollama is detected during onboarding, auxiliary tasks (compression, vision, web extract) are automatically configured to route locally at zero cost with no rate limits.
- [ ] **ONBOARD-07** The web onboarding wizard (`/setup`) is feature-equivalent to the CLI flow: five steps (workspace name, Hermes install pitch, auto-create agents, optional API key paste, dashboard redirect) and is idempotent with the CLI flow.
- [ ] **ONBOARD-08** Running both the CLI onboarding and web wizard produces the same default state and causes no duplication — the `onboarding` table record prevents re-execution.
---
## SCORE — Model Scoring Engine
- [ ] **SCORE-01** A deterministic scoring function exists in `nxr` that assigns a numeric score to any free model for a given role using the formula: `(context_length * 0.3) + (has_tools * 30) + (has_reasoning * 20) + (throughput * 0.1)`.
- [ ] **SCORE-02** The scoring function filters the `or_model_catalog` to only `is_free = 1 AND is_available = 1` models, applies role requirements (tools support, vision, reasoning, minimum context window), and returns the highest-scoring model for the role.
- [ ] **SCORE-03** If no model satisfies the role requirements, the function falls back to `openrouter/free` auto-router; if Ollama is available, local Qwen3.5 9B is the emergency fallback.
- [ ] **SCORE-04** The scoring function covers all six role categories: `pm`, `engineer`, `hermes`, `creative`, `research`, and `custom` — each with documented minimum requirements.
- [ ] **SCORE-05** Scoring results are persisted to the `model_scores` libSQL table with `model_id`, `role`, composite score, sub-scores (context, tools, reasoning, throughput), `is_free`, and `scored_at` timestamp.
- [ ] **SCORE-06** The scoring function is callable from both the `nxr` CLI (`nxr agents rescore`) and from the Go HTTP backend API (`POST /api/models/rescore`), producing identical output for the same catalog state.
---
## UPGRADE — API Key Upgrade Flow
- [ ] **UPGRADE-01** `nxr upgrade` (interactive mode) validates the provided OpenRouter API key, detects account balance, and displays the free tier unlock status (50/day → 1000/day).
- [ ] **UPGRADE-02** After key validation, `nxr upgrade` presents a per-agent upgrade preview showing each agent's current free model and its recommended paid replacement with price per million tokens and context window.
- [ ] **UPGRADE-03** The user can respond `Y` (upgrade all), `n` (store key only, keep free models), or `custom` (per-agent model picker TUI) — each path executes correctly and writes to config atomically.
- [ ] **UPGRADE-04** `nxr upgrade --all` upgrades all agents to their recommended paid models non-interactively; `nxr upgrade --agent "<name>"` upgrades a single named agent.
- [ ] **UPGRADE-05** `nxr upgrade --revert` switches all agents back to their last-recorded free model assignments; agent memory, skills, and session history are preserved across all upgrade and revert operations.
- [ ] **UPGRADE-06** The web agent manager (`/agents/manage`) exposes a "Bulk Upgrade" button and per-agent upgrade controls that call `POST /api/agents/bulk-upgrade` or `GET /api/agents/:id/recommend` respectively, with the same logic as the CLI flow.
---
## WEBCP — Web Control Plane (Nexus Hub)
### Process Control (`/hermes`)
- [ ] **WEBCP-01** The `/hermes` page displays live Hermes process status: PID, uptime, current model, and tmux viewer count; data is fetched from `GET /api/hermes/ps`.
- [ ] **WEBCP-02** The `/hermes` page provides Start, Stop, and Restart buttons that call `POST /api/hermes/up`, `POST /api/hermes/down`, and `POST /api/hermes/restart` respectively, with visible success/failure feedback.
- [ ] **WEBCP-03** The `/hermes` page shows an Ollama status card with loaded model, GPU usage, and throughput when Ollama is available.
### Model Switcher (`/models/switch`)
- [ ] **WEBCP-04** The `/models/switch` page renders a visual slot editor showing all 7 routing slots (primary, fallback, simple, vision, web_extract, approval, compression) with their currently assigned models.
- [ ] **WEBCP-05** Clicking any slot opens a model picker modal with fuzzy search and filter toggles (free, tools, vision, reasoning, MoE); selecting a model and confirming calls `PUT /api/routing/:slot` and writes `config.yaml` atomically.
- [ ] **WEBCP-06** The model picker modal displays a price comparison side panel showing cost per million input/output tokens for the currently selected model vs. the active slot model.
### Agent Manager (`/agents/manage`)
- [ ] **WEBCP-07** The `/agents/manage` page lists all agents with per-agent model assignment, role-aware model recommendation (from `GET /api/agents/:id/recommend`), last heartbeat, message count, cost, and error rate.
- [ ] **WEBCP-08** The `/agents/manage` page allows skill assignment per agent using category templates, calling `POST /api/agents/:id/skills`.
- [ ] **WEBCP-09** The `/agents/manage` page includes a "Create Agent" flow: category picker → auto model assignment → auto skill suggestion → name input → confirm — calling the existing agent creation API with the new `role`, `skillset`, and `model_auto_assigned` fields.
### Budget Dashboard (`/budget`)
- [ ] **WEBCP-10** The `/budget` page shows a real-time free tier gauge (requests used / daily limit) sourced from `hermes_tracking.db` usage data.
- [ ] **WEBCP-11** The `/budget` page shows per-agent cost breakdown for today, last 7 days, and last 30 days, plus a cost projection graph and a rate limit event log.
- [ ] **WEBCP-12** The `/budget` page provides an "Export as CSV" action that downloads the usage data for the selected time range.
### Notifications Center (`/notifications`)
- [ ] **WEBCP-13** The `/notifications` page displays all notification types from v1.2.1 plus new v1.4 types: rate limit warnings, agent auto-upgrade events, and model availability alerts; each notification can be marked read via `PUT /api/notifications/:id/read`.
- [ ] **WEBCP-14** The `/notifications` page includes per-type Telegram forwarding toggles that persist via `POST /api/notifications/settings`.
---
## SCAN — Scanner Updates
- [ ] **SCAN-01** After each 6-hourly OpenRouter catalog scan, the scanner re-scores all free models for every role category using the SCORE-01 algorithm and persists results to `model_scores`.
- [ ] **SCAN-02** If a re-score reveals a better free model for an active agent's role, the scanner creates a notification with the old model name, new model name, role, and score delta — plus a one-click upgrade action.
- [ ] **SCAN-03** A config flag `auto_upgrade_free_models` in `~/.hermes/config.yaml` (default `false`) controls whether the scanner auto-switches agents to better free models or only notifies.
- [ ] **SCAN-04** The scanner queries `model_usage` to compute average free requests per hour for the current day and projects whether the workspace will hit the daily limit before midnight UTC; this projection is surfaced in the dashboard and `nxr budget`.
- [ ] **SCAN-05** Rate limit threshold warnings are triggered at 70% (dashboard warning), 90% (Telegram notification if gateway configured), and 100% (agents queue tasks until midnight UTC reset, or route to local Qwen if available).
---
## SKILL — Default Skillsets and Agent Templates
- [ ] **SKILL-01** The PM agent is created with exactly 8 pre-loaded skills: `planning`, `task-breakdown`, `prioritization`, `status-reporting`, `dependency-mapping`, `sprint-planning`, `risk-assessment`, `stakeholder-comms`.
- [ ] **SKILL-02** The Engineer agent is created with exactly 8 pre-loaded skills: `coding`, `debugging`, `git-workflow`, `testing`, `code-review`, `refactoring`, `architecture`, `documentation`.
- [ ] **SKILL-03** The Hermes agent is created with exactly 8 pre-loaded skills: `memory`, `web-search`, `file-ops`, `cron`, `usage-tracker`, `model-scanner`, `skill-creator`, `session-search`.
- [ ] **SKILL-04** All agent skill assignments go through Hermes's skill assignment system so that the existing `listSkills`/`syncSkills` adapter API sees the skills correctly.
- [ ] **SKILL-05** When creating a custom agent, the user can select a role category (tech, creative, business, research, media, personal); each category has a suggested skill template that pre-populates the skill selector.
- [ ] **SKILL-06** Custom category skill templates are defined for all 6 categories: tech (coding, debugging, git-workflow, testing, architecture), creative (creative-writing, screenwriting, worldbuilding, dialogue), business (strategy, proposal-writing, market-analysis, financial-modeling), research (paper-analysis, literature-review, data-analysis, methodology), media (journalism, copywriting, social-media, content-strategy), personal (goal-setting, language-tutoring, fitness, travel-planning).
- [ ] **SKILL-07** Skills assigned during onboarding or agent creation are suggestions only — users can add or remove skills freely after creation, and skills from `agentskills.io` are installable via `hermes skills search`.
---
## NXR — nxr Additions
- [ ] **NXR-01** `nxr init` runs the interactive onboarding wizard: workspace name, Hermes install prompt, auto-creates PM + Engineer + Hermes with free models, optional API key prompt, and displays the ready summary.
- [ ] **NXR-02** `nxr init --free` skips the API key prompt and runs in pure free mode without interactive prompts beyond workspace name.
- [ ] **NXR-03** `nxr init --key <sk-or-...>` accepts a pre-set API key and uses it during init without prompting.
- [ ] **NXR-04** `nxr upgrade` runs the interactive upgrade picker as described in UPGRADE-01 through UPGRADE-03.
- [ ] **NXR-05** `nxr upgrade --all` and `nxr upgrade --agent "<name>"` run non-interactively as described in UPGRADE-04.
- [ ] **NXR-06** `nxr upgrade --revert` reverts all agents to free models as described in UPGRADE-05.
- [ ] **NXR-07** `nxr agents recommend` prints a table showing the recommended model for each agent based on its role, pulled from the scoring algorithm.
- [ ] **NXR-08** `nxr agents rescore` re-runs the free model scoring algorithm for all agents and updates `model_scores`; output shows any model changes.
- [ ] **NXR-09** `nxr agents create --role <role> --name <name> --free` creates a new agent with auto-selected free model and auto-applied skill template for the given role; TUI Tab 5 gains a "Create Agent" wizard with the same category → model → skills → name flow.
---
## DATA — Data Model Changes
- [ ] **DATA-01** The `agents` table in `hermes_tracking.db` gains a `role TEXT` column storing one of: `pm`, `engineer`, `hermes`, `custom`.
- [ ] **DATA-02** The `agents` table gains a `skillset TEXT` column storing a JSON array of skill name strings.
- [ ] **DATA-03** The `agents` table gains a `model_score REAL DEFAULT 0` column storing the auto-calculated quality score at the time of last model assignment.
- [ ] **DATA-04** The `agents` table gains a `model_auto_assigned INTEGER DEFAULT 0` column; value `1` indicates nxr selected the model automatically.
- [ ] **DATA-05** The `agents` table gains `workspace_id TEXT` and `is_default INTEGER DEFAULT 0` columns; `is_default = 1` marks agents created during onboarding.
- [ ] **DATA-06** A new `model_scores` table is created in `hermes_tracking.db` (libSQL) with columns: `id`, `model_id`, `role`, `score`, `context_score`, `tools_score`, `reasoning_score`, `throughput_score`, `is_free`, `scored_at`; unique constraint on `(model_id, role, scored_at)`; indexes on `role` and `is_free`.
- [ ] **DATA-07** A new `onboarding` table is created in `hermes_tracking.db` (libSQL) with columns: `id`, `completed_at`, `workspace_name`, `has_openrouter_key`, `has_ollama`, `has_whisper`, `agents_created` (JSON array of agent IDs), `initial_free_models` (JSON snapshot of model assignments at creation).
---
## Out of Scope
The following are explicitly excluded from v1.4 per PRD Section 13:
- **Multi-user auth for web dashboard** — single-user, localhost only; future milestone
- **Self-hosted model registries beyond Ollama** — future consideration
- **Model training or fine-tuning** — models used as-is from OpenRouter
- **Auto-spending user money** — free-by-default; paid models require explicit opt-in every time
- **Guaranteed free model availability** — scanner detects and adapts when models leave the free tier; no SLA
- **WebSocket live terminal stream (`WS /api/hermes/stream`)** — stretch goal; `nxr watch` is the primary observation path; ship without and add later
- **Paperclip agent orchestration replacement** — Hermes provides inference, Paperclip provides orchestration; they are complementary
- **Blog auto-generation triggers** — PRD Section 11 is informational context for v1.2.1 auto-blogging system; not a v1.4 implementation requirement
---
## Traceability
| Requirement | Phase | Status |
|-------------|-------|--------|
| ONBOARD-01 | Phase 30 | Pending |
| ONBOARD-02 | Phase 30 | Pending |
| ONBOARD-03 | Phase 30 | Pending |
| ONBOARD-04 | Phase 30 | Pending |
| ONBOARD-05 | Phase 30 | Pending |
| ONBOARD-06 | Phase 30 | Pending |
| ONBOARD-07 | Phase 30 | Pending |
| ONBOARD-08 | Phase 30 | Pending |
| SCORE-01 | Phase 28 | Pending |
| SCORE-02 | Phase 28 | Pending |
| SCORE-03 | Phase 28 | Pending |
| SCORE-04 | Phase 28 | Pending |
| SCORE-05 | Phase 28 | Pending |
| SCORE-06 | Phase 28 | Pending |
| UPGRADE-01 | Phase 31 | Pending |
| UPGRADE-02 | Phase 31 | Pending |
| UPGRADE-03 | Phase 31 | Pending |
| UPGRADE-04 | Phase 31 | Pending |
| UPGRADE-05 | Phase 31 | Pending |
| UPGRADE-06 | Phase 31 | Pending |
| WEBCP-01 | Phase 34 | Pending |
| WEBCP-02 | Phase 34 | Pending |
| WEBCP-03 | Phase 34 | Pending |
| WEBCP-04 | Phase 34 | Pending |
| WEBCP-05 | Phase 34 | Pending |
| WEBCP-06 | Phase 34 | Pending |
| WEBCP-07 | Phase 34 | Pending |
| WEBCP-08 | Phase 34 | Pending |
| WEBCP-09 | Phase 34 | Pending |
| WEBCP-10 | Phase 34 | Pending |
| WEBCP-11 | Phase 34 | Pending |
| WEBCP-12 | Phase 34 | Pending |
| WEBCP-13 | Phase 34 | Pending |
| WEBCP-14 | Phase 34 | Pending |
| SCAN-01 | Phase 32 | Pending |
| SCAN-02 | Phase 32 | Pending |
| SCAN-03 | Phase 32 | Pending |
| SCAN-04 | Phase 32 | Pending |
| SCAN-05 | Phase 32 | Pending |
| SKILL-01 | Phase 29 | Pending |
| SKILL-02 | Phase 29 | Pending |
| SKILL-03 | Phase 29 | Pending |
| SKILL-04 | Phase 29 | Pending |
| SKILL-05 | Phase 29 | Pending |
| SKILL-06 | Phase 29 | Pending |
| SKILL-07 | Phase 29 | Pending |
| NXR-01 | Phase 30 | Pending |
| NXR-02 | Phase 30 | Pending |
| NXR-03 | Phase 30 | Pending |
| NXR-04 | Phase 31 | Pending |
| NXR-05 | Phase 31 | Pending |
| NXR-06 | Phase 31 | Pending |
| NXR-07 | Phase 33 | Pending |
| NXR-08 | Phase 33 | Pending |
| NXR-09 | Phase 33 | Pending |
| DATA-01 | Phase 27 | Pending |
| DATA-02 | Phase 27 | Pending |
| DATA-03 | Phase 27 | Pending |
| DATA-04 | Phase 27 | Pending |
| DATA-05 | Phase 27 | Pending |
| DATA-06 | Phase 27 | Pending |
| DATA-07 | Phase 27 | Pending |