nexus/.planning/milestone-queue/v1.4-ROADMAP.md
Mikkel Georgsen 6c4272ce85 [nexus] chore: migrate .planning/ from agent repo to nexus repo
Planning artifacts (milestones v1.0-v1.2.1, v1.3 queue, PROJECT.md,
STATE.md, config) now live alongside the code they describe.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 03:55:42 +00:00

207 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Roadmap: v1.4 Hermes as Default Inference Provider + Web Control Plane
**Milestone:** v1.4
**Phases:** 2734
**Coverage:** 62/62 requirements mapped
**Depends on:** v1.2.0.1 (nxr), v1.2.1 (Universal Skills), v1.3 (Web Chat)
---
## Phases
- [ ] **Phase 27: Data Model** — libSQL schema additions for agents table, model_scores table, and onboarding table
- [ ] **Phase 28: Model Scoring Engine** — Deterministic scoring function, per-role selection algorithm, scoring API
- [ ] **Phase 29: Default Skillsets and Agent Templates** — Curated skill assignments for PM, Engineer, Hermes, and custom category templates
- [ ] **Phase 30: Free-by-Default Onboarding** — CLI and web wizard flows that create three agents on free models with zero API key
- [ ] **Phase 31: API Key Upgrade Flow** — nxr upgrade commands and web bulk-upgrade that switch agents from free to paid models
- [ ] **Phase 32: Scanner Updates** — Post-scan rescoring, better-model notifications, rate limit prediction
- [ ] **Phase 33: nxr Agent Commands** — nxr agents recommend, rescore, create; TUI Tab 5 create wizard
- [ ] **Phase 34: Web Control Plane** — Nexus Hub pages for process control, model switching, agent management, budget, and notifications
---
## Phase Details
### Phase 27: Data Model
**Goal**: The libSQL database in `hermes_tracking.db` has all schema additions v1.4 requires — agents table columns for role and skill metadata, a model_scores table for scoring history, and an onboarding table for wizard state — so every subsequent phase can read and write without migration surprises
**Depends on**: v1.2.0.1 (existing hermes_tracking.db schema)
**Requirements**: DATA-01, DATA-02, DATA-03, DATA-04, DATA-05, DATA-06, DATA-07
**Success Criteria** (what must be TRUE):
1. The `agents` table has all six new columns (`role`, `skillset`, `model_score`, `model_auto_assigned`, `workspace_id`, `is_default`) with correct types and defaults, addable via `ALTER TABLE` without touching upstream Paperclip migrations
2. The `model_scores` table exists with all defined columns, the `UNIQUE(model_id, role, scored_at)` constraint, and indexes on `role` and `is_free`; queries against it return zero rows on a fresh DB without errors
3. The `onboarding` table exists with all defined columns; inserting a record and querying `completed_at IS NOT NULL` works as expected
4. All schema additions use libSQL (not modernc.org/sqlite or any other driver) and are applied via the same migration mechanism used by the rest of nxr
**Plans**: TBD
### Phase 28: Model Scoring Engine
**Goal**: A deterministic Go scoring function can take any agent role and the current free model catalog, apply the scoring formula, enforce role-specific requirements, and return the best available model — callable from both the CLI and the HTTP API
**Depends on**: Phase 27
**Requirements**: SCORE-01, SCORE-02, SCORE-03, SCORE-04, SCORE-05, SCORE-06
**Success Criteria** (what must be TRUE):
1. Given identical catalog input, the scoring function always returns the same model for the same role (deterministic output verified by test)
2. Running the scorer for each of the six roles (pm, engineer, hermes, creative, research, custom) returns a model that satisfies that role's minimum requirements (tools support, context window, etc.); if no model qualifies, the fallback chain (`openrouter/free` → local Qwen) is used
3. Scoring results are written to the `model_scores` libSQL table with all sub-scores populated and a `scored_at` timestamp
4. `POST /api/models/rescore` triggers the scorer and returns the updated best model per role; `GET /api/models/best-free?role=<role>` returns the current top pick for that role
5. `GET /api/models/roster` returns all free models with their scores for all roles in a single response
**Plans**: TBD
### Phase 29: Default Skillsets and Agent Templates
**Goal**: When any agent is created — through onboarding, `nxr agents create`, or the web wizard — the correct curated skillset is automatically applied based on role, and the Paperclip adapter can see those skills via `listSkills`/`syncSkills`
**Depends on**: Phase 27
**Requirements**: SKILL-01, SKILL-02, SKILL-03, SKILL-04, SKILL-05, SKILL-06, SKILL-07
**Success Criteria** (what must be TRUE):
1. A newly created PM agent has exactly 8 skills in its `skillset` column (`planning`, `task-breakdown`, `prioritization`, `status-reporting`, `dependency-mapping`, `sprint-planning`, `risk-assessment`, `stakeholder-comms`)
2. A newly created Engineer agent has exactly 8 skills (`coding`, `debugging`, `git-workflow`, `testing`, `code-review`, `refactoring`, `architecture`, `documentation`) and Hermes agent has exactly 8 skills (`memory`, `web-search`, `file-ops`, `cron`, `usage-tracker`, `model-scanner`, `skill-creator`, `session-search`)
3. The `listSkills` adapter API call on any default agent returns the correct skill list so Paperclip heartbeats can read it
4. Selecting a custom agent category (tech, creative, business, research, media, personal) during creation pre-populates the skill selector with that category's defined template; all 6 categories have templates
5. Adding or removing skills after creation is possible and does not break the `syncSkills` round-trip
**Plans**: TBD
### Phase 30: Free-by-Default Onboarding
**Goal**: A user can run `nxr init` or open `/setup` in the browser and within 5 minutes have three working agents (PM, Engineer, Hermes) running on free models with zero API key, zero cost, and zero manual configuration
**Depends on**: Phase 28, Phase 29
**Requirements**: ONBOARD-01, ONBOARD-02, ONBOARD-03, ONBOARD-04, ONBOARD-05, ONBOARD-06, ONBOARD-07, ONBOARD-08, NXR-01, NXR-02, NXR-03
**Success Criteria** (what must be TRUE):
1. `nxr init` on a fresh install completes in under 5 minutes, creates all three default agents with free models assigned by the scoring algorithm, and prints each agent's name, model, and skillset in the confirmation output
2. `nxr init --free` runs without prompting for an API key; `nxr init --key sk-or-...` accepts a key non-interactively and uses it during setup
3. The CLI onboarding detects Ollama and faster-whisper availability and records the findings in the `onboarding` DB table; auxiliary task routing is configured for local processing when Ollama is found
4. The web wizard at `/setup` completes the same five steps and results in the same DB state as the CLI flow; running both flows does not create duplicate agents
5. The `onboarding` table record is created with `completed_at`, `agents_created`, and `initial_free_models` populated; re-running either flow detects the existing record and skips agent creation
6. The zero-key free tier limits are displayed clearly at the end of both flows, with instructions for upgrading via `nxr config set openrouter-key`
**Plans**: TBD
**UI hint**: yes
### Phase 31: API Key Upgrade Flow
**Goal**: A user with a working free-tier workspace can add an OpenRouter API key in one command or one button click, see per-agent upgrade recommendations, and switch all agents to paid models — with a revert path if they change their mind
**Depends on**: Phase 28, Phase 30
**Requirements**: UPGRADE-01, UPGRADE-02, UPGRADE-03, UPGRADE-04, UPGRADE-05, UPGRADE-06, NXR-04, NXR-05, NXR-06
**Success Criteria** (what must be TRUE):
1. `nxr upgrade` validates the OpenRouter API key, detects account balance, and displays the correct free tier tier (50/day or 1000/day) based on credit balance
2. The interactive upgrade prompt shows every agent's current free model alongside its recommended paid replacement with pricing; the user can select Y, n, or custom and each path executes without error
3. `nxr upgrade --all` switches all agents to paid models non-interactively; `nxr upgrade --agent "<name>"` upgrades exactly one agent
4. `nxr upgrade --revert` switches all agents back to their last free model assignments; agent memory, skills, and session history are intact after both upgrade and revert
5. The web agent manager's "Bulk Upgrade" button calls the API and shows the same per-agent recommendations and confirmation; individual agent upgrade controls work per-agent
6. After any upgrade or revert, `nxr agents recommend` output reflects the current model assignments correctly
**Plans**: TBD
### Phase 32: Scanner Updates
**Goal**: The 6-hourly OpenRouter scanner automatically rescores all models after each scan, creates actionable notifications when better free models are available, and proactively warns the workspace before it hits daily rate limits
**Depends on**: Phase 27, Phase 28
**Requirements**: SCAN-01, SCAN-02, SCAN-03, SCAN-04, SCAN-05
**Success Criteria** (what must be TRUE):
1. After each scan run, the scanner writes fresh rows to `model_scores` for every role category; the `scored_at` timestamps in the table match the scan time
2. When a re-score reveals a higher-scoring free model for an active agent's role, a notification record is created with the old model name, new model name, role, and score delta; the notification includes a one-click upgrade action
3. The `auto_upgrade_free_models` config flag (default `false`) controls whether the scanner auto-switches agents or only creates notifications; setting it `true` and triggering a re-score automatically updates agent model assignments
4. `nxr budget` displays the projected daily request total based on current usage rate and hours remaining, with a clear indicator if the workspace is on track to hit the limit
5. Rate limit warnings appear in the dashboard at 70% consumption, a Telegram notification fires at 90% (when gateway is configured), and agents queue tasks at 100% rather than erroring
**Plans**: TBD
### Phase 33: nxr Agent Commands
**Goal**: Users can ask nxr for model recommendations per agent, re-run scoring on demand, and create new agents with auto-selected models and skill templates — all from the terminal — and the TUI Tab 5 provides the same create flow visually
**Depends on**: Phase 28, Phase 29, Phase 31
**Requirements**: NXR-07, NXR-08, NXR-09
**Success Criteria** (what must be TRUE):
1. `nxr agents recommend` prints a table with one row per agent showing: agent name, current model, recommended model for its role, and whether an upgrade is available
2. `nxr agents rescore` re-runs the scoring algorithm for all agents, writes updated rows to `model_scores`, and reports any agents where the recommended model has changed since the last score
3. `nxr agents create --role <role> --name <name> --free` creates a new agent, assigns the best free model for the role, applies the role's skill template, and confirms creation with a summary line
4. The TUI Tab 5 "Create Agent" wizard walks through: category selection → auto model recommendation display → skill template pre-fill → name input → confirm; the created agent appears in the agent list immediately
**Plans**: TBD
### Phase 34: Web Control Plane
**Goal**: Every capability available in the `nxr` terminal TUI is also accessible from the Nexus Hub browser — process control, model slot switching, agent management with recommendations, budget tracking, and notification management
**Depends on**: Phase 30, Phase 31, Phase 32, Phase 33
**Requirements**: WEBCP-01, WEBCP-02, WEBCP-03, WEBCP-04, WEBCP-05, WEBCP-06, WEBCP-07, WEBCP-08, WEBCP-09, WEBCP-10, WEBCP-11, WEBCP-12, WEBCP-13, WEBCP-14
**Success Criteria** (what must be TRUE):
1. The `/hermes` page shows live process status (PID, uptime, model, tmux viewers) and Start/Stop/Restart buttons that work correctly; the Ollama status card appears when Ollama is running
2. The `/models/switch` page slot editor shows all 7 routing slots with current assignments; clicking a slot opens a fuzzy-search model picker with filter toggles and a price comparison panel; confirming a pick writes `config.yaml` atomically
3. The `/agents/manage` page lists all agents with role-aware model recommendations, heartbeat recency, cost, and error rate; skill assignment and the "Create Agent" wizard produce the same result as `nxr agents create`
4. The `/budget` page shows the free tier gauge updating in real time, per-agent cost breakdowns for today/7d/30d, the cost projection graph, and the rate limit event log; CSV export downloads correctly
5. The `/notifications` page shows all notification types including v1.4 additions (rate limit warnings, auto-upgrade events, model availability alerts); Telegram forwarding toggles persist correctly; unread count badge in navigation updates after marking read
**Plans**: TBD
**UI hint**: yes
---
## Coverage Validation
All 62 v1.4 requirements are mapped to exactly one phase. No orphans.
| Requirement | Phase |
|-------------|-------|
| DATA-01 | Phase 27 |
| DATA-02 | Phase 27 |
| DATA-03 | Phase 27 |
| DATA-04 | Phase 27 |
| DATA-05 | Phase 27 |
| DATA-06 | Phase 27 |
| DATA-07 | Phase 27 |
| SCORE-01 | Phase 28 |
| SCORE-02 | Phase 28 |
| SCORE-03 | Phase 28 |
| SCORE-04 | Phase 28 |
| SCORE-05 | Phase 28 |
| SCORE-06 | Phase 28 |
| SKILL-01 | Phase 29 |
| SKILL-02 | Phase 29 |
| SKILL-03 | Phase 29 |
| SKILL-04 | Phase 29 |
| SKILL-05 | Phase 29 |
| SKILL-06 | Phase 29 |
| SKILL-07 | Phase 29 |
| ONBOARD-01 | Phase 30 |
| ONBOARD-02 | Phase 30 |
| ONBOARD-03 | Phase 30 |
| ONBOARD-04 | Phase 30 |
| ONBOARD-05 | Phase 30 |
| ONBOARD-06 | Phase 30 |
| ONBOARD-07 | Phase 30 |
| ONBOARD-08 | Phase 30 |
| NXR-01 | Phase 30 |
| NXR-02 | Phase 30 |
| NXR-03 | Phase 30 |
| UPGRADE-01 | Phase 31 |
| UPGRADE-02 | Phase 31 |
| UPGRADE-03 | Phase 31 |
| UPGRADE-04 | Phase 31 |
| UPGRADE-05 | Phase 31 |
| UPGRADE-06 | Phase 31 |
| NXR-04 | Phase 31 |
| NXR-05 | Phase 31 |
| NXR-06 | Phase 31 |
| SCAN-01 | Phase 32 |
| SCAN-02 | Phase 32 |
| SCAN-03 | Phase 32 |
| SCAN-04 | Phase 32 |
| SCAN-05 | Phase 32 |
| NXR-07 | Phase 33 |
| NXR-08 | Phase 33 |
| NXR-09 | Phase 33 |
| WEBCP-01 | Phase 34 |
| WEBCP-02 | Phase 34 |
| WEBCP-03 | Phase 34 |
| WEBCP-04 | Phase 34 |
| WEBCP-05 | Phase 34 |
| WEBCP-06 | Phase 34 |
| WEBCP-07 | Phase 34 |
| WEBCP-08 | Phase 34 |
| WEBCP-09 | Phase 34 |
| WEBCP-10 | Phase 34 |
| WEBCP-11 | Phase 34 |
| WEBCP-12 | Phase 34 |
| WEBCP-13 | Phase 34 |
| WEBCP-14 | Phase 34 |
---
## Progress
| Phase | Milestone | Plans Complete | Status | Completed |
|-------|-----------|----------------|--------|-----------|
| 27. Data Model | v1.4 | 0/? | Not started | - |
| 28. Model Scoring Engine | v1.4 | 0/? | Not started | - |
| 29. Default Skillsets and Agent Templates | v1.4 | 0/? | Not started | - |
| 30. Free-by-Default Onboarding | v1.4 | 0/? | Not started | - |
| 31. API Key Upgrade Flow | v1.4 | 0/? | Not started | - |
| 32. Scanner Updates | v1.4 | 0/? | Not started | - |
| 33. nxr Agent Commands | v1.4 | 0/? | Not started | - |
| 34. Web Control Plane | v1.4 | 0/? | Not started | - |