Planning artifacts (milestones v1.0-v1.2.1, v1.3 queue, PROJECT.md, STATE.md, config) now live alongside the code they describe. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
207 lines
15 KiB
Markdown
207 lines
15 KiB
Markdown
# Roadmap: v1.4 Hermes as Default Inference Provider + Web Control Plane
|
||
|
||
**Milestone:** v1.4
|
||
**Phases:** 27–34
|
||
**Coverage:** 62/62 requirements mapped
|
||
**Depends on:** v1.2.0.1 (nxr), v1.2.1 (Universal Skills), v1.3 (Web Chat)
|
||
|
||
---
|
||
|
||
## Phases
|
||
|
||
- [ ] **Phase 27: Data Model** — libSQL schema additions for agents table, model_scores table, and onboarding table
|
||
- [ ] **Phase 28: Model Scoring Engine** — Deterministic scoring function, per-role selection algorithm, scoring API
|
||
- [ ] **Phase 29: Default Skillsets and Agent Templates** — Curated skill assignments for PM, Engineer, Hermes, and custom category templates
|
||
- [ ] **Phase 30: Free-by-Default Onboarding** — CLI and web wizard flows that create three agents on free models with zero API key
|
||
- [ ] **Phase 31: API Key Upgrade Flow** — nxr upgrade commands and web bulk-upgrade that switch agents from free to paid models
|
||
- [ ] **Phase 32: Scanner Updates** — Post-scan rescoring, better-model notifications, rate limit prediction
|
||
- [ ] **Phase 33: nxr Agent Commands** — nxr agents recommend, rescore, create; TUI Tab 5 create wizard
|
||
- [ ] **Phase 34: Web Control Plane** — Nexus Hub pages for process control, model switching, agent management, budget, and notifications
|
||
|
||
---
|
||
|
||
## Phase Details
|
||
|
||
### Phase 27: Data Model
|
||
**Goal**: The libSQL database in `hermes_tracking.db` has all schema additions v1.4 requires — agents table columns for role and skill metadata, a model_scores table for scoring history, and an onboarding table for wizard state — so every subsequent phase can read and write without migration surprises
|
||
**Depends on**: v1.2.0.1 (existing hermes_tracking.db schema)
|
||
**Requirements**: DATA-01, DATA-02, DATA-03, DATA-04, DATA-05, DATA-06, DATA-07
|
||
**Success Criteria** (what must be TRUE):
|
||
1. The `agents` table has all six new columns (`role`, `skillset`, `model_score`, `model_auto_assigned`, `workspace_id`, `is_default`) with correct types and defaults, addable via `ALTER TABLE` without touching upstream Paperclip migrations
|
||
2. The `model_scores` table exists with all defined columns, the `UNIQUE(model_id, role, scored_at)` constraint, and indexes on `role` and `is_free`; queries against it return zero rows on a fresh DB without errors
|
||
3. The `onboarding` table exists with all defined columns; inserting a record and querying `completed_at IS NOT NULL` works as expected
|
||
4. All schema additions use libSQL (not modernc.org/sqlite or any other driver) and are applied via the same migration mechanism used by the rest of nxr
|
||
**Plans**: TBD
|
||
|
||
### Phase 28: Model Scoring Engine
|
||
**Goal**: A deterministic Go scoring function can take any agent role and the current free model catalog, apply the scoring formula, enforce role-specific requirements, and return the best available model — callable from both the CLI and the HTTP API
|
||
**Depends on**: Phase 27
|
||
**Requirements**: SCORE-01, SCORE-02, SCORE-03, SCORE-04, SCORE-05, SCORE-06
|
||
**Success Criteria** (what must be TRUE):
|
||
1. Given identical catalog input, the scoring function always returns the same model for the same role (deterministic output verified by test)
|
||
2. Running the scorer for each of the six roles (pm, engineer, hermes, creative, research, custom) returns a model that satisfies that role's minimum requirements (tools support, context window, etc.); if no model qualifies, the fallback chain (`openrouter/free` → local Qwen) is used
|
||
3. Scoring results are written to the `model_scores` libSQL table with all sub-scores populated and a `scored_at` timestamp
|
||
4. `POST /api/models/rescore` triggers the scorer and returns the updated best model per role; `GET /api/models/best-free?role=<role>` returns the current top pick for that role
|
||
5. `GET /api/models/roster` returns all free models with their scores for all roles in a single response
|
||
**Plans**: TBD
|
||
|
||
### Phase 29: Default Skillsets and Agent Templates
|
||
**Goal**: When any agent is created — through onboarding, `nxr agents create`, or the web wizard — the correct curated skillset is automatically applied based on role, and the Paperclip adapter can see those skills via `listSkills`/`syncSkills`
|
||
**Depends on**: Phase 27
|
||
**Requirements**: SKILL-01, SKILL-02, SKILL-03, SKILL-04, SKILL-05, SKILL-06, SKILL-07
|
||
**Success Criteria** (what must be TRUE):
|
||
1. A newly created PM agent has exactly 8 skills in its `skillset` column (`planning`, `task-breakdown`, `prioritization`, `status-reporting`, `dependency-mapping`, `sprint-planning`, `risk-assessment`, `stakeholder-comms`)
|
||
2. A newly created Engineer agent has exactly 8 skills (`coding`, `debugging`, `git-workflow`, `testing`, `code-review`, `refactoring`, `architecture`, `documentation`) and Hermes agent has exactly 8 skills (`memory`, `web-search`, `file-ops`, `cron`, `usage-tracker`, `model-scanner`, `skill-creator`, `session-search`)
|
||
3. The `listSkills` adapter API call on any default agent returns the correct skill list so Paperclip heartbeats can read it
|
||
4. Selecting a custom agent category (tech, creative, business, research, media, personal) during creation pre-populates the skill selector with that category's defined template; all 6 categories have templates
|
||
5. Adding or removing skills after creation is possible and does not break the `syncSkills` round-trip
|
||
**Plans**: TBD
|
||
|
||
### Phase 30: Free-by-Default Onboarding
|
||
**Goal**: A user can run `nxr init` or open `/setup` in the browser and within 5 minutes have three working agents (PM, Engineer, Hermes) running on free models with zero API key, zero cost, and zero manual configuration
|
||
**Depends on**: Phase 28, Phase 29
|
||
**Requirements**: ONBOARD-01, ONBOARD-02, ONBOARD-03, ONBOARD-04, ONBOARD-05, ONBOARD-06, ONBOARD-07, ONBOARD-08, NXR-01, NXR-02, NXR-03
|
||
**Success Criteria** (what must be TRUE):
|
||
1. `nxr init` on a fresh install completes in under 5 minutes, creates all three default agents with free models assigned by the scoring algorithm, and prints each agent's name, model, and skillset in the confirmation output
|
||
2. `nxr init --free` runs without prompting for an API key; `nxr init --key sk-or-...` accepts a key non-interactively and uses it during setup
|
||
3. The CLI onboarding detects Ollama and faster-whisper availability and records the findings in the `onboarding` DB table; auxiliary task routing is configured for local processing when Ollama is found
|
||
4. The web wizard at `/setup` completes the same five steps and results in the same DB state as the CLI flow; running both flows does not create duplicate agents
|
||
5. The `onboarding` table record is created with `completed_at`, `agents_created`, and `initial_free_models` populated; re-running either flow detects the existing record and skips agent creation
|
||
6. The zero-key free tier limits are displayed clearly at the end of both flows, with instructions for upgrading via `nxr config set openrouter-key`
|
||
**Plans**: TBD
|
||
**UI hint**: yes
|
||
|
||
### Phase 31: API Key Upgrade Flow
|
||
**Goal**: A user with a working free-tier workspace can add an OpenRouter API key in one command or one button click, see per-agent upgrade recommendations, and switch all agents to paid models — with a revert path if they change their mind
|
||
**Depends on**: Phase 28, Phase 30
|
||
**Requirements**: UPGRADE-01, UPGRADE-02, UPGRADE-03, UPGRADE-04, UPGRADE-05, UPGRADE-06, NXR-04, NXR-05, NXR-06
|
||
**Success Criteria** (what must be TRUE):
|
||
1. `nxr upgrade` validates the OpenRouter API key, detects account balance, and displays the correct free tier tier (50/day or 1000/day) based on credit balance
|
||
2. The interactive upgrade prompt shows every agent's current free model alongside its recommended paid replacement with pricing; the user can select Y, n, or custom and each path executes without error
|
||
3. `nxr upgrade --all` switches all agents to paid models non-interactively; `nxr upgrade --agent "<name>"` upgrades exactly one agent
|
||
4. `nxr upgrade --revert` switches all agents back to their last free model assignments; agent memory, skills, and session history are intact after both upgrade and revert
|
||
5. The web agent manager's "Bulk Upgrade" button calls the API and shows the same per-agent recommendations and confirmation; individual agent upgrade controls work per-agent
|
||
6. After any upgrade or revert, `nxr agents recommend` output reflects the current model assignments correctly
|
||
**Plans**: TBD
|
||
|
||
### Phase 32: Scanner Updates
|
||
**Goal**: The 6-hourly OpenRouter scanner automatically rescores all models after each scan, creates actionable notifications when better free models are available, and proactively warns the workspace before it hits daily rate limits
|
||
**Depends on**: Phase 27, Phase 28
|
||
**Requirements**: SCAN-01, SCAN-02, SCAN-03, SCAN-04, SCAN-05
|
||
**Success Criteria** (what must be TRUE):
|
||
1. After each scan run, the scanner writes fresh rows to `model_scores` for every role category; the `scored_at` timestamps in the table match the scan time
|
||
2. When a re-score reveals a higher-scoring free model for an active agent's role, a notification record is created with the old model name, new model name, role, and score delta; the notification includes a one-click upgrade action
|
||
3. The `auto_upgrade_free_models` config flag (default `false`) controls whether the scanner auto-switches agents or only creates notifications; setting it `true` and triggering a re-score automatically updates agent model assignments
|
||
4. `nxr budget` displays the projected daily request total based on current usage rate and hours remaining, with a clear indicator if the workspace is on track to hit the limit
|
||
5. Rate limit warnings appear in the dashboard at 70% consumption, a Telegram notification fires at 90% (when gateway is configured), and agents queue tasks at 100% rather than erroring
|
||
**Plans**: TBD
|
||
|
||
### Phase 33: nxr Agent Commands
|
||
**Goal**: Users can ask nxr for model recommendations per agent, re-run scoring on demand, and create new agents with auto-selected models and skill templates — all from the terminal — and the TUI Tab 5 provides the same create flow visually
|
||
**Depends on**: Phase 28, Phase 29, Phase 31
|
||
**Requirements**: NXR-07, NXR-08, NXR-09
|
||
**Success Criteria** (what must be TRUE):
|
||
1. `nxr agents recommend` prints a table with one row per agent showing: agent name, current model, recommended model for its role, and whether an upgrade is available
|
||
2. `nxr agents rescore` re-runs the scoring algorithm for all agents, writes updated rows to `model_scores`, and reports any agents where the recommended model has changed since the last score
|
||
3. `nxr agents create --role <role> --name <name> --free` creates a new agent, assigns the best free model for the role, applies the role's skill template, and confirms creation with a summary line
|
||
4. The TUI Tab 5 "Create Agent" wizard walks through: category selection → auto model recommendation display → skill template pre-fill → name input → confirm; the created agent appears in the agent list immediately
|
||
**Plans**: TBD
|
||
|
||
### Phase 34: Web Control Plane
|
||
**Goal**: Every capability available in the `nxr` terminal TUI is also accessible from the Nexus Hub browser — process control, model slot switching, agent management with recommendations, budget tracking, and notification management
|
||
**Depends on**: Phase 30, Phase 31, Phase 32, Phase 33
|
||
**Requirements**: WEBCP-01, WEBCP-02, WEBCP-03, WEBCP-04, WEBCP-05, WEBCP-06, WEBCP-07, WEBCP-08, WEBCP-09, WEBCP-10, WEBCP-11, WEBCP-12, WEBCP-13, WEBCP-14
|
||
**Success Criteria** (what must be TRUE):
|
||
1. The `/hermes` page shows live process status (PID, uptime, model, tmux viewers) and Start/Stop/Restart buttons that work correctly; the Ollama status card appears when Ollama is running
|
||
2. The `/models/switch` page slot editor shows all 7 routing slots with current assignments; clicking a slot opens a fuzzy-search model picker with filter toggles and a price comparison panel; confirming a pick writes `config.yaml` atomically
|
||
3. The `/agents/manage` page lists all agents with role-aware model recommendations, heartbeat recency, cost, and error rate; skill assignment and the "Create Agent" wizard produce the same result as `nxr agents create`
|
||
4. The `/budget` page shows the free tier gauge updating in real time, per-agent cost breakdowns for today/7d/30d, the cost projection graph, and the rate limit event log; CSV export downloads correctly
|
||
5. The `/notifications` page shows all notification types including v1.4 additions (rate limit warnings, auto-upgrade events, model availability alerts); Telegram forwarding toggles persist correctly; unread count badge in navigation updates after marking read
|
||
**Plans**: TBD
|
||
**UI hint**: yes
|
||
|
||
---
|
||
|
||
## Coverage Validation
|
||
|
||
All 62 v1.4 requirements are mapped to exactly one phase. No orphans.
|
||
|
||
| Requirement | Phase |
|
||
|-------------|-------|
|
||
| DATA-01 | Phase 27 |
|
||
| DATA-02 | Phase 27 |
|
||
| DATA-03 | Phase 27 |
|
||
| DATA-04 | Phase 27 |
|
||
| DATA-05 | Phase 27 |
|
||
| DATA-06 | Phase 27 |
|
||
| DATA-07 | Phase 27 |
|
||
| SCORE-01 | Phase 28 |
|
||
| SCORE-02 | Phase 28 |
|
||
| SCORE-03 | Phase 28 |
|
||
| SCORE-04 | Phase 28 |
|
||
| SCORE-05 | Phase 28 |
|
||
| SCORE-06 | Phase 28 |
|
||
| SKILL-01 | Phase 29 |
|
||
| SKILL-02 | Phase 29 |
|
||
| SKILL-03 | Phase 29 |
|
||
| SKILL-04 | Phase 29 |
|
||
| SKILL-05 | Phase 29 |
|
||
| SKILL-06 | Phase 29 |
|
||
| SKILL-07 | Phase 29 |
|
||
| ONBOARD-01 | Phase 30 |
|
||
| ONBOARD-02 | Phase 30 |
|
||
| ONBOARD-03 | Phase 30 |
|
||
| ONBOARD-04 | Phase 30 |
|
||
| ONBOARD-05 | Phase 30 |
|
||
| ONBOARD-06 | Phase 30 |
|
||
| ONBOARD-07 | Phase 30 |
|
||
| ONBOARD-08 | Phase 30 |
|
||
| NXR-01 | Phase 30 |
|
||
| NXR-02 | Phase 30 |
|
||
| NXR-03 | Phase 30 |
|
||
| UPGRADE-01 | Phase 31 |
|
||
| UPGRADE-02 | Phase 31 |
|
||
| UPGRADE-03 | Phase 31 |
|
||
| UPGRADE-04 | Phase 31 |
|
||
| UPGRADE-05 | Phase 31 |
|
||
| UPGRADE-06 | Phase 31 |
|
||
| NXR-04 | Phase 31 |
|
||
| NXR-05 | Phase 31 |
|
||
| NXR-06 | Phase 31 |
|
||
| SCAN-01 | Phase 32 |
|
||
| SCAN-02 | Phase 32 |
|
||
| SCAN-03 | Phase 32 |
|
||
| SCAN-04 | Phase 32 |
|
||
| SCAN-05 | Phase 32 |
|
||
| NXR-07 | Phase 33 |
|
||
| NXR-08 | Phase 33 |
|
||
| NXR-09 | Phase 33 |
|
||
| WEBCP-01 | Phase 34 |
|
||
| WEBCP-02 | Phase 34 |
|
||
| WEBCP-03 | Phase 34 |
|
||
| WEBCP-04 | Phase 34 |
|
||
| WEBCP-05 | Phase 34 |
|
||
| WEBCP-06 | Phase 34 |
|
||
| WEBCP-07 | Phase 34 |
|
||
| WEBCP-08 | Phase 34 |
|
||
| WEBCP-09 | Phase 34 |
|
||
| WEBCP-10 | Phase 34 |
|
||
| WEBCP-11 | Phase 34 |
|
||
| WEBCP-12 | Phase 34 |
|
||
| WEBCP-13 | Phase 34 |
|
||
| WEBCP-14 | Phase 34 |
|
||
|
||
---
|
||
|
||
## Progress
|
||
|
||
| Phase | Milestone | Plans Complete | Status | Completed |
|
||
|-------|-----------|----------------|--------|-----------|
|
||
| 27. Data Model | v1.4 | 0/? | Not started | - |
|
||
| 28. Model Scoring Engine | v1.4 | 0/? | Not started | - |
|
||
| 29. Default Skillsets and Agent Templates | v1.4 | 0/? | Not started | - |
|
||
| 30. Free-by-Default Onboarding | v1.4 | 0/? | Not started | - |
|
||
| 31. API Key Upgrade Flow | v1.4 | 0/? | Not started | - |
|
||
| 32. Scanner Updates | v1.4 | 0/? | Not started | - |
|
||
| 33. nxr Agent Commands | v1.4 | 0/? | Not started | - |
|
||
| 34. Web Control Plane | v1.4 | 0/? | Not started | - |
|