nexus/.planning/phases/30-hardware-detection-mode-selection/30-01-PLAN.md
2026-04-04 03:55:49 +00:00

382 lines
18 KiB
Markdown

---
phase: 30-hardware-detection-mode-selection
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- server/src/services/hardware.ts
- server/src/services/nexus-settings.ts
- server/src/routes/hardware.ts
- server/src/routes/nexus-settings.ts
- server/src/app.ts
- server/src/data/ollama-model-catalog.json
- server/src/services/ollama.ts
- server/src/__tests__/30-hardware-detection.test.ts
autonomous: true
requirements:
- ONBD-02
- ONBD-03
- ONBD-01
must_haves:
truths:
- "GET /api/system/providers returns 200 with hardware info without any auth token"
- "Apple Silicon is detected via CPU brand string and returns unifiedMemory: true with hardwareTier: apple_silicon"
- "GPU detection via systeminformation has a 3-second timeout; failure degrades to cpu_only tier"
- "nexusSettingsService persists mode to data/nexus-settings.json and reads it back"
- "PATCH /api/nexus/settings requires board auth and persists the mode value"
- "Model catalog contains tier field on every variant and includes qwen3:8b family"
- "getRecommendedModel filters by hardware tier when tier data is present"
artifacts:
- path: "server/src/services/hardware.ts"
provides: "hardwareService with detect() returning HardwareInfo"
exports: ["hardwareService", "HardwareInfo", "HardwareTier"]
- path: "server/src/services/nexus-settings.ts"
provides: "File-backed nexus settings persistence"
exports: ["nexusSettingsService", "NexusMode", "NEXUS_MODES"]
- path: "server/src/routes/hardware.ts"
provides: "Unauthenticated GET /api/system/providers"
exports: ["hardwareRoutes"]
- path: "server/src/routes/nexus-settings.ts"
provides: "Board-auth-gated GET/PATCH /api/nexus/settings"
exports: ["nexusSettingsRoutes"]
- path: "server/src/data/ollama-model-catalog.json"
provides: "Extended model catalog with tier arrays and qwen3 family"
contains: "qwen3"
- path: "server/src/__tests__/30-hardware-detection.test.ts"
provides: "Unit tests for hardware service, settings service, routes, and catalog"
key_links:
- from: "server/src/routes/hardware.ts"
to: "server/src/services/hardware.ts"
via: "hardwareService().detect()"
pattern: "hardwareService.*detect"
- from: "server/src/app.ts"
to: "server/src/routes/hardware.ts"
via: "app.use before api router"
pattern: "hardwareRoutes"
- from: "server/src/services/ollama.ts"
to: "server/src/data/ollama-model-catalog.json"
via: "loadCatalog()"
pattern: "loadCatalog"
---
<objective>
Build the server-side hardware detection, mode persistence, and model catalog infrastructure for Phase 30.
Purpose: Provides the unauthenticated hardware probe endpoint, file-backed mode persistence, and tier-aware model catalog that the onboarding UI (Plan 02) will consume. These are the foundational APIs for the entire v1.5 onboarding stack.
Output: Five new server files (hardware service, hardware route, nexus-settings service, nexus-settings route, tests), two modified files (app.ts mount, ollama-model-catalog.json extension), and one updated file (ollama.ts for tier-aware recommendations).
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/30-hardware-detection-mode-selection/30-RESEARCH.md
@.planning/phases/30-hardware-detection-mode-selection/30-CONTEXT.md
@server/src/app.ts
@server/src/services/ollama.ts
@server/src/data/ollama-model-catalog.json
@server/src/home-paths.ts
@server/src/routes/ollama.ts
<interfaces>
<!-- Key types and contracts the executor needs. Extracted from codebase. -->
From server/src/services/ollama.ts:
```typescript
interface CatalogVariant {
name: string;
ramGb: number;
vramGb: number;
quality: string;
}
interface CatalogFamily {
family: string;
variants: CatalogVariant[];
}
interface ModelCatalog {
models: CatalogFamily[];
}
export function getRecommendedModel(models: OllamaModel[], systemRamBytes: number): OllamaModel[]
```
From server/src/home-paths.ts:
```typescript
export function resolvePaperclipHomeDir(): string;
export function resolvePaperclipInstanceRoot(instanceId?: string): string;
```
From server/src/middleware/auth.ts:
```typescript
// req.actor.type === "board" | "agent" | "none"
// assertBoard(req) throws 403 if not board
```
From server/src/app.ts (mounting pattern — line ~129):
```typescript
app.use(llmRoutes(db)); // mounted before api router
// ...
const api = Router();
api.use(boardMutationGuard());
// ... all authenticated routes on api ...
app.use("/api", api);
```
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Hardware service, nexus-settings service, model catalog, and tests</name>
<files>
server/src/services/hardware.ts
server/src/services/nexus-settings.ts
server/src/data/ollama-model-catalog.json
server/src/services/ollama.ts
server/src/__tests__/30-hardware-detection.test.ts
</files>
<read_first>
server/src/services/ollama.ts
server/src/data/ollama-model-catalog.json
server/src/home-paths.ts
server/src/services/instance-settings.ts
</read_first>
<behavior>
- Test: hardwareService().detect() returns HardwareInfo with all required fields (totalGb, freeGb, usableGb, platform, gpuName, gpuVramGb, unifiedMemory, hardwareTier, cpuModel)
- Test: When os.cpus()[0].model starts with "Apple" and platform is "darwin", returns unifiedMemory: true, hardwareTier: "apple_silicon", gpuVramGb: null
- Test: When si.graphics() returns a controller with vram >= 4096 MB, returns hardwareTier: "gpu" with gpuVramGb set
- Test: When si.graphics() returns no controllers (or throws), returns hardwareTier: "cpu_only"
- Test: si.graphics() is wrapped in Promise.race with 3000ms timeout; if it times out, returns cpu_only tier
- Test: nexusSettingsService().get() returns { mode: "both" } when no file exists (default)
- Test: nexusSettingsService().set({ mode: "personal_ai" }) writes to disk and subsequent get() returns "personal_ai"
- Test: nexusSettingsService().set({ mode: "invalid" as any }) throws Zod validation error
- Test: Extended catalog JSON contains a "qwen3" family with variant "qwen3:8b" having tier array ["gpu", "apple_silicon", "cpu_only"]
- Test: Every variant in catalog has a "tier" array (no variant without tier)
- Test: getRecommendedModel with tier "gpu" only recommends models whose tier includes "gpu"
</behavior>
<action>
**1. Create `server/src/services/hardware.ts`:**
Export types `HardwareTier = "gpu" | "apple_silicon" | "cpu_only"` and `HardwareInfo` interface with fields: `totalGb: number`, `freeGb: number`, `usableGb: number`, `platform: NodeJS.Platform`, `gpuName: string | null`, `gpuVramGb: number | null`, `unifiedMemory: boolean`, `hardwareTier: HardwareTier`, `cpuModel: string | null`.
Export `hardwareService()` factory function returning `{ detect }`. Implementation:
- Get `totalBytes = os.totalmem()`, `freeBytes = os.freemem()`, compute `totalGb`, `freeGb`, `usableGb = freeGb * 0.75` (all rounded to 1 decimal).
- Get `cpuModel = os.cpus()[0]?.model ?? null`.
- Detect Apple Silicon: `process.platform === "darwin" && cpuModel?.startsWith("Apple")`.
- If Apple Silicon: set `gpuName: null`, `gpuVramGb: null`, `unifiedMemory: true`, `hardwareTier: "apple_silicon"`. Do NOT call si.graphics().
- If not Apple Silicon: call `si.graphics()` wrapped in `Promise.race()` with a 3000ms timeout. On success, read `controllers[0].model` for `gpuName` and `controllers[0].vram / 1024` for `gpuVramGb`. If `gpuVramGb >= 4`, set `hardwareTier: "gpu"`. Otherwise `"cpu_only"`. On failure/timeout, set `gpuName: null`, `gpuVramGb: null`, `hardwareTier: "cpu_only"`.
- Cache result for 5 minutes (same pattern as in RESEARCH.md: `cache` variable + `cacheExpiry` timestamp).
- Import: `import os from "node:os"; import si from "systeminformation";`
**2. Create `server/src/services/nexus-settings.ts`:**
Export `NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const`, `NexusMode` type, and `nexusSettingsService()` factory.
Use Zod schema: `z.object({ mode: z.enum(NEXUS_MODES).default("both") })`.
`resolveNexusSettingsPath()`: `path.resolve(resolvePaperclipInstanceRoot(), "data", "nexus-settings.json")`.
Methods:
- `get()`: Read file, parse with Zod. On any error (file missing, invalid JSON), return `{ mode: "both" }`.
- `set(patch)`: Load current, merge patch, validate with Zod, write JSON to disk (mkdirSync recursive for data dir).
Import `resolvePaperclipInstanceRoot` from `"../home-paths.js"`.
**3. Extend `server/src/data/ollama-model-catalog.json`:**
Add `"tier"` array to every existing variant. Add two new families:
```json
{
"family": "qwen3",
"variants": [
{ "name": "qwen3:8b", "ramGb": 5, "vramGb": 5, "quality": "balanced", "tier": ["gpu", "apple_silicon", "cpu_only"] }
]
}
```
Tier assignments for existing variants:
- qwen2.5-coder:7b → ["gpu", "apple_silicon", "cpu_only"]
- qwen2.5-coder:14b → ["gpu", "apple_silicon"]
- qwen2.5-coder:32b → ["gpu"]
- llama3.2:3b → ["gpu", "apple_silicon", "cpu_only"]
- llama3.1:8b → ["gpu", "apple_silicon", "cpu_only"]
- llama3.1:70b → ["gpu"]
- mistral:7b → ["gpu", "apple_silicon", "cpu_only"]
- mistral:22b → ["gpu", "apple_silicon"]
- phi4:14b → ["gpu", "apple_silicon"]
- deepseek-r1:7b → ["gpu", "apple_silicon", "cpu_only"]
- deepseek-r1:32b → ["gpu", "apple_silicon"]
**4. Update `server/src/services/ollama.ts`:**
Update `CatalogVariant` interface: add optional `tier?: string[]` field.
Update `getRecommendedModel` signature to accept an optional third parameter `hardwareTier?: HardwareTier`:
```typescript
export function getRecommendedModel(
models: OllamaModel[],
systemRamBytes: number,
hardwareTier?: "gpu" | "apple_silicon" | "cpu_only",
): OllamaModel[]
```
In the loop that finds `bestEntry`, add a tier filter: if `hardwareTier` is provided AND `entry.tier` exists AND `!entry.tier.includes(hardwareTier)`, skip that entry. Existing behavior (no hardwareTier passed) is unchanged.
**5. Create `server/src/__tests__/30-hardware-detection.test.ts`:**
Use Vitest. Mock `os` and `systeminformation` with `vi.mock()`.
Test groups:
- `describe("hardwareService")` — test detect() for Apple Silicon, GPU, CPU-only, and timeout scenarios
- `describe("nexusSettingsService")` — test default, set/get, and validation error (use a temp dir via `vi.mock` of home-paths or `os.tmpdir()`)
- `describe("model catalog")` — load the JSON file, verify every variant has `tier` array, verify qwen3:8b exists
- `describe("getRecommendedModel with tier")` — test that tier filtering works correctly
Install systeminformation: the executor must run `pnpm --filter server add systeminformation@5` before creating hardware.ts.
</action>
<verify>
<automated>cd /opt/nexus && pnpm --filter server test --run -- 30-hardware-detection</automated>
</verify>
<acceptance_criteria>
- server/src/services/hardware.ts exports `hardwareService`, `HardwareInfo`, `HardwareTier`
- server/src/services/hardware.ts contains `Promise.race` with `3000` timeout for si.graphics
- server/src/services/hardware.ts contains `cpuModel?.startsWith("Apple")`
- server/src/services/hardware.ts contains `usableGb = freeGb * 0.75` (or equivalent `freeBytes * 0.75`)
- server/src/services/nexus-settings.ts exports `nexusSettingsService`, `NexusMode`, `NEXUS_MODES`
- server/src/services/nexus-settings.ts contains `z.enum(NEXUS_MODES).default("both")`
- server/src/services/nexus-settings.ts contains `resolvePaperclipInstanceRoot`
- server/src/data/ollama-model-catalog.json contains `"qwen3"` family
- server/src/data/ollama-model-catalog.json every variant object contains `"tier"` key
- server/src/services/ollama.ts CatalogVariant interface contains `tier`
- server/src/services/ollama.ts getRecommendedModel accepts `hardwareTier` parameter
- server/src/__tests__/30-hardware-detection.test.ts exists and exits 0
</acceptance_criteria>
<done>Hardware detection service returns correct tier for Apple Silicon, GPU, and CPU-only. Nexus settings service persists mode to disk. Model catalog has tier arrays on every variant. getRecommendedModel filters by hardware tier. All tests pass.</done>
</task>
<task type="auto">
<name>Task 2: Hardware and nexus-settings routes, app.ts mounting</name>
<files>
server/src/routes/hardware.ts
server/src/routes/nexus-settings.ts
server/src/app.ts
</files>
<read_first>
server/src/app.ts
server/src/routes/ollama.ts
server/src/routes/instance-settings.ts
server/src/middleware/auth.ts
server/src/services/hardware.ts
server/src/services/nexus-settings.ts
</read_first>
<action>
**1. Create `server/src/routes/hardware.ts`:**
Export `hardwareRoutes()` function returning an Express Router.
Single route: `router.get("/system/providers", async (_req, res) => { ... })`.
Call `hardwareService().detect()`. On success, return `res.json(info)`. On error, return a graceful degradation JSON with `os.totalmem()`, `os.freemem()`, `platform`, all GPU fields null, `hardwareTier: "cpu_only"` (exact shape from RESEARCH.md Pattern 1 fallback).
This route is intentionally unauthenticated. Add a code comment: `// Unauthenticated — hardware is a property of the machine, not the user. Safe: read-only, no mutation, no secrets.`
Also add a `GET /system/providers/recommendation` route that:
- Calls `hardwareService().detect()` to get the hardware info
- Calls `loadCatalog()` from ollama service (or reads the catalog directly) to get model families
- Returns `{ hardwareInfo, recommendedModels }` where `recommendedModels` is a filtered list of catalog entries matching the detected hardware tier
- This gives the UI a single endpoint to show "what model do we recommend for your hardware" without needing Ollama installed
Import: `import os from "node:os"`, `import { hardwareService } from "../services/hardware.js"`.
**2. Create `server/src/routes/nexus-settings.ts`:**
Export `nexusSettingsRoutes()` function returning an Express Router.
Two routes:
- `GET /nexus/settings` — calls `nexusSettingsService().get()`, returns JSON. Guard with `assertBoard(req)`.
- `PATCH /nexus/settings` — reads `req.body`, calls `nexusSettingsService().set(req.body)`, returns updated settings. Guard with `assertBoard(req)`.
Import `assertBoard` from `"./authz.js"` (same pattern as `instanceSettingsRoutes`).
**3. Modify `server/src/app.ts`:**
Add import at top:
```typescript
import { hardwareRoutes } from "./routes/hardware.js";
import { nexusSettingsRoutes } from "./routes/nexus-settings.js";
```
Mount hardware routes BEFORE the `const api = Router()` block — specifically right after `app.use(llmRoutes(db));` (line ~129 in current file). This places it after actorMiddleware runs but the route itself does not call assertBoard:
```typescript
app.use("/api", hardwareRoutes());
```
CRITICAL: The hardware route must come BEFORE `app.use("/api", api)` so it is reached without boardMutationGuard. The llmRoutes mount point (line ~129) is the correct insertion location — right after it.
Mount nexus settings routes on the `api` Router (authenticated):
```typescript
api.use(nexusSettingsRoutes());
```
Place this after `api.use(instanceSettingsRoutes(db));` for logical grouping.
</action>
<verify>
<automated>cd /opt/nexus && pnpm --filter server test --run -- 30-hardware-detection</automated>
</verify>
<acceptance_criteria>
- server/src/routes/hardware.ts exports `hardwareRoutes`
- server/src/routes/hardware.ts contains `router.get("/system/providers"`
- server/src/routes/hardware.ts contains comment with "Unauthenticated"
- server/src/routes/nexus-settings.ts exports `nexusSettingsRoutes`
- server/src/routes/nexus-settings.ts contains `assertBoard`
- server/src/routes/nexus-settings.ts contains `router.get("/nexus/settings"`
- server/src/routes/nexus-settings.ts contains `router.patch("/nexus/settings"`
- server/src/app.ts contains `import { hardwareRoutes }` from `"./routes/hardware.js"`
- server/src/app.ts contains `import { nexusSettingsRoutes }` from `"./routes/nexus-settings.js"`
- server/src/app.ts contains `app.use("/api", hardwareRoutes())` BEFORE the `const api = Router()` line
- server/src/app.ts contains `api.use(nexusSettingsRoutes())`
</acceptance_criteria>
<done>Hardware probe endpoint returns 200 without auth. Nexus settings endpoints require board auth. Both are correctly mounted in app.ts. All existing tests still pass.</done>
</task>
</tasks>
<verification>
Run the full server test suite to ensure no regressions:
```bash
cd /opt/nexus && pnpm --filter server test --run
```
Verify the hardware probe is unauthenticated by checking that `hardwareRoutes` is mounted before `boardMutationGuard`:
```bash
grep -n "hardwareRoutes\|const api = Router\|boardMutationGuard" server/src/app.ts
```
Verify the model catalog has tier on every variant:
```bash
node -e "const c = require('./server/src/data/ollama-model-catalog.json'); const all = c.models.flatMap(f => f.variants); const missing = all.filter(v => !v.tier); console.log(missing.length === 0 ? 'OK: all variants have tier' : 'FAIL: ' + missing.length + ' variants missing tier')"
```
</verification>
<success_criteria>
1. `pnpm --filter server test --run -- 30-hardware-detection` exits 0
2. `pnpm --filter server test --run` exits 0 (no regressions)
3. `server/src/services/hardware.ts` exists with hardwareService, HardwareInfo, HardwareTier exports
4. `server/src/services/nexus-settings.ts` exists with nexusSettingsService, NexusMode exports
5. `server/src/routes/hardware.ts` exists with unauthenticated GET /system/providers
6. `server/src/routes/nexus-settings.ts` exists with board-auth-gated GET/PATCH
7. Model catalog has tier arrays and qwen3 family
8. getRecommendedModel supports optional hardwareTier parameter
</success_criteria>
<output>
After completion, create `.planning/phases/30-hardware-detection-mode-selection/30-01-SUMMARY.md`
</output>