nexus/.planning/phases/30-hardware-detection-mode-selection/30-01-PLAN.md
2026-04-04 03:55:49 +00:00

18 KiB

phase plan type wave depends_on files_modified autonomous requirements must_haves
30-hardware-detection-mode-selection 01 execute 1
server/src/services/hardware.ts
server/src/services/nexus-settings.ts
server/src/routes/hardware.ts
server/src/routes/nexus-settings.ts
server/src/app.ts
server/src/data/ollama-model-catalog.json
server/src/services/ollama.ts
server/src/__tests__/30-hardware-detection.test.ts
true
ONBD-02
ONBD-03
ONBD-01
truths artifacts key_links
GET /api/system/providers returns 200 with hardware info without any auth token
Apple Silicon is detected via CPU brand string and returns unifiedMemory: true with hardwareTier: apple_silicon
GPU detection via systeminformation has a 3-second timeout; failure degrades to cpu_only tier
nexusSettingsService persists mode to data/nexus-settings.json and reads it back
PATCH /api/nexus/settings requires board auth and persists the mode value
Model catalog contains tier field on every variant and includes qwen3:8b family
getRecommendedModel filters by hardware tier when tier data is present
path provides exports
server/src/services/hardware.ts hardwareService with detect() returning HardwareInfo
hardwareService
HardwareInfo
HardwareTier
path provides exports
server/src/services/nexus-settings.ts File-backed nexus settings persistence
nexusSettingsService
NexusMode
NEXUS_MODES
path provides exports
server/src/routes/hardware.ts Unauthenticated GET /api/system/providers
hardwareRoutes
path provides exports
server/src/routes/nexus-settings.ts Board-auth-gated GET/PATCH /api/nexus/settings
nexusSettingsRoutes
path provides contains
server/src/data/ollama-model-catalog.json Extended model catalog with tier arrays and qwen3 family qwen3
path provides
server/src/__tests__/30-hardware-detection.test.ts Unit tests for hardware service, settings service, routes, and catalog
from to via pattern
server/src/routes/hardware.ts server/src/services/hardware.ts hardwareService().detect() hardwareService.*detect
from to via pattern
server/src/app.ts server/src/routes/hardware.ts app.use before api router hardwareRoutes
from to via pattern
server/src/services/ollama.ts server/src/data/ollama-model-catalog.json loadCatalog() loadCatalog
Build the server-side hardware detection, mode persistence, and model catalog infrastructure for Phase 30.

Purpose: Provides the unauthenticated hardware probe endpoint, file-backed mode persistence, and tier-aware model catalog that the onboarding UI (Plan 02) will consume. These are the foundational APIs for the entire v1.5 onboarding stack.

Output: Five new server files (hardware service, hardware route, nexus-settings service, nexus-settings route, tests), two modified files (app.ts mount, ollama-model-catalog.json extension), and one updated file (ollama.ts for tier-aware recommendations).

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/30-hardware-detection-mode-selection/30-RESEARCH.md @.planning/phases/30-hardware-detection-mode-selection/30-CONTEXT.md

@server/src/app.ts @server/src/services/ollama.ts @server/src/data/ollama-model-catalog.json @server/src/home-paths.ts @server/src/routes/ollama.ts

From server/src/services/ollama.ts:

interface CatalogVariant {
  name: string;
  ramGb: number;
  vramGb: number;
  quality: string;
}
interface CatalogFamily {
  family: string;
  variants: CatalogVariant[];
}
interface ModelCatalog {
  models: CatalogFamily[];
}
export function getRecommendedModel(models: OllamaModel[], systemRamBytes: number): OllamaModel[]

From server/src/home-paths.ts:

export function resolvePaperclipHomeDir(): string;
export function resolvePaperclipInstanceRoot(instanceId?: string): string;

From server/src/middleware/auth.ts:

// req.actor.type === "board" | "agent" | "none"
// assertBoard(req) throws 403 if not board

From server/src/app.ts (mounting pattern — line ~129):

app.use(llmRoutes(db));  // mounted before api router
// ...
const api = Router();
api.use(boardMutationGuard());
// ... all authenticated routes on api ...
app.use("/api", api);
Task 1: Hardware service, nexus-settings service, model catalog, and tests server/src/services/hardware.ts server/src/services/nexus-settings.ts server/src/data/ollama-model-catalog.json server/src/services/ollama.ts server/src/__tests__/30-hardware-detection.test.ts server/src/services/ollama.ts server/src/data/ollama-model-catalog.json server/src/home-paths.ts server/src/services/instance-settings.ts - Test: hardwareService().detect() returns HardwareInfo with all required fields (totalGb, freeGb, usableGb, platform, gpuName, gpuVramGb, unifiedMemory, hardwareTier, cpuModel) - Test: When os.cpus()[0].model starts with "Apple" and platform is "darwin", returns unifiedMemory: true, hardwareTier: "apple_silicon", gpuVramGb: null - Test: When si.graphics() returns a controller with vram >= 4096 MB, returns hardwareTier: "gpu" with gpuVramGb set - Test: When si.graphics() returns no controllers (or throws), returns hardwareTier: "cpu_only" - Test: si.graphics() is wrapped in Promise.race with 3000ms timeout; if it times out, returns cpu_only tier - Test: nexusSettingsService().get() returns { mode: "both" } when no file exists (default) - Test: nexusSettingsService().set({ mode: "personal_ai" }) writes to disk and subsequent get() returns "personal_ai" - Test: nexusSettingsService().set({ mode: "invalid" as any }) throws Zod validation error - Test: Extended catalog JSON contains a "qwen3" family with variant "qwen3:8b" having tier array ["gpu", "apple_silicon", "cpu_only"] - Test: Every variant in catalog has a "tier" array (no variant without tier) - Test: getRecommendedModel with tier "gpu" only recommends models whose tier includes "gpu" **1. Create `server/src/services/hardware.ts`:**
Export types `HardwareTier = "gpu" | "apple_silicon" | "cpu_only"` and `HardwareInfo` interface with fields: `totalGb: number`, `freeGb: number`, `usableGb: number`, `platform: NodeJS.Platform`, `gpuName: string | null`, `gpuVramGb: number | null`, `unifiedMemory: boolean`, `hardwareTier: HardwareTier`, `cpuModel: string | null`.

Export `hardwareService()` factory function returning `{ detect }`. Implementation:
- Get `totalBytes = os.totalmem()`, `freeBytes = os.freemem()`, compute `totalGb`, `freeGb`, `usableGb = freeGb * 0.75` (all rounded to 1 decimal).
- Get `cpuModel = os.cpus()[0]?.model ?? null`.
- Detect Apple Silicon: `process.platform === "darwin" && cpuModel?.startsWith("Apple")`.
- If Apple Silicon: set `gpuName: null`, `gpuVramGb: null`, `unifiedMemory: true`, `hardwareTier: "apple_silicon"`. Do NOT call si.graphics().
- If not Apple Silicon: call `si.graphics()` wrapped in `Promise.race()` with a 3000ms timeout. On success, read `controllers[0].model` for `gpuName` and `controllers[0].vram / 1024` for `gpuVramGb`. If `gpuVramGb >= 4`, set `hardwareTier: "gpu"`. Otherwise `"cpu_only"`. On failure/timeout, set `gpuName: null`, `gpuVramGb: null`, `hardwareTier: "cpu_only"`.
- Cache result for 5 minutes (same pattern as in RESEARCH.md: `cache` variable + `cacheExpiry` timestamp).
- Import: `import os from "node:os"; import si from "systeminformation";`

**2. Create `server/src/services/nexus-settings.ts`:**

Export `NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const`, `NexusMode` type, and `nexusSettingsService()` factory.

Use Zod schema: `z.object({ mode: z.enum(NEXUS_MODES).default("both") })`.

`resolveNexusSettingsPath()`: `path.resolve(resolvePaperclipInstanceRoot(), "data", "nexus-settings.json")`.

Methods:
- `get()`: Read file, parse with Zod. On any error (file missing, invalid JSON), return `{ mode: "both" }`.
- `set(patch)`: Load current, merge patch, validate with Zod, write JSON to disk (mkdirSync recursive for data dir).

Import `resolvePaperclipInstanceRoot` from `"../home-paths.js"`.

**3. Extend `server/src/data/ollama-model-catalog.json`:**

Add `"tier"` array to every existing variant. Add two new families:

```json
{
  "family": "qwen3",
  "variants": [
    { "name": "qwen3:8b", "ramGb": 5, "vramGb": 5, "quality": "balanced", "tier": ["gpu", "apple_silicon", "cpu_only"] }
  ]
}
```

Tier assignments for existing variants:
- qwen2.5-coder:7b → ["gpu", "apple_silicon", "cpu_only"]
- qwen2.5-coder:14b → ["gpu", "apple_silicon"]
- qwen2.5-coder:32b → ["gpu"]
- llama3.2:3b → ["gpu", "apple_silicon", "cpu_only"]
- llama3.1:8b → ["gpu", "apple_silicon", "cpu_only"]
- llama3.1:70b → ["gpu"]
- mistral:7b → ["gpu", "apple_silicon", "cpu_only"]
- mistral:22b → ["gpu", "apple_silicon"]
- phi4:14b → ["gpu", "apple_silicon"]
- deepseek-r1:7b → ["gpu", "apple_silicon", "cpu_only"]
- deepseek-r1:32b → ["gpu", "apple_silicon"]

**4. Update `server/src/services/ollama.ts`:**

Update `CatalogVariant` interface: add optional `tier?: string[]` field.

Update `getRecommendedModel` signature to accept an optional third parameter `hardwareTier?: HardwareTier`:
```typescript
export function getRecommendedModel(
  models: OllamaModel[],
  systemRamBytes: number,
  hardwareTier?: "gpu" | "apple_silicon" | "cpu_only",
): OllamaModel[]
```

In the loop that finds `bestEntry`, add a tier filter: if `hardwareTier` is provided AND `entry.tier` exists AND `!entry.tier.includes(hardwareTier)`, skip that entry. Existing behavior (no hardwareTier passed) is unchanged.

**5. Create `server/src/__tests__/30-hardware-detection.test.ts`:**

Use Vitest. Mock `os` and `systeminformation` with `vi.mock()`.

Test groups:
- `describe("hardwareService")` — test detect() for Apple Silicon, GPU, CPU-only, and timeout scenarios
- `describe("nexusSettingsService")` — test default, set/get, and validation error (use a temp dir via `vi.mock` of home-paths or `os.tmpdir()`)
- `describe("model catalog")` — load the JSON file, verify every variant has `tier` array, verify qwen3:8b exists
- `describe("getRecommendedModel with tier")` — test that tier filtering works correctly

Install systeminformation: the executor must run `pnpm --filter server add systeminformation@5` before creating hardware.ts.
cd /opt/nexus && pnpm --filter server test --run -- 30-hardware-detection - server/src/services/hardware.ts exports `hardwareService`, `HardwareInfo`, `HardwareTier` - server/src/services/hardware.ts contains `Promise.race` with `3000` timeout for si.graphics - server/src/services/hardware.ts contains `cpuModel?.startsWith("Apple")` - server/src/services/hardware.ts contains `usableGb = freeGb * 0.75` (or equivalent `freeBytes * 0.75`) - server/src/services/nexus-settings.ts exports `nexusSettingsService`, `NexusMode`, `NEXUS_MODES` - server/src/services/nexus-settings.ts contains `z.enum(NEXUS_MODES).default("both")` - server/src/services/nexus-settings.ts contains `resolvePaperclipInstanceRoot` - server/src/data/ollama-model-catalog.json contains `"qwen3"` family - server/src/data/ollama-model-catalog.json every variant object contains `"tier"` key - server/src/services/ollama.ts CatalogVariant interface contains `tier` - server/src/services/ollama.ts getRecommendedModel accepts `hardwareTier` parameter - server/src/__tests__/30-hardware-detection.test.ts exists and exits 0 Hardware detection service returns correct tier for Apple Silicon, GPU, and CPU-only. Nexus settings service persists mode to disk. Model catalog has tier arrays on every variant. getRecommendedModel filters by hardware tier. All tests pass. Task 2: Hardware and nexus-settings routes, app.ts mounting server/src/routes/hardware.ts server/src/routes/nexus-settings.ts server/src/app.ts server/src/app.ts server/src/routes/ollama.ts server/src/routes/instance-settings.ts server/src/middleware/auth.ts server/src/services/hardware.ts server/src/services/nexus-settings.ts **1. Create `server/src/routes/hardware.ts`:**
Export `hardwareRoutes()` function returning an Express Router.

Single route: `router.get("/system/providers", async (_req, res) => { ... })`.

Call `hardwareService().detect()`. On success, return `res.json(info)`. On error, return a graceful degradation JSON with `os.totalmem()`, `os.freemem()`, `platform`, all GPU fields null, `hardwareTier: "cpu_only"` (exact shape from RESEARCH.md Pattern 1 fallback).

This route is intentionally unauthenticated. Add a code comment: `// Unauthenticated — hardware is a property of the machine, not the user. Safe: read-only, no mutation, no secrets.`

Also add a `GET /system/providers/recommendation` route that:
- Calls `hardwareService().detect()` to get the hardware info
- Calls `loadCatalog()` from ollama service (or reads the catalog directly) to get model families
- Returns `{ hardwareInfo, recommendedModels }` where `recommendedModels` is a filtered list of catalog entries matching the detected hardware tier
- This gives the UI a single endpoint to show "what model do we recommend for your hardware" without needing Ollama installed

Import: `import os from "node:os"`, `import { hardwareService } from "../services/hardware.js"`.

**2. Create `server/src/routes/nexus-settings.ts`:**

Export `nexusSettingsRoutes()` function returning an Express Router.

Two routes:
- `GET /nexus/settings` — calls `nexusSettingsService().get()`, returns JSON. Guard with `assertBoard(req)`.
- `PATCH /nexus/settings` — reads `req.body`, calls `nexusSettingsService().set(req.body)`, returns updated settings. Guard with `assertBoard(req)`.

Import `assertBoard` from `"./authz.js"` (same pattern as `instanceSettingsRoutes`).

**3. Modify `server/src/app.ts`:**

Add import at top:
```typescript
import { hardwareRoutes } from "./routes/hardware.js";
import { nexusSettingsRoutes } from "./routes/nexus-settings.js";
```

Mount hardware routes BEFORE the `const api = Router()` block — specifically right after `app.use(llmRoutes(db));` (line ~129 in current file). This places it after actorMiddleware runs but the route itself does not call assertBoard:
```typescript
app.use("/api", hardwareRoutes());
```

CRITICAL: The hardware route must come BEFORE `app.use("/api", api)` so it is reached without boardMutationGuard. The llmRoutes mount point (line ~129) is the correct insertion location — right after it.

Mount nexus settings routes on the `api` Router (authenticated):
```typescript
api.use(nexusSettingsRoutes());
```

Place this after `api.use(instanceSettingsRoutes(db));` for logical grouping.
cd /opt/nexus && pnpm --filter server test --run -- 30-hardware-detection - server/src/routes/hardware.ts exports `hardwareRoutes` - server/src/routes/hardware.ts contains `router.get("/system/providers"` - server/src/routes/hardware.ts contains comment with "Unauthenticated" - server/src/routes/nexus-settings.ts exports `nexusSettingsRoutes` - server/src/routes/nexus-settings.ts contains `assertBoard` - server/src/routes/nexus-settings.ts contains `router.get("/nexus/settings"` - server/src/routes/nexus-settings.ts contains `router.patch("/nexus/settings"` - server/src/app.ts contains `import { hardwareRoutes }` from `"./routes/hardware.js"` - server/src/app.ts contains `import { nexusSettingsRoutes }` from `"./routes/nexus-settings.js"` - server/src/app.ts contains `app.use("/api", hardwareRoutes())` BEFORE the `const api = Router()` line - server/src/app.ts contains `api.use(nexusSettingsRoutes())` Hardware probe endpoint returns 200 without auth. Nexus settings endpoints require board auth. Both are correctly mounted in app.ts. All existing tests still pass. Run the full server test suite to ensure no regressions: ```bash cd /opt/nexus && pnpm --filter server test --run ```

Verify the hardware probe is unauthenticated by checking that hardwareRoutes is mounted before boardMutationGuard:

grep -n "hardwareRoutes\|const api = Router\|boardMutationGuard" server/src/app.ts

Verify the model catalog has tier on every variant:

node -e "const c = require('./server/src/data/ollama-model-catalog.json'); const all = c.models.flatMap(f => f.variants); const missing = all.filter(v => !v.tier); console.log(missing.length === 0 ? 'OK: all variants have tier' : 'FAIL: ' + missing.length + ' variants missing tier')"

<success_criteria>

  1. pnpm --filter server test --run -- 30-hardware-detection exits 0
  2. pnpm --filter server test --run exits 0 (no regressions)
  3. server/src/services/hardware.ts exists with hardwareService, HardwareInfo, HardwareTier exports
  4. server/src/services/nexus-settings.ts exists with nexusSettingsService, NexusMode exports
  5. server/src/routes/hardware.ts exists with unauthenticated GET /system/providers
  6. server/src/routes/nexus-settings.ts exists with board-auth-gated GET/PATCH
  7. Model catalog has tier arrays and qwen3 family
  8. getRecommendedModel supports optional hardwareTier parameter </success_criteria>
After completion, create `.planning/phases/30-hardware-detection-mode-selection/30-01-SUMMARY.md`