Nexus Dev 69517b373e [nexus] docs(30): research phase 30 — hardware detection + mode selection

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-04 03:55:49 +00:00

36 KiB

Raw Blame History

Phase 30: Hardware Detection + Mode Selection - Research

Researched: 2026-04-02 Domain: Server-side hardware probing (Node.js os + systeminformation v5), unauthenticated Express route, Apple Silicon unified memory, model catalog extension, Nexus mode state, NexusOnboardingWizard multi-step preparation Confidence: HIGH

Summary

Phase 30 is the foundation of the v1.5 onboarding stack. It adds four things: (1) an unauthenticated hardware probe endpoint that works before any board auth token exists; (2) Apple Silicon unified memory handling with the 0.75 multiplier and correct copy; (3) an extended model recommendation catalog keyed to hardware tier (GPU / Apple Silicon / CPU-only); and (4) a mode selector (Personal AI Assistant / Project Builder / Both) whose choice is persisted and gates downstream UI.

The existing codebase has a solid base: os.totalmem() is already used in ollamaRoutes and getRecommendedModel(), the 0.75 multiplier is already applied in getRecommendedModel(), and the Ollama model catalog is an on-disk JSON file that can be extended. Two gaps need to be closed before the next phase: the probe endpoint for hardware detection must work without board auth (Pitfall 14 from PITFALLS.md), and there is no nexus_mode persistence layer yet.

The state constraint is hard: no new DB tables. Mode is stored as a Nexus-namespaced key inside a new file-backed JSON at data/nexus-settings.json in the instance root, read/written by a new nexusSettingsService. This avoids touching the .strict() Zod schema on instance_settings.general (adding a key to that schema would require changes to both @paperclipai/shared and the routes — an unnecessary upstream conflict surface). The file-backed approach mirrors the config.json pattern already present in the project.

Primary recommendation: Add GET /api/system/providers (unauthenticated) for hardware probe; create server/src/services/hardware.ts using os + systeminformation@5 for GPU detection; extend ollama-model-catalog.json with hardware tier + PRD models; add server/src/services/nexus-settings.ts for file-backed mode persistence; build ModeSelector + HardwareSummaryStep as new onboarding step components.

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.

Claude's Discretion

All implementation choices are at Claude's discretion.

Deferred Ideas (OUT OF SCOPE)

None — discuss phase skipped. </user_constraints>

Additional locked decisions from STATE.md (established at roadmap):

No DB schema changes — all state in existing JSONB fields or file-backed JSON
Apple Silicon: use os.freemem() × 0.75 for VRAM estimate; label as "unified memory" not "VRAM"; use systeminformation v5 (not v6)
Unauthenticated GET /system/providers endpoint required for pre-auth hardware probe
Mode persisted in instance_settings.general.nexus namespace (ARCHITECTURE.md) — however, .strict() constraint means a file-backed alternative is safer (see below)

<phase_requirements>

Phase Requirements

ID	Description	Research Support
ONBD-01	User can select mode (Personal AI Assistant / Project Builder / Both) during onboarding	New `ModeSelector` component in `NexusOnboardingWizard`; mode persisted via `nexusSettingsService`
ONBD-02	System auto-detects GPU, RAM, and Apple Silicon unified memory within 5 seconds	New `hardwareService` + `GET /api/system/providers` (unauthenticated); `systeminformation@5` for GPU on Linux/macOS; Apple Silicon flagged via CPU brand string
ONBD-03	System recommends best local model from pre-built JSON database based on detected hardware	Extend `ollama-model-catalog.json` with PRD models (Bonsai, Qwen 3) and tier field; update `getRecommendedModel()` to use hardware tier
ONBD-07	Local AI framed as privacy premium ("runs entirely on your machine, no accounts, works offline")	`HardwareSummaryStep` component renders PRD copy verbatim; copy is gated to local AI path only
</phase_requirements>

Standard Stack

Core

Library	Version	Purpose	Why Standard
Node.js `os`	built-in	`totalmem()`, `freemem()`, `cpus()` for RAM + CPU brand	Already used in `ollamaRoutes`; zero-cost
`systeminformation`	5.31.5 (latest v5)	`graphics()` for GPU name + VRAM on Linux/macOS/Windows	STATE.md locked to v5 (not v6 — API breakage risk); not yet installed
React	project version	UI components	Project standard
Zod	project version	Schema validation for new settings	Already used throughout

Supporting

Library	Version	Purpose	When to Use
`@tanstack/react-query`	project version	`useQuery` for hardware info hook	All other API queries use this

Alternatives Considered

Instead of	Could Use	Tradeoff
`systeminformation`	`system_profiler` shell out (macOS only)	`systeminformation` works cross-platform; shell out is macOS-only and adds timeout logic
File-backed `nexus-settings.json`	`instance_settings.general.nexus` key	`.strict()` Zod schema blocks adding new keys without upstream changes; file-backed is safer for Nexus-only state

Installation:

pnpm --filter server add systeminformation@5

Version verification: npm view systeminformation version → 5.31.5 (confirmed 2026-04-02). v6 exists but STATE.md explicitly locks to v5.

Architecture Patterns

Recommended Project Structure

Changes for this phase:

server/src/
├── services/
│   ├── hardware.ts              # NEW — hardwareService: detect GPU/RAM/Apple Silicon
│   └── nexus-settings.ts        # NEW — nexusSettingsService: file-backed mode persistence
├── routes/
│   └── hardware.ts              # NEW — GET /api/system/providers (unauthenticated)
├── data/
│   └── ollama-model-catalog.json # MODIFIED — add tier field + PRD models (Bonsai, Qwen3)
└── app.ts                        # MODIFIED — mount hardwareRoutes()

ui/src/
├── components/
│   ├── NexusOnboardingWizard.tsx  # MODIFIED — add mode selector step + hardware step
│   └── onboarding/                # NEW directory
│       ├── ModeSelector.tsx        # NEW — Personal AI / Project Builder / Both cards
│       └── HardwareSummaryStep.tsx # NEW — displays GPU/RAM/unified memory + model rec
├── api/
│   └── hardware.ts                # NEW — typed fetch wrapper for /api/system/providers
└── hooks/
    └── useHardwareInfo.ts         # NEW — useQuery wrapper for hardware data

Pattern 1: Unauthenticated Hardware Probe Route

What: GET /api/system/providers returns hardware detection results without requiring any auth. This is mounted before the actorMiddleware check, or with explicit no-auth bypass. When to use: During initial onboarding before any board auth exists. Key insight: In local_trusted deploymentMode, actorMiddleware already sets req.actor = { type: "board" } implicitly — so the probe works for free in local installs. For authenticated mode (fresh install before board claim), the probe must explicitly allow unauthenticated access, since req.actor.type === "none" until login.

Approach: Mount a dedicated route before the api router that does NOT call assertBoard. Return an empty/safe result if hardware detection fails.

// server/src/routes/hardware.ts
import os from "node:os";
import { Router } from "express";
import { hardwareService } from "../services/hardware.js";

export function hardwareRoutes() {
  const router = Router();
  const svc = hardwareService();

  // Unauthenticated — intentional. Hardware is a property of the machine, not the user.
  // Safe: returns read-only system info, no mutation, no secrets.
  router.get("/system/providers", async (_req, res) => {
    try {
      const info = await svc.detect();
      res.json(info);
    } catch {
      // Graceful degradation — return minimal safe info
      res.json({
        totalGb: Math.round(os.totalmem() / (1024 ** 3)),
        freeGb: Math.round(os.freemem() / (1024 ** 3)),
        platform: os.platform(),
        gpuName: null,
        gpuVramGb: null,
        unifiedMemory: false,
        hardwareTier: "cpu_only",
      });
    }
  });

  return router;
}

Mounting in app.ts — add BEFORE app.use("/api", api):

// Unauthenticated probe — must come before the /api router (which requires actorMiddleware)
app.use("/api", hardwareRoutes());

Pattern 2: Hardware Detection Service

What: hardwareService uses Node.js os for RAM and systeminformation v5 for GPU. Apple Silicon detection via CPU brand string.

Apple Silicon identification: os.cpus()[0].model on M-series Macs returns "Apple M1", "Apple M4", etc. Check process.platform === "darwin" AND cpuModel.startsWith("Apple").

Unified memory handling: On Apple Silicon, os.totalmem() IS the unified memory (shared CPU+GPU). Use os.freemem() * 0.75 as usable headroom (matching existing getRecommendedModel logic). Label as "unified memory" in UI, never "VRAM".

// server/src/services/hardware.ts
import os from "node:os";
import si from "systeminformation";

export type HardwareTier = "gpu" | "apple_silicon" | "cpu_only";

export interface HardwareInfo {
  totalGb: number;
  freeGb: number;
  usableGb: number;        // freeGb * 0.75 — budget for model loading
  platform: NodeJS.Platform;
  gpuName: string | null;
  gpuVramGb: number | null;
  unifiedMemory: boolean;  // true on Apple Silicon
  hardwareTier: HardwareTier;
  cpuModel: string | null;
}

export function hardwareService() {
  let cache: HardwareInfo | null = null;
  let cacheExpiry = 0;
  const CACHE_TTL_MS = 5 * 60 * 1000;

  async function detect(): Promise<HardwareInfo> {
    if (cache && Date.now() < cacheExpiry) return cache;

    const totalBytes = os.totalmem();
    const freeBytes = os.freemem();
    const totalGb = totalBytes / (1024 ** 3);
    const freeGb = freeBytes / (1024 ** 3);
    const usableGb = freeGb * 0.75;
    const cpuModel = os.cpus()[0]?.model ?? null;

    const isAppleSilicon =
      process.platform === "darwin" &&
      (cpuModel?.startsWith("Apple") ?? false);

    let gpuName: string | null = null;
    let gpuVramGb: number | null = null;

    if (!isAppleSilicon) {
      try {
        const graphics = await si.graphics();
        const controller = graphics.controllers?.[0];
        if (controller) {
          gpuName = controller.model ?? null;
          // si.graphics returns vram in MB
          gpuVramGb = controller.vram ? controller.vram / 1024 : null;
        }
      } catch {
        // systeminformation not available or GPU detection failed — graceful
      }
    }

    let hardwareTier: HardwareTier;
    if (isAppleSilicon) {
      hardwareTier = "apple_silicon";
    } else if (gpuVramGb && gpuVramGb >= 4) {
      hardwareTier = "gpu";
    } else {
      hardwareTier = "cpu_only";
    }

    const result: HardwareInfo = {
      totalGb: Math.round(totalGb * 10) / 10,
      freeGb: Math.round(freeGb * 10) / 10,
      usableGb: Math.round(usableGb * 10) / 10,
      platform: process.platform,
      gpuName,
      gpuVramGb: gpuVramGb ? Math.round(gpuVramGb * 10) / 10 : null,
      unifiedMemory: isAppleSilicon,
      hardwareTier,
      cpuModel,
    };

    cache = result;
    cacheExpiry = Date.now() + CACHE_TTL_MS;
    return result;
  }

  return { detect };
}

Pattern 3: Nexus Settings Service (File-Backed Mode Persistence)

What: A new file-backed JSON service stores Nexus-specific settings (starting with nexus_mode) in {instanceRoot}/data/nexus-settings.json. This avoids modifying the .strict() Zod schema in @paperclipai/shared.

Why not instance_settings.general: The schema at packages/shared/src/validators/instance.ts uses .strict(). Adding a new key would require changes in @paperclipai/shared (upstream package) to both the Zod schema and the TypeScript interface. That creates rebase conflict surface. File-backed JSON is identical to the existing config.json and ollama-model-catalog.json patterns.

// server/src/services/nexus-settings.ts
import fs from "node:fs";
import path from "node:path";
import { z } from "zod";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";

export const NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const;
export type NexusMode = (typeof NEXUS_MODES)[number];

const nexusSettingsSchema = z.object({
  mode: z.enum(NEXUS_MODES).default("both"),
});

export type NexusSettings = z.infer<typeof nexusSettingsSchema>;

function resolveNexusSettingsPath(): string {
  return path.resolve(resolvePaperclipInstanceRoot(), "data", "nexus-settings.json");
}

export function nexusSettingsService() {
  function load(): NexusSettings {
    const filePath = resolveNexusSettingsPath();
    try {
      const raw = JSON.parse(fs.readFileSync(filePath, "utf-8"));
      return nexusSettingsSchema.parse(raw);
    } catch {
      return nexusSettingsSchema.parse({});
    }
  }

  function save(settings: NexusSettings): void {
    const filePath = resolveNexusSettingsPath();
    fs.mkdirSync(path.dirname(filePath), { recursive: true });
    fs.writeFileSync(filePath, JSON.stringify(settings, null, 2), "utf-8");
  }

  return {
    get: () => load(),
    set: (patch: Partial<NexusSettings>) => {
      const current = load();
      const next = nexusSettingsSchema.parse({ ...current, ...patch });
      save(next);
      return next;
    },
  };
}

Pattern 4: Extended Model Catalog with Hardware Tier

What: The existing ollama-model-catalog.json needs: (a) PRD models added (Bonsai 1.7B, Qwen 3 8B), (b) a tier field per variant so the recommendation service can filter by hardwareTier.

Current catalog gap: The catalog has ramGb and vramGb but no tier field. The success criteria require the model recommendation to match an entry "for the detected hardware tier (GPU / Apple Silicon / CPU-only)". The catalog must express this.

Approach: Add an optional tier array to each variant: "tier": ["gpu", "apple_silicon", "cpu_only"] (if absent, variant is valid for all tiers). Also add the PRD models missing from current catalog: Bonsai 1.7B (hf.co/unsloth/Bonsai-1.7B-1M-GGUF or custom), Qwen 3 8B.

Note: Bonsai (1-bit quantization) is listed in the PRD but may not be in the official Ollama registry under that name. Use the closest available name or add as a catalog-only entry with a downloadUrl field for future use. For Phase 30, the catalog is extended for recommendation display even if the model isn't pullable yet.

Pattern 5: Mode Selector UI Component

What: ModeSelector.tsx presents three cards — Personal AI Assistant, Project Builder, Both (default) — using the existing shadcn/ui Card pattern. The selected mode is passed up to NexusOnboardingWizard and saved via a POST /api/nexus/settings call on wizard completion.

// ui/src/components/onboarding/ModeSelector.tsx
import { cn } from "@/lib/utils";

type NexusMode = "personal_ai" | "project_builder" | "both";

interface ModeSelectorProps {
  value: NexusMode;
  onChange: (mode: NexusMode) => void;
}

const MODES = [
  {
    id: "personal_ai" as NexusMode,
    label: "Personal AI Assistant",
    description: "Always available, persistent memory, private.",
  },
  {
    id: "project_builder" as NexusMode,
    label: "Project Builder",
    description: "Brainstorm → PM → Engineer → shipped product.",
  },
  {
    id: "both" as NexusMode,
    label: "Both (recommended)",
    description: "A conversation becomes a project with one click.",
  },
];

export function ModeSelector({ value, onChange }: ModeSelectorProps) {
  return (
    <div className="grid gap-3">
      {MODES.map((mode) => (
        <button
          key={mode.id}
          type="button"
          onClick={() => onChange(mode.id)}
          className={cn(
            "flex flex-col gap-1 rounded-lg border p-4 text-left transition-colors",
            value === mode.id
              ? "border-primary bg-primary/5"
              : "border-border hover:border-muted-foreground/50",
          )}
        >
          <span className="font-medium text-sm">{mode.label}</span>
          <span className="text-xs text-muted-foreground">{mode.description}</span>
        </button>
      ))}
    </div>
  );
}

Pattern 6: Hardware Summary Step

What: HardwareSummaryStep.tsx calls GET /api/system/providers, renders detected hardware, and shows the local AI privacy frame from the PRD.

ONBD-07 copy requirement (PRD verbatim, display when local AI is viable):

Local AI (recommended for privacy)
Runs entirely on your machine.
No accounts. No tracking. Works offline.

Display rules:

Apple Silicon: show unified memory GB, use "unified memory" label (not VRAM)
GPU: show GPU name + VRAM, label as "GPU VRAM"
CPU-only: show RAM, warn "slower than GPU-accelerated models", recommend cloud

Anti-Patterns to Avoid

Gating GET /system/providers on board auth: This creates the Pitfall 14 failure — fresh install gets 403, hardware probe silently fails, wizard shows wrong defaults.
Using os.totalmem() directly as "available for models": On Apple Silicon, the OS + apps consume 30–40% of unified memory. Always apply 0.75 multiplier to freemem() (not totalmem()).
Adding nexus key to instanceGeneralSettingsSchema: The schema uses .strict() — any extra key throws a Zod validation error. Use the file-backed service instead.
Reporting Apple Silicon VRAM as a separate number from RAM: Apple M-series chips have unified memory. Do not report gpuVramGb for Apple Silicon — set it to null, set unifiedMemory: true, and use totalGb/usableGb for recommendations.
Using systeminformation v6: STATE.md explicitly locks to v5. v6 has breaking changes.
Including Bonsai models in the Ollama recommendation if they are not in the Ollama registry: The catalog can list them for display, but the recommendation engine should only mark a model recommended: true if it can actually be pulled via Ollama.

Don't Hand-Roll

Problem	Don't Build	Use Instead	Why
GPU name + VRAM detection	Custom `/proc/nvidia-smi` or WMI parsing	`systeminformation@5 si.graphics()`	Cross-platform, handles NVIDIA/AMD/Intel; si already handles platform differences
RAM detection	Any third-party RAM library	`os.totalmem()` + `os.freemem()`	Built-in, zero deps, accurate
Mode persistence as new DB table	Drizzle migration + new table	`nexusSettingsService` file-backed JSON	No DB schema changes constraint; file pattern already established
Model recommendation filtering	Custom tier logic	Extend existing `getRecommendedModel()`	Logic already correct; add tier filter as one additional condition
Onboarding step components	Monolithic wizard with inline UI	Sub-components in `ui/src/components/onboarding/`	ARCHITECTURE.md established this pattern; Phase 32 adds more steps

Key insight: The hardware probe and model catalog are the only genuinely new functionality. Mode persistence is a simple file write. Most of the work is wiring existing pieces together correctly and avoiding the auth pitfall.

Common Pitfalls

Pitfall 1: Hardware Probe Blocked by Board Auth (Pitfall 14 from PITFALLS.md)

What goes wrong: Fresh install, no board auth token yet. GET /api/system/providers returns 403. Wizard falls back to cpu_only tier for model recommendation. Mac Mini M4 user is told to use cloud because GPU/unified memory was not detected. Why it happens: All routes under /api in app.ts are mounted behind actorMiddleware. In authenticated deploymentMode, req.actor.type === "none" for unauthenticated requests. How to avoid: Mount the hardware route with app.use("/api", hardwareRoutes()) before app.use("/api", api) in app.ts. In the route handler, do NOT call assertBoard or check req.actor. The route returns read-only machine information only. Warning signs: Browser network tab shows 403 on /api/system/providers during onboarding.

Pitfall 2: Apple Silicon Reported as "0 GB VRAM"

What goes wrong: systeminformation on macOS with Apple Silicon may return vram: 0 for the GPU controller because there is no discrete VRAM chip — the GPU uses system RAM. The UI shows "0 GB VRAM" or model recommendation uses the wrong memory figure. Why it happens: si.graphics() returns vram: 0 for Apple Silicon integrated GPU. This is technically correct but misleading for model recommendations. How to avoid: When isAppleSilicon is true, do not call si.graphics() at all. Set gpuVramGb: null, unifiedMemory: true. The recommendation engine uses usableGb (from freemem() * 0.75) instead of gpuVramGb. Warning signs: UI shows "GPU VRAM: 0 GB" on an M4 Mac Mini.

Pitfall 3: `.strict()` Schema Blocks Nexus Mode Persistence via instance_settings

What goes wrong: Attempt to store mode in instance_settings.general.nexus fails with a Zod validation error because the schema is z.object({ censorUsernameInLogs: z.boolean() }).strict(). Any key not in the schema is rejected. Why it happens: The shared package uses .strict() on both general and experimental settings schemas to prevent accumulation of unknown keys in the DB. How to avoid: Use nexusSettingsService (file-backed JSON at {instanceRoot}/data/nexus-settings.json). Add a GET /api/nexus/settings and PATCH /api/nexus/settings route. These ARE board-auth-gated (mode setting happens after the user is set up). Warning signs: Server logs a Zod error when the wizard tries to save mode; updateGeneral() silently discards the nexus key.

Pitfall 4: Model Catalog Recommends Bonsai but Ollama Cannot Pull It

What goes wrong: The PRD lists "Bonsai 1.7B (1-bit)" as a model. If added to the catalog with name: "bonsai:1.7b" and Ollama has no such model, getRecommendedModel() never finds a match (it only marks models the user already has installed as recommended). But if the catalog is used to generate a "we suggest pulling this" recommendation before pull, a non-pullable name breaks the Ollama pull command. Why it happens: The PRD model list mixes "models in Ollama registry" with "models we wish were in Ollama registry". Bonsai 1-bit quantization may only be available via a Hugging Face GGUF, not via ollama pull. How to avoid: For Phase 30, add Bonsai as a catalog entry with a source: "huggingface" flag (or just omit it from the recommendation engine). The catalog is displayed to users but the getRecommendedModel() function only recommends models the user has already pulled. Phase 30 does not need to pull models — just display what the hardware can run. Warning signs: ollama pull bonsai:1.7b returns 404; recommendation shows models with pull errors.

Pitfall 5: 5-Second Timeout Not Met Due to `si.graphics()` on Linux

What goes wrong: The success criterion requires the probe to return within 5 seconds. On Linux, si.graphics() may shell out to lspci or nvidia-smi. If those commands are not installed or produce slow output, the probe hangs. Why it happens: systeminformation uses platform-specific shell commands as fallback on Linux for GPU detection. Slow GPU drivers or missing lspci cause timeouts. How to avoid: Wrap si.graphics() in a Promise.race() with a 3-second timeout abort. If it times out, return gpuName: null, gpuVramGb: null, hardwareTier: "cpu_only" and continue. The 5-second budget for the overall probe response is achievable even with a 3-second GPU probe. Warning signs: /api/system/providers takes 6–10 seconds on Linux; hardwareTier always shows cpu_only even when a GPU is present.

Pitfall 6: NexusOnboardingWizard Drift from Upstream OnboardingWizard

What goes wrong: Phase 30 extends NexusOnboardingWizard.tsx with new steps. Upstream adds new props or context dependencies to OnboardingWizard.tsx. After the next upstream rebase, NexusOnboardingWizard.tsx silently misses those changes. Why it happens: Vite alias src/components/OnboardingWizard → NexusOnboardingWizard fully replaces the upstream component. Any upstream improvement is silently discarded. How to avoid: Phase 30 modifications to NexusOnboardingWizard.tsx must maintain the same export signature as OnboardingWizard.tsx. After each upstream rebase, diff OnboardingWizard.tsx for new hook usage. Warning signs: pnpm dev fails with "cannot find module" after rebase; wizard missing features added to upstream.

Code Examples

Verified patterns from existing codebase:

Existing RAM + Recommendation Pattern (confirm before extending)

// server/src/services/ollama.ts (existing, confirmed)
export function getRecommendedModel(models: OllamaModel[], systemRamBytes: number): OllamaModel[] {
  const usableRamGb = (systemRamBytes / (1024 * 1024 * 1024)) * 0.75;
  // ... catalog-based matching ...
}
// Called in ollamaRoutes.ts:
const enrichedModels = getRecommendedModel(models, os.totalmem());
// NOTE: Phase 30 updates this to use os.freemem() for Apple Silicon path

Mounting Unauthenticated Routes Before the Protected api Router

// server/src/app.ts (MODIFIED pattern — add before app.use("/api", api))
// Source: existing health route pattern (health is also accessible without deep auth)
app.use("/api", hardwareRoutes());  // unauthenticated — must come first
app.use("/api", api);               // authenticated api router

File-Backed JSON Service Pattern

// Source: config-file.ts + ollama.ts catalog load pattern (confirmed in codebase)
import fs from "node:fs";
import path from "node:path";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";

function resolveNexusSettingsPath(): string {
  return path.resolve(resolvePaperclipInstanceRoot(), "data", "nexus-settings.json");
}

systeminformation v5 Graphics Call

// Source: systeminformation v5 npm docs (verified: npm view systeminformation version → 5.31.5)
import si from "systeminformation";

const graphics = await si.graphics();
// graphics.controllers[0].model  → GPU name string
// graphics.controllers[0].vram   → VRAM in MB (integer)
// Returns empty array if no GPU detected

NexusMode Constants (shared between server + UI)

// server/src/services/nexus-settings.ts
export const NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const;
export type NexusMode = (typeof NEXUS_MODES)[number];

// UI: ui/src/api/hardware.ts
export type NexusMode = "personal_ai" | "project_builder" | "both";
// (duplicated in UI since @paperclipai/shared is upstream-owned)

Extended Model Catalog JSON

// server/src/data/ollama-model-catalog.json (MODIFIED — add tier + PRD models)
{
  "models": [
    {
      "family": "qwen2",
      "variants": [
        { "name": "qwen2.5-coder:7b",  "ramGb": 5,  "vramGb": 5,  "quality": "fast",      "tier": ["gpu", "apple_silicon", "cpu_only"] },
        { "name": "qwen2.5-coder:14b", "ramGb": 10, "vramGb": 10, "quality": "balanced",  "tier": ["gpu", "apple_silicon"] },
        { "name": "qwen2.5-coder:32b", "ramGb": 22, "vramGb": 22, "quality": "best",      "tier": ["gpu"] }
      ]
    },
    {
      "family": "qwen3",
      "variants": [
        { "name": "qwen3:8b",           "ramGb": 5,  "vramGb": 5,  "quality": "balanced",  "tier": ["gpu", "apple_silicon", "cpu_only"] }
      ]
    },
    {
      "family": "llama",
      "variants": [
        { "name": "llama3.2:3b",  "ramGb": 3,  "vramGb": 3,  "quality": "fast",      "tier": ["gpu", "apple_silicon", "cpu_only"] },
        { "name": "llama3.1:8b",  "ramGb": 6,  "vramGb": 6,  "quality": "balanced",  "tier": ["gpu", "apple_silicon", "cpu_only"] },
        { "name": "llama3.1:70b", "ramGb": 48, "vramGb": 48, "quality": "best",      "tier": ["gpu"] }
      ]
    }
  ]
}

State of the Art

Old Approach	Current Approach	When Changed	Impact
Ollama routes require companyId (no pre-auth probe)	New `GET /api/system/providers` requires no auth	Phase 30 (this phase)	Enables pre-auth hardware detection
`getRecommendedModel` uses `totalmem()` only	Use `freemem() * 0.75` for Apple Silicon, `totalmem() * 0.75` for GPU/CPU	Phase 30	More accurate for loaded systems
Single-step `NexusOnboardingWizard`	Multi-step with `ModeSelector` + `HardwareSummaryStep`	Phase 30	Foundation for Phase 32 full wizard
Model catalog: no tier field	Catalog has `tier` array per variant	Phase 30	Enables tier-filtered recommendations

Deprecated/outdated:

getRecommendedModel() calling os.totalmem() directly — Phase 30 changes the call site to pass os.freemem() for Apple Silicon path; existing behavior preserved for non-Apple-Silicon.

Open Questions

Is qwen3:8b available in Ollama as of April 2026?
- What we know: Qwen 3 is listed in the PRD. Qwen 2.5 is in the current catalog. The Ollama registry is a moving target.
- What's unclear: Whether the exact model tag is qwen3:8b or something else.
- Recommendation: Add qwen3:8b to catalog with a note that the tag should be verified against the Ollama registry at ship time. The recommendation engine only marks models the user has pulled as recommended — a wrong tag just means the model won't be auto-recommended until the user pulls it.
Should the Nexus settings route (PATCH /api/nexus/settings) be board-auth-gated?
- What we know: Mode selection happens during onboarding. In local_trusted mode, board auth is always present. In authenticated mode, the user has logged in by the time they see the wizard.
- Recommendation: Yes, gate on board auth. The hardware probe is unauthenticated; mode persistence is not. The wizard saves mode on the final wizard-complete action, not on mode card click.
Does the mode selector need to appear in settings post-onboarding?
- What we know: ROADMAP success criteria say the mode is "persisted" and "assistant-specific UI is hidden when Project Builder-only is chosen."
- What's unclear: Whether Phase 30 needs a settings page entry point or just onboarding.
- Recommendation: Phase 30 delivers mode selection in the onboarding wizard only. A settings page entry point is deferred to Phase 33 (which introduces PersonalAssistantPage and mode-gated UI).

Environment Availability

Dependency	Required By	Available	Version	Fallback
Node.js `os`	RAM/CPU detection	✓	built-in	—
`systeminformation`	GPU name + VRAM	✗ (not installed)	5.31.5 (latest v5)	Omit GPU name, return `null`, tier defaults to `cpu_only`
`system_profiler` (macOS only)	Apple Silicon GPU model	✓ on macOS, ✗ on Linux	macOS built-in	Use CPU brand string alone
React	UI components	✓	project version	—
Zod	Settings schema	✓	project version	—
shadcn/ui `Card`, `Button`	ModeSelector UI	✓	project version	—

Missing dependencies with no fallback:

None that block execution. systeminformation absence degrades gracefully to cpu_only tier.

Missing dependencies with fallback:

systeminformation: probe route gracefully omits GPU data if detection fails; hardware tier becomes cpu_only; model recommendation still works using RAM budget.

Validation Architecture

Test Framework

Property	Value
Framework	Vitest
Config file	`server/vitest.config.ts`
Quick run command	`pnpm --filter server test --run`
Full suite command	`pnpm --filter server test --run && pnpm --filter ui test --run`

Phase Requirements → Test Map

Req ID	Behavior	Test Type	Automated Command	File Exists?
ONBD-02	`hardwareService.detect()` returns `unifiedMemory: true` when CPU brand is "Apple M4"	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-02	`hardwareService.detect()` returns `hardwareTier: "cpu_only"` when no GPU detected	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-02	`GET /api/system/providers` returns 200 without board auth (unauthenticated request)	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-02	Probe returns within 5 seconds even when `si.graphics()` is unavailable	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-03	Extended catalog contains `qwen3:8b` and `tier` field	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-03	`getRecommendedModel()` with `gpu` tier only recommends GPU-tier models	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-01	`nexusSettingsService.set({ mode: "personal_ai" })` persists and is readable	unit	`pnpm --filter server test --run -- 30-hardware-detection`	❌ Wave 0
ONBD-07	`HardwareSummaryStep` renders privacy copy when tier is not `cpu_only`	unit (React Testing Library or Vitest)	`pnpm --filter ui test --run -- HardwareSummaryStep`	❌ Wave 0

Sampling Rate

Per task commit: pnpm --filter server test --run
Per wave merge: pnpm --filter server test --run && pnpm --filter ui test --run
Phase gate: Full suite green before /gsd:verify-work

Wave 0 Gaps

server/src/__tests__/30-hardware-detection.test.ts — covers ONBD-01, ONBD-02, ONBD-03 server-side
ui/src/components/onboarding/HardwareSummaryStep.test.tsx — covers ONBD-07 copy render

Sources

Primary (HIGH confidence)

/opt/nexus/server/src/services/ollama.ts — existing getRecommendedModel(), 0.75 multiplier, os.totalmem() usage (confirmed)
/opt/nexus/server/src/routes/ollama.ts — existing company-scoped ollama routes; confirmed no unauthenticated pattern
/opt/nexus/server/src/middleware/auth.ts — actorMiddleware behavior in local_trusted vs authenticated mode (confirmed)
/opt/nexus/server/src/app.ts — route mounting order; confirmed /api router structure (confirmed)
/opt/nexus/server/src/services/instance-settings.ts — updateGeneral() uses .strict() schema; adding new keys would fail (confirmed)
/opt/nexus/packages/shared/src/validators/instance.ts — .strict() confirmed on line 5
/opt/nexus/server/src/home-paths.ts — resolvePaperclipInstanceRoot() for file-backed JSON path (confirmed)
/opt/nexus/server/src/data/ollama-model-catalog.json — current catalog structure (confirmed; no tier field, no Bonsai/Qwen3)
/opt/nexus/ui/src/components/NexusOnboardingWizard.tsx — current single-step wizard; mode selector is absent (confirmed)
/opt/nexus/.planning/STATE.md — locked decisions: systeminformation v5, freemem() * 0.75, GET /system/providers unauthenticated
/opt/nexus/.planning/research/ARCHITECTURE.md — component map, hardwareService design, nexus namespace in instance settings (confirmed architecture intent)
/opt/nexus/.planning/research/PITFALLS.md — Pitfall 13 (Apple Silicon VRAM), Pitfall 14 (probe auth level)
npm view systeminformation version → 5.31.5 (confirmed current latest v5)

Secondary (MEDIUM confidence)

/home/mikkel/upload/nexus-v1.5-prd-onboarding-assistant.md — PRD model list (Bonsai, Qwen 3, tier scenarios), ONBD-07 copy requirement
systeminformation v5 npm documentation — si.graphics() returns controllers[].vram in MB

Metadata

Confidence breakdown:

Standard stack: HIGH — os built-in confirmed; systeminformation version confirmed via npm; not yet installed (needs pnpm add)
Architecture: HIGH — all integration points confirmed via direct codebase reading; .strict() schema trap confirmed
Pitfalls: HIGH — all identified from direct code reading and confirmed PITFALLS.md analysis

Research date: 2026-04-02 Valid until: 2026-05-02 (stable domain; systeminformation API stable in v5)

36 KiB Raw Blame History Unescape Escape