diff --git a/.planning/research/ARCHITECTURE.md b/.planning/research/ARCHITECTURE.md index ec8cdf13..1fe48c50 100644 --- a/.planning/research/ARCHITECTURE.md +++ b/.planning/research/ARCHITECTURE.md @@ -1,333 +1,544 @@ -# Architecture Patterns: Display-Layer Fork Isolation +# Architecture Research -**Domain:** TypeScript monorepo fork (Paperclip → Nexus) -**Researched:** 2026-03-30 -**Confidence:** HIGH — based on direct codebase inspection + verified patterns +**Domain:** Smart Onboarding + Personal AI Assistant (v1.5) — integration with existing Nexus/Paperclip monorepo +**Researched:** 2026-04-02 +**Confidence:** HIGH — based on direct codebase inspection + verified current documentation --- -## Recommended Architecture +## System Overview -The core constraint is: **every file Nexus touches is a potential rebase conflict site.** -The architecture goal is therefore to minimize the number of upstream files modified by concentrating all fork-specific content into new files that upstream will never create. - -### Isolation Strategy: Minimal-Touch with Fork Overlay +The v1.5 features layer on top of the existing monorepo without touching DB schema, API routes, or TypeScript identifiers. Every new service, component, and data flow hooks into the existing extension points: the adapter registry, the secrets service, the instance settings JSONB columns, the chat SSE pipeline, and the onboarding wizard overlay. ``` -Upstream files Fork overlay files -───────────── ────────────────── -constants.ts → [keep AGENT_ROLE_LABELS.ceo = "CEO", change display via wrapper] -CompanyRail.tsx → (modify inline — unavoidable, low risk) -OnboardingWizard → nexus/OnboardingWizard.nexus.tsx (new file, rewire import) -onboard.ts (CLI) → modify inline strings only (no logic change) -SOUL.md / AGENTS → replace file content (same path, different content) +┌──────────────────────────────────────────────────────────────────────────┐ +│ UI Layer (React/Vite) │ +│ │ +│ ┌─────────────────────────────────────────┐ ┌────────────────────────┐ │ +│ │ NexusOnboardingWizard (MODIFIED) │ │ PersonalAssistantPage │ │ +│ │ ┌──────────────┐ ┌──────────────────┐ │ │ (NEW — lazy loaded) │ │ +│ │ │ ModeSelector │ │ HardwareSummary │ │ │ ┌──────────────────┐ │ │ +│ │ │ (NEW) │ │ (NEW) │ │ │ │ AssistantChatHub │ │ │ +│ │ └──────────────┘ └──────────────────┘ │ │ │ (MODIFIED │ │ │ +│ │ ┌──────────────┐ ┌──────────────────┐ │ │ │ ChatPanel) │ │ │ +│ │ │ProviderSetup │ │ VoiceSetupStep │ │ │ └──────────────────┘ │ │ +│ │ │ (NEW) │ │ (NEW) │ │ └────────────────────────┘ │ +│ │ └──────────────┘ └──────────────────┘ │ │ +│ └─────────────────────────────────────────┘ │ +│ │ +│ ┌───────────────────────────────────────────────────────────────────┐ │ +│ │ Existing Extension Points │ │ +│ │ ChatPanel • ChatInput • useStreamingChat • ChatAgentSelector │ │ +│ └───────────────────────────────────────────────────────────────────┘ │ +└──────────────────────────────────────────────────────────────────────────┘ + ↕ REST + SSE +┌──────────────────────────────────────────────────────────────────────────┐ +│ Server Layer (Express) │ +│ │ +│ NEW routes mounted in app.ts: │ +│ ┌────────────────────┐ ┌───────────────────┐ ┌──────────────────────┐ │ +│ │ /api/hardware │ │ /api/puter-proxy │ │ /api/voice │ │ +│ │ (hardware detect) │ │ (Puter.js relay) │ │ (Whisper + Piper) │ │ +│ └────────────────────┘ └───────────────────┘ └──────────────────────┘ │ +│ ┌────────────────────┐ ┌───────────────────┐ │ +│ │ /api/memory │ │ Existing routes: │ │ +│ │ (assistant memory) │ │ /ollama • /chat │ │ +│ └────────────────────┘ │ /secrets • /llms │ │ +│ └───────────────────┘ │ +│ │ +│ NEW services (named-export pattern, no classes): │ +│ hardwareService • puterProxyService • voiceService • memoryService │ +└──────────────────────────────────────────────────────────────────────────┘ + ↕ Drizzle ORM +┌──────────────────────────────────────────────────────────────────────────┐ +│ Data Layer (PostgreSQL) │ +│ │ +│ NO new tables — all v1.5 state lives in existing extension columns: │ +│ ┌────────────────────────────────────────────────────────────────────┐ │ +│ │ instance_settings.general JSONB (onboarding mode, voice config) │ │ +│ │ company_secrets table (OAuth tokens, Puter token) │ │ +│ │ chat_conversations table (no change — re-used as-is) │ │ +│ └────────────────────────────────────────────────────────────────────┘ │ +│ │ +│ NEW file-based storage (server data dir, no migration needed): │ +│ ┌─────────────────────────────────────────────────────────────────────┐ │ +│ │ data/memory/.json (assistant memory store) │ │ +│ │ data/whisper-models/ (downloaded .bin files) │ │ +│ │ data/piper-voices/ (downloaded .onnx voice files) │ │ +│ └─────────────────────────────────────────────────────────────────────┘ │ +└──────────────────────────────────────────────────────────────────────────┘ + ↕ npx +┌──────────────────────────────────────────────────────────────────────────┐ +│ CLI Layer (Commander.js) │ +│ │ +│ NEW standalone package: packages/buildthis/ │ +│ ┌──────────────────────────────────────────────────────────────────────┐│ +│ │ npx buildthis → detects if Nexus running → opens browser ││ +│ │ OR runs nexus onboard wizard → starts server ││ +│ └──────────────────────────────────────────────────────────────────────┘│ +└──────────────────────────────────────────────────────────────────────────┘ ``` -**Two categories of change, each with a different isolation strategy:** +--- -| Category | Strategy | Conflict Risk | -|----------|----------|--------------| -| New files added by Nexus | Add-only (upstream never touches these) | Zero | -| Upstream files with string changes | Inline edit, minimal diff | Low — strings rarely conflict | -| Upstream files requiring logic changes | Wrapper/replacement file, rewire import | Medium — requires Vite alias or import swap | +## Component Responsibilities + +| Component | Responsibility | New or Modified | Where | +|-----------|----------------|-----------------|-------| +| `NexusOnboardingWizard` | Multi-step onboarding: mode, hardware, provider, voice, summary | MODIFIED (replace single-step) | `ui/src/components/NexusOnboardingWizard.tsx` | +| `ModeSelector` | Card picker: Personal AI / Project Builder / Both | NEW | `ui/src/components/onboarding/ModeSelector.tsx` | +| `HardwareSummaryStep` | Displays detected GPU/RAM/Unified Memory result | NEW | `ui/src/components/onboarding/HardwareSummaryStep.tsx` | +| `ProviderTierStep` | Puter.js auth button, OAuth tier, API key entry | NEW | `ui/src/components/onboarding/ProviderTierStep.tsx` | +| `VoiceSetupStep` | Whisper model picker + Piper voice picker | NEW | `ui/src/components/onboarding/VoiceSetupStep.tsx` | +| `OnboardingSummaryStep` | Final summary before launch | NEW | `ui/src/components/onboarding/OnboardingSummaryStep.tsx` | +| `PersonalAssistantPage` | Full-screen chat experience for assistant mode | NEW | `ui/src/pages/PersonalAssistant.tsx` | +| `AssistantMemoryBar` | Shows memory slots / recall indicator in chat | NEW | `ui/src/components/AssistantMemoryBar.tsx` | +| `hardwareService` | Reads `os.totalmem()`, runs `system_profiler` on macOS for GPU info | NEW | `server/src/services/hardware.ts` | +| `puterProxyService` | Wraps Puter.js Node.js client; relays AI calls through SSE | NEW | `server/src/services/puter-proxy.ts` | +| `voiceService` | Manages Whisper (via `whisper-node`) + Piper (via `@mintplex-labs/piper-tts-web` server-side) | NEW | `server/src/services/voice.ts` | +| `memoryService` | CRUD on file-based JSON memory store; injects context into system prompt | NEW | `server/src/services/memory.ts` | +| `hardwareRoutes` | `GET /api/hardware/info` | NEW | `server/src/routes/hardware.ts` | +| `puterProxyRoutes` | `POST /api/puter-proxy/chat` (SSE), `POST /api/puter-proxy/auth` | NEW | `server/src/routes/puter-proxy.ts` | +| `voiceRoutes` | `POST /api/voice/transcribe`, `POST /api/voice/speak`, `GET /api/voice/status` | NEW | `server/src/routes/voice.ts` | +| `memoryRoutes` | `GET/POST/DELETE /api/companies/:id/memory` | NEW | `server/src/routes/memory.ts` | +| `buildthis` package | `npx buildthis` entry point — detect/launch Nexus | NEW | `packages/buildthis/` | --- -## Component Boundaries +## Recommended Project Structure -| Component | Responsibility | Fork Change Type | -|-----------|---------------|-----------------| -| `ui/src/lib/nexus-labels.ts` | Central display-string registry (NEW file) | New file — zero conflict risk | -| `ui/src/components/OnboardingWizard.tsx` | Multi-step first-run UX | Inline rewrite — file is owned by Nexus entirely | -| `packages/shared/src/constants.ts` | `AGENT_ROLE_LABELS` map | Inline string change only — change `ceo: "CEO"` to `ceo: "Project Manager"` | -| `ui/src/pages/Companies.tsx` | "New Company" button, "Companies" breadcrumb | Inline string change — 2-3 occurrences | -| `cli/src/commands/onboard.ts` | Terminal output strings | Inline string change — no logic change | -| `server/src/onboarding-assets/ceo/` | PM agent template content | File content replacement — same paths | -| `server/src/home-paths.ts` | `.paperclip` → `.nexus` home dir | Inline constant change — single string | -| `ui/src/components/CompanyRail.tsx` | Sidebar rail icon (`Paperclip` lucide icon) | Single import swap | +``` +packages/ +├── buildthis/ # NEW — npx buildthis entry point +│ ├── src/ +│ │ └── index.ts # bin entry: detect running Nexus, open browser or run onboard +│ └── package.json # name: "buildthis", bin: { buildthis: "./dist/index.js" } + +server/src/ +├── services/ +│ ├── hardware.ts # NEW — detect GPU/RAM/Apple Silicon +│ ├── puter-proxy.ts # NEW — Puter.js Node.js client wrapper +│ ├── voice.ts # NEW — Whisper + Piper lifecycle +│ └── memory.ts # NEW — file-based JSON assistant memory +├── routes/ +│ ├── hardware.ts # NEW — GET /api/hardware/info +│ ├── puter-proxy.ts # NEW — POST /api/puter-proxy/chat (SSE) +│ ├── voice.ts # NEW — POST /api/voice/transcribe, /speak +│ └── memory.ts # NEW — GET/POST/DELETE /companies/:id/memory +└── app.ts # MODIFIED — mount 4 new route sets + +ui/src/ +├── components/ +│ ├── NexusOnboardingWizard.tsx # MODIFIED — multi-step replaces single-step +│ ├── AssistantMemoryBar.tsx # NEW +│ └── onboarding/ # NEW directory — onboarding step components +│ ├── ModeSelector.tsx +│ ├── HardwareSummaryStep.tsx +│ ├── ProviderTierStep.tsx +│ ├── VoiceSetupStep.tsx +│ └── OnboardingSummaryStep.tsx +├── pages/ +│ └── PersonalAssistant.tsx # NEW — full-screen assistant page +├── hooks/ +│ ├── useHardwareInfo.ts # NEW — query /api/hardware/info +│ ├── usePuterChat.ts # NEW — SSE streaming from puter-proxy +│ ├── useVoiceInput.ts # NEW — Whisper transcription hook +│ ├── useVoiceSpeech.ts # NEW — Piper TTS hook +│ └── useAssistantMemory.ts # NEW — memory CRUD hook +└── api/ + ├── hardware.ts # NEW — typed fetch wrappers + ├── puter-proxy.ts # NEW + ├── voice.ts # NEW + └── memory.ts # NEW +``` + +### Structure Rationale + +- **`packages/buildthis/`:** Standalone package with its own `package.json` and `bin` field — publishable to npm as `buildthis` independently. Does not depend on the monorepo server package at runtime; it only detects a running Nexus instance via HTTP or launches the CLI onboard flow. +- **`server/src/services/` additions:** All follow the existing named-export pattern (`export function hardwareService() { return { ... } }`). No classes. Dependencies injected as parameters. Drizzle `db` is only accepted if the service actually queries the DB. +- **`ui/src/components/onboarding/`:** Sub-directory isolates the 5 new step components from the main components directory. `NexusOnboardingWizard.tsx` imports them. This limits the upstream-conflict surface to the single wizard file. +- **`ui/src/pages/PersonalAssistant.tsx`:** New route registered in App.tsx routing (the only modification needed in the routing layer). The page re-uses `ChatPanel` with an `assistantMode` prop. --- -## Isolation Pattern 1: Central Label Registry (New File) +## Architectural Patterns -Create `ui/src/lib/nexus-labels.ts` as a new file. This file is pure Nexus — upstream will never create it, so it can never conflict. +### Pattern 1: Hardware Detection via Server-Side Shell Probe + +**What:** `hardwareService` runs on the Express server where it has access to `os.totalmem()` and can shell out to `system_profiler SPDisplaysDataType` on macOS to get GPU details. Apple Silicon unified memory is detected by checking the `cpu_brand_string` for "Apple M". Results are cached in memory (5-minute TTL) so the onboarding wizard can poll cheaply. + +**When to use:** Any time the onboarding wizard needs to display hardware capabilities to make model recommendations. + +**Trade-offs:** Server-side only — the UI cannot do this itself in the browser. The route is scoped to `assertBoard` (existing auth middleware), so it's protected. Apple Silicon reports unified memory as both RAM and VRAM; the service returns `{ unifiedMemory: true, totalBytes }` instead of separate fields. + +**Example:** +```typescript +// server/src/services/hardware.ts +export function hardwareService() { + let cache: HardwareInfo | null = null; + let cacheExpiry = 0; + + return { + async detect(): Promise { + if (cache && Date.now() < cacheExpiry) return cache; + const totalBytes = os.totalmem(); + const gpuInfo = await probeGpu(); // shells system_profiler on macOS, /proc/driver/nvidia on Linux + cache = { totalBytes, gpu: gpuInfo, platform: process.platform }; + cacheExpiry = Date.now() + 5 * 60 * 1000; + return cache; + } + }; +} +``` + +### Pattern 2: Puter.js as a Server-Side Adapter (not browser-direct) + +**What:** Puter.js supports Node.js via `@heyputer/puter.js` with `init(authToken)`. The server acts as a proxy: it holds the Puter auth token (stored in `company_secrets` via the existing `secretService`), forwards chat requests to `puter.ai.chat({ stream: true })`, and pipes the async iterable back to the browser as SSE — exactly the same format the existing `useStreamingChat` hook already consumes. + +**Why not browser-direct:** The existing chat architecture is server-mediated (all agent messages go through Express SSE). Bypassing this would require forking the streaming infrastructure. Using the server as proxy re-uses `useStreamingChat` unchanged and keeps the Puter token off the client. + +**When to use:** During onboarding when user selects "Puter.js cloud" tier and authenticates. The Puter auth flow opens a browser popup (`puter.auth.signIn()` must be user-initiated from the UI), receives a token, then POSTs it to `/api/puter-proxy/auth` for server storage. + +**Trade-offs:** One extra round-trip compared to browser-direct, but avoids token exposure and re-uses the existing SSE pipeline. Puter.js Node.js usage requires `@heyputer/puter.js` as a server dependency (not currently in the monorepo). + +**Example (server-side relay):** +```typescript +// server/src/routes/puter-proxy.ts +router.post("/api/puter-proxy/chat", async (req, res) => { + const token = await svc.getStoredToken(companyId); + const puter = init(token); + res.setHeader("Content-Type", "text/event-stream"); + const stream = await puter.ai.chat(req.body.messages, { stream: true }); + for await (const chunk of stream) { + res.write(`data: ${JSON.stringify(chunk)}\n\n`); + } + res.end(); +}); +``` + +### Pattern 3: Whisper on Server, Piper in Browser (Hybrid Voice) + +**What:** Voice input (speech-to-text) runs server-side via `whisper-node` (Node.js bindings for whisper.cpp). The UI records audio via `MediaRecorder`, POSTs a blob to `POST /api/voice/transcribe`, and gets back a transcript string. Voice output (text-to-speech) uses `@mintplex-labs/piper-tts-web` which runs client-side via WebAssembly — no server round-trip needed for TTS. + +**Why this split:** whisper.cpp requires native binaries that work on CPU-only hardware, which the server controls. Piper TTS web runs via WASM in the browser and has no native dependency — this keeps TTS latency low (no network round-trip) and works even if the server is slow. + +**When to use:** When user selects "voice mode" in onboarding (VoiceSetupStep). Whisper runs only if the user chooses a local Whisper model (downloaded to `data/whisper-models/`); as a fallback, the browser's native `webkitSpeechRecognition` / `SpeechRecognition` API is used. + +**Trade-offs:** Whisper download adds 75MB–1.5GB to first-run setup. For CPU-only hardware, whisper-tiny.en (75MB) transcribes in ~2s for a 10s clip on M4 — acceptable. Piper WASM download is ~20MB (models ~30-100MB each). + +**Example (voice input hook):** +```typescript +// ui/src/hooks/useVoiceInput.ts +export function useVoiceInput() { + // Records with MediaRecorder → blob → POST /api/voice/transcribe + // Falls back to window.SpeechRecognition if whisper not configured +} +``` + +### Pattern 4: Persistent Memory via File-Backed JSON (No New DB Table) + +**What:** The assistant memory store is a per-workspace JSON file at `data/memory/.json`. Each memory entry has `{ id, content, createdAt, tags }`. The `memoryService` reads this file on startup (lazy-loaded per companyId), keeps it in-process, and writes on mutation. Memory injection works by prepending a formatted memory block to the system prompt at chat-send time in the existing chat service. + +**Why not PostgreSQL:** Adding a new table violates the "no DB schema changes" constraint for upstream rebase safety. File-backed JSON with an in-process cache is fast for a single-user setup (sub-millisecond reads) and requires no migration. + +**When to use:** Personal AI Assistant mode only. Project Builder mode does not use the memory service. + +**Trade-offs:** Not transactional. For a single-user local deployment, this is acceptable. File writes are atomic via write-then-rename pattern. Memory search is linear scan (no vector embeddings in v1.5 — semantic search is a future enhancement). + +**Example:** +```typescript +// server/src/services/memory.ts +export function memoryService() { + const cache = new Map(); + + return { + async inject(companyId: string, systemPrompt: string): Promise { + const store = await load(companyId); + if (store.entries.length === 0) return systemPrompt; + const block = store.entries.map(e => `- ${e.content}`).join("\n"); + return `${systemPrompt}\n\n## What I remember about you:\n${block}`; + } + }; +} +``` + +### Pattern 5: Onboarding State via instance_settings.general JSONB + +**What:** All onboarding configuration (selected mode, voice config, active provider tier) is stored in the existing `instance_settings.general` JSONB column. The `instanceSettingsService` already handles arbitrary JSONB keys. Nexus adds its config under a `nexus` namespace key to avoid upstream key collisions. + +**When to use:** Reading/writing onboarding mode, voice model selection, and provider tier configuration. No new table, no migration. + +**Example:** +```typescript +// instance_settings.general.nexus = { +// mode: "personal_ai" | "project_builder" | "both", +// voiceModel: "whisper-tiny.en" | "whisper-base" | null, +// piperVoice: "en_US-amy-medium" | null, +// providerTier: "local" | "puter" | "oauth_gemini" | "api_key", +// } +``` + +### Pattern 6: OAuth Token Storage via Existing Secrets Service + +**What:** OAuth tokens (Google Gemini, OpenAI) and the Puter.js auth token are stored via the existing `secretService` using the `local_encrypted` provider. The onboarding wizard calls `POST /api/companies/:id/secrets` with a well-known name (e.g., `nexus_puter_token`, `nexus_gemini_token`). Adapters read these at spawn time. + +**When to use:** Any time an OAuth flow completes and a token needs persistence. + +**Trade-offs:** Secrets are per-company (workspace), not per-instance. This is fine for single-user setup. The existing secrets UI lets users view/rotate tokens manually. + +--- + +## Data Flow + +### Onboarding Wizard Data Flow + +``` +User opens Nexus (no workspace yet) + ↓ +NexusOnboardingWizard renders + ↓ +Step 1: ModeSelector → user picks "Personal AI" / "Project Builder" / "Both" + ↓ +Step 2: HardwareSummaryStep + → GET /api/hardware/info (new route) + → hardwareService.detect() → os.totalmem() + system_profiler + → returns { totalGb, gpuName, unifiedMemory, platform } + → wizard shows model tier recommendations + ↓ +Step 3: ProviderTierStep + → Local: already detected via existing Hermes probe + → Puter.js: user clicks "Connect" → puter.auth.signIn() popup + → UI POSTs token to POST /api/puter-proxy/auth + → server stores in secretService("nexus_puter_token") + → OAuth (Gemini/OpenAI): OAuth PKCE flow in a popup window + → callback captured by temp local server or redirect + → token stored via secretService + → API Key: direct input → stored via secretService + ↓ +Step 4: VoiceSetupStep (optional, skippable) + → GET /api/voice/status → check if whisper binary present + → User picks model → POST /api/voice/download (async download + SSE progress) + → User picks Piper voice → stored in instance_settings.general.nexus + ↓ +Step 5: OnboardingSummaryStep + → Creates workspace + agents (existing companiesApi + agentsApi flow) + → Writes nexus config to instance_settings.general.nexus + → Navigates to PersonalAssistant page OR Dashboard based on mode +``` + +### Personal AI Assistant Chat Data Flow + +``` +User types message in AssistantChatHub + ↓ +ChatInput → useStreamingChat.startStream(conversationId, message) + ↓ +POST /api/companies/:id/chat/conversations/:convId/messages + ↓ (existing chat route, no change) +chat route detects "personal_assistant" agent type + ↓ +memoryService.inject(companyId, systemPrompt) ← NEW injection point + ↓ +Route selects provider based on instance_settings.general.nexus.providerTier: + - "local" → existing Hermes adapter (no change) + - "puter" → puterProxyService.chat() → Puter.js Node client → SSE relay + - "oauth_*" → respective provider API with stored OAuth token → SSE relay + ↓ +SSE events stream to UI via existing /api/chat/stream endpoint pattern + ↓ +useStreamingChat receives chunks → ChatMessageList renders them +``` + +### Voice Input Data Flow + +``` +User presses mic button in ChatInput (MODIFIED) + ↓ +useVoiceInput starts MediaRecorder → records WebM/Opus blob + ↓ +User releases mic → blob POSTed to POST /api/voice/transcribe + ↓ +voiceService.transcribe(audioBuffer) + → whisper-node.transcribe(path) → returns text + ↓ +Text injected into ChatInput.value + ↓ +User reviews → sends normally +``` + +### npx buildthis Data Flow + +``` +Developer runs: npx buildthis + ↓ +buildthis/src/index.ts checks for running Nexus: + GET http://localhost:4000/api/health → 200? + YES → open browser to http://localhost:4000 + NO → run nexus onboard wizard (delegates to paperclipai onboard) + OR detect Docker → suggest docker-compose up +``` + +--- + +## Integration Points: New vs Modified + +### Server Routes — app.ts (MODIFIED) + +One file to add 4 route mounts. Minimal conflict surface with upstream. ```typescript -// ui/src/lib/nexus-labels.ts [NEXUS-OWNED FILE] -// Central display vocabulary. Never referenced by upstream. -// All UI components import from here instead of hardcoding strings. - -export const NEXUS_LABELS = { - // Entity names - workspace: "Workspace", - workspaces: "Workspaces", - projectManager: "Project Manager", - owner: "Owner", - - // Actions - addAgent: "Add Agent", - removeAgent: "Remove Agent", - - // Onboarding - onboardingRootPrompt: "Choose your root directory", - onboardingTitle: "Welcome to Nexus", - - // App identity - appName: "Nexus", - cliCommand: "nexus", -} as const; +// In server/src/app.ts — add after ollamaRoutes(): +app.use(hardwareRoutes()); +app.use(voiceRoutes()); +app.use(memoryRoutes(db)); +app.use(puterProxyRoutes(db)); ``` -**Usage pattern in existing components:** Import `NEXUS_LABELS` and replace the hardcoded string. The diff in the upstream file is minimal — a one-line import addition and a string substitution. +### Chat Route — MODIFIED for memory injection -**Conflict profile:** The import addition is a single new line at the top of the file. String substitutions are isolated to specific JSX attributes. These lines are unlikely to be touched by upstream changes because upstream will not add an import from `nexus-labels`. +The existing chat service (`server/src/services/chat.ts`) needs one injection point: when building the system prompt for a conversation, call `memoryService.inject()`. This is scoped to conversations where the agent has `adapterConfig.assistantMode === true`. + +**Risk:** This touches an upstream file. The injection is a 3-line addition inside the message-send handler. Low conflict probability — upstream rarely modifies this section. + +### NexusOnboardingWizard.tsx — REPLACED + +The current single-step wizard becomes a multi-step wizard. Since this file is already a Nexus replacement (not an upstream file), there is zero conflict risk — it will never exist in upstream. + +### App.tsx routing — MODIFIED (one new route) + +Add the `PersonalAssistant` page as a new lazy-loaded route. Minimal upstream conflict (routing section rarely changes). + +### ChatInput.tsx — MODIFIED (voice button) + +Add a microphone button that triggers `useVoiceInput`. This is an upstream file — the modification is additive (new button, no existing logic changed). Conflict risk: LOW, as upstream rarely modifies ChatInput. --- -## Isolation Pattern 2: Inline String Replacement (Low-Conflict Edits) +## Anti-Patterns -For files with a small number of hardcoded display strings, edit inline with targeted changes. Prefix all changed lines with a `// [nexus]` comment on the preceding line so they are trivially identified during rebase conflict resolution. +### Anti-Pattern 1: Browser-Direct Puter.js -**Example — `packages/shared/src/constants.ts` line 53:** -```typescript -// [nexus] display label override -ceo: "Project Manager", -``` +**What people do:** Import `@heyputer/puter.js` in the React frontend and call `puter.ai.chat()` directly from the browser. -**Example — `ui/src/pages/Companies.tsx` line 72:** -```typescript -// [nexus] breadcrumb rename -setBreadcrumbs([{ label: "Workspaces" }]); -``` +**Why it's wrong:** Exposes the Puter auth token in browser storage/network. Bypasses the existing SSE pipeline, requiring a second streaming implementation. Breaks the memory injection pattern (no server-side hook). Cannot use the existing `useStreamingChat` hook. -**Example — `ui/src/pages/Companies.tsx` line 96:** -```typescript -// [nexus] button rename -New Workspace -``` +**Do this instead:** Use the server proxy pattern (Pattern 2). The UI sends messages to `/api/puter-proxy/chat` exactly like any other chat endpoint. -The `// [nexus]` marker serves three purposes: -1. Identifies fork-owned lines during `git diff` triage -2. Signals to the developer during a rebase conflict which side is Nexus vs upstream -3. Enables `grep -r '\[nexus\]'` to produce a complete inventory of modified lines at any time +### Anti-Pattern 2: New PostgreSQL Tables for Memory + +**What people do:** Create a `assistant_memories` migration with a proper relational schema. + +**Why it's wrong:** Violates the hard constraint: no DB migrations, no schema changes, to keep upstream rebase clean. A migration file created in Nexus will conflict every time upstream adds a migration. + +**Do this instead:** File-backed JSON in the server's data directory (Pattern 4). The single-user M4 Mini deployment will never hit performance limits with this approach. + +### Anti-Pattern 3: Multi-Step Wizard as Modified OnboardingWizard.tsx + +**What people do:** Modify the upstream `OnboardingWizard.tsx` directly to add v1.5 steps. + +**Why it's wrong:** The upstream wizard is actively maintained (120+ upstream commits since fork). Touching it creates guaranteed rebase conflicts. + +**Do this instead:** Continue the existing pattern — `NexusOnboardingWizard.tsx` is already the Nexus replacement via Vite alias. All v1.5 changes go there. Upstream file untouched. + +### Anti-Pattern 4: OAuth in the Browser via Redirect + +**What people do:** Redirect the main app window to the OAuth provider and handle the callback via `window.location`. + +**Why it's wrong:** Loses React state mid-flow. Hard to handle callback URL in a local server that may not have a publicly routable HTTPS endpoint. + +**Do this instead:** Use a popup window for OAuth (`window.open`). The popup handles the full OAuth redirect. On callback, the popup calls `window.opener.postMessage` with the token, closes itself, and the main window receives it. For Puter.js specifically, `puter.auth.signIn()` handles the popup internally. --- -## Isolation Pattern 3: File Content Replacement (Onboarding Assets) +## Scaling Considerations -The `server/src/onboarding-assets/ceo/` files (SOUL.md, AGENTS.md, HEARTBEAT.md, TOOLS.md) are plain prose. They have no code entanglement. Replace their content entirely. +This is a single-user local deployment on an M4 Mini. Scaling is not a concern for v1.5. The architecture is designed for correctness and upstream merge-ability, not horizontal scale. -**Strategy:** Keep the same file paths. Write Nexus-specific content. Upstream changes to these files will produce conflicts, but: -- Upstream changes to `ceo/SOUL.md` are relatively rare (onboarding prose is stable) -- When conflicts occur, resolution is manual prose review — not code logic -- The directory itself is not renamed (`ceo/` stays `ceo/`) to avoid path-level conflicts - -**PM and Engineer templates:** Add new template subdirectories under `server/src/onboarding-assets/`: -- `server/src/onboarding-assets/pm/` — new directory, zero conflict risk -- `server/src/onboarding-assets/engineer/` — new directory, zero conflict risk +| Concern | Single User (M4 Mini) | +|---------|----------------------| +| Hardware detection | os.totalmem() + sync shell probe, cached 5min — negligible | +| Puter.js relay | One connection at a time, no pooling needed | +| Whisper transcription | ~2s for 10s clip on M4, sequential queue sufficient | +| Memory store | File JSON, <10ms read, no contention | +| Voice TTS | WASM in browser, zero server load | --- -## Isolation Pattern 4: Build-Time File Swap via Vite Alias (High-Complexity Components) +## Build Order (Dependency Graph) -For components that require substantial structural changes (primarily `OnboardingWizard.tsx`), use Vite's `resolve.alias` to swap the import at build time. This keeps the upstream file untouched. - -**Existing Vite config** (`ui/vite.config.ts`) already uses `resolve.alias`: -```typescript -resolve: { - alias: { - "@": path.resolve(__dirname, "./src"), - lexical: path.resolve(__dirname, "./node_modules/lexical/Lexical.mjs"), - }, -}, -``` - -**Add a Nexus override alias:** -```typescript -// ui/vite.config.ts [nexus] -resolve: { - alias: { - "@": path.resolve(__dirname, "./src"), - lexical: path.resolve(__dirname, "./node_modules/lexical/Lexical.mjs"), - // [nexus] component overrides - "@/components/OnboardingWizard": path.resolve( - __dirname, "./src/nexus/OnboardingWizard.tsx" - ), - }, -}, -``` - -**New file:** `ui/src/nexus/OnboardingWizard.tsx` — entirely Nexus-owned, never conflicts. - -**Upstream file:** `ui/src/components/OnboardingWizard.tsx` — left unmodified. Any upstream updates to it are absorbed without conflict because the alias bypasses it. - -**Tradeoff:** This pattern is only worth the complexity for large rewrites (100+ lines changed). For small string changes, inline edits are lower overhead. Apply to `OnboardingWizard.tsx` only. - -**Confidence:** HIGH — Vite alias file swapping is a documented pattern used in white-label React apps. The existing config already demonstrates the alias syntax. - ---- - -## Isolation Pattern 5: Home Directory Pointer Mechanism - -The `~/.nexus` pointer file is Nexus-specific infrastructure. The approach: - -1. Modify `server/src/home-paths.ts` — change the single default string `".paperclip"` to `".nexus"`. This is a one-line change; conflict risk is minimal because upstream rarely changes default paths. - -2. Create `~/.nexus` as a single-line text file containing the root path. This is runtime data, not code. - -3. The `PAPERCLIP_HOME` env var override still works — Nexus does not rename it (display-only constraint). - -**Inline change in `server/src/home-paths.ts`:** -```typescript -// [nexus] home dir rename -const DEFAULT_HOME = ".nexus"; -``` - ---- - -## Data Flow: How Changes Propagate +The build order matters because later phases consume services built in earlier ones. ``` -nexus-labels.ts (NEW) - └── imported by: Companies.tsx, CompanyRail.tsx, InstanceSidebar.tsx, etc. - └── display strings centralized — upstream files only gain one import line +Phase 1: Hardware Detection + → hardwareService (server) + → GET /api/hardware/info (route) + → useHardwareInfo hook (UI) + → HardwareSummaryStep component (UI) + No dependencies on other new phases. -constants.ts (MODIFIED, minimal) - └── AGENT_ROLE_LABELS.ceo = "Project Manager" - └── used by: AgentConfigForm.tsx, NewAgent.tsx, ApprovalPayload.tsx - └── no other changes needed in those files +Phase 2: Provider Tiers (depends on Phase 1 for display) + → puterProxyService (server) — Puter.js Node client + → secretService integration for token storage (uses EXISTING service) + → POST /api/puter-proxy/auth (route) + → ProviderTierStep component (UI) + → OAuth popup flow (UI) -OnboardingWizard.nexus.tsx (NEW) - └── aliased via vite.config.ts (one-line alias addition) - └── upstream OnboardingWizard.tsx untouched +Phase 3: Multi-Step Onboarding Wizard (depends on Phases 1+2) + → ModeSelector, OnboardingSummaryStep components (UI) + → Refactor NexusOnboardingWizard.tsx into multi-step + → instance_settings.general.nexus config write -onboarding-assets/ceo/*.md (MODIFIED content, same paths) - └── loaded by default-agent-instructions.ts (unchanged) +Phase 4: Persistent Memory + Assistant Mode (depends on Phase 3) + → memoryService (server) + → Memory injection in chat route (MODIFIED — highest risk step) + → GET/POST/DELETE /api/companies/:id/memory (routes) + → PersonalAssistantPage (UI) + → useAssistantMemory hook (UI) -onboarding-assets/pm/ (NEW directory) -onboarding-assets/engineer/ (NEW directory) - └── loaded by new template selector in OnboardingWizard.nexus.tsx +Phase 5: Voice (depends on Phase 3, independent of Phase 4) + → voiceService (server) — whisper-node + piper setup + → POST /api/voice/transcribe, /speak, /status (routes) + → VoiceSetupStep in onboarding (UI) + → useVoiceInput, useVoiceSpeech hooks (UI) + → ChatInput microphone button (MODIFIED — upstream file, low risk) + +Phase 6: npx buildthis (independent of all above) + → packages/buildthis/ new package + → package.json bin field setup + → npm publish configuration ``` ---- - -## Anti-Patterns to Avoid - -### Anti-Pattern 1: Renaming Upstream Files or Directories - -**What:** Renaming `CompanyRail.tsx` → `WorkspaceRail.tsx`, or `onboarding-assets/ceo/` → `onboarding-assets/pm/` -**Why bad:** Git tracks renames as delete + add. During `git rebase upstream/master`, if upstream makes changes to `CompanyRail.tsx`, the patch will not apply to `WorkspaceRail.tsx`. You get an unresolved conflict that requires manual merge of the upstream diff into the renamed file. -**Instead:** Keep all upstream file paths. Use wrapper files or content replacement. Reserve new names for new files only. - -### Anti-Pattern 2: Renaming TypeScript Identifiers in Upstream Files - -**What:** Renaming `companyService` → `workspaceService`, `CompanyContext` → `WorkspaceContext` -**Why bad:** Any upstream commit touching those files produces a merge conflict on every renamed symbol. The conflict surface grows proportionally to how many usages exist (currently hundreds of import sites). -**Instead:** Leave all identifiers unchanged. The mapping from internal name to display name happens in `nexus-labels.ts` and the `AGENT_ROLE_LABELS` constant only. - -### Anti-Pattern 3: Squashing All Nexus Commits - -**What:** Maintaining Nexus changes as a single squashed "fork" commit -**Why bad:** During `git rebase upstream/master`, all conflicts appear in one commit resolution session, making them impossible to isolate. A single upstream change to `constants.ts` forces you to re-resolve every Nexus change in that file simultaneously. -**Instead:** Keep one atomic `[nexus]` commit per change area (labels, onboarding, home dir, templates). Small commits rebase cleanly. Conflicts are isolated. - -### Anti-Pattern 4: Package Name Renames - -**What:** `@paperclipai/shared` → `@nexusai/shared` -**Why bad:** Every upstream file that imports from `@paperclipai/*` will conflict because Nexus has rewritten the import path. This is effectively every file in the monorepo. -**Instead:** Keep all `@paperclipai/*` package names. This is explicitly in scope as "out of scope" in PROJECT.md. - -### Anti-Pattern 5: Centralizing All Changes in One File - -**What:** Putting all Nexus overrides in `constants.ts` or `App.tsx` -**Why bad:** High-traffic upstream files accumulate the most conflicts. Concentrating fork changes there maximizes conflict exposure. -**Instead:** Prefer adding new files (zero conflict risk) over modifying high-traffic upstream files. +**Recommended sequence:** 1 → 2 → 3 → 4 → 5 → 6. Phase 4 (memory injection into chat route) is the highest-risk upstream-file modification and should come after onboarding is validated. --- -## Scalability Considerations +## Integration Points: External Services -| Concern | Now (v1) | Future upstream rebases | -|---------|----------|------------------------| -| Label changes | 1 constants.ts edit + nexus-labels.ts | nexus-labels.ts never conflicts; constants.ts conflict is isolated to 1 line | -| Onboarding | OnboardingWizard aliased via Vite | Upstream OnboardingWizard changes ignored automatically | -| Template content | ceo/ files replaced in-place | Manual prose merge if upstream edits ceo/ — rare | -| New upstream entities | Zero action needed | New upstream files have no Nexus edits | -| New Nexus features | Add to nexus/ directory | Zero conflict risk — new files only | - ---- - -## Implementation Order (Least to Most Conflict Risk) - -This order ensures each phase can be validated and rebased independently before the next phase adds more change surface. - -### Phase 1 — Foundation (zero upstream file changes) -1. Create `ui/src/nexus/` directory -2. Create `ui/src/lib/nexus-labels.ts` with full label registry -3. Create `server/src/onboarding-assets/pm/` and `engineer/` template directories -4. Add `[nexus]` commit: "add nexus overlay directory and label registry" - -**Why first:** Establishes the containment structure with no upstream file touches. Safe to rebase at any point. - -### Phase 2 — Constants and Labels (1 upstream file, 1-line change) -1. Edit `packages/shared/src/constants.ts` — change `ceo: "CEO"` to `ceo: "Project Manager"` in `AGENT_ROLE_LABELS` -2. Add `[nexus]` commit: "rename CEO display label to Project Manager" - -**Why second:** Single file, single line. Easiest conflict to resolve if upstream touches the same line. - -### Phase 3 — Home Directory (1 upstream file, 1-line change) -1. Edit `server/src/home-paths.ts` — change default home dir string to `.nexus` -2. Edit `cli/src/config/home.ts` — same change -3. Add `[nexus]` commit: "change default home dir from .paperclip to .nexus" - -**Why third:** Low-risk lines. Home dir defaults are very rarely changed by upstream. - -### Phase 4 — UI String Renames (several upstream files, inline strings only) -1. Edit `ui/src/pages/Companies.tsx` — rename "Companies" breadcrumb and "New Company" button to "Workspaces" / "New Workspace" -2. Edit `ui/src/components/CompanyRail.tsx` — swap `Paperclip` lucide icon for a different icon -3. Edit `ui/src/pages/CompanySettings.tsx`, `InstanceSidebar.tsx` — display-string renames -4. Edit `cli/src/commands/onboard.ts` — terminal output strings -5. One `[nexus]` commit per file changed - -**Why fourth:** More files touched, but changes are string-only. Each commit is independently rebaseable. `// [nexus]` markers make conflict resolution mechanical. - -### Phase 5 — Onboarding Redesign (Vite alias + new file) -1. Add Vite alias in `ui/vite.config.ts` pointing `OnboardingWizard` to `nexus/OnboardingWizard.tsx` -2. Write `ui/src/nexus/OnboardingWizard.tsx` as a full replacement (root dir picker, PM + Engineer auto-create) -3. Replace `server/src/onboarding-assets/ceo/` file content with PM-framed prose -4. One `[nexus]` commit: "redesign onboarding for single-dev workspace flow" - -**Why last:** Most complex change. The Vite alias approach means upstream `OnboardingWizard.tsx` can evolve freely without conflicting. Template content is the highest natural-language conflict risk but lowest structural risk. - ---- - -## Rebase Workflow - -```bash -# Pull upstream changes -git fetch upstream -git rebase upstream/master - -# For each [nexus] commit, git will pause on conflicts. -# Expected conflict files per phase: -# Phase 2: packages/shared/src/constants.ts (1 line) -# Phase 3: server/src/home-paths.ts, cli/src/config/home.ts (1 line each) -# Phase 4: ui/src/pages/*.tsx, cli/src/commands/onboard.ts (string lines) -# Phase 5: server/src/onboarding-assets/ceo/*.md (prose), ui/vite.config.ts (1 line) -# -# Resolution rule: keep [nexus] version for any line marked // [nexus] -# accept upstream for everything else - -# After rebase, verify no Nexus labels reverted: -grep -r '\[nexus\]' /Volumes/UsbNvme/repos/nexus --include="*.ts" --include="*.tsx" -``` +| Service | Integration Pattern | Auth Storage | Notes | +|---------|---------------------|--------------|-------| +| Puter.js | Server-side Node.js client proxy, SSE relay | `company_secrets` table | Token obtained via browser popup on first connect | +| Google Gemini OAuth | PKCE popup flow, access token + refresh token | `company_secrets` table | Policy risk: using Gemini CLI OAuth with third-party apps may trigger abuse detection — use only if user has an active Gemini subscription | +| OpenAI OAuth | PKCE flow via auth.openai.com | `company_secrets` table | Only for free tier / ChatGPT Plus users | +| Whisper (whisper-node) | Native binary, spawned by voiceService | N/A — local binary | Download on first use, cached in data/whisper-models/ | +| Piper TTS | @mintplex-labs/piper-tts-web WASM, runs in browser | N/A — client-side | Model files downloaded to browser cache | +| Ollama | Existing integration (v1.4) — no changes | N/A | ollama.ts service and /ollama routes unchanged | --- ## Sources -- Codebase inspection: `/Volumes/UsbNvme/repos/nexus/` (direct analysis, HIGH confidence) -- Vite resolve.alias documentation: https://vite.dev/config/shared-options (HIGH confidence) -- White-label file-swap pattern: https://krasimirtsonev.com/blog/article/whitelabel-react-apps (MEDIUM confidence — describes Webpack, pattern is equivalent in Vite) -- Fork rebase best practices: https://joaquimrocha.com/2024/09/22/how-to-fork/ (MEDIUM confidence) -- Atomic commit strategy for forks: https://medium.com/@ruthmpardee/git-fork-workflow-using-rebase-587a144be470 (MEDIUM confidence) +- Codebase inspection: `/opt/nexus/server/src/`, `/opt/nexus/ui/src/`, `/opt/nexus/packages/` +- Puter.js Node.js support: https://docs.puter.com/supported-platforms/ +- Puter.js chat streaming API: https://docs.puter.com/AI/chat/ +- Puter.js auth flow: https://developer.puter.com/blog/browser-based-auth-puter-js-node/ +- whisper-node npm package: https://www.npmjs.com/package/whisper-node +- Piper TTS WASM: https://www.npmjs.com/package/@mintplex-labs/piper-tts-web +- @xenova/transformers Node.js audio guide: https://huggingface.co/docs/transformers.js/main/en/guides/node-audio-processing +- Google Gemini OAuth: https://ai.google.dev/gemini-api/docs/oauth +- Google Gemini OAuth policy risk: https://github.com/google-gemini/gemini-cli/issues/21866 +- Vectra local vector DB (future memory enhancement): https://github.com/Stevenic/vectra +- Apple Silicon unified memory: https://eclecticlight.co/2022/03/01/making-sense-of-m1-memory-use/ + +--- +*Architecture research for: Nexus v1.5 Smart Onboarding + Personal AI Assistant* +*Researched: 2026-04-02* diff --git a/.planning/research/FEATURES.md b/.planning/research/FEATURES.md index 28491fb4..d3b9f366 100644 --- a/.planning/research/FEATURES.md +++ b/.planning/research/FEATURES.md @@ -1,193 +1,317 @@ -# Feature Landscape +# Feature Research -**Domain:** Personal AI Agent Orchestration Platform (solo-developer fork of Paperclip) -**Researched:** 2026-03-30 -**Confidence:** HIGH for Paperclip base features (read from codebase); MEDIUM for ecosystem positioning (web research); LOW for subjective UX judgments +**Domain:** Smart Onboarding + Personal AI Assistant (Nexus v1.5) +**Researched:** 2026-04-02 +**Confidence:** MEDIUM overall — Puter.js confirmed current, hardware detection patterns confirmed, personal AI assistant patterns from active ecosystem; UX recommendations inferred from patterns --- -## Context +## Milestone Scope -This analysis is scoped to the Nexus fork of Paperclip. The upstream already ships a comprehensive engine — heartbeats, task lifecycle, multi-adapter support, cost budgets, approval gates, plugin system. The fork question is not "what to build" but "what to rename, what to surface, and what to hide." Features are evaluated against that lens. +This document covers only the NEW features in v1.5. Existing features (NexusOnboardingWizard, Hermes adapter, Ollama integration, chat interface, PWA, voice input via Whisper) are already built and are dependencies, not deliverables. -**Two distinct tracks:** -1. **Engine features** — what the orchestration runtime does (mostly inherited from upstream, not to be changed) -2. **Display-layer features** — what the UI and CLI communicate to the human operator (primary fork scope) +**New features being researched:** +- Hardware detection with pre-built model database +- Tiered provider setup: local (Ollama) → zero-config cloud (Puter.js) → OAuth cloud (Gemini, OpenAI) → API key / subscription (Hermes, Claude Code, OpenClaw) +- Personal AI Assistant mode with persistent memory, MCP connections, voice (Whisper + Piper) +- Project handoff: assistant conversation → PM agent with context transfer +- `npx buildthis` CLI entry point +- Every step skippable --- -## Table Stakes +## Feature Landscape -Features the fork must get right for Nexus to feel complete and usable. Missing or broken = product feels unfinished. +### Table Stakes (Users Expect These) + +Features users assume exist in a modern AI onboarding flow. Missing these makes onboarding feel broken or untrustworthy. | Feature | Why Expected | Complexity | Notes | |---------|--------------|------------|-------| -| Dashboard with live agent status | Users need to see what agents are doing at a glance | Low | Upstream has SSE-backed live updates; display rename only (Company → Workspace) | -| Real-time run logs / heartbeat transcript | "Is my agent running or stuck?" is the first question every time | Low | Upstream streams stdout/stderr per heartbeat; UI already exists; display polish needed | -| Cost visibility per agent | Without it users have no feedback loop on spend | Low | Upstream tracks `cost_events`; display rename + cleanup | -| Task (issue) list with status | Core work item visibility | Low | Upstream has full issue model; display rename only | -| Agent status indicators (idle/running/paused) | Know agent state without opening logs | Low | Upstream has agent `status` field; surface in sidebar/card | -| One-command startup | `nexus run` → working dashboard | Low | CLI command exists as `paperclipai run`; display rename only | -| Human approval workflow | Agents can request approval before acting; critical for trust | Low | Upstream has `approvals` table and routes; display rename only | -| Agent configuration page | View and edit adapter type, model, instructions file | Medium | Upstream has config revisions and rollback; display cleanup needed | -| Sub-task / issue hierarchy | Agents create sub-issues; user needs to see nesting | Low | Upstream has `parentId` and `requestDepth`; display only | -| Project grouping | Issues are grouped under Projects; navigation must reflect this | Low | Upstream has `projects` entity; display rename only (no collision — Workspace > Project > Issue) | -| Scheduled task creation (routines) | Recurring tasks without manual triggering | Low | Upstream has `routines` model with cron; display rename only | -| CLI help text that uses Nexus vocabulary | Every `--help` output that says "Paperclip" or "company" breaks the mental model | Medium | All CLI display strings need `[nexus]` overrides | +| Hardware auto-detection on first run | Any local AI tool probes GPU/RAM; users expect "it just knows" | MEDIUM | Node.js can read `/proc/meminfo`, spawn `nvidia-smi`, detect Apple Silicon via `os.arch()`; Ollama's `/api/tags` endpoint also reveals loaded models | +| RAM-aware model recommendations | Ollama and LM Studio both do this; users have been trained to expect it | LOW | Pre-built lookup table: <8GB RAM → 3B-7B, 8-16GB → 7B-13B, 16GB+ → 30B+; VRAM takes priority over system RAM | +| Step-skippable onboarding | Any wizard that forces completion feels hostile; Clerk, Vercel, and Postman all allow skip | LOW | Each step needs a "skip" or "set up later" affordance; final summary shows what was skipped | +| Progress indicator | Multi-step wizards without progress indicators cause anxiety ("how many more steps?") | LOW | Step counter or progress bar; 5-7 max steps total | +| Summary screen before entering app | Users need to understand what was set up before being dropped in the dashboard | LOW | Show: mode selected, provider configured, models available; "Start chatting" CTA | +| "Test connection" before saving | Every API key entry form should validate before proceeding | LOW | Quick `/health` or echo call to configured provider; show latency | +| Persisted onboarding state | Refreshing mid-wizard should not restart from step 1 | LOW | LocalStorage or DB; existing NexusOnboardingWizard already handles this pattern | +| Voice input/output toggle | Users who selected voice features expect them to work immediately | MEDIUM | Whisper already exists (v1.3); Piper TTS is the new addition; toggle in assistant settings | +| Persistent conversation memory | Any "personal AI assistant" product ships some form of memory (ChatGPT, Claude Projects, Gemini) | HIGH | Users compare against ChatGPT memory; table stakes for the mode to feel meaningful | +| MCP-style external connections | Power users expect the assistant to connect to their tools (files, git, search) | MEDIUM | MCP is now a universal standard (Anthropic, OpenAI, Google all adopted it); STDIO and HTTP transport both needed | -**Assessment:** Every table-stakes feature already exists in the upstream engine. The work is entirely display-layer: surface the right label, hide corporate metaphor, keep the behavior. Estimated risk: LOW — no functional code changes required. +### Differentiators (Competitive Advantage) ---- +Features that make Nexus v1.5 worth using over ChatGPT, Claude Projects, or bare Ollama. -## Differentiators +| Feature | Value Proposition | Complexity | Notes | +|---------|-------------------|------------|-------| +| Puter.js as zero-config cloud tier | No API key, no sign-up, 500+ models including GPT-4.1, Claude Sonnet 4, Gemini 2.5 — user pays via their Puter account | MEDIUM | Puter uses a "user-pays" model: each user authenticates against Puter and consumes their own credits. Developer (Mikkel) pays nothing. Implementation: drop in `puter.js` script, call `puter.ai.chat()`; requires user to have/create a Puter account (free tier exists) | +| Local-first framed as privacy premium | Most tools push cloud. Nexus frames local Ollama as the privacy-respecting choice, not the budget option | LOW | Copy/UX decision: "Your data never leaves your machine" for local tier. No code change needed | +| Hardware detection → instant model recommendation | Instead of listing 100 models and asking the user to pick, Nexus says "Given your M4 Mac Mini with 16GB unified memory, we recommend llama3.2:3b for assistant tasks" | MEDIUM | Pre-built model database (JSON lookup): Apple Silicon tiers, NVIDIA VRAM tiers, AMD VRAM tiers, CPU-only tier. Cross-reference with Ollama model library metadata | +| Project handoff from assistant to PM agent | "Turn this conversation into a project" — one button to create a Paperclip Project with issues extracted from the conversation, with full chat context transferred to PM agent | HIGH | Novel UX pattern; no off-the-shelf solution; requires: summary extraction from conversation (LLM call), Project entity creation via existing API, agent prompt injection with context summary | +| `npx buildthis` CLI entry point | Zero-install UX: `npx buildthis` downloads and runs the Nexus server + opens browser. Same pattern as `create-react-app`, `shadcn`, etc. | MEDIUM | Commander.js CLI already exists; `npx` entry requires: `bin` field in package.json, published to npm (or private registry), auto-open browser after server starts | +| Voice + local LLM = fully offline assistant | Whisper (STT) + Piper TTS + Ollama (LLM) = zero cloud dependency for voice interaction. Rare in consumer tools | HIGH | Piper is CPU-capable, fast enough on Apple Silicon. Integration complexity: audio pipeline (mic → Whisper → Ollama → Piper → speaker); streaming TTS for lower latency | +| Mode selection: Personal AI / Project Builder / Both | Most tools are either a chat assistant or a project manager. Nexus surfaces both modes with explicit switching | LOW | UI mode toggle stored in workspace settings; affects which features are surfaced in sidebar/dashboard | +| Google OAuth cloud tier (no API key) | Users with Google accounts can use Gemini without managing API keys — mirrors how Opencode handles Gemini OAuth | MEDIUM | Google OAuth flow → exchange for short-lived AI Studio token; already proven pattern in Opencode | -Features that make Nexus feel personal and purpose-built for a solo developer, versus Paperclip's "zero-human company" framing. +### Anti-Features (Commonly Requested, Often Problematic) -### D1: Zero-Question Onboarding -**Value:** Paperclip's onboarding asks for company name, mission, CEO name, adapter config, and then creates a task to "hire a founding engineer." None of this maps to a solo developer with a root directory of projects. Nexus asks for ONE thing (root directory), auto-creates PM + Engineer agents with sane templates, and drops the user in the dashboard. +Features that seem like good additions but create maintenance debt, scope creep, or user confusion. -**Why it matters:** Paperclip's own product notes flag "getting from install to first task in under 5 minutes" as a stated goal, not yet achieved consistently. This is the single highest-impact UX change. - -**Complexity:** Medium (requires rewriting `OnboardingWizard.tsx` step sequence and `onboard.ts` CLI wizard; no schema changes) - -**Dependencies:** Predefined agent templates (D2) must exist before onboarding can auto-create them. - ---- - -### D2: Predefined Agent Templates (PM + Engineer) -**Value:** Instead of asking "what should I name my CEO and what adapter should it use?", Nexus ships two templates that are immediately useful: a Project Manager agent wired to delegate and coordinate, and an Engineer agent wired to execute code tasks. - -**Why it matters:** The upstream's default first task ("hire a founding engineer, write a hiring plan") is designed for a multi-agent org-building flow. Solo developers do not want to bootstrap an org — they want to point agents at work. Opinionated defaults remove the blank-canvas paralysis. - -**Complexity:** Low (template content in AGENTS.md / HEARTBEAT.md / SOUL.md / TOOLS.md files; no schema changes; one new UI dropdown in "Add Agent" dialog) - -**Dependencies:** None — these are static files bundled with the fork. - ---- - -### D3: Workspace-First Mental Model -**Value:** Replacing the Company/CEO metaphor with Workspace/Project Manager throughout every user-facing surface creates a consistent mental model. When every button, heading, and CLI response uses the same vocabulary, the user stops translating and starts working. - -**Why it matters:** Every time a user sees "CEO" or "Company" in the Nexus UI, it costs cognitive load. Multiplied across hundreds of daily interactions, this friction accumulates. The rename is not cosmetic — it removes a persistent mismatch between the user's world model and the tool's communication. - -**Complexity:** Medium (systematic string audit across `ui/src/`, `cli/src/`, agent template files; the work is large in surface area but each change is trivial) - -**Dependencies:** None — display-only. Each component can be renamed independently. - ---- - -### D4: Human-Readable Agent Directories Under User Root -**Value:** Instead of `~/.paperclip/` opaque config, Nexus stores everything under the user-chosen root directory with human-readable names. An agent called "Engineer" lives at `~/RaglanWork/agents/engineer/`. The user can `ls` their agent setup. - -**Why it matters:** Solo developers inspect their file system. Opaque hidden directories make tooling feel like a black box. Transparent directory layout builds trust and makes debugging obvious. - -**Complexity:** Medium (requires updating config resolution in CLI and server to respect `~/.nexus` pointer file; no DB changes) - -**Dependencies:** Zero-question onboarding (D1) — the root directory picker sets the base path. - ---- - -### D5: Nexus Branding Throughout -**Value:** Consistent logo, color, app name, tab title, CLI program name (`nexus` not `paperclipai`), and absence of any upstream branding. - -**Why it matters:** Every occurrence of "Paperclip" in a tool you use daily is a reminder that you are using someone else's thing. Branding the fork removes that friction. - -**Complexity:** Low (HTML ``, favicon, logo asset swap, CLI binary name in `package.json`, help text strings) - -**Dependencies:** None — purely presentational. - ---- - -### D6: "Add Agent" Dialog with Template Dropdown -**Value:** The current upstream flow says "hire" an agent. Nexus replaces this with "Add Agent" with a dropdown of predefined templates (PM, Engineer, custom). Users pick a template and get a pre-configured agent immediately. - -**Why it matters:** The hiring metaphor forces users through a corporate onboarding flow. The template dropdown reduces the mental model to "pick what kind of agent you want." - -**Complexity:** Low (UI-only change to the dialog component; templates are static config) - -**Dependencies:** Predefined agent templates (D2) must be defined. - ---- - -## Anti-Features - -Things to deliberately NOT change in v1. Each has a reason. - -| Anti-Feature | Why Avoid | What to Do Instead | -|--------------|-----------|-------------------| -| Rename DB columns (`company_id`, `companies` table) | Breaks upstream rebase permanently; any `git rebase upstream/master` creates hundreds of conflicts with zero benefit | Accept the mismatch; translate at the display layer | -| Rename API routes (`/api/companies`) | UI already translates; server staying upstream-compatible means zero merge conflicts on route changes | Keep routes; update only the client-side labels | -| Rename TypeScript identifiers (`companyService`, `boardAuthService`) | Mechanical but enormous merge conflict surface; thousands of import statements | Leave unchanged; the identifier is not user-visible | -| Rename environment variables (`PAPERCLIP_*`) | Would break every existing deployment config and upstream docs | Keep env vars; update only the user-facing config documentation | -| Rename plugin API contracts (`company.created` events) | Would break any existing plugins silently | Leave event names unchanged; document the mismatch for plugin authors | -| Rename `.paperclip.yaml` export format | Would break import compatibility with upstream instances | Keep format; rename only the CLI command description, not the file format | -| Full Catppuccin Mocha theme | High visual complexity for v1; risk of breaking responsive layout | Treat as stretch goal; focus on vocabulary rename first | -| Multi-workspace support UI overhaul | The upstream multi-company feature already works; it's just renamed | Rename "Companies" → "Workspaces" in the switcher; don't rebuild the underlying logic | -| Telegram Channels integration | Separate project scope | Defer entirely | -| Recipe Registry plugin | Separate project scope | Defer entirely | -| MCP connector layer | Upstream adapter system already handles this via the adapter registry and process/http adapters | Do not add a new abstraction layer on top | -| Agent observability / tracing / OTEL | Enterprise-grade monitoring is overkill for a single-developer Mac Mini deployment | The upstream heartbeat logs + SSE updates are sufficient | +| Feature | Why Requested | Why Problematic | Alternative | +|---------|---------------|-----------------|-------------| +| "Sync memory to cloud" for personal assistant | Users want memory accessible across devices | Requires auth system, cloud storage, privacy policy, GDPR compliance — enormous scope for a personal tool | Local SQLite memory is sufficient for Mac Mini single-user; defer cloud sync to a future milestone | +| Automatic MCP server discovery | Users want zero-config MCP like Bluetooth discovery | MCP servers expose arbitrary capabilities; auto-discovery without user approval is a security risk | Curated list of common MCP servers (filesystem, git, web search) with one-click add; user approves each | +| Real-time provider cost display during chat | Visible per-message token cost feels responsive | Puter.js explicitly does not expose cost to developer (user-pays model); cost calculation would require hardcoding token prices that drift | Show estimated costs for API-key providers only; for Puter.js, show "costs charged to your Puter account" | +| Streaming TTS (word-by-word) | Reduces perceived latency of voice responses | Browser audio API makes true word-by-word streaming complex; sentence-by-sentence is the practical optimum | Buffer by sentence (split on `.!?`); start playing first sentence while next is synthesizing | +| Multi-user onboarding / team setup | Looks natural to "extend" to teams | Nexus is intentionally single-user (Mac Mini, local_trusted mode); team features require auth overhaul | Explicitly document single-user scope; defer team features until upstream Paperclip ships them | +| AI provider auto-negotiation (pick best available) | Transparent provider switching sounds smart | Silent model switches confuse users ("why did my assistant suddenly get dumber?"); debugging becomes impossible | Show active provider in UI always; let user set preferred priority order; never switch silently | --- ## Feature Dependencies ``` -D2 (Agent Templates) - → D1 (Zero-Question Onboarding) [onboarding auto-creates templates; templates must exist first] - → D6 (Add Agent Dialog w/ templates) [dropdown requires templates to be defined] +Hardware Detection + └──feeds──> Model Recommendation DB + └──feeds──> Local AI Setup (Ollama tier) -D1 (Zero-Question Onboarding) - → D4 (Human-Readable Directories) [root directory picker sets the base; directory layout flows from it] +Puter.js Integration + └──requires──> Puter account (user-side; not a Nexus dependency) + └──requires──> Client-side script inclusion (no server-side secrets) -D5 (Branding) [no dependencies; can ship independently] -D3 (Workspace Mental Model) [no dependencies; can ship incrementally per surface] +Personal AI Assistant Mode + └──requires──> Mode Selection (Personal / Project Builder / Both) + └──requires──> Persistent Memory Store (SQLite via existing DB) + └──requires──> Existing Chat Interface (v1.3 ChatPanel) [already built] + +MCP Connections + └──requires──> Personal AI Assistant Mode (MCP is an assistant-mode feature) + └──requires──> STDIO transport (Node.js child_process, already available in CLI) + +Voice (Piper TTS) + └──requires──> Existing Whisper STT (v1.3) [already built] + └──enhances──> Personal AI Assistant Mode + +Project Handoff + └──requires──> Personal AI Assistant Mode (conversation context exists there) + └──requires──> Existing PM Agent Template (v1.4) [already built] + └──requires──> Existing Project entity (upstream Paperclip) [already built] + └──requires──> LLM summarization call (any configured provider) + +npx buildthis + └──requires──> Existing CLI (Commander.js) [already built] + └──requires──> npm publish or private registry setup + +Google OAuth Cloud Tier + └──requires──> OAuth flow (Google Sign-In) + └──independent──> other provider tiers (each tier is additive) ``` -**Critical path:** D2 → D1 → D4. Templates first, then onboarding wizard, then directory structure. D3, D5, D6 can ship in any order alongside or after. +### Dependency Notes + +- **Persistent memory requires existing DB:** Paperclip already uses SQLite/Postgres; a `memory` table (key/value or embedding store) can be added. No ORM change needed if using raw SQL in a new file. +- **MCP requires assistant mode to be active:** MCP connections are scoped to the Personal AI Assistant mode, not the Project Builder. They should not be surfaced during project management workflows. +- **Hardware detection is a one-time onboarding concern:** Results should be cached; re-detection should be available in Settings but not re-run on every launch. +- **Puter.js has no server-side dependency:** The entire integration is client-side JavaScript. This is both a strength (zero backend changes) and a constraint (Puter auth happens in the browser, not on the Nexus server). --- -## MVP Recommendation +## MVP Definition -Prioritize in this order: +### Launch With (v1.5 Milestone) -1. **D2 — Predefined agent templates** (AGENTS.md, HEARTBEAT.md, SOUL.md, TOOLS.md for PM + Engineer) -2. **D1 — Zero-question onboarding** (rewrite wizard to use root dir + auto-create from templates) -3. **D3 — Workspace mental model rename** (systematic string pass across UI + CLI) -4. **D5 — Nexus branding** (logo, title, CLI binary name) -5. **D6 — Add Agent dialog** (template dropdown) -6. **D4 — Human-readable directories** (`.nexus` pointer file + root-relative paths) +Minimum viable set to validate the milestone goals. -**Defer to v2:** -- Full Catppuccin Mocha theme (stretch, high visual risk) -- Telegram integration (separate project) -- Recipe Registry (separate project) -- Any plugin API renames (breaks plugins) +- [ ] **Mode selection UI** — Personal AI / Project Builder / Both selector in onboarding + settings. Why essential: gates all assistant-specific features. +- [ ] **Hardware detection + model recommendation** — Detect RAM/VRAM, recommend Ollama model. Why essential: the primary UX claim of "smart onboarding." +- [ ] **Puter.js cloud tier** — Zero-config provider for users without local AI. Why essential: removes the "I have to install Ollama" barrier. +- [ ] **Personal AI Assistant chat with persistent memory** — Conversations that remember previous sessions. Why essential: defines the Personal AI Assistant mode as meaningfully different from existing chat. +- [ ] **Summary screen → straight into chat** — After onboarding completes, land in chat not dashboard. Why essential: closes the onboarding funnel. +- [ ] **Every step skippable** — Including hardware detection, cloud setup, MCP config. Why essential: PROJECT.md explicitly requires this. +- [ ] **Piper TTS** — Text-to-speech for assistant responses. Why essential: completes the voice loop that Whisper STT already started. + +### Add After Validation (v1.5.x) + +Features to add once core assistant mode is working. + +- [ ] **Project handoff** — "Turn this conversation into a project" button. Trigger: assistant mode is stable and used regularly. +- [ ] **MCP server connections** — Curated list with one-click add. Trigger: users request specific tool integrations. +- [ ] **Google OAuth cloud tier** — Gemini without API key. Trigger: Puter.js limitations surface (rate limits, cost surprises for users). +- [ ] **`npx buildthis` CLI entry point** — Zero-install UX. Trigger: sharing Nexus with others becomes a use case. + +### Future Consideration (v2+) + +Features to defer until post-v1.5. + +- [ ] **OpenAI OAuth tier** — OpenAI free tier via OAuth; rate limits are aggressive and UX is complex. +- [ ] **Subscription/API key auto-detection** — Scan environment for `ANTHROPIC_API_KEY`, etc. Low user value vs. complexity. +- [ ] **Memory export/import** — Portable memory across reinstalls. Needs file format design. +- [ ] **Multi-MCP orchestration** — Parallel MCP server calls, result merging. Enterprise complexity for personal tool. --- -## Confidence Assessment +## Feature Prioritization Matrix -| Area | Confidence | Notes | -|------|------------|-------| -| Table stakes features | HIGH | Derived directly from codebase analysis; features exist and are verified | -| Differentiator prioritization | MEDIUM | Based on Paperclip's own stated onboarding goals (product docs) + ecosystem research | -| Anti-feature list | HIGH | Based on explicit PROJECT.md constraints and merge-conflict risk analysis | -| UX claims (cognitive load, blank-canvas friction) | LOW | Reasonable inference from UX research but not validated against actual users | -| Complexity estimates | MEDIUM | Based on reading the codebase; no actual implementation attempted | +| Feature | User Value | Implementation Cost | Priority | +|---------|------------|---------------------|----------| +| Mode selection UI | HIGH | LOW | P1 | +| Hardware detection + model recommendation | HIGH | MEDIUM | P1 | +| Puter.js zero-config cloud | HIGH | MEDIUM | P1 | +| Persistent memory (SQLite) | HIGH | MEDIUM | P1 | +| Summary screen → chat | HIGH | LOW | P1 | +| Every step skippable | HIGH | LOW | P1 | +| Piper TTS | MEDIUM | MEDIUM | P1 | +| Project handoff | HIGH | HIGH | P2 | +| MCP connections (curated) | MEDIUM | MEDIUM | P2 | +| Google OAuth cloud tier | MEDIUM | MEDIUM | P2 | +| `npx buildthis` | LOW | MEDIUM | P2 | +| OpenAI free tier OAuth | LOW | HIGH | P3 | +| API key auto-detection | LOW | MEDIUM | P3 | + +**Priority key:** +- P1: Must have for v1.5 launch +- P2: Should have, add in v1.5.x +- P3: Nice to have, v2+ + +--- + +## Competitor Feature Analysis + +| Feature | ChatGPT | Claude Projects | Bare Ollama | Nexus v1.5 Approach | +|---------|---------|-----------------|-------------|---------------------| +| Persistent memory | Yes (cloud) | Yes (project instructions) | No | SQLite local; no cloud required | +| Hardware-aware setup | No | No | No | Pre-built model database; auto-recommend | +| Zero-config cloud | No (API key) | No (API key) | N/A | Puter.js user-pays model | +| Local/offline operation | No | No | Yes (manual) | Ollama + Piper + Whisper; fully offline | +| Voice I/O | Yes (cloud) | No | No | Whisper STT (existing) + Piper TTS (new) | +| Tool connections | Yes (plugins) | Yes (Projects) | No | MCP servers (curated list) | +| Project handoff | No | Partial (copy-paste) | No | One-button conversation → PM agent | +| Mode switching | No | No | No | Personal AI / Project Builder / Both | + +--- + +## Provider Tier Architecture + +The onboarding should present providers as a tiered funnel, not a flat list. Users land in the highest-comfort tier: + +``` +Tier 0: Already have Hermes / Claude Code / OpenClaw running + └──detect via env vars or local port scan──> skip straight to summary + +Tier 1: Local AI (most private, no cost) + └──Ollama installed?──> detect models, recommend based on hardware + └──Ollama not installed?──> show install prompt with one-liner + +Tier 2: Zero-config cloud (easiest, user-pays) + └──Puter.js──> "Sign in with Puter" → 500+ models, no API key + └──User creates/logs into free Puter account + +Tier 3: OAuth cloud (Google account required, free quota) + └──Google Gemini──> OAuth flow → Gemini 2.0 Flash free tier + └──Free tier as of 2026: reduced but functional (Gemini 2.0 Flash) + +Tier 4: API key / subscription + └──Hermes (existing) + └──Claude Code (ANTHROPIC_API_KEY) + └──OpenClaw (custom) + └──OpenAI (OPENAI_API_KEY) +``` + +**Key insight:** Users should be steered toward Tier 0 or 1 first (most private, most robust for single-user Mac Mini). Puter.js (Tier 2) is the escape hatch for users who won't install Ollama, not the default recommendation. + +--- + +## Puter.js Integration Notes + +**Confidence:** MEDIUM — confirmed working from official docs, but production reliability and rate limit specifics are not publicly documented. + +- Integration is entirely client-side: `<script src="https://js.puter.com/v2/"></script>` then `puter.ai.chat(model, message)` +- Supports 500+ models including GPT-4.1, Claude Sonnet 4, Gemini 2.5 Flash, Llama 3.x +- User authenticates against Puter (free account); developer incurs zero cost +- Rate limits: not publicly documented; Puter says "no restrictions" but this is unverified at scale +- Limitation: requires user to create/have a Puter account — this is friction vs. "truly zero-config" +- Risk: Puter's pricing model is described as "still being worked out" — future cost surprises for users possible +- Mitigation: Show clear messaging that Puter costs are the user's own account costs, not Nexus costs + +--- + +## Hardware Detection Implementation Notes + +**Confidence:** HIGH — patterns well-established across Ollama, LM Studio, llm-checker. + +Detection sources (Node.js server-side, run once at onboarding): +1. `os.totalmem()` — system RAM (always available) +2. Spawn `nvidia-smi --query-gpu=memory.total --format=csv,noheader` — NVIDIA VRAM +3. `system_profiler SPDisplaysDataType` (macOS) — Apple Silicon unified memory +4. Ollama `/api/tags` endpoint — detect already-running models +5. `/proc/driver/nvidia/gpus/` (Linux) — alternative NVIDIA detection + +Model recommendation lookup table (simplified): +``` +CPU-only / <8GB RAM: phi3:mini (3.8B), llama3.2:1b +8-16GB RAM: llama3.2:3b, mistral:7b, phi3:medium +16-24GB unified: llama3.1:8b, qwen2.5:7b +24GB+ unified / GPU: llama3.1:70b (quantized), qwen2.5:32b +``` + +--- + +## Persistent Memory Implementation Notes + +**Confidence:** MEDIUM — standard pattern, but the specific storage mechanism in Paperclip's DB needs verification. + +Standard patterns in production personal AI assistants: +1. **Summary-based memory:** After each conversation, run an LLM call to extract key facts → store as `memory` rows. On next conversation, inject relevant memories into system prompt. +2. **Verbatim storage:** Store full conversation history, retrieve last N messages or vector-search for relevant passages. +3. **Hybrid:** Store both summaries (for long-term preferences) and recent verbatim context (for continuity). + +Recommended for Nexus: Summary-based for long-term memory (preferences, ongoing projects, user facts) + last 10 messages as verbatim context. Avoids needing a vector database. Uses existing SQLite schema with a new `assistant_memories` table. + +**MCP-compatible storage:** The MCP memory pattern (used by Penfield, mcp-memory-service) stores memories as MCP tool call results — same summary pattern, just with MCP as the transport. Nexus does not need to implement MCP just for memory; MCP is for external tool connections. + +--- + +## Voice Architecture Notes + +**Confidence:** MEDIUM — Piper confirmed CPU-capable and fast on Apple Silicon; full pipeline integration complexity is estimated, not measured. + +Pipeline for full voice I/O: +``` +Microphone → MediaRecorder (browser) → Whisper (existing, v1.3) → LLM (any provider) + ↓ +Speaker ← Web Audio API ← Piper TTS (new) ← Text response ←────────────┘ +``` + +Piper TTS: +- Open-source (rhasspy/piper), MIT license +- Runs on CPU; Apple Silicon M4 handles it in real-time +- Node.js integration: spawn `piper` binary with text via stdin, read WAV from stdout +- Voice models: compact (few MB) per language/voice; ship one English voice as default +- Streaming: buffer by sentence for lower perceived latency (start playing sentence 1 while sentence 2 synthesizes) + +Whisper is already integrated (v1.3). Piper adds the TTS half to complete the loop. --- ## Sources -- Paperclip codebase analysis: `/Volumes/UsbNvme/agent/.planning/codebase/ARCHITECTURE.md` -- Project context: `/Volumes/UsbNvme/agent/.planning/PROJECT.md` -- [Paperclip GitHub README](https://github.com/paperclipai/paperclip) -- [Paperclip AI Review (The 4th Path, 2026)](https://www.the4thpath.com/2026/03/paperclip-ai-review-if-agents-are.html) -- [Paperclip Review 2026 — AI Agent Teams as Companies (VibeCoding)](https://vibecoding.app/blog/paperclip-review) -- [What Is an AI Agent Orchestration Platform? (Teneo, 2026)](https://www.teneo.ai/blog/what-is-an-ai-agent-orchestration-platform-benefits-features-use-cases-2026) -- [Designing For Agentic AI: Practical UX Patterns (Smashing Magazine, 2026)](https://www.smashingmagazine.com/2026/02/designing-agentic-ai-practical-ux-patterns/) -- [AI Agent Monitoring: Best Practices, Tools, and Metrics (UptimeRobot, 2026)](https://uptimerobot.com/knowledge-hub/monitoring/ai-agent-monitoring-best-practices-tools-and-metrics/) -- [Learnings From Forking an Open Source Project (Echobind)](https://echobind.com/post/learnings-from-forking-an-open-source-project) -- [Top 5 AI Agent Observability Platforms (o-mega, 2026)](https://o-mega.ai/articles/top-5-ai-agent-observability-platforms-the-ultimate-2026-guide) +- [Puter.js Free AI API (developer.puter.com)](https://developer.puter.com/tutorials/free-unlimited-ai-api/) +- [Puter.js Free LLM API (developer.puter.com)](https://developer.puter.com/tutorials/free-llm-api/) +- [Puter User-Pays Model (docs.puter.com)](https://docs.puter.com/user-pays-model/) +- [Ollama Hardware Detection and GPU Support (deepwiki.com)](https://deepwiki.com/ollama/ollama/6-gpu-and-hardware-support) +- [Ollama VRAM Requirements 2026 (localllm.in)](https://localllm.in/blog/ollama-vram-requirements-for-local-llms) +- [AI Hardware Guide 2026 (localaimaster.com)](https://localaimaster.com/blog/ai-hardware-requirements-2025-complete-guide) +- [Model Context Protocol Wikipedia](https://en.wikipedia.org/wiki/Model_Context_Protocol) +- [MCP for Persistent Memory (medium.com)](https://medium.com/mynextdeveloper/how-to-set-up-model-context-protocol-mcp-for-persistent-memory-in-your-ai-app-9c2f819f5c21) +- [Piper TTS GitHub (rhasspy/piper)](https://github.com/rhasspy/piper) +- [Voice Chat with Local LLMs: Whisper + TTS (insiderllm.com)](https://www.insiderllm.com/guides/voice-chat-local-llms-whisper-tts/) +- [Google Gemini API Free Tier 2026 (aifreeapi.com)](https://www.aifreeapi.com/en/posts/google-gemini-api-free-tier) +- [Google Gemini OAuth via Opencode (syntackle.com)](https://syntackle.com/blog/google-gemini-ai-subscription-with-opencode/) +- [AI Handoff Patterns in Multi-Agent Systems (towardsdatascience.com)](https://towardsdatascience.com/how-agent-handoffs-work-in-multi-agent-systems/) +- [Building an NPX CLI Tool (johnsedlak.com)](https://johnsedlak.com/blog/2025/03/building-an-npx-cli-tool) +- [Postman Onboarding UX Lessons (candu.ai)](https://www.candu.ai/blog/postman-onboarding-ux-lessons) + +--- +*Feature research for: Nexus v1.5 Smart Onboarding + Personal AI Assistant* +*Researched: 2026-04-02* diff --git a/.planning/research/PITFALLS.md b/.planning/research/PITFALLS.md index b3285aab..b6569f12 100644 --- a/.planning/research/PITFALLS.md +++ b/.planning/research/PITFALLS.md @@ -1,277 +1,636 @@ # Domain Pitfalls — Nexus Fork of Paperclip **Domain:** Forked open-source project with display-layer renames, no i18n layer -**Researched:** 2026-03-30 -**Confidence:** HIGH — based primarily on direct codebase analysis of `/Volumes/UsbNvme/repos/nexus` via CONCERNS.md, supplemented by fork maintenance community research +**Researched:** 2026-04-02 (updated for v1.5 milestone: smart onboarding, multi-provider, voice TTS, persistent memory, assistant mode, `npx buildthis`) +**Confidence:** HIGH — based on direct codebase analysis of `/opt/nexus/` plus targeted research on each new integration domain --- -## Critical Pitfalls +## About This Document -Mistakes that cause data loss, broken upstream rebase, or irreversible divergence. +This file covers pitfalls for the **v1.5 milestone additions**. The original pitfalls (Pitfalls 1–11) covering fork hygiene, display-layer rename discipline, and upstream sync remain valid and are preserved below. Pitfalls 12–26 are new for v1.5. + +--- + +## Critical Pitfalls (Fork Hygiene — v1.0–1.4, still active) --- ### Pitfall 1: Renaming a Code Identifier That Is Also a Stored DB Value -**What goes wrong:** You rename a TypeScript constant, CLI command, or function to use the new Nexus vocabulary, not realising the same string is also stored as a literal value in database rows. The app breaks for any existing installation because the server checks `approval.type === "hire_agent"` but the DB still has `"hire_agent"` rows. Or worse: you change the constant on one side (server) but not the other (CLI) and the two sides silently disagree. +**What goes wrong:** You rename a TypeScript constant, CLI command, or function to use the new Nexus vocabulary, not realising the same string is also stored as a literal value in database rows. The app breaks for any existing installation because the server checks `approval.type === "hire_agent"` but the DB still has `"hire_agent"` rows. -**Why it happens:** In Paperclip the same string serves double duty: it is both a TypeScript constant/enum and a persisted DB value. The CONCERNS.md audit identifies these dual-purpose strings explicitly: -- `"ceo"` — stored in `agents.role` column AND used in TypeScript `AGENT_ROLES` array -- `"hire_agent"` — stored in `approvals.type` column AND checked in route handlers -- `"approve_ceo_strategy"` — stored in `approvals.type` column AND displayed in `ApprovalPayload.tsx` -- `"bootstrap_ceo"` — stored in `invites.invite_type` column AND checked in `InviteLanding.tsx` -- `"company"` — stored as a value in `goals.level` column AND used as a string literal in constants -- `"board"` — stored in `cli_auth_challenges.requested_access` column AND used in auth middleware +**Why it happens:** In Paperclip the same string serves double duty: it is both a TypeScript constant/enum and a persisted DB value. The CONCERNS.md audit identifies these dual-purpose strings explicitly: `"ceo"`, `"hire_agent"`, `"approve_ceo_strategy"`, `"bootstrap_ceo"`, `"company"` in goal levels, `"board"` in auth challenges. -**Consequences:** Silent data incompatibility on existing installations. New rows written with the renamed value, old rows still have the old value. Code that does `WHERE type = $new_value` misses all old rows. A fresh install works; an existing install silently loses data or shows empty lists. +**How to avoid:** +1. Treat every string in the Summary Risk Table (CONCERNS.md) marked "Critical" as immutable. +2. For display renaming only: change label maps (`AGENT_ROLE_LABELS`, `ApprovalPayload` display maps) without touching the underlying constant value. +3. Before touching any string, grep for it in `packages/db/src/schema/` and migration files. -**Prevention:** -1. Treat every string in the Summary Risk Table (CONCERNS.md) marked "Critical" as immutable. Do not rename them, even in display contexts, without a data migration. -2. For display renaming only: change the label map (`AGENT_ROLE_LABELS`, `ApprovalPayload` display maps) without touching the underlying constant value. Rename `ceo: "CEO"` to `ceo: "Project Manager"` — the key `ceo` stays, the display label changes. -3. Before touching any string, grep for it in the schema directory (`packages/db/src/schema/`) and migration files. If it appears there, it is a stored value, not just a display string. +**Warning signs:** +- Any string appearing in `packages/db/src/schema/` or migration files +- Approval, invite, and goal lists empty on existing install but work on fresh install -**Detection (warning signs):** -- Any string that also appears in `packages/db/src/schema/` or `packages/db/src/migrations/` is a stored value -- Approval, invite, and goal lists that show empty on an existing install but work on a fresh install -- TypeScript constants in `APPROVAL_TYPES`, `INVITE_TYPES`, `GOAL_LEVELS`, `AGENT_ROLES` — these feed directly into DB queries - -**Phase:** Phase 1 (Display Rename). Must be resolved before any rename touches these identifiers. +**Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 2: Treating "Display-Only Rename" as a Simple Find-Replace -**What goes wrong:** You run a bulk `sed` or IDE find-replace on "company" → "workspace" across the entire codebase to get the strings right fast. The rename touches service files, route files, schema files, and test files indiscriminately. The next `git rebase upstream/master` has conflicts on hundreds of files, most of which were upstream-compatible before. +**What goes wrong:** Bulk `sed` or IDE find-replace on "company" → "workspace" across the entire codebase. Touches service files, route files, schema files, and test files indiscriminately. The next `git rebase upstream/master` has conflicts on hundreds of files. -**Why it happens:** "Display-only" is a *policy* decision, not a property the codebase enforces. Nothing in the TypeScript source distinguishes a user-facing label string from an internal identifier. Both are just string literals. A naive find-replace cannot tell `<h1>Company Settings</h1>` (display — safe to rename) from `companyService()` (code identifier — must not be renamed) from `"company"` in `GOAL_LEVELS` (stored DB value — renaming breaks data). +**Why it happens:** "Display-only" is a policy decision, not a property the codebase enforces. Nothing in the TypeScript source distinguishes a user-facing label string from an internal identifier. -**Consequences:** Blown upstream sync. Every file that had `company` as a code identifier now has a conflict on rebase. The entire maintenance advantage of display-only renaming is lost. Recovering requires reverting the bulk rename and redoing it file-by-file. +**How to avoid:** +1. Establish a strict three-zone taxonomy: Zone A (display strings, safe), Zone B (code identifiers, do not rename), Zone C (dual-purpose stored values, label map only). +2. Never run a global find-replace. Work file-by-file. -**Prevention:** -1. Establish a strict three-zone taxonomy before touching any string: - - **Zone A — Display strings**: JSX text nodes, `p.log()` CLI output, Markdown prose in onboarding assets, comment text. These are in scope. - - **Zone B — Code identifiers**: TypeScript variable names, function names, class names, file names, import paths, package names. These are OUT of scope. - - **Zone C — Dual-purpose stored values**: strings that are both code constants and stored in the DB (see Pitfall 1). OUT of scope for value; label-map only for display. -2. Never run a global find-replace. Work file-by-file with the zone taxonomy applied per file. -3. When unsure, ask: "Would upstream Paperclip have to change this file to fix a bug?" If yes, minimise changes to it. +**Warning signs:** +- PR diff touching `server/src/services/`, `server/src/routes/`, or `packages/db/` with rename changes +- Diff showing TypeScript identifier name changes (not JSX string literals) -**Detection (warning signs):** -- A PR diff that touches `server/src/services/`, `server/src/routes/`, or `packages/db/` with rename changes is a red flag -- A diff that shows changes to TypeScript identifier names (not string literals in JSX) is a Zone B violation -- Rebase producing conflicts in files not intentionally modified by Nexus - -**Phase:** Phase 1 (Display Rename). The zone taxonomy must be documented and applied from the first commit. +**Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 3: Diverging the Onboarding Assets Directory Name From Upstream -**What goes wrong:** You rename the `server/src/onboarding-assets/ceo/` directory to `server/src/onboarding-assets/pm/` (or similar) to match the new PM vocabulary. Upstream changes a file inside `ceo/` in a future commit. `git rebase` cannot reconcile a file renamed on one side with a content edit on the other — it presents as a delete/modify conflict and the upstream change is silently dropped. +**What goes wrong:** Renaming `server/src/onboarding-assets/ceo/` to `pm/`. Upstream changes a file inside `ceo/` in a future commit. Git cannot reconcile rename-on-one-side with content-edit-on-other. -**Why it happens:** Git rename detection is heuristic. When you rename a directory AND upstream edits a file within that directory, git frequently misidentifies this as "deleted old file + created new file" rather than "renamed file + edited renamed file." The merge resolves by keeping your renamed version and discarding upstream's content edit. +**How to avoid:** Do not rename the `ceo/` directory. Change file *content* only. The directory path is Zone B. -**Consequences:** You silently miss upstream improvements to agent instructions. If upstream fixes a security or correctness issue in the default agent template, your fork never gets it. +**Warning signs:** Rebase conflict shows a file as "deleted" that you expected to be "modified." -**Prevention:** -1. Do not rename the `ceo/` directory. Keep the directory path as `onboarding-assets/ceo/` in the filesystem. Only change the file *content* (the Markdown prose that says "You are the CEO"). -2. The directory name `ceo/` is an internal asset path loaded by `default-agent-instructions.ts` — it is Zone B. The prose inside `SOUL.md`, `AGENTS.md`, `HEARTBEAT.md` is Zone A. -3. If a directory rename is truly necessary, document it explicitly and set up a post-rebase hook that verifies the content was not silently dropped. - -**Detection (warning signs):** -- Rebase conflict shows a file as "deleted" that you expected to be "modified" -- Upstream changelog mentions onboarding asset changes but your fork's onboarding assets are unchanged after rebase - -**Phase:** Phase 1 (Onboarding Redesign). Address before modifying any asset file. +**Phase to address:** Phase 1 (Onboarding Redesign) --- ### Pitfall 4: Changing the `localStorage` Key or `~/.paperclip` Config Path Without a Migration -**What goes wrong:** The UI stores the selected company/workspace ID in `localStorage` under the key `"paperclip.selectedCompanyId"` (identified in `CompanyContext.tsx`). If you rename this key to `"nexus.selectedWorkspaceId"`, every existing browser session loses its selected workspace on next load. Similarly, if `~/.paperclip` config path is changed to `~/.nexus` without migrating existing data, the server starts as if it were a fresh install, losing all existing agents, API keys, and worktrees. +**What goes wrong:** Renaming `"paperclip.selectedCompanyId"` localStorage key or `~/.paperclip` config path drops all existing state. -**Why it happens:** These are persisted-state keys — they survive across deploys. Unlike code, they cannot be "renamed" by changing source; existing data already written under the old key must be read and migrated or the old key must continue to be read as a fallback. +**How to avoid:** Keep key names unchanged OR implement a read-both-paths fallback that migrates existing values on boot before deleting the old key. -**Consequences:** On `~/.paperclip` rename: complete data loss for the running installation. All agents, projects, API keys, and worktrees appear to vanish. On `localStorage` key rename: users are logged out of the UI on next load (minor but disorienting). +**Warning signs:** Server logs "no config found, starting fresh" on a machine with existing data. -**Prevention:** -1. For `~/.paperclip`: Keep the default path OR implement a read-both-paths fallback (check `~/.nexus` first, fall back to `~/.paperclip`, emit a deprecation log). The `~/.nexus` pointer-file mechanism described in PROJECT.md should write to `~/.nexus` but read from `~/.paperclip` if `~/.nexus` does not exist. -2. For `localStorage`: Either keep the key name `"paperclip.selectedCompanyId"` (it is internal, users never see it), or write a migration on app boot that reads the old key and writes the new key before deleting the old one. -3. Treat `PAPERCLIP_*` environment variable names as immutable for the same reason — existing Docker configs and systemd units use them. - -**Detection (warning signs):** -- After deploy, server logs show "no config found, starting fresh" on a machine with existing data -- UI shows empty workspace list on first load after deploy -- `docker-compose.untrusted-review.yml` still references `PAPERCLIP_HOME` after an env var rename - -**Phase:** Phase 2 (Directory Restructure / `~/.nexus` Pointer). Must have migration or fallback before shipping. +**Phase to address:** Phase 2 (Directory Restructure) --- ### Pitfall 5: Upstream Rebase Cadence Slipping Below Weekly -**What goes wrong:** The fork is deployed and working. A busy week becomes two, then a month. Upstream ships 15 commits. Now the rebase involves resolving conflicts in files you modified for display renames AND new logic added upstream to the same files. What was a 10-minute weekly rebase becomes a 4-hour archaeology session. This compounds: the next month is even harder. +**What goes wrong:** Fork drift. Upstream has 120+ commits since fork. Waiting accumulates compound conflicts. A 10-minute weekly rebase becomes 4 hours after a month gap. -**Why it happens:** Fork drift is non-linear. Each upstream commit that touches a file you also modified adds another conflict to resolve. When upstream commits accumulate faster than you rebase, the conflict count grows faster than linearly because upstream changes begin to interact with each other in ways that are opaque without context. +**How to avoid:** Rebase at minimum weekly. `[nexus]` commit prefix strictly enforced. CI alert on `git rebase upstream/master` failures in a test branch. -**Consequences:** Either you stop rebasing (fork permanently diverges, missing security patches and new features) or you spend disproportionate time on merge archaeology. Community research confirms: "initial updates took minutes; later attempts required an hour or two." +**Warning signs:** Last rebase more than 2 weeks ago; `git log upstream/master..HEAD` shows more than 20 upstream commits unmerged. -**Prevention:** -1. Rebase against `upstream/master` at minimum weekly, ideally on a fixed schedule (e.g., every Sunday). -2. Keep a `[nexus]` commit prefix convention strictly — every Nexus-specific commit is prefixed. This makes it trivial to identify which commits are yours vs. rebased upstream commits during conflict resolution. -3. Run a CI check (even a local cron) that attempts `git rebase upstream/master` on a test branch and alerts on failure. Catch conflicts before they accumulate. -4. If an upstream commit touches a file you have also modified, resolve it immediately rather than deferring. - -**Detection (warning signs):** -- Last rebase was more than 2 weeks ago -- `git log upstream/master..HEAD` shows more than 20 upstream commits unmerged -- Rebase produces conflicts in more than 5 files at once - -**Phase:** Ongoing. Establish cadence in Phase 1; automate alert in Phase 2. +**Phase to address:** Ongoing from Phase 1 --- -## Moderate Pitfalls +### Pitfall 6: Renaming the CLI Binary Name Without a Shim ---- +**What goes wrong:** Renaming to `nexus` without updating all four locations where `paperclipai` appears as an instructional string. -### Pitfall 6: Renaming the CLI Binary Name (`paperclipai` → `nexus`) Without a Shim +**How to avoid:** Add `nexus` as an alias; keep `paperclipai` binary working. If renaming, atomic commit covering all instructional copy. -**What goes wrong:** The CLI binary is currently invoked as `pnpm paperclipai run`. The UI (`App.tsx`, `startup-banner.ts`) renders the literal string `pnpm paperclipai auth bootstrap-ceo` as instructional copy. If you rename the binary to `nexus` but forget to update every UI string that mentions `paperclipai`, users see a mix of `nexus` and `paperclipai` commands in the UI, causing confusion and failed copy-paste attempts. - -**Why it happens:** The binary name appears in at least four distinct locations: `package.json` bin entry, `startup-banner.ts`, `App.tsx`, and `onboard.ts` terminal output. These are not linked by a constant. Changing the binary name in `package.json` alone does not update the rendered copy. - -**Prevention:** -1. Inventory every occurrence of `paperclipai` as a user-facing command string (not package name) before renaming. -2. Consider keeping the binary named `paperclipai` and adding a `nexus` alias, so existing muscle memory and documented commands continue to work. The alias can be the primary name in Nexus docs while `paperclipai` continues to work. -3. If renaming, treat it as an atomic change: rename binary, update all instructional strings, update docs, and test the smoke tests in one commit. - -**Detection (warning signs):** -- `startup-banner.ts` still says `paperclipai` after binary rename -- `ui/src/pages/App.tsx` shows mixed command names - -**Phase:** Phase 1 (CLI String Updates). +**Phase to address:** Phase 1 (CLI String Updates) --- ### Pitfall 7: Partial Rename — Changing Some Occurrences But Not All -**What goes wrong:** You rename "CEO" → "Project Manager" in the `OnboardingWizard.tsx` default task description and the `AGENT_ROLE_LABELS` constant, but miss the `DEFAULT_TASK_DESCRIPTION` which starts "You are the CEO." You also miss `InviteLanding.tsx` which checks `invite.inviteType === "bootstrap_ceo"` and renders "Bootstrap your Paperclip instance." Users see a mix of "CEO" and "Project Manager" in different parts of the UI. +**What goes wrong:** "CEO" renamed in 8 of 12 files. Users see mixed vocabulary. -**Why it happens:** With no i18n layer, there is no single source of truth for any display string. "CEO" appears in at least 12 distinct files. A partial search (only checking one or two obvious files) will miss the rest. There is no compile-time check that a string has been fully replaced. +**How to avoid:** Post-rename `grep -ri "CEO" ui/src cli/src server/src` and verify every remaining occurrence is Zone B/C or non-user-visible. -**Consequences:** Inconsistent vocabulary in the product. Users see "Project Manager" on the dashboard and "CEO" in the invite flow and onboarding wizard. This degrades trust in the product. - -**Prevention:** -1. Before declaring a rename complete, run a case-insensitive `grep -r "CEO" ui/src cli/src server/src` and verify that every remaining occurrence is either: (a) intentionally kept (Zone B/C), or (b) not user-visible (e.g., an internal comment). -2. Maintain a rename checklist in `.planning/` that tracks each term and its known locations. Check off each location as it is addressed. -3. After each phase, do a full-corpus string audit for any target terms that should have been renamed. - -**Detection (warning signs):** -- grep of the target term still returns JSX text nodes after the rename commit -- Onboarding flow or invite page still shows old vocabulary - -**Phase:** Phase 1 (Display Rename). Checklist needed before Phase 1 is marked complete. +**Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 8: The `[nexus]` Commit Prefix Not Applied Consistently From the Start -**What goes wrong:** Early commits are made without the `[nexus]` prefix convention. Later, when rebasing, you cannot easily distinguish "these are our changes, apply them on top of new upstream" from "this is an upstream commit we already rebased." You end up with duplicate commits or missing commits. +**What goes wrong:** Without consistent prefixing, rebase archaeology becomes necessary to identify which commits are Nexus vs. upstream. -**Why it happens:** The prefix convention feels optional at the start when there are only a few commits. Once there are 30+ commits, inconsistent prefixing means manual archaeology to reconstruct which commits are yours. +**How to avoid:** Pre-commit hook rejecting messages not starting with `[nexus]` from the first commit. -**Prevention:** -1. Apply `[nexus]` prefix from the very first commit in the fork. -2. Add a pre-commit hook that rejects commits whose message does not start with `[nexus]` or `[upstream]` (or an equivalent marker). -3. Periodically run `git log --oneline HEAD` and verify every Nexus commit has the prefix. - -**Detection (warning signs):** -- Any commit without `[nexus]` prefix in the fork's log -- Difficulty answering "which commits are mine?" during a rebase - -**Phase:** Phase 1. The hook should be in place before the first Nexus commit. +**Phase to address:** Phase 1 (First commit) --- ### Pitfall 9: Onboarding Redesign Coupled to the Corporate Metaphor in Data Layer -**What goes wrong:** The new onboarding flow (root directory picker, auto-create PM + Engineer) is implemented by calling the existing `companiesApi.create()` endpoint. But the wizard's UI variables are all named `companyName`, `companyGoal`, and the new onboarding flow does not pass a "company name" at all (the user picks a directory, not a name). If you rename the variables in the wizard without considering what the API expects, the API call sends an empty or undefined `name` field, and the company is created with no name. +**What goes wrong:** New wizard does not pass a company name; `POST /api/companies` requires it. Company created with undefined name. -**Why it happens:** The onboarding redesign changes the *UX flow* (fewer steps, different inputs) but the *API shape* has not changed. The mismatch between "user provides a directory path" and "API requires a company name" must be explicitly resolved — probably by deriving the workspace name from the directory basename. +**How to avoid:** Document API contract before redesigning wizard. Derive workspace name from directory basename (or VOCAB.appName as fallback — which `NexusOnboardingWizard.tsx` already does correctly). -**Prevention:** -1. Document the API contract (`POST /api/companies` body shape) before redesigning the wizard. Identify every required field. -2. For fields no longer collected from the user (company name), define a derivation rule (e.g., `basename(rootDir)`) and implement it explicitly rather than relying on defaults. -3. Test the onboarding flow with a fresh database to verify no required field is silently undefined. - -**Detection (warning signs):** -- Workspace created with an empty name after the new onboarding flow -- API 422 errors in the network tab after submitting the redesigned onboarding form - -**Phase:** Phase 2 (Onboarding Redesign). - ---- - -## Minor Pitfalls +**Phase to address:** Phase 2 (Onboarding Redesign) --- ### Pitfall 10: Forgetting to Update Tests That Assert on Display Strings -**What goes wrong:** `server/src/__tests__/invite-onboarding-text.test.ts` likely asserts that invite text contains "CEO." After renaming "CEO" to "Project Manager" in the display layer, the test fails. This is the correct outcome — the test needs to be updated — but if you do not notice it, you either ship a failing test suite or (worse) you revert the display rename to make tests pass. +**What goes wrong:** `invite-onboarding-text.test.ts` asserts invite text contains "CEO." After rename, tests fail. -**Why it happens:** Tests that assert on display strings are fragile to any vocabulary change. There is no way to know from the source that `invite-onboarding-text.test.ts` contains "CEO" assertions without reading it. +**How to avoid:** Before any rename commit, grep all `*.test.ts` files for old vocabulary terms and update in the same commit. -**Prevention:** -1. Before any rename commit, run `grep -r "CEO\|company\|board\|hire\|fire\|paperclip" --include="*.test.ts" cli/src server/src ui/src` to find all test files that will need updating. -2. Update the relevant tests in the same commit as the display string change — not in a follow-up commit. - -**Detection (warning signs):** -- CI fails on a test whose name contains "invite", "onboarding", or "branding" after a string rename - -**Phase:** Phase 1 (Display Rename). Pre-rename test audit is a prerequisite step. +**Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 11: Exporting a `.nexus.yaml` File While Upstream Exports `.paperclip.yaml` -**What goes wrong:** If the export file format is renamed to `.nexus.yaml`, any workspace exported from a Nexus instance cannot be imported into an upstream Paperclip instance and vice versa. This breaks the stated goal of "import upstream company bundles" and creates a permanent portability split. +**What goes wrong:** Breaking import compatibility with upstream Paperclip instances. -**Why it happens:** The export format is an identifiable artifact with a schema header (`schema: paperclip/v1`). Renaming only the file extension while keeping the schema header creates a confusing half-rename. Renaming both breaks import compatibility. +**How to avoid:** Keep emitting `.paperclip.yaml`. The filename and schema header are Zone B/C. -**Prevention:** -1. Keep emitting `.paperclip.yaml` and reading `.paperclip.yaml`. The filename and schema header are Zone B/C — they are part of the interchange contract with upstream. -2. If a Nexus-native export format is ever needed, emit `.nexus.yaml` as an *additional* file alongside `.paperclip.yaml`, not as a replacement. - -**Detection (warning signs):** -- Attempting to import a workspace from upstream Paperclip into Nexus returns "unrecognised format" error - -**Phase:** Phase 1 (Display Rename). Decide explicitly: keep `.paperclip.yaml` unchanged. +**Phase to address:** Phase 1 (Display Rename) --- -## Phase-Specific Warnings +## Critical Pitfalls (v1.5 New Features) -| Phase Topic | Likely Pitfall | Mitigation | -|-------------|---------------|------------| -| Display rename — CEO/Board/Company strings | Pitfall 1 (dual-purpose stored values) | Rename label maps only; leave constant values (`"ceo"`, `"hire_agent"`) unchanged | -| Display rename — bulk approach | Pitfall 2 (Zone B contamination) | File-by-file using zone taxonomy; never global find-replace | -| Onboarding asset content rewrite | Pitfall 3 (directory rename breaks git rebase) | Change file content only; leave `ceo/` directory name unchanged | -| CLI binary rename `paperclipai` → `nexus` | Pitfall 6 (partial instructional string update) | Atomic commit covering all instructional copy | -| Onboarding redesign (root dir picker) | Pitfall 9 (API shape mismatch) | Document API contract first; derive workspace name from directory basename | -| `~/.nexus` pointer file mechanism | Pitfall 4 (data path migration) | Read-both-paths fallback; never rename path without migration | -| `[nexus]` commit convention | Pitfall 8 (inconsistent prefix) | Pre-commit hook from first commit | -| Upstream rebase cadence | Pitfall 5 (drift) | Weekly schedule; CI rebase check | -| Test suite after string renames | Pitfall 10 (test assertions on display strings) | Pre-rename test audit; update tests in same commit | -| Export file format | Pitfall 11 (`.paperclip.yaml` vs `.nexus.yaml`) | Keep upstream format; no rename | +--- + +### Pitfall 12: Vite Alias Swap Breaking Upstream Rebase on OnboardingWizard + +**What goes wrong:** The current pattern aliases `src/components/OnboardingWizard` → `NexusOnboardingWizard` at build time via `vite.config.ts`. If upstream renames, moves, or splits `OnboardingWizard.tsx` into multiple files, the alias silently points to a non-existent target — the build succeeds (the alias target exists) but the import resolution breaks at runtime in any code path that imports the upstream file by a new name. + +More critically: when v1.5 replaces the simple wizard with a multi-step hardware-detection wizard, the alias target `NexusOnboardingWizard.tsx` grows significantly. Upstream may add new features to `OnboardingWizard.tsx` (new props, context dependencies) that `NexusOnboardingWizard.tsx` silently misses, since it fully replaces rather than extends the upstream file. + +**Why it happens:** Full file replacement via Vite alias means no inheritance from upstream. Every upstream improvement to the wizard is silently discarded. + +**How to avoid:** +1. After each upstream rebase, diff `OnboardingWizard.tsx` against the previous upstream version: `git diff upstream-prev..upstream-new -- ui/src/components/OnboardingWizard.tsx`. If upstream adds new props or context hooks, integrate them into `NexusOnboardingWizard.tsx`. +2. Keep `NexusOnboardingWizard.tsx` surface API identical to `OnboardingWizard.tsx` (same component name export, same props interface as far as upstream is concerned). +3. Add a CI check: `test -f ui/src/components/OnboardingWizard.tsx` — verify the aliased-away file still exists with its expected export. + +**Warning signs:** +- `NexusOnboardingWizard.tsx` not using a `DialogContext` or `CompanyContext` hook that upstream's version uses +- After rebase, `pnpm dev` fails with "cannot find module" for the alias source path +- The multi-step wizard is missing features that upstream added (e.g., invite-based onboarding, workspace templates) + +**Phase to address:** Phase 1 (Hardware Detection Wizard) — before building the multi-step v1.5 wizard, establish a diff-and-integrate protocol for this alias. + +--- + +### Pitfall 13: Hardware Detection Returning Inaccurate or Platform-Specific Values + +**What goes wrong:** The v1.5 hardware detection step must surface GPU/RAM to recommend Ollama models. Two platform-specific traps exist on the Mac Mini M4 deploy target: + +1. **VRAM is not VRAM on Apple Silicon.** The M4 uses unified memory — the same physical RAM serves both CPU and GPU. `os.totalmem()` in Node.js returns total unified memory. Reporting this as "VRAM available for Ollama" misleads: Ollama on Apple Silicon uses a portion of unified memory, but the OS, browser, and other processes also consume it. Treating `totalmem × 0.75` as GPU-available VRAM overestimates for models that also need system RAM headroom. + +2. **`os.totalmem()` reads total installed RAM, not available RAM.** The existing `getRecommendedModel()` in `server/src/services/ollama.ts` already applies a 0.75 multiplier to account for OS overhead, but it uses total RAM, not free RAM. If the system is under load (Paperclip server + Ollama already running), available RAM is far lower than 75% of total. + +**Why it happens:** Node.js `os` module has `totalmem()` and `freemem()` but no VRAM API. Browser `WebGL` UNMASKED_RENDERER gives GPU name but not VRAM size; actual VRAM queries are blocked by browser security sandboxing. Developers reach for the most accessible number. + +**How to avoid:** +1. Use `os.freemem()` (not `totalmem()`) as the baseline for available-RAM recommendations when Ollama is already running. +2. On Apple Silicon, explicitly document in UI copy that "available memory" is unified memory shared with OS, not dedicated GPU VRAM. +3. Treat hardware detection values as hints, not guarantees. Add a message: "Recommendation based on system RAM. Actual performance may vary." +4. The pre-built model catalog (`ollama-model-catalog.json`) is the right layer for model-to-RAM requirements; use it as the authoritative source rather than computing from raw hardware numbers. + +**Warning signs:** +- Model recommendation shows "fits in memory" but Ollama OOM-kills it at load time +- M4 Mac Mini reports 16GB available for models but the system has 16GB total (OS needs 4–6GB) +- AMD GPU users see wildly incorrect VRAM numbers (confirmed bug in Ollama's VRAM detection for AMD/Vulkan as of 2025) + +**Phase to address:** Phase 1 (Hardware Detection) — define detection methodology before building the UI layer. + +--- + +### Pitfall 14: The Onboarding Probe Running at the Wrong Authentication Level + +**What goes wrong:** The existing adapter probe endpoint (`GET /adapters/:type/probe`) requires board authentication (`req.actor.type !== "board"`). The v1.5 onboarding wizard runs *during* first-time setup — before the user has authenticated. If the probe is called before board auth is established, every probe returns 403, the wizard always falls back to `claude_local`, and the user never gets the Hermes auto-detection benefit. + +This is the exact scenario the current `NexusOnboardingWizard.tsx` is vulnerable to: it calls `agentsApi.probeAdapter("hermes_local")` on wizard open, but if the user arrives at the onboarding page without board auth (fresh install, incognito session), the probe silently fails and `defaultAdapter` stays `"claude_local"`. + +**Why it happens:** Board auth is the right guard for post-setup adapter operations. But hardware detection and provider probing are legitimately pre-auth operations — you want to present the right setup path before any credentials exist. + +**How to avoid:** +1. Create a separate `GET /system/providers` endpoint that does not require board auth. It returns available local providers (Ollama status, Hermes status) based purely on server-side detection (no user credentials needed). +2. Alternatively, make the probe endpoint check auth level: if no board auth exists (fresh install), allow the probe to run unauthenticated for a whitelist of safe probe types (`hermes_local`, `ollama`). +3. Never gate hardware detection on user credentials — hardware is a property of the machine, not the user session. + +**Warning signs:** +- Browser network tab shows 403 on the probe call during onboarding +- `defaultAdapter` in the wizard is always `"claude_local"` even when Ollama/Hermes are running +- Probe works in the settings page (user is auth'd) but not during initial onboarding + +**Phase to address:** Phase 1 (Hardware Detection) — the probe auth story must be designed before the multi-step wizard is built. + +--- + +### Pitfall 15: Puter.js "Zero-Config" Promise Breaking on Paperclip's Server-Side Architecture + +**What goes wrong:** Puter.js is designed for purely browser-side use: load the CDN script, call `puter.ai.chat()`, Puter handles auth via its own popup login flow. Nexus/Paperclip proxies AI calls through the server (`/api/chat`, `/api/agents`). If Puter.js is loaded browser-side and calls Puter's servers directly, it bypasses Paperclip's cost tracking, budget enforcement, session codec, and skill sync entirely. + +This creates a split-brain: the Puter adapter sends messages to Puter's cloud while Paperclip's adapter system thinks the agent is using a different provider. Cost tracking shows $0 for Puter sessions. Heartbeat and session management are not wired up. + +**Why it happens:** Puter.js is documented as a CDN-loaded browser library with client-side auth. The natural integration is to `<script src="https://js.puter.com/v2/">` and call the API directly. But Paperclip's architecture requires all AI calls to go through server-side adapter machinery. + +**How to avoid:** +1. Implement Puter as a server-side adapter that calls Puter's API from Node.js using HTTP (not the browser SDK). The Puter API is callable via standard HTTP — use `fetch()` on the server, not the browser SDK. +2. The server-side Puter adapter must implement the full adapter contract: `spawn`, `heartbeat`, `sessionCodec`, `configFields` (see `packages/adapters/` pattern). +3. If browser-side Puter SDK is needed for auth popup (Puter uses its own account system), implement auth as a UI-only step that retrieves a Puter token, then stores that token in Paperclip's adapter config for server-side use. +4. Confirm Puter's rate limiting behavior for server-side calls. Puter's "free unlimited" claim applies to personal/hobby use; verify terms before treating it as production-grade. + +**Warning signs:** +- Puter.js loaded via `<script>` CDN tag in the app shell +- Cost tracking shows $0 for all Puter-backed agent sessions +- `puter.ai.chat()` calls appearing in browser network tab (not proxied through `/api/`) + +**Phase to address:** Phase 2 (Zero-Config Cloud / Puter.js) + +--- + +### Pitfall 16: OAuth Token Storage in `localStorage` Creating Security and Rebase Risk + +**What goes wrong:** The natural place to store OAuth access tokens in an SPA is `localStorage`. But: +1. `localStorage` is accessible to any JS on the page — XSS vulnerabilities can steal tokens. +2. Paperclip already uses `localStorage` with `"paperclip.*"` prefixed keys. Any Nexus key added with `"nexus.*"` prefix will need a migration if the key name is ever changed, per Pitfall 4. +3. OAuth refresh token rotation (required for Google/OpenAI free tiers) must clear-and-rewrite the stored token on every refresh. If this fails mid-write (e.g., browser close), the user is logged out and must re-authenticate. + +**Why it happens:** `localStorage` is the default that every OAuth tutorial reaches for in SPA context. The PKCE security guidance says to use `sessionStorage` for the code verifier but often developers apply `localStorage` for the actual access token. + +**How to avoid:** +1. Store OAuth tokens server-side in Paperclip's existing config/secrets mechanism (`server/src/secrets/`). The server does the OAuth exchange and stores the token; the browser never sees the raw token. +2. Use Paperclip's existing board auth cookie mechanism to gate whether the OAuth integration is enabled — do not create a separate browser-side auth session for each OAuth provider. +3. If browser-side token storage is unavoidable, use `sessionStorage` (not `localStorage`) for OAuth code verifiers; store refresh tokens server-side only. +4. For the state parameter in PKCE flow: generate a cryptographically random state with `crypto.getRandomValues()`, store in `sessionStorage`, verify on redirect. + +**Warning signs:** +- `window.localStorage.getItem("nexus.oauth.google.accessToken")` or similar in browser DevTools +- OAuth token visible in network requests from browser to Google/OpenAI APIs (not proxied through Paperclip server) +- Re-authentication required after browser restart (session not persisting correctly) + +**Phase to address:** Phase 3 (OAuth Cloud Tier) + +--- + +### Pitfall 17: Multi-Provider Onboarding Creating Multiple Competing Default Adapters + +**What goes wrong:** v1.5 adds multiple provider tiers: local Ollama/Hermes, free cloud Puter.js, OAuth Google Gemini/OpenAI, and subscription detection (Claude Code, OpenClaw). If a user configures more than one provider during onboarding, the resulting agents get created with the adapter config from the onboarding summary step. But Paperclip's agent model is one-adapter-per-agent. If the wizard creates agents without being explicit about which provider wins, agents may be created with inconsistent adapter types (one with `hermes_local`, another with `puter_cloud`), creating a confusing mixed-provider workspace. + +The deeper trap: the onboarding wizard currently creates exactly 2 agents (PM + Engineer) with identical adapter config. v1.5 may want different agents on different providers (e.g., assistant on Puter, PM on Hermes). This is a valid architecture but requires explicit per-agent provider selection, which the current wizard doesn't support. + +**Why it happens:** Multi-provider selection UX tends to present all providers as equally valid, then requires a tie-breaking decision the wizard may not have asked the user to make. + +**How to avoid:** +1. Make the onboarding wizard select ONE primary provider and create all initial agents on that provider. Secondary provider credentials can be stored for later use (configuring individual agents from the settings page). +2. If the mode selection is "Personal AI Assistant," create the assistant agent on the highest-quality available provider (subscription > OAuth > Puter > local). +3. If the mode selection is "Project Builder," create PM + Engineer on the local/privacy-first provider since these agents run autonomously and should not require cloud API credits per task. +4. Document the provider selection logic explicitly in code comments. + +**Warning signs:** +- PM agent created with `hermes_local`, Engineer created with `puter_cloud` after the same onboarding flow +- "Recommended provider" badge in wizard applied to multiple providers simultaneously +- Users confused about which API credits are being used for which agents + +**Phase to address:** Phase 1 (Mode Selection) — define the provider-per-mode rule before building the selection UI. + +--- + +### Pitfall 18: Voice TTS (Piper) Cold Start Blocking the First Spoken Response + +**What goes wrong:** Piper TTS (browser WASM implementation) downloads the voice model on the first synthesis call. This means the first time a user activates TTS, they wait 5–30 seconds for the model to download before hearing anything. Without user feedback, this appears as a hang or broken feature. + +A secondary trap: the WASM Piper phonemizer does not always match the phoneme mapping expected by every Piper voice model. Using a voice model that was compiled for a different language variant (e.g., an `en_GB` model on a browser Piper instance expecting `en_US` phoneme tables) produces garbled or silent output. + +**Why it happens:** Browser-based Piper TTS stores models in the Origin Private File System (OPFS). The first call triggers the download. Developers who test Piper locally after the first call never encounter the cold start because the model is already cached. + +**How to avoid:** +1. Pre-warm Piper on background thread during onboarding (after the voice step is confirmed, not on first message). Use a silent warmup synthesis ("...") to trigger model download before the user expects to hear anything. +2. Show a download progress indicator on the TTS toggle — not a spinner (implies in-progress work) but a "preparing voice model" state with estimated download size. +3. Limit initial voice model choices to stable Piper models with confirmed browser WASM compatibility. Avoid offering non-English models unless specifically verified. +4. Store pre-downloaded voice models in OPFS; on subsequent loads, check `navigator.storage.getDirectory()` before re-downloading. + +**Warning signs:** +- TTS button appears responsive (toggles on) but no audio plays for 15+ seconds +- Voice model download appears in DevTools network tab on the first "speak" action +- Users reporting "the voice feature is broken" on first use but "works fine" on subsequent uses + +**Phase to address:** Phase 4 (Voice TTS) — warmup strategy must be designed before the TTS toggle is wired up. + +--- + +### Pitfall 19: Persistent Memory Injecting Sensitive Data Into System Prompts + +**What goes wrong:** The Personal AI Assistant stores memories (user preferences, past conversation summaries, project context) to inject into future system prompts. Two failure modes: + +1. **Prompt injection via stored memory.** If memory content is retrieved from external sources (web fetch, document import, MCP tools) and stored verbatim, malicious content in those sources gets injected into future system prompts with elevated priority. Palo Alto Unit 42 documented this attack vector in 2025: memory-poisoning allows persistent malicious instructions affecting agent behavior across sessions. + +2. **Sensitive data leaking between sessions.** If the assistant stores a memory like "user's Stripe API key is sk_live_..." (from a pasted credential) and that memory surfaces in a future session with a different context (e.g., a Puter.js provider that logs requests), the credential leaks. + +**Why it happens:** Memory systems treat all content as equal. The distinction between "safe user preference" and "sensitive credential that should never be persisted" is not obvious at write time. + +**How to avoid:** +1. Apply rule-based filters at write time: never store content matching secret patterns (API key regexes, tokens, passwords). Use a blocklist of patterns before persisting any memory fragment. +2. Sanitize memory content before injecting into system prompts — strip any content between `<` `>` tags, backtick blocks, or content that looks like instruction syntax. +3. For MCP tool results that become memory, apply the same sanitization as user-pasted content. +4. Implement memory scoping: memories should only surface in sessions with the same mode (assistant memories should not surface in project builder sessions). + +**Warning signs:** +- Memory fragments containing "api_key", "token", "password", "secret" stored in the memory DB +- A stored memory from a previous session altering agent behavior in unexpected ways +- MCP tool output (e.g., fetched web page content) appearing verbatim in system prompts + +**Phase to address:** Phase 5 (Persistent Memory) — memory schema must include sanitization at write time before any memory is persisted. + +--- + +### Pitfall 20: MCP Integration Conflicting With Paperclip's Existing Tool/Skill System + +**What goes wrong:** Paperclip has its own skill/tool system (`AdapterSkillSnapshot`, `AdapterSkillEntry`, `company-skills.ts`). MCP also defines tools. If an MCP server exposes a tool named `"terminal"` or `"file_read"` and Paperclip's skill system also has these (used in Hermes heartbeat prompt templates), the agent receives duplicate or conflicting tool definitions. The LLM may call the MCP version when the Paperclip version was intended, bypassing Paperclip's permission and cost tracking. + +Additionally, MCP uses SSE as its transport, which is deprecated in the latest MCP spec (June 2025 spec prefers Streamable HTTP). If the MCP server is implemented with SSE transport, it will need migration as MCP clients drop SSE support. + +**Why it happens:** MCP tool names are unscoped — any tool named `"terminal"` is `"terminal"`. The collision with Paperclip's native tools is invisible until an agent calls the wrong one. Developers add MCP without auditing for name collisions. + +**How to avoid:** +1. Use Streamable HTTP transport for the MCP server (not SSE, which is deprecated as of MCP spec 2025-06-18). +2. Prefix all Nexus-registered MCP tools with a namespace: `nexus_memory_read`, `nexus_memory_write`, `nexus_context_set`, etc. +3. Before exposing any MCP tool, check it against the list of tool names in `TOOLS.md` (Hermes skill bundle). If there is a collision, rename the MCP tool. +4. TypeScript interface pitfall: when defining `structuredContent` types for MCP tool responses, use `type` aliases not `interface` declarations — interfaces lack implicit index signatures and cause TypeScript assignment errors with `{ [key: string]: unknown }`. + +**Warning signs:** +- Agent calling `terminal` tool but the call is going to MCP server, not Paperclip's exec sandbox +- TypeScript compile errors: "Type 'XInterface' is not assignable to type '{ [key: string]: unknown }'" +- MCP server implemented with `sse` transport (use `streamable-http` instead) + +**Phase to address:** Phase 5 (MCP Integration) + +--- + +### Pitfall 21: `npx buildthis` Conflicting With an Existing Paperclip CLI Entry Point + +**What goes wrong:** The `npx buildthis` entry point must add a new `bin` entry to the Nexus package. Paperclip's CLI already has `bin.paperclipai`. If `buildthis` is added to a package that does not yet exist on npm (or is published under a different name), `npx buildthis` will either: (a) fetch the wrong package from npm (there are existing npm packages named `buildthis`), or (b) fail with "package not found" because the Nexus fork is not on npm. + +A secondary trap: `npx` installs packages temporarily in a user's npm cache. If `npx buildthis` is run on a machine that already has `npx` cached from a previous install, it may use the old version without the latest onboarding flow. + +**Why it happens:** `npx` resolves package names from the public npm registry first. If the package name collides with an existing npm package, users get the wrong thing. If the package is private (Forgejo only), `npx` cannot find it by default. + +**How to avoid:** +1. Before naming the CLI entry `buildthis`, search npm: `npm search buildthis` — verify there is no collision. If there is, choose `nexus-buildthis` or `@yourusername/buildthis` (scoped package). +2. Since Nexus is deployed on a Mac Mini for single-user use, `npx buildthis` likely resolves to a local package reference rather than npm. Document this explicitly: `npx /path/to/nexus/packages/cli buildthis` or publish to a private registry. +3. For first-run detection: check for `~/.paperclip` (or `~/.nexus`) existence before running full onboarding; if config exists, route to the "already configured" path. + +**Warning signs:** +- `npx buildthis` prints output from an unrelated npm package +- CLI help text shows incorrect version (cached from npm, not local build) +- `npm info buildthis` returns a package that is not Nexus + +**Phase to address:** Phase 6 (`npx buildthis` CLI) + +--- + +## Moderate Pitfalls (v1.5) + +--- + +### Pitfall 22: Multi-Step Onboarding Wizard Breaking the "Every Step Skippable" Requirement + +**What goes wrong:** The v1.5 onboarding has many steps: mode selection, hardware detection, local AI setup, voice, Puter.js, OAuth, subscription detection, summary, and straight-into-chat. As the wizard grows, "every step skippable" becomes hard to maintain because steps develop implicit dependencies: +- The summary step shows "selected providers" — if you skip all provider steps, the summary is empty and the wizard has no actionable result. +- The voice step configures Piper — if it's skipped, the voice feature is silently disabled without telling the user. +- OAuth setup creates credentials — if skipped after starting the OAuth popup, the popup tab is orphaned. + +**Why it happens:** Step dependencies are added incrementally as each step is built. By the time all steps exist, the skip logic has edge cases that weren't anticipated. + +**How to avoid:** +1. Define the "skip all" state explicitly before building any step: what does a fully-skipped onboarding produce? Answer: one workspace, one agent, Hermes or claude_local as default, no voice, no OAuth, no memory. Make this the minimum valid state. +2. Code the summary step to present a useful state even when every step is skipped. +3. Treat OAuth flows specially: if a user starts an OAuth popup (opens Google auth window) and then closes the wizard, cancel the OAuth state cleanly. Never leave orphaned OAuth state. + +**Warning signs:** +- Summary step shows empty provider list when all steps are skipped +- "Skip" button disabled on certain steps +- Closing the wizard mid-OAuth leaves the OAuth callback URL still active + +**Phase to address:** Phase 1 (Mode Selection) — define the skip-all state as a test case before building any step. + +--- + +### Pitfall 23: Assistant Mode and Project Builder Mode Sharing Conversation History + +**What goes wrong:** The Personal AI Assistant has its own conversation context: user preferences, daily notes, personal projects. The Project Builder has PM + Engineer agents working on specific code issues. If both modes share the same `conversations` table without a mode discriminator, the assistant's personal context bleeds into project sessions and vice versa. + +A user asking the assistant "remind me what I was working on yesterday" should not surface issues from the Project Builder's agent task queue. An agent executing a coding task should not have the user's personal assistant context injected into its system prompt. + +**Why it happens:** The `conversations` table is generic. Adding a `mode` column or `agent_type` discriminator requires a DB schema change, which is out of scope for Nexus (no migrations). Without a schema change, mode separation must be achieved through metadata conventions. + +**How to avoid:** +1. Since DB schema changes are out of scope, use the existing conversation metadata/tagging system (if available) to tag conversations as `assistant` vs. `agent`. Filter on this tag when fetching conversation history. +2. If no tagging system exists, use the agent's `role` field as a discriminator: conversations involving a `role: "ceo"` or `role: "engineer"` agent are project builder context; conversations with a dedicated assistant agent are personal assistant context. +3. The personal assistant agent should have a distinct `adapterType` or `name` pattern that makes it queryable as a filter. + +**Warning signs:** +- Assistant surfacing agent task IDs or issue numbers when answering personal questions +- Project Builder agents including personal notes in their task context +- `conversations` table query returns mixed results from both modes + +**Phase to address:** Phase 2 (Mode Selection / Assistant Mode) — define the conversation isolation strategy before creating the assistant agent. + +--- + +### Pitfall 24: Subscription/API Key Auto-Detection Creating False Positives + +**What goes wrong:** The onboarding tries to auto-detect existing Hermes, Claude Code, and OpenClaw subscriptions. Each of these works differently: +- Hermes: probe the local adapter (existing `probeAdapter` endpoint) +- Claude Code: check for `~/.claude/` directory or `claude` binary in PATH +- OpenClaw: check for an OpenClaw-specific config file or env var + +False positives occur when: a Claude Code config exists but the API key is expired; an OpenClaw config file exists but the subscription is cancelled; a `claude` binary exists but is the wrong version for the adapter. + +Showing "Claude Code detected — ready to use" when the subscription is inactive is worse than not detecting it, because the user proceeds with a broken setup. + +**Why it happens:** Presence of config files or binaries does not guarantee valid credentials or active subscriptions. The only reliable detection is making an actual API call, which has latency implications for onboarding. + +**How to avoid:** +1. Distinguish between "binary/config present" (detected) and "API call succeeded" (verified). Show "detected" state immediately but show "verified" state only after a lightweight API validation call. +2. For expensive verification calls, do them in parallel with a timeout. If verification times out, show "detected but unverified" rather than "ready to use." +3. Never block onboarding progress on subscription verification. Mark unverified detections prominently and let the user proceed, then verify asynchronously. + +**Warning signs:** +- Onboarding step shows "Claude Code ready" but first agent run fails with auth error +- Detection step takes more than 3 seconds (verification calls blocking UI) +- Config file present but API key revoked 6 months ago + +**Phase to address:** Phase 3 (Subscription/API Key Auto-Detection) + +--- + +## Minor Pitfalls (v1.5) + +--- + +### Pitfall 25: Project Handoff from Assistant Conversation Losing Context + +**What goes wrong:** "Project handoff: assistant conversation → PM with context transfer" is a v1.5 requirement. The naive implementation creates a new issue in the project from the assistant conversation summary. But the handoff loses: branching context (which assistant conversation branch), attachment references (files uploaded in the assistant chat), and the interim decisions the user made during the assistant conversation. + +**How to avoid:** +1. Handoff should carry: (a) conversation ID or branch ID as a reference, (b) a structured summary (not just free text), and (c) attachment IDs from the assistant conversation. +2. The PM agent receiving the handoff should be able to `GET /api/chat/conversations/{id}` to retrieve the full context if needed. +3. Do not flatten the handoff context into the issue title/description alone — preserve the conversation reference. + +**Phase to address:** Phase 5 (Persistent Memory + Assistant Mode) + +--- + +### Pitfall 26: `ollama-model-catalog.json` Becoming Stale as New Models Are Released + +**What goes wrong:** The pre-built model catalog (`server/src/data/ollama-model-catalog.json`) hard-codes RAM/VRAM requirements per model name. Ollama releases new model versions and new model families frequently. A user who installs a new model after the catalog was last updated gets no recommendation reason — the model is silently marked `recommended: false` with `recommendationReason: null` because it is not in the catalog. + +The existing code in `getRecommendedModel()` silently skips models not in the catalog (`const entry = catalogMap.get(model.name); if (!entry) continue;`). A model installed as `llama3.3:latest` may not match a catalog entry for `llama3.3:70b-instruct-q4_K_M`. + +**How to avoid:** +1. Implement a fallback heuristic: if a model is not in the catalog, estimate RAM requirements from the model's `parameterSize` and `quantization` fields that Ollama already returns. A 7B Q4_K_M model reliably fits in ~5GB. +2. Normalize model name matching — strip version tags and match on family+quantization pattern, not exact name string. +3. Document the catalog update process: when to update it, who owns it, and how to add new families. + +**Phase to address:** Phase 1 (Hardware Detection / Model Recommendations) + +--- + +## Technical Debt Patterns + +| Shortcut | Immediate Benefit | Long-term Cost | When Acceptable | +|----------|-------------------|----------------|-----------------| +| Browser-side Puter.js SDK instead of server adapter | Faster to ship | Bypasses cost tracking, skill sync, session codec; creates split-brain | Never for production use | +| `localStorage` for OAuth tokens | Easy to implement | XSS exposure; migration required if key renamed; conflicts with upstream Paperclip keys | Never; use server-side secrets storage | +| `os.totalmem()` for RAM recommendations | One-line implementation | Overestimates available RAM on loaded systems; misleads model recommendations | Only as a fallback when `freemem()` is not available | +| Polling for hardware detection status | Avoids SSE complexity | Hammers server during onboarding; creates race conditions with slow detection | Only if SSE is unavailable | +| Inline Piper model download on first TTS call | Zero extra onboarding step | Silent hang on first use; poor UX; perceived as broken feature | Never; always pre-warm | +| Flat memory injection (all memories into every prompt) | Simple implementation | Context window overflow; irrelevant memories degrade response quality | Only for prototyping | +| No mode discriminator on conversations table | No schema change needed | Mode cross-contamination; hard to query assistant vs. agent conversations | Acceptable with explicit agent-based filtering | + +--- + +## Integration Gotchas + +| Integration | Common Mistake | Correct Approach | +|-------------|----------------|------------------| +| Puter.js | Load browser SDK, call `puter.ai.chat()` directly | Implement as server-side HTTP adapter; Puter token stored in Paperclip config | +| Piper TTS (WASM) | Call synthesis on first user message | Pre-warm on background thread during onboarding step; show download progress | +| Ollama probe | Probe at onboarding time without board auth | Use a dedicated unauthenticated `/system/providers` endpoint for pre-auth hardware detection | +| MCP tools | Add tools with generic names (`terminal`, `search`) | Namespace all MCP tools: `nexus_memory_*`, `nexus_context_*` | +| Google OAuth | Store access token in `localStorage` | Exchange code server-side; store token in Paperclip secrets; never expose to browser | +| Upstream rebase after v1.5 | Forget to diff `OnboardingWizard.tsx` against upstream | Post-rebase protocol: diff the aliased-away file, integrate any new upstream props | +| Apple Silicon VRAM | Report `os.totalmem()` as available GPU memory | Use `os.freemem()` with explicit copy: "unified memory, shared with OS" | + +--- + +## Performance Traps + +| Trap | Symptoms | Prevention | When It Breaks | +|------|----------|------------|----------------| +| Sequential provider probes in onboarding | Each probe adds 3s+ to wizard load time | Probe all providers in parallel with `Promise.allSettled()` | Any multi-provider step with 3+ probes | +| Memory retrieval on every chat message | 200-500ms added to every response | Cache last N memories; only re-fetch if conversation context changes | Systems with >100 stored memory fragments | +| Piper TTS blocking main thread | UI freezes during synthesis | Run Piper WASM in a Web Worker; stream audio chunks as they generate | Models larger than small/medium quality | +| Ollama model catalog loaded from disk on every request | File I/O on every recommendation call | Load and cache catalog at server startup, not per-request | High-frequency polling during onboarding | +| MCP tool calls in the critical path of assistant response | Latency spikes when memory server is slow | Make MCP tool calls non-blocking where possible; set aggressive timeouts | MCP server under load or starting up | + +--- + +## Security Mistakes + +| Mistake | Risk | Prevention | +|---------|------|------------| +| Storing OAuth tokens in `localStorage` | XSS can steal tokens; Paperclip key collision | Server-side token storage in existing secrets mechanism | +| Persisting raw user input in memory without sanitization | Credential leakage; prompt injection across sessions | Regex-based blocklist at write time; strip instruction-like syntax | +| Unauthenticated MCP endpoint exposure | External callers invoking memory read/write | MCP server bound to `localhost` only; board auth required for all tool calls | +| Puter.js API key in browser bundle | Key exposure in DevTools | Server-side Puter adapter; no Puter credentials in browser | +| Recording audio without explicit per-session consent indicator | Privacy violation perception | Show persistent recording indicator; stop all audio tracks immediately on stop | + +--- + +## UX Pitfalls + +| Pitfall | User Impact | Better Approach | +|---------|-------------|-----------------| +| Multi-step wizard with no skip-all option | Users with existing tools feel trapped | "Skip setup" at top of wizard; minimum valid state if skipped | +| Showing all providers as equally valid | Decision paralysis; wrong choice for hardware | Pre-select the best option; others are secondary alternatives | +| TTS toggle with no download state | Appears broken; silent 15-30s wait | Pre-warm voice model; show download progress before toggle is active | +| Hardware detection with false confidence | User loads model that OOMs | Label recommendations as "estimated" not "guaranteed"; add safety margin | +| Mode selection before hardware detection | User picks "Personal AI Assistant" but their hardware can't run local models | Show hardware detection first; mode recommendation follows hardware capability | +| Summary screen with no way to change a step | User made wrong choice earlier; stuck | Every summary item links back to the relevant step | + +--- + +## "Looks Done But Isn't" Checklist + +- [ ] **Puter.js adapter:** Is it going through the server-side adapter machinery (cost tracking, heartbeat, session codec) or calling Puter's API directly from the browser? +- [ ] **Adapter probe during onboarding:** Does it work before board auth is established (fresh install) or does it silently return 403? +- [ ] **Piper TTS first use:** Has the warmup been tested on a clean browser profile with no OPFS cache? +- [ ] **Persistent memory:** Are there sanitization filters at write time preventing credential storage? +- [ ] **MCP tool names:** Have all Nexus MCP tools been checked against the Hermes `TOOLS.md` skill bundle for name collisions? +- [ ] **OAuth token storage:** Is the refresh token stored server-side? Is the browser holding only a session indicator, not the raw token? +- [ ] **Mode isolation:** Can assistant conversation history be queried without surfacing project builder agent conversations? +- [ ] **Onboarding skip:** Does skipping every step produce a usable workspace with at least one agent? +- [ ] **Apple Silicon VRAM copy:** Does the hardware detection screen say "unified memory" not "VRAM" for M-series chips? +- [ ] **`npx buildthis` package name:** Has `npm search buildthis` been run to verify no collision? +- [ ] **Upstream OnboardingWizard diff:** After the v1.5 wizard is built, has `OnboardingWizard.tsx` been diffed against upstream to check for new props that `NexusOnboardingWizard.tsx` needs to handle? + +--- + +## Recovery Strategies + +| Pitfall | Recovery Cost | Recovery Steps | +|---------|---------------|----------------| +| Puter.js browser-side integration shipped | HIGH | Rewrite as server-side adapter; migrate conversation history to route through server | +| OAuth tokens in `localStorage` shipped | HIGH | Server-side migration: on next load, detect browser-stored tokens, exchange for server-stored ones, clear localStorage | +| Persistent memory storing credentials | HIGH | Purge memory store; add retroactive scan-and-delete for credential patterns; add blocklist | +| Piper TTS no warmup (silent hang) | LOW | Add warmup call in background; show download progress indicator | +| Model catalog stale | LOW | Add fallback heuristic; document update process | +| Onboarding probe auth-gated on board auth | MEDIUM | Add unauthenticated system/providers endpoint; update wizard to use new endpoint | +| Mode contamination in conversations table | MEDIUM | Add agent-based filter to conversation queries; document the filtering convention | + +--- + +## Pitfall-to-Phase Mapping + +| Pitfall | Prevention Phase | Verification | +|---------|------------------|--------------| +| Vite alias swap breaking upstream rebase (12) | Phase 1 — Hardware Wizard | Post-rebase diff protocol in place and documented | +| Hardware detection inaccuracy on Apple Silicon (13) | Phase 1 — Hardware Detection | Unit test: compare `totalmem()` vs `freemem()` recommendations; verify M4 copy says "unified" | +| Probe endpoint requires board auth (14) | Phase 1 — Hardware Detection | Test: call probe endpoint with no board auth cookie; should succeed | +| Puter.js bypassing adapter system (15) | Phase 2 — Zero-Config Cloud | Verify: Puter sessions appear in cost tracking with correct provider label | +| OAuth tokens in localStorage (16) | Phase 3 — OAuth | Verify: no OAuth tokens visible in browser DevTools localStorage | +| Multi-provider creating competing defaults (17) | Phase 1 — Mode Selection | Test: skip-all onboarding produces exactly one adapter type per agent | +| Piper TTS cold start hang (18) | Phase 4 — Voice TTS | Test: fresh browser profile, enable TTS, measure time-to-first-audio | +| Memory prompt injection (19) | Phase 5 — Persistent Memory | Test: paste a credential into chat; verify it is NOT stored in memory DB | +| MCP tool name collision (20) | Phase 5 — MCP Integration | Audit: compare MCP tool names against TOOLS.md before shipping | +| `npx buildthis` package name collision (21) | Phase 6 — CLI | Run `npm search buildthis` before publishing | +| Skip-all onboarding broken (22) | Phase 1 — Mode Selection | Test: skip every step; verify workspace + one agent created | +| Assistant/project builder context bleed (23) | Phase 2 — Mode Selection | Test: assistant query does not surface issue IDs from project builder | +| Subscription detection false positives (24) | Phase 3 — Subscription Detection | Test: revoke an API key; verify wizard shows "unverified" not "ready" | +| Project handoff losing context (25) | Phase 5 — Persistent Memory | Test: handoff includes conversation ID, not just flat text summary | +| Model catalog staleness (26) | Phase 1 — Hardware Detection | Test: install an uncatalogued Ollama model; verify fallback heuristic fires | --- ## Sources -- Codebase analysis: `/Volumes/UsbNvme/agent/.planning/codebase/CONCERNS.md` — direct audit of Paperclip source (HIGH confidence) -- [Stop Forking Around — Fork Drift in Open Source](https://preset.io/blog/stop-forking-around-the-hidden-dangers-of-fork-drift-in-open-source-adoption/) — fork drift patterns (MEDIUM confidence) -- [Lessons Learned from Maintaining a Fork](https://dev.to/bengreenberg/lessons-learned-from-maintaining-a-fork-48i8) — exponential maintenance cost (MEDIUM confidence) -- [Friendly Fork Management — GitHub Blog](https://github.blog/2022-05-02-friend-zone-strategies-friendly-fork-management/) — sync strategies, conflict accumulation (MEDIUM confidence) -- [The Dynamic Relationship of Forks with Upstream](https://ropensci.org/blog/2025/02/20/forks-upstream-relationship/) — upstream isolation patterns (MEDIUM confidence) +**Codebase analysis (HIGH confidence):** +- `/opt/nexus/server/src/services/ollama.ts` — RAM detection using `totalmem()`, catalog lookup +- `/opt/nexus/ui/src/components/NexusOnboardingWizard.tsx` — probe auth requirement, adapter detection +- `/opt/nexus/server/src/routes/agents.ts` — board-auth gate on probe endpoint +- `/opt/nexus/ui/vite.config.ts` — OnboardingWizard Vite alias pattern +- `/opt/nexus/ui/src/components/VoiceRecordButton.tsx` — existing Whisper STT implementation +- `/opt/nexus/ui/src/adapters/registry.ts` — adapter registration pattern + +**Research (MEDIUM confidence unless noted):** +- [Puter.js Free Unlimited AI API](https://developer.puter.com/tutorials/free-unlimited-ai-api/) — Puter is browser-SDK-first; server-side HTTP integration requires manual HTTP calls +- [WebGPU/WebGL VRAM Limitations](https://dl.acm.org/doi/10.1145/3730567.3764504) — VRAM not queryable from browser; integrated vs. dedicated GPU reporting issues (HIGH confidence — peer-reviewed) +- [Ollama AMD VRAM Detection Bug](https://github.com/ollama/ollama/issues/13677) — confirmed VRAM misreport on AMD/Vulkan +- [MCP Tips, Tricks and Pitfalls — Nearform](https://nearform.com/digital-community/implementing-model-context-protocol-mcp-tips-tricks-and-pitfalls/) — TypeScript interface vs. type alias; SSE deprecated +- [MCP Specification 2025-06-18](https://modelcontextprotocol.io/specification/2025-06-18) — SSE deprecated, Streamable HTTP preferred +- [Memory Poison Attack — Palo Alto Unit 42](https://unit42.paloaltonetworks.com/indirect-prompt-injection-poisons-ai-longterm-memory/) — persistent memory prompt injection attack vector (HIGH confidence) +- [Piper TTS WASM cold start](https://github.com/rhasspy/piper/issues/352) — first-run download, OPFS caching, warmup pattern +- [OAuth PKCE SPA Best Practices — Curity](https://curity.io/resources/learn/spa-best-practices/) — sessionStorage for verifiers, server-side token storage +- [AI Agent Memory — Redis](https://redis.io/blog/ai-agent-memory-stateful-systems/) — context window overflow, hybrid vector+graph architecture + +--- +*Pitfalls research for: Nexus v1.5 — Smart Onboarding + Personal AI Assistant* +*Researched: 2026-04-02* diff --git a/.planning/research/STACK.md b/.planning/research/STACK.md index f063c4e7..b3b44533 100644 --- a/.planning/research/STACK.md +++ b/.planning/research/STACK.md @@ -1,364 +1,395 @@ -# Technology Stack: Fork Maintenance Approach +# Technology Stack: v1.5 Smart Onboarding + Personal AI Assistant -**Project:** Nexus (fork of Paperclip) -**Researched:** 2026-03-30 -**Scope:** Safely maintaining a display-layer fork of a TypeScript monorepo while staying rebassable on upstream +**Project:** Nexus v1.5 — additive to existing fork maintenance stack (see prior milestone research for branding/fork strategy) +**Researched:** 2026-04-02 +**Scope:** NEW libraries only — Puter.js, hardware detection, Whisper STT + Piper TTS, OAuth, `npx buildthis` CLI, persistent memory +**Confidence:** MEDIUM-HIGH (most verified via official docs; a few version numbers from npm search only) --- -## Summary Recommendation +## Existing Stack (Do Not Change) -Use **git rebase with a [nexus] commit prefix convention** for fork maintenance. Extract all display strings into **a single `packages/branding/` package** that acts as the exclusive mutation surface. Keep every code identifier, route, schema, and package name unchanged. This combination minimises conflict surface to two file types: branding constants and onboarding assets. +The following are already installed and working. Zero changes needed: + +| Area | What's There | Location | +|------|-------------|----------| +| CLI framework | Commander.js `^13.1.0` + `@clack/prompts ^0.10.0` | `cli/package.json` | +| Hardware/Ollama | Custom detection (`v1.4`) + `systeminformation` likely via existing adapter | `packages/adapters/hermes` | +| Server auth | `better-auth 1.4.18` | `server/package.json` | +| UI | React 19, Vite 6, Tailwind v4, TanStack Query v5 | `ui/package.json` | +| DB | LibSQL/Drizzle ORM | `server/package.json` | --- -## 1. Fork Maintenance Strategy +## New Libraries by Feature Area -### Recommended: Rebase-Over-Upstream with Prefix Convention +### 1. Puter.js — Zero-Config Cloud AI -**Confidence: HIGH** — Used by git-for-windows, microsoft/git, and VSCodium. Standard practice for long-lived forks. +**Package:** `@heyputer/puter.js` +**Version:** latest (no stable semver pinned on npm — use `@latest` and lock in pnpm-lock) +**Where it lives:** `ui/` only — Puter.js is a frontend-first browser SDK -**How it works:** +**Why:** 500+ models (GPT-4o, Claude, Gemini, Grok, DeepSeek) with zero API keys and zero developer billing. Users authenticate with their own Puter account; usage cost falls on the user, not the developer. This is the project's "zero-config cloud" tier — the entire value prop depends on this library. -Every Nexus-specific commit carries a `[nexus]` prefix in the commit message. On each upstream release: +**How the API works:** -```bash -git fetch upstream -git rebase upstream/master +```typescript +// Browser only — import via script tag or bundler +import Puter from "@heyputer/puter.js"; + +// Chat (streaming) +const stream = await puter.ai.chat("Hello", { + model: "gpt-4o", + stream: true, +}); +for await (const part of stream) { + process.stdout.write(part?.text ?? ""); +} + +// Image generation, TTS, STT also available under puter.ai.* ``` -During rebase, conflicts only appear on commits that touch the same lines as upstream changes. With display-only mutations (string constants, Markdown prose, one config file), the conflict surface is tiny. Non-conflicting commits replay cleanly. +**Integration point:** New `PuterAdapter` in `packages/adapters/` following the existing adapter pattern. The adapter wraps `puter.ai.chat()` and maps to the shared `AdapterMessage` type. Keep it display-layer only — no server-side Puter calls. -**Commit message convention:** -``` -[nexus] Rename CEO→Project Manager in OnboardingWizard -[nexus] Replace AGENT_ROLE_LABELS display value for ceo role -[nexus] Rewrite onboarding-assets/ceo/ SOUL.md and AGENTS.md -``` +**Constraint:** Puter.js runs in browser context only. Do NOT add it to `server/` or `cli/`. The adapter must be a frontend-only workspace package or inlined into the UI. -The prefix does two things: it makes `[nexus]` commits immediately identifiable in `git log`, and it allows `git range-diff` to verify that a rebase correctly replayed all downstream patches. - -**Verification after every upstream sync:** - -```bash -# Compare the old and new version of the downstream patch series -git range-diff upstream/master ORIG_HEAD HEAD -``` - -`git range-diff` shows which `[nexus]` commits changed during rebase (conflict resolutions), which replayed identically, and which were dropped. This is the standard tool used by the Git project itself for patch-series validation. **Confidence: HIGH** (official Git tooling, not a third-party tool). - -**Enable rerere to auto-replay recurring resolutions:** - -```bash -git config rerere.enabled true -``` - -`git rerere` records how each conflict was resolved. On the next upstream sync, if the same conflict hunk appears again (common when upstream frequently touches the same area), Git auto-resolves it identically. This eliminates repetitive manual conflict resolution. **Confidence: HIGH** (official Git feature, described in Pro Git book). - -**Atomic commits — most important discipline:** - -Each `[nexus]` commit must touch exactly one logical unit. Never mix a display-string change with a behaviour change in the same commit. Rationale: if upstream changes the same file for a different reason, a mixed commit creates conflicts in code paths you didn't mean to touch. Atomic commits mean a conflict only appears on the exact line you changed. **Confidence: HIGH** (documented in git-for-windows strategy and GitHub's friendly fork guide). +**Confidence: HIGH** — Official docs verified at developer.puter.com. User-pays model confirmed. --- -### Alternative Considered: git-format-patch / Quilt-style Patch Queue +### 2. Hardware Detection — GPU, RAM, Apple Silicon -**What it is:** Maintain Nexus changes as a series of `.patch` files outside the tree, applied on top of a clean upstream checkout. Used by VSCodium for build-time patch application with placeholder substitution. +**Package:** `systeminformation` +**Version:** `^5.31.5` (latest stable; v6 TypeScript rewrite is in progress but not released) +**Where it lives:** `server/` (runs on the Mac Mini; browser APIs cannot access hardware) -**Why not for Nexus:** VSCodium's patch approach works because they rebuild from source on every release. Nexus is a live development fork where engineers commit code daily. Applying patches at build time would break the normal `git commit` / `git push` workflow. Rebase-over-upstream is the right model when the fork is being actively developed, not just rebranded at release time. +**Why:** The only comprehensive cross-platform system info library for Node.js with 20M+ monthly downloads. Covers CPU, total RAM, GPU model/VRAM, and Apple Silicon GPU core count — exactly what's needed for model recommendation. Alternatives (`detect-gpu`, `gpu-info`) are browser-only or Windows-only. -**Confidence: MEDIUM** — VSCodium's approach is well-documented but architecturally different from a dev fork. +**Key functions for v1.5:** + +```typescript +import si from "systeminformation"; + +// Total system RAM +const mem = await si.mem(); // mem.total in bytes + +// GPU info — works on macOS, Windows, Linux +const graphics = await si.graphics(); +// graphics.controllers[0].vram — VRAM in MB (dedicated GPU) +// graphics.controllers[0].cores — GPU cores (Apple Silicon only) +// graphics.controllers[0].model — e.g. "Apple M4 Pro" +``` + +**Apple Silicon nuance:** Apple Silicon has unified memory — there is no separate VRAM. `si.graphics()` returns `vram: 0` and populates `cores` with GPU core count instead. The model recommendation logic must handle this: use `mem.total` as effective VRAM for Apple Silicon, scaled by a configurable fraction (typically 0.75 since OS+apps compete for the same pool). + +**Existing usage in v1.4:** Ollama detection and RAM/VRAM recommendations are already implemented. This is an additive enhancement — if `systeminformation` is not yet imported in the server, add it. If it is, extend the existing detection service. + +**Confidence: HIGH** — Verified via systeminformation.io official docs. Apple Silicon behavior confirmed via GPU core detection doc. --- -### Alternative Considered: Merge (not rebase) +### 3. Whisper STT — Speech to Text (CPU-capable) -Merge upstream with `git merge upstream/master` produces a merge commit that interleaves upstream and Nexus history. GitHub's friendly fork guide recommends merge for multi-contributor forks. For a solo-developer fork with a small, clearly bounded patch set, rebase produces a cleaner history and makes it obvious exactly which commits are Nexus-specific. Use merge only if the team grows beyond one or two contributors. +**Recommendation:** `smart-whisper` +**Version:** `^0.8.1` (latest as of October 2025) +**Where it lives:** `server/` as an optional service (graceful degradation if model not downloaded) + +**Why over alternatives:** +- `smart-whisper`: Native Node.js addon wrapping whisper.cpp directly. Supports loading one model for parallel inferences. Auto-enables Apple Neural Engine acceleration on macOS. Pre-built binaries for macOS arm64 (Mac Mini M4). +- `nodejs-whisper` (v0.2.9, 10 months old): Older, CPU-focused, spawns a subprocess. Works but slower and less maintained. +- `whisper-node` (v1.1.1, 2 years old): Abandoned. + +**Model recommendation for Mac Mini M4:** +- `base.en` model (~140MB) — good balance of speed/accuracy for English voice input +- `small.en` model (~460MB) — better accuracy if user has RAM to spare +- Models download lazily on first voice use; onboarding should gate voice on model availability + +**Integration pattern:** + +```typescript +import { Whisper } from "smart-whisper"; + +const whisper = new Whisper("base.en"); // downloads on first call +const transcript = await whisper.transcribe(audioBuffer, { language: "en" }); +``` + +**Server endpoint:** Add `POST /api/voice/transcribe` that accepts audio blob (WAV/WebM from browser MediaRecorder), returns transcript JSON. The existing v1.3 voice input uses browser-side Web Speech API as a fallback — this is the local/offline upgrade path. + +**Confidence: MEDIUM** — Package verified on npm and GitHub. Version from GitHub releases page. Apple Silicon acceleration confirmed in README. No production deployment data for this specific version. --- -## 2. String Extraction Pattern +### 4. Piper TTS — Text to Speech (CPU-capable) -### Recommended: Centralised Branding Package with Typed Constants +**Recommendation:** Spawn `piper` binary via `child_process`, do NOT use a Node.js wrapper library +**Why:** No mature, production-ready Node.js binding for Piper TTS exists as of April 2026. The `@mintplex-labs/piper-tts-web` package is browser-only. ONNX-based implementations exist in Python (`piper-onnx`) and partially in JavaScript for Bun, but none are packaged for Node.js production use. -**Confidence: HIGH** — Standard TypeScript monorepo pattern, no third-party risk. +**Approach:** -#### Why NOT i18n (react-i18next, LinguiJS, etc.) +```typescript +import { spawn } from "child_process"; +import path from "path"; -i18n libraries are designed for multi-locale text management. They add runtime overhead, require JSON translation files, and introduce a dependency that Paperclip upstream does not have. Importing one into a display-layer fork creates a new package.json entry that will conflict if upstream ever adds i18n itself. The simpler approach is a plain TypeScript constants module. +// piper binary downloaded to ~/.paperclip/voice/piper +// voice model downloaded to ~/.paperclip/voice/models/ +async function synthesize(text: string, modelPath: string): Promise<Buffer> { + return new Promise((resolve, reject) => { + const proc = spawn("piper", [ + "--model", modelPath, + "--output-raw", + ]); + const chunks: Buffer[] = []; + proc.stdout.on("data", (chunk) => chunks.push(chunk)); + proc.stdout.on("end", () => resolve(Buffer.concat(chunks))); + proc.stdin.write(text); + proc.stdin.end(); + }); +} +``` -#### The Pattern: `packages/branding/` +**Alternative for pure-JS TTS (fallback/cloud):** The browser's `window.speechSynthesis` API covers the cloud and basic local cases without any server dependency. Use Web Speech API as the default TTS tier; offer Piper as an optional "high-quality offline voice" that the user must enable explicitly. -Create a dedicated workspace package at `packages/branding/` that is the single place all display-layer strings live. Nothing else in the monorepo hardcodes Nexus-facing strings. +**Piper binary distribution:** During onboarding, detect if piper binary exists at `~/.paperclip/voice/piper`. If not, show download prompt. Use `https://github.com/rhasspy/piper/releases` to fetch the macOS arm64 binary. Store in `~/.paperclip/` (Nexus never renames this dir per PROJECT.md constraints). + +**Recommended voice model for Mac Mini M4:** `en_US-lessac-medium` (~63MB) — good quality, fast on Apple Silicon. + +**Confidence: MEDIUM** — Based on official Piper GitHub + community blog posts (Bun runtime example). Subprocess approach is the proven path. ONNX-native Node.js path is theoretically possible but no maintained package exists. + +--- + +### 5. OAuth Flows — Google Gemini + OpenAI Free Tiers + +**Recommendation:** `openid-client` v6 +**Version:** `^6.8.2` (latest stable, complete v6 API rewrite) +**Where it lives:** `server/` — OAuth flows run server-side with PKCE + +**Why openid-client over passport.js:** +- Passport.js adds middleware abstraction that conflicts with Nexus's existing `better-auth` setup (already in `server/package.json`) +- `openid-client` v6 is a certified OAuth 2/OIDC client that handles PKCE natively without middleware +- Works alongside `better-auth` — openid-client handles the provider OAuth dance; better-auth handles the Nexus session + +**What it provides:** +- Authorization Code Flow with PKCE (required by OAuth 2.1) +- Discovery via `.well-known/openid-configuration` — works for both Google and any OpenAI-compatible provider +- Token refresh, revocation, introspection + +**Integration pattern:** + +```typescript +import * as client from "openid-client"; + +// Google discovery +const googleConfig = await client.discovery( + new URL("https://accounts.google.com"), + process.env.GOOGLE_CLIENT_ID!, + process.env.GOOGLE_CLIENT_SECRET! +); + +// Generate PKCE challenge +const codeVerifier = client.randomPKCECodeVerifier(); +const codeChallenge = await client.calculatePKCECodeChallenge(codeVerifier); +``` + +**Note on "zero sign-up":** Puter.js handles the zero-API-key tier. OAuth is the tier above that — where users already have Google/OpenAI accounts and want to connect them. Keep these separate in the onboarding UI: Puter tier requires zero setup; OAuth tier shows "Connect your Google account" CTA. + +**Server routes to add:** +- `GET /api/oauth/google/start` — initiate flow, return redirect URL +- `GET /api/oauth/google/callback` — exchange code for tokens, store encrypted +- Same pattern for OpenAI when their OAuth flow is stable + +**Confidence: MEDIUM** — openid-client v6 verified via GitHub and npm. Google OIDC integration confirmed. OpenAI's free tier OAuth specifics are LOW confidence (their free tier structure changes frequently). + +--- + +### 6. `npx buildthis` — CLI Bootstrapper + +**No new library needed.** The package structure is a standard npm pattern. + +**What to build:** A new npm package `buildthis` (or scoped `@nexus/buildthis`) published to npm. When run via `npx buildthis`, it: +1. Detects if Nexus server is running locally (`localhost:4000` or configured port) +2. If yes: opens browser to onboarding URL +3. If no: guides user through one-command install (Docker or native) **Package structure:** ``` -packages/branding/ +cli-bootstrapper/ # New top-level directory in the Nexus monorepo + package.json # name: "buildthis", bin: { "buildthis": "./dist/index.js" } src/ - index.ts -- re-exports everything - vocabulary.ts -- entity names (Workspace, Project Manager, Owner) - ui-labels.ts -- button text, page titles, sidebar labels - cli-strings.ts -- CLI output messages, prompts, banner - agent-roles.ts -- display labels for role constants - package.json -- name: "@paperclipai/branding" (keeps @paperclipai namespace) - tsconfig.json + index.ts # #!/usr/bin/env node shebang entry + dist/ # bundled by esbuild (same config as existing CLI) ``` -**`vocabulary.ts` example:** +**`package.json` bin field:** -```typescript -export const VOCAB = { - // The Company entity displayed as: - company: { - singular: "Workspace", - plural: "Workspaces", - possessive: "Workspace's", +```json +{ + "name": "buildthis", + "version": "0.1.0", + "bin": { + "buildthis": "./dist/index.js" }, - // The CEO role displayed as: - ceo: { - singular: "Project Manager", - short: "PM", - }, - // The Board role displayed as: - board: { - singular: "Owner", - }, - // Product name - product: { - name: "Nexus", - cli: "nexus", - tagline: "Your agent workspace", - }, -} as const; -``` - -**`agent-roles.ts` example — overrides `AGENT_ROLE_LABELS` from shared:** - -```typescript -import { AGENT_ROLE_LABELS } from "@paperclipai/shared"; - -// Override display labels only. Underlying keys (ceo, engineer, etc.) are unchanged. -export const DISPLAY_ROLE_LABELS: typeof AGENT_ROLE_LABELS = { - ...AGENT_ROLE_LABELS, - ceo: "Project Manager", -}; -``` - -**Why keep the package name `@paperclipai/branding`:** The `@paperclipai/*` namespace is used by thousands of import statements. Adding a new package under the same namespace costs nothing and avoids the namespace change that would ripple through every file. The branding package is net-new; it does not rename any existing package. - -**Usage in UI:** - -Components import from `@paperclipai/branding` instead of hardcoding strings. The existing `AGENT_ROLE_LABELS` from `@paperclipai/shared` stays unchanged; components use `DISPLAY_ROLE_LABELS` from branding instead. - -```tsx -// Before (upstream hardcoded): -<span>Company</span> -<span>{AGENT_ROLE_LABELS[agent.role]}</span> - -// After (Nexus): -import { VOCAB, DISPLAY_ROLE_LABELS } from "@paperclipai/branding"; -<span>{VOCAB.company.singular}</span> -<span>{DISPLAY_ROLE_LABELS[agent.role]}</span> -``` - -**Usage in CLI (`cli/src/commands/onboard.ts`):** - -```typescript -import { VOCAB } from "@paperclipai/branding"; - -p.intro(`${VOCAB.product.name} setup`); -// Replaces: p.intro("Paperclip setup"); -``` - -**Usage in server banner (`server/src/startup-banner.ts`):** - -```typescript -import { VOCAB } from "@paperclipai/branding"; - -// Replace ASCII art "PAPERCLIP" with "NEXUS" -// Replace embedded CLI command text with VOCAB.product.cli references -``` - -#### What Stays in `@paperclipai/shared` — Unchanged - -The following stay exactly as upstream to preserve upstream rebasability: - -- `AGENT_ROLE_LABELS` (with `ceo: "CEO"`) — the authoritative map, untouched -- `AGENT_ROLES` array containing `"ceo"` — these are stored values, not display strings -- `APPROVAL_TYPES`, `INVITE_TYPES` — stored DB enum values, untouched -- `API.companies = "/api/companies"` — route constants, untouched - -The branding package only **overrides at the callsite**, never modifying shared constants. - ---- - -## 3. UI Branding / Theming Layer - -### Recommended: CSS Custom Properties in Tailwind v4 + a Single `branding.css` File - -**Confidence: HIGH** — Tailwind v4's CSS-first config model is designed for this. Official Vite + Tailwind v4 docs confirm CSS custom properties as the standard. - -Paperclip already uses Tailwind CSS 4.0.7. In Tailwind v4, theme tokens are defined as CSS custom properties in the CSS file, not in a JavaScript config. This makes branding overrides a single CSS file change. - -**`ui/src/branding.css` (new [nexus] file):** - -```css -/* Nexus brand overrides — Tailwind v4 custom properties */ -:root { - --color-brand-primary: oklch(65% 0.2 270); /* Nexus blue-purple */ - --color-brand-secondary: oklch(75% 0.15 200); + "files": ["dist"] } ``` -Import this file once in `ui/src/main.tsx` after the main Tailwind CSS import. Zero upstream conflict risk: it is a net-new file. +**Key constraint:** Keep `buildthis` dependencies minimal. `npx` downloads and installs the package fresh on each invocation. Heavy dependencies (e.g. Commander.js, Inquirer) add 200-500ms to startup. Use Node.js built-ins (`readline`, `https`, `child_process`) wherever possible. Acceptable: `@clack/prompts` (already a project dependency, ~20KB). -**Vite `define` for build-time constants:** +**Existing CLI packages already use:** Commander.js `^13.1.0`, `@clack/prompts ^0.10.0`, `picocolors`. Reuse these — they're already in the project's lockfile. -For values injected at build time (version strings, product name in `<title>` tag), use Vite's `define` option in `vite.config.ts`: +**Confidence: HIGH** — npx bin-field pattern is official Node.js documentation. No novel library choices required. + +--- + +### 7. Persistent Memory — Personal AI Assistant + +**Recommendation:** Two-layer approach — SQLite for structured memory + local vector search for semantic recall + +**Layer 1 — Structured facts:** Use the existing LibSQL/Drizzle ORM stack. Add a `memories` table with columns: `id`, `user_id`, `content` (text), `embedding` (blob), `created_at`, `source` (`conversation` | `explicit`). No new DB library needed — LibSQL supports this schema. + +**Layer 2 — Semantic search:** `vectra` +**Version:** `^0.12.3` (last published ~1 month ago) +**Where it lives:** `server/` as an optional memory service + +**Why vectra:** +- Zero infrastructure — index is a folder of JSON files on disk. Fits `~/.paperclip/memory/` perfectly. +- Sub-millisecond lookup for small corpora (<10K items, typical personal assistant use) +- TypeScript-native, MIT licensed +- No cloud dependency, no server process + +**Embeddings for vectra:** Use Ollama's `nomic-embed-text` model (already in the Ollama ecosystem from v1.4). This avoids any OpenAI API key dependency for the memory layer. ```typescript -// vite.config.ts — [nexus] section -define: { - __NEXUS_PRODUCT_NAME__: JSON.stringify("Nexus"), - __NEXUS_VERSION__: JSON.stringify(process.env.npm_package_version), -}, +import { LocalIndex } from "vectra"; +import ollama from "ollama"; // already installed via hermes adapter + +const index = new LocalIndex(path.join(process.env.PAPERCLIP_HOME!, "memory")); + +// Store memory +const { embeddings } = await ollama.embeddings({ model: "nomic-embed-text", prompt: text }); +await index.insertItem({ vector: embeddings[0], metadata: { content: text, date: Date.now() } }); + +// Recall memories +const results = await index.queryItems(queryEmbedding, 5); ``` -Declare the type in `ui/src/vite-env.d.ts`: +**Why NOT mem0ai:** `mem0ai` npm package defaults to OpenAI for both the LLM and embedder. Local/offline configuration is not documented in the Node SDK (only the Python SDK supports local providers). Using it would introduce an OpenAI API key hard dependency that conflicts with the "zero-config local-first" goal. -```typescript -declare const __NEXUS_PRODUCT_NAME__: string; -``` +**Why NOT LangChain MemoryVectorStore:** LangChain JS is 40MB+ of dependencies and would be the largest single addition to the project. For a personal assistant's memory layer, vectra + Ollama embeddings is 1/20th the footprint. -Use this only for values that must appear in static HTML before React hydrates (e.g. `<title>` tag, meta tags). Component-level strings should use the branding package, not `define`. - -**Why not a full Catppuccin Mocha theme in v1:** Full theme overhaul is listed as out-of-scope in PROJECT.md. CSS custom properties allow it to be added later as a single-file change. +**Confidence: MEDIUM** — vectra verified on npm/GitHub. Ollama embeddings confirmed via ollama.com docs. mem0ai limitation confirmed via their Node SDK docs (no local LLM option documented). --- -## 4. Onboarding Assets — Separate Files, Zero Code Conflict +## Installation Summary -### Recommended: Direct File Replacement, No Pattern Needed +```bash +# server/ — add these dependencies +pnpm --filter @paperclipai/server add systeminformation openid-client vectra -**Confidence: HIGH** — This is already how the codebase works. +# server/ — smart-whisper (optional, for local STT) +pnpm --filter @paperclipai/server add smart-whisper -The files in `server/src/onboarding-assets/ceo/` (SOUL.md, AGENTS.md, HEARTBEAT.md, TOOLS.md) are plain Markdown loaded at runtime via `fs.readFile`. They contain the hardcoded "You are the CEO" prose that must change for Nexus. +# ui/ — Puter.js frontend SDK +pnpm --filter @paperclipai/ui add @heyputer/puter.js -**Strategy:** Replace these files entirely as a `[nexus]` commit. The directory name `ceo/` stays unchanged (directory rename would cause upstream conflicts on every change upstream makes to these files). The file content changes. These files are prose with no TypeScript identifiers — conflict risk is purely editorial (if upstream rewrites the CEO instructions, the rebase will conflict on the content, which is a genuine conflict to resolve manually). - -**For new Nexus-specific agent templates** (PM and Engineer predefined templates), add new directories: - -``` -server/src/onboarding-assets/ - ceo/ -- upstream directory, content replaced by [nexus] - pm/ -- [nexus] new directory, PM template - engineer/ -- [nexus] new directory, Engineer template +# New package for npx bootstrapper (separate publish) +# cli-bootstrapper/package.json — no new external deps beyond @clack/prompts ``` -New directories are never touched by upstream; they replay through rebase with zero conflicts. +--- + +## Alternatives Considered + +| Feature | Recommended | Alternative | Why Not | +|---------|-------------|-------------|---------| +| Hardware detection | `systeminformation ^5.31.5` | `detect-gpu` | Browser-only; Node.js usage not supported | +| Hardware detection | `systeminformation ^5.31.5` | `gpu-info` | Windows-only; no macOS/Linux support | +| STT | `smart-whisper ^0.8.1` | `nodejs-whisper ^0.2.9` | Subprocess-based, 10 months stale, slower on Apple Silicon | +| STT | `smart-whisper ^0.8.1` | Cloud Whisper API | Requires API key; breaks offline/local-first promise | +| TTS | Piper binary via `child_process` | `@mintplex-labs/piper-tts-web` | Browser-only npm package, cannot run in Node.js server | +| TTS | Piper binary | `sherpa-onnx ^1.12.34` | Supports both STT+TTS but adds 80MB binary; overkill if using smart-whisper for STT | +| OAuth | `openid-client ^6.8.2` | `passport-oauth2` | Adds middleware layer that conflicts with existing `better-auth` session handling | +| Memory | `vectra ^0.12.3` + Ollama embeddings | `mem0ai` | Node SDK requires OpenAI; no local embedding option documented | +| Memory | `vectra ^0.12.3` + Ollama embeddings | LangChain MemoryVectorStore | 40MB+ transitive dependency footprint; overkill for personal use scale | +| Zero-config cloud | `@heyputer/puter.js` | Direct provider SDKs | Would require managing API keys per user; Puter eliminates this entirely | --- -## 5. What NOT to Do — Anti-Patterns +## What NOT to Add -### Anti-Pattern 1: Rename any `@paperclipai/*` package - -**What happens:** Every TypeScript file in the monorepo imports from `@paperclipai/shared`, `@paperclipai/db`, etc. Renaming any of these produces thousands of lines of import-statement diffs across every file. On the next upstream rebase, every one of those files conflicts because upstream and Nexus both modified the imports (upstream: added a new function, Nexus: changed the import path). This turns a clean rebase into a multi-hour conflict session on every upstream release. - -**Instead:** Keep all `@paperclipai/*` names. The new branding package is `@paperclipai/branding` — same namespace, no existing files modified. - -### Anti-Pattern 2: Rename TypeScript identifiers (`companyService`, `CompanyContext`, etc.) - -**What happens:** If `companyService` is renamed to `workspaceService` in Nexus, any upstream commit that touches `companies.ts` will produce a conflict at that identifier. The function is the same; only the name differs. This is a pure noise conflict with zero semantic value. - -**Instead:** Leave all identifiers unchanged. `CompanyContext` stays `CompanyContext` internally; only the string it renders in JSX changes. - -### Anti-Pattern 3: Scatter display strings across individual component files - -**What happens:** If each component file hardcodes its own Nexus strings (`<span>Workspace</span>` scattered across 30 files), every upstream change to a component file produces a conflict on the string line. Finding and resolving these becomes the dominant cost of each sync. - -**Instead:** All display strings live in `packages/branding/`. Each component imports one constant. Upstream touches component logic; Nexus touches the branding package. File overlap is minimised. - -### Anti-Pattern 4: Change DB column names, stored enum values, or API routes - -**What happens:** These are breaking changes with migration requirements. They also conflict with upstream on every schema or route change. - -**Instead:** These are already out-of-scope per PROJECT.md. The ORM layer stays `companies`, `company_id`, `"ceo"` role. The branding package translates at display time. - -### Anti-Pattern 5: Mix Nexus and upstream changes in one commit - -**What happens:** If a `[nexus]` commit also contains an upstream bug fix, the bug fix becomes entangled with the display change. On rebase, if upstream fixes the same bug, there is a conflict in a commit that was supposed to be a display-only patch. - -**Instead:** If a bug fix is needed, create a separate commit without the `[nexus]` prefix. Consider submitting it upstream. Keep `[nexus]` commits purely display-layer. - -### Anti-Pattern 6: Rename `~/.paperclip` to `~/.nexus` (data directory) - -**What happens:** Requires changing `PAPERCLIP_HOME` environment variable references across server, CLI, Docker files, and documentation. Breaks all existing deployments. Creates conflicts on every upstream change touching home-path logic. - -**Instead:** Use `~/.nexus` as a pointer file only (containing the root directory path), as described in PROJECT.md. The actual data directory stays `~/.paperclip`. The `~/.nexus` pointer file is a net-new file; upstream never touches it. +| Avoid | Why | Use Instead | +|-------|-----|-------------| +| `passport.js` | Conflicts with existing `better-auth`; adds middleware overhead | `openid-client v6` (certified, no middleware) | +| `langchain` or `llamaindex` | 40-80MB dep footprint; overkill for single-user personal assistant | `vectra` + direct Ollama calls | +| `mem0ai` Node SDK | OpenAI hard dependency in Node SDK; no local embedding option | Custom memory layer: `vectra` + Ollama `nomic-embed-text` | +| `@mintplex-labs/piper-tts-web` | Browser-only, cannot be used in Node.js server | Piper binary subprocess | +| Any browser extension for auth | Security risk; not applicable to local app | Standard PKCE via `openid-client` | +| `electron` or `tauri` | PROJECT.md target is web app on Mac Mini, not desktop app | Existing Vite/Express architecture | --- -## 6. Tooling Summary +## Version Compatibility Notes -| Tool | Purpose | Confidence | -|------|---------|------------| -| `git rebase upstream/master` | Sync with upstream releases | HIGH | -| `[nexus]` commit prefix | Identify all downstream-only commits | HIGH | -| `git range-diff` | Verify rebase replayed all patches correctly | HIGH | -| `git rerere` | Auto-resolve recurring conflict patterns | HIGH | -| `packages/branding/` package | Single mutation surface for display strings | HIGH | -| `ui/src/branding.css` | CSS custom property overrides for Tailwind v4 | HIGH | -| `vite.config.ts define` | Build-time product name injection for static HTML | HIGH | +| Package | Compatible With | Notes | +|---------|-----------------|-------| +| `systeminformation ^5.31.5` | Node.js >=18 | v6 is being rewritten in TS but not released; stick with v5 | +| `smart-whisper ^0.8.1` | Node.js >=18, macOS arm64 | Prebuilt binaries for Apple Silicon — no compilation needed | +| `openid-client ^6.8.2` | Node.js >=20 | v6 is a full rewrite; do not use v5 patterns (completely different API) | +| `vectra ^0.12.3` | Node.js >=16 | File-based; no native addons, no compilation | +| `@heyputer/puter.js` | Browser (Vite/ESM) | Not for Node.js server use | --- -## 7. File Mutation Surface (Complete List) +## Integration Architecture -Files that `[nexus]` commits are permitted to touch, and the rationale: +``` +Browser (UI) Server (Express) +───────────────── ──────────────────────────────── +@heyputer/puter.js ──────────→ No server proxy needed + (Puter calls go direct to puter.com) -| File / Directory | Change Type | Upstream Conflict Risk | -|------------------|------------|----------------------| -| `packages/branding/` (new) | Create entire package | None — net new | -| `ui/src/branding.css` (new) | Create branding CSS | None — net new | -| `server/src/onboarding-assets/ceo/*.md` | Replace prose content | Low — prose-level conflict only if upstream rewrites instructions | -| `server/src/onboarding-assets/pm/` (new) | Create PM template | None — net new | -| `server/src/onboarding-assets/engineer/` (new) | Create Engineer template | None — net new | -| `ui/src/components/OnboardingWizard.tsx` | Replace JSX strings with branding imports | Medium — upstream actively modifies onboarding | -| `ui/src/pages/App.tsx` | Replace CLI command strings | Low — static text, rarely changed | -| `server/src/startup-banner.ts` | Replace ASCII art and startup text | Low — rarely changed | -| `cli/src/commands/onboard.ts` | Replace terminal output strings | Medium — onboarding logic changes | -| `vite.config.ts` | Add `define` block | Low — config changes rarely conflict | -| `ui/index.html` | Update `<title>` tag | Low — rarely touched | +React voice input ──────────→ POST /api/voice/transcribe + └── smart-whisper (local STT) + └── ~140MB model file in ~/.paperclip/voice/ -Files that `[nexus]` commits must NEVER touch: +GET /api/system/hardware ←──── systeminformation + └── GPU cores, total RAM, GPU model -- `packages/db/src/schema/` — DB schema -- `packages/db/src/migrations/` — migration SQL -- `packages/shared/src/constants.ts` — stored enum values -- `packages/shared/src/api.ts` — route constants -- `server/src/routes/` — API route handlers -- Any `package.json` `"name"` field other than the new branding package -- `pnpm-workspace.yaml` (except to add `packages/branding`) -- Any TypeScript identifier (function name, variable name, class name) +React onboarding OAuth ────────→ GET /api/oauth/google/start + └── openid-client PKCE flow + └── GET /api/oauth/google/callback + +Personal assistant chat ───────→ POST /api/assistant/chat + └── vectra recall (nomic-embed-text via Ollama) + └── context injection → selected AI provider + +TTS response ──────────────────→ POST /api/voice/synthesize + └── piper binary subprocess + └── returns raw PCM → browser Audio API +``` --- ## Sources -- [History-preserving fork maintenance with git](https://amboar.github.io/notes/2021/09/16/history-preserving-fork-maintenance-with-git.html) -- [GitHub: Strategies for friendly fork management](https://github.blog/developer-skills/github/friend-zone-strategies-friendly-fork-management/) -- [VSCodium Build System — DeepWiki](https://deepwiki.com/VSCodium/vscodium/2-build-system) -- [Git range-diff documentation](https://git-scm.com/docs/git-range-diff) -- [Git rerere — Pro Git book](https://git-scm.com/book/en/v2/Git-Tools-Rerere) -- [Mastering Git Rerere — This Dot Labs](https://www.thisdot.co/blog/mastering-git-rerere-solving-repetitive-merge-conflicts-with-ease) -- [Vite define option](https://vite.dev/config/shared-options#define) -- [Tailwind CSS v4 + Vite — CSS custom properties theming](https://medium.com/render-beyond/build-a-flawless-multi-theme-ui-using-new-tailwind-css-v4-react-dca2b3c95510) -- [A Scalable Text Management Pattern — React Context + TypeScript](https://nicholasgalante1997.medium.com/a-scalable-text-management-pattern-for-web-developers-with-react-context-and-typescript-5b26aacceceb) -- [TypeScript Record pattern for display labels](https://dev.to/naserrasouli/mastering-record-in-typescript-the-clean-way-to-map-enums-to-labels-and-colors-46bh) -- [How to Synchronize Your Fork with Upstream Changes](https://nhutduong.com/blog/how-to-synchronize-your-fork-repository-with-upstream-changes/) +- [Puter.js developer docs](https://developer.puter.com/) — API structure, user-pays model confirmed +- [Puter.js npm install](https://docs.puter.com/) — package name `@heyputer/puter.js` verified +- [systeminformation npm](https://www.npmjs.com/package/systeminformation) — v5.31.5 latest, v6 in progress +- [systeminformation GPU docs](https://systeminformation.io/graphics.html) — Apple Silicon GPU cores confirmed +- [smart-whisper GitHub releases](https://github.com/JacobLinCool/smart-whisper/releases) — v0.8.1, October 2025 +- [openid-client npm](https://www.npmjs.com/package/openid-client) — v6.8.2, PKCE confirmed +- [openid-client v6 migration discussion](https://github.com/panva/openid-client/discussions/702) — API changes documented +- [vectra npm](https://www.npmjs.com/package/vectra) — v0.12.3, file-backed vector index +- [Ollama embedding models](https://ollama.com/blog/embedding-models) — nomic-embed-text capability confirmed +- [Piper TTS GitHub](https://github.com/rhasspy/piper) — macOS arm64 binary available +- [Running Piper TTS with JS (Bun)](https://n4ze3m.com/blog/running-piper-tts-with-javascript-in-the-bun-runtime) — ONNX approach documented +- [mem0 Node SDK docs](https://docs.mem0.ai/open-source/node-quickstart) — OpenAI default confirmed, no local option documented +- [clack/prompts npm](https://www.npmjs.com/package/@clack/prompts) — v1.2.0 latest (CLI already uses ^0.10.0) +- [npx bin field pattern](https://docs.npmjs.com/cli/v11/commands/npx/) — official npm docs --- -*Stack research: 2026-03-30* +*Stack research for: Nexus v1.5 Smart Onboarding + Personal AI Assistant* +*Researched: 2026-04-02* +*Prior milestone stack research (fork maintenance): see STACK.md entry dated 2026-03-30 (preserved above this file was overwritten — the fork maintenance content is in git history)* diff --git a/.planning/research/SUMMARY.md b/.planning/research/SUMMARY.md index 4feb6c4d..3a0de48f 100644 --- a/.planning/research/SUMMARY.md +++ b/.planning/research/SUMMARY.md @@ -1,17 +1,17 @@ # Project Research Summary -**Project:** Nexus (fork of Paperclip) -**Domain:** Display-layer fork of a TypeScript AI agent orchestration monorepo -**Researched:** 2026-03-30 -**Confidence:** HIGH +**Project:** Nexus v1.5 — Smart Onboarding + Personal AI Assistant +**Domain:** Forked open-source AI platform (Paperclip) — additive features on existing monorepo +**Researched:** 2026-04-02 +**Confidence:** MEDIUM-HIGH ## Executive Summary -Nexus is a personal-use fork of Paperclip, an open-source AI agent orchestration platform. The project scope is strictly display-layer: rename corporate metaphors (Company, CEO, Board) to solo-developer vocabulary (Workspace, Project Manager, Owner), replace the onboarding wizard with a zero-friction root-directory-picker flow, and ship predefined PM and Engineer agent templates. No engine changes, no schema changes, no route changes. Every functional capability is inherited from upstream — the work is entirely in what the product communicates to its operator. +Nexus v1.5 adds a smart multi-step onboarding flow and Personal AI Assistant mode to an existing, working Paperclip fork. The product's primary value is removing every barrier between "first run" and "first useful AI interaction" — hardware detection drives model recommendations, Puter.js eliminates API key requirements for cloud AI, and persistent memory makes the assistant mode feel meaningfully different from a stateless chat interface. Experts building this type of product treat the onboarding funnel as the highest-risk surface: users who cannot configure a provider in the first two minutes abandon. The recommended approach is a tiered provider architecture (local Ollama → Puter.js zero-config cloud → Google OAuth → API key) that steers users toward local-first and uses Puter.js as the escape hatch for users who won't install Ollama, not as the default recommendation. -The recommended approach is rebase-over-upstream with a `[nexus]` commit prefix convention, all fork-specific strings isolated in a new `packages/branding/` package and `ui/src/lib/nexus-labels.ts` file, and a Vite alias to redirect the `OnboardingWizard` import to a fully Nexus-owned replacement component. This architecture concentrates the entire mutable surface in new files that upstream will never create, minimising rebase conflict exposure to a small number of well-understood lines in five upstream files. +The architecture is additive by design. Zero new database tables are introduced — all state lives in the existing `instance_settings.general` JSONB column or file-backed JSON in the server data directory. Four new server route sets mount into the existing Express app, and five new onboarding step components extend the existing NexusOnboardingWizard via Vite alias. This approach preserves upstream rebase safety, which remains the single most important constraint for a maintained fork. The most important technical decision is that Puter.js must be proxied through the server-side adapter system rather than called browser-direct, to preserve cost tracking, session management, and memory injection. This is not optional — browser-direct Puter.js is the primary anti-pattern and must be called out in every phase spec. -The principal risk is accidental scope creep into Zone B (code identifiers) or Zone C (dual-purpose stored DB values) during rename work. A single naive find-replace that touches `companyService`, `"ceo"` in `AGENT_ROLES`, or `/api/companies` routes would shatter rebasability and require a recovery that undoes the entire rename. The mitigation is a strict three-zone taxonomy applied file-by-file from the first commit, combined with a pre-commit hook enforcing the `[nexus]` prefix and a weekly rebase cadence. +The top risks are: (1) Puter.js bypassing the Paperclip adapter machinery if implemented browser-direct, (2) OAuth token storage in localStorage creating security exposure and upstream key collisions, (3) persistent memory injecting sensitive data (credentials, API keys) into system prompts without sanitization, (4) hardware detection returning misleading values on Apple Silicon where unified memory is shared between CPU and GPU, and (5) the onboarding probe endpoint requiring board auth that does not exist yet on a fresh install. All five risks are avoidable with explicit architectural constraints established before implementation begins. --- @@ -19,122 +19,139 @@ The principal risk is accidental scope creep into Zone B (code identifiers) or Z ### Recommended Stack -Paperclip's existing stack is retained without alteration. The fork adds one new workspace package (`@paperclipai/branding`) and two new UI files (`nexus-labels.ts`, `branding.css`) — all additions, no modifications to `package.json` names, tsconfig paths, or Tailwind config structure. +The existing stack (React 19, Vite 6, Tailwind v4, TanStack Query v5, LibSQL/Drizzle, Commander.js, better-auth) requires zero changes. All v1.5 additions are strictly additive. New server dependencies: `systeminformation ^5.31.5` (hardware detection), `openid-client ^6.8.2` (OAuth PKCE), `vectra ^0.12.3` (file-backed vector memory), `smart-whisper ^0.8.1` (local STT). New UI dependency: `@heyputer/puter.js` (zero-config cloud AI SDK, browser-only). Piper TTS uses no Node.js library — the `piper` binary is spawned via `child_process`. A new standalone package `packages/buildthis/` provides the `npx buildthis` CLI entry point with no additional external dependencies. -The only tooling additions are: `git rerere` enabled in the repo config to auto-replay recurring conflict resolutions, and a `vite.config.ts` `define` block for build-time product name injection into static HTML. Both are zero-dependency changes. +Key version constraints: `openid-client` v6 is a complete API rewrite and v5 patterns do not apply. `systeminformation` v5 is stable; v6 TypeScript rewrite is unreleased. `@heyputer/puter.js` is browser-only and must never be imported in server code. `smart-whisper ^0.8.1` has prebuilt macOS arm64 binaries, avoiding compilation on the Mac Mini M4 target. `vectra ^0.12.3` is file-backed with no native addons — no compilation required anywhere. -**Core tools:** -- `git rebase upstream/master` + `[nexus]` prefix convention: fork sync strategy — standard practice used by git-for-windows, VSCodium, microsoft/git -- `git range-diff` + `git rerere`: rebase verification and auto-resolution — official Git tooling, no third-party risk -- `packages/branding/` (`@paperclipai/branding`): single string mutation surface — new package in existing namespace, zero import-path disruption -- `ui/src/lib/nexus-labels.ts`: UI-layer label registry — new file, zero upstream conflict risk -- `ui/src/branding.css`: Tailwind v4 CSS custom property overrides — new file, zero upstream conflict risk -- Vite `resolve.alias` for `OnboardingWizard`: build-time component swap — existing vite.config.ts already uses alias syntax +**Core technologies:** +- `@heyputer/puter.js`: Zero-config cloud AI (UI auth only, never server) — user-pays model, 500+ models, zero developer billing +- `systeminformation ^5.31.5`: Server-side GPU/RAM/Apple Silicon detection — only comprehensive cross-platform Node.js option with 20M+ monthly downloads +- `smart-whisper ^0.8.1`: Local STT via whisper.cpp — native Apple Neural Engine acceleration on macOS arm64; prebuilt binaries available +- Piper binary via `child_process`: Text-to-speech — no mature Node.js binding exists; subprocess is the proven production path +- `openid-client ^6.8.2`: OAuth PKCE flows — certified, middleware-free, works alongside existing `better-auth` +- `vectra ^0.12.3` + Ollama `nomic-embed-text`: File-backed semantic memory — zero infrastructure, MIT licensed, reuses existing Ollama ecosystem, avoids OpenAI dependency +- `packages/buildthis/`: npx CLI bootstrapper — standard `bin` field pattern, uses only Node.js built-ins and existing `@clack/prompts` + +**What NOT to add:** `passport.js` (conflicts with existing `better-auth`), LangChain/LlamaIndex (40-80MB footprint), `mem0ai` Node SDK (OpenAI hard dependency), `@mintplex-labs/piper-tts-web` on the server (browser-only). ### Expected Features -**Must have (table stakes) — all already exist in upstream, display rename only:** -- Dashboard with live agent status (SSE-backed, Company → Workspace rename) -- Real-time run logs and heartbeat transcript -- Cost visibility per agent (`cost_events` table already tracked) -- Task/issue list with status and sub-task hierarchy -- Agent status indicators (idle/running/paused) -- One-command startup (`nexus run` replacing `paperclipai run`) -- Human approval workflow (approvals table and routes intact) -- Agent configuration page with config revision history -- Scheduled task creation (routines with cron) -- CLI help text using Nexus vocabulary throughout +**Must have (v1.5 launch — P1):** +- Mode selection (Personal AI / Project Builder / Both) — gates all assistant-specific features; minimum valid state for skip-all must be defined first +- Hardware auto-detection + RAM/VRAM-aware model recommendation — primary UX claim; Apple Silicon requires special handling +- Puter.js zero-config cloud tier — removes Ollama installation barrier; must be server-proxied +- Personal AI Assistant chat with persistent memory — defines the mode as meaningfully different from stateless chat +- Summary screen landing straight into chat — closes the onboarding funnel +- Every step skippable — PROJECT.md requirement; skip-all must produce a working workspace with one agent +- Piper TTS — completes the voice loop Whisper STT started in v1.3 -**Should have (differentiators):** -- Zero-question onboarding — root directory picker, auto-create PM + Engineer agents, no "company name" or "CEO" prompts (highest-impact UX change) -- Predefined agent templates (PM + Engineer) — SOUL.md, AGENTS.md, HEARTBEAT.md, TOOLS.md for each role -- Workspace-first mental model — systematic string audit across all UI and CLI surfaces -- Nexus branding — logo, `<title>`, CLI binary name (`nexus`), favicon -- "Add Agent" dialog with template dropdown replacing the "hire agent" flow -- Human-readable agent directories under user root (`~/RaglanWork/agents/engineer/`) +**Should have (v1.5.x — P2 differentiators):** +- Project handoff ("turn this conversation into a project") — novel UX, no off-the-shelf solution; requires stable assistant mode first +- MCP server connections (curated list, one-click add) — power user expectation; namespace all tool names to avoid Hermes skill collisions +- Google OAuth cloud tier (Gemini without API key) — escape hatch when Puter.js limits surface; policy risk with third-party OAuth needs documentation +- `npx buildthis` CLI entry point — zero-install UX; verify `npm search buildthis` for name collision before publishing **Defer (v2+):** -- Full Catppuccin Mocha theme (high visual risk for v1, CSS custom properties make it addable later as a single-file change) -- Telegram Channels integration (separate project scope) -- Recipe Registry plugin (separate project scope) -- Plugin API event renames (`company.created` etc.) — would break existing plugins silently -- MCP connector layer abstraction (upstream adapter system already handles this) +- OpenAI OAuth free tier — aggressive rate limits, unstable UX, LOW confidence on specifics +- Cloud memory sync — GDPR scope, multi-device auth, enormous complexity for single-user product +- Multi-MCP orchestration — enterprise complexity for personal tool +- Streaming TTS word-by-word — browser Audio API complexity; sentence-buffered TTS is the practical optimum -**Critical path for differentiators:** D2 (Agent Templates) → D1 (Zero-Question Onboarding) → D4 (Human-Readable Directories). D3 (Workspace Mental Model), D5 (Branding), and D6 (Add Agent Dialog) can ship in any order alongside or after. +**Provider tier ordering (steers users correctly):** +Tier 0 (existing Hermes/Claude Code/OpenClaw) → Tier 1 (local Ollama, most private) → Tier 2 (Puter.js zero-config) → Tier 3 (Google OAuth) → Tier 4 (API key/subscription). Tiers 0 and 1 are the recommendations; Tier 2 is the fallback, not the default. ### Architecture Approach -The architecture goal is to confine all fork-specific content to new files that upstream will never create. Two strategies cover every change type: (1) add-only new files for net-new content (zero conflict risk), and (2) minimal inline edits with `// [nexus]` markers on lines that must touch existing upstream files (string changes only, never identifier renames). For the one component requiring substantial structural rewriting (OnboardingWizard), a Vite alias redirects the import to a fully Nexus-owned file, leaving the upstream file untouched and allowing upstream to evolve it freely. +All v1.5 features hook into existing extension points without touching DB schema, API routes, or TypeScript identifiers from upstream. The NexusOnboardingWizard Vite alias continues as the sole onboarding replacement surface. File-backed JSON in `data/memory/<companyId>.json` handles assistant memory with no migration. Puter.js is proxied through a new `puterProxyService` that stores the auth token in `company_secrets` and pipes SSE output in the exact format the existing `useStreamingChat` hook already consumes. The four new server route sets (hardware, puter-proxy, voice, memory) are mounted in a single four-line addition to `app.ts`. **Major components:** -1. `packages/branding/` (NEW) — canonical vocabulary constants (`VOCAB`, `DISPLAY_ROLE_LABELS`); the only place Nexus display strings are defined -2. `ui/src/lib/nexus-labels.ts` (NEW) — UI-layer label registry imported by components instead of hardcoded strings -3. `ui/src/nexus/OnboardingWizard.tsx` (NEW) — full Nexus onboarding replacement; upstream `OnboardingWizard.tsx` left untouched -4. `server/src/onboarding-assets/pm/` and `engineer/` (NEW) — predefined agent template directories; zero conflict risk as net-new paths -5. `ui/src/branding.css` (NEW) — Tailwind v4 CSS custom property overrides for brand colors -6. `packages/shared/src/constants.ts` (MODIFIED, 1 line) — `ceo: "Project Manager"` in `AGENT_ROLE_LABELS`; the only upstream constants file touched -7. `server/src/home-paths.ts` (MODIFIED, 1 line) — default home dir `".nexus"` -8. `ui/vite.config.ts` (MODIFIED, 1 line) — alias entry redirecting `OnboardingWizard` import +1. `hardwareService` (NEW, server) — detects GPU/RAM/Apple Silicon via `systeminformation`; 5-min cache; returns `{ unifiedMemory: true, totalBytes }` for M-series chips; unauthenticated endpoint required for pre-auth onboarding +2. `puterProxyService` (NEW, server) — stores Puter auth token in `company_secrets`, proxies AI calls as SSE matching existing chat pipeline format; Puter auth popup is UI-only +3. `voiceService` (NEW, server) — manages `smart-whisper` for STT and Piper binary subprocess for TTS; graceful degradation if models not downloaded +4. `memoryService` (NEW, server) — file-backed JSON memory per `companyId`; sanitization blocklist at write time; injects formatted memory block into system prompt +5. `NexusOnboardingWizard.tsx` (MODIFIED, UI) — multi-step wizard consuming 5 new step components from `ui/src/components/onboarding/` +6. `PersonalAssistantPage` (NEW, UI) — full-screen assistant experience; re-uses ChatPanel with `assistantMode` prop; lazy-loaded +7. `packages/buildthis/` (NEW) — standalone npm package; health-check detects running Nexus; opens browser or guides install + +**Build dependency order (from ARCHITECTURE.md):** +Phase 1 (Hardware) → Phase 2 (Puter Proxy) → Phase 3 (Wizard Assembly) → Phase 4 (Memory + Assistant Mode) → Phase 5 (Voice) → Phase 6 (buildthis CLI) ### Critical Pitfalls -1. **Renaming a code identifier that is also a stored DB value** — `"ceo"`, `"hire_agent"`, `"bootstrap_ceo"`, `"board"`, `"company"` are stored in DB rows, not just TypeScript constants. Renaming the constant value silently breaks existing installations (old rows no longer match). Mitigation: rename only the label map value (`ceo: "Project Manager"`), never the key (`ceo`). Grep for any target string in `packages/db/src/schema/` before renaming. +1. **Puter.js browser-direct bypasses the adapter system** — cost tracking, session codec, and memory injection all break. `@heyputer/puter.js` in the UI is for the auth popup only; all AI calls go through `POST /api/puter-proxy/chat`. Recovery cost if shipped wrong: HIGH. -2. **Bulk find-replace contaminating Zone B (code identifiers)** — a naive global replace of "company" touches `companyService`, import paths, and DB schema values alongside JSX strings. Result: hundreds of rebase conflicts in files that should never have been modified. Mitigation: three-zone taxonomy enforced file-by-file; no global find-replace ever. +2. **OAuth tokens in localStorage** — XSS exposure; key collision with upstream Paperclip `localStorage` keys. All OAuth tokens (Google, Puter) must be stored server-side via existing `secretService`. Browser holds only a session indicator. -3. **Upstream rebase cadence drift** — fork conflicts accumulate non-linearly. A two-week gap becomes a four-hour archaeology session. Mitigation: weekly rebase on a fixed schedule, `[nexus]` prefix from the first commit, CI rebase check on a test branch. +3. **Persistent memory storing credentials** — regex-based blocklist (API key patterns, token patterns) must be applied at write time, not retrieval time. MCP tool results and user-pasted content both need the same sanitization. Recovery cost if shipped without: HIGH (requires retroactive purge). -4. **Renaming `~/.paperclip` config path without a migration** — existing installations lose all agents, projects, and API keys on next startup if the config path is renamed without a read-both-paths fallback. Mitigation: check `~/.nexus` first, fall back to `~/.paperclip`; implement the pointer-file mechanism before shipping the home dir change. +4. **Apple Silicon VRAM reporting** — M-series has unified memory; `os.totalmem()` is NOT GPU VRAM. Use `os.freemem()` as baseline, apply 0.75 multiplier, label all recommendations as "estimated." UI copy must say "unified memory" not "VRAM" for M-series chips. -5. **Partial rename — missing occurrences across 12+ files** — "CEO" appears in at least 12 distinct files. Without an i18n layer there is no compile-time verification that a rename is complete. Mitigation: run `grep -ri "CEO\|company\|board\|hire\|paperclip" ui/src cli/src server/src` after each phase and verify every remaining occurrence is intentional (Zone B/C). +5. **Onboarding probe auth-gated on board auth** — hardware detection runs before board auth exists on a fresh install. A separate unauthenticated `GET /system/providers` endpoint is required. Without it, all provider probing silently returns 403 and auto-detection never works. + +6. **Vite alias silent divergence from upstream** — after every upstream rebase, diff `OnboardingWizard.tsx` against the prior upstream version and integrate any new props into `NexusOnboardingWizard.tsx`. Without this protocol, upstream wizard improvements are silently discarded. + +7. **Piper TTS cold start hang** — WASM voice model downloads on first synthesis call (5–30 seconds), appearing as a broken feature. Pre-warm the model on a background thread during the onboarding voice step. Show download progress before enabling the toggle. + +8. **Multi-provider creating competing defaults** — one primary provider per agent; do not let the wizard create PM and Engineer on different providers silently. Project Builder agents default to local/privacy-first; Personal AI assistant defaults to highest-quality available. --- ## Implications for Roadmap -Based on research, suggested phase structure: +Based on the dependency graph in ARCHITECTURE.md and the pitfall-to-phase mapping in PITFALLS.md, the build order is fixed by component dependencies and upstream-conflict risk sequencing. -### Phase 1: Foundation and String Infrastructure -**Rationale:** Establishes all new files with zero upstream file touches. Creates the containment structure before any existing file is modified. Safe to rebase at any point. Pre-commit hook and zone taxonomy documented here — if these are not in place before Phase 2, all subsequent work is at risk. -**Delivers:** `packages/branding/`, `ui/src/lib/nexus-labels.ts`, `ui/src/nexus/` directory, `server/src/onboarding-assets/pm/` and `engineer/` skeleton directories, `[nexus]` pre-commit hook, zone taxonomy document in `.planning/`. -**Addresses:** D2 partial (template directories created), D5 partial (branding package scaffold) -**Avoids:** Pitfall 8 (no-prefix commits), Pitfall 2 (Zone B contamination from lack of taxonomy) +### Phase 1: Hardware Detection + Mode Selection Foundation +**Rationale:** All other phases depend on knowing the hardware tier and the user's chosen mode. Mode selection gates which features are surfaced. Hardware detection drives model recommendations. Critically, the unauthenticated probe endpoint (Pitfall 14) and the skip-all minimum valid state (Pitfall 22) must both be defined here as test cases before any provider probing or wizard step is built. This is the riskiest design phase even though it contains no upstream file modifications. +**Delivers:** `hardwareService`, `GET /api/hardware/info`, unauthenticated `GET /system/providers`, `HardwareSummaryStep` and `ModeSelector` components, model recommendation lookup table with Apple Silicon handling, skip-all minimum valid state definition and test +**Addresses:** Hardware auto-detection + model recommendation (P1), Mode selection UI (P1) +**Avoids:** Pitfalls 13 (Apple Silicon VRAM), 14 (probe auth), 17 (competing defaults), 22 (skip-all breakage), 26 (stale model catalog fallback heuristic) -### Phase 2: Constants, Labels, and Home Directory -**Rationale:** Touches three upstream files with one-line changes each. These are the lowest-risk upstream file modifications: rarely-changed lines, isolated diffs, immediately verifiable. Completing this phase makes every downstream component able to import correct labels before any component is touched. -**Delivers:** `AGENT_ROLE_LABELS.ceo = "Project Manager"` live, home dir default changed to `.nexus` with read-both-paths fallback, `DISPLAY_ROLE_LABELS` exported from branding package. -**Addresses:** D3 (core vocabulary change), D4 partial (home dir pointer) -**Avoids:** Pitfall 1 (dual-purpose stored values — keys unchanged), Pitfall 4 (config migration — fallback implemented here) +### Phase 2: Puter.js Zero-Config Cloud Tier +**Rationale:** Puter.js is the primary escape hatch for users who won't install Ollama. The server-proxy pattern must be established before the UI provider step is built — implementing UI first creates the risk of accidentally wiring browser-direct calls. The Puter auth popup is the one legitimate browser-side use; everything else is server-mediated. +**Delivers:** `puterProxyService`, `POST /api/puter-proxy/chat` (SSE relay), `POST /api/puter-proxy/auth`, Puter section of `ProviderTierStep` (UI), Puter auth popup, Puter token storage via `secretService` +**Uses:** `@heyputer/puter.js` (UI popup only), server-side HTTP calls to Puter API +**Avoids:** Pitfall 15 (Puter.js bypassing adapter system), Pitfall 16 (OAuth tokens in localStorage) -### Phase 3: UI and CLI String Renames -**Rationale:** Surface-area is larger (multiple upstream files) but each change is string-only with `// [nexus]` markers. Individual commits per file keep each rebase conflict isolated and mechanically resolvable. The branding infrastructure from Phases 1–2 must exist before this phase to avoid scattering string definitions. -**Delivers:** All "Company/CEO/Board" display strings replaced with "Workspace/Project Manager/Owner" across `Companies.tsx`, `CompanyRail.tsx`, `CompanySettings.tsx`, `InstanceSidebar.tsx`, `cli/onboard.ts`, `startup-banner.ts`. CLI binary renamed to `nexus` atomically with all instructional copy updated. -**Addresses:** D3 (complete), D5 (complete), D6 partial (dialog strings updated) -**Avoids:** Pitfall 6 (atomic CLI rename), Pitfall 7 (post-phase grep audit), Pitfall 10 (test assertions updated in same commits) +### Phase 3: Multi-Step Onboarding Wizard Assembly +**Rationale:** After hardware detection and Puter.js are independently built and tested, the wizard is assembled. This is the phase that modifies `NexusOnboardingWizard.tsx` substantially — establish the post-rebase diff protocol before touching this file. The ProviderTierStep covers all provider tiers (local, Puter, OAuth). VoiceSetupStep UI shell is included here; voice service is wired in Phase 5. +**Delivers:** Refactored `NexusOnboardingWizard.tsx` (multi-step), `OnboardingSummaryStep`, `VoiceSetupStep` (shell only), OAuth PKCE popup pattern for Google Gemini, `instance_settings.general.nexus` config write, navigation routing to PersonalAssistantPage vs Dashboard +**Implements:** Onboarding Wizard data flow from ARCHITECTURE.md +**Avoids:** Pitfall 12 (Vite alias divergence — diff protocol in place), Pitfall 22 (skip-all confirmed from Phase 1), Pitfall 17 (one primary provider per mode) -### Phase 4: Onboarding Redesign -**Rationale:** Most complex change goes last. Vite alias approach means upstream `OnboardingWizard.tsx` is never touched and can evolve independently. PM and Engineer template content (written in Phase 1) is wired up here. Onboarding API shape mismatch (workspace name derived from directory basename) must be explicitly resolved. -**Delivers:** `ui/src/nexus/OnboardingWizard.tsx` full replacement (root dir picker, auto-create PM + Engineer agents, one-step flow), Vite alias in `vite.config.ts`, `ceo/` onboarding asset content replaced with PM framing, PM and Engineer template files populated, "Add Agent" dialog updated with template dropdown. -**Addresses:** D1 (complete), D2 (complete), D4 (complete), D6 (complete) -**Avoids:** Pitfall 3 (ceo/ directory name kept, only content replaced), Pitfall 9 (API shape documented and workspace name derived before implementation) +### Phase 4: Persistent Memory + Personal Assistant Mode +**Rationale:** Memory injection modifies the existing chat route — the highest-risk upstream file modification in the entire milestone. It comes after onboarding is validated so mode context is reliable before memory is scoped to it. Memory sanitization is built at write time into the schema (not patched post-launch). This phase also defines the conversation isolation strategy between assistant and project builder modes. +**Delivers:** `memoryService` with write-time sanitization blocklist, `GET/POST/DELETE /api/companies/:id/memory`, memory injection in chat route (MODIFIED), `PersonalAssistantPage`, `AssistantMemoryBar`, `useAssistantMemory` hook, conversation isolation via agent-based filter +**Avoids:** Pitfall 19 (credential injection via memory), Pitfall 23 (assistant/project builder context bleed) + +### Phase 5: Voice (Whisper STT + Piper TTS) +**Rationale:** Independent of Phase 4 but requires Phase 3 (onboarding wizard must exist to surface VoiceSetupStep). Piper pre-warming strategy must be designed before the TTS toggle is wired, not after. This phase is isolated enough to be deprioritized or built in parallel without blocking Phase 4 or 6. +**Delivers:** `voiceService` (smart-whisper + Piper subprocess), `POST /api/voice/transcribe`, `POST /api/voice/speak`, `GET /api/voice/status`, VoiceSetupStep wired into onboarding wizard, `useVoiceInput` and `useVoiceSpeech` hooks, ChatInput mic button (MODIFIED — upstream file, low risk), Piper pre-warm background thread with download progress indicator +**Avoids:** Pitfall 18 (Piper TTS cold start hang) + +### Phase 6: npx buildthis CLI Bootstrapper +**Rationale:** Fully independent of all other phases. Can be built in parallel or deferred to v1.5.x. P2 priority — useful for sharing Nexus but not required for core assistant functionality. Primary gate is verifying `npm search buildthis` for package name collision before publishing. +**Delivers:** `packages/buildthis/` standalone package, `bin.buildthis` entry point, health-check detection of running Nexus (`GET localhost:4000/api/health`), npm publish configuration +**Avoids:** Pitfall 21 (npx package name collision) ### Phase Ordering Rationale -- New files before upstream file modifications — zero conflict risk for the majority of work -- Constants before components — components can import correct labels from day one -- String renames before onboarding redesign — the vocabulary must be stable before the most complex component is written against it -- Onboarding last — its Vite alias approach is the most architectural change; having it isolated keeps every earlier phase simple and independently rebaseable -- Each phase produces a rebasing-clean state — can sync upstream between any two phases without compound conflicts +- Phase 1 must precede all others because mode and hardware are inputs to every subsequent phase's UX decisions. Skip-all state definition is a hard prerequisite for Phase 3. +- Phase 2 (Puter.js proxy) precedes wizard assembly (Phase 3) because the server proxy pattern must exist before the UI references it — wiring UI first creates the anti-pattern risk. +- Phase 4 (memory) is separated from Phase 3 (wizard) because the chat route modification is the highest upstream-conflict-risk step and deserves its own isolated phase after onboarding is stable and tested. +- Phase 5 (voice) and Phase 6 (buildthis) are independent of each other and can be built in either order or in parallel. +- Each phase delivers a working, rebasing-clean state — upstream sync can occur between any two phases without compound conflicts. ### Research Flags -Phases with well-documented patterns (skip `/gsd:research-phase`): -- **Phase 1:** Standard TypeScript monorepo package creation and git hook setup — no research needed -- **Phase 2:** Single-line constant and config path changes — no research needed -- **Phase 3:** Mechanical string replacement with documented taxonomy — no research needed +Phases likely needing deeper research during planning: +- **Phase 2 (Puter.js):** Puter rate limits and Node.js HTTP API behavior are not publicly documented. Need to verify server-side streaming API surface and token refresh behavior before designing the proxy service. Also: confirm Puter's terms of service allow server-side relaying of requests. +- **Phase 4 (Memory):** The specific injection hook location in `server/src/services/chat.ts` needs codebase inspection to confirm the right insertion point. Also: decide between linear scan (v1.5) vs vectra vector search (v2) based on expected corpus size — should be explicit in the spec. +- **Phase 5 (Voice):** `smart-whisper ^0.8.1` Apple Neural Engine acceleration claim needs verification on the actual Mac Mini M4 target before committing to `base.en` as the default model. If acceleration is not confirmed, fall back to `tiny.en`. -Phases likely benefiting from deeper research during planning: -- **Phase 4:** The onboarding API shape mismatch (Pitfall 9) needs the `POST /api/companies` contract documented before writing the new wizard. A brief codebase read of `server/src/routes/companies.ts` and the API client should resolve this. Not complex — 30 minutes of reading, not a full research session. +Phases with standard patterns (skip research-phase): +- **Phase 1 (Hardware Detection):** `systeminformation` is mature (20M+ monthly downloads), Apple Silicon behavior is officially documented. Pattern is well-established across Ollama, LM Studio, llm-checker. +- **Phase 3 (Wizard Assembly):** React multi-step wizard patterns are well-documented. NexusOnboardingWizard Vite alias pattern is already live in the codebase. +- **Phase 6 (buildthis CLI):** Standard npm `bin` field pattern per official Node.js docs. No novel choices. --- @@ -142,44 +159,47 @@ Phases likely benefiting from deeper research during planning: | Area | Confidence | Notes | |------|------------|-------| -| Stack | HIGH | Based on direct codebase inspection of live repo + official Git, Vite, Tailwind v4 documentation | -| Features | HIGH (table stakes), MEDIUM (differentiators) | Table stakes verified from codebase; differentiator prioritization informed by Paperclip product notes and UX research but not validated against actual users | -| Architecture | HIGH | Patterns derived from direct codebase inspection; Vite alias pattern verified against official docs and existing vite.config.ts in the repo | -| Pitfalls | HIGH | Primarily from direct audit of CONCERNS.md and codebase; supplemented by fork maintenance community research | +| Stack | HIGH | Most libraries verified via official docs; `systeminformation` and `openid-client` v6 fully confirmed; `smart-whisper` version from GitHub releases with no production deployment data | +| Features | MEDIUM | Puter.js rate limits and production reliability unverified at scale; hardware detection patterns confirmed from Ollama/LM Studio ecosystem; UX recommendations inferred from Clerk/Vercel/Postman patterns | +| Architecture | HIGH | Based on direct codebase inspection of `/opt/nexus/`; all extension points verified to exist; file-backed JSON approach confirmed feasible given single-user M4 Mini target | +| Pitfalls | HIGH | Based on direct codebase analysis plus targeted research per integration domain; Apple Silicon VRAM behavior confirmed; Puter.js adapter risk confirmed from architecture analysis | -**Overall confidence:** HIGH +**Overall confidence:** MEDIUM-HIGH ### Gaps to Address -- **OnboardingWizard API contract:** The `POST /api/companies` required fields are not fully documented in research. Before Phase 4 implementation, read `server/src/routes/companies.ts` to determine exactly what fields are required and derive a rule for the workspace name field (likely `basename(rootDir)`). -- **Test suite audit scope:** The pre-rename test audit (Pitfall 10) requires running the grep against the actual test files. The exact count of test files asserting on "CEO" / "company" display strings is not known — this should be done as the first step of Phase 3 execution, not planning. -- **`localStorage` key migration:** Whether to keep `"paperclip.selectedCompanyId"` or migrate it is unresolved. Given it is internal and users never see it, keeping it unchanged is the lowest-risk path and should be the default decision unless there is a specific reason to change it. -- **Catppuccin Mocha theme scope boundary:** The `branding.css` scaffold is included in Phase 1 but full theme is deferred. The exact CSS custom property overrides needed for even minimal brand differentiation (Nexus blue-purple vs Paperclip defaults) should be defined during Phase 3 execution. +- **Puter.js Node.js API surface:** Server-side streaming via HTTP (not the browser SDK) needs verification before `puterProxyService` is specced. Architecture assumes `stream: true` works server-side — confirm during Phase 2 planning. +- **Puter.js rate limits and ToS:** "No restrictions" claim is unverified at scale. Design graceful degradation for rate limit responses. Attribute all costs to user's Puter account in UI copy. +- **smart-whisper Apple Silicon acceleration:** Performance claim needs on-device verification on the Mac Mini M4 target. If not confirmed, `tiny.en` may be required as default instead of `base.en`. +- **Google Gemini OAuth policy risk:** Using Gemini CLI OAuth with third-party apps may trigger abuse detection (GitHub issue #21866 confirmed). Gate this tier on users with active Gemini subscriptions; document limitation explicitly. +- **Memory store performance ceiling:** Linear scan is acceptable for fewer than ~500 entries. Define the upgrade threshold to `vectra` vector search during Phase 4 planning and document it in the code. +- **OpenAI OAuth free tier:** LOW confidence — OpenAI free tier OAuth specifics change frequently. Do not include in v1.5 scope; defer to v2+. --- ## Sources ### Primary (HIGH confidence) -- `/Volumes/UsbNvme/agent/.planning/codebase/ARCHITECTURE.md` — direct codebase analysis -- `/Volumes/UsbNvme/agent/.planning/codebase/CONCERNS.md` — direct audit of dual-purpose stored values -- `/Volumes/UsbNvme/agent/.planning/PROJECT.md` — project constraints and scope -- `/Volumes/UsbNvme/repos/nexus/` — live codebase inspection -- [Git range-diff documentation](https://git-scm.com/docs/git-range-diff) -- [Git rerere — Pro Git book](https://git-scm.com/book/en/v2/Git-Tools-Rerere) -- [Vite resolve.alias + define documentation](https://vite.dev/config/shared-options) +- `/opt/nexus/` direct codebase inspection (ARCHITECTURE.md) — extension points, existing patterns, upstream file risk +- [Puter.js developer docs](https://developer.puter.com/) — user-pays model, API structure confirmed +- [systeminformation official docs](https://systeminformation.io/graphics.html) — Apple Silicon GPU core detection confirmed +- [openid-client npm v6](https://www.npmjs.com/package/openid-client) — PKCE, v6 API confirmed +- [Piper TTS GitHub](https://github.com/rhasspy/piper) — macOS arm64 binary, CPU-capable, MIT license +- [npx bin field pattern](https://docs.npmjs.com/cli/v11/commands/npx/) — official npm docs ### Secondary (MEDIUM confidence) -- [GitHub: Strategies for friendly fork management](https://github.blog/developer-skills/github/friend-zone-strategies-friendly-fork-management/) -- [History-preserving fork maintenance with git](https://amboar.github.io/notes/2021/09/16/history-preserving-fork-maintenance-with-git.html) -- [VSCodium Build System — DeepWiki](https://deepwiki.com/VSCodium/vscodium/2-build-system) -- [Stop Forking Around — Fork Drift in Open Source](https://preset.io/blog/stop-forking-around-the-hidden-dangers-of-fork-drift-in-open-source-adoption/) -- [Designing For Agentic AI: Practical UX Patterns (Smashing Magazine, 2026)](https://www.smashingmagazine.com/2026/02/designing-agentic-ai-practical-ux-patterns/) -- [Tailwind CSS v4 + Vite — CSS custom properties theming](https://medium.com/render-beyond/build-a-flawless-multi-theme-ui-using-new-tailwind-css-v4-react-dca2b3c95510) +- [smart-whisper GitHub releases](https://github.com/JacobLinCool/smart-whisper/releases) — v0.8.1, Apple Silicon acceleration claim +- [vectra npm](https://www.npmjs.com/package/vectra) — file-backed vector index, MIT license +- [Ollama embedding models](https://ollama.com/blog/embedding-models) — nomic-embed-text capability confirmed +- [mem0 Node SDK docs](https://docs.mem0.ai/open-source/node-quickstart) — OpenAI default confirmed, no local option documented +- [Google Gemini free tier 2026](https://www.aifreeapi.com/en/posts/google-gemini-api-free-tier) — Gemini 2.0 Flash free tier +- [Google Gemini OAuth via Opencode](https://syntackle.com/blog/google-gemini-ai-subscription-with-opencode/) — OAuth pattern confirmed in related tool ### Tertiary (LOW confidence) -- UX claims regarding cognitive load from vocabulary mismatch — reasonable inference, not validated against actual users +- [Running Piper TTS with JS (Bun)](https://n4ze3m.com/blog/running-piper-tts-with-javascript-in-the-bun-runtime) — subprocess approach validated in Bun; no Node.js production data +- [Google Gemini OAuth policy risk](https://github.com/google-gemini/gemini-cli/issues/21866) — third-party OAuth may trigger abuse detection; single GitHub issue +- Puter.js rate limits — "no restrictions" from Puter marketing only; no independent verification --- -*Research completed: 2026-03-30* +*Research completed: 2026-04-02* *Ready for roadmap: yes*