# Domain Pitfalls — Nexus Fork of Paperclip **Domain:** Forked open-source project with display-layer renames, no i18n layer **Researched:** 2026-04-02 (updated for v1.5 milestone: smart onboarding, multi-provider, voice TTS, persistent memory, assistant mode, `npx buildthis`) **Confidence:** HIGH — based on direct codebase analysis of `/opt/nexus/` plus targeted research on each new integration domain --- ## About This Document This file covers pitfalls for the **v1.5 milestone additions**. The original pitfalls (Pitfalls 1–11) covering fork hygiene, display-layer rename discipline, and upstream sync remain valid and are preserved below. Pitfalls 12–26 are new for v1.5. --- ## Critical Pitfalls (Fork Hygiene — v1.0–1.4, still active) --- ### Pitfall 1: Renaming a Code Identifier That Is Also a Stored DB Value **What goes wrong:** You rename a TypeScript constant, CLI command, or function to use the new Nexus vocabulary, not realising the same string is also stored as a literal value in database rows. The app breaks for any existing installation because the server checks `approval.type === "hire_agent"` but the DB still has `"hire_agent"` rows. **Why it happens:** In Paperclip the same string serves double duty: it is both a TypeScript constant/enum and a persisted DB value. The CONCERNS.md audit identifies these dual-purpose strings explicitly: `"ceo"`, `"hire_agent"`, `"approve_ceo_strategy"`, `"bootstrap_ceo"`, `"company"` in goal levels, `"board"` in auth challenges. **How to avoid:** 1. Treat every string in the Summary Risk Table (CONCERNS.md) marked "Critical" as immutable. 2. For display renaming only: change label maps (`AGENT_ROLE_LABELS`, `ApprovalPayload` display maps) without touching the underlying constant value. 3. Before touching any string, grep for it in `packages/db/src/schema/` and migration files. **Warning signs:** - Any string appearing in `packages/db/src/schema/` or migration files - Approval, invite, and goal lists empty on existing install but work on fresh install **Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 2: Treating "Display-Only Rename" as a Simple Find-Replace **What goes wrong:** Bulk `sed` or IDE find-replace on "company" → "workspace" across the entire codebase. Touches service files, route files, schema files, and test files indiscriminately. The next `git rebase upstream/master` has conflicts on hundreds of files. **Why it happens:** "Display-only" is a policy decision, not a property the codebase enforces. Nothing in the TypeScript source distinguishes a user-facing label string from an internal identifier. **How to avoid:** 1. Establish a strict three-zone taxonomy: Zone A (display strings, safe), Zone B (code identifiers, do not rename), Zone C (dual-purpose stored values, label map only). 2. Never run a global find-replace. Work file-by-file. **Warning signs:** - PR diff touching `server/src/services/`, `server/src/routes/`, or `packages/db/` with rename changes - Diff showing TypeScript identifier name changes (not JSX string literals) **Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 3: Diverging the Onboarding Assets Directory Name From Upstream **What goes wrong:** Renaming `server/src/onboarding-assets/ceo/` to `pm/`. Upstream changes a file inside `ceo/` in a future commit. Git cannot reconcile rename-on-one-side with content-edit-on-other. **How to avoid:** Do not rename the `ceo/` directory. Change file *content* only. The directory path is Zone B. **Warning signs:** Rebase conflict shows a file as "deleted" that you expected to be "modified." **Phase to address:** Phase 1 (Onboarding Redesign) --- ### Pitfall 4: Changing the `localStorage` Key or `~/.paperclip` Config Path Without a Migration **What goes wrong:** Renaming `"paperclip.selectedCompanyId"` localStorage key or `~/.paperclip` config path drops all existing state. **How to avoid:** Keep key names unchanged OR implement a read-both-paths fallback that migrates existing values on boot before deleting the old key. **Warning signs:** Server logs "no config found, starting fresh" on a machine with existing data. **Phase to address:** Phase 2 (Directory Restructure) --- ### Pitfall 5: Upstream Rebase Cadence Slipping Below Weekly **What goes wrong:** Fork drift. Upstream has 120+ commits since fork. Waiting accumulates compound conflicts. A 10-minute weekly rebase becomes 4 hours after a month gap. **How to avoid:** Rebase at minimum weekly. `[nexus]` commit prefix strictly enforced. CI alert on `git rebase upstream/master` failures in a test branch. **Warning signs:** Last rebase more than 2 weeks ago; `git log upstream/master..HEAD` shows more than 20 upstream commits unmerged. **Phase to address:** Ongoing from Phase 1 --- ### Pitfall 6: Renaming the CLI Binary Name Without a Shim **What goes wrong:** Renaming to `nexus` without updating all four locations where `paperclipai` appears as an instructional string. **How to avoid:** Add `nexus` as an alias; keep `paperclipai` binary working. If renaming, atomic commit covering all instructional copy. **Phase to address:** Phase 1 (CLI String Updates) --- ### Pitfall 7: Partial Rename — Changing Some Occurrences But Not All **What goes wrong:** "CEO" renamed in 8 of 12 files. Users see mixed vocabulary. **How to avoid:** Post-rename `grep -ri "CEO" ui/src cli/src server/src` and verify every remaining occurrence is Zone B/C or non-user-visible. **Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 8: The `[nexus]` Commit Prefix Not Applied Consistently From the Start **What goes wrong:** Without consistent prefixing, rebase archaeology becomes necessary to identify which commits are Nexus vs. upstream. **How to avoid:** Pre-commit hook rejecting messages not starting with `[nexus]` from the first commit. **Phase to address:** Phase 1 (First commit) --- ### Pitfall 9: Onboarding Redesign Coupled to the Corporate Metaphor in Data Layer **What goes wrong:** New wizard does not pass a company name; `POST /api/companies` requires it. Company created with undefined name. **How to avoid:** Document API contract before redesigning wizard. Derive workspace name from directory basename (or VOCAB.appName as fallback — which `NexusOnboardingWizard.tsx` already does correctly). **Phase to address:** Phase 2 (Onboarding Redesign) --- ### Pitfall 10: Forgetting to Update Tests That Assert on Display Strings **What goes wrong:** `invite-onboarding-text.test.ts` asserts invite text contains "CEO." After rename, tests fail. **How to avoid:** Before any rename commit, grep all `*.test.ts` files for old vocabulary terms and update in the same commit. **Phase to address:** Phase 1 (Display Rename) --- ### Pitfall 11: Exporting a `.nexus.yaml` File While Upstream Exports `.paperclip.yaml` **What goes wrong:** Breaking import compatibility with upstream Paperclip instances. **How to avoid:** Keep emitting `.paperclip.yaml`. The filename and schema header are Zone B/C. **Phase to address:** Phase 1 (Display Rename) --- ## Critical Pitfalls (v1.5 New Features) --- ### Pitfall 12: Vite Alias Swap Breaking Upstream Rebase on OnboardingWizard **What goes wrong:** The current pattern aliases `src/components/OnboardingWizard` → `NexusOnboardingWizard` at build time via `vite.config.ts`. If upstream renames, moves, or splits `OnboardingWizard.tsx` into multiple files, the alias silently points to a non-existent target — the build succeeds (the alias target exists) but the import resolution breaks at runtime in any code path that imports the upstream file by a new name. More critically: when v1.5 replaces the simple wizard with a multi-step hardware-detection wizard, the alias target `NexusOnboardingWizard.tsx` grows significantly. Upstream may add new features to `OnboardingWizard.tsx` (new props, context dependencies) that `NexusOnboardingWizard.tsx` silently misses, since it fully replaces rather than extends the upstream file. **Why it happens:** Full file replacement via Vite alias means no inheritance from upstream. Every upstream improvement to the wizard is silently discarded. **How to avoid:** 1. After each upstream rebase, diff `OnboardingWizard.tsx` against the previous upstream version: `git diff upstream-prev..upstream-new -- ui/src/components/OnboardingWizard.tsx`. If upstream adds new props or context hooks, integrate them into `NexusOnboardingWizard.tsx`. 2. Keep `NexusOnboardingWizard.tsx` surface API identical to `OnboardingWizard.tsx` (same component name export, same props interface as far as upstream is concerned). 3. Add a CI check: `test -f ui/src/components/OnboardingWizard.tsx` — verify the aliased-away file still exists with its expected export. **Warning signs:** - `NexusOnboardingWizard.tsx` not using a `DialogContext` or `CompanyContext` hook that upstream's version uses - After rebase, `pnpm dev` fails with "cannot find module" for the alias source path - The multi-step wizard is missing features that upstream added (e.g., invite-based onboarding, workspace templates) **Phase to address:** Phase 1 (Hardware Detection Wizard) — before building the multi-step v1.5 wizard, establish a diff-and-integrate protocol for this alias. --- ### Pitfall 13: Hardware Detection Returning Inaccurate or Platform-Specific Values **What goes wrong:** The v1.5 hardware detection step must surface GPU/RAM to recommend Ollama models. Two platform-specific traps exist on the Mac Mini M4 deploy target: 1. **VRAM is not VRAM on Apple Silicon.** The M4 uses unified memory — the same physical RAM serves both CPU and GPU. `os.totalmem()` in Node.js returns total unified memory. Reporting this as "VRAM available for Ollama" misleads: Ollama on Apple Silicon uses a portion of unified memory, but the OS, browser, and other processes also consume it. Treating `totalmem × 0.75` as GPU-available VRAM overestimates for models that also need system RAM headroom. 2. **`os.totalmem()` reads total installed RAM, not available RAM.** The existing `getRecommendedModel()` in `server/src/services/ollama.ts` already applies a 0.75 multiplier to account for OS overhead, but it uses total RAM, not free RAM. If the system is under load (Paperclip server + Ollama already running), available RAM is far lower than 75% of total. **Why it happens:** Node.js `os` module has `totalmem()` and `freemem()` but no VRAM API. Browser `WebGL` UNMASKED_RENDERER gives GPU name but not VRAM size; actual VRAM queries are blocked by browser security sandboxing. Developers reach for the most accessible number. **How to avoid:** 1. Use `os.freemem()` (not `totalmem()`) as the baseline for available-RAM recommendations when Ollama is already running. 2. On Apple Silicon, explicitly document in UI copy that "available memory" is unified memory shared with OS, not dedicated GPU VRAM. 3. Treat hardware detection values as hints, not guarantees. Add a message: "Recommendation based on system RAM. Actual performance may vary." 4. The pre-built model catalog (`ollama-model-catalog.json`) is the right layer for model-to-RAM requirements; use it as the authoritative source rather than computing from raw hardware numbers. **Warning signs:** - Model recommendation shows "fits in memory" but Ollama OOM-kills it at load time - M4 Mac Mini reports 16GB available for models but the system has 16GB total (OS needs 4–6GB) - AMD GPU users see wildly incorrect VRAM numbers (confirmed bug in Ollama's VRAM detection for AMD/Vulkan as of 2025) **Phase to address:** Phase 1 (Hardware Detection) — define detection methodology before building the UI layer. --- ### Pitfall 14: The Onboarding Probe Running at the Wrong Authentication Level **What goes wrong:** The existing adapter probe endpoint (`GET /adapters/:type/probe`) requires board authentication (`req.actor.type !== "board"`). The v1.5 onboarding wizard runs *during* first-time setup — before the user has authenticated. If the probe is called before board auth is established, every probe returns 403, the wizard always falls back to `claude_local`, and the user never gets the Hermes auto-detection benefit. This is the exact scenario the current `NexusOnboardingWizard.tsx` is vulnerable to: it calls `agentsApi.probeAdapter("hermes_local")` on wizard open, but if the user arrives at the onboarding page without board auth (fresh install, incognito session), the probe silently fails and `defaultAdapter` stays `"claude_local"`. **Why it happens:** Board auth is the right guard for post-setup adapter operations. But hardware detection and provider probing are legitimately pre-auth operations — you want to present the right setup path before any credentials exist. **How to avoid:** 1. Create a separate `GET /system/providers` endpoint that does not require board auth. It returns available local providers (Ollama status, Hermes status) based purely on server-side detection (no user credentials needed). 2. Alternatively, make the probe endpoint check auth level: if no board auth exists (fresh install), allow the probe to run unauthenticated for a whitelist of safe probe types (`hermes_local`, `ollama`). 3. Never gate hardware detection on user credentials — hardware is a property of the machine, not the user session. **Warning signs:** - Browser network tab shows 403 on the probe call during onboarding - `defaultAdapter` in the wizard is always `"claude_local"` even when Ollama/Hermes are running - Probe works in the settings page (user is auth'd) but not during initial onboarding **Phase to address:** Phase 1 (Hardware Detection) — the probe auth story must be designed before the multi-step wizard is built. --- ### Pitfall 15: Puter.js "Zero-Config" Promise Breaking on Paperclip's Server-Side Architecture **What goes wrong:** Puter.js is designed for purely browser-side use: load the CDN script, call `puter.ai.chat()`, Puter handles auth via its own popup login flow. Nexus/Paperclip proxies AI calls through the server (`/api/chat`, `/api/agents`). If Puter.js is loaded browser-side and calls Puter's servers directly, it bypasses Paperclip's cost tracking, budget enforcement, session codec, and skill sync entirely. This creates a split-brain: the Puter adapter sends messages to Puter's cloud while Paperclip's adapter system thinks the agent is using a different provider. Cost tracking shows $0 for Puter sessions. Heartbeat and session management are not wired up. **Why it happens:** Puter.js is documented as a CDN-loaded browser library with client-side auth. The natural integration is to `