nexus/.planning/research/SUMMARY.md
2026-04-04 03:55:49 +00:00

23 KiB
Raw Blame History

Project Research Summary

Project: Nexus v1.5 — Smart Onboarding + Personal AI Assistant Domain: Forked open-source AI platform (Paperclip) — additive features on existing monorepo Researched: 2026-04-02 Confidence: MEDIUM-HIGH

Executive Summary

Nexus v1.5 adds a smart multi-step onboarding flow and Personal AI Assistant mode to an existing, working Paperclip fork. The product's primary value is removing every barrier between "first run" and "first useful AI interaction" — hardware detection drives model recommendations, Puter.js eliminates API key requirements for cloud AI, and persistent memory makes the assistant mode feel meaningfully different from a stateless chat interface. Experts building this type of product treat the onboarding funnel as the highest-risk surface: users who cannot configure a provider in the first two minutes abandon. The recommended approach is a tiered provider architecture (local Ollama → Puter.js zero-config cloud → Google OAuth → API key) that steers users toward local-first and uses Puter.js as the escape hatch for users who won't install Ollama, not as the default recommendation.

The architecture is additive by design. Zero new database tables are introduced — all state lives in the existing instance_settings.general JSONB column or file-backed JSON in the server data directory. Four new server route sets mount into the existing Express app, and five new onboarding step components extend the existing NexusOnboardingWizard via Vite alias. This approach preserves upstream rebase safety, which remains the single most important constraint for a maintained fork. The most important technical decision is that Puter.js must be proxied through the server-side adapter system rather than called browser-direct, to preserve cost tracking, session management, and memory injection. This is not optional — browser-direct Puter.js is the primary anti-pattern and must be called out in every phase spec.

The top risks are: (1) Puter.js bypassing the Paperclip adapter machinery if implemented browser-direct, (2) OAuth token storage in localStorage creating security exposure and upstream key collisions, (3) persistent memory injecting sensitive data (credentials, API keys) into system prompts without sanitization, (4) hardware detection returning misleading values on Apple Silicon where unified memory is shared between CPU and GPU, and (5) the onboarding probe endpoint requiring board auth that does not exist yet on a fresh install. All five risks are avoidable with explicit architectural constraints established before implementation begins.


Key Findings

The existing stack (React 19, Vite 6, Tailwind v4, TanStack Query v5, LibSQL/Drizzle, Commander.js, better-auth) requires zero changes. All v1.5 additions are strictly additive. New server dependencies: systeminformation ^5.31.5 (hardware detection), openid-client ^6.8.2 (OAuth PKCE), vectra ^0.12.3 (file-backed vector memory), smart-whisper ^0.8.1 (local STT). New UI dependency: @heyputer/puter.js (zero-config cloud AI SDK, browser-only). Piper TTS uses no Node.js library — the piper binary is spawned via child_process. A new standalone package packages/buildthis/ provides the npx buildthis CLI entry point with no additional external dependencies.

Key version constraints: openid-client v6 is a complete API rewrite and v5 patterns do not apply. systeminformation v5 is stable; v6 TypeScript rewrite is unreleased. @heyputer/puter.js is browser-only and must never be imported in server code. smart-whisper ^0.8.1 has prebuilt macOS arm64 binaries, avoiding compilation on the Mac Mini M4 target. vectra ^0.12.3 is file-backed with no native addons — no compilation required anywhere.

Core technologies:

  • @heyputer/puter.js: Zero-config cloud AI (UI auth only, never server) — user-pays model, 500+ models, zero developer billing
  • systeminformation ^5.31.5: Server-side GPU/RAM/Apple Silicon detection — only comprehensive cross-platform Node.js option with 20M+ monthly downloads
  • smart-whisper ^0.8.1: Local STT via whisper.cpp — native Apple Neural Engine acceleration on macOS arm64; prebuilt binaries available
  • Piper binary via child_process: Text-to-speech — no mature Node.js binding exists; subprocess is the proven production path
  • openid-client ^6.8.2: OAuth PKCE flows — certified, middleware-free, works alongside existing better-auth
  • vectra ^0.12.3 + Ollama nomic-embed-text: File-backed semantic memory — zero infrastructure, MIT licensed, reuses existing Ollama ecosystem, avoids OpenAI dependency
  • packages/buildthis/: npx CLI bootstrapper — standard bin field pattern, uses only Node.js built-ins and existing @clack/prompts

What NOT to add: passport.js (conflicts with existing better-auth), LangChain/LlamaIndex (40-80MB footprint), mem0ai Node SDK (OpenAI hard dependency), @mintplex-labs/piper-tts-web on the server (browser-only).

Expected Features

Must have (v1.5 launch — P1):

  • Mode selection (Personal AI / Project Builder / Both) — gates all assistant-specific features; minimum valid state for skip-all must be defined first
  • Hardware auto-detection + RAM/VRAM-aware model recommendation — primary UX claim; Apple Silicon requires special handling
  • Puter.js zero-config cloud tier — removes Ollama installation barrier; must be server-proxied
  • Personal AI Assistant chat with persistent memory — defines the mode as meaningfully different from stateless chat
  • Summary screen landing straight into chat — closes the onboarding funnel
  • Every step skippable — PROJECT.md requirement; skip-all must produce a working workspace with one agent
  • Piper TTS — completes the voice loop Whisper STT started in v1.3

Should have (v1.5.x — P2 differentiators):

  • Project handoff ("turn this conversation into a project") — novel UX, no off-the-shelf solution; requires stable assistant mode first
  • MCP server connections (curated list, one-click add) — power user expectation; namespace all tool names to avoid Hermes skill collisions
  • Google OAuth cloud tier (Gemini without API key) — escape hatch when Puter.js limits surface; policy risk with third-party OAuth needs documentation
  • npx buildthis CLI entry point — zero-install UX; verify npm search buildthis for name collision before publishing

Defer (v2+):

  • OpenAI OAuth free tier — aggressive rate limits, unstable UX, LOW confidence on specifics
  • Cloud memory sync — GDPR scope, multi-device auth, enormous complexity for single-user product
  • Multi-MCP orchestration — enterprise complexity for personal tool
  • Streaming TTS word-by-word — browser Audio API complexity; sentence-buffered TTS is the practical optimum

Provider tier ordering (steers users correctly): Tier 0 (existing Hermes/Claude Code/OpenClaw) → Tier 1 (local Ollama, most private) → Tier 2 (Puter.js zero-config) → Tier 3 (Google OAuth) → Tier 4 (API key/subscription). Tiers 0 and 1 are the recommendations; Tier 2 is the fallback, not the default.

Architecture Approach

All v1.5 features hook into existing extension points without touching DB schema, API routes, or TypeScript identifiers from upstream. The NexusOnboardingWizard Vite alias continues as the sole onboarding replacement surface. File-backed JSON in data/memory/<companyId>.json handles assistant memory with no migration. Puter.js is proxied through a new puterProxyService that stores the auth token in company_secrets and pipes SSE output in the exact format the existing useStreamingChat hook already consumes. The four new server route sets (hardware, puter-proxy, voice, memory) are mounted in a single four-line addition to app.ts.

Major components:

  1. hardwareService (NEW, server) — detects GPU/RAM/Apple Silicon via systeminformation; 5-min cache; returns { unifiedMemory: true, totalBytes } for M-series chips; unauthenticated endpoint required for pre-auth onboarding
  2. puterProxyService (NEW, server) — stores Puter auth token in company_secrets, proxies AI calls as SSE matching existing chat pipeline format; Puter auth popup is UI-only
  3. voiceService (NEW, server) — manages smart-whisper for STT and Piper binary subprocess for TTS; graceful degradation if models not downloaded
  4. memoryService (NEW, server) — file-backed JSON memory per companyId; sanitization blocklist at write time; injects formatted memory block into system prompt
  5. NexusOnboardingWizard.tsx (MODIFIED, UI) — multi-step wizard consuming 5 new step components from ui/src/components/onboarding/
  6. PersonalAssistantPage (NEW, UI) — full-screen assistant experience; re-uses ChatPanel with assistantMode prop; lazy-loaded
  7. packages/buildthis/ (NEW) — standalone npm package; health-check detects running Nexus; opens browser or guides install

Build dependency order (from ARCHITECTURE.md): Phase 1 (Hardware) → Phase 2 (Puter Proxy) → Phase 3 (Wizard Assembly) → Phase 4 (Memory + Assistant Mode) → Phase 5 (Voice) → Phase 6 (buildthis CLI)

Critical Pitfalls

  1. Puter.js browser-direct bypasses the adapter system — cost tracking, session codec, and memory injection all break. @heyputer/puter.js in the UI is for the auth popup only; all AI calls go through POST /api/puter-proxy/chat. Recovery cost if shipped wrong: HIGH.

  2. OAuth tokens in localStorage — XSS exposure; key collision with upstream Paperclip localStorage keys. All OAuth tokens (Google, Puter) must be stored server-side via existing secretService. Browser holds only a session indicator.

  3. Persistent memory storing credentials — regex-based blocklist (API key patterns, token patterns) must be applied at write time, not retrieval time. MCP tool results and user-pasted content both need the same sanitization. Recovery cost if shipped without: HIGH (requires retroactive purge).

  4. Apple Silicon VRAM reporting — M-series has unified memory; os.totalmem() is NOT GPU VRAM. Use os.freemem() as baseline, apply 0.75 multiplier, label all recommendations as "estimated." UI copy must say "unified memory" not "VRAM" for M-series chips.

  5. Onboarding probe auth-gated on board auth — hardware detection runs before board auth exists on a fresh install. A separate unauthenticated GET /system/providers endpoint is required. Without it, all provider probing silently returns 403 and auto-detection never works.

  6. Vite alias silent divergence from upstream — after every upstream rebase, diff OnboardingWizard.tsx against the prior upstream version and integrate any new props into NexusOnboardingWizard.tsx. Without this protocol, upstream wizard improvements are silently discarded.

  7. Piper TTS cold start hang — WASM voice model downloads on first synthesis call (530 seconds), appearing as a broken feature. Pre-warm the model on a background thread during the onboarding voice step. Show download progress before enabling the toggle.

  8. Multi-provider creating competing defaults — one primary provider per agent; do not let the wizard create PM and Engineer on different providers silently. Project Builder agents default to local/privacy-first; Personal AI assistant defaults to highest-quality available.


Implications for Roadmap

Based on the dependency graph in ARCHITECTURE.md and the pitfall-to-phase mapping in PITFALLS.md, the build order is fixed by component dependencies and upstream-conflict risk sequencing.

Phase 1: Hardware Detection + Mode Selection Foundation

Rationale: All other phases depend on knowing the hardware tier and the user's chosen mode. Mode selection gates which features are surfaced. Hardware detection drives model recommendations. Critically, the unauthenticated probe endpoint (Pitfall 14) and the skip-all minimum valid state (Pitfall 22) must both be defined here as test cases before any provider probing or wizard step is built. This is the riskiest design phase even though it contains no upstream file modifications. Delivers: hardwareService, GET /api/hardware/info, unauthenticated GET /system/providers, HardwareSummaryStep and ModeSelector components, model recommendation lookup table with Apple Silicon handling, skip-all minimum valid state definition and test Addresses: Hardware auto-detection + model recommendation (P1), Mode selection UI (P1) Avoids: Pitfalls 13 (Apple Silicon VRAM), 14 (probe auth), 17 (competing defaults), 22 (skip-all breakage), 26 (stale model catalog fallback heuristic)

Phase 2: Puter.js Zero-Config Cloud Tier

Rationale: Puter.js is the primary escape hatch for users who won't install Ollama. The server-proxy pattern must be established before the UI provider step is built — implementing UI first creates the risk of accidentally wiring browser-direct calls. The Puter auth popup is the one legitimate browser-side use; everything else is server-mediated. Delivers: puterProxyService, POST /api/puter-proxy/chat (SSE relay), POST /api/puter-proxy/auth, Puter section of ProviderTierStep (UI), Puter auth popup, Puter token storage via secretService Uses: @heyputer/puter.js (UI popup only), server-side HTTP calls to Puter API Avoids: Pitfall 15 (Puter.js bypassing adapter system), Pitfall 16 (OAuth tokens in localStorage)

Phase 3: Multi-Step Onboarding Wizard Assembly

Rationale: After hardware detection and Puter.js are independently built and tested, the wizard is assembled. This is the phase that modifies NexusOnboardingWizard.tsx substantially — establish the post-rebase diff protocol before touching this file. The ProviderTierStep covers all provider tiers (local, Puter, OAuth). VoiceSetupStep UI shell is included here; voice service is wired in Phase 5. Delivers: Refactored NexusOnboardingWizard.tsx (multi-step), OnboardingSummaryStep, VoiceSetupStep (shell only), OAuth PKCE popup pattern for Google Gemini, instance_settings.general.nexus config write, navigation routing to PersonalAssistantPage vs Dashboard Implements: Onboarding Wizard data flow from ARCHITECTURE.md Avoids: Pitfall 12 (Vite alias divergence — diff protocol in place), Pitfall 22 (skip-all confirmed from Phase 1), Pitfall 17 (one primary provider per mode)

Phase 4: Persistent Memory + Personal Assistant Mode

Rationale: Memory injection modifies the existing chat route — the highest-risk upstream file modification in the entire milestone. It comes after onboarding is validated so mode context is reliable before memory is scoped to it. Memory sanitization is built at write time into the schema (not patched post-launch). This phase also defines the conversation isolation strategy between assistant and project builder modes. Delivers: memoryService with write-time sanitization blocklist, GET/POST/DELETE /api/companies/:id/memory, memory injection in chat route (MODIFIED), PersonalAssistantPage, AssistantMemoryBar, useAssistantMemory hook, conversation isolation via agent-based filter Avoids: Pitfall 19 (credential injection via memory), Pitfall 23 (assistant/project builder context bleed)

Phase 5: Voice (Whisper STT + Piper TTS)

Rationale: Independent of Phase 4 but requires Phase 3 (onboarding wizard must exist to surface VoiceSetupStep). Piper pre-warming strategy must be designed before the TTS toggle is wired, not after. This phase is isolated enough to be deprioritized or built in parallel without blocking Phase 4 or 6. Delivers: voiceService (smart-whisper + Piper subprocess), POST /api/voice/transcribe, POST /api/voice/speak, GET /api/voice/status, VoiceSetupStep wired into onboarding wizard, useVoiceInput and useVoiceSpeech hooks, ChatInput mic button (MODIFIED — upstream file, low risk), Piper pre-warm background thread with download progress indicator Avoids: Pitfall 18 (Piper TTS cold start hang)

Phase 6: npx buildthis CLI Bootstrapper

Rationale: Fully independent of all other phases. Can be built in parallel or deferred to v1.5.x. P2 priority — useful for sharing Nexus but not required for core assistant functionality. Primary gate is verifying npm search buildthis for package name collision before publishing. Delivers: packages/buildthis/ standalone package, bin.buildthis entry point, health-check detection of running Nexus (GET localhost:4000/api/health), npm publish configuration Avoids: Pitfall 21 (npx package name collision)

Phase Ordering Rationale

  • Phase 1 must precede all others because mode and hardware are inputs to every subsequent phase's UX decisions. Skip-all state definition is a hard prerequisite for Phase 3.
  • Phase 2 (Puter.js proxy) precedes wizard assembly (Phase 3) because the server proxy pattern must exist before the UI references it — wiring UI first creates the anti-pattern risk.
  • Phase 4 (memory) is separated from Phase 3 (wizard) because the chat route modification is the highest upstream-conflict-risk step and deserves its own isolated phase after onboarding is stable and tested.
  • Phase 5 (voice) and Phase 6 (buildthis) are independent of each other and can be built in either order or in parallel.
  • Each phase delivers a working, rebasing-clean state — upstream sync can occur between any two phases without compound conflicts.

Research Flags

Phases likely needing deeper research during planning:

  • Phase 2 (Puter.js): Puter rate limits and Node.js HTTP API behavior are not publicly documented. Need to verify server-side streaming API surface and token refresh behavior before designing the proxy service. Also: confirm Puter's terms of service allow server-side relaying of requests.
  • Phase 4 (Memory): The specific injection hook location in server/src/services/chat.ts needs codebase inspection to confirm the right insertion point. Also: decide between linear scan (v1.5) vs vectra vector search (v2) based on expected corpus size — should be explicit in the spec.
  • Phase 5 (Voice): smart-whisper ^0.8.1 Apple Neural Engine acceleration claim needs verification on the actual Mac Mini M4 target before committing to base.en as the default model. If acceleration is not confirmed, fall back to tiny.en.

Phases with standard patterns (skip research-phase):

  • Phase 1 (Hardware Detection): systeminformation is mature (20M+ monthly downloads), Apple Silicon behavior is officially documented. Pattern is well-established across Ollama, LM Studio, llm-checker.
  • Phase 3 (Wizard Assembly): React multi-step wizard patterns are well-documented. NexusOnboardingWizard Vite alias pattern is already live in the codebase.
  • Phase 6 (buildthis CLI): Standard npm bin field pattern per official Node.js docs. No novel choices.

Confidence Assessment

Area Confidence Notes
Stack HIGH Most libraries verified via official docs; systeminformation and openid-client v6 fully confirmed; smart-whisper version from GitHub releases with no production deployment data
Features MEDIUM Puter.js rate limits and production reliability unverified at scale; hardware detection patterns confirmed from Ollama/LM Studio ecosystem; UX recommendations inferred from Clerk/Vercel/Postman patterns
Architecture HIGH Based on direct codebase inspection of /opt/nexus/; all extension points verified to exist; file-backed JSON approach confirmed feasible given single-user M4 Mini target
Pitfalls HIGH Based on direct codebase analysis plus targeted research per integration domain; Apple Silicon VRAM behavior confirmed; Puter.js adapter risk confirmed from architecture analysis

Overall confidence: MEDIUM-HIGH

Gaps to Address

  • Puter.js Node.js API surface: Server-side streaming via HTTP (not the browser SDK) needs verification before puterProxyService is specced. Architecture assumes stream: true works server-side — confirm during Phase 2 planning.
  • Puter.js rate limits and ToS: "No restrictions" claim is unverified at scale. Design graceful degradation for rate limit responses. Attribute all costs to user's Puter account in UI copy.
  • smart-whisper Apple Silicon acceleration: Performance claim needs on-device verification on the Mac Mini M4 target. If not confirmed, tiny.en may be required as default instead of base.en.
  • Google Gemini OAuth policy risk: Using Gemini CLI OAuth with third-party apps may trigger abuse detection (GitHub issue #21866 confirmed). Gate this tier on users with active Gemini subscriptions; document limitation explicitly.
  • Memory store performance ceiling: Linear scan is acceptable for fewer than ~500 entries. Define the upgrade threshold to vectra vector search during Phase 4 planning and document it in the code.
  • OpenAI OAuth free tier: LOW confidence — OpenAI free tier OAuth specifics change frequently. Do not include in v1.5 scope; defer to v2+.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)


Research completed: 2026-04-02 Ready for roadmap: yes