From ff4f47de3b5e6ea7389807ec4d4355d0c1535ccb Mon Sep 17 00:00:00 2001
From: Nexus Dev <nexus@local>
Date: Fri, 3 Apr 2026 00:11:45 +0000
Subject: [PATCH] =?UTF-8?q?docs(31):=20research=20phase=20=E2=80=94=20Pute?=
 =?UTF-8?q?r.js=20zero-config=20cloud=20provider=20integration?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 .../31-RESEARCH.md                            | 547 ++++++++++++++++++
 1 file changed, 547 insertions(+)
 create mode 100644 .planning/phases/31-puter.js-zero-config-cloud/31-RESEARCH.md

diff --git a/.planning/phases/31-puter.js-zero-config-cloud/31-RESEARCH.md b/.planning/phases/31-puter.js-zero-config-cloud/31-RESEARCH.md
new file mode 100644
index 00000000..bf5c7b46
--- /dev/null
+++ b/.planning/phases/31-puter.js-zero-config-cloud/31-RESEARCH.md
@@ -0,0 +1,547 @@
+# Phase 31: Puter.js Zero-Config Cloud — Research
+
+**Researched:** 2026-04-02
+**Domain:** Puter.js server-proxy AI, Google OAuth PKCE for Gemini, tool auto-detection, cloud provider onboarding
+**Confidence:** MEDIUM-HIGH (Puter OpenAI-compatible endpoint HIGH, Puter.js Node.js streaming MEDIUM, Google OAuth risk HIGH)
+
+---
+
+<user_constraints>
+## User Constraints (from CONTEXT.md)
+
+### Locked Decisions
+All implementation choices are at Claude's discretion — discuss phase was skipped per user setting.
+
+Key constraints established in STATE.md (carryover from roadmap):
+- Puter.js is server-proxied adapter only — `@heyputer/puter.js` browser import is for auth popup only; all AI calls via `POST /api/puter-proxy/chat`
+- OAuth tokens (Google, Puter) stored server-side via `secretService` — never in localStorage
+- Google OAuth cloud tier: include but flag policy risk (Gemini CLI abuse detection issue #21866 and subsequent mass bans)
+- No DB schema changes — all state in existing JSONB fields (`instance_settings.general`) and file-backed JSON
+
+### Claude's Discretion
+All implementation choices are at Claude's discretion.
+
+### Deferred Ideas (OUT OF SCOPE)
+None — discuss phase skipped.
+</user_constraints>
+
+---
+
+<phase_requirements>
+## Phase Requirements
+
+| ID | Description | Research Support |
+|----|-------------|------------------|
+| CLOUD-01 | User gets working AI via Puter.js with zero API keys and no sign-up required | Puter free tier confirmed; `puter.auth.signIn()` browser popup → token; OpenAI-compat endpoint for server-proxy calls |
+| CLOUD-02 | Puter.js integrated as server-proxied adapter (not browser-direct) with full cost tracking | OpenAI-compat `https://api.puter.com/puterai/openai/v1/` endpoint; `node-fetch` + Bearer token; costService.createEvent() pattern available |
+| CLOUD-03 | User can sign in via Google OAuth to access Gemini free tier | PKCE flow documented; Gemini CLI client_id can be reused; policy-risk warning required |
+| CLOUD-04 | System auto-detects installed tools (Hermes, Claude Code, OpenClaw) and pre-fills configuration | `/adapters/:type/probe` endpoint already exists; hermes_local probe already implemented in wizard Step 3 |
+| CLOUD-05 | User can enter API keys for subscription providers during onboarding | secretService.create() pattern available; onboarding wizard Step 3 form pattern available |
+</phase_requirements>
+
+---
+
+## Summary
+
+Phase 31 adds three cloud provider paths to the onboarding wizard: Puter.js (zero-config free tier), Google OAuth (Gemini free tier with policy warning), and manual API key entry for subscription providers. It also auto-detects pre-installed tools like Hermes, Claude Code, and OpenClaw.
+
+The core of the phase is a new **`POST /api/puter-proxy/chat`** route on the server that holds the user's Puter auth token (obtained via browser popup and stored via `secretService`) and relays AI calls to Puter's OpenAI-compatible endpoint (`https://api.puter.com/puterai/openai/v1/`). This proxy approach satisfies the security requirement (token never in localStorage) and enables full cost tracking via the existing `costService.createEvent()` infrastructure.
+
+The existing onboarding wizard (in `NexusOnboardingWizard.tsx`, 3-step Nexus replacement) needs a new Step 4: "Choose a provider." This step is inserted before the root directory step and provides three paths: Puter (button triggers auth popup, stores token, continues), Google OAuth (PKCE flow for Gemini), and manual API key entry. The probe-based auto-detection for Hermes/Claude Code/OpenClaw already exists via `/adapters/:type/probe` — it just needs to be surfaced in the provider step.
+
+**Primary recommendation:** Use Puter's OpenAI-compatible REST endpoint (`https://api.puter.com/puterai/openai/v1/chat/completions`) for all server-proxy AI calls. Store the Puter token via `secretService` under a well-known name (e.g., `puter_auth_token`). For Gemini OAuth, implement PKCE flow that mirrors what `opencode-gemini-auth` does — local redirect server on a random port captures the token; warn the user prominently about the abuse-detection risk before they proceed.
+
+---
+
+## Standard Stack
+
+### Core
+| Library | Version | Purpose | Why Standard |
+|---------|---------|---------|--------------|
+| `@heyputer/puter.js` | 2.2.14 (latest) | Browser auth popup (`puter.auth.signIn()`) — NOT for AI calls | Already on npm; official SDK for Puter popup auth |
+| `node-fetch` / Node 18+ `fetch` | built-in | Server-side HTTP calls to Puter OpenAI-compat endpoint | Already available; no extra dep needed |
+| `secretService` (internal) | existing | Store Puter token and Google OAuth tokens server-side | Established pattern in codebase; prevents localStorage storage |
+| `costService` (internal) | existing | Record token usage and costs per conversation/agent | Established `createEvent()` pattern; `inputTokens`, `outputTokens`, `costCents` fields match what we need |
+
+### Supporting
+| Library | Version | Purpose | When to Use |
+|---------|---------|---------|-------------|
+| PKCE helpers (`crypto.subtle` or `@auth/core`) | built-in | Generate code_verifier, code_challenge for Google OAuth | Gemini OAuth flow only |
+| `open` (npm) | optional | Open browser for Puter/Google auth in headless flows | Only needed if CLI path in future; for browser UI, `window.open()` suffices |
+
+### Puter Endpoint Facts (HIGH confidence — verified via official docs and community implementations)
+- **OpenAI-compatible base URL:** `https://api.puter.com/puterai/openai/v1/`
+- **Auth:** `Authorization: Bearer <puter_auth_token>`
+- **Chat endpoint:** `POST /chat/completions` (standard OpenAI format)
+- **Streaming:** `"stream": true` with SSE — same as OpenAI streaming
+- **Models:** 500+ including `claude-3-5-sonnet`, `gpt-4o`, `gemini-2.0-flash`, `mistral-large`, `deepseek-chat`
+- **Default model:** `gpt-5-nano` (per Puter docs; use `claude-3-5-haiku-20241022` or `gpt-4o-mini` for balanced quality/cost)
+- **Response format:** Standard OpenAI `ChatCompletion` object — `usage.prompt_tokens`, `usage.completion_tokens` available
+
+### Puter Auth Flow (HIGH confidence)
+- **Browser popup:** `puter.auth.signIn()` — must be triggered from user gesture (button click)
+- **Token retrieval:** After `signIn()` resolves, `puter.auth.isSignedIn()` is true; token is available in `puter.authToken` or via internal SDK state
+- **Node.js alternate:** `init(process.env.PUTER_AUTH_TOKEN)` for scripts; not relevant for this phase
+- **Cost model:** "User-Pays" — usage is billed to the authenticated user's Puter account; developer pays nothing
+- **No API key required:** User only needs a free Puter.com account; the `signIn()` popup creates one if needed
+
+### Google Gemini OAuth (HIGH confidence — risk documented)
+- **PKCE flow:** Standard OAuth2 with PKCE; local redirect server on random port captures callback
+- **Client ID:** Gemini CLI uses a hardcoded installed-app client_id (publicly documented)
+- **Scopes:** `https://www.googleapis.com/auth/cloud-platform` and `https://www.googleapis.com/auth/generative-language.retriever`
+- **CRITICAL POLICY RISK:** Google has mass-banned accounts for using third-party OAuth with Gemini CLI credentials. The risk extends to Google Workspace, Gmail access. UI must display a prominent warning before the user initiates this flow. This is a known, documented risk (gemini-cli issues #21866, #14203, discussions #22970, #20632)
+- **Safe alternative:** Users with a `GEMINI_API_KEY` from Google AI Studio should use that instead — plain API key flow, no abuse risk
+
+**Installation:**
+```bash
+# No new deps required for core Puter proxy — uses Node built-in fetch
+# For browser popup integration:
+pnpm --filter @paperclipai/ui add @heyputer/puter.js
+# Already in npm registry at 2.2.14
+```
+
+**Version verification:**
+```bash
+npm view @heyputer/puter.js version  # → 2.2.14 (verified 2026-04-02)
+```
+
+---
+
+## Architecture Patterns
+
+### Recommended Project Structure
+```
+server/src/
+├── routes/
+│   └── puter-proxy.ts         # POST /api/puter-proxy/chat (new)
+├── services/
+│   └── puter-proxy.ts         # puterProxyService — token resolve + OpenAI call + cost record
+ui/src/
+├── components/
+│   └── NexusOnboardingWizard.tsx  # Add Step 4: Provider selection
+├── components/onboarding/
+│   ├── ProviderSelectionStep.tsx  # New: Puter / Google OAuth / API key / Skip
+│   ├── PuterAuthButton.tsx        # New: loads puter.js CDN, calls signIn(), posts token to server
+│   └── GoogleOAuthButton.tsx      # New: PKCE flow, policy-risk warning, posts token to server
+```
+
+### Pattern 1: Puter Server-Proxy Call
+**What:** Server receives chat request → resolves Puter token from secretService → calls `https://api.puter.com/puterai/openai/v1/chat/completions` → streams back → records cost event
+**When to use:** All Puter-powered chat in the onboarding wizard "Continue with Puter" path
+
+```typescript
+// Source: https://developer.puter.com/tutorials/use-openai-sdk-with-puter/ and
+//         https://api.puter.com/puterai/openai/v1/ endpoint docs
+// server/src/services/puter-proxy.ts
+
+import type { Db } from "@paperclipai/db";
+import { secretService } from "./secrets.js";
+import { costService } from "./costs.js";
+
+const PUTER_BASE_URL = "https://api.puter.com/puterai/openai/v1";
+const PUTER_TOKEN_SECRET_NAME = "puter_auth_token";
+const PUTER_DEFAULT_MODEL = "claude-3-5-haiku-20241022";
+
+export function puterProxyService(db: Db) {
+  const secrets = secretService(db);
+  const costs = costService(db);
+
+  async function resolveToken(companyId: string): Promise<string> {
+    const secret = await secrets.getByName(companyId, PUTER_TOKEN_SECRET_NAME);
+    if (!secret) throw new Error("Puter auth token not configured");
+    return secrets.resolveSecretValue(companyId, secret.id, "latest");
+  }
+
+  async function chatStream(
+    companyId: string,
+    agentId: string,
+    messages: Array<{ role: string; content: string }>,
+    model = PUTER_DEFAULT_MODEL,
+    signal: AbortSignal,
+  ): AsyncGenerator<string> {
+    const token = await resolveToken(companyId);
+
+    const response = await fetch(`${PUTER_BASE_URL}/chat/completions`, {
+      method: "POST",
+      headers: {
+        "Authorization": `Bearer ${token}`,
+        "Content-Type": "application/json",
+      },
+      body: JSON.stringify({ model, messages, stream: true }),
+      signal,
+    });
+
+    if (!response.ok || !response.body) {
+      const text = await response.text().catch(() => "");
+      throw new Error(`Puter API error ${response.status}: ${text}`);
+    }
+
+    // Parse SSE stream — same pattern as OpenAI streaming
+    let inputTokens = 0;
+    let outputTokens = 0;
+    const reader = response.body.getReader();
+    const decoder = new TextDecoder();
+    let buf = "";
+
+    try {
+      while (true) {
+        const { done, value } = await reader.read();
+        if (done) break;
+        buf += decoder.decode(value, { stream: true });
+        const lines = buf.split("\n");
+        buf = lines.pop() ?? "";
+        for (const line of lines) {
+          if (!line.startsWith("data: ")) continue;
+          const payload = line.slice(6).trim();
+          if (payload === "[DONE]") continue;
+          const chunk = JSON.parse(payload);
+          // Accumulate usage from last chunk (some providers send it on final chunk)
+          if (chunk.usage) {
+            inputTokens = chunk.usage.prompt_tokens ?? 0;
+            outputTokens = chunk.usage.completion_tokens ?? 0;
+          }
+          const content = chunk.choices?.[0]?.delta?.content;
+          if (content) yield content;
+        }
+      }
+    } finally {
+      reader.releaseLock();
+    }
+
+    // Record cost — Puter is "free" to user (user-pays model), cost tracked as 0
+    // but token counts are still useful for visibility
+    if (inputTokens > 0 || outputTokens > 0) {
+      await costs.createEvent(companyId, {
+        agentId,
+        provider: "puter",
+        biller: "puter",
+        billingType: "subscription_included",
+        model,
+        inputTokens,
+        outputTokens,
+        costCents: 0, // user-pays: no cost to the Nexus instance
+        occurredAt: new Date(),
+      }).catch(() => {}); // non-blocking; don't fail on cost tracking error
+    }
+  }
+
+  return { resolveToken, chatStream };
+}
+```
+
+### Pattern 2: Puter Auth Popup (Browser Side)
+**What:** Load Puter.js via CDN (not npm bundle — avoids large bundle) in a React component, call `puter.auth.signIn()`, capture the auth token, POST it to `/api/puter-proxy/token` for server-side storage
+**When to use:** Onboarding wizard "Continue with Puter" button click
+
+```typescript
+// Source: https://docs.puter.com/Auth/ and https://js.puter.com/v2/
+// ui/src/components/onboarding/PuterAuthButton.tsx (simplified)
+
+async function handlePuterSignIn() {
+  // Dynamically load Puter.js CDN script (once) to avoid bundle bloat
+  await loadScript("https://js.puter.com/v2/");
+  const puter = (window as unknown as { puter: PuterInstance }).puter;
+
+  await puter.auth.signIn(); // opens popup; resolves when user completes auth
+
+  // Extract token from Puter SDK internal state
+  const token = puter.authToken; // string
+
+  // POST token to server for storage via secretService
+  await fetch("/api/puter-proxy/token", {
+    method: "POST",
+    headers: { "Content-Type": "application/json", ...authHeaders },
+    body: JSON.stringify({ token }),
+  });
+}
+```
+
+**SPIKE REQUIRED:** The exact property name for the token after `signIn()` is unverified (training knowledge suggests `puter.authToken`; must confirm against `@heyputer/puter.js` 2.2.14 source or live test). See Open Questions.
+
+### Pattern 3: Cost Tracking for Puter
+**What:** POST `/api/companies/:companyId/cost-events` with `provider: "puter"`, `billingType: "subscription_included"`, `costCents: 0`
+**When to use:** After every Puter chat completion, non-blocking
+
+The existing `costService.createEvent()` already accepts this shape. The `inputTokens` / `outputTokens` fields flow to the cost tracking view. Users see token consumption even when costCents is 0.
+
+### Pattern 4: Google OAuth PKCE (Server-Assisted)
+**What:** UI opens a popup to Google OAuth URL with PKCE code_challenge → local server at `GET /api/oauth/google/callback` captures the redirect → exchanges code for access_token → stores in secretService as `google_gemini_oauth_token`
+**When to use:** "Sign in with Google" path only
+
+```typescript
+// PKCE generation (server-side, Node.js built-in crypto)
+import crypto from "node:crypto";
+
+function generatePkce() {
+  const verifier = crypto.randomBytes(32).toString("base64url");
+  const challenge = crypto.createHash("sha256").update(verifier).digest("base64url");
+  return { verifier, challenge };
+}
+```
+
+### Pattern 5: Tool Auto-Detection (CLOUD-04)
+The probe endpoint `/adapters/:type/probe` already exists (line 667 in `server/src/routes/agents.ts`). In `NexusOnboardingWizard.tsx` the hermes_local probe is already called on wizard open. The provider selection step should call probes for `claude_local`, `hermes_local`, `openclaw_gateway` and pre-fill the recommended adapter. This extends the existing `probeAdapter` call pattern.
+
+```typescript
+// Already in agentsApi (ui/src/api/agents.ts)
+probeAdapter: (type: string) => api.get<{ available: boolean; status: string }>(`/adapters/${type}/probe`),
+```
+
+### Anti-Patterns to Avoid
+- **Storing Puter/Google tokens in localStorage:** Violates locked constraint. Always POST to server → secretService.
+- **Importing `@heyputer/puter.js` into the UI bundle for AI calls:** Bundle is large; use CDN script load for popup only; all AI calls go via server proxy.
+- **Using Gemini CLI OAuth client_id without warning:** Policy risk is real and documented. Always show policy-risk warning before initiating Google OAuth.
+- **Skipping cost event on Puter calls:** Even at `costCents: 0`, token counts must be recorded so the cost view is populated.
+- **DB schema changes:** Out of scope per locked constraints. Token storage goes via `secretService` (existing `company_secrets` table).
+
+---
+
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Puter AI HTTP calls | Custom fetch wrapper | `fetch` + OpenAI-compat endpoint | Simple JSON; no SDK needed server-side |
+| Token encryption at rest | Custom encryption | `secretService` with `local_encrypted` provider | Already handles AES encryption; existing pattern |
+| Cost tracking | Custom cost DB table | `costService.createEvent()` + existing `cost_events` table | Full schema already exists with agent/company attribution |
+| SSE streaming parse | Custom parser | Node 18+ `ReadableStream` with line-by-line split | OpenAI SSE format is simple; 10 lines of code, don't add a dep |
+| PKCE code generation | Manual | `crypto.randomBytes` + `crypto.createHash` (Node built-in) | Node crypto is sufficient; no need for oauth library |
+| Google redirect callback | Full OAuth library | Minimal Express route + one-shot `res.redirect()` | Only one flow; full library is overkill |
+
+**Key insight:** The hardest part of this phase is the Puter browser popup → server token handoff. The server-side AI calls are trivially simple: a POST to an OpenAI-compatible endpoint with a Bearer token.
+
+---
+
+## Common Pitfalls
+
+### Pitfall 1: Puter SDK Token Property Name
+**What goes wrong:** `puter.authToken` property may be named differently in 2.2.14; calling the wrong property returns `undefined`, silently failing token storage.
+**Why it happens:** Puter.js SDK is actively developed; property names are not formally documented.
+**How to avoid:** Include a spike step in the plan that imports `@heyputer/puter.js` 2.2.14, calls `signIn()`, and console.logs the puter object to discover the correct token property. Alternative: use the OpenAI-compat REST endpoint approach and have the user paste a token from puter.com/dashboard during onboarding (zero-friction alternative that avoids popup complexity entirely).
+**Warning signs:** `token` is undefined or empty string after `signIn()` resolves.
+
+### Pitfall 2: Puter Streaming Usage Field Timing
+**What goes wrong:** The `usage` field (with `prompt_tokens` / `completion_tokens`) in Puter's OpenAI-compat stream may only appear on the final `[DONE]` chunk, or may not appear at all during streaming.
+**Why it happens:** Puter proxies multiple backends (OpenAI, Anthropic, etc.) — each has different streaming behavior. OpenAI includes usage in the last chunk only when `stream_options: { include_usage: true }` is set.
+**How to avoid:** Add `stream_options: { include_usage: true }` to the request body. If usage is still zero, fall back to local token count estimation (input: character_count/4; output: same) purely for display purposes.
+**Warning signs:** Cost view shows 0 tokens for all Puter conversations.
+
+### Pitfall 3: Puter `signIn()` Popup Blocked
+**What goes wrong:** Browser blocks the `puter.auth.signIn()` popup because it was not called from a direct user gesture (e.g., called in a `useEffect` or after an async operation).
+**Why it happens:** Browsers require popups to originate from synchronous click handlers.
+**How to avoid:** Call `signIn()` directly in the button's `onClick` handler — no async operations between the click and the `signIn()` call.
+**Warning signs:** Browser console shows "Popup blocked" or the popup opens briefly then closes.
+
+### Pitfall 4: Google OAuth Abuse Risk
+**What goes wrong:** User's Google account is suspended for using third-party OAuth with Gemini CLI credentials. Risk affects Gmail and Workspace too.
+**Why it happens:** Google treats third-party OAuth as abuse of their Developer Program Policy (specifically Gemini CLI terms). Bans can be automated and immediate.
+**How to avoid:** (1) Display a prominent warning before the user initiates Google OAuth. (2) Recommend API key from Google AI Studio as the safer alternative. (3) Do not suppress the warning even if it reduces conversion.
+**Warning signs:** User reports "Access blocked" or account suspension after completing the Google OAuth flow.
+
+### Pitfall 5: boardMutationGuard on New Routes
+**What goes wrong:** New routes added inside the `api` Router in `app.ts` are automatically protected by `boardMutationGuard()`. Puter proxy route needs board auth.
+**Why it happens:** `api.use(boardMutationGuard())` applies to all routes mounted after it.
+**How to avoid:** Mount `puterProxyRoutes(db)` inside the `api` Router, after `boardMutationGuard()`. This is the correct pattern — the Puter proxy should require board auth.
+**Warning signs:** `403 Forbidden` responses even with valid board session; or route works without any auth (mounted outside `api` Router accidentally).
+
+### Pitfall 6: Secret Name Collision
+**What goes wrong:** `secretService.create()` throws a conflict error if `puter_auth_token` already exists, causing onboarding to fail on re-entry.
+**Why it happens:** `getByName` + `create` is not atomic; user may re-open onboarding wizard.
+**How to avoid:** Use `getByName` first; if exists, call `rotate()` instead of `create()`. Pattern: upsert-via-rotate.
+**Warning signs:** `conflict: Secret already exists: puter_auth_token` error in server logs.
+
+---
+
+## Code Examples
+
+### SSE Streaming in Existing chat.ts (Reference Pattern)
+```typescript
+// Source: /opt/nexus/server/src/routes/chat.ts lines 87-136
+// Pattern: set SSE headers, flushHeaders(), write "data: {...}\n\n" per chunk, res.end()
+res.setHeader("Content-Type", "text/event-stream");
+res.setHeader("Cache-Control", "no-cache");
+res.setHeader("Connection", "keep-alive");
+res.setHeader("X-Accel-Buffering", "no");
+res.flushHeaders();
+res.write(":ok\n\n");
+```
+
+### secretService Upsert Pattern
+```typescript
+// Source: /opt/nexus/server/src/services/secrets.ts
+// Pattern for idempotent token storage:
+const existing = await secrets.getByName(companyId, "puter_auth_token");
+if (existing) {
+  await secrets.rotate(existing.id, { value: newToken });
+} else {
+  await secrets.create(companyId, {
+    name: "puter_auth_token",
+    provider: "local_encrypted",
+    value: newToken,
+    description: "Puter.com auth token for AI proxy",
+  });
+}
+```
+
+### costService.createEvent() Shape for Puter
+```typescript
+// Source: /opt/nexus/packages/db/src/schema/cost_events.ts (schema)
+//         /opt/nexus/server/src/services/costs.ts (service)
+await costs.createEvent(companyId, {
+  agentId,               // required: UUID of agent or a sentinel agent
+  provider: "puter",
+  biller: "puter",
+  billingType: "subscription_included",
+  model: "claude-3-5-haiku-20241022",
+  inputTokens: 432,
+  outputTokens: 87,
+  costCents: 0,          // user-pays model: zero cost to Nexus instance
+  occurredAt: new Date(),
+});
+```
+
+### Probe Adapter Pattern (Existing, Reference)
+```typescript
+// Source: /opt/nexus/server/src/routes/agents.ts lines 667-692
+// Pattern for adapter availability probe (already exists):
+GET /adapters/hermes_local/probe  // → { available: true/false, status, checks }
+GET /adapters/claude_local/probe
+GET /adapters/openclaw_gateway/probe
+```
+
+### Mounting a New Route in app.ts
+```typescript
+// Source: /opt/nexus/server/src/app.ts lines 132-163 (pattern)
+// Hardware routes are mounted BEFORE the api Router (unauthenticated):
+app.use("/api", hardwareRoutes());  // unauthenticated
+
+// Most routes are mounted INSIDE the api Router (board-auth + mutation guard):
+api.use(secretRoutes(db));
+api.use(costRoutes(db));
+// ...
+api.use(puterProxyRoutes(db)); // ← new route goes here
+```
+
+---
+
+## State of the Art
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| `puter.ai.chat()` browser-direct | Server-proxy via OpenAI-compat REST | This phase | Tokens never exposed to browser; cost tracking possible |
+| Gemini OAuth considered stable | Gemini CLI OAuth leads to account bans | 2025-2026 | Must warn users; recommend API key instead |
+| `@heyputer/puter.js` latest | 2.2.14 | 2026-04-02 | No major version; stable |
+
+**Active risk areas:**
+- Puter token property name in SDK 2.2.14 — requires spike to verify
+- Puter streaming `usage` field presence — requires spike to verify
+- Google OAuth policy enforcement — not changing; bans are active and ongoing
+
+---
+
+## Open Questions
+
+1. **Exact Puter auth token property name after `signIn()`**
+   - What we know: The SDK has a `puter.authToken` property (per training data) and `puter.auth.getUser()` method after sign-in
+   - What's unclear: The exact API surface in 2.2.14 for extracting the bearer token
+   - Recommendation: Include a research spike as Wave 0 or Task 1 — load `https://js.puter.com/v2/`, call `signIn()`, and inspect the resulting puter object. Alternative: build a "paste your token" fallback in the UI (visible at `https://puter.com/dashboard` under profile) so the popup approach is optional.
+
+2. **Puter streaming `stream_options` support**
+   - What we know: Standard OpenAI SDK requires `stream_options: { include_usage: true }` for usage in stream chunks
+   - What's unclear: Whether Puter's OpenAI-compat layer passes this through
+   - Recommendation: Include `stream_options: { include_usage: true }` in the request. If usage is 0, implement a local token estimator (char_count / 4) as a fallback — cost view shows "~N tokens (estimated)".
+
+3. **Agent identity for Puter chat cost events**
+   - What we know: `costService.createEvent()` requires a valid `agentId` (FK to agents table)
+   - What's unclear: Which agent gets attributed for costs from the onboarding wizard (no agent created yet)
+   - Recommendation: Use the newly-created PM agent's ID after the wizard completes. If the Puter call happens before agent creation, defer cost recording or use a placeholder until an agent exists. The simplest approach: wire the Puter proxy endpoint to require a `companyId` and `agentId` in the request body.
+
+---
+
+## Environment Availability
+
+| Dependency | Required By | Available | Version | Fallback |
+|------------|------------|-----------|---------|----------|
+| `@heyputer/puter.js` (npm) | Puter browser popup | Not yet in project | 2.2.14 | CDN load: `https://js.puter.com/v2/` |
+| Node.js `fetch` (built-in) | Puter OpenAI proxy | Yes | Node 18+ | `node-fetch` if Node <18 |
+| `crypto` (Node built-in) | Google PKCE | Yes | built-in | — |
+| Puter.com API endpoint | All Puter AI calls | Verified (external) | — | User sees "Puter service unavailable" |
+| Google OAuth consent screen | Gemini OAuth | External | — | User uses API key path instead |
+
+**Missing dependencies with no fallback:**
+- None that block execution.
+
+**Missing dependencies with fallback:**
+- `@heyputer/puter.js` in UI — CDN load (`https://js.puter.com/v2/`) is acceptable; avoids large bundle. Plan should add npm package for TypeScript types only.
+
+---
+
+## Validation Architecture
+
+### Test Framework
+| Property | Value |
+|----------|-------|
+| Framework | Vitest (server) |
+| Config file | `/opt/nexus/server/vitest.config.ts` |
+| Quick run command | `pnpm --filter @paperclipai/server test run src/__tests__/31-puter-proxy.test.ts` |
+| Full suite command | `pnpm test:run` |
+
+### Phase Requirements → Test Map
+| Req ID | Behavior | Test Type | Automated Command | File Exists? |
+|--------|----------|-----------|-------------------|-------------|
+| CLOUD-01 | Puter auth token storage roundtrip (POST token → resolve → correct value) | unit | `pnpm --filter @paperclipai/server test run src/__tests__/31-puter-proxy.test.ts` | No — Wave 0 |
+| CLOUD-02 | Proxy route calls Puter endpoint with correct Bearer header; records cost event | unit (mocked fetch) | same file | No — Wave 0 |
+| CLOUD-02 | Cost event has provider=puter, billingType=subscription_included, inputTokens>0 | unit | same file | No — Wave 0 |
+| CLOUD-03 | Google OAuth PKCE code_verifier/challenge generation is valid | unit | same file | No — Wave 0 |
+| CLOUD-04 | Probe route returns correct available/unavailable for known adapter types | unit | existing test infra | Partially — extend 30-hardware-detection.test.ts or new file |
+| CLOUD-05 | Secret upsert: second call rotates, not conflicts | unit | same file | No — Wave 0 |
+| CLOUD-01 | (E2E) Onboarding wizard Step 4 renders provider options | manual smoke | n/a | manual |
+
+### Sampling Rate
+- **Per task commit:** `pnpm --filter @paperclipai/server test run src/__tests__/31-puter-proxy.test.ts`
+- **Per wave merge:** `pnpm test:run`
+- **Phase gate:** Full suite green before `/gsd:verify-work`
+
+### Wave 0 Gaps
+- [ ] `server/src/__tests__/31-puter-proxy.test.ts` — covers CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-05
+- [ ] Spike: confirm Puter SDK token property name and streaming usage field — can be a minimal Node script, not a test
+
+*(Existing test infrastructure covers CLOUD-04 partially via adapter probe patterns)*
+
+---
+
+## Sources
+
+### Primary (HIGH confidence)
+- [Puter OpenAI-compat endpoint](https://developer.puter.com/tutorials/use-openai-sdk-with-puter/) — `https://api.puter.com/puterai/openai/v1/` verified
+- [Puter auth docs](https://docs.puter.com/Auth/) — `puter.auth.signIn()` popup flow verified
+- [Puter AI chat docs](https://docs.puter.com/AI/chat/) — `ChatResponse` structure, streaming, model list
+- [Puter free unlimited AI](https://developer.puter.com/tutorials/free-unlimited-ai-api/) — user-pays model confirmed
+- `/opt/nexus/server/src/services/secrets.ts` — secretService API verified
+- `/opt/nexus/packages/db/src/schema/cost_events.ts` — cost event fields verified
+- `/opt/nexus/server/src/routes/agents.ts` lines 667-692 — probe endpoint pattern verified
+- Google Gemini CLI abuse detection: [gemini-cli discussion #22970](https://github.com/google-gemini/gemini-cli/discussions/22970), [openclaw issue #14203](https://github.com/openclaw/openclaw/issues/14203) — risk confirmed
+
+### Secondary (MEDIUM confidence)
+- [Puter Node.js guide](https://developer.puter.com/tutorials/puter-js-node-js/) — `init(token)` pattern, `puter.ai.chat()` usage
+- [Browser-based auth blog](https://developer.puter.com/blog/browser-based-auth-puter-js-node/) — `getAuthToken()` flow
+- [jsputer-proxy](https://github.com/mulkymalikuldhrs/jsputer-proxy) — community server-proxy pattern, confirms approach works
+- [npm: @heyputer/puter.js 2.2.14](https://www.npmjs.com/package/@heyputer/puter.js) — version confirmed 2026-04-02
+- [opencode-gemini-auth](https://github.com/jenslys/opencode-gemini-auth) — PKCE flow for Gemini, mirrors what we need
+
+### Tertiary (LOW confidence — need spike to verify)
+- Puter auth token property name (`puter.authToken`) — training data only, not in official docs
+- Puter streaming `usage` field behavior with `stream_options` — unverified against Puter's specific proxy layer
+- Exact Gemini CLI client_id to reuse for OAuth — publicly accessible but not formally documented for third-party use
+
+---
+
+## Metadata
+
+**Confidence breakdown:**
+- Standard stack: HIGH — Puter OpenAI endpoint is documented and community-verified; secretService/costService are internal and fully understood
+- Architecture: HIGH — proxy pattern is straightforward; wizard extension follows existing 3-step pattern
+- Pitfalls: HIGH — Google OAuth risk is extensively documented with real-world ban evidence; Puter popup pitfall is browser-standard behavior
+- Open questions: LOW — token property name requires live spike; can be resolved in Wave 0
+
+**Research date:** 2026-04-02
+**Valid until:** 2026-05-02 (Puter.js is actively developed; token API surface may change)