From 2efed8797e99af8a254c43e1670a700ca1bfdfa8 Mon Sep 17 00:00:00 2001
From: Nexus Dev <nexus@local>
Date: Fri, 3 Apr 2026 21:44:10 +0000
Subject: [PATCH] docs(33): research phase persistent memory domain

---
 .../33-persistent-memory/33-RESEARCH.md       | 420 ++++++++++++++++++
 1 file changed, 420 insertions(+)
 create mode 100644 .planning/phases/33-persistent-memory/33-RESEARCH.md
diff --git a/.planning/phases/33-persistent-memory/33-RESEARCH.md b/.planning/phases/33-persistent-memory/33-RESEARCH.md
new file mode 100644
index 00000000..bec6e614
--- /dev/null
+++ b/.planning/phases/33-persistent-memory/33-RESEARCH.md
@@ -0,0 +1,420 @@
+# Phase 33: Persistent Memory + Personal Assistant Mode — Research
+
+**Researched:** 2026-04-01
+**Domain:** File-backed memory service, prompt injection, assistant handoff to PM agent
+**Confidence:** HIGH
+
+---
+
+<user_constraints>
+## User Constraints (from CONTEXT.md)
+
+### Locked Decisions
+All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
+
+### Claude's Discretion
+All implementation choices.
+
+### Deferred Ideas (OUT OF SCOPE)
+None — discuss phase skipped.
+</user_constraints>
+
+---
+
+<phase_requirements>
+## Phase Requirements
+
+| ID | Description | Research Support |
+|----|-------------|------------------|
+| ASST-01 | User has persistent memory across chat sessions (summary-based, injected into system prompts) | File-backed memory service at `data/assistant-memory/<companyId>.json`; inject via new `/conversations/:id/assistant-stream` endpoint or by extending existing `/conversations/:id/stream` |
+| ASST-02 | Memory content sanitized at write time to prevent prompt injection | `server/src/redaction.ts` has `SECRET_PAYLOAD_KEY_RE` + `JWT_VALUE_RE`; extend with a plain-text credential blocklist pattern applied at `memorySummary` write time |
+| ASST-03 | User can hand off an assistant conversation to a PM agent with one click, transferring context | Existing `POST /conversations/:id/handoff` route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary |
+| ASST-04 | Assistant and Project Builder modes work standalone or together | `nexusSettingsService` persists `mode` as `"personal_ai" | "project_builder" | "both"` to `data/nexus-settings.json`. Phase 33 adds a `PersonalAssistantPage` gated on `mode !== "project_builder"` |
+</phase_requirements>
+
+---
+
+## Summary
+
+Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange.
+
+No new DB tables are permitted (milestone constraint). All state lives either in `data/assistant-memory/<companyId>.json` (following the `nexus-settings.json` file-backed pattern) or in existing JSONB columns. The `mode` setting already persists correctly in `data/nexus-settings.json` via `nexusSettingsService` and is accessible via `GET /nexus/settings`. The UI only needs to read this mode to gate the `PersonalAssistantPage` route.
+
+The streaming endpoint at `POST /conversations/:id/stream` currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through `puterProxyService`) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new `assistantMemoryService.append()` which sanitizes the assistant text before persisting.
+
+**Primary recommendation:** Build a `assistantMemoryService` modelled on `nexusSettingsService` (file-backed JSON, no DB changes), extend the `/conversations/:id/stream` endpoint to prepend memory as a system message, and add a `/conversations/:id/assistant-handoff` route that creates a PM-linked conversation with a seeded system message.
+
+---
+
+## Standard Stack
+
+### Core
+| Library | Version | Purpose | Why Standard |
+|---------|---------|---------|--------------|
+| `node:fs` (sync) | Node built-in | Read/write `assistant-memory/<companyId>.json` | Same pattern as `nexus-settings.json` — no extra deps |
+| `zod` | already in codebase | Schema validation for memory JSON on read | All service layer uses zod |
+| `drizzle-orm` | already in codebase | Querying `chatMessages` for summary extraction | Already used in `chatService` |
+| Express Router | already in codebase | New memory CRUD routes | All server routes use Express |
+
+### Supporting
+| Library | Version | Purpose | When to Use |
+|---------|---------|---------|-------------|
+| `puterProxyService` | internal | Real AI completions for memory summarization | Use for the summarize-conversation step; same service already used for chat |
+
+### Alternatives Considered
+| Instead of | Could Use | Tradeoff |
+|------------|-----------|----------|
+| File-backed JSON memory | `instance_settings.general` JSONB | JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note |
+| File-backed JSON memory | Vector DB (Mem0, Chroma) | Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead" |
+| File-backed JSON memory | `chat_conversations` extra column | Would require a DB schema migration — prohibited by milestone constraint |
+
+**Installation:** No new packages needed.
+
+---
+
+## Architecture Patterns
+
+### Recommended Project Structure
+```
+server/src/services/
+├── assistant-memory.ts     # new — read/write data/assistant-memory/<companyId>.json
+server/src/routes/
+├── assistant-memory.ts     # new — GET/PATCH memory endpoints
+ui/src/
+├── pages/PersonalAssistantPage.tsx  # new — gated on mode !== "project_builder"
+├── api/assistantMemory.ts           # new — API client for memory endpoints
+```
+
+### Pattern 1: File-Backed Service (follows nexusSettingsService)
+
+**What:** Read a JSON file from `resolvePaperclipInstanceRoot()/data/assistant-memory/<companyId>.json`, validate with zod, write back after updates.
+
+**When to use:** All memory reads and writes.
+
+**Example:**
+```typescript
+// Source: server/src/services/nexus-settings.ts (existing pattern)
+import fs from "node:fs";
+import path from "node:path";
+import { z } from "zod";
+import { resolvePaperclipInstanceRoot } from "../home-paths.js";
+
+const assistantMemorySchema = z.object({
+  facts: z.array(z.string()).default([]),
+  updatedAt: z.string().datetime().optional(),
+});
+
+type AssistantMemory = z.infer<typeof assistantMemorySchema>;
+
+function resolveMemoryPath(companyId: string): string {
+  return path.resolve(
+    resolvePaperclipInstanceRoot(),
+    "data",
+    "assistant-memory",
+    `${companyId}.json`,
+  );
+}
+```
+
+### Pattern 2: Memory Injection via System Message Prefix
+
+**What:** Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model.
+
+**When to use:** Every `/conversations/:id/stream` call when the conversation's company is in `personal_ai` or `both` mode.
+
+**Example:**
+```typescript
+// Inject pattern — server/src/routes/chat.ts stream endpoint extension
+const memory = await assistantMemoryService.get(companyId);
+const systemPrefix = memory.facts.length > 0
+  ? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n`
+  : "";
+const messagesWithMemory = systemPrefix
+  ? [{ role: "system", content: systemPrefix }, ...conversationMessages]
+  : conversationMessages;
+```
+
+### Pattern 3: Write-Time Sanitization
+
+**What:** Before appending any fact to memory, run it through a credential-scrubbing function.
+
+**When to use:** Every call to `assistantMemoryService.append()`.
+
+**Example:**
+```typescript
+// Source: server/src/redaction.ts (existing pattern extended for plain text)
+const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g;
+const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi;
+
+function sanitizeMemoryFact(raw: string): string {
+  return raw
+    .replace(CREDENTIAL_INLINE_RE, "[REDACTED]")
+    .replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]");
+}
+```
+
+### Pattern 4: PM Agent Handoff with Conversation Summary
+
+**What:** `POST /conversations/:id/assistant-handoff` — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation.
+
+**When to use:** When user clicks "Turn this into a project" in PersonalAssistantPage.
+
+The existing `POST /conversations/:id/handoff` creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should:
+1. Fetch the last N messages from the current conversation.
+2. Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available).
+3. Create a new conversation (or find the PM agent's default conversation) and insert a `messageType: "handoff_context"` system message.
+4. Return `{ targetConversationId }` so the client can navigate.
+
+### Anti-Patterns to Avoid
+- **Injecting memory at retrieval time (not write time):** ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk.
+- **Storing the full conversation as memory:** Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads.
+- **Putting memory in `instance_settings.general`:** That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces.
+- **Reading memory synchronously in the hot path of the SSE stream without caching:** The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop.
+
+---
+
+## Don't Hand-Roll
+
+| Problem | Don't Build | Use Instead | Why |
+|---------|-------------|-------------|-----|
+| Credential scrubbing regex | Custom patterns from scratch | Extend `server/src/redaction.ts` with plain-text patterns | Existing patterns cover `api_key`, `access_token`, `jwt`, etc. — reusing reduces divergence |
+| Secrets storage | New encrypted store | `secretService` already handles encrypted storage — memory does not store secrets, only sanitized facts | Credential data must never reach memory layer at all |
+| AI summarisation service | New inference client | `puterProxyService` for optional summarisation — same endpoint, same cost tracking | Avoids a second AI client implementation |
+
+**Key insight:** The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage.
+
+---
+
+## Runtime State Inventory
+
+> SKIPPED — this is a greenfield phase, not a rename/refactor/migration.
+
+---
+
+## Common Pitfalls
+
+### Pitfall 1: Mode Check Location
+**What goes wrong:** UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users.
+**Why it happens:** `nexusSettings.mode` is read from a file, not from DB — easy to forget to check it server-side.
+**How to avoid:** The stream endpoint must call `nexusSettingsService().get()` and skip memory injection when `mode === "project_builder"`.
+**Warning signs:** Memory facts appearing in project builder conversations.
+
+### Pitfall 2: companyId Scoping in Memory File Path
+**What goes wrong:** All companies share a single memory file.
+**Why it happens:** Forgetting `companyId` in the path, e.g. `data/assistant-memory.json` instead of `data/assistant-memory/<companyId>.json`.
+**How to avoid:** The path resolver function must include `companyId` as a directory segment (as shown in Pattern 1 above).
+**Warning signs:** Facts from workspace A appearing in workspace B.
+
+### Pitfall 3: SSE Stream Already Flushed When Memory Read Fails
+**What goes wrong:** A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken.
+**Why it happens:** The existing stream endpoint flushes headers at line 101 (`res.flushHeaders()`) before any async logic.
+**How to avoid:** Read memory before calling `res.flushHeaders()`. If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush.
+**Warning signs:** Client sees `:ok` event but then nothing further.
+
+### Pitfall 4: Memory Growing Without Bound
+**What goes wrong:** `facts` array accumulates thousands of entries; the injected system prompt exceeds the model's context window.
+**Why it happens:** No fact eviction or cap.
+**How to avoid:** Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max.
+**Warning signs:** Streaming responses truncated; model errors about context length.
+
+### Pitfall 5: Handoff Creates Duplicate PM Conversations
+**What goes wrong:** Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations.
+**Why it happens:** The handoff endpoint creates a new conversation on every call.
+**How to avoid:** Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return `targetConversationId` so the UI can detect if a new one was created.
+**Warning signs:** Conversation list growing rapidly.
+
+### Pitfall 6: Chat Route Injection Point
+**What goes wrong:** Memory injection is added to the wrong handler — the echo `streamEcho` helper instead of the real AI call.
+**Why it happens:** `POST /conversations/:id/stream` currently calls `svc.streamEcho()` which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub.
+**How to avoid:** During plan-phase, inspect the exact injection point in `server/src/routes/chat.ts` lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in `server/src/services/chat.ts` during plan-phase".)
+**Warning signs:** Memory injected but responses are still the echo stub.
+
+---
+
+## Code Examples
+
+Verified patterns from existing codebase:
+
+### File-Backed Service Read/Write
+```typescript
+// Source: server/src/services/nexus-settings.ts
+async function get(): Promise<NexusSettings> {
+  const filePath = resolveNexusSettingsPath();
+  try {
+    const raw = fs.readFileSync(filePath, "utf-8");
+    const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw));
+    if (parsed.success) return parsed.data;
+    return { mode: "both" };
+  } catch {
+    return { mode: "both" };
+  }
+}
+
+async function set(patch: Partial<NexusSettings>): Promise<NexusSettings> {
+  const current = await get();
+  const merged = { ...current, ...patch };
+  const validated = nexusSettingsSchema.parse(merged);
+  const filePath = resolveNexusSettingsPath();
+  fs.mkdirSync(path.dirname(filePath), { recursive: true });
+  fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8");
+  return validated;
+}
+```
+
+### Existing Redaction Patterns (extend for plain-text memory)
+```typescript
+// Source: server/src/redaction.ts
+const SECRET_PAYLOAD_KEY_RE =
+  /(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
+const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/;
+```
+
+### Existing Handoff Route Pattern
+```typescript
+// Source: server/src/routes/chat.ts lines 163-203
+router.post("/conversations/:id/handoff", async (req, res) => {
+  assertBoard(req);
+  const data = handoffSchema.parse(req.body);
+  const conversation = await svc.getConversation(req.params.id!);
+  const companyId = conversation.companyId;
+  // 1. Insert handoff system message
+  // 2. Create issue from spec
+  // 3. Insert task_created system message
+  res.json({ handoffMessageId: handoffMsg.id, issues: [issue] });
+});
+```
+
+### Existing SSE Streaming Pattern
+```typescript
+// Source: server/src/routes/chat.ts lines 88-137
+res.setHeader("Content-Type", "text/event-stream");
+res.setHeader("Cache-Control", "no-cache");
+res.setHeader("Connection", "keep-alive");
+res.setHeader("X-Accel-Buffering", "no");
+res.flushHeaders();
+res.write(":ok\n\n");
+// ... async generator loop
+res.write(`data: ${JSON.stringify({ token })}\n\n`);
+res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`);
+```
+
+### PuterProxyService Chat Stream
+```typescript
+// Source: server/src/services/puter-proxy.ts
+async function* chatStream(
+  companyId: string,
+  agentId: string | null | undefined,
+  messages: unknown[],
+  model: string | undefined,
+  signal: AbortSignal | undefined,
+): AsyncGenerator<string>
+// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }]
+```
+
+---
+
+## State of the Art
+
+| Old Approach | Current Approach | When Changed | Impact |
+|--------------|------------------|--------------|--------|
+| Vector DB for memory | Summary-based file-backed facts | Phase 33 (new) | No infra overhead; no pgvector needed |
+| Echo stub in stream endpoint | Real AI via puterProxyService | Phase 33 (new) | Memory injection only makes sense once real AI is wired |
+
+**Deprecated/outdated:**
+- `streamEcho()` in `chatService`: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint).
+
+---
+
+## Open Questions
+
+1. **Does the stream endpoint replace the echo stub or add a parallel endpoint?**
+   - What we know: `/conversations/:id/stream` currently calls `svc.streamEcho()`. The puter proxy has its own endpoint at `/puter-proxy/chat`.
+   - What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated `/conversations/:id/assistant-stream`.
+   - Recommendation: Extend the existing endpoint — the UI already calls it via `chatApi.postMessageAndStream()`. The endpoint should check `nexusSettings.mode` and route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change.
+
+2. **What constitutes a "memory fact" — when is it written?**
+   - What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand.
+   - What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn.
+   - Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The `onDone` callback in `chatApi.postMessageAndStream` is the natural trigger; or do it server-side in the stream endpoint after saving the final message.
+
+3. **How does the personal assistant handoff interact with the existing brainstormer handoff?**
+   - What we know: The existing handoff (Phase 23) creates a project issue from a structured spec (`{ what, why, constraints, success }`). The assistant handoff (ASST-03) transfers conversation context to a PM agent.
+   - What's unclear: Whether these are the same button or different flows.
+   - Recommendation: Implement a separate `POST /conversations/:id/assistant-handoff` endpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec).
+
+---
+
+## Environment Availability
+
+> Phase is code/config-only. No new external services required.
+> `puterProxyService` is already integrated (Phase 31). File I/O uses Node.js built-ins.
+
+| Dependency | Required By | Available | Version | Fallback |
+|------------|------------|-----------|---------|----------|
+| Node.js `fs` module | Memory file storage | ✓ | built-in | — |
+| `puterProxyService` | AI responses + optional summarisation | ✓ | internal | Fall back to echo stub if no token |
+| `nexusSettingsService` | Mode check (ASST-04) | ✓ | internal | — |
+| `secretService` | Token resolution in puterProxy | ✓ | internal | — |
+
+---
+
+## Validation Architecture
+
+### Test Framework
+| Property | Value |
+|----------|-------|
+| Framework | Vitest 3.x |
+| Config file | `server/vitest.config.ts` |
+| Quick run command | `pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts` |
+| Full suite command | `pnpm test:run` |
+
+### Phase Requirements → Test Map
+| Req ID | Behavior | Test Type | Automated Command | File Exists? |
+|--------|----------|-----------|-------------------|-------------|
+| ASST-01 | Memory persists across sessions — new session includes previously stored fact in system prompt | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts` | ❌ Wave 0 |
+| ASST-02 | API key pasted into chat is NOT stored in memory file | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts` | ❌ Wave 0 |
+| ASST-03 | Assistant handoff creates target conversation with context summary system message | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts` | ❌ Wave 0 |
+| ASST-04 | PersonalAssistantPage visible when mode is `personal_ai` or `both`; hidden when `project_builder` | unit | `pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx` | ❌ Wave 0 |
+
+### Sampling Rate
+- **Per task commit:** `pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts`
+- **Per wave merge:** `pnpm test:run`
+- **Phase gate:** Full suite green before `/gsd:verify-work`
+
+### Wave 0 Gaps
+- [ ] `server/src/__tests__/33-assistant-memory.test.ts` — covers ASST-01
+- [ ] `server/src/__tests__/33-memory-sanitization.test.ts` — covers ASST-02
+- [ ] `server/src/__tests__/33-assistant-handoff.test.ts` — covers ASST-03
+- [ ] `ui/src/components/PersonalAssistantPage.test.tsx` — covers ASST-04
+
+---
+
+## Sources
+
+### Primary (HIGH confidence)
+- `server/src/services/nexus-settings.ts` — file-backed JSON service pattern (confirmed by reading source)
+- `server/src/services/puter-proxy.ts` — AI streaming service (confirmed by reading source)
+- `server/src/routes/chat.ts` — SSE streaming pattern and handoff route (confirmed by reading source)
+- `server/src/redaction.ts` — credential scrubbing patterns (confirmed by reading source)
+- `packages/db/src/schema/chat_conversations.ts` — conversation schema (confirmed by reading source)
+- `packages/db/src/schema/chat_messages.ts` — message schema with `messageType` column (confirmed by reading source)
+- `.planning/REQUIREMENTS.md` — "DB schema changes: Out of Scope" (confirmed by reading source)
+
+### Secondary (MEDIUM confidence)
+- `.planning/ROADMAP.md` Phase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04
+- `.planning/STATE.md` Accumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time"
+
+### Tertiary (LOW confidence)
+- None — all findings verified from source files.
+
+---
+
+## Metadata
+
+**Confidence breakdown:**
+- Standard stack: HIGH — verified by reading existing service files
+- Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route)
+- Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs
+
+**Research date:** 2026-04-01
+**Valid until:** 2026-05-01 (stable codebase; these patterns won't change without major refactors)