From 2efed8797e99af8a254c43e1670a700ca1bfdfa8 Mon Sep 17 00:00:00 2001 From: Nexus Dev Date: Fri, 3 Apr 2026 21:44:10 +0000 Subject: [PATCH] docs(33): research phase persistent memory domain --- .../33-persistent-memory/33-RESEARCH.md | 420 ++++++++++++++++++ 1 file changed, 420 insertions(+) create mode 100644 .planning/phases/33-persistent-memory/33-RESEARCH.md diff --git a/.planning/phases/33-persistent-memory/33-RESEARCH.md b/.planning/phases/33-persistent-memory/33-RESEARCH.md new file mode 100644 index 00000000..bec6e614 --- /dev/null +++ b/.planning/phases/33-persistent-memory/33-RESEARCH.md @@ -0,0 +1,420 @@ +# Phase 33: Persistent Memory + Personal Assistant Mode — Research + +**Researched:** 2026-04-01 +**Domain:** File-backed memory service, prompt injection, assistant handoff to PM agent +**Confidence:** HIGH + +--- + + +## User Constraints (from CONTEXT.md) + +### Locked Decisions +All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions. + +### Claude's Discretion +All implementation choices. + +### Deferred Ideas (OUT OF SCOPE) +None — discuss phase skipped. + + +--- + + +## Phase Requirements + +| ID | Description | Research Support | +|----|-------------|------------------| +| ASST-01 | User has persistent memory across chat sessions (summary-based, injected into system prompts) | File-backed memory service at `data/assistant-memory/.json`; inject via new `/conversations/:id/assistant-stream` endpoint or by extending existing `/conversations/:id/stream` | +| ASST-02 | Memory content sanitized at write time to prevent prompt injection | `server/src/redaction.ts` has `SECRET_PAYLOAD_KEY_RE` + `JWT_VALUE_RE`; extend with a plain-text credential blocklist pattern applied at `memorySummary` write time | +| ASST-03 | User can hand off an assistant conversation to a PM agent with one click, transferring context | Existing `POST /conversations/:id/handoff` route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary | +| ASST-04 | Assistant and Project Builder modes work standalone or together | `nexusSettingsService` persists `mode` as `"personal_ai" | "project_builder" | "both"` to `data/nexus-settings.json`. Phase 33 adds a `PersonalAssistantPage` gated on `mode !== "project_builder"` | + + +--- + +## Summary + +Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange. + +No new DB tables are permitted (milestone constraint). All state lives either in `data/assistant-memory/.json` (following the `nexus-settings.json` file-backed pattern) or in existing JSONB columns. The `mode` setting already persists correctly in `data/nexus-settings.json` via `nexusSettingsService` and is accessible via `GET /nexus/settings`. The UI only needs to read this mode to gate the `PersonalAssistantPage` route. + +The streaming endpoint at `POST /conversations/:id/stream` currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through `puterProxyService`) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new `assistantMemoryService.append()` which sanitizes the assistant text before persisting. + +**Primary recommendation:** Build a `assistantMemoryService` modelled on `nexusSettingsService` (file-backed JSON, no DB changes), extend the `/conversations/:id/stream` endpoint to prepend memory as a system message, and add a `/conversations/:id/assistant-handoff` route that creates a PM-linked conversation with a seeded system message. + +--- + +## Standard Stack + +### Core +| Library | Version | Purpose | Why Standard | +|---------|---------|---------|--------------| +| `node:fs` (sync) | Node built-in | Read/write `assistant-memory/.json` | Same pattern as `nexus-settings.json` — no extra deps | +| `zod` | already in codebase | Schema validation for memory JSON on read | All service layer uses zod | +| `drizzle-orm` | already in codebase | Querying `chatMessages` for summary extraction | Already used in `chatService` | +| Express Router | already in codebase | New memory CRUD routes | All server routes use Express | + +### Supporting +| Library | Version | Purpose | When to Use | +|---------|---------|---------|-------------| +| `puterProxyService` | internal | Real AI completions for memory summarization | Use for the summarize-conversation step; same service already used for chat | + +### Alternatives Considered +| Instead of | Could Use | Tradeoff | +|------------|-----------|----------| +| File-backed JSON memory | `instance_settings.general` JSONB | JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note | +| File-backed JSON memory | Vector DB (Mem0, Chroma) | Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead" | +| File-backed JSON memory | `chat_conversations` extra column | Would require a DB schema migration — prohibited by milestone constraint | + +**Installation:** No new packages needed. + +--- + +## Architecture Patterns + +### Recommended Project Structure +``` +server/src/services/ +├── assistant-memory.ts # new — read/write data/assistant-memory/.json +server/src/routes/ +├── assistant-memory.ts # new — GET/PATCH memory endpoints +ui/src/ +├── pages/PersonalAssistantPage.tsx # new — gated on mode !== "project_builder" +├── api/assistantMemory.ts # new — API client for memory endpoints +``` + +### Pattern 1: File-Backed Service (follows nexusSettingsService) + +**What:** Read a JSON file from `resolvePaperclipInstanceRoot()/data/assistant-memory/.json`, validate with zod, write back after updates. + +**When to use:** All memory reads and writes. + +**Example:** +```typescript +// Source: server/src/services/nexus-settings.ts (existing pattern) +import fs from "node:fs"; +import path from "node:path"; +import { z } from "zod"; +import { resolvePaperclipInstanceRoot } from "../home-paths.js"; + +const assistantMemorySchema = z.object({ + facts: z.array(z.string()).default([]), + updatedAt: z.string().datetime().optional(), +}); + +type AssistantMemory = z.infer; + +function resolveMemoryPath(companyId: string): string { + return path.resolve( + resolvePaperclipInstanceRoot(), + "data", + "assistant-memory", + `${companyId}.json`, + ); +} +``` + +### Pattern 2: Memory Injection via System Message Prefix + +**What:** Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model. + +**When to use:** Every `/conversations/:id/stream` call when the conversation's company is in `personal_ai` or `both` mode. + +**Example:** +```typescript +// Inject pattern — server/src/routes/chat.ts stream endpoint extension +const memory = await assistantMemoryService.get(companyId); +const systemPrefix = memory.facts.length > 0 + ? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n` + : ""; +const messagesWithMemory = systemPrefix + ? [{ role: "system", content: systemPrefix }, ...conversationMessages] + : conversationMessages; +``` + +### Pattern 3: Write-Time Sanitization + +**What:** Before appending any fact to memory, run it through a credential-scrubbing function. + +**When to use:** Every call to `assistantMemoryService.append()`. + +**Example:** +```typescript +// Source: server/src/redaction.ts (existing pattern extended for plain text) +const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g; +const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi; + +function sanitizeMemoryFact(raw: string): string { + return raw + .replace(CREDENTIAL_INLINE_RE, "[REDACTED]") + .replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]"); +} +``` + +### Pattern 4: PM Agent Handoff with Conversation Summary + +**What:** `POST /conversations/:id/assistant-handoff` — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation. + +**When to use:** When user clicks "Turn this into a project" in PersonalAssistantPage. + +The existing `POST /conversations/:id/handoff` creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should: +1. Fetch the last N messages from the current conversation. +2. Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available). +3. Create a new conversation (or find the PM agent's default conversation) and insert a `messageType: "handoff_context"` system message. +4. Return `{ targetConversationId }` so the client can navigate. + +### Anti-Patterns to Avoid +- **Injecting memory at retrieval time (not write time):** ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk. +- **Storing the full conversation as memory:** Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads. +- **Putting memory in `instance_settings.general`:** That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces. +- **Reading memory synchronously in the hot path of the SSE stream without caching:** The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop. + +--- + +## Don't Hand-Roll + +| Problem | Don't Build | Use Instead | Why | +|---------|-------------|-------------|-----| +| Credential scrubbing regex | Custom patterns from scratch | Extend `server/src/redaction.ts` with plain-text patterns | Existing patterns cover `api_key`, `access_token`, `jwt`, etc. — reusing reduces divergence | +| Secrets storage | New encrypted store | `secretService` already handles encrypted storage — memory does not store secrets, only sanitized facts | Credential data must never reach memory layer at all | +| AI summarisation service | New inference client | `puterProxyService` for optional summarisation — same endpoint, same cost tracking | Avoids a second AI client implementation | + +**Key insight:** The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage. + +--- + +## Runtime State Inventory + +> SKIPPED — this is a greenfield phase, not a rename/refactor/migration. + +--- + +## Common Pitfalls + +### Pitfall 1: Mode Check Location +**What goes wrong:** UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users. +**Why it happens:** `nexusSettings.mode` is read from a file, not from DB — easy to forget to check it server-side. +**How to avoid:** The stream endpoint must call `nexusSettingsService().get()` and skip memory injection when `mode === "project_builder"`. +**Warning signs:** Memory facts appearing in project builder conversations. + +### Pitfall 2: companyId Scoping in Memory File Path +**What goes wrong:** All companies share a single memory file. +**Why it happens:** Forgetting `companyId` in the path, e.g. `data/assistant-memory.json` instead of `data/assistant-memory/.json`. +**How to avoid:** The path resolver function must include `companyId` as a directory segment (as shown in Pattern 1 above). +**Warning signs:** Facts from workspace A appearing in workspace B. + +### Pitfall 3: SSE Stream Already Flushed When Memory Read Fails +**What goes wrong:** A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken. +**Why it happens:** The existing stream endpoint flushes headers at line 101 (`res.flushHeaders()`) before any async logic. +**How to avoid:** Read memory before calling `res.flushHeaders()`. If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush. +**Warning signs:** Client sees `:ok` event but then nothing further. + +### Pitfall 4: Memory Growing Without Bound +**What goes wrong:** `facts` array accumulates thousands of entries; the injected system prompt exceeds the model's context window. +**Why it happens:** No fact eviction or cap. +**How to avoid:** Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max. +**Warning signs:** Streaming responses truncated; model errors about context length. + +### Pitfall 5: Handoff Creates Duplicate PM Conversations +**What goes wrong:** Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations. +**Why it happens:** The handoff endpoint creates a new conversation on every call. +**How to avoid:** Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return `targetConversationId` so the UI can detect if a new one was created. +**Warning signs:** Conversation list growing rapidly. + +### Pitfall 6: Chat Route Injection Point +**What goes wrong:** Memory injection is added to the wrong handler — the echo `streamEcho` helper instead of the real AI call. +**Why it happens:** `POST /conversations/:id/stream` currently calls `svc.streamEcho()` which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub. +**How to avoid:** During plan-phase, inspect the exact injection point in `server/src/routes/chat.ts` lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in `server/src/services/chat.ts` during plan-phase".) +**Warning signs:** Memory injected but responses are still the echo stub. + +--- + +## Code Examples + +Verified patterns from existing codebase: + +### File-Backed Service Read/Write +```typescript +// Source: server/src/services/nexus-settings.ts +async function get(): Promise { + const filePath = resolveNexusSettingsPath(); + try { + const raw = fs.readFileSync(filePath, "utf-8"); + const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw)); + if (parsed.success) return parsed.data; + return { mode: "both" }; + } catch { + return { mode: "both" }; + } +} + +async function set(patch: Partial): Promise { + const current = await get(); + const merged = { ...current, ...patch }; + const validated = nexusSettingsSchema.parse(merged); + const filePath = resolveNexusSettingsPath(); + fs.mkdirSync(path.dirname(filePath), { recursive: true }); + fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8"); + return validated; +} +``` + +### Existing Redaction Patterns (extend for plain-text memory) +```typescript +// Source: server/src/redaction.ts +const SECRET_PAYLOAD_KEY_RE = + /(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i; +const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/; +``` + +### Existing Handoff Route Pattern +```typescript +// Source: server/src/routes/chat.ts lines 163-203 +router.post("/conversations/:id/handoff", async (req, res) => { + assertBoard(req); + const data = handoffSchema.parse(req.body); + const conversation = await svc.getConversation(req.params.id!); + const companyId = conversation.companyId; + // 1. Insert handoff system message + // 2. Create issue from spec + // 3. Insert task_created system message + res.json({ handoffMessageId: handoffMsg.id, issues: [issue] }); +}); +``` + +### Existing SSE Streaming Pattern +```typescript +// Source: server/src/routes/chat.ts lines 88-137 +res.setHeader("Content-Type", "text/event-stream"); +res.setHeader("Cache-Control", "no-cache"); +res.setHeader("Connection", "keep-alive"); +res.setHeader("X-Accel-Buffering", "no"); +res.flushHeaders(); +res.write(":ok\n\n"); +// ... async generator loop +res.write(`data: ${JSON.stringify({ token })}\n\n`); +res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`); +``` + +### PuterProxyService Chat Stream +```typescript +// Source: server/src/services/puter-proxy.ts +async function* chatStream( + companyId: string, + agentId: string | null | undefined, + messages: unknown[], + model: string | undefined, + signal: AbortSignal | undefined, +): AsyncGenerator +// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }] +``` + +--- + +## State of the Art + +| Old Approach | Current Approach | When Changed | Impact | +|--------------|------------------|--------------|--------| +| Vector DB for memory | Summary-based file-backed facts | Phase 33 (new) | No infra overhead; no pgvector needed | +| Echo stub in stream endpoint | Real AI via puterProxyService | Phase 33 (new) | Memory injection only makes sense once real AI is wired | + +**Deprecated/outdated:** +- `streamEcho()` in `chatService`: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint). + +--- + +## Open Questions + +1. **Does the stream endpoint replace the echo stub or add a parallel endpoint?** + - What we know: `/conversations/:id/stream` currently calls `svc.streamEcho()`. The puter proxy has its own endpoint at `/puter-proxy/chat`. + - What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated `/conversations/:id/assistant-stream`. + - Recommendation: Extend the existing endpoint — the UI already calls it via `chatApi.postMessageAndStream()`. The endpoint should check `nexusSettings.mode` and route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change. + +2. **What constitutes a "memory fact" — when is it written?** + - What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand. + - What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn. + - Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The `onDone` callback in `chatApi.postMessageAndStream` is the natural trigger; or do it server-side in the stream endpoint after saving the final message. + +3. **How does the personal assistant handoff interact with the existing brainstormer handoff?** + - What we know: The existing handoff (Phase 23) creates a project issue from a structured spec (`{ what, why, constraints, success }`). The assistant handoff (ASST-03) transfers conversation context to a PM agent. + - What's unclear: Whether these are the same button or different flows. + - Recommendation: Implement a separate `POST /conversations/:id/assistant-handoff` endpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec). + +--- + +## Environment Availability + +> Phase is code/config-only. No new external services required. +> `puterProxyService` is already integrated (Phase 31). File I/O uses Node.js built-ins. + +| Dependency | Required By | Available | Version | Fallback | +|------------|------------|-----------|---------|----------| +| Node.js `fs` module | Memory file storage | ✓ | built-in | — | +| `puterProxyService` | AI responses + optional summarisation | ✓ | internal | Fall back to echo stub if no token | +| `nexusSettingsService` | Mode check (ASST-04) | ✓ | internal | — | +| `secretService` | Token resolution in puterProxy | ✓ | internal | — | + +--- + +## Validation Architecture + +### Test Framework +| Property | Value | +|----------|-------| +| Framework | Vitest 3.x | +| Config file | `server/vitest.config.ts` | +| Quick run command | `pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts` | +| Full suite command | `pnpm test:run` | + +### Phase Requirements → Test Map +| Req ID | Behavior | Test Type | Automated Command | File Exists? | +|--------|----------|-----------|-------------------|-------------| +| ASST-01 | Memory persists across sessions — new session includes previously stored fact in system prompt | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts` | ❌ Wave 0 | +| ASST-02 | API key pasted into chat is NOT stored in memory file | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts` | ❌ Wave 0 | +| ASST-03 | Assistant handoff creates target conversation with context summary system message | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts` | ❌ Wave 0 | +| ASST-04 | PersonalAssistantPage visible when mode is `personal_ai` or `both`; hidden when `project_builder` | unit | `pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx` | ❌ Wave 0 | + +### Sampling Rate +- **Per task commit:** `pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts` +- **Per wave merge:** `pnpm test:run` +- **Phase gate:** Full suite green before `/gsd:verify-work` + +### Wave 0 Gaps +- [ ] `server/src/__tests__/33-assistant-memory.test.ts` — covers ASST-01 +- [ ] `server/src/__tests__/33-memory-sanitization.test.ts` — covers ASST-02 +- [ ] `server/src/__tests__/33-assistant-handoff.test.ts` — covers ASST-03 +- [ ] `ui/src/components/PersonalAssistantPage.test.tsx` — covers ASST-04 + +--- + +## Sources + +### Primary (HIGH confidence) +- `server/src/services/nexus-settings.ts` — file-backed JSON service pattern (confirmed by reading source) +- `server/src/services/puter-proxy.ts` — AI streaming service (confirmed by reading source) +- `server/src/routes/chat.ts` — SSE streaming pattern and handoff route (confirmed by reading source) +- `server/src/redaction.ts` — credential scrubbing patterns (confirmed by reading source) +- `packages/db/src/schema/chat_conversations.ts` — conversation schema (confirmed by reading source) +- `packages/db/src/schema/chat_messages.ts` — message schema with `messageType` column (confirmed by reading source) +- `.planning/REQUIREMENTS.md` — "DB schema changes: Out of Scope" (confirmed by reading source) + +### Secondary (MEDIUM confidence) +- `.planning/ROADMAP.md` Phase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04 +- `.planning/STATE.md` Accumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time" + +### Tertiary (LOW confidence) +- None — all findings verified from source files. + +--- + +## Metadata + +**Confidence breakdown:** +- Standard stack: HIGH — verified by reading existing service files +- Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route) +- Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs + +**Research date:** 2026-04-01 +**Valid until:** 2026-05-01 (stable codebase; these patterns won't change without major refactors)