nexus/.planning/phases/33-persistent-memory/33-RESEARCH.md

23 KiB

Phase 33: Persistent Memory + Personal Assistant Mode — Research

Researched: 2026-04-01 Domain: File-backed memory service, prompt injection, assistant handoff to PM agent Confidence: HIGH


<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.

Claude's Discretion

All implementation choices.

Deferred Ideas (OUT OF SCOPE)

None — discuss phase skipped. </user_constraints>


<phase_requirements>

Phase Requirements

ID Description Research Support
ASST-01 User has persistent memory across chat sessions (summary-based, injected into system prompts) File-backed memory service at data/assistant-memory/<companyId>.json; inject via new /conversations/:id/assistant-stream endpoint or by extending existing /conversations/:id/stream
ASST-02 Memory content sanitized at write time to prevent prompt injection server/src/redaction.ts has SECRET_PAYLOAD_KEY_RE + JWT_VALUE_RE; extend with a plain-text credential blocklist pattern applied at memorySummary write time
ASST-03 User can hand off an assistant conversation to a PM agent with one click, transferring context Existing POST /conversations/:id/handoff route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary
ASST-04 Assistant and Project Builder modes work standalone or together nexusSettingsService persists mode as `"personal_ai"
</phase_requirements>

Summary

Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange.

No new DB tables are permitted (milestone constraint). All state lives either in data/assistant-memory/<companyId>.json (following the nexus-settings.json file-backed pattern) or in existing JSONB columns. The mode setting already persists correctly in data/nexus-settings.json via nexusSettingsService and is accessible via GET /nexus/settings. The UI only needs to read this mode to gate the PersonalAssistantPage route.

The streaming endpoint at POST /conversations/:id/stream currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through puterProxyService) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new assistantMemoryService.append() which sanitizes the assistant text before persisting.

Primary recommendation: Build a assistantMemoryService modelled on nexusSettingsService (file-backed JSON, no DB changes), extend the /conversations/:id/stream endpoint to prepend memory as a system message, and add a /conversations/:id/assistant-handoff route that creates a PM-linked conversation with a seeded system message.


Standard Stack

Core

Library Version Purpose Why Standard
node:fs (sync) Node built-in Read/write assistant-memory/<companyId>.json Same pattern as nexus-settings.json — no extra deps
zod already in codebase Schema validation for memory JSON on read All service layer uses zod
drizzle-orm already in codebase Querying chatMessages for summary extraction Already used in chatService
Express Router already in codebase New memory CRUD routes All server routes use Express

Supporting

Library Version Purpose When to Use
puterProxyService internal Real AI completions for memory summarization Use for the summarize-conversation step; same service already used for chat

Alternatives Considered

Instead of Could Use Tradeoff
File-backed JSON memory instance_settings.general JSONB JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note
File-backed JSON memory Vector DB (Mem0, Chroma) Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead"
File-backed JSON memory chat_conversations extra column Would require a DB schema migration — prohibited by milestone constraint

Installation: No new packages needed.


Architecture Patterns

server/src/services/
├── assistant-memory.ts     # new — read/write data/assistant-memory/<companyId>.json
server/src/routes/
├── assistant-memory.ts     # new — GET/PATCH memory endpoints
ui/src/
├── pages/PersonalAssistantPage.tsx  # new — gated on mode !== "project_builder"
├── api/assistantMemory.ts           # new — API client for memory endpoints

Pattern 1: File-Backed Service (follows nexusSettingsService)

What: Read a JSON file from resolvePaperclipInstanceRoot()/data/assistant-memory/<companyId>.json, validate with zod, write back after updates.

When to use: All memory reads and writes.

Example:

// Source: server/src/services/nexus-settings.ts (existing pattern)
import fs from "node:fs";
import path from "node:path";
import { z } from "zod";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";

const assistantMemorySchema = z.object({
  facts: z.array(z.string()).default([]),
  updatedAt: z.string().datetime().optional(),
});

type AssistantMemory = z.infer<typeof assistantMemorySchema>;

function resolveMemoryPath(companyId: string): string {
  return path.resolve(
    resolvePaperclipInstanceRoot(),
    "data",
    "assistant-memory",
    `${companyId}.json`,
  );
}

Pattern 2: Memory Injection via System Message Prefix

What: Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model.

When to use: Every /conversations/:id/stream call when the conversation's company is in personal_ai or both mode.

Example:

// Inject pattern — server/src/routes/chat.ts stream endpoint extension
const memory = await assistantMemoryService.get(companyId);
const systemPrefix = memory.facts.length > 0
  ? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n`
  : "";
const messagesWithMemory = systemPrefix
  ? [{ role: "system", content: systemPrefix }, ...conversationMessages]
  : conversationMessages;

Pattern 3: Write-Time Sanitization

What: Before appending any fact to memory, run it through a credential-scrubbing function.

When to use: Every call to assistantMemoryService.append().

Example:

// Source: server/src/redaction.ts (existing pattern extended for plain text)
const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g;
const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi;

function sanitizeMemoryFact(raw: string): string {
  return raw
    .replace(CREDENTIAL_INLINE_RE, "[REDACTED]")
    .replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]");
}

Pattern 4: PM Agent Handoff with Conversation Summary

What: POST /conversations/:id/assistant-handoff — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation.

When to use: When user clicks "Turn this into a project" in PersonalAssistantPage.

The existing POST /conversations/:id/handoff creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should:

  1. Fetch the last N messages from the current conversation.
  2. Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available).
  3. Create a new conversation (or find the PM agent's default conversation) and insert a messageType: "handoff_context" system message.
  4. Return { targetConversationId } so the client can navigate.

Anti-Patterns to Avoid

  • Injecting memory at retrieval time (not write time): ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk.
  • Storing the full conversation as memory: Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads.
  • Putting memory in instance_settings.general: That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces.
  • Reading memory synchronously in the hot path of the SSE stream without caching: The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop.

Don't Hand-Roll

Problem Don't Build Use Instead Why
Credential scrubbing regex Custom patterns from scratch Extend server/src/redaction.ts with plain-text patterns Existing patterns cover api_key, access_token, jwt, etc. — reusing reduces divergence
Secrets storage New encrypted store secretService already handles encrypted storage — memory does not store secrets, only sanitized facts Credential data must never reach memory layer at all
AI summarisation service New inference client puterProxyService for optional summarisation — same endpoint, same cost tracking Avoids a second AI client implementation

Key insight: The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage.


Runtime State Inventory

SKIPPED — this is a greenfield phase, not a rename/refactor/migration.


Common Pitfalls

Pitfall 1: Mode Check Location

What goes wrong: UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users. Why it happens: nexusSettings.mode is read from a file, not from DB — easy to forget to check it server-side. How to avoid: The stream endpoint must call nexusSettingsService().get() and skip memory injection when mode === "project_builder". Warning signs: Memory facts appearing in project builder conversations.

Pitfall 2: companyId Scoping in Memory File Path

What goes wrong: All companies share a single memory file. Why it happens: Forgetting companyId in the path, e.g. data/assistant-memory.json instead of data/assistant-memory/<companyId>.json. How to avoid: The path resolver function must include companyId as a directory segment (as shown in Pattern 1 above). Warning signs: Facts from workspace A appearing in workspace B.

Pitfall 3: SSE Stream Already Flushed When Memory Read Fails

What goes wrong: A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken. Why it happens: The existing stream endpoint flushes headers at line 101 (res.flushHeaders()) before any async logic. How to avoid: Read memory before calling res.flushHeaders(). If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush. Warning signs: Client sees :ok event but then nothing further.

Pitfall 4: Memory Growing Without Bound

What goes wrong: facts array accumulates thousands of entries; the injected system prompt exceeds the model's context window. Why it happens: No fact eviction or cap. How to avoid: Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max. Warning signs: Streaming responses truncated; model errors about context length.

Pitfall 5: Handoff Creates Duplicate PM Conversations

What goes wrong: Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations. Why it happens: The handoff endpoint creates a new conversation on every call. How to avoid: Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return targetConversationId so the UI can detect if a new one was created. Warning signs: Conversation list growing rapidly.

Pitfall 6: Chat Route Injection Point

What goes wrong: Memory injection is added to the wrong handler — the echo streamEcho helper instead of the real AI call. Why it happens: POST /conversations/:id/stream currently calls svc.streamEcho() which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub. How to avoid: During plan-phase, inspect the exact injection point in server/src/routes/chat.ts lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in server/src/services/chat.ts during plan-phase".) Warning signs: Memory injected but responses are still the echo stub.


Code Examples

Verified patterns from existing codebase:

File-Backed Service Read/Write

// Source: server/src/services/nexus-settings.ts
async function get(): Promise<NexusSettings> {
  const filePath = resolveNexusSettingsPath();
  try {
    const raw = fs.readFileSync(filePath, "utf-8");
    const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw));
    if (parsed.success) return parsed.data;
    return { mode: "both" };
  } catch {
    return { mode: "both" };
  }
}

async function set(patch: Partial<NexusSettings>): Promise<NexusSettings> {
  const current = await get();
  const merged = { ...current, ...patch };
  const validated = nexusSettingsSchema.parse(merged);
  const filePath = resolveNexusSettingsPath();
  fs.mkdirSync(path.dirname(filePath), { recursive: true });
  fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8");
  return validated;
}

Existing Redaction Patterns (extend for plain-text memory)

// Source: server/src/redaction.ts
const SECRET_PAYLOAD_KEY_RE =
  /(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/;

Existing Handoff Route Pattern

// Source: server/src/routes/chat.ts lines 163-203
router.post("/conversations/:id/handoff", async (req, res) => {
  assertBoard(req);
  const data = handoffSchema.parse(req.body);
  const conversation = await svc.getConversation(req.params.id!);
  const companyId = conversation.companyId;
  // 1. Insert handoff system message
  // 2. Create issue from spec
  // 3. Insert task_created system message
  res.json({ handoffMessageId: handoffMsg.id, issues: [issue] });
});

Existing SSE Streaming Pattern

// Source: server/src/routes/chat.ts lines 88-137
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.setHeader("X-Accel-Buffering", "no");
res.flushHeaders();
res.write(":ok\n\n");
// ... async generator loop
res.write(`data: ${JSON.stringify({ token })}\n\n`);
res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`);

PuterProxyService Chat Stream

// Source: server/src/services/puter-proxy.ts
async function* chatStream(
  companyId: string,
  agentId: string | null | undefined,
  messages: unknown[],
  model: string | undefined,
  signal: AbortSignal | undefined,
): AsyncGenerator<string>
// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }]

State of the Art

Old Approach Current Approach When Changed Impact
Vector DB for memory Summary-based file-backed facts Phase 33 (new) No infra overhead; no pgvector needed
Echo stub in stream endpoint Real AI via puterProxyService Phase 33 (new) Memory injection only makes sense once real AI is wired

Deprecated/outdated:

  • streamEcho() in chatService: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint).

Open Questions

  1. Does the stream endpoint replace the echo stub or add a parallel endpoint?

    • What we know: /conversations/:id/stream currently calls svc.streamEcho(). The puter proxy has its own endpoint at /puter-proxy/chat.
    • What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated /conversations/:id/assistant-stream.
    • Recommendation: Extend the existing endpoint — the UI already calls it via chatApi.postMessageAndStream(). The endpoint should check nexusSettings.mode and route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change.
  2. What constitutes a "memory fact" — when is it written?

    • What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand.
    • What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn.
    • Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The onDone callback in chatApi.postMessageAndStream is the natural trigger; or do it server-side in the stream endpoint after saving the final message.
  3. How does the personal assistant handoff interact with the existing brainstormer handoff?

    • What we know: The existing handoff (Phase 23) creates a project issue from a structured spec ({ what, why, constraints, success }). The assistant handoff (ASST-03) transfers conversation context to a PM agent.
    • What's unclear: Whether these are the same button or different flows.
    • Recommendation: Implement a separate POST /conversations/:id/assistant-handoff endpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec).

Environment Availability

Phase is code/config-only. No new external services required. puterProxyService is already integrated (Phase 31). File I/O uses Node.js built-ins.

Dependency Required By Available Version Fallback
Node.js fs module Memory file storage built-in
puterProxyService AI responses + optional summarisation internal Fall back to echo stub if no token
nexusSettingsService Mode check (ASST-04) internal
secretService Token resolution in puterProxy internal

Validation Architecture

Test Framework

Property Value
Framework Vitest 3.x
Config file server/vitest.config.ts
Quick run command pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts
Full suite command pnpm test:run

Phase Requirements → Test Map

Req ID Behavior Test Type Automated Command File Exists?
ASST-01 Memory persists across sessions — new session includes previously stored fact in system prompt unit pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts Wave 0
ASST-02 API key pasted into chat is NOT stored in memory file unit pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts Wave 0
ASST-03 Assistant handoff creates target conversation with context summary system message unit pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts Wave 0
ASST-04 PersonalAssistantPage visible when mode is personal_ai or both; hidden when project_builder unit pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx Wave 0

Sampling Rate

  • Per task commit: pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts
  • Per wave merge: pnpm test:run
  • Phase gate: Full suite green before /gsd:verify-work

Wave 0 Gaps

  • server/src/__tests__/33-assistant-memory.test.ts — covers ASST-01
  • server/src/__tests__/33-memory-sanitization.test.ts — covers ASST-02
  • server/src/__tests__/33-assistant-handoff.test.ts — covers ASST-03
  • ui/src/components/PersonalAssistantPage.test.tsx — covers ASST-04

Sources

Primary (HIGH confidence)

  • server/src/services/nexus-settings.ts — file-backed JSON service pattern (confirmed by reading source)
  • server/src/services/puter-proxy.ts — AI streaming service (confirmed by reading source)
  • server/src/routes/chat.ts — SSE streaming pattern and handoff route (confirmed by reading source)
  • server/src/redaction.ts — credential scrubbing patterns (confirmed by reading source)
  • packages/db/src/schema/chat_conversations.ts — conversation schema (confirmed by reading source)
  • packages/db/src/schema/chat_messages.ts — message schema with messageType column (confirmed by reading source)
  • .planning/REQUIREMENTS.md — "DB schema changes: Out of Scope" (confirmed by reading source)

Secondary (MEDIUM confidence)

  • .planning/ROADMAP.md Phase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04
  • .planning/STATE.md Accumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time"

Tertiary (LOW confidence)

  • None — all findings verified from source files.

Metadata

Confidence breakdown:

  • Standard stack: HIGH — verified by reading existing service files
  • Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route)
  • Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs

Research date: 2026-04-01 Valid until: 2026-05-01 (stable codebase; these patterns won't change without major refactors)