Nexus Dev 2efed8797e docs(33): research phase persistent memory domain

2026-04-04 03:55:49 +00:00

23 KiB

Raw Blame History

Phase 33: Persistent Memory + Personal Assistant Mode — Research

Researched: 2026-04-01 Domain: File-backed memory service, prompt injection, assistant handoff to PM agent Confidence: HIGH

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.

Claude's Discretion

All implementation choices.

Deferred Ideas (OUT OF SCOPE)

None — discuss phase skipped. </user_constraints>

<phase_requirements>

Phase Requirements

ID	Description	Research Support
ASST-01	User has persistent memory across chat sessions (summary-based, injected into system prompts)	File-backed memory service at `data/assistant-memory/<companyId>.json`; inject via new `/conversations/:id/assistant-stream` endpoint or by extending existing `/conversations/:id/stream`
ASST-02	Memory content sanitized at write time to prevent prompt injection	`server/src/redaction.ts` has `SECRET_PAYLOAD_KEY_RE` + `JWT_VALUE_RE`; extend with a plain-text credential blocklist pattern applied at `memorySummary` write time
ASST-03	User can hand off an assistant conversation to a PM agent with one click, transferring context	Existing `POST /conversations/:id/handoff` route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary
ASST-04	Assistant and Project Builder modes work standalone or together	`nexusSettingsService` persists `mode` as `"personal_ai"
</phase_requirements>

Summary

Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange.

No new DB tables are permitted (milestone constraint). All state lives either in data/assistant-memory/<companyId>.json (following the nexus-settings.json file-backed pattern) or in existing JSONB columns. The mode setting already persists correctly in data/nexus-settings.json via nexusSettingsService and is accessible via GET /nexus/settings. The UI only needs to read this mode to gate the PersonalAssistantPage route.

The streaming endpoint at POST /conversations/:id/stream currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through puterProxyService) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new assistantMemoryService.append() which sanitizes the assistant text before persisting.

Primary recommendation: Build a assistantMemoryService modelled on nexusSettingsService (file-backed JSON, no DB changes), extend the /conversations/:id/stream endpoint to prepend memory as a system message, and add a /conversations/:id/assistant-handoff route that creates a PM-linked conversation with a seeded system message.

Standard Stack

Core

Library	Version	Purpose	Why Standard
`node:fs` (sync)	Node built-in	Read/write `assistant-memory/<companyId>.json`	Same pattern as `nexus-settings.json` — no extra deps
`zod`	already in codebase	Schema validation for memory JSON on read	All service layer uses zod
`drizzle-orm`	already in codebase	Querying `chatMessages` for summary extraction	Already used in `chatService`
Express Router	already in codebase	New memory CRUD routes	All server routes use Express

Supporting

Library	Version	Purpose	When to Use
`puterProxyService`	internal	Real AI completions for memory summarization	Use for the summarize-conversation step; same service already used for chat

Alternatives Considered

Instead of	Could Use	Tradeoff
File-backed JSON memory	`instance_settings.general` JSONB	JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note
File-backed JSON memory	Vector DB (Mem0, Chroma)	Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead"
File-backed JSON memory	`chat_conversations` extra column	Would require a DB schema migration — prohibited by milestone constraint

Installation: No new packages needed.

Architecture Patterns

Recommended Project Structure

server/src/services/
├── assistant-memory.ts     # new — read/write data/assistant-memory/<companyId>.json
server/src/routes/
├── assistant-memory.ts     # new — GET/PATCH memory endpoints
ui/src/
├── pages/PersonalAssistantPage.tsx  # new — gated on mode !== "project_builder"
├── api/assistantMemory.ts           # new — API client for memory endpoints

Pattern 1: File-Backed Service (follows nexusSettingsService)

What: Read a JSON file from resolvePaperclipInstanceRoot()/data/assistant-memory/<companyId>.json, validate with zod, write back after updates.

When to use: All memory reads and writes.

Example:

// Source: server/src/services/nexus-settings.ts (existing pattern)
import fs from "node:fs";
import path from "node:path";
import { z } from "zod";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";

const assistantMemorySchema = z.object({
  facts: z.array(z.string()).default([]),
  updatedAt: z.string().datetime().optional(),
});

type AssistantMemory = z.infer<typeof assistantMemorySchema>;

function resolveMemoryPath(companyId: string): string {
  return path.resolve(
    resolvePaperclipInstanceRoot(),
    "data",
    "assistant-memory",
    `${companyId}.json`,
  );
}

Pattern 2: Memory Injection via System Message Prefix

What: Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model.

When to use: Every /conversations/:id/stream call when the conversation's company is in personal_ai or both mode.

Example:

// Inject pattern — server/src/routes/chat.ts stream endpoint extension
const memory = await assistantMemoryService.get(companyId);
const systemPrefix = memory.facts.length > 0
  ? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n`
  : "";
const messagesWithMemory = systemPrefix
  ? [{ role: "system", content: systemPrefix }, ...conversationMessages]
  : conversationMessages;

Pattern 3: Write-Time Sanitization

What: Before appending any fact to memory, run it through a credential-scrubbing function.

When to use: Every call to assistantMemoryService.append().

Example:

// Source: server/src/redaction.ts (existing pattern extended for plain text)
const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g;
const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi;

function sanitizeMemoryFact(raw: string): string {
  return raw
    .replace(CREDENTIAL_INLINE_RE, "[REDACTED]")
    .replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]");
}

Pattern 4: PM Agent Handoff with Conversation Summary

What: POST /conversations/:id/assistant-handoff — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation.

When to use: When user clicks "Turn this into a project" in PersonalAssistantPage.

The existing POST /conversations/:id/handoff creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should:

Fetch the last N messages from the current conversation.
Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available).
Create a new conversation (or find the PM agent's default conversation) and insert a messageType: "handoff_context" system message.
Return { targetConversationId } so the client can navigate.

Anti-Patterns to Avoid

Injecting memory at retrieval time (not write time): ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk.
Storing the full conversation as memory: Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads.
Putting memory in instance_settings.general: That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces.
Reading memory synchronously in the hot path of the SSE stream without caching: The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop.

Don't Hand-Roll

Problem	Don't Build	Use Instead	Why
Credential scrubbing regex	Custom patterns from scratch	Extend `server/src/redaction.ts` with plain-text patterns	Existing patterns cover `api_key`, `access_token`, `jwt`, etc. — reusing reduces divergence
Secrets storage	New encrypted store	`secretService` already handles encrypted storage — memory does not store secrets, only sanitized facts	Credential data must never reach memory layer at all
AI summarisation service	New inference client	`puterProxyService` for optional summarisation — same endpoint, same cost tracking	Avoids a second AI client implementation

Key insight: The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage.

Runtime State Inventory

SKIPPED — this is a greenfield phase, not a rename/refactor/migration.

Common Pitfalls

Pitfall 1: Mode Check Location

What goes wrong: UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users. Why it happens: nexusSettings.mode is read from a file, not from DB — easy to forget to check it server-side. How to avoid: The stream endpoint must call nexusSettingsService().get() and skip memory injection when mode === "project_builder". Warning signs: Memory facts appearing in project builder conversations.

Pitfall 2: companyId Scoping in Memory File Path

What goes wrong: All companies share a single memory file. Why it happens: Forgetting companyId in the path, e.g. data/assistant-memory.json instead of data/assistant-memory/<companyId>.json. How to avoid: The path resolver function must include companyId as a directory segment (as shown in Pattern 1 above). Warning signs: Facts from workspace A appearing in workspace B.

Pitfall 3: SSE Stream Already Flushed When Memory Read Fails

What goes wrong: A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken. Why it happens: The existing stream endpoint flushes headers at line 101 (res.flushHeaders()) before any async logic. How to avoid: Read memory before calling res.flushHeaders(). If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush. Warning signs: Client sees :ok event but then nothing further.

Pitfall 4: Memory Growing Without Bound

What goes wrong: facts array accumulates thousands of entries; the injected system prompt exceeds the model's context window. Why it happens: No fact eviction or cap. How to avoid: Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max. Warning signs: Streaming responses truncated; model errors about context length.

Pitfall 5: Handoff Creates Duplicate PM Conversations

What goes wrong: Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations. Why it happens: The handoff endpoint creates a new conversation on every call. How to avoid: Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return targetConversationId so the UI can detect if a new one was created. Warning signs: Conversation list growing rapidly.

Pitfall 6: Chat Route Injection Point

What goes wrong: Memory injection is added to the wrong handler — the echo streamEcho helper instead of the real AI call. Why it happens: POST /conversations/:id/stream currently calls svc.streamEcho() which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub. How to avoid: During plan-phase, inspect the exact injection point in server/src/routes/chat.ts lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in server/src/services/chat.ts during plan-phase".) Warning signs: Memory injected but responses are still the echo stub.

Code Examples

Verified patterns from existing codebase:

File-Backed Service Read/Write

// Source: server/src/services/nexus-settings.ts
async function get(): Promise<NexusSettings> {
  const filePath = resolveNexusSettingsPath();
  try {
    const raw = fs.readFileSync(filePath, "utf-8");
    const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw));
    if (parsed.success) return parsed.data;
    return { mode: "both" };
  } catch {
    return { mode: "both" };
  }
}

async function set(patch: Partial<NexusSettings>): Promise<NexusSettings> {
  const current = await get();
  const merged = { ...current, ...patch };
  const validated = nexusSettingsSchema.parse(merged);
  const filePath = resolveNexusSettingsPath();
  fs.mkdirSync(path.dirname(filePath), { recursive: true });
  fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8");
  return validated;
}

Existing Redaction Patterns (extend for plain-text memory)

// Source: server/src/redaction.ts
const SECRET_PAYLOAD_KEY_RE =
  /(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/;

Existing Handoff Route Pattern

// Source: server/src/routes/chat.ts lines 163-203
router.post("/conversations/:id/handoff", async (req, res) => {
  assertBoard(req);
  const data = handoffSchema.parse(req.body);
  const conversation = await svc.getConversation(req.params.id!);
  const companyId = conversation.companyId;
  // 1. Insert handoff system message
  // 2. Create issue from spec
  // 3. Insert task_created system message
  res.json({ handoffMessageId: handoffMsg.id, issues: [issue] });
});

Existing SSE Streaming Pattern

// Source: server/src/routes/chat.ts lines 88-137
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.setHeader("X-Accel-Buffering", "no");
res.flushHeaders();
res.write(":ok\n\n");
// ... async generator loop
res.write(`data: ${JSON.stringify({ token })}\n\n`);
res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`);

PuterProxyService Chat Stream

// Source: server/src/services/puter-proxy.ts
async function* chatStream(
  companyId: string,
  agentId: string | null | undefined,
  messages: unknown[],
  model: string | undefined,
  signal: AbortSignal | undefined,
): AsyncGenerator<string>
// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }]

State of the Art

Old Approach	Current Approach	When Changed	Impact
Vector DB for memory	Summary-based file-backed facts	Phase 33 (new)	No infra overhead; no pgvector needed
Echo stub in stream endpoint	Real AI via puterProxyService	Phase 33 (new)	Memory injection only makes sense once real AI is wired

Deprecated/outdated:

streamEcho() in chatService: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint).

Open Questions

Does the stream endpoint replace the echo stub or add a parallel endpoint?
- What we know: /conversations/:id/stream currently calls svc.streamEcho(). The puter proxy has its own endpoint at /puter-proxy/chat.
- What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated /conversations/:id/assistant-stream.
- Recommendation: Extend the existing endpoint — the UI already calls it via chatApi.postMessageAndStream(). The endpoint should check nexusSettings.mode and route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change.
What constitutes a "memory fact" — when is it written?
- What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand.
- What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn.
- Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The onDone callback in chatApi.postMessageAndStream is the natural trigger; or do it server-side in the stream endpoint after saving the final message.
How does the personal assistant handoff interact with the existing brainstormer handoff?
- What we know: The existing handoff (Phase 23) creates a project issue from a structured spec ({ what, why, constraints, success }). The assistant handoff (ASST-03) transfers conversation context to a PM agent.
- What's unclear: Whether these are the same button or different flows.
- Recommendation: Implement a separate POST /conversations/:id/assistant-handoff endpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec).

Environment Availability

Phase is code/config-only. No new external services required. puterProxyService is already integrated (Phase 31). File I/O uses Node.js built-ins.

Dependency	Required By	Available	Version	Fallback
Node.js `fs` module	Memory file storage	✓	built-in	—
`puterProxyService`	AI responses + optional summarisation	✓	internal	Fall back to echo stub if no token
`nexusSettingsService`	Mode check (ASST-04)	✓	internal	—
`secretService`	Token resolution in puterProxy	✓	internal	—

Validation Architecture

Test Framework

Property	Value
Framework	Vitest 3.x
Config file	`server/vitest.config.ts`
Quick run command	`pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts`
Full suite command	`pnpm test:run`

Phase Requirements → Test Map

Req ID	Behavior	Test Type	Automated Command	File Exists?
ASST-01	Memory persists across sessions — new session includes previously stored fact in system prompt	unit	`pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts`	❌ Wave 0
ASST-02	API key pasted into chat is NOT stored in memory file	unit	`pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts`	❌ Wave 0
ASST-03	Assistant handoff creates target conversation with context summary system message	unit	`pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts`	❌ Wave 0
ASST-04	PersonalAssistantPage visible when mode is `personal_ai` or `both`; hidden when `project_builder`	unit	`pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx`	❌ Wave 0

Sampling Rate

Per task commit: pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts
Per wave merge: pnpm test:run
Phase gate: Full suite green before /gsd:verify-work

Wave 0 Gaps

server/src/__tests__/33-assistant-memory.test.ts — covers ASST-01
server/src/__tests__/33-memory-sanitization.test.ts — covers ASST-02
server/src/__tests__/33-assistant-handoff.test.ts — covers ASST-03
ui/src/components/PersonalAssistantPage.test.tsx — covers ASST-04

Sources

Primary (HIGH confidence)

server/src/services/nexus-settings.ts — file-backed JSON service pattern (confirmed by reading source)
server/src/services/puter-proxy.ts — AI streaming service (confirmed by reading source)
server/src/routes/chat.ts — SSE streaming pattern and handoff route (confirmed by reading source)
server/src/redaction.ts — credential scrubbing patterns (confirmed by reading source)
packages/db/src/schema/chat_conversations.ts — conversation schema (confirmed by reading source)
packages/db/src/schema/chat_messages.ts — message schema with messageType column (confirmed by reading source)
.planning/REQUIREMENTS.md — "DB schema changes: Out of Scope" (confirmed by reading source)

Secondary (MEDIUM confidence)

.planning/ROADMAP.md Phase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04
.planning/STATE.md Accumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time"

Tertiary (LOW confidence)

None — all findings verified from source files.

Metadata

Confidence breakdown:

Standard stack: HIGH — verified by reading existing service files
Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route)
Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs

Research date: 2026-04-01 Valid until: 2026-05-01 (stable codebase; these patterns won't change without major refactors)

23 KiB Raw Blame History

Phase 33: Persistent Memory + Personal Assistant Mode — Research

User Constraints (from CONTEXT.md)

Locked Decisions

Claude's Discretion

Deferred Ideas (OUT OF SCOPE)

Phase Requirements

Summary

Standard Stack

Core

Supporting

Alternatives Considered

Architecture Patterns

Recommended Project Structure

Pattern 1: File-Backed Service (follows nexusSettingsService)

Pattern 2: Memory Injection via System Message Prefix

Pattern 3: Write-Time Sanitization

Pattern 4: PM Agent Handoff with Conversation Summary

Anti-Patterns to Avoid

Don't Hand-Roll

Runtime State Inventory

Common Pitfalls

Pitfall 1: Mode Check Location

Pitfall 2: companyId Scoping in Memory File Path

Pitfall 3: SSE Stream Already Flushed When Memory Read Fails

Pitfall 4: Memory Growing Without Bound

Pitfall 5: Handoff Creates Duplicate PM Conversations

Pitfall 6: Chat Route Injection Point

Code Examples

File-Backed Service Read/Write

Existing Redaction Patterns (extend for plain-text memory)

Existing Handoff Route Pattern

Existing SSE Streaming Pattern

PuterProxyService Chat Stream

State of the Art

Open Questions

Environment Availability

Validation Architecture

Test Framework

Phase Requirements → Test Map

Sampling Rate

Wave 0 Gaps

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

Metadata

23 KiB

Raw Blame History