23 KiB
Phase 33: Persistent Memory + Personal Assistant Mode — Research
Researched: 2026-04-01 Domain: File-backed memory service, prompt injection, assistant handoff to PM agent Confidence: HIGH
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
Claude's Discretion
All implementation choices.
Deferred Ideas (OUT OF SCOPE)
None — discuss phase skipped. </user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| ASST-01 | User has persistent memory across chat sessions (summary-based, injected into system prompts) | File-backed memory service at data/assistant-memory/<companyId>.json; inject via new /conversations/:id/assistant-stream endpoint or by extending existing /conversations/:id/stream |
| ASST-02 | Memory content sanitized at write time to prevent prompt injection | server/src/redaction.ts has SECRET_PAYLOAD_KEY_RE + JWT_VALUE_RE; extend with a plain-text credential blocklist pattern applied at memorySummary write time |
| ASST-03 | User can hand off an assistant conversation to a PM agent with one click, transferring context | Existing POST /conversations/:id/handoff route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary |
| ASST-04 | Assistant and Project Builder modes work standalone or together | nexusSettingsService persists mode as `"personal_ai" |
| </phase_requirements> |
Summary
Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange.
No new DB tables are permitted (milestone constraint). All state lives either in data/assistant-memory/<companyId>.json (following the nexus-settings.json file-backed pattern) or in existing JSONB columns. The mode setting already persists correctly in data/nexus-settings.json via nexusSettingsService and is accessible via GET /nexus/settings. The UI only needs to read this mode to gate the PersonalAssistantPage route.
The streaming endpoint at POST /conversations/:id/stream currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through puterProxyService) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new assistantMemoryService.append() which sanitizes the assistant text before persisting.
Primary recommendation: Build a assistantMemoryService modelled on nexusSettingsService (file-backed JSON, no DB changes), extend the /conversations/:id/stream endpoint to prepend memory as a system message, and add a /conversations/:id/assistant-handoff route that creates a PM-linked conversation with a seeded system message.
Standard Stack
Core
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
node:fs (sync) |
Node built-in | Read/write assistant-memory/<companyId>.json |
Same pattern as nexus-settings.json — no extra deps |
zod |
already in codebase | Schema validation for memory JSON on read | All service layer uses zod |
drizzle-orm |
already in codebase | Querying chatMessages for summary extraction |
Already used in chatService |
| Express Router | already in codebase | New memory CRUD routes | All server routes use Express |
Supporting
| Library | Version | Purpose | When to Use |
|---|---|---|---|
puterProxyService |
internal | Real AI completions for memory summarization | Use for the summarize-conversation step; same service already used for chat |
Alternatives Considered
| Instead of | Could Use | Tradeoff |
|---|---|---|
| File-backed JSON memory | instance_settings.general JSONB |
JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note |
| File-backed JSON memory | Vector DB (Mem0, Chroma) | Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead" |
| File-backed JSON memory | chat_conversations extra column |
Would require a DB schema migration — prohibited by milestone constraint |
Installation: No new packages needed.
Architecture Patterns
Recommended Project Structure
server/src/services/
├── assistant-memory.ts # new — read/write data/assistant-memory/<companyId>.json
server/src/routes/
├── assistant-memory.ts # new — GET/PATCH memory endpoints
ui/src/
├── pages/PersonalAssistantPage.tsx # new — gated on mode !== "project_builder"
├── api/assistantMemory.ts # new — API client for memory endpoints
Pattern 1: File-Backed Service (follows nexusSettingsService)
What: Read a JSON file from resolvePaperclipInstanceRoot()/data/assistant-memory/<companyId>.json, validate with zod, write back after updates.
When to use: All memory reads and writes.
Example:
// Source: server/src/services/nexus-settings.ts (existing pattern)
import fs from "node:fs";
import path from "node:path";
import { z } from "zod";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";
const assistantMemorySchema = z.object({
facts: z.array(z.string()).default([]),
updatedAt: z.string().datetime().optional(),
});
type AssistantMemory = z.infer<typeof assistantMemorySchema>;
function resolveMemoryPath(companyId: string): string {
return path.resolve(
resolvePaperclipInstanceRoot(),
"data",
"assistant-memory",
`${companyId}.json`,
);
}
Pattern 2: Memory Injection via System Message Prefix
What: Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model.
When to use: Every /conversations/:id/stream call when the conversation's company is in personal_ai or both mode.
Example:
// Inject pattern — server/src/routes/chat.ts stream endpoint extension
const memory = await assistantMemoryService.get(companyId);
const systemPrefix = memory.facts.length > 0
? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n`
: "";
const messagesWithMemory = systemPrefix
? [{ role: "system", content: systemPrefix }, ...conversationMessages]
: conversationMessages;
Pattern 3: Write-Time Sanitization
What: Before appending any fact to memory, run it through a credential-scrubbing function.
When to use: Every call to assistantMemoryService.append().
Example:
// Source: server/src/redaction.ts (existing pattern extended for plain text)
const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g;
const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi;
function sanitizeMemoryFact(raw: string): string {
return raw
.replace(CREDENTIAL_INLINE_RE, "[REDACTED]")
.replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]");
}
Pattern 4: PM Agent Handoff with Conversation Summary
What: POST /conversations/:id/assistant-handoff — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation.
When to use: When user clicks "Turn this into a project" in PersonalAssistantPage.
The existing POST /conversations/:id/handoff creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should:
- Fetch the last N messages from the current conversation.
- Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available).
- Create a new conversation (or find the PM agent's default conversation) and insert a
messageType: "handoff_context"system message. - Return
{ targetConversationId }so the client can navigate.
Anti-Patterns to Avoid
- Injecting memory at retrieval time (not write time): ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk.
- Storing the full conversation as memory: Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads.
- Putting memory in
instance_settings.general: That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces. - Reading memory synchronously in the hot path of the SSE stream without caching: The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop.
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| Credential scrubbing regex | Custom patterns from scratch | Extend server/src/redaction.ts with plain-text patterns |
Existing patterns cover api_key, access_token, jwt, etc. — reusing reduces divergence |
| Secrets storage | New encrypted store | secretService already handles encrypted storage — memory does not store secrets, only sanitized facts |
Credential data must never reach memory layer at all |
| AI summarisation service | New inference client | puterProxyService for optional summarisation — same endpoint, same cost tracking |
Avoids a second AI client implementation |
Key insight: The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage.
Runtime State Inventory
SKIPPED — this is a greenfield phase, not a rename/refactor/migration.
Common Pitfalls
Pitfall 1: Mode Check Location
What goes wrong: UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users.
Why it happens: nexusSettings.mode is read from a file, not from DB — easy to forget to check it server-side.
How to avoid: The stream endpoint must call nexusSettingsService().get() and skip memory injection when mode === "project_builder".
Warning signs: Memory facts appearing in project builder conversations.
Pitfall 2: companyId Scoping in Memory File Path
What goes wrong: All companies share a single memory file.
Why it happens: Forgetting companyId in the path, e.g. data/assistant-memory.json instead of data/assistant-memory/<companyId>.json.
How to avoid: The path resolver function must include companyId as a directory segment (as shown in Pattern 1 above).
Warning signs: Facts from workspace A appearing in workspace B.
Pitfall 3: SSE Stream Already Flushed When Memory Read Fails
What goes wrong: A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken.
Why it happens: The existing stream endpoint flushes headers at line 101 (res.flushHeaders()) before any async logic.
How to avoid: Read memory before calling res.flushHeaders(). If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush.
Warning signs: Client sees :ok event but then nothing further.
Pitfall 4: Memory Growing Without Bound
What goes wrong: facts array accumulates thousands of entries; the injected system prompt exceeds the model's context window.
Why it happens: No fact eviction or cap.
How to avoid: Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max.
Warning signs: Streaming responses truncated; model errors about context length.
Pitfall 5: Handoff Creates Duplicate PM Conversations
What goes wrong: Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations.
Why it happens: The handoff endpoint creates a new conversation on every call.
How to avoid: Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return targetConversationId so the UI can detect if a new one was created.
Warning signs: Conversation list growing rapidly.
Pitfall 6: Chat Route Injection Point
What goes wrong: Memory injection is added to the wrong handler — the echo streamEcho helper instead of the real AI call.
Why it happens: POST /conversations/:id/stream currently calls svc.streamEcho() which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub.
How to avoid: During plan-phase, inspect the exact injection point in server/src/routes/chat.ts lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in server/src/services/chat.ts during plan-phase".)
Warning signs: Memory injected but responses are still the echo stub.
Code Examples
Verified patterns from existing codebase:
File-Backed Service Read/Write
// Source: server/src/services/nexus-settings.ts
async function get(): Promise<NexusSettings> {
const filePath = resolveNexusSettingsPath();
try {
const raw = fs.readFileSync(filePath, "utf-8");
const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw));
if (parsed.success) return parsed.data;
return { mode: "both" };
} catch {
return { mode: "both" };
}
}
async function set(patch: Partial<NexusSettings>): Promise<NexusSettings> {
const current = await get();
const merged = { ...current, ...patch };
const validated = nexusSettingsSchema.parse(merged);
const filePath = resolveNexusSettingsPath();
fs.mkdirSync(path.dirname(filePath), { recursive: true });
fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8");
return validated;
}
Existing Redaction Patterns (extend for plain-text memory)
// Source: server/src/redaction.ts
const SECRET_PAYLOAD_KEY_RE =
/(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/;
Existing Handoff Route Pattern
// Source: server/src/routes/chat.ts lines 163-203
router.post("/conversations/:id/handoff", async (req, res) => {
assertBoard(req);
const data = handoffSchema.parse(req.body);
const conversation = await svc.getConversation(req.params.id!);
const companyId = conversation.companyId;
// 1. Insert handoff system message
// 2. Create issue from spec
// 3. Insert task_created system message
res.json({ handoffMessageId: handoffMsg.id, issues: [issue] });
});
Existing SSE Streaming Pattern
// Source: server/src/routes/chat.ts lines 88-137
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.setHeader("X-Accel-Buffering", "no");
res.flushHeaders();
res.write(":ok\n\n");
// ... async generator loop
res.write(`data: ${JSON.stringify({ token })}\n\n`);
res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`);
PuterProxyService Chat Stream
// Source: server/src/services/puter-proxy.ts
async function* chatStream(
companyId: string,
agentId: string | null | undefined,
messages: unknown[],
model: string | undefined,
signal: AbortSignal | undefined,
): AsyncGenerator<string>
// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }]
State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|---|---|---|---|
| Vector DB for memory | Summary-based file-backed facts | Phase 33 (new) | No infra overhead; no pgvector needed |
| Echo stub in stream endpoint | Real AI via puterProxyService | Phase 33 (new) | Memory injection only makes sense once real AI is wired |
Deprecated/outdated:
streamEcho()inchatService: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint).
Open Questions
-
Does the stream endpoint replace the echo stub or add a parallel endpoint?
- What we know:
/conversations/:id/streamcurrently callssvc.streamEcho(). The puter proxy has its own endpoint at/puter-proxy/chat. - What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated
/conversations/:id/assistant-stream. - Recommendation: Extend the existing endpoint — the UI already calls it via
chatApi.postMessageAndStream(). The endpoint should checknexusSettings.modeand route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change.
- What we know:
-
What constitutes a "memory fact" — when is it written?
- What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand.
- What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn.
- Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The
onDonecallback inchatApi.postMessageAndStreamis the natural trigger; or do it server-side in the stream endpoint after saving the final message.
-
How does the personal assistant handoff interact with the existing brainstormer handoff?
- What we know: The existing handoff (Phase 23) creates a project issue from a structured spec (
{ what, why, constraints, success }). The assistant handoff (ASST-03) transfers conversation context to a PM agent. - What's unclear: Whether these are the same button or different flows.
- Recommendation: Implement a separate
POST /conversations/:id/assistant-handoffendpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec).
- What we know: The existing handoff (Phase 23) creates a project issue from a structured spec (
Environment Availability
Phase is code/config-only. No new external services required.
puterProxyServiceis already integrated (Phase 31). File I/O uses Node.js built-ins.
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
Node.js fs module |
Memory file storage | ✓ | built-in | — |
puterProxyService |
AI responses + optional summarisation | ✓ | internal | Fall back to echo stub if no token |
nexusSettingsService |
Mode check (ASST-04) | ✓ | internal | — |
secretService |
Token resolution in puterProxy | ✓ | internal | — |
Validation Architecture
Test Framework
| Property | Value |
|---|---|
| Framework | Vitest 3.x |
| Config file | server/vitest.config.ts |
| Quick run command | pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts |
| Full suite command | pnpm test:run |
Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|---|---|---|---|---|
| ASST-01 | Memory persists across sessions — new session includes previously stored fact in system prompt | unit | pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts |
❌ Wave 0 |
| ASST-02 | API key pasted into chat is NOT stored in memory file | unit | pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts |
❌ Wave 0 |
| ASST-03 | Assistant handoff creates target conversation with context summary system message | unit | pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts |
❌ Wave 0 |
| ASST-04 | PersonalAssistantPage visible when mode is personal_ai or both; hidden when project_builder |
unit | pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx |
❌ Wave 0 |
Sampling Rate
- Per task commit:
pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts - Per wave merge:
pnpm test:run - Phase gate: Full suite green before
/gsd:verify-work
Wave 0 Gaps
server/src/__tests__/33-assistant-memory.test.ts— covers ASST-01server/src/__tests__/33-memory-sanitization.test.ts— covers ASST-02server/src/__tests__/33-assistant-handoff.test.ts— covers ASST-03ui/src/components/PersonalAssistantPage.test.tsx— covers ASST-04
Sources
Primary (HIGH confidence)
server/src/services/nexus-settings.ts— file-backed JSON service pattern (confirmed by reading source)server/src/services/puter-proxy.ts— AI streaming service (confirmed by reading source)server/src/routes/chat.ts— SSE streaming pattern and handoff route (confirmed by reading source)server/src/redaction.ts— credential scrubbing patterns (confirmed by reading source)packages/db/src/schema/chat_conversations.ts— conversation schema (confirmed by reading source)packages/db/src/schema/chat_messages.ts— message schema withmessageTypecolumn (confirmed by reading source).planning/REQUIREMENTS.md— "DB schema changes: Out of Scope" (confirmed by reading source)
Secondary (MEDIUM confidence)
.planning/ROADMAP.mdPhase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04.planning/STATE.mdAccumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time"
Tertiary (LOW confidence)
- None — all findings verified from source files.
Metadata
Confidence breakdown:
- Standard stack: HIGH — verified by reading existing service files
- Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route)
- Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs
Research date: 2026-04-01 Valid until: 2026-05-01 (stable codebase; these patterns won't change without major refactors)