docs(33): research phase persistent memory domain

This commit is contained in:
Nexus Dev 2026-04-03 21:44:10 +00:00
parent 9ff01aeda9
commit 2efed8797e

View file

@ -0,0 +1,420 @@
# Phase 33: Persistent Memory + Personal Assistant Mode — Research
**Researched:** 2026-04-01
**Domain:** File-backed memory service, prompt injection, assistant handoff to PM agent
**Confidence:** HIGH
---
<user_constraints>
## User Constraints (from CONTEXT.md)
### Locked Decisions
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
### Claude's Discretion
All implementation choices.
### Deferred Ideas (OUT OF SCOPE)
None — discuss phase skipped.
</user_constraints>
---
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|------------------|
| ASST-01 | User has persistent memory across chat sessions (summary-based, injected into system prompts) | File-backed memory service at `data/assistant-memory/<companyId>.json`; inject via new `/conversations/:id/assistant-stream` endpoint or by extending existing `/conversations/:id/stream` |
| ASST-02 | Memory content sanitized at write time to prevent prompt injection | `server/src/redaction.ts` has `SECRET_PAYLOAD_KEY_RE` + `JWT_VALUE_RE`; extend with a plain-text credential blocklist pattern applied at `memorySummary` write time |
| ASST-03 | User can hand off an assistant conversation to a PM agent with one click, transferring context | Existing `POST /conversations/:id/handoff` route creates an issue from a spec. Phase 33 needs a parallel route — or mode on the existing route — that creates a PM agent conversation with a system message containing a conversation summary |
| ASST-04 | Assistant and Project Builder modes work standalone or together | `nexusSettingsService` persists `mode` as `"personal_ai" | "project_builder" | "both"` to `data/nexus-settings.json`. Phase 33 adds a `PersonalAssistantPage` gated on `mode !== "project_builder"` |
</phase_requirements>
---
## Summary
Phase 33 adds three capabilities to Nexus: (1) a file-backed memory layer that accumulates facts across sessions and injects them as a system prompt prefix before every assistant response, (2) credential scrubbing that prevents API keys and tokens from being stored in memory, and (3) a one-click "hand off to PM" flow that creates a new conversation pre-seeded with a summary of the assistant exchange.
No new DB tables are permitted (milestone constraint). All state lives either in `data/assistant-memory/<companyId>.json` (following the `nexus-settings.json` file-backed pattern) or in existing JSONB columns. The `mode` setting already persists correctly in `data/nexus-settings.json` via `nexusSettingsService` and is accessible via `GET /nexus/settings`. The UI only needs to read this mode to gate the `PersonalAssistantPage` route.
The streaming endpoint at `POST /conversations/:id/stream` currently responds with the echo service. Phase 33 must replace this with a real AI call (most likely through `puterProxyService`) and add memory injection into the message array sent to the model. Memory is accumulated after each completed assistant turn by calling a new `assistantMemoryService.append()` which sanitizes the assistant text before persisting.
**Primary recommendation:** Build a `assistantMemoryService` modelled on `nexusSettingsService` (file-backed JSON, no DB changes), extend the `/conversations/:id/stream` endpoint to prepend memory as a system message, and add a `/conversations/:id/assistant-handoff` route that creates a PM-linked conversation with a seeded system message.
---
## Standard Stack
### Core
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| `node:fs` (sync) | Node built-in | Read/write `assistant-memory/<companyId>.json` | Same pattern as `nexus-settings.json` — no extra deps |
| `zod` | already in codebase | Schema validation for memory JSON on read | All service layer uses zod |
| `drizzle-orm` | already in codebase | Querying `chatMessages` for summary extraction | Already used in `chatService` |
| Express Router | already in codebase | New memory CRUD routes | All server routes use Express |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| `puterProxyService` | internal | Real AI completions for memory summarization | Use for the summarize-conversation step; same service already used for chat |
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| File-backed JSON memory | `instance_settings.general` JSONB | JSONB is per-instance, not per-company; file-backed allows per-company isolation and matches REQUIREMENTS.md out-of-scope note |
| File-backed JSON memory | Vector DB (Mem0, Chroma) | Explicitly out of scope in REQUIREMENTS.md — "Vector database for memory: Summary-based approach sufficient; no infra overhead" |
| File-backed JSON memory | `chat_conversations` extra column | Would require a DB schema migration — prohibited by milestone constraint |
**Installation:** No new packages needed.
---
## Architecture Patterns
### Recommended Project Structure
```
server/src/services/
├── assistant-memory.ts # new — read/write data/assistant-memory/<companyId>.json
server/src/routes/
├── assistant-memory.ts # new — GET/PATCH memory endpoints
ui/src/
├── pages/PersonalAssistantPage.tsx # new — gated on mode !== "project_builder"
├── api/assistantMemory.ts # new — API client for memory endpoints
```
### Pattern 1: File-Backed Service (follows nexusSettingsService)
**What:** Read a JSON file from `resolvePaperclipInstanceRoot()/data/assistant-memory/<companyId>.json`, validate with zod, write back after updates.
**When to use:** All memory reads and writes.
**Example:**
```typescript
// Source: server/src/services/nexus-settings.ts (existing pattern)
import fs from "node:fs";
import path from "node:path";
import { z } from "zod";
import { resolvePaperclipInstanceRoot } from "../home-paths.js";
const assistantMemorySchema = z.object({
facts: z.array(z.string()).default([]),
updatedAt: z.string().datetime().optional(),
});
type AssistantMemory = z.infer<typeof assistantMemorySchema>;
function resolveMemoryPath(companyId: string): string {
return path.resolve(
resolvePaperclipInstanceRoot(),
"data",
"assistant-memory",
`${companyId}.json`,
);
}
```
### Pattern 2: Memory Injection via System Message Prefix
**What:** Before streaming a response, prepend a system message containing the memory facts to the messages array passed to the model.
**When to use:** Every `/conversations/:id/stream` call when the conversation's company is in `personal_ai` or `both` mode.
**Example:**
```typescript
// Inject pattern — server/src/routes/chat.ts stream endpoint extension
const memory = await assistantMemoryService.get(companyId);
const systemPrefix = memory.facts.length > 0
? `[Memory from previous sessions]\n${memory.facts.map(f => `- ${f}`).join("\n")}\n\n`
: "";
const messagesWithMemory = systemPrefix
? [{ role: "system", content: systemPrefix }, ...conversationMessages]
: conversationMessages;
```
### Pattern 3: Write-Time Sanitization
**What:** Before appending any fact to memory, run it through a credential-scrubbing function.
**When to use:** Every call to `assistantMemoryService.append()`.
**Example:**
```typescript
// Source: server/src/redaction.ts (existing pattern extended for plain text)
const CREDENTIAL_INLINE_RE = /\b(sk-[A-Za-z0-9]{20,}|ghp_[A-Za-z0-9]{36}|AIza[0-9A-Za-z_-]{35}|[A-Za-z0-9_-]{20,}\.[A-Za-z0-9_-]{6,}\.[A-Za-z0-9_-]{20,})/g;
const SENSITIVE_KEY_VALUE_RE = /(?:api[_-]?key|token|secret|password|bearer|auth)\s*[:=]\s*\S+/gi;
function sanitizeMemoryFact(raw: string): string {
return raw
.replace(CREDENTIAL_INLINE_RE, "[REDACTED]")
.replace(SENSITIVE_KEY_VALUE_RE, "[REDACTED]");
}
```
### Pattern 4: PM Agent Handoff with Conversation Summary
**What:** `POST /conversations/:id/assistant-handoff` — creates a new conversation pre-seeded with a system message summarising the current exchange, then navigates the user to that new conversation.
**When to use:** When user clicks "Turn this into a project" in PersonalAssistantPage.
The existing `POST /conversations/:id/handoff` creates an issue from a structured spec. The new assistant handoff is different — it creates a conversation context transfer, not necessarily an issue. It should:
1. Fetch the last N messages from the current conversation.
2. Produce a brief text summary (simple first-pass: concatenate user messages up to a token budget; optional: use AI summarisation via puterProxyService if token available).
3. Create a new conversation (or find the PM agent's default conversation) and insert a `messageType: "handoff_context"` system message.
4. Return `{ targetConversationId }` so the client can navigate.
### Anti-Patterns to Avoid
- **Injecting memory at retrieval time (not write time):** ASST-02 requires sanitization at write time. Sanitizing only on read means raw credentials are stored to disk.
- **Storing the full conversation as memory:** Memory should be a summary — not raw message content. Full messages can include credentials, PII, or prompt-injection payloads.
- **Putting memory in `instance_settings.general`:** That field is per-instance not per-company. Multiple companies (workspaces) need separate memory namespaces.
- **Reading memory synchronously in the hot path of the SSE stream without caching:** The file read is fast (< 1ms for small JSON) but should be done once before writing SSE headers, not inside the token loop.
---
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| Credential scrubbing regex | Custom patterns from scratch | Extend `server/src/redaction.ts` with plain-text patterns | Existing patterns cover `api_key`, `access_token`, `jwt`, etc. — reusing reduces divergence |
| Secrets storage | New encrypted store | `secretService` already handles encrypted storage — memory does not store secrets, only sanitized facts | Credential data must never reach memory layer at all |
| AI summarisation service | New inference client | `puterProxyService` for optional summarisation — same endpoint, same cost tracking | Avoids a second AI client implementation |
**Key insight:** The memory layer is deliberately simple — a list of plain-text facts. The complexity is in the sanitization gate, not the storage.
---
## Runtime State Inventory
> SKIPPED — this is a greenfield phase, not a rename/refactor/migration.
---
## Common Pitfalls
### Pitfall 1: Mode Check Location
**What goes wrong:** UI renders PersonalAssistantPage for all users regardless of mode, or the stream endpoint injects memory for project-builder-only users.
**Why it happens:** `nexusSettings.mode` is read from a file, not from DB — easy to forget to check it server-side.
**How to avoid:** The stream endpoint must call `nexusSettingsService().get()` and skip memory injection when `mode === "project_builder"`.
**Warning signs:** Memory facts appearing in project builder conversations.
### Pitfall 2: companyId Scoping in Memory File Path
**What goes wrong:** All companies share a single memory file.
**Why it happens:** Forgetting `companyId` in the path, e.g. `data/assistant-memory.json` instead of `data/assistant-memory/<companyId>.json`.
**How to avoid:** The path resolver function must include `companyId` as a directory segment (as shown in Pattern 1 above).
**Warning signs:** Facts from workspace A appearing in workspace B.
### Pitfall 3: SSE Stream Already Flushed When Memory Read Fails
**What goes wrong:** A memory read error causes an unhandled exception after SSE headers are flushed, leaving the client connection open but broken.
**Why it happens:** The existing stream endpoint flushes headers at line 101 (`res.flushHeaders()`) before any async logic.
**How to avoid:** Read memory before calling `res.flushHeaders()`. If memory read fails, fall back gracefully (empty memory, log warning) — never throw after flush.
**Warning signs:** Client sees `:ok` event but then nothing further.
### Pitfall 4: Memory Growing Without Bound
**What goes wrong:** `facts` array accumulates thousands of entries; the injected system prompt exceeds the model's context window.
**Why it happens:** No fact eviction or cap.
**How to avoid:** Cap at 50 facts (FIFO — drop oldest when limit is reached). Cap the injected system prefix at 2000 characters max.
**Warning signs:** Streaming responses truncated; model errors about context length.
### Pitfall 5: Handoff Creates Duplicate PM Conversations
**What goes wrong:** Every "Turn this into a project" click creates a new PM conversation, leading to many orphaned conversations.
**Why it happens:** The handoff endpoint creates a new conversation on every call.
**How to avoid:** Use the PM agent's existing most-recent-conversation or create a fresh one per handoff (acceptable for v1.5); document this as a known v1.5 limitation. The route should return `targetConversationId` so the UI can detect if a new one was created.
**Warning signs:** Conversation list growing rapidly.
### Pitfall 6: Chat Route Injection Point
**What goes wrong:** Memory injection is added to the wrong handler — the echo `streamEcho` helper instead of the real AI call.
**Why it happens:** `POST /conversations/:id/stream` currently calls `svc.streamEcho()` which is a stub. Phase 33 replaces this with a real AI call. The memory injection must be added here, not in the stub.
**How to avoid:** During plan-phase, inspect the exact injection point in `server/src/routes/chat.ts` lines 107-137 and confirm the replacement strategy. (State.md blocker: "Chat route injection point needs codebase inspection — confirm correct hook location in `server/src/services/chat.ts` during plan-phase".)
**Warning signs:** Memory injected but responses are still the echo stub.
---
## Code Examples
Verified patterns from existing codebase:
### File-Backed Service Read/Write
```typescript
// Source: server/src/services/nexus-settings.ts
async function get(): Promise<NexusSettings> {
const filePath = resolveNexusSettingsPath();
try {
const raw = fs.readFileSync(filePath, "utf-8");
const parsed = nexusSettingsSchema.safeParse(JSON.parse(raw));
if (parsed.success) return parsed.data;
return { mode: "both" };
} catch {
return { mode: "both" };
}
}
async function set(patch: Partial<NexusSettings>): Promise<NexusSettings> {
const current = await get();
const merged = { ...current, ...patch };
const validated = nexusSettingsSchema.parse(merged);
const filePath = resolveNexusSettingsPath();
fs.mkdirSync(path.dirname(filePath), { recursive: true });
fs.writeFileSync(filePath, JSON.stringify(validated, null, 2), "utf-8");
return validated;
}
```
### Existing Redaction Patterns (extend for plain-text memory)
```typescript
// Source: server/src/redaction.ts
const SECRET_PAYLOAD_KEY_RE =
/(api[-_]?key|access[-_]?token|auth(?:_?token)?|authorization|bearer|secret|passwd|password|credential|jwt|private[-_]?key|cookie|connectionstring)/i;
const JWT_VALUE_RE = /^[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+\.[A-Za-z0-9_-]+(?:\.[A-Za-z0-9_-]+)?$/;
```
### Existing Handoff Route Pattern
```typescript
// Source: server/src/routes/chat.ts lines 163-203
router.post("/conversations/:id/handoff", async (req, res) => {
assertBoard(req);
const data = handoffSchema.parse(req.body);
const conversation = await svc.getConversation(req.params.id!);
const companyId = conversation.companyId;
// 1. Insert handoff system message
// 2. Create issue from spec
// 3. Insert task_created system message
res.json({ handoffMessageId: handoffMsg.id, issues: [issue] });
});
```
### Existing SSE Streaming Pattern
```typescript
// Source: server/src/routes/chat.ts lines 88-137
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.setHeader("X-Accel-Buffering", "no");
res.flushHeaders();
res.write(":ok\n\n");
// ... async generator loop
res.write(`data: ${JSON.stringify({ token })}\n\n`);
res.write(`data: ${JSON.stringify({ done: true, messageId, content })}\n\n`);
```
### PuterProxyService Chat Stream
```typescript
// Source: server/src/services/puter-proxy.ts
async function* chatStream(
companyId: string,
agentId: string | null | undefined,
messages: unknown[],
model: string | undefined,
signal: AbortSignal | undefined,
): AsyncGenerator<string>
// Called with messages array in OpenAI format: [{ role: "system"|"user"|"assistant", content: string }]
```
---
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Vector DB for memory | Summary-based file-backed facts | Phase 33 (new) | No infra overhead; no pgvector needed |
| Echo stub in stream endpoint | Real AI via puterProxyService | Phase 33 (new) | Memory injection only makes sense once real AI is wired |
**Deprecated/outdated:**
- `streamEcho()` in `chatService`: stub used for testing — Phase 33 replaces it with real AI calls in the stream route (or supplements it with a parallel assistant-stream endpoint).
---
## Open Questions
1. **Does the stream endpoint replace the echo stub or add a parallel endpoint?**
- What we know: `/conversations/:id/stream` currently calls `svc.streamEcho()`. The puter proxy has its own endpoint at `/puter-proxy/chat`.
- What's unclear: Whether to extend the existing stream endpoint to delegate to puterProxyService (if the conversation is in a personal-assistant company), or create a dedicated `/conversations/:id/assistant-stream`.
- Recommendation: Extend the existing endpoint — the UI already calls it via `chatApi.postMessageAndStream()`. The endpoint should check `nexusSettings.mode` and route to puterProxy if a token is available, else fall back to echo stub. This avoids a UI-side change.
2. **What constitutes a "memory fact" — when is it written?**
- What we know: ASST-01 says "summary-based". Facts could be extracted from (a) every assistant turn, (b) end of conversation, or (c) on demand.
- What's unclear: v1.5 success criterion 1 says "A fact stated in one chat session… is referenced correctly by the assistant in a new session." This implies facts are persisted at the end of a session or after each user-confirmed turn.
- Recommendation: Append after each assistant response turn — simpler than session-end detection, and aligns with success criterion 1 (fact from one session visible in next). The `onDone` callback in `chatApi.postMessageAndStream` is the natural trigger; or do it server-side in the stream endpoint after saving the final message.
3. **How does the personal assistant handoff interact with the existing brainstormer handoff?**
- What we know: The existing handoff (Phase 23) creates a project issue from a structured spec (`{ what, why, constraints, success }`). The assistant handoff (ASST-03) transfers conversation context to a PM agent.
- What's unclear: Whether these are the same button or different flows.
- Recommendation: Implement a separate `POST /conversations/:id/assistant-handoff` endpoint. It does not create an issue — it creates a new conversation with a context summary as a seeded system message. The UI button is "Turn this into a project" (distinct from the brainstormer's "Send to PM" which targets a structured spec).
---
## Environment Availability
> Phase is code/config-only. No new external services required.
> `puterProxyService` is already integrated (Phase 31). File I/O uses Node.js built-ins.
| Dependency | Required By | Available | Version | Fallback |
|------------|------------|-----------|---------|----------|
| Node.js `fs` module | Memory file storage | ✓ | built-in | — |
| `puterProxyService` | AI responses + optional summarisation | ✓ | internal | Fall back to echo stub if no token |
| `nexusSettingsService` | Mode check (ASST-04) | ✓ | internal | — |
| `secretService` | Token resolution in puterProxy | ✓ | internal | — |
---
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | Vitest 3.x |
| Config file | `server/vitest.config.ts` |
| Quick run command | `pnpm --filter @paperclipai/server vitest run --reporter=verbose src/__tests__/33-*.test.ts` |
| Full suite command | `pnpm test:run` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| ASST-01 | Memory persists across sessions — new session includes previously stored fact in system prompt | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-memory.test.ts` | ❌ Wave 0 |
| ASST-02 | API key pasted into chat is NOT stored in memory file | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-memory-sanitization.test.ts` | ❌ Wave 0 |
| ASST-03 | Assistant handoff creates target conversation with context summary system message | unit | `pnpm --filter @paperclipai/server vitest run src/__tests__/33-assistant-handoff.test.ts` | ❌ Wave 0 |
| ASST-04 | PersonalAssistantPage visible when mode is `personal_ai` or `both`; hidden when `project_builder` | unit | `pnpm --filter @paperclipai/ui vitest run src/components/PersonalAssistantPage.test.tsx` | ❌ Wave 0 |
### Sampling Rate
- **Per task commit:** `pnpm --filter @paperclipai/server vitest run src/__tests__/33-*.test.ts`
- **Per wave merge:** `pnpm test:run`
- **Phase gate:** Full suite green before `/gsd:verify-work`
### Wave 0 Gaps
- [ ] `server/src/__tests__/33-assistant-memory.test.ts` — covers ASST-01
- [ ] `server/src/__tests__/33-memory-sanitization.test.ts` — covers ASST-02
- [ ] `server/src/__tests__/33-assistant-handoff.test.ts` — covers ASST-03
- [ ] `ui/src/components/PersonalAssistantPage.test.tsx` — covers ASST-04
---
## Sources
### Primary (HIGH confidence)
- `server/src/services/nexus-settings.ts` — file-backed JSON service pattern (confirmed by reading source)
- `server/src/services/puter-proxy.ts` — AI streaming service (confirmed by reading source)
- `server/src/routes/chat.ts` — SSE streaming pattern and handoff route (confirmed by reading source)
- `server/src/redaction.ts` — credential scrubbing patterns (confirmed by reading source)
- `packages/db/src/schema/chat_conversations.ts` — conversation schema (confirmed by reading source)
- `packages/db/src/schema/chat_messages.ts` — message schema with `messageType` column (confirmed by reading source)
- `.planning/REQUIREMENTS.md` — "DB schema changes: Out of Scope" (confirmed by reading source)
### Secondary (MEDIUM confidence)
- `.planning/ROADMAP.md` Phase 33 success criteria — defines exact behavior for ASST-01/ASST-02/ASST-03/ASST-04
- `.planning/STATE.md` Accumulated Context decisions — confirms "No DB schema changes", "Memory sanitization blocklist applied at write time"
### Tertiary (LOW confidence)
- None — all findings verified from source files.
---
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — verified by reading existing service files
- Architecture: HIGH — follows established patterns in codebase (nexusSettingsService, redaction.ts, handoff route)
- Pitfalls: HIGH — derived from reading the actual stream endpoint code and constraint docs
**Research date:** 2026-04-01
**Valid until:** 2026-05-01 (stable codebase; these patterns won't change without major refactors)