nexus/.planning/phases/22-agent-streaming/22-VALIDATION.md

92 lines
4.6 KiB
Markdown

---
phase: 22
slug: agent-streaming
status: draft
nyquist_compliant: true
wave_0_complete: true
created: 2026-04-01
---
# Phase 22 — Validation Strategy
> Per-phase validation contract for feedback sampling during execution.
---
## Test Infrastructure
| Property | Value |
|----------|-------|
| **Framework** | vitest ^3.0.5 |
| **Config file** | `ui/vitest.config.ts` |
| **Quick run command** | `pnpm --filter @paperclipai/ui vitest run --reporter=verbose` |
| **Full suite command** | `pnpm vitest run` (root, all workspaces) |
| **Estimated runtime** | ~20 seconds |
**Environment note:** `ui/vitest.config.ts` sets `environment: "node"`. Tests needing DOM use `// @vitest-environment jsdom` file-level annotation.
---
## Sampling Rate
- **After every task commit:** Run relevant test file(s) per task verify command
- **After every plan wave:** Run `pnpm vitest run`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Max feedback latency:** 20 seconds
---
## Per-Task Verification Map
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 22-00-01 | 00 | 0 | THEME-03 | unit | `pnpm --filter @paperclipai/ui vitest run src/lib/agent-role-colors.test.ts` | Created in W0 | pending |
| 22-00-02 | 00 | 0 | (scaffolds) | stub | `pnpm --filter @paperclipai/ui vitest run` | Created in W0 | pending |
| 22-01-01 | 01 | 1 | PERF-02 | unit+grep | `tsc --noEmit` + flushHeaders position check | N/A | pending |
| 22-01-02 | 01 | 1 | CHAT-01, CHAT-12 | unit | `pnpm --filter @paperclipai/ui vitest run src/hooks/useStreamingChat.test.ts` | Wave 0 (stubs replaced with real tests in 01-02) | pending |
| 22-02-01 | 02 | 1 | AGENT-04, THEME-03 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatMessageIdentityBar.test.tsx` | Wave 0 | pending |
| 22-02-02 | 02 | 1 | CHAT-08 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatAgentSelector.test.tsx` | Wave 0 | pending |
| 22-03-01 | 03 | 2 | CHAT-10, CHAT-11, CHAT-12 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatMessage.test.tsx` | Wave 0 | pending |
| 22-04-01 | 04 | 2 | INPUT-05 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatSlashCommandPopover.test.tsx` | Wave 0 | pending |
| 22-04-02 | 04 | 2 | INPUT-06 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatMentionPopover.test.tsx` | Wave 0 | pending |
| 22-05-01 | 05 | 3 | PERF-03 | unit | `pnpm --filter @paperclipai/ui vitest run src/components/ChatMessageList.test.tsx` | Wave 0 | pending |
| 22-05-02 | 05 | 3 | PERF-02, CHAT-01, CHAT-08, CHAT-10, CHAT-11, CHAT-12, INPUT-05, INPUT-06 | tsc+manual | `tsc --noEmit` + human verify checkpoint | N/A | pending |
*Status: pending / green / red / flaky*
---
## Wave 0 Requirements
- [x] `ui/src/lib/agent-role-colors.test.ts` — covers THEME-03 agent colors (real test with uniqueness check)
- [x] `ui/src/hooks/useStreamingChat.test.ts` — covers CHAT-01, CHAT-12 streaming hook (stubs; replaced with real tests in Plan 01)
- [x] `ui/src/components/ChatAgentSelector.test.tsx` — covers CHAT-08 agent selection
- [x] `ui/src/components/ChatMessage.test.tsx` — covers CHAT-10, CHAT-11 edit/retry
- [x] `ui/src/components/ChatSlashCommandPopover.test.tsx` — covers INPUT-05 slash commands
- [x] `ui/src/components/ChatMentionPopover.test.tsx` — covers INPUT-06 @mention
- [x] `ui/src/components/ChatMessageIdentityBar.test.tsx` — covers AGENT-04 identity
- [x] `ui/src/components/ChatMessageList.test.tsx` — covers PERF-03 virtualization
---
## Manual-Only Verifications
| Behavior | Requirement | Why Manual | Test Instructions |
|----------|-------------|------------|-------------------|
| First token under 500ms | PERF-02 | Timing depends on LLM response | Open chat, send message, measure time to first token appearance |
| Agent colors distinguishable across themes | THEME-03 | Visual distinction | Switch between all 3 themes, verify agent name colors are readable |
| 1,000+ messages scroll without jank | PERF-03 | Performance testing | Load a conversation with 1,000+ messages, scroll rapidly |
| Retry uses actual prior user message | CHAT-11 | Interaction flow | Click retry on assistant message, verify the regenerated response matches original user input |
---
## Validation Sign-Off
- [x] All tasks have `<automated>` verify or Wave 0 dependencies
- [x] Sampling continuity: no 3 consecutive tasks without automated verify
- [x] Wave 0 covers all MISSING references
- [x] No watch-mode flags
- [x] Feedback latency < 20s
- [x] `nyquist_compliant: true` set in frontmatter
**Approval:** approved