nexus/.planning/phases/22-agent-streaming/22-VALIDATION.md

4.6 KiB

phase slug status nyquist_compliant wave_0_complete created
22 agent-streaming draft true true 2026-04-01

Phase 22 — Validation Strategy

Per-phase validation contract for feedback sampling during execution.


Test Infrastructure

Property Value
Framework vitest ^3.0.5
Config file ui/vitest.config.ts
Quick run command pnpm --filter @paperclipai/ui vitest run --reporter=verbose
Full suite command pnpm vitest run (root, all workspaces)
Estimated runtime ~20 seconds

Environment note: ui/vitest.config.ts sets environment: "node". Tests needing DOM use // @vitest-environment jsdom file-level annotation.


Sampling Rate

  • After every task commit: Run relevant test file(s) per task verify command
  • After every plan wave: Run pnpm vitest run
  • Before /gsd:verify-work: Full suite must be green
  • Max feedback latency: 20 seconds

Per-Task Verification Map

Task ID Plan Wave Requirement Test Type Automated Command File Exists Status
22-00-01 00 0 THEME-03 unit pnpm --filter @paperclipai/ui vitest run src/lib/agent-role-colors.test.ts Created in W0 pending
22-00-02 00 0 (scaffolds) stub pnpm --filter @paperclipai/ui vitest run Created in W0 pending
22-01-01 01 1 PERF-02 unit+grep tsc --noEmit + flushHeaders position check N/A pending
22-01-02 01 1 CHAT-01, CHAT-12 unit pnpm --filter @paperclipai/ui vitest run src/hooks/useStreamingChat.test.ts Wave 0 (stubs replaced with real tests in 01-02) pending
22-02-01 02 1 AGENT-04, THEME-03 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatMessageIdentityBar.test.tsx Wave 0 pending
22-02-02 02 1 CHAT-08 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatAgentSelector.test.tsx Wave 0 pending
22-03-01 03 2 CHAT-10, CHAT-11, CHAT-12 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatMessage.test.tsx Wave 0 pending
22-04-01 04 2 INPUT-05 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatSlashCommandPopover.test.tsx Wave 0 pending
22-04-02 04 2 INPUT-06 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatMentionPopover.test.tsx Wave 0 pending
22-05-01 05 3 PERF-03 unit pnpm --filter @paperclipai/ui vitest run src/components/ChatMessageList.test.tsx Wave 0 pending
22-05-02 05 3 PERF-02, CHAT-01, CHAT-08, CHAT-10, CHAT-11, CHAT-12, INPUT-05, INPUT-06 tsc+manual tsc --noEmit + human verify checkpoint N/A pending

Status: pending / green / red / flaky


Wave 0 Requirements

  • ui/src/lib/agent-role-colors.test.ts — covers THEME-03 agent colors (real test with uniqueness check)
  • ui/src/hooks/useStreamingChat.test.ts — covers CHAT-01, CHAT-12 streaming hook (stubs; replaced with real tests in Plan 01)
  • ui/src/components/ChatAgentSelector.test.tsx — covers CHAT-08 agent selection
  • ui/src/components/ChatMessage.test.tsx — covers CHAT-10, CHAT-11 edit/retry
  • ui/src/components/ChatSlashCommandPopover.test.tsx — covers INPUT-05 slash commands
  • ui/src/components/ChatMentionPopover.test.tsx — covers INPUT-06 @mention
  • ui/src/components/ChatMessageIdentityBar.test.tsx — covers AGENT-04 identity
  • ui/src/components/ChatMessageList.test.tsx — covers PERF-03 virtualization

Manual-Only Verifications

Behavior Requirement Why Manual Test Instructions
First token under 500ms PERF-02 Timing depends on LLM response Open chat, send message, measure time to first token appearance
Agent colors distinguishable across themes THEME-03 Visual distinction Switch between all 3 themes, verify agent name colors are readable
1,000+ messages scroll without jank PERF-03 Performance testing Load a conversation with 1,000+ messages, scroll rapidly
Retry uses actual prior user message CHAT-11 Interaction flow Click retry on assistant message, verify the regenerated response matches original user input

Validation Sign-Off

  • All tasks have <automated> verify or Wave 0 dependencies
  • Sampling continuity: no 3 consecutive tasks without automated verify
  • Wave 0 covers all MISSING references
  • No watch-mode flags
  • Feedback latency < 20s
  • nyquist_compliant: true set in frontmatter

Approval: approved