286 lines
14 KiB
Markdown
286 lines
14 KiB
Markdown
---
|
|
phase: 37-web-chat-voice-ui
|
|
plan: 03
|
|
type: execute
|
|
wave: 2
|
|
depends_on: ["37-01"]
|
|
files_modified:
|
|
- ui/src/components/ChatVoicePlayer.tsx
|
|
- ui/src/components/ChatVoiceBadge.tsx
|
|
- ui/src/components/VoiceModeToggle.tsx
|
|
autonomous: true
|
|
requirements:
|
|
- WCHAT-04
|
|
- WCHAT-05
|
|
- WCHAT-06
|
|
|
|
must_haves:
|
|
truths:
|
|
- "ChatVoicePlayer renders inline audio player with play/pause controls"
|
|
- "ChatVoicePlayer auto-plays when autoPlay setting is true"
|
|
- "ChatVoiceBadge shows 'Voice' badge on voice messages"
|
|
- "ChatVoiceBadge has collapsible full markdown section for voice_full messages"
|
|
- "VoiceModeToggle renders three pills: Text / Voice In / Full Voice"
|
|
- "VoiceModeToggle persists selection via useVoiceMode hook"
|
|
- "Auto-play preference stored in localStorage under nexus:voice:autoplay"
|
|
artifacts:
|
|
- path: "ui/src/components/ChatVoicePlayer.tsx"
|
|
provides: "Inline audio player for synthesized voice responses"
|
|
exports: ["ChatVoicePlayer"]
|
|
- path: "ui/src/components/ChatVoiceBadge.tsx"
|
|
provides: "Voice badge + collapsible markdown on agent messages"
|
|
exports: ["ChatVoiceBadge"]
|
|
- path: "ui/src/components/VoiceModeToggle.tsx"
|
|
provides: "Three-state pill toggle for voice mode"
|
|
exports: ["VoiceModeToggle"]
|
|
key_links:
|
|
- from: "ui/src/components/ChatVoicePlayer.tsx"
|
|
to: "/api/synthesize"
|
|
via: "fetch POST to get audio blob"
|
|
pattern: "fetch.*api/synthesize"
|
|
- from: "ui/src/components/ChatVoiceBadge.tsx"
|
|
to: "shadcn Collapsible"
|
|
via: "Collapsible/CollapsibleContent/CollapsibleTrigger"
|
|
pattern: "Collapsible"
|
|
- from: "ui/src/components/VoiceModeToggle.tsx"
|
|
to: "ui/src/hooks/useVoiceMode.ts"
|
|
via: "useVoiceMode() hook"
|
|
pattern: "useVoiceMode"
|
|
---
|
|
|
|
<objective>
|
|
Build the voice output and mode selection components: ChatVoicePlayer for inline audio playback, ChatVoiceBadge for voice message display, and VoiceModeToggle for switching between text/voice_input/full_voice modes.
|
|
|
|
Purpose: These components handle the output side of voice I/O (playing synthesized responses, showing voice badges on messages) and the mode selector that controls the entire voice behavior.
|
|
|
|
Output: 3 new component files — ChatVoicePlayer, ChatVoiceBadge, VoiceModeToggle
|
|
</objective>
|
|
|
|
<execution_context>
|
|
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
|
@$HOME/.claude/get-shit-done/templates/summary.md
|
|
</execution_context>
|
|
|
|
<context>
|
|
@.planning/phases/37-web-chat-voice-ui/37-RESEARCH.md
|
|
|
|
<interfaces>
|
|
<!-- From Plan 01 — synthesize endpoint -->
|
|
```
|
|
POST /api/synthesize
|
|
Body: { text: string, voiceId?: string }
|
|
Response: audio/wav binary buffer
|
|
```
|
|
|
|
<!-- From useVoiceMode hook (Plan 02) -->
|
|
```typescript
|
|
type VoiceMode = "text" | "voice_input" | "full_voice";
|
|
export function useVoiceMode(): {
|
|
mode: VoiceMode;
|
|
setMode: (next: VoiceMode) => Promise<void>;
|
|
isLoading: boolean;
|
|
}
|
|
```
|
|
|
|
<!-- ChatMessage messageType values for voice -->
|
|
```
|
|
messageType: "voice_input" → user sent via voice, agent replied with text
|
|
messageType: "voice_full" → user sent via voice, agent replied with SPOKEN + DETAILED format
|
|
```
|
|
|
|
<!-- SPOKEN/DETAILED content format from formatForVoice -->
|
|
```
|
|
SPOKEN: <concise spoken version of the response>
|
|
DETAILED: <full markdown response with code blocks etc>
|
|
```
|
|
|
|
<!-- shadcn components already available -->
|
|
```typescript
|
|
import { Badge } from "@/components/ui/badge";
|
|
import { Collapsible, CollapsibleContent, CollapsibleTrigger } from "@/components/ui/collapsible";
|
|
import { Button } from "@/components/ui/button";
|
|
```
|
|
</interfaces>
|
|
</context>
|
|
|
|
<tasks>
|
|
|
|
<task type="auto">
|
|
<name>Task 1: Create ChatVoicePlayer and ChatVoiceBadge components</name>
|
|
<files>
|
|
ui/src/components/ChatVoicePlayer.tsx,
|
|
ui/src/components/ChatVoiceBadge.tsx
|
|
</files>
|
|
<read_first>
|
|
ui/src/components/ChatMessage.tsx,
|
|
ui/src/components/ChatMarkdownMessage.tsx
|
|
</read_first>
|
|
<action>
|
|
1. **ui/src/components/ChatVoicePlayer.tsx** — Inline audio player for voice responses:
|
|
```typescript
|
|
interface ChatVoicePlayerProps {
|
|
text: string; // The spoken text to synthesize
|
|
autoPlay?: boolean; // Whether to auto-play on mount
|
|
}
|
|
export function ChatVoicePlayer({ text, autoPlay = false }: ChatVoicePlayerProps)
|
|
```
|
|
Implementation:
|
|
- State: `status: "idle" | "loading" | "playing" | "paused"`, `audioUrl: string | null`
|
|
- On mount (or when text changes): POST /api/synthesize with `{ text }`, credentials: "include"
|
|
- Set status to "loading"
|
|
- Get response as blob: `const blob = await res.blob()`
|
|
- Create object URL: `const url = URL.createObjectURL(blob)`
|
|
- Store url in state, set status to "idle"
|
|
- Create `<audio>` element ref. Set src to audioUrl when available.
|
|
- If autoPlay is true AND audioUrl is set, call `audioRef.current.play()`, set status to "playing"
|
|
- Audio event listeners:
|
|
- `onEnded`: set status to "idle", revoke blob URL via `URL.revokeObjectURL(audioUrl)`
|
|
- `onPause`: set status to "paused"
|
|
- `onPlay`: set status to "playing"
|
|
- Render:
|
|
- loading: `<Loader2 className="h-3 w-3 animate-spin" />` with "Loading audio..." text
|
|
- idle/paused: `<Button variant="ghost" size="sm">` with `<Play className="h-3 w-3" />` icon. onClick: `audioRef.current.play()`
|
|
- playing: `<Button variant="ghost" size="sm">` with `<Pause className="h-3 w-3" />` icon. onClick: `audioRef.current.pause()`
|
|
- Hidden `<audio ref={audioRef} />` element with aria-label="Voice response"
|
|
- Import Play, Pause, Loader2 from lucide-react
|
|
- Cleanup: revoke any blob URL on unmount
|
|
|
|
2. **ui/src/components/ChatVoiceBadge.tsx** — Voice badge + collapsible markdown:
|
|
```typescript
|
|
interface ChatVoiceBadgeProps {
|
|
content: string;
|
|
messageType: string; // "voice_input" | "voice_full"
|
|
autoPlayVoice?: boolean;
|
|
}
|
|
export function ChatVoiceBadge({ content, messageType, autoPlayVoice = false }: ChatVoiceBadgeProps)
|
|
```
|
|
Implementation:
|
|
- Parse content for SPOKEN/DETAILED sections:
|
|
```typescript
|
|
const spokenMatch = content.match(/SPOKEN:\s*([\s\S]*?)(?=\nDETAILED:|$)/);
|
|
const spokenText = spokenMatch?.[1]?.trim() ?? content;
|
|
const detailedMatch = content.match(/DETAILED:\s*([\s\S]*)/);
|
|
```
|
|
- Render:
|
|
a. `<Badge variant="outline" className="text-xs mb-2">Voice</Badge>`
|
|
b. `<p className="text-sm">{spokenText}</p>`
|
|
c. If messageType === "voice_full":
|
|
- `<ChatVoicePlayer text={spokenText} autoPlay={autoPlayVoice} />`
|
|
- If detailedMatch exists, render shadcn Collapsible:
|
|
```
|
|
<Collapsible>
|
|
<CollapsibleTrigger className="text-xs text-muted-foreground hover:text-foreground mt-1">
|
|
{open ? "Hide full response" : "Show full response"}
|
|
</CollapsibleTrigger>
|
|
<CollapsibleContent>
|
|
<ChatMarkdownMessage content={detailedMatch[1].trim()} />
|
|
</CollapsibleContent>
|
|
</Collapsible>
|
|
```
|
|
- For voice_input messageType: just show badge + spoken text, no player, no collapsible
|
|
- Import ChatVoicePlayer from ./ChatVoicePlayer
|
|
- Import ChatMarkdownMessage from ./ChatMarkdownMessage (already exists in codebase)
|
|
- Import Badge from @/components/ui/badge
|
|
- Import Collapsible, CollapsibleContent, CollapsibleTrigger from @/components/ui/collapsible
|
|
</action>
|
|
<verify>
|
|
<automated>cd /opt/nexus/.claude/worktrees/agent-a009558f && test -f ui/src/components/ChatVoicePlayer.tsx && test -f ui/src/components/ChatVoiceBadge.tsx && grep -q "export function ChatVoicePlayer" ui/src/components/ChatVoicePlayer.tsx && grep -q "export function ChatVoiceBadge" ui/src/components/ChatVoiceBadge.tsx && grep -q "api/synthesize" ui/src/components/ChatVoicePlayer.tsx && grep -q "URL.createObjectURL" ui/src/components/ChatVoicePlayer.tsx && grep -q "URL.revokeObjectURL" ui/src/components/ChatVoicePlayer.tsx && grep -q "Collapsible" ui/src/components/ChatVoiceBadge.tsx && grep -q "Show full response" ui/src/components/ChatVoiceBadge.tsx && grep -q "Badge" ui/src/components/ChatVoiceBadge.tsx && grep -q "SPOKEN:" ui/src/components/ChatVoiceBadge.tsx && echo "PASS" || echo "FAIL"</automated>
|
|
</verify>
|
|
<acceptance_criteria>
|
|
- grep "export function ChatVoicePlayer" ui/src/components/ChatVoicePlayer.tsx returns match
|
|
- grep "export function ChatVoiceBadge" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "api/synthesize" ui/src/components/ChatVoicePlayer.tsx returns match
|
|
- grep "URL.createObjectURL" ui/src/components/ChatVoicePlayer.tsx returns match
|
|
- grep "URL.revokeObjectURL" ui/src/components/ChatVoicePlayer.tsx returns match
|
|
- grep "audio" ui/src/components/ChatVoicePlayer.tsx returns match (native audio element)
|
|
- grep "aria-label.*Voice response" ui/src/components/ChatVoicePlayer.tsx returns match
|
|
- grep "Collapsible" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "Show full response" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "Hide full response" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "Badge.*Voice" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "SPOKEN:" ui/src/components/ChatVoiceBadge.tsx returns match
|
|
- grep "ChatVoicePlayer" ui/src/components/ChatVoiceBadge.tsx returns match (imports it)
|
|
</acceptance_criteria>
|
|
<done>ChatVoicePlayer synthesizes and plays audio with play/pause controls, auto-play support, and proper blob URL cleanup. ChatVoiceBadge shows Voice badge, spoken text, optional audio player, and collapsible full markdown for voice_full messages.</done>
|
|
</task>
|
|
|
|
<task type="auto">
|
|
<name>Task 2: Create VoiceModeToggle three-pill component</name>
|
|
<files>
|
|
ui/src/components/VoiceModeToggle.tsx
|
|
</files>
|
|
<read_first>
|
|
ui/src/hooks/useVoiceMode.ts
|
|
</read_first>
|
|
<action>
|
|
**ui/src/components/VoiceModeToggle.tsx** — Three-state pill toggle:
|
|
```typescript
|
|
export function VoiceModeToggle()
|
|
```
|
|
Implementation:
|
|
- Call `useVoiceMode()` to get `{ mode, setMode, isLoading }`
|
|
- Read auto-play preference from localStorage: `localStorage.getItem("nexus:voice:autoplay") === "true"`
|
|
- Provide `autoPlay` state + toggle in the component for WCHAT-06 (auto-play configurable)
|
|
- Render a `<div role="group" aria-label="Voice mode" className="flex items-center gap-1">`:
|
|
- Three pill buttons, each a `<button>`:
|
|
- "Text" → `setMode("text")`
|
|
- "Voice In" → `setMode("voice_input")`
|
|
- "Full Voice" → `setMode("full_voice")`
|
|
- Active pill: `bg-primary text-primary-foreground` classes
|
|
- Inactive pills: `bg-muted text-muted-foreground` classes
|
|
- All pills: `rounded-full px-3 py-1 text-xs font-medium transition-colors`
|
|
- Disabled when isLoading
|
|
- Below the pills (only when mode is "full_voice"), render auto-play toggle:
|
|
```
|
|
<label className="flex items-center gap-2 text-xs text-muted-foreground mt-1">
|
|
<input
|
|
type="checkbox"
|
|
checked={autoPlay}
|
|
onChange={(e) => {
|
|
setAutoPlay(e.target.checked);
|
|
localStorage.setItem("nexus:voice:autoplay", String(e.target.checked));
|
|
}}
|
|
/>
|
|
Auto-play voice responses
|
|
</label>
|
|
```
|
|
- Export autoPlay state for consumers: expose via a separate export or make VoiceModeToggle accept `onAutoPlayChange` callback. Better: just read localStorage directly in ChatVoiceBadge — keep it simple.
|
|
- The auto-play checkbox label text per UI spec: "Auto-play voice responses"
|
|
</action>
|
|
<verify>
|
|
<automated>cd /opt/nexus/.claude/worktrees/agent-a009558f && test -f ui/src/components/VoiceModeToggle.tsx && grep -q "export function VoiceModeToggle" ui/src/components/VoiceModeToggle.tsx && grep -q "useVoiceMode" ui/src/components/VoiceModeToggle.tsx && grep -q "Voice In" ui/src/components/VoiceModeToggle.tsx && grep -q "Full Voice" ui/src/components/VoiceModeToggle.tsx && grep -q "Text" ui/src/components/VoiceModeToggle.tsx && grep -q "bg-primary" ui/src/components/VoiceModeToggle.tsx && grep -q 'role="group"' ui/src/components/VoiceModeToggle.tsx && grep -q "nexus:voice:autoplay" ui/src/components/VoiceModeToggle.tsx && grep -q "Auto-play voice responses" ui/src/components/VoiceModeToggle.tsx && echo "PASS" || echo "FAIL"</automated>
|
|
</verify>
|
|
<acceptance_criteria>
|
|
- grep "export function VoiceModeToggle" ui/src/components/VoiceModeToggle.tsx returns match
|
|
- grep "useVoiceMode" ui/src/components/VoiceModeToggle.tsx returns match
|
|
- grep "Text" ui/src/components/VoiceModeToggle.tsx returns match (first pill)
|
|
- grep "Voice In" ui/src/components/VoiceModeToggle.tsx returns match (second pill)
|
|
- grep "Full Voice" ui/src/components/VoiceModeToggle.tsx returns match (third pill)
|
|
- grep "bg-primary text-primary-foreground" ui/src/components/VoiceModeToggle.tsx returns match (active state)
|
|
- grep "bg-muted text-muted-foreground" ui/src/components/VoiceModeToggle.tsx returns match (inactive state)
|
|
- grep 'role="group"' ui/src/components/VoiceModeToggle.tsx returns match
|
|
- grep 'aria-label="Voice mode"' ui/src/components/VoiceModeToggle.tsx returns match
|
|
- grep "nexus:voice:autoplay" ui/src/components/VoiceModeToggle.tsx returns match (localStorage key)
|
|
- grep "Auto-play voice responses" ui/src/components/VoiceModeToggle.tsx returns match
|
|
</acceptance_criteria>
|
|
<done>VoiceModeToggle renders three pills with active/inactive styling. Clicking a pill persists voiceMode to nexus-settings. Auto-play checkbox appears in full_voice mode and persists to localStorage.</done>
|
|
</task>
|
|
|
|
</tasks>
|
|
|
|
<verification>
|
|
- ChatVoicePlayer POSTs to /api/synthesize and plays via native audio element
|
|
- ChatVoicePlayer revokes blob URLs on cleanup (no memory leaks)
|
|
- ChatVoiceBadge parses SPOKEN/DETAILED content format
|
|
- ChatVoiceBadge shows collapsible section only for voice_full
|
|
- VoiceModeToggle has three pills with correct labels and accessibility
|
|
- Auto-play preference persisted in localStorage under nexus:voice:autoplay
|
|
</verification>
|
|
|
|
<success_criteria>
|
|
All three output-side voice components complete: ChatVoicePlayer plays synthesized audio with controls, ChatVoiceBadge renders voice badges with collapsible detail, VoiceModeToggle switches between text/voice_input/full_voice with persistence.
|
|
</success_criteria>
|
|
|
|
<output>
|
|
After completion, create `.planning/phases/37-web-chat-voice-ui/37-03-SUMMARY.md`
|
|
</output>
|