16 KiB
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | must_haves | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 37-web-chat-voice-ui | 04 | execute | 3 |
|
|
false |
|
|
Purpose: This is the integration plan that connects all Phase 37 components to the existing chat UI. Without this wiring, the components exist but aren't used.
Output: 5 modified files connecting voice I/O to the chat system
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/phases/37-web-chat-voice-ui/37-RESEARCH.md @.planning/phases/37-web-chat-voice-ui/37-02-SUMMARY.md @.planning/phases/37-web-chat-voice-ui/37-03-SUMMARY.md ```typescript interface VoiceMicButtonProps { onTranscript: (text: string) => void; disabled?: boolean; } export function VoiceMicButton({ onTranscript, disabled }: VoiceMicButtonProps) ```export function VoiceModeToggle()
// Uses useVoiceMode() internally; renders three pills + auto-play checkbox
interface ChatVoiceBadgeProps {
content: string;
messageType: string; // "voice_input" | "voice_full"
autoPlayVoice?: boolean;
}
export function ChatVoiceBadge({ content, messageType, autoPlayVoice }: ChatVoiceBadgeProps)
type VoiceMode = "text" | "voice_input" | "full_voice";
export function useVoiceMode(): { mode: VoiceMode; setMode: (v: VoiceMode) => Promise<void>; isLoading: boolean }
interface ChatInputProps {
onSend: (content: string) => void;
isSubmitting?: boolean;
disabled?: boolean;
placeholder?: string;
agents?: Agent[];
agentsLoading?: boolean;
onFilesPicked?: (files: File[]) => void;
pendingFiles?: PendingFile[];
onRemoveFile?: (id: string) => void;
enableVoiceInput?: boolean; // Controls VoiceRecordButton visibility
}
export function useStreamingChat(conversationId: string | null) {
// startStream(userMessage: string, agentId?: string) — needs voiceMode param added
return { streamingContent, isStreaming, startStream, stop };
}
async postMessageAndStream(
conversationId: string,
data: { content: string; agentId?: string }, // needs voiceMode added
callbacks: { onToken, onDone, onError },
signal?: AbortSignal,
): Promise<void>
// handleSend calls startStream(content, resolvedAgentId) — needs voiceMode
Task 1: Thread voiceMode through chatApi and useStreamingChat
ui/src/api/chat.ts,
ui/src/hooks/useStreamingChat.ts
ui/src/api/chat.ts,
ui/src/hooks/useStreamingChat.ts
1. **ui/src/api/chat.ts** — Extend postMessageAndStream data parameter:
- Change the `data` parameter type from `{ content: string; agentId?: string }` to `{ content: string; agentId?: string; voiceMode?: string }`
- The body is already sent as `JSON.stringify(data)`, so voiceMode will be included automatically when present
- No other changes needed — the server's chat.ts stream handler already reads voiceMode from req.body (added in Plan 01)
- ui/src/hooks/useStreamingChat.ts — Extend startStream to accept voiceMode:
- Change
startStreamsignature from(userMessage: string, agentId?: string)to(userMessage: string, agentId?: string, voiceMode?: string) - Pass voiceMode through to chatApi.postMessageAndStream:
chatApi.postMessageAndStream( conversationId, { content: userMessage, agentId, voiceMode }, { onToken, onDone, onError }, abort.signal, ); - Add
voiceModeto the useCallback dependency array if needed (it's a parameter, not state, so it shouldn't need to be) cd /opt/nexus/.claude/worktrees/agent-a009558f && grep -q "voiceMode" ui/src/api/chat.ts && grep -q "voiceMode" ui/src/hooks/useStreamingChat.ts && grep "postMessageAndStream" ui/src/api/chat.ts | grep -q "voiceMode" && echo "PASS" || echo "FAIL" <acceptance_criteria> - grep "voiceMode" ui/src/api/chat.ts returns match in postMessageAndStream data type
- grep "voiceMode" ui/src/hooks/useStreamingChat.ts returns match in startStream signature
- grep "voiceMode" ui/src/hooks/useStreamingChat.ts returns match in postMessageAndStream call </acceptance_criteria> chatApi.postMessageAndStream sends voiceMode in request body. useStreamingChat.startStream accepts and forwards voiceMode parameter.
- Change
-
ui/src/components/ChatMessage.tsx — Add ChatVoiceBadge for voice messages:
- Add imports:
import { ChatVoiceBadge } from "./ChatVoiceBadge"; - In the messageType dispatch block (after the existing spec_card, handoff, task_created, status_update checks), add:
if (messageType === "voice_input" || messageType === "voice_full") { const autoPlay = typeof window !== "undefined" ? localStorage.getItem("nexus:voice:autoplay") === "true" : false; return ( <div className="max-w-full group relative"> {agentName && ( <ChatMessageIdentityBar agentName={agentName} agentIcon={agentIcon} agentRole={agentRole} timestamp={timestamp} isStreaming={isStreaming} /> )} <ChatVoiceBadge content={content} messageType={messageType} autoPlayVoice={autoPlay} /> {isStreaming && <ChatStreamingCursor />} <ChatMessageActions role="assistant" isStreaming={isAnyStreaming} onRetry={id && onRetry ? () => onRetry(id) : undefined} onBookmark={id && onBookmark ? () => onBookmark(id) : undefined} isBookmarked={isBookmarked} /> </div> ); } - Place this BEFORE the general "fall through to default system message rendering" comment, but AFTER the status_update check
- Add imports:
-
ui/src/components/ChatPanel.tsx — Connect useVoiceMode and pass voiceMode to startStream:
- Add imports:
import { useVoiceMode } from "../hooks/useVoiceMode"; - Inside the ChatPanel component, call the hook:
const { mode: voiceMode } = useVoiceMode(); - Find ALL calls to
startStream(content, agentId)(there are ~5 of them per the read_first scan). Add voiceMode as third argument:startStream(content, resolvedAgentId ?? undefined, voiceMode); - The five locations are approximately:
- In handleSend:
startStream(content, resolvedAgentId ?? undefined)(two calls — online and offline branches) - In handleEdit callback:
startStream(newContent, activeAgentId ?? undefined) - In handleRetry:
startStream(newContent, activeAgentId ?? undefined) - In retry from error:
startStream(lastUserContent, activeAgentId ?? undefined)
- In handleSend:
- Update each to include
voiceModeas the third argument - Also pass
enableVoiceInput={voiceMode !== "text" || true}to ChatInput — actually, keepenableVoiceInput={true}always (or however it's currently set). The VoiceModeToggle handles mode selection independently. The mic button should always be visible when voice is available. - Check how enableVoiceInput is currently set in ChatPanel. If it's hardcoded or conditional, ensure it stays true so VoiceMicButton renders. cd /opt/nexus/.claude/worktrees/agent-a009558f && grep -q "VoiceMicButton" ui/src/components/ChatInput.tsx && grep -q "VoiceModeToggle" ui/src/components/ChatInput.tsx && ! grep -q "VoiceRecordButton" ui/src/components/ChatInput.tsx && grep -q "ChatVoiceBadge" ui/src/components/ChatMessage.tsx && grep -q "voice_input|voice_full" ui/src/components/ChatMessage.tsx && grep -q "useVoiceMode" ui/src/components/ChatPanel.tsx && grep -q "voiceMode" ui/src/components/ChatPanel.tsx && echo "PASS" || echo "FAIL" <acceptance_criteria>
- grep "VoiceMicButton" ui/src/components/ChatInput.tsx returns match
- grep "VoiceModeToggle" ui/src/components/ChatInput.tsx returns match
- grep "VoiceRecordButton" ui/src/components/ChatInput.tsx returns NO match (replaced)
- grep "ChatVoiceBadge" ui/src/components/ChatMessage.tsx returns match
- grep "voice_input" ui/src/components/ChatMessage.tsx returns match
- grep "voice_full" ui/src/components/ChatMessage.tsx returns match
- grep "nexus:voice:autoplay" ui/src/components/ChatMessage.tsx returns match (reads localStorage)
- grep "useVoiceMode" ui/src/components/ChatPanel.tsx returns match
- grep "voiceMode" ui/src/components/ChatPanel.tsx appears in startStream calls
- grep "startStream.*voiceMode" ui/src/components/ChatPanel.tsx returns match </acceptance_criteria> ChatInput uses VoiceMicButton (VAD-powered) instead of VoiceRecordButton. VoiceModeToggle shown above input. ChatMessage renders ChatVoiceBadge for voice messages. ChatPanel passes voiceMode to all startStream calls.
- Add imports:
What was built across all Phase 37 plans:
- VoiceMicButton with VAD auto-stop replacing VoiceRecordButton
- VoiceWaveform canvas animation during recording
- VoiceModeToggle (Text / Voice In / Full Voice) with nexus-settings persistence
- ChatVoiceBadge with collapsible full markdown for voice_full messages
- ChatVoicePlayer with play/pause and auto-play from localStorage
- voiceMode threaded through ChatPanel -> useStreamingChat -> chatApi -> server chat.ts
cd /opt/nexus/.claude/worktrees/agent-a009558f && grep -q "VoiceMicButton" ui/src/components/ChatInput.tsx && grep -q "ChatVoiceBadge" ui/src/components/ChatMessage.tsx && grep -q "voiceMode" ui/src/components/ChatPanel.tsx && echo "PASS" || echo "FAIL"
<acceptance_criteria>
- VoiceModeToggle visible above chat input with three pills
- Mic button starts recording with waveform animation
- Recording auto-stops on silence detection
- Transcribed text populates input field
- Voice badge appears on agent responses in voice modes
- Audio player works for voice_full messages
- Auto-play toggle persists across page refresh </acceptance_criteria> End-to-end voice flow verified by human: recording, VAD auto-stop, transcription, voice mode toggle, voice badge, audio playback, and auto-play setting all working correctly.
<success_criteria> Complete voice I/O working in browser chat: VAD-powered recording with waveform, auto-stop on silence, voice mode toggle with persistence, voice badge on responses, inline audio player with auto-play setting. User can have a full voice conversation with their agent. </success_criteria>
After completion, create `.planning/phases/37-web-chat-voice-ui/37-04-SUMMARY.md`