# Phase 37: Web Chat Voice UI - Research **Researched:** 2026-04-03 **Domain:** Browser voice I/O — VAD, MediaRecorder, Web Audio API, waveform visualization, audio playback, COOP/COEP headers **Confidence:** HIGH --- ## User Constraints (from CONTEXT.md) ### Locked Decisions All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. ### Claude's Discretion All implementation details. Use ROADMAP phase goal, success criteria, and codebase conventions. Key research findings baked into context: - `@ricky0123/vad-react ^0.0.36` for browser-side silence detection (VAD) - COOP/COEP headers required on Express server for SharedArrayBuffer - Waveform via Web Audio API AnalyserNode (Canvas or SVG, 30-50 data points) - Native ` ## Phase Requirements | ID | Description | Research Support | |----|-------------|------------------| | WCHAT-01 | Mic button in chat input starts/stops voice recording with visual state (idle/recording/processing) | VoiceMicButton replaces VoiceRecordButton; three-state via recording/userSpeaking/loading from useMicVAD | | WCHAT-02 | Recording auto-stops on silence detection via VAD | useMicVAD onSpeechEnd callback fires automatically after 1.5s silence; no manual stop needed | | WCHAT-03 | Real-time waveform/amplitude visualization displays while recording | VoiceWaveform canvas component using Web Audio API AnalyserNode + requestAnimationFrame | | WCHAT-04 | Voice response audio plays inline in chat message with audio player controls | ChatVoicePlayer with native ` --- ## Summary Phase 37 adds browser-based voice I/O to the existing web chat. Phase 36 delivered the server-side pipeline (VoicePipelineService, POST /api/transcribe, POST /api/synthesize, voiceMode wiring in chat.ts) and the nexus-settings schema extension. Phase 37 is entirely a frontend phase with one server-side addition: COOP/COEP response headers on the Express static middleware. The central library is `@ricky0123/vad-react ^0.0.36`, which wraps Silero VAD running in an AudioWorklet. It requires the page to be cross-origin isolated (COOP + COEP headers) to use SharedArrayBuffer. The package ships ONNX model files and a worklet bundle that must either be served locally from `public/` or loaded from its default CDN URLs. The CDN default is simpler and acceptable for development; production should serve them locally. Waveform visualization uses a standard Web Audio API AnalyserNode pattern: connect the microphone stream → AnalyserNode → read Uint8Array in requestAnimationFrame loop → render bars on a ``. This is entirely in-browser with no extra library. Audio playback for synthesized responses uses the native `