docs(37): UI design contract for web-chat-voice-ui

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Nexus Dev 2026-04-04 01:53:32 +00:00
parent 30708d38e5
commit a010333787

View file

@ -0,0 +1,207 @@
---
phase: 37
slug: web-chat-voice-ui
status: draft
shadcn_initialized: true
preset: new-york / neutral / tailwind-v4
created: 2026-04-03
---
# Phase 37 — UI Design Contract
> Visual and interaction contract for frontend phases. Generated by gsd-ui-researcher, verified by gsd-ui-checker.
---
## Design System
| Property | Value |
|----------|-------|
| Tool | shadcn (new-york style) |
| Preset | new-york, baseColor: neutral, cssVariables: true |
| Component library | Radix UI (via shadcn) |
| Icon library | lucide-react |
| Font | System default (inherit from body) |
Source: `components.json` detected + `npx shadcn info -c ui` confirmed.
---
## Spacing Scale
Declared values (must be multiples of 4):
| Token | Value | Usage |
|-------|-------|-------|
| xs | 4px | Icon gaps, waveform bar gaps, badge inner padding |
| sm | 8px | Button icon padding, compact element spacing |
| md | 16px | Chat message padding, input field padding |
| lg | 24px | Panel section spacing |
| xl | 32px | Layout gaps between major sections |
| 2xl | 48px | Major section breaks |
| 3xl | 64px | Page-level spacing |
Exceptions:
- Mic button and voice mode toggle touch target: 32px (h-8 w-8) to match existing VoiceRecordButton/TtsButton sizing pattern
- Waveform canvas height: 32px fixed (h-8) to stay within input row
---
## Typography
| Role | Size | Weight | Line Height |
|------|------|--------|-------------|
| Body | 14px (text-sm) | 400 (regular) | 1.5 |
| Label | 12px (text-xs) | 400 (regular) | 1.4 |
| Heading | 16px (text-base) | 600 (semibold) | 1.2 |
| Display | 20px (text-xl) | 600 (semibold) | 1.2 |
Source: Existing chat components use `text-sm` for message content, `text-xs` for file labels and badges. Matches `ChatMessage.tsx` and `ChatInput.tsx` patterns.
---
## Color
| Role | Value | Usage |
|------|-------|-------|
| Dominant (60%) | `--background` (#eff1f5 light / #1e1e2e dark) | Chat surface, page background |
| Secondary (30%) | `--secondary` (#ccd0da light / #313244 dark) | User message bubbles, sidebar |
| Accent (10%) | `--primary` (#1e66f5 light / #89b4fa dark) | Active recording state, voice mode active indicator, audio player progress bar |
| Destructive | `--destructive` (#d20f39 light / #f38ba8 dark) | Stop-recording button state (Square icon when recording), delete confirmations |
Accent reserved for:
1. VoiceMicButton border/ring when in active recording state
2. Voice mode toggle active-state highlight (currently selected mode pill)
3. Audio player progress fill while audio is playing
4. Waveform bars while recording is live
Source: `src/index.css` CSS variable definitions. Existing `VoiceRecordButton` uses `text-destructive` for stop state — contract preserves this pattern.
---
## Component Inventory
New components to build for this phase:
| Component | Location | Description |
|-----------|----------|-------------|
| `VoiceMicButton` | `ui/src/components/VoiceMicButton.tsx` | Replaces `VoiceRecordButton` — adds VAD auto-stop, waveform visualization, three visual states (idle / recording+waveform / processing) |
| `VoiceWaveform` | `ui/src/components/VoiceWaveform.tsx` | Canvas-based amplitude bars, 30-50 data points, 32px tall, animated during recording only |
| `VoiceModeToggle` | `ui/src/components/VoiceModeToggle.tsx` | Three-state pill toggle: "Text" / "Voice In" / "Full Voice" — persists via nexus-settings |
| `ChatVoicePlayer` | `ui/src/components/ChatVoicePlayer.tsx` | Inline `<audio>` player with play/pause/stop controls using native element + URL.createObjectURL() |
| `ChatVoiceBadge` | `ui/src/components/ChatVoiceBadge.tsx` | Small badge on agent messages with messageType voice_input or voice_full; includes expand/collapse for full markdown |
| `useVoiceMode` | `ui/src/hooks/useVoiceMode.ts` | Hook reading/writing voiceMode from nexus-settings; returns current mode + setter |
| `useVadRecorder` | `ui/src/hooks/useVadRecorder.ts` | Wraps @ricky0123/vad-react, exposes recording state + Float32Array callback on speech end |
Reused/extended components:
| Component | Change |
|-----------|--------|
| `ChatInput.tsx` | Replace `VoiceRecordButton` with `VoiceMicButton`; add `VoiceModeToggle` in toolbar row |
| `ChatMessage.tsx` | Add `ChatVoiceBadge` + `ChatVoicePlayer` branch for messageType voice_input / voice_full |
| `server/src/app.ts` | Add COOP/COEP headers to static middleware for SharedArrayBuffer support |
---
## Interaction States
### VoiceMicButton
| State | Visual |
|-------|--------|
| idle | `Mic` icon (lucide), ghost variant, h-8 w-8 |
| recording | `VoiceWaveform` replaces icon, primary ring/border, stop action on click |
| processing (transcribing) | `Loader2` animate-spin, disabled |
| VAD speech-end | Automatically transitions to processing — no manual stop needed |
### VoiceModeToggle
| State | Visual |
|-------|--------|
| text | "Text" pill, default/muted background |
| voice_input | "Voice In" pill, primary background |
| full_voice | "Full Voice" pill, primary background |
Three pills rendered side-by-side. Active pill uses `bg-primary text-primary-foreground`. Inactive pills use `bg-muted text-muted-foreground`.
### ChatVoiceBadge (on agent message)
| messageType | Badge label | Expandable section |
|-------------|-------------|-------------------|
| voice_input | "Voice" | None — transcript shown in bubble |
| voice_full | "Voice" | Collapsed by default; expand reveals full markdown with code blocks |
Expand/collapse uses shadcn `Collapsible` component. Badge is rendered with shadcn `Badge` variant outline.
### ChatVoicePlayer
| State | Visual |
|-------|--------|
| auto-play (on) | Plays immediately on mount; `Volume2` icon shows; click to pause |
| auto-play (off) | Shows play button; user-initiated only |
| playing | Progress bar fills with `--primary`; `Pause` icon |
| paused | `Play` icon |
| ended | Returns to play button |
---
## Copywriting Contract
| Element | Copy |
|---------|------|
| Primary CTA (start recording) | "Start voice input" (aria-label on VoiceMicButton idle) |
| Recording active label | "Recording — speak now" (aria-label on VoiceMicButton recording state) |
| Processing label | "Transcribing..." (aria-label on VoiceMicButton processing state) |
| Voice mode toggle label | "Voice mode" (tooltip/title on toggle group) |
| Mode option: text only | "Text" |
| Mode option: voice input | "Voice In" |
| Mode option: full voice | "Full Voice" |
| Voice badge label | "Voice" |
| Expand full response | "Show full response" |
| Collapse full response | "Hide full response" |
| Auto-play setting label | "Auto-play voice responses" |
| Empty audio state | Not applicable — audio only appears when agent sends voice_full response |
| Error: mic permission denied | "Microphone access denied. Allow microphone access in your browser settings." |
| Error: transcription failed | "Transcription failed. Please try again." |
| Error: synthesis failed | "Voice synthesis failed. The text response is still available above." |
| Destructive confirmation | None — no destructive actions in this phase |
---
## Accessibility
- All interactive voice controls have explicit `aria-label` values (not icon-only without label)
- VoiceMicButton states must be announced via `aria-live="polite"` region with status text
- Audio player must have `aria-label="Voice response"` on the `<audio>` element
- VoiceModeToggle uses `role="group"` with `aria-label="Voice mode"` on the wrapper
- COOP/COEP headers are required on the Express server — without them, SharedArrayBuffer is unavailable and VAD silently fails
- AudioContext unlock must happen inside a user gesture handler (click/tap) to satisfy browser autoplay policy
---
## Registry Safety
| Registry | Blocks Used | Safety Gate |
|----------|-------------|-------------|
| shadcn official | button, badge, collapsible, tooltip, select | not required |
Third-party packages (not shadcn registry — installed via npm, not `npx shadcn add`):
| Package | Purpose | Safety note |
|---------|---------|-------------|
| `@ricky0123/vad-react ^0.0.36` | VAD silence detection — delivers Float32Array at 16kHz on speech end | npm package, not registry block — standard npm install, no vetting gate required |
No third-party shadcn registry blocks declared. Registry vetting gate: not applicable.
---
## Checker Sign-Off
- [ ] Dimension 1 Copywriting: PASS
- [ ] Dimension 2 Visuals: PASS
- [ ] Dimension 3 Color: PASS
- [ ] Dimension 4 Typography: PASS
- [ ] Dimension 5 Spacing: PASS
- [ ] Dimension 6 Registry Safety: PASS
**Approval:** pending