nexus/.planning/phases/39-voice-polish/39-02-PLAN.md
Nexus Dev 2716d822c4 docs(39): create phase plan for voice polish
Two plans in wave 1 (parallel):
- 39-01: Sentence-buffered TTS streaming + multi-language synthesis (VPIPE-07, VPIPE-08)
- 39-02: Onboarding voice hardware capability probe (ONBRD-01, ONBRD-02)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 03:27:36 +00:00

12 KiB

phase plan type wave depends_on files_modified autonomous requirements must_haves
39-voice-polish 02 execute 1
server/src/services/hardware.ts
server/src/routes/hardware.ts
ui/src/components/onboarding/VoiceStep.tsx
ui/src/components/NexusOnboardingWizard.tsx
ui/src/hooks/useHardwareInfo.ts
server/src/__tests__/39-voice-hardware-probe.test.ts
true
ONBRD-01
ONBRD-02
truths artifacts key_links
Onboarding hardware probe reports whether Whisper STT is runnable on detected hardware
Onboarding hardware probe reports whether Piper TTS is runnable on detected hardware
VoiceStep shows enable/skip when hardware is sufficient
VoiceStep shows capability note and auto-skips or shows skip-only when hardware is insufficient
path provides contains
server/src/services/hardware.ts voiceCapability probe in HardwareInfo whisperAvailable
path provides contains
server/src/routes/hardware.ts voice capability data in /system/providers response voiceCapability
path provides contains
ui/src/components/onboarding/VoiceStep.tsx Hardware-aware voice step with conditional enable/skip whisperAvailable|piperAvailable|voiceCapability
path provides contains
server/src/__tests__/39-voice-hardware-probe.test.ts Tests for voice capability detection describe.*voice.*capability
from to via pattern
ui/src/components/onboarding/VoiceStep.tsx ui/src/hooks/useHardwareInfo.ts voiceCapability prop from hardware info voiceCapability
from to via pattern
ui/src/components/NexusOnboardingWizard.tsx ui/src/components/onboarding/VoiceStep.tsx passes hardware voiceCapability as prop voiceCapability
Onboarding voice hardware detection — probe for Whisper STT and Piper TTS capability during onboarding and gate the voice enable step accordingly.

Purpose: New installs detect whether the machine can run STT/TTS before offering voice features, preventing users from enabling voice on incapable hardware. Output: Extended hardware probe with voice capability, updated VoiceStep with hardware-aware UI.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/ROADMAP.md @.planning/REQUIREMENTS.md @.planning/phases/39-voice-polish/39-CONTEXT.md

@server/src/services/hardware.ts @server/src/routes/hardware.ts @ui/src/components/onboarding/VoiceStep.tsx @ui/src/components/NexusOnboardingWizard.tsx @ui/src/hooks/useHardwareInfo.ts

From server/src/services/hardware.ts: ```typescript export type HardwareTier = "gpu" | "apple_silicon" | "cpu_only";

export interface HardwareInfo { totalGb: number; freeGb: number; usableGb: number; platform: NodeJS.Platform; gpuName: string | null; gpuVramGb: number | null; unifiedMemory: boolean; hardwareTier: HardwareTier; cpuModel: string | null; }

export function hardwareService(): { detect(): Promise }


From ui/src/components/onboarding/VoiceStep.tsx:
```typescript
interface VoiceStepProps {
  onEnable: () => void;
  onSkip: () => void;
}
// Currently only checks for microphone via navigator.mediaDevices
// Does NOT check if Whisper/Piper binaries are available on server

From ui/src/hooks/useHardwareInfo.ts:

// Hook that fetches GET /system/providers and returns HardwareInfo
// Used by NexusOnboardingWizard.tsx
Task 1: Voice capability probe in hardware service and route server/src/services/hardware.ts, server/src/routes/hardware.ts, server/src/__tests__/39-voice-hardware-probe.test.ts - server/src/services/hardware.ts (full file — understand detect() and HardwareInfo) - server/src/routes/hardware.ts (full file — understand route patterns) - server/src/services/voice-pipeline.ts (lines 76-125 — understand whisper/piper detection patterns) - Test: detectVoiceCapability() returns { whisperAvailable: true, piperAvailable: true } when both binaries resolve via execFile --version - Test: detectVoiceCapability() returns { whisperAvailable: false, piperAvailable: false } when both binaries throw ENOENT - Test: detectVoiceCapability() returns { whisperAvailable: true, piperAvailable: false } when only whisper is found - Test: Hardware tier "cpu_only" with < 4GB RAM sets voiceTierSufficient to false - Test: Hardware tier "apple_silicon" with >= 8GB RAM sets voiceTierSufficient to true 1. In hardware.ts, add a `VoiceCapability` interface: ```typescript export interface VoiceCapability { whisperAvailable: boolean; piperAvailable: boolean; voiceTierSufficient: boolean; // true if hardware tier >= apple_silicon OR (cpu_only with >= 4GB free) } ```
2. Extend HardwareInfo interface with `voiceCapability: VoiceCapability`.

3. Add `async detectVoiceCapability(): Promise<VoiceCapability>` to hardwareService:
   - Probe whisper-cpp: try `execFile("whisper-cpp", ["--help"])` with 2s timeout. If resolves → whisperAvailable=true. If ENOENT → try `execFile("whisper", ["--help"])` as fallback. Both fail → false.
   - Probe piper: try `execFile("piper", ["--help"])` with 2s timeout. If resolves → piperAvailable=true. Catch → false.
   - voiceTierSufficient: true if hardwareTier is "apple_silicon" or "gpu", OR if "cpu_only" with freeGb >= 4
   - Use execFile from node:child_process with promisify pattern (or the existing execFileAsync if extracted)

4. Call detectVoiceCapability() inside detect() AFTER the existing hardware detection, add result to HardwareInfo. Use a separate 3s timeout to avoid slowing down hardware detection if voice probes hang.

5. In hardware.ts route: no changes needed — it already returns the full HardwareInfo object from detect(), so voiceCapability will be included automatically.

6. Write tests in 39-voice-hardware-probe.test.ts:
   - Mock execFile (child_process) to test whisper/piper detection
   - Test voiceTierSufficient logic for each hardware tier
   - Test that detectVoiceCapability timeout does not exceed 3s
cd /opt/nexus && npx vitest run server/src/__tests__/39-voice-hardware-probe.test.ts --reporter=verbose 2>&1 | tail -30 - grep -q "VoiceCapability" server/src/services/hardware.ts - grep -q "whisperAvailable" server/src/services/hardware.ts - grep -q "piperAvailable" server/src/services/hardware.ts - grep -q "voiceTierSufficient" server/src/services/hardware.ts - grep -q "voiceCapability" server/src/services/hardware.ts - test -f server/src/__tests__/39-voice-hardware-probe.test.ts - HardwareInfo includes voiceCapability with whisperAvailable, piperAvailable, voiceTierSufficient - Binary detection probes whisper-cpp/whisper and piper with 2s timeout each - voiceTierSufficient is true for apple_silicon/gpu, or cpu_only with >= 4GB free RAM - GET /system/providers response now includes voiceCapability object - All tests pass Task 2: VoiceStep hardware-aware UI with conditional enable/skip ui/src/components/onboarding/VoiceStep.tsx, ui/src/components/NexusOnboardingWizard.tsx, ui/src/hooks/useHardwareInfo.ts - ui/src/components/onboarding/VoiceStep.tsx (full file) - ui/src/components/NexusOnboardingWizard.tsx (full file — understand step 4 voice wiring) - ui/src/hooks/useHardwareInfo.ts (full file — understand HardwareInfo type on client) 1. Update useHardwareInfo.ts: ensure the TypeScript type for hardware info includes the new voiceCapability field. Add: ```typescript interface VoiceCapability { whisperAvailable: boolean; piperAvailable: boolean; voiceTierSufficient: boolean; } ``` Add `voiceCapability?: VoiceCapability` to the HardwareInfo type used in the hook. The "?" makes it backward-compatible if the server hasn't been updated yet.
2. Update VoiceStep props to accept voice capability:
   ```typescript
   interface VoiceStepProps {
     onEnable: () => void;
     onSkip: () => void;
     voiceCapability?: {
       whisperAvailable: boolean;
       piperAvailable: boolean;
       voiceTierSufficient: boolean;
     };
   }
   ```

3. Update VoiceStep rendering logic:
   - If voiceCapability is undefined (loading/missing): show current behavior (mic check only)
   - If voiceCapability.voiceTierSufficient === false: show capability note ("Your hardware may not support voice features. Voice requires at least 4GB free RAM."), show Skip button only (no Enable), do NOT auto-skip — let user read the note
   - If voiceCapability.whisperAvailable && voiceCapability.piperAvailable: show green checkmark next to STT and TTS labels ("Whisper detected", "Piper detected"), show Enable + Skip buttons
   - If whisperAvailable but NOT piperAvailable: show checkmark for STT, warning for TTS ("Piper not found — install piper for voice output"), still allow Enable (voice input will work, output won't)
   - If neither available but tier is sufficient: show note "Install whisper-cpp and piper for voice features", show Skip button, dim the Enable button but keep it clickable (user may install later)

4. In NexusOnboardingWizard.tsx, pass voiceCapability to VoiceStep:
   - hardwareInfo already comes from useHardwareInfo hook
   - Pass `voiceCapability={hardwareInfo?.voiceCapability}` to VoiceStep in step 4

5. Keep existing microphone detection in VoiceStep — it checks client-side mic availability which is complementary to server-side binary detection.
cd /opt/nexus && npx tsc --noEmit --project ui/tsconfig.json 2>&1 | tail -20 - grep -q "voiceCapability" ui/src/components/onboarding/VoiceStep.tsx - grep -q "whisperAvailable" ui/src/components/onboarding/VoiceStep.tsx - grep -q "piperAvailable" ui/src/components/onboarding/VoiceStep.tsx - grep -q "voiceTierSufficient" ui/src/components/onboarding/VoiceStep.tsx - grep -q "voiceCapability" ui/src/components/NexusOnboardingWizard.tsx - grep -q "VoiceCapability" ui/src/hooks/useHardwareInfo.ts - VoiceStep accepts voiceCapability prop and renders conditionally based on hardware detection - Sufficient hardware + binaries present: shows enable/skip with green checkmarks - Insufficient hardware: shows capability note and skip-only - Missing binaries on sufficient hardware: shows install note with dimmed enable - NexusOnboardingWizard passes voiceCapability from hardware probe to VoiceStep - TypeScript compiles without errors 1. TypeScript compiles: `npx tsc --noEmit` for both server and ui 2. Tests pass: `npx vitest run server/src/__tests__/39-voice-hardware-probe.test.ts` 3. VoiceStep renders hardware-aware UI: grep confirms voiceCapability, whisperAvailable, piperAvailable in VoiceStep.tsx 4. Wizard wiring: grep confirms voiceCapability prop passed in NexusOnboardingWizard.tsx

<success_criteria>

  • ONBRD-01: Onboarding hardware probe reports Whisper STT and Piper TTS capability (whisperAvailable, piperAvailable, voiceTierSufficient in HardwareInfo)
  • ONBRD-02: VoiceStep activates enable/skip when hardware is capable, shows capability note when below threshold
  • Backward compatible: existing hardware endpoint still works, VoiceStep degrades gracefully if voiceCapability is undefined </success_criteria>
After completion, create `.planning/phases/39-voice-polish/39-02-SUMMARY.md`