15 KiB
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | must_haves | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 34-voice | 02 | execute | 2 |
|
|
true |
|
|
Purpose: VOICE-03 requires voice features offered during onboarding based on hardware capability. The PersonalAssistant is the primary chat surface for v1.5 and must have both STT (VoiceRecordButton) and TTS (TtsButton) controls.
Output: VoiceStep component, updated 6-step wizard, PersonalAssistant with voice I/O.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/34-voice/34-RESEARCH.md @.planning/phases/34-voice/34-01-SUMMARY.md@ui/src/components/NexusOnboardingWizard.tsx @ui/src/pages/PersonalAssistant.tsx @ui/src/components/VoiceRecordButton.tsx
From ui/src/hooks/usePiperTts.ts (created in Plan 01):
export type TtsStatus = "idle" | "downloading" | "ready" | "speaking" | "error";
export function usePiperTts(): {
status: TtsStatus;
progress: number;
prewarm: () => Promise<void>;
speak: (text: string) => Promise<void>;
stop: () => void;
};
From ui/src/components/TtsButton.tsx (created in Plan 01):
interface TtsButtonProps {
status: TtsStatus;
progress: number;
onSpeak: () => void;
onStop: () => void;
onPrewarm: () => void;
disabled?: boolean;
}
export function TtsButton(props: TtsButtonProps): JSX.Element;
From ui/src/api/hardware.ts (updated in Plan 01):
export interface NexusSettings {
mode: NexusMode;
voiceEnabled?: boolean;
}
export function updateNexusSettings(settings: Partial<NexusSettings>): Promise<NexusSettings>;
From ui/src/components/VoiceRecordButton.tsx (existing):
interface VoiceRecordButtonProps {
onTranscription: (text: string) => void;
disabled?: boolean;
}
export function VoiceRecordButton(props: VoiceRecordButtonProps): JSX.Element;
From ui/src/components/NexusOnboardingWizard.tsx (existing step structure):
- Step 1: hardware detection
- Step 2: mode selection
- Step 3: provider selection
- Step 4: root directory (will become step 5)
- Step 5: summary (will become step 6)
- Step state:
const [step, setStep] = useState(1); - Label:
{step === 5 ? "Summary" : \Step ${step} of 4`}`
import { useEffect, useState } from "react";
import { Mic, Volume2 } from "lucide-react";
import { Button } from "@/components/ui/button";
interface VoiceStepProps {
onEnable: () => void;
onSkip: () => void;
}
export function VoiceStep({ onEnable, onSkip }: VoiceStepProps) {
const [micAvailable, setMicAvailable] = useState<boolean | null>(null);
useEffect(() => {
navigator.mediaDevices?.enumerateDevices()
.then(devices => setMicAvailable(devices.some(d => d.kind === "audioinput")))
.catch(() => setMicAvailable(false));
}, []);
return (
<div className="flex flex-col gap-4">
<div className="flex flex-col gap-3">
<div className="flex items-center gap-3 rounded-lg border p-3">
<Mic className="h-5 w-5 text-primary shrink-0" />
<div>
<p className="text-sm font-medium">Speech-to-Text (Whisper)</p>
<p className="text-xs text-muted-foreground">
{micAvailable === false
? "No microphone detected — unavailable"
: micAvailable === true
? "Microphone detected — speak to your assistant"
: "Checking microphone..."}
</p>
</div>
</div>
<div className="flex items-center gap-3 rounded-lg border p-3">
<Volume2 className="h-5 w-5 text-primary shrink-0" />
<div>
<p className="text-sm font-medium">Text-to-Speech (Piper)</p>
<p className="text-xs text-muted-foreground">
Hear responses read aloud. Runs entirely on your device — no server needed.
</p>
</div>
</div>
</div>
<div className="flex flex-col gap-2">
<Button onClick={onEnable} className="w-full">
Enable voice
</Button>
<Button variant="ghost" onClick={onSkip} className="w-full">
Skip
</Button>
</div>
</div>
);
}
2. Update NexusOnboardingWizard.tsx — insert step 4 (voice), shift steps:
This is a precise step-number shift. Do a full audit of all setStep(N) calls and update:
a. Add imports at top:
import { VoiceStep } from "./onboarding/VoiceStep";import { updateNexusSettings } from "../api/hardware";(already imported)
b. Add voiceEnabled state:
const [voiceEnabled, setVoiceEnabled] = useState(false);
c. Step number shift — ALL occurrences:
- Old step 4 (rootDir) becomes step 5
- Old step 5 (summary) becomes step 6
- Every
setStep(4)that meant "go to rootDir" becomessetStep(5) - Every
setStep(5)that meant "go to summary" becomessetStep(6) - The Back button on old step 4 (rootDir) that said
setStep(3)becomessetStep(4)(back to voice) - Old step 3 (provider) onSkip/onContinue
setStep(4)becomessetStep(4)(now goes to voice, not rootDir) — no change needed here since 4 IS the voice step
d. Step indicator label:
- Change
{step === 5 ? "Summary" : \Step ${step} of 4`}to{step === 6 ? "Summary" : `Step ${step} of 5`}`
e. Reset voiceEnabled in the cleanup useEffect:
- Add
setVoiceEnabled(false);alongside other resets
f. Add step 4 rendering block (voice) — insert between step 3 and the rootDir step:
{/* Step 4 — Voice */}
{step === 4 && (
<>
<div className="flex flex-col gap-2 text-center">
<h1 className="text-2xl font-semibold tracking-tight">
Voice features
</h1>
<p className="text-sm text-muted-foreground">
Speak to your assistant and hear responses read aloud. Runs entirely on your device.
</p>
</div>
<VoiceStep
onEnable={() => {
setVoiceEnabled(true);
setStep(5);
}}
onSkip={() => setStep(5)}
/>
<Button
type="button"
variant="ghost"
onClick={() => setStep(3)}
className="w-full"
>
Back
</Button>
</>
)}
g. Persist voiceEnabled in createWorkspace() — add after the existing mode save:
// Persist voice preference — non-blocking
if (voiceEnabled) {
try {
await updateNexusSettings({ voiceEnabled: true });
} catch {
// Non-blocking
}
}
h. Update old step 4 comment to say "Step 5 — Root Directory (was step 4)" Update old step 5 comment to say "Step 6 — Summary (was step 5)"
Step number audit checklist (verify each):
- Step 3 provider: onSkip →
setStep(4)(voice) -- was already 4, now means voice - Step 3 provider: onContinue →
setStep(4)(voice) -- same - Step 4 (NEW voice): Enable →
setStep(5), Skip →setStep(5), Back →setStep(3) - Step 5 (was 4, rootDir): "Review & finish" →
setStep(6), Back →setStep(4)(voice), "Skip to summary" →setStep(6) - Step 6 (was 5, summary): onBack →
setStep(5)(rootDir) - Step rendering:
{step === 4 && ...}for rootDir becomes{step === 5 && ...} - Step rendering:
{step === 5 && ...}for summary becomes{step === 6 && ...}cd /opt/nexus && grep -c "setStep" ui/src/components/NexusOnboardingWizard.tsx && grep -q "VoiceStep" ui/src/components/NexusOnboardingWizard.tsx && grep -q "step === 4" ui/src/components/NexusOnboardingWizard.tsx && grep -q "Step 4" ui/src/components/onboarding/VoiceStep.tsx 2>/dev/null; grep -q "VoiceStep" ui/src/components/onboarding/VoiceStep.tsx && echo "PASS" || echo "FAIL" <acceptance_criteria>- grep -q "VoiceStep" ui/src/components/onboarding/VoiceStep.tsx returns 0
- grep -q "VoiceStep" ui/src/components/NexusOnboardingWizard.tsx returns 0
- grep -q "step === 6" ui/src/components/NexusOnboardingWizard.tsx returns 0 (summary is now step 6)
- grep -q "Step.*of 5" ui/src/components/NexusOnboardingWizard.tsx returns 0 (label updated from "of 4")
- grep -q "voiceEnabled" ui/src/components/NexusOnboardingWizard.tsx returns 0
- grep -q "enumerateDevices" ui/src/components/onboarding/VoiceStep.tsx returns 0 </acceptance_criteria> Onboarding wizard has 6 steps with voice at step 4. VoiceStep probes mic availability and offers enable/skip. voiceEnabled is persisted on workspace creation. All setStep() calls use correct updated numbers.
2. Add usePiperTts hook in PersonalAssistant component body:
const { status: ttsStatus, progress: ttsProgress, prewarm, speak, stop } = usePiperTts();
3. Add VoiceRecordButton to the input bar:
In the input bar section ({selectedConvId && ( <div className="px-6 py-4 ..."> ... </div> )}), add VoiceRecordButton inside the <div className="flex gap-3 items-end"> container, between the textarea and Send button:
<VoiceRecordButton
onTranscription={(text) => setInputValue((prev) => prev ? prev + " " + text : text)}
disabled={isSending}
/>
The onTranscription callback appends transcribed text to the input field (does not auto-send). This lets users review before sending.
4. Add TtsButton next to assistant messages in MessageBubble:
Modify the MessageBubble component to accept an optional onSpeak callback and show a TtsButton for assistant messages:
Actually, a cleaner approach: add the TtsButton inline where messages are rendered, not inside MessageBubble (to avoid prop drilling the hook through). In the messages.map section, render a small TTS button after each assistant message:
{messages.map((msg) => (
<div key={msg.id}>
<MessageBubble message={msg} />
{msg.role === "assistant" && msg.content && (
<div className="flex justify-start pl-10 -mt-1 mb-1">
<TtsButton
status={ttsStatus}
progress={ttsProgress}
onSpeak={() => speak(msg.content)}
onStop={stop}
onPrewarm={prewarm}
/>
</div>
)}
</div>
))}
The pl-10 aligns the button under the message bubble (past the avatar). The -mt-1 mb-1 tucks it close.
5. Auto-prewarm TTS when PersonalAssistant mounts (optional optimization): Do NOT auto-prewarm. Let the user trigger it on first click of any TtsButton. This avoids unexpected downloads. cd /opt/nexus && grep -q "VoiceRecordButton" ui/src/pages/PersonalAssistant.tsx && grep -q "TtsButton" ui/src/pages/PersonalAssistant.tsx && grep -q "usePiperTts" ui/src/pages/PersonalAssistant.tsx && echo "PASS" || echo "FAIL" <acceptance_criteria> - grep -q "VoiceRecordButton" ui/src/pages/PersonalAssistant.tsx returns 0 - grep -q "onTranscription" ui/src/pages/PersonalAssistant.tsx returns 0 - grep -q "TtsButton" ui/src/pages/PersonalAssistant.tsx returns 0 - grep -q "usePiperTts" ui/src/pages/PersonalAssistant.tsx returns 0 - grep -q "speak" ui/src/pages/PersonalAssistant.tsx returns 0 </acceptance_criteria> PersonalAssistant has VoiceRecordButton in the input bar (STT via /api/transcribe) and TtsButton next to each assistant message (TTS via Piper WASM). Voice input appends to textarea for review before sending.
- `grep -q "VoiceStep" ui/src/components/NexusOnboardingWizard.tsx` — voice step integrated - `grep -q "step === 6" ui/src/components/NexusOnboardingWizard.tsx` — summary correctly at step 6 - `grep -q "VoiceRecordButton" ui/src/pages/PersonalAssistant.tsx` — STT wired - `grep -q "TtsButton" ui/src/pages/PersonalAssistant.tsx` — TTS wired - `grep -q "enumerateDevices" ui/src/components/onboarding/VoiceStep.tsx` — mic detection<success_criteria>
- Onboarding wizard has voice at step 4 with mic detection and enable/skip (VOICE-03)
- Steps 5 (rootDir) and 6 (summary) work with correct Back/Continue navigation
- PersonalAssistant has VoiceRecordButton for STT input
- PersonalAssistant has TtsButton for TTS playback on assistant messages
- voiceEnabled preference is persisted when user enables voice during onboarding </success_criteria>