--- phase: 34-voice plan: 01 type: execute wave: 1 depends_on: [] files_modified: - server/src/app.ts - server/src/services/nexus-settings.ts - server/src/routes/nexus-settings.ts - ui/src/api/hardware.ts - ui/src/hooks/usePiperTts.ts - ui/src/components/TtsButton.tsx autonomous: true requirements: - VOICE-01 - VOICE-02 must_haves: truths: - "POST /api/transcribe is reachable and returns 503 with descriptive error when no Whisper CLI is installed" - "usePiperTts hook exposes prewarm/speak/status/progress and transitions idle->downloading->ready->speaking" - "TtsButton renders a speaker icon that calls speak() and shows download progress during prewarm" - "voiceEnabled boolean is persisted in nexus-settings.json and exposed via GET/PATCH /nexus/settings" artifacts: - path: "ui/src/hooks/usePiperTts.ts" provides: "Piper TTS hook with prewarm, speak, status, progress" exports: ["usePiperTts"] - path: "ui/src/components/TtsButton.tsx" provides: "Speaker button component for TTS playback" exports: ["TtsButton"] key_links: - from: "server/src/app.ts" to: "server/src/routes/chat-files.ts" via: "api.use(chatFileRoutes(db, opts.storageService))" pattern: "chatFileRoutes" - from: "ui/src/hooks/usePiperTts.ts" to: "@mintplex-labs/piper-tts-web" via: "import { tts }" pattern: "tts\\.download|tts\\.predict" --- Fix the broken /transcribe route registration, create the Piper TTS browser hook and button component, and add voiceEnabled to nexus-settings persistence. Purpose: VOICE-01 requires TTS on CPU-only hardware (browser WASM satisfies this). VOICE-02 requires visible download progress before first synthesis. The /transcribe route exists but is never mounted — a 1-line fix. voiceEnabled persistence is needed so onboarding voice opt-in survives sessions. Output: Working /api/transcribe endpoint, usePiperTts hook, TtsButton component, voiceEnabled in nexus-settings. @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/34-voice/34-RESEARCH.md @server/src/app.ts @server/src/routes/chat-files.ts @server/src/services/nexus-settings.ts @ui/src/api/hardware.ts @ui/src/components/VoiceRecordButton.tsx From server/src/routes/chat-files.ts: ```typescript export function chatFileRoutes(db: Db, storage: StorageService) { ... } // POST /transcribe — accepts multipart audio, returns { text: string } or 503 ``` From server/src/app.ts (line 147 pattern): ```typescript api.use(assetRoutes(db, opts.storageService)); // chatFileRoutes uses the same (db, opts.storageService) signature ``` From server/src/services/nexus-settings.ts: ```typescript export const NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const; export type NexusMode = (typeof NEXUS_MODES)[number]; const nexusSettingsSchema = z.object({ mode: z.enum(NEXUS_MODES).default("both"), }); export function nexusSettingsService() { get(), set(patch) } ``` From ui/src/api/hardware.ts: ```typescript export type NexusMode = "personal_ai" | "project_builder" | "both"; export interface NexusSettings { mode: NexusMode; } export function fetchNexusSettings(): Promise; export function updateNexusSettings(settings: Partial): Promise; ``` Task 1: Register chatFileRoutes in app.ts and add voiceEnabled to nexus-settings server/src/app.ts, server/src/services/nexus-settings.ts, server/src/routes/nexus-settings.ts, ui/src/api/hardware.ts - server/src/app.ts (full file — find insertion point after assistantHandoffRoutes) - server/src/services/nexus-settings.ts (full file — understand schema) - server/src/routes/nexus-settings.ts (full file — understand PATCH handler) - ui/src/api/hardware.ts (full file — understand client types) **1. Register chatFileRoutes in app.ts:** - Add import at top with other route imports: `import { chatFileRoutes } from "./routes/chat-files.js";` - Add `api.use(chatFileRoutes(db, opts.storageService));` after the `api.use(assistantHandoffRoutes(db));` line (around line 161). Mirror the `assetRoutes(db, opts.storageService)` pattern exactly. - Do NOT place it before boardMutationGuard — the /transcribe route calls assertBoard(req) and needs to be inside the guarded api sub-router. **2. Add voiceEnabled to nexusSettingsSchema (server/src/services/nexus-settings.ts):** - Add `voiceEnabled: z.boolean().default(false)` to the nexusSettingsSchema z.object. - This is a file-backed JSON field, NOT a DB migration — acceptable under the "no DB schema changes" constraint. **3. Update NexusSettings type on client (ui/src/api/hardware.ts):** - Add `voiceEnabled?: boolean` to the `NexusSettings` interface. - No changes to API functions needed — they already handle Partial. **4. Check nexus-settings route handler (server/src/routes/nexus-settings.ts):** - Read the file. The PATCH handler should already forward arbitrary fields to `nexusSettingsService().set(patch)` since it uses the Zod schema. If it manually picks fields, add voiceEnabled to the pick list. If it passes req.body through, no change needed. cd /opt/nexus && npx vitest run server/src/__tests__/chat-file-routes.test.ts 2>&1 | tail -5 - grep -q "chatFileRoutes" server/src/app.ts returns 0 - grep -q "voiceEnabled" server/src/services/nexus-settings.ts returns 0 - grep -q "voiceEnabled" ui/src/api/hardware.ts returns 0 POST /api/transcribe is reachable (returns 503 when no Whisper CLI installed, not 404). voiceEnabled persists in nexus-settings.json via the existing settings route. Task 2: Create usePiperTts hook and TtsButton component ui/src/hooks/usePiperTts.ts, ui/src/components/TtsButton.tsx - ui/src/components/VoiceRecordButton.tsx (reference for button style patterns) - ui/src/components/ui/button.tsx (Button component API) **0. Install piper-tts-web:** ```bash pnpm --filter @paperclipai/ui add @mintplex-labs/piper-tts-web ``` **1. Create ui/src/hooks/usePiperTts.ts:** ```typescript import { useState, useCallback, useRef } from "react"; import { tts } from "@mintplex-labs/piper-tts-web"; const DEFAULT_VOICE = "en_US-hfc_female-medium"; export type TtsStatus = "idle" | "downloading" | "ready" | "speaking" | "error"; export function usePiperTts() { const [status, setStatus] = useState("idle"); const [progress, setProgress] = useState(0); const audioRef = useRef(null); const prewarm = useCallback(async () => { if (status === "ready" || status === "downloading") return; setStatus("downloading"); setProgress(0); try { const stored = await tts.stored(); if (!stored.includes(DEFAULT_VOICE)) { await tts.download(DEFAULT_VOICE, (p: { loaded: number; total: number }) => { setProgress(Math.round((p.loaded / p.total) * 100)); }); } setStatus("ready"); setProgress(100); } catch { setStatus("error"); } }, [status]); const speak = useCallback(async (text: string) => { if (status !== "ready") return; // Stop any currently playing audio if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; } setStatus("speaking"); try { const wav = await tts.predict({ text, voiceId: DEFAULT_VOICE }); const audio = new Audio(wav); audioRef.current = audio; audio.onended = () => { audioRef.current = null; setStatus("ready"); }; audio.onerror = () => { audioRef.current = null; setStatus("ready"); }; await audio.play(); } catch { setStatus("ready"); } }, [status]); const stop = useCallback(() => { if (audioRef.current) { audioRef.current.pause(); audioRef.current = null; } if (status === "speaking") setStatus("ready"); }, [status]); return { status, progress, prewarm, speak, stop }; } ``` Key points: - `tts.stored()` checks IndexedDB cache — skips download if model already present (VOICE-02). - `tts.download()` with progress callback provides visible download progress (VOICE-02). - `tts.predict()` returns a Blob URL (WAV) — use `new Audio(url).play()` (VOICE-01, CPU-safe WASM). - `stop()` allows interrupting playback. - Do NOT import this in any server-side or test file running in Node — browser-only. **2. Create ui/src/components/TtsButton.tsx:** ```typescript import { Volume2, VolumeX, Loader2 } from "lucide-react"; import { Button } from "./ui/button"; import type { TtsStatus } from "../hooks/usePiperTts"; interface TtsButtonProps { status: TtsStatus; progress: number; onSpeak: () => void; onStop: () => void; onPrewarm: () => void; disabled?: boolean; } export function TtsButton({ status, progress, onSpeak, onStop, onPrewarm, disabled }: TtsButtonProps) { if (status === "downloading") { return ( ); } if (status === "speaking") { return ( ); } // idle or error: clicking triggers prewarm then speak // ready: clicking triggers speak directly const handleClick = () => { if (status === "ready") { onSpeak(); } else { onPrewarm(); } }; return ( ); } ``` The TtsButton receives status/progress from the hook and delegates actions. It does NOT import piper-tts-web directly — all TTS logic stays in the hook. The button is reusable: PersonalAssistant (Plan 02) will place it next to assistant messages. cd /opt/nexus && grep -q "usePiperTts" ui/src/hooks/usePiperTts.ts && grep -q "TtsButton" ui/src/components/TtsButton.tsx && grep -q "piper-tts-web" ui/package.json 2>/dev/null || grep -q "piper-tts-web" pnpm-lock.yaml && echo "PASS" || echo "FAIL" - grep -q "tts.download" ui/src/hooks/usePiperTts.ts returns 0 - grep -q "tts.predict" ui/src/hooks/usePiperTts.ts returns 0 - grep -q "tts.stored" ui/src/hooks/usePiperTts.ts returns 0 - grep -q "TtsButton" ui/src/components/TtsButton.tsx returns 0 - grep -q "piper-tts-web" pnpm-lock.yaml returns 0 - grep -q "Volume2" ui/src/components/TtsButton.tsx returns 0 usePiperTts hook handles download progress (VOICE-02) and CPU-safe WASM synthesis (VOICE-01). TtsButton shows download progress during prewarm and speaker icon for playback. piper-tts-web is installed as a UI dependency. - `grep -q "chatFileRoutes" server/src/app.ts` — route is registered - `grep -q "voiceEnabled" server/src/services/nexus-settings.ts` — settings schema extended - `ls ui/src/hooks/usePiperTts.ts ui/src/components/TtsButton.tsx` — both files exist - `npx vitest run server/src/__tests__/chat-file-routes.test.ts` — existing route tests pass 1. POST /api/transcribe returns 503 (not 404) when no Whisper CLI is installed — route is mounted 2. usePiperTts hook exports prewarm(), speak(), stop(), status, progress 3. TtsButton renders download progress during prewarm and speaker icon for playback 4. voiceEnabled persists in nexus-settings.json After completion, create `.planning/phases/34-voice/34-01-SUMMARY.md`