nexus/.planning/phases/34-voice/34-01-PLAN.md

---
phase: 34-voice
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
  - server/src/app.ts
  - server/src/services/nexus-settings.ts
  - server/src/routes/nexus-settings.ts
  - ui/src/api/hardware.ts
  - ui/src/hooks/usePiperTts.ts
  - ui/src/components/TtsButton.tsx
autonomous: true
requirements:
  - VOICE-01
  - VOICE-02

must_haves:
  truths:
    - "POST /api/transcribe is reachable and returns 503 with descriptive error when no Whisper CLI is installed"
    - "usePiperTts hook exposes prewarm/speak/status/progress and transitions idle->downloading->ready->speaking"
    - "TtsButton renders a speaker icon that calls speak() and shows download progress during prewarm"
    - "voiceEnabled boolean is persisted in nexus-settings.json and exposed via GET/PATCH /nexus/settings"
  artifacts:
    - path: "ui/src/hooks/usePiperTts.ts"
      provides: "Piper TTS hook with prewarm, speak, status, progress"
      exports: ["usePiperTts"]
    - path: "ui/src/components/TtsButton.tsx"
      provides: "Speaker button component for TTS playback"
      exports: ["TtsButton"]
  key_links:
    - from: "server/src/app.ts"
      to: "server/src/routes/chat-files.ts"
      via: "api.use(chatFileRoutes(db, opts.storageService))"
      pattern: "chatFileRoutes"
    - from: "ui/src/hooks/usePiperTts.ts"
      to: "@mintplex-labs/piper-tts-web"
      via: "import { tts }"
      pattern: "tts\\.download|tts\\.predict"
---

<objective>
Fix the broken /transcribe route registration, create the Piper TTS browser hook and button component, and add voiceEnabled to nexus-settings persistence.

Purpose: VOICE-01 requires TTS on CPU-only hardware (browser WASM satisfies this). VOICE-02 requires visible download progress before first synthesis. The /transcribe route exists but is never mounted — a 1-line fix. voiceEnabled persistence is needed so onboarding voice opt-in survives sessions.

Output: Working /api/transcribe endpoint, usePiperTts hook, TtsButton component, voiceEnabled in nexus-settings.
</objective>

<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>

<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/34-voice/34-RESEARCH.md

@server/src/app.ts
@server/src/routes/chat-files.ts
@server/src/services/nexus-settings.ts
@ui/src/api/hardware.ts
@ui/src/components/VoiceRecordButton.tsx

<interfaces>
<!-- Existing interfaces the executor needs -->

From server/src/routes/chat-files.ts:
```typescript
export function chatFileRoutes(db: Db, storage: StorageService) { ... }
// POST /transcribe — accepts multipart audio, returns { text: string } or 503
```

From server/src/app.ts (line 147 pattern):
```typescript
api.use(assetRoutes(db, opts.storageService));
// chatFileRoutes uses the same (db, opts.storageService) signature
```

From server/src/services/nexus-settings.ts:
```typescript
export const NEXUS_MODES = ["personal_ai", "project_builder", "both"] as const;
export type NexusMode = (typeof NEXUS_MODES)[number];
const nexusSettingsSchema = z.object({
  mode: z.enum(NEXUS_MODES).default("both"),
});
export function nexusSettingsService() { get(), set(patch) }
```

From ui/src/api/hardware.ts:
```typescript
export type NexusMode = "personal_ai" | "project_builder" | "both";
export interface NexusSettings { mode: NexusMode; }
export function fetchNexusSettings(): Promise<NexusSettings>;
export function updateNexusSettings(settings: Partial<NexusSettings>): Promise<NexusSettings>;
```
</interfaces>
</context>

<tasks>

<task type="auto">
  <name>Task 1: Register chatFileRoutes in app.ts and add voiceEnabled to nexus-settings</name>
  <files>server/src/app.ts, server/src/services/nexus-settings.ts, server/src/routes/nexus-settings.ts, ui/src/api/hardware.ts</files>
  <read_first>
    - server/src/app.ts (full file — find insertion point after assistantHandoffRoutes)
    - server/src/services/nexus-settings.ts (full file — understand schema)
    - server/src/routes/nexus-settings.ts (full file — understand PATCH handler)
    - ui/src/api/hardware.ts (full file — understand client types)
  </read_first>
  <action>
**1. Register chatFileRoutes in app.ts:**
- Add import at top with other route imports: `import { chatFileRoutes } from "./routes/chat-files.js";`
- Add `api.use(chatFileRoutes(db, opts.storageService));` after the `api.use(assistantHandoffRoutes(db));` line (around line 161). Mirror the `assetRoutes(db, opts.storageService)` pattern exactly.
- Do NOT place it before boardMutationGuard — the /transcribe route calls assertBoard(req) and needs to be inside the guarded api sub-router.

**2. Add voiceEnabled to nexusSettingsSchema (server/src/services/nexus-settings.ts):**
- Add `voiceEnabled: z.boolean().default(false)` to the nexusSettingsSchema z.object.
- This is a file-backed JSON field, NOT a DB migration — acceptable under the "no DB schema changes" constraint.

**3. Update NexusSettings type on client (ui/src/api/hardware.ts):**
- Add `voiceEnabled?: boolean` to the `NexusSettings` interface.
- No changes to API functions needed — they already handle Partial<NexusSettings>.

**4. Check nexus-settings route handler (server/src/routes/nexus-settings.ts):**
- Read the file. The PATCH handler should already forward arbitrary fields to `nexusSettingsService().set(patch)` since it uses the Zod schema. If it manually picks fields, add voiceEnabled to the pick list. If it passes req.body through, no change needed.
  </action>
  <verify>
    <automated>cd /opt/nexus && npx vitest run server/src/__tests__/chat-file-routes.test.ts 2>&1 | tail -5</automated>
  </verify>
  <acceptance_criteria>
    - grep -q "chatFileRoutes" server/src/app.ts returns 0
    - grep -q "voiceEnabled" server/src/services/nexus-settings.ts returns 0
    - grep -q "voiceEnabled" ui/src/api/hardware.ts returns 0
  </acceptance_criteria>
  <done>POST /api/transcribe is reachable (returns 503 when no Whisper CLI installed, not 404). voiceEnabled persists in nexus-settings.json via the existing settings route.</done>
</task>

<task type="auto">
  <name>Task 2: Create usePiperTts hook and TtsButton component</name>
  <files>ui/src/hooks/usePiperTts.ts, ui/src/components/TtsButton.tsx</files>
  <read_first>
    - ui/src/components/VoiceRecordButton.tsx (reference for button style patterns)
    - ui/src/components/ui/button.tsx (Button component API)
  </read_first>
  <action>
**0. Install piper-tts-web:**
```bash
pnpm --filter @paperclipai/ui add @mintplex-labs/piper-tts-web
```

**1. Create ui/src/hooks/usePiperTts.ts:**
```typescript
import { useState, useCallback, useRef } from "react";
import { tts } from "@mintplex-labs/piper-tts-web";

const DEFAULT_VOICE = "en_US-hfc_female-medium";

export type TtsStatus = "idle" | "downloading" | "ready" | "speaking" | "error";

export function usePiperTts() {
  const [status, setStatus] = useState<TtsStatus>("idle");
  const [progress, setProgress] = useState(0);
  const audioRef = useRef<HTMLAudioElement | null>(null);

  const prewarm = useCallback(async () => {
    if (status === "ready" || status === "downloading") return;
    setStatus("downloading");
    setProgress(0);
    try {
      const stored = await tts.stored();
      if (!stored.includes(DEFAULT_VOICE)) {
        await tts.download(DEFAULT_VOICE, (p: { loaded: number; total: number }) => {
          setProgress(Math.round((p.loaded / p.total) * 100));
        });
      }
      setStatus("ready");
      setProgress(100);
    } catch {
      setStatus("error");
    }
  }, [status]);

  const speak = useCallback(async (text: string) => {
    if (status !== "ready") return;
    // Stop any currently playing audio
    if (audioRef.current) {
      audioRef.current.pause();
      audioRef.current = null;
    }
    setStatus("speaking");
    try {
      const wav = await tts.predict({ text, voiceId: DEFAULT_VOICE });
      const audio = new Audio(wav);
      audioRef.current = audio;
      audio.onended = () => {
        audioRef.current = null;
        setStatus("ready");
      };
      audio.onerror = () => {
        audioRef.current = null;
        setStatus("ready");
      };
      await audio.play();
    } catch {
      setStatus("ready");
    }
  }, [status]);

  const stop = useCallback(() => {
    if (audioRef.current) {
      audioRef.current.pause();
      audioRef.current = null;
    }
    if (status === "speaking") setStatus("ready");
  }, [status]);

  return { status, progress, prewarm, speak, stop };
}
```

Key points:
- `tts.stored()` checks IndexedDB cache — skips download if model already present (VOICE-02).
- `tts.download()` with progress callback provides visible download progress (VOICE-02).
- `tts.predict()` returns a Blob URL (WAV) — use `new Audio(url).play()` (VOICE-01, CPU-safe WASM).
- `stop()` allows interrupting playback.
- Do NOT import this in any server-side or test file running in Node — browser-only.

**2. Create ui/src/components/TtsButton.tsx:**
```typescript
import { Volume2, VolumeX, Loader2 } from "lucide-react";
import { Button } from "./ui/button";
import type { TtsStatus } from "../hooks/usePiperTts";

interface TtsButtonProps {
  status: TtsStatus;
  progress: number;
  onSpeak: () => void;
  onStop: () => void;
  onPrewarm: () => void;
  disabled?: boolean;
}

export function TtsButton({ status, progress, onSpeak, onStop, onPrewarm, disabled }: TtsButtonProps) {
  if (status === "downloading") {
    return (
      <Button variant="ghost" size="icon" className="h-8 w-8 relative" disabled title={`Downloading voice model: ${progress}%`}>
        <Loader2 className="h-4 w-4 animate-spin" />
        <span className="absolute -bottom-1 text-[10px] text-muted-foreground">{progress}%</span>
      </Button>
    );
  }

  if (status === "speaking") {
    return (
      <Button
        variant="ghost"
        size="icon"
        className="h-8 w-8 text-primary"
        onClick={onStop}
        aria-label="Stop speaking"
        title="Stop speaking"
      >
        <VolumeX className="h-4 w-4" />
      </Button>
    );
  }

  // idle or error: clicking triggers prewarm then speak
  // ready: clicking triggers speak directly
  const handleClick = () => {
    if (status === "ready") {
      onSpeak();
    } else {
      onPrewarm();
    }
  };

  return (
    <Button
      variant="ghost"
      size="icon"
      className="h-8 w-8"
      onClick={handleClick}
      disabled={disabled || status === "error"}
      aria-label="Read aloud"
      title={status === "error" ? "TTS unavailable" : status === "idle" ? "Download voice model and read aloud" : "Read aloud"}
    >
      <Volume2 className="h-4 w-4" />
    </Button>
  );
}
```

The TtsButton receives status/progress from the hook and delegates actions. It does NOT import piper-tts-web directly — all TTS logic stays in the hook. The button is reusable: PersonalAssistant (Plan 02) will place it next to assistant messages.
  </action>
  <verify>
    <automated>cd /opt/nexus && grep -q "usePiperTts" ui/src/hooks/usePiperTts.ts && grep -q "TtsButton" ui/src/components/TtsButton.tsx && grep -q "piper-tts-web" ui/package.json 2>/dev/null || grep -q "piper-tts-web" pnpm-lock.yaml && echo "PASS" || echo "FAIL"</automated>
  </verify>
  <acceptance_criteria>
    - grep -q "tts.download" ui/src/hooks/usePiperTts.ts returns 0
    - grep -q "tts.predict" ui/src/hooks/usePiperTts.ts returns 0
    - grep -q "tts.stored" ui/src/hooks/usePiperTts.ts returns 0
    - grep -q "TtsButton" ui/src/components/TtsButton.tsx returns 0
    - grep -q "piper-tts-web" pnpm-lock.yaml returns 0
    - grep -q "Volume2" ui/src/components/TtsButton.tsx returns 0
  </acceptance_criteria>
  <done>usePiperTts hook handles download progress (VOICE-02) and CPU-safe WASM synthesis (VOICE-01). TtsButton shows download progress during prewarm and speaker icon for playback. piper-tts-web is installed as a UI dependency.</done>
</task>

</tasks>

<verification>
- `grep -q "chatFileRoutes" server/src/app.ts` — route is registered
- `grep -q "voiceEnabled" server/src/services/nexus-settings.ts` — settings schema extended
- `ls ui/src/hooks/usePiperTts.ts ui/src/components/TtsButton.tsx` — both files exist
- `npx vitest run server/src/__tests__/chat-file-routes.test.ts` — existing route tests pass
</verification>

<success_criteria>
1. POST /api/transcribe returns 503 (not 404) when no Whisper CLI is installed — route is mounted
2. usePiperTts hook exports prewarm(), speak(), stop(), status, progress
3. TtsButton renders download progress during prewarm and speaker icon for playback
4. voiceEnabled persists in nexus-settings.json
</success_criteria>

<output>
After completion, create `.planning/phases/34-voice/34-01-SUMMARY.md`
</output>