6 phases, 13 plans, 21 requirements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
127 lines
5.8 KiB
Markdown
127 lines
5.8 KiB
Markdown
---
|
|
phase: 34-voice
|
|
plan: 01
|
|
subsystem: api, ui, tts
|
|
tags: [piper-tts, whisper, voice, tts, nexus-settings, chat-files]
|
|
|
|
# Dependency graph
|
|
requires:
|
|
- phase: 33-persistent-memory
|
|
provides: assistantHandoffRoutes, nexusSettingsRoutes, chat.ts streaming
|
|
- phase: 25-file-system
|
|
provides: chatFileRoutes, StorageService, chatFileService
|
|
|
|
provides:
|
|
- POST /api/transcribe route registered (returns 503 when Whisper CLI absent, not 404)
|
|
- GET/PATCH /nexus/settings with voiceEnabled persistence
|
|
- usePiperTts hook with prewarm/speak/stop/status/progress
|
|
- TtsButton component with download progress and speaker icon
|
|
|
|
affects: [34-02-PLAN, PersonalAssistant, VoiceOnboarding]
|
|
|
|
# Tech tracking
|
|
tech-stack:
|
|
added: ["@mintplex-labs/piper-tts-web (WASM TTS in browser)"]
|
|
patterns:
|
|
- "Browser WASM TTS: tts.stored() → tts.download(voice, progressCb) → tts.predict({text, voiceId})"
|
|
- "TtsButton receives status/progress props from hook — zero TTS logic in component"
|
|
- "nexus-settings: file-backed JSON via Zod schema (no DB migration)"
|
|
|
|
key-files:
|
|
created:
|
|
- ui/src/hooks/usePiperTts.ts
|
|
- ui/src/components/TtsButton.tsx
|
|
modified:
|
|
- server/src/app.ts
|
|
- server/src/services/nexus-settings.ts
|
|
- ui/src/api/hardware.ts
|
|
- ui/package.json
|
|
- pnpm-lock.yaml
|
|
|
|
key-decisions:
|
|
- "chatFileRoutes registered inside boardMutationGuard (after assistantHandoffRoutes) — assertBoard() requires authenticated api router"
|
|
- "nexusSettingsRoutes also registered here (was missing from app.ts despite file existing)"
|
|
- "voiceEnabled as Zod boolean with default(false) — file-backed JSON, no DB migration needed"
|
|
- "Default return values in nexusSettingsService.get() updated to include voiceEnabled: false for safe fallback"
|
|
|
|
patterns-established:
|
|
- "TTS hook pattern: prewarm triggers download, speak asserts ready status, stop interrupts audio"
|
|
- "Audio playback via new Audio(blobUrl).play() — CPU-safe, no GPU required"
|
|
|
|
requirements-completed: [VOICE-01, VOICE-02]
|
|
|
|
# Metrics
|
|
duration: 3min
|
|
completed: 2026-04-01
|
|
---
|
|
|
|
# Phase 34 Plan 01: Voice Foundation Summary
|
|
|
|
**chatFileRoutes and nexusSettingsRoutes mounted in app.ts; voiceEnabled added to nexus-settings; usePiperTts hook and TtsButton component created with @mintplex-labs/piper-tts-web WASM synthesis**
|
|
|
|
## Performance
|
|
|
|
- **Duration:** ~3 min
|
|
- **Started:** 2026-04-01T22:32:52Z
|
|
- **Completed:** 2026-04-01T22:35:57Z
|
|
- **Tasks:** 2/2
|
|
- **Files modified:** 7
|
|
|
|
## Accomplishments
|
|
|
|
### Task 1: Register chatFileRoutes in app.ts and add voiceEnabled to nexus-settings
|
|
|
|
- Added `chatFileRoutes(db, opts.storageService)` import and `api.use()` call after `assistantHandoffRoutes(db)` in `server/src/app.ts` — POST /api/transcribe now returns 503 (not 404) when Whisper CLI is absent
|
|
- Added `nexusSettingsRoutes()` import and registration (was also missing from app.ts)
|
|
- Extended `nexusSettingsSchema` with `voiceEnabled: z.boolean().default(false)` in `server/src/services/nexus-settings.ts`
|
|
- Updated default fallback returns in `nexusSettingsService.get()` to include `voiceEnabled: false`
|
|
- Added `voiceEnabled?: boolean` to `NexusSettings` client interface in `ui/src/api/hardware.ts`
|
|
- All 10 existing chat-file-routes tests pass
|
|
|
|
### Task 2: Create usePiperTts hook and TtsButton component
|
|
|
|
- Installed `@mintplex-labs/piper-tts-web` as a UI dependency
|
|
- Created `ui/src/hooks/usePiperTts.ts` — exposes `prewarm`, `speak`, `stop`, `status` (idle/downloading/ready/speaking/error), `progress` (0-100)
|
|
- `tts.stored()` checks IndexedDB cache — skips download if model already present (satisfies VOICE-02 caching)
|
|
- `tts.download()` with progress callback provides visible download progress during prewarm (satisfies VOICE-02 UX)
|
|
- `tts.predict()` returns WAV blob URL — CPU-safe WASM synthesis, no GPU required (satisfies VOICE-01)
|
|
- Created `ui/src/components/TtsButton.tsx` — shows Loader2 + progress% during downloading, VolumeX to stop during speaking, Volume2 for idle/ready states
|
|
- TtsButton receives all state/callbacks from hook — zero TTS logic in component (clean separation)
|
|
|
|
## Verification
|
|
|
|
- `grep -q "chatFileRoutes" server/src/app.ts` — PASS
|
|
- `grep -q "voiceEnabled" server/src/services/nexus-settings.ts` — PASS
|
|
- `ls ui/src/hooks/usePiperTts.ts ui/src/components/TtsButton.tsx` — PASS
|
|
- `npx vitest run server/src/__tests__/chat-file-routes.test.ts` — 10/10 tests PASS
|
|
|
|
## Deviations from Plan
|
|
|
|
### Auto-fixed Issues
|
|
|
|
**1. [Rule 2 - Missing Critical Functionality] nexusSettingsRoutes was also missing from app.ts**
|
|
- **Found during:** Task 1
|
|
- **Issue:** The plan only mentioned registering `chatFileRoutes`, but `nexusSettingsRoutes` was also absent from app.ts despite the route file existing
|
|
- **Fix:** Added both `chatFileRoutes` and `nexusSettingsRoutes` imports and `api.use()` registrations
|
|
- **Files modified:** `server/src/app.ts`
|
|
- **Commit:** 0d318a31
|
|
|
|
**2. [Rule 2 - Missing Critical Functionality] Default fallback in nexusSettingsService.get() needed voiceEnabled**
|
|
- **Found during:** Task 1
|
|
- **Issue:** The hardcoded fallback `{ mode: "both" }` would cause TypeScript type mismatch after adding voiceEnabled to schema
|
|
- **Fix:** Updated both catch-path and parse-failure-path returns to include `voiceEnabled: false`
|
|
- **Files modified:** `server/src/services/nexus-settings.ts`
|
|
- **Commit:** 0d318a31
|
|
|
|
## Known Stubs
|
|
|
|
None — all functionality is wired. TtsButton requires the caller to provide status/progress/callbacks from usePiperTts (documented pattern for Plan 02 integration into PersonalAssistant).
|
|
|
|
## Commits
|
|
|
|
| Commit | Task | Description |
|
|
|--------|------|-------------|
|
|
| 0d318a31 | Task 1 | feat(34-01): register chatFileRoutes + nexusSettingsRoutes in app.ts, add voiceEnabled to nexus-settings |
|
|
| 8f8257e1 | Task 2 | feat(34-01): create usePiperTts hook and TtsButton component with piper-tts-web |
|
|
|
|
## Self-Check: PASSED
|