docs(34-01): complete voice foundation plan — chatFileRoutes, usePiperTts, TtsButton, voiceEnabled

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Nexus Dev 2026-04-03 22:36:11 +00:00
parent 847f316319
commit 2b568a0f5d
4 changed files with 147 additions and 17 deletions

View file

@ -25,8 +25,8 @@
### Voice
- [ ] **VOICE-01**: User gets Piper TTS speech output that works on CPU-only hardware
- [ ] **VOICE-02**: Piper TTS pre-warms on first use with visible download progress (no silent 15-30s hang)
- [x] **VOICE-01**: User gets Piper TTS speech output that works on CPU-only hardware
- [x] **VOICE-02**: Piper TTS pre-warms on first use with visible download progress (no silent 15-30s hang)
- [ ] **VOICE-03**: Voice features (Whisper STT + Piper TTS) offered during onboarding based on hardware capability
### Personal AI Assistant
@ -85,8 +85,8 @@
| ASST-02 | Phase 33 | Complete |
| ASST-03 | Phase 33 | Complete |
| ASST-04 | Phase 33 | Complete |
| VOICE-01 | Phase 34 | Pending |
| VOICE-02 | Phase 34 | Pending |
| VOICE-01 | Phase 34 | Complete |
| VOICE-02 | Phase 34 | Complete |
| VOICE-03 | Phase 34 | Pending |
| CLI-01 | Phase 35 | Pending |
| CLI-02 | Phase 35 | Pending |

View file

@ -175,7 +175,7 @@ Plans:
**Plans**: 2 plans
Plans:
- [ ] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings
- [x] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings
- [ ] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring
**UI hint**: yes
@ -238,5 +238,5 @@ All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
| 34. Voice | v1.5 | 0/2 | Not started | - |
| 34. Voice | v1.5 | 1/2 | In Progress| |
| 35. npx buildthis CLI | v1.5 | 0/TBD | Not started | - |

View file

@ -2,15 +2,15 @@
gsd_state_version: 1.0
milestone: v1.5
milestone_name: Smart Onboarding + Personal AI Assistant
status: verifying
stopped_at: Completed 33-persistent-memory/33-03
last_updated: "2026-04-03T22:15:46.392Z"
status: executing
stopped_at: Completed 34-voice/34-01
last_updated: "2026-04-03T22:35:58.055Z"
last_activity: 2026-04-03
progress:
total_phases: 6
completed_phases: 4
total_plans: 10
completed_plans: 10
total_plans: 12
completed_plans: 11
percent: 0
---
@ -21,13 +21,13 @@ progress:
See: .planning/PROJECT.md (updated 2026-04-02)
**Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
**Current focus:** Phase 33 — persistent-memory
**Current focus:** Phase 34 — voice
## Current Position
Phase: 34
Plan: Not started
Status: Phase complete — ready for verification
Phase: 34 (voice) — EXECUTING
Plan: 2 of 2
Status: Ready to execute
Last activity: 2026-04-03
Progress: [__________] 0%
@ -62,6 +62,7 @@ Progress: [__________] 0%
| Phase 33 P01 | 4 | 2 tasks | 6 files |
| Phase 33 P02 | 12 | 2 tasks | 8 files |
| Phase 33-persistent-memory P03 | 20 | 2 tasks | 6 files |
| Phase 34-voice P01 | 3 | 2 tasks | 7 files |
## Accumulated Context
@ -95,6 +96,8 @@ Key constraints for v1.5 (established at roadmap):
- [Phase 33-persistent-memory]: Pre-fetch conversation/settings/memory BEFORE flushHeaders to avoid SSE header race (Pitfall 3 from research)
- [Phase 33-persistent-memory]: puterProxyService.resolveToken wrapped in try/catch — graceful fallback to streamEcho when no puter token configured
- [Phase 33-persistent-memory]: buildHandoffSummary exported as named pure function for direct unit testing without route test harness
- [Phase 34-voice]: chatFileRoutes registered inside boardMutationGuard after assistantHandoffRoutes; nexusSettingsRoutes also added (was missing)
- [Phase 34-voice]: voiceEnabled as Zod boolean with default(false) in nexus-settings — file-backed JSON, no DB migration
### Pending Todos
@ -109,6 +112,6 @@ None yet.
## Session Continuity
Last session: 2026-04-03T22:14:47.420Z
Stopped at: Completed 33-persistent-memory/33-03
Last session: 2026-04-03T22:35:58.052Z
Stopped at: Completed 34-voice/34-01
Resume file: None

View file

@ -0,0 +1,127 @@
---
phase: 34-voice
plan: 01
subsystem: api, ui, tts
tags: [piper-tts, whisper, voice, tts, nexus-settings, chat-files]
# Dependency graph
requires:
- phase: 33-persistent-memory
provides: assistantHandoffRoutes, nexusSettingsRoutes, chat.ts streaming
- phase: 25-file-system
provides: chatFileRoutes, StorageService, chatFileService
provides:
- POST /api/transcribe route registered (returns 503 when Whisper CLI absent, not 404)
- GET/PATCH /nexus/settings with voiceEnabled persistence
- usePiperTts hook with prewarm/speak/stop/status/progress
- TtsButton component with download progress and speaker icon
affects: [34-02-PLAN, PersonalAssistant, VoiceOnboarding]
# Tech tracking
tech-stack:
added: ["@mintplex-labs/piper-tts-web (WASM TTS in browser)"]
patterns:
- "Browser WASM TTS: tts.stored() → tts.download(voice, progressCb) → tts.predict({text, voiceId})"
- "TtsButton receives status/progress props from hook — zero TTS logic in component"
- "nexus-settings: file-backed JSON via Zod schema (no DB migration)"
key-files:
created:
- ui/src/hooks/usePiperTts.ts
- ui/src/components/TtsButton.tsx
modified:
- server/src/app.ts
- server/src/services/nexus-settings.ts
- ui/src/api/hardware.ts
- ui/package.json
- pnpm-lock.yaml
key-decisions:
- "chatFileRoutes registered inside boardMutationGuard (after assistantHandoffRoutes) — assertBoard() requires authenticated api router"
- "nexusSettingsRoutes also registered here (was missing from app.ts despite file existing)"
- "voiceEnabled as Zod boolean with default(false) — file-backed JSON, no DB migration needed"
- "Default return values in nexusSettingsService.get() updated to include voiceEnabled: false for safe fallback"
patterns-established:
- "TTS hook pattern: prewarm triggers download, speak asserts ready status, stop interrupts audio"
- "Audio playback via new Audio(blobUrl).play() — CPU-safe, no GPU required"
requirements-completed: [VOICE-01, VOICE-02]
# Metrics
duration: 3min
completed: 2026-04-01
---
# Phase 34 Plan 01: Voice Foundation Summary
**chatFileRoutes and nexusSettingsRoutes mounted in app.ts; voiceEnabled added to nexus-settings; usePiperTts hook and TtsButton component created with @mintplex-labs/piper-tts-web WASM synthesis**
## Performance
- **Duration:** ~3 min
- **Started:** 2026-04-01T22:32:52Z
- **Completed:** 2026-04-01T22:35:57Z
- **Tasks:** 2/2
- **Files modified:** 7
## Accomplishments
### Task 1: Register chatFileRoutes in app.ts and add voiceEnabled to nexus-settings
- Added `chatFileRoutes(db, opts.storageService)` import and `api.use()` call after `assistantHandoffRoutes(db)` in `server/src/app.ts` — POST /api/transcribe now returns 503 (not 404) when Whisper CLI is absent
- Added `nexusSettingsRoutes()` import and registration (was also missing from app.ts)
- Extended `nexusSettingsSchema` with `voiceEnabled: z.boolean().default(false)` in `server/src/services/nexus-settings.ts`
- Updated default fallback returns in `nexusSettingsService.get()` to include `voiceEnabled: false`
- Added `voiceEnabled?: boolean` to `NexusSettings` client interface in `ui/src/api/hardware.ts`
- All 10 existing chat-file-routes tests pass
### Task 2: Create usePiperTts hook and TtsButton component
- Installed `@mintplex-labs/piper-tts-web` as a UI dependency
- Created `ui/src/hooks/usePiperTts.ts` — exposes `prewarm`, `speak`, `stop`, `status` (idle/downloading/ready/speaking/error), `progress` (0-100)
- `tts.stored()` checks IndexedDB cache — skips download if model already present (satisfies VOICE-02 caching)
- `tts.download()` with progress callback provides visible download progress during prewarm (satisfies VOICE-02 UX)
- `tts.predict()` returns WAV blob URL — CPU-safe WASM synthesis, no GPU required (satisfies VOICE-01)
- Created `ui/src/components/TtsButton.tsx` — shows Loader2 + progress% during downloading, VolumeX to stop during speaking, Volume2 for idle/ready states
- TtsButton receives all state/callbacks from hook — zero TTS logic in component (clean separation)
## Verification
- `grep -q "chatFileRoutes" server/src/app.ts` — PASS
- `grep -q "voiceEnabled" server/src/services/nexus-settings.ts` — PASS
- `ls ui/src/hooks/usePiperTts.ts ui/src/components/TtsButton.tsx` — PASS
- `npx vitest run server/src/__tests__/chat-file-routes.test.ts` — 10/10 tests PASS
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 2 - Missing Critical Functionality] nexusSettingsRoutes was also missing from app.ts**
- **Found during:** Task 1
- **Issue:** The plan only mentioned registering `chatFileRoutes`, but `nexusSettingsRoutes` was also absent from app.ts despite the route file existing
- **Fix:** Added both `chatFileRoutes` and `nexusSettingsRoutes` imports and `api.use()` registrations
- **Files modified:** `server/src/app.ts`
- **Commit:** 0d318a31
**2. [Rule 2 - Missing Critical Functionality] Default fallback in nexusSettingsService.get() needed voiceEnabled**
- **Found during:** Task 1
- **Issue:** The hardcoded fallback `{ mode: "both" }` would cause TypeScript type mismatch after adding voiceEnabled to schema
- **Fix:** Updated both catch-path and parse-failure-path returns to include `voiceEnabled: false`
- **Files modified:** `server/src/services/nexus-settings.ts`
- **Commit:** 0d318a31
## Known Stubs
None — all functionality is wired. TtsButton requires the caller to provide status/progress/callbacks from usePiperTts (documented pattern for Plan 02 integration into PersonalAssistant).
## Commits
| Commit | Task | Description |
|--------|------|-------------|
| 0d318a31 | Task 1 | feat(34-01): register chatFileRoutes + nexusSettingsRoutes in app.ts, add voiceEnabled to nexus-settings |
| 8f8257e1 | Task 2 | feat(34-01): create usePiperTts hook and TtsButton component with piper-tts-web |
## Self-Check: PASSED