diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index 4896be73..4a8099ba 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -27,7 +27,7 @@ - [x] **VOICE-01**: User gets Piper TTS speech output that works on CPU-only hardware - [x] **VOICE-02**: Piper TTS pre-warms on first use with visible download progress (no silent 15-30s hang) -- [ ] **VOICE-03**: Voice features (Whisper STT + Piper TTS) offered during onboarding based on hardware capability +- [x] **VOICE-03**: Voice features (Whisper STT + Piper TTS) offered during onboarding based on hardware capability ### Personal AI Assistant @@ -87,7 +87,7 @@ | ASST-04 | Phase 33 | Complete | | VOICE-01 | Phase 34 | Complete | | VOICE-02 | Phase 34 | Complete | -| VOICE-03 | Phase 34 | Pending | +| VOICE-03 | Phase 34 | Complete | | CLI-01 | Phase 35 | Pending | | CLI-02 | Phase 35 | Pending | diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 94002676..62ccd78f 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -92,7 +92,7 @@ Plans: - [x] **Phase 31: Puter.js Zero-Config Cloud** — Server-proxied Puter.js adapter with full cost tracking, Google OAuth PKCE tier, and subscription auto-detection; no API keys required for zero-config path (completed 2026-04-03) - [x] **Phase 32: Multi-Step Onboarding Wizard** — Assemble all provider tiers and hardware data into a skippable multi-step wizard; summary screen routes directly into chat (completed 2026-04-03) - [x] **Phase 33: Persistent Memory + Personal Assistant Mode** — File-backed memory with write-time sanitization, PersonalAssistantPage, conversation handoff to PM agent (completed 2026-04-03) -- [ ] **Phase 34: Voice** — Piper TTS with pre-warm progress, Whisper STT wired into voice service, onboarding voice step activated +- [x] **Phase 34: Voice** — Piper TTS with pre-warm progress, Whisper STT wired into voice service, onboarding voice step activated (completed 2026-04-03) - [ ] **Phase 35: npx buildthis CLI** — Standalone bootstrapper package with hardware detection and provider tiering parity with web onboarding --- @@ -176,7 +176,7 @@ Plans: Plans: - [x] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings -- [ ] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring +- [x] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring **UI hint**: yes ### Phase 35: npx buildthis CLI @@ -238,5 +238,5 @@ All 21 v1.5 requirements are mapped to exactly one phase. No orphans. | 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 | | 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 | | 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 | -| 34. Voice | v1.5 | 1/2 | In Progress| | +| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 | | 35. npx buildthis CLI | v1.5 | 0/TBD | Not started | - | diff --git a/.planning/STATE.md b/.planning/STATE.md index ff5d06bc..622bbd82 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,15 +2,15 @@ gsd_state_version: 1.0 milestone: v1.5 milestone_name: Smart Onboarding + Personal AI Assistant -status: executing -stopped_at: Completed 34-voice/34-01 -last_updated: "2026-04-03T22:35:58.055Z" +status: verifying +stopped_at: Completed 34-voice/34-02 +last_updated: "2026-04-03T22:42:08.349Z" last_activity: 2026-04-03 progress: total_phases: 6 - completed_phases: 4 + completed_phases: 5 total_plans: 12 - completed_plans: 11 + completed_plans: 12 percent: 0 --- @@ -27,7 +27,7 @@ See: .planning/PROJECT.md (updated 2026-04-02) Phase: 34 (voice) — EXECUTING Plan: 2 of 2 -Status: Ready to execute +Status: Phase complete — ready for verification Last activity: 2026-04-03 Progress: [__________] 0% @@ -63,6 +63,7 @@ Progress: [__________] 0% | Phase 33 P02 | 12 | 2 tasks | 8 files | | Phase 33-persistent-memory P03 | 20 | 2 tasks | 6 files | | Phase 34-voice P01 | 3 | 2 tasks | 7 files | +| Phase 34-voice P02 | 4 | 2 tasks | 3 files | ## Accumulated Context @@ -98,6 +99,8 @@ Key constraints for v1.5 (established at roadmap): - [Phase 33-persistent-memory]: buildHandoffSummary exported as named pure function for direct unit testing without route test harness - [Phase 34-voice]: chatFileRoutes registered inside boardMutationGuard after assistantHandoffRoutes; nexusSettingsRoutes also added (was missing) - [Phase 34-voice]: voiceEnabled as Zod boolean with default(false) in nexus-settings — file-backed JSON, no DB migration +- [Phase 34-voice]: VoiceStep inserted at step 4; rootDir shifts to step 5, summary to step 6 — clean sequential numbering +- [Phase 34-voice]: TtsButton rendered inline in messages.map rather than inside MessageBubble — avoids prop drilling usePiperTts ### Pending Todos @@ -112,6 +115,6 @@ None yet. ## Session Continuity -Last session: 2026-04-03T22:35:58.052Z -Stopped at: Completed 34-voice/34-01 +Last session: 2026-04-03T22:42:08.346Z +Stopped at: Completed 34-voice/34-02 Resume file: None diff --git a/.planning/phases/34-voice/34-02-SUMMARY.md b/.planning/phases/34-voice/34-02-SUMMARY.md new file mode 100644 index 00000000..a8140039 --- /dev/null +++ b/.planning/phases/34-voice/34-02-SUMMARY.md @@ -0,0 +1,122 @@ +--- +phase: 34-voice +plan: 02 +subsystem: ui, onboarding +tags: [voice, onboarding, tts, stt, piper, whisper, personal-assistant] + +# Dependency graph +requires: + - phase: 34-voice/34-01 + provides: usePiperTts, TtsButton, VoiceRecordButton, voiceEnabled nexus-settings + - phase: 33-persistent-memory + provides: PersonalAssistant page, chatApi streaming + - phase: 32-multi-step-onboarding-wizard + provides: NexusOnboardingWizard 5-step base + +provides: + - VoiceStep onboarding component with mic detection (enumerateDevices) + - 6-step NexusOnboardingWizard with voice opt-in at step 4 + - voiceEnabled persisted in nexus-settings on workspace creation + - PersonalAssistant with VoiceRecordButton (STT) in input bar + - PersonalAssistant with TtsButton next to each assistant message (TTS) + +affects: [ui/src/components/NexusOnboardingWizard.tsx, ui/src/pages/PersonalAssistant.tsx] + +# Tech tracking +tech-stack: + added: [] + patterns: + - "VoiceStep probes microphone with navigator.mediaDevices.enumerateDevices() — async, graceful fallback to false" + - "voiceEnabled captured in wizard state, persisted via updateNexusSettings after createWorkspace()" + - "TtsButton rendered inline in messages.map (no prop drilling) — pl-10 aligns under message bubble" + - "VoiceRecordButton appends transcription to textarea (not auto-send) — user reviews before sending" + +key-files: + created: + - ui/src/components/onboarding/VoiceStep.tsx + modified: + - ui/src/components/NexusOnboardingWizard.tsx + - ui/src/pages/PersonalAssistant.tsx + +key-decisions: + - "VoiceStep inserted at step 4; rootDir shifts to step 5, summary to step 6 — clean sequential numbering" + - "voiceEnabled persisted after mode save in createWorkspace() — non-blocking try/catch wrapper" + - "TtsButton rendered inline in messages.map rather than inside MessageBubble — avoids prop drilling usePiperTts through MessageBubble" + - "VoiceRecordButton appends (not replaces) transcription to textarea — user can combine typed + spoken input" + - "No TTS auto-prewarm on mount — triggered only on first TtsButton click to avoid unexpected WASM downloads" + +requirements-completed: [VOICE-03] + +# Metrics +duration: 4min +completed: 2026-04-03 +tasks_completed: 2 +files_changed: 3 +--- + +# Phase 34 Plan 02: Voice Onboarding Step + PersonalAssistant Wire-up Summary + +**VoiceStep onboarding component (mic detection, enable/skip) inserted as wizard step 4; VoiceRecordButton (STT) and TtsButton (TTS) wired into PersonalAssistant for full voice I/O** + +## Performance + +- **Duration:** ~4 min +- **Started:** 2026-04-03T22:38:32Z +- **Completed:** 2026-04-03T22:41:13Z +- **Tasks:** 2/2 +- **Files modified:** 3 + +## Accomplishments + +### Task 1: Create VoiceStep component and insert into NexusOnboardingWizard as step 4 + +- Created `ui/src/components/onboarding/VoiceStep.tsx` with: + - `navigator.mediaDevices.enumerateDevices()` mic probe with loading/available/unavailable states + - Two info cards (Mic/Whisper STT + Volume/Piper TTS) with conditional mic availability message + - Enable voice button (sets voiceEnabled = true, advances to step 5) and Skip button +- Updated `NexusOnboardingWizard.tsx`: + - Added `VoiceStep` import + - Added `voiceEnabled` state (default false, reset on close, persisted in `createWorkspace()`) + - Inserted step 4 (Voice) block with Back → step 3, Enable → step 5, Skip → step 5 + - Shifted old step 4 (rootDir) to step 5 — "Review & finish" → step 6, Back → step 4 (voice), Skip to summary → step 6 + - Shifted old step 5 (summary) to step 6 — Back → step 5 + - Updated step indicator from `Step N of 4` to `Step N of 5` (summary shows "Summary" at step 6) + - Updated handleStartChat error message to reference step 5 + +### Task 2: Wire VoiceRecordButton and TtsButton into PersonalAssistant + +- Added imports: `VoiceRecordButton`, `TtsButton`, `usePiperTts` +- Added `usePiperTts` hook call in component body — exposes `ttsStatus`, `ttsProgress`, `prewarm`, `speak`, `stop` +- Added `VoiceRecordButton` in the input bar between textarea and Send button: + - `onTranscription` callback appends transcribed text to textarea (does not auto-send) + - `disabled` passes through `isSending` state +- Added `TtsButton` next to each assistant message in `messages.map`: + - Wrapped message rendering in `