From b54130d991da4ef9426ce486d34bd4c8ffd3fd76 Mon Sep 17 00:00:00 2001 From: Nexus Dev Date: Sat, 4 Apr 2026 03:23:44 +0000 Subject: [PATCH] docs(39): auto-generated context (discuss skipped) --- .../phases/39-voice-polish/39-CONTEXT.md | 62 +++++++++++++++++++ 1 file changed, 62 insertions(+) create mode 100644 .planning/phases/39-voice-polish/39-CONTEXT.md diff --git a/.planning/phases/39-voice-polish/39-CONTEXT.md b/.planning/phases/39-voice-polish/39-CONTEXT.md new file mode 100644 index 00000000..fcd8e2f9 --- /dev/null +++ b/.planning/phases/39-voice-polish/39-CONTEXT.md @@ -0,0 +1,62 @@ +# Phase 39: Voice Polish - Context + +**Gathered:** 2026-04-04 +**Status:** Ready for planning +**Mode:** Auto-generated (discuss skipped via workflow.skip_discuss) + + +## Phase Boundary + +Voice responses begin playing before synthesis is complete (sentence-buffered), a single response can be synthesized in multiple languages simultaneously, and new installs can detect STT/TTS hardware capability during onboarding and enable voice in one step. + +Requirements: VPIPE-07, VPIPE-08, ONBRD-01, ONBRD-02 + + + + +## Implementation Decisions + +### Claude's Discretion +All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. + +Key research findings to incorporate: +- Sentence-buffered TTS: split response on sentence boundaries (.!?), synthesize first sentence immediately, start playback while subsequent sentences synthesize +- Multi-language TTS: Piper supports multiple language models; user requests same text as audio in multiple languages (e.g. English + Danish) without a second agent call +- Onboarding hardware detection: extend existing hardware probe to check for Whisper/Piper binary availability and hardware capability +- VoiceStep already exists from v1.5 (Phase 34) — enhance it with hardware probe results rather than creating a new step +- Use existing systeminformation probe pattern from Phase 30 +- Sentence splitting: simple regex on .!? followed by whitespace; no NLP library needed + + + + +## Existing Code Insights + +### Reusable Assets +- `server/src/services/voice-pipeline.ts` — VoicePipelineService (synthesize already does sentence chunking) +- `ui/src/components/ChatVoicePlayer.tsx` — audio playback (needs streaming support) +- `ui/src/components/onboarding/VoiceStep.tsx` — existing voice enable/skip step +- `server/src/routes/voice.ts` — POST /api/synthesize +- Hardware detection from Phase 30 (systeminformation probe) + +### Integration Points +- `server/src/routes/voice.ts` — new endpoint for multi-language synthesis +- `ui/src/components/ChatVoicePlayer.tsx` — sentence-buffered playback +- `ui/src/components/onboarding/VoiceStep.tsx` — hardware capability display +- `server/src/services/nexus-settings.ts` — piperBinaryPath, whisperBinaryPath + + + + +## Specific Ideas + +No specific requirements — discuss phase skipped. + + + + +## Deferred Ideas + +None — discuss phase skipped. + +