docs(36): auto-generated context (discuss skipped)
This commit is contained in:
parent
68aa5ae052
commit
b18355bc47
1 changed files with 73 additions and 0 deletions
73
.planning/phases/36-voice-pipeline-foundation/36-CONTEXT.md
Normal file
73
.planning/phases/36-voice-pipeline-foundation/36-CONTEXT.md
Normal file
|
|
@ -0,0 +1,73 @@
|
|||
# Phase 36: Voice Pipeline Foundation - Context
|
||||
|
||||
**Gathered:** 2026-04-04
|
||||
**Status:** Ready for planning
|
||||
**Mode:** Auto-generated (discuss skipped via workflow.skip_discuss)
|
||||
|
||||
<domain>
|
||||
## Phase Boundary
|
||||
|
||||
The transport-agnostic voice pipeline is live and callable from any consumer — web chat, Telegram, or future integrations — with correct audio transcoding, voice mode flag propagation, and dual output formatting baked in from the start.
|
||||
|
||||
Requirements: VPIPE-01, VPIPE-02, VPIPE-03, VPIPE-04, VPIPE-05, VPIPE-06
|
||||
|
||||
</domain>
|
||||
|
||||
<decisions>
|
||||
## Implementation Decisions
|
||||
|
||||
### Claude's Discretion
|
||||
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
|
||||
|
||||
Key research findings to incorporate:
|
||||
- VoicePipelineService as server-side service: `transcribe(buffer, format)`, `synthesize(text, voiceId?)`, `formatForVoice(text)`
|
||||
- Move `/transcribe` from `chat-files.ts` to new `voice.ts` route to reduce rebase conflict surface
|
||||
- Use `ffmpeg-static ^5.2.0` (NOT archived fluent-ffmpeg) for WebM→WAV and OGG→WAV transcoding
|
||||
- Use `execFile` (not `exec`) for CLI subprocess calls — prevents shell injection
|
||||
- Wrap CLI calls (`piper`, `ffmpeg`) in `Promise.race([call, timeout(8000)])` for graceful degradation
|
||||
- Voice mode flag must survive: client → Express → message persistence → agent session codec
|
||||
- Dual output: prompt engineering requests `SPOKEN: [prose]` + `DETAILED: [markdown]` with post-processing strip as fallback
|
||||
- nexus-settings schema extension: `voiceMode: "text" | "voice_input" | "full_voice"`, optional `telegramToken`
|
||||
- No DB migrations — all state in existing JSONB fields and file-backed JSON
|
||||
|
||||
</decisions>
|
||||
|
||||
<code_context>
|
||||
## Existing Code Insights
|
||||
|
||||
### Reusable Assets
|
||||
- `server/src/routes/chat-files.ts` — existing `/transcribe` endpoint with whisper-cpp/openai-whisper cascade
|
||||
- `server/src/services/nexus-settings.ts` — file-backed JSON with Zod validation
|
||||
- `packages/shared/src/validators/chat.ts` — Zod chat message validators
|
||||
- `packages/shared/src/types/chat.ts` — ChatMessage type definitions
|
||||
- `ui/src/components/VoiceRecordButton.tsx` — MediaRecorder API (client-side)
|
||||
- `ui/src/components/TtsButton.tsx` — @mintplex-labs/piper-tts-web WASM
|
||||
- `ui/src/hooks/usePiperTts.ts` — browser TTS hook
|
||||
|
||||
### Established Patterns
|
||||
- Express route files in `server/src/routes/`
|
||||
- Service files in `server/src/services/`
|
||||
- Zod validators in `packages/shared/src/validators/`
|
||||
- Routes mounted in `server/src/app.ts`
|
||||
- File-backed settings via nexus-settings service
|
||||
|
||||
### Integration Points
|
||||
- `server/src/app.ts` — mount new voice routes
|
||||
- `packages/shared` — extend chat message types with voiceMode field
|
||||
- `server/src/services/nexus-settings.ts` — extend schema for voiceMode and telegramToken
|
||||
|
||||
</code_context>
|
||||
|
||||
<specifics>
|
||||
## Specific Ideas
|
||||
|
||||
No specific requirements — discuss phase skipped. Refer to ROADMAP phase description and success criteria.
|
||||
|
||||
</specifics>
|
||||
|
||||
<deferred>
|
||||
## Deferred Ideas
|
||||
|
||||
None — discuss phase skipped.
|
||||
|
||||
</deferred>
|
||||
Loading…
Add table
Reference in a new issue