- Create 37-01-SUMMARY.md with task results and deviations - Update STATE.md: advance to plan 2, add decisions, update progress to 57% - Update ROADMAP.md: phase 37 in progress (1/4 plans complete) - Mark WCHAT-01, WCHAT-02, WCHAT-04 complete in REQUIREMENTS.md
79 lines
3.4 KiB
Markdown
79 lines
3.4 KiB
Markdown
---
|
|
gsd_state_version: 1.0
|
|
milestone: v1.6
|
|
milestone_name: Voice Pipeline + Minimal Message Bridge
|
|
status: executing
|
|
stopped_at: Completed 37-01-PLAN.md — Server Prerequisites + VAD Browser Infrastructure
|
|
last_updated: "2026-04-04T02:26:30.188Z"
|
|
last_activity: 2026-04-04
|
|
progress:
|
|
total_phases: 4
|
|
completed_phases: 1
|
|
total_plans: 7
|
|
completed_plans: 4
|
|
percent: 0
|
|
---
|
|
|
|
# Project State
|
|
|
|
## Project Reference
|
|
|
|
See: .planning/PROJECT.md (updated 2026-04-03)
|
|
|
|
**Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
|
|
**Current focus:** Phase 37 — web-chat-voice-ui
|
|
|
|
## Current Position
|
|
|
|
Phase: 37 (web-chat-voice-ui) — EXECUTING
|
|
Plan: 2 of 4
|
|
Status: Ready to execute
|
|
Last activity: 2026-04-04
|
|
|
|
Progress: [░░░░░░░░░░] 0%
|
|
|
|
## Performance Metrics
|
|
|
|
**Velocity:**
|
|
|
|
- Total plans completed: 0 (v1.6)
|
|
- Average duration: -
|
|
- Total execution time: 0 hours
|
|
|
|
## Accumulated Context
|
|
|
|
### Decisions
|
|
|
|
Decisions are logged in PROJECT.md Key Decisions table.
|
|
Key constraints for v1.6:
|
|
|
|
- voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
|
|
- Telegram bridge uses long polling (grammY `bot.start()`) — no public HTTPS required on Mac Mini
|
|
- Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
|
|
- Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
|
|
- COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
|
|
- Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
|
|
- Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)
|
|
- [Phase 36]: Export nexusSettingsSchema for direct testing, use nexusSettingsSchema.parse({}) for consistent defaults in catch blocks
|
|
- [Phase 36]: Used manual execFileAsync wrapper instead of promisify(execFileCb) to avoid util.promisify.custom symbol incompatibility with vitest mocks
|
|
- [Phase 36]: Voice routes are dedicated voice.ts module (not added to chat-files.ts) for clean separation — voice pipeline is its own subsystem
|
|
- [Phase 36]: voiceMode typed as text|voice_input|full_voice union in stream endpoint, persisted as voice_full/voice_input messageType for downstream rendering
|
|
- [Phase 37]: Cherry-picked Phase 36 commits to bring voice pipeline, nexus-settings, and voiceMode wiring to phase-37 branch
|
|
- [Phase 37]: COOP/COEP headers placed as first Express middleware — applies to all responses including API, static, and Vite dev
|
|
- [Phase 37]: VAD ONNX assets served from ui/public/ same-origin to avoid COEP blocking CDN-served binary files
|
|
|
|
### Pending Todos
|
|
|
|
None yet.
|
|
|
|
### Blockers/Concerns
|
|
|
|
- [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed
|
|
- [v1.6] grammY session management approach not yet chosen: lightweight `Map<chatId, sessionId>` vs. grammY conversation plugin — decide at Phase 38 planning
|
|
- [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional
|
|
|
|
## Session Continuity
|
|
|
|
Last session: 2026-04-04T02:26:30.185Z
|
|
Stopped at: Completed 37-01-PLAN.md — Server Prerequisites + VAD Browser Infrastructure
|
|
Resume file: None
|