nexus/.planning/STATE.md
Nexus Dev 8883cb6ecb
Some checks failed
Docker / build-and-push (push) Has been cancelled
chore: complete v1.6 Voice Pipeline + Minimal Message Bridge milestone
2026-04-04 03:52:25 +00:00

4.4 KiB

gsd_state_version milestone milestone_name status stopped_at last_updated last_activity progress
1.0 v1.6 Voice Pipeline + Minimal Message Bridge executing Completed 38-02-PLAN.md — Telegram voice handling + TTS reply 2026-04-04T03:51:24.336Z 2026-04-04
total_phases completed_phases total_plans completed_plans percent
4 4 12 12 0

Project State

Project Reference

See: .planning/PROJECT.md (updated 2026-04-03)

Core value: A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard. Current focus: Phase 39 — voice-polish

Current Position

Phase: 39 Plan: Not started Status: Executing Phase 39 Last activity: 2026-04-04

Progress: [░░░░░░░░░░] 0%

Performance Metrics

Velocity:

  • Total plans completed: 0 (v1.6)
  • Average duration: -
  • Total execution time: 0 hours

Accumulated Context

Decisions

Decisions are logged in PROJECT.md Key Decisions table. Key constraints for v1.6:

  • voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
  • Telegram bridge uses long polling (grammY bot.start()) — no public HTTPS required on Mac Mini
  • Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
  • Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
  • COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
  • Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
  • Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)
  • [Phase 36]: Export nexusSettingsSchema for direct testing, use nexusSettingsSchema.parse({}) for consistent defaults in catch blocks
  • [Phase 36]: Used manual execFileAsync wrapper instead of promisify(execFileCb) to avoid util.promisify.custom symbol incompatibility with vitest mocks
  • [Phase 36]: Voice routes are dedicated voice.ts module (not added to chat-files.ts) for clean separation — voice pipeline is its own subsystem
  • [Phase 36]: voiceMode typed as text|voice_input|full_voice union in stream endpoint, persisted as voice_full/voice_input messageType for downstream rendering
  • [Phase 37]: Cherry-picked Phase 36 commits to bring voice pipeline, nexus-settings, and voiceMode wiring to phase-37 branch
  • [Phase 37]: COOP/COEP headers placed as first Express middleware — applies to all responses including API, static, and Vite dev
  • [Phase 37]: VAD ONNX assets served from ui/public/ same-origin to avoid COEP blocking CDN-served binary files
  • [Phase 37]: useVadRecorder requests separate MediaStream ref for VoiceWaveform AnalyserNode — useMicVAD manages its own stream internally
  • [Phase 37]: AudioContext not closed on cleanup in VoiceWaveform — reused across recording cycles to avoid repeated autoplay unlock prompts
  • [Phase 37]: useVoiceMode hook created in plan 37-03 to unblock VoiceModeToggle during parallel execution
  • [Phase 37]: Auto-play preference stored in localStorage (nexus:voice:autoplay), not nexus-settings — avoids server round-trip for fast UX
  • [Phase 38-telegram-bridge]: TelegramStep uses onNext/onBack props; Continue disabled until token validated; Skip always available
  • [Phase 38-telegram-bridge]: telegramRoutes accepts service instance as second param — enables restart from token route
  • [Phase 38-telegram-bridge]: Long-polling: deleteWebhook first, then bot.start() fire-and-forget with catch logger
  • [Phase 38-telegram-bridge]: processVoiceMessage() extracted as top-level async function — keeps bot handler clean; botToken stored as module-level mutable ref for CDN URL construction

Pending Todos

None yet.

Blockers/Concerns

  • [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to tiny.en if base.en acceleration not confirmed
  • [v1.6] grammY session management approach not yet chosen: lightweight Map<chatId, sessionId> vs. grammY conversation plugin — decide at Phase 38 planning
  • [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional

Session Continuity

Last session: 2026-04-04T03:18:52.490Z Stopped at: Completed 38-02-PLAN.md — Telegram voice handling + TTS reply Resume file: None