docs: create milestone v1.6 roadmap (4 phases)

2026-04-04 00:53:37 +00:00 · 2026-04-04 00:53:37 +00:00 · 68aa5ae052
commit 68aa5ae052
parent 1cdcfac10b
3 changed files with 155 additions and 172 deletions
--- a/.planning/REQUIREMENTS.md
+++ b/.planning/REQUIREMENTS.md
@ -72,35 +72,35 @@

 | Requirement | Phase | Status |
 |-------------|-------|--------|
-| VPIPE-01 | — | Pending |
-| VPIPE-02 | — | Pending |
-| VPIPE-03 | — | Pending |
-| VPIPE-04 | — | Pending |
-| VPIPE-05 | — | Pending |
-| VPIPE-06 | — | Pending |
-| VPIPE-07 | — | Pending |
-| VPIPE-08 | — | Pending |
-| WCHAT-01 | — | Pending |
-| WCHAT-02 | — | Pending |
-| WCHAT-03 | — | Pending |
-| WCHAT-04 | — | Pending |
-| WCHAT-05 | — | Pending |
-| WCHAT-06 | — | Pending |
-| TGRAM-01 | — | Pending |
-| TGRAM-02 | — | Pending |
-| TGRAM-03 | — | Pending |
-| TGRAM-04 | — | Pending |
-| TGRAM-05 | — | Pending |
-| TGRAM-06 | — | Pending |
-| ONBRD-01 | — | Pending |
-| ONBRD-02 | — | Pending |
-| ONBRD-03 | — | Pending |
+| VPIPE-01 | Phase 36 | Pending |
+| VPIPE-02 | Phase 36 | Pending |
+| VPIPE-03 | Phase 36 | Pending |
+| VPIPE-04 | Phase 36 | Pending |
+| VPIPE-05 | Phase 36 | Pending |
+| VPIPE-06 | Phase 36 | Pending |
+| VPIPE-07 | Phase 39 | Pending |
+| VPIPE-08 | Phase 39 | Pending |
+| WCHAT-01 | Phase 37 | Pending |
+| WCHAT-02 | Phase 37 | Pending |
+| WCHAT-03 | Phase 37 | Pending |
+| WCHAT-04 | Phase 37 | Pending |
+| WCHAT-05 | Phase 37 | Pending |
+| WCHAT-06 | Phase 37 | Pending |
+| TGRAM-01 | Phase 38 | Pending |
+| TGRAM-02 | Phase 38 | Pending |
+| TGRAM-03 | Phase 38 | Pending |
+| TGRAM-04 | Phase 38 | Pending |
+| TGRAM-05 | Phase 38 | Pending |
+| TGRAM-06 | Phase 38 | Pending |
+| ONBRD-01 | Phase 39 | Pending |
+| ONBRD-02 | Phase 39 | Pending |
+| ONBRD-03 | Phase 38 | Pending |

 **Coverage:**
 - v1.6 requirements: 23 total
- Mapped to phases: 0
- Unmapped: 23 ⚠️
+- Mapped to phases: 23
+- Unmapped: 0 ✓

 ---
 *Requirements defined: 2026-04-04*
-*Last updated: 2026-04-04 after initial definition*
+*Last updated: 2026-04-03 — traceability populated after roadmap creation*
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@ -5,7 +5,8 @@
 - ✅ **v1.2.1 Universal Skill Management** - Phase 1 (shipped 2026-04-01)
 - ✅ **v1.3 Chat & PWA** - Phases 21-26 (shipped 2026-04-02)
 - ✅ **v1.4 Hermes Default Provider** - Phases 27-29 (shipped 2026-04-02)
- 🚧 **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (in progress)
+- ✅ **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (shipped 2026-04-03)
+- 🚧 **v1.6 Voice Pipeline + Minimal Message Bridge** - Phases 36-39 (in progress)

 ---

@ -58,168 +59,140 @@ Plans:
 **Goal**: Users can create a Hermes agent in Nexus, configure it, and have it execute heartbeats that spawn `hermes chat -q`, return a result, and persist the session across runs
 **Plans**: 1/1 plans complete

-Plans:
- [x] 27-01-PLAN.md — Close four integration gaps: SESSIONED_LOCAL_ADAPTERS, create-mode toolsets bug, duplicate constant, session codec test
-
 ### Phase 28: Ollama Integration & Agent Surface
 **Goal**: Users can see which Ollama models are available, get a recommendation for their hardware, configure any Hermes agent to use a local model, and see Hermes-specific runtime data in the dashboard and agent config
 **Plans**: 3/3 plans complete

-Plans:
- [x] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
- [x] 28-02-PLAN.md — UI model selector dropdown, install callout, Hermes skill badge
- [x] 28-03-PLAN.md — Hermes stateJson runtime data and dashboard HermesRuntimeCard
-
 ### Phase 29: Default Provider & End-to-End
 **Goal**: A fresh Nexus install with only Hermes and Ollama works end-to-end — onboarding offers Hermes as the default, PM and Engineer templates run correctly on the Hermes runtime, and GSD workflow tasks complete successfully
 **Plans**: 2/2 plans complete

-Plans:
- [x] 29-01-PLAN.md — Adapter probe route, onboarding wizard Hermes fallback, adapter-neutral templates
- [x] 29-02-PLAN.md — Hermes skill injection via promptTemplate, integration tests
+</details>
+
+<details>
+<summary>✅ v1.5 Smart Onboarding + Personal AI Assistant (Phases 30-35) - SHIPPED 2026-04-03</summary>
+
+### Phase 30: Hardware Detection + Mode Selection
+**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features
+**Plans**: 2/2 plans complete
+
+### Phase 31: Puter.js Zero-Config Cloud
+**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js
+**Plans**: 4/4 plans complete
+
+### Phase 32: Multi-Step Onboarding Wizard
+**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen
+**Plans**: 1/1 plans complete
+
+### Phase 33: Persistent Memory + Personal Assistant Mode
+**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses
+**Plans**: 3/3 plans complete
+
+### Phase 34: Voice
+**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS)
+**Plans**: 2/2 plans complete
+
+### Phase 35: npx buildthis CLI
+**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install
+**Plans**: 1/1 plans complete

 </details>

 ---

-### 🚧 v1.5 Smart Onboarding + Personal AI Assistant (In Progress)
+### 🚧 v1.6 Voice Pipeline + Minimal Message Bridge (In Progress)

-**Milestone Goal:** The definitive onboarding experience — hardware detection, tiered provider setup (local/free cloud/paid), and a Personal AI Assistant mode that coexists with the Project Builder.
+**Milestone Goal:** Transport-agnostic voice pipeline (Whisper STT + Piper TTS) integrated into web chat, plus a minimal Telegram bridge for phone access. Voice infrastructure designed to survive v2.2 Command Center migration.

 ## Phases

- [x] **Phase 30: Hardware Detection + Mode Selection** — Unauthenticated hardware probe, Apple Silicon unified memory handling, model recommendation database, and mode selector that gates all assistant-specific features (completed 2026-04-02)
- [x] **Phase 31: Puter.js Zero-Config Cloud** — Server-proxied Puter.js adapter with full cost tracking, Google OAuth PKCE tier, and subscription auto-detection; no API keys required for zero-config path (completed 2026-04-03)
- [x] **Phase 32: Multi-Step Onboarding Wizard** — Assemble all provider tiers and hardware data into a skippable multi-step wizard; summary screen routes directly into chat (completed 2026-04-03)
- [x] **Phase 33: Persistent Memory + Personal Assistant Mode** — File-backed memory with write-time sanitization, PersonalAssistantPage, conversation handoff to PM agent (completed 2026-04-03)
- [x] **Phase 34: Voice** — Piper TTS with pre-warm progress, Whisper STT wired into voice service, onboarding voice step activated (completed 2026-04-03)
- [x] **Phase 35: npx buildthis CLI** — Standalone bootstrapper package with hardware detection and provider tiering parity with web onboarding (completed 2026-04-03)
-
---
+- [ ] **Phase 36: Voice Pipeline Foundation** — Transport-agnostic VoicePipelineService (transcribe, synthesize, formatForVoice), voice.ts route, ffmpeg audio transcoding, voiceMode flag, dual output pattern
+- [ ] **Phase 37: Web Chat Voice UI** — VAD silence detection, waveform visualization, voice mode toggle, inline audio player, auto-play toggle, COOP/COEP headers
+- [ ] **Phase 38: Telegram Bridge** — grammY long polling relay, text + voice note bidirectional relay, agent identity prefix, BotFather onboarding setup
+- [ ] **Phase 39: Voice Polish** — Sentence-buffered TTS streaming, multi-language TTS output, onboarding STT/TTS hardware detection step

 ## Phase Details

-### Phase 30: Hardware Detection + Mode Selection
-**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features — with the probe working before board auth exists
-**Depends on**: Phase 29 (v1.4 shipped)
-**Requirements**: ONBD-01, ONBD-02, ONBD-03, ONBD-07
+### Phase 36: Voice Pipeline Foundation
+**Goal**: The transport-agnostic voice pipeline is live and callable from any consumer — web chat, Telegram, or future integrations — with correct audio transcoding, voice mode flag propagation, and dual output formatting baked in from the start
+**Depends on**: Phase 35 (v1.5 shipped)
+**Requirements**: VPIPE-01, VPIPE-02, VPIPE-03, VPIPE-04, VPIPE-05, VPIPE-06
 **Success Criteria** (what must be TRUE):
-  1. On a fresh install (before any board auth token exists), the hardware probe returns GPU, RAM, and Apple Silicon unified memory data within 5 seconds
-  2. A Mac Mini M4 reports "unified memory" (not VRAM) with the 0.75 multiplier applied and copy that says "runs entirely on your machine"
-  3. The mode selector (Personal AI Assistant / Project Builder / Both) is visible during onboarding and the selected mode is persisted; assistant-specific UI is hidden when Project Builder-only is chosen
-  4. The model recommendation shown to the user matches an entry in the pre-built JSON catalog for the detected hardware tier (GPU / Apple Silicon / CPU-only)
-**Plans**: 2 plans
+  1. Posting a WAV audio file to `POST /api/transcribe` returns a transcription with detected language, regardless of whether the request came from the web UI or a test harness
+  2. Calling `POST /api/synthesize` with a markdown-heavy agent response returns two outputs: a voice-optimized prose version (no markdown) and the original full text with code blocks
+  3. A WebM/Opus browser recording and an OGG/Opus Telegram voice note both produce identical Whisper transcription quality after ffmpeg transcodes each to WAV 16kHz mono
+  4. The `voiceMode` flag on a chat message survives from client request through Express route to message persistence — verifiable in the DB record
+  5. `nexus-settings.json` accepts `voiceMode: "text" | "voice_input" | "full_voice"` and `telegramToken` fields without breaking existing settings reads
+**Plans**: TBD

-Plans:
- [x] 30-01-PLAN.md — Hardware service, nexus-settings service, model catalog extension, routes, and tests
- [x] 30-02-PLAN.md — ModeSelector, HardwareSummaryStep, useHardwareInfo hook, multi-step wizard wiring
-
-### Phase 31: Puter.js Zero-Config Cloud
-**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js — all calls server-proxied, tokens server-stored, cost tracked; Google OAuth and subscription auto-detection round out the provider tier
-**Depends on**: Phase 30
-**Requirements**: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05
+### Phase 37: Web Chat Voice UI
+**Goal**: Users can speak to any agent in web chat — recording auto-stops on silence, a live waveform confirms the mic is active, responses play back automatically (toggleable), and voice mode is a first-class setting
+**Depends on**: Phase 36
+**Requirements**: WCHAT-01, WCHAT-02, WCHAT-03, WCHAT-04, WCHAT-05, WCHAT-06
 **Success Criteria** (what must be TRUE):
-  1. A user with no Ollama and no API keys clicks "Continue with Puter" in onboarding, completes the Puter auth popup, and immediately gets a working chat response — no API key entry required
-  2. All Puter AI calls flow through `POST /api/puter-proxy/chat` (verifiable in server logs); the Puter auth token is stored server-side via secretService, not in localStorage
-  3. Token cost for Puter responses appears in the cost tracking view, attributed correctly per conversation
-  4. A user with Hermes, Claude Code, or OpenClaw already installed sees those tools pre-filled in the provider configuration step with no manual entry
-  5. A user clicking "Sign in with Google" for Gemini completes PKCE OAuth and gets a Gemini-backed chat response; the UI displays a policy-risk note that Google OAuth may trigger abuse detection
-**Plans**: 4 plans
-
-Plans:
- [x] 31-01-PLAN.md — Puter proxy service, routes, unit tests, and app.ts wiring
- [x] 31-02-PLAN.md — Google OAuth PKCE service, routes, API key storage route
- [x] 31-03-PLAN.md — Provider Selection UI step, PuterAuthButton, GoogleOAuthButton, ApiKeyEntryForm, 4-step wizard wiring
- [x] 31-04-PLAN.md — Google OAuth claim endpoint, human verification of full onboarding flow
+  1. Clicking the mic button starts recording; the waveform animates to show audio levels; speaking and then pausing for 1.5 seconds auto-submits the recording without pressing any button
+  2. The voice mode toggle has three visible states (text only / voice input / full voice) and persists the selected mode across page refreshes
+  3. An agent response delivered in full voice mode plays back automatically in the chat thread; the auto-play can be turned off in settings and stays off after a page reload
+  4. The chat message for a voice interaction shows a voice badge and an expandable section revealing the full markdown response with code blocks intact
+  5. Voice recording and VAD work correctly in Chrome and Firefox on the Mac Mini (COOP/COEP headers satisfy SharedArrayBuffer requirements)
+**Plans**: TBD
 **UI hint**: yes

-### Phase 32: Multi-Step Onboarding Wizard
-**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen — and can jump straight into chat from there
-**Depends on**: Phase 31
-**Requirements**: ONBD-04, ONBD-05, ONBD-06
+### Phase 38: Telegram Bridge
+**Goal**: The user can message any Nexus agent from their phone via Telegram — text and voice notes both work, agent identity is visible on every reply, and the bot is set up through guided onboarding with no manual token entry in config files
+**Depends on**: Phase 36
+**Requirements**: TGRAM-01, TGRAM-02, TGRAM-03, TGRAM-04, TGRAM-05, TGRAM-06, ONBRD-03
 **Success Criteria** (what must be TRUE):
-  1. A user can click "Skip" on every onboarding step (hardware, provider, voice) and reach the summary screen; the resulting workspace has at least one working agent with a valid provider
-  2. The summary screen shows the configured providers and agent-model pairings for the selected mode; no corporate language ("company", "CEO", "mission") appears anywhere in the flow
-  3. From the summary screen, one click navigates directly to the Personal Assistant chat or the project dashboard (depending on chosen mode) with no additional prompts
-**Plans**: 1 plan
+  1. Sending a text message to the Nexus Telegram bot from a phone produces an agent reply prefixed with the agent name (e.g. `[PM]: response`) within 10 seconds
+  2. Sending a voice note to the Telegram bot produces a transcription confirmation message followed by the agent's text reply — the bot does not silently fail or miss the update
+  3. Requesting a voice reply from the bot returns an OGG voice note that plays back correctly in the Telegram mobile app
+  4. The Telegram bridge runs via long polling with no public HTTPS endpoint required — verified by running on the Mac Mini behind NAT
+  5. The entire `telegram.ts` service file is under 500 lines
+  6. The onboarding wizard includes a BotFather setup step that walks through creating a bot token and saves it to `nexus-settings.json` without manual file editing
+**Plans**: TBD

-Plans:
- [x] 32-01-PLAN.md — Summary step, skip buttons, chat handoff
-**UI hint**: yes
-
-### Phase 33: Persistent Memory + Personal Assistant Mode
-**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses — with no risk of credentials leaking into prompts — and can hand off any conversation to a PM agent with context intact
-**Depends on**: Phase 32
-**Requirements**: ASST-01, ASST-02, ASST-03, ASST-04
+### Phase 39: Voice Polish
+**Goal**: Voice responses begin playing before synthesis is complete (sentence-buffered), a single response can be synthesized in multiple languages simultaneously, and new installs can detect STT/TTS hardware capability during onboarding and enable voice in one step
+**Depends on**: Phase 37
+**Requirements**: VPIPE-07, VPIPE-08, ONBRD-01, ONBRD-02
 **Success Criteria** (what must be TRUE):
-  1. A fact stated in one chat session ("I prefer TypeScript") is referenced correctly by the assistant in a new session started after closing and reopening the browser
-  2. Pasting an API key or token into chat and then starting a new session results in the assistant having no knowledge of that credential — the sanitization blocklist prevented it from being stored
-  3. A user clicks "Turn this into a project" in an assistant conversation; a PM agent is created with a system message containing the conversation summary and they land on the project dashboard
-  4. A user with mode set to "Both" can switch between Personal Assistant chat and the project dashboard without losing context or cross-contaminating assistant memory with project agent messages
-**Plans**: 3 plans
-
-Plans:
- [x] 33-01-PLAN.md — Memory sanitizer, assistant memory service, REST routes, and unit tests
- [x] 33-02-PLAN.md — PersonalAssistantPage, useNexusMode hook, sidebar navigation, route wiring
- [x] 33-03-PLAN.md — Real AI streaming with memory injection, assistant-to-PM handoff route and UI
-**UI hint**: yes
-
-### Phase 34: Voice
-**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS) — Piper pre-warms visibly so the first synthesis call does not appear broken, and voice is offered during onboarding based on hardware capability
-**Depends on**: Phase 32
-**Requirements**: VOICE-01, VOICE-02, VOICE-03
-**Success Criteria** (what must be TRUE):
-  1. On a CPU-only machine (no GPU), enabling Piper TTS in the assistant produces audible speech output within a reasonable time after the first synthesis (not a silent hang)
-  2. When Piper's WASM voice model is downloading for the first time, a visible progress indicator is shown before the TTS toggle is enabled; the download completes and TTS works without a page reload
-  3. The onboarding voice step offers Whisper STT and Piper TTS toggles only when the hardware detection step has confirmed sufficient capability; on hardware below the threshold, the step is skipped or shows a capability warning
-**Plans**: 2 plans
-
-Plans:
- [x] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings
- [x] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring
-**UI hint**: yes
-
-### Phase 35: npx buildthis CLI
-**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install — with the same hardware detection and provider tiering as the web onboarding
-**Depends on**: Phase 30 (hardware detection service must exist)
-**Requirements**: CLI-01, CLI-02
-**Success Criteria** (what must be TRUE):
-  1. Running `npx buildthis` on a machine where Nexus is already running opens the Nexus UI in the default browser; running it on a machine with no Nexus guides the user through installation steps
-  2. The CLI bootstrapper detects the same hardware tier (GPU / Apple Silicon / CPU-only) as the web onboarding and presents the matching provider tier recommendations in the terminal prompt
-**Plans**: 1 plan
-
-Plans:
- [x] 35-01-PLAN.md — Package scaffold, hardware detection, two-path bootstrap (probe running vs guide install), provider selection, tests
+  1. For a multi-sentence agent response, the first sentence begins playing in the browser before the second sentence has finished synthesizing — the gap between text completion and first audio is under 1 second
+  2. A user can request the same agent response as audio in both English and Danish; both OGG files are generated and available for playback without a second agent call
+  3. On a fresh install, the onboarding hardware probe reports whether Whisper STT and Piper TTS are runnable on the detected hardware tier
+  4. The onboarding voice step activates (showing enable/skip options) only when the hardware probe confirms sufficient capability; on hardware below threshold it shows a capability note and skips to the next step
+**Plans**: TBD

 ---

 ## Coverage Validation

-All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
+All 23 v1.6 requirements are mapped to exactly one phase. No orphans.

 | Requirement | Phase |
 |-------------|-------|
-| ONBD-01 | 30 |
-| ONBD-02 | 30 |
-| ONBD-03 | 30 |
-| ONBD-07 | 30 |
-| CLOUD-01 | 31 |
-| CLOUD-02 | 31 |
-| CLOUD-03 | 31 |
-| CLOUD-04 | 31 |
-| CLOUD-05 | 31 |
-| ONBD-04 | 32 |
-| ONBD-05 | 32 |
-| ONBD-06 | 32 |
-| ASST-01 | 33 |
-| ASST-02 | 33 |
-| ASST-03 | 33 |
-| ASST-04 | 33 |
-| VOICE-01 | 34 |
-| VOICE-02 | 34 |
-| VOICE-03 | 34 |
-| CLI-01 | 35 |
-| CLI-02 | 35 |
+| VPIPE-01 | 36 |
+| VPIPE-02 | 36 |
+| VPIPE-03 | 36 |
+| VPIPE-04 | 36 |
+| VPIPE-05 | 36 |
+| VPIPE-06 | 36 |
+| WCHAT-01 | 37 |
+| WCHAT-02 | 37 |
+| WCHAT-03 | 37 |
+| WCHAT-04 | 37 |
+| WCHAT-05 | 37 |
+| WCHAT-06 | 37 |
+| TGRAM-01 | 38 |
+| TGRAM-02 | 38 |
+| TGRAM-03 | 38 |
+| TGRAM-04 | 38 |
+| TGRAM-05 | 38 |
+| TGRAM-06 | 38 |
+| ONBRD-03 | 38 |
+| VPIPE-07 | 39 |
+| VPIPE-08 | 39 |
+| ONBRD-01 | 39 |
+| ONBRD-02 | 39 |

 ---

@ -237,9 +210,13 @@ All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
 | 27. Hermes Adapter | v1.4 | 1/1 | Complete | 2026-04-02 |
 | 28. Ollama Integration & Agent Surface | v1.4 | 3/3 | Complete | 2026-04-02 |
 | 29. Default Provider & End-to-End | v1.4 | 2/2 | Complete | 2026-04-02 |
-| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete    | 2026-04-03 |
-| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete    | 2026-04-03 |
-| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete    | 2026-04-03 |
-| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete    | 2026-04-03 |
-| 34. Voice | v1.5 | 2/2 | Complete    | 2026-04-03 |
-| 35. npx buildthis CLI | v1.5 | 1/1 | Complete    | 2026-04-03 |
+| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete | 2026-04-03 |
+| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
+| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
+| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
+| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
+| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
+| 36. Voice Pipeline Foundation | v1.6 | 0/TBD | Not started | - |
+| 37. Web Chat Voice UI | v1.6 | 0/TBD | Not started | - |
+| 38. Telegram Bridge | v1.6 | 0/TBD | Not started | - |
+| 39. Voice Polish | v1.6 | 0/TBD | Not started | - |
--- a/.planning/STATE.md
+++ b/.planning/STATE.md
@ -7,7 +7,7 @@ stopped_at: null
 last_updated: "2026-04-03"
 last_activity: 2026-04-03
 progress:
-  total_phases: 0
+  total_phases: 4
  completed_phases: 0
  total_plans: 0
  completed_plans: 0
@ -21,14 +21,16 @@ progress:
 See: .planning/PROJECT.md (updated 2026-04-03)

 **Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
-**Current focus:** Defining requirements for v1.6
+**Current focus:** Phase 36 — Voice Pipeline Foundation (ready to plan)

 ## Current Position

-Phase: Not started (defining requirements)
-Plan: —
-Status: Defining requirements
-Last activity: 2026-04-03 — Milestone v1.6 started
+Phase: 36 of 39 (Voice Pipeline Foundation)
+Plan: — (not started)
+Status: Ready to plan
+Last activity: 2026-04-03 — v1.6 roadmap created (4 phases, 23 requirements mapped)
+
+Progress: [░░░░░░░░░░] 0%

 ## Performance Metrics

@ -45,11 +47,13 @@ Last activity: 2026-04-03 — Milestone v1.6 started
 Decisions are logged in PROJECT.md Key Decisions table.
 Key constraints for v1.6:

- Voice pipeline is transport-agnostic — no Telegram-specific code in core voice components
- Telegram bridge is intentionally disposable (<500 lines) — will be replaced by v2.2 Command Center
- Dual output always: voice response + full technical details in text
- Voice mode is a per-message flag, not a per-agent setting
- v1.5 already has VoiceRecordButton, TtsButton, usePiperTts hooks in place — build on these
+- voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
+- Telegram bridge uses long polling (grammY `bot.start()`) — no public HTTPS required on Mac Mini
+- Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
+- Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
+- COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
+- Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
+- Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)

 ### Pending Todos

@ -57,10 +61,12 @@ None yet.

 ### Blockers/Concerns

- [v1.5 carryover] smart-whisper Apple Silicon acceleration claim unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed on device
+- [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed
+- [v1.6] grammY session management approach not yet chosen: lightweight `Map<chatId, sessionId>` vs. grammY conversation plugin — decide at Phase 38 planning
+- [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional

 ## Session Continuity

 Last session: 2026-04-03
-Stopped at: Milestone v1.6 initialized
+Stopped at: Roadmap created — 4 phases defined, 23/23 requirements mapped
 Resume file: None