docs: create milestone v1.6 roadmap (4 phases)

This commit is contained in:
Nexus Dev 2026-04-04 00:53:37 +00:00
parent 1cdcfac10b
commit 68aa5ae052
3 changed files with 155 additions and 172 deletions

View file

@ -72,35 +72,35 @@
| Requirement | Phase | Status |
|-------------|-------|--------|
| VPIPE-01 | | Pending |
| VPIPE-02 | | Pending |
| VPIPE-03 | | Pending |
| VPIPE-04 | | Pending |
| VPIPE-05 | | Pending |
| VPIPE-06 | | Pending |
| VPIPE-07 | | Pending |
| VPIPE-08 | | Pending |
| WCHAT-01 | | Pending |
| WCHAT-02 | | Pending |
| WCHAT-03 | | Pending |
| WCHAT-04 | | Pending |
| WCHAT-05 | | Pending |
| WCHAT-06 | | Pending |
| TGRAM-01 | | Pending |
| TGRAM-02 | | Pending |
| TGRAM-03 | | Pending |
| TGRAM-04 | | Pending |
| TGRAM-05 | | Pending |
| TGRAM-06 | | Pending |
| ONBRD-01 | | Pending |
| ONBRD-02 | | Pending |
| ONBRD-03 | | Pending |
| VPIPE-01 | Phase 36 | Pending |
| VPIPE-02 | Phase 36 | Pending |
| VPIPE-03 | Phase 36 | Pending |
| VPIPE-04 | Phase 36 | Pending |
| VPIPE-05 | Phase 36 | Pending |
| VPIPE-06 | Phase 36 | Pending |
| VPIPE-07 | Phase 39 | Pending |
| VPIPE-08 | Phase 39 | Pending |
| WCHAT-01 | Phase 37 | Pending |
| WCHAT-02 | Phase 37 | Pending |
| WCHAT-03 | Phase 37 | Pending |
| WCHAT-04 | Phase 37 | Pending |
| WCHAT-05 | Phase 37 | Pending |
| WCHAT-06 | Phase 37 | Pending |
| TGRAM-01 | Phase 38 | Pending |
| TGRAM-02 | Phase 38 | Pending |
| TGRAM-03 | Phase 38 | Pending |
| TGRAM-04 | Phase 38 | Pending |
| TGRAM-05 | Phase 38 | Pending |
| TGRAM-06 | Phase 38 | Pending |
| ONBRD-01 | Phase 39 | Pending |
| ONBRD-02 | Phase 39 | Pending |
| ONBRD-03 | Phase 38 | Pending |
**Coverage:**
- v1.6 requirements: 23 total
- Mapped to phases: 0
- Unmapped: 23 ⚠️
- Mapped to phases: 23
- Unmapped: 0 ✓
---
*Requirements defined: 2026-04-04*
*Last updated: 2026-04-04 after initial definition*
*Last updated: 2026-04-03 — traceability populated after roadmap creation*

View file

@ -5,7 +5,8 @@
- ✅ **v1.2.1 Universal Skill Management** - Phase 1 (shipped 2026-04-01)
- ✅ **v1.3 Chat & PWA** - Phases 21-26 (shipped 2026-04-02)
- ✅ **v1.4 Hermes Default Provider** - Phases 27-29 (shipped 2026-04-02)
- 🚧 **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (in progress)
- ✅ **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (shipped 2026-04-03)
- 🚧 **v1.6 Voice Pipeline + Minimal Message Bridge** - Phases 36-39 (in progress)
---
@ -58,168 +59,140 @@ Plans:
**Goal**: Users can create a Hermes agent in Nexus, configure it, and have it execute heartbeats that spawn `hermes chat -q`, return a result, and persist the session across runs
**Plans**: 1/1 plans complete
Plans:
- [x] 27-01-PLAN.md — Close four integration gaps: SESSIONED_LOCAL_ADAPTERS, create-mode toolsets bug, duplicate constant, session codec test
### Phase 28: Ollama Integration & Agent Surface
**Goal**: Users can see which Ollama models are available, get a recommendation for their hardware, configure any Hermes agent to use a local model, and see Hermes-specific runtime data in the dashboard and agent config
**Plans**: 3/3 plans complete
Plans:
- [x] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
- [x] 28-02-PLAN.md — UI model selector dropdown, install callout, Hermes skill badge
- [x] 28-03-PLAN.md — Hermes stateJson runtime data and dashboard HermesRuntimeCard
### Phase 29: Default Provider & End-to-End
**Goal**: A fresh Nexus install with only Hermes and Ollama works end-to-end — onboarding offers Hermes as the default, PM and Engineer templates run correctly on the Hermes runtime, and GSD workflow tasks complete successfully
**Plans**: 2/2 plans complete
Plans:
- [x] 29-01-PLAN.md — Adapter probe route, onboarding wizard Hermes fallback, adapter-neutral templates
- [x] 29-02-PLAN.md — Hermes skill injection via promptTemplate, integration tests
</details>
<details>
<summary>✅ v1.5 Smart Onboarding + Personal AI Assistant (Phases 30-35) - SHIPPED 2026-04-03</summary>
### Phase 30: Hardware Detection + Mode Selection
**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features
**Plans**: 2/2 plans complete
### Phase 31: Puter.js Zero-Config Cloud
**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js
**Plans**: 4/4 plans complete
### Phase 32: Multi-Step Onboarding Wizard
**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen
**Plans**: 1/1 plans complete
### Phase 33: Persistent Memory + Personal Assistant Mode
**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses
**Plans**: 3/3 plans complete
### Phase 34: Voice
**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS)
**Plans**: 2/2 plans complete
### Phase 35: npx buildthis CLI
**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install
**Plans**: 1/1 plans complete
</details>
---
### 🚧 v1.5 Smart Onboarding + Personal AI Assistant (In Progress)
### 🚧 v1.6 Voice Pipeline + Minimal Message Bridge (In Progress)
**Milestone Goal:** The definitive onboarding experience — hardware detection, tiered provider setup (local/free cloud/paid), and a Personal AI Assistant mode that coexists with the Project Builder.
**Milestone Goal:** Transport-agnostic voice pipeline (Whisper STT + Piper TTS) integrated into web chat, plus a minimal Telegram bridge for phone access. Voice infrastructure designed to survive v2.2 Command Center migration.
## Phases
- [x] **Phase 30: Hardware Detection + Mode Selection** — Unauthenticated hardware probe, Apple Silicon unified memory handling, model recommendation database, and mode selector that gates all assistant-specific features (completed 2026-04-02)
- [x] **Phase 31: Puter.js Zero-Config Cloud** — Server-proxied Puter.js adapter with full cost tracking, Google OAuth PKCE tier, and subscription auto-detection; no API keys required for zero-config path (completed 2026-04-03)
- [x] **Phase 32: Multi-Step Onboarding Wizard** — Assemble all provider tiers and hardware data into a skippable multi-step wizard; summary screen routes directly into chat (completed 2026-04-03)
- [x] **Phase 33: Persistent Memory + Personal Assistant Mode** — File-backed memory with write-time sanitization, PersonalAssistantPage, conversation handoff to PM agent (completed 2026-04-03)
- [x] **Phase 34: Voice** — Piper TTS with pre-warm progress, Whisper STT wired into voice service, onboarding voice step activated (completed 2026-04-03)
- [x] **Phase 35: npx buildthis CLI** — Standalone bootstrapper package with hardware detection and provider tiering parity with web onboarding (completed 2026-04-03)
---
- [ ] **Phase 36: Voice Pipeline Foundation** — Transport-agnostic VoicePipelineService (transcribe, synthesize, formatForVoice), voice.ts route, ffmpeg audio transcoding, voiceMode flag, dual output pattern
- [ ] **Phase 37: Web Chat Voice UI** — VAD silence detection, waveform visualization, voice mode toggle, inline audio player, auto-play toggle, COOP/COEP headers
- [ ] **Phase 38: Telegram Bridge** — grammY long polling relay, text + voice note bidirectional relay, agent identity prefix, BotFather onboarding setup
- [ ] **Phase 39: Voice Polish** — Sentence-buffered TTS streaming, multi-language TTS output, onboarding STT/TTS hardware detection step
## Phase Details
### Phase 30: Hardware Detection + Mode Selection
**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features — with the probe working before board auth exists
**Depends on**: Phase 29 (v1.4 shipped)
**Requirements**: ONBD-01, ONBD-02, ONBD-03, ONBD-07
### Phase 36: Voice Pipeline Foundation
**Goal**: The transport-agnostic voice pipeline is live and callable from any consumer — web chat, Telegram, or future integrations — with correct audio transcoding, voice mode flag propagation, and dual output formatting baked in from the start
**Depends on**: Phase 35 (v1.5 shipped)
**Requirements**: VPIPE-01, VPIPE-02, VPIPE-03, VPIPE-04, VPIPE-05, VPIPE-06
**Success Criteria** (what must be TRUE):
1. On a fresh install (before any board auth token exists), the hardware probe returns GPU, RAM, and Apple Silicon unified memory data within 5 seconds
2. A Mac Mini M4 reports "unified memory" (not VRAM) with the 0.75 multiplier applied and copy that says "runs entirely on your machine"
3. The mode selector (Personal AI Assistant / Project Builder / Both) is visible during onboarding and the selected mode is persisted; assistant-specific UI is hidden when Project Builder-only is chosen
4. The model recommendation shown to the user matches an entry in the pre-built JSON catalog for the detected hardware tier (GPU / Apple Silicon / CPU-only)
**Plans**: 2 plans
1. Posting a WAV audio file to `POST /api/transcribe` returns a transcription with detected language, regardless of whether the request came from the web UI or a test harness
2. Calling `POST /api/synthesize` with a markdown-heavy agent response returns two outputs: a voice-optimized prose version (no markdown) and the original full text with code blocks
3. A WebM/Opus browser recording and an OGG/Opus Telegram voice note both produce identical Whisper transcription quality after ffmpeg transcodes each to WAV 16kHz mono
4. The `voiceMode` flag on a chat message survives from client request through Express route to message persistence — verifiable in the DB record
5. `nexus-settings.json` accepts `voiceMode: "text" | "voice_input" | "full_voice"` and `telegramToken` fields without breaking existing settings reads
**Plans**: TBD
Plans:
- [x] 30-01-PLAN.md — Hardware service, nexus-settings service, model catalog extension, routes, and tests
- [x] 30-02-PLAN.md — ModeSelector, HardwareSummaryStep, useHardwareInfo hook, multi-step wizard wiring
### Phase 31: Puter.js Zero-Config Cloud
**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js — all calls server-proxied, tokens server-stored, cost tracked; Google OAuth and subscription auto-detection round out the provider tier
**Depends on**: Phase 30
**Requirements**: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05
### Phase 37: Web Chat Voice UI
**Goal**: Users can speak to any agent in web chat — recording auto-stops on silence, a live waveform confirms the mic is active, responses play back automatically (toggleable), and voice mode is a first-class setting
**Depends on**: Phase 36
**Requirements**: WCHAT-01, WCHAT-02, WCHAT-03, WCHAT-04, WCHAT-05, WCHAT-06
**Success Criteria** (what must be TRUE):
1. A user with no Ollama and no API keys clicks "Continue with Puter" in onboarding, completes the Puter auth popup, and immediately gets a working chat response — no API key entry required
2. All Puter AI calls flow through `POST /api/puter-proxy/chat` (verifiable in server logs); the Puter auth token is stored server-side via secretService, not in localStorage
3. Token cost for Puter responses appears in the cost tracking view, attributed correctly per conversation
4. A user with Hermes, Claude Code, or OpenClaw already installed sees those tools pre-filled in the provider configuration step with no manual entry
5. A user clicking "Sign in with Google" for Gemini completes PKCE OAuth and gets a Gemini-backed chat response; the UI displays a policy-risk note that Google OAuth may trigger abuse detection
**Plans**: 4 plans
Plans:
- [x] 31-01-PLAN.md — Puter proxy service, routes, unit tests, and app.ts wiring
- [x] 31-02-PLAN.md — Google OAuth PKCE service, routes, API key storage route
- [x] 31-03-PLAN.md — Provider Selection UI step, PuterAuthButton, GoogleOAuthButton, ApiKeyEntryForm, 4-step wizard wiring
- [x] 31-04-PLAN.md — Google OAuth claim endpoint, human verification of full onboarding flow
1. Clicking the mic button starts recording; the waveform animates to show audio levels; speaking and then pausing for 1.5 seconds auto-submits the recording without pressing any button
2. The voice mode toggle has three visible states (text only / voice input / full voice) and persists the selected mode across page refreshes
3. An agent response delivered in full voice mode plays back automatically in the chat thread; the auto-play can be turned off in settings and stays off after a page reload
4. The chat message for a voice interaction shows a voice badge and an expandable section revealing the full markdown response with code blocks intact
5. Voice recording and VAD work correctly in Chrome and Firefox on the Mac Mini (COOP/COEP headers satisfy SharedArrayBuffer requirements)
**Plans**: TBD
**UI hint**: yes
### Phase 32: Multi-Step Onboarding Wizard
**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen — and can jump straight into chat from there
**Depends on**: Phase 31
**Requirements**: ONBD-04, ONBD-05, ONBD-06
### Phase 38: Telegram Bridge
**Goal**: The user can message any Nexus agent from their phone via Telegram — text and voice notes both work, agent identity is visible on every reply, and the bot is set up through guided onboarding with no manual token entry in config files
**Depends on**: Phase 36
**Requirements**: TGRAM-01, TGRAM-02, TGRAM-03, TGRAM-04, TGRAM-05, TGRAM-06, ONBRD-03
**Success Criteria** (what must be TRUE):
1. A user can click "Skip" on every onboarding step (hardware, provider, voice) and reach the summary screen; the resulting workspace has at least one working agent with a valid provider
2. The summary screen shows the configured providers and agent-model pairings for the selected mode; no corporate language ("company", "CEO", "mission") appears anywhere in the flow
3. From the summary screen, one click navigates directly to the Personal Assistant chat or the project dashboard (depending on chosen mode) with no additional prompts
**Plans**: 1 plan
1. Sending a text message to the Nexus Telegram bot from a phone produces an agent reply prefixed with the agent name (e.g. `[PM]: response`) within 10 seconds
2. Sending a voice note to the Telegram bot produces a transcription confirmation message followed by the agent's text reply — the bot does not silently fail or miss the update
3. Requesting a voice reply from the bot returns an OGG voice note that plays back correctly in the Telegram mobile app
4. The Telegram bridge runs via long polling with no public HTTPS endpoint required — verified by running on the Mac Mini behind NAT
5. The entire `telegram.ts` service file is under 500 lines
6. The onboarding wizard includes a BotFather setup step that walks through creating a bot token and saves it to `nexus-settings.json` without manual file editing
**Plans**: TBD
Plans:
- [x] 32-01-PLAN.md — Summary step, skip buttons, chat handoff
**UI hint**: yes
### Phase 33: Persistent Memory + Personal Assistant Mode
**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses — with no risk of credentials leaking into prompts — and can hand off any conversation to a PM agent with context intact
**Depends on**: Phase 32
**Requirements**: ASST-01, ASST-02, ASST-03, ASST-04
### Phase 39: Voice Polish
**Goal**: Voice responses begin playing before synthesis is complete (sentence-buffered), a single response can be synthesized in multiple languages simultaneously, and new installs can detect STT/TTS hardware capability during onboarding and enable voice in one step
**Depends on**: Phase 37
**Requirements**: VPIPE-07, VPIPE-08, ONBRD-01, ONBRD-02
**Success Criteria** (what must be TRUE):
1. A fact stated in one chat session ("I prefer TypeScript") is referenced correctly by the assistant in a new session started after closing and reopening the browser
2. Pasting an API key or token into chat and then starting a new session results in the assistant having no knowledge of that credential — the sanitization blocklist prevented it from being stored
3. A user clicks "Turn this into a project" in an assistant conversation; a PM agent is created with a system message containing the conversation summary and they land on the project dashboard
4. A user with mode set to "Both" can switch between Personal Assistant chat and the project dashboard without losing context or cross-contaminating assistant memory with project agent messages
**Plans**: 3 plans
Plans:
- [x] 33-01-PLAN.md — Memory sanitizer, assistant memory service, REST routes, and unit tests
- [x] 33-02-PLAN.md — PersonalAssistantPage, useNexusMode hook, sidebar navigation, route wiring
- [x] 33-03-PLAN.md — Real AI streaming with memory injection, assistant-to-PM handoff route and UI
**UI hint**: yes
### Phase 34: Voice
**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS) — Piper pre-warms visibly so the first synthesis call does not appear broken, and voice is offered during onboarding based on hardware capability
**Depends on**: Phase 32
**Requirements**: VOICE-01, VOICE-02, VOICE-03
**Success Criteria** (what must be TRUE):
1. On a CPU-only machine (no GPU), enabling Piper TTS in the assistant produces audible speech output within a reasonable time after the first synthesis (not a silent hang)
2. When Piper's WASM voice model is downloading for the first time, a visible progress indicator is shown before the TTS toggle is enabled; the download completes and TTS works without a page reload
3. The onboarding voice step offers Whisper STT and Piper TTS toggles only when the hardware detection step has confirmed sufficient capability; on hardware below the threshold, the step is skipped or shows a capability warning
**Plans**: 2 plans
Plans:
- [x] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings
- [x] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring
**UI hint**: yes
### Phase 35: npx buildthis CLI
**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install — with the same hardware detection and provider tiering as the web onboarding
**Depends on**: Phase 30 (hardware detection service must exist)
**Requirements**: CLI-01, CLI-02
**Success Criteria** (what must be TRUE):
1. Running `npx buildthis` on a machine where Nexus is already running opens the Nexus UI in the default browser; running it on a machine with no Nexus guides the user through installation steps
2. The CLI bootstrapper detects the same hardware tier (GPU / Apple Silicon / CPU-only) as the web onboarding and presents the matching provider tier recommendations in the terminal prompt
**Plans**: 1 plan
Plans:
- [x] 35-01-PLAN.md — Package scaffold, hardware detection, two-path bootstrap (probe running vs guide install), provider selection, tests
1. For a multi-sentence agent response, the first sentence begins playing in the browser before the second sentence has finished synthesizing — the gap between text completion and first audio is under 1 second
2. A user can request the same agent response as audio in both English and Danish; both OGG files are generated and available for playback without a second agent call
3. On a fresh install, the onboarding hardware probe reports whether Whisper STT and Piper TTS are runnable on the detected hardware tier
4. The onboarding voice step activates (showing enable/skip options) only when the hardware probe confirms sufficient capability; on hardware below threshold it shows a capability note and skips to the next step
**Plans**: TBD
---
## Coverage Validation
All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
All 23 v1.6 requirements are mapped to exactly one phase. No orphans.
| Requirement | Phase |
|-------------|-------|
| ONBD-01 | 30 |
| ONBD-02 | 30 |
| ONBD-03 | 30 |
| ONBD-07 | 30 |
| CLOUD-01 | 31 |
| CLOUD-02 | 31 |
| CLOUD-03 | 31 |
| CLOUD-04 | 31 |
| CLOUD-05 | 31 |
| ONBD-04 | 32 |
| ONBD-05 | 32 |
| ONBD-06 | 32 |
| ASST-01 | 33 |
| ASST-02 | 33 |
| ASST-03 | 33 |
| ASST-04 | 33 |
| VOICE-01 | 34 |
| VOICE-02 | 34 |
| VOICE-03 | 34 |
| CLI-01 | 35 |
| CLI-02 | 35 |
| VPIPE-01 | 36 |
| VPIPE-02 | 36 |
| VPIPE-03 | 36 |
| VPIPE-04 | 36 |
| VPIPE-05 | 36 |
| VPIPE-06 | 36 |
| WCHAT-01 | 37 |
| WCHAT-02 | 37 |
| WCHAT-03 | 37 |
| WCHAT-04 | 37 |
| WCHAT-05 | 37 |
| WCHAT-06 | 37 |
| TGRAM-01 | 38 |
| TGRAM-02 | 38 |
| TGRAM-03 | 38 |
| TGRAM-04 | 38 |
| TGRAM-05 | 38 |
| TGRAM-06 | 38 |
| ONBRD-03 | 38 |
| VPIPE-07 | 39 |
| VPIPE-08 | 39 |
| ONBRD-01 | 39 |
| ONBRD-02 | 39 |
---
@ -237,9 +210,13 @@ All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
| 27. Hermes Adapter | v1.4 | 1/1 | Complete | 2026-04-02 |
| 28. Ollama Integration & Agent Surface | v1.4 | 3/3 | Complete | 2026-04-02 |
| 29. Default Provider & End-to-End | v1.4 | 2/2 | Complete | 2026-04-02 |
| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete | 2026-04-03 |
| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete | 2026-04-03 |
| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
| 36. Voice Pipeline Foundation | v1.6 | 0/TBD | Not started | - |
| 37. Web Chat Voice UI | v1.6 | 0/TBD | Not started | - |
| 38. Telegram Bridge | v1.6 | 0/TBD | Not started | - |
| 39. Voice Polish | v1.6 | 0/TBD | Not started | - |

View file

@ -7,7 +7,7 @@ stopped_at: null
last_updated: "2026-04-03"
last_activity: 2026-04-03
progress:
total_phases: 0
total_phases: 4
completed_phases: 0
total_plans: 0
completed_plans: 0
@ -21,14 +21,16 @@ progress:
See: .planning/PROJECT.md (updated 2026-04-03)
**Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
**Current focus:** Defining requirements for v1.6
**Current focus:** Phase 36 — Voice Pipeline Foundation (ready to plan)
## Current Position
Phase: Not started (defining requirements)
Plan: —
Status: Defining requirements
Last activity: 2026-04-03 — Milestone v1.6 started
Phase: 36 of 39 (Voice Pipeline Foundation)
Plan: — (not started)
Status: Ready to plan
Last activity: 2026-04-03 — v1.6 roadmap created (4 phases, 23 requirements mapped)
Progress: [░░░░░░░░░░] 0%
## Performance Metrics
@ -45,11 +47,13 @@ Last activity: 2026-04-03 — Milestone v1.6 started
Decisions are logged in PROJECT.md Key Decisions table.
Key constraints for v1.6:
- Voice pipeline is transport-agnostic — no Telegram-specific code in core voice components
- Telegram bridge is intentionally disposable (<500 lines) will be replaced by v2.2 Command Center
- Dual output always: voice response + full technical details in text
- Voice mode is a per-message flag, not a per-agent setting
- v1.5 already has VoiceRecordButton, TtsButton, usePiperTts hooks in place — build on these
- voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
- Telegram bridge uses long polling (grammY `bot.start()`) — no public HTTPS required on Mac Mini
- Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
- Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
- COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
- Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
- Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)
### Pending Todos
@ -57,10 +61,12 @@ None yet.
### Blockers/Concerns
- [v1.5 carryover] smart-whisper Apple Silicon acceleration claim unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed on device
- [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed
- [v1.6] grammY session management approach not yet chosen: lightweight `Map<chatId, sessionId>` vs. grammY conversation plugin — decide at Phase 38 planning
- [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional
## Session Continuity
Last session: 2026-04-03
Stopped at: Milestone v1.6 initialized
Stopped at: Roadmap created — 4 phases defined, 23/23 requirements mapped
Resume file: None