docs: create milestone v1.6 roadmap (4 phases)
This commit is contained in:
parent
1cdcfac10b
commit
68aa5ae052
3 changed files with 155 additions and 172 deletions
|
|
@ -72,35 +72,35 @@
|
|||
|
||||
| Requirement | Phase | Status |
|
||||
|-------------|-------|--------|
|
||||
| VPIPE-01 | — | Pending |
|
||||
| VPIPE-02 | — | Pending |
|
||||
| VPIPE-03 | — | Pending |
|
||||
| VPIPE-04 | — | Pending |
|
||||
| VPIPE-05 | — | Pending |
|
||||
| VPIPE-06 | — | Pending |
|
||||
| VPIPE-07 | — | Pending |
|
||||
| VPIPE-08 | — | Pending |
|
||||
| WCHAT-01 | — | Pending |
|
||||
| WCHAT-02 | — | Pending |
|
||||
| WCHAT-03 | — | Pending |
|
||||
| WCHAT-04 | — | Pending |
|
||||
| WCHAT-05 | — | Pending |
|
||||
| WCHAT-06 | — | Pending |
|
||||
| TGRAM-01 | — | Pending |
|
||||
| TGRAM-02 | — | Pending |
|
||||
| TGRAM-03 | — | Pending |
|
||||
| TGRAM-04 | — | Pending |
|
||||
| TGRAM-05 | — | Pending |
|
||||
| TGRAM-06 | — | Pending |
|
||||
| ONBRD-01 | — | Pending |
|
||||
| ONBRD-02 | — | Pending |
|
||||
| ONBRD-03 | — | Pending |
|
||||
| VPIPE-01 | Phase 36 | Pending |
|
||||
| VPIPE-02 | Phase 36 | Pending |
|
||||
| VPIPE-03 | Phase 36 | Pending |
|
||||
| VPIPE-04 | Phase 36 | Pending |
|
||||
| VPIPE-05 | Phase 36 | Pending |
|
||||
| VPIPE-06 | Phase 36 | Pending |
|
||||
| VPIPE-07 | Phase 39 | Pending |
|
||||
| VPIPE-08 | Phase 39 | Pending |
|
||||
| WCHAT-01 | Phase 37 | Pending |
|
||||
| WCHAT-02 | Phase 37 | Pending |
|
||||
| WCHAT-03 | Phase 37 | Pending |
|
||||
| WCHAT-04 | Phase 37 | Pending |
|
||||
| WCHAT-05 | Phase 37 | Pending |
|
||||
| WCHAT-06 | Phase 37 | Pending |
|
||||
| TGRAM-01 | Phase 38 | Pending |
|
||||
| TGRAM-02 | Phase 38 | Pending |
|
||||
| TGRAM-03 | Phase 38 | Pending |
|
||||
| TGRAM-04 | Phase 38 | Pending |
|
||||
| TGRAM-05 | Phase 38 | Pending |
|
||||
| TGRAM-06 | Phase 38 | Pending |
|
||||
| ONBRD-01 | Phase 39 | Pending |
|
||||
| ONBRD-02 | Phase 39 | Pending |
|
||||
| ONBRD-03 | Phase 38 | Pending |
|
||||
|
||||
**Coverage:**
|
||||
- v1.6 requirements: 23 total
|
||||
- Mapped to phases: 0
|
||||
- Unmapped: 23 ⚠️
|
||||
- Mapped to phases: 23
|
||||
- Unmapped: 0 ✓
|
||||
|
||||
---
|
||||
*Requirements defined: 2026-04-04*
|
||||
*Last updated: 2026-04-04 after initial definition*
|
||||
*Last updated: 2026-04-03 — traceability populated after roadmap creation*
|
||||
|
|
|
|||
|
|
@ -5,7 +5,8 @@
|
|||
- ✅ **v1.2.1 Universal Skill Management** - Phase 1 (shipped 2026-04-01)
|
||||
- ✅ **v1.3 Chat & PWA** - Phases 21-26 (shipped 2026-04-02)
|
||||
- ✅ **v1.4 Hermes Default Provider** - Phases 27-29 (shipped 2026-04-02)
|
||||
- 🚧 **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (in progress)
|
||||
- ✅ **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (shipped 2026-04-03)
|
||||
- 🚧 **v1.6 Voice Pipeline + Minimal Message Bridge** - Phases 36-39 (in progress)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -58,168 +59,140 @@ Plans:
|
|||
**Goal**: Users can create a Hermes agent in Nexus, configure it, and have it execute heartbeats that spawn `hermes chat -q`, return a result, and persist the session across runs
|
||||
**Plans**: 1/1 plans complete
|
||||
|
||||
Plans:
|
||||
- [x] 27-01-PLAN.md — Close four integration gaps: SESSIONED_LOCAL_ADAPTERS, create-mode toolsets bug, duplicate constant, session codec test
|
||||
|
||||
### Phase 28: Ollama Integration & Agent Surface
|
||||
**Goal**: Users can see which Ollama models are available, get a recommendation for their hardware, configure any Hermes agent to use a local model, and see Hermes-specific runtime data in the dashboard and agent config
|
||||
**Plans**: 3/3 plans complete
|
||||
|
||||
Plans:
|
||||
- [x] 28-01-PLAN.md — Ollama service, routes, model catalog, and unit tests
|
||||
- [x] 28-02-PLAN.md — UI model selector dropdown, install callout, Hermes skill badge
|
||||
- [x] 28-03-PLAN.md — Hermes stateJson runtime data and dashboard HermesRuntimeCard
|
||||
|
||||
### Phase 29: Default Provider & End-to-End
|
||||
**Goal**: A fresh Nexus install with only Hermes and Ollama works end-to-end — onboarding offers Hermes as the default, PM and Engineer templates run correctly on the Hermes runtime, and GSD workflow tasks complete successfully
|
||||
**Plans**: 2/2 plans complete
|
||||
|
||||
Plans:
|
||||
- [x] 29-01-PLAN.md — Adapter probe route, onboarding wizard Hermes fallback, adapter-neutral templates
|
||||
- [x] 29-02-PLAN.md — Hermes skill injection via promptTemplate, integration tests
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary>✅ v1.5 Smart Onboarding + Personal AI Assistant (Phases 30-35) - SHIPPED 2026-04-03</summary>
|
||||
|
||||
### Phase 30: Hardware Detection + Mode Selection
|
||||
**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features
|
||||
**Plans**: 2/2 plans complete
|
||||
|
||||
### Phase 31: Puter.js Zero-Config Cloud
|
||||
**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js
|
||||
**Plans**: 4/4 plans complete
|
||||
|
||||
### Phase 32: Multi-Step Onboarding Wizard
|
||||
**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen
|
||||
**Plans**: 1/1 plans complete
|
||||
|
||||
### Phase 33: Persistent Memory + Personal Assistant Mode
|
||||
**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses
|
||||
**Plans**: 3/3 plans complete
|
||||
|
||||
### Phase 34: Voice
|
||||
**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS)
|
||||
**Plans**: 2/2 plans complete
|
||||
|
||||
### Phase 35: npx buildthis CLI
|
||||
**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install
|
||||
**Plans**: 1/1 plans complete
|
||||
|
||||
</details>
|
||||
|
||||
---
|
||||
|
||||
### 🚧 v1.5 Smart Onboarding + Personal AI Assistant (In Progress)
|
||||
### 🚧 v1.6 Voice Pipeline + Minimal Message Bridge (In Progress)
|
||||
|
||||
**Milestone Goal:** The definitive onboarding experience — hardware detection, tiered provider setup (local/free cloud/paid), and a Personal AI Assistant mode that coexists with the Project Builder.
|
||||
**Milestone Goal:** Transport-agnostic voice pipeline (Whisper STT + Piper TTS) integrated into web chat, plus a minimal Telegram bridge for phone access. Voice infrastructure designed to survive v2.2 Command Center migration.
|
||||
|
||||
## Phases
|
||||
|
||||
- [x] **Phase 30: Hardware Detection + Mode Selection** — Unauthenticated hardware probe, Apple Silicon unified memory handling, model recommendation database, and mode selector that gates all assistant-specific features (completed 2026-04-02)
|
||||
- [x] **Phase 31: Puter.js Zero-Config Cloud** — Server-proxied Puter.js adapter with full cost tracking, Google OAuth PKCE tier, and subscription auto-detection; no API keys required for zero-config path (completed 2026-04-03)
|
||||
- [x] **Phase 32: Multi-Step Onboarding Wizard** — Assemble all provider tiers and hardware data into a skippable multi-step wizard; summary screen routes directly into chat (completed 2026-04-03)
|
||||
- [x] **Phase 33: Persistent Memory + Personal Assistant Mode** — File-backed memory with write-time sanitization, PersonalAssistantPage, conversation handoff to PM agent (completed 2026-04-03)
|
||||
- [x] **Phase 34: Voice** — Piper TTS with pre-warm progress, Whisper STT wired into voice service, onboarding voice step activated (completed 2026-04-03)
|
||||
- [x] **Phase 35: npx buildthis CLI** — Standalone bootstrapper package with hardware detection and provider tiering parity with web onboarding (completed 2026-04-03)
|
||||
|
||||
---
|
||||
- [ ] **Phase 36: Voice Pipeline Foundation** — Transport-agnostic VoicePipelineService (transcribe, synthesize, formatForVoice), voice.ts route, ffmpeg audio transcoding, voiceMode flag, dual output pattern
|
||||
- [ ] **Phase 37: Web Chat Voice UI** — VAD silence detection, waveform visualization, voice mode toggle, inline audio player, auto-play toggle, COOP/COEP headers
|
||||
- [ ] **Phase 38: Telegram Bridge** — grammY long polling relay, text + voice note bidirectional relay, agent identity prefix, BotFather onboarding setup
|
||||
- [ ] **Phase 39: Voice Polish** — Sentence-buffered TTS streaming, multi-language TTS output, onboarding STT/TTS hardware detection step
|
||||
|
||||
## Phase Details
|
||||
|
||||
### Phase 30: Hardware Detection + Mode Selection
|
||||
**Goal**: Users see accurate hardware information during onboarding, get a model recommendation matched to their machine, and choose a mode that correctly gates all downstream features — with the probe working before board auth exists
|
||||
**Depends on**: Phase 29 (v1.4 shipped)
|
||||
**Requirements**: ONBD-01, ONBD-02, ONBD-03, ONBD-07
|
||||
### Phase 36: Voice Pipeline Foundation
|
||||
**Goal**: The transport-agnostic voice pipeline is live and callable from any consumer — web chat, Telegram, or future integrations — with correct audio transcoding, voice mode flag propagation, and dual output formatting baked in from the start
|
||||
**Depends on**: Phase 35 (v1.5 shipped)
|
||||
**Requirements**: VPIPE-01, VPIPE-02, VPIPE-03, VPIPE-04, VPIPE-05, VPIPE-06
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. On a fresh install (before any board auth token exists), the hardware probe returns GPU, RAM, and Apple Silicon unified memory data within 5 seconds
|
||||
2. A Mac Mini M4 reports "unified memory" (not VRAM) with the 0.75 multiplier applied and copy that says "runs entirely on your machine"
|
||||
3. The mode selector (Personal AI Assistant / Project Builder / Both) is visible during onboarding and the selected mode is persisted; assistant-specific UI is hidden when Project Builder-only is chosen
|
||||
4. The model recommendation shown to the user matches an entry in the pre-built JSON catalog for the detected hardware tier (GPU / Apple Silicon / CPU-only)
|
||||
**Plans**: 2 plans
|
||||
1. Posting a WAV audio file to `POST /api/transcribe` returns a transcription with detected language, regardless of whether the request came from the web UI or a test harness
|
||||
2. Calling `POST /api/synthesize` with a markdown-heavy agent response returns two outputs: a voice-optimized prose version (no markdown) and the original full text with code blocks
|
||||
3. A WebM/Opus browser recording and an OGG/Opus Telegram voice note both produce identical Whisper transcription quality after ffmpeg transcodes each to WAV 16kHz mono
|
||||
4. The `voiceMode` flag on a chat message survives from client request through Express route to message persistence — verifiable in the DB record
|
||||
5. `nexus-settings.json` accepts `voiceMode: "text" | "voice_input" | "full_voice"` and `telegramToken` fields without breaking existing settings reads
|
||||
**Plans**: TBD
|
||||
|
||||
Plans:
|
||||
- [x] 30-01-PLAN.md — Hardware service, nexus-settings service, model catalog extension, routes, and tests
|
||||
- [x] 30-02-PLAN.md — ModeSelector, HardwareSummaryStep, useHardwareInfo hook, multi-step wizard wiring
|
||||
|
||||
### Phase 31: Puter.js Zero-Config Cloud
|
||||
**Goal**: Users without Ollama installed can reach working AI in one click via Puter.js — all calls server-proxied, tokens server-stored, cost tracked; Google OAuth and subscription auto-detection round out the provider tier
|
||||
**Depends on**: Phase 30
|
||||
**Requirements**: CLOUD-01, CLOUD-02, CLOUD-03, CLOUD-04, CLOUD-05
|
||||
### Phase 37: Web Chat Voice UI
|
||||
**Goal**: Users can speak to any agent in web chat — recording auto-stops on silence, a live waveform confirms the mic is active, responses play back automatically (toggleable), and voice mode is a first-class setting
|
||||
**Depends on**: Phase 36
|
||||
**Requirements**: WCHAT-01, WCHAT-02, WCHAT-03, WCHAT-04, WCHAT-05, WCHAT-06
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. A user with no Ollama and no API keys clicks "Continue with Puter" in onboarding, completes the Puter auth popup, and immediately gets a working chat response — no API key entry required
|
||||
2. All Puter AI calls flow through `POST /api/puter-proxy/chat` (verifiable in server logs); the Puter auth token is stored server-side via secretService, not in localStorage
|
||||
3. Token cost for Puter responses appears in the cost tracking view, attributed correctly per conversation
|
||||
4. A user with Hermes, Claude Code, or OpenClaw already installed sees those tools pre-filled in the provider configuration step with no manual entry
|
||||
5. A user clicking "Sign in with Google" for Gemini completes PKCE OAuth and gets a Gemini-backed chat response; the UI displays a policy-risk note that Google OAuth may trigger abuse detection
|
||||
**Plans**: 4 plans
|
||||
|
||||
Plans:
|
||||
- [x] 31-01-PLAN.md — Puter proxy service, routes, unit tests, and app.ts wiring
|
||||
- [x] 31-02-PLAN.md — Google OAuth PKCE service, routes, API key storage route
|
||||
- [x] 31-03-PLAN.md — Provider Selection UI step, PuterAuthButton, GoogleOAuthButton, ApiKeyEntryForm, 4-step wizard wiring
|
||||
- [x] 31-04-PLAN.md — Google OAuth claim endpoint, human verification of full onboarding flow
|
||||
1. Clicking the mic button starts recording; the waveform animates to show audio levels; speaking and then pausing for 1.5 seconds auto-submits the recording without pressing any button
|
||||
2. The voice mode toggle has three visible states (text only / voice input / full voice) and persists the selected mode across page refreshes
|
||||
3. An agent response delivered in full voice mode plays back automatically in the chat thread; the auto-play can be turned off in settings and stays off after a page reload
|
||||
4. The chat message for a voice interaction shows a voice badge and an expandable section revealing the full markdown response with code blocks intact
|
||||
5. Voice recording and VAD work correctly in Chrome and Firefox on the Mac Mini (COOP/COEP headers satisfy SharedArrayBuffer requirements)
|
||||
**Plans**: TBD
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 32: Multi-Step Onboarding Wizard
|
||||
**Goal**: Users move through a complete, skippable onboarding flow that assembles hardware data, provider selection, and voice options into a summary screen — and can jump straight into chat from there
|
||||
**Depends on**: Phase 31
|
||||
**Requirements**: ONBD-04, ONBD-05, ONBD-06
|
||||
### Phase 38: Telegram Bridge
|
||||
**Goal**: The user can message any Nexus agent from their phone via Telegram — text and voice notes both work, agent identity is visible on every reply, and the bot is set up through guided onboarding with no manual token entry in config files
|
||||
**Depends on**: Phase 36
|
||||
**Requirements**: TGRAM-01, TGRAM-02, TGRAM-03, TGRAM-04, TGRAM-05, TGRAM-06, ONBRD-03
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. A user can click "Skip" on every onboarding step (hardware, provider, voice) and reach the summary screen; the resulting workspace has at least one working agent with a valid provider
|
||||
2. The summary screen shows the configured providers and agent-model pairings for the selected mode; no corporate language ("company", "CEO", "mission") appears anywhere in the flow
|
||||
3. From the summary screen, one click navigates directly to the Personal Assistant chat or the project dashboard (depending on chosen mode) with no additional prompts
|
||||
**Plans**: 1 plan
|
||||
1. Sending a text message to the Nexus Telegram bot from a phone produces an agent reply prefixed with the agent name (e.g. `[PM]: response`) within 10 seconds
|
||||
2. Sending a voice note to the Telegram bot produces a transcription confirmation message followed by the agent's text reply — the bot does not silently fail or miss the update
|
||||
3. Requesting a voice reply from the bot returns an OGG voice note that plays back correctly in the Telegram mobile app
|
||||
4. The Telegram bridge runs via long polling with no public HTTPS endpoint required — verified by running on the Mac Mini behind NAT
|
||||
5. The entire `telegram.ts` service file is under 500 lines
|
||||
6. The onboarding wizard includes a BotFather setup step that walks through creating a bot token and saves it to `nexus-settings.json` without manual file editing
|
||||
**Plans**: TBD
|
||||
|
||||
Plans:
|
||||
- [x] 32-01-PLAN.md — Summary step, skip buttons, chat handoff
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 33: Persistent Memory + Personal Assistant Mode
|
||||
**Goal**: Users in Personal AI Assistant mode accumulate memory across sessions that shapes future responses — with no risk of credentials leaking into prompts — and can hand off any conversation to a PM agent with context intact
|
||||
**Depends on**: Phase 32
|
||||
**Requirements**: ASST-01, ASST-02, ASST-03, ASST-04
|
||||
### Phase 39: Voice Polish
|
||||
**Goal**: Voice responses begin playing before synthesis is complete (sentence-buffered), a single response can be synthesized in multiple languages simultaneously, and new installs can detect STT/TTS hardware capability during onboarding and enable voice in one step
|
||||
**Depends on**: Phase 37
|
||||
**Requirements**: VPIPE-07, VPIPE-08, ONBRD-01, ONBRD-02
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. A fact stated in one chat session ("I prefer TypeScript") is referenced correctly by the assistant in a new session started after closing and reopening the browser
|
||||
2. Pasting an API key or token into chat and then starting a new session results in the assistant having no knowledge of that credential — the sanitization blocklist prevented it from being stored
|
||||
3. A user clicks "Turn this into a project" in an assistant conversation; a PM agent is created with a system message containing the conversation summary and they land on the project dashboard
|
||||
4. A user with mode set to "Both" can switch between Personal Assistant chat and the project dashboard without losing context or cross-contaminating assistant memory with project agent messages
|
||||
**Plans**: 3 plans
|
||||
|
||||
Plans:
|
||||
- [x] 33-01-PLAN.md — Memory sanitizer, assistant memory service, REST routes, and unit tests
|
||||
- [x] 33-02-PLAN.md — PersonalAssistantPage, useNexusMode hook, sidebar navigation, route wiring
|
||||
- [x] 33-03-PLAN.md — Real AI streaming with memory injection, assistant-to-PM handoff route and UI
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 34: Voice
|
||||
**Goal**: Users can speak to the assistant (Whisper STT) and hear responses read aloud (Piper TTS) — Piper pre-warms visibly so the first synthesis call does not appear broken, and voice is offered during onboarding based on hardware capability
|
||||
**Depends on**: Phase 32
|
||||
**Requirements**: VOICE-01, VOICE-02, VOICE-03
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. On a CPU-only machine (no GPU), enabling Piper TTS in the assistant produces audible speech output within a reasonable time after the first synthesis (not a silent hang)
|
||||
2. When Piper's WASM voice model is downloading for the first time, a visible progress indicator is shown before the TTS toggle is enabled; the download completes and TTS works without a page reload
|
||||
3. The onboarding voice step offers Whisper STT and Piper TTS toggles only when the hardware detection step has confirmed sufficient capability; on hardware below the threshold, the step is skipped or shows a capability warning
|
||||
**Plans**: 2 plans
|
||||
|
||||
Plans:
|
||||
- [x] 34-01-PLAN.md — Fix /transcribe route registration, Piper TTS hook + TtsButton, voiceEnabled in nexus-settings
|
||||
- [x] 34-02-PLAN.md — VoiceStep onboarding component, wizard step insertion, PersonalAssistant voice wiring
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 35: npx buildthis CLI
|
||||
**Goal**: A developer can run `npx buildthis` on a fresh machine and either open an already-running Nexus or be guided through install — with the same hardware detection and provider tiering as the web onboarding
|
||||
**Depends on**: Phase 30 (hardware detection service must exist)
|
||||
**Requirements**: CLI-01, CLI-02
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Running `npx buildthis` on a machine where Nexus is already running opens the Nexus UI in the default browser; running it on a machine with no Nexus guides the user through installation steps
|
||||
2. The CLI bootstrapper detects the same hardware tier (GPU / Apple Silicon / CPU-only) as the web onboarding and presents the matching provider tier recommendations in the terminal prompt
|
||||
**Plans**: 1 plan
|
||||
|
||||
Plans:
|
||||
- [x] 35-01-PLAN.md — Package scaffold, hardware detection, two-path bootstrap (probe running vs guide install), provider selection, tests
|
||||
1. For a multi-sentence agent response, the first sentence begins playing in the browser before the second sentence has finished synthesizing — the gap between text completion and first audio is under 1 second
|
||||
2. A user can request the same agent response as audio in both English and Danish; both OGG files are generated and available for playback without a second agent call
|
||||
3. On a fresh install, the onboarding hardware probe reports whether Whisper STT and Piper TTS are runnable on the detected hardware tier
|
||||
4. The onboarding voice step activates (showing enable/skip options) only when the hardware probe confirms sufficient capability; on hardware below threshold it shows a capability note and skips to the next step
|
||||
**Plans**: TBD
|
||||
|
||||
---
|
||||
|
||||
## Coverage Validation
|
||||
|
||||
All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
|
||||
All 23 v1.6 requirements are mapped to exactly one phase. No orphans.
|
||||
|
||||
| Requirement | Phase |
|
||||
|-------------|-------|
|
||||
| ONBD-01 | 30 |
|
||||
| ONBD-02 | 30 |
|
||||
| ONBD-03 | 30 |
|
||||
| ONBD-07 | 30 |
|
||||
| CLOUD-01 | 31 |
|
||||
| CLOUD-02 | 31 |
|
||||
| CLOUD-03 | 31 |
|
||||
| CLOUD-04 | 31 |
|
||||
| CLOUD-05 | 31 |
|
||||
| ONBD-04 | 32 |
|
||||
| ONBD-05 | 32 |
|
||||
| ONBD-06 | 32 |
|
||||
| ASST-01 | 33 |
|
||||
| ASST-02 | 33 |
|
||||
| ASST-03 | 33 |
|
||||
| ASST-04 | 33 |
|
||||
| VOICE-01 | 34 |
|
||||
| VOICE-02 | 34 |
|
||||
| VOICE-03 | 34 |
|
||||
| CLI-01 | 35 |
|
||||
| CLI-02 | 35 |
|
||||
| VPIPE-01 | 36 |
|
||||
| VPIPE-02 | 36 |
|
||||
| VPIPE-03 | 36 |
|
||||
| VPIPE-04 | 36 |
|
||||
| VPIPE-05 | 36 |
|
||||
| VPIPE-06 | 36 |
|
||||
| WCHAT-01 | 37 |
|
||||
| WCHAT-02 | 37 |
|
||||
| WCHAT-03 | 37 |
|
||||
| WCHAT-04 | 37 |
|
||||
| WCHAT-05 | 37 |
|
||||
| WCHAT-06 | 37 |
|
||||
| TGRAM-01 | 38 |
|
||||
| TGRAM-02 | 38 |
|
||||
| TGRAM-03 | 38 |
|
||||
| TGRAM-04 | 38 |
|
||||
| TGRAM-05 | 38 |
|
||||
| TGRAM-06 | 38 |
|
||||
| ONBRD-03 | 38 |
|
||||
| VPIPE-07 | 39 |
|
||||
| VPIPE-08 | 39 |
|
||||
| ONBRD-01 | 39 |
|
||||
| ONBRD-02 | 39 |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -237,9 +210,13 @@ All 21 v1.5 requirements are mapped to exactly one phase. No orphans.
|
|||
| 27. Hermes Adapter | v1.4 | 1/1 | Complete | 2026-04-02 |
|
||||
| 28. Ollama Integration & Agent Surface | v1.4 | 3/3 | Complete | 2026-04-02 |
|
||||
| 29. Default Provider & End-to-End | v1.4 | 2/2 | Complete | 2026-04-02 |
|
||||
| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete | 2026-04-03 |
|
||||
| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
|
||||
| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
|
||||
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
|
||||
| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
|
||||
| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
|
||||
| 30. Hardware Detection + Mode Selection | v1.5 | 2/2 | Complete | 2026-04-03 |
|
||||
| 31. Puter.js Zero-Config Cloud | v1.5 | 4/4 | Complete | 2026-04-03 |
|
||||
| 32. Multi-Step Onboarding Wizard | v1.5 | 1/1 | Complete | 2026-04-03 |
|
||||
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
|
||||
| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
|
||||
| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
|
||||
| 36. Voice Pipeline Foundation | v1.6 | 0/TBD | Not started | - |
|
||||
| 37. Web Chat Voice UI | v1.6 | 0/TBD | Not started | - |
|
||||
| 38. Telegram Bridge | v1.6 | 0/TBD | Not started | - |
|
||||
| 39. Voice Polish | v1.6 | 0/TBD | Not started | - |
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ stopped_at: null
|
|||
last_updated: "2026-04-03"
|
||||
last_activity: 2026-04-03
|
||||
progress:
|
||||
total_phases: 0
|
||||
total_phases: 4
|
||||
completed_phases: 0
|
||||
total_plans: 0
|
||||
completed_plans: 0
|
||||
|
|
@ -21,14 +21,16 @@ progress:
|
|||
See: .planning/PROJECT.md (updated 2026-04-03)
|
||||
|
||||
**Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
|
||||
**Current focus:** Defining requirements for v1.6
|
||||
**Current focus:** Phase 36 — Voice Pipeline Foundation (ready to plan)
|
||||
|
||||
## Current Position
|
||||
|
||||
Phase: Not started (defining requirements)
|
||||
Plan: —
|
||||
Status: Defining requirements
|
||||
Last activity: 2026-04-03 — Milestone v1.6 started
|
||||
Phase: 36 of 39 (Voice Pipeline Foundation)
|
||||
Plan: — (not started)
|
||||
Status: Ready to plan
|
||||
Last activity: 2026-04-03 — v1.6 roadmap created (4 phases, 23 requirements mapped)
|
||||
|
||||
Progress: [░░░░░░░░░░] 0%
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
|
|
@ -45,11 +47,13 @@ Last activity: 2026-04-03 — Milestone v1.6 started
|
|||
Decisions are logged in PROJECT.md Key Decisions table.
|
||||
Key constraints for v1.6:
|
||||
|
||||
- Voice pipeline is transport-agnostic — no Telegram-specific code in core voice components
|
||||
- Telegram bridge is intentionally disposable (<500 lines) — will be replaced by v2.2 Command Center
|
||||
- Dual output always: voice response + full technical details in text
|
||||
- Voice mode is a per-message flag, not a per-agent setting
|
||||
- v1.5 already has VoiceRecordButton, TtsButton, usePiperTts hooks in place — build on these
|
||||
- voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
|
||||
- Telegram bridge uses long polling (grammY `bot.start()`) — no public HTTPS required on Mac Mini
|
||||
- Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
|
||||
- Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
|
||||
- COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
|
||||
- Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
|
||||
- Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)
|
||||
|
||||
### Pending Todos
|
||||
|
||||
|
|
@ -57,10 +61,12 @@ None yet.
|
|||
|
||||
### Blockers/Concerns
|
||||
|
||||
- [v1.5 carryover] smart-whisper Apple Silicon acceleration claim unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed on device
|
||||
- [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed
|
||||
- [v1.6] grammY session management approach not yet chosen: lightweight `Map<chatId, sessionId>` vs. grammY conversation plugin — decide at Phase 38 planning
|
||||
- [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional
|
||||
|
||||
## Session Continuity
|
||||
|
||||
Last session: 2026-04-03
|
||||
Stopped at: Milestone v1.6 initialized
|
||||
Stopped at: Roadmap created — 4 phases defined, 23/23 requirements mapped
|
||||
Resume file: None
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue