docs: create milestone v1.7 roadmap (6 phases)
This commit is contained in:
parent
cc569b4cd6
commit
87272b79fc
3 changed files with 247 additions and 83 deletions
|
|
@ -128,13 +128,66 @@ Which phases cover which requirements. Updated during roadmap creation.
|
|||
|
||||
| Requirement | Phase | Status |
|
||||
|-------------|-------|--------|
|
||||
| — | — | — |
|
||||
| INFRA-01 | Phase 40 | Pending |
|
||||
| INFRA-02 | Phase 40 | Pending |
|
||||
| INFRA-03 | Phase 40 | Pending |
|
||||
| INFRA-04 | Phase 40 | Pending |
|
||||
| DIAG-01 | Phase 41 | Pending |
|
||||
| DIAG-02 | Phase 41 | Pending |
|
||||
| DIAG-03 | Phase 41 | Pending |
|
||||
| DIAG-04 | Phase 41 | Pending |
|
||||
| DIAG-05 | Phase 41 | Pending |
|
||||
| THEME-01 | Phase 41 | Pending |
|
||||
| THEME-02 | Phase 41 | Pending |
|
||||
| THEME-03 | Phase 41 | Pending |
|
||||
| THEME-04 | Phase 41 | Pending |
|
||||
| THEME-05 | Phase 41 | Pending |
|
||||
| THEME-06 | Phase 41 | Pending |
|
||||
| THEME-07 | Phase 41 | Pending |
|
||||
| ICON-01 | Phase 41 | Pending |
|
||||
| ICON-02 | Phase 41 | Pending |
|
||||
| ICON-03 | Phase 41 | Pending |
|
||||
| WALL-01 | Phase 42 | Pending |
|
||||
| WALL-02 | Phase 42 | Pending |
|
||||
| WALL-03 | Phase 42 | Pending |
|
||||
| WALL-04 | Phase 42 | Pending |
|
||||
| SOCIAL-01 | Phase 42 | Pending |
|
||||
| SOCIAL-02 | Phase 42 | Pending |
|
||||
| SOCIAL-03 | Phase 42 | Pending |
|
||||
| CONV-01 | Phase 42 | Pending |
|
||||
| CONV-02 | Phase 42 | Pending |
|
||||
| CONV-03 | Phase 42 | Pending |
|
||||
| CONV-04 | Phase 42 | Pending |
|
||||
| CONV-05 | Phase 42 | Pending |
|
||||
| CONV-06 | Phase 42 | Pending |
|
||||
| CONV-07 | Phase 42 | Pending |
|
||||
| CONV-08 | Phase 42 | Pending |
|
||||
| CONV-09 | Phase 42 | Pending |
|
||||
| VOICE-01 | Phase 42 | Pending |
|
||||
| VOICE-02 | Phase 42 | Pending |
|
||||
| VOICE-03 | Phase 42 | Pending |
|
||||
| DOC-01 | Phase 43 | Pending |
|
||||
| DOC-02 | Phase 43 | Pending |
|
||||
| DOC-03 | Phase 43 | Pending |
|
||||
| BRAND-01 | Phase 43 | Pending |
|
||||
| BRAND-02 | Phase 43 | Pending |
|
||||
| BRAND-03 | Phase 43 | Pending |
|
||||
| BRAND-04 | Phase 43 | Pending |
|
||||
| BRAND-05 | Phase 43 | Pending |
|
||||
| BRAND-06 | Phase 43 | Pending |
|
||||
| PRES-01 | Phase 44 | Pending |
|
||||
| PRES-02 | Phase 44 | Pending |
|
||||
| PRES-03 | Phase 44 | Pending |
|
||||
| PRES-04 | Phase 44 | Pending |
|
||||
| SKILL-01 | Phase 45 | Pending |
|
||||
| SKILL-02 | Phase 45 | Pending |
|
||||
| SKILL-03 | Phase 45 | Pending |
|
||||
|
||||
**Coverage:**
|
||||
- v1.7 requirements: 52 total
|
||||
- Mapped to phases: 0
|
||||
- Unmapped: 52
|
||||
- Mapped to phases: 52
|
||||
- Unmapped: 0
|
||||
|
||||
---
|
||||
*Requirements defined: 2026-04-04*
|
||||
*Last updated: 2026-04-04 after initial definition*
|
||||
*Last updated: 2026-04-04 after roadmap creation (v1.7)*
|
||||
|
|
|
|||
|
|
@ -6,7 +6,8 @@
|
|||
- ✅ **v1.3 Chat & PWA** - Phases 21-26 (shipped 2026-04-02)
|
||||
- ✅ **v1.4 Hermes Default Provider** - Phases 27-29 (shipped 2026-04-02)
|
||||
- ✅ **v1.5 Smart Onboarding + Personal AI Assistant** - Phases 30-35 (shipped 2026-04-03)
|
||||
- 🚧 **v1.6 Voice Pipeline + Minimal Message Bridge** - Phases 36-39 (in progress)
|
||||
- ✅ **v1.6 Voice Pipeline + Minimal Message Bridge** - Phases 36-39 (shipped 2026-04-04)
|
||||
- 🚧 **v1.7 Content Generation** - Phases 40-45 (in progress)
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -98,20 +99,8 @@ Plans:
|
|||
|
||||
</details>
|
||||
|
||||
---
|
||||
|
||||
### 🚧 v1.6 Voice Pipeline + Minimal Message Bridge (In Progress)
|
||||
|
||||
**Milestone Goal:** Transport-agnostic voice pipeline (Whisper STT + Piper TTS) integrated into web chat, plus a minimal Telegram bridge for phone access. Voice infrastructure designed to survive v2.2 Command Center migration.
|
||||
|
||||
## Phases
|
||||
|
||||
- [x] **Phase 36: Voice Pipeline Foundation** — Transport-agnostic VoicePipelineService (transcribe, synthesize, formatForVoice), voice.ts route, ffmpeg audio transcoding, voiceMode flag, dual output pattern (completed 2026-04-04)
|
||||
- [x] **Phase 37: Web Chat Voice UI** — VAD silence detection, waveform visualization, voice mode toggle, inline audio player, auto-play toggle, COOP/COEP headers (completed 2026-04-04)
|
||||
- [x] **Phase 38: Telegram Bridge** — grammY long polling relay, text + voice note bidirectional relay, agent identity prefix, BotFather onboarding setup (completed 2026-04-04)
|
||||
- [x] **Phase 39: Voice Polish** — Sentence-buffered TTS streaming, multi-language TTS output, onboarding STT/TTS hardware detection step (completed 2026-04-04)
|
||||
|
||||
## Phase Details
|
||||
<details>
|
||||
<summary>✅ v1.6 Voice Pipeline + Minimal Message Bridge (Phases 36-39) - SHIPPED 2026-04-04</summary>
|
||||
|
||||
### Phase 36: Voice Pipeline Foundation
|
||||
**Goal**: The transport-agnostic voice pipeline is live and callable from any consumer — web chat, Telegram, or future integrations — with correct audio transcoding, voice mode flag propagation, and dual output formatting baked in from the start
|
||||
|
|
@ -171,37 +160,158 @@ Plans:
|
|||
- [x] 39-01-PLAN.md — Sentence-buffered TTS streaming + multi-language synthesis
|
||||
- [ ] 39-02-PLAN.md — Onboarding voice hardware capability probe
|
||||
|
||||
</details>
|
||||
|
||||
---
|
||||
|
||||
### 🚧 v1.7 Content Generation (In Progress)
|
||||
|
||||
**Milestone Goal:** Agents produce real deliverables — diagrams, themes, PDFs, wallpapers, social assets, icons, and video — entirely on-device. Every content type is an installable skill. Long-running renders are async with SSE progress from the first request.
|
||||
|
||||
## Phases
|
||||
|
||||
- [ ] **Phase 40: Job Infrastructure** — content_jobs table, async render lifecycle, SSE progress events, namespaced storage without size limit (INFRA-01..04)
|
||||
- [ ] **Phase 41: Diagrams, Icons & Theme Engine** — Mermaid diagrams, SVG icon generation, OKLCH theme palette with WCAG AA and live preview (DIAG-01..05, ICON-01..03, THEME-01..07)
|
||||
- [ ] **Phase 42: Wallpapers, Social, Format Conversion & Voice** — Satori image pipeline, social content, format conversion registry with AI fallback, Whisper web chat mic (WALL-01..04, SOCIAL-01..03, CONV-01..09, VOICE-01..03)
|
||||
- [ ] **Phase 43: Documents & Branding** — Playwright PDF reports and invoices, full brand identity kit with zip export (DOC-01..03, BRAND-01..06)
|
||||
- [ ] **Phase 44: Video & Presentations** — Remotion workspace package, pitch decks and demo videos, SSE render progress (PRES-01..04)
|
||||
- [ ] **Phase 45: Content as Skills** — Markdown skill files for all content types, Creative skill group on generalist agent (SKILL-01..03)
|
||||
|
||||
## Phase Details
|
||||
|
||||
### Phase 40: Job Infrastructure
|
||||
**Goal**: Every content generation request returns a job ID immediately, progresses through a tracked lifecycle, and stores its output in namespaced storage — so nothing blocks and nothing is orphaned
|
||||
**Depends on**: Phase 39 (v1.6 shipped)
|
||||
**Requirements**: INFRA-01, INFRA-02, INFRA-03, INFRA-04
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Submitting a content generation request returns HTTP 202 with a job ID within 200ms, regardless of how long the render takes
|
||||
2. A connected browser receives SSE events as a job progresses through queued → generating → ready (or error), with no polling required
|
||||
3. A generated video file larger than 10MB can be stored and retrieved without a size-limit error — the generated/ storage namespace bypasses the upload route limit
|
||||
4. Every generated asset in the database has a sourceTaskId linking it to the originating conversation task, visible via the asset list API
|
||||
**Plans**: TBD
|
||||
|
||||
### Phase 41: Diagrams, Icons & Theme Engine
|
||||
**Goal**: Users can generate diagrams from natural language, produce SVG icon sets from descriptions, and create a complete OKLCH color theme from a single seed color — all without binary dependencies beyond what is already installed
|
||||
**Depends on**: Phase 40
|
||||
**Requirements**: DIAG-01, DIAG-02, DIAG-03, DIAG-04, DIAG-05, ICON-01, ICON-02, ICON-03, THEME-01, THEME-02, THEME-03, THEME-04, THEME-05, THEME-06, THEME-07
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Describing an architecture in chat produces a rendered Mermaid diagram (SVG and PNG) attached to the conversation, with the editable Mermaid source visible in a collapsible panel
|
||||
2. Mermaid rendering uses strict security level — a diagram with a `click` directive or `%%{init}%%` override is stripped before render, and SVG output passes DOMPurify before reaching the DOM
|
||||
3. Requesting an icon set from a description returns a cohesive set of SVG icons downloadable in SVG and PNG formats at multiple sizes
|
||||
4. Picking a seed color produces a full palette (background, surface, overlay, text, accents) in OKLCH with separate dark and light variants, all passing WCAG AA contrast checks
|
||||
5. The generated theme can be previewed live in the Nexus UI via CSS custom property injection and applied permanently in one click; export works for CSS variables, Tailwind config, VS Code theme, and JSON
|
||||
**Plans**: TBD
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 42: Wallpapers, Social, Format Conversion & Voice
|
||||
**Goal**: Users can generate platform-ready images (wallpapers, OG images, social banners) via the Satori pipeline, convert between any file format pair, and record voice directly in web chat via the Whisper mic button
|
||||
**Depends on**: Phase 40
|
||||
**Requirements**: WALL-01, WALL-02, WALL-03, WALL-04, SOCIAL-01, SOCIAL-02, SOCIAL-03, CONV-01, CONV-02, CONV-03, CONV-04, CONV-05, CONV-06, CONV-07, CONV-08, CONV-09, VOICE-01, VOICE-02, VOICE-03
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Requesting a desktop wallpaper returns a 2560×1440 PNG; requesting an Instagram banner returns a correctly-dimensioned image — platform dimensions are constants, not magic numbers
|
||||
2. The format conversion UI allows drag-drop of a source file, selection of a target format, and download of the converted file; direct conversion pairs (image, audio/video, document, data) use native tools; any unsupported pair falls through to AI-bridged conversion rather than showing as unavailable
|
||||
3. Navigating to `/convert/png/svg` deep-links directly to the PNG→SVG conversion flow with source and target pre-selected
|
||||
4. An uploaded file is validated against its magic bytes before processing — a JPEG renamed to `.png` is rejected with a clear error, not silently misprocessed
|
||||
5. Clicking the mic button in web chat records audio, transcribes it via local Whisper, and populates the chat input — works offline with the locally cached model
|
||||
**Plans**: TBD
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 43: Documents & Branding
|
||||
**Goal**: Users can generate polished PDF reports and invoices via Playwright, and create a complete brand identity (logo, avatars, social profiles, letterhead, guidelines PDF, zip package) from a single conversation
|
||||
**Depends on**: Phase 41
|
||||
**Requirements**: DOC-01, DOC-02, DOC-03, BRAND-01, BRAND-02, BRAND-03, BRAND-04, BRAND-05, BRAND-06
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Generating a PDF report from a conversation produces a downloadable PDF with correct layout; generating an invoice from a template produces a filled invoice PDF with correct line items
|
||||
2. Generating a one-pager or API reference document produces a styled PDF with navigable headings
|
||||
3. Starting a brand identity conversation produces a logo mark (SVG), avatar at multiple sizes, platform-specific social images, an email signature, and a brand guidelines PDF — all in a single brand kit
|
||||
4. The complete brand kit can be downloaded as a single zip file with assets organized by type
|
||||
**Plans**: TBD
|
||||
**UI hint**: yes
|
||||
|
||||
### Phase 44: Video & Presentations
|
||||
**Goal**: Agents can produce pitch deck presentations and demo videos rendered by Remotion from a conversation, with SSE progress updates throughout the render — which may take several minutes on the M4
|
||||
**Depends on**: Phase 40
|
||||
**Requirements**: PRES-01, PRES-02, PRES-03, PRES-04
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Requesting a pitch deck from a conversation description produces a Remotion-rendered interactive web presentation or MP4; the render runs in a separate workspace package and does not block the main server process
|
||||
2. The Remotion bundle is compiled once at server startup and reused for all renders — submitting a second render request does not trigger a second webpack compilation
|
||||
3. A browser connected during a video render receives SSE progress events (percentage complete) throughout the render; the final event delivers the download URL
|
||||
4. Concurrent LLM inference and video rendering do not cause the server to become unresponsive — render concurrency is capped and serialized with LLM workloads
|
||||
**Plans**: TBD
|
||||
|
||||
### Phase 45: Content as Skills
|
||||
**Goal**: Every content type built in Phases 41-44 is accessible to agents as an installable Markdown skill, and the generalist agent ships pre-loaded with the Creative skill group
|
||||
**Depends on**: Phase 44
|
||||
**Requirements**: SKILL-01, SKILL-02, SKILL-03
|
||||
**Success Criteria** (what must be TRUE):
|
||||
1. Each content type (diagram, theme, icon, wallpaper, social post, PDF, brand kit, video) has a corresponding skill file that an agent can load and use to call the correct content job API
|
||||
2. A freshly created generalist agent has the Creative skill group pre-loaded — it can generate diagrams and themes without any manual skill configuration
|
||||
3. A user can add or remove individual content type skills through the Skill Aggregator UI without touching configuration files
|
||||
**Plans**: TBD
|
||||
**UI hint**: yes
|
||||
|
||||
---
|
||||
|
||||
## Coverage Validation
|
||||
|
||||
All 23 v1.6 requirements are mapped to exactly one phase. No orphans.
|
||||
All 52 v1.7 requirements are mapped to exactly one phase. No orphans.
|
||||
|
||||
| Requirement | Phase |
|
||||
|-------------|-------|
|
||||
| VPIPE-01 | 36 |
|
||||
| VPIPE-02 | 36 |
|
||||
| VPIPE-03 | 36 |
|
||||
| VPIPE-04 | 36 |
|
||||
| VPIPE-05 | 36 |
|
||||
| VPIPE-06 | 36 |
|
||||
| WCHAT-01 | 37 |
|
||||
| WCHAT-02 | 37 |
|
||||
| WCHAT-03 | 37 |
|
||||
| WCHAT-04 | 37 |
|
||||
| WCHAT-05 | 37 |
|
||||
| WCHAT-06 | 37 |
|
||||
| TGRAM-01 | 38 |
|
||||
| TGRAM-02 | 38 |
|
||||
| TGRAM-03 | 38 |
|
||||
| TGRAM-04 | 38 |
|
||||
| TGRAM-05 | 38 |
|
||||
| TGRAM-06 | 38 |
|
||||
| ONBRD-03 | 38 |
|
||||
| VPIPE-07 | 39 |
|
||||
| VPIPE-08 | 39 |
|
||||
| ONBRD-01 | 39 |
|
||||
| ONBRD-02 | 39 |
|
||||
| INFRA-01 | 40 |
|
||||
| INFRA-02 | 40 |
|
||||
| INFRA-03 | 40 |
|
||||
| INFRA-04 | 40 |
|
||||
| DIAG-01 | 41 |
|
||||
| DIAG-02 | 41 |
|
||||
| DIAG-03 | 41 |
|
||||
| DIAG-04 | 41 |
|
||||
| DIAG-05 | 41 |
|
||||
| THEME-01 | 41 |
|
||||
| THEME-02 | 41 |
|
||||
| THEME-03 | 41 |
|
||||
| THEME-04 | 41 |
|
||||
| THEME-05 | 41 |
|
||||
| THEME-06 | 41 |
|
||||
| THEME-07 | 41 |
|
||||
| ICON-01 | 41 |
|
||||
| ICON-02 | 41 |
|
||||
| ICON-03 | 41 |
|
||||
| WALL-01 | 42 |
|
||||
| WALL-02 | 42 |
|
||||
| WALL-03 | 42 |
|
||||
| WALL-04 | 42 |
|
||||
| SOCIAL-01 | 42 |
|
||||
| SOCIAL-02 | 42 |
|
||||
| SOCIAL-03 | 42 |
|
||||
| CONV-01 | 42 |
|
||||
| CONV-02 | 42 |
|
||||
| CONV-03 | 42 |
|
||||
| CONV-04 | 42 |
|
||||
| CONV-05 | 42 |
|
||||
| CONV-06 | 42 |
|
||||
| CONV-07 | 42 |
|
||||
| CONV-08 | 42 |
|
||||
| CONV-09 | 42 |
|
||||
| VOICE-01 | 42 |
|
||||
| VOICE-02 | 42 |
|
||||
| VOICE-03 | 42 |
|
||||
| DOC-01 | 43 |
|
||||
| DOC-02 | 43 |
|
||||
| DOC-03 | 43 |
|
||||
| BRAND-01 | 43 |
|
||||
| BRAND-02 | 43 |
|
||||
| BRAND-03 | 43 |
|
||||
| BRAND-04 | 43 |
|
||||
| BRAND-05 | 43 |
|
||||
| BRAND-06 | 43 |
|
||||
| PRES-01 | 44 |
|
||||
| PRES-02 | 44 |
|
||||
| PRES-03 | 44 |
|
||||
| PRES-04 | 44 |
|
||||
| SKILL-01 | 45 |
|
||||
| SKILL-02 | 45 |
|
||||
| SKILL-03 | 45 |
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -225,7 +335,13 @@ All 23 v1.6 requirements are mapped to exactly one phase. No orphans.
|
|||
| 33. Persistent Memory + Personal Assistant Mode | v1.5 | 3/3 | Complete | 2026-04-03 |
|
||||
| 34. Voice | v1.5 | 2/2 | Complete | 2026-04-03 |
|
||||
| 35. npx buildthis CLI | v1.5 | 1/1 | Complete | 2026-04-03 |
|
||||
| 36. Voice Pipeline Foundation | v1.6 | 2/3 | Complete | 2026-04-04 |
|
||||
| 37. Web Chat Voice UI | v1.6 | 3/4 | Complete | 2026-04-04 |
|
||||
| 38. Telegram Bridge | v1.6 | 3/3 | Complete | 2026-04-04 |
|
||||
| 39. Voice Polish | v1.6 | 1/2 | Complete | 2026-04-04 |
|
||||
| 36. Voice Pipeline Foundation | v1.6 | 2/3 | Complete | 2026-04-04 |
|
||||
| 37. Web Chat Voice UI | v1.6 | 3/4 | Complete | 2026-04-04 |
|
||||
| 38. Telegram Bridge | v1.6 | 3/3 | Complete | 2026-04-04 |
|
||||
| 39. Voice Polish | v1.6 | 1/2 | Complete | 2026-04-04 |
|
||||
| 40. Job Infrastructure | v1.7 | 0/TBD | Not started | - |
|
||||
| 41. Diagrams, Icons & Theme Engine | v1.7 | 0/TBD | Not started | - |
|
||||
| 42. Wallpapers, Social, Format Conversion & Voice | v1.7 | 0/TBD | Not started | - |
|
||||
| 43. Documents & Branding | v1.7 | 0/TBD | Not started | - |
|
||||
| 44. Video & Presentations | v1.7 | 0/TBD | Not started | - |
|
||||
| 45. Content as Skills | v1.7 | 0/TBD | Not started | - |
|
||||
|
|
|
|||
|
|
@ -7,7 +7,7 @@ stopped_at: ""
|
|||
last_updated: "2026-04-04"
|
||||
last_activity: 2026-04-04
|
||||
progress:
|
||||
total_phases: 0
|
||||
total_phases: 6
|
||||
completed_phases: 0
|
||||
total_plans: 0
|
||||
completed_plans: 0
|
||||
|
|
@ -21,20 +21,21 @@ progress:
|
|||
See: .planning/PROJECT.md (updated 2026-04-04)
|
||||
|
||||
**Core value:** A fresh onboard asks for ONE thing (root directory), auto-creates PM + Engineer agents, and drops you in the dashboard.
|
||||
**Current focus:** Defining requirements for v1.7
|
||||
**Current focus:** Phase 40 — Job Infrastructure (v1.7 start)
|
||||
|
||||
## Current Position
|
||||
|
||||
Phase: Not started (defining requirements)
|
||||
Plan: —
|
||||
Status: Defining requirements
|
||||
Last activity: 2026-04-04 — Milestone v1.7 started
|
||||
Phase: 40 of 45 (Job Infrastructure)
|
||||
Plan: — (not yet planned)
|
||||
Status: Ready to plan
|
||||
Last activity: 2026-04-04 — v1.7 roadmap created, 52 requirements mapped to 6 phases
|
||||
|
||||
Progress: [░░░░░░░░░░] 0%
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
**Velocity:**
|
||||
|
||||
- Total plans completed: 0 (v1.6)
|
||||
- Total plans completed: 0 (v1.7)
|
||||
- Average duration: -
|
||||
- Total execution time: 0 hours
|
||||
|
||||
|
|
@ -43,30 +44,23 @@ Last activity: 2026-04-04 — Milestone v1.7 started
|
|||
### Decisions
|
||||
|
||||
Decisions are logged in PROJECT.md Key Decisions table.
|
||||
Key constraints for v1.6:
|
||||
Key constraints for v1.7:
|
||||
|
||||
- voicePipelineService is the keystone — Phase 37 and Phase 38 both depend on it; build Phase 36 first
|
||||
- Telegram bridge uses long polling (grammY `bot.start()`) — no public HTTPS required on Mac Mini
|
||||
- Audio transcoding via ffmpeg-static ^5.2.0 — NOT archived fluent-ffmpeg (archived May 2025)
|
||||
- Voice mode flag must survive every pipeline layer: client → Express → message persistence → agent codec
|
||||
- COOP/COEP headers required for @ricky0123/vad-react SharedArrayBuffer (add to Express static middleware)
|
||||
- Phase 37 and Phase 38 are independent once Phase 36 ships; sequential ordering for single-developer delivery
|
||||
- Telegram bridge must stay under 500 lines (TGRAM-06 is a hard constraint)
|
||||
- [Phase 36]: Export nexusSettingsSchema for direct testing, use nexusSettingsSchema.parse({}) for consistent defaults in catch blocks
|
||||
- [Phase 36]: Used manual execFileAsync wrapper instead of promisify(execFileCb) to avoid util.promisify.custom symbol incompatibility with vitest mocks
|
||||
- [Phase 36]: Voice routes are dedicated voice.ts module (not added to chat-files.ts) for clean separation — voice pipeline is its own subsystem
|
||||
- [Phase 36]: voiceMode typed as text|voice_input|full_voice union in stream endpoint, persisted as voice_full/voice_input messageType for downstream rendering
|
||||
- [Phase 37]: Cherry-picked Phase 36 commits to bring voice pipeline, nexus-settings, and voiceMode wiring to phase-37 branch
|
||||
- [Phase 37]: COOP/COEP headers placed as first Express middleware — applies to all responses including API, static, and Vite dev
|
||||
- [Phase 37]: VAD ONNX assets served from ui/public/ same-origin to avoid COEP blocking CDN-served binary files
|
||||
- [Phase 37]: useVadRecorder requests separate MediaStream ref for VoiceWaveform AnalyserNode — useMicVAD manages its own stream internally
|
||||
- [Phase 37]: AudioContext not closed on cleanup in VoiceWaveform — reused across recording cycles to avoid repeated autoplay unlock prompts
|
||||
- [Phase 37]: useVoiceMode hook created in plan 37-03 to unblock VoiceModeToggle during parallel execution
|
||||
- [Phase 37]: Auto-play preference stored in localStorage (nexus:voice:autoplay), not nexus-settings — avoids server round-trip for fast UX
|
||||
- [Phase 38-telegram-bridge]: TelegramStep uses onNext/onBack props; Continue disabled until token validated; Skip always available
|
||||
- [Phase 38-telegram-bridge]: telegramRoutes accepts service instance as second param — enables restart from token route
|
||||
- [Phase 38-telegram-bridge]: Long-polling: deleteWebhook first, then bot.start() fire-and-forget with catch logger
|
||||
- [Phase 38-telegram-bridge]: processVoiceMessage() extracted as top-level async function — keeps bot handler clean; botToken stored as module-level mutable ref for CDN URL construction
|
||||
- content_jobs table + renderPipelineService stub must exist before any renderer is built — Phase 40 is the hard dependency for all other phases
|
||||
- Async job pattern is mandatory — all render requests return 202 + job ID immediately; never block HTTP on render
|
||||
- sourceTaskId is required on every generated asset from day one (prevents SSD orphan accumulation)
|
||||
- MAX_GENERATED_ASSET_BYTES constant bypasses the 10MB upload limit for generated/namespace — separate from upload route
|
||||
- Mermaid securityLevel must be "strict" — strip %%{init}%% and click directives before render, DOMPurify on SVG output
|
||||
- OKLCH via culori for all theme generation — HSL is forbidden as an intermediate (perceptually non-uniform)
|
||||
- Remotion bundle() called once at startup, not per-render — cached bundle path passed to renderMedia() per request
|
||||
- Remotion isolated in packages/content-renderer/ workspace package — webpack bundler must not enter Vite/tsc server context
|
||||
- Phase 42 and Phase 41 both depend on Phase 40 but are independent of each other (can parallelize if needed)
|
||||
- Phase 43 (PDF/Brand) depends on Phase 41 because PDF templates may reuse satori/SVG pipeline components
|
||||
- Phase 44 (Remotion) depends only on Phase 40 (job infra) — can start after Phase 40, independent of 41-43
|
||||
- Phase 45 (Skills) is last — skill markdown files reference API contracts finalized in Phases 41-44
|
||||
- AI-bridged conversion (CONV-05) is the fallback for all format pairs — never show a format pair as blocked
|
||||
- CONV-08: converter availability detected at startup via probe; unavailable direct paths fall to AI bridge
|
||||
- CONV-09: magic-byte MIME validation before processing — reject misnamed files with a clear error
|
||||
|
||||
### Pending Todos
|
||||
|
||||
|
|
@ -74,12 +68,13 @@ None yet.
|
|||
|
||||
### Blockers/Concerns
|
||||
|
||||
- [v1.5 carryover] smart-whisper Apple Silicon acceleration unverified on Mac Mini M4 — fall back to `tiny.en` if `base.en` acceleration not confirmed
|
||||
- [v1.6] grammY session management approach not yet chosen: lightweight `Map<chatId, sessionId>` vs. grammY conversation plugin — decide at Phase 38 planning
|
||||
- [v1.6] Dual output prompt reliability on 7B models is ~90% — Approach B fallback (post-process markdown strip) must be implemented as safety net, not optional
|
||||
- [v1.7 pre-start] Verify correct resvg package name: `@resvg/resvg-js` (v2.6.2) vs `resvg-js` (v0.1.97) — run `npm info @resvg/resvg-js` before pnpm add in Phase 41
|
||||
- [v1.7 pre-start] Check whether playwright-chromium and @mermaid-js/mermaid-cli can share a Chromium binary via PUPPETEER_EXECUTABLE_PATH — could save ~300MB on Mac Mini SSD
|
||||
- [v1.7 pre-start] Run pnpm build after adding packages/content-renderer/ to verify no Vite/webpack conflicts before Phase 44 implementation
|
||||
- [v1.7 pre-start] Confirm pdf-lib scope: Playwright for design-rich PDFs, pdf-lib for data-driven invoices — decide at Phase 43 planning
|
||||
|
||||
## Session Continuity
|
||||
|
||||
Last session: 2026-04-04T03:18:52.490Z
|
||||
Stopped at: Completed 38-02-PLAN.md — Telegram voice handling + TTS reply
|
||||
Last session: 2026-04-04
|
||||
Stopped at: v1.7 roadmap created — 52 requirements mapped, 6 phases defined (40-45), files written
|
||||
Resume file: None
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue