Nexus Dev ffc7b130e4 chore: archive v1.3 phase directories to milestones/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-04 03:55:48 +00:00

6.6 KiB

Raw Blame History

phase

plan

subsystem

tags

requires

provides

affects

tech-stack

key-files

key-decisions

patterns-established

requirements-completed

duration

completed

25-file-system

voice

whisper

mediarecorder

transcription

react

express

multer

phase	provides
25-file-system-02	ChatInput with file upload props, chat-files route with multer pattern

VoiceRecordButton component with MediaRecorder API, idle/recording/transcribing states

POST /transcribe server endpoint using execFileAsync for whisper-cpp or openai-whisper

ChatInput enableVoiceInput prop that renders VoiceRecordButton conditionally

25-file-system

chat-input

chat-panel

added

patterns

MediaRecorder API with 250ms chunk collection and onstop blob assembly

execFileAsync (not exec) for shell commands to avoid injection risk

Whisper CLI cascade: whisper-cpp first, openai-whisper Python fallback, 503 if neither

created

modified

ui/src/components/VoiceRecordButton.tsx

ui/src/components/ChatInput.tsx

ui/src/components/ChatPanel.tsx

server/src/routes/chat-files.ts

.planning/REQUIREMENTS.md

Use setValue state updater (functional form) for transcription append — avoids stale closure vs native DOM event approach

enableVoiceInput defaults to false for backward-compat; ChatPanel passes true unconditionally — server returns 503 gracefully if whisper absent

execFileAsync over exec for whisper CLI invocation — no shell injection risk with system-generated tmpPath

POST /transcribe uses separate multer instance with field name 'audio' to avoid conflict with 'file' field used by upload routes

INPUT-02

INPUT-03

INPUT-04

8min

2026-04-01

Phase 25 Plan 08: Voice Input Summary

VoiceRecordButton with MediaRecorder API wired into ChatInput; POST /transcribe endpoint with whisper-cpp/openai-whisper cascade and graceful 503 fallback

Performance

Duration: ~8 min
Started: 2026-04-01T23:58:00Z
Completed: 2026-04-01T23:59:00Z
Tasks: 2
Files modified: 5

Accomplishments

VoiceRecordButton component: idle (Mic), recording (Square/red), transcribing (Loader2 spinner) states using MediaRecorder API with 250ms chunks
POST /transcribe endpoint: writes audio to temp file, tries whisper-cpp CLI first, falls back to openai-whisper Python CLI, returns 503 with helpful install message if neither is present
ChatInput: new enableVoiceInput prop renders VoiceRecordButton; handleTranscription appends text to existing textarea value via functional setState
ChatPanel passes enableVoiceInput={true} unconditionally (server returns 503 if whisper unavailable)
INPUT-02, INPUT-03, INPUT-04 marked Complete in REQUIREMENTS.md

Task Commits

Each task was committed atomically:

Task 1: Create VoiceRecordButton and server transcription endpoint - c7c46a02 (feat)
Task 2: Wire VoiceRecordButton into ChatInput and update REQUIREMENTS.md - a1e1b11b (feat)

Files Created/Modified

ui/src/components/VoiceRecordButton.tsx - New voice recording button component with MediaRecorder API
ui/src/components/ChatInput.tsx - Added enableVoiceInput prop, handleTranscription callback, VoiceRecordButton render
ui/src/components/ChatPanel.tsx - Passes enableVoiceInput={true} to ChatInput
server/src/routes/chat-files.ts - Added POST /transcribe endpoint with whisper CLI cascade
.planning/REQUIREMENTS.md - Marked INPUT-02, INPUT-03, INPUT-04 as Complete

Decisions Made

Used functional form of setValue (setValue((current) => ...)) for transcription append to avoid stale closure issues — simpler than the native DOM event approach suggested in the plan
enableVoiceInput defaults to false in ChatInput props for backward compatibility; ChatPanel passes true unconditionally since the server returns a friendly 503 if whisper is not installed
Used a separate audioUpload multer instance with .single("audio") inside the transcribe handler to avoid field name collision with the existing fileUpload instance that uses .single("file")

Deviations from Plan

Auto-fixed Issues

1. [Rule 1 - Bug] Added path import at top of chat-files.ts

Found during: Task 1 (server transcription endpoint)
Issue: The plan's code used path.join(tmpdir(), ...) but path was not imported in the file
Fix: Added import path from "node:path"; at the top of chat-files.ts
Files modified: server/src/routes/chat-files.ts
Verification: TypeScript compiles without errors
Committed in: c7c46a02 (Task 1 commit)

2. [Rule 2 - Missing Critical] Separate multer instance for audio upload

Found during: Task 1 (server transcription endpoint)
Issue: The plan's code called runSingleFileUpload(fileUpload, req, res) which uses .single("file") — but the audio field is named "audio", so no file would be found
Fix: Created separate audioUpload multer instance and runAudioUpload helper using .single("audio")
Files modified: server/src/routes/chat-files.ts
Verification: TypeScript compiles without errors; logic matches field name used by VoiceRecordButton
Committed in: c7c46a02 (Task 1 commit)

3. [Rule 1 - Bug] Functional setState for transcription append

Found during: Task 2 (ChatInput integration)
Issue: Plan suggested using native DOM event dispatch to update the textarea — unnecessarily complex since ChatInput uses controlled value state directly
Fix: Used setValue((current) => current ? \${current} ${text}` : text)` which correctly appends without stale closure risk
Files modified: ui/src/components/ChatInput.tsx
Verification: TypeScript compiles without errors
Committed in: a1e1b11b (Task 2 commit)

Total deviations: 3 auto-fixed (1 missing import, 1 field name mismatch, 1 simpler state approach) Impact on plan: All fixes necessary for correctness. The path import and multer field name fixes would have caused runtime errors. The setState approach is simpler and more idiomatic React.

Issues Encountered

None beyond the auto-fixed deviations above.

User Setup Required

None — server returns 503 with install instructions if whisper is not present. No configuration required by default.

Next Phase Readiness

Voice input complete; INPUT-02/03/04 all marked Complete
Remaining Phase 25 plans can proceed independently
To enable transcription: install whisper-cpp (brew install whisper-cpp) or openai-whisper (pip install openai-whisper)

Phase: 25-file-system Completed: 2026-04-01

6.6 KiB Raw Blame History