homelab/.planning/research/FEATURES.md
Mikkel Georgsen 1648a986bc docs: complete project research
Files:
- STACK.md
- FEATURES.md
- ARCHITECTURE.md
- PITFALLS.md
- SUMMARY.md

Key findings:
- Stack: Python 3.12+ with python-telegram-bot 22.6, asyncio subprocess management
- Architecture: Path-based session routing with state machine lifecycle management
- Critical pitfall: Asyncio PIPE deadlock requires concurrent stdout/stderr draining

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 13:37:24 +00:00

379 lines
20 KiB
Markdown

# Feature Research: Telegram-to-Claude Code Bridge
**Domain:** AI chatbot bridge / Remote code assistant interface
**Researched:** 2026-02-04
**Confidence:** HIGH
## Feature Landscape
### Table Stakes (Users Expect These)
Features users assume exist. Missing these = product feels incomplete.
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| Basic message send/receive | Core functionality of any chat interface | LOW | Python-telegram-bot or grammY provide this out-of-box |
| Session persistence | Users expect conversations to continue where they left off | MEDIUM | Store session state to disk/DB; must survive bot restarts |
| Command interface | Standard way to control bot behavior (`/help`, `/new`, `/status`) | LOW | Built-in to telegram bot frameworks |
| Typing indicator | Shows bot is processing (expected for AI bots with 10-60s response times) | LOW | Use `sendChatAction` every 5s during processing |
| Error messages | Clear feedback when something goes wrong | LOW | Graceful error handling with user-friendly messages |
| File upload support | Send files/images to Claude for analysis | MEDIUM | Telegram supports up to 50MB files; larger requires self-hosted Bot API |
| File download | Receive files Claude generates (scripts, configs, reports) | MEDIUM | Bot sends files back; organize in user-specific folders |
| Authentication | Only authorized users can access the bot | LOW | User ID whitelist in config (for single-user: just one ID) |
| Multi-message handling | Long responses split intelligently across multiple messages | MEDIUM | Telegram has 4096 char limit; need smart splitting at code block/paragraph boundaries |
### Differentiators (Competitive Advantage)
Features that set the product apart. Not required, but valuable.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| Named session management | Switch between multiple projects/contexts (`/session work`, `/session personal`) | MEDIUM | Session key = user:session_name; list/switch/delete sessions |
| Idle timeout with graceful suspension | Auto-suspend idle sessions to save costs, easy resume with context preserved | MEDIUM | Timer-based monitoring; serialize session state; clear resume UX with `/resume <session>` |
| Smart output modes | Choose verbosity: final answer only / verbose with tool calls / auto-smart truncation | HIGH | Requires parsing Claude Code output stream and making intelligent display decisions |
| Tool call progress notifications | Real-time updates as Claude uses tools ("Reading file X", "Running command Y") | HIGH | Stream parsing + progressive message editing; balance info vs notification spam |
| Cost tracking per session | Show token usage and $ cost for each conversation | MEDIUM | Track input/output tokens; calculate using Anthropic pricing; display in `/stats` |
| Session-specific folders | Each session gets isolated file workspace (~/stuff/sessions/<name>/) | LOW | Create directory per session; pass as cwd to Claude Code |
| Inline keyboard menus | Button-based navigation (session list, quick commands) instead of typing | MEDIUM | Telegram InlineKeyboardMarkup for cleaner UX |
| Voice message support | Send voice, bot transcribes and processes | HIGH | Requires Whisper API or similar; adds complexity but strong UX boost |
| Photo/image analysis | Send photos, Claude analyzes with vision | MEDIUM | Claude supports vision natively; just pass image data |
| Proactive heartbeat | Bot checks in periodically ("Task done?", "Anything broken?") | HIGH | Cron-based with intelligent prompting; OpenClaw-style feature |
| Multi-model routing | Use Haiku for simple tasks, Sonnet for complex, Opus for critical | HIGH | Analyze message complexity; route intelligently; 80% cost savings potential |
| Session export | Export full conversation history as markdown/JSON | LOW | Serialize messages to file, send via Telegram |
| Undo/rollback | Revert to previous message in conversation | HIGH | Requires conversation tree management; complex but powerful |
### Anti-Features (Commonly Requested, Often Problematic)
Features that seem good but create problems.
| Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------|
| Multi-user support (v1) | Seems like natural evolution | Adds auth complexity, resource contention, security surface, and user isolation requirements before core experience is validated | Build single-user first; prove value; then add multi-user with proper tenant isolation |
| Real-time streaming text | Shows AI thinking character-by-character | Telegram message editing has rate limits; causes flickering; annoying for code blocks | Use typing indicator + tool call progress updates + send complete responses |
| Inline bot mode (@mention in any chat) | Convenience of using bot anywhere | Security nightmare (exposes bot to all chats, leaks context); hard to maintain session isolation | Keep bot in dedicated chat; use `/share` to export results elsewhere |
| Voice response (TTS) | "Complete voice assistant" feel | Adds latency, quality issues, limited Telegram voice note support, user often reading anyway | Text-first; voice input OK but output stays text |
| Auto-response to all messages | Bot always active, no explicit commands needed | Burns tokens on noise; user loses control; hard to have side conversations | Require explicit command or @mention; clear when bot is listening |
| Unlimited session history | "Never forget anything" | Memory bloat, context window waste, cost explosion | Implement sliding window (last N messages) + summarization; store full history off-context |
| Advanced NLP for command parsing | "Natural language commands" | Adds unreliability; burns tokens; users prefer explicit commands for tools | Use standard `/command` syntax; save NLP tokens for actual Claude conversations |
| Rich formatting (bold, italic, links) in bot messages | Prettier output | Telegram markdown syntax fragile; breaks on code blocks; debugging nightmare | Use plain text with clear structure; minimal formatting for critical info only |
## Feature Dependencies
```
Authentication (whitelist)
└──requires──> Session Management
├──requires──> Message Handling
│ └──requires──> Claude Code Integration
└──requires──> File Handling
└──requires──> Session Folders
Smart Output Modes
└──requires──> Output Stream Parsing
└──requires──> Message Splitting
Tool Call Progress
└──requires──> Output Stream Parsing
└──requires──> Typing Indicator
Idle Timeout
└──requires──> Session Persistence
└──requires──> Session Management
Cost Tracking
└──requires──> Token Counting
└──requires──> Claude Code Integration
Multi-Model Routing
└──requires──> Message Complexity Analysis
└──enhances──> Cost Tracking
```
### Dependency Notes
- **Session Management is foundational**: Nearly everything depends on solid session management. This must be robust before adding advanced features.
- **Output Stream Parsing enables differentiators**: Many high-value features (smart output modes, tool progress, cost tracking) require parsing Claude Code's output stream. Build this infrastructure early.
- **File Handling is isolated**: Can be built in parallel with core message flow; minimal dependencies.
- **Authentication gates everything**: Single-user whitelist is simplest; must be in place before any other features.
## MVP Definition
### Launch With (v0.1 - Prove Value)
Minimum viable product — what's needed to validate the concept.
- [ ] **User whitelist authentication** — Only owner can use bot (security baseline)
- [ ] **Basic message send/receive** — Chat with Claude Code via Telegram
- [ ] **Session persistence** — Conversations survive bot restarts
- [ ] **Simple session management**`/new`, `/continue`, `/list` commands
- [ ] **Typing indicator** — Shows bot is thinking during long AI responses
- [ ] **File upload** — Send files to Claude (PDFs, screenshots, code)
- [ ] **File download** — Receive files Claude creates
- [ ] **Error handling** — Clear messages when things break
- [ ] **Message splitting** — Long responses broken into readable chunks
- [ ] **Session folders** — Each session has isolated file workspace
**MVP Success Criteria**: Can manage homelab from phone during commute. Can send screenshot of error, Claude analyzes and suggests fix, can review and apply.
### Add After Validation (v0.2-0.5 - Polish Core Experience)
Features to add once core is working and usage patterns emerge.
- [ ] **Named sessions** — Switch between projects (`/session ansible`, `/session docker`)
- [ ] **Idle timeout with suspend/resume** — Save costs on unused sessions
- [ ] **Basic output modes** — Toggle verbose (`/verbose on`) for debugging
- [ ] **Cost tracking** — See token usage per session (`/stats`)
- [ ] **Inline keyboard menus** — Button-based session picker
- [ ] **Session export** — Download conversation as markdown (`/export`)
- [ ] **Image analysis** — Send photos, Claude describes/debugs
**Trigger for adding**: Using bot daily, patterns clear, requesting these features organically.
### Future Consideration (v1.0+ - Differentiating Power Features)
Features to defer until product-market fit is established.
- [ ] **Smart output modes** — AI decides what to show based on context
- [ ] **Tool call progress notifications** — Real-time updates on Claude's actions
- [ ] **Multi-model routing** — Haiku for simple, Sonnet for complex (cost optimization)
- [ ] **Voice message support** — Voice input with Whisper transcription
- [ ] **Proactive heartbeat** — Bot checks in on long-running tasks
- [ ] **Undo/rollback** — Revert conversation to previous state
- [ ] **Multi-user support** — Share bot with team (requires tenant isolation)
**Why defer**: These are complex, require significant engineering, and value unclear until core experience proven. Some (like multi-model routing) need usage data to optimize.
## Feature Prioritization Matrix
| Feature | User Value | Implementation Cost | Priority | Phase |
|---------|------------|---------------------|----------|-------|
| Message send/receive | HIGH | LOW | P1 | MVP |
| Session persistence | HIGH | MEDIUM | P1 | MVP |
| File upload/download | HIGH | MEDIUM | P1 | MVP |
| Typing indicator | HIGH | LOW | P1 | MVP |
| User authentication | HIGH | LOW | P1 | MVP |
| Message splitting | HIGH | MEDIUM | P1 | MVP |
| Error handling | HIGH | LOW | P1 | MVP |
| Session folders | MEDIUM | LOW | P1 | MVP |
| Basic commands | HIGH | LOW | P1 | MVP |
| Named sessions | HIGH | MEDIUM | P2 | Post-MVP |
| Idle timeout | MEDIUM | MEDIUM | P2 | Post-MVP |
| Cost tracking | MEDIUM | MEDIUM | P2 | Post-MVP |
| Inline keyboards | MEDIUM | MEDIUM | P2 | Post-MVP |
| Session export | LOW | LOW | P2 | Post-MVP |
| Image analysis | MEDIUM | MEDIUM | P2 | Post-MVP |
| Smart output modes | HIGH | HIGH | P3 | Future |
| Tool progress | MEDIUM | HIGH | P3 | Future |
| Multi-model routing | HIGH | HIGH | P3 | Future |
| Voice messages | LOW | HIGH | P3 | Future |
| Proactive heartbeat | LOW | HIGH | P3 | Future |
**Priority key:**
- P1: Must have for launch (MVP)
- P2: Should have, add when core working (Post-MVP)
- P3: Nice to have, future consideration (v1.0+)
## Competitor Feature Analysis
| Feature | OpenClaw | claude-code-telegram | Claude-Code-Remote | Our Approach |
|---------|----------|----------------------|--------------------|--------------|
| Session Management | Multi-agent sessions with isolation | Session persistence, project switching | Smart session detection (24h tokens) | Named sessions with manual switch |
| Authentication | Pairing allowlist, mention gating | User ID whitelist + optional token | User ID whitelist | Single-user whitelist (simplest) |
| File Handling | Full file operations | Directory navigation (cd/ls/pwd) | File transfers | Upload to session folders, download results |
| Progress Updates | Proactive heartbeat | Command output shown | Real-time notifications | Tool call progress (stretch goal) |
| Multi-Platform | Telegram, Discord, Slack, WhatsApp, iMessage | Telegram only | Telegram, Email, Discord, LINE | Telegram only (focused) |
| Output Management | Native streaming | Full responses | Smart content handling | Smart truncation + output modes |
| Cost Optimization | Not mentioned | Rate limiting | Cost tracking | Multi-model routing (future) |
| Voice Support | Not mentioned | Not mentioned | Not mentioned | Future consideration |
| Proactive Features | Heartbeat + cron jobs | Not mentioned | Not mentioned | Defer to v1+ |
**Our Differentiation Strategy**:
- **Simpler than OpenClaw**: No multi-platform complexity, focus on Telegram-Claude Code excellence
- **Smarter than claude-code-telegram**: Output modes, cost tracking, idle management (post-MVP)
- **More focused than Claude-Code-Remote**: Single platform, deep integration, better UX
- **Unique angle**: Cost-conscious design with multi-model routing and idle timeout (future)
## Implementation Complexity Assessment
### Low Complexity (1-2 days)
- User whitelist authentication
- Basic message send/receive
- Typing indicator
- Simple command interface
- Error messages
- Session folders
- Session export
### Medium Complexity (3-5 days)
- Session persistence (state serialization)
- File upload/download (Telegram file API)
- Message splitting (intelligent chunking)
- Named session management
- Idle timeout implementation
- Cost tracking
- Inline keyboards
- Image analysis (using Claude vision)
### High Complexity (1-2 weeks)
- Smart output modes (AI-driven truncation)
- Tool call progress parsing
- Multi-model routing (complexity analysis)
- Voice message support (Whisper integration)
- Proactive heartbeat (cron + intelligent prompting)
- Undo/rollback (conversation tree)
## Technical Considerations
### Telegram Bot Framework Options
**python-telegram-bot (Recommended)**
- Mature, well-documented (v21.8 as of 2026)
- ConversationHandler for state management
- Built-in file handling
- Already familiar to user (Python preference noted)
**Alternative: grammY (TypeScript/Node)**
- Used by OpenClaw
- Excellent session plugin
- Not aligned with user's Python preference
**Decision**: Use python-telegram-bot for consistency with existing homelab Python scripts.
### Session Storage Options
**SQLite (Recommended for MVP)**
- Simple, file-based, no server needed
- Built into Python
- Easy to backup (single file)
**Alternative: JSON files**
- Even simpler but no transaction safety
- Good for prototyping, migrate to SQLite quickly
**Decision**: Start with JSON for rapid prototyping, migrate to SQLite by v0.2.
### Claude Code Integration
**Subprocess Approach (Recommended)**
- Spawn `claude-code` CLI as subprocess
- Capture stdout/stderr
- Parse output for tool calls, costs, errors
- Clean isolation, no SDK dependency
**Challenge**: claude-code CLI doesn't expose token counts in output yet. Will need to:
1. Parse prompts/responses to estimate tokens
2. Or wait for CLI feature addition
3. Or use Anthropic API directly (breaks "use Claude Code" requirement)
### File Handling Architecture
```
~/stuff/telegram-sessions/
├── <session_name_1>/
│ ├── uploads/ # User-sent files
│ ├── downloads/ # Claude-generated files
│ └── metadata.json # Session info
└── <session_name_2>/
└── ...
```
Each session gets isolated folder on shared ZFS storage (~/stuff). Pass session folder as cwd to Claude Code.
### Cost Optimization Strategy
**Haiku vs Sonnet Pricing (2026):**
- Haiku 4.5: $1 input / $5 output per MTok
- Sonnet 4.5: $3 input / $15 output per MTok
**Haiku is 1/3 the cost of Sonnet**, performs within 5% on many tasks.
**Polling Pattern (Future Optimization)**:
- Use Haiku for idle checking: "Any new messages? Reply WAIT or process request"
- If WAIT: sleep and poll again (cheap)
- If action needed: Hand off to Sonnet for actual work
- Potential 70-80% cost reduction for always-on bot
**Not MVP**: Requires significant engineering, usage patterns unclear.
## Security & Privacy Notes
**Single-User Design Benefits:**
- No multi-tenant isolation complexity
- No user data privacy concerns (owner = user)
- Simple whitelist auth sufficient
- Can run with full system access (owner trusts self)
**Risks to Mitigate:**
- Telegram token leakage (store in config, never commit)
- User ID spoofing (validate against hardcoded whitelist)
- File upload exploits (validate file types, scan for malware if paranoid)
- Command injection via filenames (sanitize all user input)
**Session Security:**
- Sessions stored on local disk (~/stuff)
- Accessed only by bot user (mikkel)
- No encryption needed (single-user, trusted environment)
## Performance Considerations
**Telegram API Limits:**
- Bot messages: 30/sec across all chats
- Message edits: 1/sec per chat
- File uploads: 50MB default, 2000MB with self-hosted Bot API
**Implications:**
- Typing indicator: Max 1 update per 5-6 seconds (rate limit safe)
- Tool progress: Batch updates, don't spam on every tool call
- File handling: 50MB sufficient for most use cases (PDFs, screenshots, scripts)
**Claude Code Response Times:**
- Simple queries: 2-5 seconds
- Complex with tools: 10-60 seconds
- Very long responses: 60+ seconds
**Implications:**
- Typing indicator critical (users wait 10-60s regularly)
- Consider "Still working..." message at 30s mark
- Tool progress updates help perception of progress
## Sources
**Telegram Bot Features & Best Practices:**
- [Best Telegram Bots in 2026](https://chatimize.com/best-telegram-bots/)
- [Telegram AI Chatbots Best Practices](https://botpress.com/blog/top-telegram-chatbots)
- [Create Telegram Bot 2026](https://evacodes.com/blog/create-telegram-bot)
**Session Management:**
- [OpenClaw Telegram Bot Sessions](https://macaron.im/blog/openclaw-telegram-bot-setup)
- [grammY Session Plugin](https://grammy.dev/plugins/session.html)
- [python-telegram-bot ConversationHandler](https://docs.python-telegram-bot.org/en/v21.8/telegram.ext.conversationhandler.html)
**Claude Code Implementations:**
- [claude-code-telegram GitHub](https://github.com/RichardAtCT/claude-code-telegram)
- [Claude-Code-Remote GitHub](https://github.com/JessyTsui/Claude-Code-Remote)
- [OpenClaw Telegram Docs](https://docs.openclaw.ai/channels/telegram)
**Cost Optimization:**
- [Claude API Pricing 2026](https://www.metacto.com/blogs/anthropic-api-pricing-a-full-breakdown-of-costs-and-integration)
- [Claude API Pricing Guide](https://www.aifreeapi.com/en/posts/claude-api-pricing-per-million-tokens)
- [Anthropic Cost Optimization](https://www.finout.io/blog/anthropic-api-pricing)
**File Handling:**
- [Telegram File Handling](https://grammy.dev/guide/files)
- [Telegram Bot File Upload](https://telegrambots.github.io/book/3/files/upload.html)
**UX & Progress Updates:**
- [AI Assistant Streaming Responses](https://avestalabs.ai/aspire-ai-academy/gen-ai-engineering/streaming-responses)
- [Telegram Typing Indicator](https://community.latenode.com/t/how-can-a-telegram-bot-simulate-a-typing-indicator/5602)
**Timeout & Session Management:**
- [Chatbot Session Timeout Best Practices](https://quidget.ai/blog/ai-automation/chatbot-session-timeout-settings-best-practices/)
- [AI Chatbot Session Management](https://optiblack.com/insights/ai-chatbot-session-management-best-practices)
**Telegram Interface:**
- [Telegram Bot Buttons](https://core.telegram.org/bots/features)
- [Inline Keyboards](https://grammy.dev/plugins/keyboard)
---
*Feature research for: Telegram-to-Claude Code Bridge*
*Researched: 2026-02-04*
*Confidence: HIGH - All findings verified with official documentation and multiple current sources*