homelab/.planning/phases/02-telegram-integration/02-02-PLAN.md

---
phase: 02-telegram-integration
plan: 02
type: execute
wave: 2
depends_on: ["02-01"]
files_modified:
  - telegram/bot.py
  - telegram/message_batcher.py
autonomous: false

must_haves:
  truths:
    - "User sends message in Telegram and receives Claude's response formatted in MarkdownV2"
    - "Typing indicator stays visible during entire Claude processing time (10-60s+)"
    - "User sees tool call progress notifications (e.g. 'Reading config.json...')"
    - "Rapid sequential messages are batched into a single Claude prompt"
    - "User attaches photo in Telegram and Claude auto-analyzes it"
    - "User attaches document in Telegram and Claude can reference it in session"
    - "Responses longer than 4096 chars are split across multiple messages without breaking code blocks"
    - "Bot runs as systemd user service and restarts on failure"
  artifacts:
    - path: "telegram/bot.py"
      provides: "Updated message handlers with typing, progress, batching, file handling"
      contains: "typing_indicator_loop"
    - path: "telegram/message_batcher.py"
      provides: "MessageBatcher class for debounce-based message batching"
      exports: ["MessageBatcher"]
    - path: "~/.config/systemd/user/telegram-bot.service"
      provides: "Systemd user service unit for bot"
      contains: "telegram-bot"
  key_links:
    - from: "telegram/bot.py"
      to: "telegram/claude_subprocess.py"
      via: "ClaudeSubprocess.send_message() and callbacks"
      pattern: "send_message"
    - from: "telegram/bot.py"
      to: "telegram/telegram_utils.py"
      via: "split_message_smart, escape_markdown_v2, typing_indicator_loop"
      pattern: "split_message_smart|escape_markdown_v2|typing_indicator_loop"
    - from: "telegram/bot.py"
      to: "telegram/message_batcher.py"
      via: "MessageBatcher.add_message()"
      pattern: "MessageBatcher"
---

<objective>
Wire the persistent subprocess and utility functions into the Telegram bot with typing indicators, progress notifications, message batching, file handling, and systemd service setup.

Purpose: This plan makes the entire system work end-to-end. Messages flow from Telegram through the batcher to the persistent Claude subprocess, responses come back formatted in MarkdownV2 with smart splitting, and the user sees typing indicators and tool call progress throughout. File attachments land in session folders with auto-analysis. The systemd service ensures reliability across container restarts.

Output: Updated `bot.py` with full integration, new `message_batcher.py`, systemd service file, working end-to-end flow.
</objective>

<execution_context>
@/home/mikkel/.claude/get-shit-done/workflows/execute-plan.md
@/home/mikkel/.claude/get-shit-done/templates/summary.md
</execution_context>

<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/02-telegram-integration/02-RESEARCH.md
@.planning/phases/02-telegram-integration/02-CONTEXT.md
@.planning/phases/02-telegram-integration/02-01-SUMMARY.md
@telegram/bot.py
@telegram/claude_subprocess.py
@telegram/telegram_utils.py
@telegram/session_manager.py
</context>

<tasks>

<task type="auto">
  <name>Task 1: Create MessageBatcher and update bot.py with typing, progress, batching, and file handling</name>
  <files>telegram/message_batcher.py, telegram/bot.py</files>
  <action>
**Part A: Create telegram/message_batcher.py**

Implement `MessageBatcher` class for debounce-based message batching:

```python
class MessageBatcher:
    def __init__(self, callback: Callable, debounce_seconds: float = 2.0):
        ...
    async def add_message(self, message: str):
        """Add message, reset debounce timer. When timer expires, flush batch via callback."""
        ...
```

- Uses asyncio.Queue to collect messages
- Cancels previous debounce timer when new message arrives
- After debounce_seconds of silence, joins all queued messages with `\n\n` and calls callback
- Callback is async (receives combined message string)
- Handles CancelledError gracefully during timer cancellation
- Follow research pattern from 02-RESEARCH.md (MessageBatcher section)

**Part B: Update telegram/bot.py — make_callbacks() overhaul**

Replace the current `make_callbacks()` with a new version that uses telegram_utils:

```python
from telegram_utils import split_message_smart, escape_markdown_v2, typing_indicator_loop
from message_batcher import MessageBatcher
```

New `make_callbacks(bot, chat_id)` returns dict or tuple of callbacks:

1. **on_output(text):**
   - Split text using `split_message_smart(text)`
   - For each chunk: try sending with `parse_mode='MarkdownV2'` after `escape_markdown_v2()`
   - If MarkdownV2 parse fails (Telegram BadRequest), fall back to plain text send
   - Stop the typing indicator (set stop_event)

2. **on_error(error):**
   - Send error message to chat (plain text, no MarkdownV2)
   - Stop the typing indicator

3. **on_complete():**
   - Stop the typing indicator (set stop_event)
   - Log completion

4. **on_status(status):**
   - Send status as a brief message (e.g., "Claude restarted with context preserved")

5. **on_tool_use(tool_name, tool_input):** (NEW)
   - Format tool call notification: extract meaningful target from tool_input
   - For Bash tool: show command preview (first 50 chars)
   - For Read tool: show file path
   - For Edit tool: show file path
   - For Grep/Glob: show pattern
   - For Write tool: show file path
   - Send as a single editable progress message (edit_message_text on a progress message)
   - OR send as separate short messages (planner's discretion — separate messages are simpler and more reliable)
   - Format: italic text like `_Reading config.json..._`

**Part C: Update handle_message()**

Overhaul the message handler to use typing indicators and message batching:

1. On message received:
   - Start typing indicator loop: `stop_typing = asyncio.Event()`, `asyncio.create_task(typing_indicator_loop(...))`
   - Pass stop_typing event to callbacks so on_output/on_complete can stop it
   - Get or create subprocess (existing logic, but use `start()` instead of constructor for persistent process)

2. Message batching:
   - Create one `MessageBatcher` per session (store in dict alongside subprocesses)
   - Batcher callback = `subprocess.send_message()`
   - On message: `await batcher.add_message(text)` instead of direct `subprocess.send_message()`
   - Typing indicator starts immediately on first message, stops on Claude response

3. Subprocess auto-start (integrate into existing handle_message after session lookup, before batcher):
   ```python
   # In handle_message(), after resolving active session:
   session_id = session_manager.get_active_session(user_id)

   # Get or create subprocess for this session (avoid double-start)
   if session_id not in self.subprocesses or not self.subprocesses[session_id].is_alive:
       callbacks = make_callbacks(bot, chat_id, stop_typing_event)
       subprocess = ClaudeSubprocess(
           session_dir=session_dir,
           on_output=callbacks['on_output'],
           on_error=callbacks['on_error'],
           on_complete=callbacks['on_complete'],
           on_status=callbacks['on_status'],
           on_tool_use=callbacks['on_tool_use'],
       )
       await subprocess.start()
       self.subprocesses[session_id] = subprocess
   else:
       subprocess = self.subprocesses[session_id]
   ```
   - The `is_alive` check prevents double-start: only creates and starts if no subprocess exists for session or previous one died
   - `self.subprocesses` is a dict[str, ClaudeSubprocess] stored on the handler/application context (same pattern as existing subprocess tracking in bot.py)

**Part D: Update handle_photo() and handle_document()**

Save files to active session folder instead of global images/files directories:

1. **handle_photo():**
   - Get active session directory from session_manager
   - If no active session, prompt user to create one
   - Download highest-quality photo to session directory as `photo_YYYYMMDD_HHMMSS.jpg`
   - Auto-analyze: send message to Claude subprocess: "I've attached a photo: {filename}. {caption or 'Please describe what you see.'}"
   - Start typing indicator while Claude analyzes

2. **handle_document():**
   - Get active session directory from session_manager
   - If no active session, prompt user to create one
   - Download document to session directory with original filename (timestamp prefix for collision avoidance)
   - If caption provided: send caption + "The file {filename} has been saved to your session." to Claude
   - If no caption: send "User uploaded file: {filename}" to Claude (let Claude infer intent from context, per CONTEXT.md decision)

**Part E: Update switch_session_cmd() and archive_session_cmd()**

- On session switch: stop typing indicator for current session if running
- On session switch: batcher should flush immediately (don't lose queued messages)
- On archive: terminate subprocess, remove batcher
  </action>
  <verify>
1. `python -c "from message_batcher import MessageBatcher; print('import OK')"` from ~/homelab/telegram/
2. bot.py imports telegram_utils functions and MessageBatcher without errors
3. make_callbacks includes on_tool_use callback
4. handle_message uses typing_indicator_loop
5. handle_photo saves to session directory (not global images/)
6. handle_document saves to session directory (not global files/)
7. MessageBatcher has add_message() method
  </verify>
  <done>
MessageBatcher debounces rapid messages with configurable timer. Bot handlers use typing indicators, progress notifications for tool calls, smart message splitting with MarkdownV2, and file handling saves to session directories with auto-analysis.
  </done>
</task>

<task type="auto">
  <name>Task 2: Create systemd user service for the bot</name>
  <files>~/.config/systemd/user/telegram-bot.service</files>
  <action>
Create or update the systemd user service unit for the Telegram bot.

**Service file at `~/.config/systemd/user/telegram-bot.service`:**

```ini
[Unit]
Description=Homelab Telegram Bot
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
WorkingDirectory=/home/mikkel/homelab/telegram
ExecStart=/home/mikkel/venv/bin/python bot.py
Restart=on-failure
RestartSec=10
KillMode=mixed
KillSignal=SIGTERM
TimeoutStopSec=30

# Environment
Environment=PATH=/home/mikkel/.local/bin:/home/mikkel/bin:/usr/local/bin:/usr/bin:/bin

[Install]
WantedBy=default.target
```

Key settings:
- **KillMode=mixed:** Sends SIGTERM to main process, SIGKILL to remaining children (ensures Claude subprocesses are cleaned up)
- **RestartSec=10:** Wait 10s before restart to avoid rapid restart loops
- **TimeoutStopSec=30:** Give bot time to gracefully terminate subprocesses before force kill
- **WorkingDirectory:** Set to telegram/ so sibling imports work

After creating the service file:
```bash
mkdir -p ~/.config/systemd/user
# Write service file
systemctl --user daemon-reload
systemctl --user enable telegram-bot.service
```

Do NOT start the service yet (user will start it after verifying manually).

Also ensure loginctl enable-linger is set for the mikkel user (allows user services to run without active login session). Check with `loginctl show-user mikkel -p Linger`. If not enabled, note it as a requirement but do NOT run the command (requires root).
  </action>
  <verify>
1. Service file exists at ~/.config/systemd/user/telegram-bot.service
2. `systemctl --user cat telegram-bot.service` shows the service configuration
3. `systemctl --user is-enabled telegram-bot.service` returns "enabled"
4. Service file has KillMode=mixed and correct WorkingDirectory
5. Check loginctl linger status and report
  </verify>
  <done>
Systemd user service is created and enabled (not started). Bot can be started with `systemctl --user start telegram-bot.service` and survives container restarts (with linger enabled). KillMode=mixed ensures Claude subprocesses are cleaned up on stop.
  </done>
</task>

<task type="checkpoint:human-verify" gate="blocking">
  <what-built>
Complete Telegram-Claude Code bidirectional messaging system:
- Persistent Claude Code subprocess with stream-json I/O (no respawn per turn)
- Typing indicator while Claude processes (re-sent every 4s)
- Tool call progress notifications (e.g., "Reading config.json...")
- Smart message splitting at paragraph/code block boundaries with MarkdownV2
- Message batching for rapid sequential messages (2s debounce)
- Photos/documents saved to session folder with auto-analysis
- Systemd user service for reliability
  </what-built>
  <how-to-verify>
1. Start the bot manually: `cd ~/homelab/telegram && ~/venv/bin/python bot.py`
2. In Telegram, create a session: `/new test-phase2`
3. Send a simple message: "Hello, what can you help me with?"
4. Verify: typing indicator appears while Claude processes
5. Verify: Claude's response arrives formatted properly
6. Send a message that triggers tool use: "Read the file ~/homelab/CLAUDE.md and summarize it"
7. Verify: you see tool call progress notification (e.g., "Reading CLAUDE.md...")
8. Verify: response is natural language summary (not raw code)
9. Send a photo with caption "What is this?"
10. Verify: photo is saved to ~/homelab/telegram/sessions/test-phase2/ and Claude analyzes it
11. Send 3 rapid messages (within 2 seconds): "one", "two", "three"
12. Verify: they are batched into a single Claude prompt
13. Type a long question that produces a response >4096 chars
14. Verify: response splits across multiple messages without broken code blocks
15. Check systemd: `systemctl --user status telegram-bot.service` shows enabled
16. Archive test session: `/archive test-phase2`
  </how-to-verify>
  <resume-signal>Type "approved" or describe issues found</resume-signal>
</task>

</tasks>

<verification>
1. End-to-end: Send message in Telegram, receive Claude's response back
2. Typing indicator visible during processing (10-60s range)
3. Tool call notifications appear for Read, Bash, Edit operations
4. Photo attachment saved to session folder and auto-analyzed
5. Document attachment saved to session folder
6. Long response properly split across messages
7. MarkdownV2 formatting renders correctly (bold, code blocks, etc.)
8. Rapid messages batched before sending to Claude
9. Systemd service enabled and configured with KillMode=mixed
10. Session switching stops typing indicator for previous session
</verification>

<success_criteria>
- User sends message in Telegram and receives Claude's response formatted in MarkdownV2
- Typing indicator visible for entire processing duration
- Tool call progress notifications appear
- Photos auto-analyzed, documents saved to session
- Long responses split correctly
- Rapid messages batched
- Systemd service configured and enabled
- Bot survives manual restart test
</success_criteria>

<output>
After completion, create `.planning/phases/02-telegram-integration/02-02-SUMMARY.md`
</output>