docs(03): create phase plan for lifecycle management
Phase 03: Lifecycle Management - 2 plans in 2 waves - Plan 01 (wave 1): Idle timer module + session metadata + PID tracking - Plan 02 (wave 2): Suspend/resume wiring, /timeout, /sessions, startup cleanup, graceful shutdown - Ready for execution Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
8f7b67a91b
commit
88cd339a54
3 changed files with 449 additions and 4 deletions
|
|
@ -63,10 +63,11 @@ Plans:
|
||||||
3. User can change idle timeout via `/timeout <minutes>` command
|
3. User can change idle timeout via `/timeout <minutes>` command
|
||||||
4. User can list all sessions with last activity timestamp via `/sessions` command
|
4. User can list all sessions with last activity timestamp via `/sessions` command
|
||||||
5. Bot restart leaves no zombie processes (systemd KillMode handles cleanup)
|
5. Bot restart leaves no zombie processes (systemd KillMode handles cleanup)
|
||||||
**Plans**: TBD
|
**Plans:** 2 plans
|
||||||
|
|
||||||
Plans:
|
Plans:
|
||||||
- [ ] TBD
|
- [ ] 03-01-PLAN.md -- Idle timer module + session metadata extensions + PID tracking
|
||||||
|
- [ ] 03-02-PLAN.md -- Suspend/resume wiring, /timeout, /sessions, startup cleanup, graceful shutdown
|
||||||
|
|
||||||
### Phase 4: Output Modes
|
### Phase 4: Output Modes
|
||||||
**Goal**: Users control response verbosity and format based on context
|
**Goal**: Users control response verbosity and format based on context
|
||||||
|
|
@ -84,11 +85,11 @@ Plans:
|
||||||
## Progress
|
## Progress
|
||||||
|
|
||||||
**Execution Order:**
|
**Execution Order:**
|
||||||
Phases execute in numeric order: 1 → 2 → 3 → 4
|
Phases execute in numeric order: 1 -> 2 -> 3 -> 4
|
||||||
|
|
||||||
| Phase | Plans Complete | Status | Completed |
|
| Phase | Plans Complete | Status | Completed |
|
||||||
|-------|----------------|--------|-----------|
|
|-------|----------------|--------|-----------|
|
||||||
| 1. Session & Process Foundation | 3/3 | Complete | 2026-02-04 |
|
| 1. Session & Process Foundation | 3/3 | Complete | 2026-02-04 |
|
||||||
| 2. Telegram Integration | 2/2 | Complete | 2026-02-04 |
|
| 2. Telegram Integration | 2/2 | Complete | 2026-02-04 |
|
||||||
| 3. Lifecycle Management | 0/TBD | Not started | - |
|
| 3. Lifecycle Management | 0/2 | In progress | - |
|
||||||
| 4. Output Modes | 0/TBD | Not started | - |
|
| 4. Output Modes | 0/TBD | Not started | - |
|
||||||
|
|
|
||||||
133
.planning/phases/03-lifecycle-management/03-01-PLAN.md
Normal file
133
.planning/phases/03-lifecycle-management/03-01-PLAN.md
Normal file
|
|
@ -0,0 +1,133 @@
|
||||||
|
---
|
||||||
|
phase: 03-lifecycle-management
|
||||||
|
plan: 01
|
||||||
|
type: execute
|
||||||
|
wave: 1
|
||||||
|
depends_on: []
|
||||||
|
files_modified:
|
||||||
|
- telegram/idle_timer.py
|
||||||
|
- telegram/session_manager.py
|
||||||
|
- telegram/claude_subprocess.py
|
||||||
|
autonomous: true
|
||||||
|
|
||||||
|
must_haves:
|
||||||
|
truths:
|
||||||
|
- "Per-session idle timer fires callback after configurable timeout seconds"
|
||||||
|
- "Timer resets on activity (cancel + restart)"
|
||||||
|
- "Session metadata includes idle_timeout field (default 600s)"
|
||||||
|
- "ClaudeSubprocess exposes its PID for metadata tracking"
|
||||||
|
artifacts:
|
||||||
|
- path: "telegram/idle_timer.py"
|
||||||
|
provides: "SessionIdleTimer class with asyncio-based per-session idle timers"
|
||||||
|
min_lines: 60
|
||||||
|
- path: "telegram/session_manager.py"
|
||||||
|
provides: "Session metadata with idle_timeout field, PID tracking"
|
||||||
|
contains: "idle_timeout"
|
||||||
|
- path: "telegram/claude_subprocess.py"
|
||||||
|
provides: "PID property for external access"
|
||||||
|
contains: "def pid"
|
||||||
|
key_links:
|
||||||
|
- from: "telegram/idle_timer.py"
|
||||||
|
to: "asyncio.create_task"
|
||||||
|
via: "Background sleep task with cancellation"
|
||||||
|
pattern: "asyncio\\.create_task.*_wait_for_timeout"
|
||||||
|
- from: "telegram/session_manager.py"
|
||||||
|
to: "metadata.json"
|
||||||
|
via: "idle_timeout stored in session metadata"
|
||||||
|
pattern: "idle_timeout"
|
||||||
|
---
|
||||||
|
|
||||||
|
<objective>
|
||||||
|
Create the idle timer module and extend session metadata for lifecycle management.
|
||||||
|
|
||||||
|
Purpose: Foundation components needed before wiring suspend/resume into the bot. The idle timer provides per-session timeout detection, and metadata extensions store timeout configuration and subprocess PIDs.
|
||||||
|
Output: New `idle_timer.py` module, updated `session_manager.py` and `claude_subprocess.py`
|
||||||
|
</objective>
|
||||||
|
|
||||||
|
<execution_context>
|
||||||
|
@/home/mikkel/.claude/get-shit-done/workflows/execute-plan.md
|
||||||
|
@/home/mikkel/.claude/get-shit-done/templates/summary.md
|
||||||
|
</execution_context>
|
||||||
|
|
||||||
|
<context>
|
||||||
|
@.planning/PROJECT.md
|
||||||
|
@.planning/ROADMAP.md
|
||||||
|
@.planning/STATE.md
|
||||||
|
@.planning/phases/03-lifecycle-management/03-CONTEXT.md
|
||||||
|
@.planning/phases/03-lifecycle-management/03-RESEARCH.md
|
||||||
|
@telegram/idle_timer.py (will be created)
|
||||||
|
@telegram/session_manager.py
|
||||||
|
@telegram/claude_subprocess.py
|
||||||
|
</context>
|
||||||
|
|
||||||
|
<tasks>
|
||||||
|
|
||||||
|
<task type="auto">
|
||||||
|
<name>Task 1: Create SessionIdleTimer module</name>
|
||||||
|
<files>telegram/idle_timer.py</files>
|
||||||
|
<action>
|
||||||
|
Create `telegram/idle_timer.py` with a `SessionIdleTimer` class that manages per-session idle timeouts using asyncio.
|
||||||
|
|
||||||
|
Class design:
|
||||||
|
- `__init__(self, session_name: str, timeout_seconds: int, on_timeout: Callable[[str], Awaitable[None]])` -- stores config, initializes _timer_task to None, _last_activity to now (UTC)
|
||||||
|
- `reset(self)` -- updates _last_activity to now, cancels existing _timer_task if running, creates new asyncio.create_task(_wait_for_timeout())
|
||||||
|
- `async _wait_for_timeout(self)` -- awaits asyncio.sleep(self.timeout_seconds), then calls `await self.on_timeout(self.session_name)`. Catches asyncio.CancelledError silently (timer was reset).
|
||||||
|
- `cancel(self)` -- cancels _timer_task if running (used on shutdown/archive)
|
||||||
|
- `@property seconds_since_activity` -- returns float seconds since _last_activity
|
||||||
|
- `@property last_activity` -- returns the datetime of last activity (for /sessions display)
|
||||||
|
|
||||||
|
Use `datetime.now(timezone.utc)` for timestamps. Import typing for Callable, Optional, Awaitable.
|
||||||
|
|
||||||
|
Add module docstring explaining this is the idle timeout manager for session lifecycle. Log timer start/cancel/fire events at DEBUG level, timeout firing at INFO level.
|
||||||
|
</action>
|
||||||
|
<verify>
|
||||||
|
`python3 -c "from idle_timer import SessionIdleTimer; print('import OK')"` run from telegram/ directory succeeds.
|
||||||
|
</verify>
|
||||||
|
<done>SessionIdleTimer class exists with reset(), cancel(), _wait_for_timeout(), seconds_since_activity, and last_activity. Imports cleanly.</done>
|
||||||
|
</task>
|
||||||
|
|
||||||
|
<task type="auto">
|
||||||
|
<name>Task 2: Extend session metadata and subprocess PID tracking</name>
|
||||||
|
<files>telegram/session_manager.py, telegram/claude_subprocess.py</files>
|
||||||
|
<action>
|
||||||
|
**session_manager.py changes:**
|
||||||
|
|
||||||
|
1. In `create_session()`, add `"idle_timeout": 600` (10 minutes default) to the initial metadata dict (alongside existing fields like name, created, last_active, persona, pid, status).
|
||||||
|
|
||||||
|
2. Add a helper method `get_session_timeout(self, name: str) -> int` that reads metadata and returns `metadata.get('idle_timeout', 600)`. This provides a clean interface for the bot to query timeout values.
|
||||||
|
|
||||||
|
3. No changes to list_sessions() -- it already returns full metadata which will now include idle_timeout.
|
||||||
|
|
||||||
|
**claude_subprocess.py changes:**
|
||||||
|
|
||||||
|
1. Add a `@property pid(self) -> Optional[int]` that returns `self._process.pid if self._process and self._process.returncode is None else None`. This lets the bot store the PID in session metadata for orphan cleanup on restart.
|
||||||
|
|
||||||
|
2. In `start()`, after successful subprocess spawn, store the PID in a `self._pid` attribute as well (for access even after process terminates, useful for logging). Keep the property returning live PID only.
|
||||||
|
|
||||||
|
These are minimal, targeted changes. Do NOT refactor existing code. Do NOT change the terminate() method or any existing logic.
|
||||||
|
</action>
|
||||||
|
<verify>
|
||||||
|
`python3 -c "from session_manager import SessionManager; sm = SessionManager(); print('SM OK')"` and `python3 -c "from claude_subprocess import ClaudeSubprocess; print('CS OK')"` both succeed from telegram/ directory.
|
||||||
|
</verify>
|
||||||
|
<done>Session metadata includes idle_timeout (default 600s). SessionManager has get_session_timeout() method. ClaudeSubprocess has pid property returning live process PID.</done>
|
||||||
|
</task>
|
||||||
|
|
||||||
|
</tasks>
|
||||||
|
|
||||||
|
<verification>
|
||||||
|
- `cd ~/homelab/telegram && python3 -c "from idle_timer import SessionIdleTimer; from session_manager import SessionManager; from claude_subprocess import ClaudeSubprocess; print('All imports OK')"`
|
||||||
|
- SessionIdleTimer has reset(), cancel(), seconds_since_activity, last_activity
|
||||||
|
- SessionManager.get_session_timeout() returns int
|
||||||
|
- ClaudeSubprocess.pid returns Optional[int]
|
||||||
|
</verification>
|
||||||
|
|
||||||
|
<success_criteria>
|
||||||
|
- idle_timer.py exists with SessionIdleTimer class implementing asyncio-based per-session idle timeout
|
||||||
|
- session_manager.py creates sessions with idle_timeout=600 in metadata and has get_session_timeout() helper
|
||||||
|
- claude_subprocess.py exposes pid property for PID tracking
|
||||||
|
- All three modules import without errors
|
||||||
|
</success_criteria>
|
||||||
|
|
||||||
|
<output>
|
||||||
|
After completion, create `.planning/phases/03-lifecycle-management/03-01-SUMMARY.md`
|
||||||
|
</output>
|
||||||
311
.planning/phases/03-lifecycle-management/03-02-PLAN.md
Normal file
311
.planning/phases/03-lifecycle-management/03-02-PLAN.md
Normal file
|
|
@ -0,0 +1,311 @@
|
||||||
|
---
|
||||||
|
phase: 03-lifecycle-management
|
||||||
|
plan: 02
|
||||||
|
type: execute
|
||||||
|
wave: 2
|
||||||
|
depends_on: ["03-01"]
|
||||||
|
files_modified:
|
||||||
|
- telegram/bot.py
|
||||||
|
autonomous: true
|
||||||
|
|
||||||
|
must_haves:
|
||||||
|
truths:
|
||||||
|
- "Session suspends automatically after idle timeout (subprocess terminated, status set to suspended)"
|
||||||
|
- "User message to suspended session resumes it with --continue and shows 'Resuming session...' status"
|
||||||
|
- "Resume failure sends error to user and does not auto-create fresh session"
|
||||||
|
- "Race between timeout-fire and user-message is prevented by asyncio.Lock"
|
||||||
|
- "Bot startup kills orphaned subprocess PIDs and sets all sessions to suspended"
|
||||||
|
- "Bot shutdown terminates all subprocesses gracefully (SIGTERM + 5s timeout + SIGKILL)"
|
||||||
|
- "/timeout <minutes> sets per-session idle timeout (1-120 range)"
|
||||||
|
- "/sessions lists all sessions with status indicator, persona, and last active time"
|
||||||
|
artifacts:
|
||||||
|
- path: "telegram/bot.py"
|
||||||
|
provides: "Suspend/resume wiring, idle timers, /timeout, /sessions, startup cleanup, graceful shutdown"
|
||||||
|
contains: "idle_timers"
|
||||||
|
key_links:
|
||||||
|
- from: "telegram/bot.py"
|
||||||
|
to: "telegram/idle_timer.py"
|
||||||
|
via: "import and instantiate SessionIdleTimer per session"
|
||||||
|
pattern: "from idle_timer import SessionIdleTimer"
|
||||||
|
- from: "telegram/bot.py on_complete callback"
|
||||||
|
to: "idle_timer.reset()"
|
||||||
|
via: "Timer starts after Claude finishes processing"
|
||||||
|
pattern: "idle_timers.*reset"
|
||||||
|
- from: "telegram/bot.py handle_message"
|
||||||
|
to: "resume logic"
|
||||||
|
via: "Detect suspended session, spawn with --continue, send status"
|
||||||
|
pattern: "Resuming session"
|
||||||
|
- from: "telegram/bot.py suspend_session"
|
||||||
|
to: "ClaudeSubprocess.terminate()"
|
||||||
|
via: "Idle timer fires, terminates subprocess"
|
||||||
|
pattern: "await.*terminate"
|
||||||
|
---
|
||||||
|
|
||||||
|
<objective>
|
||||||
|
Wire suspend/resume lifecycle, idle timers, new commands, and cleanup into the bot.
|
||||||
|
|
||||||
|
Purpose: This is the core integration plan that makes sessions automatically suspend after idle timeout, resume transparently on user message, and provides /timeout + /sessions commands. Also adds startup orphan cleanup and graceful shutdown signal handling.
|
||||||
|
Output: Updated `bot.py` with full lifecycle management
|
||||||
|
</objective>
|
||||||
|
|
||||||
|
<execution_context>
|
||||||
|
@/home/mikkel/.claude/get-shit-done/workflows/execute-plan.md
|
||||||
|
@/home/mikkel/.claude/get-shit-done/templates/summary.md
|
||||||
|
</execution_context>
|
||||||
|
|
||||||
|
<context>
|
||||||
|
@.planning/PROJECT.md
|
||||||
|
@.planning/ROADMAP.md
|
||||||
|
@.planning/STATE.md
|
||||||
|
@.planning/phases/03-lifecycle-management/03-CONTEXT.md
|
||||||
|
@.planning/phases/03-lifecycle-management/03-RESEARCH.md
|
||||||
|
@.planning/phases/03-lifecycle-management/03-01-SUMMARY.md
|
||||||
|
@telegram/bot.py
|
||||||
|
@telegram/idle_timer.py
|
||||||
|
@telegram/session_manager.py
|
||||||
|
@telegram/claude_subprocess.py
|
||||||
|
</context>
|
||||||
|
|
||||||
|
<tasks>
|
||||||
|
|
||||||
|
<task type="auto">
|
||||||
|
<name>Task 1: Suspend/resume wiring with race locks, startup cleanup, and graceful shutdown</name>
|
||||||
|
<files>telegram/bot.py</files>
|
||||||
|
<action>
|
||||||
|
This is the core lifecycle wiring in bot.py. Make these changes:
|
||||||
|
|
||||||
|
**New imports and globals:**
|
||||||
|
- `import signal, os` (for shutdown handlers and PID checks)
|
||||||
|
- `from idle_timer import SessionIdleTimer`
|
||||||
|
- Add global dict: `idle_timers: dict[str, SessionIdleTimer] = {}`
|
||||||
|
- Add global dict: `subprocess_locks: dict[str, asyncio.Lock] = {}` (one lock per session, prevents races between timeout-fire and user-message)
|
||||||
|
|
||||||
|
**Helper: get_subprocess_lock(session_name)**
|
||||||
|
- Returns existing lock or creates new one for session. Pattern: `subprocess_locks.setdefault(session_name, asyncio.Lock())`
|
||||||
|
|
||||||
|
**Suspend function: `async def suspend_session(session_name: str)`**
|
||||||
|
- This is the idle timer's on_timeout callback.
|
||||||
|
- Acquire the session's subprocess lock.
|
||||||
|
- Check if subprocess exists and is_alive. If not alive, just update metadata and return.
|
||||||
|
- Check `subprocesses[session_name].is_busy` -- if busy, DON'T suspend (Claude is mid-processing). Instead, reset the idle timer to try again later. Log this. Return.
|
||||||
|
- Store the subprocess PID for logging.
|
||||||
|
- Call `await subprocesses[session_name].terminate()` (existing method with SIGTERM + timeout + SIGKILL).
|
||||||
|
- Remove from `subprocesses` dict.
|
||||||
|
- Flush and remove batcher if exists: `if session_name in batchers: await batchers[session_name].flush_immediately(); del batchers[session_name]`
|
||||||
|
- Update session metadata: `session_manager.update_session(session_name, status='suspended', pid=None)`
|
||||||
|
- Cancel and remove idle timer: `if session_name in idle_timers: idle_timers[session_name].cancel(); del idle_timers[session_name]`
|
||||||
|
- Log: `logger.info(f"Session '{session_name}' suspended after idle timeout")`
|
||||||
|
- DECISION (from CONTEXT.md): Silent suspension -- do NOT send any Telegram message.
|
||||||
|
|
||||||
|
**Modify make_callbacks() -- add on_complete idle timer integration:**
|
||||||
|
- The `on_complete` callback already exists. Wrap it: after existing logic (stop typing), add idle timer reset:
|
||||||
|
```python
|
||||||
|
# Reset idle timer (only start counting AFTER Claude finishes)
|
||||||
|
if session_name in idle_timers:
|
||||||
|
idle_timers[session_name].reset()
|
||||||
|
```
|
||||||
|
- This ensures timer only starts when Claude is truly idle, never during processing.
|
||||||
|
|
||||||
|
**Modify handle_message() -- add resume logic:**
|
||||||
|
- After checking for active session, BEFORE the subprocess check, add:
|
||||||
|
```python
|
||||||
|
# Acquire lock to prevent race with suspend_session
|
||||||
|
lock = get_subprocess_lock(active_session)
|
||||||
|
async with lock:
|
||||||
|
```
|
||||||
|
Wrap the subprocess get-or-create and message send in this lock.
|
||||||
|
- Inside the lock, when subprocess is not alive:
|
||||||
|
1. Check if session has `.claude/` dir (has history). If yes, this is a resume.
|
||||||
|
2. If resuming: send status message to user: `"Resuming session..."` (include idle duration if >1 min from metadata last_active). Example: `"Resuming session (idle for 15 min)..."`
|
||||||
|
3. Spawn subprocess normally (the existing ClaudeSubprocess constructor + start() already handles --continue when .claude/ exists).
|
||||||
|
4. Store PID in metadata: `session_manager.update_session(active_session, status='active', last_active=now_iso, pid=subprocesses[active_session].pid)`
|
||||||
|
- After sending message (outside lock), create/reset idle timer for the session:
|
||||||
|
```python
|
||||||
|
timeout_secs = session_manager.get_session_timeout(active_session)
|
||||||
|
if active_session not in idle_timers:
|
||||||
|
idle_timers[active_session] = SessionIdleTimer(active_session, timeout_secs, on_timeout=suspend_session)
|
||||||
|
# Don't reset here -- timer resets in on_complete when Claude finishes
|
||||||
|
```
|
||||||
|
- IMPORTANT: Also reset the idle timer when user sends a message (user activity should reset timer too, per CONTEXT.md):
|
||||||
|
```python
|
||||||
|
if active_session in idle_timers:
|
||||||
|
idle_timers[active_session].reset()
|
||||||
|
```
|
||||||
|
Put this BEFORE sending to subprocess (so timer is reset even if message queues).
|
||||||
|
|
||||||
|
**Similarly update handle_photo() and handle_document():**
|
||||||
|
- Add the same lock acquisition, resume detection, and idle timer reset as handle_message().
|
||||||
|
- Keep the existing photo/document save and notification logic.
|
||||||
|
|
||||||
|
**Modify new_session() -- initialize idle timer after creation:**
|
||||||
|
- After subprocess creation, add:
|
||||||
|
```python
|
||||||
|
timeout_secs = session_manager.get_session_timeout(name)
|
||||||
|
idle_timers[name] = SessionIdleTimer(name, timeout_secs, on_timeout=suspend_session)
|
||||||
|
```
|
||||||
|
- Store PID in metadata: after subprocess is created/started, `session_manager.update_session(name, pid=subprocesses[name].pid)` (only after start()).
|
||||||
|
Note: The existing code creates ClaudeSubprocess but does NOT call start() -- start happens lazily on first send_message. So PID tracking happens in handle_message when subprocess auto-starts.
|
||||||
|
|
||||||
|
**Modify switch_session_cmd():**
|
||||||
|
- Per CONTEXT.md LOCKED decision: switching sessions leaves previous subprocess running (it suspends on its own timer). Do NOT cancel old session's idle timer.
|
||||||
|
- When auto-spawning subprocess for new session, set up idle timer as above.
|
||||||
|
|
||||||
|
**Modify archive_session_cmd():**
|
||||||
|
- Cancel idle timer if exists: `if name in idle_timers: idle_timers[name].cancel(); del idle_timers[name]`
|
||||||
|
- Remove subprocess lock if exists: `subprocess_locks.pop(name, None)`
|
||||||
|
|
||||||
|
**Modify model_cmd():**
|
||||||
|
- After terminating subprocess for model change, cancel idle timer: `if active_session in idle_timers: idle_timers[active_session].cancel(); del idle_timers[active_session]`
|
||||||
|
|
||||||
|
**Startup cleanup function: `async def cleanup_orphaned_subprocesses()`**
|
||||||
|
- Called once at bot startup (before polling starts).
|
||||||
|
- Iterate all sessions via `session_manager.list_sessions()`.
|
||||||
|
- For each session with a non-None `pid`:
|
||||||
|
1. Check if PID process exists: `os.kill(pid, 0)` wrapped in try/except ProcessLookupError.
|
||||||
|
2. If process exists, verify it's a claude process: read `/proc/{pid}/cmdline`, check if "claude" is in it. If not claude, skip killing.
|
||||||
|
3. If it IS a claude process: `os.kill(pid, signal.SIGTERM)`, sleep 2s, then try `os.kill(pid, signal.SIGKILL)` (catch ProcessLookupError if already dead).
|
||||||
|
4. Update metadata: `session_manager.update_session(session['name'], pid=None, status='suspended')`
|
||||||
|
- For sessions with status != 'suspended' and no pid, also set status to 'suspended'.
|
||||||
|
- Log summary: "Cleaned up N orphaned subprocesses"
|
||||||
|
|
||||||
|
**Graceful shutdown:**
|
||||||
|
- python-telegram-bot's `Application.run_polling()` handles signal installation internally. Instead of overriding signal handlers (which conflicts with the library), use the `post_shutdown` callback:
|
||||||
|
```python
|
||||||
|
async def post_shutdown(application):
|
||||||
|
"""Clean up subprocesses and timers on bot shutdown."""
|
||||||
|
logger.info("Bot shutting down, cleaning up...")
|
||||||
|
|
||||||
|
# Cancel all idle timers
|
||||||
|
for name, timer in idle_timers.items():
|
||||||
|
timer.cancel()
|
||||||
|
|
||||||
|
# Terminate all subprocesses
|
||||||
|
for name, proc in list(subprocesses.items()):
|
||||||
|
if proc.is_alive:
|
||||||
|
logger.info(f"Terminating subprocess for '{name}'")
|
||||||
|
await proc.terminate()
|
||||||
|
|
||||||
|
logger.info("Cleanup complete")
|
||||||
|
```
|
||||||
|
- Register in main(): `app.post_shutdown = post_shutdown`
|
||||||
|
- Also add a `post_init` callback for startup cleanup:
|
||||||
|
```python
|
||||||
|
async def post_init(application):
|
||||||
|
"""Run startup cleanup."""
|
||||||
|
await cleanup_orphaned_subprocesses()
|
||||||
|
```
|
||||||
|
Register: `app = Application.builder().token(TOKEN).post_init(post_init).build()`
|
||||||
|
|
||||||
|
**Update help text:**
|
||||||
|
- Add `/timeout <minutes>` and `/sessions` to the help_command text under "Claude Sessions" section.
|
||||||
|
</action>
|
||||||
|
<verify>
|
||||||
|
`python3 -c "import bot"` from telegram/ directory should not error (syntax check). Look for: idle_timers dict, subprocess_locks dict, suspend_session function, cleanup_orphaned_subprocesses function, post_shutdown callback.
|
||||||
|
</verify>
|
||||||
|
<done>
|
||||||
|
- suspend_session() terminates subprocess on idle timeout, updates metadata to suspended, silent (no Telegram notification)
|
||||||
|
- handle_message() detects suspended session, sends "Resuming session..." status, spawns with --continue
|
||||||
|
- Race lock prevents concurrent suspend + resume on same session
|
||||||
|
- Startup cleanup kills orphaned PIDs verified via /proc/cmdline
|
||||||
|
- Graceful shutdown terminates all subprocesses and cancels all timers
|
||||||
|
- handle_photo/handle_document also support resume from suspended state
|
||||||
|
</done>
|
||||||
|
</task>
|
||||||
|
|
||||||
|
<task type="auto">
|
||||||
|
<name>Task 2: /timeout and /sessions commands</name>
|
||||||
|
<files>telegram/bot.py</files>
|
||||||
|
<action>
|
||||||
|
Add two new command handlers to bot.py:
|
||||||
|
|
||||||
|
**/timeout command: `async def timeout_cmd(update, context)`**
|
||||||
|
- Auth check (same pattern as other commands).
|
||||||
|
- If no active session: reply "No active session. Use /new <name> to start one."
|
||||||
|
- If no args: show current timeout.
|
||||||
|
```python
|
||||||
|
timeout_secs = session_manager.get_session_timeout(active_session)
|
||||||
|
minutes = timeout_secs // 60
|
||||||
|
await update.message.reply_text(f"Idle timeout: {minutes} minutes\n\nUsage: /timeout <minutes> (1-120)")
|
||||||
|
```
|
||||||
|
- If args: parse first arg as int.
|
||||||
|
- Validate range 1-120. If out of range: `"Timeout must be between 1 and 120 minutes"`
|
||||||
|
- If not a valid int: `"Invalid number. Usage: /timeout <minutes>"`
|
||||||
|
- Convert to seconds: `timeout_seconds = minutes * 60`
|
||||||
|
- Update session metadata: `session_manager.update_session(active_session, idle_timeout=timeout_seconds)`
|
||||||
|
- If idle timer exists for this session, update its timeout_seconds attribute and reset: `idle_timers[active_session].timeout_seconds = timeout_seconds; idle_timers[active_session].reset()`
|
||||||
|
- Reply: `f"Idle timeout set to {minutes} minutes for session '{active_session}'."`
|
||||||
|
|
||||||
|
**/sessions command: `async def sessions_cmd(update, context)`**
|
||||||
|
- Auth check.
|
||||||
|
- Get all sessions: `session_manager.list_sessions()` (already sorted by last_active desc).
|
||||||
|
- If empty: reply "No sessions. Use /new <name> to create one."
|
||||||
|
- Build formatted list. For each session:
|
||||||
|
- Status indicator: active subprocess running -> "LIVE", status == "active" (in metadata) -> "ACTIVE", status == "suspended" -> "IDLE", else -> status
|
||||||
|
- Actually, check real subprocess state: `name in subprocesses and subprocesses[name].is_alive` -> "LIVE"
|
||||||
|
- Format last_active as relative time (e.g., "2m ago", "1h ago", "3d ago") using a small helper function:
|
||||||
|
```python
|
||||||
|
def format_relative_time(iso_str):
|
||||||
|
dt = datetime.fromisoformat(iso_str)
|
||||||
|
delta = datetime.now(timezone.utc) - dt
|
||||||
|
secs = delta.total_seconds()
|
||||||
|
if secs < 60: return "just now"
|
||||||
|
if secs < 3600: return f"{int(secs/60)}m ago"
|
||||||
|
if secs < 86400: return f"{int(secs/3600)}h ago"
|
||||||
|
return f"{int(secs/86400)}d ago"
|
||||||
|
```
|
||||||
|
- Mark current active session with arrow prefix.
|
||||||
|
- Format line: `"{marker}{status_emoji} {name} ({persona}) - {relative_time}"`
|
||||||
|
- Status emojis: LIVE -> green circle, IDLE/suspended -> white circle
|
||||||
|
- Join lines, reply with parse_mode='Markdown'. Use backticks around session names for monospace.
|
||||||
|
|
||||||
|
**Register handlers in main():**
|
||||||
|
- `app.add_handler(CommandHandler("timeout", timeout_cmd))` -- after the model handler
|
||||||
|
- `app.add_handler(CommandHandler("sessions", sessions_cmd))` -- after the session handler
|
||||||
|
|
||||||
|
**Update help text in help_command():**
|
||||||
|
- Under "Claude Sessions" section, add:
|
||||||
|
- `/sessions` - List all sessions with status
|
||||||
|
- `/timeout <minutes>` - Set idle timeout (1-120)
|
||||||
|
</action>
|
||||||
|
<verify>
|
||||||
|
`python3 -c "import bot; print('OK')"` succeeds. Grep for "timeout_cmd" and "sessions_cmd" in bot.py to confirm both exist. Grep for "CommandHandler.*timeout" and "CommandHandler.*sessions" to confirm registration.
|
||||||
|
</verify>
|
||||||
|
<done>
|
||||||
|
- /timeout shows current timeout when called without args, sets timeout (1-120 min range) when called with arg
|
||||||
|
- /sessions lists all sessions sorted by last active, showing live/idle status, persona, relative time
|
||||||
|
- Both commands registered as handlers in main()
|
||||||
|
- Help text updated with new commands
|
||||||
|
</done>
|
||||||
|
</task>
|
||||||
|
|
||||||
|
</tasks>
|
||||||
|
|
||||||
|
<verification>
|
||||||
|
1. `cd ~/homelab/telegram && python3 -c "import bot; print('All OK')"` -- no import errors
|
||||||
|
2. Grep for key integration points:
|
||||||
|
- `grep -n "suspend_session" telegram/bot.py` -- suspend function exists
|
||||||
|
- `grep -n "idle_timers" telegram/bot.py` -- idle timer dict used
|
||||||
|
- `grep -n "subprocess_locks" telegram/bot.py` -- race locks exist
|
||||||
|
- `grep -n "cleanup_orphaned" telegram/bot.py` -- startup cleanup exists
|
||||||
|
- `grep -n "post_shutdown" telegram/bot.py` -- graceful shutdown exists
|
||||||
|
- `grep -n "Resuming session" telegram/bot.py` -- resume status message exists
|
||||||
|
- `grep -n "timeout_cmd\|sessions_cmd" telegram/bot.py` -- new commands exist
|
||||||
|
3. Restart bot service: `systemctl --user restart telegram-bot.service && sleep 2 && systemctl --user status telegram-bot.service` -- should show active
|
||||||
|
</verification>
|
||||||
|
|
||||||
|
<success_criteria>
|
||||||
|
- Session auto-suspends after idle timeout (subprocess terminated, metadata status=suspended, no Telegram notification)
|
||||||
|
- Message to suspended session shows "Resuming session..." then Claude responds with full history
|
||||||
|
- If resume fails, error message sent (no auto-fresh-start)
|
||||||
|
- asyncio.Lock prevents race between timeout-fire and incoming message
|
||||||
|
- Bot startup kills orphaned subprocess PIDs (verified via /proc/cmdline)
|
||||||
|
- Bot shutdown terminates all subprocesses gracefully
|
||||||
|
- /timeout <minutes> sets per-session idle timeout (1-120 range), shows current value without args
|
||||||
|
- /sessions lists all sessions with LIVE/IDLE status, persona, and relative last-active time
|
||||||
|
- Help text includes new commands
|
||||||
|
- Bot service restarts cleanly
|
||||||
|
</success_criteria>
|
||||||
|
|
||||||
|
<output>
|
||||||
|
After completion, create `.planning/phases/03-lifecycle-management/03-02-SUMMARY.md`
|
||||||
|
</output>
|
||||||
Loading…
Add table
Reference in a new issue