homelab/CLAUDE.md

305 lines
11 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Purpose
This is the management container (VMID 102) for Mikkel's homelab infrastructure. Claude Code operates here to assist with homelab management, automation, and maintenance tasks.
## Environment
- **Container:** LXC on Proxmox VE (core.georgsen.dk)
- **Network Access:** vmbr1 (10.5.0.0/24 internal), Tailscale
- **SSH Keys:** Pre-installed for accessing other containers/VMs
- **User:** mikkel (UID 1000, group georgsen GID 1000)
- **Python venv:** ~/venv (activate with `source ~/venv/bin/activate`)
- **Helper scripts:** ~/bin (pve, npm-api, dns, pbs, beszel, kuma, telegram, updates)
- **Git repos:** ~/repos
- **Shared storage:** ~/stuff (ZFS bind mount, shared across containers, SMB accessible)
## Living Documentation
**`homelab-documentation.md`** is the authoritative reference for all infrastructure details. This file must be kept current:
- Update when infrastructure changes are made
- Update when new services/containers are added
- Update when configurations change
- Update IP addresses, ports, and service mappings as they evolve
## Network Topology
```
Internet ─► vmbr0 (65.108.14.165) ─► NPM (10.5.0.1) ─► Internal services
├─ vmbr1: 10.5.0.0/24
└─ vmbr2: 10.9.1.0/24 (Hetzner vSwitch)
Tailscale mesh connects: PBS, Synology NAS, pve01, pve02, dev containers
```
## Key Infrastructure
| Service | IP | Access |
|---------|-----|--------|
| NPM (reverse proxy) | 10.5.0.1 | Admin :81 |
| DNS (Technitium) | 10.5.0.2 | :5380 or dns.georgsen.dk |
| PBS (backups) | 10.5.0.6 | :8007 or pbs.georgsen.dk |
| Dockge (docker mgmt) | 10.5.0.10 | :5001 |
| Forgejo (git) | 10.5.0.14 | :3000 or git.georgsen.dk |
| Tailscale relay | 10.5.0.x | Routes to 10.9.0.0/16 |
## PVE API Access
The `~/bin/pve` helper script provides API access to Proxmox:
```bash
~/bin/pve list # List all VMs/containers
~/bin/pve status <vmid> # Show status
~/bin/pve start <vmid> # Start VM/container
~/bin/pve stop <vmid> # Stop VM/container
~/bin/pve create-ct <vmid> <hostname> <ip> <disk_gb> # Create container
```
## NPM API Access
The `~/bin/npm-api` script manages Nginx Proxy Manager:
```bash
~/bin/npm-api --host-list # List proxy hosts
~/bin/npm-api --host-search <domain> # Search by domain
~/bin/npm-api --host-create <domain> -i <ip> -p <port> # Create proxy host
~/bin/npm-api --host-delete <id> # Delete proxy host
~/bin/npm-api --cert-list # List SSL certs
```
Note: SSL cert generation requires manual setup via web UI (http://10.5.0.1:81)
## DNS API Access
The `~/bin/dns` script manages Technitium DNS (internal zone: lab.georgsen.dk):
```bash
~/bin/dns list # List all zones
~/bin/dns records [zone] # List records in zone
~/bin/dns add <name> <ip> [zone] # Add A record (e.g., dns add myhost 10.5.0.50)
~/bin/dns delete <name> [zone] # Delete A record
~/bin/dns lookup <name> # Query DNS
```
## PBS (Proxmox Backup Server) Access
The `~/bin/pbs` script shows backup status and statistics:
```bash
~/bin/pbs status # Overview: datastores, storage, dedup, tasks
~/bin/pbs backups [ns] # Last backup per VM/CT per namespace
~/bin/pbs tasks [hours] # Show recent tasks (default: 24h)
~/bin/pbs errors [hours] # Show only failures (default: 72h)
~/bin/pbs gc # Garbage collection status
~/bin/pbs snapshots [ns] # List recent snapshots
~/bin/pbs storage # Detailed storage/dedup stats
```
## Beszel Dashboard Access
The `~/bin/beszel` script manages the server monitoring dashboard:
```bash
~/bin/beszel list # List all systems
~/bin/beszel status # Show system metrics (CPU, RAM, disk)
~/bin/beszel add <name> <host> [port] # Add a system
~/bin/beszel delete <id> # Delete a system
```
Dashboard URL: https://dashboard.georgsen.dk
## Uptime Kuma API Access
The `~/bin/kuma` script manages Uptime Kuma monitors:
```bash
~/bin/kuma list # List all monitors
~/bin/kuma info <id> # Show monitor details
~/bin/kuma add-http <name> <url> # Add HTTP monitor
~/bin/kuma add-port <name> <host> <port> # Add TCP port monitor
~/bin/kuma add-ping <name> <host> # Add ping monitor
~/bin/kuma delete <id> # Delete monitor
~/bin/kuma pause <id> # Pause monitor
~/bin/kuma resume <id> # Resume monitor
```
## Stalwart Mail Server
The `~/bin/mail` script manages Stalwart Mail Server (VM 200, 65.108.14.164):
```bash
~/bin/mail list # List all mail accounts
~/bin/mail info <email> # Show account details
~/bin/mail create <email> <password> [name] # Create new mail account
~/bin/mail delete <email> # Delete mail account
~/bin/mail passwd <email> <password> # Change account password
~/bin/mail domains # List configured domains
~/bin/mail status # Show server status/version
```
**Active domain:** datalos.dk
**Admin UI:** https://mail.georgsen.dk
**Webmail:** https://webmail.georgsen.dk (Snappymail on Dockge)
**Credentials:** `~/homelab/stalwart/credentials`
## Service Updates
The `~/bin/updates` script checks for and applies updates across all homelab services:
```bash
~/bin/updates check # Check all services for available updates
~/bin/updates update <name|all> [-y] # Update one or more services
```
**Tracked services:** dragonfly, beszel, uptime-kuma, snappymail, stalwart, dockge, npm, forgejo, dns, pbs
Checks Docker image versions (Dockge + NPM), LXC service binaries (Forgejo, Technitium DNS), and apt packages (PBS) against GitHub/Codeberg releases.
## Telegram Bot
Two-way interactive bot for homelab management and communication with Claude.
**Bot:** @georgsen_homelab_bot
**Commands (in Telegram):**
- `/new <name> [persona]` - Create new Claude session
- `/session <name>` - Switch to a session
- `/sessions` - List all sessions with status
- `/model <name>` - Switch model (sonnet/opus/haiku or full ID). Persisted per session.
- `/timeout <minutes>` - Set idle timeout (1-120 min, default 10)
- `/archive <name>` - Archive and remove a session
- `/status` - Quick service overview (ping check)
- `/pbs` - PBS backup status
- `/backups` - Last backup per VM/CT
- `/beszel` - Server metrics
- `/kuma` - Uptime Kuma status
- `/ping <host>` - Ping a host
**CLI helper (`~/bin/telegram`):**
```bash
telegram send "message" # Send message to admin
telegram inbox # Read messages from admin
telegram clear # Clear inbox
telegram status # Check if bot is running
```
**Features:**
- Text messages saved to inbox for Claude to read
- Photos saved to `~/homelab/telegram/images/`
- Files saved to `~/homelab/telegram/files/`
- Runs as systemd user service (`telegram-bot.service`)
## Shared Storage
ZFS dataset on PVE host, bind-mounted into containers:
```
PVE: rpool/shared/mikkel → /shared/mikkel/stuff
├── mgmt (102): ~/stuff (backup=1)
├── dev (111): ~/stuff
└── general (113): ~/stuff
```
**SMB Access (Windows):** `\\mgmt\stuff` via Tailscale MagicDNS
**Notes:**
- UID mapping: container UID 1000 = host UID 101000 (unprivileged)
- Only mgmt has `backup=1` to avoid duplicate PBS backups
- ZFS dataset survives PVE reinstalls
## Common SSH Targets
```bash
ssh root@10.5.0.1 # NPM
ssh root@10.5.0.2 # DNS
ssh root@10.5.0.6 # PBS
ssh root@10.5.0.10 # Dockge
ssh root@10.5.0.14 # Forgejo
ssh mikkel@10.5.0.111 # dev container
```
## Important IPs
- **Home IP:** 83.89.248.247 (static, used for NPM access lists)
- **Public IP:** 65.108.14.165 (core.georgsen.dk)
## Security
- **Home IP:** 83.89.248.247 (whitelisted everywhere)
- **NPM Access List "home_only" (ID 1):** Restricts access to home IP only
- Applied to: dns.georgsen.dk, dockge.georgsen.dk, pbs.georgsen.dk
- **Fail2ban:** Running on PVE host (core) and Forgejo
- SSH jail on core, forgejo jail on Forgejo
- Bans after 5 failed attempts for 24 hours
- Whitelisted: 127.0.0.1, 10.5.0.0/24, 83.89.248.247
- **Firewall (core vmbr0):** Blocked ports: 53, 111, 3128, 8006, 8008 (home IP allowed)
## Container Management
**Update NPM:**
```bash
ssh root@10.5.0.1 'cd /opt/npm && docker compose pull && docker compose up -d'
```
**Enable ping in unprivileged containers:**
Unprivileged LXC containers drop `cap_net_raw` capability, causing ping to fail with "Operation not permitted". Fix by granting the capability to the ping binary:
```bash
# Run inside the container as root
setcap cap_net_raw+ep /bin/ping
# Or from PVE host
ssh root@10.5.0.254 'pct exec <vmid> -- setcap cap_net_raw+ep /bin/ping'
```
Note: Must be re-applied after `iputils-ping` package upgrades.
**Tailscale on LXC containers:**
When setting up Tailscale with `--ssh` on an unprivileged LXC container:
1. Stop the container and add TUN device access to `/etc/pve/lxc/<vmid>.conf` on the PVE host:
```
lxc.cgroup2.devices.allow: c 10:200 rwm
lxc.mount.entry: /dev/net/tun dev/net/tun none bind,create=file
```
2. Start the container, install and enable Tailscale:
```bash
curl -fsSL https://tailscale.com/install.sh | sh
systemctl start tailscaled
tailscale up --ssh
```
3. Move local SSH to port 2222 (Tailscale SSH takes port 22):
```bash
# Update sshd_config
sed -i 's/^#Port 22/Port 2222/' /etc/ssh/sshd_config
# Override ssh.socket (Ubuntu 24.04 uses socket activation)
# ListenStream= clears defaults, then bind explicitly to IPv4
mkdir -p /etc/systemd/system/ssh.socket.d
cat > /etc/systemd/system/ssh.socket.d/override.conf << EOF
[Socket]
ListenStream=
ListenStream=0.0.0.0:2222
EOF
systemctl daemon-reload
systemctl restart ssh.socket ssh.service
```
After setup: local SSH via `ssh -p 2222 user@<ip>`, Tailscale SSH via `ssh user@<hostname>`.
## CRITICAL: Software Versions
**NEVER use version numbers from training data.** Always fetch the latest version dynamically:
```bash
# GitHub releases - get latest tag
curl -s https://api.github.com/repos/OWNER/REPO/releases/latest | jq -r .tag_name
# Or check the project's download page/API
```
Training data is outdated the moment it's created. Hardcoding versions like `v1.27.1` when the latest is `v1.30.0` is unacceptable. Always query the source.
## User Preferences
- Python and Batch for scripting
- 256-color terminal retro aesthetic for UIs
- Ask clarifying questions rather than making assumptions
- Prefer understanding root causes over workarounds