debate/.planning/phases/01-core-infrastructure-security/01-05-PLAN.md
Mikkel Georgsen 262a32673b docs(01): create phase plan
Phase 01: Core Infrastructure & Security
- 5 plans in 3 waves
- 3 parallel (Wave 1-2), 1 sequential (Wave 3)
- Ready for execution

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-25 19:59:49 +00:00

23 KiB

phase plan type wave depends_on files_modified autonomous must_haves
01-core-infrastructure-security 05 execute 3
01-01
01-02
backend/app/services/__init__.py
backend/app/services/sandbox.py
backend/app/services/deterministic.py
backend/app/services/build.py
scripts/setup-sandbox.sh
tests/test_deterministic.py
true
truths artifacts key_links
Sandbox creates isolated systemd-nspawn container
Build commands execute with no network access
Same configuration produces identical hash
SOURCE_DATE_EPOCH is set for all builds
path provides contains
backend/app/services/sandbox.py systemd-nspawn sandbox management systemd-nspawn
path provides contains
backend/app/services/deterministic.py Deterministic build configuration SOURCE_DATE_EPOCH
path provides contains
backend/app/services/build.py Build orchestration service class BuildService
path provides contains
scripts/setup-sandbox.sh Sandbox environment initialization pacstrap
from to via pattern
backend/app/services/build.py backend/app/services/sandbox.py BuildSandbox import from.*sandbox import
from to via pattern
backend/app/services/build.py backend/app/services/deterministic.py DeterministicBuildConfig import from.*deterministic import
Implement systemd-nspawn build sandbox with deterministic configuration for reproducible ISO builds.

Purpose: Ensure ISO builds are isolated from host (ISO-04) and produce identical output for same input (determinism for caching). Output: Sandbox service that creates isolated containers, deterministic build configuration with hash generation.

<execution_context> @/home/mikkel/.claude/get-shit-done/workflows/execute-plan.md @/home/mikkel/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/phases/01-core-infrastructure-security/01-RESEARCH.md (Pattern 4: systemd-nspawn Build Sandbox, Pattern 5: Deterministic Build Configuration) @.planning/phases/01-core-infrastructure-security/01-CONTEXT.md (Sandbox Strictness, Determinism Approach decisions) @.planning/phases/01-core-infrastructure-security/01-01-SUMMARY.md @.planning/phases/01-core-infrastructure-security/01-02-SUMMARY.md Task 1: Create sandbox setup script and sandbox service scripts/setup-sandbox.sh backend/app/services/__init__.py backend/app/services/sandbox.py Create scripts/setup-sandbox.sh: ```bash #!/bin/bash # Initialize sandbox environment for ISO builds # Run once to create base container image

set -euo pipefail

SANDBOX_ROOT="${SANDBOX_ROOT:-/var/lib/debate/sandbox}" SANDBOX_BASE="${SANDBOX_ROOT}/base" ALLOWED_MIRRORS=( "https://geo.mirror.pkgbuild.com/$repo/os/$arch" "https://mirror.cachyos.org/repo/$arch/$repo" )

log() { echo "[$(date '+%Y-%m-%d %H:%M:%S')] $1" }

Check prerequisites

if ! command -v pacstrap &> /dev/null; then log "ERROR: pacstrap not found. Install arch-install-scripts package." exit 1 fi

if ! command -v systemd-nspawn &> /dev/null; then log "ERROR: systemd-nspawn not found. Install systemd-container package." exit 1 fi

Create sandbox directories

log "Creating sandbox directories..." mkdir -p "$SANDBOX_ROOT"/{base,builds,cache}

Bootstrap base Arch environment

if [ ! -d "$SANDBOX_BASE/usr" ]; then log "Bootstrapping base Arch Linux environment..." pacstrap -c -G -M "$SANDBOX_BASE" base archiso

# Configure mirrors (whitelist only)
log "Configuring mirrors..."
MIRRORLIST="$SANDBOX_BASE/etc/pacman.d/mirrorlist"
: > "$MIRRORLIST"
for mirror in "${ALLOWED_MIRRORS[@]}"; do
    echo "Server = $mirror" >> "$MIRRORLIST"
done

# Set fixed locale for determinism
echo "en_US.UTF-8 UTF-8" > "$SANDBOX_BASE/etc/locale.gen"
systemd-nspawn -D "$SANDBOX_BASE" locale-gen

log "Base environment created at $SANDBOX_BASE"

else log "Base environment already exists at $SANDBOX_BASE" fi

log "Sandbox setup complete"


Create backend/app/services/__init__.py:
- Empty or import key services

Create backend/app/services/sandbox.py:
```python
"""
systemd-nspawn sandbox for isolated ISO builds.

Security measures:
- --private-network: No network access (packages pre-cached in base)
- --read-only: Immutable root filesystem
- --tmpfs: Writable temp directories only
- --capability: Minimal capabilities for mkarchiso
- Resource limits: 8GB RAM, 4 cores (from CONTEXT.md)
"""

import asyncio
import shutil
import subprocess
from pathlib import Path
from typing import Optional
from dataclasses import dataclass
from datetime import datetime

from app.core.config import settings


@dataclass
class SandboxConfig:
    """Configuration for sandbox execution."""
    memory_limit: str = "8G"
    cpu_quota: str = "400%"  # 4 cores
    timeout_seconds: int = 1200  # 20 minutes (with 15min warning)
    warning_seconds: int = 900  # 15 minutes


class BuildSandbox:
    """Manages systemd-nspawn sandboxed build environments."""

    def __init__(
        self,
        sandbox_root: Path = None,
        config: SandboxConfig = None
    ):
        self.sandbox_root = sandbox_root or Path(settings.sandbox_root)
        self.base_path = self.sandbox_root / "base"
        self.builds_path = self.sandbox_root / "builds"
        self.config = config or SandboxConfig()

    async def create_build_container(self, build_id: str) -> Path:
        """
        Create isolated container for a specific build.
        Uses overlay filesystem on base for efficiency.
        """
        container_path = self.builds_path / build_id
        if container_path.exists():
            shutil.rmtree(container_path)
        container_path.mkdir(parents=True)

        # Copy base (in production, use overlayfs for efficiency)
        # For now, simple copy is acceptable
        proc = await asyncio.create_subprocess_exec(
            "cp", "-a", str(self.base_path) + "/.", str(container_path),
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )
        await proc.wait()

        return container_path

    async def run_build(
        self,
        container_path: Path,
        profile_path: Path,
        output_path: Path,
        source_date_epoch: int
    ) -> tuple[int, str, str]:
        """
        Execute archiso build in sandboxed container.

        Returns:
            Tuple of (return_code, stdout, stderr)
        """
        output_path.mkdir(parents=True, exist_ok=True)

        nspawn_cmd = [
            "systemd-nspawn",
            f"--directory={container_path}",
            "--private-network",  # No network access
            "--read-only",  # Immutable root
            "--tmpfs=/tmp:mode=1777",
            "--tmpfs=/var/tmp:mode=1777",
            f"--bind={profile_path}:/build/profile:ro",
            f"--bind={output_path}:/build/output",
            f"--setenv=SOURCE_DATE_EPOCH={source_date_epoch}",
            "--setenv=LC_ALL=C",
            "--setenv=TZ=UTC",
            "--capability=CAP_SYS_ADMIN",  # Required for mkarchiso
            "--console=pipe",
            "--quiet",
            "--",
            "mkarchiso",
            "-v",
            "-r",  # Remove work directory after build
            "-w", "/tmp/archiso-work",
            "-o", "/build/output",
            "/build/profile"
        ]

        proc = await asyncio.create_subprocess_exec(
            *nspawn_cmd,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE
        )

        try:
            stdout, stderr = await asyncio.wait_for(
                proc.communicate(),
                timeout=self.config.timeout_seconds
            )
            return proc.returncode, stdout.decode(), stderr.decode()
        except asyncio.TimeoutError:
            proc.kill()
            return -1, "", f"Build timed out after {self.config.timeout_seconds} seconds"

    async def cleanup_container(self, container_path: Path):
        """Remove container after build."""
        if container_path.exists():
            shutil.rmtree(container_path)
Run: ```bash cd /home/mikkel/repos/debate ruff check backend/app/services/sandbox.py python -c "from backend.app.services.sandbox import BuildSandbox, SandboxConfig; print('Import OK')" ``` Expected: No ruff errors, import succeeds. Sandbox service creates isolated containers with network isolation, resource limits, and deterministic environment. Task 2: Create deterministic build configuration service backend/app/services/deterministic.py tests/test_deterministic.py Create backend/app/services/deterministic.py: ```python """ Deterministic build configuration for reproducible ISOs.

Critical: Same configuration must produce identical ISO hash. This is required for caching to work correctly.

Determinism factors:

  • SOURCE_DATE_EPOCH: Fixed timestamps in all generated files
  • LC_ALL=C: Fixed locale for sorting
  • TZ=UTC: Fixed timezone
  • Sorted inputs: Packages, files always in consistent order
  • Fixed compression: Consistent squashfs settings """

import hashlib import json from pathlib import Path from typing import Any from dataclasses import dataclass

@dataclass class OverlayFile: """A file to be included in the overlay.""" path: str # Absolute path in ISO (e.g., /etc/skel/.bashrc) content: str mode: str = "0644"

@dataclass class BuildConfiguration: """Normalized build configuration for deterministic hashing.""" packages: list[str] overlays: list[dict[str, Any]] locale: str = "en_US.UTF-8" timezone: str = "UTC"

class DeterministicBuildConfig: """Ensures reproducible ISO builds."""

@staticmethod
def compute_config_hash(config: dict[str, Any]) -> str:
    """
    Generate deterministic hash of build configuration.

    Process:
    1. Normalize all inputs (sort lists, normalize paths)
    2. Hash file contents (not file objects)
    3. Use consistent JSON serialization

    Returns:
        SHA-256 hash of normalized configuration
    """
    # Normalize packages (sorted, deduplicated)
    packages = sorted(set(config.get("packages", [])))

    # Normalize overlays
    normalized_overlays = []
    for overlay in sorted(config.get("overlays", []), key=lambda x: x.get("name", "")):
        normalized_files = []
        for f in sorted(overlay.get("files", []), key=lambda x: x.get("path", "")):
            content = f.get("content", "")
            content_hash = hashlib.sha256(content.encode()).hexdigest()
            normalized_files.append({
                "path": f.get("path", "").strip(),
                "content_hash": content_hash,
                "mode": f.get("mode", "0644")
            })
        normalized_overlays.append({
            "name": overlay.get("name", "").strip(),
            "files": normalized_files
        })

    # Build normalized config
    normalized = {
        "packages": packages,
        "overlays": normalized_overlays,
        "locale": config.get("locale", "en_US.UTF-8"),
        "timezone": config.get("timezone", "UTC")
    }

    # JSON with sorted keys for determinism
    config_json = json.dumps(normalized, sort_keys=True, separators=(',', ':'))
    return hashlib.sha256(config_json.encode()).hexdigest()

@staticmethod
def get_source_date_epoch(config_hash: str) -> int:
    """
    Generate deterministic timestamp from config hash.

    Using hash-derived timestamp ensures:
    - Same config always gets same timestamp
    - Different configs get different timestamps
    - No dependency on wall clock time

    The timestamp is within a reasonable range (2020-2030).
    """
    # Use first 8 bytes of hash to generate timestamp
    hash_int = int(config_hash[:16], 16)
    # Map to range: Jan 1, 2020 to Dec 31, 2030
    min_epoch = 1577836800  # 2020-01-01
    max_epoch = 1924991999  # 2030-12-31
    return min_epoch + (hash_int % (max_epoch - min_epoch))

@staticmethod
def create_archiso_profile(
    config: dict[str, Any],
    profile_path: Path,
    source_date_epoch: int
) -> None:
    """
    Generate archiso profile with deterministic settings.

    Creates:
    - packages.x86_64: Sorted package list
    - profiledef.sh: Build configuration
    - pacman.conf: Package manager config
    - airootfs/: Overlay files
    """
    profile_path.mkdir(parents=True, exist_ok=True)

    # packages.x86_64 (sorted for determinism)
    packages = sorted(set(config.get("packages", ["base", "linux"])))
    packages_file = profile_path / "packages.x86_64"
    packages_file.write_text("\n".join(packages) + "\n")

    # profiledef.sh
    profiledef = profile_path / "profiledef.sh"
    iso_date = f"$(date --date=@{source_date_epoch} +%Y%m)"
    iso_version = f"$(date --date=@{source_date_epoch} +%Y.%m.%d)"

    profiledef.write_text(f'''#!/usr/bin/env bash

Deterministic archiso profile

Generated for Debate platform

iso_name="debate-custom" iso_label="DEBATE_{iso_date}" iso_publisher="Debate Platform https://debate.example.com" iso_application="Debate Custom Linux" iso_version="{iso_version}" install_dir="arch" bootmodes=('bios.syslinux.mbr' 'bios.syslinux.eltorito' 'uefi-x64.systemd-boot.esp' 'uefi-x64.systemd-boot.eltorito') arch="x86_64" pacman_conf="pacman.conf" airootfs_image_type="squashfs" airootfs_image_tool_options=('-comp' 'xz' '-Xbcj' 'x86' '-b' '1M' '-Xdict-size' '1M')

file_permissions=( ["/etc/shadow"]="0:0:0400" ["/root"]="0:0:750" ["/etc/gshadow"]="0:0:0400" ) ''')

    # pacman.conf
    pacman_conf = profile_path / "pacman.conf"
    pacman_conf.write_text('''[options]

Architecture = auto CheckSpace SigLevel = Required DatabaseOptional LocalFileSigLevel = Optional

[core] Include = /etc/pacman.d/mirrorlist

[extra] Include = /etc/pacman.d/mirrorlist ''')

    # airootfs structure with overlay files
    airootfs = profile_path / "airootfs"
    airootfs.mkdir(exist_ok=True)

    for overlay in config.get("overlays", []):
        for file_config in overlay.get("files", []):
            file_path = airootfs / file_config["path"].lstrip("/")
            file_path.parent.mkdir(parents=True, exist_ok=True)
            file_path.write_text(file_config["content"])
            if "mode" in file_config:
                file_path.chmod(int(file_config["mode"], 8))

Create tests/test_deterministic.py:
```python
"""Tests for deterministic build configuration."""

import pytest
from backend.app.services.deterministic import DeterministicBuildConfig


class TestDeterministicBuildConfig:
    """Test that same inputs produce same outputs."""

    def test_hash_deterministic(self):
        """Same config produces same hash."""
        config = {
            "packages": ["vim", "git", "base"],
            "overlays": [{
                "name": "test",
                "files": [{"path": "/etc/test", "content": "hello"}]
            }]
        }

        hash1 = DeterministicBuildConfig.compute_config_hash(config)
        hash2 = DeterministicBuildConfig.compute_config_hash(config)

        assert hash1 == hash2

    def test_hash_order_independent(self):
        """Package order doesn't affect hash."""
        config1 = {"packages": ["vim", "git", "base"], "overlays": []}
        config2 = {"packages": ["base", "git", "vim"], "overlays": []}

        hash1 = DeterministicBuildConfig.compute_config_hash(config1)
        hash2 = DeterministicBuildConfig.compute_config_hash(config2)

        assert hash1 == hash2

    def test_hash_different_configs(self):
        """Different configs produce different hashes."""
        config1 = {"packages": ["vim"], "overlays": []}
        config2 = {"packages": ["emacs"], "overlays": []}

        hash1 = DeterministicBuildConfig.compute_config_hash(config1)
        hash2 = DeterministicBuildConfig.compute_config_hash(config2)

        assert hash1 != hash2

    def test_source_date_epoch_deterministic(self):
        """Same hash produces same timestamp."""
        config_hash = "abc123def456"

        epoch1 = DeterministicBuildConfig.get_source_date_epoch(config_hash)
        epoch2 = DeterministicBuildConfig.get_source_date_epoch(config_hash)

        assert epoch1 == epoch2

    def test_source_date_epoch_in_range(self):
        """Timestamp is within reasonable range."""
        config_hash = "abc123def456"

        epoch = DeterministicBuildConfig.get_source_date_epoch(config_hash)

        # Should be between 2020 and 2030
        assert 1577836800 <= epoch <= 1924991999
Run: ```bash cd /home/mikkel/repos/debate ruff check backend/app/services/deterministic.py tests/test_deterministic.py pytest tests/test_deterministic.py -v ``` Expected: Ruff passes, all tests pass. Deterministic build config generates consistent hashes, timestamps derived from config hash. Task 3: Create build orchestration service backend/app/services/build.py Create backend/app/services/build.py: ```python """ Build orchestration service.

Coordinates:

  1. Configuration validation
  2. Hash computation (for caching)
  3. Sandbox creation
  4. Build execution
  5. Result storage """

import asyncio from pathlib import Path from typing import Optional from uuid import uuid4 from datetime import datetime, UTC

from sqlalchemy.ext.asyncio import AsyncSession from sqlalchemy import select

from app.core.config import settings from app.db.models.build import Build, BuildStatus from app.services.sandbox import BuildSandbox from app.services.deterministic import DeterministicBuildConfig

class BuildService: """Orchestrates ISO build process."""

def __init__(self, db: AsyncSession):
    self.db = db
    self.sandbox = BuildSandbox()
    self.output_root = Path(settings.iso_output_root)

async def get_or_create_build(
    self,
    config: dict
) -> tuple[Build, bool]:
    """
    Get existing build from cache or create new one.

    Returns:
        Tuple of (Build, is_cached)
    """
    # Compute deterministic hash
    config_hash = DeterministicBuildConfig.compute_config_hash(config)

    # Check cache
    stmt = select(Build).where(
        Build.config_hash == config_hash,
        Build.status == BuildStatus.completed
    )
    result = await self.db.execute(stmt)
    cached_build = result.scalar_one_or_none()

    if cached_build:
        # Return cached build
        return cached_build, True

    # Create new build
    build = Build(
        id=uuid4(),
        config_hash=config_hash,
        status=BuildStatus.pending
    )
    self.db.add(build)
    await self.db.commit()
    await self.db.refresh(build)

    return build, False

async def execute_build(
    self,
    build: Build,
    config: dict
) -> Build:
    """
    Execute the actual ISO build.

    Process:
    1. Update status to building
    2. Create sandbox container
    3. Generate archiso profile
    4. Run build
    5. Update status with result
    """
    build.status = BuildStatus.building
    build.started_at = datetime.now(UTC)
    await self.db.commit()

    container_path = None
    profile_path = self.output_root / str(build.id) / "profile"
    output_path = self.output_root / str(build.id) / "output"

    try:
        # Create sandbox
        container_path = await self.sandbox.create_build_container(str(build.id))

        # Generate deterministic profile
        source_date_epoch = DeterministicBuildConfig.get_source_date_epoch(
            build.config_hash
        )
        DeterministicBuildConfig.create_archiso_profile(
            config, profile_path, source_date_epoch
        )

        # Run build in sandbox
        return_code, stdout, stderr = await self.sandbox.run_build(
            container_path, profile_path, output_path, source_date_epoch
        )

        if return_code == 0:
            # Find generated ISO
            iso_files = list(output_path.glob("*.iso"))
            if iso_files:
                build.iso_path = str(iso_files[0])
                build.status = BuildStatus.completed
            else:
                build.status = BuildStatus.failed
                build.error_message = "Build completed but no ISO found"
        else:
            build.status = BuildStatus.failed
            build.error_message = stderr or f"Build failed with code {return_code}"

        build.build_log = stdout + "\n" + stderr

    except Exception as e:
        build.status = BuildStatus.failed
        build.error_message = str(e)

    finally:
        # Cleanup sandbox
        if container_path:
            await self.sandbox.cleanup_container(container_path)

        build.completed_at = datetime.now(UTC)
        await self.db.commit()
        await self.db.refresh(build)

    return build

async def get_build_status(self, build_id: str) -> Optional[Build]:
    """Get build by ID."""
    stmt = select(Build).where(Build.id == build_id)
    result = await self.db.execute(stmt)
    return result.scalar_one_or_none()
  </action>
  <verify>
Run:
```bash
cd /home/mikkel/repos/debate
ruff check backend/app/services/build.py
python -c "from backend.app.services.build import BuildService; print('Import OK')"

Expected: No ruff errors, import succeeds. Build service coordinates hash computation, caching, sandbox execution, and status tracking.

1. `ruff check backend/app/services/` passes 2. `pytest tests/test_deterministic.py` - all tests pass 3. Sandbox service can be imported without errors 4. Build service can be imported without errors 5. DeterministicBuildConfig.compute_config_hash produces consistent results

<success_criteria>

  • Sandbox service creates isolated systemd-nspawn containers (ISO-04)
  • Builds run with --private-network (no network access)
  • SOURCE_DATE_EPOCH set for deterministic builds
  • Same configuration produces identical hash
  • Build service coordinates full build lifecycle
  • Cache lookup happens before build execution </success_criteria>
After completion, create `.planning/phases/01-core-infrastructure-security/01-05-SUMMARY.md`