Files: - STACK.md: Technology stack recommendations (Python 3.12+, FastAPI, React 19+, Vite, Celery, PostgreSQL 18+) - FEATURES.md: Feature landscape analysis (table stakes vs differentiators) - ARCHITECTURE.md: Layered web-queue-worker architecture with SAT-based dependency resolution - PITFALLS.md: Critical pitfalls and prevention strategies - SUMMARY.md: Research synthesis with roadmap implications Key findings: - Stack: Modern 2026 async Python (FastAPI/Celery) + React/Three.js 3D frontend - Architecture: Web-queue-worker pattern with sandboxed archiso builds - Critical pitfall: Build sandboxing required from day one (CHAOS RAT AUR incident July 2025) Recommended 9-phase roadmap: Infrastructure → Config → Dependency → Overlay → Build Queue → Frontend → Advanced SAT → 3D Viz → Optimization Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
900 lines
38 KiB
Markdown
900 lines
38 KiB
Markdown
# Architecture Patterns: Linux Distribution Builder Platform
|
|
|
|
**Domain:** Web-based Linux distribution customization and ISO generation
|
|
**Researched:** 2026-01-25
|
|
**Confidence:** MEDIUM-HIGH
|
|
|
|
## Executive Summary
|
|
|
|
Linux distribution builder platforms combine web interfaces with backend build systems, overlaying configuration layers onto base distributions to create customized bootable ISOs. Modern architectures (2026) leverage container-based immutable systems, asynchronous task queues, and SAT-solver dependency resolution. The Debate platform architecture aligns with established patterns from archiso, Universal Blue/Bazzite, and web-queue-worker patterns.
|
|
|
|
## Recommended Architecture
|
|
|
|
The Debate platform should follow a **layered web-queue-worker architecture** with these tiers:
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ PRESENTATION LAYER │
|
|
│ React Frontend + Three.js 3D Visualization │
|
|
│ (User configuration interface, visual package builder) │
|
|
└────────────────────┬────────────────────────────────────────────┘
|
|
│ HTTP/WebSocket
|
|
┌────────────────────▼────────────────────────────────────────────┐
|
|
│ API LAYER │
|
|
│ FastAPI (async endpoints, validation, session management) │
|
|
└────────────────────┬────────────────────────────────────────────┘
|
|
│
|
|
┌───────────┼───────────┐
|
|
│ │ │
|
|
┌────────▼──────┐ ┌─▼─────────┐ ┌▼───────────────┐
|
|
│ Dependency │ │ Overlay │ │ Build Queue │
|
|
│ Resolver │ │ Engine │ │ Manager │
|
|
│ (SAT solver) │ │ (Layers) │ │ (Celery) │
|
|
└────────┬──────┘ └─┬─────────┘ └┬───────────────┘
|
|
│ │ │
|
|
└──────────┼─────────────┘
|
|
│
|
|
┌───────────────────▼─────────────────────────────────────────────┐
|
|
│ PERSISTENCE LAYER │
|
|
│ PostgreSQL (config, user data, build metadata) │
|
|
│ Object Storage (ISO cache, build artifacts) │
|
|
└──────────────────────────────────────────────────────────────────┘
|
|
│
|
|
┌───────────────────▼─────────────────────────────────────────────┐
|
|
│ BUILD EXECUTION LAYER │
|
|
│ Worker Nodes (Celery workers running archiso/mkarchiso) │
|
|
│ - Profile generation │
|
|
│ - Package installation to airootfs │
|
|
│ - Overlay application (OverlayFS concepts) │
|
|
│ - ISO generation with bootloader config │
|
|
└──────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Component Boundaries
|
|
|
|
### Core Components
|
|
|
|
| Component | Responsibility | Communicates With | State Management |
|
|
|-----------|---------------|-------------------|------------------|
|
|
| **React Frontend** | User interaction, 3D visualization, configuration UI | API Layer (REST/WS) | Client-side state (React context/Redux) |
|
|
| **Three.js Renderer** | 3D package/layer visualization, visual debugging | React components | Scene state separate from app state |
|
|
| **FastAPI Gateway** | Request routing, validation, auth, session mgmt | All backend services | Stateless (session in DB/cache) |
|
|
| **Dependency Resolver** | Package conflict detection, SAT solving, suggestions | API Layer, Database | Computation-only (no persistent state) |
|
|
| **Overlay Engine** | Layer composition, configuration merging, precedence | Build Queue, Database | Configuration versioning in DB |
|
|
| **Build Queue Manager** | Job scheduling, worker coordination, priority mgmt | Celery broker (Redis/RabbitMQ) | Queue state in message broker |
|
|
| **Celery Workers** | ISO build execution, archiso orchestration | Build Queue, Object Storage | Job state tracked in result backend |
|
|
| **PostgreSQL DB** | User data, build configs, metadata, audit logs | All backend services | ACID transactional storage |
|
|
| **Object Storage** | ISO caching, build artifacts, profile storage | Workers, API (download endpoint) | Immutable blob storage |
|
|
|
|
### Detailed Component Architecture
|
|
|
|
#### 1. Presentation Layer (React + Three.js)
|
|
|
|
**Purpose:** Provide visual interface for distribution customization with 3D representation of layers.
|
|
|
|
**Architecture Pattern:**
|
|
- **State Management:** Application state in React (configuration data) separate from scene state (3D objects). Changes flow from app state → scene rendering.
|
|
- **Performance:** Use React Three Fiber (r3f) for declarative Three.js integration. Target 60 FPS, <100MB memory.
|
|
- **Optimization:** InstancedMesh for repeated elements (packages), frustum culling, lazy loading with Suspense, GPU resource cleanup with dispose().
|
|
- **Model Format:** GLTF/GLB for 3D assets.
|
|
|
|
**Communication:**
|
|
- REST API for CRUD operations (save configuration, list builds)
|
|
- WebSocket for real-time build progress updates
|
|
- Server-Sent Events (SSE) alternative for progress streaming
|
|
|
|
**Sources:**
|
|
- [React Three Fiber vs. Three.js Performance Guide 2026](https://graffersid.com/react-three-fiber-vs-three-js/)
|
|
- [3D Data Visualization with React and Three.js](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432)
|
|
|
|
#### 2. API Layer (FastAPI)
|
|
|
|
**Purpose:** Asynchronous API gateway handling request validation, routing, and coordination.
|
|
|
|
**Architecture Pattern:**
|
|
- **Layered Structure:** Separate routers (by domain), services (business logic), and data access layers.
|
|
- **Async I/O:** Use async/await throughout to prevent blocking on database/queue operations.
|
|
- **Middleware:** Custom logging, metrics, error handling middleware for observability.
|
|
- **Validation:** Pydantic models for request/response validation.
|
|
|
|
**Endpoints:**
|
|
- `/api/v1/configurations` - CRUD for user configurations
|
|
- `/api/v1/packages` - Package search, metadata, conflicts
|
|
- `/api/v1/builds` - Submit build, query status, download ISO
|
|
- `/api/v1/layers` - Layer definitions (Opening Statement, Platform, etc.)
|
|
- `/ws/builds/{build_id}` - WebSocket for build progress
|
|
|
|
**Performance:** FastAPI achieves 300% better performance than synchronous frameworks for I/O-bound operations (2026 benchmarks).
|
|
|
|
**Sources:**
|
|
- [Modern FastAPI Architecture Patterns 2026](https://medium.com/algomart/modern-fastapi-architecture-patterns-for-scalable-production-systems-41a87b165a8b)
|
|
- [FastAPI for Microservices 2025](https://talent500.com/blog/fastapi-microservices-python-api-design-patterns-2025/)
|
|
|
|
#### 3. Dependency Resolver
|
|
|
|
**Purpose:** Detect package conflicts, resolve dependencies, suggest alternatives using SAT solver algorithms.
|
|
|
|
**Architecture Pattern:**
|
|
- **SAT Solver Implementation:** Use libsolv (openSUSE) or similar SAT-based approach. Translate package dependencies to logic clauses, apply CDCL algorithm.
|
|
- **Algorithm:** Conflict-Driven Clause Learning (CDCL) solves NP-complete dependency problems in milliseconds for typical workloads.
|
|
- **Input:** Package selection across 5 layers (Opening Statement, Platform, Rhetoric, Talking Points, Closing Argument).
|
|
- **Output:** Valid package set or conflict report with suggested resolutions.
|
|
|
|
**Data Structure:**
|
|
```
|
|
Package Dependency Graph:
|
|
- Nodes: Packages (name, version, layer)
|
|
- Edges: Dependencies (requires, conflicts, provides, suggests)
|
|
- Constraints: Version ranges, mutual exclusions
|
|
```
|
|
|
|
**Integration:**
|
|
- Called synchronously from API during configuration validation
|
|
- Pre-compute common dependency sets for base layers (cache results)
|
|
- Asynchronous deep resolution for full build validation
|
|
|
|
**Sources:**
|
|
- [Libsolv SAT Solver](https://github.com/openSUSE/libsolv)
|
|
- [Version SAT Research](https://research.swtch.com/version-sat)
|
|
- [Dependency Resolution Made Simple](https://borretti.me/article/dependency-resolution-made-simple)
|
|
|
|
#### 4. Overlay Engine
|
|
|
|
**Purpose:** Manage layered configuration packages, applying merge strategies and precedence rules.
|
|
|
|
**Architecture Pattern:**
|
|
- **Layer Model:** 5 layers with defined precedence (Closing Argument > Talking Points > Rhetoric > Platform > Opening Statement).
|
|
- **OverlayFS Inspiration:** Conceptually similar to OverlayFS union mounting, where upper layers override lower layers.
|
|
- **Configuration Merging:** Files from higher layers replace/merge with lower layers based on merge strategy (replace, merge-append, merge-deep).
|
|
|
|
**Layer Structure:**
|
|
```
|
|
Layer Definition:
|
|
- id: unique identifier
|
|
- name: user-facing name (e.g., "Platform")
|
|
- order: precedence (1=lowest, 5=highest)
|
|
- packages: list of package selections
|
|
- files: custom files to overlay
|
|
- merge_strategy: how to handle conflicts
|
|
```
|
|
|
|
**Merge Strategies:**
|
|
- **Replace:** Higher layer file completely replaces lower
|
|
- **Merge-Append:** Concatenate files (e.g., package lists)
|
|
- **Merge-Deep:** Smart merge (e.g., JSON/YAML key merging)
|
|
|
|
**Output:** Unified archiso profile with:
|
|
- `packages.x86_64` (merged package list)
|
|
- `airootfs/` directory (merged filesystem overlay)
|
|
- `profiledef.sh` (combined metadata)
|
|
|
|
**Sources:**
|
|
- [OverlayFS Linux Kernel Documentation](https://docs.kernel.org/filesystems/overlayfs.html)
|
|
- [OverlayFS ArchWiki](https://wiki.archlinux.org/title/Overlay_filesystem)
|
|
|
|
#### 5. Build Queue Manager (Celery)
|
|
|
|
**Purpose:** Distributed task queue for asynchronous ISO build jobs with priority scheduling.
|
|
|
|
**Architecture Pattern:**
|
|
- **Web-Queue-Worker Pattern:** Web frontend → Message queue → Worker pool
|
|
- **Message Broker:** Redis (low latency) or RabbitMQ (high reliability) for job queue
|
|
- **Result Backend:** Redis or PostgreSQL for job status/results
|
|
- **Worker Pool:** Multiple Celery workers (one per build server core for CPU-bound builds)
|
|
|
|
**Job Types:**
|
|
1. **Quick Validation:** Dependency resolution (seconds) - High priority
|
|
2. **Full Build:** ISO generation (minutes) - Normal priority
|
|
3. **Cache Warming:** Pre-build common configurations - Low priority
|
|
|
|
**Scheduling:**
|
|
- **Priority Queue:** User-initiated builds > automated cache warming
|
|
- **Rate Limiting:** Prevent queue flooding, enforce user quotas
|
|
- **Retry Logic:** Automatic retry with exponential backoff for transient failures
|
|
- **Timeout:** Per-job timeout (e.g., 30 min max for build)
|
|
|
|
**Coordinator Pattern:**
|
|
- Single coordinator manages job assignment and worker health
|
|
- Leader election for coordinator HA (if scaled beyond single instance)
|
|
|
|
**Monitoring:**
|
|
- Job state transitions logged to PostgreSQL
|
|
- Metrics: queue depth, worker utilization, average build time
|
|
- Dead letter queue for failed jobs requiring manual investigation
|
|
|
|
**Sources:**
|
|
- [Celery Distributed Task Queue](https://docs.celeryq.dev/)
|
|
- [Design Distributed Job Scheduler](https://www.systemdesignhandbook.com/guides/design-a-distributed-job-scheduler/)
|
|
- [Web-Queue-Worker Architecture - Azure](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker)
|
|
|
|
#### 6. Build Execution Workers (archiso-based)
|
|
|
|
**Purpose:** Execute ISO generation using archiso (mkarchiso) with custom profiles.
|
|
|
|
**Architecture Pattern:**
|
|
- **Profile-Based Build:** Generate temporary archiso profile per build job
|
|
- **Isolation:** Each build runs in isolated environment (separate working directory)
|
|
- **Stages:** Profile generation → Package installation → Customization → ISO creation
|
|
|
|
**Build Process Flow:**
|
|
```
|
|
1. Profile Generation (Overlay Engine output)
|
|
├── Create temp directory
|
|
├── Write packages.x86_64 (merged package list)
|
|
├── Write profiledef.sh (metadata, permissions)
|
|
├── Copy airootfs/ overlay files
|
|
└── Configure bootloaders (syslinux, grub, systemd-boot)
|
|
|
|
2. Package Installation
|
|
├── mkarchiso downloads packages (pacman cache)
|
|
├── Install to work_dir/x86_64/airootfs
|
|
└── Apply package configurations
|
|
|
|
3. Customization (customize_airootfs.sh)
|
|
├── Enable systemd services
|
|
├── Apply user-specific configs
|
|
├── Run post-install scripts
|
|
└── Set permissions
|
|
|
|
4. ISO Generation
|
|
├── Create kernel and initramfs images
|
|
├── Build squashfs filesystem
|
|
├── Assemble bootable ISO
|
|
├── Generate checksums
|
|
└── Move to output directory
|
|
|
|
5. Post-Processing
|
|
├── Upload ISO to object storage
|
|
├── Update database (build status, ISO location)
|
|
├── Cache metadata for reuse
|
|
└── Clean up working directory
|
|
```
|
|
|
|
**Worker Configuration:**
|
|
- **Resource Limits:** 1 build per worker (CPU/memory intensive)
|
|
- **Concurrency:** 6 workers max (6-core build server)
|
|
- **Working Directory:** `/tmp/archiso-tmp-{job_id}` (cleaned after completion with -r flag)
|
|
- **Output Directory:** Temporary → Object storage → Local cleanup
|
|
|
|
**Optimizations:**
|
|
- **Package Cache:** Shared pacman cache across workers (prevent redundant downloads)
|
|
- **Layer Caching:** Cache common base layers (Opening Statement variations)
|
|
- **Incremental Builds:** Detect unchanged layers, reuse previous airootfs where possible
|
|
|
|
**Sources:**
|
|
- [Archiso ArchWiki](https://wiki.archlinux.org/title/Archiso)
|
|
- [Custom Archiso Tutorial](https://serverless.industries/2024/12/30/custom-archiso.en.html)
|
|
|
|
#### 7. Persistence Layer (PostgreSQL + Object Storage)
|
|
|
|
**Purpose:** Store configuration data, build metadata, and build artifacts.
|
|
|
|
**PostgreSQL Schema Design:**
|
|
|
|
```sql
|
|
-- User configurations
|
|
CREATE SCHEMA configurations;
|
|
|
|
CREATE TABLE configurations.user_configs (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
user_id UUID NOT NULL,
|
|
name VARCHAR(255) NOT NULL,
|
|
description TEXT,
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
updated_at TIMESTAMP DEFAULT NOW()
|
|
);
|
|
|
|
CREATE TABLE configurations.layers (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
config_id UUID REFERENCES configurations.user_configs(id),
|
|
layer_type VARCHAR(50) NOT NULL, -- opening_statement, platform, rhetoric, etc.
|
|
layer_order INT NOT NULL,
|
|
merge_strategy VARCHAR(50) DEFAULT 'replace'
|
|
);
|
|
|
|
CREATE TABLE configurations.layer_packages (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
layer_id UUID REFERENCES configurations.layers(id),
|
|
package_name VARCHAR(255) NOT NULL,
|
|
package_version VARCHAR(50),
|
|
required BOOLEAN DEFAULT TRUE
|
|
);
|
|
|
|
CREATE TABLE configurations.layer_files (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
layer_id UUID REFERENCES configurations.layers(id),
|
|
file_path VARCHAR(1024) NOT NULL, -- path in airootfs
|
|
file_content TEXT, -- for small configs
|
|
file_storage_url VARCHAR(2048), -- for large files in object storage
|
|
permissions VARCHAR(4) DEFAULT '0644'
|
|
);
|
|
|
|
-- Build management
|
|
CREATE SCHEMA builds;
|
|
|
|
CREATE TABLE builds.build_jobs (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
config_id UUID REFERENCES configurations.user_configs(id),
|
|
status VARCHAR(50) NOT NULL, -- queued, running, success, failed
|
|
priority INT DEFAULT 5,
|
|
started_at TIMESTAMP,
|
|
completed_at TIMESTAMP,
|
|
iso_url VARCHAR(2048), -- object storage location
|
|
iso_checksum VARCHAR(128),
|
|
error_message TEXT,
|
|
build_log_url VARCHAR(2048)
|
|
);
|
|
|
|
CREATE TABLE builds.build_cache (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
config_hash VARCHAR(64) UNIQUE NOT NULL, -- hash of layer config
|
|
iso_url VARCHAR(2048),
|
|
created_at TIMESTAMP DEFAULT NOW(),
|
|
last_accessed TIMESTAMP DEFAULT NOW(),
|
|
access_count INT DEFAULT 0
|
|
);
|
|
|
|
-- Package metadata
|
|
CREATE SCHEMA packages;
|
|
|
|
CREATE TABLE packages.package_metadata (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
name VARCHAR(255) UNIQUE NOT NULL,
|
|
description TEXT,
|
|
repository VARCHAR(100), -- core, extra, community, aur
|
|
version VARCHAR(50),
|
|
dependencies JSONB, -- {requires: [], conflicts: [], provides: []}
|
|
last_updated TIMESTAMP DEFAULT NOW()
|
|
);
|
|
```
|
|
|
|
**Schema Organization Best Practices (2026):**
|
|
- Separate schemas for functional areas (configurations, builds, packages)
|
|
- Schema-level access control for security isolation
|
|
- CI/CD integration with migration tools (Flyway, Alembic)
|
|
- Indexes on frequently queried fields (config_id, status, config_hash)
|
|
|
|
**Object Storage:**
|
|
- **Purpose:** Store ISOs (large files, 1-4GB), build logs, custom overlay files
|
|
- **Technology:** S3-compatible (AWS S3, MinIO, Cloudflare R2)
|
|
- **Structure:**
|
|
- `/isos/{build_id}.iso` - Generated ISOs
|
|
- `/logs/{build_id}.log` - Build logs
|
|
- `/overlays/{layer_id}/{file_path}` - Custom files too large for DB
|
|
- `/cache/{config_hash}.iso` - Cached ISOs for reuse
|
|
|
|
**Sources:**
|
|
- [PostgreSQL Schema Design Best Practices 2026](https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application)
|
|
- [SQL Database Fundamentals 2026](https://www.nucamp.co/blog/sql-and-database-fundamentals-in-2026-queries-design-and-postgresql-essentials)
|
|
|
|
## Data Flow
|
|
|
|
### Configuration Creation Flow
|
|
|
|
```
|
|
User (Frontend)
|
|
↓ (1) Create/Edit configuration
|
|
API Layer (Validation)
|
|
↓ (2) Validate input
|
|
Dependency Resolver
|
|
↓ (3) Check conflicts
|
|
↓ (4) Return validation result
|
|
API Layer
|
|
↓ (5) Save configuration
|
|
PostgreSQL (configurations schema)
|
|
↓ (6) Return config_id
|
|
Frontend (Display confirmation)
|
|
```
|
|
|
|
### Build Submission Flow
|
|
|
|
```
|
|
User (Frontend)
|
|
↓ (1) Submit build request
|
|
API Layer
|
|
↓ (2) Check cache (config hash)
|
|
PostgreSQL (build_cache)
|
|
├─→ (3a) Cache hit: return cached ISO URL
|
|
└─→ (3b) Cache miss: create build job
|
|
Build Queue Manager (Celery)
|
|
↓ (4) Enqueue job with priority
|
|
Message Broker (Redis/RabbitMQ)
|
|
↓ (5) Job dispatched to worker
|
|
Celery Worker
|
|
↓ (6a) Fetch configuration from DB
|
|
↓ (6b) Generate archiso profile (Overlay Engine)
|
|
↓ (6c) Execute mkarchiso
|
|
↓ (6d) Upload ISO to object storage
|
|
↓ (6e) Update build status in DB
|
|
PostgreSQL + Object Storage
|
|
↓ (7) Job complete
|
|
API Layer (WebSocket)
|
|
↓ (8) Notify user
|
|
Frontend (Display download link)
|
|
```
|
|
|
|
### Real-Time Progress Updates Flow
|
|
|
|
```
|
|
Celery Worker
|
|
↓ (1) Emit progress events during build
|
|
↓ (e.g., "downloading packages", "generating ISO")
|
|
Celery Result Backend
|
|
↓ (2) Store progress state
|
|
API Layer (WebSocket handler)
|
|
↓ (3) Poll/subscribe to job progress
|
|
↓ (4) Push updates to client
|
|
Frontend (WebSocket listener)
|
|
↓ (5) Update UI progress bar
|
|
```
|
|
|
|
## Patterns to Follow
|
|
|
|
### Pattern 1: Layered Configuration Precedence
|
|
|
|
**What:** Higher layers override lower layers with defined merge strategies.
|
|
|
|
**When:** User customizes configuration across multiple layers (Platform, Rhetoric, etc.).
|
|
|
|
**Implementation:**
|
|
```python
|
|
class OverlayEngine:
|
|
def merge_layers(self, layers: List[Layer]) -> Profile:
|
|
"""Merge layers from lowest to highest precedence."""
|
|
sorted_layers = sorted(layers, key=lambda l: l.order)
|
|
|
|
profile = Profile()
|
|
for layer in sorted_layers:
|
|
profile = self.apply_layer(profile, layer)
|
|
|
|
return profile
|
|
|
|
def apply_layer(self, profile: Profile, layer: Layer) -> Profile:
|
|
"""Apply layer based on merge strategy."""
|
|
if layer.merge_strategy == "replace":
|
|
profile.files.update(layer.files) # Overwrite
|
|
elif layer.merge_strategy == "merge-append":
|
|
profile.packages.extend(layer.packages) # Append
|
|
elif layer.merge_strategy == "merge-deep":
|
|
profile.config = deep_merge(profile.config, layer.config)
|
|
|
|
return profile
|
|
```
|
|
|
|
**Source:** OverlayFS union mount concepts applied to configuration management.
|
|
|
|
### Pattern 2: SAT-Based Dependency Resolution
|
|
|
|
**What:** Translate package dependencies to boolean satisfiability problem, solve with CDCL algorithm.
|
|
|
|
**When:** User adds package to configuration, system detects conflicts.
|
|
|
|
**Implementation:**
|
|
```python
|
|
class DependencyResolver:
|
|
def resolve(self, packages: List[Package]) -> Resolution:
|
|
"""Resolve dependencies using SAT solver."""
|
|
clauses = self.build_clauses(packages)
|
|
|
|
solver = SATSolver()
|
|
result = solver.solve(clauses)
|
|
|
|
if result.satisfiable:
|
|
return Resolution(success=True, packages=result.model)
|
|
else:
|
|
conflicts = self.explain_conflicts(result.unsat_core)
|
|
alternatives = self.suggest_alternatives(conflicts)
|
|
return Resolution(success=False, conflicts=conflicts,
|
|
alternatives=alternatives)
|
|
|
|
def build_clauses(self, packages: List[Package]) -> List[Clause]:
|
|
"""Convert dependency graph to CNF clauses."""
|
|
clauses = []
|
|
for pkg in packages:
|
|
# If package selected, all dependencies must be selected
|
|
for dep in pkg.requires:
|
|
clauses.append(Implies(pkg, dep))
|
|
# If package selected, no conflicts can be selected
|
|
for conflict in pkg.conflicts:
|
|
clauses.append(Not(And(pkg, conflict)))
|
|
return clauses
|
|
```
|
|
|
|
**Source:** [Libsolv implementation patterns](https://github.com/openSUSE/libsolv)
|
|
|
|
### Pattern 3: Asynchronous Build Queue with Progress Tracking
|
|
|
|
**What:** Submit long-running build jobs to queue, track progress, notify on completion.
|
|
|
|
**When:** User submits build request (ISO generation takes minutes).
|
|
|
|
**Implementation:**
|
|
```python
|
|
# API endpoint
|
|
@app.post("/api/v1/builds")
|
|
async def submit_build(config_id: UUID, background_tasks: BackgroundTasks):
|
|
# Check cache first
|
|
cache_key = compute_hash(config_id)
|
|
cached = await check_cache(cache_key)
|
|
if cached:
|
|
return {"status": "cached", "iso_url": cached.iso_url}
|
|
|
|
# Enqueue build job
|
|
job = build_iso.apply_async(
|
|
args=[config_id],
|
|
priority=5,
|
|
task_id=str(uuid.uuid4())
|
|
)
|
|
|
|
return {"status": "queued", "job_id": job.id}
|
|
|
|
# Celery task
|
|
@celery.task(bind=True)
|
|
def build_iso(self, config_id: UUID):
|
|
self.update_state(state='DOWNLOADING', meta={'progress': 10})
|
|
|
|
# Generate profile
|
|
profile = overlay_engine.generate_profile(config_id)
|
|
self.update_state(state='BUILDING', meta={'progress': 30})
|
|
|
|
# Run mkarchiso
|
|
result = subprocess.run([
|
|
'mkarchiso', '-v', '-r',
|
|
'-w', f'/tmp/archiso-{self.request.id}',
|
|
'-o', '/tmp/output',
|
|
profile.path
|
|
])
|
|
self.update_state(state='UPLOADING', meta={'progress': 80})
|
|
|
|
# Upload to object storage
|
|
iso_url = upload_iso(f'/tmp/output/archlinux.iso')
|
|
|
|
return {"iso_url": iso_url, "progress": 100}
|
|
```
|
|
|
|
**Source:** [Celery best practices](https://docs.celeryq.dev/), [Web-Queue-Worker pattern](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker)
|
|
|
|
### Pattern 4: Cache-First Build Strategy
|
|
|
|
**What:** Hash configuration, check cache before building, reuse identical ISOs.
|
|
|
|
**When:** User submits build that may have been built previously.
|
|
|
|
**Implementation:**
|
|
```python
|
|
def compute_config_hash(config_id: UUID) -> str:
|
|
"""Create deterministic hash of configuration."""
|
|
config = db.query(Config).get(config_id)
|
|
|
|
# Include all layers, packages, files in hash
|
|
hash_input = {
|
|
"layers": sorted([
|
|
{
|
|
"type": layer.type,
|
|
"packages": sorted(layer.packages),
|
|
"files": sorted([
|
|
{"path": f.path, "content_hash": hash(f.content)}
|
|
for f in layer.files
|
|
])
|
|
}
|
|
for layer in config.layers
|
|
], key=lambda x: x["type"])
|
|
}
|
|
|
|
return hashlib.sha256(
|
|
json.dumps(hash_input, sort_keys=True).encode()
|
|
).hexdigest()
|
|
|
|
async def check_cache(config_hash: str) -> Optional[CachedBuild]:
|
|
"""Check if ISO exists for this configuration."""
|
|
cached = await db.query(BuildCache).filter_by(
|
|
config_hash=config_hash
|
|
).first()
|
|
|
|
if cached and cached.iso_exists():
|
|
# Update access metadata
|
|
cached.last_accessed = datetime.now()
|
|
cached.access_count += 1
|
|
await db.commit()
|
|
return cached
|
|
|
|
return None
|
|
```
|
|
|
|
**Benefit:** Reduces build time from minutes to seconds for repeated configurations. Critical for popular base configurations (e.g., "KDE Desktop with development tools").
|
|
|
|
## Anti-Patterns to Avoid
|
|
|
|
### Anti-Pattern 1: Blocking API Calls During Build
|
|
|
|
**What:** Synchronously waiting for ISO build to complete in API endpoint.
|
|
|
|
**Why bad:** Ties up API worker for minutes, prevents handling other requests, poor user experience with timeout risks.
|
|
|
|
**Instead:** Use asynchronous task queue (Celery) with WebSocket/SSE for progress updates. API returns immediately with job_id, frontend polls or subscribes to updates.
|
|
|
|
**Example:**
|
|
```python
|
|
# BAD: Blocking build
|
|
@app.post("/builds")
|
|
def build(config_id):
|
|
iso = generate_iso(config_id) # Takes 10 minutes!
|
|
return {"iso_url": iso}
|
|
|
|
# GOOD: Async queue
|
|
@app.post("/builds")
|
|
async def build(config_id):
|
|
job = build_iso.delay(config_id)
|
|
return {"job_id": job.id, "status": "queued"}
|
|
```
|
|
|
|
### Anti-Pattern 2: Duplicating State Between React and Three.js
|
|
|
|
**What:** Maintaining separate state trees for application data and 3D scene, manually syncing.
|
|
|
|
**Why bad:** State gets out of sync, bugs from inconsistent data, complexity in update logic.
|
|
|
|
**Instead:** Single source of truth in React state. Scene derives from state. User interactions → dispatch actions → update state → scene re-renders.
|
|
|
|
**Example:**
|
|
```javascript
|
|
// BAD: Separate state
|
|
const [appState, setAppState] = useState({packages: []});
|
|
const [sceneObjects, setSceneObjects] = useState([]);
|
|
|
|
// GOOD: Scene derives from app state
|
|
const [config, setConfig] = useState({packages: []});
|
|
|
|
function Scene({packages}) {
|
|
return packages.map(pkg => <PackageMesh key={pkg.id} {...pkg} />);
|
|
}
|
|
```
|
|
|
|
**Source:** [React Three Fiber state management best practices](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432)
|
|
|
|
### Anti-Pattern 3: Storing Large Files in PostgreSQL
|
|
|
|
**What:** Storing ISO files (1-4GB) or build logs (megabytes) as BYTEA in PostgreSQL.
|
|
|
|
**Why bad:** Database bloat, slow backups, memory pressure, poor performance for large blob operations.
|
|
|
|
**Instead:** Store large files in object storage (S3/MinIO), keep URLs/metadata in PostgreSQL.
|
|
|
|
**Example:**
|
|
```sql
|
|
-- BAD: ISO in database
|
|
CREATE TABLE builds (
|
|
id UUID PRIMARY KEY,
|
|
iso_data BYTEA -- 2GB blob!
|
|
);
|
|
|
|
-- GOOD: URL reference
|
|
CREATE TABLE builds (
|
|
id UUID PRIMARY KEY,
|
|
iso_url VARCHAR(2048), -- s3://bucket/isos/{id}.iso
|
|
iso_checksum VARCHAR(128),
|
|
iso_size_bytes BIGINT
|
|
);
|
|
```
|
|
|
|
### Anti-Pattern 4: Running Multiple Builds Per Worker Concurrently
|
|
|
|
**What:** Allowing a single Celery worker to process multiple ISO builds in parallel.
|
|
|
|
**Why bad:** ISO generation is CPU and memory intensive (compressing filesystem, creating squashfs). Running multiple builds causes resource contention, thrashing, and OOM kills.
|
|
|
|
**Instead:** Configure Celery workers with concurrency=1 for build tasks. Run one build per worker. Scale horizontally with multiple workers.
|
|
|
|
**Example:**
|
|
```bash
|
|
# BAD: Multiple concurrent builds
|
|
celery -A app worker --concurrency=4 # 4 builds at once on 6-core machine
|
|
|
|
# GOOD: One build per worker
|
|
celery -A app worker --concurrency=1 -Q builds # Start 6 workers for 6 cores
|
|
```
|
|
|
|
### Anti-Pattern 5: No Dependency Validation Until Build Time
|
|
|
|
**What:** Allowing users to save configurations without checking package conflicts, discovering issues during ISO build.
|
|
|
|
**Why bad:** Wastes build resources (minutes of CPU time), poor user experience (delayed error feedback), difficult to debug which package caused failure.
|
|
|
|
**Instead:** Run dependency resolution in API layer during configuration save/update. Provide immediate feedback with conflict explanations and alternatives.
|
|
|
|
**Example:**
|
|
```python
|
|
# BAD: Validate during build
|
|
@celery.task
|
|
def build_iso(config_id):
|
|
packages = load_packages(config_id)
|
|
result = resolve_dependencies(packages) # Fails here after queueing!
|
|
if not result.valid:
|
|
raise BuildError("Conflicts detected")
|
|
|
|
# GOOD: Validate on save
|
|
@app.post("/configs")
|
|
async def save_config(config: ConfigInput):
|
|
resolution = dependency_resolver.resolve(config.packages)
|
|
if not resolution.valid:
|
|
return {"error": "conflicts", "details": resolution.conflicts}
|
|
|
|
await db.save(config)
|
|
return {"success": True}
|
|
```
|
|
|
|
## Scalability Considerations
|
|
|
|
| Concern | At 100 users | At 10K users | At 1M users |
|
|
|---------|--------------|--------------|-------------|
|
|
| **API Layer** | Single FastAPI instance | Multiple instances behind load balancer | Auto-scaling group, CDN for static assets |
|
|
| **Build Queue** | Single Redis broker | Redis cluster or RabbitMQ | Kafka for high-throughput messaging |
|
|
| **Workers** | 1 build server (6 cores) | 3-5 build servers | Auto-scaling worker pool, spot instances |
|
|
| **Database** | Single PostgreSQL instance | Primary + read replicas | Sharded PostgreSQL or distributed SQL (CockroachDB) |
|
|
| **Storage** | Local MinIO | S3-compatible with CDN | Multi-region S3 with CloudFront |
|
|
| **Caching** | In-memory cache | Redis cache cluster | Multi-tier cache (Redis + CDN) |
|
|
|
|
### Horizontal Scaling Strategy
|
|
|
|
**API Layer:**
|
|
- Stateless FastAPI instances (session in DB/Redis)
|
|
- Load balancer (Nginx, HAProxy, AWS ALB)
|
|
- Auto-scaling based on CPU/request latency
|
|
|
|
**Build Workers:**
|
|
- Independent Celery workers connecting to shared broker
|
|
- Each worker runs 1 build at a time
|
|
- Scale workers based on queue depth (add workers when >10 jobs queued)
|
|
|
|
**Database:**
|
|
- Read replicas for queries (config lookups)
|
|
- Write operations to primary (build status updates)
|
|
- Connection pooling (PgBouncer)
|
|
|
|
**Storage:**
|
|
- Object storage is inherently scalable
|
|
- CDN for ISO downloads (reduce egress costs)
|
|
- Lifecycle policies (delete ISOs older than 30 days if not accessed)
|
|
|
|
## Build Order Implications for Development
|
|
|
|
### Phase 1: Core Infrastructure
|
|
**What to build:** Database schema, basic API scaffolding, object storage setup.
|
|
**Why first:** Foundation for all other components. No dependencies on complex logic.
|
|
**Duration estimate:** 1-2 weeks
|
|
|
|
### Phase 2: Configuration Management
|
|
**What to build:** Layer data models, CRUD endpoints, basic validation.
|
|
**Why second:** Enables testing configuration storage before complex dependency resolution.
|
|
**Duration estimate:** 1-2 weeks
|
|
|
|
### Phase 3: Dependency Resolver (Simplified)
|
|
**What to build:** Basic conflict detection (direct conflicts only, no SAT solver yet).
|
|
**Why third:** Provides early validation capability. Full SAT solver can wait.
|
|
**Duration estimate:** 1 week
|
|
|
|
### Phase 4: Overlay Engine
|
|
**What to build:** Layer merging logic, profile generation for archiso.
|
|
**Why fourth:** Requires configuration data models from Phase 2. Produces profiles for builds.
|
|
**Duration estimate:** 2 weeks
|
|
|
|
### Phase 5: Build Queue + Workers
|
|
**What to build:** Celery setup, basic build task, worker orchestration.
|
|
**Why fifth:** Depends on Overlay Engine for profile generation. Core value delivery.
|
|
**Duration estimate:** 2-3 weeks
|
|
|
|
### Phase 6: Frontend (Basic)
|
|
**What to build:** React UI for configuration (forms, no 3D yet), build submission.
|
|
**Why sixth:** API must exist first. Provides usable interface for testing builds.
|
|
**Duration estimate:** 2-3 weeks
|
|
|
|
### Phase 7: Advanced Dependency Resolution
|
|
**What to build:** Full SAT solver integration, conflict explanations, alternatives.
|
|
**Why seventh:** Complex feature. System works with basic validation from Phase 3.
|
|
**Duration estimate:** 2-3 weeks
|
|
|
|
### Phase 8: 3D Visualization
|
|
**What to build:** Three.js integration, layer visualization, visual debugging.
|
|
**Why eighth:** Polish/differentiator feature. Core functionality works without it.
|
|
**Duration estimate:** 3-4 weeks
|
|
|
|
### Phase 9: Caching + Optimization
|
|
**What to build:** Build cache, package cache, performance tuning.
|
|
**Why ninth:** Optimization after core features work. Requires usage data to tune.
|
|
**Duration estimate:** 1-2 weeks
|
|
|
|
**Total estimated duration:** 17-23 weeks (4-6 months)
|
|
|
|
## Critical Architectural Decisions
|
|
|
|
### Decision 1: Message Broker (Redis vs RabbitMQ)
|
|
|
|
**Recommendation:** Start with Redis, migrate to RabbitMQ if reliability requirements increase.
|
|
|
|
**Rationale:**
|
|
- Redis: Lower latency, simpler setup, sufficient for <10K builds/day
|
|
- RabbitMQ: Higher reliability, message persistence, better for >100K builds/day
|
|
|
|
**When to switch:** If experiencing message loss or need guaranteed delivery.
|
|
|
|
### Decision 2: Container-Based vs. Direct archiso
|
|
|
|
**Recommendation:** Use direct archiso (mkarchiso) on bare metal workers initially.
|
|
|
|
**Rationale:**
|
|
- Container-based (like Bazzite/Universal Blue) adds complexity (OCI image builds)
|
|
- Direct archiso is simpler, well-documented, less abstraction
|
|
- Can containerize workers later if isolation/portability becomes critical
|
|
|
|
**When to reconsider:** Multi-cloud deployment or need strong isolation between builds.
|
|
|
|
### Decision 3: Monolithic vs. Microservices API
|
|
|
|
**Recommendation:** Start monolithic (single FastAPI app), split services if scaling demands.
|
|
|
|
**Rationale:**
|
|
- Monolith: Faster development, easier debugging, sufficient for <100K users
|
|
- Microservices: Adds operational complexity (service mesh, inter-service communication)
|
|
|
|
**When to split:** If specific services (e.g., dependency resolver) need independent scaling.
|
|
|
|
### Decision 4: Real-Time Updates (WebSocket vs. SSE vs. Polling)
|
|
|
|
**Recommendation:** Use Server-Sent Events (SSE) for build progress.
|
|
|
|
**Rationale:**
|
|
- WebSocket: Bidirectional, but overkill for one-way progress updates
|
|
- SSE: Simpler, built-in reconnection, sufficient for progress streaming
|
|
- Polling: Wasteful, higher latency
|
|
|
|
**Implementation:**
|
|
```python
|
|
@app.get("/api/v1/builds/{job_id}/stream")
|
|
async def stream_progress(job_id: str):
|
|
async def event_generator():
|
|
while True:
|
|
status = await get_job_status(job_id)
|
|
yield f"data: {json.dumps(status)}\n\n"
|
|
if status['state'] in ['SUCCESS', 'FAILURE']:
|
|
break
|
|
await asyncio.sleep(1)
|
|
|
|
return EventSourceResponse(event_generator())
|
|
```
|
|
|
|
## Sources
|
|
|
|
**Archiso & Build Systems:**
|
|
- [Archiso ArchWiki](https://wiki.archlinux.org/title/Archiso) - MEDIUM confidence
|
|
- [Custom Archiso Tutorial 2024](https://serverless.industries/2024/12/30/custom-archiso.en.html) - MEDIUM confidence
|
|
- [Bazzite ISO Build Process](https://deepwiki.com/ublue-os/bazzite/2.6-iso-build-process) - MEDIUM confidence
|
|
- [Universal Blue](https://universal-blue.org/) - MEDIUM confidence
|
|
|
|
**Dependency Resolution:**
|
|
- [Libsolv SAT Solver](https://github.com/openSUSE/libsolv) - HIGH confidence (official)
|
|
- [Version SAT Research](https://research.swtch.com/version-sat) - HIGH confidence
|
|
- [Dependency Resolution Made Simple](https://borretti.me/article/dependency-resolution-made-simple) - MEDIUM confidence
|
|
- [Package Conflict Resolution](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages) - LOW confidence
|
|
|
|
**API & Queue Architecture:**
|
|
- [FastAPI Architecture Patterns 2026](https://medium.com/algomart/modern-fastapi-architecture-patterns-for-scalable-production-systems-41a87b165a8b) - MEDIUM confidence
|
|
- [Celery Documentation](https://docs.celeryq.dev/) - HIGH confidence (official)
|
|
- [Web-Queue-Worker Pattern - Azure](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker) - HIGH confidence (official)
|
|
- [Design Distributed Job Scheduler](https://www.systemdesignhandbook.com/guides/design-a-distributed-job-scheduler/) - MEDIUM confidence
|
|
|
|
**Storage & Database:**
|
|
- [PostgreSQL Schema Design Best Practices](https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application) - HIGH confidence (official)
|
|
- [OverlayFS Linux Kernel Docs](https://docs.kernel.org/filesystems/overlayfs.html) - HIGH confidence (official)
|
|
|
|
**Frontend:**
|
|
- [React Three Fiber Performance 2026](https://graffersid.com/react-three-fiber-vs-three-js/) - MEDIUM confidence
|
|
- [3D Data Visualization with React](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432) - MEDIUM confidence
|
|
|
|
## Confidence Assessment
|
|
|
|
- **Overall Architecture:** MEDIUM-HIGH - Based on established patterns (web-queue-worker, archiso) with modern 2026 practices
|
|
- **Component Boundaries:** HIGH - Clear separation of concerns, well-defined interfaces
|
|
- **Build Process:** HIGH - archiso is well-documented, multiple reference implementations
|
|
- **Dependency Resolution:** MEDIUM - SAT solver approach is proven, but integration complexity unknown
|
|
- **Scalability:** MEDIUM - Patterns are sound, but specific bottlenecks depend on usage patterns
|
|
- **Frontend 3D:** MEDIUM - Three.js + React patterns established, but performance depends on complexity
|