docs: add project research

Files:
- STACK.md: Technology stack recommendations (Python 3.12+, FastAPI, React 19+, Vite, Celery, PostgreSQL 18+)
- FEATURES.md: Feature landscape analysis (table stakes vs differentiators)
- ARCHITECTURE.md: Layered web-queue-worker architecture with SAT-based dependency resolution
- PITFALLS.md: Critical pitfalls and prevention strategies
- SUMMARY.md: Research synthesis with roadmap implications

Key findings:
- Stack: Modern 2026 async Python (FastAPI/Celery) + React/Three.js 3D frontend
- Architecture: Web-queue-worker pattern with sandboxed archiso builds
- Critical pitfall: Build sandboxing required from day one (CHAOS RAT AUR incident July 2025)

Recommended 9-phase roadmap: Infrastructure → Config → Dependency → Overlay → Build Queue → Frontend → Advanced SAT → 3D Viz → Optimization

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
Mikkel Georgsen 2026-01-25 02:07:11 +00:00
parent 87116b1f56
commit c0ff95951e
5 changed files with 2495 additions and 0 deletions

View file

@ -0,0 +1,900 @@
# Architecture Patterns: Linux Distribution Builder Platform
**Domain:** Web-based Linux distribution customization and ISO generation
**Researched:** 2026-01-25
**Confidence:** MEDIUM-HIGH
## Executive Summary
Linux distribution builder platforms combine web interfaces with backend build systems, overlaying configuration layers onto base distributions to create customized bootable ISOs. Modern architectures (2026) leverage container-based immutable systems, asynchronous task queues, and SAT-solver dependency resolution. The Debate platform architecture aligns with established patterns from archiso, Universal Blue/Bazzite, and web-queue-worker patterns.
## Recommended Architecture
The Debate platform should follow a **layered web-queue-worker architecture** with these tiers:
```
┌─────────────────────────────────────────────────────────────────┐
│ PRESENTATION LAYER │
│ React Frontend + Three.js 3D Visualization │
│ (User configuration interface, visual package builder) │
└────────────────────┬────────────────────────────────────────────┘
│ HTTP/WebSocket
┌────────────────────▼────────────────────────────────────────────┐
│ API LAYER │
│ FastAPI (async endpoints, validation, session management) │
└────────────────────┬────────────────────────────────────────────┘
┌───────────┼───────────┐
│ │ │
┌────────▼──────┐ ┌─▼─────────┐ ┌▼───────────────┐
│ Dependency │ │ Overlay │ │ Build Queue │
│ Resolver │ │ Engine │ │ Manager │
│ (SAT solver) │ │ (Layers) │ │ (Celery) │
└────────┬──────┘ └─┬─────────┘ └┬───────────────┘
│ │ │
└──────────┼─────────────┘
┌───────────────────▼─────────────────────────────────────────────┐
│ PERSISTENCE LAYER │
│ PostgreSQL (config, user data, build metadata) │
│ Object Storage (ISO cache, build artifacts) │
└──────────────────────────────────────────────────────────────────┘
┌───────────────────▼─────────────────────────────────────────────┐
│ BUILD EXECUTION LAYER │
│ Worker Nodes (Celery workers running archiso/mkarchiso) │
│ - Profile generation │
│ - Package installation to airootfs │
│ - Overlay application (OverlayFS concepts) │
│ - ISO generation with bootloader config │
└──────────────────────────────────────────────────────────────────┘
```
## Component Boundaries
### Core Components
| Component | Responsibility | Communicates With | State Management |
|-----------|---------------|-------------------|------------------|
| **React Frontend** | User interaction, 3D visualization, configuration UI | API Layer (REST/WS) | Client-side state (React context/Redux) |
| **Three.js Renderer** | 3D package/layer visualization, visual debugging | React components | Scene state separate from app state |
| **FastAPI Gateway** | Request routing, validation, auth, session mgmt | All backend services | Stateless (session in DB/cache) |
| **Dependency Resolver** | Package conflict detection, SAT solving, suggestions | API Layer, Database | Computation-only (no persistent state) |
| **Overlay Engine** | Layer composition, configuration merging, precedence | Build Queue, Database | Configuration versioning in DB |
| **Build Queue Manager** | Job scheduling, worker coordination, priority mgmt | Celery broker (Redis/RabbitMQ) | Queue state in message broker |
| **Celery Workers** | ISO build execution, archiso orchestration | Build Queue, Object Storage | Job state tracked in result backend |
| **PostgreSQL DB** | User data, build configs, metadata, audit logs | All backend services | ACID transactional storage |
| **Object Storage** | ISO caching, build artifacts, profile storage | Workers, API (download endpoint) | Immutable blob storage |
### Detailed Component Architecture
#### 1. Presentation Layer (React + Three.js)
**Purpose:** Provide visual interface for distribution customization with 3D representation of layers.
**Architecture Pattern:**
- **State Management:** Application state in React (configuration data) separate from scene state (3D objects). Changes flow from app state → scene rendering.
- **Performance:** Use React Three Fiber (r3f) for declarative Three.js integration. Target 60 FPS, <100MB memory.
- **Optimization:** InstancedMesh for repeated elements (packages), frustum culling, lazy loading with Suspense, GPU resource cleanup with dispose().
- **Model Format:** GLTF/GLB for 3D assets.
**Communication:**
- REST API for CRUD operations (save configuration, list builds)
- WebSocket for real-time build progress updates
- Server-Sent Events (SSE) alternative for progress streaming
**Sources:**
- [React Three Fiber vs. Three.js Performance Guide 2026](https://graffersid.com/react-three-fiber-vs-three-js/)
- [3D Data Visualization with React and Three.js](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432)
#### 2. API Layer (FastAPI)
**Purpose:** Asynchronous API gateway handling request validation, routing, and coordination.
**Architecture Pattern:**
- **Layered Structure:** Separate routers (by domain), services (business logic), and data access layers.
- **Async I/O:** Use async/await throughout to prevent blocking on database/queue operations.
- **Middleware:** Custom logging, metrics, error handling middleware for observability.
- **Validation:** Pydantic models for request/response validation.
**Endpoints:**
- `/api/v1/configurations` - CRUD for user configurations
- `/api/v1/packages` - Package search, metadata, conflicts
- `/api/v1/builds` - Submit build, query status, download ISO
- `/api/v1/layers` - Layer definitions (Opening Statement, Platform, etc.)
- `/ws/builds/{build_id}` - WebSocket for build progress
**Performance:** FastAPI achieves 300% better performance than synchronous frameworks for I/O-bound operations (2026 benchmarks).
**Sources:**
- [Modern FastAPI Architecture Patterns 2026](https://medium.com/algomart/modern-fastapi-architecture-patterns-for-scalable-production-systems-41a87b165a8b)
- [FastAPI for Microservices 2025](https://talent500.com/blog/fastapi-microservices-python-api-design-patterns-2025/)
#### 3. Dependency Resolver
**Purpose:** Detect package conflicts, resolve dependencies, suggest alternatives using SAT solver algorithms.
**Architecture Pattern:**
- **SAT Solver Implementation:** Use libsolv (openSUSE) or similar SAT-based approach. Translate package dependencies to logic clauses, apply CDCL algorithm.
- **Algorithm:** Conflict-Driven Clause Learning (CDCL) solves NP-complete dependency problems in milliseconds for typical workloads.
- **Input:** Package selection across 5 layers (Opening Statement, Platform, Rhetoric, Talking Points, Closing Argument).
- **Output:** Valid package set or conflict report with suggested resolutions.
**Data Structure:**
```
Package Dependency Graph:
- Nodes: Packages (name, version, layer)
- Edges: Dependencies (requires, conflicts, provides, suggests)
- Constraints: Version ranges, mutual exclusions
```
**Integration:**
- Called synchronously from API during configuration validation
- Pre-compute common dependency sets for base layers (cache results)
- Asynchronous deep resolution for full build validation
**Sources:**
- [Libsolv SAT Solver](https://github.com/openSUSE/libsolv)
- [Version SAT Research](https://research.swtch.com/version-sat)
- [Dependency Resolution Made Simple](https://borretti.me/article/dependency-resolution-made-simple)
#### 4. Overlay Engine
**Purpose:** Manage layered configuration packages, applying merge strategies and precedence rules.
**Architecture Pattern:**
- **Layer Model:** 5 layers with defined precedence (Closing Argument > Talking Points > Rhetoric > Platform > Opening Statement).
- **OverlayFS Inspiration:** Conceptually similar to OverlayFS union mounting, where upper layers override lower layers.
- **Configuration Merging:** Files from higher layers replace/merge with lower layers based on merge strategy (replace, merge-append, merge-deep).
**Layer Structure:**
```
Layer Definition:
- id: unique identifier
- name: user-facing name (e.g., "Platform")
- order: precedence (1=lowest, 5=highest)
- packages: list of package selections
- files: custom files to overlay
- merge_strategy: how to handle conflicts
```
**Merge Strategies:**
- **Replace:** Higher layer file completely replaces lower
- **Merge-Append:** Concatenate files (e.g., package lists)
- **Merge-Deep:** Smart merge (e.g., JSON/YAML key merging)
**Output:** Unified archiso profile with:
- `packages.x86_64` (merged package list)
- `airootfs/` directory (merged filesystem overlay)
- `profiledef.sh` (combined metadata)
**Sources:**
- [OverlayFS Linux Kernel Documentation](https://docs.kernel.org/filesystems/overlayfs.html)
- [OverlayFS ArchWiki](https://wiki.archlinux.org/title/Overlay_filesystem)
#### 5. Build Queue Manager (Celery)
**Purpose:** Distributed task queue for asynchronous ISO build jobs with priority scheduling.
**Architecture Pattern:**
- **Web-Queue-Worker Pattern:** Web frontend → Message queue → Worker pool
- **Message Broker:** Redis (low latency) or RabbitMQ (high reliability) for job queue
- **Result Backend:** Redis or PostgreSQL for job status/results
- **Worker Pool:** Multiple Celery workers (one per build server core for CPU-bound builds)
**Job Types:**
1. **Quick Validation:** Dependency resolution (seconds) - High priority
2. **Full Build:** ISO generation (minutes) - Normal priority
3. **Cache Warming:** Pre-build common configurations - Low priority
**Scheduling:**
- **Priority Queue:** User-initiated builds > automated cache warming
- **Rate Limiting:** Prevent queue flooding, enforce user quotas
- **Retry Logic:** Automatic retry with exponential backoff for transient failures
- **Timeout:** Per-job timeout (e.g., 30 min max for build)
**Coordinator Pattern:**
- Single coordinator manages job assignment and worker health
- Leader election for coordinator HA (if scaled beyond single instance)
**Monitoring:**
- Job state transitions logged to PostgreSQL
- Metrics: queue depth, worker utilization, average build time
- Dead letter queue for failed jobs requiring manual investigation
**Sources:**
- [Celery Distributed Task Queue](https://docs.celeryq.dev/)
- [Design Distributed Job Scheduler](https://www.systemdesignhandbook.com/guides/design-a-distributed-job-scheduler/)
- [Web-Queue-Worker Architecture - Azure](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker)
#### 6. Build Execution Workers (archiso-based)
**Purpose:** Execute ISO generation using archiso (mkarchiso) with custom profiles.
**Architecture Pattern:**
- **Profile-Based Build:** Generate temporary archiso profile per build job
- **Isolation:** Each build runs in isolated environment (separate working directory)
- **Stages:** Profile generation → Package installation → Customization → ISO creation
**Build Process Flow:**
```
1. Profile Generation (Overlay Engine output)
├── Create temp directory
├── Write packages.x86_64 (merged package list)
├── Write profiledef.sh (metadata, permissions)
├── Copy airootfs/ overlay files
└── Configure bootloaders (syslinux, grub, systemd-boot)
2. Package Installation
├── mkarchiso downloads packages (pacman cache)
├── Install to work_dir/x86_64/airootfs
└── Apply package configurations
3. Customization (customize_airootfs.sh)
├── Enable systemd services
├── Apply user-specific configs
├── Run post-install scripts
└── Set permissions
4. ISO Generation
├── Create kernel and initramfs images
├── Build squashfs filesystem
├── Assemble bootable ISO
├── Generate checksums
└── Move to output directory
5. Post-Processing
├── Upload ISO to object storage
├── Update database (build status, ISO location)
├── Cache metadata for reuse
└── Clean up working directory
```
**Worker Configuration:**
- **Resource Limits:** 1 build per worker (CPU/memory intensive)
- **Concurrency:** 6 workers max (6-core build server)
- **Working Directory:** `/tmp/archiso-tmp-{job_id}` (cleaned after completion with -r flag)
- **Output Directory:** Temporary → Object storage → Local cleanup
**Optimizations:**
- **Package Cache:** Shared pacman cache across workers (prevent redundant downloads)
- **Layer Caching:** Cache common base layers (Opening Statement variations)
- **Incremental Builds:** Detect unchanged layers, reuse previous airootfs where possible
**Sources:**
- [Archiso ArchWiki](https://wiki.archlinux.org/title/Archiso)
- [Custom Archiso Tutorial](https://serverless.industries/2024/12/30/custom-archiso.en.html)
#### 7. Persistence Layer (PostgreSQL + Object Storage)
**Purpose:** Store configuration data, build metadata, and build artifacts.
**PostgreSQL Schema Design:**
```sql
-- User configurations
CREATE SCHEMA configurations;
CREATE TABLE configurations.user_configs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
name VARCHAR(255) NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE TABLE configurations.layers (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
config_id UUID REFERENCES configurations.user_configs(id),
layer_type VARCHAR(50) NOT NULL, -- opening_statement, platform, rhetoric, etc.
layer_order INT NOT NULL,
merge_strategy VARCHAR(50) DEFAULT 'replace'
);
CREATE TABLE configurations.layer_packages (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
layer_id UUID REFERENCES configurations.layers(id),
package_name VARCHAR(255) NOT NULL,
package_version VARCHAR(50),
required BOOLEAN DEFAULT TRUE
);
CREATE TABLE configurations.layer_files (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
layer_id UUID REFERENCES configurations.layers(id),
file_path VARCHAR(1024) NOT NULL, -- path in airootfs
file_content TEXT, -- for small configs
file_storage_url VARCHAR(2048), -- for large files in object storage
permissions VARCHAR(4) DEFAULT '0644'
);
-- Build management
CREATE SCHEMA builds;
CREATE TABLE builds.build_jobs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
config_id UUID REFERENCES configurations.user_configs(id),
status VARCHAR(50) NOT NULL, -- queued, running, success, failed
priority INT DEFAULT 5,
started_at TIMESTAMP,
completed_at TIMESTAMP,
iso_url VARCHAR(2048), -- object storage location
iso_checksum VARCHAR(128),
error_message TEXT,
build_log_url VARCHAR(2048)
);
CREATE TABLE builds.build_cache (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
config_hash VARCHAR(64) UNIQUE NOT NULL, -- hash of layer config
iso_url VARCHAR(2048),
created_at TIMESTAMP DEFAULT NOW(),
last_accessed TIMESTAMP DEFAULT NOW(),
access_count INT DEFAULT 0
);
-- Package metadata
CREATE SCHEMA packages;
CREATE TABLE packages.package_metadata (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name VARCHAR(255) UNIQUE NOT NULL,
description TEXT,
repository VARCHAR(100), -- core, extra, community, aur
version VARCHAR(50),
dependencies JSONB, -- {requires: [], conflicts: [], provides: []}
last_updated TIMESTAMP DEFAULT NOW()
);
```
**Schema Organization Best Practices (2026):**
- Separate schemas for functional areas (configurations, builds, packages)
- Schema-level access control for security isolation
- CI/CD integration with migration tools (Flyway, Alembic)
- Indexes on frequently queried fields (config_id, status, config_hash)
**Object Storage:**
- **Purpose:** Store ISOs (large files, 1-4GB), build logs, custom overlay files
- **Technology:** S3-compatible (AWS S3, MinIO, Cloudflare R2)
- **Structure:**
- `/isos/{build_id}.iso` - Generated ISOs
- `/logs/{build_id}.log` - Build logs
- `/overlays/{layer_id}/{file_path}` - Custom files too large for DB
- `/cache/{config_hash}.iso` - Cached ISOs for reuse
**Sources:**
- [PostgreSQL Schema Design Best Practices 2026](https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application)
- [SQL Database Fundamentals 2026](https://www.nucamp.co/blog/sql-and-database-fundamentals-in-2026-queries-design-and-postgresql-essentials)
## Data Flow
### Configuration Creation Flow
```
User (Frontend)
↓ (1) Create/Edit configuration
API Layer (Validation)
↓ (2) Validate input
Dependency Resolver
↓ (3) Check conflicts
↓ (4) Return validation result
API Layer
↓ (5) Save configuration
PostgreSQL (configurations schema)
↓ (6) Return config_id
Frontend (Display confirmation)
```
### Build Submission Flow
```
User (Frontend)
↓ (1) Submit build request
API Layer
↓ (2) Check cache (config hash)
PostgreSQL (build_cache)
├─→ (3a) Cache hit: return cached ISO URL
└─→ (3b) Cache miss: create build job
Build Queue Manager (Celery)
↓ (4) Enqueue job with priority
Message Broker (Redis/RabbitMQ)
↓ (5) Job dispatched to worker
Celery Worker
↓ (6a) Fetch configuration from DB
↓ (6b) Generate archiso profile (Overlay Engine)
↓ (6c) Execute mkarchiso
↓ (6d) Upload ISO to object storage
↓ (6e) Update build status in DB
PostgreSQL + Object Storage
↓ (7) Job complete
API Layer (WebSocket)
↓ (8) Notify user
Frontend (Display download link)
```
### Real-Time Progress Updates Flow
```
Celery Worker
↓ (1) Emit progress events during build
↓ (e.g., "downloading packages", "generating ISO")
Celery Result Backend
↓ (2) Store progress state
API Layer (WebSocket handler)
↓ (3) Poll/subscribe to job progress
↓ (4) Push updates to client
Frontend (WebSocket listener)
↓ (5) Update UI progress bar
```
## Patterns to Follow
### Pattern 1: Layered Configuration Precedence
**What:** Higher layers override lower layers with defined merge strategies.
**When:** User customizes configuration across multiple layers (Platform, Rhetoric, etc.).
**Implementation:**
```python
class OverlayEngine:
def merge_layers(self, layers: List[Layer]) -> Profile:
"""Merge layers from lowest to highest precedence."""
sorted_layers = sorted(layers, key=lambda l: l.order)
profile = Profile()
for layer in sorted_layers:
profile = self.apply_layer(profile, layer)
return profile
def apply_layer(self, profile: Profile, layer: Layer) -> Profile:
"""Apply layer based on merge strategy."""
if layer.merge_strategy == "replace":
profile.files.update(layer.files) # Overwrite
elif layer.merge_strategy == "merge-append":
profile.packages.extend(layer.packages) # Append
elif layer.merge_strategy == "merge-deep":
profile.config = deep_merge(profile.config, layer.config)
return profile
```
**Source:** OverlayFS union mount concepts applied to configuration management.
### Pattern 2: SAT-Based Dependency Resolution
**What:** Translate package dependencies to boolean satisfiability problem, solve with CDCL algorithm.
**When:** User adds package to configuration, system detects conflicts.
**Implementation:**
```python
class DependencyResolver:
def resolve(self, packages: List[Package]) -> Resolution:
"""Resolve dependencies using SAT solver."""
clauses = self.build_clauses(packages)
solver = SATSolver()
result = solver.solve(clauses)
if result.satisfiable:
return Resolution(success=True, packages=result.model)
else:
conflicts = self.explain_conflicts(result.unsat_core)
alternatives = self.suggest_alternatives(conflicts)
return Resolution(success=False, conflicts=conflicts,
alternatives=alternatives)
def build_clauses(self, packages: List[Package]) -> List[Clause]:
"""Convert dependency graph to CNF clauses."""
clauses = []
for pkg in packages:
# If package selected, all dependencies must be selected
for dep in pkg.requires:
clauses.append(Implies(pkg, dep))
# If package selected, no conflicts can be selected
for conflict in pkg.conflicts:
clauses.append(Not(And(pkg, conflict)))
return clauses
```
**Source:** [Libsolv implementation patterns](https://github.com/openSUSE/libsolv)
### Pattern 3: Asynchronous Build Queue with Progress Tracking
**What:** Submit long-running build jobs to queue, track progress, notify on completion.
**When:** User submits build request (ISO generation takes minutes).
**Implementation:**
```python
# API endpoint
@app.post("/api/v1/builds")
async def submit_build(config_id: UUID, background_tasks: BackgroundTasks):
# Check cache first
cache_key = compute_hash(config_id)
cached = await check_cache(cache_key)
if cached:
return {"status": "cached", "iso_url": cached.iso_url}
# Enqueue build job
job = build_iso.apply_async(
args=[config_id],
priority=5,
task_id=str(uuid.uuid4())
)
return {"status": "queued", "job_id": job.id}
# Celery task
@celery.task(bind=True)
def build_iso(self, config_id: UUID):
self.update_state(state='DOWNLOADING', meta={'progress': 10})
# Generate profile
profile = overlay_engine.generate_profile(config_id)
self.update_state(state='BUILDING', meta={'progress': 30})
# Run mkarchiso
result = subprocess.run([
'mkarchiso', '-v', '-r',
'-w', f'/tmp/archiso-{self.request.id}',
'-o', '/tmp/output',
profile.path
])
self.update_state(state='UPLOADING', meta={'progress': 80})
# Upload to object storage
iso_url = upload_iso(f'/tmp/output/archlinux.iso')
return {"iso_url": iso_url, "progress": 100}
```
**Source:** [Celery best practices](https://docs.celeryq.dev/), [Web-Queue-Worker pattern](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker)
### Pattern 4: Cache-First Build Strategy
**What:** Hash configuration, check cache before building, reuse identical ISOs.
**When:** User submits build that may have been built previously.
**Implementation:**
```python
def compute_config_hash(config_id: UUID) -> str:
"""Create deterministic hash of configuration."""
config = db.query(Config).get(config_id)
# Include all layers, packages, files in hash
hash_input = {
"layers": sorted([
{
"type": layer.type,
"packages": sorted(layer.packages),
"files": sorted([
{"path": f.path, "content_hash": hash(f.content)}
for f in layer.files
])
}
for layer in config.layers
], key=lambda x: x["type"])
}
return hashlib.sha256(
json.dumps(hash_input, sort_keys=True).encode()
).hexdigest()
async def check_cache(config_hash: str) -> Optional[CachedBuild]:
"""Check if ISO exists for this configuration."""
cached = await db.query(BuildCache).filter_by(
config_hash=config_hash
).first()
if cached and cached.iso_exists():
# Update access metadata
cached.last_accessed = datetime.now()
cached.access_count += 1
await db.commit()
return cached
return None
```
**Benefit:** Reduces build time from minutes to seconds for repeated configurations. Critical for popular base configurations (e.g., "KDE Desktop with development tools").
## Anti-Patterns to Avoid
### Anti-Pattern 1: Blocking API Calls During Build
**What:** Synchronously waiting for ISO build to complete in API endpoint.
**Why bad:** Ties up API worker for minutes, prevents handling other requests, poor user experience with timeout risks.
**Instead:** Use asynchronous task queue (Celery) with WebSocket/SSE for progress updates. API returns immediately with job_id, frontend polls or subscribes to updates.
**Example:**
```python
# BAD: Blocking build
@app.post("/builds")
def build(config_id):
iso = generate_iso(config_id) # Takes 10 minutes!
return {"iso_url": iso}
# GOOD: Async queue
@app.post("/builds")
async def build(config_id):
job = build_iso.delay(config_id)
return {"job_id": job.id, "status": "queued"}
```
### Anti-Pattern 2: Duplicating State Between React and Three.js
**What:** Maintaining separate state trees for application data and 3D scene, manually syncing.
**Why bad:** State gets out of sync, bugs from inconsistent data, complexity in update logic.
**Instead:** Single source of truth in React state. Scene derives from state. User interactions → dispatch actions → update state → scene re-renders.
**Example:**
```javascript
// BAD: Separate state
const [appState, setAppState] = useState({packages: []});
const [sceneObjects, setSceneObjects] = useState([]);
// GOOD: Scene derives from app state
const [config, setConfig] = useState({packages: []});
function Scene({packages}) {
return packages.map(pkg => <PackageMesh key={pkg.id} {...pkg} />);
}
```
**Source:** [React Three Fiber state management best practices](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432)
### Anti-Pattern 3: Storing Large Files in PostgreSQL
**What:** Storing ISO files (1-4GB) or build logs (megabytes) as BYTEA in PostgreSQL.
**Why bad:** Database bloat, slow backups, memory pressure, poor performance for large blob operations.
**Instead:** Store large files in object storage (S3/MinIO), keep URLs/metadata in PostgreSQL.
**Example:**
```sql
-- BAD: ISO in database
CREATE TABLE builds (
id UUID PRIMARY KEY,
iso_data BYTEA -- 2GB blob!
);
-- GOOD: URL reference
CREATE TABLE builds (
id UUID PRIMARY KEY,
iso_url VARCHAR(2048), -- s3://bucket/isos/{id}.iso
iso_checksum VARCHAR(128),
iso_size_bytes BIGINT
);
```
### Anti-Pattern 4: Running Multiple Builds Per Worker Concurrently
**What:** Allowing a single Celery worker to process multiple ISO builds in parallel.
**Why bad:** ISO generation is CPU and memory intensive (compressing filesystem, creating squashfs). Running multiple builds causes resource contention, thrashing, and OOM kills.
**Instead:** Configure Celery workers with concurrency=1 for build tasks. Run one build per worker. Scale horizontally with multiple workers.
**Example:**
```bash
# BAD: Multiple concurrent builds
celery -A app worker --concurrency=4 # 4 builds at once on 6-core machine
# GOOD: One build per worker
celery -A app worker --concurrency=1 -Q builds # Start 6 workers for 6 cores
```
### Anti-Pattern 5: No Dependency Validation Until Build Time
**What:** Allowing users to save configurations without checking package conflicts, discovering issues during ISO build.
**Why bad:** Wastes build resources (minutes of CPU time), poor user experience (delayed error feedback), difficult to debug which package caused failure.
**Instead:** Run dependency resolution in API layer during configuration save/update. Provide immediate feedback with conflict explanations and alternatives.
**Example:**
```python
# BAD: Validate during build
@celery.task
def build_iso(config_id):
packages = load_packages(config_id)
result = resolve_dependencies(packages) # Fails here after queueing!
if not result.valid:
raise BuildError("Conflicts detected")
# GOOD: Validate on save
@app.post("/configs")
async def save_config(config: ConfigInput):
resolution = dependency_resolver.resolve(config.packages)
if not resolution.valid:
return {"error": "conflicts", "details": resolution.conflicts}
await db.save(config)
return {"success": True}
```
## Scalability Considerations
| Concern | At 100 users | At 10K users | At 1M users |
|---------|--------------|--------------|-------------|
| **API Layer** | Single FastAPI instance | Multiple instances behind load balancer | Auto-scaling group, CDN for static assets |
| **Build Queue** | Single Redis broker | Redis cluster or RabbitMQ | Kafka for high-throughput messaging |
| **Workers** | 1 build server (6 cores) | 3-5 build servers | Auto-scaling worker pool, spot instances |
| **Database** | Single PostgreSQL instance | Primary + read replicas | Sharded PostgreSQL or distributed SQL (CockroachDB) |
| **Storage** | Local MinIO | S3-compatible with CDN | Multi-region S3 with CloudFront |
| **Caching** | In-memory cache | Redis cache cluster | Multi-tier cache (Redis + CDN) |
### Horizontal Scaling Strategy
**API Layer:**
- Stateless FastAPI instances (session in DB/Redis)
- Load balancer (Nginx, HAProxy, AWS ALB)
- Auto-scaling based on CPU/request latency
**Build Workers:**
- Independent Celery workers connecting to shared broker
- Each worker runs 1 build at a time
- Scale workers based on queue depth (add workers when >10 jobs queued)
**Database:**
- Read replicas for queries (config lookups)
- Write operations to primary (build status updates)
- Connection pooling (PgBouncer)
**Storage:**
- Object storage is inherently scalable
- CDN for ISO downloads (reduce egress costs)
- Lifecycle policies (delete ISOs older than 30 days if not accessed)
## Build Order Implications for Development
### Phase 1: Core Infrastructure
**What to build:** Database schema, basic API scaffolding, object storage setup.
**Why first:** Foundation for all other components. No dependencies on complex logic.
**Duration estimate:** 1-2 weeks
### Phase 2: Configuration Management
**What to build:** Layer data models, CRUD endpoints, basic validation.
**Why second:** Enables testing configuration storage before complex dependency resolution.
**Duration estimate:** 1-2 weeks
### Phase 3: Dependency Resolver (Simplified)
**What to build:** Basic conflict detection (direct conflicts only, no SAT solver yet).
**Why third:** Provides early validation capability. Full SAT solver can wait.
**Duration estimate:** 1 week
### Phase 4: Overlay Engine
**What to build:** Layer merging logic, profile generation for archiso.
**Why fourth:** Requires configuration data models from Phase 2. Produces profiles for builds.
**Duration estimate:** 2 weeks
### Phase 5: Build Queue + Workers
**What to build:** Celery setup, basic build task, worker orchestration.
**Why fifth:** Depends on Overlay Engine for profile generation. Core value delivery.
**Duration estimate:** 2-3 weeks
### Phase 6: Frontend (Basic)
**What to build:** React UI for configuration (forms, no 3D yet), build submission.
**Why sixth:** API must exist first. Provides usable interface for testing builds.
**Duration estimate:** 2-3 weeks
### Phase 7: Advanced Dependency Resolution
**What to build:** Full SAT solver integration, conflict explanations, alternatives.
**Why seventh:** Complex feature. System works with basic validation from Phase 3.
**Duration estimate:** 2-3 weeks
### Phase 8: 3D Visualization
**What to build:** Three.js integration, layer visualization, visual debugging.
**Why eighth:** Polish/differentiator feature. Core functionality works without it.
**Duration estimate:** 3-4 weeks
### Phase 9: Caching + Optimization
**What to build:** Build cache, package cache, performance tuning.
**Why ninth:** Optimization after core features work. Requires usage data to tune.
**Duration estimate:** 1-2 weeks
**Total estimated duration:** 17-23 weeks (4-6 months)
## Critical Architectural Decisions
### Decision 1: Message Broker (Redis vs RabbitMQ)
**Recommendation:** Start with Redis, migrate to RabbitMQ if reliability requirements increase.
**Rationale:**
- Redis: Lower latency, simpler setup, sufficient for <10K builds/day
- RabbitMQ: Higher reliability, message persistence, better for >100K builds/day
**When to switch:** If experiencing message loss or need guaranteed delivery.
### Decision 2: Container-Based vs. Direct archiso
**Recommendation:** Use direct archiso (mkarchiso) on bare metal workers initially.
**Rationale:**
- Container-based (like Bazzite/Universal Blue) adds complexity (OCI image builds)
- Direct archiso is simpler, well-documented, less abstraction
- Can containerize workers later if isolation/portability becomes critical
**When to reconsider:** Multi-cloud deployment or need strong isolation between builds.
### Decision 3: Monolithic vs. Microservices API
**Recommendation:** Start monolithic (single FastAPI app), split services if scaling demands.
**Rationale:**
- Monolith: Faster development, easier debugging, sufficient for <100K users
- Microservices: Adds operational complexity (service mesh, inter-service communication)
**When to split:** If specific services (e.g., dependency resolver) need independent scaling.
### Decision 4: Real-Time Updates (WebSocket vs. SSE vs. Polling)
**Recommendation:** Use Server-Sent Events (SSE) for build progress.
**Rationale:**
- WebSocket: Bidirectional, but overkill for one-way progress updates
- SSE: Simpler, built-in reconnection, sufficient for progress streaming
- Polling: Wasteful, higher latency
**Implementation:**
```python
@app.get("/api/v1/builds/{job_id}/stream")
async def stream_progress(job_id: str):
async def event_generator():
while True:
status = await get_job_status(job_id)
yield f"data: {json.dumps(status)}\n\n"
if status['state'] in ['SUCCESS', 'FAILURE']:
break
await asyncio.sleep(1)
return EventSourceResponse(event_generator())
```
## Sources
**Archiso & Build Systems:**
- [Archiso ArchWiki](https://wiki.archlinux.org/title/Archiso) - MEDIUM confidence
- [Custom Archiso Tutorial 2024](https://serverless.industries/2024/12/30/custom-archiso.en.html) - MEDIUM confidence
- [Bazzite ISO Build Process](https://deepwiki.com/ublue-os/bazzite/2.6-iso-build-process) - MEDIUM confidence
- [Universal Blue](https://universal-blue.org/) - MEDIUM confidence
**Dependency Resolution:**
- [Libsolv SAT Solver](https://github.com/openSUSE/libsolv) - HIGH confidence (official)
- [Version SAT Research](https://research.swtch.com/version-sat) - HIGH confidence
- [Dependency Resolution Made Simple](https://borretti.me/article/dependency-resolution-made-simple) - MEDIUM confidence
- [Package Conflict Resolution](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages) - LOW confidence
**API & Queue Architecture:**
- [FastAPI Architecture Patterns 2026](https://medium.com/algomart/modern-fastapi-architecture-patterns-for-scalable-production-systems-41a87b165a8b) - MEDIUM confidence
- [Celery Documentation](https://docs.celeryq.dev/) - HIGH confidence (official)
- [Web-Queue-Worker Pattern - Azure](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker) - HIGH confidence (official)
- [Design Distributed Job Scheduler](https://www.systemdesignhandbook.com/guides/design-a-distributed-job-scheduler/) - MEDIUM confidence
**Storage & Database:**
- [PostgreSQL Schema Design Best Practices](https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application) - HIGH confidence (official)
- [OverlayFS Linux Kernel Docs](https://docs.kernel.org/filesystems/overlayfs.html) - HIGH confidence (official)
**Frontend:**
- [React Three Fiber Performance 2026](https://graffersid.com/react-three-fiber-vs-three-js/) - MEDIUM confidence
- [3D Data Visualization with React](https://medium.com/cortico/3d-data-visualization-with-react-and-three-js-7272fb6de432) - MEDIUM confidence
## Confidence Assessment
- **Overall Architecture:** MEDIUM-HIGH - Based on established patterns (web-queue-worker, archiso) with modern 2026 practices
- **Component Boundaries:** HIGH - Clear separation of concerns, well-defined interfaces
- **Build Process:** HIGH - archiso is well-documented, multiple reference implementations
- **Dependency Resolution:** MEDIUM - SAT solver approach is proven, but integration complexity unknown
- **Scalability:** MEDIUM - Patterns are sound, but specific bottlenecks depend on usage patterns
- **Frontend 3D:** MEDIUM - Three.js + React patterns established, but performance depends on complexity

View file

@ -0,0 +1,232 @@
# Feature Landscape
**Domain:** Linux Distribution Builder and Customization Platform
**Researched:** 2026-01-25
## Table Stakes
Features users expect. Missing = product feels incomplete.
| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| **Package Selection** | Core functionality - users need to choose what software gets installed | Medium | All ISO builders (archiso, Cubic, live-build) provide this. Must support searching, categorizing packages. Debate metaphor: "Talking Points" |
| **Base Distribution Selection** | Users need a foundation to build from | Low | Standard in all tools. Debate calls this "Opening Statement" (Arch, Ubuntu, etc.) |
| **ISO Generation** | End product - bootable installation media | High | Essential output format. Tools like archiso, Cubic all produce .iso files. Requires build system integration |
| **Configuration Persistence** | Users expect to save and reload their work | Medium | All modern tools save configurations (archiso profiles, NixOS configs). Debate calls this "Speech" |
| **Bootloader Configuration** | ISOs must boot on target hardware | Medium | Both UEFI and BIOS support expected. archiso supports syslinux, GRUB, systemd-boot |
| **Kernel Selection** | Users may need specific kernel versions | Low | archiso allows multiple kernels. Important for hardware compatibility |
| **User/Password Setup** | Basic system access configuration | Low | Expected in all distribution builders |
| **Locale/Keyboard Configuration** | System must support user's language/region | Low | Standard feature across all tools |
## Differentiators
Features that set product apart. Not expected, but valued.
| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| **Visual Conflict Resolution** | Makes dependency hell visible and solvable for non-experts | High | UNIQUE to Debate. Current tools show cryptic error messages. Visual "Objection" system could be game-changing for accessibility |
| **Live Preview in Browser** | See customizations before building ISO | Very High | Web-based VM preview would be revolutionary. Current tools require local VM testing. Enables instant gratification |
| **Curated Starting Templates** | Pre-configured setups (like Omarchy) as starting points | Medium | Inspired by Hyprland dotfiles community and r/unixporn. Debate's "Opening Statements" as gallery |
| **Visual Theme Customization** | GUI for selecting/previewing window managers, themes, icons | Medium | Tools like HyprRice exist for post-install. Doing it PRE-install is differentiator. Debate's "Rhetoric" metaphor |
| **One-Click Export to Multiple Formats** | ISO, USB image, Ventoy-compatible, VM disk | Medium | Ventoy integration is emerging trend. Multi-format export reduces friction |
| **Conflict Explanation System** | AI-assisted or rule-based explanations for why packages conflict | High | Educational value. Turns errors into learning moments. Could use LLM for natural language explanations |
| **Community Template Gallery** | Browse/fork/share custom configurations | Medium | Inspired by dotfiles.github.io and awesome-dotfiles. Social feature drives engagement |
| **Configuration Comparison** | Visual diff between two "Speeches" | Medium | Helps users understand what changed. Useful for learning from others' configs |
| **Automatic Optimization Suggestions** | "You selected KDE and GNOME - did you mean to?" | Medium | Catches common mistakes. Reduces ISO bloat |
| **Real-time Build Size Calculator** | Show ISO size as user adds packages | Low | Prevents surprise "ISO too large" errors at build time |
| **Secure Boot Support** | Generate signed ISOs for secure boot systems | High | archiso added this recently. Becoming table stakes for 2026+ |
| **Reproducible Builds** | Same config = identical ISO every time | Medium | Security/verification feature. Inspired by NixOS philosophy |
## Anti-Features
Features to explicitly NOT build. Common mistakes in this domain.
| Anti-Feature | Why Avoid | What to Do Instead |
|--------------|-----------|-------------------|
| **Full NixOS-style Declarative Config** | Too complex for target audience. Defeats "accessibility" goal | Provide simple GUI with optional advanced mode. Let users export to NixOS/Ansible later if they want |
| **Build Everything Locally** | Computationally expensive, slow, blocks UX | Use cloud build workers. Users configure, servers build. Stream logs for transparency |
| **Support Every Distro at Launch** | Maintenance nightmare, quality suffers | Start with Arch (Omarchy use case). Add Ubuntu/Fedora based on demand. Deep > wide |
| **Custom Package Repository Hosting** | Infrastructure burden, security liability | Use existing repos (AUR, official). Let users add custom repos via URL, but don't host |
| **Native Desktop App** | Limits accessibility, cross-platform pain | Web-first. Desktop can be Electron wrapper later if needed |
| **Real-time Collaboration** | Complex to build, unclear value | Async sharing via templates is sufficient. Can add later if users demand it |
| **Post-Install Configuration** | Scope creep - becomes a remote management tool | Focus on ISO creation. Link to Ansible/SaltStack/dotfiles managers for post-install |
| **Automated Testing of ISOs** | Resource-intensive, brittle, unclear ROI for MVP | Manual testing, community validation. Automate after product-market fit |
## Feature Dependencies
```
Foundation Layer:
Base Distro Selection → Package Repository Access
Package Layer:
Package Selection → Conflict Detection
Conflict Detection → Conflict Resolution UI
Configuration Layer:
WM/DE Selection → Theme Selection (themes must match WM)
Package Selection → Build Size Calculator
Build Layer:
All Configuration → ISO Generation
ISO Generation → Export Format Options
Sharing Layer:
Configuration Persistence → Template Gallery
Template Gallery → Configuration Comparison
```
**Critical Path for MVP:**
1. Base Distro Selection
2. Package Selection
3. Conflict Detection (basic)
4. ISO Generation
5. Configuration Save/Load
**Enhancement Path:**
1. Visual Conflict Resolution (differentiator)
2. Theme Customization
3. Template Gallery
4. Live Preview (if feasible)
## MVP Recommendation
For MVP, prioritize:
### Must Have (Table Stakes):
1. **Base Distro Selection** - Start with Arch only (Omarchy use case)
2. **Package Selection** - Visual interface for browsing/selecting packages
3. **Basic Conflict Detection** - Show when packages conflict, even if resolution is manual
4. **Configuration Save/Load** - Users can save their "Speech"
5. **ISO Generation** - Basic working ISO output
6. **Bootloader Config** - UEFI + BIOS support
### Should Have (Core Differentiators):
7. **Curated Starting Template** - Omarchy as first "Opening Statement"
8. **Visual Conflict Resolution** - The "Objection" system - this is your moat
9. **Build Size Calculator** - Real-time feedback prevents mistakes
### Nice to Have (Polish):
10. **Theme Preview** - Screenshots of WMs/themes
11. **Export to USB Format** - Ventoy-compatible output
Defer to post-MVP:
- **Live Preview**: Very high complexity, requires VM infrastructure. Get manual testing feedback first
- **Template Gallery**: Need user base first. Can launch with 3-5 curated templates
- **Multi-distro Support**: Ubuntu/Fedora after Arch works perfectly
- **Conflict Explanations**: Start with simple error messages, enhance with AI later
- **Secure Boot**: Nice to have but not critical for target audience (Linux-curious users likely disabling secure boot anyway)
- **Reproducible Builds**: Important for security-conscious users but not core value prop
## Platform-Specific Notes
### Web Platform Advantages (for Debate):
- **Accessibility**: No installation barrier, works on any OS
- **Community**: Easy sharing via URLs
- **Iteration**: Can update without user action
- **Discovery**: SEO/social sharing drives growth
### Web Platform Challenges:
- **Build Performance**: Offload to backend workers, not client-side
- **File Size**: Users downloading multi-GB ISOs - need CDN
- **Preview**: Browser-based VM is hard - consider VNC to backend VM
## Competitive Analysis
### Existing Tool Categories:
**Command-Line Tools** (archiso, live-build):
- Strengths: Powerful, flexible, reproducible
- Weaknesses: Steep learning curve, text-based config
- Debate advantage: Visual UI, guided flow
**Desktop GUI Tools** (Cubic, HyprRice):
- Strengths: Easier than CLI, visual feedback
- Weaknesses: Post-install only (HyprRice) or Ubuntu-only (Cubic), still requires Linux knowledge
- Debate advantage: Web-based (works on any OS), pre-install customization, conflict resolution
**Web Services** (SUSE Studio - discontinued):
- Strengths: Accessible, shareable
- Weaknesses: Vendor-locked, no longer maintained
- Debate advantage: Modern stack, open ecosystem (Arch/AUR), domain-specific UX (debate metaphor)
**Declarative Systems** (NixOS):
- Strengths: Reproducible, programmable, powerful
- Weaknesses: Very steep learning curve, unique syntax
- Debate advantage: Visual-first, approachable for non-programmers
### Feature Gap Analysis:
**What nobody does well:**
1. Visual conflict resolution for non-experts
2. Web-based ISO creation for any OS
3. Social/sharing features for configurations
4. Beginner-friendly theming/ricing PRE-install
**What Debate can own:**
1. "Linux customization for the visual web generation"
2. "GitHub for Linux configurations" (social sharing)
3. "What Canva did for design, Debate does for Linux"
## User Journey Feature Mapping
### Target Persona 1: Linux-Curious Switcher
**Pain Points**: Overwhelmed by options, afraid of breaking system, wants pretty desktop
**Critical Features**:
- Curated starting templates (low choice paradox)
- Visual theme preview (see before build)
- Conflict resolution with explanations (learning aid)
- One-click export to USB (easy to test)
### Target Persona 2: Enthusiast Ricer
**Pain Points**: Post-install configuration tedious, wants to share setups, iterates frequently
**Critical Features**:
- Granular package selection (power user control)
- Template gallery for inspiration/sharing
- Configuration comparison (learn from others)
- Fast iteration (quick rebuilds)
### Target Persona 3: Content Creator
**Pain Points**: Needs reproducible setups, wants to share with audience, aesthetics matter
**Critical Features**:
- Shareable configuration URLs (easy distribution)
- Reproducible builds (audience gets same result)
- Theme showcase (visual content)
- Export to multiple formats (audience flexibility)
## Sources
### Linux Distribution Builders:
- [Linux Distribution Builder Tools Features 2026](https://thelinuxcode.com/linux-distributions-a-practical-builder-friendly-guide-for-2026/)
- [archiso - ArchWiki](https://wiki.archlinux.org/title/Archiso)
- [Cubic: Custom Ubuntu ISO Creator](https://github.com/PJ-Singh-001/Cubic)
- [Kali Linux Custom ISO Creation](https://www.kali.org/docs/development/live-build-a-custom-kali-iso/)
- [5 Tools to Create Custom Linux Distro](https://www.maketecheasier.com/6-tools-to-easily-create-your-own-custom-linux-distro/)
### Customization Tools:
- [Awesome Linux Ricing Tools](https://github.com/avtzis/awesome-linux-ricing)
- [HyprRice GUI for Hyprland](https://github.com/avtzis/awesome-linux-ricing)
- [NixOS Configuration Editors](https://nixos.wiki/wiki/NixOS_configuration_editors)
- [nix-gui: NixOS Without Coding](https://github.com/nix-gui/nix-gui)
### Package Management:
- [Package Conflict Resolution](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages)
- [Dependency Hell - Wikipedia](https://en.wikipedia.org/wiki/Dependency_hell)
### Configuration Sharing:
- [Dotfiles Inspiration Gallery](https://dotfiles.github.io/inspiration/)
- [Awesome Dotfiles Resources](https://github.com/webpro/awesome-dotfiles)
- [Hyprland Example Configurations](https://wiki.hypr.land/Configuring/Example-configurations/)
- [Best Hyprland Dotfiles](https://itsfoss.com/best-hyprland-dotfiles/)
### Multi-Boot & Export:
- [Ventoy Multi-Boot USB](https://opensource.com/article/21/5/linux-ventoy)
- [YUMI Multiboot USB Creator](https://pendrivelinux.com/yumi-multiboot-usb-creator/)
### Confidence Levels:
- **Table Stakes Features**: HIGH (verified via archiso wiki, multiple tool documentation)
- **Differentiator Features**: MEDIUM (based on market gap analysis and community tools)
- **Anti-Features**: MEDIUM (based on scope analysis and target audience research)
- **User Journey Mapping**: LOW (requires user interviews to validate)

View file

@ -0,0 +1,577 @@
# Domain Pitfalls: Linux Distribution Builder Platform
**Domain:** Web-based Linux distribution customization and ISO generation
**Researched:** 2026-01-25
**Confidence:** MEDIUM-HIGH
## Critical Pitfalls
Mistakes that cause rewrites, security breaches, or major production issues.
### Pitfall 1: Unsandboxed User-Generated Package Execution
**What goes wrong:** User-submitted overlay packages execute arbitrary code during build with full system privileges, allowing malicious actors to compromise the build server, inject malware into generated ISOs, or exfiltrate sensitive data.
**Why it happens:** The archiso build process and makepkg (used for AUR packages) run without sandboxing by default. Developers assume community review is sufficient, or don't realize PKGBUILD scripts execute during the build phase, not just installation.
**Consequences:**
- In July 2025, CHAOS RAT malware was distributed through AUR packages (librewolf-fix-bin, firefox-patch-bin, zen-browser-patched-bin) that used .install scripts to execute remote code
- Compromised builds can inject backdoors into ISOs downloaded by thousands of users
- Build server compromise can leak user data, API keys, or allow lateral movement to other infrastructure
- Legal liability for distributing malware-infected operating systems
**Prevention:**
- **NEVER run user-submitted PKGBUILDs directly on build servers**
- Use systemd-nspawn, nsjail, or microVMs to isolate each build in a separate sandbox
- Implement static analysis on PKGBUILD files before execution (detect suspicious commands: curl, wget, eval, base64)
- Run builds in ephemeral containers discarded after each build
- Implement network egress filtering for build environments (block outbound connections except to approved package mirrors)
- Require manual security review for any overlay containing .install scripts or custom build steps
**Detection:**
- Monitor build processes for unexpected network connections
- Alert on PKGBUILD files containing: curl/wget with piped execution, base64 encoding, eval statements, /tmp modifications
- Track build duration anomalies (malicious code often adds delays)
- Log all filesystem modifications during builds
- Use integrity checking to detect unauthorized binary modifications
**Phase to address:** Phase 1 (Core Infrastructure) - Build sandboxing must be architected from the start. Retrofitting security is nearly impossible.
**Sources:**
- [CHAOS RAT in AUR Packages](https://linuxsecurity.com/features/chaos-rat-in-aur)
- [AUR Malware Packages Exploit](https://itsfoss.gitlab.io/blog/aur-malware-packages-exploit-critical-security-flaws-exposed/)
- [Sandboxing untrusted code 2026](https://dev.to/mohameddiallo/4-ways-to-sandbox-untrusted-code-in-2026-1ffb)
### Pitfall 2: Non-Deterministic Build Reproducibility
**What goes wrong:** The same configuration generates different ISO hashes on different builds, making it impossible to verify ISO integrity, debug user issues, or implement proper caching. Cache invalidation becomes unreliable, causing excessive rebuilds or stale builds.
**Why it happens:** Timestamps in build artifacts, non-deterministic file ordering, parallel build race conditions, leaked build environment variables, and external dependency fetches introduce randomness.
**Consequences:**
- Cache invalidation strategies fail (can't detect if upstream changes require rebuild)
- Users report bugs that can't be reproduced
- Security auditing becomes impossible (can't verify ISO hasn't been tampered with)
- Build queue backs up from unnecessary rebuilds
- Wasted compute resources rebuilding identical configurations
**Prevention:**
- Normalize all timestamps using SOURCE_DATE_EPOCH environment variable
- Sort input files deterministically before processing
- Use fixed locales (LC_ALL=C)
- Pin compiler versions and toolchain
- Disable ASLR during builds (affects compiler output)
- Use `--clamp-mtime` for filesystem timestamps
- Implement hermetic builds (no network access, all dependencies pre-fetched)
- Configure archiso with reproducible options:
- Disable CONFIG_MODULE_SIG_ALL (generates random keys)
- Pin git commits (don't use HEAD/branch names)
- Use fixed compression levels and algorithms
**Detection:**
- Automated testing: build same config twice, compare checksums
- Monitor cache hit rate (sudden drops indicate non-determinism)
- Track build output size variance for identical configs
- Diff filesystem trees from duplicate builds
**Phase to address:** Phase 1 (Core Infrastructure) - Reproducibility must be designed into the build pipeline from the start.
**Sources:**
- [Reproducible builds documentation](https://reproducible-builds.org/docs/deterministic-build-systems/)
- [Linux Kernel reproducible builds](https://docs.kernel.org/kbuild/reproducible-builds.html)
- [Three pillars of reproducible builds](https://fossa.com/blog/three-pillars-reproducible-builds/)
### Pitfall 3: Upstream Breaking Changes Without Version Pinning
**What goes wrong:** Omarchy or CachyOS repositories update packages with breaking changes. Suddenly all builds fail with cryptic dependency errors, incompatible kernel modules, or missing packages. No coordination exists to warn of changes.
**Why it happens:** Relying on rolling release repositories (Arch, CachyOS) without pinning versions. Assuming upstream maintainers will preserve compatibility. Not monitoring upstream changelogs.
**Consequences:**
- All user builds fail simultaneously when upstream updates
- Emergency firefighting to identify breaking changes
- User trust erosion ("the platform is unreliable")
- CachyOS experienced frequent kernel stability issues in 2025, requiring LTS fallback
- Dependency mismatches between Arch and CachyOS v3 repositories in October 2025
**Prevention:**
- **Pin package repository snapshots by date** (use https://archive.archlinux.org/ or equivalent)
- Implement a staging environment that tests against latest upstream before promoting to production
- Monitor upstream repositories for breaking changes:
- Subscribe to CachyOS announcement channels
- Track Arch Linux security advisories
- Monitor package version changes daily
- Implement gradual rollout: test builds with 1% of traffic before full deployment
- Provide repository version selection in UI ("stable" = 1 month old, "latest" = current)
- Cache known-good package sets and allow rollback
- Document which Omarchy/CachyOS features are used and monitor their changelog
**Detection:**
- Automated canary builds every 6 hours against latest repos
- Alert when build failure rate exceeds threshold
- Track dependency resolution errors
- Monitor upstream package version drift
**Phase to address:** Phase 2 (Build Pipeline) - After basic builds work, implement upstream isolation.
**Sources:**
- [CachyOS FAQ & Troubleshooting](https://wiki.cachyos.org/cachyos_basic/faq/)
- [CachyOS dependency errors](https://discuss.cachyos.org/t/recent-package-system-upgrade-caused-many-dependancy-errors/17017)
- [Archiso fork upstream breakage](https://joaquimrocha.com/2024/09/22/how-to-fork/)
### Pitfall 4: Dependency Hell Across Hundreds of Overlays
**What goes wrong:** User selects multiple overlays that declare conflicting package versions or file ownership. Build fails with "conflicting files" errors. Alternatively, build succeeds but generates a broken ISO where applications crash or won't start.
**Why it happens:** Package managers (pacman, apt) don't automatically resolve conflicts between third-party overlays. Multiple overlays might modify the same config file. No validation of overlay compatibility occurs during selection.
**Consequences:**
- Build fails after 15 minutes of package installation
- User gets cryptic error: "file /etc/foo.conf exists in packages A and B"
- Generated ISO boots but applications don't work
- User blames platform instead of specific overlay combination
- Support burden: every overlay combination creates unique failure modes
**Prevention:**
- Pre-validate overlay compatibility during upload:
- Extract file lists from packages
- Check for file conflicts between overlays
- Tag overlays as mutually exclusive
- Implement dependency solver that detects conflicts **before** build starts:
- Use SAT solver or constraint solver to validate overlay combinations
- Show "conflict graph" in UI when incompatible overlays selected
- Provide curated overlay collections known to work together
- Generate warning when user selects overlays with overlapping file ownership
- Implement priority system (if conflict, package from higher-priority overlay wins)
- Test common overlay combinations in CI
**Detection:**
- Parse pacman/apt error messages for "conflicting files"
- Track which overlay combinations fail most frequently
- Monitor user retry patterns (same user rebuilding with fewer overlays)
- Collect telemetry on successful vs failed overlay combinations
**Phase to address:** Phase 3 (Overlay System) - When overlay selection UI is implemented.
**Sources:**
- [Package dependency resolution conflicts](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages)
- [Dependency hell Wikipedia](https://en.wikipedia.org/wiki/Dependency_hell)
- [Arch Linux conflicting packages](https://bbs.archlinux.org/viewtopic.php?id=297274)
### Pitfall 5: Cache Invalidation False Negatives
**What goes wrong:** Upstream package updates but cached build is still served. Users download ISOs with outdated packages containing known CVEs. Security scanners flag ISOs as vulnerable.
**Why it happens:** Cache invalidation logic doesn't account for transitive dependencies. Package A updates, but cache key only checks direct dependencies. Alternatively, rolling release repos mean "latest" points to different package versions over time.
**Consequences:**
- Users install ISOs with security vulnerabilities
- Platform reputation damage ("distributing outdated software")
- Legal liability if vulnerable software causes data breaches
- Users manually discover their ISO is outdated and distrust platform
**Prevention:**
- Include full dependency tree hash in cache key, not just direct dependencies
- Implement time-based cache expiry (max 7 days for rolling release)
- Track package repository snapshot timestamps in cache metadata
- Invalidate cache when ANY package in the tree updates, not just overlay packages
- Provide "force rebuild with latest packages" option in UI
- Display build timestamp and package versions prominently in ISO metadata
- Run vulnerability scanning (grype, trivy) on generated ISOs before serving
**Detection:**
- Compare package versions in cached ISO vs current repository
- Alert when cached ISOs are served > 14 days old
- Monitor CVE databases for packages in cached ISOs
- Track user reports of "outdated packages"
**Phase to address:** Phase 2 (Build Pipeline) - When caching is implemented.
**Sources:**
- [Linux kernel CVEs 2025](https://ciq.com/blog/linux-kernel-cves-2025-what-security-leaders-need-to-know-to-prepare-for-2026/)
- [Package cache invalidation issues](https://forums.linuxmint.com/viewtopic.php?t=327727)
## Moderate Pitfalls
Mistakes that cause delays, poor UX, or technical debt.
### Pitfall 6: 3D Visualization Performance Degradation
**What goes wrong:** Beautiful 3D package visualizations work perfectly on developer machines (RTX 4090) but run at 5fps on target users' mid-range laptops. Page becomes unusable. Users blame "bloated web apps."
**Why it happens:** Not testing on mid-range hardware. Using unoptimized Three.js scenes with too many draw calls. No progressive enhancement or fallback to 2D views. WebGL single-threaded bottleneck starves GPU.
**Consequences:**
- Target users ("Windows refugees" with 3-year-old laptops) can't use the platform
- High bounce rate from slow page load
- Negative reviews: "looks pretty but unusable"
- Mobile users completely locked out
- Battery drain on laptops
**Prevention:**
- **Test on mid-range hardware from day one** (Intel integrated graphics, GTX 1650)
- Implement Level of Detail (LOD): reduce geometry complexity for distant objects
- Use instancing for repeated elements (package icons)
- Move rendering to Web Worker with OffscreenCanvas to unblock main thread
- Consider WebGPU migration for parallel command encoding (reduces CPU bottleneck)
- Provide 2D fallback UI for low-end devices
- Lazy load 3D view (show 2D list first, load 3D on interaction)
- Set performance budget: 60fps on Intel UHD Graphics 620
- Implement automatic quality adjustment based on frame rate
**Detection:**
- Monitor FPS via Performance API in production
- Track GPU utilization (available via WebGL extensions)
- A/B test: measure conversion rate for 3D vs 2D view
- Collect device/GPU telemetry to understand user hardware
**Phase to address:** Phase 4 (3D Visualization) - During 3D UI development, enforce performance requirements.
**Sources:**
- [WebGL vs WebGPU performance](https://medium.com/@sudenurcevik/upgrading-performance-moving-from-webgl-to-webgpu-in-three-js-4356e84e4702)
- [Three.js performance optimization](https://tympanus.net/codrops/2025/02/11/building-efficient-three-js-scenes-optimize-performance-while-maintaining-quality/)
- [OffscreenCanvas for WebGL](https://evilmartians.com/chronicles/faster-webgl-three-js-3d-graphics-with-offscreencanvas-and-web-workers)
### Pitfall 7: Build Queue Starvation and Resource Contention
**What goes wrong:** During peak hours, build queue fills up. New builds wait 2 hours. Meanwhile, 10 builds for the same configuration are queued because different users requested identical overlays. Resources wasted on duplicate work.
**Why it happens:** No build deduplication. FIFO queue without prioritization. Fixed pool of build workers regardless of load. Not leveraging cache hits to avoid builds.
**Consequences:**
- Poor user experience (long wait times)
- Wasted compute resources on duplicate builds
- Scaling costs spike during traffic bursts
- Users retry, adding more duplicate builds to queue
- Platform appears slow and unreliable
**Prevention:**
- Implement build deduplication:
- Hash configuration (packages + overlays + options)
- If identical build in queue or recently completed, return same result
- Show "joining existing build" UI to set expectations
- Add queue priority levels:
- Cache hit = instant (no build needed)
- Existing identical build = join queue position
- Small overlay = higher priority than full rebuild
- Authenticated users > anonymous
- Autoscale build workers based on queue depth (Kubernetes HPA)
- Show queue position and estimated wait time in UI
- Implement progressive caching (overlay-level caching, not just full ISO)
- Reserve capacity for fast/small builds to prevent queue starvation
**Detection:**
- Monitor queue depth over time
- Track build deduplication hit rate
- Measure p95 wait time
- Alert when wait time exceeds SLA (e.g., >10 minutes)
- Analyze duplicate builds (same config hash queued multiple times)
**Phase to address:** Phase 5 (Scaling) - After MVP proves demand exists.
**Sources:**
- [Linux package build server scaling](https://linuxsecurity.com/features/navigating-software-scalability)
- [Automation breakpoints 2026](https://codecondo.com/automation-breakpoints-5-critical-failures-2026/)
### Pitfall 8: Archiso Breaking Changes in Updates
**What goes wrong:** Platform uses archiso v85, which has certain boot mode configurations. Archiso updates to v86+ with unified boot modes. Suddenly all builds fail with "invalid boot mode" errors.
**Why it happens:** Relying on latest archiso package without pinning version. Not monitoring archiso changelog. Assuming backward compatibility in tooling.
**Consequences:**
- All builds fail when archiso updates
- Emergency debugging session to identify breaking change
- Must rewrite build configuration for new archiso API
- User builds stuck until fix deployed
**Prevention:**
- Pin archiso version in build environment (don't use rolling latest)
- Monitor archiso changelog: https://github.com/archlinux/archiso/blob/master/CHANGELOG.rst
- Test against new archiso versions in staging before upgrading production
- Notable breaking changes to watch:
- v86 (Sept 2025): Boot mode consolidation (bios.syslinux replaces bios.syslinux.eltorito/mbr)
- v87 (Oct 2025): Bootstrap package config changes
- Boot parameter changes: archisodevice → archisosearchuuid
- Abstract archiso-specific config behind internal API (easier to update)
- Maintain compatibility layer for multiple archiso versions
**Detection:**
- Automated builds against latest archiso in CI
- Alert on archiso package version changes in upstream repos
- Parse archiso error messages for "unknown boot mode" or deprecation warnings
**Phase to address:** Phase 2 (Build Pipeline) - When archiso integration is implemented.
**Sources:**
- [Archiso changelog](https://github.com/archlinux/archiso/blob/master/CHANGELOG.rst)
- [Archiso wiki](https://wiki.archlinux.org/title/Archiso)
### Pitfall 9: Beginner UX Assumes Linux Knowledge
**What goes wrong:** UI uses jargon like "initramfs", "systemd units", "GRUB config". Users see errors like "failed to install linux-firmware" with no explanation. Windows refugees feel overwhelmed and leave.
**Why it happens:** Developers are Linux experts, forgetting target users aren't. Passing raw build errors to UI without translation. No onboarding flow explaining concepts.
**Consequences:**
- High bounce rate from non-technical users
- Support burden: answering basic Linux questions
- Negative word-of-mouth: "too complicated"
- Failed promise of making Linux accessible
- Common beginner mistakes from 2026 research:
- Installing incompatible packages (wrong architecture, conflicting dependencies)
- Not understanding difference between LTS and rolling release
- Customizing too much at once, breaking desktop environment
**Prevention:**
- **Translate technical errors to plain language:**
- "Failed to install linux-firmware" → "Your ISO needs device drivers. This is normal and will be included."
- "Conflicting packages" → "Two of your selected packages can't be installed together. Try removing [X] or [Y]."
- Implement guided mode with curated options (vs advanced mode with full control)
- Add tooltips explaining Linux concepts:
- Desktop environment (with screenshots)
- LTS vs rolling release (stability vs latest features)
- Package manager basics
- Provide templates: "Windows-like", "macOS-like", "Developer workstation"
- Show visual previews of desktop environments, not just names
- Implement "test in browser" feature (preview DE without downloading ISO)
- User testing with actual Windows refugees, not Linux users
**Detection:**
- Track where users abandon the flow (heatmaps, analytics)
- Monitor support tickets for recurring questions
- A/B test simplified vs technical language
- Survey users: "How confusing was this? 1-5"
**Phase to address:** Phase 6 (Polish & Onboarding) - After core features work, focus on UX refinement.
**Sources:**
- [Linux mistakes beginners make](https://dev.to/techrefreshing/10-linux-mistakes-every-beginner-makes-i-made-all-of-them-4och)
- [Choosing Linux distro 2026](https://dev.to/srijan-xi/navigating-the-switch-how-to-choose-the-right-linux-distro-in-2026-448b)
- [UX design mistakes 2026](https://www.wearetenet.com/blog/ux-design-mistakes)
### Pitfall 10: ISO Download Reliability Issues
**What goes wrong:** User customizes ISO, clicks download, and gets 2.5GB file transfer. Browser crashes at 80%. Or network hiccups cause corruption. User re-customizes and re-downloads, wasting build resources.
**Why it happens:** Using direct file downloads without resume support. No integrity checking before use. Not leveraging browser download manager capabilities.
**Consequences:**
- User frustration from failed downloads
- Wasted bandwidth (re-downloading)
- Corrupted ISOs that fail to boot (user blames platform)
- Support burden from "ISO won't boot" issues
**Prevention:**
- Implement resumable downloads (HTTP Range requests)
- Provide torrent option for large ISOs
- Display SHA256 checksum prominently with instructions to verify
- Use Content-Disposition header to set filename (debate-custom-2026-01-25.iso)
- Consider chunked download with client-side reassembly
- For PWA approach: Use Background Fetch API for large downloads
- Download continues even if tab closed
- Browser shows persistent UI for download progress
- Better reliability on mobile/flaky connections
- Show download progress (not just "downloading...")
- Provide "test ISO in browser" option (emulator) before download
**Detection:**
- Track download completion rate (started vs finished)
- Monitor download retry patterns
- Analyze user reports of "corrupted ISO"
- Track checksum verification usage
**Phase to address:** Phase 5 (Distribution) - After ISOs are being generated.
**Sources:**
- [PWA offline functionality 2026](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Guides/Offline_and_background_operation)
- [PWA development trends 2026](https://vocal.media/journal/progressive-web-app-development-trends-and-use-cases-for-2026)
## Minor Pitfalls
Mistakes that cause annoyance but are relatively easy to fix.
### Pitfall 11: Insecure Default Configurations
**What goes wrong:** Generated ISOs have default passwords (root/toor), SSH enabled with password auth, or autologin configured. User deploys to production and gets compromised.
**Why it happens:** Copying archiso baseline defaults without hardening. Assuming users will secure their systems post-install. Making convenience the default over security.
**Consequences:**
- Generated ISOs are insecure by default
- Users deploy vulnerable systems
- Platform reputation damage if incidents occur
- Archiso baseline includes autologin by default
**Prevention:**
- Override insecure archiso defaults:
- Disable autologin (remove autologin.conf)
- Require password setup during ISO customization
- Disable SSH or require key-based auth
- Provide security checklist in UI:
- "Will this ISO be used on the internet?" → Disable password auth
- "Will this be installed on physical hardware?" → Enable disk encryption
- Show security warnings for risky configurations
- Default to secure, allow opting into convenience features
**Detection:**
- Static analysis of generated ISO configs
- Alert on ISOs with default passwords or autologin
- Track which security features are enabled/disabled
**Phase to address:** Phase 3 (Configuration) - When users can customize security settings.
**Sources:**
- [Archiso security considerations](https://wiki.archlinux.org/title/Archiso)
### Pitfall 12: Inadequate Build Logging and Debugging
**What goes wrong:** User reports "my build failed" with no details. Build logs are 10MB of pacman output. Error message buried on line 8,432. Impossible to debug without reproduction.
**Why it happens:** Logging everything without structure. No log aggregation or parsing. Not extracting key errors for display.
**Consequences:**
- Support burden (need full logs to debug)
- Users can't self-service debug
- Repeated builds to add debug logging
- Difficult to identify systematic issues
**Prevention:**
- Structure logs with severity levels (INFO, WARN, ERROR)
- Extract and highlight fatal errors in UI
- Provide "debug mode" that shows full logs
- Store build logs for 30 days with unique build ID
- Implement log search/filter in UI
- Add build context to logs (config hash, overlay versions, timestamp)
- Common errors should have KB articles linked
**Detection:**
- Track support tickets requesting logs
- Monitor build failure rate by error type
- Analyze which errors lead to user retry vs abandonment
**Phase to address:** Phase 2 (Build Pipeline) - Implement with build infrastructure.
**Sources:**
- [Build automation best practices](https://codecondo.com/automation-breakpoints-5-critical-failures-2026/)
### Pitfall 13: Package Repository Mirror Failures
**What goes wrong:** Build relies on mirrors.cachyos.org. Mirror goes down during build. Build fails with "failed to download packages". Build queue backs up.
**Why it happens:** Single point of failure for package sources. Not implementing mirror fallback. Assuming mirrors have 100% uptime.
**Consequences:**
- Builds fail during mirror outages
- User sees "server error" with no explanation
- Build queue fills with retries
**Prevention:**
- Configure multiple mirrors in pacman.conf (fallback)
- Cache frequently-used packages on build infrastructure
- Implement retry logic with exponential backoff
- Monitor mirror health and automatically disable unhealthy mirrors
- Provide user feedback: "Package mirror temporarily unavailable, retrying..."
**Detection:**
- Monitor mirror response times and availability
- Alert on increased build failures from download errors
- Track which mirrors cause failures
**Phase to address:** Phase 2 (Build Pipeline) - When package downloading is implemented.
**Sources:**
- [CachyOS optimized repositories](https://wiki.cachyos.org/features/optimized_repos/)
## Phase-Specific Warnings
| Phase | Likely Pitfall | Mitigation |
|-------|---------------|------------|
| Phase 1: Core Infrastructure | Unsandboxed build execution (Critical #1) | Design build isolation from day one using systemd-nspawn or microVMs |
| Phase 1: Core Infrastructure | Non-deterministic builds (Critical #2) | Implement reproducible build practices immediately |
| Phase 2: Build Pipeline | Upstream breaking changes (Critical #3) | Pin repository snapshots, test against staging |
| Phase 2: Build Pipeline | Cache invalidation bugs (Critical #5) | Include dependency tree hash in cache key |
| Phase 3: Overlay System | Dependency hell (Critical #4) | Pre-validate overlay compatibility, implement conflict detection |
| Phase 4: 3D Visualization | Performance on mid-range hardware (Moderate #6) | Test on target hardware, implement LOD and fallbacks |
| Phase 5: Scaling | Build queue starvation (Moderate #7) | Implement build deduplication and autoscaling |
| Phase 6: Polish | Beginner UX (Moderate #9) | User test with Windows refugees, translate jargon |
## Validation Checklist
Before launching each phase, verify:
**Phase 1 (Infrastructure):**
- [ ] All builds run in isolated sandboxes (no host system access)
- [ ] Same configuration generates identical checksum 3 times in a row
- [ ] Build logs structured and searchable
- [ ] Failed builds provide actionable error messages
**Phase 2 (Build Pipeline):**
- [ ] Package repository versions pinned/snapshotted
- [ ] Mirror fallback configured and tested
- [ ] Cache invalidation includes transitive dependencies
- [ ] Staging environment tests against latest upstream
**Phase 3 (Overlay System):**
- [ ] File conflict detection runs before build
- [ ] Incompatible overlays show warning in UI
- [ ] Dependency solver validates combinations
**Phase 4 (3D Visualization):**
- [ ] Achieves 60fps on Intel UHD Graphics 620
- [ ] 2D fallback available for low-end devices
- [ ] Frame rate monitoring in production
**Phase 5 (Scaling):**
- [ ] Build deduplication prevents duplicate work
- [ ] Queue autoscaling based on depth
- [ ] p95 wait time under SLA
**Phase 6 (Polish):**
- [ ] User tested with non-technical "Windows refugees"
- [ ] Technical jargon translated to plain language
- [ ] Download resume support implemented
- [ ] Security defaults enabled
## Sources
**Security & Malware:**
- [CHAOS RAT Found in Arch Linux AUR Packages](https://linuxsecurity.com/features/chaos-rat-in-aur)
- [AUR Malware Packages Exploit Critical Security Flaws Exposed](https://itsfoss.gitlab.io/blog/aur-malware-packages-exploit-critical-security-flaws-exposed/)
- [Arch Linux Removes Malicious AUR Packages](https://dailysecurityreview.com/security-spotlight/arch-linux-removes-malicious-aur-packages-that-deployed-chaos-rat-malware/)
- [Sandboxing untrusted code in 2026](https://dev.to/mohameddiallo/4-ways-to-sandbox-untrusted-code-in-2026-1ffb)
**Reproducible Builds:**
- [Reproducible builds - deterministic build systems](https://reproducible-builds.org/docs/deterministic-build-systems/)
- [Linux Kernel reproducible builds](https://docs.kernel.org/kbuild/reproducible-builds.html)
- [Three Pillars of Reproducible Builds](https://fossa.com/blog/three-pillars-reproducible-builds/)
**Archiso & Build Systems:**
- [Archiso ArchWiki](https://wiki.archlinux.org/title/Archiso)
- [Archiso CHANGELOG](https://github.com/archlinux/archiso/blob/master/CHANGELOG.rst)
- [How to Create archiso - Arch Forums](https://bbs.archlinux.org/viewtopic.php?id=257187)
**Dependency & Package Management:**
- [Package Conflict Resolution](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages)
- [Dependency hell - Wikipedia](https://en.wikipedia.org/wiki/Dependency_hell)
- [CachyOS FAQ & Troubleshooting](https://wiki.cachyos.org/cachyos_basic/faq/)
- [CachyOS dependency errors](https://discuss.cachyos.org/t/recent-package-system-upgrade-caused-many-dependancy-errors/17017)
**Performance & Scaling:**
- [WebGL vs WebGPU performance in Three.js](https://medium.com/@sudenurcevik/upgrading-performance-moving-from-webgl-to-webgpu-in-three-js-4356e84e4702)
- [Building Efficient Three.js Scenes](https://tympanus.net/codrops/2025/02/11/building-efficient-three-js-scenes-optimize-performance-while-maintaining-quality/)
- [Faster WebGL with OffscreenCanvas](https://evilmartians.com/chronicles/faster-webgl-three-js-3d-graphics-with-offscreencanvas-and-web-workers)
- [Linux package build server scaling](https://linuxsecurity.com/features/navigating-software-scalability)
**User Experience:**
- [10 Linux Mistakes Every Beginner Makes](https://dev.to/techrefreshing/10-linux-mistakes-every-beginner-makes-i-made-all-of-them-4och)
- [Navigating the Switch: Choosing Linux Distro in 2026](https://dev.to/srijan-xi/navigating-the-switch-how-to-choose-the-right-linux-distro-in-2026-448b)
- [13 UX Design Mistakes to Avoid in 2026](https://www.wearetenet.com/blog/ux-design-mistakes)
**Progressive Web Apps:**
- [PWA Offline and background operation](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Guides/Offline_and_background_operation)
- [Progressive Web App Development Trends 2026](https://vocal.media/journal/progressive-web-app-development-trends-and-use-cases-for-2026)
**Security & CVEs:**
- [Linux kernel CVEs 2025: preparing for 2026](https://ciq.com/blog/linux-kernel-cves-2025-what-security-leaders-need-to-know-to-prepare-for-2026/)
- [OverlayFS vulnerability](https://securitylabs.datadoghq.com/articles/overlayfs-cve-2023-0386/)

466
.planning/research/STACK.md Normal file
View file

@ -0,0 +1,466 @@
# Technology Stack
**Project:** Debate - Visual Linux Distribution Builder
**Researched:** 2026-01-25
**Overall Confidence:** HIGH
## Recommended Stack
### Core Backend
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Python | 3.12+ | Runtime environment | FastAPI requires 3.9+; 3.12 is stable and well-supported. Avoid 3.13 until ecosystem catches up. | HIGH |
| FastAPI | 0.128.0+ | Web framework | Industry standard for async Python APIs. Latest version adds Python 3.14 support, mixed Pydantic v1/v2 (though use v2), and ReDoc 2.x. Fast, type-safe, auto-docs. | HIGH |
| Pydantic | 2.12.5+ | Data validation | Required by FastAPI (>=2.7.0). V1 is deprecated and unsupported in Python 3.14+. V2 offers better build-time performance and type safety. No v3 exists. | HIGH |
| Uvicorn | 0.30+ | ASGI server | Production-grade ASGI server. Recent versions include built-in multi-process supervisor, eliminating need for Gunicorn in many cases. Use `--workers N` for multi-core. | HIGH |
### Database Layer
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| PostgreSQL | 18.1+ | Primary database | Latest major release (Nov 2025). PG 13 is EOL, PG 18 has latest security patches and performance improvements. Always run current minor release. | HIGH |
| SQLAlchemy | 2.0+ | ORM | Industry standard with recent type-hint improvements. Better raw performance than Tortoise ORM in most benchmarks. Async support via Core. Avoid 1.x (legacy). | HIGH |
| asyncpg | Latest | PostgreSQL driver | High-performance async Postgres driver. Used by SQLAlchemy async. Significantly faster than psycopg2. | MEDIUM |
| Alembic | Latest | Database migrations | Official SQLAlchemy migration tool. Standard choice, well-integrated ecosystem. | HIGH |
**Alternative considered:** Tortoise ORM - simpler API, async-first, but SQLAlchemy 2.0's type hints and performance make it the safer long-term bet. Use SQLAlchemy unless team strongly prefers Django-style ORM.
### Task Queue
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Celery | 5.6.2+ | Distributed task queue | Latest stable (Jan 2026). Supports Python 3.9-3.13. Battle-tested for ISO builds. Redis transport is feature-complete. Overkill for simple tasks but essential for long-running ISO generation. | HIGH |
| Redis | 6.2+ | Message broker & cache | Celery backend. Version constraint updated in Kombu. Redis Sentinel ACL auth fixed in Celery 5.6.1. Also serves as cache layer for ISO metadata. | HIGH |
**Alternatives considered:**
- RQ - Too simple for multi-hour ISO builds requiring progress tracking and cancellation
- Dramatiq - Good performance but smaller ecosystem; Celery's maturity wins for production workloads
- APScheduler - Not designed for distributed task execution
**Decision:** Celery despite complexity because ISO builds require:
- Progress tracking (partial results)
- Task cancellation (user aborts build)
- Resource limiting (only N builds concurrent)
- Retry logic (transient failures)
### Core Frontend
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| React | 19.2.3+ | UI framework | Latest stable (Dec 2025). React 19.2 adds Activity API (hide/restore UI state), useEffectEvent, and Performance panel integration. Use 19.x for latest features. | HIGH |
| TypeScript | 5.8+ | Type system | Feb 2025 release. Adds --erasableSyntaxOnly for Node.js type-stripping, checked returns for conditional types, and performance optimizations. 5.9 expected 2026. Avoid bleeding edge 7.0 (Go rewrite). | HIGH |
| Vite | 6.x+ | Build tool | Instant HMR, ESM-native. Makimo reported 16.1s build vs 28.4s with CRA; 390ms startup vs 4.5s. Choose Vite over Next.js - no SSR needed (no SEO benefit for logged-in tool), microservice alignment, fast iteration. | HIGH |
**Why not Next.js:** Project is SPA-first (no SEO requirement), needs architectural freedom for 3D integration, and benefits from Vite's dev speed. Next.js SSR optimization is wasted here.
### 3D Visualization
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| React Three Fiber | 8.x+ | React renderer for Three.js | Integrates Three.js with React paradigms. Outperforms vanilla Three.js at scale due to React scheduling. WebGPU support since Safari 26 (Sept 2025) makes this future-proof. Essential for project. | HIGH |
| Three.js | Latest r1xx | 3D engine | Underlying 3D engine. R3F keeps up with Three.js releases. WebGPU renderer available, massive performance gains on modern browsers. | HIGH |
| @react-three/drei | Latest | Helper library | Essential helper abstractions (cameras, controls, loaders, HTML overlays). Industry standard R3F companion. Includes `<Detailed />` for LOD (30-40% frame rate improvement). | HIGH |
| @react-three/postprocessing | Latest | Visual effects | Post-processing effects (bloom, SSAO, etc.) for visual polish. Based on pmndrs/postprocessing. | MEDIUM |
| leva | Latest | Debug controls | GUI controls for rapid 3D prototyping. Invaluable during development for tweaking camera angles, lighting, animation speeds. | MEDIUM |
**Critical 3D Performance Requirements (60fps mandate):**
- Instancing and batching to keep <100 draw calls per frame (90% reduction possible)
- LOD using drei's `<Detailed />` component
- Draco compression for geometry (90-95% file size reduction)
- KTX2 with Basis Universal for textures (10x memory reduction, GPU-compressed)
- Mutations in useFrame, NOT React state (avoid re-render overhead)
**Why React Three Fiber over vanilla Three.js:**
- Team is React-focused (TypeScript/React already chosen)
- Component reusability (layer cards, conflict modals)
- React scheduling prevents frame drops during state updates
- Ecosystem alignment (Vite, dev tools)
### State Management
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Zustand | 5.x+ | Global state | Sweet spot between Redux complexity and Context limitations. Zero dependencies, minimal boilerplate, excellent DevTools. Recommended for medium-large apps. Single store model fits this project. | HIGH |
**Alternatives considered:**
- Redux Toolkit - Too heavyweight; boilerplate overhead not justified for this project's state complexity
- Jotai - Atom-based model is overkill; Zustand's single store simpler for stack-builder state
- Context API - Insufficient for complex 3D state synchronization and performance requirements
**Decision:** Zustand because:
- Configuration builder has central state (layers, conflicts, user selections)
- Need Redux DevTools support for debugging complex 3D interactions
- Performance matters (3D re-renders expensive)
- Team prefers minimal boilerplate
### UI Components & Styling
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Tailwind CSS | 4.x+ | Utility-first CSS | Industry standard. V4 released 2025 with @theme directive, OKLCH colors, improved performance. Essential for rapid UI development. | HIGH |
| shadcn/ui | Latest | Component library | Copy-paste React components (Radix UI + Tailwind). Full code ownership, no dependency bloat. Updated for Tailwind v4 and React 19 (forwardRefs removed). Default "new-york" style. | HIGH |
| Radix UI | Via shadcn/ui | Headless components | Accessibility primitives. Integrated via shadcn/ui; don't install separately unless custom components needed. | HIGH |
**Why shadcn/ui:** Own the code, customize freely, no black-box dependencies. Perfect for design system that needs 3D integration (custom layer card components).
### Form Management
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| React Hook Form | 7.x+ | Form library | Zero dependencies, minimal re-renders, smaller bundle than Formik. Active maintenance (Formik hasn't had commits in 1+ year). Native HTML5 validation + Yup integration. | HIGH |
| Zod | Latest | Schema validation | TypeScript-first validation. Pairs well with React Hook Form. Prefer over Yup for better TypeScript inference. | MEDIUM |
**Why not Formik:** Inactive maintenance, heavier bundle, more re-renders due to controlled components. React Hook Form is 2026 standard.
### Testing
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Vitest | Latest | Unit/component tests | 10-20x faster than Jest on large codebases. Jest-compatible API, native Vite integration. Browser Mode for real browser component testing. | HIGH |
| React Testing Library | Latest | Component testing | User-focused testing paradigm. Industry standard. Integrates with Vitest via @testing-library/react. | HIGH |
| Playwright | Latest | E2E testing | Browser automation for critical flows (signup, build ISO, resolve conflict). Keep E2E suite small (3-5 flows), rely on Vitest for coverage. | HIGH |
| MSW | Latest | API mocking | Mock Service Worker for intercepting network requests. Essential for testing without backend dependency. | MEDIUM |
**Testing strategy:** Vitest for fast unit/component tests, Playwright for 3-5 critical E2E flows in CI. Avoid testing library churn - these are stable choices.
### Infrastructure
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Docker | Latest stable | Containerization | Multi-stage builds for lean images. FastAPI best practice: base on python:3.12-slim, separate dependency install stage. One process per container, scale at container level. | HIGH |
| Caddy | 2.x+ | Reverse proxy | REST API on localhost:2019 for programmatic route management (critical for adding/updating routes via Python control plane). Automatic HTTPS, simpler than Nginx for this use case. Atomic updates without reload. | HIGH |
**Why Caddy over Nginx/Traefik:**
- Python control plane needs to dynamically manage routes (user ISO download endpoints)
- Caddy's JSON REST API is perfect for programmatic configuration
- Nginx requires .conf file generation + reload (not atomic)
- Traefik is overkill (designed for K8s label-based discovery)
**Docker configuration:**
- Uvicorn with `--workers` matching CPU cores (6 for build server)
- Caddy in front for HTTPS termination and routing
- Multi-stage builds: stage 1 installs deps, stage 2 copies installed packages (lean final image)
- Environment variables via pydantic-settings
### ISO Generation
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| archiso | Latest | ISO builder | Official Arch Linux ISO builder. Use "releng" profile for full package set. Standard tool, well-documented, active maintenance. | HIGH |
| Docker (sandboxed) | Latest | Build isolation | Run archiso in sandboxed container for security. ISO builds from untrusted configs require isolation. Resource limits prevent abuse. | HIGH |
**archiso best practices (2026):**
- Copy releng profile to ext4 partition (NTFS/FAT32 have permission issues)
- Use mksquashfs with: `-b 1048576 -comp xz -Xdict-size 100% -always-use-fragments -noappend` for best compression
- Place working dir on tmpfs if memory allows (speed improvement)
- Build command: `mkarchiso -v -r -w /tmp/archiso-tmp -o /path/to/out_dir /path/to/profile/`
**Integration approach:**
- Celery task receives config, generates archiso profile
- Spins up Docker container with archiso, mounts generated profile
- Monitors build progress, streams logs to frontend via WebSocket
- Caches resulting ISO by config hash
### Development Tools
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| uv | Latest | Package manager | 10-100x faster than pip (cold JupyterLab install: 2.6s vs 21.4s). Global cache saves disk space. Drop-in pip/pip-tools replacement. Astral's tool, active development. | HIGH |
| pre-commit | Latest | Git hooks | Auto-format, lint, type-check before commit. Standard Python ecosystem tool. | MEDIUM |
| Ruff | Latest | Linter & formatter | Rust-based Python linter/formatter. Replaces Black, isort, flake8, pylint. Blazing fast, zero config needed. | HIGH |
| mypy | Latest | Type checker | Static type checking for Python. Essential with Pydantic and FastAPI. Strict mode recommended. | MEDIUM |
| ESLint | Latest | JS/TS linter | Standard TypeScript linter. Use with typescript-eslint plugin. | MEDIUM |
| Prettier | Latest | Code formatter | Opinionated JS/TS/CSS formatter. Integrates with ESLint via eslint-config-prettier. | MEDIUM |
**Why uv over Poetry/pip-tools:**
- Speed is critical for developer experience (instant feedback loop)
- uv is drop-in compatible (no workflow change)
- Poetry is slower, more opinionated (own venv logic)
- UV handles Python version management automatically
**Python package installation pattern:**
```bash
# Core dependencies managed by uv
uv pip install fastapi[all] uvicorn[standard] sqlalchemy[asyncio] celery[redis] pydantic-settings
uv pip install -D pytest pytest-asyncio ruff mypy pre-commit
```
### Monitoring & Observability
| Technology | Version | Purpose | Why | Confidence |
|------------|---------|---------|-----|------------|
| Sentry | Latest SDK | Error tracking & APM | Auto-enabled with FastAPI. Captures stack traces, request context, user info. Flame charts for profiling. Industry standard. Set `traces_sample_rate` for performance monitoring. | HIGH |
| Prometheus | Latest | Metrics | Time-series metrics for tracking build queue depth, ISO generation times, API latency. Standard cloud-native monitoring. | MEDIUM |
| Grafana | Latest | Dashboards | Visualize Prometheus metrics. Standard pairing for observability. | MEDIUM |
**Observability strategy:**
- Sentry for errors and APM (traces)
- Structured logging (JSON) for debugging
- Prometheus for custom metrics (ISO build duration, queue depth)
- Grafana for dashboards
## Alternatives Considered
| Category | Recommended | Alternative | Why Not | Confidence |
|----------|-------------|-------------|---------|------------|
| Backend Framework | FastAPI 0.128+ | Django/Flask | FastAPI's async, type safety, auto-docs superior for API-first app. Django is overkill, Flask is outdated. | HIGH |
| ORM | SQLAlchemy 2.0+ | Tortoise ORM | Tortoise simpler but SQLAlchemy 2.0's type hints + performance + ecosystem maturity win. Benchmarks favor SQLAlchemy Core. | MEDIUM |
| Task Queue | Celery 5.6+ | RQ, Dramatiq | RQ too simple for long-running builds. Dramatiq lacks ecosystem maturity. Celery's complexity justified for this use case. | HIGH |
| Build Tool | Vite 6+ | Next.js 15+ | No SSR needed (no SEO), Vite's dev speed critical, architectural freedom for 3D integration. Next.js SSR optimization wasted. | HIGH |
| 3D Library | React Three Fiber 8+ | Vanilla Three.js, Babylon.js | R3F integrates with React paradigm, better scaling. Vanilla Three.js requires manual integration. Babylon.js smaller ecosystem. | HIGH |
| State Mgmt | Zustand 5+ | Redux Toolkit, Jotai | Redux too heavyweight. Jotai's atom model overkill for single-store use case. Zustand is sweet spot. | MEDIUM |
| Form Library | React Hook Form 7+ | Formik | Formik unmaintained (1+ year no commits), heavier bundle, worse performance. RHF is 2026 standard. | HIGH |
| Reverse Proxy | Caddy 2+ | Nginx, Traefik | Caddy's REST API critical for dynamic route management. Nginx requires file generation + reload. Traefik overkill. | HIGH |
| Package Mgr | uv | Poetry, pip-tools | uv's speed (10-100x) improves DX dramatically. Poetry is comprehensive but slow. uv is drop-in replacement. | MEDIUM |
| Component Lib | shadcn/ui | Material-UI, Ant Design | shadcn gives code ownership, zero dependency bloat. MUI/Ant are black boxes, harder to customize for 3D integration. | HIGH |
## Installation
### Backend Setup
```bash
# Install uv (package manager)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Create virtual environment
uv venv
# Activate venv
source .venv/bin/activate
# Core dependencies
uv pip install \
fastapi[all]==0.128.0 \
uvicorn[standard]>=0.30.0 \
sqlalchemy[asyncio]>=2.0.0 \
asyncpg \
alembic \
celery[redis]==5.6.2 \
redis>=6.2.0 \
pydantic>=2.12.0 \
pydantic-settings \
sentry-sdk[fastapi]
# Development dependencies
uv pip install -D \
pytest \
pytest-asyncio \
pytest-cov \
httpx \
ruff \
mypy \
pre-commit
# Install pre-commit hooks
pre-commit install
```
### Frontend Setup
```bash
# Install Node.js 20+ (LTS)
# Use nvm/fnm for version management
# Create Vite + React + TypeScript project
npm create vite@latest frontend -- --template react-ts
cd frontend
# Core dependencies
npm install \
react@latest \
react-dom@latest \
@react-three/fiber@latest \
@react-three/drei@latest \
@react-three/postprocessing@latest \
three@latest \
zustand@latest \
react-hook-form@latest \
zod@latest
# UI & styling
npm install \
tailwindcss@latest \
autoprefixer \
postcss
# Initialize shadcn/ui (React 19 + Tailwind v4)
npx shadcn@latest init
# Development dependencies
npm install -D \
@types/react \
@types/react-dom \
@types/three \
vitest \
@testing-library/react \
@testing-library/jest-dom \
playwright \
@playwright/test \
msw \
eslint \
@typescript-eslint/parser \
@typescript-eslint/eslint-plugin \
prettier \
eslint-config-prettier \
leva
# Initialize Playwright
npx playwright install
```
### Infrastructure Setup
```bash
# Install Docker (use system package manager)
# Install Caddy
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' | sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' | sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update
sudo apt install caddy
# Install PostgreSQL 18.1
sudo apt install -y postgresql-18
# Install Redis
sudo apt install -y redis-server
# Install archiso (on Arch-based system or in Docker)
# Run on Arch Linux or in Arch container:
pacman -S archiso
```
## Project Structure
```
debate/
├── backend/ # FastAPI application
│ ├── app/
│ │ ├── api/ # API routes
│ │ ├── core/ # Config, security, dependencies
│ │ ├── crud/ # Database operations
│ │ ├── db/ # Database models, session
│ │ ├── schemas/ # Pydantic models
│ │ ├── services/ # Business logic
│ │ ├── tasks/ # Celery tasks (ISO generation)
│ │ └── main.py
│ ├── tests/
│ ├── alembic/ # Database migrations
│ ├── Dockerfile
│ ├── pyproject.toml
│ └── requirements.txt # Generated by uv
├── frontend/ # React + Vite application
│ ├── src/
│ │ ├── components/ # React components
│ │ │ ├── 3d/ # Three.js/R3F components
│ │ │ ├── ui/ # shadcn/ui components
│ │ │ └── forms/ # Form components
│ │ ├── stores/ # Zustand stores
│ │ ├── hooks/ # Custom React hooks
│ │ ├── lib/ # Utilities
│ │ ├── types/ # TypeScript types
│ │ └── main.tsx
│ ├── tests/
│ ├── e2e/ # Playwright tests
│ ├── public/
│ ├── Dockerfile
│ ├── package.json
│ ├── vite.config.ts
│ ├── tsconfig.json
│ └── tailwind.config.js
├── iso-builder/ # Archiso Docker container
│ ├── Dockerfile
│ └── profiles/ # Custom archiso profiles
├── docker-compose.yml # Local development stack
├── Caddyfile # Caddy configuration
└── .planning/
└── research/
└── STACK.md # This file
```
## Version Pinning Strategy
- **Python packages:** Pin major.minor with `>=` (e.g., `fastapi>=0.128.0`) to get security patches
- **Critical dependencies:** Pin exact version for reproducibility (e.g., `celery==5.6.2`)
- **Node packages:** Use `^` for semver compatible updates (e.g., `"react": "^19.2.3"`)
- **Lock files:** Commit `uv.lock` (Python) and `package-lock.json` (Node) for reproducible builds
- **Docker base images:** Pin to specific minor versions (e.g., `python:3.12-slim`) with digest for production
## Security Considerations
- ISO builds run in sandboxed Docker containers with resource limits (prevent CPU/memory abuse)
- User uploads validated with strict schemas (Pydantic), never executed directly
- Caddy handles HTTPS termination with automatic Let's Encrypt certificates
- Database: Use asyncpg with prepared statements (SQL injection protection via SQLAlchemy)
- API: Rate limiting via FastAPI middleware (protect against abuse)
- Secrets: Environment variables only, never committed (use pydantic-settings)
- Sentry: Scrub sensitive data before sending error reports
## Performance Targets
| Metric | Target | Why |
|--------|--------|-----|
| 3D visualization | 60fps | Core differentiator - must feel fluid on mid-range hardware |
| API response time | <100ms (p95) | User perception of responsiveness |
| ISO generation | <30min | Acceptable for custom distro build (archiso baseline) |
| Frontend bundle | <500KB gzipped | Fast initial load, code-split 3D assets |
| Database queries | <50ms (p95) | Adequate for CRUD operations |
| WebSocket latency | <50ms | Real-time build progress updates |
**How we achieve 60fps:**
- LOD (Level of Detail) with drei's `<Detailed />` - 30-40% frame rate improvement
- Draw call optimization: <100 per frame via instancing/batching
- Asset compression: Draco (geometry), KTX2 (textures)
- Mutation in `useFrame`, not React state (avoid re-render overhead)
- Web Workers for heavy computation (config validation off main thread)
- WebGPU renderer when available (Safari 26+, Chrome, Firefox)
## Migration Path
This stack is greenfield-ready. No legacy migrations required.
**Future considerations:**
- **TypeScript 7.0 (Go rewrite):** Monitor but don't migrate until 2027+ when ecosystem stable
- **React 20.x:** Adopt when stable (likely 2027), no breaking changes expected based on 19.x pattern
- **PostgreSQL 19:** Upgrade when released (Sept 2026), follow minor release cadence
- **Pydantic v3:** Does not exist; stay on v2.x series
## Sources
**High Confidence (Official Docs / Context7):**
- [FastAPI Release Notes](https://fastapi.tiangolo.com/release-notes/)
- [FastAPI Releases](https://github.com/fastapi/fastapi/releases)
- [React Versions](https://react.dev/versions)
- [React 19.2 Release](https://react.dev/blog/2025/10/01/react-19-2)
- [PostgreSQL Versioning Policy](https://www.postgresql.org/support/versioning/)
- [PostgreSQL 18.1 Release](https://www.postgresql.org/about/news/postgresql-181-177-1611-1515-1420-and-1323-released-3171/)
- [Celery Documentation - Redis](https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/redis.html)
- [Celery Releases](https://github.com/celery/celery/releases)
- [TypeScript 5.8 Documentation](https://www.typescriptlang.org/docs/handbook/release-notes/typescript-5-8.html)
- [TypeScript 5.8 Announcement](https://devblogs.microsoft.com/typescript/announcing-typescript-5-8/)
- [archiso - ArchWiki](https://wiki.archlinux.org/title/Archiso)
- [React Hook Form Documentation](https://react-hook-form.com/)
- [shadcn/ui Documentation](https://ui.shadcn.com/)
- [Sentry FastAPI Integration](https://docs.sentry.io/platforms/python/integrations/fastapi/)
**Medium Confidence (Multiple credible sources agree):**
- [React Three Fiber vs Three.js 2026](https://graffersid.com/react-three-fiber-vs-three-js/)
- [Vite vs Next.js 2025 Comparison](https://strapi.io/blog/vite-vs-nextjs-2025-developer-framework-comparison)
- [State Management in 2025: Redux, Zustand, Jotai](https://dev.to/hijazi313/state-management-in-2025-when-to-use-context-redux-zustand-or-jotai-2d2k)
- [Zustand vs Redux vs Jotai Comparison](https://betterstack.com/community/guides/scaling-nodejs/zustand-vs-redux-toolkit-vs-jotai/)
- [Testing in 2026: Jest, RTL, Playwright](https://www.nucamp.co/blog/testing-in-2026-jest-react-testing-library-and-full-stack-testing-strategies)
- [FastAPI Docker Best Practices](https://betterstack.com/community/guides/scaling-python/fastapi-docker-best-practices/)
- [Caddy vs Nginx vs Traefik Comparison](https://www.programonaut.com/reverse-proxies-compared-traefik-vs-caddy-vs-nginx-docker/)
- [Python Package Management: uv vs Poetry](https://medium.com/@hitorunajp/poetry-vs-uv-which-python-package-manager-should-you-use-in-2025-4212cb5e0a14)
- [SQLAlchemy vs Tortoise ORM Comparison](https://betterstack.com/community/guides/scaling-python/tortoiseorm-vs-sqlalchemy/)
- [React Hook Form vs Formik Comparison](https://www.dhiwise.com/post/choosing-the-right-form-library-formik-vs-react-hook-form)
**Low Confidence (Single source, needs validation):**
- 100 Three.js Best Practices (community guide, not official)
- Celery alternatives discussion threads (opinions vary widely)
---
**Summary:** This stack represents the 2026 industry standard for building a high-performance, type-safe, async Python API with a React 3D frontend. All major technologies are on current stable versions with active maintenance. The 60fps 3D visualization requirement is achievable with React Three Fiber and proper optimization techniques. ISO generation via Celery + archiso is proven (similar to distro builder tools). The stack avoids deprecated technologies (Formik, Python 3.8, Pydantic v1, Redux for this use case) and unproven bleeding-edge options (TypeScript 7.0, React Server Components without Next.js).

View file

@ -0,0 +1,320 @@
# Project Research Summary
**Project:** Debate - Visual Linux Distribution Builder
**Domain:** Web-based Linux distribution customization and ISO generation platform
**Researched:** 2026-01-25
**Confidence:** MEDIUM-HIGH
## Executive Summary
Debate is a web-based Linux distribution builder that uses a 3D visual interface to help users customize and build bootable ISOs. Expert research shows that successful distribution builders follow a **layered web-queue-worker architecture**: React frontend with 3D visualization (Three.js/React Three Fiber) communicating with a FastAPI backend that delegates long-running ISO builds to Celery workers using archiso. The recommended approach is to start with Arch Linux support (Omarchy use case), implement robust dependency resolution with SAT solvers, and build sandboxing into the infrastructure from day one.
The platform's core differentiator is **visual conflict resolution** - making dependency hell visible and solvable for non-experts through the "Objection" system in the debate metaphor. This positions Debate as "what Canva did for design, but for Linux customization." The recommended stack is modern 2026 technology: Python 3.12+ with FastAPI/Uvicorn/Celery, React 19+ with Vite/Zustand/React Three Fiber, PostgreSQL 18+, and Redis for task queuing.
**Critical risks:** (1) Unsandboxed build execution allowing malicious code in user overlays - archiso and PKGBUILD files execute arbitrary code, requiring systemd-nspawn/microVM isolation from day one. (2) Non-deterministic builds preventing reliable caching - timestamps and environment variables must be normalized for reproducible builds. (3) Upstream breaking changes from rolling release repos (Arch/CachyOS) - pin repository snapshots and test in staging. (4) Performance degradation of 3D visualization on mid-range hardware - enforce 60fps target on Intel UHD Graphics from the start. These risks are mitigated through early architectural decisions in Phase 1 infrastructure.
## Key Findings
### Recommended Stack
Research shows the 2026 industry standard for high-performance Python APIs with React 3D frontends combines async frameworks (FastAPI), modern build tools (Vite), and distributed task queues (Celery). The stack avoids deprecated technologies (Formik, Pydantic v1) and unproven bleeding-edge options (TypeScript 7.0).
**Core technologies:**
- **FastAPI 0.128+ + Uvicorn 0.30+**: Async Python framework with auto-docs and type safety. 300% better performance than sync frameworks for I/O-bound operations. Industry standard for API-first apps.
- **React 19+ + Vite 6+**: Modern frontend with instant HMR (16.1s build vs 28.4s CRA). Vite chosen over Next.js because no SSR needed (no SEO benefit for logged-in tool), faster dev speed critical, architectural freedom for 3D integration.
- **React Three Fiber 8+ + Three.js**: 3D visualization framework that integrates Three.js with React paradigms. Outperforms vanilla Three.js at scale due to React scheduling. WebGPU support since Safari 26 makes this future-proof.
- **Celery 5.6.2+ + Redis 6.2+**: Distributed task queue for long-running ISO builds requiring progress tracking, cancellation, resource limiting, and retry logic. RQ too simple, Dramatiq lacks ecosystem maturity.
- **PostgreSQL 18.1 + SQLAlchemy 2.0+**: Latest stable database (Nov 2025) with async ORM. Better type hints and performance than Tortoise ORM.
- **archiso (latest)**: Official Arch Linux ISO builder using "releng" profile. Well-documented, active maintenance, proven approach.
- **Zustand 5+**: State management sweet spot between Redux complexity and Context limitations. Single store model fits stack-builder state, minimal boilerplate.
- **Caddy 2+**: Reverse proxy with REST API for programmatic route management (critical for dynamic ISO download endpoints). Simpler than Nginx for this use case.
**Critical version notes:** Python 3.12+ required (3.13 ecosystem immature), PostgreSQL 18.1 current stable (13 is EOL), Pydantic 2.12+ only (v1 deprecated, v3 doesn't exist), Celery 5.6.2+ for Redis Sentinel ACL auth fixes.
### Expected Features
Research into existing distribution builders (archiso, Cubic, live-build, NixOS) reveals clear table stakes vs. competitive differentiators.
**Must have (table stakes):**
- **Package Selection** - Core functionality with search/categorization (Debate's "Talking Points")
- **Base Distribution Selection** - Foundation to build from, start with Arch only (Debate's "Opening Statement")
- **ISO Generation** - Bootable installation media as end product
- **Configuration Persistence** - Save and reload work (Debate's "Speech")
- **Bootloader Configuration** - UEFI + BIOS support via archiso (syslinux, GRUB, systemd-boot)
- **Kernel/Locale/User Setup** - Expected in all distribution builders
**Should have (competitive differentiators):**
- **Visual Conflict Resolution** - UNIQUE to Debate. Makes dependency hell visible through "Objection" system. Current tools show cryptic errors.
- **Curated Starting Templates** - Pre-configured setups (Omarchy) as gallery of "Opening Statements"
- **Build Size Calculator** - Real-time feedback prevents mistakes
- **Visual Theme Customization** - GUI for WM/themes/icons BEFORE install (tools like HyprRice only work post-install)
- **Community Template Gallery** - Browse/fork/share configs (social feature drives engagement)
- **Conflict Explanation System** - AI-assisted or rule-based explanations turning errors into learning moments
**Defer (v2+):**
- **Live Preview in Browser** - Very high complexity, requires VM infrastructure. Get manual testing feedback first.
- **Multi-distro Support** - Ubuntu/Fedora after Arch works perfectly. Deep > wide.
- **Secure Boot** - Nice to have but not critical for target audience (Linux-curious users likely disabling secure boot)
- **Post-Install Configuration** - Scope creep. Link to Ansible/dotfiles managers instead.
**Anti-features (explicitly avoid):**
- Full NixOS-style declarative config (too complex for target audience)
- Build everything locally (slow, blocks UX - use cloud workers)
- Custom package repository hosting (infrastructure burden, security liability)
- Native desktop app (limits accessibility - web-first, Electron wrapper later if needed)
### Architecture Approach
Research shows successful distribution builders use **layered web-queue-worker architecture** with separation between frontend configuration, backend validation, and isolated build execution. The Debate platform should follow OverlayFS-inspired layer precedence (5 layers: Opening Statement → Platform → Rhetoric → Talking Points → Closing Argument), SAT-solver dependency resolution, and cache-first build strategy.
**Major components:**
1. **React Frontend + Three.js Renderer** - 3D visualization of layers/packages with configuration UI. State in React (app data) drives scene rendering. Performance target: 60fps on Intel UHD Graphics.
2. **FastAPI Gateway** - Stateless async API with Pydantic validation, request routing, WebSocket/SSE for real-time build progress. Separate routers by domain (configs, packages, builds).
3. **Dependency Resolver** - SAT solver (libsolv approach) translating package dependencies to logic clauses. Detects conflicts BEFORE build, suggests alternatives. Called during configuration save.
4. **Overlay Engine** - Layer composition with merge strategies (replace/append/deep-merge). Generates archiso profiles from layered configurations. Precedence: higher layers override lower.
5. **Build Queue Manager (Celery)** - Distributed task queue with priority scheduling. Job types: quick validation (seconds), full build (minutes), cache warming (low priority). One build per worker (CPU-intensive).
6. **Build Execution Workers** - archiso runners in sandboxed containers (systemd-nspawn/microVMs). Profile generation → package install → customization → ISO creation → object storage upload.
7. **PostgreSQL + Object Storage** - Configuration data, build metadata, user data in PostgreSQL. ISOs (1-4GB), logs, overlays in S3-compatible storage.
**Critical patterns:**
- **Layered configuration precedence** with OverlayFS-inspired merge strategies
- **SAT-based dependency resolution** using CDCL algorithm (NP-complete solved in milliseconds)
- **Asynchronous build queue** with progress tracking via WebSocket/SSE
- **Cache-first strategy** with config hash to reuse identical ISOs
- **Reproducible builds** with SOURCE_DATE_EPOCH, fixed locales, deterministic file ordering
**Anti-patterns to avoid:**
- Blocking API calls during build (use async queue)
- Duplicating state between React and Three.js (single source of truth)
- Storing large files in PostgreSQL (use object storage)
- Multiple builds per worker (resource contention)
- No dependency validation until build time (validate on save)
### Critical Pitfalls
Research into archiso security incidents, rolling release challenges, and 3D web performance reveals systematic failure modes.
1. **Unsandboxed User-Generated Package Execution** - CHAOS RAT malware distributed through AUR packages in July 2025 using .install scripts. PKGBUILD files execute arbitrary code during build. **Prevention:** Never run user PKGBUILDs directly on build servers. Use systemd-nspawn/microVMs for isolation, static analysis on PKGBUILDs, network egress filtering, ephemeral containers discarded after each build. **Phase 1 critical**.
2. **Non-Deterministic Build Reproducibility** - Same configuration generates different ISO hashes, breaking cache invalidation and security verification. **Prevention:** Normalize timestamps (SOURCE_DATE_EPOCH), sort files deterministically, use fixed locales (LC_ALL=C), pin toolchain versions, disable ASLR during builds. **Phase 1 critical**.
3. **Upstream Breaking Changes Without Version Pinning** - CachyOS/Arch rolling repos update with breaking changes. All builds fail simultaneously. CachyOS had kernel stability issues in 2025. **Prevention:** Pin package repository snapshots by date (use archive.archlinux.org), staging environment testing, monitor upstream changelogs, gradual rollout (1% traffic). **Phase 2 critical**.
4. **Dependency Hell Across Overlays** - Multiple overlays declare conflicting package versions or file ownership. Build fails after 15 minutes or succeeds with broken ISO. **Prevention:** Pre-validate overlay compatibility during upload (extract file lists, check conflicts), SAT solver detects conflicts BEFORE build, curated overlay collections, priority system for file conflicts. **Phase 3 critical**.
5. **3D Visualization Performance Degradation** - Works on RTX 4090 dev machines, runs at 5fps on target users' mid-range laptops. **Prevention:** Test on Intel UHD Graphics from day one, LOD (Level of Detail), instancing for repeated elements, Web Worker with OffscreenCanvas, 2D fallback UI, 60fps performance budget enforcement. **Phase 4 critical**.
## Implications for Roadmap
Based on research findings, suggested 9-phase structure optimized for dependency ordering, risk mitigation, and value delivery:
### Phase 1: Core Infrastructure & Security
**Rationale:** Foundation for all components. Build sandboxing and reproducibility MUST be architected from the start - retrofitting security is nearly impossible. No dependencies on complex logic.
**Delivers:** Database schema, basic API scaffolding, object storage setup, **sandboxed build environment**, deterministic build configuration.
**Addresses:** Basic architecture components, storage layer
**Avoids:** Pitfall #1 (unsandboxed execution), Pitfall #2 (non-deterministic builds)
**Duration:** 1-2 weeks
**Research needed:** Standard patterns, skip `/gsd:research-phase`
### Phase 2: Configuration Management
**Rationale:** Enables testing configuration storage before complex dependency resolution. Data models required for later phases.
**Delivers:** Layer data models (5 debate layers), CRUD endpoints, basic validation, configuration save/load.
**Addresses:** Configuration Persistence (table stakes), layered architecture foundation
**Uses:** FastAPI, PostgreSQL, Pydantic
**Implements:** Database persistence layer
**Duration:** 1-2 weeks
**Research needed:** Standard CRUD patterns, skip `/gsd:research-phase`
### Phase 3: Dependency Resolver (Simplified)
**Rationale:** Provides early validation capability without full SAT solver complexity. Catches obvious conflicts before build.
**Delivers:** Basic conflict detection (direct conflicts only, no SAT solver yet), immediate validation feedback.
**Addresses:** Early error detection, improved UX
**Avoids:** Pitfall #4 (dependency hell) - partial mitigation
**Duration:** 1 week
**Research needed:** Consider `/gsd:research-phase` for SAT solver integration patterns
### Phase 4: Overlay Engine
**Rationale:** Requires configuration data models from Phase 2. Produces archiso profiles for Phase 5 builds. Core business logic.
**Delivers:** Layer merging logic with precedence rules, profile generation for archiso, merge strategies (replace/append/deep-merge).
**Addresses:** Core architecture component
**Uses:** OverlayFS-inspired patterns
**Implements:** Overlay Engine component
**Duration:** 2 weeks
**Research needed:** Standard patterns, skip `/gsd:research-phase`
### Phase 5: Build Queue + Workers
**Rationale:** Depends on Overlay Engine for profile generation. Core value delivery - users can build ISOs. Implements web-queue-worker pattern.
**Delivers:** Celery setup, basic build task, worker orchestration, **sandboxed archiso execution**, progress tracking.
**Addresses:** ISO Generation (table stakes), asynchronous processing
**Uses:** Celery, Redis, archiso, systemd-nspawn
**Avoids:** Pitfall #3 (upstream breaking changes) via pinned repos
**Implements:** Build Queue Manager, Build Execution Workers
**Duration:** 2-3 weeks
**Research needed:** Consider `/gsd:research-phase` for archiso integration specifics
### Phase 6: Frontend (Basic)
**Rationale:** API must exist first (Phases 1-5). Provides usable interface for testing builds. No 3D yet - focus on functionality.
**Delivers:** React UI for configuration (forms, lists), build submission, status polling.
**Addresses:** User interface for table stakes features
**Uses:** React 19, Vite, Zustand, shadcn/ui
**Duration:** 2-3 weeks
**Research needed:** Standard React patterns, skip `/gsd:research-phase`
### Phase 7: Advanced Dependency Resolution
**Rationale:** Complex feature. System works with basic validation from Phase 3. Enables competitive differentiation.
**Delivers:** Full SAT solver integration (libsolv approach), conflict explanations, alternative suggestions, **visual conflict resolution UI**.
**Addresses:** Visual Conflict Resolution (core differentiator)
**Avoids:** Pitfall #4 (dependency hell) - complete mitigation
**Implements:** SAT-based Dependency Resolver
**Duration:** 2-3 weeks
**Research needed:** **NEEDS `/gsd:research-phase`** - SAT solver integration is complex, requires domain-specific research
### Phase 8: 3D Visualization
**Rationale:** Polish/differentiator feature. Core functionality works without it. Requires mature configuration system to visualize.
**Delivers:** Three.js integration, layer visualization in 3D space, visual debugging, performance optimization (LOD, instancing).
**Addresses:** Visual Theme Customization (differentiator), unique UX
**Uses:** React Three Fiber, Three.js, @react-three/drei
**Avoids:** Pitfall #5 (performance degradation) via 60fps enforcement on mid-range hardware
**Implements:** Three.js Renderer component
**Duration:** 3-4 weeks
**Research needed:** **NEEDS `/gsd:research-phase`** - 3D performance optimization patterns, LOD strategies, WebGPU considerations
### Phase 9: Caching + Optimization
**Rationale:** Optimization after core features work. Requires usage data to tune effectively. Improves scalability.
**Delivers:** Build cache with config hash, package cache, performance tuning, build deduplication, autoscaling.
**Addresses:** Scalability, cost optimization
**Avoids:** Pitfall #7 (queue starvation), improves cache invalidation from Pitfall #5
**Duration:** 1-2 weeks
**Research needed:** Standard caching patterns, skip `/gsd:research-phase`
### Phase Ordering Rationale
**Why this order:**
- **Security first (Phase 1):** Build sandboxing and reproducibility cannot be retrofitted - architectural from day one
- **Data before logic (Phase 2 before 3-4):** Configuration models required for dependency resolution and overlay engine
- **Validation before build (Phase 3 before 5):** Catch errors early, prevent wasted build resources
- **Backend before frontend (Phases 1-5 before 6):** API must exist for UI to consume
- **Core features before polish (Phases 1-6 before 7-8):** Prove value delivery before investing in differentiators
- **Optimization last (Phase 9):** Need usage patterns to optimize effectively
**Dependency chain:**
```
Phase 1 (Infrastructure)
Phase 2 (Config Models) ──→ Phase 4 (Overlay Engine)
↓ ↓
Phase 3 (Basic Validation) Phase 5 (Build Queue)
↓ ↓
Phase 6 (Frontend) ←──────────────┘
Phase 7 (Advanced Dependency) & Phase 8 (3D Viz)
Phase 9 (Optimization)
```
**How this avoids pitfalls:**
- Sandboxing in Phase 1 prevents Pitfall #1 (malicious code execution)
- Reproducible builds in Phase 1 enable Phase 9 caching
- Validation in Phase 3 reduces build failures from Pitfall #4
- Repository pinning in Phase 5 mitigates Pitfall #3
- Performance requirements in Phase 8 prevent Pitfall #5
- Build deduplication in Phase 9 addresses Pitfall #7
### Research Flags
**Phases needing deeper research during planning:**
- **Phase 7 (Advanced Dependency Resolution):** SAT solver integration patterns are complex and domain-specific. Recommend `/gsd:research-phase` for libsolv/version-SAT integration approaches, CDCL algorithm implementation, conflict explanation strategies.
- **Phase 8 (3D Visualization):** Performance optimization for 60fps on mid-range hardware requires specialized knowledge. Recommend `/gsd:research-phase` for LOD strategies, instancing patterns, WebGPU migration paths, OffscreenCanvas integration.
- **Phase 5 (Build Queue + Workers):** Consider `/gsd:research-phase` for archiso integration specifics (profile generation, customization hooks, boot configuration).
**Phases with standard patterns (skip research-phase):**
- **Phase 1 (Core Infrastructure):** Standard FastAPI/PostgreSQL/Docker setup, well-documented
- **Phase 2 (Configuration Management):** CRUD patterns, standard database schema design
- **Phase 3 (Basic Dependency Resolution):** Simple conflict detection, no SAT solver complexity yet
- **Phase 4 (Overlay Engine):** File merging logic, well-understood patterns
- **Phase 6 (Frontend Basic):** Standard React/Vite setup, CRUD UI patterns
- **Phase 9 (Caching + Optimization):** Standard cache invalidation patterns, autoscaling approaches
## Confidence Assessment
| Area | Confidence | Notes |
|------|------------|-------|
| Stack | HIGH | All technologies verified via official docs, release notes, and Context7 library. Versions confirmed current and stable (Jan 2026). |
| Features | MEDIUM | Table stakes verified via archiso wiki and multiple distribution builder tools. Differentiators based on market gap analysis and community tools research. User journey mapping needs validation. |
| Architecture | MEDIUM-HIGH | Patterns based on established web-queue-worker architecture, archiso documentation, and multiple reference implementations. Component boundaries clear, but integration complexity requires validation. |
| Pitfalls | MEDIUM-HIGH | Security pitfalls verified via documented incidents (CHAOS RAT in AUR July 2025, CachyOS stability issues 2025). Performance pitfalls based on Three.js optimization guides. Dependency issues confirmed in Arch forums. |
**Overall confidence:** MEDIUM-HIGH
Research is grounded in official documentation (FastAPI, archiso, React, PostgreSQL), verified incidents (AUR malware), and established architectural patterns (web-queue-worker, SAT solvers). Lower confidence areas are appropriately flagged for deeper research during planning (Phase 7 SAT solver, Phase 8 3D optimization).
### Gaps to Address
**Technical unknowns requiring validation during implementation:**
- **SAT solver integration complexity (Phase 7):** How to integrate libsolv with Python/FastAPI? Performance characteristics for large dependency graphs? Conflict explanation generation strategies? → Recommend `/gsd:research-phase` before Phase 7 implementation.
- **3D performance on target hardware (Phase 8):** Actual FPS achieved with layer visualization on Intel UHD Graphics? WebGPU adoption timeline? LOD effectiveness for package graphs? → Recommend `/gsd:research-phase` and early prototyping with performance testing.
- **archiso customization limits (Phase 5):** What can/can't be customized via profiles? Boot configuration edge cases? Multi-kernel support? → Validate via archiso ArchWiki and experimentation during Phase 5.
- **Upstream repository stability (Phase 5):** How frequently do CachyOS/Omarchy repos break compatibility? Optimal snapshot cadence? → Monitor during staging deployment, adjust pinning strategy based on data.
**User experience unknowns requiring user testing:**
- **Target audience validation:** Are "Windows refugees" actually the right target? Do they want 3D visualization or prefer simpler UI? → User testing during Phase 6 frontend development.
- **Conflict explanation effectiveness:** Can non-technical users understand dependency conflict explanations? What level of detail is helpful vs overwhelming? → User testing during Phase 7 development.
- **Template gallery adoption:** Will users share configurations? What incentives drive engagement? → Defer to post-MVP, validate demand first.
**Business/operational unknowns:**
- **Build resource costs:** What's the actual cost per build (CPU time, storage, bandwidth)? → Measure during beta deployment, implement quotas if needed.
- **Support burden:** What percentage of users need help debugging build failures? → Track during beta, inform UX improvements.
## Sources
### Primary (HIGH confidence)
**Stack & Technology:**
- [FastAPI Release Notes](https://fastapi.tiangolo.com/release-notes/) - Version verification, features
- [React 19.2 Release](https://react.dev/blog/2025/10/01/react-19-2) - Version confirmation, Activity API
- [PostgreSQL 18.1 Release](https://www.postgresql.org/about/news/postgresql-181-177-1611-1515-1420-and-1323-released-3171/) - Current stable version
- [Celery Documentation - Redis](https://docs.celeryq.dev/en/stable/getting-started/backends-and-brokers/redis.html) - Official patterns
- [TypeScript 5.8 Documentation](https://www.typescriptlang.org/docs/handbook/release-notes/typescript-5-8.html) - Features, compatibility
- [archiso ArchWiki](https://wiki.archlinux.org/title/Archiso) - Build process, configuration
- [React Hook Form Documentation](https://react-hook-form.com/) - Official API reference
- [Sentry FastAPI Integration](https://docs.sentry.io/platforms/python/integrations/fastapi/) - Observability setup
**Architecture:**
- [Libsolv SAT Solver](https://github.com/openSUSE/libsolv) - Official implementation
- [OverlayFS Linux Kernel Docs](https://docs.kernel.org/filesystems/overlayfs.html) - Layer merging concepts
- [PostgreSQL Schema Design Best Practices](https://wiki.postgresql.org/wiki/Database_Schema_Recommendations_for_an_Application) - Official wiki
- [Web-Queue-Worker Pattern - Azure](https://learn.microsoft.com/en-us/azure/architecture/guide/architecture-styles/web-queue-worker) - Microsoft official docs
**Security Incidents:**
- [CHAOS RAT in AUR Packages](https://linuxsecurity.com/features/chaos-rat-in-aur) - July 2025 malware incident
- [Reproducible builds documentation](https://reproducible-builds.org/docs/deterministic-build-systems/) - Official guides
- [Linux Kernel reproducible builds](https://docs.kernel.org/kbuild/reproducible-builds.html) - Official kernel docs
### Secondary (MEDIUM confidence)
**Comparative Analysis:**
- [React Three Fiber vs Three.js 2026](https://graffersid.com/react-three-fiber-vs-three-js/) - Performance comparison
- [Vite vs Next.js 2025 Comparison](https://strapi.io/blog/vite-vs-nextjs-2025-developer-framework-comparison) - Build tool decision
- [State Management in 2025: Redux, Zustand, Jotai](https://dev.to/hijazi313/state-management-in-2025-when-to-use-context-redux-zustand-or-jotai-2d2k) - Framework comparison
- [FastAPI Architecture Patterns 2026](https://medium.com/algomart/modern-fastapi-architecture-patterns-for-scalable-production-systems-41a87b165a8b) - Architecture guidance
- [SQLAlchemy vs Tortoise ORM Comparison](https://betterstack.com/community/guides/scaling-python/tortoiseorm-vs-sqlalchemy/) - ORM decision
- [Python Package Management: uv vs Poetry](https://medium.com/@hitorunajp/poetry-vs-uv-which-python-package-manager-should-you-use-in-2025-4212cb5e0a14) - Tooling choice
**Domain-Specific:**
- [Custom Archiso Tutorial 2024](https://serverless.industries/2024/12/30/custom-archiso.en.html) - Implementation guide
- [Package Conflict Resolution](https://distropack.dev/Blog/Post?slug=package-conflict-resolution-handling-conflicting-packages) - Dependency issues
- [CachyOS FAQ & Troubleshooting](https://wiki.cachyos.org/cachyos_basic/faq/) - Known issues
- [CachyOS dependency errors](https://discuss.cachyos.org/t/recent-package-system-upgrade-caused-many-dependancy-errors/17017) - Upstream breakage example
**Performance & Optimization:**
- [WebGL vs WebGPU performance](https://medium.com/@sudenurcevik/upgrading-performance-moving-from-webgl-to-webgpu-in-three-js-4356e84e4702) - 3D optimization
- [Building Efficient Three.js Scenes](https://tympanus.net/codrops/2025/02/11/building-efficient-three-js-scenes-optimize-performance-while-maintaining-quality/) - Performance patterns
- [OffscreenCanvas for WebGL](https://evilmartians.com/chronicles/faster-webgl-three-js-3d-graphics-with-offscreencanvas-and-web-workers) - Threading optimization
### Tertiary (LOW confidence)
**User Experience:**
- [10 Linux Mistakes Every Beginner Makes](https://dev.to/techrefreshing/10-linux-mistakes-every-beginner-makes-i-made-all-of-them-4och) - User research, needs validation
- [Choosing Linux Distro 2026](https://dev.to/srijan-xi/navigating-the-switch-how-to-choose-the-right-linux-distro-in-2026-448b) - Target audience insights
- [UX Design Mistakes 2026](https://www.wearetenet.com/blog/ux-design-mistakes) - General UX guidance
---
*Research completed: 2026-01-25*
*Ready for roadmap: yes*