1,190-line research covering all 18 technology areas for PVM: Rust/Axum backend, SvelteKit frontend, Postgres + libSQL databases, NATS + JetStream messaging, DragonflyDB caching, and more. Includes recommended stack summary and open questions. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
64 KiB
PVM (Poker Venue Manager) — Tech Stack Research
Generated: 2026-02-08 Status: DRAFT — for discussion and refinement
Table of Contents
- Programming Language
- Backend Framework
- Frontend Framework
- Database Strategy
- Caching Layer
- Message Queue / Event Streaming
- Real-Time Communication
- Auth & Authorization
- API Design
- Local Node Architecture
- Chromecast / Display Streaming
- Mobile Strategy
- Deployment & Infrastructure
- Monitoring & Observability
- Testing Strategy
- Security
- Developer Experience
- CSS / Styling
- Recommended Stack Summary
- Open Questions / Decisions Needed
1. Programming Language
Recommendation: Rust (backend + local node) + TypeScript (frontend + shared types)
Alternatives Considered
| Language | Pros | Cons |
|---|---|---|
| Rust | Memory safety, fearless concurrency, tiny binaries for RPi5, no GC pauses, excellent WebSocket perf | Steeper learning curve, slower compile times |
| Go | Simple, fast compilation, good concurrency | Less expressive type system, GC pauses (minor), larger binaries than Rust |
| TypeScript (full-stack) | One language everywhere, huge ecosystem, fast dev velocity | Node.js memory overhead on RPi5, GC pauses in real-time scenarios, weaker concurrency model |
| Elixir | Built for real-time (Phoenix), fault-tolerant OTP | Small ecosystem, harder to find libs, RPi5 BEAM VM overhead |
Reasoning
Rust is the strongest choice for PVM because of the RPi5 local node constraint. The local node must run reliably on constrained hardware with limited memory, handle real-time tournament clocks, manage offline operations, and sync data. Rust's zero-cost abstractions, lack of garbage collector, and small binary sizes (typically 5-15 MB static binaries) make it ideal for this.
For the cloud backend, Rust's performance means fewer servers and lower hosting costs. A single Rust service can handle thousands of concurrent WebSocket connections with minimal memory overhead — critical for real-time tournament updates across many venues.
The "all code written by Claude Code" constraint actually favors Rust: Claude has excellent Rust fluency, and the compiler's strict type system catches bugs that would otherwise require extensive testing in dynamic languages.
TypeScript remains the right choice for the frontend — the browser ecosystem is TypeScript-native, and sharing type definitions between Rust (via generated OpenAPI types) and TypeScript gives end-to-end type safety.
Gotchas
- Rust compile times can be mitigated with
cargo-watch, incremental compilation, andsccache - Cross-compilation for RPi5 (ARM64) is well-supported via
crossorcargo-zigbuild - Shared domain types can be generated from Rust structs to TypeScript via
ts-rsor OpenAPI codegen
2. Backend Framework
Recommendation: Axum (v0.8+)
Alternatives Considered
| Framework | Pros | Cons |
|---|---|---|
| Axum | Tokio-native, excellent middleware (Tower), lowest memory footprint, growing ecosystem, WebSocket built-in | Younger than Actix |
| Actix Web | Highest raw throughput, most mature | Actor model adds complexity, not Tokio-native (uses own runtime fork) |
| Rocket | Most ergonomic, Rails-like DX | Slower performance, less flexible middleware |
| Loco | Rails-like conventions, batteries-included | Very new (2024), smaller community, opinionated |
Reasoning
Axum is the clear winner for PVM:
- Tokio-native: Axum is built directly on Tokio + Hyper + Tower. Since NATS, database drivers, and WebSocket handling all use Tokio, everything shares one async runtime — no impedance mismatch.
- Tower middleware: The Tower service/layer pattern gives composable middleware for auth, rate limiting, tracing, compression, CORS, etc. Middleware can be shared between HTTP and WebSocket handlers.
- WebSocket support: First-class WebSocket extraction with
axum::extract::ws, typed WebSocket messages viaaxum-typed-websockets. - Memory efficiency: Benchmarks show Axum achieves the lowest memory footprint per connection — critical when serving thousands of concurrent venue connections.
- OpenAPI integration:
utoipacrate provides derive macros for generating OpenAPI 3.1 specs directly from Axum handler types. - Extractor pattern: Axum's extractor-based request handling maps cleanly to domain operations (extract tenant, extract auth, extract venue context).
Key Libraries
axum— HTTP frameworkaxum-extra— typed headers, cookie jar, multiparttower+tower-http— middleware stack (CORS, compression, tracing, rate limiting)utoipa+utoipa-axum— OpenAPI spec generationutoipa-swagger-ui— embedded Swagger UIaxum-typed-websockets— strongly typed WS messages
Gotchas
- Axum's error handling requires careful design — use
thiserror+ a custom error type that implementsIntoResponse - Route organization: use
axum::Router::nest()for modular route trees per domain (tournaments, venues, players) - State management: use
axum::extract::StatewithArc<AppState>— avoid the temptation to put everything in one giant state struct
3. Frontend Framework
Recommendation: SvelteKit (Svelte 5 + runes reactivity)
Alternatives Considered
| Framework | Pros | Cons |
|---|---|---|
| SvelteKit | Smallest bundles, true compilation (no virtual DOM), built-in routing/SSR/PWA, Svelte 5 runes are elegant | Smaller ecosystem than React |
| Next.js (React) | Largest ecosystem, most libraries, biggest job market | Vercel lock-in concerns, React hydration overhead, larger bundles, RSC complexity |
| SolidStart | Finest-grained reactivity, near-zero overhead updates | Smallest ecosystem, least mature, fewer component libraries |
| Nuxt (Vue) | Good DX, solid ecosystem | Vue 3 composition API less elegant than Svelte 5 runes |
Reasoning
SvelteKit is the best fit for PVM for several reasons:
- Performance matters for venue displays: Tournament clocks, waiting lists, and seat maps will run on venue TVs via Chromecast. Svelte's compiled output produces minimal JavaScript — the Cast receiver app will load faster and use less memory on Chromecast hardware.
- Real-time UI updates: Svelte 5's fine-grained reactivity (runes:
$state,$derived,$effect) means updating a single timer or seat status re-renders only that DOM node, not a virtual DOM diff. This is ideal for dashboards with many independently updating elements. - PWA support: SvelteKit has first-class service worker support and offline capabilities through
@sveltejs/adapter-staticandvite-plugin-pwa. - Bundle size: SvelteKit produces the smallest JavaScript bundles of any major framework — important for mobile PWA users on venue WiFi.
- Claude Code compatibility: Svelte's template syntax is straightforward and less boilerplate than React — Claude can generate clean, readable Svelte components efficiently.
- No framework lock-in: Svelte compiles away, so there's no runtime dependency. The output is vanilla JS + DOM manipulation.
UI Component Library
Recommendation: Skeleton UI (Svelte-native) or shadcn-svelte (Tailwind-based, port of shadcn/ui)
shadcn-svelte is particularly compelling because:
- Components are copied into your codebase (not a dependency) — full control
- Built on Tailwind CSS — consistent with the styling recommendation
- Accessible by default (uses Bits UI primitives under the hood)
- Matches the design patterns of the widely-used shadcn/ui ecosystem
Gotchas
- SvelteKit's SSR is useful for the management dashboard but the Cast receiver and PWA may use
adapter-staticfor pure SPA mode - Svelte's ecosystem is smaller than React's, but for PVM's needs (forms, tables, charts, real-time) the ecosystem is sufficient
- Svelte 5 (runes) is a significant API change from Svelte 4 — ensure all examples and libraries target Svelte 5
4. Database Strategy
Recommendation: PostgreSQL (cloud primary) + libSQL/SQLite (local node) + Electric SQL or custom sync
Alternatives Considered
| Approach | Pros | Cons |
|---|---|---|
| Postgres cloud + libSQL local + sync | Best of both worlds — Postgres power in cloud, SQLite simplicity on RPi5 | Need sync layer, schema divergence risk |
| Postgres everywhere | One DB engine, simpler mental model | Postgres on RPi5 uses more memory, harder offline |
| libSQL/Turso everywhere | One engine, built-in edge replication | Less powerful for complex cloud queries, multi-tenant partitioning |
| CockroachDB | Distributed, strong consistency | Heavy for RPi5, expensive, overkill |
Detailed Recommendation
Cloud Database: PostgreSQL 16+
- The gold standard for multi-tenant SaaS
- Row-level security (RLS) for tenant isolation
- JSONB for flexible per-venue configuration
- Excellent full-text search for player lookup across venues
- Partitioning by tenant for performance at scale
- Managed options: Neon (serverless, branching for dev), Supabase, or AWS RDS
Local Node Database: libSQL (via Turso's embedded runtime)
- Fork of SQLite with cloud sync capabilities
- Runs embedded in the Rust binary — no separate database process on RPi5
- WAL mode for concurrent reads during tournament operations
- Tiny memory footprint (< 10 MB typical)
- libSQL's Rust driver (
libsql) is well-maintained
Sync Strategy:
The local node operates on a subset of the cloud data — only data relevant to its venue(s). The sync approach:
- Cloud-to-local: Player profiles, memberships, credit lines pushed to local node via NATS JetStream. Local node maintains a read replica of relevant data in libSQL.
- Local-to-cloud: Tournament results, waitlist changes, transactions pushed to cloud via NATS JetStream with at-least-once delivery. Cloud processes as events.
- Conflict resolution: Last-writer-wins (LWW) with vector clocks for most entities. For financial data (credit lines, buy-ins), use event sourcing — conflicts are impossible because every transaction is an immutable event.
- Offline queue: When disconnected, local node queues mutations in a local WAL-style append-only log. On reconnect, replays in order via NATS.
ORM / Query Layer
Recommendation: sqlx (compile-time checked queries)
sqlxchecks SQL queries against the actual database schema at compile time- No ORM abstraction layer — write real SQL, get compile-time safety
- Supports both PostgreSQL and SQLite/libSQL
- Avoids the N+1 query problems that ORMs introduce
- Migrations via
sqlx migrate
Alternative: sea-orm if you want a full ORM, but for PVM the explicit SQL approach of sqlx gives more control over multi-tenant queries and complex joins.
Migrations
- Use
sqlx migratefor cloud PostgreSQL migrations - Maintain parallel migration files for libSQL (SQLite-compatible subset)
- A shared migration test ensures both schemas stay compatible for the sync subset
Gotchas
- PostgreSQL and SQLite have different SQL dialects — the sync subset must use compatible types (no Postgres-specific types in synced tables)
- libSQL's
VECTORtype is interesting for future player similarity features but not needed initially - Turso's hosted libSQL replication is an option but adds a dependency — prefer embedded libSQL with custom NATS-based sync for more control
- Schema versioning must be tracked on the local node so the cloud knows what schema version it's talking to
5. Caching Layer
Recommendation: DragonflyDB
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| DragonflyDB | 25x Redis throughput, Redis-compatible API, multi-threaded, lower memory usage | Younger project, smaller community |
| Redis 7+ | Most mature, largest ecosystem, Redis Stack modules | Single-threaded core, BSL license concerns since Redis 7.4 |
| Valkey | Redis fork, community-driven, BSD license | Still catching up to Redis feature parity |
| KeyDB | Multi-threaded Redis fork | Development appears stalled (no updates in 1.5+ years) |
| No cache (just Postgres) | Simpler architecture | Higher DB load, slower for session/real-time data |
Reasoning
DragonflyDB is the right choice for PVM:
- Redis API compatibility: Drop-in replacement — all Redis client libraries work unchanged. The
fredRust crate (async Redis client) works with DragonflyDB out of the box. - Multi-threaded architecture: DragonflyDB uses all available CPU cores, unlike Redis's single-threaded model. This matters when caching tournament state for hundreds of concurrent venues.
- Memory efficiency: DragonflyDB uses up to 80% less memory than Redis for the same dataset — important for keeping infrastructure costs low.
- No license concerns: DragonflyDB uses BSL 1.1 (converts to open source after 4 years). Redis switched to a dual-license model that's more restrictive. Valkey is BSD but is playing catch-up.
- Pub/Sub: DragonflyDB supports Redis Pub/Sub — useful as a lightweight complement to NATS for in-process event distribution within the backend cluster.
What to Cache
- Session data: User sessions, JWT refresh tokens
- Tournament state: Current level, blinds, clock, player counts (hot read path)
- Waiting lists: Ordered sets per venue/game type
- Rate limiting: API rate limit counters
- Player lookup cache: Frequently accessed player profiles
- Seat maps: Current table/seat assignments per venue
What NOT to Cache (use Postgres directly)
- Financial transactions (credit lines, buy-ins) — always hit the source of truth
- Audit logs
- Historical tournament data
Local Node: No DragonflyDB
The RPi5 local node should not run DragonflyDB. libSQL is fast enough for local caching needs, and adding another process increases complexity and memory usage on constrained hardware. Use in-memory Rust data structures (e.g., DashMap, moka cache crate) for hot local state.
Gotchas
- DragonflyDB's replication features are less mature than Redis Sentinel/Cluster — use managed hosting or keep it simple with a single node + persistence initially
- Monitor DragonflyDB's release cycle — it's actively developed but younger than Redis
- Keep the cache layer optional — the system should function (slower) without it
6. Message Queue / Event Streaming
Recommendation: NATS + JetStream
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| NATS + JetStream | Lightweight (single binary, ~20MB), sub-ms latency, built-in persistence, embedded mode, perfect for edge | Smaller community than Kafka |
| Apache Kafka | Highest throughput, mature, excellent tooling | Heavy (JVM, ZooKeeper/KRaft), 4GB+ RAM minimum, overkill for PVM's scale |
| RabbitMQ | Mature AMQP, sophisticated routing | Higher latency (5-20ms), more memory, Erlang ops complexity |
| Redis Streams | Simple, already have cache layer | Not designed for reliable message delivery at scale |
Reasoning
NATS + JetStream is purpose-built for PVM's architecture:
-
Edge-native: NATS can run as a leaf node on the RPi5, connecting to the cloud NATS cluster. This is the core of the local-to-cloud sync architecture. When the connection drops, JetStream buffers messages locally and replays them on reconnect.
-
Lightweight: NATS server is a single ~20 MB binary. On RPi5, it uses ~50 MB RAM. Compare to Kafka's 4 GB minimum.
-
Sub-millisecond latency: Core NATS delivers messages in < 1ms. JetStream (persistent) adds 1-5ms. This is critical for real-time tournament updates — when a player busts, every connected display should update within milliseconds.
-
Subject-based addressing: NATS subjects map perfectly to PVM's domain:
venue.{venue_id}.tournament.{id}.clock— tournament clock ticksvenue.{venue_id}.waitlist.update— waiting list changesvenue.{venue_id}.seats.{table_id}— seat assignmentsplayer.{player_id}.notifications— player-specific eventssync.{node_id}.upstream— local node to cloud syncsync.{node_id}.downstream— cloud to local node sync
-
Built-in patterns: Request/reply (for RPC between cloud and node), pub/sub (for broadcasts), queue groups (for load-balanced consumers), key-value store (for distributed config), object store (for binary data like player photos).
-
JetStream for durability: Tournament results, financial transactions, and sync operations need guaranteed delivery. JetStream provides at-least-once and exactly-once delivery semantics with configurable retention.
Architecture
RPi5 Local Node Cloud
┌──────────────┐ ┌──────────────────┐
│ NATS Leaf │◄──── TLS ────►│ NATS Cluster │
│ Node │ (auto- │ (3-node) │
│ │ reconnect) │ │
│ JetStream │ │ JetStream │
│ (local buf) │ │ (persistent) │
└──────────────┘ └──────────────────┘
Gotchas
- NATS JetStream's exactly-once semantics require careful consumer design — use idempotent handlers with deduplication IDs
- Subject namespace design is critical — plan it early, changing later is painful
- NATS leaf nodes need TLS configured for secure cloud connection
- Monitor JetStream stream sizes on RPi5 — set max bytes limits to avoid filling the SD card during extended offline periods
- The
async-natsRust crate is the official async client — well maintained and Tokio-native
7. Real-Time Communication
Recommendation: WebSockets (via Axum) for interactive clients + NATS for backend fan-out + SSE as fallback
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| WebSockets | Full duplex, low latency, wide support | Requires connection management, can't traverse some proxies |
| Server-Sent Events (SSE) | Simpler, auto-reconnect, HTTP-native | Server-to-client only, no binary support |
| WebTransport | HTTP/3, multiplexed streams, unreliable mode | Very new, limited browser support, no Chromecast support |
| Socket.IO | Auto-fallback, rooms, namespaces | Node.js-centric, adds overhead, not Rust-native |
| gRPC streaming | Typed, efficient, bidirectional | Not browser-native (needs grpc-web proxy), overkill |
Architecture
The real-time pipeline has three layers:
-
NATS (backend event bus): All state changes publish to NATS subjects. This is the single source of real-time truth. Both cloud services and local nodes publish here.
-
WebSocket Gateway (Axum): A dedicated Axum service subscribes to relevant NATS subjects and fans out to connected WebSocket clients. Each client subscribes to the venues/tournaments they care about.
-
SSE Fallback: For environments where WebSockets are blocked (some corporate networks), provide an SSE endpoint that delivers the same event stream. SSE's built-in auto-reconnect with
Last-Event-IDmakes resumption simple.
Flow Example: Tournament Clock Update
Tournament Service (Rust)
→ publishes to NATS: venue.123.tournament.456.clock {level: 5, time_remaining: 1200}
→ WebSocket Gateway subscribes to venue.123.tournament.*
→ fans out to all connected clients watching tournament 456
→ Chromecast receiver app gets update, renders clock
→ PWA on player's phone gets update, shows current level
Implementation Details
- Use
axum::extract::ws::WebSocketwithtokio::select!to multiplex NATS subscription + client messages - Implement heartbeat/ping-pong to detect stale connections (30s interval, 10s timeout)
- Client reconnection with exponential backoff + subscription replay from NATS JetStream
- Binary message format: consider MessagePack (
rmp-serde) for compact payloads over WebSocket, with JSON as human-readable fallback - Connection limits: track per-venue connection count, implement backpressure
Gotchas
- WebSocket connections are stateful — need sticky sessions or a connection registry if running multiple gateway instances
- Chromecast receiver apps have limited WebSocket support — test thoroughly on actual hardware
- Mobile PWAs going to background will drop WebSocket connections — design for reconnection and state catch-up
- Rate limit outbound messages to prevent flooding slow clients (tournament clock ticks should be throttled to 1/second for display, even if internal state updates more frequently)
8. Auth & Authorization
Recommendation: Custom JWT auth with Postgres-backed RBAC + optional OAuth2 social login
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| Custom JWT + RBAC | Full control, no vendor dependency, works offline on local node | Must implement everything yourself |
| Auth0 / Clerk | Managed, social login, MFA out of box | Vendor lock-in, cost scales with users, doesn't work offline |
| Keycloak | Self-hosted, full-featured, OIDC/SAML | Heavy (Java), complex to operate, overkill |
| Ory (Kratos + Keto) | Open source, cloud-native, API-first | Multiple services to deploy, newer |
| Lucia Auth | Lightweight, framework-agnostic | TypeScript-only, no Rust support |
Architecture
PVM's auth has a unique challenge: cross-venue universal player accounts that must work both online (cloud) and offline (local node). This rules out purely managed auth services.
Token Strategy:
Access Token (JWT, short-lived: 15 min)
├── sub: player_id (universal)
├── tenant_id: current operator
├── venue_id: current venue (if applicable)
├── roles: ["player", "dealer", "floor_manager", "admin"]
├── permissions: ["tournament.manage", "waitlist.view", ...]
└── iat, exp, iss
Refresh Token (opaque, stored in DB/DragonflyDB, long-lived: 30 days)
└── Rotated on each use, old tokens invalidated
RBAC Model:
Operator (tenant)
├── Admin — full control over all venues
├── Manager — manage specific venues
├── Floor Manager — tournament/table operations at a venue
├── Dealer — assigned to tables, report results
└── Player — universal account, cross-venue
├── can self-register
├── has memberships per venue
└── has credit lines per venue (managed by admin)
Key Design Decisions:
- Tenant-scoped roles: A user can be an admin in one operator's venues and a player in another. The
(user_id, operator_id, role)triple is the authorization unit. - Offline auth on local node: The local node caches valid JWT signing keys and a subset of user credentials (hashed). Players can authenticate locally when the cloud is unreachable. New registrations queue for cloud sync.
- JWT signing: Use Ed25519 (fast, small signatures) via the
jsonwebtokencrate. The cloud signs tokens; the local node can verify them with the public key. For offline token issuance, the local node has a delegated signing key. - Password hashing:
argon2crate — memory-hard, resistant to GPU attacks. Tune parameters for RPi5 (lower memory cost than cloud). - Social login (optional, cloud-only): Support Google/Apple sign-in for player accounts via standard OAuth2 flows. Map social identities to the universal player account.
Gotchas
- Token revocation is hard with JWTs — use short expiry (15 min) + refresh token rotation + a lightweight blocklist in DragonflyDB for immediate revocation
- Cross-venue account linking: when a player signs up at venue A and later visits venue B (different operator), they should be recognized. Use email/phone as the universal identifier with verification.
- Local node token issuance must be time-limited and logged — cloud should audit all locally-issued tokens on sync
- Rate limit login attempts both on cloud and local node to prevent brute force
9. API Design
Recommendation: REST + OpenAPI 3.1 with generated TypeScript client
Alternatives Considered
| Approach | Pros | Cons |
|---|---|---|
| REST + OpenAPI | Universal, tooling-rich, generated clients, cacheable | Overfetching possible, multiple round trips |
| GraphQL | Flexible queries, single endpoint, good for complex UIs | Complexity overhead, caching harder, Rust support less mature |
| tRPC | Zero-config type safety | TypeScript-only — cannot use with Rust backend |
| gRPC | Efficient binary protocol, streaming | Needs proxy for browsers, overkill for this use case |
Reasoning
tRPC is ruled out because it requires both client and server to be TypeScript. With a Rust backend, this is not viable.
REST + OpenAPI is the best approach because:
- Generated type safety: Use
utoipato generate OpenAPI 3.1 specs from Rust types, thenopenapi-typescriptto generate TypeScript types for the frontend. Changes to the Rust API automatically propagate to the frontend types. - Cacheable: REST's HTTP semantics enable CDN caching, ETag support, and conditional requests — important for player profiles and tournament structures that change infrequently.
- Universal clients: The REST API will also be consumed by the Chromecast receiver app, the local node sync layer, and potentially third-party integrations. OpenAPI makes all of these easy.
- Tooling: Swagger UI for exploration,
openapi-fetchfor the TypeScript client (type-safe fetch wrapper), Postman/Insomnia for testing.
API Conventions
# Resource-based URLs
GET /api/v1/venues/{venue_id}/tournaments
POST /api/v1/venues/{venue_id}/tournaments
GET /api/v1/venues/{venue_id}/tournaments/{id}
PATCH /api/v1/venues/{venue_id}/tournaments/{id}
# Actions as sub-resources
POST /api/v1/venues/{venue_id}/tournaments/{id}/start
POST /api/v1/venues/{venue_id}/tournaments/{id}/pause
POST /api/v1/venues/{venue_id}/waitlists/{id}/join
POST /api/v1/venues/{venue_id}/waitlists/{id}/call/{player_id}
# Cross-venue player operations
GET /api/v1/players/me
GET /api/v1/players/{id}/memberships
POST /api/v1/players/{id}/credit-lines
# Real-time subscriptions
WS /api/v1/ws?venue={id}&subscribe=tournament.clock,waitlist.updates
Type Generation Pipeline
Rust structs (serde + utoipa derive)
→ OpenAPI 3.1 JSON spec (generated at build time)
→ openapi-typescript (CI step)
→ TypeScript types + openapi-fetch client
→ SvelteKit frontend consumes typed API
Gotchas
- Version the API from day one (
/api/v1/) — breaking changes go in/api/v2/ - Use cursor-based pagination for lists (not offset-based) — more efficient and handles concurrent inserts
- Standardize error responses:
{ error: { code: string, message: string, details?: any } } - Consider a lightweight BFF (Backend-for-Frontend) pattern in SvelteKit's server routes for aggregating multiple API calls into one page load
10. Local Node Architecture
Recommendation: Single Rust binary running on RPi5 with embedded libSQL, NATS leaf node, and local HTTP/WS server
What Runs on the RPi5
┌─────────────────────────────────────────────────────┐
│ PVM Local Node (single Rust binary, ~15-20 MB) │
│ │
│ ┌──────────────┐ ┌──────────────┐ │
│ │ HTTP/WS │ │ NATS Leaf │ │
│ │ Server │ │ Node │ │
│ │ (Axum) │ │ (embedded or │ │
│ │ │ │ sidecar) │ │
│ └──────┬───────┘ └──────┬───────┘ │
│ │ │ │
│ ┌──────┴──────────────────┴───────┐ │
│ │ Application Core │ │
│ │ - Tournament engine │ │
│ │ - Clock manager │ │
│ │ - Waitlist manager │ │
│ │ - Seat assignment │ │
│ │ - Sync orchestrator │ │
│ └──────────────┬───────────────────┘ │
│ │ │
│ ┌──────────────┴───────────────────┐ │
│ │ libSQL (embedded) │ │
│ │ - Venue data subset │ │
│ │ - Offline mutation queue │ │
│ │ - Local auth cache │ │
│ └───────────────────────────────────┘ │
│ │
│ ┌───────────────────────────────────┐ │
│ │ moka in-memory cache │ │
│ │ - Hot tournament state │ │
│ │ - Active session tokens │ │
│ └───────────────────────────────────┘ │
└─────────────────────────────────────────────────────────┘
Offline Operations
When the cloud connection drops, the local node continues operating:
- Tournament operations: Clock continues, blinds advance, players bust/rebuy — all local state
- Waitlist management: Players can join/leave waitlists — queued for cloud sync
- Seat assignments: Floor managers can move players between tables locally
- Player auth: Cached credentials allow existing players to log in. New registrations queued.
- Financial operations: Buy-ins and credit transactions logged locally with offline flag. Cloud reconciles on reconnect.
Sync Protocol
On reconnect:
1. Local node sends its last-seen cloud sequence number
2. Cloud sends all events since that sequence (via NATS JetStream replay)
3. Local node sends its offline mutation queue (ordered by local timestamp)
4. Cloud processes mutations, detects conflicts, responds with resolution
5. Local node applies cloud resolutions, updates local state
6. Both sides confirm sync complete
Conflict Resolution Strategy
| Data Type | Strategy | Reasoning |
|---|---|---|
| Tournament state | Cloud wins | Only one node runs a tournament at a time |
| Waitlist | Merge (union) | Both sides can add/remove; merge and re-order by timestamp |
| Player profiles | Cloud wins (LWW) | Cloud is the authority for universal accounts |
| Credit transactions | Append-only (event sourcing) | No conflicts — every transaction is immutable |
| Seat assignments | Local wins during offline | Floor manager's local decisions take precedence |
| Dealer schedules | Cloud wins | Schedules are set centrally |
RPi5 System Setup
- OS: Raspberry Pi OS Lite (64-bit, Debian Bookworm-based) — no desktop environment
- Storage: 32 GB+ microSD or USB SSD (recommended for durability)
- Auto-start: systemd service for the PVM binary
- Updates: OTA binary updates via a self-update mechanism (download new binary, verify signature, swap, restart)
- Watchdog: Hardware watchdog timer to auto-reboot if the process hangs
- Networking: Ethernet preferred (reliable), WiFi as fallback. mDNS for local discovery.
Gotchas
- RPi5 has 4 GB or 8 GB RAM — target 8 GB model, budget ~200 MB for the PVM process + NATS
- SD card wear: use an external USB SSD for the libSQL database if heavy write operations are expected
- Time synchronization: use
chronyNTP client — accurate timestamps are critical for conflict resolution and tournament clocks - Power loss: libSQL in WAL mode is crash-safe, but implement a clean shutdown handler (SIGTERM) that flushes state
- Security: the RPi5 is physically accessible in venues — encrypt the libSQL database at rest, disable SSH password auth, use key-only
11. Chromecast / Display Streaming
Recommendation: Google Cast SDK with a Custom Web Receiver (SvelteKit static app)
Architecture
┌──────────────┐ Cast SDK ┌──────────────────┐
│ Sender App │ ──────────────► │ Custom Web │
│ (PVM Admin │ (discovers & │ Receiver │
│ Dashboard) │ launches) │ (SvelteKit SPA) │
│ │ │ │
│ or │ │ Hosted at: │
│ │ │ cast.pvmapp.com │
│ Local Node │ │ │
│ HTTP Server │ │ Connects to WS │
│ │ │ for live updates │
└──────────────┘ └────────┬───────────┘
│
┌────────▼───────────┐
│ Chromecast Device │
│ (renders receiver) │
└────────────────────┘
Custom Web Receiver
The Cast receiver is a separate SvelteKit static app that:
- Loads on the Chromecast device when cast is initiated
- Connects to the PVM WebSocket endpoint (cloud or local node, depending on network)
- Subscribes to venue-specific events (tournament clock, waitlist, seat map)
- Renders full-screen display layouts:
- Tournament clock: Large timer, current level, blind structure, next break
- Waiting list: Player queue by game type, estimated wait times
- Table status: Open seats, game types, stakes per table
- Custom messages: Announcements, promotions
Display Manager
A venue can have multiple Chromecast devices showing different content:
- TV 1: Tournament clock (main)
- TV 2: Cash game waiting list
- TV 3: Table/seat map
- TV 4: Rotating between tournament clock and waiting list
The Display Manager (part of the admin dashboard) lets floor managers:
- Assign content to each Chromecast device
- Configure rotation/cycling between views
- Send one-time announcements to all screens
- Adjust display themes (dark/light, font size, venue branding)
Technical Details
- Register the receiver app with Google Cast Developer Console (one-time setup, $5 fee)
- Use Cast Application Framework (CAF) Receiver SDK v3
- The receiver app is a standard web page — can use any web framework (SvelteKit static build)
- Sender integration: use the
cast.framework.CastContextAPI in the admin dashboard - For local network casting (offline mode): the local node serves the receiver app directly, and the Chromecast connects to the local node's IP
- Consider also supporting generic HDMI displays via a simple browser in kiosk mode (Chromium on a secondary RPi or mini PC) as a non-Chromecast fallback
Gotchas
- Chromecast devices have limited memory and CPU — keep the receiver app lightweight (Svelte is ideal here)
- Cast sessions can timeout after inactivity — implement keep-alive messages
- Chromecast requires an internet connection for initial app load (it fetches the receiver URL from Google's servers) — for fully offline venues, the kiosk-mode browser fallback is essential
- Test on actual Chromecast hardware early — the developer emulator doesn't catch all issues
- Cast SDK requires HTTPS for the receiver URL in production (self-signed certs won't work on Chromecast)
12. Mobile Strategy
Recommendation: PWA first (SvelteKit), with Capacitor wrapper for app store presence when needed
Alternatives Considered
| Approach | Pros | Cons |
|---|---|---|
| PWA (SvelteKit) | One codebase, instant updates, no app store, works offline | Limited native API access, no push on iOS (improving), discoverability |
| Capacitor (hybrid) | PWA + native shell, access native APIs, app store distribution | Thin WebView wrapper, some performance overhead |
| Tauri Mobile | Rust backend, small size | Mobile support very early (alpha/beta), limited ecosystem |
| React Native | True native UI, large ecosystem | Separate codebase from web, React dependency, not Svelte |
| Flutter | Excellent cross-platform, single codebase | Dart language, separate from web entirely |
Reasoning
PVM's mobile needs are primarily consumption-oriented — players check tournament schedules, waiting list position, and receive notifications. This is a perfect fit for a PWA:
-
PWA first: The SvelteKit app with
vite-plugin-pwaalready provides offline caching, add-to-home-screen, and background sync. For most players, this is sufficient. -
Capacitor wrap when needed: When iOS push notifications, Apple Pay, or app store presence becomes important, wrap the existing SvelteKit PWA in Capacitor. Capacitor runs the same web app in a native WebView and provides JavaScript bridges to native APIs.
-
Tauri Mobile is not ready: As of 2026, Tauri 2.0's mobile support exists but is still maturing. It would be a good fit architecturally (Rust backend + web frontend), but the plugin ecosystem and build tooling aren't as polished as Capacitor's. Revisit in 12-18 months.
PWA Features for PVM
- Service Worker: Cache tournament schedules, player profile, venue info for offline access
- Push Notifications: Web Push API for tournament start reminders, waitlist calls (Android + iOS 16.4+)
- Add to Home Screen: App-like experience without app store
- Background Sync: Queue waitlist join/leave actions when offline, sync when back online
- Share Target: Accept shared tournament links
Gotchas
- iOS PWA support is improving but still has limitations (no background fetch, limited push notification payload)
- Capacitor requires maintaining iOS/Android build pipelines — only add this when there's a clear need
- Test PWA on actual mobile devices in venues — WiFi quality varies dramatically
- Deep linking: configure universal links / app links so shared tournament URLs open in the PWA/app
13. Deployment & Infrastructure
Recommendation: Fly.io (primary cloud) + Docker containers + GitHub Actions CI/CD
Alternatives Considered
| Platform | Pros | Cons |
|---|---|---|
| Fly.io | Edge deployment, built-in Postgres, simple scaling, good pricing, Rust-friendly | CLI-first workflow, no built-in CI/CD |
| Railway | Excellent DX, GitHub integration, preview environments | Less edge presence, newer |
| AWS (ECS/Fargate) | Full control, enterprise grade, broadest service catalog | Complex, expensive operations overhead |
| Render | Simple, good free tier | Less flexible networking, no edge |
| Hetzner + manual | Cheapest, full control | Operations burden, no managed services |
Reasoning
Fly.io is the best fit for PVM:
- Edge deployment: Fly.io runs containers close to users. For a poker venue SaaS with venues in multiple cities/countries, edge deployment means lower latency for real-time tournament updates.
- Built-in Postgres: Fly Postgres is managed, with automatic failover and point-in-time recovery.
- Fly Machines: Fine-grained control over machine placement — can run NATS, DragonflyDB, and the API server as separate Fly machines.
- Rust-friendly: Fly.io's multi-stage Docker builds work well for Rust (build on large machine, deploy tiny binary).
- Private networking: Fly's WireGuard mesh enables secure communication between services without exposing ports publicly. The RPi5 local nodes can use Fly's WireGuard to connect to the cloud NATS cluster.
- Reasonable pricing: Pay-as-you-go, no minimum commitment. Scale to zero for staging environments.
Infrastructure Layout
Fly.io Cloud
├── pvm-api (Axum, 2+ instances, auto-scaled)
├── pvm-ws-gateway (Axum WebSocket, 2+ instances)
├── pvm-nats (NATS cluster, 3 nodes)
├── pvm-db (Fly Postgres, primary + replica)
├── pvm-cache (DragonflyDB, single node)
└── pvm-worker (background jobs: sync processing, notifications)
Venue (RPi5)
└── pvm-node (single Rust binary + NATS leaf node)
└── connects to pvm-nats via WireGuard/TLS
CI/CD Pipeline (GitHub Actions)
# Triggered on push to main
1. Lint (clippy, eslint)
2. Test (cargo test, vitest, playwright)
3. Build (multi-stage Docker for cloud, cross-compile for RPi5)
4. Deploy staging (auto-deploy to Fly.io staging)
5. E2E tests against staging
6. Deploy production (manual approval gate)
7. Publish RPi5 binary (signed, to update server)
Gotchas
- Fly.io Postgres is not fully managed — you still need to handle major version upgrades and backup verification
- Use multi-stage Docker builds to keep Rust image sizes small (builder stage with
rust:bookworm, runtime stage withdebian:bookworm-slimordistroless) - Pin Fly.io machine regions to match your target markets — don't spread too thin initially
- Set up blue-green deployments for zero-downtime upgrades
- The RPi5 binary update mechanism needs a rollback strategy — keep the previous binary and a fallback boot option
14. Monitoring & Observability
Recommendation: OpenTelemetry (traces + metrics + logs) exported to Grafana Cloud (or self-hosted Grafana + Loki + Tempo + Prometheus)
Alternatives Considered
| Stack | Pros | Cons |
|---|---|---|
| OpenTelemetry + Grafana | Vendor-neutral, excellent Rust support, unified pipeline | Some setup required |
| Datadog | All-in-one, excellent UX | Expensive at scale, vendor lock-in |
| New Relic | Good APM | Cost, Rust support less first-class |
| Sentry | Excellent error tracking | Limited metrics/traces, complementary rather than primary |
Rust Instrumentation Stack
# Key crates
tracing = "0.1" # Structured logging/tracing facade
tracing-subscriber = "0.3" # Log formatting, filtering
tracing-opentelemetry = "0.28" # Bridge tracing → OpenTelemetry
opentelemetry = "0.28" # OTel SDK
opentelemetry-otlp = "0.28" # OTLP exporter
opentelemetry-semantic-conventions # Standard attribute names
What to Monitor
Application Metrics:
- Request rate, latency (p50/p95/p99), error rate per endpoint
- WebSocket connection count per venue
- NATS message throughput and consumer lag
- Tournament clock drift (local node vs cloud time)
- Sync latency (time from local mutation to cloud persistence)
- Cache hit/miss ratios (DragonflyDB)
Business Metrics:
- Active tournaments per venue
- Players on waiting lists
- Concurrent connected users
- Tournament registrations per hour
- Offline duration per local node
Infrastructure Metrics:
- CPU, memory, disk per service
- RPi5 node health: temperature, memory usage, SD card wear level
- NATS cluster health
- Postgres connection pool utilization
Local Node Observability
The RPi5 node should:
- Buffer OpenTelemetry spans/metrics locally when offline
- Flush to cloud collector on reconnect
- Expose a local
/healthendpoint for venue staff to check node status - Log to both stdout (for
journalctl) and a rotating file
Alerting
- Use Grafana Alerting for cloud services
- Critical alerts: API error rate > 5%, NATS cluster partition, Postgres replication lag > 30s
- Warning alerts: RPi5 node offline > 5 min, sync backlog > 1000 events, high memory usage
- Notification channels: Slack/Discord for ops team, push notification for venue managers on critical local node issues
Gotchas
- OpenTelemetry's Rust SDK is stable but evolving — pin versions carefully
- The
tracingcrate is the Rust ecosystem standard — everything (Axum, sqlx, async-nats) already emits tracing spans, so you get deep instrumentation for free - Sampling is important at scale — don't trace every tournament clock tick in production
- Grafana Cloud's free tier is generous enough for early stages (10k metrics, 50GB logs, 50GB traces)
15. Testing Strategy
Recommendation: Multi-layer testing with cargo test (unit/integration), Playwright (E2E), and Vitest (frontend unit)
Test Pyramid
▲
/ \ E2E Tests (Playwright)
/ \ - Full user flows
/ \ - Cast receiver rendering
/───────\
/ \ Integration Tests (cargo test + testcontainers)
/ \ - API endpoint tests with real DB
/ \ - NATS pub/sub flows
/ \ - Sync protocol tests
/─────────────────\
Unit Tests (cargo test + vitest)
- Domain logic (tournament engine, clock, waitlist)
- Svelte component tests
- Conflict resolution logic
Backend Testing (Rust)
- Unit tests: Inline
#[cfg(test)]modules for domain logic. The tournament engine, clock manager, waitlist priority algorithm, and conflict resolution are all pure functions that are easy to unit test. - Integration tests: Use
testcontainerscrate to spin up ephemeral Postgres + NATS + DragonflyDB instances. Test full API flows including auth, multi-tenancy, and real-time events. - sqlx compile-time checks: SQL queries are validated against the database schema at compile time — this catches a huge class of bugs before runtime.
- Property-based testing: Use
proptestfor testing conflict resolution and sync protocol with random inputs. - Test runner:
cargo-nextestfor parallel test execution (significantly faster than defaultcargo test).
Frontend Testing (TypeScript/Svelte)
- Component tests: Vitest +
@testing-library/sveltefor testing Svelte components in isolation. - Store/state tests: Vitest for testing reactive state logic (tournament clock state, waitlist updates).
- API mocking:
msw(Mock Service Worker) for intercepting API calls in tests.
End-to-End Testing
- Playwright: Test critical user flows in real browsers:
- Tournament creation and management flow
- Player registration and waitlist join
- Real-time updates (verify clock ticks appear in browser)
- Multi-venue admin dashboard
- Cast receiver display rendering (headless Chromium)
- Local node E2E: Test offline scenarios — start local node, disconnect from cloud, perform operations, reconnect, verify sync.
Specialized Tests
- Sync protocol tests: Simulate network partitions, conflicting writes, replay scenarios
- Load testing:
k6ordrill(Rust) for WebSocket connection saturation, API throughput - Cast receiver tests: Visual regression testing with Playwright screenshots of display layouts
- Cross-browser: Playwright covers Chromium, Firefox, WebKit — ensure PWA works on all
Gotchas
- Rust integration tests with testcontainers need Docker available in CI — Fly.io's CI runners support this, or use GitHub Actions with Docker
- Playwright tests are slow — run in parallel, and only test critical paths in CI (full suite nightly)
- The local node's offline/reconnect behavior is the hardest thing to test — invest heavily in deterministic sync protocol tests
- Mock the NATS connection in unit tests using a channel-based mock, not an actual NATS server
16. Security
Recommendation: Defense in depth across all layers
Data Security
| Layer | Measure |
|---|---|
| Transport | TLS 1.3 everywhere — API, WebSocket, NATS, Postgres connections |
| Data at rest | Postgres: encrypted volumes (cloud provider). libSQL on RPi5: SQLCipher-compatible encryption via libsql |
| Secrets | Environment variables via Fly.io secrets (cloud), encrypted config file on RPi5 (sealed at provisioning) |
| Passwords | Argon2id hashing, tuned per environment (higher params on cloud, lower on RPi5) |
| JWTs | Ed25519 signing, short expiry (15 min), refresh token rotation |
| API keys | SHA-256 hashed in DB, displayed once at creation, prefix-based identification (pvm_live_, pvm_test_) |
Network Security
- API: Rate limiting (Tower middleware), CORS restricted to known origins, request size limits
- WebSocket: Authenticated connection upgrade (JWT in first message or query param), per-connection rate limiting
- NATS: TLS + token auth between cloud and leaf nodes. Leaf nodes have scoped permissions (can only access their venue's subjects)
- RPi5: Firewall (nftables/ufw) — only allow outbound to cloud NATS + HTTPS, inbound on local network only for venue devices
- DDoS: Fly.io provides basic DDoS protection. Add Cloudflare in front for the API if needed.
Financial Data Security
PVM handles credit lines and buy-in transactions — this requires extra care:
- All financial mutations are event-sourced with immutable audit trail
- Credit line changes require admin approval with logged reason
- Buy-in/cashout transactions include idempotency keys to prevent duplicate charges
- Financial reports are only accessible to operator admins, with access logged
- Consider PCI DSS implications if handling payment card data directly — prefer delegating to a payment processor (Stripe)
Local Node Security
The RPi5 is physically in a venue — assume it can be stolen or tampered with:
- Disk encryption: Full disk encryption (LUKS) or at minimum encrypted database
- Secure boot: Signed binaries, verified at startup
- Remote wipe: Cloud can send a command to reset the node to factory state
- Tamper detection: Log unexpected restarts, hardware changes
- Credential scope: Local node only has access to its venue's data — compromising one node doesn't expose other venues
Gotchas
- DO NOT store payment card numbers — use a payment processor's tokenization
- GDPR/privacy: Player data across venues requires careful consent management. Players must be able to request data deletion.
- The local node's offline auth cache is a security risk — limit cached credentials, expire after configurable period
- Regularly rotate NATS credentials and JWT signing keys — automate this
17. Developer Experience
Recommendation: Cargo workspace (Rust monorepo) + pnpm workspace (TypeScript) managed by Turborepo
Monorepo Structure
pvm/
├── Cargo.toml # Rust workspace root
├── turbo.json # Turborepo config
├── package.json # pnpm workspace root
├── pnpm-workspace.yaml
│
├── crates/ # Rust crates
│ ├── pvm-api/ # Cloud API server (Axum)
│ ├── pvm-node/ # Local node binary
│ ├── pvm-ws-gateway/ # WebSocket gateway
│ ├── pvm-worker/ # Background job processor
│ ├── pvm-core/ # Shared domain logic
│ │ ├── tournament/ # Tournament engine
│ │ ├── waitlist/ # Waitlist management
│ │ ├── clock/ # Tournament clock
│ │ └── sync/ # Sync protocol
│ ├── pvm-db/ # Database layer (sqlx queries, migrations)
│ ├── pvm-auth/ # Auth logic (JWT, RBAC)
│ ├── pvm-nats/ # NATS client wrappers
│ └── pvm-types/ # Shared types (serde, utoipa derives)
│
├── apps/ # TypeScript apps
│ ├── dashboard/ # SvelteKit admin dashboard
│ ├── player/ # SvelteKit player-facing app
│ ├── cast-receiver/ # SvelteKit Cast receiver (static)
│ └── docs/ # Documentation site (optional)
│
├── packages/ # Shared TypeScript packages
│ ├── ui/ # shadcn-svelte components
│ ├── api-client/ # Generated OpenAPI client
│ └── shared/ # Shared types, utilities
│
├── docker/ # Dockerfiles
├── .github/ # GitHub Actions workflows
└── docs/ # Project documentation
Key Tools
| Tool | Purpose |
|---|---|
| Cargo | Rust build system, workspace management |
| pnpm | Fast, disk-efficient Node.js package manager |
| Turborepo | Orchestrates build/test/lint across both Rust and TS workspaces. Caches build outputs. --affected flag for CI optimization. |
| cargo-watch | Auto-rebuild on Rust file changes during development |
| cargo-nextest | Faster test runner with parallel execution |
| sccache | Shared compilation cache (speeds up CI and local builds) |
| cross / cargo-zigbuild | Cross-compile Rust for RPi5 ARM64 |
| Biome | Fast linter + formatter for TypeScript (replaces ESLint + Prettier) |
| clippy | Rust linter (run with --deny warnings in CI) |
| rustfmt | Rust formatter (enforced in CI) |
| lefthook | Git hooks manager (format + lint on pre-commit) |
Development Workflow
# Start everything for local development
turbo dev # Starts SvelteKit dev servers
cargo watch -x run -p pvm-api # Auto-restart API on changes
# Run all tests
turbo test # TypeScript tests
cargo nextest run # Rust tests
# Generate API client after backend changes
cargo run -p pvm-api -- --openapi > apps/dashboard/src/lib/api/schema.json
turbo generate:api-client
# Build for production
turbo build # TypeScript apps
cargo build --release -p pvm-api
cross build --release --target aarch64-unknown-linux-gnu -p pvm-node
Gotchas
- Turborepo's Rust support is task-level (it runs
cargoas a shell command) — it doesn't understand Cargo's internal dependency graph. Use Cargo workspace for Rust-internal dependencies. - Keep
pvm-coreas a pure library crate with no async runtime dependency — this lets it be used in both the cloud API and the local node without conflicts. - Rust compile times are the bottleneck — invest in
sccacheand incremental compilation from day one - Use
.cargo/config.tomlfor cross-compilation targets and linker settings
18. CSS / Styling
Recommendation: Tailwind CSS v4 + shadcn-svelte component system
Alternatives Considered
| Option | Pros | Cons |
|---|---|---|
| Tailwind CSS v4 | Utility-first, fast, excellent Svelte integration, v4 is faster with Rust-based engine | Learning curve for utility classes |
| Vanilla CSS | No dependencies, full control | Slow development, inconsistent patterns |
| UnoCSS | Atomic CSS, fast, flexible presets | Smaller ecosystem than Tailwind |
| Open Props | Design tokens as CSS custom properties | Not utility-first, less adoption |
| Panda CSS | Type-safe styles, zero runtime | Newer, smaller ecosystem |
Reasoning
Tailwind CSS v4 is the clear choice:
- Svelte integration: Tailwind works seamlessly with SvelteKit via the Vite plugin. Svelte's template syntax + Tailwind utilities produce compact, readable component markup.
- Tailwind v4 improvements: The v4 release includes a Rust-based engine (Oxide) that is significantly faster, CSS-first configuration (no more
tailwind.config.js), automatic content detection, and native CSS cascade layers. - shadcn-svelte: The component library is built on Tailwind, providing a consistent design system with accessible, customizable components. Components are generated into your codebase — full ownership, no black box.
- Cast receiver: Tailwind's utility classes produce small CSS bundles (only used classes are included) — important for the resource-constrained Chromecast receiver.
- Design tokens: Use CSS custom properties (via Tailwind's theme) for venue-specific branding (colors, logos) that can be swapped at runtime.
Design System Structure
packages/ui/
├── components/ # shadcn-svelte generated components
│ ├── button/
│ ├── card/
│ ├── data-table/
│ ├── dialog/
│ ├── form/
│ └── ...
├── styles/
│ ├── app.css # Global styles, Tailwind imports
│ ├── themes/
│ │ ├── default.css # Default PVM theme
│ │ ├── dark.css # Dark mode overrides
│ │ └── cast.css # Optimized for large screens
│ └── tokens.css # Design tokens (colors, spacing, typography)
└── utils.ts # cn() helper, variant utilities
Venue Branding
Venues should be able to customize their displays:
/* Runtime theme switching via CSS custom properties */
:root {
--venue-primary: theme(colors.blue.600);
--venue-secondary: theme(colors.gray.800);
--venue-logo-url: url('/default-logo.svg');
}
/* Applied per-venue at runtime */
[data-venue-theme="vegas-poker"] {
--venue-primary: #c41e3a;
--venue-secondary: #1a1a2e;
--venue-logo-url: url('/venues/vegas-poker/logo.svg');
}
Gotchas
- Tailwind v4's CSS-first config is a paradigm shift from v3 — ensure all team documentation targets v4 syntax
- shadcn-svelte components use Tailwind v4 as of recent updates — verify compatibility
- Large data tables (tournament player lists, waitlists) need careful styling — consider virtualized rendering for 100+ row tables
- Cast receiver displays need large fonts and high contrast — create a dedicated
cast.csstheme - Dark mode is essential for poker venues (low-light environments) — design dark-first
Recommended Stack Summary
| Area | Recommendation | Key Reasoning |
|---|---|---|
| Backend Language | Rust | Memory efficiency on RPi5, performance, type safety |
| Frontend Language | TypeScript | Browser ecosystem standard, type safety |
| Backend Framework | Axum (v0.8+) | Tokio-native, Tower middleware, WebSocket support |
| Frontend Framework | SvelteKit (Svelte 5) | Smallest bundles, fine-grained reactivity, PWA support |
| UI Components | shadcn-svelte | Accessible, Tailwind-based, full ownership |
| Cloud Database | PostgreSQL 16+ | Multi-tenant gold standard, RLS, JSONB |
| Local Database | libSQL (embedded) | SQLite-compatible, tiny footprint, Rust-native |
| ORM / Queries | sqlx | Compile-time checked SQL, Postgres + SQLite support |
| Caching | DragonflyDB | Redis-compatible, multi-threaded, memory efficient |
| Messaging | NATS + JetStream | Edge-native leaf nodes, sub-ms latency, lightweight |
| Real-Time | WebSockets (Axum) + SSE fallback | Full duplex, NATS-backed fan-out |
| Auth | Custom JWT + RBAC | Offline-capable, cross-venue, full control |
| API Design | REST + OpenAPI 3.1 | Generated TypeScript client, universal compatibility |
| Mobile | PWA first, Capacitor later | One codebase, offline support, app store when needed |
| Cast/Display | Google Cast SDK + Custom Web Receiver | SvelteKit static app on Chromecast |
| Deployment | Fly.io + Docker | Edge deployment, managed Postgres, WireGuard |
| CI/CD | GitHub Actions + Turborepo | Cross-language build orchestration, caching |
| Monitoring | OpenTelemetry + Grafana | Vendor-neutral, excellent Rust support |
| Testing | cargo-nextest + Vitest + Playwright | Full pyramid: unit, integration, E2E |
| Styling | Tailwind CSS v4 | Fast, small bundles, Svelte-native |
| Monorepo | Cargo workspace + pnpm + Turborepo | Unified builds, shared types |
| Linting | clippy + Biome | Rust + TypeScript coverage |
Open Questions / Decisions Needed
High Priority
-
Fly.io vs. self-hosted: Fly.io simplifies operations but creates vendor dependency. For a bootstrapped SaaS, the convenience is worth it. For VC-funded with an ops team, self-hosted on Hetzner could be cheaper at scale. Decision: Start with Fly.io, design for portability.
-
libSQL sync granularity: Should the local node sync entire tables or individual rows? Row-level sync is more efficient but more complex to implement. Recommendation: Start with table-level sync for the initial version, refine to row-level as data volumes grow.
-
NATS embedded vs. sidecar on RPi5: Running NATS as an embedded library (via
nats-serverRust bindings) vs. a separate process. Embedded is simpler but couples versions tightly. Recommendation: Sidecar (separate process managed by systemd) for operational flexibility. -
Financial data handling: Does PVM handle actual money transactions, or only track buy-ins/credits as records? If handling real money, PCI DSS and financial regulations apply. Recommendation: Track records only. Integrate with Stripe for actual payments.
-
Multi-region from day one?: Should the initial architecture support venues in multiple countries/regions? This affects Postgres replication strategy and NATS cluster topology. Recommendation: Single region initially, design NATS subjects and DB schema for eventual multi-region.
Medium Priority
-
Player account deduplication: When a player signs up at two venues independently, how do we detect and merge accounts? Email match? Phone match? Manual linking? Needs product decision.
-
Chromecast vs. generic display hardware: Should the primary display strategy be Chromecast, or should we target a browser-in-kiosk-mode approach that also works with Chromecast? Recommendation: Build the receiver as a standard web app first (works in kiosk mode), add Cast SDK integration second.
-
RPi5 provisioning: How are local nodes set up? Manual image flashing? Automated provisioning? Remote setup? Recommendation: Pre-built OS image with first-boot wizard that connects to cloud and provisions the node.
-
Offline duration limits: How long should a local node operate offline before we consider the data stale? 1 hour? 1 day? 1 week? Needs product decision based on venue feedback.
-
API versioning strategy: When do we introduce
/api/v2/? Should we support multiple versions simultaneously? Recommendation: Semantic versioning for the API spec. Maintain backward compatibility as long as possible. Only version on breaking changes.
Low Priority
-
GraphQL for player-facing app: The admin dashboard is well-served by REST, but the player app might benefit from GraphQL's flexible querying (e.g., "show me my upcoming tournaments across all venues with waitlist status"). Revisit after v1 launch.
-
WebTransport: When browser support matures and Chromecast supports it, WebTransport could replace WebSockets for lower-latency, multiplexed real-time streams. Monitor but do not adopt yet.
-
WASM on local node: Could parts of the frontend run on the local node via WASM for ultra-fast local rendering? Interesting but not a priority. Defer.
-
AI features: Player behavior analytics, optimal table assignments, tournament structure recommendations. The data model should be designed to support future ML pipelines. Design for it, build later.