26 KiB
Project Research Summary
Project: Felt — Edge-cloud poker venue management platform Domain: Live venue poker operations (ARM64 SBC + cloud hybrid, offline-first) Researched: 2026-02-28 Confidence: MEDIUM-HIGH
Executive Summary
Felt is a three-tier edge-cloud platform for managing live poker venues: tournament operations, cash game management, player tracking, and digital display signage. The competitive landscape (TDD, Blind Valet, BravoPokerLive, LetsPoker) reveals a clear gap — no single product combines offline-first reliability, wireless display management, cloud sync, and modern UX. Experts in this domain build tournament state machines as pure functions with an append-only event log backing financial calculations, and treat offline operation as a first-class architectural constraint rather than a fallback. The recommended approach is a Go monorepo with shared domain logic compiled to two targets: an ARM64 Leaf binary for venue hardware and an amd64 Core binary for the cloud tier, connected via NATS JetStream leaf-node mirroring over a WireGuard mesh.
The primary technical risk is the intersection of offline-first requirements with financial correctness. Financial mutations (buy-ins, rebuys, prize pool splits) must be modelled as an immutable append-only event log using integer arithmetic — not mutable rows or floating-point values. Any deviation from this is not recoverable post-production without manual audit and player agreement. The second major risk is CGO cross-compilation complexity introduced by the LibSQL Go driver; this must be validated in CI from day one. A third risk is NATS JetStream's default sync_interval which does not guarantee durability on power loss — requiring an explicit configuration override before any production deployment.
The architecture is well-validated: Go's single-binary embed model (SvelteKit built assets embedded via go:embed) eliminates deployment complexity on ARM hardware; NATS JetStream's leaf-node domain isolation provides clean offline queuing with replay; PostgreSQL RLS provides multi-tenant isolation on Core; Pi Zero 2W display nodes are stateless Chromium kiosk consumers, not managed agents. The main uncertainty is around LibSQL's go-libsql driver (no tagged releases, pinned to commit hash) and the Netbird reverse proxy beta status, both of which require early integration testing to validate before committing to downstream features.
Key Findings
Recommended Stack
The stack is a Go + SvelteKit monorepo targeting two runtimes. Go 1.26 provides single-binary deployment, native ARM64 cross-compilation, and goroutine-based concurrency for real-time tournament clock management. SvelteKit 2 + Svelte 5 runes handle all frontends (operator UI, player PWA, display views, public pages) with Svelte 5's runes reactivity model handling 100ms clock updates without virtual DOM overhead. NATS Server 2.12.4 is embedded in the Leaf binary (~10MB RAM) and runs as a standalone cluster on Core; JetStream provides durable event queuing with offline store-and-forward replay. LibSQL (go-libsql, CGO-required) is the embedded database on Leaf; PostgreSQL 16 with row-level security is the multi-tenant store on Core. Netbird + Authentik provide self-hosted WireGuard mesh networking and OIDC identity.
Core technologies:
- Go 1.26: Backend runtime for both Leaf and Core — single binary, ARM64 native, goroutine concurrency for real-time state
- SvelteKit 2 + Svelte 5: All frontends — operator UI, player PWA, display views, public pages served from embedded Go binary
- NATS JetStream 2.12.4: Embedded message broker on Leaf + hub cluster on Core — durable offline-first event sync with store-and-forward replay
- LibSQL (go-libsql): Embedded SQLite-compatible DB on Leaf — offline-first authoritative state store with WAL mode
- PostgreSQL 16: Multi-tenant relational store on Core — RLS tenant isolation, cross-venue aggregation, league management
- Netbird (WireGuard mesh): Zero-config peer networking — reverse proxy for player PWA external access, encrypted tunnel for Core sync
- Authentik: Self-hosted OIDC identity provider — operator SSO with offline PIN fallback, integrates with Netbird
Critical version notes:
- go-libsql has no tagged releases; pin to commit hash in go.mod
- NATS server v2.12.4 must match nats.go v1.49.0 client
- Tailwind CSS v4 requires
@tailwindcss/viteplugin listed beforesveltekit()in vite.config - CGO_ENABLED=1 required for LibSQL; ARM64 cross-compilation needs
aarch64-linux-gnu-gcc
Expected Features
The MVP must replace TDD (The Tournament Director) and a whiteboard for a single-venue operator running a complete tournament. Competitive analysis confirms no product combines offline reliability with wireless displays and cloud sync — this is the primary differentiation window.
Must have (table stakes) — P1:
- Tournament clock engine (countdown, levels, breaks, pause/resume) — the core product job
- Configurable blind structures with presets, antes, chip-up messaging, and break config
- Player registration, bust-out tracking, rebuy/add-on/late registration handling
- Prize pool calculation and payout structure (ICM, fixed %, custom splits)
- Table seating assignment (random + manual) and automated table balancing
- Display system: clock view and seating view on dedicated screens (wireless)
- Player mobile PWA: live clock, blinds, personal rank — QR code access, no install
- Offline-first operation — zero internet dependency during tournament
- Role-based auth: operator PIN offline, OIDC online
- TDD data import: blind structures and player database
- Data export: CSV, JSON, HTML print output
Should have (competitive advantage) — P2:
- Cash game: waitlist management, table status board, session/rake tracking, must-move logic
- Events engine: rule-based automation ("level advance → play sound, change display view")
- League/season management with configurable point formulas
- Hand-for-hand bubble mode per TDA rules
- Dealer tablet module for bust-outs and rebuys at the table
- Seat-available notifications (push/SMS)
- Digital signage content system with playlist scheduling
Defer (v2+) — P3:
- WYSIWYG content editor and AI promo generation
- Dealer management and staff scheduling
- Player loyalty/points system
- Public venue presence page with online event registration
- Analytics dashboards (revenue, retention, game popularity)
- Native iOS/Android apps — PWA covers use case until then
- Cross-venue player leaderboards — requires network effect
Anti-features (do not build):
- Payment processing or chip cashout — PCI/gambling license complexity
- Online poker gameplay — different product entirely
- BYO hardware support — undermines offline guarantees, support burden
- Real-money gambling regulation features — jurisdiction-specific legal maintenance
Architecture Approach
The architecture is a three-tier model: Cloud Core (Hetzner Proxmox LXC, amd64), Edge Leaf (Orange Pi 5 Plus ARM64 SBC, NVMe), and Display Tier (Raspberry Pi Zero 2W Chromium kiosk). The Leaf node is the authoritative operational unit — all tournament writes happen to LibSQL first, domain events are published to a local NATS JetStream stream, and the Hub broadcasts state deltas to all WebSocket clients within ~10ms. NATS mirrors the local stream to Core asynchronously over WireGuard, providing offline store-and-forward sync with per-subject ordering guarantees. Core is an analytics/aggregation/cross-venue target only — never a write-path dependency.
Major components:
- Go Leaf API — tournament engine (pure domain logic, no I/O), financial engine, seating engine, WebSocket hub, REST/WS API; single ARM64 binary with embedded NATS + LibSQL + SvelteKit assets
- Go Core API — multi-tenant aggregation, cross-venue leagues, player platform identity; PostgreSQL with RLS, NATS hub cluster
- NATS JetStream (Leaf → Core) — leaf-node domain isolation, store-and-forward mirroring, append-only event audit log; doubles as sync mechanism and audit trail
- WebSocket Hub — per-client goroutine send channels, central broadcast channel; non-blocking drop for slow clients; in-process pub/sub triggers
- Display Tier — stateless Pi Zero 2W kiosk nodes; view assignment stored on Leaf (not on node); Chromium kiosk subscribes to assigned view URL via WebSocket; operator reassigns view through Hub broadcast
- Netbird Mesh — WireGuard peer-to-peer tunnels; reverse proxy for player PWA HTTPS access; Authentik OIDC for operator auth when online
Key architecture rules:
- Leaf is always the authoritative write target; Core is read/aggregation only
- Financial events are immutable append-only log; prize pool is derived, never stored
- All monetary values stored as int64 cents — never float64
- Display nodes are stateless; view assignment is Leaf state, not node state
- Single Go monorepo with shared
internal/packages;cmd/leafandcmd/coreare the only divergence points
Critical Pitfalls
-
NATS JetStream default fsync causes data loss on power failure — Set
sync_interval: alwaysin the embedded NATS server config before first production deploy. The December 2025 Jepsen analysis confirmed NATS 2.12.1 loses acknowledged writes with default settings. Tournament events are low-frequency so the throughput tradeoff is irrelevant. -
Float64 arithmetic corrupts prize pool calculations — Store and compute all monetary values as
int64cents. Percentage payouts use integer multiplication-then-division. Write a CI gate test: sum of individual payouts must equal prize pool total. This is a zero-compromise constraint — floating-point errors in production require manual audit and player agreement to resolve. -
LibSQL WAL checkpoint stall causes clock drift — Disable autocheckpoint (
PRAGMA wal_autocheckpoint = 0), schedule explicitPRAGMA wal_checkpoint(TRUNCATE)during level transitions and breaks. Setjournal_size_limitto 64MB. Gate LibSQL cloud sync behind a mutex with the write path — never overlap sync with an active write transaction. -
Pi Zero 2W memory exhaustion crashes display mid-tournament — Test ALL display views on actual Pi Zero 2W hardware from day one, not a Pi 4. Enable zram (~200MB effective headroom). Set
--max-old-space-size=256Chromium flag. Use systemdMemoryMax=450MwithRestart=always. Consider Server-Sent Events instead of WebSocket for display-only views to reduce connection overhead. -
Table rebalancing algorithm produces invalid seating — Implement rebalancing as a pure function returning a proposed move list, never auto-applied. Consult Poker TDA Rules 25-28. Unit test exhaustively: last 2-player table, dealer button position, player in blind position. Require operator tap-to-confirm before executing any move. Never silently apply balance suggestions.
-
PostgreSQL RLS tenant isolation bleeds between connections — Always use
SET LOCAL app.tenant_id = $1(transaction-scoped), never session-scopedSET. Assertcurrent_setting('app.tenant_id')matches JWT claim before every query. Write integration tests verifying venue A token returns 0 results (not 403) for venue B endpoints. -
Offline sync conflict corrupts financial ledger — Financial events must be immutable with monotonic sequence numbers assigned by the Leaf. Never use wall-clock timestamps for event ordering (SBC clock drift is real). Surface genuine conflicts in the operator UI rather than auto-merging.
Implications for Roadmap
Research strongly supports the five-phase build order identified in ARCHITECTURE.md, with one critical addition: financial engine correctness and infrastructure hardening must be established before any frontend work begins.
Phase 1: Foundation — Leaf Core Infrastructure
Rationale: The Leaf node is the architectural foundation everything else depends on. Tournament engine, financial engine, and data layer correctness must be established in isolation before adding network layers or frontends. All seven Phase 1 critical pitfalls manifest here: NATS fsync, float arithmetic, WAL configuration, Pi Zero 2W memory, seating algorithm, RLS isolation, and offline sync conflict handling. None of these are retrofittable.
Delivers: A working offline tournament system accessible via API. Operators can run a complete tournament (registration → clock → rebuys → bust-outs → payout) without any frontend UI, verifiable via API calls and automated tests.
Addresses (from FEATURES.md P1): Tournament clock engine, blind structure config, player registration + bust-out, rebuy/add-on, prize pool calculation, table seating + balancing, financial engine, role-based auth, offline operation
Avoids: NATS default fsync data loss, float arithmetic in financials, WAL checkpoint stall, offline sync financial conflict, table rebalancing invalid seating, Multi-tenant RLS data leak
Needs deeper research: CGO cross-compilation pipeline (LibSQL ARM64 build); NATS JetStream embedded server wiring with domain isolation; go-libsql commit-pin strategy given no tagged releases
Phase 2: Operator + Display Frontend
Rationale: The API from Phase 1 is the source of truth; frontend is a view layer. Building frontend after backend eliminates the common mistake of letting UI design drive data model decisions. Display node architecture (stateless Chromium kiosk) must be validated on actual Pi Zero 2W hardware before building more display views.
Delivers: Fully operational venue management UI — operators can run a tournament through the SvelteKit operator interface; display nodes show clock/seating on TV screens; player PWA shows live data via QR code.
Addresses (from FEATURES.md P1): Display system (clock + seating views), player mobile PWA, TDD data import, data export
Implements (from ARCHITECTURE.md): SvelteKit operator UI, display view routing via URL parameters + WebSocket view-change broadcast, player PWA with service worker, Netbird reverse proxy for external player access
Avoids: Pi Zero 2W memory exhaustion (validate on hardware before adding more views), PWA stale service worker (configure skipWaiting from day one), WebSocket state desync on server restart
Standard patterns: SvelteKit + Svelte 5 runes, Tailwind v4 Vite integration, vite-pwa/sveltekit plugin — well-documented, skip deep research here
Phase 3: Cloud Sync + Core Backend
Rationale: Core is a progressive enhancement — Leaf operates completely without it. This deliberate ordering ensures offline-first is proven before adding the cloud dependency. Multi-tenant RLS and NATS hub cluster configuration are complex enough to warrant dedicated implementation phase after Leaf is battle-tested.
Delivers: Leaf events mirror to Core PostgreSQL; multi-venue operator dashboard; player platform identity (player belongs to Felt, not just one venue); cross-venue league standings computable from aggregated data.
Addresses (from FEATURES.md P1/P2): TDD import to cloud, league/season management foundations, analytics data pipeline, multi-tenant operator accounts
Implements (from ARCHITECTURE.md): PostgreSQL schema + RLS, NATS hub cluster, Leaf-to-Core JetStream mirror stream, Go Core API, SvelteKit SSR public pages + admin dashboard
Avoids: Multi-tenant RLS tenant isolation bleed, Netbird management server as single point of failure (design redundancy here), NATS data loss on Leaf reconnect (verify replay correctness)
Needs deeper research: NATS JetStream stream source/mirror configuration across domains; PostgreSQL RLS with pgx connection pool (transaction mode vs session mode); Authentik OIDC integration with Netbird self-hosted
Phase 4: Cash Game + Advanced Tournament Features
Rationale: Cash game operations have different state machine characteristics than tournaments (open-ended sessions, waitlist progression, must-move table logic). Building after tournament proves the event-sourcing and WebSocket broadcast patterns. Events engine automation layer is additive on top of existing state machines.
Delivers: Full cash game venue management: waitlist, table status board, session/rake tracking, must-move logic, seat-available notifications. Events engine enables rule-based automation for both tournament and cash game operations.
Addresses (from FEATURES.md P2): Waitlist management, table status board, session tracking, rake tracking, seat-available notifications, must-move table logic, events engine, hand-for-hand mode, dealer tablet module, digital signage content system
Avoids: Must-move algorithm correctness (similar testing discipline as table rebalancing), GDPR consent capture for player contact details used in push notifications
Needs deeper research: Push notification delivery for seat-available (PWA push vs SMS gateway); must-move table priority queue algorithm per TDA rules; digital signage content scheduling architecture
Phase 5: Platform Maturity + Analytics
Rationale: Platform-level features (public venue pages, cross-venue leaderboards, loyalty system, analytics dashboards) require the player identity foundation from Phase 3 and the full event history from Phases 1-4. Analytics consumers on Core event streams can be added without modifying existing Leaf or Core operational code.
Delivers: Public venue discovery pages, online event registration, player loyalty/points system, revenue analytics dashboards, full GDPR compliance workflow including right-to-erasure API.
Addresses (from FEATURES.md P3): WYSIWYG content editor, AI promo generation, dealer management, loyalty points, public venue presence, analytics dashboards, full GDPR compliance
Avoids: Full GDPR right-to-erasure implementation (PII anonymization without destroying tournament results), cross-tenant leaderboard data isolation
Standard patterns: Analytics dashboards on time-series data — well-documented patterns; skip deep research unless using TimescaleDB/ClickHouse
Phase Ordering Rationale
- Financial engine correctness, NATS durability, and data layer configuration are all Phase 1 because they are architectural constraints that cannot be retrofitted without full data migration or manual audit
- Frontend follows backend (Phase 2 after Phase 1) to prevent UI from driving data model decisions; the API contract is established before the first pixel is rendered
- Core cloud sync (Phase 3) is explicitly deferred until Leaf is proven in offline operation — this validates the most important product constraint before adding complexity
- Cash game (Phase 4) shares infrastructure with tournaments but has distinct operational semantics; building after tournament prevents premature abstraction
- GDPR compliance is split: the anonymization data model must be in place from Phase 1 (player management), but the full workflow (consent capture, deletion API, retention enforcement) is Phase 5
Research Flags
Phases likely needing /gsd:research-phase during planning:
- Phase 1 — CGO cross-compilation pipeline: go-libsql has no tagged releases and CGO ARM64 cross-compilation is a known complexity point. Need to validate the specific commit hash strategy and Docker buildx vs cross-compiler approach before committing to the build pipeline.
- Phase 1 — NATS embedded leaf node domain setup: The exact configuration for running an embedded NATS server as a JetStream leaf node with a named domain, connecting to a Core hub, is documented but has known gotchas (domain naming conflicts, account configuration). Validate with a minimal integration test before building any domain event logic on top.
- Phase 3 — NATS JetStream stream source/mirror across domains: Cross-domain JetStream mirroring has specific configuration requirements. The Core side creating a stream source from a leaf domain is not well-documented outside official NATS docs. Needs validation test.
- Phase 3 — Netbird reverse proxy beta status: Netbird reverse proxy is in beta as of research date. The integration with Traefik as external reverse proxy needs explicit validation. Test before committing the player PWA access pattern to this mechanism.
- Phase 4 — Push notification delivery for seat-available: PWA push requires browser permission grant and a push service (Web Push Protocol, VAPID keys). SMS requires a gateway (Twilio, Vonage). Neither is trivial and the choice has cost and compliance implications.
Phases with standard patterns (skip research-phase):
- Phase 2 — SvelteKit + Tailwind v4 + vite-pwa: All well-documented with official guides. The integration patterns are verified and the stack is stable. Implement directly.
- Phase 2 — WebSocket Hub pattern: Canonical Go pattern with multiple reference implementations. Implement directly from the hub pattern documented in ARCHITECTURE.md.
- Phase 5 — Analytics dashboards: Standard time-series query patterns on PostgreSQL/TimescaleDB. Skip research unless introducing a dedicated analytics database.
Confidence Assessment
| Area | Confidence | Notes |
|---|---|---|
| Stack | MEDIUM-HIGH | Core technologies (Go, SvelteKit, NATS, PostgreSQL) verified against official sources and current releases. go-libsql is the uncertainty — no tagged releases, CGO complexity. Netbird reverse proxy is beta. |
| Features | MEDIUM | Competitive analysis covered major products (TDD, Blind Valet, BravoPokerLive, LetsPoker, CasinoWare). Feature importance inferred from analysis and forum discussions, not direct operator interviews. Prioritization reflects reasonable inference, not validated PMF. |
| Architecture | MEDIUM-HIGH | NATS leaf-node patterns verified via official docs. WebSocket Hub is a canonical Go pattern. LibSQL embedded replication model verified. Pi Zero 2W constraints community-verified. Chromium kiosk approach has multiple real-world references. |
| Pitfalls | HIGH | NATS fsync data loss is documented in a December 2025 Jepsen analysis (independent, high confidence). Float arithmetic, RLS bleed, and WAL checkpoint issues are verified against official sources. Pi Zero 2W memory constraints are community-verified. Table rebalancing edge cases are documented in TDD's own changelog. |
Overall confidence: MEDIUM-HIGH
Gaps to Address
- Direct operator validation: Feature priorities are inferred from competitive analysis, not operator interviews. The first beta deployments should include structured feedback collection to validate P1 feature completeness before Phase 2 work begins.
- go-libsql stability and replication: The go-libsql driver has no tagged releases and the LibSQL embedded replication feature is in public beta. The sync-to-Core path may not be needed if NATS JetStream handles all replication. Validate during Phase 1 whether LibSQL sync is used at all or NATS is the exclusive sync mechanism.
- Netbird reverse proxy in production: Beta status means API may change. Validate the full player PWA access flow (QR code → public HTTPS URL → WireGuard → Leaf) in a real venue network environment before Phase 3 depends on it.
- Pi Zero 2W Chromium memory with multi-view display: Memory profiling has been community-validated for basic kiosk use, but not for the specific animation patterns in a tournament clock display. Must be validated on actual hardware in Phase 2 before scaling display views.
- Multi-currency display configuration: Research flagged this as deferred (display-only currency symbol), but the data model choice (storing amounts as cents in a single implicit currency vs. currency-tagged amounts) must be made in Phase 1 even if multi-currency display is deferred.
Sources
Primary (HIGH confidence)
- Go 1.26 release — https://go.dev/blog/go1.26
- NATS Server v2.12.4 releases — https://github.com/nats-io/nats-server/releases
- NATS JetStream Core Concepts — https://docs.nats.io/nats-concepts/jetstream
- Jepsen: NATS 2.12.1 analysis — https://jepsen.io/blog/2025-12-08-nats-2.12.1
- NATS JetStream data loss GitHub issue — https://github.com/nats-io/nats-server/issues/7564
- SQLite Write-Ahead Logging — https://sqlite.org/wal.html
- LibSQL Embedded Replicas data corruption — https://github.com/tursodatabase/libsql/discussions/1910
- Multi-tenant Data Isolation with PostgreSQL RLS — AWS — https://aws.amazon.com/blogs/database/multi-tenant-data-isolation-with-postgresql-row-level-security/
- Floats Don't Work for Storing Cents — Modern Treasury
- SvelteKit 2.53.x official docs — https://kit.svelte.dev
- Tailwind v4 Vite integration — https://tailwindcss.com/docs/guides/sveltekit
- The Tournament Director known bugs changelog — https://thetournamentdirector.net/changes.txt
Secondary (MEDIUM confidence)
- NATS Adaptive Edge Deployment — https://docs.nats.io/nats-concepts/service_infrastructure/adaptive_edge_deployment
- JetStream on Leaf Nodes — https://docs.nats.io/running-a-nats-service/configuration/leafnodes/jetstream_leafnodes
- NetBird Reverse Proxy Docs — https://docs.netbird.io/manage/reverse-proxy
- LibSQL Embedded Replicas — https://docs.turso.tech/features/embedded-replicas/introduction
- Authentik Netbird integration — https://docs.netbird.io/selfhosted/identity-providers/authentik
- CGO ARM64 cross-compilation community thread
- Go embed + SvelteKit pattern — https://www.liip.ch/en/blog/embed-sveltekit-into-a-go-binary
- Chromium on Pi Zero 2W memory — Raspberry Pi Forums
- PostgreSQL RLS implementation guide — permit.io / AWS
- Competitor feature analysis: Blind Valet, BravoPokerLive, LetsPoker, CasinoWare, kHold'em, PokerAtlas TableCaptain
- PokerNews: Best Poker Table Management Software comparison
- Poker TDA Rules 2013 (balancing procedures Rules 25-28)
Tertiary (LOW confidence)
- Synadia: AI at the Edge with NATS JetStream — single source for edge AI patterns
- Multi-Tenancy Database Patterns in Go — single source, corroborates general RLS pattern
- Raspberry Pi Kiosk System community project
- NetBird 2025 critical mistakes — third-party blog, verify against official docs
- GDPR compliance for gaming operators — legal advisory blog, not authoritative
Research completed: 2026-02-28 Ready for roadmap: yes