homelabby/.planning/research/SUMMARY.md

22 KiB

Project Research Summary

Project: HWLab — AI-powered self-hosted homelab hardware inventory Domain: USB peripheral control + local AI inference + NetBox ITAM integration Researched: 2026-04-09 Confidence: MEDIUM-HIGH

Executive Summary

HWLab is a single-operator homelab tool that closes the gap no existing ITAM product addresses: zero-manual-entry physical hardware cataloging via AI vision, with QR label printing and cable testing integrated at the hardware layer. The canonical pattern for this class of tool is a Go single-binary backend (HTTP API + USB serial management + AI orchestration) paired with a React SPA, with NetBox as the sole inventory data store. No local inventory database. SQLite is used only for chat history, config, and a write-ahead queue for NetBox operations. The React frontend is embedded into the Go binary for single-binary deployment — appropriate for a solo-operator homelab context.

The recommended approach centers on a three-tier AI pipeline: Tier 1 (local Gemma 4 via oMLX on Apple Silicon) handles all routine intake offline; Tier 2 (SearXNG research agent via OpenRouter) fills specification gaps; Tier 3 (Claude Opus via OpenRouter) powers the Lab Advisor strategic chat. This tiering is the core architectural differentiator — it delivers offline capability, cost control, and depth of response that no single-model approach achieves. The AI Orchestrator must own all model routing decisions; service-layer code must never reference model names directly.

The two dominant risks are hardware-dependent and must be addressed before any feature work begins. First, the PRT Qutie label printer has no public USB protocol documentation — protocol characterization must happen the day hardware arrives (2026-04-13) before committing to a label printing architecture. Second, Gemma 4 on 16GB unified memory must be profiled under full stack load before the model tier is finalized — Gemma 4 E4B (4-bit, ~5-8GB) is the safe default; Gemma 4 26B A4B needs TurboQuant testing. The USB device layer (goroutine-per-device with VID/PID enumeration) and NetBox custom field write/read asymmetry are well-documented pitfalls that must be handled from the first line of integration code.

Key Findings

The backend is Go 1.22+ using Gin for HTTP routing, go.bug.st/serial for USB serial (not tarm/serial — unmaintained since 2018), go-netbox/v4 for the typed NetBox client, and sashabaranov/go-openai as the single OpenAI-compatible client (base URL is config-driven per tier). SQLite via mattn/go-sqlite3 + sqlx handles the chat history and write-ahead queue. The frontend is React 18 + TypeScript 5 + Vite 5, with TanStack Query for server state (NetBox data), Zustand for client state (chat buffer, USB device status), and shadcn/ui + Tailwind 3 for components. oMLX on macOS 15+ (Apple Silicon M1+) is the local inference server; it is not compatible with Intel Mac or Linux.

Core technologies:

  • Go 1.22+: backend API, USB serial control, AI orchestration — single binary deployment, native concurrency for USB polling
  • React 18 + TypeScript 5 + Vite 5: SPA frontend — type safety catches NetBox API contract drift
  • oMLX (latest): local Gemma 4 inference via MLX backend — OpenAI-compatible, Apple Silicon optimized, required for Tier 1
  • go-netbox/v4: official typed NetBox REST client — generated from OpenAPI spec, only compatible with NetBox 4.x
  • go.bug.st/serial v1.x: USB serial for printer and cable testers — actively maintained, supports port enumeration
  • SQLite (mattn/go-sqlite3 v1.14+): chat history, config, write-ahead queue only — NetBox owns inventory data
  • sashabaranov/go-openai v1.x: single client for all AI tiers — base URL switches between oMLX, OpenRouter

Do not use: GORM (over-engineered for two tables), tarm/serial (unmaintained), Redux (use Zustand + TanStack Query), Next.js (Go serves the SPA as static files), llama.cpp Go bindings (oMLX is 2-4x faster on M-series).

Expected Features

Must have (table stakes):

  • Asset record CRUD (delegated to NetBox via API) — every ITAM tool has this
  • Unique HW-XXXXX identifiers — stable QR-encoded references, assigned at intake
  • Asset search and filter — table search + AI natural language query translation
  • Asset status tracking (Available / In Use / Retired / Unknown) — baseline expectation
  • Category, manufacturer, model, serial number fields — NetBox native; populated by AI intake
  • Location / rack / site tracking — NetBox hierarchy
  • QR label printing via PRT Qutie — closes the physical intake loop
  • Audit trail — NetBox change log, surfaced in record detail

Should have (differentiators):

  • AI photo intake (zero-entry) — the core value proposition; photo to structured NetBox record via Gemma 4 vision
  • Three-tier AI escalation — local Gemma 4 then SearXNG research agent then Claude Opus; offline-capable and cost-optimal
  • Lab Advisor chat — strategic homelab Q&A with full inventory context; unique among ITAM tools
  • Cable test workflow (Treedix + FNIRSI FNB58) — USB serial integration; test results attached to cable records
  • Catalog quality gate — draft / indexed / needs_research / researched / complete; AI-driven state transitions
  • Natural language inventory search — LLM query translation to NetBox API filter, running locally
  • SearXNG product research (Tier 2) — auto-fills specs, EAN/part numbers, pricing for incomplete records

Defer (v2+):

  • Barcode scanner HID support — only if photo intake proves slow for high-volume bulk intake
  • Bulk CSV import — bypasses quality gate; NetBox native import is better path for migrations
  • Predictive failure / warranty alerts — disproportionate infrastructure for homelab context
  • Multi-user / RBAC — NetBox already provides this if genuinely needed; adds no solo-operator value

Architecture Approach

The system is a layered modular monolith: React SPA, Go HTTP layer (thin handlers), Service layer (Inventory, Advisor, Label/Testing), Client/Adapter layer (NetBox client, AI client, SQLite), External services (NetBox LXC, oMLX, OpenRouter, SearXNG). The USB Manager runs outside the service layer, managing dedicated goroutines per device with typed command/event channels. HTTP handlers send commands to USB Manager channels and receive results via Server-Sent Events — no polling, no WebSockets. The Go binary embeds the React build via go:embed for single-binary deployment.

Major components:

  1. AI Orchestrator (internal/ai/orchestrator.go) — owns all tier routing, escalation logic, prompt construction; services never reference model names
  2. USB Manager (internal/usb/manager.go) — goroutine-per-device, VID/PID enumeration, command channels in / event channels out; owns PRT Qutie, 3x Treedix, FNIRSI FNB58
  3. NetBox Client (internal/netbox/client.go) — sole integration point for all NetBox REST calls; repository pattern; separate Go types for read vs. write custom fields
  4. Inventory Service (internal/inventory/service.go) — photo intake flow, quality gate state machine, record lifecycle
  5. Advisor Service (internal/advisor/service.go) — chat session, NetBox context assembly, response streaming via SSE
  6. SQLite store — chat history (append-only), config, write-ahead queue for pending NetBox operations

Critical Pitfalls

  1. USB serial path churn on device replug — macOS assigns dynamic /dev/cu.* paths; hard-coding them means wrong devices get targeted after any replug. Enumerate by USB VID/PID + serial number from day one; re-resolve path on every reconnect.
  2. Goroutine leak on USB disconnectserial.Read() does not reliably unblock when a port is closed from another goroutine. Wrap every read loop with context.Context cancellation; use go.bug.st/serial (has explicit unblock support); write a replug-cycle goroutine count test before any feature work.
  3. NetBox custom field write/read asymmetry — PATCH returns HTTP 200 but the field is silently not written if you send the read format (nested object) instead of the write format (ID array). Write round-trip tests (PATCH then GET, assert value) for every custom field before building any intake flow that depends on them.
  4. AI misidentification with no quality gate enforcement — Gemma 4 returns high-confidence wrong classifications for visually ambiguous hardware. Define hard confidence thresholds in config (e.g., below 0.7 pins at needs_research); never auto-advance past indexed without threshold clearance or explicit operator confirmation.
  5. PRT Qutie protocol unknown — no public USB raw protocol docs exist. Hardware arrives 2026-04-13. First action on arrival is protocol characterization (Wireshark USB capture, check CDC-ACM vs. HID), not feature development. Fallback: Brother QL-820NWBc (well-documented ZPL).
  6. 16GB unified memory budget — Gemma 4 E4B uses 5-8GB; with Go backend, browser, and macOS overhead, memory pressure is a real risk. Run a full-stack memory profiling session before committing to model selection.
  7. No NetBox offline buffer — without a write-ahead queue in SQLite, any NetBox LXC hiccup makes the entire app non-functional during intake. Add a pending_operations table to SQLite in the NetBox integration phase, before any intake workflow is built.

Implications for Roadmap

Based on research, the architecture has a clear dependency DAG. The NetBox client and AI client must be stable before any services are built. USB hardware cannot be finalized until it arrives and protocols are characterized. The suggested 7-phase structure follows these hard dependencies.

Phase 1: Foundation — Infrastructure and NetBox Integration

Rationale: Every single feature in the system reads or writes through NetBox. This is the hardest dependency and must be resolved first. The write-ahead queue belongs here too — retrofitting it later is a pitfall. Delivers: Running Go binary with NetBox API client; all custom fields provisioned in NetBox; round-trip test suite for custom fields; SQLite schema (chat + write-ahead queue); Vite dev proxy configured. Addresses: Asset CRUD, HW-XXXXX ID scheme, custom fields schema for quality gate and all HWLab-specific fields. Avoids: NetBox custom field write/read asymmetry (Pitfall 7); NetBox downtime / no offline buffer (Pitfall 3). Research flag: Low — NetBox REST API is well-documented; go-netbox/v4 is official.

Phase 2: AI Pipeline — Local Inference and Photo Intake

Rationale: AI photo intake is the core differentiator. It must be validated before building the UI around it. oMLX memory budget must be verified before model selection is finalized. Delivers: oMLX + Gemma 4 setup and profiled memory budget; AI Orchestrator with Tier 1 (local) and Tier 2 (OpenRouter) routing; photo intake endpoint (POST /api/intake); quality gate state machine enforcing confidence thresholds; structured output parsing from vision model. Uses: go-openai client with configurable base URL; oMLX at localhost; openrouter.ai for escalation. Implements: AI Orchestrator, Inventory Service (intake), quality gate state machine. Avoids: AI misidentification / quality gate bypass (Pitfall 4); 16GB memory exhaustion (Pitfall 6); mixing AI tiers in service logic (Architecture Anti-Pattern 3). Research flag: MEDIUM — three-tier confidence scoring calibration for vision tasks is novel; needs real hardware photos to tune thresholds.

Phase 3: React SPA and Inventory Dashboard

Rationale: Frontend needs a stable backend API (Phases 1-2) to be useful. Building the UI before the backend is stable creates churn. Delivers: React SPA with Vite (embedded via go:embed), inventory dashboard (TanStack Query via Go API), asset detail view with photo, quality gate status with human-readable labels and action buttons, basic text search, intake flow UI with inline AI classification correction. Uses: React 18 + TypeScript 5, shadcn/ui + Tailwind 3, TanStack Query, Zustand, react-hot-toast. Implements: All frontend views; SSE subscription for async intake progress updates. Avoids: Quality gate status shown as enum code (UX Pitfall); no progress feedback during AI intake (UX Pitfall). Research flag: Low — React + TanStack Query + shadcn/ui are well-documented, established patterns.

Phase 4: USB Hardware Characterization and Label Printing

Rationale: USB hardware arrives 2026-04-13. Protocol characterization is a spike that must complete before any label printing architecture is committed to. This phase has the highest uncertainty. Delivers: Protocol characterization for PRT Qutie (and decision on CDC-ACM vs. Bluetooth fallback); USB Manager with goroutine-per-device, VID/PID enumeration, channel fan-out, reconnect handling; QR label generation and print flow; SSE events for print status. Implements: USB Manager, Label Service, printer sub-package. Avoids: PRT Qutie protocol unknown (Pitfall 5); USB serial path churn (Pitfall 1); goroutine leak on disconnect (Pitfall 2). Research flag: HIGH — PRT Qutie protocol unknown. Dedicated hardware characterization spike required on 2026-04-13. Do not plan label printing features before spike completes.

Phase 5: Cable Test Integration

Rationale: Builds directly on the USB Manager foundation from Phase 4. Cannot start until Phase 4 USB Manager and Treedix serial protocols are understood. Delivers: Treedix cable tester protocol implementation (3 testers); FNIRSI FNB58 power meter (continuous 10ms sample integration on host); cable test workflow UI; test results written to NetBox cable record; quality gate advance for tested cables. Implements: Tester and powermeter sub-packages; Label/Testing Service. Avoids: Cable test results shown as raw hex (UX Pitfall); FNIRSI raw sample integration assumption (Integration Gotcha). Research flag: MEDIUM — Treedix serial protocol needs reverse engineering; FNIRSI FNB58 has a reference implementation (baryluk/fnirsi-usb-power-data-logger) but protocol parsing is non-trivial.

Phase 6: Lab Advisor Chat

Rationale: No USB dependency; can overlap with Phase 4/5 if velocity allows. Scheduled after Phase 5 to ensure inventory has meaningful data for Q&A. Delivers: Lab Advisor chat interface; AdvisorService with NetBox context assembly; Claude Opus via OpenRouter (Tier 3); streaming response via SSE; SQLite chat history (append-only); OpenRouter per-call token limit and spend safeguards. Implements: Advisor Service, chat history persistence. Avoids: OpenRouter runaway spend; OpenRouter key in frontend bundle. Research flag: Low — SSE streaming, SQLite append-only, and OpenRouter integration are established patterns.

Phase 7: SearXNG Research Agent and Quality Gate Automation

Rationale: Enhancement layer — all primitives (AI Orchestrator, NetBox client, SearXNG) exist by this point. Adds intelligence to the quality gate by automating the needs_research to researched transition. Delivers: SearXNG research agent (Tier 2) with query sanitization; automated quality gate advancement; full 5-state quality gate lifecycle; natural language inventory search (Gemma 4 query translation to NetBox filter). Implements: Search client (internal/search/); extends AI Orchestrator with research task type. Avoids: SearXNG receiving unsanitized AI-generated queries; three-tier escalation never triggering in practice. Research flag: Low — SearXNG JSON API is straightforward; query sanitization pattern is well-understood.

Phase Ordering Rationale

  • NetBox first because it is the dependency of every other component — no feature works without it.
  • AI before UI because the intake endpoint behavior (confidence thresholds, quality gate states) defines what the UI must display.
  • USB hardware characterization as a spike in Phase 4 rather than embedded in a feature phase — isolates the highest-uncertainty work and prevents it from blocking UI or AI development.
  • Cable testing after label printing (Phase 5 after Phase 4) because both share the USB Manager; the Manager must be solid before adding more device types.
  • Lab Advisor after core inventory is populated (Phase 6 after Phase 5) — the advisor is only valuable when it has inventory context to reason about.
  • Research agent last (Phase 7) because it is pure enhancement; all tiers, the quality gate state machine, and the intake flow must exist first.

Research Flags

Phases needing /gsd-research-phase during planning:

  • Phase 2 (AI Pipeline): Confidence threshold calibration for hardware vision tasks is novel; no established benchmarks; needs real-photo experiments to tune.
  • Phase 4 (USB Hardware Characterization): PRT Qutie protocol is completely unknown. Spike required on hardware arrival (2026-04-13) before any architecture decisions for label printing.
  • Phase 5 (Cable Test Integration): Treedix serial protocols require reverse engineering; FNIRSI FNB58 continuous sample integration has a reference impl but is non-trivial.

Phases with standard patterns (skip research-phase):

  • Phase 1 (Foundation): NetBox REST API and go-netbox/v4 are official and well-documented.
  • Phase 3 (React SPA): TanStack Query + shadcn/ui + SSE subscription are standard modern React patterns.
  • Phase 6 (Lab Advisor): OpenRouter streaming + SQLite append-only + SSE are established patterns.
  • Phase 7 (Research Agent): SearXNG JSON API integration is straightforward.

Confidence Assessment

Area Confidence Notes
Stack MEDIUM-HIGH Core stack confirmed via web search and official sources. Library versions from training data — pin and verify before first install. mattn/go-sqlite3 requires cgo + gcc; modernc.org/sqlite is pure-Go fallback if cross-compilation matters.
Features MEDIUM-HIGH Table stakes features are well-established ITAM patterns (HIGH). AI photo intake zero-entry claim is novel — no competitor exists to validate against (LOW for that specific claim; validate after v1 intake loop).
Architecture HIGH Patterns are established Go idioms. NetBox repository pattern, goroutine-per-device, SSE fan-out, embedded SPA are all well-documented. Three-tier AI orchestrator follows established orchestration patterns.
Pitfalls MEDIUM-HIGH USB serial goroutine leak and path churn are confirmed community issues with documented fixes. NetBox custom field asymmetry confirmed via multiple GitHub issues. PRT Qutie protocol risk is sizing-unknown (1 day to 2 weeks). Memory budget needs empirical measurement.

Overall confidence: MEDIUM-HIGH

Gaps to Address

  • PRT Qutie USB protocol: Complete unknown until hardware arrives 2026-04-13. Block label printing architecture decisions until characterization spike completes.
  • Gemma 4 memory budget under full stack: Must measure empirically on the target Mac Mini M4 before committing to model tier selection.
  • Gemma 4 vision confidence scoring: No established benchmarks for hardware photo classification. The 0.7 threshold is an informed estimate — calibrate against real photos during Phase 2.
  • Treedix cable tester serial protocol: No public documentation. Must reverse-engineer from USB traffic capture. Estimated complexity unknown.
  • go-netbox/v4 custom field lag: The official client may lag NetBox 4.2's custom field API. Plan to hand-roll custom field write/read wrappers regardless.
  • oMLX project longevity: oMLX (jundot/omlx) is a smaller project. If it becomes unmaintained, Ollama with MLX backend is the fallback.

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence — needs validation)

  • AI photo intake zero-entry claim: no competitor exists to validate against; validate after v1 intake loop
  • Confidence threshold of 0.7 for quality gate escalation: informed estimate, requires empirical calibration with real hardware photos
  • PRT Qutie protocol nature (CDC-ACM vs. HID vs. Bluetooth-only): unknown until hardware arrival 2026-04-13

Research completed: 2026-04-09 Ready for roadmap: yes