--- phase: 01-foundation plan: "05" subsystem: queue tags: [dragonfly, redis, write-ahead-queue, worker, graceful-shutdown] dependency_graph: requires: [01-01-SUMMARY.md] provides: [internal/queue WAQ, RunWorker, NoOpHandler] affects: [cmd/hwlab/main.go] tech_stack: added: [github.com/redis/go-redis/v9 v9.18.0, github.com/google/uuid v1.6.0] patterns: [RPUSH/BLPOP FIFO queue, context cancellation worker loop, non-fatal degraded init] key_files: created: - internal/queue/waq.go - internal/queue/waq_test.go - internal/queue/worker.go modified: - cmd/hwlab/main.go - go.mod - go.sum decisions: - "Custom URL parser (regex + redis.Options) required because passwords with forward slashes break url.Parse and redis.ParseURL" - "WAQ init is non-fatal — binary starts with WARNING log when DragonFlyDB unreachable" - "NoOpHandler placeholder drains Phase 1 queue; Phase 2 will replace with real NetBox retry" metrics: duration: "~10 minutes" completed: "2026-04-10T05:22:13Z" tasks_completed: 2 files_modified: 5 requirements: [NB-05] --- # Phase 01 Plan 05: DragonFlyDB Write-Ahead Queue Summary DragonFlyDB-backed write-ahead queue with RPUSH/BLPOP FIFO ordering, BLPOP retry worker with context cancellation and exponential backoff, and graceful shutdown wired into main binary. ## Tasks Completed | Task | Name | Commit | Files | |------|------|--------|-------| | 1 | Write-ahead queue core (Enqueue, Dequeue, Len) | e07ad92 | internal/queue/waq.go, internal/queue/waq_test.go, go.mod, go.sum | | 2 | WAQ retry worker + wire into main binary | d1192c3 | internal/queue/worker.go, cmd/hwlab/main.go | ## What Was Built ### internal/queue/waq.go - `PendingOp` struct: UUID ID, operation type, `json.RawMessage` payload, created_at timestamp, retry attempt counter - `NewPendingOp(opType, payload)`: constructs op with generated UUID - `WAQ` type wrapping `*redis.Client` - `NewWAQ(url)`: connects to DragonFlyDB, pings on init, returns error if unreachable - `Enqueue(ctx, op)`: RPUSH to `hwlab:netbox:pending_ops` - `Dequeue(ctx, timeout)`: blocking BLPOP, returns `nil, nil` on timeout - `Len(ctx)`: LLEN queue depth - `Close()`: releases connection ### internal/queue/worker.go - `RunWorker(ctx, handler, maxAttempts, retryInterval)`: BLPOP loop on the WAQ - Context cancellation triggers clean exit - Connection errors trigger `retryInterval` backoff (T-05-04 mitigation) - Handler errors increment `op.Attempts` and re-enqueue - Ops exceeding `maxAttempts` are dropped with a warning log (T-05-03 mitigation) - `NoOpHandler`: Phase 1 placeholder that logs and drains ops ### cmd/hwlab/main.go - `signal.NotifyContext` for SIGINT/SIGTERM graceful shutdown - Non-fatal WAQ init: `WARNING` log when DragonFlyDB unavailable, binary continues serving - `go waq.RunWorker(ctx, ...)` goroutine started after successful WAQ init - `defer waq.Close()` on clean path - HTTP server runs in goroutine; `srv.Shutdown(shutdownCtx)` with 10s timeout on signal ## DragonFlyDB Integration Test Results DragonFlyDB at `10.5.0.10:6379` was reachable during execution. ``` === RUN TestWAQEnqueueDequeue 2026/04/10 05:21:23 WAQ connected to DragonFlyDB --- PASS: TestWAQEnqueueDequeue (0.02s) PASS ``` The URL `redis://:nUq/IfoIQJf/kouckKHRQOk7vV0NwCuI@10.5.0.10:6379` contains forward slashes in the password, which causes both Go's `url.Parse` and `redis.ParseURL` to fail (they misinterpret the slashes as path separators). See Deviations section. ## Final Test Suite Output ``` ? git.georgsen.dk/hwlab [no test files] ? git.georgsen.dk/hwlab/cmd/hwlab [no test files] ? git.georgsen.dk/hwlab/internal/api [no test files] ok git.georgsen.dk/hwlab/internal/api/handlers 0.006s ok git.georgsen.dk/hwlab/internal/config 0.006s ok git.georgsen.dk/hwlab/internal/netbox 0.003s ok git.georgsen.dk/hwlab/internal/queue 0.004s ``` All packages green. No regressions. ## Deviations from Plan ### Auto-fixed Issues **1. [Rule 1 - Bug] Custom URL parser for passwords with forward slashes** - **Found during:** Task 1, integration test run - **Issue:** `redis://:nUq/IfoIQJf/kouckKHRQOk7vV0NwCuI@10.5.0.10:6379` — the password contains `/` characters. Go's `url.Parse` treats them as path separators, producing `invalid port ":nUq" after host`. `redis.ParseURL` delegates to `url.Parse` and inherits the same failure. The plan noted to use `redis.ParseURL` but that function cannot handle this URL format. - **Fix:** Added `parseRedisURL()` in `waq.go` that tries `redis.ParseURL` first (fast path for standard passwords), then falls back to a `regexp`-based extractor that captures password, host, port, and db directly — bypassing `url.Parse` entirely. Constructs `redis.Options` struct directly. - **Files modified:** internal/queue/waq.go - **Commit:** e07ad92 ## Threat Mitigations Applied | Threat ID | Mitigation | Location | |-----------|-----------|----------| | T-05-03 | `maxAttempts` drop prevents unbounded queue growth | worker.go:44 | | T-05-04 | `retryInterval` backoff on connection loss prevents tight-loop hammering | worker.go:32-37 | ## Known Stubs - `NoOpHandler` in `internal/queue/worker.go`: Phase 1 placeholder. Logs ops and returns nil (success), causing queued ops to drain without actual processing. Phase 2 NetBox integration will replace this with a real retry handler that re-drives failed NetBox API calls. ## Self-Check: PASSED - internal/queue/waq.go: FOUND - internal/queue/waq_test.go: FOUND - internal/queue/worker.go: FOUND - cmd/hwlab/main.go: FOUND (modified) - Commit e07ad92: FOUND - Commit d1192c3: FOUND