From 8f902edcd7cae92d7311e7f42fb8a9eb455ffed3 Mon Sep 17 00:00:00 2001 From: Mikkel Georgsen Date: Fri, 10 Apr 2026 05:22:54 +0000 Subject: [PATCH] docs(01-05): complete DragonFlyDB write-ahead queue plan summary - Integration tests ran and passed against DragonFlyDB at 10.5.0.10:6379 - Documents custom URL parser deviation for slash-in-password fix - Notes NoOpHandler stub for Phase 2 replacement --- .../phases/01-foundation/01-05-SUMMARY.md | 132 ++++++++++++++++++ 1 file changed, 132 insertions(+) create mode 100644 .planning/phases/01-foundation/01-05-SUMMARY.md diff --git a/.planning/phases/01-foundation/01-05-SUMMARY.md b/.planning/phases/01-foundation/01-05-SUMMARY.md new file mode 100644 index 0000000..20e72a3 --- /dev/null +++ b/.planning/phases/01-foundation/01-05-SUMMARY.md @@ -0,0 +1,132 @@ +--- +phase: 01-foundation +plan: "05" +subsystem: queue +tags: [dragonfly, redis, write-ahead-queue, worker, graceful-shutdown] +dependency_graph: + requires: [01-01-SUMMARY.md] + provides: [internal/queue WAQ, RunWorker, NoOpHandler] + affects: [cmd/hwlab/main.go] +tech_stack: + added: [github.com/redis/go-redis/v9 v9.18.0, github.com/google/uuid v1.6.0] + patterns: [RPUSH/BLPOP FIFO queue, context cancellation worker loop, non-fatal degraded init] +key_files: + created: + - internal/queue/waq.go + - internal/queue/waq_test.go + - internal/queue/worker.go + modified: + - cmd/hwlab/main.go + - go.mod + - go.sum +decisions: + - "Custom URL parser (regex + redis.Options) required because passwords with forward slashes break url.Parse and redis.ParseURL" + - "WAQ init is non-fatal — binary starts with WARNING log when DragonFlyDB unreachable" + - "NoOpHandler placeholder drains Phase 1 queue; Phase 2 will replace with real NetBox retry" +metrics: + duration: "~10 minutes" + completed: "2026-04-10T05:22:13Z" + tasks_completed: 2 + files_modified: 5 +requirements: [NB-05] +--- + +# Phase 01 Plan 05: DragonFlyDB Write-Ahead Queue Summary + +DragonFlyDB-backed write-ahead queue with RPUSH/BLPOP FIFO ordering, BLPOP retry worker with context cancellation and exponential backoff, and graceful shutdown wired into main binary. + +## Tasks Completed + +| Task | Name | Commit | Files | +|------|------|--------|-------| +| 1 | Write-ahead queue core (Enqueue, Dequeue, Len) | e07ad92 | internal/queue/waq.go, internal/queue/waq_test.go, go.mod, go.sum | +| 2 | WAQ retry worker + wire into main binary | d1192c3 | internal/queue/worker.go, cmd/hwlab/main.go | + +## What Was Built + +### internal/queue/waq.go + +- `PendingOp` struct: UUID ID, operation type, `json.RawMessage` payload, created_at timestamp, retry attempt counter +- `NewPendingOp(opType, payload)`: constructs op with generated UUID +- `WAQ` type wrapping `*redis.Client` +- `NewWAQ(url)`: connects to DragonFlyDB, pings on init, returns error if unreachable +- `Enqueue(ctx, op)`: RPUSH to `hwlab:netbox:pending_ops` +- `Dequeue(ctx, timeout)`: blocking BLPOP, returns `nil, nil` on timeout +- `Len(ctx)`: LLEN queue depth +- `Close()`: releases connection + +### internal/queue/worker.go + +- `RunWorker(ctx, handler, maxAttempts, retryInterval)`: BLPOP loop on the WAQ + - Context cancellation triggers clean exit + - Connection errors trigger `retryInterval` backoff (T-05-04 mitigation) + - Handler errors increment `op.Attempts` and re-enqueue + - Ops exceeding `maxAttempts` are dropped with a warning log (T-05-03 mitigation) +- `NoOpHandler`: Phase 1 placeholder that logs and drains ops + +### cmd/hwlab/main.go + +- `signal.NotifyContext` for SIGINT/SIGTERM graceful shutdown +- Non-fatal WAQ init: `WARNING` log when DragonFlyDB unavailable, binary continues serving +- `go waq.RunWorker(ctx, ...)` goroutine started after successful WAQ init +- `defer waq.Close()` on clean path +- HTTP server runs in goroutine; `srv.Shutdown(shutdownCtx)` with 10s timeout on signal + +## DragonFlyDB Integration Test Results + +DragonFlyDB at `10.5.0.10:6379` was reachable during execution. + +``` +=== RUN TestWAQEnqueueDequeue +2026/04/10 05:21:23 WAQ connected to DragonFlyDB +--- PASS: TestWAQEnqueueDequeue (0.02s) +PASS +``` + +The URL `redis://:nUq/IfoIQJf/kouckKHRQOk7vV0NwCuI@10.5.0.10:6379` contains forward slashes in the password, which causes both Go's `url.Parse` and `redis.ParseURL` to fail (they misinterpret the slashes as path separators). See Deviations section. + +## Final Test Suite Output + +``` +? git.georgsen.dk/hwlab [no test files] +? git.georgsen.dk/hwlab/cmd/hwlab [no test files] +? git.georgsen.dk/hwlab/internal/api [no test files] +ok git.georgsen.dk/hwlab/internal/api/handlers 0.006s +ok git.georgsen.dk/hwlab/internal/config 0.006s +ok git.georgsen.dk/hwlab/internal/netbox 0.003s +ok git.georgsen.dk/hwlab/internal/queue 0.004s +``` + +All packages green. No regressions. + +## Deviations from Plan + +### Auto-fixed Issues + +**1. [Rule 1 - Bug] Custom URL parser for passwords with forward slashes** + +- **Found during:** Task 1, integration test run +- **Issue:** `redis://:nUq/IfoIQJf/kouckKHRQOk7vV0NwCuI@10.5.0.10:6379` — the password contains `/` characters. Go's `url.Parse` treats them as path separators, producing `invalid port ":nUq" after host`. `redis.ParseURL` delegates to `url.Parse` and inherits the same failure. The plan noted to use `redis.ParseURL` but that function cannot handle this URL format. +- **Fix:** Added `parseRedisURL()` in `waq.go` that tries `redis.ParseURL` first (fast path for standard passwords), then falls back to a `regexp`-based extractor that captures password, host, port, and db directly — bypassing `url.Parse` entirely. Constructs `redis.Options` struct directly. +- **Files modified:** internal/queue/waq.go +- **Commit:** e07ad92 + +## Threat Mitigations Applied + +| Threat ID | Mitigation | Location | +|-----------|-----------|----------| +| T-05-03 | `maxAttempts` drop prevents unbounded queue growth | worker.go:44 | +| T-05-04 | `retryInterval` backoff on connection loss prevents tight-loop hammering | worker.go:32-37 | + +## Known Stubs + +- `NoOpHandler` in `internal/queue/worker.go`: Phase 1 placeholder. Logs ops and returns nil (success), causing queued ops to drain without actual processing. Phase 2 NetBox integration will replace this with a real retry handler that re-drives failed NetBox API calls. + +## Self-Check: PASSED + +- internal/queue/waq.go: FOUND +- internal/queue/waq_test.go: FOUND +- internal/queue/worker.go: FOUND +- cmd/hwlab/main.go: FOUND (modified) +- Commit e07ad92: FOUND +- Commit d1192c3: FOUND