homelabby/.planning/phases/02-ai-pipeline/02-02-PLAN.md at 73eab561cfb84fd8d87c07b976629eaecff97f6e

Mikkel Georgsen 7bebe2ed93 docs(02): create phase 2 AI pipeline plans (4 plans, 4 waves)

Wave 1: go-openai dep, CreateDevice gap, AIClient interface + mock + config
Wave 2: three-tier orchestrator, WAQ real handler, SearXNG stub
Wave 3: POST /api/intake handler, router wiring, quick add mode
Wave 4: oMLX integration test + memory checkpoint

Covers requirements: AI-01 through AI-09 (AI-04 stub only; full impl Phase 7)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-10 05:40:22 +00:00

17 KiB

Raw Blame History

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

02-ai-pipeline

execute

02-01

internal/ai/orchestrator.go

internal/ai/orchestrator_test.go

internal/ai/research.go

internal/queue/handler.go

internal/queue/handler_test.go

true

AI-04

AI-05

AI-06

truths

artifacts

key_links

Orchestrator calls tier1; if confidence < threshold it escalates to tier2

Orchestrator maps confidence to CatalogStatus: high -> indexed, low -> needs_research

Both tier1 and tier2 use the same AIClient interface — swap by config

WAQ handler processes netbox.create_device and netbox.patch_custom_fields ops and retries on failure

ResearchClient interface exists with NoOpResearchClient stub (AI-04 deferred impl)

path

provides

exports

internal/ai/orchestrator.go

Orchestrator with Analyze(ctx, IntakeRequest) → (*IntakeResult, CatalogStatus, error)

Orchestrator

NewOrchestrator

path	provides
internal/ai/orchestrator_test.go	Unit tests covering tier1-only, tier1-low-confidence-escalates, tier2-fallback

path

provides

exports

internal/ai/research.go

ResearchClient interface + NoOpResearchClient stub

ResearchClient

NoOpResearchClient

path

provides

exports

internal/queue/handler.go

NetBoxOpHandler that processes create_device and patch_custom_fields WAQ ops

NewNetBoxOpHandler

NetBoxOpHandler

path	provides
internal/queue/handler_test.go	Tests for handler routing, JSON decode, and error propagation

from	to	via	pattern
internal/ai/orchestrator.go	internal/inventory/quality_gate.go	Orchestrator.Analyze returns inventory.CatalogStatus — StatusIndexed or StatusNeedsResearch	inventory.CatalogStatus

from	to	via	pattern
internal/queue/handler.go	internal/netbox/client.go	NetBoxOpHandler calls client.CreateDevice or client.PatchCustomFields based on op.Type	netbox.create_device\|netbox.patch_custom_fields

Build the three-tier orchestrator with confidence-based tier escalation and CatalogStatus mapping, the WAQ real handler replacing NoOpHandler, and the SearXNG ResearchClient stub.

Purpose: The orchestrator is the AI decision engine. The WAQ handler closes the Phase 1 gap where NoOpHandler silently dropped all queued NetBox ops. The research stub satisfies AI-04 interface without Phase 7 scope creep. Output: internal/ai/orchestrator.go, internal/ai/research.go, internal/queue/handler.go — all tested.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/02-ai-pipeline/02-CONTEXT.md @.planning/phases/02-ai-pipeline/02-RESEARCH.md @.planning/phases/02-ai-pipeline/02-01-SUMMARY.md

From internal/ai/client.go:

type AIClient interface {
    AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error)
}
type MockAIClient struct {
    FixedResult *IntakeResult
    FixedError  error
    Calls       []IntakeRequest
}
func HighConfidenceResult() *IntakeResult  // confidence: 0.95
func LowConfidenceResult() *IntakeResult   // confidence: 0.40

From internal/ai/types.go:

type IntakeRequest struct { PhotosBase64 []string; JobID string }
type IntakeResult struct {
    SerialNumber, Model, Manufacturer, Category string
    Specs map[string]string; SuggestedTags []string
    AINotes string; Confidence float64; ConfidenceNote string
}
type AIConfig struct {
    Tier1, Tier2        TierConfig
    ConfidenceThreshold float64
    QuickAddEnabled     bool
    QuickAddThreshold   float64
}

From internal/inventory/quality_gate.go:

type CatalogStatus string
const StatusIndexed       CatalogStatus = "indexed"
const StatusNeedsResearch CatalogStatus = "needs_research"

From internal/queue/waq.go:

type PendingOp struct {
    ID        string
    Type      string          // op type string e.g. "netbox.create_device"
    Payload   json.RawMessage
    CreatedAt time.Time
    Attempts  int
}
type OpHandler func(ctx context.Context, op PendingOp) error

From internal/netbox/client.go:

func (c *Client) CreateDevice(ctx context.Context, name, assetTag string, deviceTypeID, roleID, siteID int32) (int64, error)
func (c *Client) PatchCustomFields(ctx context.Context, deviceID int64, patch map[string]interface{}) error

Task 1: Three-tier orchestrator with confidence routing internal/ai/orchestrator.go, internal/ai/orchestrator_test.go, internal/ai/research.go - internal/ai/types.go (full) - internal/ai/client.go (full) - internal/ai/mock.go (full) - internal/inventory/quality_gate.go (full — CatalogStatus constants) - .planning/phases/02-ai-pipeline/02-RESEARCH.md lines 324-410 (orchestrator pattern) - Test: TestOrchestratorHighConfidence — tier1 returns confidence 0.95 (above 0.75 threshold); tier2 never called; result status == StatusIndexed - Test: TestOrchestratorLowConfidenceEscalates — tier1 returns confidence 0.40; tier2 called once; tier2 returns confidence 0.85; final status == StatusIndexed - Test: TestOrchestratorBothTiersFail — tier1 returns error; tier2 returns error; result is non-nil (zero IntakeResult), status == StatusNeedsResearch, err == nil (orchestrator does not propagate tier errors; it degrades gracefully) - Test: TestOrchestratorTier1NilResult — tier1 returns nil result with nil error; orchestrator escalates to tier2 - Test: TestOrchestratorNeedsResearch — tier1 returns confidence 0.40; tier2 also returns confidence 0.40; final status == StatusNeedsResearch **internal/ai/orchestrator.go:**

package ai

import (
    "context"
    "log"

    "git.georgsen.dk/hwlab/internal/inventory"
)

// Orchestrator manages the three-tier AI pipeline.
// Tier1 is local oMLX (fast, low cost). Tier2 is OpenRouter (slower, better).
// Tier3 (Lab Advisor) is out of scope for Phase 2.
type Orchestrator struct {
    tier1     AIClient
    tier2     AIClient
    threshold float64 // confidence threshold for escalation; default 0.75
}

// NewOrchestrator creates an Orchestrator. Both tier1 and tier2 must be non-nil.
func NewOrchestrator(tier1, tier2 AIClient, threshold float64) *Orchestrator {
    if threshold <= 0 {
        threshold = 0.75
    }
    return &Orchestrator{tier1: tier1, tier2: tier2, threshold: threshold}
}

// Analyze runs tier1 and escalates to tier2 if confidence is below threshold.
// Never returns an error from individual tier failures — tier errors cause escalation.
// Returns a non-nil IntakeResult in all cases (may be zero-value on total failure).
// The returned CatalogStatus is either StatusIndexed or StatusNeedsResearch.
func (o *Orchestrator) Analyze(ctx context.Context, req IntakeRequest) (*IntakeResult, inventory.CatalogStatus, error) {
    result, err := o.tier1.AnalyzePhotos(ctx, req)
    if err != nil {
        log.Printf("orchestrator: tier1 error (escalating to tier2): %v", err)
        result = nil
    }

    // Escalate if tier1 result is missing, nil, or low confidence
    if result == nil || result.Confidence < o.threshold {
        log.Printf("orchestrator: tier1 confidence=%.2f < threshold=%.2f — escalating to tier2",
            confidenceOf(result), o.threshold)
        result2, err2 := o.tier2.AnalyzePhotos(ctx, req)
        if err2 != nil {
            log.Printf("orchestrator: tier2 error: %v", err2)
        } else if result2 != nil {
            result = result2
        }
    }

    // Map confidence to CatalogStatus
    if result == nil {
        return &IntakeResult{
            AINotes:        "all AI tiers failed",
            Confidence:     0.0,
            ConfidenceNote: "no result from any tier",
        }, inventory.StatusNeedsResearch, nil
    }

    status := inventory.StatusIndexed
    if result.Confidence < o.threshold {
        status = inventory.StatusNeedsResearch
    }
    return result, status, nil
}

// confidenceOf returns 0.0 for nil results, otherwise result.Confidence.
func confidenceOf(r *IntakeResult) float64 {
    if r == nil {
        return 0.0
    }
    return r.Confidence
}

internal/ai/research.go — SearXNG stub (AI-04 Phase 7 interface):

package ai

import "context"

// SearchResult is a single result from a SearXNG research query.
type SearchResult struct {
    Title   string
    URL     string
    Snippet string
}

// ResearchClient abstracts the SearXNG search backend.
// Phase 7 will provide a real implementation.
type ResearchClient interface {
    Search(ctx context.Context, query string) ([]SearchResult, error)
}

// NoOpResearchClient is a Phase 2 stub that returns empty results.
// Replace with SearXNG HTTP client in Phase 7.
type NoOpResearchClient struct{}

func (n *NoOpResearchClient) Search(_ context.Context, _ string) ([]SearchResult, error) {
    return nil, nil
}

internal/ai/orchestrator_test.go — five tests using MockAIClient:

Write all five tests from the behavior block above. Use table-driven style where practical. Key patterns:

Create MockAIClient with FixedResult / FixedError for tier1 and tier2
Call NewOrchestrator(tier1, tier2, 0.75).Analyze(context.Background(), IntakeRequest{PhotosBase64: []string{"data:image/jpeg;base64,/9j/"}})
Assert returned status and confirm tier2 Calls length via mock.Calls cd /home/mikkel/homelabby && go build ./... && go test ./internal/ai/... -run TestOrchestrator -v 2>&1 All 5 TestOrchestrator* tests pass. go build ./... clean. internal/ai/research.go exists with ResearchClient interface and NoOpResearchClient.

Task 2: WAQ real NetBox op handler (replaces NoOpHandler) internal/queue/handler.go, internal/queue/handler_test.go - internal/queue/waq.go (full — PendingOp struct, OpHandler type) - internal/queue/worker.go (full — NoOpHandler is what we're replacing; understand RunWorker signature) - internal/netbox/client.go (full — CreateDevice and PatchCustomFields signatures) - internal/netbox/types.go (full — understand what types are available) - .planning/phases/01-foundation/01-05-SUMMARY.md lines 45-70 (WAQ op structure) - Test: TestNetBoxOpHandlerRouting — handler receives op with Type="netbox.create_device", asserts CreateDevice is called (use a mock netbox client interface) - Test: TestNetBoxOpHandlerPatchCustomFields — handler receives op with Type="netbox.patch_custom_fields", asserts PatchCustomFields called - Test: TestNetBoxOpHandlerUnknownType — handler receives op with Type="unknown.op", returns a non-nil error (unknown ops are re-queued, not silently dropped) - Test: TestNetBoxOpHandlerBadJSON — handler receives op with malformed payload JSON, returns non-nil error - Test: TestCreateDevicePayloadDecode — a JSON payload `{"name":"test","asset_tag":"HW-00001","device_type_id":1,"role_id":2,"site_id":3}` decodes correctly Define two payload types and the handler in internal/queue/handler.go.

Op type string constants:

const (
    OpNetBoxCreateDevice     = "netbox.create_device"
    OpNetBoxPatchCustomFields = "netbox.patch_custom_fields"
)

Payload types:

// CreateDevicePayload is the JSON payload for OpNetBoxCreateDevice ops.
type CreateDevicePayload struct {
    Name         string                 `json:"name"`
    AssetTag     string                 `json:"asset_tag"`
    DeviceTypeID int32                  `json:"device_type_id"`
    RoleID       int32                  `json:"role_id"`
    SiteID       int32                  `json:"site_id"`
}

// PatchCustomFieldsPayload is the JSON payload for OpNetBoxPatchCustomFields ops.
type PatchCustomFieldsPayload struct {
    DeviceID int64                  `json:"device_id"`
    Patch    map[string]interface{} `json:"patch"`
}

Handler interface for testability (so tests don't need a real NetBox client):

// NetBoxOpsClient is the subset of netbox.Client that the WAQ handler needs.
type NetBoxOpsClient interface {
    CreateDevice(ctx context.Context, name, assetTag string, deviceTypeID, roleID, siteID int32) (int64, error)
    PatchCustomFields(ctx context.Context, deviceID int64, patch map[string]interface{}) error
}

// NewNetBoxOpHandler returns an OpHandler that processes netbox WAQ operations.
// Pass a *netbox.Client as the client argument.
func NewNetBoxOpHandler(client NetBoxOpsClient) OpHandler {
    return func(ctx context.Context, op PendingOp) error {
        switch op.Type {
        case OpNetBoxCreateDevice:
            var p CreateDevicePayload
            if err := json.Unmarshal(op.Payload, &p); err != nil {
                return fmt.Errorf("decode create_device payload: %w", err)
            }
            _, err := client.CreateDevice(ctx, p.Name, p.AssetTag, p.DeviceTypeID, p.RoleID, p.SiteID)
            return err
        case OpNetBoxPatchCustomFields:
            var p PatchCustomFieldsPayload
            if err := json.Unmarshal(op.Payload, &p); err != nil {
                return fmt.Errorf("decode patch_custom_fields payload: %w", err)
            }
            return client.PatchCustomFields(ctx, p.DeviceID, p.Patch)
        default:
            return fmt.Errorf("unknown op type: %q", op.Type)
        }
    }
}

For tests, define a MockNetBoxOpsClient in handler_test.go (not exported):

type mockNetBoxOpsClient struct {
    createCalls []CreateDevicePayload
    patchCalls  []PatchCustomFieldsPayload
    createErr   error
    patchErr    error
}
func (m *mockNetBoxOpsClient) CreateDevice(ctx context.Context, name, assetTag string, dtID, roleID, siteID int32) (int64, error) {
    m.createCalls = append(m.createCalls, CreateDevicePayload{Name: name, AssetTag: assetTag, DeviceTypeID: dtID, RoleID: roleID, SiteID: siteID})
    return 42, m.createErr
}
func (m *mockNetBoxOpsClient) PatchCustomFields(ctx context.Context, deviceID int64, patch map[string]interface{}) error {
    m.patchCalls = append(m.patchCalls, PatchCustomFieldsPayload{DeviceID: deviceID, Patch: patch})
    return m.patchErr
}

NOTE: Do NOT remove NoOpHandler from worker.go — it is used in main.go. The intake handler (Plan 03) will switch main.go to use NewNetBoxOpHandler. For now, NoOpHandler stays; the new handler lives alongside it in handler.go. cd /home/mikkel/homelabby && go build ./... && go test ./internal/queue/... -v 2>&1 All 5 TestNetBoxOpHandler* tests pass. go build ./... clean. internal/queue/handler.go exports NewNetBoxOpHandler, OpNetBoxCreateDevice, OpNetBoxPatchCustomFields constants. NoOpHandler remains untouched in worker.go.

<threat_model>

Trust Boundaries

Boundary	Description
AI response → IntakeResult	JSON from model is untrusted; parsed into typed struct
WAQ payload → NetBox call	Queued JSON payload decoded before NetBox API call

STRIDE Threat Register

Threat ID	Category	Component	Disposition	Mitigation Plan
T-02-05	Tampering	Orchestrator result	mitigate	IntakeResult.Confidence clamped to 0.0 by json.Unmarshal zero-value on invalid JSON; orchestrator treats nil/zero result as needs_research
T-02-06	Denial of Service	Orchestrator both-tiers-timeout	mitigate	Each TierClient.AnalyzePhotos wraps call in context.WithTimeout — maximum total wait is tier1.timeout + tier2.timeout
T-02-07	Tampering	WAQ payload injection	mitigate	PendingOp.Payload decoded via json.Unmarshal into typed structs (CreateDevicePayload / PatchCustomFieldsPayload) — arbitrary fields ignored; unknown op types return error and are re-queued, not executed
T-02-08	Elevation of Privilege	WAQ unknown op type	mitigate	NewNetBoxOpHandler returns error on unknown Type — op re-queued up to maxAttempts then dropped; no code execution from op type
</threat_model>

After plan completion: 1. `go build ./...` — zero errors 2. `go test ./internal/ai/... -v` — all orchestrator tests pass 3. `go test ./internal/queue/... -v` — all handler tests pass (including pre-existing WAQ tests) 4. `grep -r "NoOpHandler" internal/` — still present in worker.go; main.go still compiles 5. `grep "ResearchClient" internal/ai/research.go` — interface present

<success_criteria>

Orchestrator escalates from tier1 to tier2 when confidence < threshold
All tier1/tier2 failure combinations handled gracefully (no panic, no error propagation)
CatalogStatus returned from Analyze is always StatusIndexed or StatusNeedsResearch
WAQ handler routes create_device and patch_custom_fields ops to correct NetBox methods
Unknown WAQ op types return error (re-queued, not silently dropped)
ResearchClient interface stub present for Phase 7 </success_criteria>

After completion, create `.planning/phases/02-ai-pipeline/02-02-SUMMARY.md`

17 KiB Raw Blame History

Trust Boundaries

STRIDE Threat Register

17 KiB

Raw Blame History