homelabby/.planning/phases/02-ai-pipeline/02-01-PLAN.md at f151b96f888fa9b634cd76fda82333193e5edc2a

Mikkel Georgsen 7bebe2ed93 docs(02): create phase 2 AI pipeline plans (4 plans, 4 waves)

Wave 1: go-openai dep, CreateDevice gap, AIClient interface + mock + config
Wave 2: three-tier orchestrator, WAQ real handler, SearXNG stub
Wave 3: POST /api/intake handler, router wiring, quick add mode
Wave 4: oMLX integration test + memory checkpoint

Covers requirements: AI-01 through AI-09 (AI-04 stub only; full impl Phase 7)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-04-10 05:40:22 +00:00

24 KiB

Raw Blame History

phase

plan

type

wave

depends_on

files_modified

autonomous

requirements

must_haves

02-ai-pipeline

execute

go.mod

go.sum

internal/netbox/client.go

internal/netbox/client_test.go

internal/ai/types.go

internal/ai/client.go

internal/ai/mock.go

internal/ai/prompts/intake.go

internal/config/config.go

internal/config/config_test.go

ai_config.json

true

AI-01

AI-08

AI-09

truths

artifacts

key_links

go-openai v1.x is in go.mod and the binary compiles

AIClient interface has a single AnalyzePhotos method with domain types

MockAIClient implements AIClient, returns HighConfidenceResult and LowConfidenceResult fixtures

TierClient wraps go-openai with BaseURL override — tier routing is config-driven

ai_config.json template exists and config.Load() unmarshals AIConfig fields

NetBox client has CreateDevice method returning (int64, error)

path

provides

exports

internal/ai/types.go

IntakeRequest, IntakeResult, TierConfig, AIConfig domain types

IntakeRequest

IntakeResult

TierConfig

AIConfig

path

provides

exports

internal/ai/client.go

AIClient interface + TierClient production implementation

AIClient

TierClient

NewTierClient

path

provides

exports

internal/ai/mock.go

MockAIClient test double with fixture constructors

MockAIClient

HighConfidenceResult

LowConfidenceResult

path

provides

exports

internal/ai/prompts/intake.go

BuildIntakePrompt() returning the JSON-extraction prompt template

BuildIntakePrompt

path	provides
internal/config/config.go	Config struct extended with AI AIConfig field and viper bindings

path	provides
ai_config.json	Template config file with tier1/tier2/threshold/quick_add settings

path	provides	contains
internal/netbox/client.go	CreateDevice method on *Client	func.Client.CreateDevice

from	to	via	pattern
internal/config/config.go	internal/ai/types.go	AIConfig embeds TierConfig — config unmarshals into AIConfig struct	AIConfig

from	to	via	pattern
internal/ai/client.go	github.com/sashabaranov/go-openai	TierClient wraps openai.Client; BaseURL set from TierConfig.BaseURL	openai.DefaultConfig

Lay the AI package foundation: install go-openai, define the AIClient interface and domain types, write MockAIClient for tests, add the intake prompt template, extend config for AI tiers, and add CreateDevice to the NetBox client (Phase 1 gap).

Purpose: Every downstream plan (orchestrator, intake handler) builds against these contracts. Nothing else can run without this. Output: go.mod updated, internal/ai/ package with types/interface/mock/prompts, config extended, NetBox CreateDevice added.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/phases/02-ai-pipeline/02-CONTEXT.md @.planning/phases/02-ai-pipeline/02-RESEARCH.md

From internal/netbox/client.go:

type Client struct { /* wraps *nb.APIClient */ }
func NewClient(url, token string) (*Client, error)
func (c *Client) Ping(ctx context.Context) error
func (c *Client) ListDevices(ctx context.Context, limit int) ([]Device, error)
func (c *Client) GetDevice(ctx context.Context, id int64) (*Device, error)
func (c *Client) PatchCustomFields(ctx context.Context, deviceID int64, patch map[string]interface{}) error

From internal/netbox/types.go:

type Device struct {
    ID           int64
    Name         string
    AssetTag     string
    CustomFields CustomFields
    Created      time.Time
    LastUpdated  time.Time
}
type CustomFields struct {
    HWID            string
    CatalogStatus   string
    ProductURL      string
    FirmwareVersion string
    TestDate        string
    TestData        string
    AINotes         string
    PhotoURLs       []string
}

From internal/inventory/quality_gate.go:

type CatalogStatus string
const (
    StatusDraft         CatalogStatus = "draft"
    StatusIndexed       CatalogStatus = "indexed"
    StatusNeedsResearch CatalogStatus = "needs_research"
    StatusResearched    CatalogStatus = "researched"
    StatusComplete      CatalogStatus = "complete"
)

From internal/config/config.go (current):

type Config struct {
    Host     string
    Port     int
    LogLevel string
    NetBoxURL            string
    NetBoxToken          string
    NetBoxTimeoutSeconds int
    DragonflyURL            string
    WAQRetryIntervalSeconds int
    WAQMaxAttempts          int
    QualityGateConfidenceThreshold float64
}
func Load() (*Config, error)

go-netbox v4 CreateDevice pattern (from go-netbox generated client):

// WritableDeviceWithConfigContextRequest is the request body for POST /dcim/devices/
// Key fields: Name (string), DeviceType (int32), Role (int32), Site (int32), AssetTag (NullableString)
// After creation, returned DeviceWithConfigContext has .GetId() int32

req := nb.NewWritableDeviceWithConfigContextRequest("device-name", roleID, siteID, deviceTypeID)
req.SetAssetTag(nb.NewNullableString(&assetTag))
result, resp, err := c.api.DcimAPI.DcimDevicesCreate(ctx).
    WritableDeviceWithConfigContextRequest(*req).Execute()
// result.GetId() returns int32 — cast to int64

Task 1: Install go-openai and add CreateDevice to NetBox client go.mod, go.sum, internal/netbox/client.go, internal/netbox/client_test.go - internal/netbox/client.go (full — understand Client struct and existing methods) - internal/netbox/client_test.go (full — understand test patterns to follow) - internal/netbox/types.go (full — Device struct) - Test: TestCreateDeviceValidation — calling CreateDevice with empty name returns error, no NetBox call made - Test: TestCreateDeviceLive — skipped unless HWLAB_NETBOX_TOKEN is 40 chars AND HWLAB_TEST_SITE_ID is set; when conditions met: creates a device, asserts returned ID > 0, deletes the device (cleanup) 1. Run: `cd /home/mikkel/homelabby && go get github.com/sashabaranov/go-openai@latest`

Add CreateDevice to internal/netbox/client.go. Follow the existing method style exactly.

// CreateDevice creates a new device in NetBox with the given name and asset tag.
// deviceTypeID, roleID, and siteID must be valid NetBox IDs (pre-existing objects).
// Returns the new device's NetBox ID or error.
func (c *Client) CreateDevice(ctx context.Context, name, assetTag string, deviceTypeID, roleID, siteID int32) (int64, error) {
    if name == "" {
        return 0, fmt.Errorf("device name must not be empty")
    }
    req := nb.NewWritableDeviceWithConfigContextRequest(name, roleID, siteID, deviceTypeID)
    if assetTag != "" {
        req.SetAssetTag(*nb.NewNullableString(&assetTag))
    }
    result, _, err := c.api.DcimAPI.DcimDevicesCreate(ctx).
        WritableDeviceWithConfigContextRequest(*req).Execute()
    if err != nil {
        return 0, fmt.Errorf("CreateDevice %q: %w", name, err)
    }
    return int64(result.GetId()), nil
}

Add to client_test.go:

TestCreateDeviceValidation: calls CreateDevice with empty name, asserts err != nil, no NetBox token needed
TestCreateDeviceLive: skip guard if len(token) != 40 || os.Getenv("HWLAB_TEST_SITE_ID") == "" { t.Skip(...) }

Run go build ./... and go test ./internal/netbox/... -run TestCreateDevice — both must pass. cd /home/mikkel/homelabby && go build ./... && go test ./internal/netbox/... -run TestCreateDevice -v 2>&1 | tail -20 go.mod contains github.com/sashabaranov/go-openai; go build ./... passes; TestCreateDeviceValidation PASS; TestCreateDeviceLive SKIP (no live token).

Task 2: AI package — types, interface, mock, prompts, config extension internal/ai/types.go, internal/ai/client.go, internal/ai/mock.go, internal/ai/prompts/intake.go, internal/config/config.go, internal/config/config_test.go, ai_config.json - internal/config/config.go (full — extend this file) - internal/config/config_test.go (full — extend tests) - .planning/phases/02-ai-pipeline/02-RESEARCH.md lines 270-455 (type definitions, patterns) - Test: TestAIConfig — Load() with an ai_config.json (written to temp dir or CWD before test) correctly unmarshals Tier1.BaseURL, Tier1.Model, ConfidenceThreshold, QuickAddEnabled - Test: TestMockAIClient — MockAIClient.AnalyzePhotos returns HighConfidenceResult fixture; Calls slice has length 1 after one call; FixedError path returns nil result and the error - Test: TestTierClientConstruction — NewTierClient(cfg) does not panic; a TierClient created with a bogus URL returns an error when AnalyzePhotos is called (HTTP connection refused, not a panic) Create these files:

internal/ai/types.go — Domain types only, no go-openai import:

package ai

// IntakeRequest carries 1-3 photos (base64-encoded data URLs) for AI analysis.
type IntakeRequest struct {
    PhotosBase64 []string // "data:image/jpeg;base64,..."
    JobID        string   // UUID for tracing
}

// IntakeResult is the structured output from any AI tier's photo analysis.
// The model is instructed to return this JSON shape verbatim.
type IntakeResult struct {
    SerialNumber   string            `json:"serial_number"`
    Model          string            `json:"model"`
    Manufacturer   string            `json:"manufacturer"`
    Category       string            `json:"category"`        // compute | networking | storage | cable | peripheral | component | unknown
    Specs          map[string]string `json:"specs"`
    SuggestedTags  []string          `json:"suggested_tags"`
    AINotes        string            `json:"ai_notes"`
    Confidence     float64           `json:"confidence"`      // 0.0–1.0 self-reported
    ConfidenceNote string            `json:"confidence_note"` // reason if < threshold
}

// TierConfig holds provider configuration for one AI tier.
type TierConfig struct {
    BaseURL        string `json:"base_url"        mapstructure:"base_url"`
    APIKey         string `json:"api_key"         mapstructure:"api_key"`
    Model          string `json:"model"           mapstructure:"model"`
    TimeoutSeconds int    `json:"timeout_seconds" mapstructure:"timeout_seconds"`
}

// AIConfig holds all AI tier configurations and orchestration settings.
type AIConfig struct {
    Tier1               TierConfig `json:"tier1"                mapstructure:"tier1"`
    Tier2               TierConfig `json:"tier2"                mapstructure:"tier2"`
    ConfidenceThreshold float64    `json:"confidence_threshold" mapstructure:"confidence_threshold"`
    QuickAddEnabled     bool       `json:"quick_add_enabled"    mapstructure:"quick_add_enabled"`
    QuickAddThreshold   float64    `json:"quick_add_threshold"  mapstructure:"quick_add_threshold"`
}

internal/ai/client.go — AIClient interface + TierClient:

package ai

import (
    "context"
    "encoding/json"
    "fmt"
    "time"

    openai "github.com/sashabaranov/go-openai"
)

// AIClient is the single abstraction over any OpenAI-compatible inference backend.
// Production: TierClient wrapping sashabaranov/go-openai.
// Tests: MockAIClient with canned responses.
type AIClient interface {
    AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error)
}

// TierClient is the production AIClient backed by go-openai.
type TierClient struct {
    client  *openai.Client
    model   string
    timeout time.Duration
}

// NewTierClient creates a TierClient from a TierConfig.
// BaseURL is set directly on the openai.ClientConfig — this is the tier-routing mechanism.
func NewTierClient(cfg TierConfig) *TierClient {
    config := openai.DefaultConfig(cfg.APIKey)
    config.BaseURL = cfg.BaseURL
    timeout := time.Duration(cfg.TimeoutSeconds) * time.Second
    if timeout == 0 {
        timeout = 30 * time.Second
    }
    return &TierClient{
        client:  openai.NewClientWithConfig(config),
        model:   cfg.Model,
        timeout: timeout,
    }
}

// AnalyzePhotos sends 1-3 base64-encoded photos to the configured model and
// parses the structured JSON response into an IntakeResult.
// Falls back gracefully: if the model returns malformed JSON, returns a
// zero-confidence IntakeResult (not an error) so the orchestrator can escalate.
func (c *TierClient) AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error) {
    // Build vision message parts: text prompt first, then image URLs
    parts := []openai.ChatMessagePart{
        {
            Type: openai.ChatMessagePartTypeText,
            Text: buildIntakePromptWithCount(len(req.PhotosBase64)),
        },
    }
    for _, b64 := range req.PhotosBase64 {
        parts = append(parts, openai.ChatMessagePart{
            Type: openai.ChatMessagePartTypeImageURL,
            ImageURL: &openai.ChatMessageImageURL{
                URL:    b64,
                Detail: openai.ImageURLDetailAuto,
            },
        })
    }

    tctx, cancel := context.WithTimeout(ctx, c.timeout)
    defer cancel()

    resp, err := c.client.CreateChatCompletion(tctx, openai.ChatCompletionRequest{
        Model: c.model,
        Messages: []openai.ChatCompletionMessage{
            {Role: openai.ChatMessageRoleUser, MultiContent: parts},
        },
    })
    if err != nil {
        return nil, fmt.Errorf("chat completion: %w", err)
    }
    if len(resp.Choices) == 0 {
        return nil, fmt.Errorf("no choices in response")
    }

    content := resp.Choices[0].Message.Content
    var result IntakeResult
    if err := json.Unmarshal([]byte(content), &result); err != nil {
        // JSON parse failure — return zero-confidence result so orchestrator escalates
        return &IntakeResult{
            AINotes:        fmt.Sprintf("JSON parse failed: %v | raw: %.200s", err, content),
            Confidence:     0.0,
            ConfidenceNote: "model returned non-JSON response",
        }, nil
    }
    return &result, nil
}

internal/ai/mock.go — deterministic test double:

package ai

import "context"

// MockAIClient is a test double for AIClient.
// Set FixedResult and/or FixedError before use.
type MockAIClient struct {
    FixedResult *IntakeResult
    FixedError  error
    Calls       []IntakeRequest
}

func (m *MockAIClient) AnalyzePhotos(_ context.Context, req IntakeRequest) (*IntakeResult, error) {
    m.Calls = append(m.Calls, req)
    return m.FixedResult, m.FixedError
}

// HighConfidenceResult returns a fixture IntakeResult with confidence 0.95.
func HighConfidenceResult() *IntakeResult {
    return &IntakeResult{
        Model:         "Raspberry Pi 4 Model B",
        Manufacturer:  "Raspberry Pi Foundation",
        Category:      "compute",
        Specs:         map[string]string{"ram": "4GB", "cpu": "BCM2711"},
        SuggestedTags: []string{"raspberry-pi", "compute", "arm"},
        Confidence:    0.95,
    }
}

// LowConfidenceResult returns a fixture with confidence 0.40 (below threshold).
func LowConfidenceResult() *IntakeResult {
    return &IntakeResult{
        Model:          "Unknown Device",
        Category:       "unknown",
        Confidence:     0.40,
        ConfidenceNote: "Cannot identify markings clearly",
    }
}

internal/ai/prompts/intake.go — prompt template:

package prompts

import "fmt"

// BuildIntakePrompt returns the vision prompt instructing the model to return
// structured JSON for hardware analysis. photoCount is 1-3.
func BuildIntakePrompt(photoCount int) string {
    return fmt.Sprintf(`You are a hardware inventory assistant. Analyze the %d hardware photo(s) provided and return ONLY valid JSON matching this exact schema. Do not include markdown, code fences, or explanations — return only the raw JSON object.

{
  "serial_number": "<string — exact serial number visible on label, or empty string if not visible>",
  "model":         "<string — product model name>",
  "manufacturer":  "<string — manufacturer/brand name>",
  "category":      "<one of: compute, networking, storage, cable, peripheral, component, unknown>",
  "specs":         {"<spec_key>": "<spec_value>"},
  "suggested_tags": ["<tag1>", "<tag2>"],
  "ai_notes":      "<free-form observations about condition, notable features, or ambiguities>",
  "confidence":    <float between 0.0 and 1.0 — your confidence in the identification>,
  "confidence_note": "<reason why confidence is below 0.75, or empty string if confidence >= 0.75>"
}`, photoCount)
}

Note: internal/ai/client.go imports internal/ai/prompts — add a helper shim in client.go:

// buildIntakePromptWithCount is a package-internal shim to the prompts package.
func buildIntakePromptWithCount(n int) string {
    return prompts.BuildIntakePrompt(n)
}

Add import "git.georgsen.dk/hwlab/internal/ai/prompts" in client.go.

internal/config/config.go — extend Config struct with AIConfig:

Add to the Config struct:

AI AIConfig `mapstructure:"ai"`

Add AIConfig import alias and type reference. Since AIConfig is in internal/ai, import that package. Add defaults in Load():

v.SetDefault("ai.tier1.base_url", "http://localhost:8000/v1")
v.SetDefault("ai.tier1.api_key", "local")
v.SetDefault("ai.tier1.model", "gemma-4-e4b")
v.SetDefault("ai.tier1.timeout_seconds", 30)
v.SetDefault("ai.tier2.base_url", "https://openrouter.ai/api/v1")
v.SetDefault("ai.tier2.api_key", "")
v.SetDefault("ai.tier2.model", "google/gemma-3-27b-it")
v.SetDefault("ai.tier2.timeout_seconds", 60)
v.SetDefault("ai.confidence_threshold", 0.75)
v.SetDefault("ai.quick_add_enabled", false)
v.SetDefault("ai.quick_add_threshold", 0.90)

Add viper bindings:

_ = v.BindEnv("ai.tier1.base_url", "HWLAB_AI_TIER1_BASE_URL")
_ = v.BindEnv("ai.tier1.api_key", "HWLAB_AI_TIER1_API_KEY")
_ = v.BindEnv("ai.tier1.model", "HWLAB_AI_TIER1_MODEL")
_ = v.BindEnv("ai.tier2.base_url", "HWLAB_AI_TIER2_BASE_URL")
_ = v.BindEnv("ai.tier2.api_key", "HWLAB_AI_TIER2_API_KEY")
_ = v.BindEnv("ai.tier2.model", "HWLAB_AI_TIER2_MODEL")
_ = v.BindEnv("ai.confidence_threshold", "HWLAB_AI_CONFIDENCE_THRESHOLD")
_ = v.BindEnv("ai.quick_add_enabled", "HWLAB_AI_QUICK_ADD_ENABLED")

Also configure viper to optionally load ai_config.json as a merge override:

// Optionally merge ai_config.json if present — overrides defaults
v.SetConfigName("ai_config")
v.SetConfigType("json")
if err := v.MergeInConfig(); err != nil {
    if _, ok := err.(viper.ConfigFileNotFoundError); !ok {
        return nil, fmt.Errorf("ai_config file: %w", err)
    }
}
// Restore config name for the primary config file
v.SetConfigName("config")
_ = v.ReadInConfig()

Actually, the safest pattern for viper MergeInConfig with two config files is:

Create a second viper instance OR
Use AddConfigPath multiple times with different names — not directly supported

Simplest correct approach: read main config.json first (existing), then merge ai_config.json:

// In Load(), after v.ReadInConfig() for config.json:
// Try to merge ai_config.json as an override
v2 := viper.New()
v2.SetConfigName("ai_config")
v2.SetConfigType("json")
v2.AddConfigPath(".")
v2.AddConfigPath("/etc/hwlab")
if err := v2.ReadInConfig(); err == nil {
    // ai_config.json exists — merge AI section into main viper
    if err := v.MergeConfigMap(v2.AllSettings()); err != nil {
        return nil, fmt.Errorf("merge ai_config: %w", err)
    }
}

ai_config.json — template file committed to repo (safe defaults, no real API keys):

{
  "ai": {
    "tier1": {
      "base_url": "http://localhost:8000/v1",
      "api_key": "local",
      "model": "gemma-4-e4b",
      "timeout_seconds": 30
    },
    "tier2": {
      "base_url": "https://openrouter.ai/api/v1",
      "api_key": "REPLACE_WITH_OPENROUTER_KEY",
      "model": "google/gemma-3-27b-it",
      "timeout_seconds": 60
    },
    "confidence_threshold": 0.75,
    "quick_add_enabled": false,
    "quick_add_threshold": 0.90
  }
}

Tests to add (in internal/config/config_test.go and new internal/ai/client_test.go):

config_test.go — TestAIConfigDefaults: call Load() with no config file present (use t.TempDir() and os.Chdir); assert cfg.AI.Tier1.BaseURL == "http://localhost:8000/v1", cfg.AI.ConfidenceThreshold == 0.75.

internal/ai/client_test.go (new file):

TestMockAIClient: create MockAIClient with HighConfidenceResult(); call AnalyzePhotos; assert result.Confidence == 0.95 and len(mock.Calls) == 1
TestMockAIClientError: set FixedError = errors.New("timeout"); assert returned error is non-nil
TestTierClientConstruction: NewTierClient(TierConfig{BaseURL: "http://localhost:9999/v1", APIKey: "x", Model: "m", TimeoutSeconds: 1}) — assert client is not nil; AnalyzePhotos returns non-nil error (connection refused) cd /home/mikkel/homelabby && go build ./... && go test ./internal/ai/... ./internal/config/... -v 2>&1 | tail -30
- go build ./... passes
- TestMockAIClient PASS, TestMockAIClientError PASS
- TestTierClientConstruction PASS (connection refused error returned, not panic)
- TestAIConfigDefaults PASS (defaults unmarshalled correctly)
- internal/ai/ package has types.go, client.go, mock.go, prompts/intake.go
- ai_config.json exists in project root

<threat_model>

Trust Boundaries

Boundary	Description
config→TierClient	API keys from ai_config.json reach go-openai client; must not be logged
TierClient→oMLX	HTTP to localhost:8000 — trusted network, but response is untrusted JSON from model
TierClient→OpenRouter	HTTPS to openrouter.ai — TLS protects key in transit; response is untrusted

STRIDE Threat Register

Threat ID	Category	Component	Disposition	Mitigation Plan
T-02-01	Information Disclosure	ai_config.json	mitigate	Add ai_config.json to .gitignore (contains OPENROUTER_KEY); commit only template with placeholder value
T-02-02	Tampering	TierClient JSON parse	mitigate	json.Unmarshal into typed struct — unrecognized keys ignored; numeric fields clamped by orchestrator in Plan 02
T-02-03	Denial of Service	TierClient timeout	mitigate	context.WithTimeout(ctx, c.timeout) wraps every CreateChatCompletion call — oMLX hangs cannot block handler indefinitely
T-02-04	Information Disclosure	API key logging	accept	No log.Printf of TierConfig.APIKey anywhere; dev/test uses "local" key (not secret)
</threat_model>

After plan completion: 1. `go build ./...` — zero errors 2. `go test ./internal/ai/... -v` — all unit tests pass 3. `go test ./internal/config/... -v` — including new AI config defaults test 4. `go test ./internal/netbox/... -run TestCreateDevice -v` — validation test passes, live test skips 5. `grep "sashabaranov/go-openai" go.mod` — confirms dependency present 6. `ls internal/ai/` — shows types.go, client.go, mock.go, client_test.go 7. `ls internal/ai/prompts/` — shows intake.go 8. `ls ai_config.json` — file exists

<success_criteria>

go-openai is in go.mod and the module compiles cleanly
AIClient interface defined in internal/ai/client.go with AnalyzePhotos(ctx, IntakeRequest) signature
MockAIClient implements AIClient and records calls
TierClient wraps go-openai with BaseURL override
Config.AI.Tier1.BaseURL defaults to "http://localhost:8000/v1" when no config file present
NetBox client has CreateDevice method that returns (int64, error)
All new tests pass; go build clean </success_criteria>

After completion, create `.planning/phases/02-ai-pipeline/02-01-SUMMARY.md`

24 KiB Raw Blame History Unescape Escape

Trust Boundaries

STRIDE Threat Register

24 KiB

Raw Blame History