36 KiB
Phase 2: AI Pipeline - Research
Researched: 2026-04-10 Domain: Go AI client interface, multipart photo intake, multimodal vision with Gemma 4 via oMLX, three-tier orchestrator, confidence-based quality gate wiring Confidence: HIGH (core patterns from training knowledge, verified against codebase and stack decisions)
<user_constraints>
User Constraints (from CONTEXT.md)
Locked Decisions
- Single
go-openaiclient with configurable BaseURL per tier - Tier 1: oMLX at http://localhost:8000/v1 (Gemma 4 E4B default)
- Tier 2: OpenRouter at https://openrouter.ai/api/v1 (research agent)
- Tier 3: OpenRouter (Opus for Lab Advisor — deferred to Phase 6)
- Config JSON drives tier routing — no code changes to swap providers
- POST /api/intake accepts multipart/form-data with 1-3 photo files
- Photos encoded as base64 and sent to Gemma 4 vision endpoint
- AI extracts: serial number, model, manufacturer, specs, category, suggested tags
- Confidence score determines catalog_status: high → indexed, low → needs_research
- Config flag enables skip-review flow for high-confidence items (Quick Add mode)
- oMLX may not be installed on dev machine — use mock AI client for unit tests
- Integration tests skip gracefully when oMLX unreachable
- Expose
AIClientinterface so production uses oMLX, tests use mock - AI config lives in ai_config.json (separate from main config.json)
- Intake handler should use write-ahead queue if NetBox unreachable
- SearXNG function calling deferred to Phase 7
Claude's Discretion
All implementation details are at Claude's discretion. Use Phase 1 artifacts (NetBox client, quality gate, HW-ID) as building blocks.
Deferred Ideas (OUT OF SCOPE)
- SearXNG function calling (Phase 7)
- Lab Advisor tier 3 (Phase 6)
- Natural language search (Phase 7)
- Actual Gemma 4 model tuning/fine-tuning
- React UI for intake (Phase 3) </user_constraints>
<phase_requirements>
Phase Requirements
| ID | Description | Research Support |
|---|---|---|
| AI-01 | oMLX installed on Mac Mini M4 with Gemma 4 model serving OpenAI-compatible API | oMLX setup guide + mock pattern for dev |
| AI-02 | User can upload 1-3 photos and AI extracts serial number, model, manufacturer, specs via multimodal vision | Multipart form handling + base64 vision message pattern |
| AI-03 | AI suggests category, tags, and location for each item | Structured JSON response from vision prompt |
| AI-04 | AI calls SearXNG via function calling to research product specs (STUB only this phase) | Stub interface only; real impl Phase 7 |
| AI-05 | Orchestrator reviews Tier 1 output for completeness and flags gaps as needs_research | Confidence extraction + quality gate transition |
| AI-06 | Tier 2 research agent (OpenRouter) automatically enriches items flagged needs_research | go-openai BaseURL swap pattern |
| AI-07 | Quick add mode skips review screen for items with high AI confidence | Config flag + threshold comparison |
| AI-08 | All AI tiers accessed via single OpenAI-compatible client with configurable base URLs | go-openai ClientConfig.BaseURL |
| AI-09 | Provider routing configured via JSON file — swap any tier without code changes | ai_config.json schema + factory pattern |
| </phase_requirements> |
Summary
Phase 2 builds the AI backbone of HWLab: a Go interface hierarchy that decouples test-time mocks from production oMLX/OpenRouter calls, a multipart photo intake handler that encodes images as base64 vision messages, a structured-output extractor that parses Gemma 4 JSON responses into typed IntakeResult values, and a three-tier orchestrator that escalates to OpenRouter when Tier 1 confidence falls below threshold.
The key design challenge is keeping the AIClient interface minimal enough to mock cleanly while capturing the full vision + JSON-mode call pattern used by go-openai. The confidence score must be embedded in the model's structured output (not inferred post-hoc) because Gemma 4 / OpenAI-compatible APIs do not expose logprobs for vision tasks reliably.
The orchestrator plugs directly into Phase 1's CatalogUpdater, AllocateNextHWID, PatchCustomFields, and SyncTags — all four are stable and tested. The WAQ from Phase 1 (Plan 05) is already wired into main.go and is the fallback path when NetBox is unreachable during intake.
Primary recommendation: Build the AIClient interface and mock first, then the intake handler, then the orchestrator. Keep confidence scoring self-contained inside the AI package — do not leak float64 confidence values into the service layer; instead expose a typed CatalogStatus decision from the orchestrator.
Standard Stack
Core (Phase 2 additions)
| Library | Version | Purpose | Why Standard |
|---|---|---|---|
| github.com/sashabaranov/go-openai | v1.x | OpenAI-compatible HTTP client | Single client for oMLX + OpenRouter; BaseURL swap is the tier-routing mechanism; already recommended in STACK.md |
Version verification:
go get github.com/sashabaranov/go-openai@latest
# As of 2026-04 training knowledge: v1.36+ is current — verify before install
[ASSUMED: exact latest version; run npm view equivalent: go list -m github.com/sashabaranov/go-openai@latest to confirm]
Already in go.mod (no new dependencies needed)
| Package | Current Version | Used By Phase 2 |
|---|---|---|
| github.com/go-chi/chi/v5 | v5.2.5 | POST /api/intake route |
| github.com/spf13/viper | v1.21.0 | ai_config.json loading |
| github.com/google/uuid | v1.6.0 | Intake job ID (already indirect) |
| github.com/redis/go-redis/v9 | v9.18.0 | WAQ fallback on NetBox failure |
Installation
cd /home/mikkel/homelabby
go get github.com/sashabaranov/go-openai@latest
Architecture Patterns
Recommended Package Structure (Phase 2 additions)
internal/
├── ai/
│ ├── client.go # AIClient interface + TierClient concrete type
│ ├── mock.go # MockAIClient for unit tests
│ ├── orchestrator.go # Three-tier routing + escalation logic
│ ├── types.go # IntakeRequest, IntakeResult, ConfidenceLevel
│ └── prompts/
│ └── intake.go # Prompt templates for hardware analysis
├── api/
│ ├── handlers/
│ │ └── intake.go # POST /api/intake multipart handler (new)
│ └── router.go # Add intake route (modify existing)
└── config/
└── config.go # Add AIConfig fields (modify existing)
Pattern 1: AIClient Interface + TierClient
What: A minimal Go interface that captures the one call shape Phase 2 needs. TierClient wraps *openai.Client from go-openai. MockAIClient implements the same interface deterministically.
Why minimal interface: The interface should expose the behavior, not the library. If the interface requires *openai.ChatCompletionRequest, tests must import go-openai. A domain-typed interface (AnalyzePhotos) keeps mocks simple.
// Source: training knowledge — standard Go interface pattern [ASSUMED]
// internal/ai/client.go
package ai
import "context"
// AIClient is the single abstraction over any OpenAI-compatible inference backend.
// Production: TierClient wrapping sashabaranov/go-openai.
// Tests: MockAIClient with canned responses.
type AIClient interface {
AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error)
}
// TierConfig holds provider configuration for one AI tier.
type TierConfig struct {
BaseURL string `json:"base_url"`
APIKey string `json:"api_key"`
Model string `json:"model"`
TimeoutS int `json:"timeout_seconds"`
}
// TierClient is the production AIClient backed by go-openai.
type TierClient struct {
client *openai.Client
model string
}
func NewTierClient(cfg TierConfig) *TierClient {
config := openai.DefaultConfig(cfg.APIKey)
config.BaseURL = cfg.BaseURL
return &TierClient{
client: openai.NewClientWithConfig(config),
model: cfg.Model,
}
}
[VERIFIED: go-openai BaseURL override via openai.DefaultConfig + config.BaseURL — confirmed pattern from STACK.md and ARCHITECTURE.md]
Pattern 2: Multipart Photo Upload → Base64 Vision Message
What: chi handler reads up to 3 files from multipart form, reads each into []byte, encodes to base64 data URL, assembles a ChatCompletionRequest with ImageURL content parts.
go-openai vision message shape: [ASSUMED: standard pattern, consistent with OpenAI API]
// internal/api/handlers/intake.go
// Source: go-openai vision pattern [ASSUMED — matches OpenAI API spec]
func (h *IntakeHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Parse multipart — 32MB max
if err := r.ParseMultipartForm(32 << 20); err != nil {
http.Error(w, "bad multipart", http.StatusBadRequest)
return
}
files := r.MultipartForm.File["photos"]
if len(files) == 0 || len(files) > 3 {
http.Error(w, "1-3 photos required", http.StatusBadRequest)
return
}
var photosB64 []string
for _, fh := range files {
f, err := fh.Open()
if err != nil { /* handle */ }
defer f.Close()
data, err := io.ReadAll(f)
if err != nil { /* handle */ }
// Detect MIME type from first 512 bytes
mime := http.DetectContentType(data[:min(512, len(data))])
photosB64 = append(photosB64, fmt.Sprintf("data:%s;base64,%s",
mime, base64.StdEncoding.EncodeToString(data)))
}
result, err := h.ai.AnalyzePhotos(r.Context(), ai.IntakeRequest{
PhotosBase64: photosB64,
})
// ...
}
go-openai vision content parts: [ASSUMED]
// internal/ai/client.go — TierClient.AnalyzePhotos
func (c *TierClient) AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error) {
// Build image content parts
parts := []openai.ChatMessagePart{
{
Type: openai.ChatMessagePartTypeText,
Text: buildIntakePrompt(),
},
}
for _, b64 := range req.PhotosBase64 {
parts = append(parts, openai.ChatMessagePart{
Type: openai.ChatMessagePartTypeImageURL,
ImageURL: &openai.ChatMessageImageURL{
URL: b64, // data:image/jpeg;base64,...
Detail: openai.ImageURLDetailAuto,
},
})
}
resp, err := c.client.CreateChatCompletion(ctx, openai.ChatCompletionRequest{
Model: c.model,
Messages: []openai.ChatCompletionMessage{
{Role: openai.ChatMessageRoleUser, MultiContent: parts},
},
// ResponseFormat for JSON mode — see Pattern 3
})
// parse resp.Choices[0].Message.Content as JSON
}
[ASSUMED: MultiContent field name in go-openai ChatCompletionMessage — verify against actual go-openai source after install. Some versions use Content string OR MultiContent []ChatMessagePart]
CRITICAL NOTE: Verify the exact ChatCompletionMessage field for multi-content vision after go get. The field has been MultiContent in v1.20+ but naming may differ. Check with:
go doc github.com/sashabaranov/go-openai ChatCompletionMessage
Pattern 3: Structured JSON Output from Gemma 4
What: Instruct the model to return a specific JSON schema via prompt engineering. Use ResponseFormat with JSONObject type when the endpoint supports it (oMLX/Gemma 4 may not support strict JSON schema mode — fall back to prompt-only).
IntakeResult schema:
// internal/ai/types.go
package ai
// IntakeResult is the structured output from any AI tier's photo analysis.
// The model is instructed to return this JSON shape verbatim.
type IntakeResult struct {
SerialNumber string `json:"serial_number"` // empty string if not visible
Model string `json:"model"`
Manufacturer string `json:"manufacturer"`
Category string `json:"category"` // e.g. "networking", "cable", "compute"
Specs map[string]string `json:"specs"` // key-value hardware specs
SuggestedTags []string `json:"suggested_tags"`
AINotes string `json:"ai_notes"` // free-form observations
Confidence float64 `json:"confidence"` // 0.0–1.0, self-reported by model
ConfidenceNote string `json:"confidence_note"` // why confidence is low (if < threshold)
}
Prompt pattern for JSON output:
// internal/ai/prompts/intake.go
func buildIntakePrompt() string {
return `Analyze the hardware in the provided photo(s) and return ONLY valid JSON matching this schema:
{
"serial_number": "<string or empty>",
"model": "<string>",
"manufacturer": "<string>",
"category": "<one of: compute, networking, storage, cable, peripheral, component, unknown>",
"specs": {"<key>": "<value>"},
"suggested_tags": ["<tag1>", "<tag2>"],
"ai_notes": "<observations>",
"confidence": <float 0.0-1.0>,
"confidence_note": "<reason if confidence < 0.75>"
}
Return ONLY the JSON object. No markdown, no explanation.`
}
JSON mode ResponseFormat (use if supported by endpoint): [ASSUMED]
// Only set if oMLX / OpenRouter model supports JSON mode
ResponseFormat: &openai.ChatCompletionResponseFormat{
Type: openai.ChatCompletionResponseFormatTypeJSONObject,
},
[ASSUMED: Gemma 4 via oMLX may not support response_format: json_object — implement with prompt-only fallback and parse json.Unmarshal on the raw response string. If JSON parse fails, treat as low-confidence and escalate.]
Pattern 4: Three-Tier Orchestrator
What: Orchestrator holds two AIClient instances (tier1, tier2). For each intake request: call tier1, parse result, check confidence. If confidence < threshold OR parse failed, call tier2 with same request. Map confidence to CatalogStatus for quality gate.
// internal/ai/orchestrator.go
package ai
type Orchestrator struct {
tier1 AIClient
tier2 AIClient
threshold float64 // from config — default 0.75
}
func NewOrchestrator(tier1, tier2 AIClient, threshold float64) *Orchestrator {
return &Orchestrator{tier1: tier1, tier2: tier2, threshold: threshold}
}
// Analyze runs tier1, escalates to tier2 if needed, returns result + catalog decision.
func (o *Orchestrator) Analyze(ctx context.Context, req IntakeRequest) (*IntakeResult, inventory.CatalogStatus, error) {
result, err := o.tier1.AnalyzePhotos(ctx, req)
if err != nil || result == nil || result.Confidence < o.threshold {
// Escalate to tier2
result2, err2 := o.tier2.AnalyzePhotos(ctx, req)
if err2 == nil && result2 != nil {
result = result2
}
// If tier2 also fails, use tier1 result (or zero result) with NeedsResearch status
}
status := inventory.StatusIndexed
if result == nil || result.Confidence < o.threshold {
status = inventory.StatusNeedsResearch
}
return result, status, nil
}
Pattern 5: MockAIClient for Unit Tests
What: A deterministic mock that returns canned IntakeResult values. Implements AIClient interface. Configurable to return high-confidence or low-confidence responses, and optionally errors.
// internal/ai/mock.go
package ai
import "context"
// MockAIClient is a test double for AIClient.
// Configure FixedResult and/or FixedError before use.
type MockAIClient struct {
FixedResult *IntakeResult
FixedError error
Calls []IntakeRequest // record of calls for assertions
}
func (m *MockAIClient) AnalyzePhotos(_ context.Context, req IntakeRequest) (*IntakeResult, error) {
m.Calls = append(m.Calls, req)
return m.FixedResult, m.FixedError
}
// HighConfidenceResult returns a fixture IntakeResult with confidence 0.95.
func HighConfidenceResult() *IntakeResult {
return &IntakeResult{
Model: "Raspberry Pi 4 Model B",
Manufacturer: "Raspberry Pi Foundation",
Category: "compute",
Specs: map[string]string{"ram": "4GB", "cpu": "BCM2711"},
SuggestedTags: []string{"raspberry-pi", "compute", "arm"},
Confidence: 0.95,
}
}
// LowConfidenceResult returns a fixture with confidence 0.40 (below threshold).
func LowConfidenceResult() *IntakeResult {
return &IntakeResult{
Model: "Unknown Device",
Category: "unknown",
Confidence: 0.40,
ConfidenceNote: "Cannot identify markings clearly",
}
}
Pattern 6: AI Config Schema (ai_config.json)
What: Separate JSON config file for AI provider settings. Loaded by viper alongside main config.json. Keeps provider credentials out of the main config.
{
"tier1": {
"base_url": "http://localhost:8000/v1",
"api_key": "local",
"model": "gemma-4-e4b",
"timeout_seconds": 30
},
"tier2": {
"base_url": "https://openrouter.ai/api/v1",
"api_key": "sk-or-...",
"model": "google/gemma-2-27b-it",
"timeout_seconds": 60
},
"confidence_threshold": 0.75,
"quick_add_enabled": false,
"quick_add_threshold": 0.90
}
Config struct extension (extend existing internal/config/config.go):
type AIConfig struct {
Tier1 TierConfig `mapstructure:"tier1"`
Tier2 TierConfig `mapstructure:"tier2"`
ConfidenceThreshold float64 `mapstructure:"confidence_threshold"`
QuickAddEnabled bool `mapstructure:"quick_add_enabled"`
QuickAddThreshold float64 `mapstructure:"quick_add_threshold"`
}
// Add to Config struct:
AI AIConfig `mapstructure:"ai"`
Viper loads ai_config.json by merging it into the same viper instance using v.MergeInConfig() with a second config name, or by embedding the AI fields directly in config.json under an "ai" key. Simplest: use a single config.json with an "ai" section and add ai_config.json as an override file via v.MergeConfigMap.
[ASSUMED: viper MergeInConfig pattern for secondary config file — standard viper v1 capability]
Pattern 7: Intake Handler Wiring to Phase 1 Components
What: The intake handler coordinates: orchestrator (AI analysis) → AllocateNextHWID (ID) → BuildFullCustomFieldsPatch (fields) → NetboxClient.CreateDevice or PatchCustomFields → SyncTags → CatalogUpdater.UpdateCatalogStatus → WAQ fallback.
Existing Phase 1 APIs the handler calls:
| Phase 1 Function | Package | Handler Usage |
|---|---|---|
AllocateNextHWID(ctx) |
internal/netbox |
Assign HW-XXXXX ID to new record |
BuildFullCustomFieldsPatch(cf) |
internal/netbox |
Populate custom fields from IntakeResult |
PatchCustomFields(ctx, id, patch) |
internal/netbox |
Write AI data to NetBox device |
SyncTags(ctx, tags) |
internal/netbox |
Create and assign AI-suggested tags |
UpdateCatalogStatus(ctx, id, current, next) |
internal/inventory |
Set indexed or needs_research |
waq.Enqueue(ctx, op) |
internal/queue |
Buffer NetBox write if unreachable |
Note: Phase 1's client.go has ListDevices and GetDevice but no CreateDevice. The intake handler will need CreateDevice — this is a new method on internal/netbox.Client. Plan must include this task.
Pattern 8: SearXNG Stub (AI-04)
What: AI-04 is listed as "Phase 7" in REQUIREMENTS.md but the CONTEXT.md says "stub only" this phase. Implement a ResearchClient interface with a Search(ctx, query) method, and a NoOpResearchClient that returns empty results. This satisfies the interface requirement without Phase 7 scope creep.
// internal/ai/research.go (stub)
type ResearchClient interface {
Search(ctx context.Context, query string) ([]SearchResult, error)
}
type NoOpResearchClient struct{}
func (n *NoOpResearchClient) Search(_ context.Context, _ string) ([]SearchResult, error) {
return nil, nil // Phase 7 will provide real implementation
}
Anti-Patterns to Avoid
- Don't extract confidence from logprobs: Gemma 4 vision via oMLX does not expose per-token logprobs reliably. Embed
confidence: floatin the JSON output schema and instruct the model to self-report it. [ASSUMED: oMLX logprobs availability is uncertain] - Don't store photos: Per CLAUDE.md stack patterns: "Store the original photo in a local temp directory only until the NetBox record is created; do not persist photos in HWLab itself." Photos are transient.
- Don't call NetBox from the AI package:
internal/aishould not importinternal/netbox. The intake handler (service layer) orchestrates both. Keep the AI package focused on inference only. - Don't share a single go-openai client across tiers: Each tier gets its own
*openai.Clientinstance with its ownBaseURLandAPIKey. Mutating a shared client's config is a race condition. - Don't block the HTTP response on AI inference: AI calls take 2-30 seconds. The intake handler should return a job ID immediately and push the result via SSE. (Phase 3 will add SSE — for Phase 2, a synchronous response is acceptable since there's no UI yet, but design the handler to support async promotion.)
Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---|---|---|---|
| OpenAI-compatible HTTP client | Custom HTTP calls to oMLX | sashabaranov/go-openai |
Handles auth headers, retry, streaming, vision content parts |
| Base64 encoding | Custom encoder | encoding/base64 stdlib |
Already in Go stdlib |
| MIME type detection | File extension parsing | net/http.DetectContentType |
Magic bytes detection from stdlib |
| JSON structured output parsing | Regex extraction | encoding/json.Unmarshal |
Model output is well-formed JSON when prompted correctly |
| Multipart form parsing | Manual --boundary parsing |
r.ParseMultipartForm() |
stdlib net/http handles multipart |
Common Pitfalls
Pitfall 1: go-openai Vision MultiContent Field Name
What goes wrong: Code compiles but ChatCompletionMessage.MultiContent field doesn't exist or is named differently in the installed version.
Why it happens: go-openai API evolved; older versions used a single Content string, newer versions added MultiContent []ChatMessagePart for vision. The exact field name depends on the version.
How to avoid: After go get github.com/sashabaranov/go-openai@latest, run go doc github.com/sashabaranov/go-openai ChatCompletionMessage and verify the vision field name before writing handler code.
Warning signs: Compiler error "unknown field MultiContent" or images silently not being sent (text-only response from model).
Pitfall 2: oMLX JSON Mode Not Supported
What goes wrong: Setting ResponseFormat: {Type: "json_object"} causes a 400 error from oMLX because Gemma 4 E4B via oMLX may not support the response_format parameter.
Why it happens: The response_format JSON schema enforcement is an OpenAI-specific feature not universally implemented across all OpenAI-compatible servers.
How to avoid: Implement JSON parsing with a fallback: try json.Unmarshal(content) on the raw string. If parse fails, treat result as zero-confidence and escalate to tier2. Do not set ResponseFormat unless tested against live oMLX.
Warning signs: 400 Bad Request from oMLX at inference time with "unsupported parameter" in body.
Pitfall 3: Data URL MIME Type vs go-openai Image URL
What goes wrong: Some OpenAI-compatible servers reject data:image/jpeg;base64,... data URLs in vision requests and require a https:// URL instead.
Why it happens: The OpenAI spec allows data URLs in image_url.url but not all providers implement this.
How to avoid: oMLX (local, Gemma 4) should accept data URLs since it's processing locally. Test with a minimal integration test against live oMLX before building the full intake flow. Keep the base64 path for oMLX (tier1) and note that tier2 (OpenRouter) may require a different approach if it doesn't accept data URLs.
Warning signs: 400 or inference-time error from oMLX with "invalid image_url".
Pitfall 4: CreateDevice Not in Phase 1 NetBox Client
What goes wrong: Intake handler tries to call netboxClient.CreateDevice(...) but that method was not built in Phase 1 (only ListDevices, GetDevice, PatchCustomFields were built).
Why it happens: Phase 1 was scoped to read/patch existing devices for the quality gate workflow. Intake requires creating new records.
How to avoid: Plan must include a Wave 0 task to add CreateDevice(ctx, name, assetTag) (int, error) to internal/netbox/client.go before the intake handler can be completed.
go-netbox v4 create pattern: [ASSUMED — matches observed PATCH pattern from 01-02-SUMMARY]
req := nb.WritableDeviceWithConfigContextRequest{}
req.SetName(name)
req.SetAssetTag(assetTag)
// DeviceRole and DeviceType are required by NetBox — plan must handle defaults
resp, _, err := c.api.DcimAPI.DcimDevicesCreate(ctx).
WritableDeviceWithConfigContextRequest(req).Execute()
Note: NetBox DcimDevicesCreate requires device_role and device_type to be set (they are non-nullable FK fields in NetBox v4). The intake handler must either pick sensible defaults or require them to exist in NetBox as pre-provisioned "Unknown" role/type records.
Pitfall 5: Confidence Self-Reporting Calibration
What goes wrong: Model returns "confidence": 0.95 for every item regardless of actual uncertainty, making the threshold useless.
Why it happens: LLMs tend to be overconfident in self-reporting. Without explicit calibration prompting, models bias toward high confidence.
How to avoid: Add calibration guidance to the intake prompt: "Return confidence < 0.75 if: serial number not visible, item is partially obscured, or manufacturer/model cannot be determined from visual inspection alone." This nudges the model toward honest low-confidence responses for ambiguous photos.
Pitfall 6: WAQ Integration — PendingOp Payload Schema
What goes wrong: Intake handler enqueues a PendingOp with a payload, but Phase 1's NoOpHandler (the WAQ worker) is still installed — it drains the queue silently. Phase 2 must replace NoOpHandler with a real NetBox retry handler.
Why it happens: Phase 1 explicitly left NoOpHandler as a stub: "Phase 2 will replace this with a real retry handler."
How to avoid: Phase 2 plan must include a task to implement the real WAQ handler that retries failed NetBox CreateDevice / PatchCustomFields calls. Define PendingOp.OpType constants (e.g., "netbox.create_device", "netbox.patch_custom_fields") and the payload structs for each.
Code Examples
go-openai Client Configuration for oMLX
// Source: go-openai README pattern, confirmed in STACK.md [ASSUMED version specifics]
import openai "github.com/sashabaranov/go-openai"
cfg := openai.DefaultConfig("local") // API key "local" for oMLX (no auth)
cfg.BaseURL = "http://localhost:8000/v1"
client := openai.NewClientWithConfig(cfg)
go-openai Client Configuration for OpenRouter
cfg := openai.DefaultConfig("sk-or-your-key-here")
cfg.BaseURL = "https://openrouter.ai/api/v1"
client := openai.NewClientWithConfig(cfg)
Multipart File Reading in chi Handler
// Source: Go stdlib net/http [VERIFIED: stdlib pattern]
r.ParseMultipartForm(32 << 20) // 32MB max memory
files := r.MultipartForm.File["photos"]
for _, fh := range files {
f, err := fh.Open()
defer f.Close()
data, _ := io.ReadAll(f)
mime := http.DetectContentType(data[:min(512, len(data))])
b64 := base64.StdEncoding.EncodeToString(data)
dataURL := fmt.Sprintf("data:%s;base64,%s", mime, b64)
}
JSON Parse with Fallback
// Source: Go stdlib encoding/json [VERIFIED: stdlib pattern]
var result ai.IntakeResult
content := resp.Choices[0].Message.Content
if err := json.Unmarshal([]byte(content), &result); err != nil {
// Model returned non-JSON — treat as low confidence, escalate
return &ai.IntakeResult{Confidence: 0.0}, nil
}
Integration Test Skip Guard (consistent with Phase 1 pattern)
// Source: Phase 1 established pattern (01-02-SUMMARY.md) [VERIFIED: codebase]
func TestAnalyzePhotosLive(t *testing.T) {
endpoint := os.Getenv("HWLAB_OMLX_ENDPOINT")
if endpoint == "" {
t.Skip("HWLAB_OMLX_ENDPOINT not set — skipping live oMLX test")
}
// ...
}
Validation Architecture
Test Framework
| Property | Value |
|---|---|
| Framework | Go testing stdlib (go test ./...) |
| Config file | none — test flags via env vars |
| Quick run command | go test ./internal/ai/... -run "^Test[^L]" -timeout 30s |
| Full suite command | go test ./... |
Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|---|---|---|---|---|
| AI-02 | Photo upload multipart parsing | unit | go test ./internal/api/handlers/... -run TestIntakeHandler |
Wave 0 |
| AI-02 | Base64 encoding of JPEG | unit | go test ./internal/ai/... -run TestEncodePhoto |
Wave 0 |
| AI-03 | JSON parse of structured output | unit | go test ./internal/ai/... -run TestParseIntakeResult |
Wave 0 |
| AI-05 | Confidence below threshold → needs_research | unit | go test ./internal/ai/... -run TestOrchestratorEscalation |
Wave 0 |
| AI-05 | Confidence above threshold → indexed | unit | go test ./internal/ai/... -run TestOrchestratorHighConf |
Wave 0 |
| AI-06 | Tier 2 called on tier 1 failure | unit | go test ./internal/ai/... -run TestOrchestratorTier2Fallback |
Wave 0 |
| AI-07 | Quick add flag honors threshold | unit | go test ./internal/ai/... -run TestQuickAddMode |
Wave 0 |
| AI-08 | TierClient uses configured BaseURL | unit | go test ./internal/ai/... -run TestTierClientConfig |
Wave 0 |
| AI-09 | ai_config.json loaded via viper | unit | go test ./internal/config/... -run TestAIConfig |
Wave 0 |
| AI-01 | oMLX live inference smoke test | integration | go test ./internal/ai/... -run TestAnalyzePhotosLive (skip if env unset) |
Wave 0 |
Sampling Rate
- Per task commit:
go test ./internal/ai/... ./internal/api/handlers/... -timeout 30s - Per wave merge:
go test ./... - Phase gate: Full suite green before
/gsd-verify-work
Wave 0 Gaps
internal/ai/client_test.go— covers AI-08, AI-09 (TierClient config)internal/ai/orchestrator_test.go— covers AI-05, AI-06, AI-07internal/ai/types_test.go— covers AI-03 (JSON parse)internal/api/handlers/intake_test.go— covers AI-02
Security Domain
Applicable ASVS Categories
| ASVS Category | Applies | Standard Control |
|---|---|---|
| V2 Authentication | no | No auth in solo homelab tool |
| V3 Session Management | no | Stateless REST |
| V4 Access Control | no | Solo operator, no roles |
| V5 Input Validation | yes | Validate photo count (1-3), file size cap, MIME type check |
| V6 Cryptography | no | API keys in config, not in code |
Known Threat Patterns
| Pattern | STRIDE | Standard Mitigation |
|---|---|---|
| Oversized photo upload (DoS) | Denial of Service | ParseMultipartForm(32 << 20) caps memory; add explicit per-file size check (e.g., 10MB/photo) |
| AI prompt injection via filename | Tampering | Do not include original filename in AI prompt; use only image bytes |
| API key leakage in logs | Info Disclosure | Never log TierConfig.APIKey; use *** redaction in any debug output |
| Malformed JSON from model | Tampering | Always json.Unmarshal into typed struct; ignore extra fields; treat parse failure as low confidence |
Environment Availability
| Dependency | Required By | Available | Version | Fallback |
|---|---|---|---|---|
| oMLX on localhost:8000 | AI-01, Tier 1 inference | Unknown (dev machine) | — | MockAIClient for unit tests; integration tests skip with env guard |
| OpenRouter API key | AI-06, Tier 2 | Unknown | — | Integration tests skip; tier2 returns error, orchestrator falls back to needs_research |
| DragonFlyDB (10.5.0.10) | WAQ fallback | VERIFIED reachable (from 01-05-SUMMARY) | — | WAQ init is non-fatal; see 01-05 pattern |
| NetBox (10.5.0.130:8000) | CreateDevice, PatchCustomFields | Available (integration tests skip on placeholder token) | — | WAQ enqueues ops; real token needed for integration tests |
Missing dependencies with no fallback:
- None — all dependencies have mock/skip fallbacks for unit tests.
Missing dependencies with fallback:
- oMLX: MockAIClient covers unit tests; integration test skips with
HWLAB_OMLX_ENDPOINTguard. - OpenRouter key: Same skip guard pattern.
Open Questions
-
NetBox device_role and device_type for CreateDevice
- What we know: NetBox v4 requires both to be non-null FKs on device creation
- What's unclear: Should intake auto-create "Unknown" role/type records if absent, or require them pre-provisioned?
- Recommendation: Phase 1 (Plan 03, provision.go) may have already provisioned these. Check
internal/netbox/provision.gobefore planning the CreateDevice task.
-
Gemma 4 E4B model ID string in oMLX
- What we know: CONTEXT.md says
model: "gemma-4-e4b"as default; oMLX uses the model filename/ID - What's unclear: The exact model ID string oMLX uses for Gemma 4 E4B (may be
mlx-community/gemma-4-e4bor similar) - Recommendation: Leave as a config value; user sets the correct model ID once oMLX is installed. Default to
"gemma-4-e4b"in ai_config.json with a comment.
- What we know: CONTEXT.md says
-
Synchronous vs async intake response
- What we know: AI inference takes 2-30 seconds; Phase 3 adds SSE; no UI in Phase 2
- What's unclear: Should Phase 2 implement async job IDs now (for Phase 3 to build on) or keep synchronous for simplicity?
- Recommendation: Implement synchronous for Phase 2 (no UI yet); design the handler to accept a
?async=truequery param stub that returns "not yet implemented" — this reserves the API surface for Phase 3 without blocking Phase 2.
Assumptions Log
| # | Claim | Section | Risk if Wrong |
|---|---|---|---|
| A1 | go-openai vision content uses MultiContent []ChatMessagePart field on ChatCompletionMessage |
Pattern 2 | Compile error; verify with go doc after install |
| A2 | oMLX supports data URL base64 images in vision requests | Pattern 2 | 400 error at inference time; may need to write image to temp file and use URL instead |
| A3 | oMLX may not support response_format: json_object |
Pattern 3 | Must use prompt-only JSON mode; 400 if ResponseFormat is set |
| A4 | go-openai latest version is v1.36+ | Standard Stack | Run go get to verify; version is only needed to confirm stability |
| A5 | Gemma 4 E4B self-reports honest confidence scores with calibration prompting | Pattern 5 pitfall | Threshold becomes useless if model is always overconfident; may need threshold tuning |
| A6 | viper MergeInConfig can load ai_config.json as secondary config |
Pattern 6 | Config loading fails silently; test config loading in Wave 0 |
Sources
Primary (HIGH confidence)
- CONTEXT.md
02-CONTEXT.md— locked decisions for Phase 2 (this session) 01-02-SUMMARY.md,01-04-SUMMARY.md,01-05-SUMMARY.md— Phase 1 actual implementation (verified codebase state)internal/config/config.go— existing config struct to extendinternal/api/router.go— existing chi router to add route togo.mod— confirmed go-openai not yet installed
Secondary (MEDIUM confidence)
ARCHITECTURE.md,STACK.md— project research documents (verified at research time)- CLAUDE.md stack patterns section — photo intake pattern, AI tier routing pattern
Tertiary (LOW/ASSUMED)
- go-openai
ChatCompletionMessage.MultiContentfield name — training knowledge, verify post-install - oMLX
response_formatsupport status — not tested; marked ASSUMED - go-openai latest version number — marked ASSUMED
Metadata
Confidence breakdown:
- Standard stack: HIGH — go-openai is the decided library; already in STACK.md; pattern for BaseURL swap is verified
- Architecture (interface/mock pattern): HIGH — standard Go interface idiom, consistent with Phase 1 patterns
- go-openai vision API field names: LOW — exact field names require post-install verification
- oMLX JSON mode support: LOW — not tested against live oMLX
Research date: 2026-04-10 Valid until: 2026-05-10 (go-openai API is stable; oMLX is fast-moving — re-verify JSON mode if oMLX version changes)