Wave 1: go-openai dep, CreateDevice gap, AIClient interface + mock + config Wave 2: three-tier orchestrator, WAQ real handler, SearXNG stub Wave 3: POST /api/intake handler, router wiring, quick add mode Wave 4: oMLX integration test + memory checkpoint Covers requirements: AI-01 through AI-09 (AI-04 stub only; full impl Phase 7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
12 KiB
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | must_haves | |||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 02-ai-pipeline | 04 | execute | 4 |
|
|
false |
|
|
Purpose: AI-01 requires empirical validation that Gemma 4 fits in 16GB on the Mac Mini. This checkpoint collects that measurement. Output: Passing integration test (when oMLX reachable), memory measurement recorded in docs/, model tier confirmed.
<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/phases/02-ai-pipeline/02-CONTEXT.md @.planning/phases/02-ai-pipeline/02-03-SUMMARY.md From internal/ai/client.go: ```go type TierClient struct{ /* ... */ } func NewTierClient(cfg TierConfig) *TierClient func (c *TierClient) AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error) ```From internal/ai/types.go:
type TierConfig struct {
BaseURL string
APIKey string
Model string
TimeoutSeconds int
}
type IntakeRequest struct { PhotosBase64 []string; JobID string }
type IntakeResult struct {
Model string; Manufacturer string; Confidence float64
// ... other fields
}
oMLX installation (macOS Apple Silicon):
# Install oMLX (requires macOS 15+, Apple Silicon)
# From https://omlx.ai or brew if available
# Default port: 8000
# Start with: omlx serve --model gemma-4-e4b --port 8000
# Measure memory: activity monitor or `memory_pressure` / `vm_stat`
Task 1: oMLX integration test with skip guard
internal/ai/omlx_integration_test.go
- internal/ai/client.go (full — TierClient and AnalyzePhotos)
- internal/ai/types.go (full)
- internal/netbox/client_test.go (skim — skip guard pattern used in this codebase)
Create internal/ai/omlx_integration_test.go:
//go:build integration
package ai_test
import (
"context"
"os"
"testing"
"encoding/base64"
"git.georgsen.dk/hwlab/internal/ai"
)
// TestOMLXIntegration tests a real call to oMLX.
// Run with: HWLAB_OMLX_URL=http://localhost:8000/v1 go test ./internal/ai/... -tags integration -v -run TestOMLX
//
// Skip conditions:
// - HWLAB_OMLX_URL not set
// - oMLX unreachable (test fails with connection error — not skipped, so the failure is visible)
func TestOMLXIntegration(t *testing.T) {
omlxURL := os.Getenv("HWLAB_OMLX_URL")
if omlxURL == "" {
t.Skip("HWLAB_OMLX_URL not set — skipping oMLX integration test")
}
model := os.Getenv("HWLAB_OMLX_MODEL")
if model == "" {
model = "gemma-4-e4b"
}
client := ai.NewTierClient(ai.TierConfig{
BaseURL: omlxURL,
APIKey: "local",
Model: model,
TimeoutSeconds: 60,
})
// Minimal 1x1 red JPEG for testing — real photos not needed for integration smoke test
// This is a valid tiny JPEG in base64
minimalJPEG := "data:image/jpeg;base64," + minimalJPEGBase64()
result, err := client.AnalyzePhotos(context.Background(), ai.IntakeRequest{
PhotosBase64: []string{minimalJPEG},
JobID: "integration-test-001",
})
if err != nil {
t.Fatalf("AnalyzePhotos error: %v", err)
}
if result == nil {
t.Fatal("result is nil")
}
// Confidence may be low for a minimal test image — just verify the call completed
t.Logf("IntakeResult: model=%q manufacturer=%q category=%q confidence=%.2f",
result.Model, result.Manufacturer, result.Category, result.Confidence)
t.Logf("AINotes: %s", result.AINotes)
// The model must return something in the JSON fields — at minimum a non-panic parse
// (empty model string is acceptable for a 1x1 pixel image)
if result.Confidence < 0 || result.Confidence > 1.0 {
t.Errorf("confidence %.2f out of [0,1] range", result.Confidence)
}
}
// minimalJPEGBase64 returns a base64-encoded minimal valid JPEG (1x1 white pixel).
// Source: https://github.com/nicowillis/pngheaders (1x1 JPEG, 631 bytes)
func minimalJPEGBase64() string {
// 1x1 white JPEG — static bytes for reproducible test
data := []byte{
0xff, 0xd8, 0xff, 0xe0, 0x00, 0x10, 0x4a, 0x46, 0x49, 0x46, 0x00, 0x01,
0x01, 0x00, 0x00, 0x01, 0x00, 0x01, 0x00, 0x00, 0xff, 0xdb, 0x00, 0x43,
0x00, 0x08, 0x06, 0x06, 0x07, 0x06, 0x05, 0x08, 0x07, 0x07, 0x07, 0x09,
0x09, 0x08, 0x0a, 0x0c, 0x14, 0x0d, 0x0c, 0x0b, 0x0b, 0x0c, 0x19, 0x12,
0x13, 0x0f, 0x14, 0x1d, 0x1a, 0x1f, 0x1e, 0x1d, 0x1a, 0x1c, 0x1c, 0x20,
0x24, 0x2e, 0x27, 0x20, 0x22, 0x2c, 0x23, 0x1c, 0x1c, 0x28, 0x37, 0x29,
0x2c, 0x30, 0x31, 0x34, 0x34, 0x34, 0x1f, 0x27, 0x39, 0x3d, 0x38, 0x32,
0x3c, 0x2e, 0x33, 0x34, 0x32, 0xff, 0xc0, 0x00, 0x0b, 0x08, 0x00, 0x01,
0x00, 0x01, 0x01, 0x01, 0x11, 0x00, 0xff, 0xc4, 0x00, 0x1f, 0x00, 0x00,
0x01, 0x05, 0x01, 0x01, 0x01, 0x01, 0x01, 0x01, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08,
0x09, 0x0a, 0x0b, 0xff, 0xc4, 0x00, 0xb5, 0x10, 0x00, 0x02, 0x01, 0x03,
0x03, 0x02, 0x04, 0x03, 0x05, 0x05, 0x04, 0x04, 0x00, 0x00, 0x01, 0x7d,
0x01, 0x02, 0x03, 0x00, 0x04, 0x11, 0x05, 0x12, 0x21, 0x31, 0x41, 0x06,
0x13, 0x51, 0x61, 0x07, 0x22, 0x71, 0x14, 0x32, 0x81, 0x91, 0xa1, 0x08,
0x23, 0x42, 0xb1, 0xc1, 0x15, 0x52, 0xd1, 0xf0, 0x24, 0x33, 0x62, 0x72,
0x82, 0x09, 0x0a, 0x16, 0x17, 0x18, 0x19, 0x1a, 0x25, 0x26, 0x27, 0x28,
0x29, 0x2a, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3a, 0x43, 0x44, 0x45,
0x46, 0x47, 0x48, 0x49, 0x4a, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59,
0x5a, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69, 0x6a, 0x73, 0x74, 0x75,
0x76, 0x77, 0x78, 0x79, 0x7a, 0x83, 0x84, 0x85, 0x86, 0x87, 0x88, 0x89,
0x8a, 0x93, 0x94, 0x95, 0x96, 0x97, 0x98, 0x99, 0x9a, 0xa2, 0xa3, 0xa4,
0xa5, 0xa6, 0xa7, 0xa8, 0xa9, 0xaa, 0xb2, 0xb3, 0xb4, 0xb5, 0xb6, 0xb7,
0xb8, 0xb9, 0xba, 0xc2, 0xc3, 0xc4, 0xc5, 0xc6, 0xc7, 0xc8, 0xc9, 0xca,
0xd2, 0xd3, 0xd4, 0xd5, 0xd6, 0xd7, 0xd8, 0xd9, 0xda, 0xe1, 0xe2, 0xe3,
0xe4, 0xe5, 0xe6, 0xe7, 0xe8, 0xe9, 0xea, 0xf1, 0xf2, 0xf3, 0xf4, 0xf5,
0xf6, 0xf7, 0xf8, 0xf9, 0xfa, 0xff, 0xda, 0x00, 0x08, 0x01, 0x01, 0x00,
0x00, 0x3f, 0x00, 0xfb, 0xd2, 0x8a, 0x28, 0x03, 0xff, 0xd9,
}
return base64.StdEncoding.EncodeToString(data)
}
NOTE: Use build tag //go:build integration so the test is excluded from normal go test ./... runs. Integration tests only run when explicitly tagged: go test -tags integration ./internal/ai/...
This follows the skip-guard pattern established in Phase 1 but uses build tags instead of env-only guards, since oMLX is only available on the Mac Mini production machine.
cd /home/mikkel/homelabby && go build ./... && go test ./internal/ai/... -v 2>&1 | tail -20
- go build ./... passes
- go test ./internal/ai/... -v (without -tags integration) shows integration test NOT included — only unit tests run
- internal/ai/omlx_integration_test.go exists with build tag integration
**Step 2: Test the intake endpoint with mock photos (binary running locally)**
```bash
# Terminal 1: start the server
cd /home/mikkel/homelabby
go run cmd/hwlab/main.go &
# Terminal 2: send a test intake request (any JPEG file will work)
curl -s -X POST http://localhost:8080/api/intake \
-F "photos=@/path/to/any-photo.jpg" | python3 -m json.tool
```
Expected (without real oMLX running):
- If oMLX is reachable: JSON response with hw_id, model, confidence, catalog_status
- If oMLX unreachable (expected on dev machine): 500 or 202 depending on tier client timeout
**Step 3: On Mac Mini M4 — run the oMLX integration test**
```bash
# On Mac Mini: start oMLX
omlx serve --model gemma-4-e4b --port 8000
# Check memory: Activity Monitor → omlx process, note "Real Memory"
# Expected for E4B: ~8-10GB RAM
# Run integration test
cd /home/mikkel/homelabby
HWLAB_OMLX_URL=http://localhost:8000/v1 go test -tags integration ./internal/ai/... -run TestOMLX -v
```
Expected: PASS with logged IntakeResult fields (model may be empty for test pixel — that's OK).
**Step 4: Document memory measurement**
Record in the summary: "Gemma 4 E4B: X GB real memory on Mac Mini M4 16GB"
If > 12GB: note that 26B A4B is not feasible without TurboQuant KV offload.
Type "approved" after verifying unit tests pass.
If oMLX test was run on Mac Mini, include memory measurement (e.g. "approved — E4B uses 9.2GB").
If Mac Mini not available yet, type "approved — oMLX test deferred, unit tests pass".
<threat_model>
Trust Boundaries
| Boundary | Description |
|---|---|
| integration test → oMLX | Test sends real data to local AI; only runs when explicitly triggered |
STRIDE Threat Register
| Threat ID | Category | Component | Disposition | Mitigation Plan |
|---|---|---|---|---|
| T-02-14 | Information Disclosure | omlx-setup.md | accept | Document contains model names and port numbers — no secrets; oMLX API key is "local" (not a real credential) |
| T-02-15 | Denial of Service | integration test resource usage | mitigate | Build tag integration ensures test never runs in standard CI pipeline; only runs manually with explicit env var |
| </threat_model> |
<success_criteria>
- All Phase 2 unit tests pass with zero failures
- oMLX integration test exists, skips gracefully when HWLAB_OMLX_URL not set
- Memory budget for Gemma 4 E4B documented (or deferred with note if Mac Mini not available)
- Phase 2 complete: POST /api/intake is end-to-end functional </success_criteria>