--- phase: 07-research-agent-search plan: 01 type: execute wave: 1 depends_on: [] files_modified: - internal/config/config.go - internal/netbox/client.go - internal/research/searxng.go - internal/research/agent.go - internal/ai/research.go - cmd/hwlab/main.go autonomous: true requirements: - AI-04 must_haves: truths: - "A SearXNG HTTP GET to http://10.5.0.129:8080/search?q=...&format=json returns parsed results" - "Items with catalog_status=needs_research are polled from NetBox every 10 minutes" - "Each needs_research item is enriched by SearXNG + Tier 2 LLM and updated to catalog_status=researched in NetBox" - "POST /api/research/trigger fires an immediate research cycle (does not wait for the 10-min ticker)" artifacts: - path: "internal/research/searxng.go" provides: "SearXNGClient implementing ai.ResearchClient" exports: ["SearXNGClient", "NewSearXNGClient"] - path: "internal/research/agent.go" provides: "ResearchAgent background goroutine" exports: ["Agent", "NewAgent", "RunOnce", "Start"] key_links: - from: "internal/research/searxng.go" to: "http://10.5.0.129:8080/search" via: "net/http GET with q and format=json query params" pattern: "http\\.Get.*search.*format=json" - from: "internal/research/agent.go" to: "internal/netbox/client.go" via: "ListDevicesWithStatus(ctx, \"needs_research\")" pattern: "ListDevicesWithStatus" - from: "internal/research/agent.go" to: "internal/ai/client.go" via: "tier2.AnalyzePhotos (text-only prompt, no photos)" pattern: "AnalyzePhotos" --- Build the real SearXNG research client and the ResearchAgent background worker that closes the AI-04 research loop: items at needs_research are enriched automatically. Purpose: Replace the Phase 2 NoOpResearchClient stub and deliver the automated enrichment cycle that advances items from needs_research to researched in NetBox. Output: - internal/research/searxng.go — real HTTP client implementing ai.ResearchClient - internal/research/agent.go — background worker with ticker + on-demand trigger - Config additions for SearXNG URL - main.go goroutine start + POST /api/research/trigger handler @/home/mikkel/.claude/get-shit-done/workflows/execute-plan.md @/home/mikkel/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @internal/ai/research.go @internal/ai/client.go @internal/ai/orchestrator.go @internal/netbox/client.go @internal/netbox/custom_fields.go @internal/netbox/types.go @internal/inventory/catalog_updater.go @internal/config/config.go @cmd/hwlab/main.go @internal/api/router.go From internal/ai/research.go: ```go type SearchResult struct { Title string URL string Snippet string } type ResearchClient interface { Search(ctx context.Context, query string) ([]SearchResult, error) } type NoOpResearchClient struct{} // Replace this with SearXNGClient in this plan. ``` From internal/ai/client.go: ```go type AIClient interface { AnalyzePhotos(ctx context.Context, req IntakeRequest) (*IntakeResult, error) } // IntakeRequest.PhotosBase64 may be empty — the Tier 2 model accepts text-only // if the prompt is placed in a separate system message; use a text-only prompt // for research enrichment (no photos). ``` From internal/netbox/client.go (method to ADD): ```go // ListDevicesWithStatus returns devices whose catalog_status custom field equals status. // Use status="needs_research" to find items needing enrichment. func (c *Client) ListDevicesWithStatus(ctx context.Context, status string) ([]Device, error) ``` From internal/inventory/catalog_updater.go: ```go func (u *CatalogUpdater) UpdateCatalogStatus(ctx context.Context, deviceID int64, current, next CatalogStatus) (CatalogStatus, error) ``` From internal/inventory (quality_gate.go constants): ```go const StatusNeedsResearch CatalogStatus = "needs_research" const StatusResearched CatalogStatus = "researched" ``` From internal/config/config.go (field to ADD): ```go SearXNGURL string `mapstructure:"searxng_url"` // default: "http://10.5.0.129:8080" // env: HWLAB_SEARXNG_URL ``` Task 1: SearXNG client + netbox.ListDevicesWithStatus internal/research/searxng.go, internal/research/searxng_test.go, internal/netbox/client.go, internal/config/config.go - SearXNGClient.Search(ctx, "Intel NIC i350") sends GET http://10.5.0.129:8080/search?q=Intel+NIC+i350&format=json - HTTP 200 with JSON body {"results":[{"title":"...","url":"...","content":"..."},...]} parses into []ai.SearchResult (map content->Snippet) - HTTP non-200 returns error with status code - Empty results array returns empty slice, no error - Query is URL-encoded (url.QueryEscape or url.Values) - ListDevicesWithStatus filters via custom_fields cf_catalog_status in go-netbox list call; falls back to client-side filter if API param unavailable - ListDevicesWithStatus("needs_research") returns only devices with that catalog_status Create package internal/research. internal/research/searxng.go: - Struct SearXNGClient with baseURL string and httpClient *http.Client (timeout 15s) - NewSearXNGClient(baseURL string) *SearXNGClient — if baseURL empty, use "http://10.5.0.129:8080" - Implements ai.ResearchClient interface - Search method: build GET {baseURL}/search?q={url-encoded query}&format=json, execute, decode JSON - SearXNG JSON response shape: {"results":[{"title":"","url":"","content":""},...]} Map content field to SearchResult.Snippet (SearXNG uses "content" not "snippet") - Return ([]ai.SearchResult, error). Never panic on empty results. internal/research/searxng_test.go: - Use httptest.NewServer to mock SearXNG responses - Test: valid response parses correctly (2 results) - Test: HTTP 500 returns error - Test: empty results returns empty slice internal/netbox/client.go — add ListDevicesWithStatus: - List all devices (up to 200), filter client-side where CustomFields.CatalogStatus == status - (go-netbox v4 custom field filtering via query param is schema-dependent; client-side is safer) internal/config/config.go — add SearXNGURL: - Field: SearXNGURL string `mapstructure:"searxng_url"` - Default: v.SetDefault("searxng_url", "http://10.5.0.129:8080") - Env binding: v.BindEnv("searxng_url", "HWLAB_SEARXNG_URL") cd /home/mikkel/homelabby && go test ./internal/research/... ./internal/config/... -v -count=1 -run TestSearXNG 2>&1 | tail -20 SearXNGClient implements ai.ResearchClient. Tests pass with httptest mock server. ListDevicesWithStatus added to netbox.Client. Config loads SearXNGURL with default. Task 2: ResearchAgent worker + main.go wiring + trigger endpoint internal/research/agent.go, internal/research/agent_test.go, internal/api/handlers/research.go, internal/api/router.go, cmd/hwlab/main.go - Agent.RunOnce(ctx) polls NetBox for needs_research items, for each: builds a text-only search query from item Name, calls SearXNGClient.Search, sends results to Tier 2 LLM with a research prompt, patches NetBox custom fields (ai_notes, product_url from first result URL), transitions status to researched via CatalogUpdater - Agent.Start(ctx, interval) runs RunOnce on ticker; logs "research agent: cycle complete, enriched N items" - If SearXNG returns 0 results for an item, log warning and skip (do not change status) - Tier 2 LLM research prompt: "You are enriching a hardware inventory record. Item: {name}. Search results: {formatted snippets}. Return JSON: {\"ai_notes\": \"...\", \"product_url\": \"...\"}" - POST /api/research/trigger responds 202 Accepted and fires RunOnce in a goroutine (non-blocking) - Query sanitization: strip characters outside [a-zA-Z0-9 .-_] before passing to SearXNG internal/research/agent.go: - Struct Agent with fields: nbClient *netbox.Client, researchClient ai.ResearchClient, tier2 ai.AIClient, updater *inventory.CatalogUpdater - NewAgent(nb *netbox.Client, rc ai.ResearchClient, tier2 ai.AIClient, updater *inventory.CatalogUpdater) *Agent - sanitizeQuery(s string) string — regexp [^a-zA-Z0-9 .\-_]+ replaced with space, strings.TrimSpace - RunOnce(ctx context.Context) (enriched int, err error): 1. ListDevicesWithStatus(ctx, "needs_research") 2. For each device: a. query = sanitizeQuery(device.Name) b. results = researchClient.Search(ctx, query) — skip if 0 results c. Build text prompt with top 3 results (title + snippet) d. tier2.AnalyzePhotos(ctx, IntakeRequest{PhotosBase64: nil, SystemPrompt: researchPrompt}) NOTE: IntakeRequest may not have SystemPrompt; build the research prompt as the text part of the multimodal request by putting it in a single text-only message. Check IntakeRequest fields; if no SystemPrompt, use a wrapper: set PhotosBase64 to nil and pass the assembled prompt text in a way the TierClient accepts. ALTERNATIVE if IntakeRequest does not support text-only: use go-openai directly via a new ResearchTierClient method — add TextComplete(ctx, prompt) (*IntakeResult, error) that posts a simple text ChatCompletion (no images). Prefer this approach for clarity. e. Parse response for ai_notes and product_url f. Patch NetBox: PatchCustomFields with ai_notes + product_url (if non-empty) g. UpdateCatalogStatus(ctx, id, StatusNeedsResearch, StatusResearched) h. enriched++ 3. Return enriched count - Start(ctx context.Context, interval time.Duration): log.Printf("research agent: starting, interval=%v", interval) RunOnce immediately, then ticker loop until ctx.Done() For the text-only LLM call: add TextComplete to TierClient in internal/ai/client.go: ```go func (c *TierClient) TextComplete(ctx context.Context, prompt string) (string, error) ``` This does a simple non-vision ChatCompletion with a single user message. Agent uses this. internal/research/agent_test.go: - Mock ResearchClient returning 2 fake SearchResults - Mock AIClient (use existing MockAIClient pattern if available, else minimal struct) - Mock NetBox (or use a stub struct) — test RunOnce returns enriched=1 for a fake device - Test sanitizeQuery strips special chars internal/api/handlers/research.go: - ResearchHandler struct with agent *research.Agent - NewResearchHandler(agent *research.Agent) *ResearchHandler - TriggerResearch(w http.ResponseWriter, r *http.Request): go func() { agent.RunOnce(context.Background()) }() w.WriteHeader(http.StatusAccepted) json.NewEncoder(w).Encode(map[string]string{"status": "accepted"}) internal/api/router.go: - Add researchHandler *handlers.ResearchHandler parameter to NewRouter signature - Add r.Post("/research/trigger", researchHandler.TriggerResearch) inside r.Route("/api", ...) - If researchHandler is nil, register an unavailable handler (same pattern as advisorHandler) cmd/hwlab/main.go: - Import internal/research - After config load: searxngClient := research.NewSearXNGClient(cfg.SearXNGURL) - researchAgent := research.NewAgent(nbClient, searxngClient, tier2, catalogUpdater) - go researchAgent.Start(ctx, 10*time.Minute) - researchHandler := handlers.NewResearchHandler(researchAgent) - Pass researchHandler to api.NewRouter(...) cd /home/mikkel/homelabby && go build ./... && go test ./internal/research/... -v -count=1 2>&1 | tail -30 go build passes. Agent tests pass. POST /api/research/trigger wired in router. Research agent goroutine starts on server launch with 10-minute interval. ## Trust Boundaries | Boundary | Description | |----------|-------------| | agent → SearXNG | AI-generated query text leaves the process and reaches the search engine | | SearXNG → agent | External search results (HTML snippets) enter the process and are forwarded to LLM | | trigger endpoint → agent | HTTP request from frontend triggers a research cycle | ## STRIDE Threat Register | Threat ID | Category | Component | Disposition | Mitigation Plan | |-----------|----------|-----------|-------------|-----------------| | T-07-01 | Tampering | sanitizeQuery | mitigate | Strip [^a-zA-Z0-9 .\-_]+ before dispatch; test with adversarial input in unit test | | T-07-02 | Information Disclosure | SearXNG response snippets | accept | SearXNG is self-hosted LAN service; snippets never stored, only passed to LLM | | T-07-03 | Denial of Service | POST /api/research/trigger | mitigate | Trigger fires goroutine but RunOnce is bounded per item; no queuing needed for MVP rate | | T-07-04 | Spoofing | SearXNG base URL in config | accept | LAN-only service at fixed IP; no auth required by design | 1. `go build ./...` passes with no errors 2. `go test ./internal/research/...` all pass 3. SearXNG integration (manual): `curl "http://10.5.0.129:8080/search?q=Intel+i350&format=json"` returns JSON 4. Trigger endpoint: `curl -X POST http://localhost:8080/api/research/trigger` returns 202 5. Log line "research agent: starting, interval=10m0s" appears on server start - SearXNGClient.Search returns parsed []ai.SearchResult from live SearXNG instance - ResearchAgent.RunOnce enriches needs_research items end-to-end: search → LLM → NetBox patch → status transition - Research cycle runs every 10 minutes automatically and on demand via POST /api/research/trigger - All queries sanitized before SearXNG dispatch - go build clean, all new tests pass After completion, create `.planning/phases/07-research-agent-search/07-01-SUMMARY.md`