nexus/.planning/phases/40-job-infrastructure/40-VERIFICATION.md

129 lines
11 KiB
Markdown

---
phase: 40-job-infrastructure
verified: 2026-04-04T12:48:00Z
status: gaps_found
score: 9/10 must-haves verified
gaps:
- truth: "GET /api/companies/:companyId/content-jobs/:jobId/events streams SSE events until terminal state then closes"
status: partial
reason: "SSE endpoint checks event.payload.status for terminal detection, but contentJobRunner publishes content_job.done with { jobId, assetId } and content_job.failed with { jobId, errorMessage } — neither payload contains a status field. The stream therefore never auto-closes when a running job completes; it stays open until the client disconnects (req.close). The initial-status fast-path (for already-terminal jobs) works correctly."
artifacts:
- path: "server/src/services/content-job-runner.ts"
issue: "publishLiveEvent payloads for content_job.done and content_job.failed omit 'status' field"
- path: "server/src/routes/content-jobs.ts"
issue: "Line 108: checks event.payload.status which is always undefined with current runner payloads"
missing:
- "Add status: 'done' to content_job.done payload in content-job-runner.ts"
- "Add status: 'failed' to content_job.failed payload in content-job-runner.ts"
- "OR change SSE terminal detection to use event.type === 'content_job.done' || event.type === 'content_job.failed'"
---
# Phase 40: Job Infrastructure Verification Report
**Phase Goal:** Every content generation request returns a job ID immediately, progresses through a tracked lifecycle, and stores its output in namespaced storage — so nothing blocks and nothing is orphaned
**Verified:** 2026-04-04T12:48:00Z
**Status:** gaps_found
**Re-verification:** No — initial verification
## Goal Achievement
### Observable Truths
| # | Truth | Status | Evidence |
|---|-------|--------|----------|
| 1 | content_jobs table exists in DB with queued/running/done/failed lifecycle columns | VERIFIED | packages/db/src/schema/content_jobs.ts has all lifecycle columns; migration 0046_tense_randall.sql applies CREATE TABLE |
| 2 | assets table has a source_task_id column for conversation linkage | VERIFIED | packages/db/src/schema/assets.ts line 18: sourceTaskId: text("source_task_id"); same migration adds column |
| 3 | LIVE_EVENT_TYPES includes content_job.queued, content_job.running, content_job.done, content_job.failed | VERIFIED | packages/shared/src/constants.ts lines 334-337 confirm all four entries |
| 4 | MAX_GENERATED_ASSET_BYTES constant exists and defaults to 500MB | VERIFIED | server/src/attachment-types.ts line 76-77: exports constant, defaults to 500 * 1024 * 1024 |
| 5 | contentJobStore service can create, get, list, and transition jobs | VERIFIED | server/src/services/content-job-store.ts exports all four methods backed by real Drizzle queries |
| 6 | contentJobRunner dispatches a job asynchronously without blocking, transitions through lifecycle, stores asset, publishes live events | VERIFIED | void runJob() pattern confirmed; transitions running→done/failed; putFile with namespace "generated"; assetService.create with sourceTaskId; publishLiveEvent called |
| 7 | POST /api/companies/:companyId/content-jobs returns 202 with jobId and status within 200ms | VERIFIED | Route handler returns res.status(202).json({ jobId, status, createdAt }); dispatch is fire-and-forget |
| 8 | GET /api/companies/:companyId/content-jobs/:jobId returns the job record with current status | VERIFIED | Route calls contentJobStore(db).getById, returns job or 404 |
| 9 | GET /api/companies/:companyId/content-jobs lists all jobs for a company ordered by createdAt desc | VERIFIED | Route calls listByCompany, Drizzle query uses orderBy(desc(contentJobs.createdAt)) |
| 10 | GET /api/companies/:companyId/content-jobs/:jobId/events streams SSE events until terminal state then closes | FAILED | Initial-status fast-path works for already-terminal jobs. For running jobs, SSE terminal detection checks event.payload.status but runner publishes { jobId, assetId } / { jobId, errorMessage } with no status field — stream will not auto-close on completion |
**Score:** 9/10 truths verified
### Required Artifacts
| Artifact | Expected | Status | Details |
|----------|----------|--------|---------|
| `packages/db/src/schema/content_jobs.ts` | content_jobs table with lifecycle | VERIFIED | 33 lines, exports contentJobs pgTable + CONTENT_JOB_STATUSES, two indexes |
| `server/src/services/content-job-store.ts` | CRUD for content_jobs | VERIFIED | 37 lines, all four methods, real Drizzle queries |
| `server/src/services/content-job-runner.ts` | Async dispatcher with live events | VERIFIED | 88 lines, fire-and-forget dispatch, lifecycle transitions, asset storage, live events published |
| `server/src/routes/content-jobs.ts` | HTTP routes for job API | VERIFIED | 122 lines, all four endpoints implemented |
| `server/src/__tests__/content-jobs-routes.test.ts` | Route integration tests (min 50 lines) | VERIFIED | 190 lines, 8 tests covering 202/400/404/sourceTaskId |
| `server/src/__tests__/content-jobs-sse.test.ts` | SSE integration tests (min 30 lines) | VERIFIED | 139 lines, 5 tests covering content-type/initial event/terminal states |
### Key Link Verification
| From | To | Via | Status | Details |
|------|----|----|--------|---------|
| content-job-runner.ts | content-job-store.ts | store.transition() calls | WIRED | Lines 26, 57, 70 call store.transition() after await |
| content-job-runner.ts | live-events.ts | publishLiveEvent for content_job.* events | WIRED | Lines 27-31, 62-65, 74-78 publish running/done/failed events |
| content-job-runner.ts | assets.ts | assetService.create with sourceTaskId | WIRED | Lines 50-55: assetService(db).create with sourceTaskId: job.sourceTaskId |
| content-jobs.ts (route) | content-job-store.ts | contentJobStore(db) calls | WIRED | Lines 28, 51, 60, 82 all call contentJobStore(db) |
| content-jobs.ts (route) | content-job-runner.ts | contentJobRunner.dispatch() in POST handler | WIRED | Line 37: void contentJobRunner.dispatch(db, storage, job!) |
| content-jobs.ts (route) | live-events.ts | subscribeCompanyLiveEvents in SSE endpoint | WIRED | Line 102: subscribeCompanyLiveEvents() with req.on("close") cleanup |
| app.ts | content-jobs.ts | api.use mount | WIRED | Line 190: api.use(contentJobRoutes(db, opts.storageService)) |
### Data-Flow Trace (Level 4)
| Artifact | Data Variable | Source | Produces Real Data | Status |
|----------|---------------|--------|-------------------|--------|
| content-job-store.ts → create | row returned | db.insert(contentJobs).values().returning() | Yes — Drizzle INSERT with RETURNING | FLOWING |
| content-job-store.ts → getById | row or null | db.select().from(contentJobs).where(eq(id)) | Yes — Drizzle SELECT | FLOWING |
| content-job-store.ts → listByCompany | rows array | db.select().from(contentJobs).where().orderBy() | Yes — Drizzle SELECT | FLOWING |
| content-job-runner.ts → renderContent | buffer | Stub returning fixed Buffer.from("placeholder output") | Intentional stub — phases 41-45 add real renderers | STUB (documented, intentional) |
| content-jobs.ts (POST route) | job response | contentJobStore(db).create() | Yes — store backed by real Drizzle | FLOWING |
### Behavioral Spot-Checks
| Behavior | Command | Result | Status |
|----------|---------|--------|--------|
| All 13 integration tests pass | npx vitest run content-jobs-routes.test.ts content-jobs-sse.test.ts | 13/13 passed in 790ms | PASS |
| TypeScript compiles clean (db) | pnpm tsc --noEmit --project packages/db/tsconfig.json | Exit 0 | PASS |
| TypeScript compiles clean (shared) | pnpm tsc --noEmit --project packages/shared/tsconfig.json | Exit 0 | PASS |
| TypeScript compiles clean (server) | pnpm tsc --noEmit --project server/tsconfig.json | Exit 0 | PASS |
| Migration file contains content_jobs DDL | grep content_jobs migrations/*.sql | Found in 0046_tense_randall.sql | PASS |
| SSE stream auto-close for live jobs | (code inspection) | event.payload.status never set by runner | FAIL |
### Requirements Coverage
| Requirement | Source Plan | Description | Status | Evidence |
|-------------|------------|-------------|--------|----------|
| INFRA-01 | 40-01, 40-02 | System processes content generation jobs asynchronously with queued → running → done/failed lifecycle | SATISFIED | contentJobRunner.dispatch fires void runJob; transitions queued→running→done/failed; POST returns 202; tests verify 202/queued |
| INFRA-02 | 40-02 | System pushes job progress updates via SSE to connected clients | PARTIAL | SSE endpoint exists and streams initial status + live events. However, SSE does not auto-close for live jobs because runner payloads lack status field — stream hangs open until client disconnect |
| INFRA-03 | 40-01, 40-02 | Generated content stored in namespaced storage without size restrictions blocking video/images | SATISFIED | storage.putFile called with namespace: "generated"; MAX_GENERATED_ASSET_BYTES = 500MB; no blocking — async dispatch |
| INFRA-04 | 40-01, 40-02 | All generated content tracked in database with source conversation linkage | SATISFIED | assets.sourceTaskId column added; runner passes sourceTaskId: job.sourceTaskId to assetService.create; test verifies sourceTaskId persistence via POST then GET |
### Anti-Patterns Found
| File | Line | Pattern | Severity | Impact |
|------|------|---------|----------|--------|
| server/src/services/content-job-runner.ts | 11-21 | renderContent stub returning hardcoded buffer | Info | Intentional — documented in SUMMARY as placeholder for phases 41-45 |
| server/src/routes/content-jobs.ts | 108 | event.payload.status terminal check against payloads that never include status | Blocker | SSE stream for live jobs never auto-closes on completion — client sees no completion signal until disconnect |
### Human Verification Required
None required — all behavioral checks were resolved programmatically.
## Gaps Summary
One gap blocks complete goal achievement:
**SSE stream does not auto-close when a running job reaches terminal state.** The `contentJobRunner` publishes `content_job.done` with `{ jobId, assetId }` and `content_job.failed` with `{ jobId, errorMessage }`. Neither payload contains a `status` field. The SSE route's terminal detection at line 108 reads `event.payload.status` and checks for `"done"` or `"failed"` — this will always be `undefined`, so `unsubscribe()` and `res.end()` are never called by the subscriber. The stream remains open indefinitely until the client disconnects.
The initial-status fast-path works correctly: if a job is already terminal when the SSE connection opens, the stream closes immediately after sending the first event. Only the live-event-driven close path is broken.
**Fix options (either resolves the gap):**
1. In `content-job-runner.ts`, add `status: "done"` to the `content_job.done` payload and `status: "failed"` to the `content_job.failed` payload.
2. In `content-jobs.ts` (SSE route), replace the payload status check with `event.type === "content_job.done" || event.type === "content_job.failed"`.
The `renderContent` stub is not a gap — it is intentionally deferred to phases 41-45 and documented as such.
---
_Verified: 2026-04-04T12:48:00Z_
_Verifier: Claude (gsd-verifier)_