nexus/.planning/phases/40-job-infrastructure/40-RESEARCH.md

575 lines
26 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Phase 40: Job Infrastructure - Research
**Researched:** 2026-04-04
**Domain:** Async job lifecycle, SSE streaming, storage namespacing, asset tracking
**Confidence:** HIGH
---
<user_constraints>
## User Constraints (from CONTEXT.md)
### Locked Decisions
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting. Use ROADMAP phase goal, success criteria, and codebase conventions to guide decisions.
### Claude's Discretion
All implementation choices are at Claude's discretion.
### Deferred Ideas (OUT OF SCOPE)
None — discuss phase skipped.
</user_constraints>
---
<phase_requirements>
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|------------------|
| INFRA-01 | System processes content generation jobs asynchronously with queued → running → done/failed lifecycle | New `content_jobs` table + in-process job runner service; pattern mirrors `heartbeat_runs` / `plugin_job_runs` already in codebase |
| INFRA-02 | System pushes job progress updates via SSE to connected clients | Existing `live-events.ts` EventEmitter + `LIVE_EVENT_TYPES` const; add new event types and a new SSE endpoint scoped to jobs |
| INFRA-03 | Generated content stored in namespaced storage without size restrictions blocking video/images | `StorageService.putFile()` already has no server-side byte limit; add `generated/` namespace + `MAX_GENERATED_ASSET_BYTES` constant bypassing the upload-route multer limit |
| INFRA-04 | All generated content tracked in database with source conversation linkage | Extend `assets` table with `source_task_id` column (nullable FK); assetService.create() gains optional `sourceTaskId` parameter |
</phase_requirements>
---
## Summary
Phase 40 builds the async foundation that every subsequent content generation phase depends on. The codebase already has two well-established job-tracking patterns (`heartbeat_runs`, `plugin_job_runs`) and a working SSE streaming pattern in three routes (`voice.ts`, `puter-proxy.ts`, `plugins.ts`). The work here is additive: a new `content_jobs` DB table, a minimal in-process job runner, new live-event types for SSE progress, a `generated/` storage namespace with its own size constant, and a `source_task_id` column on `assets`.
No external queue infrastructure (Redis, BullMQ) is needed. The project is single-user and local-first. An in-process async runner (fire-and-forget `void Promise`) with EventEmitter fan-out — matching the heartbeat pattern — is the correct approach. If a job crashes the process restarts clean; orphan prevention is via `source_task_id` so consumers can audit.
**Primary recommendation:** Mirror the `heartbeat_runs` table/service pattern for the `content_jobs` table. Use the existing `publishLiveEvent` function with new `content_job.*` event types for SSE. Add `MAX_GENERATED_ASSET_BYTES` as a server-only constant and a `generated/` namespace prefix. Add `source_task_id` to `assets` via a migration.
---
## Standard Stack
### Core
| Library | Version (verified) | Purpose | Why Standard |
|---------|-------------------|---------|--------------|
| drizzle-orm | 0.38.4 (pkg) | Schema definition + query builder | Already used throughout; schema in `packages/db/src/schema/` |
| postgres (postgres.js) | project dep | DB connection | Already used via `createDb()` in `packages/db/src/client.ts` |
| express | project dep | HTTP layer for new routes | All routes are Express; follows existing pattern |
| Node.js EventEmitter | built-in | In-process pub/sub for SSE fan-out | Already used in `live-events.ts`; no extra dep |
### Supporting
| Library | Version | Purpose | When to Use |
|---------|---------|---------|-------------|
| drizzle-kit | 0.31.9 (pkg) | Migration generation | Run `pnpm db:generate` after schema change |
| vitest | project dep | Unit + integration tests | All server tests use vitest; pattern in `server/src/__tests__/` |
| supertest | project dep | HTTP route tests | Used in `assets.test.ts`, `chat-routes.test.ts`, etc. |
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| In-process EventEmitter runner | BullMQ / Redis | Redis adds infra complexity; single-user single-process — EventEmitter is correct here |
| In-process EventEmitter runner | Worker threads | Unnecessary isolation; jobs are I/O-bound (renders call child processes) |
| Custom SSE endpoint | WebSocket upgrade | SSE is simpler for one-way server → client push; WebSocket already used for live events via `live-events-ws.ts` — keep SSE for job polling per existing voice/puter patterns |
**Installation:** No new packages required. All tooling already present.
---
## Architecture Patterns
### Recommended Project Structure
```
packages/db/src/schema/
├── content_jobs.ts # NEW: content_jobs table definition
├── assets.ts # MODIFY: add source_task_id column
├── index.ts # MODIFY: export content_jobs
packages/db/src/migrations/
├── 0056_create_content_jobs.sql # NEW: generated via drizzle-kit
├── 0057_assets_source_task_id.sql # NEW: generated via drizzle-kit
packages/shared/src/
├── constants.ts # MODIFY: add content_job.* to LIVE_EVENT_TYPES
# add CONTENT_JOB_STATUSES constant
server/src/
├── services/
│ ├── content-job-store.ts # NEW: DB CRUD for content_jobs
│ ├── content-job-runner.ts # NEW: async executor + live-event publisher
│ └── index.ts # MODIFY: export new services
├── routes/
│ ├── content-jobs.ts # NEW: POST /companies/:id/content-jobs
│ │ # GET /companies/:id/content-jobs/:jobId
│ │ # GET /companies/:id/content-jobs/:jobId/events (SSE)
│ └── index.ts # MODIFY: mount content-job routes
├── app.ts # MODIFY: register routes
```
### Pattern 1: content_jobs Table (mirrors heartbeat_runs / plugin_job_runs)
**What:** A persisted lifecycle table for async content generation requests.
**When to use:** Any content generation work that may take >200ms.
```typescript
// packages/db/src/schema/content_jobs.ts
// Pattern source: packages/db/src/schema/heartbeat_runs.ts + plugin_jobs.ts
import { pgTable, uuid, text, timestamp, jsonb, index } from "drizzle-orm/pg-core";
import { companies } from "./companies.js";
// Status lifecycle: queued → running → done | failed
export const CONTENT_JOB_STATUSES = ["queued", "running", "done", "failed"] as const;
export type ContentJobStatus = (typeof CONTENT_JOB_STATUSES)[number];
export const contentJobs = pgTable(
"content_jobs",
{
id: uuid("id").primaryKey().defaultRandom(),
companyId: uuid("company_id").notNull().references(() => companies.id),
jobType: text("job_type").notNull(), // e.g. "diagram", "theme", "video"
status: text("status").$type<ContentJobStatus>().notNull().default("queued"),
input: jsonb("input").notNull().default({}), // renderer-specific params
resultAssetId: uuid("result_asset_id"), // populated on done
errorMessage: text("error_message"), // populated on failed
sourceTaskId: text("source_task_id"), // conversation task linkage (INFRA-04)
startedAt: timestamp("started_at", { withTimezone: true }),
finishedAt: timestamp("finished_at", { withTimezone: true }),
createdAt: timestamp("created_at", { withTimezone: true }).notNull().defaultNow(),
updatedAt: timestamp("updated_at", { withTimezone: true }).notNull().defaultNow(),
},
(table) => ({
companyStatusIdx: index("content_jobs_company_status_idx").on(table.companyId, table.status),
companyCreatedIdx: index("content_jobs_company_created_idx").on(table.companyId, table.createdAt),
}),
);
```
### Pattern 2: HTTP 202 + Job ID Response
**What:** POST to submit a job returns immediately with jobId.
**When to use:** All content generation submissions.
```typescript
// server/src/routes/content-jobs.ts
router.post("/companies/:companyId/content-jobs", async (req, res) => {
assertCompanyAccess(req, companyId);
const job = await contentJobStore.create(companyId, {
jobType: req.body.jobType,
input: req.body.input ?? {},
sourceTaskId: req.body.sourceTaskId ?? null,
});
// Fire and forget — never await the runner here
void contentJobRunner.dispatch(job);
res.status(202).json({ jobId: job.id, status: job.status });
});
```
### Pattern 3: SSE Job Progress (mirrors voice.ts pattern)
**What:** GET endpoint that holds connection open and pushes events until terminal state.
**When to use:** Browser polls for job progress without polling.
```typescript
// server/src/routes/content-jobs.ts
router.get("/companies/:companyId/content-jobs/:jobId/events", async (req, res) => {
assertCompanyAccess(req, companyId);
res.setHeader("Content-Type", "text/event-stream");
res.setHeader("Cache-Control", "no-cache");
res.setHeader("Connection", "keep-alive");
res.flushHeaders();
const sendEvent = (type: string, data: unknown) => {
res.write(`event: ${type}\ndata: ${JSON.stringify(data)}\n\n`);
};
// Send current state immediately
const job = await contentJobStore.getById(jobId);
sendEvent("status", { jobId, status: job.status });
if (job.status === "done" || job.status === "failed") {
res.end();
return;
}
// Subscribe to live events for this job
const unsubscribe = subscribeCompanyLiveEvents(companyId, (event) => {
if (event.type === "content_job.status" && event.payload.jobId === jobId) {
sendEvent("status", event.payload);
if (event.payload.status === "done" || event.payload.status === "failed") {
unsubscribe();
res.end();
}
}
});
req.on("close", () => {
unsubscribe();
});
});
```
### Pattern 4: Live Event Types for Job Progress
**What:** Extend LIVE_EVENT_TYPES in shared constants.
**When to use:** Publishing job progress from the runner.
```typescript
// packages/shared/src/constants.ts — add to LIVE_EVENT_TYPES array:
"content_job.queued",
"content_job.running",
"content_job.done",
"content_job.failed",
```
### Pattern 5: Generated Asset Storage Namespace
**What:** `generated/` namespace bypasses upload-route multer limit.
**When to use:** Writing rendered output (video, SVG, PDF, PNG) from job runner.
```typescript
// server/src/attachment-types.ts — add alongside MAX_ATTACHMENT_BYTES:
export const MAX_GENERATED_ASSET_BYTES =
Number(process.env.PAPERCLIP_GENERATED_ASSET_MAX_BYTES) || 500 * 1024 * 1024; // 500MB default
// Job runner stores via:
const stored = await storage.putFile({
companyId,
namespace: "generated", // bypasses upload limit — this is not from multer
originalFilename: outputFilename,
contentType,
body: outputBuffer, // renderer output, no multer involved
});
```
**Key insight:** The upload route (`assets.ts`) enforces limits via multer `limits: { fileSize: MAX_ATTACHMENT_BYTES }`. Job runners write directly to `storage.putFile()` — multer is never involved. The `MAX_GENERATED_ASSET_BYTES` constant exists for the job runner to validate before calling `putFile`, but `putFile` itself has no byte limit.
### Pattern 6: source_task_id on assets (INFRA-04)
**What:** Nullable column added to `assets` table.
**When to use:** Every asset created by a job runner must pass `sourceTaskId`.
```typescript
// packages/db/src/schema/assets.ts — add column:
sourceTaskId: text("source_task_id"), // nullable, no FK — task IDs are string identifiers
// assetService.create() in server/src/services/assets.ts accepts it through
// the existing spread pattern: db.insert(assets).values({ ...data, companyId })
// since the column is nullable, no callers break.
```
### Pattern 7: content-job-store.ts (service layer)
**What:** CRUD service for content_jobs table.
**When to use:** Create and update jobs from routes + runner.
```typescript
// server/src/services/content-job-store.ts
// Pattern source: server/src/services/assets.ts, heartbeat.ts
export function contentJobStore(db: Db) {
return {
create: (companyId: string, data: { jobType: string; input: Record<string, unknown>; sourceTaskId: string | null }) =>
db.insert(contentJobs).values({ companyId, ...data }).returning().then((r) => r[0]),
getById: (id: string) =>
db.select().from(contentJobs).where(eq(contentJobs.id, id)).then((r) => r[0] ?? null),
listByCompany: (companyId: string) =>
db.select().from(contentJobs)
.where(eq(contentJobs.companyId, companyId))
.orderBy(desc(contentJobs.createdAt)),
transition: (id: string, patch: Partial<typeof contentJobs.$inferInsert>) =>
db.update(contentJobs).set({ ...patch, updatedAt: new Date() }).where(eq(contentJobs.id, id)),
};
}
```
### Anti-Patterns to Avoid
- **Awaiting render in HTTP handler:** Never `await renderer.run()` in the route handler. Always `void dispatch(job)` and return 202.
- **Using multer for generated asset storage:** Job runners call `storage.putFile()` directly; multer is only for user uploads.
- **Hardcoding status strings:** Always use the typed `CONTENT_JOB_STATUSES` constant from shared, not raw strings.
- **Blocking SSE on DB polling:** SSE endpoint subscribes to EventEmitter via `subscribeCompanyLiveEvents`, not a polling loop.
- **Missing `source_task_id` in job creation:** Every job submission should pass `sourceTaskId` from the incoming request (even if null for now); the column prevents future orphan accumulation.
---
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| In-process pub/sub for SSE | Custom EventEmitter wrapper | `publishLiveEvent` + `subscribeCompanyLiveEvents` from `live-events.ts` | Already handles multi-company scoping, id sequencing |
| Storage path generation | Custom UUID + date path builder | `StorageService.putFile()` via `service.ts` | Already handles namespace normalization, sha256, objectKey construction |
| Migration execution | Custom SQL runner | `pnpm db:generate` then `pnpm db:migrate` | Existing drizzle-kit + custom migration runner in `client.ts` |
| HTTP route mounting | Ad-hoc app.use() | Follow `app.ts` pattern: create router fn, import in app.ts | Consistent middleware application (auth, actor, etc.) |
| Asset DB record | Custom insert | `assetService(db).create()` | Already handles the full asset shape |
---
## Common Pitfalls
### Pitfall 1: Forgetting the "generated" namespace bypass logic lives in the runner, not the route
**What goes wrong:** Developer adds byte-limit check in the new content-job route handler (treating it like the assets upload route), rejecting large files before they're even rendered.
**Why it happens:** The upload route pattern uses multer limits; the developer copies that pattern.
**How to avoid:** The content-jobs route only creates the job record (tiny JSON). The runner calls `storage.putFile()` directly — no multer anywhere. The size check (`MAX_GENERATED_ASSET_BYTES`) belongs in the job runner, after rendering, before writing.
**Warning signs:** Any import of `multer` in `content-jobs.ts` or `content-job-runner.ts`.
### Pitfall 2: SSE connection leaking if client disconnects mid-job
**What goes wrong:** Client disconnects; the `unsubscribe` callback is never called; the EventEmitter listener accumulates. With many requests this can trigger Node's MaxListeners warning.
**Why it happens:** SSE streams need explicit `req.on("close")` cleanup.
**How to avoid:** Always register `req.on("close", () => { unsubscribe(); })` in every SSE handler. See `live-events-ws.ts` `cleanupByClient` pattern.
**Warning signs:** MaxListeners exceeded warning in server logs.
### Pitfall 3: Publishing live events before the DB row is committed
**What goes wrong:** Browser receives `content_job.done` event, queries `/content-jobs/:id`, gets stale data (or 404 if the row hasn't flushed).
**Why it happens:** `publishLiveEvent` is synchronous; the DB write is async.
**How to avoid:** Always `await db.update(...)` before calling `publishLiveEvent(...)` in the job runner.
**Warning signs:** Frontend shows "done" but API returns "running".
### Pitfall 4: Migration numbering collision
**What goes wrong:** Two migrations are created with the same prefix (e.g., `0056_`) and drizzle-kit fails to apply them in order.
**Why it happens:** Parallel development or rebasing creates numbering conflicts.
**How to avoid:** Check `ls packages/db/src/migrations/` before running `pnpm db:generate`. The last file is currently `0055_create_push_subscriptions.sql`. New migrations start at `0056_`.
**Warning signs:** `pnpm db:generate` creates a file with a duplicate number.
### Pitfall 5: Forgetting to export from schema/index.ts and services/index.ts
**What goes wrong:** TypeScript compiles but runtime throws "cannot find module" or imports return undefined.
**Why it happens:** The project uses explicit barrel exports; tree-shaking won't auto-discover.
**How to avoid:** After adding `content_jobs.ts` schema and `content-job-store.ts` service, immediately add exports to `packages/db/src/schema/index.ts` and `server/src/services/index.ts`.
### Pitfall 6: Adding content_job.* event types in the wrong place
**What goes wrong:** `publishLiveEvent({ type: "content_job.running" })` throws a TypeScript error because the type is not in `LIVE_EVENT_TYPES`.
**Why it happens:** `LiveEventType` is derived from `LIVE_EVENT_TYPES as const` in `packages/shared/src/constants.ts`.
**How to avoid:** Add the four new types (`content_job.queued`, `content_job.running`, `content_job.done`, `content_job.failed`) to the `LIVE_EVENT_TYPES` array in `constants.ts` before writing the runner.
---
## Code Examples
### Submitting a Job and Returning 202
```typescript
// Source: voice.ts (sync return), assets.ts (storage pattern) — combined
router.post("/companies/:companyId/content-jobs", async (req, res) => {
const companyId = req.params.companyId;
assertCompanyAccess(req, companyId);
const { jobType, input, sourceTaskId } = req.body as {
jobType: string;
input?: Record<string, unknown>;
sourceTaskId?: string;
};
const store = contentJobStore(db);
const job = await store.create(companyId, {
jobType,
input: input ?? {},
sourceTaskId: sourceTaskId ?? null,
});
void contentJobRunner.dispatch(db, storage, job); // fire and forget
res.status(202).json({
jobId: job.id,
status: job.status,
createdAt: job.createdAt,
});
});
```
### Publishing Progress from the Runner
```typescript
// Source: live-events.ts publishLiveEvent pattern
async function runJob(db: Db, storage: StorageService, job: ContentJob) {
const store = contentJobStore(db);
// Transition to running
await store.transition(job.id, { status: "running", startedAt: new Date() });
publishLiveEvent({
companyId: job.companyId,
type: "content_job.running",
payload: { jobId: job.id },
});
try {
const result = await renderContent(job.jobType, job.input);
// Store asset
const stored = await storage.putFile({
companyId: job.companyId,
namespace: "generated",
originalFilename: result.filename,
contentType: result.contentType,
body: result.buffer,
});
const asset = await assetService(db).create(job.companyId, {
...stored,
sourceTaskId: job.sourceTaskId,
createdByAgentId: null,
createdByUserId: null,
});
// Transition to done
await store.transition(job.id, { status: "done", resultAssetId: asset.id, finishedAt: new Date() });
publishLiveEvent({
companyId: job.companyId,
type: "content_job.done",
payload: { jobId: job.id, assetId: asset.id },
});
} catch (err) {
const errorMessage = err instanceof Error ? err.message : "Unknown error";
await store.transition(job.id, { status: "failed", errorMessage, finishedAt: new Date() });
publishLiveEvent({
companyId: job.companyId,
type: "content_job.failed",
payload: { jobId: job.id, errorMessage },
});
}
}
```
### Migration for content_jobs table
```sql
-- packages/db/src/migrations/0056_create_content_jobs.sql
-- (generated by drizzle-kit; shown for reference)
CREATE TABLE IF NOT EXISTS "content_jobs" (
"id" uuid PRIMARY KEY DEFAULT gen_random_uuid() NOT NULL,
"company_id" uuid NOT NULL REFERENCES "companies"("id"),
"job_type" text NOT NULL,
"status" text NOT NULL DEFAULT 'queued',
"input" jsonb NOT NULL DEFAULT '{}',
"result_asset_id" uuid,
"error_message" text,
"source_task_id" text,
"started_at" timestamp with time zone,
"finished_at" timestamp with time zone,
"created_at" timestamp with time zone DEFAULT now() NOT NULL,
"updated_at" timestamp with time zone DEFAULT now() NOT NULL
);
CREATE INDEX IF NOT EXISTS "content_jobs_company_status_idx" ON "content_jobs" ("company_id", "status");
CREATE INDEX IF NOT EXISTS "content_jobs_company_created_idx" ON "content_jobs" ("company_id", "created_at");
```
### Migration for source_task_id on assets
```sql
-- packages/db/src/migrations/0057_assets_source_task_id.sql
ALTER TABLE "assets" ADD COLUMN IF NOT EXISTS "source_task_id" text;
CREATE INDEX IF NOT EXISTS "assets_source_task_id_idx" ON "assets" ("source_task_id");
```
---
## State of the Art
| Old Approach | Current Approach | When Changed | Impact |
|--------------|------------------|--------------|--------|
| Polling for job status | SSE push | Existing pattern in codebase | No polling loop needed on client |
| Blocking HTTP for render | HTTP 202 + async | Decision from STATE.md | Render time decoupled from response time |
| Flat asset storage | Namespaced storage (`generated/`, `assets/general`) | Existing `service.ts` pattern | No path collision between user uploads and generated output |
**Already established in project:**
- All DB schemas use Drizzle ORM with explicit migrations (not `drizzle.push`)
- Migration files are numbered sequentially starting at `0000_`; next is `0056_`
- Services follow a simple factory function pattern: `export function xService(db: Db) { return { ... } }`
- Routes follow `export function xRoutes(db, deps...) { const router = Router(); ... return router; }` pattern
- All routes are mounted in `server/src/app.ts`
---
## Environment Availability
Step 2.6: SKIPPED — Phase 40 is purely code and database changes. No external CLI tools, external services, or runtime binaries are required beyond Node.js 20, pnpm 9, and PostgreSQL already running.
---
## Validation Architecture
### Test Framework
| Property | Value |
|----------|-------|
| Framework | vitest (project dep) |
| Config file | `server/vitest.config.ts``environment: "node"` |
| Quick run command | `pnpm test:run --project server -- --reporter=verbose src/__tests__/content-jobs*` |
| Full suite command | `pnpm test:run` |
### Phase Requirements → Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| INFRA-01 | POST /content-jobs returns 202 + jobId within 200ms | unit | `pnpm --filter @paperclipai/server test:run -- src/__tests__/content-jobs-routes.test.ts` | ❌ Wave 0 |
| INFRA-01 | Job transitions queued → running → done/failed | unit | same file | ❌ Wave 0 |
| INFRA-02 | SSE endpoint delivers progress events before terminal | unit | `pnpm --filter @paperclipai/server test:run -- src/__tests__/content-jobs-sse.test.ts` | ❌ Wave 0 |
| INFRA-03 | storage.putFile with generated/ namespace stores bytes without size error | unit | `pnpm --filter @paperclipai/server test:run -- src/__tests__/content-jobs-storage.test.ts` | ❌ Wave 0 |
| INFRA-04 | Asset created by job runner includes sourceTaskId | unit | covered in content-jobs-routes.test.ts | ❌ Wave 0 |
### Sampling Rate
- **Per task commit:** `pnpm --filter @paperclipai/server test:run -- src/__tests__/content-jobs*.test.ts`
- **Per wave merge:** `pnpm test:run`
- **Phase gate:** Full suite green before `/gsd:verify-work`
### Wave 0 Gaps
- [ ] `server/src/__tests__/content-jobs-routes.test.ts` — covers INFRA-01, INFRA-04
- [ ] `server/src/__tests__/content-jobs-sse.test.ts` — covers INFRA-02
- [ ] `server/src/__tests__/content-jobs-storage.test.ts` — covers INFRA-03
*(No new test framework needed — vitest already configured for server.)*
---
## Project Constraints (from CLAUDE.md)
CLAUDE.md does not exist at the project root. The constraints below are derived from `STATE.md` key decisions which carry the same authority:
1. **Async job pattern is mandatory** — all render requests return 202 + job ID immediately; never block HTTP on render
2. **`content_jobs` table must exist before any renderer is built** — this phase is the hard dependency for all others (phases 4145)
3. **`sourceTaskId` required on every generated asset from day one** — prevents SSD orphan accumulation
4. **`MAX_GENERATED_ASSET_BYTES` constant bypasses the 10MB upload limit for `generated/` namespace** — separate from upload route
5. **Async pattern**`renderPipelineService` stub must exist by end of phase (even as no-op) so phase 41 can extend it
---
## Sources
### Primary (HIGH confidence)
- Codebase: `server/src/services/live-events.ts` — EventEmitter pub/sub pattern for SSE fan-out
- Codebase: `server/src/routes/voice.ts` — SSE headers pattern (`text/event-stream`, `flushHeaders`, `res.write`)
- Codebase: `packages/db/src/schema/plugin_jobs.ts` — job lifecycle table pattern (status, timestamps, logs)
- Codebase: `packages/db/src/schema/assets.ts` — asset table shape for INFRA-04 extension
- Codebase: `server/src/storage/service.ts``putFile` has no byte limit; limit is in multer (upload route only)
- Codebase: `packages/shared/src/constants.ts``LIVE_EVENT_TYPES` pattern to extend for job events
- Codebase: `server/src/attachment-types.ts``MAX_ATTACHMENT_BYTES` = 10MB; `MAX_GENERATED_ASSET_BYTES` to be added here
- Codebase: `packages/db/src/migrations/` — last migration is `0055_`; next is `0056_`
- Project STATE.md — locked architecture decisions
### Secondary (MEDIUM confidence)
- Pattern inference from `heartbeat_runs` / `plugin_job_runs` tables (same repo) for `content_jobs` shape
### Tertiary (LOW confidence)
None — all findings are from direct codebase inspection.
---
## Metadata
**Confidence breakdown:**
- Standard stack: HIGH — all verified from codebase
- Architecture: HIGH — directly modeled on existing patterns in same repo
- Pitfalls: HIGH — identified from direct code review of existing patterns
**Research date:** 2026-04-04
**Valid until:** 2026-05-04 (stable internal patterns)