204 lines
23 KiB
Markdown
204 lines
23 KiB
Markdown
# Project Research Summary
|
||
|
||
**Project:** Nexus v1.7 — Content Generation Layer
|
||
**Domain:** AI-driven local content generation (presentations, diagrams, PDFs, themes, social assets, icons)
|
||
**Researched:** 2026-04-04
|
||
**Confidence:** MEDIUM-HIGH
|
||
|
||
## Executive Summary
|
||
|
||
Nexus v1.7 adds a local content generation layer to an existing Paperclip fork running on a Mac Mini M4. The scope is narrow but technically deep: agents produce visual and document deliverables (diagrams, PDFs, videos, color themes, social media assets, icons) entirely on-device, with no cloud API calls. The recommended approach is a pipeline of purpose-built libraries — Remotion for video, Playwright for PDFs, satori+resvg-js for social images, culori for OKLCH-based theme generation, and `@mermaid-js/mermaid-cli` for server-side diagrams — routed through a shared async job infrastructure built on top of the existing Paperclip `assets`, `publishLiveEvent`, and `StorageService` systems. Every content type is an installable skill, meaning the content layer is additive and does not touch the upstream Paperclip schema.
|
||
|
||
The single most important architectural decision is the async job pattern. Long-running renders (Remotion video: 3–10 min, PDF: 1–5 sec, Mermaid: fast) must return a job ID immediately and push progress via the existing SSE live-events bus. Synchronous HTTP for any render is the primary failure path. The second most important decision is Remotion bundle isolation: the webpack bundler must run once at startup in a dedicated `packages/remotion-compositions/` workspace package, never on each render request, and never inside the main Vite/tsc server build context.
|
||
|
||
The primary risks cluster around three areas: Remotion's CPU/RAM footprint competing with Ollama on the shared M4 machine (mitigated by capping concurrency at 4 and serializing renders with LLM inference); security in the diagram and icon pipeline (Mermaid `securityLevel: "loose"` has documented XSS-to-RCE exploits; all SVG output, AI-generated or not, must pass DOMPurify before reaching the DOM); and storage growth (video renders accumulate fast on a finite Mac Mini SSD — `sourceTaskId` linking and per-type retention policies are mandatory from day one, not deferred cleanup).
|
||
|
||
## Key Findings
|
||
|
||
### Recommended Stack
|
||
|
||
The v1.7 stack is entirely additive to the v1.6 base (Express, sharp, ffmpeg-static, grammy, mermaid). Seven new library groups cover the new content types. Remotion requires workspace isolation in `packages/content-renderer/` due to its webpack bundler conflicting with Vite. Three separate Chromium binaries will be installed (Remotion, mermaid-cli, Playwright) totaling approximately 900MB on the Mac Mini SSD — acceptable, but worth attempting to share via `PUPPETEER_EXECUTABLE_PATH`.
|
||
|
||
One package name needs verification before installation: the correct package may be `@resvg/resvg-js` (v2.6.2, Rust napi-rs) rather than `resvg-js` (v0.1.97, older version). Confirm before `pnpm add`.
|
||
|
||
**Core technologies:**
|
||
- `remotion ^4.0.443` + `@remotion/bundler` + `@remotion/renderer`: React-based video/presentation rendering, Mac M4 arm64 confirmed, SSR API works in Node.js without browser UI — isolated in `packages/content-renderer/`
|
||
- `playwright-chromium ^1.50.0`: HTML-to-PDF via headless Chromium, 42ms cold start vs Puppeteer's 147ms (2026 macOS arm64 benchmark), TypeScript-native — installed in `server/`
|
||
- `@mermaid-js/mermaid-cli ^11.12.0`: Official server-side Mermaid-to-SVG via `run()` API, same version as `mermaid ^11.12.0` already in `ui/` — installed in `server/`
|
||
- `satori ^0.26.0` + `@resvg/resvg-js ^2.6.2`: JSX/CSS-to-SVG-to-PNG without a browser; used by `@vercel/og` internally; pipeline for OG images, social cards, wallpapers — installed in `server/`
|
||
- `culori ^4.0.2`: OKLCH-native color math, correct WCAG contrast calculation (0.04045 threshold, not the erroneous 0.03928 in the W3C spec), 2026 community consensus over chroma-js for design-system work — installed in `server/` and `ui/`
|
||
- `@stable-canvas/comfyui-client ^1.5.9`: Zero-dependency MIT client for ComfyUI REST/WebSocket API; graceful degradation when ComfyUI not running on `localhost:8188` — optional, installed in `server/`
|
||
- `sharp ^0.34.5` (already installed): image compositing, resizing, format conversion — extended for content use, not re-added
|
||
- `ffmpeg-static ^5.3.0` (already installed): Remotion detects it automatically via `ensureFfmpeg()`; no second FFmpeg needed
|
||
|
||
### Expected Features
|
||
|
||
The FEATURES.md establishes a clear three-tier priority. The critical insight is that the Content Skill System must come first because every other content type depends on it. Satori+Sharp is the single image pipeline for all 2D raster output — do not introduce per-type image libraries.
|
||
|
||
**Must have (table stakes — P1):**
|
||
- Download produced file with correct MIME type and `Content-Disposition: attachment`
|
||
- Preview output before downloading (inline SVG, iframe PDF, Remotion Player, image thumbnail)
|
||
- Generation status feedback via SSE progress: `queued → generating → ready → error`
|
||
- Structured error recovery with actionable suggestions (e.g., "Run: ollama pull llava")
|
||
- Save output to file system with git versioning and PLACEHOLDERS.md manifest integration
|
||
- Re-generate with revised prompt (store parameters per job)
|
||
- Content type labeled clearly (distinct icon, preview strategy, type registry)
|
||
|
||
**Should have (differentiators — P1/P2):**
|
||
- Agent-driven generation from chat (NL → skill routing → file attachment in chat)
|
||
- Content types as installable skills (each generator is a separate skill file, not a monolithic feature)
|
||
- PLACEHOLDERS.md manifest integration (draft flag, `prompt_hash`, `generated_at` on every asset)
|
||
- Seed-color-to-full-theme pipeline with WCAG AA enforced (not optional) using OKLCH
|
||
- Diagram from natural language (LLM → Mermaid syntax → server-side SVG)
|
||
- Local-only operation (no data leaves Mac Mini)
|
||
|
||
**Defer to v2+:**
|
||
- Branding media kit (high coordination cost; requires all other generators stable first)
|
||
- Batch generation (job queue infrastructure not justified for v1.7)
|
||
- Font embedding in PDF/video (licensing audit required)
|
||
- Auto-publish to social platforms (OAuth token management, platform API complexity)
|
||
- Template marketplace
|
||
|
||
### Architecture Approach
|
||
|
||
The architecture builds entirely on existing Nexus/Paperclip patterns: factory functions (not classes), `StorageService` for all blob storage, `publishLiveEvent` for SSE fan-out, and the `assets` table for file metadata. The core addition is a `content_jobs` table tracking async render lifecycle, a `renderPipelineService` routing jobs to typed `RendererAdapter` implementations, and a `themeEngineService` as a pure computation service with no DB dependency. The ARCHITECTURE.md is derived from direct codebase inspection (HIGH confidence) — the patterns are proven.
|
||
|
||
Content types are implemented as Markdown skill files, not code. Agents read the skill instructions and call `POST /api/companies/:id/content-jobs` with the appropriate `type` and `params`. No new schema is needed for the skill layer.
|
||
|
||
**Major components:**
|
||
1. `contentJobService` — Enqueues async render jobs, emits `content.job.started/done/failed` live events, tracks lifecycle in `content_jobs` table; returns `202 Accepted` with job ID immediately
|
||
2. `renderPipelineService` — Strategy dispatch: routes `ContentJobType` to the correct `RendererAdapter`; each adapter is independently pluggable behind a shared interface
|
||
3. `themeEngineService` — Pure OKLCH computation: seed color → palette → WCAG AA validation → CSS/JSON/Tailwind exports; synchronous HTTP, no DB, client-side preview via CSS custom property injection
|
||
4. Renderer adapters (mermaid, svg, pdf, remotion, image) — each isolated behind `RendererAdapter` interface; binary-dependent adapters in `server/src/services/renderers/`
|
||
5. `packages/content-renderer/` (Remotion workspace package) — Compositions bundled once at startup; `renderMedia()` called per request against cached bundle path
|
||
6. UI components — `ContentJobViewer`, `DiagramRenderer`, `ThemePreview`, `ContentGallery` — consume SSE events and existing asset APIs
|
||
|
||
### Critical Pitfalls
|
||
|
||
The PITFALLS.md has 22 v1.7-specific pitfalls (45–66). The highest-severity items:
|
||
|
||
1. **Remotion `bundle()` called per render request** (Pitfall 45) — Webpack takes 2–5 min; server becomes unresponsive under load. Prevention: call `bundle()` once at startup, cache the bundle path, pass only `inputProps` to `renderMedia()` per request.
|
||
|
||
2. **Storage 10MB limit blocks video/large image storage** (Pitfall 48) — The existing `MAX_ATTACHMENT_BYTES = 10MB` and MIME type allowlist reject generated video files. Prevention: separate `MAX_GENERATED_ASSET_BYTES` constant and `generated/` namespace in `StorageService`; write rendered output directly via `putObject`, bypassing the upload route entirely.
|
||
|
||
3. **Mermaid `securityLevel: "loose"` enabling XSS to RCE** (Pitfall 49) — AI-generated Mermaid syntax with `click` directives executes arbitrary JS. Confirmed exploits in production apps (OneUptime, DeepChat) in 2025–2026. Prevention: always `"strict"`, strip `%%{init}%%` and `click` statements before render, DOMPurify on SVG output.
|
||
|
||
4. **HSL-based palette generation producing perceptually incoherent themes** (Pitfall 51) — Equal HSL lightness steps are not perceptually equal; blue at L=50% appears darker than yellow at L=50%. Prevention: use OKLCH via `culori` for all generation; never HSL as an intermediate.
|
||
|
||
5. **Agent heartbeat timeout too short for long renders** (Pitfall 60) — A 3–10 min video render orphans when the heartbeat exits; task stays `in_progress` indefinitely, or a second render starts. Prevention: fire-and-forget from heartbeat (write job ID to task, exit); a polling routine checks job status and closes the task on completion.
|
||
|
||
6. **Generated assets not linked to originating task** (Pitfall 66) — Orphaned files accumulate on Mac Mini SSD (50–200GB over months). Prevention: `sourceTaskId` is a mandatory field on every generated asset from day one; cleanup job triggers on task deletion.
|
||
|
||
7. **AI-generated SVG rendered inline without sanitization** (Pitfall 64) — XSS via `<script>` tags or event handlers in AI-generated SVG when set directly as innerHTML. Prevention: DOMPurify with SVG profile on all AI-generated SVG; prefer `<img src="data:image/svg+xml;base64,...">` over inline SVG for untrusted content.
|
||
|
||
## Implications for Roadmap
|
||
|
||
Based on the dependency graph in FEATURES.md and the build order in ARCHITECTURE.md, the natural phase structure has seven phases. The critical path runs: storage/job infrastructure → fast no-binary content types → UI pipeline → browser-dependent generators (PDF, video) → optional ML-dependent features.
|
||
|
||
### Phase 1: Storage and Job Infrastructure
|
||
**Rationale:** Everything else depends on this. The `content_jobs` table, `renderPipelineService` stub, storage namespace extension, and the 10MB limit fix (Pitfall 48) must exist before any content type can be built. The `sourceTaskId` field (Pitfall 66) must be present from the first asset stored.
|
||
**Delivers:** `content_jobs` DB migration, `contentJobService`, `renderPipelineService` stub, extended storage namespace, `LIVE_EVENT_TYPES` for content jobs, API route scaffolding, `MAX_GENERATED_ASSET_BYTES` constant
|
||
**Addresses:** Table stakes (download, status feedback, save to file system, re-generate)
|
||
**Avoids:** Pitfall 48 (storage size limit), Pitfall 66 (orphaned assets), Pitfall 45 (bundle-per-render pre-empted by establishing async job model), Pitfall 60 (agent heartbeat — async fire-and-forget designed here)
|
||
|
||
### Phase 2: Fast Content Types (No Binary Dependencies)
|
||
**Rationale:** SVG generation and theme engine are pure TypeScript with no Chromium, Webpack, or binary deps. They validate the end-to-end pipeline (job → render → asset → SSE → UI) at low risk before heavier renderers are added. WCAG contrast correctness (Pitfall 52) and OKLCH color space (Pitfall 51) must be locked here — retrofitting after the theme exporter is built is costly.
|
||
**Delivers:** `svgGeneratorAdapter` (icons, placeholders, banners), `themeEngineService` (OKLCH, WCAG AA enforcement, CSS/JSON/Tailwind export), placeholder asset system with DRAFT watermark, culori integration
|
||
**Addresses:** Theme + palette generator (P1), placeholder asset system (P1), icon generation scaffolding, OKLCH export in multiple formats
|
||
**Avoids:** Pitfall 51 (HSL perceptual incoherence), Pitfall 52 (WCAG linearization error), Pitfall 62 (HEX-only export losing OKLCH)
|
||
|
||
### Phase 3: Diagram Generation and Content Gallery UI
|
||
**Rationale:** Mermaid is the highest-value, lowest-complexity content type. The UI pipeline (ContentJobViewer, DiagramRenderer, ContentGallery) validates the SSE progress flow end-to-end. The Mermaid security config (Pitfall 49) and DOMPurify memory pattern (Pitfall 50) must be established before any diagram renders reach the browser.
|
||
**Delivers:** `mermaidRendererAdapter` (server-side via `@mermaid-js/mermaid-cli`), `ChatMarkdownMessage` extension for client-side Mermaid fences, `DiagramRenderer` component, `ThemePreview` component, `ContentJobViewer`, `ContentGallery`, `GeneratedAssetCard`, `assetService.list()`
|
||
**Addresses:** Diagram generation (P1), content type preview, inline diagram rendering in chat
|
||
**Avoids:** Pitfall 49 (Mermaid XSS/RCE), Pitfall 50 (DOMPurify JSDOM memory accumulation), Pitfall 59 (server-side Mermaid DOM requirement)
|
||
|
||
### Phase 4: Wallpapers and OG Images (Satori Pipeline)
|
||
**Rationale:** The satori+resvg-js+sharp pipeline is pure Node.js (no Chromium) and covers OG images, social headers, and wallpapers in a single code path. Establishes the reusable 2D raster pipeline before PDF and video introduce heavier binary deps.
|
||
**Delivers:** Platform-sized image outputs (OG 1200x630, Instagram 1080x1080, desktop wallpaper 2560x1440, etc.), `social.ts` service, platform dimension registry constant
|
||
**Uses:** satori, @resvg/resvg-js, sharp (already installed)
|
||
**Addresses:** Wallpapers + OG images (P1), social media content scaffolding (P2)
|
||
**Avoids:** Pitfall 56 (platform MIME type and dimension constraints encoded as explicit data structure, not magic numbers)
|
||
|
||
### Phase 5: PDF Document Generation
|
||
**Rationale:** PDF introduces the first Chromium binary via `playwright-chromium`. Browser lifecycle must be established as a persistent instance (Pitfall 54) before any template work begins. Font self-hosting (Pitfall 53) must be designed before the first PDF template is considered complete.
|
||
**Delivers:** `pdfRendererAdapter` (Playwright persistent browser instance), HTML template PDF (reports, one-pagers), pdf-lib for data-driven invoices, font self-hosting via Express static server, PDF download flow in UI
|
||
**Addresses:** PDF generation (P1)
|
||
**Avoids:** Pitfall 53 (headless Chromium font loading), Pitfall 54 (Puppeteer launch-per-request overhead)
|
||
|
||
### Phase 6: Video and Presentations (Remotion)
|
||
**Rationale:** Remotion is the highest-complexity and highest-risk content type — webpack bundler conflicts, three Chromium binaries total, M4 concurrency limits, and the agent heartbeat timeout problem. It comes last among P1/P2 features so the async job infrastructure (Phase 1) is fully proven before the longest-running render type is added.
|
||
**Delivers:** `packages/content-renderer/` workspace package, `remotionRendererAdapter` (CLI subprocess with cached bundle), video playback UI, `onProgress` SSE progress events, render queue with `concurrency: 4` on M4
|
||
**Addresses:** Remotion presentations + video (P2)
|
||
**Avoids:** Pitfall 45 (bundle-per-render), Pitfall 46 (Chromium concurrency thrashing), Pitfall 47 (bundler inside compiled server context), Pitfall 55 (video not streamable — onProgress mandatory), Pitfall 63 (pnpm lockfile conflicts — add Remotion immediately after upstream rebase)
|
||
|
||
### Phase 7: Content as Skills
|
||
**Rationale:** No new code — this phase writes Markdown skill files for each content type in `company_skills`. It is last because skill instructions reference API contracts finalized in Phases 1–6. Plugin boundary rules (Pitfall 57) must be enforced before any skill implementation.
|
||
**Delivers:** Skill markdown files for diagram, theme, PDF, wallpaper, video content types; agent-callable via existing Skill Aggregator
|
||
**Addresses:** Content types as installable skills (differentiator)
|
||
**Avoids:** Pitfall 57 (plugin workers bypassing JSON-RPC bridge, using direct HTTP to host API)
|
||
|
||
### Phase Ordering Rationale
|
||
|
||
- Phases 1 → 2 → 3 follow the build-order diagram in ARCHITECTURE.md exactly: infrastructure unblocks fast types, fast types validate the pipeline, UI comes after the first adapter works end-to-end.
|
||
- Phase 4 (Satori) precedes Phase 5 (PDF) because Satori has no Chromium dep; PDF introduces the first persistent browser instance that the diagram renderer (Phase 3) can optionally reuse to avoid a second Chromium binary.
|
||
- Phase 6 (Remotion) is last among feature phases because it is CPU/RAM-intensive and its Webpack bundler is a build pipeline risk — isolating it reduces rebase conflict surface.
|
||
- Phase 7 (Skills) is last because skill instructions reference finalized API contracts.
|
||
|
||
### Research Flags
|
||
|
||
Phases likely needing deeper research during planning:
|
||
- **Phase 6 (Remotion):** Chromium binary count on the specific Mac Mini M4 config (18GB vs 32GB RAM variant changes concurrency budget); Remotion bundle vs Vite isolation needs validation in the actual monorepo build pipeline; run `npx remotion benchmark` before finalizing concurrency setting
|
||
- **Phase 5 (PDF):** Verify whether `playwright-chromium` and `@mermaid-js/mermaid-cli` can share a Chromium binary via `PUPPETEER_EXECUTABLE_PATH` to reduce total to two binaries instead of three
|
||
- **Phase 4 (Satori):** Verify correct package name: `@resvg/resvg-js` vs `resvg-js` — npm shows different versions; confirm before `pnpm add`
|
||
|
||
Phases with standard patterns (can proceed without additional research):
|
||
- **Phase 1 (Infrastructure):** Factory function pattern, `content_jobs` table schema, and SSE live events pattern are all directly codebase-confirmed — HIGH confidence, no research needed
|
||
- **Phase 2 (Theme/SVG):** culori OKLCH API is documented and confirmed; WCAG threshold fix is specific and well-understood
|
||
- **Phase 3 (Mermaid):** Mermaid CLI Node.js `run()` API confirmed in README; security config is a one-line change with documented correct value
|
||
- **Phase 7 (Skills):** Skill markdown format is already established in the codebase
|
||
|
||
## Confidence Assessment
|
||
|
||
| Area | Confidence | Notes |
|
||
|------|------------|-------|
|
||
| Stack | MEDIUM-HIGH | Remotion HIGH (official SSR docs confirmed). Playwright PDF benchmark MEDIUM (single benchmark source, pdf4.dev March 2026). resvg-js package name LOW (npm shows two packages — verify). culori MEDIUM (version and WCAG claim confirmed via npm + pkgpulse comparison). ComfyUI client MEDIUM (npm confirmed, Mac M4 support sourced from offlinecreator.com). |
|
||
| Features | MEDIUM-HIGH | Technology capabilities verified via docs. UX expectations inferred from Canva/Pitch/Mermaid Live comparisons. Skill architecture patterns based on existing Nexus skill system. |
|
||
| Architecture | HIGH | Derived entirely from direct codebase inspection of `/opt/nexus/` on 2026-04-04. Factory function patterns, StorageService interface, live events bus, placeholder service, and asset service all confirmed by reading source files. |
|
||
| Pitfalls | HIGH | Critical pitfalls verified via multiple sources: Mermaid XSS confirmed via production exploit reports (OneUptime, DeepChat 2025–2026); WCAG linearization error confirmed vs W3C spec; HSL perceptual non-uniformity confirmed by Tailwind CSS 4.0 rationale; Remotion bundle timing confirmed via official Remotion SSR docs. |
|
||
|
||
**Overall confidence:** MEDIUM-HIGH
|
||
|
||
### Gaps to Address
|
||
|
||
- **resvg-js package name:** Run `npm info @resvg/resvg-js` before `pnpm add` — npm shows divergent versions between `resvg-js` (v0.1.97) and `@resvg/resvg-js` (v2.6.2). Use the scoped package.
|
||
- **Chromium binary sharing:** Whether `PUPPETEER_EXECUTABLE_PATH` pointing to Playwright's Chromium satisfies `@mermaid-js/mermaid-cli`'s bundled-puppeteer binary requirement needs a 10-minute test on the Mac Mini before Phase 3 begins — could eliminate one ~300MB download.
|
||
- **Remotion Vite isolation:** Run `pnpm build` after adding `packages/content-renderer/` to the workspace to verify no Vite/webpack conflicts surface before Phase 6 implementation work begins.
|
||
- **ComfyUI availability:** Image generation (optional, Phase 7) assumes ComfyUI is already installed. Confirm whether this is in scope for v1.7 or defer to v2 — the install is multi-GB (ComfyUI + Flux.1 model).
|
||
- **pdf-lib scope:** FEATURES.md recommends both Playwright (design-rich PDFs) and pdf-lib (invoices). Confirm whether pdf-lib is in scope for v1.7 or if all PDF is Playwright-only initially during Phase 5 planning.
|
||
|
||
## Sources
|
||
|
||
### Primary (HIGH confidence)
|
||
- Direct codebase inspection of `/opt/nexus/` (2026-04-04) — service patterns, StorageService interface, live events bus, asset schema, placeholder service, package.json contents
|
||
- [Remotion SSR docs](https://www.remotion.dev/docs/ssr) — `@remotion/renderer` Node.js API, bundle caching pattern
|
||
- [Remotion Express render-server template](https://github.com/remotion-dev/template-render-server) — Express integration confirmed
|
||
- [vercel/satori GitHub](https://github.com/vercel/satori) — JSX-to-SVG API, font format constraints (TTF/OTF/WOFF, no WOFF2)
|
||
- [mermaid-cli GitHub](https://github.com/mermaid-js/mermaid-cli) — Node.js `run()` API confirmed
|
||
- [playwright-chromium npm](https://www.npmjs.com/package/playwright-chromium) — Chromium-only package confirmed
|
||
- [culori npm](https://www.npmjs.com/package/culori) — version 4.0.2, WCAG functions confirmed
|
||
|
||
### Secondary (MEDIUM confidence)
|
||
- [PDF benchmark 2026](https://pdf4.dev/blog/html-to-pdf-benchmark-2026) — Playwright vs Puppeteer macOS arm64 timing (single source)
|
||
- [thx/resvg-js GitHub](https://github.com/thx/resvg-js) — SVG-to-PNG Rust napi-rs; package name ambiguity noted
|
||
- [culori vs chroma-js 2026](https://www.pkgpulse.com/blog/culori-vs-chroma-js-vs-tinycolor2-color-manipulation-javascript-2026) — OKLCH accuracy comparison
|
||
- [@stable-canvas/comfyui-client npm](https://www.npmjs.com/package/@stable-canvas/comfyui-client) — zero deps, MIT confirmed
|
||
- [SocialSizes.io 2026](https://socialsizes.io/) — platform dimension registry
|
||
|
||
### Tertiary (LOW confidence — needs validation during implementation)
|
||
- [offlinecreator.com — ComfyUI Mac M4 2026](https://offlinecreator.com/blog/best-local-stable-diffusion-setup-2026) — ComfyUI Metal/MPS support on M4
|
||
- Mermaid XSS via `securityLevel: "loose"` — referenced via exploit reports for OneUptime and DeepChat; the attack vector is documented in the Mermaid changelog and security advisories; specific CVE numbers not cited
|
||
|
||
---
|
||
*Research completed: 2026-04-04*
|
||
*Ready for roadmap: yes*
|