Nexus Dev e4a103cd9b docs: complete project research

2026-04-04 04:25:21 +00:00

28 KiB

Raw Blame History

Feature Research

Domain: Content Generation Layer (Nexus v1.7) — agents produce visual, document, and media deliverables Researched: 2026-04-04 Confidence: MEDIUM-HIGH — technology capabilities verified via docs and ecosystem research; UX expectations inferred from comparable tools (Canva, Pitch, Mermaid Live, Figma tokens); skill system patterns based on existing Nexus skill architecture

Milestone Scope

This document covers only NEW features in v1.7. The following are already built and are dependencies, not deliverables:

File system with upload, git versioning, PLACEHOLDERS.md manifest (v1.3)
Skill system with Skill Aggregator and company skills API (Paperclip upstream)
Chat interface with streaming SSE (v1.3)
Agent orchestration, heartbeat lifecycle (Paperclip upstream)
Voice I/O with Whisper STT + Piper TTS (v1.6)
Hermes adapter with native skills, Ollama integration (v1.4)

New features being researched:

Presentations and video generation via Remotion
Placeholder assets with DRAFT styling and manifest tracking
Theme and palette generator (seed color → full theme, WCAG AA, exports)
Wallpapers and visual assets (desktop/mobile, banners, OG images)
Diagram generation (natural language → Mermaid → SVG/PNG)
Document generation (PDF reports, invoices, one-pagers)
Icon generation (SVG from description, consistent sets)
Social media content (platform-formatted posts, carousels, hashtags)
Branding media kit (full brand identity from conversation)
Content types as installable Nexus skills

Feature Landscape

Table Stakes (Users Expect These)

Features that must exist for content generation to feel complete. Missing any of these and the deliverable is not production-ready.

Feature	Why Expected	Complexity	Notes
Download produced file directly	Every generator (Canva, Pitch, Remotion) lets you download the output; no download = the tool is a preview, not a generator	LOW	Route: `GET /api/content/:jobId/download`; serve the file buffer with correct MIME type and `Content-Disposition: attachment`
Show a preview of the output	Users need to see what was generated before downloading — a blank "file ready" message is not sufficient	MEDIUM	For images/SVG: `<img>` or inline SVG; for PDF: `<iframe src="...">` or pdf.js; for Remotion: Remotion Player component; for diagrams: rendered SVG inline
Generation status feedback	Content generation jobs (PDF, Remotion render, Mermaid) can take 5–60s; a spinner with no progress info creates anxiety	LOW	SSE progress events or polling with status enum: `queued → generating → ready → error`; show estimated time where possible
Error recovery with explanation	If generation fails, the user needs to know why and what to try next	LOW	Return structured error: `{ error: "model_not_installed", message: "Ollama llava model required for image generation", suggestion: "Run: ollama pull llava" }`
Save output to file system	Generated files belong in the existing file system so they participate in git versioning and the PLACEHOLDERS.md manifest	MEDIUM	Integrate with existing file upload pipeline: write generated file to workspace directory, call git versioning hook, update PLACEHOLDERS.md
Re-generate with revised prompt	Users iterate on generated content; "generate and you're done" misses 80% of actual workflow	LOW	Store generation parameters (prompt, theme, options) with the job; "regenerate" re-calls with same params + optional overrides
Content type labeled clearly	A diagram, a PDF, and a video are fundamentally different outputs; conflating them in one UI creates confusion	LOW	Each content type has a distinct icon, label, and preview strategy; use a type registry not ad hoc conditionals

Differentiators (Competitive Advantage)

Features that distinguish Nexus content generation from Canva/Pitch/generic generators.

Feature	Value Proposition	Complexity	Notes
Agent-driven generation from chat	User describes what they want in natural language; the agent selects the right content type, calls the generation skill, and delivers a file — no form-filling	HIGH	Requires agent skill routing: agent detects "make me a diagram of the user flow" → invokes `diagram-generation` skill → returns SVG file attachment in chat; skill selection uses existing Nexus Skill Aggregator
Content types as installable skills	Each generator (diagrams, PDFs, wallpapers, presentations) is a separate installable skill, not a monolithic feature — users install only what they need	HIGH	Follow existing `skills/paperclip/` pattern; each skill has `SKILL.md`, optional `references/`, and is registered via company skills API; agents get assigned the relevant skills
PLACEHOLDERS.md manifest integration	Every generated asset — including draft placeholders — is tracked in the existing PLACEHOLDERS.md manifest with status (draft/final), source prompt, and generator version	MEDIUM	Extend existing manifest format; add `generator`, `prompt_hash`, `generated_at` fields; draft assets get DRAFT watermark applied at generation time
Seed-color-to-full-theme pipeline	User provides one hex color; system generates a complete accessible color system (primary, secondary, accent, neutral, semantic colors) with WCAG AA compliance verified	HIGH	Use OKLCH/LCh color model for perceptually uniform lightness; verify 4.5:1 contrast for text pairs; export to CSS custom properties, Tailwind config, and JSON token format
WCAG AA enforced, not optional	Theme generator refuses to export palettes that fail contrast requirements — accessibility is baked in, not a checkbox	MEDIUM	Run contrast check on every text/background pair; if a combination fails, auto-adjust lightness until compliant; show contrast ratio in preview
Diagram from natural language	User says "flowchart of the auth pipeline" in chat; agent generates Mermaid syntax, renders SVG, attaches to the conversation	MEDIUM	LLM generates valid Mermaid; `@mermaid-js/mermaid-cli` renders server-side; fallback: return raw Mermaid for user to copy into mermaid.live; store `.mmd` source + rendered `.svg`
Local-only operation	All generation (Remotion render, PDF, Mermaid, Satori images) runs on Mac Mini M4 without cloud API calls; no data leaves the machine	MEDIUM	Reject cloud-dependent generators (no DALL-E, no Stable Diffusion API); prefer deterministic generators (Mermaid, Satori, Remotion, Playwright PDF) that need only the local LLM + local tools
Branding kit from conversation	User chats about their project; agent extracts brand DNA (colors, typography, tone) and produces a coherent brand kit — no form, no design background required	HIGH	Multi-step: LLM extracts brand parameters → theme generator → typography selector → icon style picker → assembles ZIP with CSS tokens, font stack, sample SVG logo, OG image template

Anti-Features (Commonly Requested, Often Problematic)

Feature	Why Requested	Why Problematic	Alternative
Image generation via Stable Diffusion or DALL-E	"Make me an illustration of..." seems like a natural content type	SD requires GPU VRAM (conflicts with LLM VRAM budget on M4); DALL-E is cloud, data leaves machine; output quality is non-deterministic and hard to brand-consistently	Deterministic vector tools: Satori for OG images/banners, icon description → SVG path via LLM (text-to-SVG), wallpaper = CSS gradient/pattern composition; no raster AI images in v1.7
Real-time collaborative editing of generated content	"Let the agent iterate with me live"	Requires a full rich-text or canvas editor (collaborative editing is a product in itself); far outside scope	Chat-and-regenerate loop: show output, accept feedback as text, regenerate — no in-place editing
Font embedding in all output formats	"I want my brand font in the PDF and video"	Font licensing for system-level embed is complex; font subsetting in PDF requires careful handling; Remotion font loading has SSR implications	Use system-safe font stacks for PDF (Helvetica, Times, Courier are embed-safe); Remotion uses `@remotion/google-fonts` for web-safe options; custom font is a v2 concern
Batch generation (50 social posts at once)	"Generate a month of content in one click"	Job queue depth, disk space, and UI feedback for 50 concurrent generation jobs is significant infrastructure work	Single-at-a-time generation with a "generate next variant" button; queue infrastructure is a v2 concern
Auto-publish to social platforms	"Post directly to Twitter/LinkedIn"	OAuth token management per platform, platform API rate limits, legal liability for AI-generated content posted as the user	Download + manual post; provide platform-formatted file with exact recommended dimensions; no publishing API integration in v1.7
Template marketplace / sharing	"Share my Remotion template with others"	Multi-user/multi-workspace concerns; the Nexus model is single-workspace, single-user	Templates stored in workspace file system under `templates/`; user can git-push to share; no marketplace infrastructure
Animated / lottie social content	"Animated post for Instagram stories"	Lottie export from Remotion is possible but adds significant complexity; Instagram animated format requirements are strict	Static images for social in v1.7; Remotion video export covers the animation use case separately
AI logo design (raster output)	"Generate my company logo"	AI raster logos are non-scalable and inconsistent across regenerations; brand identity requires reproducibility	SVG icon generation from description using LLM-as-code (the LLM writes SVG path code); deterministic, scalable, reproducible

Feature Dependencies

Content Skill System (foundation)
    └──required-by──> All content types (diagrams, PDFs, presentations, themes, icons)
    └──requires──> Existing Skill Aggregator + company skills API [already built]
    └──requires──> Agent skill routing (chat → skill invocation → file attachment)

Diagram Generation
    └──requires──> @mermaid-js/mermaid-cli (server-side render to SVG/PNG)
    └──requires──> LLM to generate valid Mermaid syntax
    └──produces──> SVG + raw .mmd source → saved to file system

PDF Generation
    └──requires──> Playwright (headless Chromium) OR pdf-lib (programmatic)
    └──choice: Playwright for HTML-template-based PDFs (reports, one-pagers)
    └──choice: pdf-lib for programmatic PDFs (invoices, receipts with data)
    └──produces──> .pdf file → saved to file system

Theme Generator
    └──requires──> Color math library (chroma-js or culori for OKLCH)
    └──requires──> WCAG contrast calculation (wcag-color-contrast or manual APCA)
    └──produces──> CSS custom properties file + JSON tokens + Tailwind config
    └──consumed-by──> Branding media kit (uses theme as input)

Remotion Presentations + Video
    └──requires──> @remotion/renderer (server-side render to MP4/still frames)
    └──requires──> Node.js >=18 (already met)
    └──requires──> ffmpeg-static (already in stack from v1.6 for audio; reused for video)
    └──produces──> .mp4 or .png stills → saved to file system
    └──optionally-uses──> Theme generator (colors, typography from brand kit)

Wallpapers + Visual Assets (OG images, banners, social headers)
    └──requires──> Satori (HTML/CSS → SVG) + Sharp (SVG → PNG, resize)
    └──requires──> Platform dimension registry (OG: 1200×630, LinkedIn: 1584×396, etc.)
    └──produces──> PNG files at multiple sizes → saved to file system
    └──optionally-uses──> Theme generator (brand colors)

Icon Generation (SVG)
    └──requires──> LLM to generate SVG path code from description
    └──no external rendering lib needed (SVG is text)
    └──produces──> .svg files → saved to file system
    └──consumed-by──> Branding media kit

Social Media Content
    └──requires──> Wallpaper/banner generator (for image posts)
    └──requires──> LLM (for copy: captions, hashtags, platform-appropriate tone)
    └──requires──> Platform spec registry (image sizes, character limits per platform)
    └──produces──> Platform folder: {platform}/{size}.png + caption.txt + hashtags.txt

Branding Media Kit
    └──requires──> Theme generator (colors)
    └──requires──> Icon generator (SVG logo concept)
    └──requires──> Wallpaper generator (OG image, banner)
    └──requires──> LLM (typography pairing, brand voice, tagline)
    └──produces──> ZIP archive: brand-kit.zip containing all assets + CSS tokens

Placeholder Asset System
    └──requires──> File system with PLACEHOLDERS.md [already built]
    └──requires──> Any generator (diagram, wallpaper, PDF) to set draft flag
    └──produces──> Asset file with DRAFT watermark + PLACEHOLDERS.md entry
    └──resolves-via──> "generate final" command removes watermark, updates manifest

Dependency Notes

Content Skill System is the foundation. Every content type is a skill. If the skill routing pattern is not established first, each content type becomes a disconnected one-off endpoint.
Satori + Sharp is the image stack for all 2D raster outputs. Do not introduce a separate image library per content type — Satori handles the JSX/CSS layout, Sharp handles PNG conversion and resizing. One pipeline for wallpapers, OG images, social headers, and banner generation.
Playwright for HTML-template PDFs, pdf-lib for data-driven PDFs. Do not use a single library for both — Playwright is better for design-rich output, pdf-lib is better for invoices and receipts. Use the right tool per use case.
ffmpeg-static already in v1.6 stack. Remotion's video pipeline reuses it — do not add a second FFmpeg dependency.
Branding media kit is a composition of other skills. It is not a standalone generator; it orchestrates theme → icons → wallpapers → copy and zips the outputs.
PLACEHOLDERS.md integration is cross-cutting. Every content generator must write to the manifest on save; this is not optional per the v1.7 milestone requirements.

MVP Definition

Launch With (v1.7 Milestone — P1)

The minimum set to make content generation genuinely useful as a daily workflow tool.

Content skill system scaffolding — Skill registration pattern, agent routing, file attachment to chat; gates everything else
Diagram generation — NL → Mermaid → SVG/PNG; most requested, lowest complexity, immediate productivity value for a developer
Theme and palette generator — Seed color → full color system with WCAG AA; exports CSS tokens + JSON; standalone value even without other generators
Placeholder asset system — DRAFT watermark on any generated file + PLACEHOLDERS.md entry; prevents generated assets from being accidentally shipped unreviewed
PDF generation — Playwright-based HTML → PDF for reports/one-pagers; pdf-lib for invoices; solves a concrete recurring task
Wallpapers and OG images — Satori + Sharp pipeline; produces desktop wallpaper, OG image, and LinkedIn/Twitter header from a single theme config

Add After Validation (v1.7.x — P2)

Icon generation (SVG) — LLM-as-SVG-coder; trigger: user asks for consistent icon set for a project
Social media content — Platform-formatted posts + captions; trigger: user has a completed project and needs to announce it
Remotion presentations — React-component slides → MP4/stills; trigger: user needs a pitch deck or demo video; requires careful VRAM budget on M4

Future Consideration (v2+ — P3)

Branding media kit — Full brand kit ZIP; requires all other generators to be stable first; high coordination cost
Batch generation — Multiple variants or sizes at once; requires job queue infrastructure
Template library — Reusable Remotion/Satori templates stored in workspace
Font embedding — Custom font in PDF and video; requires font licensing audit and subsetting

Feature Prioritization Matrix

Feature	User Value	Implementation Cost	Priority
Content skill system	HIGH	MEDIUM	P1
Diagram generation	HIGH	LOW	P1
Theme + palette generator	HIGH	MEDIUM	P1
Placeholder asset system	MEDIUM	LOW	P1
PDF generation	HIGH	MEDIUM	P1
Wallpapers + OG images (Satori)	MEDIUM	MEDIUM	P1
Icon generation (SVG)	MEDIUM	LOW	P2
Social media content	MEDIUM	MEDIUM	P2
Remotion presentations + video	HIGH	HIGH	P2
Branding media kit	HIGH	HIGH	P3
Batch generation	LOW	HIGH	P3
Template library	MEDIUM	MEDIUM	P3

Priority key:

P1: Must have for v1.7 launch
P2: Should have; add when P1 is stable
P3: Future milestone

Content Type Profiles

Detailed breakdown of what each content type requires and delivers.

Diagram Generation

User trigger: "Draw me a sequence diagram of the auth flow" in chat Input: Natural language description Output: .svg + .mmd (Mermaid source) files Generator: LLM → Mermaid syntax → @mermaid-js/mermaid-cli run() API Preview: Inline SVG in chat bubble Complexity: LOW — Mermaid CLI has Node.js programmatic API; LLMs are good at generating valid Mermaid Risk: LLM occasionally produces invalid Mermaid syntax; must validate and retry or surface the raw .mmd for user to fix in mermaid.live Platform spec: Vector SVG = no size constraint; PNG export at 2x for retina via --scale 2

Theme + Palette Generator

User trigger: "Generate a color theme from #2563EB" or "Create a dark theme for my portfolio" Input: Seed hex color + optional: mode (light/dark/both), style (minimal/vibrant/muted) Output: theme.css (CSS custom properties), theme.json (design tokens), tailwind.config.ts Generator: Server-side color math using OKLCH/LCh model; no LLM required for color generation; LLM assists with labeling semantic colors (primary, danger, success) Preview: Live color swatches with contrast ratio overlay in UI Complexity: MEDIUM — OKLCH color math is non-trivial; WCAG AA enforcement requires iteration loop WCAG AA rule: Every text/background combination must hit 4.5:1 contrast ratio; auto-adjust lightness until compliant Exports: Three formats (CSS, JSON, Tailwind) from same data model

Placeholder Asset System

User trigger: Any content generator can emit a "draft" asset; user explicitly marks an asset as a draft placeholder Input: Any generated file + optional placeholder label Output: File with diagonal DRAFT watermark (SVG overlay for images/PDFs, badge for other types) + PLACEHOLDERS.md entry Generator: Post-processing step in every content pipeline, not a standalone generator Manifest fields: path, type, status (draft/final), generator, prompt_hash, generated_at, resolved_at Complexity: LOW — SVG watermark overlay is a compositing operation; PLACEHOLDERS.md is an existing format to extend Resolve flow: "finalize this asset" removes watermark, updates manifest status to final

PDF Generation

Use case A — Design-rich reports and one-pagers: HTML template rendered with Playwright headless Chromium → PDF

Supports full CSS (flexbox, grid, custom fonts via @font-face)
Startup cost: 1–3s browser init; reuse browser instance across requests
Output: pixel-accurate PDF matching HTML preview

Use case B — Invoices, receipts, data tables: programmatic construction with pdf-lib

No browser dependency; pure Node.js; fast (<200ms for simple documents)
Supports: text positioning, tables, page breaks, image embedding
Output: structured PDF from data objects

User trigger: "Generate a project summary PDF" (→ Playwright) or "Create an invoice for client X" (→ pdf-lib) Complexity: MEDIUM (Playwright browser lifecycle management; pdf-lib API for data-driven docs)

Wallpapers + Visual Assets

Scope: Desktop wallpaper (2560×1440, 3840×2160), mobile wallpaper (1080×1920), OG image (1200×630), LinkedIn banner (1584×396), Twitter/X header (1500×500) User trigger: "Generate a wallpaper for my project" or "Create an OG image with the project name" Input: Theme colors + project name/tagline + optional layout style Output: PNG files at each requested size Generator: Satori (JSX → SVG) → Sharp (SVG → PNG, resize per platform spec) Preview: Thumbnail grid in UI; click to view full size; download individual or as ZIP Complexity: MEDIUM — Satori requires a subset of CSS (no display: grid in all versions; use flexbox); Sharp handles the raster conversion

Icon Generation (SVG)

User trigger: "Create an icon for notifications" or "Generate a 5-icon set for the nav bar" Input: Icon description + optional style (outline/filled/duotone) + size (24px/32px/48px) Output: .svg file(s) with clean <svg> structure Generator: LLM generates SVG path code directly — this is a text-to-code task, not image generation Preview: Rendered SVG inline in chat; displayed at 24, 48, 96px to show scalability Complexity: LOW — modern LLMs (Claude, GPT-4) reliably generate clean SVG paths for simple icons; more complex icons need iteration Consistency rule: Generate entire sets in one LLM call with style instructions; icons generated separately look inconsistent

Scope: Post image + caption + hashtags for LinkedIn, Twitter/X, Instagram (static only) Platform specs:

LinkedIn: 1200×628px image, 3000 char limit, 3–5 hashtags
Twitter/X: 1200×675px image, 280 char limit (with image), 2–3 hashtags
Instagram: 1080×1080px (square), 2200 char limit, 10–30 hashtags User trigger: "Create a launch announcement post for LinkedIn and Twitter" Input: Project description/milestone + platform selection + tone (professional/casual/technical) Output: Per platform: {platform}/image.png + {platform}/caption.txt + {platform}/hashtags.txt Generator: Satori for platform image → LLM for caption + hashtags Complexity: MEDIUM — platform spec registry is straightforward; caption writing via LLM is reliable; image must be platform-safe (no text too close to edge)

Remotion Presentations + Video

User trigger: "Create a 2-minute pitch deck video" or "Generate slides for the project demo" Input: Slide content (title, bullets, code snippets) + theme + duration estimate Output: .mp4 (for video) or PNG stills per slide (for presentation mode) Generator: @remotion/renderer renderMedia() API — server-side, no browser UI needed Preview: Remotion Player component in UI for interactive playback before export Complexity: HIGH — Remotion render is CPU-intensive (no GPU on M4 needed; uses CPU rendering); render time for a 2-min video ~30–90s on M4; must manage render queue VRAM note: Remotion does NOT use GPU/VRAM; pure CPU/RAM render; does not compete with LLM VRAM budget ffmpeg reuse: Remotion uses ffmpeg internally for video encoding; ffmpeg-static already in the v1.6 stack satisfies this

Branding Media Kit (v2 — complex coordination)

Output: brand-kit.zip containing: colors/theme.css, colors/theme.json, icons/logo.svg, icons/favicon.svg, images/og-image.png, images/banner-linkedin.png, images/banner-twitter.png, images/wallpaper-desktop.png, typography/font-stack.css, copy/brand-voice.md, copy/tagline.txt Generator: Orchestrator agent coordinates all sub-generators in sequence Complexity: HIGH — coordination of 6 generators with shared state (theme colors must flow through all visual assets)

Competitor Feature Analysis

Feature	Canva / Pitch	Mermaid Live / Eraser	Figma Tokens Studio	Nexus v1.7 Approach
Content type	Raster images, slides	Diagrams only	Design tokens only	All types via skills
AI integration	Prompt-to-design (cloud)	None / limited	None	Chat-driven, local LLM
Offline / local	No	No	No	Fully local on M4
Skill installability	Monolithic product	Standalone tool	Figma plugin	Per-type installable skills
File ownership	Cloud-locked	Export only	Figma-locked	Local file system, git-versioned
WCAG enforcement	Optional check	N/A	Via plugin	Enforced at generation
PLACEHOLDERS.md	N/A	N/A	N/A	Native; draft tracking built in
Agent-driven	No	No	No	Core UX: chat → deliverable

Platform Dimension Registry

Used by wallpaper generator, social content, and branding kit.

Asset Type	Width	Height	Notes
OG Image	1200	630	Universal (Facebook, LinkedIn, Twitter)
LinkedIn Banner	1584	396	Center-safe zone; edges cropped on mobile
Twitter/X Header	1500	500	3:1 aspect ratio
YouTube Banner	2560	1440	Safe zone: center 1546×423
Instagram Square	1080	1080	1:1
Desktop Wallpaper	2560	1440	Standard; also offer 3840×2160
Mobile Wallpaper	1080	1920	9:16
Favicon	32	32	SVG preferred; PNG fallback
Apple Touch Icon	180	180	PNG only

Generation Job Lifecycle

All content generation follows this status machine to enable consistent UI feedback:

queued → generating → ready → (draft → final via placeholder system)
                   ↘ error (with structured reason + suggestion)

queued: Job accepted; worker not yet started
generating: Active work; emit SSE progress events with % or step label
ready: File available; preview URL returned; download URL available
draft: File saved with DRAFT watermark; PLACEHOLDERS.md entry created
final: User confirmed; watermark removed; manifest updated
error: Structured error with reason + actionable suggestion

Sources

Remotion — Make videos programmatically — server-side rendering, Remotion Player, @remotion/renderer API
Remotion GitHub — Remotion Skills (January 2026), CPU rendering confirmed
Mermaid CLI npm — @mermaid-js/mermaid-cli — programmatic run() API, Node.js >=18, SVG/PNG/PDF output
Satori GitHub — vercel/satori — HTML/CSS to SVG; flexbox subset; use with Sharp for PNG
Social media image sizes 2026 — SocialSizes.io — platform dimension registry
Social media image sizes — Buffer 2026 — OG 1200×630 confirmed universal
Accessible Palette — accessiblepalette.com — OKLCH/LCh for perceptually uniform palette generation
InclusiveColors — WCAG accessible palette creator — WCAG AA enforcement pattern
Generating accessible color palettes — Canonical — APCA/WCAG algorithm approaches
How to Generate PDFs in 2025 — DEV Community — Playwright vs pdf-lib use case guidance
Puppeteer HTML to PDF — RisingStack — HTML-to-PDF pattern; applies to Playwright equivalent
AI SVG Icon Generator — DEV Community — LLM-as-SVG-coder pattern validated
Tracking designs using watermarks — Atlassian — DRAFT watermark status pattern in design workflow
Branding with AI — BrandForge — brand kit component structure (logo, palette, typography, voice, templates)
Mermaid Chart export guide — SVG, PNG, MMD export options confirmed

Feature research for: Nexus v1.7 Content Generation Researched: 2026-04-04

28 KiB Raw Blame History Unescape Escape