nexus/.planning/research/FEATURES.md
2026-04-04 04:25:21 +00:00

28 KiB
Raw Blame History

Feature Research

Domain: Content Generation Layer (Nexus v1.7) — agents produce visual, document, and media deliverables Researched: 2026-04-04 Confidence: MEDIUM-HIGH — technology capabilities verified via docs and ecosystem research; UX expectations inferred from comparable tools (Canva, Pitch, Mermaid Live, Figma tokens); skill system patterns based on existing Nexus skill architecture


Milestone Scope

This document covers only NEW features in v1.7. The following are already built and are dependencies, not deliverables:

  • File system with upload, git versioning, PLACEHOLDERS.md manifest (v1.3)
  • Skill system with Skill Aggregator and company skills API (Paperclip upstream)
  • Chat interface with streaming SSE (v1.3)
  • Agent orchestration, heartbeat lifecycle (Paperclip upstream)
  • Voice I/O with Whisper STT + Piper TTS (v1.6)
  • Hermes adapter with native skills, Ollama integration (v1.4)

New features being researched:

  • Presentations and video generation via Remotion
  • Placeholder assets with DRAFT styling and manifest tracking
  • Theme and palette generator (seed color → full theme, WCAG AA, exports)
  • Wallpapers and visual assets (desktop/mobile, banners, OG images)
  • Diagram generation (natural language → Mermaid → SVG/PNG)
  • Document generation (PDF reports, invoices, one-pagers)
  • Icon generation (SVG from description, consistent sets)
  • Social media content (platform-formatted posts, carousels, hashtags)
  • Branding media kit (full brand identity from conversation)
  • Content types as installable Nexus skills

Feature Landscape

Table Stakes (Users Expect These)

Features that must exist for content generation to feel complete. Missing any of these and the deliverable is not production-ready.

Feature Why Expected Complexity Notes
Download produced file directly Every generator (Canva, Pitch, Remotion) lets you download the output; no download = the tool is a preview, not a generator LOW Route: GET /api/content/:jobId/download; serve the file buffer with correct MIME type and Content-Disposition: attachment
Show a preview of the output Users need to see what was generated before downloading — a blank "file ready" message is not sufficient MEDIUM For images/SVG: <img> or inline SVG; for PDF: <iframe src="..."> or pdf.js; for Remotion: Remotion Player component; for diagrams: rendered SVG inline
Generation status feedback Content generation jobs (PDF, Remotion render, Mermaid) can take 560s; a spinner with no progress info creates anxiety LOW SSE progress events or polling with status enum: queued → generating → ready → error; show estimated time where possible
Error recovery with explanation If generation fails, the user needs to know why and what to try next LOW Return structured error: { error: "model_not_installed", message: "Ollama llava model required for image generation", suggestion: "Run: ollama pull llava" }
Save output to file system Generated files belong in the existing file system so they participate in git versioning and the PLACEHOLDERS.md manifest MEDIUM Integrate with existing file upload pipeline: write generated file to workspace directory, call git versioning hook, update PLACEHOLDERS.md
Re-generate with revised prompt Users iterate on generated content; "generate and you're done" misses 80% of actual workflow LOW Store generation parameters (prompt, theme, options) with the job; "regenerate" re-calls with same params + optional overrides
Content type labeled clearly A diagram, a PDF, and a video are fundamentally different outputs; conflating them in one UI creates confusion LOW Each content type has a distinct icon, label, and preview strategy; use a type registry not ad hoc conditionals

Differentiators (Competitive Advantage)

Features that distinguish Nexus content generation from Canva/Pitch/generic generators.

Feature Value Proposition Complexity Notes
Agent-driven generation from chat User describes what they want in natural language; the agent selects the right content type, calls the generation skill, and delivers a file — no form-filling HIGH Requires agent skill routing: agent detects "make me a diagram of the user flow" → invokes diagram-generation skill → returns SVG file attachment in chat; skill selection uses existing Nexus Skill Aggregator
Content types as installable skills Each generator (diagrams, PDFs, wallpapers, presentations) is a separate installable skill, not a monolithic feature — users install only what they need HIGH Follow existing skills/paperclip/ pattern; each skill has SKILL.md, optional references/, and is registered via company skills API; agents get assigned the relevant skills
PLACEHOLDERS.md manifest integration Every generated asset — including draft placeholders — is tracked in the existing PLACEHOLDERS.md manifest with status (draft/final), source prompt, and generator version MEDIUM Extend existing manifest format; add generator, prompt_hash, generated_at fields; draft assets get DRAFT watermark applied at generation time
Seed-color-to-full-theme pipeline User provides one hex color; system generates a complete accessible color system (primary, secondary, accent, neutral, semantic colors) with WCAG AA compliance verified HIGH Use OKLCH/LCh color model for perceptually uniform lightness; verify 4.5:1 contrast for text pairs; export to CSS custom properties, Tailwind config, and JSON token format
WCAG AA enforced, not optional Theme generator refuses to export palettes that fail contrast requirements — accessibility is baked in, not a checkbox MEDIUM Run contrast check on every text/background pair; if a combination fails, auto-adjust lightness until compliant; show contrast ratio in preview
Diagram from natural language User says "flowchart of the auth pipeline" in chat; agent generates Mermaid syntax, renders SVG, attaches to the conversation MEDIUM LLM generates valid Mermaid; @mermaid-js/mermaid-cli renders server-side; fallback: return raw Mermaid for user to copy into mermaid.live; store .mmd source + rendered .svg
Local-only operation All generation (Remotion render, PDF, Mermaid, Satori images) runs on Mac Mini M4 without cloud API calls; no data leaves the machine MEDIUM Reject cloud-dependent generators (no DALL-E, no Stable Diffusion API); prefer deterministic generators (Mermaid, Satori, Remotion, Playwright PDF) that need only the local LLM + local tools
Branding kit from conversation User chats about their project; agent extracts brand DNA (colors, typography, tone) and produces a coherent brand kit — no form, no design background required HIGH Multi-step: LLM extracts brand parameters → theme generator → typography selector → icon style picker → assembles ZIP with CSS tokens, font stack, sample SVG logo, OG image template

Anti-Features (Commonly Requested, Often Problematic)

Feature Why Requested Why Problematic Alternative
Image generation via Stable Diffusion or DALL-E "Make me an illustration of..." seems like a natural content type SD requires GPU VRAM (conflicts with LLM VRAM budget on M4); DALL-E is cloud, data leaves machine; output quality is non-deterministic and hard to brand-consistently Deterministic vector tools: Satori for OG images/banners, icon description → SVG path via LLM (text-to-SVG), wallpaper = CSS gradient/pattern composition; no raster AI images in v1.7
Real-time collaborative editing of generated content "Let the agent iterate with me live" Requires a full rich-text or canvas editor (collaborative editing is a product in itself); far outside scope Chat-and-regenerate loop: show output, accept feedback as text, regenerate — no in-place editing
Font embedding in all output formats "I want my brand font in the PDF and video" Font licensing for system-level embed is complex; font subsetting in PDF requires careful handling; Remotion font loading has SSR implications Use system-safe font stacks for PDF (Helvetica, Times, Courier are embed-safe); Remotion uses @remotion/google-fonts for web-safe options; custom font is a v2 concern
Batch generation (50 social posts at once) "Generate a month of content in one click" Job queue depth, disk space, and UI feedback for 50 concurrent generation jobs is significant infrastructure work Single-at-a-time generation with a "generate next variant" button; queue infrastructure is a v2 concern
Auto-publish to social platforms "Post directly to Twitter/LinkedIn" OAuth token management per platform, platform API rate limits, legal liability for AI-generated content posted as the user Download + manual post; provide platform-formatted file with exact recommended dimensions; no publishing API integration in v1.7
Template marketplace / sharing "Share my Remotion template with others" Multi-user/multi-workspace concerns; the Nexus model is single-workspace, single-user Templates stored in workspace file system under templates/; user can git-push to share; no marketplace infrastructure
Animated / lottie social content "Animated post for Instagram stories" Lottie export from Remotion is possible but adds significant complexity; Instagram animated format requirements are strict Static images for social in v1.7; Remotion video export covers the animation use case separately
AI logo design (raster output) "Generate my company logo" AI raster logos are non-scalable and inconsistent across regenerations; brand identity requires reproducibility SVG icon generation from description using LLM-as-code (the LLM writes SVG path code); deterministic, scalable, reproducible

Feature Dependencies

Content Skill System (foundation)
    └──required-by──> All content types (diagrams, PDFs, presentations, themes, icons)
    └──requires──> Existing Skill Aggregator + company skills API [already built]
    └──requires──> Agent skill routing (chat → skill invocation → file attachment)

Diagram Generation
    └──requires──> @mermaid-js/mermaid-cli (server-side render to SVG/PNG)
    └──requires──> LLM to generate valid Mermaid syntax
    └──produces──> SVG + raw .mmd source → saved to file system

PDF Generation
    └──requires──> Playwright (headless Chromium) OR pdf-lib (programmatic)
    └──choice: Playwright for HTML-template-based PDFs (reports, one-pagers)
    └──choice: pdf-lib for programmatic PDFs (invoices, receipts with data)
    └──produces──> .pdf file → saved to file system

Theme Generator
    └──requires──> Color math library (chroma-js or culori for OKLCH)
    └──requires──> WCAG contrast calculation (wcag-color-contrast or manual APCA)
    └──produces──> CSS custom properties file + JSON tokens + Tailwind config
    └──consumed-by──> Branding media kit (uses theme as input)

Remotion Presentations + Video
    └──requires──> @remotion/renderer (server-side render to MP4/still frames)
    └──requires──> Node.js >=18 (already met)
    └──requires──> ffmpeg-static (already in stack from v1.6 for audio; reused for video)
    └──produces──> .mp4 or .png stills → saved to file system
    └──optionally-uses──> Theme generator (colors, typography from brand kit)

Wallpapers + Visual Assets (OG images, banners, social headers)
    └──requires──> Satori (HTML/CSS → SVG) + Sharp (SVG → PNG, resize)
    └──requires──> Platform dimension registry (OG: 1200×630, LinkedIn: 1584×396, etc.)
    └──produces──> PNG files at multiple sizes → saved to file system
    └──optionally-uses──> Theme generator (brand colors)

Icon Generation (SVG)
    └──requires──> LLM to generate SVG path code from description
    └──no external rendering lib needed (SVG is text)
    └──produces──> .svg files → saved to file system
    └──consumed-by──> Branding media kit

Social Media Content
    └──requires──> Wallpaper/banner generator (for image posts)
    └──requires──> LLM (for copy: captions, hashtags, platform-appropriate tone)
    └──requires──> Platform spec registry (image sizes, character limits per platform)
    └──produces──> Platform folder: {platform}/{size}.png + caption.txt + hashtags.txt

Branding Media Kit
    └──requires──> Theme generator (colors)
    └──requires──> Icon generator (SVG logo concept)
    └──requires──> Wallpaper generator (OG image, banner)
    └──requires──> LLM (typography pairing, brand voice, tagline)
    └──produces──> ZIP archive: brand-kit.zip containing all assets + CSS tokens

Placeholder Asset System
    └──requires──> File system with PLACEHOLDERS.md [already built]
    └──requires──> Any generator (diagram, wallpaper, PDF) to set draft flag
    └──produces──> Asset file with DRAFT watermark + PLACEHOLDERS.md entry
    └──resolves-via──> "generate final" command removes watermark, updates manifest

Dependency Notes

  • Content Skill System is the foundation. Every content type is a skill. If the skill routing pattern is not established first, each content type becomes a disconnected one-off endpoint.
  • Satori + Sharp is the image stack for all 2D raster outputs. Do not introduce a separate image library per content type — Satori handles the JSX/CSS layout, Sharp handles PNG conversion and resizing. One pipeline for wallpapers, OG images, social headers, and banner generation.
  • Playwright for HTML-template PDFs, pdf-lib for data-driven PDFs. Do not use a single library for both — Playwright is better for design-rich output, pdf-lib is better for invoices and receipts. Use the right tool per use case.
  • ffmpeg-static already in v1.6 stack. Remotion's video pipeline reuses it — do not add a second FFmpeg dependency.
  • Branding media kit is a composition of other skills. It is not a standalone generator; it orchestrates theme → icons → wallpapers → copy and zips the outputs.
  • PLACEHOLDERS.md integration is cross-cutting. Every content generator must write to the manifest on save; this is not optional per the v1.7 milestone requirements.

MVP Definition

Launch With (v1.7 Milestone — P1)

The minimum set to make content generation genuinely useful as a daily workflow tool.

  • Content skill system scaffolding — Skill registration pattern, agent routing, file attachment to chat; gates everything else
  • Diagram generation — NL → Mermaid → SVG/PNG; most requested, lowest complexity, immediate productivity value for a developer
  • Theme and palette generator — Seed color → full color system with WCAG AA; exports CSS tokens + JSON; standalone value even without other generators
  • Placeholder asset system — DRAFT watermark on any generated file + PLACEHOLDERS.md entry; prevents generated assets from being accidentally shipped unreviewed
  • PDF generation — Playwright-based HTML → PDF for reports/one-pagers; pdf-lib for invoices; solves a concrete recurring task
  • Wallpapers and OG images — Satori + Sharp pipeline; produces desktop wallpaper, OG image, and LinkedIn/Twitter header from a single theme config

Add After Validation (v1.7.x — P2)

  • Icon generation (SVG) — LLM-as-SVG-coder; trigger: user asks for consistent icon set for a project
  • Social media content — Platform-formatted posts + captions; trigger: user has a completed project and needs to announce it
  • Remotion presentations — React-component slides → MP4/stills; trigger: user needs a pitch deck or demo video; requires careful VRAM budget on M4

Future Consideration (v2+ — P3)

  • Branding media kit — Full brand kit ZIP; requires all other generators to be stable first; high coordination cost
  • Batch generation — Multiple variants or sizes at once; requires job queue infrastructure
  • Template library — Reusable Remotion/Satori templates stored in workspace
  • Font embedding — Custom font in PDF and video; requires font licensing audit and subsetting

Feature Prioritization Matrix

Feature User Value Implementation Cost Priority
Content skill system HIGH MEDIUM P1
Diagram generation HIGH LOW P1
Theme + palette generator HIGH MEDIUM P1
Placeholder asset system MEDIUM LOW P1
PDF generation HIGH MEDIUM P1
Wallpapers + OG images (Satori) MEDIUM MEDIUM P1
Icon generation (SVG) MEDIUM LOW P2
Social media content MEDIUM MEDIUM P2
Remotion presentations + video HIGH HIGH P2
Branding media kit HIGH HIGH P3
Batch generation LOW HIGH P3
Template library MEDIUM MEDIUM P3

Priority key:

  • P1: Must have for v1.7 launch
  • P2: Should have; add when P1 is stable
  • P3: Future milestone

Content Type Profiles

Detailed breakdown of what each content type requires and delivers.

Diagram Generation

User trigger: "Draw me a sequence diagram of the auth flow" in chat Input: Natural language description Output: .svg + .mmd (Mermaid source) files Generator: LLM → Mermaid syntax → @mermaid-js/mermaid-cli run() API Preview: Inline SVG in chat bubble Complexity: LOW — Mermaid CLI has Node.js programmatic API; LLMs are good at generating valid Mermaid Risk: LLM occasionally produces invalid Mermaid syntax; must validate and retry or surface the raw .mmd for user to fix in mermaid.live Platform spec: Vector SVG = no size constraint; PNG export at 2x for retina via --scale 2

Theme + Palette Generator

User trigger: "Generate a color theme from #2563EB" or "Create a dark theme for my portfolio" Input: Seed hex color + optional: mode (light/dark/both), style (minimal/vibrant/muted) Output: theme.css (CSS custom properties), theme.json (design tokens), tailwind.config.ts Generator: Server-side color math using OKLCH/LCh model; no LLM required for color generation; LLM assists with labeling semantic colors (primary, danger, success) Preview: Live color swatches with contrast ratio overlay in UI Complexity: MEDIUM — OKLCH color math is non-trivial; WCAG AA enforcement requires iteration loop WCAG AA rule: Every text/background combination must hit 4.5:1 contrast ratio; auto-adjust lightness until compliant Exports: Three formats (CSS, JSON, Tailwind) from same data model

Placeholder Asset System

User trigger: Any content generator can emit a "draft" asset; user explicitly marks an asset as a draft placeholder Input: Any generated file + optional placeholder label Output: File with diagonal DRAFT watermark (SVG overlay for images/PDFs, badge for other types) + PLACEHOLDERS.md entry Generator: Post-processing step in every content pipeline, not a standalone generator Manifest fields: path, type, status (draft/final), generator, prompt_hash, generated_at, resolved_at Complexity: LOW — SVG watermark overlay is a compositing operation; PLACEHOLDERS.md is an existing format to extend Resolve flow: "finalize this asset" removes watermark, updates manifest status to final

PDF Generation

Use case A — Design-rich reports and one-pagers: HTML template rendered with Playwright headless Chromium → PDF

  • Supports full CSS (flexbox, grid, custom fonts via @font-face)
  • Startup cost: 13s browser init; reuse browser instance across requests
  • Output: pixel-accurate PDF matching HTML preview

Use case B — Invoices, receipts, data tables: programmatic construction with pdf-lib

  • No browser dependency; pure Node.js; fast (<200ms for simple documents)
  • Supports: text positioning, tables, page breaks, image embedding
  • Output: structured PDF from data objects

User trigger: "Generate a project summary PDF" (→ Playwright) or "Create an invoice for client X" (→ pdf-lib) Complexity: MEDIUM (Playwright browser lifecycle management; pdf-lib API for data-driven docs)

Wallpapers + Visual Assets

Scope: Desktop wallpaper (2560×1440, 3840×2160), mobile wallpaper (1080×1920), OG image (1200×630), LinkedIn banner (1584×396), Twitter/X header (1500×500) User trigger: "Generate a wallpaper for my project" or "Create an OG image with the project name" Input: Theme colors + project name/tagline + optional layout style Output: PNG files at each requested size Generator: Satori (JSX → SVG) → Sharp (SVG → PNG, resize per platform spec) Preview: Thumbnail grid in UI; click to view full size; download individual or as ZIP Complexity: MEDIUM — Satori requires a subset of CSS (no display: grid in all versions; use flexbox); Sharp handles the raster conversion

Icon Generation (SVG)

User trigger: "Create an icon for notifications" or "Generate a 5-icon set for the nav bar" Input: Icon description + optional style (outline/filled/duotone) + size (24px/32px/48px) Output: .svg file(s) with clean <svg> structure Generator: LLM generates SVG path code directly — this is a text-to-code task, not image generation Preview: Rendered SVG inline in chat; displayed at 24, 48, 96px to show scalability Complexity: LOW — modern LLMs (Claude, GPT-4) reliably generate clean SVG paths for simple icons; more complex icons need iteration Consistency rule: Generate entire sets in one LLM call with style instructions; icons generated separately look inconsistent

Social Media Content

Scope: Post image + caption + hashtags for LinkedIn, Twitter/X, Instagram (static only) Platform specs:

  • LinkedIn: 1200×628px image, 3000 char limit, 35 hashtags
  • Twitter/X: 1200×675px image, 280 char limit (with image), 23 hashtags
  • Instagram: 1080×1080px (square), 2200 char limit, 1030 hashtags User trigger: "Create a launch announcement post for LinkedIn and Twitter" Input: Project description/milestone + platform selection + tone (professional/casual/technical) Output: Per platform: {platform}/image.png + {platform}/caption.txt + {platform}/hashtags.txt Generator: Satori for platform image → LLM for caption + hashtags Complexity: MEDIUM — platform spec registry is straightforward; caption writing via LLM is reliable; image must be platform-safe (no text too close to edge)

Remotion Presentations + Video

User trigger: "Create a 2-minute pitch deck video" or "Generate slides for the project demo" Input: Slide content (title, bullets, code snippets) + theme + duration estimate Output: .mp4 (for video) or PNG stills per slide (for presentation mode) Generator: @remotion/renderer renderMedia() API — server-side, no browser UI needed Preview: Remotion Player component in UI for interactive playback before export Complexity: HIGH — Remotion render is CPU-intensive (no GPU on M4 needed; uses CPU rendering); render time for a 2-min video ~3090s on M4; must manage render queue VRAM note: Remotion does NOT use GPU/VRAM; pure CPU/RAM render; does not compete with LLM VRAM budget ffmpeg reuse: Remotion uses ffmpeg internally for video encoding; ffmpeg-static already in the v1.6 stack satisfies this

Branding Media Kit (v2 — complex coordination)

Output: brand-kit.zip containing: colors/theme.css, colors/theme.json, icons/logo.svg, icons/favicon.svg, images/og-image.png, images/banner-linkedin.png, images/banner-twitter.png, images/wallpaper-desktop.png, typography/font-stack.css, copy/brand-voice.md, copy/tagline.txt Generator: Orchestrator agent coordinates all sub-generators in sequence Complexity: HIGH — coordination of 6 generators with shared state (theme colors must flow through all visual assets)


Competitor Feature Analysis

Feature Canva / Pitch Mermaid Live / Eraser Figma Tokens Studio Nexus v1.7 Approach
Content type Raster images, slides Diagrams only Design tokens only All types via skills
AI integration Prompt-to-design (cloud) None / limited None Chat-driven, local LLM
Offline / local No No No Fully local on M4
Skill installability Monolithic product Standalone tool Figma plugin Per-type installable skills
File ownership Cloud-locked Export only Figma-locked Local file system, git-versioned
WCAG enforcement Optional check N/A Via plugin Enforced at generation
PLACEHOLDERS.md N/A N/A N/A Native; draft tracking built in
Agent-driven No No No Core UX: chat → deliverable

Platform Dimension Registry

Used by wallpaper generator, social content, and branding kit.

Asset Type Width Height Notes
OG Image 1200 630 Universal (Facebook, LinkedIn, Twitter)
LinkedIn Banner 1584 396 Center-safe zone; edges cropped on mobile
Twitter/X Header 1500 500 3:1 aspect ratio
YouTube Banner 2560 1440 Safe zone: center 1546×423
Instagram Square 1080 1080 1:1
Desktop Wallpaper 2560 1440 Standard; also offer 3840×2160
Mobile Wallpaper 1080 1920 9:16
Favicon 32 32 SVG preferred; PNG fallback
Apple Touch Icon 180 180 PNG only

Generation Job Lifecycle

All content generation follows this status machine to enable consistent UI feedback:

queued → generating → ready → (draft → final via placeholder system)
                   ↘ error (with structured reason + suggestion)
  • queued: Job accepted; worker not yet started
  • generating: Active work; emit SSE progress events with % or step label
  • ready: File available; preview URL returned; download URL available
  • draft: File saved with DRAFT watermark; PLACEHOLDERS.md entry created
  • final: User confirmed; watermark removed; manifest updated
  • error: Structured error with reason + actionable suggestion

Sources


Feature research for: Nexus v1.7 Content Generation Researched: 2026-04-04