# Phase 43: Documents & Branding - Research
**Researched:** 2026-04-04
**Domain:** PDF generation via Playwright, brand identity kit assembly, ZIP packaging
**Confidence:** HIGH
---
## User Constraints (from CONTEXT.md)
### Locked Decisions
All implementation choices are at Claude's discretion — discuss phase was skipped per user setting.
### Claude's Discretion
All implementation choices are at Claude's discretion. Use ROADMAP phase goal, success criteria, and codebase conventions.
### Deferred Ideas (OUT OF SCOPE)
None — discuss phase skipped.
---
## Phase Requirements
| ID | Description | Research Support |
|----|-------------|------------------|
| DOC-01 | User can generate formatted PDF reports from conversation content | Playwright page.pdf() with HTML-to-PDF approach; follows diagram-renderer.ts Playwright pattern |
| DOC-02 | User can generate invoices and contracts from templates | Template-driven HTML rendered to PDF via Playwright; pdf-lib for lightweight data-only invoices |
| DOC-03 | User can generate one-pagers and API documentation | LLM-generated HTML + CSS styled page → Playwright PDF; same pipeline as DOC-01 |
| BRAND-01 | User can generate a full brand identity from a single conversation | Multi-step LLM extraction: brand name/colors/typography → feeds each sub-renderer |
| BRAND-02 | System produces logo mark (SVG), avatar in multiple sizes | SVG logo via LLM + validateAndCleanSvg; sharp rasterizes to [512, 256, 128, 64, 32]px PNGs |
| BRAND-03 | System produces social media profile images and banners per platform | Re-uses PLATFORM_DIMENSIONS from wallpaper-renderer; logo composited onto brand-colored background |
| BRAND-04 | System produces email signature and letterhead templates | LLM generates HTML; stored as HTML string inside bundle + PNG preview via Playwright screenshot |
| BRAND-05 | System produces a brand guidelines document (PDF) | Playwright PDF of a styled brand guidelines HTML page generated by LLM |
| BRAND-06 | User can download all brand assets as a zip package | `archiver` v7 streams all bundle assets into a ZIP buffer returned as a single RenderResult |
---
## Summary
Phase 43 adds two new job types — `pdf-document` (DOC-01..03) and `brand-kit` (BRAND-01..06) — to the existing content job pipeline. Both use infrastructure from Phase 40 (job store, SSE, asset storage) and follow the renderer pattern established in Phases 41-42.
PDF generation uses the Playwright Chromium browser already installed at `~/.cache/ms-playwright/chromium-1217/`. The `resolveBrowserPath()` function in `diagram-renderer.ts` is already in place and reusable. The pattern is: LLM generates an HTML page → Playwright renders it → `page.pdf()` outputs a PDF buffer. This sidesteps any native binary dependencies beyond the already-present Chromium.
Brand kit generation orchestrates multiple sub-renders — logo SVG, avatar PNGs, social images, email signature HTML, letterhead HTML, brand guidelines PDF — and then packages them with `archiver` into a single ZIP buffer. The ZIP is stored as a single generated asset; the UI fetches and triggers a browser download. `pdf-lib` is NOT needed for Phase 43 — Playwright PDF covers all three document types.
**Primary recommendation:** One `pdf-renderer.ts` for DOC-01..03 and one `brand-renderer.ts` for BRAND-01..06. Both follow the `renderX(input) → RenderResult` contract. Add two new `case` blocks in `content-job-runner.ts`. Add two new tabs to `ContentStudio.tsx`.
---
## Project Constraints (from CLAUDE.md)
No `CLAUDE.md` exists in the project root. Constraints are derived from codebase conventions documented in STATE.md:
- **Async job pattern is mandatory** — all render requests return 202 + job ID immediately; never block HTTP on render
- **sourceTaskId required** on every generated asset from day one
- **MAX_GENERATED_ASSET_BYTES** applies to all generated assets (bypasses 10MB upload limit for "generated" namespace)
- **Playwright Chromium** already decided for design-rich PDFs (confirmed in STATE.md blocker note: "Confirm pdf-lib scope: Playwright for design-rich PDFs, pdf-lib for data-driven invoices — decide at Phase 43 planning")
- **Renderer pattern**: `renderX(input: Record): Promise` — default export via dynamic import in `content-job-runner.ts`
- **Bundle pattern**: rich JSON blob stored as the RenderResult; UI fetches the asset URL, parses JSON, hydrates component
- **puterChatComplete** for all LLM calls; reads `PUTER_AUTH_TOKEN` from env
- **No new binary dependencies** beyond what is already installed (`sharp`, `svgo`, `playwright-core`, `@resvg/resvg-js`, `ffmpeg-static`)
- **TypeScript strict** — all new files need proper types
- **Test mocks**: `playwright-core` and `puter-inference.js` are always vi.mock()ed in tests
---
## Standard Stack
### Core (already installed — no new installs needed for PDF)
| Library | Version | Purpose | Why Standard |
|---------|---------|---------|--------------|
| `playwright-core` | 1.58.2 | HTML → PDF via Chromium headless | Already installed, `resolveBrowserPath()` written |
| `sharp` | ^0.34.5 | SVG → PNG rasterization for avatars and social images | Already in use (wallpaper-renderer.ts) |
| `svgo` | ^4.0.1 | SVG cleanup/validation | Already in use (icon-renderer.ts) |
| `@resvg/resvg-js` | ^2.6.2 | High-fidelity SVG → PNG for logo mark | Already in use (diagram-renderer.ts) |
### New Dependency: ZIP Packaging
| Library | Version | Purpose | Why |
|---------|---------|---------|-----|
| `archiver` | ^7.0.1 | Stream multiple buffers into a ZIP buffer | Well-maintained (MIT), streams-based, works entirely in memory via `archiver.finalize()` + collect into Buffer; no disk I/O |
**Installation (one new package):**
```bash
pnpm --filter @paperclipai/server add archiver
pnpm --filter @paperclipai/server add -D @types/archiver
```
**Version verification (npm registry, 2026-04-04):**
- `archiver`: 7.0.1 (latest) — confirmed
- `@types/archiver`: published alongside, v5.3.4
### Why NOT pdf-lib
STATE.md blocker note says "Confirm pdf-lib scope: Playwright for design-rich PDFs, pdf-lib for data-driven invoices — decide at Phase 43 planning."
**Decision: Use Playwright for all three doc types (DOC-01, DOC-02, DOC-03).**
Rationale:
- DOC-01 (reports), DOC-02 (invoices), DOC-03 (one-pagers) all need styled output — headings, tables, code blocks
- LLM can generate HTML + inline CSS in a single shot; Playwright renders it faithfully
- `pdf-lib` is excellent for programmatic PDF manipulation (merge, fill form fields) but poor for styled layout
- We already have Playwright Chromium; adding pdf-lib adds another package for no benefit
- Invoice "templates" work cleanly as HTML templates: LLM fills the line items, Playwright renders to PDF
### Alternatives Considered
| Instead of | Could Use | Tradeoff |
|------------|-----------|----------|
| Playwright HTML→PDF | `pdf-lib` | pdf-lib has no layout engine; styling complex reports requires hand-coding coordinates — far harder than HTML+CSS |
| archiver | `jszip` | jszip is promise-based but slower; archiver's streaming API handles large asset sets better |
| archiver | `adm-zip` | adm-zip v0.5 is synchronous; blocks event loop for large zips |
| LLM-generated SVG logo | Stable Diffusion / DALL-E | Out of scope per REQUIREMENTS.md |
---
## Architecture Patterns
### Recommended Project Structure
```
server/src/services/renderers/
├── pdf-renderer.ts # NEW — DOC-01, DOC-02, DOC-03
├── brand-renderer.ts # NEW — BRAND-01..06
├── diagram-renderer.ts # existing
├── icon-renderer.ts # existing
├── wallpaper-renderer.ts # existing
├── social-renderer.ts # existing
├── convert-renderer.ts # existing
└── types.ts # add PdfDocumentBundle + BrandKitBundle
ui/src/components/
├── DocumentGeneratePanel.tsx # NEW — DOC tab UI
├── BrandKitPanel.tsx # NEW — Brand tab UI
├── BrandKitResult.tsx # NEW — brand kit display + ZIP download trigger
└── ... (existing)
ui/src/pages/
└── ContentStudio.tsx # add "Documents" tab + "Brand" tab
```
### Pattern 1: Playwright PDF Renderer
Same structure as `diagram-renderer.ts` Playwright usage — launch browser from `resolveBrowserPath()`, create page, set HTML content, call `page.pdf()`, close browser.
```typescript
// server/src/services/renderers/pdf-renderer.ts
import { chromium } from "playwright-core";
import { resolveBrowserPath } from "./diagram-renderer.js";
import { puterChatComplete } from "../puter-inference.js";
import type { RenderResult, PdfDocumentBundle } from "./types.js";
export async function renderPdfDocument(
input: Record,
): Promise {
const docType = typeof input.docType === "string" ? input.docType : "report";
const prompt = typeof input.prompt === "string" ? input.prompt : "";
const title = typeof input.title === "string" ? input.title : "Document";
// LLM generates a complete, self-contained HTML document
const html = await puterChatComplete([
{ role: "system", content: buildPdfSystemPrompt(docType) },
{ role: "user", content: prompt },
]);
const cleanHtml = stripMarkdownFences(html);
const executablePath = resolveBrowserPath();
const browser = await chromium.launch({
executablePath,
headless: true,
args: ["--no-sandbox", "--disable-setuid-sandbox"],
});
let pdfBuffer: Buffer;
try {
const page = await browser.newPage();
await page.setContent(cleanHtml, { waitUntil: "networkidle" });
const pdfUint8 = await page.pdf({
format: "A4",
printBackground: true,
margin: { top: "20mm", bottom: "20mm", left: "20mm", right: "20mm" },
});
pdfBuffer = Buffer.from(pdfUint8);
} finally {
await browser.close();
}
const bundle: PdfDocumentBundle = {
type: "pdf-document-bundle",
docType,
title,
pdfBase64: pdfBuffer.toString("base64"),
};
return {
filename: `document-${docType}.json`,
contentType: "application/json",
buffer: Buffer.from(JSON.stringify(bundle)),
};
}
```
**Key Playwright PDF options:**
- `page.pdf({ format: "A4", printBackground: true })` — produces a Buffer (Uint8Array in Playwright v1.58)
- `waitUntil: "networkidle"` — ensures any web fonts / images finish before capture; use `"domcontentloaded"` as fallback for offline-only HTML
- Margin in mm units
- `printBackground: true` — needed for colored headers/footers in styled documents
### Pattern 2: Brand Kit Orchestration
Brand kit is a multi-step job: one LLM call extracts the brand specification, then sub-renderers produce each asset in sequence, then archiver packages everything.
```typescript
// server/src/services/renderers/brand-renderer.ts
interface BrandSpec {
name: string;
tagline: string;
primaryColor: string; // hex
secondaryColor: string; // hex
fontStyle: "sans" | "serif" | "mono";
logoDescription: string;
industry: string;
}
export async function renderBrandKit(
input: Record,
): Promise {
const prompt = typeof input.prompt === "string" ? input.prompt : "";
// Step 1: Extract brand specification
const spec = await extractBrandSpec(prompt);
// Step 2: Generate logo SVG
const logoSvg = await generateLogoSvg(spec);
// Step 3: Rasterize logo to avatar sizes [512, 256, 128, 64, 32]
const avatarPngs = await rasterizeAvatars(logoSvg);
// Step 4: Generate social platform images (profile + banner per platform)
const socialImages = await generateSocialImages(spec, logoSvg);
// Step 5: Generate email signature HTML + letterhead HTML
const { signature, letterhead } = await generateTemplates(spec, logoSvg);
// Step 6: Generate brand guidelines PDF via Playwright
const guidelinesPdf = await generateGuidelinesPdf(spec, logoSvg, signature);
// Step 7: Package everything into a ZIP
const zipBuffer = await buildZip({
logoSvg, avatarPngs, socialImages,
signature, letterhead, guidelinesPdf,
});
const bundle: BrandKitBundle = {
type: "brand-kit-bundle",
spec,
logoSvgBase64: Buffer.from(logoSvg).toString("base64"),
avatarPngs, // { "512": base64, "256": base64, ... }
socialImages, // { "twitter-profile": base64, ... }
signatureHtml: signature,
letterheadHtml: letterhead,
guidelinesPdfBase64: guidelinesPdf.toString("base64"),
zipBase64: zipBuffer.toString("base64"),
};
return {
filename: "brand-kit-bundle.json",
contentType: "application/json",
buffer: Buffer.from(JSON.stringify(bundle)),
};
}
```
### Pattern 3: archiver ZIP buffer assembly
```typescript
// Source: archiver v7 official docs
import archiver from "archiver";
import { Writable } from "stream";
async function buildZipBuffer(entries: Array<{ name: string; data: Buffer }>): Promise {
return new Promise((resolve, reject) => {
const chunks: Buffer[] = [];
const sink = new Writable({
write(chunk: Buffer, _enc, cb) { chunks.push(chunk); cb(); },
});
const archive = archiver("zip", { zlib: { level: 6 } });
archive.on("error", reject);
sink.on("finish", () => resolve(Buffer.concat(chunks)));
archive.pipe(sink);
for (const entry of entries) {
archive.append(entry.data, { name: entry.name });
}
void archive.finalize();
});
}
```
### Pattern 4: content-job-runner.ts additions
```typescript
// Add to renderContent() switch in content-job-runner.ts
case "pdf-document": {
const { renderPdfDocument } = await import("./renderers/pdf-renderer.js");
return renderPdfDocument(input);
}
case "brand-kit": {
const { renderBrandKit } = await import("./renderers/brand-renderer.js");
return renderBrandKit(input);
}
```
### Pattern 5: UI — useContentJob + bundle fetch (established pattern)
The UI pattern for both new tabs is identical to `SocialPostPanel.tsx`:
1. `useContentJob(companyId)` for submit + SSE progress
2. `if (job.status === "done" && job.resultAssetId && !bundle)` → fetch asset URL → `fetch()` → `JSON.parse()` → set bundle state
3. Display result component
4. Download button triggers `URL.createObjectURL(base64ToBinary(...))` + `.click()`
For BRAND-06 (ZIP download): the zipBase64 field in the bundle drives a single download button that triggers a browser `` with the ZIP blob.
### Anti-Patterns to Avoid
- **Do NOT use `waitUntil: "load"` for Playwright PDF** — network requests for CDN fonts will fail in the sandbox; use self-contained inline CSS with `@import` disabled, or use `waitUntil: "domcontentloaded"` + system fonts only
- **Do NOT open a new Playwright browser per sub-step in brand kit** — open once, generate guidelines PDF in that session, close; reuse same `resolveBrowserPath()` pattern
- **Do NOT pass a file path to archiver from tmpdir** — use `archive.append(buffer, { name })` in-memory to avoid disk temp files
- **Do NOT build the brand kit as a single sequential LLM mega-prompt** — extract spec first (structured JSON), then feed spec fields into individual generators; this gives predictable output shapes
- **Do NOT define BrandKitBundle only in the panel component file** — unlike wallpaper/social bundles (which are panel-local per STATE.md decision), `BrandKitBundle` and `PdfDocumentBundle` must be added to `server/src/services/renderers/types.ts` because the brand renderer references them directly
---
## Don't Hand-Roll
| Problem | Don't Build | Use Instead | Why |
|---------|-------------|-------------|-----|
| PDF from styled HTML | Custom layout engine | `playwright-core` `page.pdf()` | CSS layout is already solved; Playwright has CSS paged media support |
| ZIP archive in memory | Manually writing ZIP file format | `archiver` v7 | ZIP format has CRC32, compression, directory entries — trivially wrong to hand-roll |
| SVG logo cleanup | Custom regex stripper | `svgo` (already installed) `validateAndCleanSvg()` in icon-renderer.ts | Already written and tested |
| SVG → PNG rasterization | Sharp for logo mark | `@resvg/resvg-js` (already installed, used in diagram-renderer.ts) | Handles embedded fonts and complex gradients better than sharp for LLM-generated logos |
| Brand color parsing | Write hex parser | CSS string; pass raw hex from LLM spec directly to SVG fill attributes | No parsing needed |
**Key insight:** Every difficult sub-problem in this phase already has a solved dependency in the codebase. This phase is integration work, not new infrastructure.
---
## Common Pitfalls
### Pitfall 1: Playwright `page.pdf()` returns Uint8Array, not Buffer
**What goes wrong:** TypeScript infers `Uint8Array`; passing directly to `Buffer.from()` works but `byteLength` comparison for MAX_GENERATED_ASSET_BYTES expects a Buffer.
**Why it happens:** Playwright v1.58 changed the return type of `page.pdf()` to `Promise`.
**How to avoid:** Always wrap: `const pdfBuffer = Buffer.from(await page.pdf({...}))`.
**Warning signs:** TS2345 type error, or `buffer.byteLength` returning wrong value.
### Pitfall 2: LLM-generated HTML includes external resource references
**What goes wrong:** HTML links to Google Fonts CDN, external images. Playwright in `--no-sandbox` headless mode may not fetch them, producing blank/missing content.
**Why it happens:** LLM follows standard web conventions; doesn't know the context is offline.
**How to avoid:** System prompt must explicitly say: "Use only inline CSS. No external URLs. Use web-safe system fonts (Arial, Georgia, monospace). No `` or `