Skip to content

The Agent Cast In Detail

Chapter 1's introduction to the cast gave you a quick tour. This page goes deep — Script Writer modes, what each prompt-writing agent actually does, the inputs they need, and the artifacts they produce.

You only need this depth when you're creating from scratch (Chapter 12), where you'll see each agent's output as it comes through.

Script Writer — 5 modes

The Script Writer has five distinct modes. The Manager picks the right mode based on what you're asking for:

Mode When Inputs Output
Mode 1: Adaptation You have an existing script (yours or a competitor's) and want to base a new workflow on it Source script + brand + audience Adapted script that mirrors the structure with original wording
Mode 2: Translation An existing approved workflow needs another language Source workflow's dialogue + target language + dialect Translated dialogue, same scene structure
Mode 3: Follower-growth You want growth content (educational, soft sell, recipe-style) Topic + audience + product (as context, not pitch) Growth script — opens with hook, body is value-add, soft CTA
Mode 4: Variant rewrite A Lvl 1-2 variant on an existing approved workflow Source workflow + change scope (dialogue / wardrobe / both) New dialogue rows + per-scene wardrobe updates
Mode 5: New script (default) Fresh from-scratch script Full brief: product, audience, tone, platform, sales channel, B-roll density Complete script with hook / body / CTA structure

The Manager picks the mode automatically. You don't have to specify — the brief tells the Manager which mode applies.

What the Script Writer needs

For Mode 5 (new from-scratch script), the Script Writer expects:

  • Product info — loaded from reference/products/{product}.md. Approved claims, banned claims, mechanism notes.
  • Brand info — loaded from reference/brands/{brand}.md. Voice, banned words, approved angles.
  • Audience description — age range, gender, life stage, pain points
  • Schwartz awareness level — Unaware / Problem Aware / Solution Aware / Product Aware / Most Aware
  • Sales channel — TikTok Shop / Amazon / Meta Shop (drives CTA structure)
  • Platform — TikTok / Reels / Facebook (drives length and pacing constraints)
  • Tone — warm / authoritative / urgent / casual
  • Creative template (optional) — if you've documented one, it constrains hook style and pacing
  • Variant constraints — if there's a parent workflow being varied

The Manager assembles all of this from the brief + reference files before delegating.

What the Script Writer returns

A structured script with tagged segments:

[HOOK — 4s]
"If your sleep stopped working when you hit 45, here's something
nobody tells you..."

[BODY — symptom cluster, 6s]
"Brain fog. Mood swings. 3am wake-ups. They feel like separate
problems, but they're not."

[BODY — personal anchor, 8s]
"I'm 47, and 6 months ago I realized..."

[BODY — mechanism reveal, 12s]
"It turns out your body's making less of a key compound..."

[BODY — product introduction, 10s]
"Brand XYZ magnesium glycinate restores it. Two capsules at night..."

[CTA — 5s]
"Comment SLEEP and I'll send you the link."

Each segment is tagged with its purpose and target duration. The Visual Planner uses these tags to decide scene structure.

Visual Planner — what it does

Takes the approved script and produces a scene-by-scene storyboard.

For each script segment, the Visual Planner decides:

  • How many scenes (an 8-second body segment usually becomes one scene; a 12-second segment may split)
  • Camera framing per scene — selfie / medium / close-up; chest-up / waist-up / full body
  • Subject pose per scene — leaning forward / direct gaze / gesturing / etc.
  • Lighting direction per scene — usually inherited from creative template, sometimes varied
  • Background / environment — usually inherited from creative template
  • B-roll placement — where cutaways should go (based on B-roll density)

It returns two outputs:

User summary
1-2 sentences per scene. What you approve.
Internal detail
Full scene specifications used by the Image Prompter and Veo Prompter in Stage 4. You typically don't see this — it goes straight to downstream agents.

Image Prompter — what it does

Takes the storyboard's internal detail and writes per-scene image prompts for NanoBanana 2.

For each reference image group (often 1 image is shared across multiple scenes that have similar composition), the Image Prompter writes a JSON-structured prompt:

{
  "framing": "chest to shoulder up selfie",
  "subject": {
    "wardrobe": "white linen shirt, soft texture",
    "pose": "subtle forward lean, weight on left arm resting on counter",
    "build": "slim athletic, casual posture"
  },
  "setting": {
    "primary": "bathroom vanity mirror",
    "secondary": "tiled wall behind, vanity counter with minimal items"
  },
  "lighting": {
    "direction": "warm natural window light from camera-left",
    "quality": "golden hour, soft fill"
  },
  "aesthetic": {
    "vibe": "raw UGC, slight smartphone texture",
    "color_palette": "neutral warm tones"
  },
  "no": "third person view, external camera, full body shot, pants visible"
}

Each prompt applies the Image Prompt Rules — framing declaration, in-frame only, POV correctness, etc.

Veo Prompter — what it does

Takes the storyboard's scene details (specifically: dialogue line + motion description) and writes per-scene video prompts for Veo 3.1.

Speaking-scene Veo prompts use the universal talking head template — same structure for every speaking scene, only the dialogue line varies. The template handles:

  • Delivery pacing
  • Micro-pacing within a line (Veo's natural language for this)
  • Mouth-sync expectations
  • Natural blinking / breathing (Veo adds these automatically)

The Veo prompt for a speaking scene looks roughly like:

[Universal talking head template — fixed structure]

{dialogue}: "the dialogue line for this specific scene"

[Macro motion description from Visual Planner]
"Subject leans forward subtly on the line 'here's something nobody
tells you', then settles back as the line completes."

B-roll Veo prompts use natural language — no structured template, no dialogue, ambient audio only.

Seedance Prompter — when used

Same shape as Veo Prompter, but adapted for Seedance 2.0's constraints (positive-only prompts, no negative prompts). Used when Seedance is producing a specific stylization that Veo can't match.

In practice this is rare — Veo handles almost every case the pipeline encounters. Seedance is a fallback.

PatchWork Importer — what it does

Takes all the prompt files from the prompt-writing agents and assembles them into a .nbflow.

For each scene:

  1. Creates the image gen node chain (Plain Prompt → NanobananaAPI → Approve)
  2. Creates the video gen node chain (Dynamic Prompt → Template → Veo3 → Approve)
  3. Wires them together (image's approved output feeds Veo's start frame)
  4. Connects reference image Media nodes to the image gen nodes
  5. Connects all approvals to downstream consumers

The importer also has a Patch Mode for variant work — instead of building from scratch, it takes an existing .nbflow plus a list of changes and applies them surgically.

Generation Runner — what it does

You've seen this one — see Chapter 2's Generation Runner page. The agent that runs the workflow and produces the candidates.

How they coordinate

The Manager orchestrates the cast. A typical from-scratch flow:

sequenceDiagram
    User->>Manager: brief
    Manager->>User: confirm brief
    User->>Manager: approve
    Manager->>Script Writer: script (Mode 5)
    Script Writer-->>Manager: structured script
    Manager->>User: approve script
    User->>Manager: approve
    Manager->>Visual Planner: storyboard request
    Visual Planner-->>Manager: user summary + internal detail
    Manager->>User: approve storyboard
    User->>Manager: approve
    Manager->>Image Prompter: per-scene image prompts
    Manager->>Veo Prompter: per-scene video prompts
    Image Prompter-->>Manager: prompt files
    Veo Prompter-->>Manager: prompt files
    Manager->>PatchWork Importer: assemble .nbflow
    PatchWork Importer-->>Manager: complete .nbflow
    Manager->>User: ready to generate

You don't talk to any of the prompt-writing agents directly. The Manager handles all the delegation. You see the script and the storyboard — the actual prompts are surfaced only if you ask to see them.

When you'd see individual agent outputs

Most of the time, you only see the user-facing artifacts: script, storyboard summary, the final .nbflow, and the generated candidates.

You might dig into individual agent outputs when:

  • A scene's image prompt isn't producing what you want (you want to read the prompt)
  • The Visual Planner's storyboard summary leaves out detail you need
  • A surgical edit needs you to know exactly which prompt to change

These are advanced cases. The Manager handles agent coordination transparently in normal use.

When you're ready

Next: The Prehook — what a prehook is, when to use one, how it fits in the workflow.