The Agent Cast In Detail¶
Chapter 1's introduction to the cast gave you a quick tour. This page goes deep — Script Writer modes, what each prompt-writing agent actually does, the inputs they need, and the artifacts they produce.
You only need this depth when you're creating from scratch (Chapter 11), where you'll see each agent's output as it comes through.
Script Writer — 5 modes¶
The Script Writer has five distinct modes. The Manager picks the right mode based on what you're asking for:
| Mode | When | Inputs | Output |
|---|---|---|---|
| Mode 1: Adaptation | You have an existing script (yours or a competitor's) and want to base a new workflow on it | Source script + brand + audience | Adapted script that mirrors the structure with original wording |
| Mode 2: Translation | An existing approved workflow needs another language | Source workflow's dialogue + target language + dialect | Translated dialogue, same scene structure |
| Mode 3: Follower-growth | You want growth content (educational, soft sell, recipe-style) | Topic + audience + product (as context, not pitch) | Growth script — opens with hook, body is value-add, soft CTA |
| Mode 4: Variant rewrite | A Lvl 1-2 variant on an existing approved workflow | Source workflow + change scope (dialogue / wardrobe / both) | New dialogue rows + per-scene wardrobe updates |
| Mode 5: New script (default) | Fresh from-scratch script | Full brief: product, audience, tone, platform, sales channel, B-roll density | Complete script with hook / body / CTA structure |
The Manager picks the mode automatically. You don't have to specify — the brief tells the Manager which mode applies.
What the Script Writer needs¶
For Mode 5 (new from-scratch script), the Script Writer expects:
- Product info — loaded from
reference/products/{product}.md. Approved claims, banned claims, mechanism notes. - Brand info — loaded from
reference/brands/{brand}.md. Voice, banned words, approved angles. - Audience description — age range, gender, life stage, pain points
- Schwartz awareness level — Unaware / Problem Aware / Solution Aware / Product Aware / Most Aware
- Sales channel — TikTok Shop / Amazon / Meta Shop (drives CTA structure)
- Platform — TikTok / Reels / Facebook (drives length and pacing constraints)
- Tone — warm / authoritative / urgent / casual
- Creative template (optional) — if you've documented one, it constrains hook style and pacing
- Variant constraints — if there's a parent workflow being varied
The Manager assembles all of this from the brief + reference files before delegating.
What the Script Writer returns¶
A structured script with tagged segments:
[HOOK — 4s]
"If your sleep stopped working when you hit 45, here's something
nobody tells you..."
[BODY — symptom cluster, 6s]
"Brain fog. Mood swings. 3am wake-ups. They feel like separate
problems, but they're not."
[BODY — personal anchor, 8s]
"I'm 47, and 6 months ago I realized..."
[BODY — mechanism reveal, 12s]
"It turns out your body's making less of a key compound..."
[BODY — product introduction, 10s]
"Brand XYZ magnesium glycinate restores it. Two capsules at night..."
[CTA — 5s]
"Comment SLEEP and I'll send you the link."
Each segment is tagged with its purpose and target duration. The Visual Planner uses these tags to decide scene structure.
Visual Planner — what it does¶
Takes the approved script and produces a scene-by-scene storyboard.
For each script segment, the Visual Planner decides:
- How many scenes (an 8-second body segment usually becomes one scene; a 12-second segment may split)
- Camera framing per scene — selfie / medium / close-up; chest-up / waist-up / full body
- Subject pose per scene — leaning forward / direct gaze / gesturing / etc.
- Lighting direction per scene — usually inherited from creative template, sometimes varied
- Background / environment — usually inherited from creative template
- B-roll placement — where cutaways should go (based on B-roll density)
It returns two outputs:
User summary- 1-2 sentences per scene. What you approve.
Internal detail- Full scene specifications used by the Image Prompter and Veo Prompter in Stage 4. You typically don't see this — it goes straight to downstream agents.
Image Prompter — what it does¶
Takes the storyboard's internal detail and writes per-scene image prompts for NanoBanana 2.
For each reference image group (often 1 image is shared across multiple scenes that have similar composition), the Image Prompter writes a JSON-structured prompt:
{
"framing": "chest to shoulder up selfie",
"subject": {
"wardrobe": "white linen shirt, soft texture",
"pose": "subtle forward lean, weight on left arm resting on counter",
"build": "slim athletic, casual posture"
},
"setting": {
"primary": "bathroom vanity mirror",
"secondary": "tiled wall behind, vanity counter with minimal items"
},
"lighting": {
"direction": "warm natural window light from camera-left",
"quality": "golden hour, soft fill"
},
"aesthetic": {
"vibe": "raw UGC, slight smartphone texture",
"color_palette": "neutral warm tones"
},
"no": "third person view, external camera, full body shot, pants visible"
}
Each prompt applies the Image Prompt Rules — framing declaration, in-frame only, POV correctness, etc.
Veo Prompter — what it does¶
Takes the storyboard's scene details (specifically: dialogue line + motion description) and writes per-scene video prompts for Veo 3.1.
Speaking-scene Veo prompts use the universal talking head template — same structure for every speaking scene, only the dialogue line varies. The template handles:
- Delivery pacing
- Micro-pacing within a line (Veo's natural language for this)
- Mouth-sync expectations
- Natural blinking / breathing (Veo adds these automatically)
The Veo prompt for a speaking scene looks roughly like:
[Universal talking head template — fixed structure]
{dialogue}: "the dialogue line for this specific scene"
[Macro motion description from Visual Planner]
"Subject leans forward subtly on the line 'here's something nobody
tells you', then settles back as the line completes."
B-roll Veo prompts use natural language — no structured template, no dialogue, ambient audio only.
Seedance Prompter — when used¶
Same shape as Veo Prompter, but adapted for Seedance 2.0's constraints (positive-only prompts, no negative prompts). Used when Seedance is producing a specific stylization that Veo can't match.
In practice this is rare — Veo handles almost every case the pipeline encounters. Seedance is a fallback.
PatchWork Importer — what it does¶
Takes all the prompt files from the prompt-writing agents and assembles them into a .nbflow.
For each scene:
- Creates the image gen node chain (Plain Prompt → NanobananaAPI → Approve)
- Creates the video gen node chain (Dynamic Prompt → Template → Veo3 → Approve)
- Wires them together (image's approved output feeds Veo's start frame)
- Connects reference image Media nodes to the image gen nodes
- Connects all approvals to downstream consumers
The importer also has a Patch Mode for variant work — instead of building from scratch, it takes an existing .nbflow plus a list of changes and applies them surgically.
Generation Runner — what it does¶
You've seen this one — see Chapter 2's Generation Runner page. The agent that runs the workflow and produces the candidates.
How they coordinate¶
The Manager orchestrates the cast. A typical from-scratch flow:
sequenceDiagram
User->>Manager: brief
Manager->>User: confirm brief
User->>Manager: approve
Manager->>Script Writer: script (Mode 5)
Script Writer-->>Manager: structured script
Manager->>User: approve script
User->>Manager: approve
Manager->>Visual Planner: storyboard request
Visual Planner-->>Manager: user summary + internal detail
Manager->>User: approve storyboard
User->>Manager: approve
Manager->>Image Prompter: per-scene image prompts
Manager->>Veo Prompter: per-scene video prompts
Image Prompter-->>Manager: prompt files
Veo Prompter-->>Manager: prompt files
Manager->>PatchWork Importer: assemble .nbflow
PatchWork Importer-->>Manager: complete .nbflow
Manager->>User: ready to generate
You don't talk to any of the prompt-writing agents directly. The Manager handles all the delegation. You see the script and the storyboard — the actual prompts are surfaced only if you ask to see them.
When you'd see individual agent outputs¶
Most of the time, you only see the user-facing artifacts: script, storyboard summary, the final .nbflow, and the generated candidates.
You might dig into individual agent outputs when:
- A scene's image prompt isn't producing what you want (you want to read the prompt)
- The Visual Planner's storyboard summary leaves out detail you need
- A surgical edit needs you to know exactly which prompt to change
These are advanced cases. The Manager handles agent coordination transparently in normal use.
When you're ready¶
→ Next: The Prehook — what a prehook is, when to use one, how it fits in the workflow.