Manual B-roll Generation¶
A workflow's main body (speaking scenes) runs through the .nbflow automatically. B-roll cutaways run separately — they're generated manually outside the workflow file. Here's how that works and why.
Why B-roll runs separately¶
A .nbflow wires speaking scenes together because they depend on each other — the image gen feeds the Veo3 node, which feeds an Approve. B-roll clips are independent — each is a standalone clip with its own reference image and prompt. Wiring them into the same .nbflow adds complexity without benefit.
Instead, the Image Prompter and Veo Prompter produce standalone prompt files for B-roll, saved to:
projects/{month}/{brand}/Assets/{workflow}/broll/
broll-01-image.json image prompt for B-roll clip 1
broll-01-video.txt Veo prompt for B-roll clip 1
broll-02-image.json image prompt for B-roll clip 2
broll-02-video.txt Veo prompt for B-roll clip 2
...
You run these manually via Claude.
How to generate a B-roll clip¶
You: generate B-roll clip 1 from Assets/XYZG3/broll/
Claude:
Step 1: generating the reference image using broll-01-image.json
[image generated, saved to broll/broll-01-image.png]
Step 2: generating the Veo clip using broll-01-video.txt
[Veo clip generated, saved to broll/broll-01-clip.mp4]
Both ready in Assets/XYZG3/broll/
The two steps are independent. The image gen produces the reference frame. The Veo gen takes that frame + the natural-language video prompt and produces the animated clip.
When to use start-frame-only vs. start + end frames¶
| Situation | Frames to use |
|---|---|
| Static B-roll (ingredient close-up, product hero shot, calm cutaway) | Start frame only (Veo extrapolates motion) |
| Animated transformation (something pouring, opening, changing state) | Start + end frame (Veo interpolates between them) |
| Sequence of steps (recipe with multiple ingredients) | One clip per step, each with its own start frame |
What B-roll generation produces¶
For each B-roll clip:
- 1 reference image PNG (the still frame the Veo gen will animate)
- 1 video MP4 (the animated clip, typically 4 candidates)
You pick the best video candidate and use that. The reference image is just an intermediate artifact.
Density-specific notes¶
Recap of what B-roll density means (from Chapter 1 — Pipeline Overview):
| Density | Number of B-roll clips |
|---|---|
| None | 0 — no B-roll at all |
| Low | 2-3 critical-moment clips |
| Medium (default) | One clip per scene that benefits from a cutaway |
| High | One clip per speaking scene |
If density is None, there's nothing in the broll/ folder and nothing to do here. Skip to Exporting Final Assets.
When B-roll is required vs. invented¶
Important: B-roll only exists in a workflow when the source brief / source video actually has it. Don't invent B-roll because a workflow "would benefit from cutaways."
For from-scratch workflows: the user / Manager decides B-roll density at the brief stage based on the platform format.
For video-copy workflows: the source video dictates B-roll. If the source has no B-roll, the copy has no B-roll. Period. See Chapter 11 — Video Copy.
Stitching B-roll into the final video¶
After generation, you'll have:
- The speaking scenes' Veo clips (from the main
.nbflow) - The B-roll clips (from
broll/)
Post-production (in your video editor — CapCut, Premiere, etc.) is where they get cut together:
- Lay the speaking scenes on the timeline in order
- At each B-roll insertion point, cut to the B-roll clip while the speaker's audio continues
- Trim each B-roll clip to its target length (typically 2-3 seconds)
- Add captions / on-screen text as needed
The pipeline doesn't do this stitching automatically. The Generation Runner produces the parts; you assemble them.
When you're ready¶
→ Next: Exporting Final Assets — getting the picked candidates out of PatchWork in a usable format.