The Basics¶
If you've never worked with AI-generated content before, start here. This page introduces the four ideas you need before anything else in the wiki will make sense.
What "AI generation" means¶
You describe what you want in words. An AI model produces it.
For images:
You write: "A woman in her 40s in a sunlit kitchen, holding a coffee mug,
warm window light from her left, candid selfie style"
AI produces: [an image matching that description]
For videos, you do the same thing — usually with a starting still image attached so the AI knows what the scene looks like before it animates.
You don't draw anything. You don't film anything. You write a description, the model handles the rest. The quality of what comes back is mostly a function of how well you described it.
What a "prompt" is¶
A prompt is the text you write to describe what you want. That's it. Every image and every video in this pipeline starts from a prompt.
Prompts can be short ("a red apple on a wooden table") or detailed (a paragraph specifying camera angle, lighting, subject, wardrobe, mood, color palette). For production work, detailed prompts win — the model has more to anchor to.
The pipeline writes prompts for you. You don't usually type the prompt yourself; agents inside the pipeline (the Image Prompter, the Veo Prompter) translate your brief into well-structured prompts. But you'll read them, edit them, and understand them — so knowing what they look like matters.
What an "AI avatar" is¶
An avatar is a virtual person — a face that looks consistent across many images and videos, but isn't a real person. You give the AI a single reference photo of a face (a 3-panel portrait — front, three-quarter, side profile). After that, every image of "that person" the AI generates uses the same face.
The avatar is not a real human. They have no name, no body, no identity outside what the prompt establishes. The reference photo locks the face — wardrobe, body, setting, pose are all decided per-scene by the prompt.
Each account in the pipeline (Account A, Account B, etc.) has its own avatar. The avatar is the consistent "person" that account's content features.
What a "workflow" is¶
A workflow is the full set of scripts, prompts, settings, and generated media for one piece of content — typically one short-form video destined for TikTok, Reels, or Facebook.
A workflow lives as a single file (a .nbflow file) that holds:
- The dialogue and pacing (per scene)
- The image prompts (per scene)
- The video prompts (per scene)
- The references (avatar photo, product photo)
- The connections between them — what feeds into what
When you "run a workflow," the pipeline reads this file and generates everything described inside it. You end up with a folder of candidate images and videos to review.
Workflows are reusable. Once one performs well, you can produce variants of it (a script tweak, a wardrobe swap, a setting change) without starting over.
What "PatchWork" is¶
PatchWork is the visual tool that holds workflow files. It looks like a graph editor — boxes (called nodes) connected by lines (called links).
Each node does one job:
- A Prompt node holds prompt text
- A Media node holds an image (an upload or a generation result)
- A NanobananaAPI node generates images
- A Veo3 node generates videos
- An Approve node shows you the candidates and lets you pick
You don't usually wire nodes yourself by hand. The pipeline builds the graph for you. But you'll open PatchWork to review the results — see which images came out, pick the best ones, rerun anything that didn't land.
How a workflow goes from idea to finished video¶
The big picture:
flowchart LR
A[Brief] --> B[Script]
B --> C[Storyboard]
C --> D[PatchWork<br/>workflow file]
D --> E[Generation]
E --> F[Review<br/>and pick]
F --> G[Final videos]
- Brief — you describe the product, audience, tone
- Script — the Script Writer writes the spoken dialogue
- Storyboard — the Visual Planner breaks the script into scenes with camera and visual direction
- PatchWork workflow file — all the prompts get bundled into one file with everything wired together
- Generation — the Generation Runner produces 4 candidate images per scene and 4 candidate clips per scene
- Review — you open the results in PatchWork, pick the best candidates
- Final videos — export the picks, post them
Each step has its own page in this wiki. The next chapter — Pipeline Overview — walks through them in detail.
Vocabulary you'll see repeatedly¶
These five terms are the most important to internalize before going further:
Prompt- The text description that tells the AI what to generate.
Reference image- A photo you upload that the AI uses as a visual anchor — most often the avatar's face, sometimes a product or setting.
Generation- The act of running an AI model with a prompt and getting back media (images or video clips). Each generation produces 4 candidate results.
Candidate- One of the 4 results from a single generation. You pick the best candidate to keep; the rest are discarded.
Workflow- The full file (a
.nbflow) that holds everything for one piece of content.
If a sentence anywhere in this wiki uses one of these and you forget what it means, this page is the place to come back to.
What to read next¶
You're ready for Pipeline Overview. That page walks through the 6 stages from brief to delivery in full detail.
If you want to come back here later, this page is always in Orientation in the left sidebar.