Judging Candidates¶
Every generation produces 4 candidates per scene. Your job is to pick one — the strongest take — and move on. This page is about doing that quickly and confidently, without falling into common traps.
The 80/20 of picking¶
For most scenes, the right candidate is obvious within 5 seconds. Glance at the four, the strongest one usually stands out. Pick it, move on.
You're going through 8-15 scenes per workflow. Spending 30 seconds per scene on a careful comparison adds up to 5-7 minutes of decision fatigue. That fatigue is your enemy. Speed beats deliberation when the candidate set is good.
What to actually look at¶
In order of priority (highest to lowest):
flowchart TD
A[1. Face identity matches the avatar]
B[2. Composition fits the scene goal]
C[3. Wardrobe + props correct]
D[4. Body pose natural]
E[5. Setting matches]
A --> B --> C --> D --> E
1. Face identity¶
Does this candidate look like the avatar? If a candidate's face is subtly off (wrong age, drift toward a different person), drop it. Identity beats everything else — a candidate with perfect lighting but the wrong face is unusable.
2. Composition¶
Does the framing serve the scene? Is the subject in the right position, looking the right direction, holding the product (if any) where it should be? A candidate with "good vibes" but the wrong framing won't cut together with the rest of the workflow.
3. Wardrobe and props¶
The clothing matches the storyboard. The product is in frame. The accessories are right. Skip candidates where wardrobe drifted.
4. Body pose¶
Natural — not stiff, not contorted, not awkward. Veo will animate from this still, so the pose has to read as something a real person would actually do.
5. Setting¶
The environment is right. Don't over-weight this — backgrounds often have minor variances across the 4 candidates that don't matter at scroll speed.
What NOT to look at¶
Things that look bad in a still image but don't matter in the final video:
- Hallucinated background text — store signage, price tags, fake brand labels. NB2 is fundamentally bad at text. Garbled background text is invisible at scroll speed. Ignore it.
- Minor facial asymmetry — slight eye misalignment, an unusual ear shape. The viewer won't notice in 8 seconds of motion.
- Background details you only see when zooming in — patterns on a backdrop, distant clutter. Not load-bearing.
- Hand details that won't be in motion — if the hand sits at rest off-screen in Veo, anatomic perfection in the still doesn't matter.
If you're zooming into a 9:16 candidate to find a flaw, you've already gone too deep. Step back and look at it the way a viewer will see it — phone screen, 1.5 seconds of attention before scroll.
Composition over polish¶
A consistently good rule of thumb:
A candidate with good composition + slight facial asymmetry beats a candidate with perfect anatomy + dead-eyes posing.
Composition is what reads at scroll speed. Anatomy is what you notice when you stop and inspect. Most viewers don't stop and inspect.
If two candidates are close and one has more energy / better lean / more committed pose — pick that one, even if the other is technically cleaner.
When none of the 4 are good¶
This is where the decision tree from How Quality Is Handled kicks in. Specifically:
- All 4 have the same issue (lighting off, wardrobe wrong, hand in wrong position) → Prompt tuning
- All 4 have AI tells (extra fingers, melted faces, garbled features) → tell Claude "regen Scene N with 10 attempts"
- All 4 are "fine" but the scene is uninspired → may be a Lvl 3 variant situation, not a re-roll
Don't accept a weak candidate just because you're tired of iterating. But also don't iterate forever chasing a phantom of perfection. The middle ground:
If you've seen 8+ candidates across 2 regens and nothing has been right, the issue is not random variance. Step back and fix the prompt or the reference, not the seed.
Threshold to ship¶
Different work has different bars:
| Content | Threshold |
|---|---|
| Internal test / experiment | 1 of 4 acceptable. Pick the best, ship the test |
| Standard production workflow | 3 of 4 acceptable. Pick the strongest, move on |
| Premium / hero workflow | 4 of 4 acceptable. High polish on every scene |
| Client-facing premium | 4 of 4 + a second pair of eyes before delivery |
Set the bar before you start picking, not after. If you didn't decide ahead of time what "good enough" looks like, you'll either:
- Accept too easily because you're tired
- Iterate forever because you keep finding small things to fix
Set the bar. Hit it. Ship.
When the workflow has bottleneck scenes¶
Some workflows have one or two scenes that won't land while the rest are fine. The right move is targeted investment:
- Leave the 7 clean scenes at default iteration count
- Spend extra iterations + prompt-tuning specifically on the stubborn scene
- Don't pull the whole workflow's quality bar up to match the one stubborn scene
Tell Claude: "Scene 04 keeps producing bad hands. Bump attempts to 10 for just that node. Other scenes are fine."
When you're ready¶
You've finished Chapter 5. You can now:
- Trust the automatic checks (sanity check, auto-QA, auto-rerun) to handle the routine quality work
- Know when you need to step in vs. let the system do its job
- Invoke the prompt-tuning skill when a candidate set has a structural issue
- Pick from a candidate set quickly without burning yourself out
- Set explicit quality thresholds before starting a pass
→ Next: Chapter 6 — Lvl 3-4 Variants. Bigger variants — environment, camera, setting, structural changes. These need testing, and the quality skills from this chapter are what lets you judge whether a test iteration is converging.