Judging Candidates¶

Every generation produces 4 candidates per scene. Your job is to pick one — the strongest take — and move on. This page is about doing that quickly and confidently, without falling into common traps.

The 80/20 of picking¶

For most scenes, the right candidate is obvious within 5 seconds. Glance at the four, the strongest one usually stands out. Pick it, move on.

You're going through 8-15 scenes per workflow. Spending 30 seconds per scene on a careful comparison adds up to 5-7 minutes of decision fatigue. That fatigue is your enemy. Speed beats deliberation when the candidate set is good.

What to actually look at¶

In order of priority (highest to lowest):

flowchart TD
    A[1. Face identity matches the avatar]
    B[2. Composition fits the scene goal]
    C[3. Wardrobe + props correct]
    D[4. Body pose natural]
    E[5. Setting matches]
    A --> B --> C --> D --> E

1. Face identity¶

Does this candidate look like the avatar? If a candidate's face is subtly off (wrong age, drift toward a different person), drop it. Identity beats everything else — a candidate with perfect lighting but the wrong face is unusable.

2. Composition¶

Does the framing serve the scene? Is the subject in the right position, looking the right direction, holding the product (if any) where it should be? A candidate with "good vibes" but the wrong framing won't cut together with the rest of the workflow.

3. Wardrobe and props¶

The clothing matches the storyboard. The product is in frame. The accessories are right. Skip candidates where wardrobe drifted.

4. Body pose¶

Natural — not stiff, not contorted, not awkward. Veo will animate from this still, so the pose has to read as something a real person would actually do.

5. Setting¶

The environment is right. Don't over-weight this — backgrounds often have minor variances across the 4 candidates that don't matter at scroll speed.

What NOT to look at¶

Things that look bad in a still image but don't matter in the final video:

Hallucinated background text — store signage, price tags, fake brand labels. NB2 is fundamentally bad at text. Garbled background text is invisible at scroll speed. Ignore it.
Minor facial asymmetry — slight eye misalignment, an unusual ear shape. The viewer won't notice in 8 seconds of motion.
Background details you only see when zooming in — patterns on a backdrop, distant clutter. Not load-bearing.
Hand details that won't be in motion — if the hand sits at rest off-screen in Veo, anatomic perfection in the still doesn't matter.

If you're zooming into a 9:16 candidate to find a flaw, you've already gone too deep. Step back and look at it the way a viewer will see it — phone screen, 1.5 seconds of attention before scroll.

Composition over polish¶

A consistently good rule of thumb:

A candidate with good composition + slight facial asymmetry beats a candidate with perfect anatomy + dead-eyes posing.

Composition is what reads at scroll speed. Anatomy is what you notice when you stop and inspect. Most viewers don't stop and inspect.

If two candidates are close and one has more energy / better lean / more committed pose — pick that one, even if the other is technically cleaner.

When none of the 4 are good¶

This is where the decision tree from How Quality Is Handled kicks in. Specifically:

All 4 have the same issue (lighting off, wardrobe wrong, hand in wrong position) → Prompt tuning
All 4 have AI tells (extra fingers, melted faces, garbled features) → tell Claude "regen Scene N with 10 attempts"
All 4 are "fine" but the scene is uninspired → may be a Lvl 3 variant situation, not a re-roll

Don't accept a weak candidate just because you're tired of iterating. But also don't iterate forever chasing a phantom of perfection. The middle ground:

If you've seen 8+ candidates across 2 regens and nothing has been right, the issue is not random variance. Step back and fix the prompt or the reference, not the seed.

Threshold to ship¶

Different work has different bars:

Content	Threshold
Internal test / experiment	1 of 4 acceptable. Pick the best, ship the test
Standard production workflow	3 of 4 acceptable. Pick the strongest, move on
Premium / hero workflow	4 of 4 acceptable. High polish on every scene
Client-facing premium	4 of 4 + a second pair of eyes before delivery

Set the bar before you start picking, not after. If you didn't decide ahead of time what "good enough" looks like, you'll either:

Accept too easily because you're tired
Iterate forever because you keep finding small things to fix

Set the bar. Hit it. Ship.

When the workflow has bottleneck scenes¶

Some workflows have one or two scenes that won't land while the rest are fine. The right move is targeted investment:

Leave the 7 clean scenes at default iteration count
Spend extra iterations + prompt-tuning specifically on the stubborn scene
Don't pull the whole workflow's quality bar up to match the one stubborn scene

Tell Claude: "Scene 04 keeps producing bad hands. Bump attempts to 10 for just that node. Other scenes are fine."

When you're ready¶

You've finished Chapter 5. You can now:

Trust the automatic checks (sanity check, auto-QA, auto-rerun) to handle the routine quality work
Know when you need to step in vs. let the system do its job
Invoke the prompt-tuning skill when a candidate set has a structural issue
Pick from a candidate set quickly without burning yourself out
Set explicit quality thresholds before starting a pass

→ Next: Chapter 6 — Lvl 3-4 Variants. Bigger variants — environment, camera, setting, structural changes. These need testing, and the quality skills from this chapter are what lets you judge whether a test iteration is converging.