Skip to content

Iterating with Prompt Tuning

The prompt-tuning skill is built for the testing phase. When you're at V2-0-3 and Scene 04 just won't land — same problem across attempts, normal regens not helping — prompt-tuning is the right tool.

This page shows how it fits into the testing-phase iteration loop.

When to reach for it during testing

You've tried:

  • Re-rolling the seed (same prompt, fresh candidates) — didn't help
  • One targeted prompt edit — partially helped but not enough
  • Rerunning with up to 10 regen attempts — still not consistent

Now what? Prompt tuning. It uses visual judgment between iterations to converge on the right prompt.

The integrated flow

flowchart TB
    A[Testing iteration N: scene X still wrong] --> B{Tried normal regen?}
    B -->|no| C[Re-roll seed first]
    C --> D{Solved?}
    D -->|yes| E[Continue testing]
    D -->|no| F{Tried prompt edit?}
    B -->|yes| F
    F -->|no| G[Edit prompt, rerun]
    G --> H{Solved?}
    H -->|yes| E
    H -->|no| I[Invoke prompt-tuning skill]
    F -->|yes| I
    I --> J[Iterate the prompt across N rounds]
    J --> K{Converged?}
    K -->|yes| L[Apply final prompt, rerun scene, bump V0-N]
    K -->|no| M[Scene may not be doable — consider reframing]

What "converged" means

Prompt-tuning runs N iterations and stops either:

  • When it reaches the iteration cap you set (typically 5-8)
  • When it judges the latest output matches the reference closely enough (early stop)

If it hits the cap without converging, you have a choice:

  • Bump the iteration count and keep going (e.g., another 5 rounds)
  • Accept the closest version and move on
  • Decide the scene's structurally not going to work — reframe / split / drop it

Integrating tuning into the V0-N bump

When prompt-tuning produces a dialed-in prompt:

  1. The final prompt is saved back into the workflow file (or sidecar)
  2. You bump V0-N — the prompt change is a meaningful workflow change
  3. You regenerate just that scene with the new prompt
  4. You evaluate: did the dialed-in prompt produce the candidates you wanted?

If yes — continue testing other scenes. If no — the prompt-tuning loop converged but the output still isn't right. That usually means the reference image is the problem, or the scene structure itself isn't going to work.

Cost discipline

Each prompt-tuning iteration is one image generation (4 candidates). 5 iterations = 20 candidates for one scene's tuning.

Across a workflow's testing phase, prompt-tuning on 2-3 stubborn scenes can add up:

Scenes tuned Iterations each Total extra generations
1 scene 5 20 generations
2 scenes 5 40 generations
3 scenes 8 96 generations

This is on top of the normal Generation Runner passes for the workflow. Stay aware of the cost.

When prompt-tuning makes sense:

  • Workflow is high-value (top performer being refreshed)
  • One scene is the lynchpin and the rest is locked
  • You've tried cheaper fixes first

When it doesn't:

  • The whole workflow is in early testing — let normal iterations sort most of it
  • Many scenes are off — you have a bigger problem than a prompt tune can fix
  • The cheap fixes (seed re-roll, prompt edit, regen attempts) weren't tried first

Example session — tuning during V2-0-3

Status: XYZS1-V2-0-3 in testing. Scenes 01, 02, 05, 06, 07, 08 are good.
Scenes 03 and 04 still need work.

Scene 03 — avatar's pose feels stiff in all candidates. Tried 3 re-rolls,
no improvement. Edited prompt to specify "casual lean toward camera",
got slight improvement but not enough.

Scene 04 — lighting too cool, doesn't match the new environment.

You: tune Scene 03 and Scene 04 prompts in parallel. 5 iterations each.

Claude:
  Running prompt-tuning on Scene 03 (target: casual forward lean,
  natural body angle) and Scene 04 (target: warm window light matching
  the env reference). Both 5 iterations.

  Scene 03 — converged at iteration 3. Final prompt: added "subtle
  forward lean, weight on left arm resting on counter, casual body
  angle".

  Scene 04 — converged at iteration 4. Final prompt: added "warm
  morning light from camera-left, golden hour quality, soft fill on
  the face".

  Both final prompts saved. Bumping XYZS1-V2-0-3 → V2-0-4.
  Regenerating Scenes 03 and 04 (other scenes use cached output).

  [generation pass]

  Scene 03 — 4/4 candidates clean. Pose is right.
  Scene 04 — 3/4 candidates clean. One has slightly off lighting,
  pick from the other 3.

  V2-0-4 looking ready for graduation. Want me to mark it for user
  approval?

When prompt-tuning isn't the answer

Situation Better tool
Multiple unrelated scenes all need work Just iterate normally with prompt edits and regens
One scene's anatomy is consistently wrong Generation Runner regen-attempts bumped to 10
Lighting is wrong in 5+ scenes Edit the lighting once in the workflow's shared anchor, not per-scene
Avatar's face drifts across scenes Check the avatar reference link refs (per-node link matches central)
Scene structurally won't work Reframe / split / drop it. Don't try to tune around a fundamentally bad composition

When you're ready

Next: Graduation: When to Bump to V{N+1} — the criteria for ending the testing phase and moving the variant into approved/.