Speed Levers¶

Specific techniques to make workflow runs faster. Most of these are about avoiding wasted work rather than running individual steps faster.

Lever 1: `mode=images` first pass¶

Already covered in Cost Awareness, but worth emphasizing here as a speed lever too:

Image-only pass: ~5-10 minutes for an 8-scene workflow
Full pass with video: 25-45 minutes for the same workflow
The image-only pass surfaces 90% of prompt issues

You finish an image-only validation pass in 15-25% of the time of a full pass. If issues are found, you fix and re-run faster than you would if you'd done full passes both times.

You: run XYZG3-V0-1 in images-only mode for validation. We can do full
     video gen after we approve the stills.

Claude: [invokes Generation Runner with mode=images, ~7 min later
         returns image candidates]

Lever 2: Concurrency tuning¶

The default concurrency is 3. If you have a large workflow and your G-Labs is stable:

You: run with concurrency 5 for this workflow

Claude: [invokes Generation Runner with --concurrency 5]

Speed gain: ~30-40% faster on large workflows. Diminishing returns above 5 (rate-limiting kicks in).

When to NOT bump concurrency:

G-Labs has been flaky recently
You're debugging and want clean per-node error attribution
The workflow is small (concurrency 3 vs 5 makes little difference on a 4-scene workflow)

Lever 3: Per-node reruns instead of full reruns¶

When 2-3 scenes need re-rolls, don't rerun the whole workflow:

Slow: full workflow rerun (8 scenes × 30s avg = 4 min): The runner re-checks every node, gets cache hits for unchanged ones, but still has overhead.
Fast: per-node rerun for just the affected scenes (2 scenes × 30s = 1 min): The regen-nodes.py mechanism only re-runs the specified nodes. Cache for everything else stays untouched.

Speed gain: 2-4× faster when only a few scenes need redoing.

You: rerun just Scenes 03 and 05 — same prompts, fresh seeds.

Claude: [per-node rerun, ~1 min later returns new candidates]

Lever 4: Reference image reuse¶

Reference images (avatar reference sheets, product photos) are uploaded to R2 once and reused across many workflows.

Speed implications:

First time an avatar ref is used in a workflow: ~5 seconds to upload to R2, then it's cached at the URL
Subsequent workflows using the same avatar ref: instant — the URL is already there

If you're building multiple workflows for the same account, the first one pays the upload cost; the rest don't.

Don't re-upload an avatar ref that's already on R2 (check reference/avatar-sheets/r2-urls.md first).

Lever 5: Cache discipline¶

The Generation Runner caches generation outputs in the source .nbflow. The next run uses cached output for unchanged nodes.

To benefit:

Don't change the prompt unless you actually want to regenerate that scene
Don't wipe the cache unless you need a completely fresh start
Don't bump versions for trivial reasons — that breaks the cache-of-the-old-version

When the cache is honored, a "rerun" of a clean workflow does effectively no work — it sees everything cached, reports success, and finishes in seconds.

Lever 6: Skip aggressive regen attempts when not needed¶

The Generation Runner defaults to 3 regen attempts per AI-tell-flagged image. Each attempt is a full image gen.

If a scene has been flagged 3 times in a row, the 4^th attempt is unlikely to suddenly work. The variance is what it is.

You can cap lower for first iterations:

You: run with max 1 regen attempt per flagged image — I just want to
     see the baseline output, not invest in regens yet.

Claude: [invokes runner with attempts=1]

Speed gain: cuts QA loop time by 50-70% on flagged-heavy workflows.

When you've narrowed down to specific stubborn scenes, you can bump attempts higher for just those:

You: rerun Scene 04 with up to 10 attempts — it's the only one that
     keeps producing bad hands.

Claude: [per-node rerun for Scene 04 with attempts=10]

Targeted high-attempt = good. Universal high-attempt = expensive without benefit.

Lever 7: Parallel tab generation (multi-account)¶

When generating a multi-account workflow:

Slow: generate tab-by-tab sequentially: Tab A (10 min) → Tab B (10 min) → ... → Tab E (10 min) = 50 min total.
Fast: generate all tabs in parallel: The Generation Runner can handle parallel tabs naturally. With concurrency 5, multiple tabs' nodes interleave. ~20 min total instead of 50.

This is the default behavior — the runner doesn't artificially sequentialize tabs. But if you've been running tabs one at a time manually, switch to letting the runner parallelize.

Lever 8: Pre-uploaded reference images¶

If you know a workflow will use specific reference images, pre-upload them to R2 before invoking the runner. The runner's auto-upload is fast but adds ~5 seconds per image at run-start.

For workflows with many references (3+ images), pre-upload saves ~15-30 seconds per run. Minor but real.

Lever 9: Local image PNGs cached¶

The Generation Runner downloads every generated image to disk in Assets/{workflow}/generated-images/v0-N/. This means:

You can review locally without round-tripping through PatchWork
You can show stills to a client without sharing the live R2 URL
You have a backup if R2 has an issue

Most importantly: you don't have to re-download for subsequent uses. The local copy is there.

When NOT to optimize¶

Speed levers cost something — usually clarity or safety. Skip them when:

You're learning — slow and careful trumps fast and wrong
The workflow is high-value / production-critical — the cost of an error exceeds the time saved
You're working with a new product or account — extra care prevents bigger problems

Optimization is for production scale, not learning. Get comfortable with the slow path first.

When you're ready¶

→ Next: Quality Levers — when to invest more time / cost for better quality, and when to accept "good enough."