Speed Levers¶
Specific techniques to make workflow runs faster. Most of these are about avoiding wasted work rather than running individual steps faster.
Lever 1: mode=images first pass¶
Already covered in Cost Awareness, but worth emphasizing here as a speed lever too:
- Image-only pass: ~5-10 minutes for an 8-scene workflow
- Full pass with video: 25-45 minutes for the same workflow
- The image-only pass surfaces 90% of prompt issues
You finish an image-only validation pass in 15-25% of the time of a full pass. If issues are found, you fix and re-run faster than you would if you'd done full passes both times.
You: run XYZG3-V0-1 in images-only mode for validation. We can do full
video gen after we approve the stills.
Claude: [invokes Generation Runner with mode=images, ~7 min later
returns image candidates]
Lever 2: Concurrency tuning¶
The default concurrency is 3. If you have a large workflow and your G-Labs is stable:
You: run with concurrency 5 for this workflow
Claude: [invokes Generation Runner with --concurrency 5]
Speed gain: ~30-40% faster on large workflows. Diminishing returns above 5 (rate-limiting kicks in).
When to NOT bump concurrency:
- G-Labs has been flaky recently
- You're debugging and want clean per-node error attribution
- The workflow is small (concurrency 3 vs 5 makes little difference on a 4-scene workflow)
Lever 3: Per-node reruns instead of full reruns¶
When 2-3 scenes need re-rolls, don't rerun the whole workflow:
Slow: full workflow rerun (8 scenes × 30s avg = 4 min)- The runner re-checks every node, gets cache hits for unchanged ones, but still has overhead.
Fast: per-node rerun for just the affected scenes (2 scenes × 30s = 1 min)- The
regen-nodes.pymechanism only re-runs the specified nodes. Cache for everything else stays untouched.
Speed gain: 2-4× faster when only a few scenes need redoing.
You: rerun just Scenes 03 and 05 — same prompts, fresh seeds.
Claude: [per-node rerun, ~1 min later returns new candidates]
Lever 4: Reference image reuse¶
Reference images (avatar reference sheets, product photos) are uploaded to R2 once and reused across many workflows.
Speed implications:
- First time an avatar ref is used in a workflow: ~5 seconds to upload to R2, then it's cached at the URL
- Subsequent workflows using the same avatar ref: instant — the URL is already there
If you're building multiple workflows for the same account, the first one pays the upload cost; the rest don't.
Don't re-upload an avatar ref that's already on R2 (check reference/avatar-sheets/r2-urls.md first).
Lever 5: Cache discipline¶
The Generation Runner caches generation outputs in the source .nbflow. The next run uses cached output for unchanged nodes.
To benefit:
- Don't change the prompt unless you actually want to regenerate that scene
- Don't wipe the cache unless you need a completely fresh start
- Don't bump versions for trivial reasons — that breaks the cache-of-the-old-version
When the cache is honored, a "rerun" of a clean workflow does effectively no work — it sees everything cached, reports success, and finishes in seconds.
Lever 6: Skip aggressive regen attempts when not needed¶
The Generation Runner defaults to 3 regen attempts per AI-tell-flagged image. Each attempt is a full image gen.
If a scene has been flagged 3 times in a row, the 4th attempt is unlikely to suddenly work. The variance is what it is.
You can cap lower for first iterations:
You: run with max 1 regen attempt per flagged image — I just want to
see the baseline output, not invest in regens yet.
Claude: [invokes runner with attempts=1]
Speed gain: cuts QA loop time by 50-70% on flagged-heavy workflows.
When you've narrowed down to specific stubborn scenes, you can bump attempts higher for just those:
You: rerun Scene 04 with up to 10 attempts — it's the only one that
keeps producing bad hands.
Claude: [per-node rerun for Scene 04 with attempts=10]
Targeted high-attempt = good. Universal high-attempt = expensive without benefit.
Lever 7: Parallel tab generation (multi-account)¶
When generating a multi-account workflow:
Slow: generate tab-by-tab sequentially- Tab A (10 min) → Tab B (10 min) → ... → Tab E (10 min) = 50 min total.
Fast: generate all tabs in parallel- The Generation Runner can handle parallel tabs naturally. With concurrency 5, multiple tabs' nodes interleave. ~20 min total instead of 50.
This is the default behavior — the runner doesn't artificially sequentialize tabs. But if you've been running tabs one at a time manually, switch to letting the runner parallelize.
Lever 8: Pre-uploaded reference images¶
If you know a workflow will use specific reference images, pre-upload them to R2 before invoking the runner. The runner's auto-upload is fast but adds ~5 seconds per image at run-start.
For workflows with many references (3+ images), pre-upload saves ~15-30 seconds per run. Minor but real.
Lever 9: Local image PNGs cached¶
The Generation Runner downloads every generated image to disk in Assets/{workflow}/generated-images/v0-N/. This means:
- You can review locally without round-tripping through PatchWork
- You can show stills to a client without sharing the live R2 URL
- You have a backup if R2 has an issue
Most importantly: you don't have to re-download for subsequent uses. The local copy is there.
When NOT to optimize¶
Speed levers cost something — usually clarity or safety. Skip them when:
- You're learning — slow and careful trumps fast and wrong
- The workflow is high-value / production-critical — the cost of an error exceeds the time saved
- You're working with a new product or account — extra care prevents bigger problems
Optimization is for production scale, not learning. Get comfortable with the slow path first.
When you're ready¶
→ Next: Quality Levers — when to invest more time / cost for better quality, and when to accept "good enough."