Pipeline Limits¶

The hard constraints the pipeline imposes on you. Worth knowing so you don't accidentally fight them.

Generation concurrency¶

The Generation Runner has a default concurrency of 3 parallel generations, with a hard cap of 5.

Setting	When to use
`--concurrency 3` (default)	Standard. Good throughput, low risk of upstream throttling.
`--concurrency 5` (cap)	When you're sure G-Labs can handle it and you want maximum throughput on a large workflow.
`--concurrency 1`	Debugging. Errors attribute cleanly to specific nodes.

Why the cap is 5:

Bumping to 7-8 produces longer wait times AND more failures. Stay at 3-5.

Each Veo 3.1 clip is ~8 seconds. Hard ceiling.

A scene's dialogue must fit in 8 seconds of delivery
Transformation clips, per-step demos, motion-heavy clips all share this 8-second ceiling
If you need longer content, split into multiple clips and stitch in post

The Visual Planner respects this constraint when building the storyboard (see Storyboarding Logic).

Tied to the 8-second Veo clip ceiling. Conversational English delivery is 150-170 words per minute = ~2.5-2.8 words per second. So:

If your script needs to say more than fits, split the beat across two scenes.

Cloudflare R2 has generous quotas. We're not close to them in normal use.

You could hit limits if:

The practical answer: don't worry about R2 quota for normal operation. If you ever hit it, archive old -generated.nbflow outputs to local storage.

R2 is CDN-backed, so download from PatchWork → your laptop is fast. The slow link is usually:

Upload from your laptop to R2 during reference image upload — limited by your home/office bandwidth
Generation API response time — Veo gens take 30-90 seconds; that's compute time, not bandwidth

Neither is a "limit" you'll hit; they're just the natural speed of operations.

.nbflow files are JSON. They get big quickly:

PatchWork starts to slow down on workflow files over ~5 MB:

If you're approaching this:

Trim cached _savedImages lists that aren't needed (only keep the picked candidate's URL, not all 4)
Split workflows that have grown too large (multi-segment workflows might be better as separate files)
Use compression if PatchWork supports it (some installs do)

G-Labs proxies the actual API calls. Rate limits at the upstream:

The Generation Runner's concurrency-3 default stays comfortably under these soft caps for any reasonable workflow.

If you ever do hit a rate limit, you'll see 429 Too Many Requests in errors.json. The runner backs off automatically.

Cloudflared tunnel duration: Quick tunnels (--url mode) can drop after extended use, typically 30+ minutes. Restart cloudflared and update the URL wherever it's referenced.
PatchWork browser session: PatchWork's web app stores state in browser local storage. Clearing your browser cache loses unsaved picks (export your workflow to persist).
G-Labs uptime: G-Labs runs on your machine — it's only "up" when you've started it. Restart at the beginning of each session.

These aren't "limits" so much as session realities to plan around.

Soft suggestions to keep workflows maintainable:

Element	Practical limit
Scenes per workflow	≤ 15 (longer = harder to maintain attention)
Accounts per fan-out	≤ 10 (more = harder to test and audit each tab)
Iteration count during testing	≤ 10 V0-N (more = consider rebuilding instead of iterating)
Variant decimal-bumps from one V1	≤ 5 (more = consider a Lvl 3-4 variant to V2)

These aren't hard caps. They're patterns where exceeding them often means you should rethink, not push through.

→ Next: Working with Multiple Workflows in Parallel — juggling brands × workflows × variants without dropping context.