Tool & Model Reference¶
Reference card for every AI model in the pipeline — what it does, what it accepts, how to prompt it, where it fails. Read once when onboarding, then return as a lookup.
Image generation¶
NanoBanana 2¶
The default image generation model. Used for every image gen in every workflow.
| Field | Value |
|---|---|
| Model identifier | nano_banana_2 |
| Access | G-Labs API (POST /api/image/generate) |
| PatchWork node | nanobanana/NanobananaAPI with properties.model = "nano_banana_2" |
| Inputs | Prompt text (string), up to 2 reference images (URLs) |
| Output count | 4 candidate images per run (outputCount: 4) |
| Aspect ratio | 9:16 vertical default for short-form |
| Output format | PNG, served from R2 via public URLs |
Always set the model field. If model is missing or unset, G-Labs defaults to NanoBanana 1, which is structurally different and produces worse results. The pre-generation sanity check catches this.
Prompting NanoBanana 2¶
Best practices:
- Open with a framing declaration (e.g. "Eye-level frontal selfie, chest to shoulder up") — this is a hard boundary for everything else
- Reference images do the heavy lifting on identity. When a character ref is wired in, don't repeat gender / age / face / hair / ethnicity in the prompt — those tokens consume attention budget the model should spend on the scene
- Describe only what's in frame. See Image Prompt Rules
- Keep templates minimal. When wiring through a template, the template should be
{scene}alone, not anchored with camera/lighting/aesthetic boilerplate — anchors dilute reference adherence - No hyphens. Use spaced or unhyphenated alternatives in prompts
Known failure modes¶
| Failure | Cause | Fix |
|---|---|---|
| Generic stock-style output | model field unset, fell back to NB1 |
Set model: "nano_banana_2" |
| Output unrelated to prompt | Dynamic variableName doesn't match template placeholder |
Set variableName to short lowercase identifier matching placeholder |
| Avatar's face wrong | Reference image link broken (stale per-node refs or Cached Media bug) | Run resync_link_refs(tab) and verify in PatchWork UI |
| Extra fingers / melted face / garbled hands | Normal AI tells | Generation Runner auto-reruns up to 3 attempts; bump to 10 for stubborn cases |
| Hallucinated text (signage, labels) | NanoBanana routinely produces garbled text | Ignore — low priority |
| Cropping wrong | Framing declaration missing or contradicted by out-of-frame mentions | Follow in-frame rule |
When to consider alternatives¶
You generally won't. NanoBanana 2 is the default and produces the best results across our use cases. The only reasons to switch:
- An explicit comparison run between models (rare — usually run by the model researcher, not in production)
- A specific composition that NB2 keeps failing on after 10 regen attempts — at that point, consider whether the prompt is the problem before swapping models
Video generation¶
Veo 3.1 (fast preview)¶
The default video model. Used for speaking scenes and most B-roll.
| Field | Value |
|---|---|
| Model identifier | veo-3.1-fast-generate-preview |
| Access | G-Labs API |
| PatchWork node | nanobanana/Veo3 with properties.model set to the identifier |
| Inputs | Prompt text, start frame image, optional end frame image |
| Output count | 4 candidate clips per run (outputCount: 4) |
| Clip length | ~8 seconds per clip |
| Frame mode | Frames-to-video — interpolates between start and end frames |
| Resolution | 1080p or 720p (set via properties.resolution) |
| Output format | MP4, served from R2 via public URLs |
Veo3 node requires 3 input slots¶
Veo3 nodes need all 3 input slots present even when end-frame isn't wired:
"inputs": [
{"name": "prompt", "type": "string", "link": null},
{"name": "start frame", "type": "*", "link": null},
{"name": "end frame", "type": "*", "link": null}
]
Slot names have spaces, not underscores. Missing the third slot causes r.trim is not a function errors on import.
Prompting Veo 3.1¶
Best practices:
- Macro motion only — Veo automatically adds natural micro motion (blinking, breathing, micro expressions). Prompting for micro motion adds noise. See Video Prompt Rules
- Never prompt location changes without an explicit end frame — frames-to-video interpolation will hallucinate the transition badly
- Speaking scenes use the universal talking head template — don't override the delivery instructions; the template is tuned
- B-roll is natural language, no dialogue, ambient audio only
- Low-movement scenes use the same image as start AND end frame to prevent drift
negativePromptmust be a string (empty""is fine, undefined crashes)
Known failure modes¶
| Failure | Cause | Fix |
|---|---|---|
| Avatar's face drifts mid-clip | Start and end frames too different, OR scene needs more than 8 seconds | Use same image as start AND end, OR split into shorter motion |
| Clip looks glitchy with hallucinated transitions | Location change without end frame | Split into two scenes, or add explicit end frame |
| Distant background animals glitching | Animals in distant background of start frame | Regenerate start frame without animals — see Image Prompt Rules |
| POV clip shows third-person body | Prompt described subject's body orientation | Rewrite per POV rule |
| Voice / mouth sync off | Using a custom delivery override instead of the universal template | Switch to universal template |
Import error r.trim is not a function |
Veo3 missing negativePrompt (undefined) or missing 3rd input slot |
Fill negativePrompt: "" and add end frame slot with link: null |
Seedance 2.0¶
Alternative video model. Used selectively when Seedance produces a better result for a specific style (rare). Accessed manually via the Jimeng web UI — no API integration yet.
| Field | Value |
|---|---|
| Access | Jimeng web UI (manual, no API) |
| Inputs | Prompt text, start frame image |
| Negative prompts | Not supported — positive constraints only |
| Output | Single clip per run |
| Format | MP4 download from the web UI |
Prompting Seedance 2.0¶
- Positive constraints only. Restate what you want, don't negate what you don't want
- Same macro-motion rule applies as with Veo
- Ambient audio is set in the UI, not via prompt
- Manual workflow — no
.nbflowintegration; the Seedance Prompter agent produces the prompt text, you paste it into Jimeng yourself
When to use Seedance over Veo¶
In practice, almost never. Use cases:
- A specific stylization that Veo can't match (rare)
- A comparison run for a particularly tricky scene
- When the G-Labs / Veo path is down and you need a stopgap
Default to Veo 3.1 unless there's a concrete reason to switch.
The G-Labs backend¶
Not a model — but it's the layer that proxies every model call. Worth knowing the interface.
| Field | Value |
|---|---|
| Local port | 8765 |
| Public access | via cloudflared tunnel — see G-Labs Setup |
| Image generation endpoint | POST /api/image/generate |
| Video generation endpoint | POST /api/video/generate |
| Health check | GET /health (returns 200 OK when running) |
| Authentication | API key in request header |
Both the Generation Runner (via --server flag) and the PatchWork web app (via settings panel) need the tunnel URL.
R2 storage¶
Cloudflare R2 — object storage where all reference images and generation outputs live.
| Field | Value |
|---|---|
| Upload mechanism | Worker proxy (https://patchwork-r2-upload.patchworkstudio.workers.dev) |
| URL format | https://pub-....r2.dev/<path>.png (public CDN URLs) |
| Authentication | None on the operator side — the Worker proxy handles it |
| Used by | Generation Runner (auto-upload of reference images and outputs), avatar reference registry |
The Generation Runner uploads any local reference image paths to R2 automatically before calling G-Labs. You don't need to handle this manually.
Quick lookup — which tool when¶
| Task | Tool |
|---|---|
| Generate an image | NanoBanana 2 (always) |
| Generate a speaking scene clip | Veo 3.1 |
| Generate a B-roll clip | Veo 3.1 (natural language prompt, no dialogue) |
| Generate a clip in a specific style Veo can't hit | Seedance 2.0 (manual via Jimeng UI) |
| Upload a reference image | R2 via the Worker (handled automatically by Generation Runner) |
| Run a workflow headlessly | Generation Runner (workflow-runner.js) |
| Run a workflow visually in a browser | Playwright + PatchWork web app |
Validate an .nbflow before generating |
Pre-generation sanity check |
Costs and rate limits¶
(To be filled in — track per-model costs and concurrency caps here. Currently the operational defaults: Generation Runner concurrency 3 default, 5 cap; 4 candidates per gen node; ~3 retries per node before giving up.)