Video Prompt Rules¶
The rules every Veo 3.1 video prompt must follow. These are about how Veo interprets motion and frame-to-frame interpolation — getting them wrong produces glitches, hallucinated transitions, or off-format clips.
Macro motion only — never micro¶
Video prompts describe macro movements, not micro. Veo's 8-second clip compresses everything; subtle motions get lost and the model adds its own natural breathing/blinking anyway. Stick to motion that translates clearly from a start frame to an end frame.
Macro (do prompt these)¶
- Full head turn (clear angle change, 15-30 degrees over the clip)
- Hand sweep, hand lift, finger point, palm-open gesture
- Body lean forward / back (clearly visible)
- Walking / stepping into or out of frame
- Full arm gesture, two-hand product hold
- Single firm nod (one head dip)
- Three deliberate body taps (named beats on regions like shoulder / stomach / temple)
Micro (do NOT prompt these — Veo handles them automatically)¶
- Slight blink, slow blink, brief blink at line start
- Slight brow raise, slight brow tighten, micro expressions
- Slight head tilt (less than ~10 degrees)
- Slight breath rise/fall, subtle inhale
- Micro pauses, micro inflections
- Tiny eye darts, slight gaze drift
- Slight smile, half smile, subtle warmth
- Slight forward lean (less than a clearly visible angle change)
Why¶
Veo 3.1 is trained to add natural human nuance (blinking, breathing, micro facial movement) automatically. Prompting for micro motion produces no improvement and risks the model interpreting your instruction as noise or glitch. Macro movements give Veo something concrete to interpolate between start and end frames.
The smell test¶
Ask: would a viewer notice this in 8 seconds of phone video at scroll speed? If no → cut it. If yes → keep it. Strip every "slight" / "subtle" / "micro" / "brief" / "slow" qualifier from the motion prompt unless it modifies a macro motion (e.g. "slow grounded sweep" is OK because the sweep is macro).
Never prompt a location change without an end frame¶
Veo's frames-to-video mode interpolates start → end frame. If you tell Veo "the avatar walks into the kitchen" but only give it a start frame in the living room, it hallucinates the transition and produces unusable footage.
The rule¶
If the script calls for a location change, that's a new scene with a new image gen and a new Veo3 clip, not a movement within one clip.
Exception¶
If you have an explicit end frame showing the new location, the location change is fine — Veo interpolates between the two frames cleanly.
| Situation | Approach |
|---|---|
| Avatar stays in one location, talking | Same image as start AND end frame (prevents drift) |
| Transformation (before → after) within one location | Different start and end frames showing the transformation |
| Avatar moves from kitchen to living room | Split into TWO scenes — kitchen clip + living room clip |
| Per-step recipe sequence | One image per step, each becomes its own Veo clip, trimmed in post |
The POV rule (same as image prompts)¶
In a first-person POV shot the camera IS the subject's eye. You cannot describe the subject's body, face, or orientation. See Image Prompt Rules for the full breakdown — the same logic applies to video prompts.
For video specifically:
- Open with vantage language: "First person POV from the subject's own eye view, walking through..."
- Describe what the camera sees moving (background motion, what comes into frame)
- Describe what the hand/arm holding the product does
- Add
--noexclusions:third person view, external camera, camera facing the subject, selfie of the subject, subject's face visible, body visible
If you describe the subject's body orientation in a POV video prompt, you've written a third-person prompt. The model will resolve the contradiction by flipping to third-person and putting the avatar's body in frame.
Speaking scenes use the universal talking head template¶
All speaking scenes (avatar on camera, delivering dialogue) use the universal talking head template. The template handles:
- Delivery pacing
- Micro-pacing within a line
- Natural speaking rhythm
- Mouth-sync expectations
The dynamic prompt only varies the dialogue line per scene. Don't override the template with custom delivery instructions unless there's a specific reason — the template's been tuned across hundreds of clips.
B-roll prompts are different¶
B-roll clips (cutaways during speaking scenes — ingredient close-ups, demos, lifestyle shots) follow different rules:
- Natural language prompt, not the universal talking head template
- No dialogue
- Ambient audio only (no voiceover, no speech)
- Single reference image (not start + end frame pairs — B-roll is one shot, not a transformation)
When the source video has no B-roll, the workflow has no B-roll. Don't invent B-roll cutaways because "it would benefit from visual reinforcement" — that's a video-copy violation.
Veo3 needs both start and end frame slots present¶
Even when end-frame isn't wired, the Veo3 node needs all 3 input slots (prompt, start frame, end frame). The end frame slot should have link: null when unused, but the slot itself must exist.
Missing the slot causes r.trim is not a function errors on import. See Pre-generation Sanity Check for the validation snippet.
Quick checklist before submitting a video prompt¶
- Motion described is macro, not micro
- No "slight" / "subtle" / "micro" / "brief" qualifiers (unless on a macro motion)
- No location changes within a single clip without an end frame
- If POV: no body / face / orientation descriptions; vantage + arm + background only
- If speaking: uses the universal talking head template, only dialogue varies
- If B-roll: natural language, no dialogue, single reference image
- No hyphens in the prompt (use spaced or unhyphenated alternatives)
-
negativePromptfield is a string (empty""is fine, undefined is not)