Recovery from Failed Fan-out¶
A fan-out pass can fail in several ways: duplicate node IDs causing schema breaks, STANDARD content drifting across tabs, broken per-node link refs hiding avatar references, a tab's avatar swap missing one prompt. Here's how to diagnose and recover.
The most common failure modes¶
flowchart TD
A[Fan-out done<br/>but something's wrong]
A --> B{Symptom}
B -->|Workflow won't open in PatchWork| C[Schema break — duplicate IDs or<br/>broken edges]
B -->|Some scenes generate wrong avatar| D[Avatar ref not swapped or<br/>per-node link refs stale]
B -->|Different tabs produce slightly<br/>different STANDARD output| E[STANDARD drift]
B -->|Sheet sync mismatches reality| F[Tracker out of sync]
B -->|G-Labs errors only on new tabs| G[Node ID collisions]
Diagnostic flow¶
Step 1: Try to open the workflow in PatchWork¶
If the file won't import — schema is broken. Check:
- Duplicate node IDs across tabs. After fan-out,
len({n.id for tab in tabs for n in tab.nodes})must equal the total node count - Broken edges — links referring to non-existent nodes or wrong slot indices
- Per-node link refs stale —
output.links/input.linknot matching the centrallinksarray
The pre-generation sanity check catches all of these. Run it on the fan-out output. If it fails, you have a starting point.
Step 2: Run the sanity check¶
You: run the pre-gen sanity check on XYZS1-V0-5.nbflow
Claude: [runs the validator]
Found:
- 3 duplicate node IDs between tabs Account B and Account D
- 12 stale per-node link refs in Account E
Recommend: restore from the pre-fanout backup and re-run fan-out
with the renumber step explicit.
Step 3: Restore from backup¶
Every fan-out step archives the previous version to backups/. The pattern:
Restoring:
You: restore XYZS1-V0-4 from the pre-fanout backup. The V0-5 fan-out
attempt broke.
Claude:
Restoring XYZS1-V0-5.nbflow from
backups/XYZS1-V0-4.nbflow.bak-2026-05-12-pre-fanout.
Moving the broken V0-5 to backups/XYZS1-V0-5-broken-{timestamp}.bak
for forensics.
Workflow is back at V0-4 — the pre-fanout state.
Ready to retry fan-out? I'll be more careful with renumbering this time.
Step 4: Retry the fan-out¶
The retry follows the 7-step protocol — same as the first attempt, but with the diagnostic findings in mind.
If the original failure was duplicate IDs, the retry should use the per-tab integer offset scheme (each tab index N adds N * 100000 to every ID) more carefully.
If the original failure was stale link refs, the retry should explicitly call resync_link_refs(tab) and assert_clean(workflow) after every mutation pass (the scripts in manager/scripts/_lib_link_refs.py handle this).
STANDARD drift recovery¶
A subtler failure: the workflow opens fine and generates clean output, but the STANDARD items aren't byte-identical across tabs anymore. You notice when:
- Two tabs have slightly different camera angles for the same scene
- A compliance word ban is enforced on one tab but not another
- Product reference R2 URLs differ between tabs
Diagnosing drift¶
Compare STANDARD items across all tabs:
import json
data = json.load(open(p))
source_tab = data['tabs'][0]
for n in source_tab['graphData']['nodes']:
if is_standard(n): # use the variant-checklist.md to define
source_text = n['properties'].get('text', '')
for tab in data['tabs'][1:]:
matching = find_corresponding(tab, n)
if matching['properties'].get('text', '') != source_text:
print(f"DRIFT in {tab['name']}: node {matching['id']}")
If the script reports any drift, fix it:
- Promote the change to STANDARD: edit all tabs to match the new version
- Or move that field to CUSTOMIZED: document why this account needs its own version
Either way, don't leave silent drift.
Why drift happens¶
Common causes:
- Someone did a surgical edit on one tab and forgot to propagate
- A Lvl 1-2 variant applied dialogue rewrites globally but per-account customizations didn't get re-applied
- A fan-out used the wrong source tab as the STANDARD reference
The cure is process discipline — see STANDARD vs CUSTOMIZED.
Per-node link ref recovery¶
If sanity-check tells you per-node link refs are stale, it means the workflow's edges live in two places and they don't match:
- The central
tab.graphData.linksarray - Per-node
output.linksandinput.linkreferences
PatchWork's UI renderer reads the per-node refs. The headless runner reads the central array. When they disagree, the runner generates with one version of the wiring and the UI shows a different version. Avatar references can silently disappear this way.
The fix¶
You: per-node link refs are stale in XYZS1-V0-5. Sync them.
Claude: [imports manager/scripts/_lib_link_refs.py, calls
resync_link_refs(tab) on every tab, then assert_clean(workflow),
saves the file. Per-node refs are now byte-identical with the
central links array.]
Full detail in PatchWork Import Bugs — stale per-node link refs.
Tracker out of sync¶
If the master Google Sheets tracker shows a different state than reality:
You: tracker says XYZS1 is at V1 approved, but reality is V2-0-3 in
testing. The fan-out updated the file but tracker-sync wasn't run.
Claude: [runs the tracker-sync skill, scans projects/, reconciles
tracker tabs with file system state]
Tracker tabs updated. XYZS1 row now shows V2-0-3 testing.
Tracker drift is annoying but easy to fix — just run tracker-sync.
When recovery isn't enough¶
Sometimes the fan-out is so broken that restore + retry is faster than recovery. Don't be afraid to:
- Restore from the pre-fanout backup
- Confirm the source is clean
- Fan out again, slowly, one tab at a time
- Validate after each tab
This is slower per step but fewer retries overall than chasing a bad state.
When you're ready¶
→ Next: Syncing the Tracker — keeping the master Google Sheets up to date after fan-out and other meaningful changes.