perf(engine): split Writer into Phase A (plan) + Phase B (beats)

The Writer was the serial long pole: a single LLM call wrote the scene
skeleton AND the full beats[] graph before anything downstream could
start, so variable-length beat generation blew up tail latency.

Split it into two calls:
- Phase A (runWriterPlan): minimal skeleton the image pipeline needs
  (sceneSummary, sceneKey, entryBeatId, cast, entry roster, entry speaker).
  Serial, on the critical path, kept lightweight.
- Phase B (runWriterBeats): full beats[] + storyStatePatch, written to
  honor the plan. Launched immediately, overlaps the ENTIRE image pipeline
  (cards / cinematographer / portraits / painter), awaited last.

Critical path becomes PhaseA + max(imagePipeline, PhaseB), so the long
beat-writing is hidden behind image gen. A Phase B failure degrades to a
single playable beat synthesized from the plan.

Paired distinct-payload A/B (6 content-matched stories, baseline vs split):
- median end-to-end 42.6s -> 32.2s (-24%)
- mean 46.4s -> 33.1s (-29%)
- worst case 74.7s -> 37.6s (halved)
- no content regression: total Writer output tokens 12858 -> 13699

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-06-04 11:17:34 +08:00
parent 9f4dcc097b
commit 3bf5c92841
5 changed files with 443 additions and 174 deletions
+37
View File
@@ -92,6 +92,43 @@ export type SceneHistoryEntry = {
exit?: SceneExit;
};
// ──────────────────────────────────────────────────────────────────────
// Writer two-phase split
//
// The Writer runs as TWO LLM calls so scene-image generation can begin
// before the dialogue is fully written:
// Phase A (WriterPlan) — the minimal skeleton the image pipeline needs:
// sceneSummary + sceneKey + the entry beat's
// on-stage roster + the full cast to design.
// Phase B (beats) — the full beats[] graph + storyStatePatch, written
// to honor the plan, overlapped with image gen.
// The Cinematographer + character design + Painter all run off the Plan, so
// Phase B's (longer) output is hidden behind the image pipeline.
// ──────────────────────────────────────────────────────────────────────
export type WriterPlan = {
/** 中文 scene synopsis (location + time + mood + key event + opening hook).
* The sole input the Cinematographer composes the establishing shot from. */
sceneSummary: string;
/** English location+time slug for cross-scene visual continuity. */
sceneKey?: string;
/** Beat id the player lands on when entering the scene. Phase B must emit a
* beat with this id (reconciled if it doesn't). */
entryBeatId: string;
/** Every NPC name that appears anywhere in this scene. Drives character
* design (card + portrait + voice) IN PARALLEL with Phase B beat writing, so
* the whole cast is provisioned by the time the scene returns. Phase B may
* only use names from this list (plus the POV "你"). Never includes the player. */
cast: string[];
/** The entry beat's on-stage roster (who's visible + pose when the player
* lands). Drives the Cinematographer's framing and the entry-beat portraits
* the Painter anchors to. Never includes the POV player. */
entryActiveCharacters: BeatActiveCharacter[];
/** The entry beat's speaker — an NPC name, "你" (player speaking), or
* undefined for a pure narration/environment entry. Drives shot selection. */
entrySpeaker?: string;
};
// ──────────────────────────────────────────────────────────────────────
// Characters & voices (TTS)
// ──────────────────────────────────────────────────────────────────────