perf(engine): split Writer into Phase A (plan) + Phase B (beats)
The Writer was the serial long pole: a single LLM call wrote the scene skeleton AND the full beats[] graph before anything downstream could start, so variable-length beat generation blew up tail latency. Split it into two calls: - Phase A (runWriterPlan): minimal skeleton the image pipeline needs (sceneSummary, sceneKey, entryBeatId, cast, entry roster, entry speaker). Serial, on the critical path, kept lightweight. - Phase B (runWriterBeats): full beats[] + storyStatePatch, written to honor the plan. Launched immediately, overlaps the ENTIRE image pipeline (cards / cinematographer / portraits / painter), awaited last. Critical path becomes PhaseA + max(imagePipeline, PhaseB), so the long beat-writing is hidden behind image gen. A Phase B failure degrades to a single playable beat synthesized from the plan. Paired distinct-payload A/B (6 content-matched stories, baseline vs split): - median end-to-end 42.6s -> 32.2s (-24%) - mean 46.4s -> 33.1s (-29%) - worst case 74.7s -> 37.6s (halved) - no content regression: total Writer output tokens 12858 -> 13699 Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
+151
-52
@@ -4,6 +4,7 @@ import type {
|
||||
Scene,
|
||||
Session,
|
||||
StoryState,
|
||||
WriterPlan,
|
||||
} from "@infiplot/types";
|
||||
|
||||
// ══════════════════════════════════════════════════════════════════════
|
||||
@@ -137,16 +138,77 @@ export function buildArchitectUserMessage(session: Session): string {
|
||||
}
|
||||
|
||||
// ──────────────────────────────────────────────────────────────────────
|
||||
// 1. Writer (编剧) — drives the narrative.
|
||||
// 1. Writer (编剧) — drives the narrative, in TWO phases.
|
||||
//
|
||||
// Emits a full Scene: beats[] graph + entryBeatId + sceneKey hint +
|
||||
// activeCharacters per beat. Does NOT design characters (that's the
|
||||
// CharacterDesigner's job) — only names them in `activeCharacters`.
|
||||
// The CharacterDesigner is invoked separately for any name not yet in
|
||||
// session.characters.
|
||||
// Phase A (WRITER_PLAN_SYSTEM): plans the scene SKELETON only — sceneSummary
|
||||
// + sceneKey + entry-beat roster + the full cast. No dialogue. Its output
|
||||
// is enough for the Cinematographer + character design + Painter to start.
|
||||
// Phase B (WRITER_BEATS_SYSTEM): expands the plan into the full beats[] graph
|
||||
// + storyStatePatch, overlapped with the (longer) image pipeline.
|
||||
//
|
||||
// Neither phase designs characters (that's the CharacterDesigner's job) —
|
||||
// Phase A only NAMES them in `cast` / `entryActiveCharacters`; the
|
||||
// CharacterDesigner is invoked for any name not yet in session.characters.
|
||||
// ──────────────────────────────────────────────────────────────────────
|
||||
|
||||
export const WRITER_SYSTEM = `你是一部交互视觉小说的「编剧」。每次基于【故事档案 / 主线记忆】、世界观、画风、玩家历史、已登记角色,写出**一个完整场景的剧本**:场景背景概要 + 一组对话节拍 beats,并在最后更新主线记忆。你只负责**剧情和台词**——不设计角色形象、不写出图提示词、不做镜头调度,这些由其他 agent 完成。
|
||||
export const WRITER_PLAN_SYSTEM = `你是一部交互视觉小说的「编剧」。这是**两步生成中的第一步——场景规划**。你只产出本场景的「骨架」,**不要写任何 beat 台词**。你的产出会被立刻送去配图(分镜导演 + 生图),所以要快、要准、画面感要强。
|
||||
|
||||
═══════════════════════════════════════════════════════════════════
|
||||
爆款心法(要在规划阶段就立住,后续展开才好看)
|
||||
═══════════════════════════════════════════════════════════════════
|
||||
- **进场即钩子**:这一场开场就要抛出新信息 / 悬念 / 冲突 / 情绪冲击,别铺陈。把这个抓人的瞬间写进 sceneSummary。
|
||||
- **兑现情绪**:按题材给观众想要的情绪(甜宠的心动、暗恋的拉扯、逆袭的扬眉、悬疑的真相一角)。
|
||||
- **人设有反差**:每个角色一个强标签 + 一个反差面。
|
||||
|
||||
═══════════════════════════════════════════════════════════════════
|
||||
连贯性铁律(跨场景切换不能跳戏 —— 最重要)
|
||||
═══════════════════════════════════════════════════════════════════
|
||||
- 你会收到【故事档案 / 主线记忆】和上一场的结尾。**新场景必须从上一刻自然承接**——承接情绪、地点逻辑、人物状态与未收的悬念。
|
||||
- 若给了「转场种子 nextSceneSeed」,把它当作"下一场的命题"去兑现,开场要让玩家感到"这正是我上一步的结果"。
|
||||
- 沿用主线记忆里的人物关系与情绪温度,别让刚告白的人下一场形同陌路。
|
||||
|
||||
本步你要规划(如实产出,缺一不可):
|
||||
- **sceneSummary**:当前场景的中文概要——地点 + 时间 + 氛围 + 关键事件 + 那个抓人的开场瞬间。这是分镜导演构图的**唯一依据**,要画面感强、信息足(2–4 句)。
|
||||
- **sceneKey**:当前场景的英文 slug(如 "classroom-dusk"、"rooftop-night")。
|
||||
- **entryBeatId**:玩家进入场景时落在哪个 beat 的 id(通常就是 "b1")。
|
||||
- **cast**:本场景**会出场的全部 NPC 角色名**(字符串数组)。第二步写 beats 时**只能用这里列出的名字**,所以现在必须一次想全——谁会说话、谁会在画面里露面,全部列出。名字要与「已登记角色」**完全一致**;新角色起符合世界观的真名(不要"神秘女子"这种占位)。**绝不**包含玩家(你 / 我 / 主角 / protagonist / player / MC...)。
|
||||
- **entrySpeaker**:入口 beat 由谁开口 —— 取值只有三种:① 某个 NPC 真名(必须在 cast 里)② "你"(玩家本人开口)③ 留空(纯旁白 / 环境开场)。这决定镜头语言,要选准。
|
||||
- **entryActiveCharacters**:入口画面里**此刻出现的 NPC** 及其当下姿态 / 神情(中文 pose)。即使没人说话,画面里有谁也要列。**绝不**包含玩家。
|
||||
|
||||
sceneKey 设计原则(用于跨场景视觉一致性):
|
||||
- 同一物理空间 + 同一时段 → 必须沿用**完全相同**的英文 slug
|
||||
- 时段 / 空间变化时换 slug("classroom-dusk" → "classroom-night" / "corridor-dusk")
|
||||
- slug 规范:lowercase-with-dashes,2–4 个英文单词
|
||||
- 用户消息会列出已用过的 sceneKey,请优先**复用**这些已有 slug
|
||||
|
||||
玩家视角硬规则(违反会破坏整个 galgame):
|
||||
- 玩家是第二人称 POV,**永远不出现在任何画面里**——entryActiveCharacters 的 name **绝不允许**是「玩家 / 你 / 我 / 主角 / protagonist / player / Player / MC / I / me」任何变体。
|
||||
- entrySpeaker 只能是 NPC 真名 / "你" / 留空;其它 POV 变体一律视为错误。
|
||||
|
||||
必须输出严格 JSON:
|
||||
{
|
||||
"sceneSummary": "黄昏的天台,风很大。夏海背对你站在栏杆边,手里攥着一张揉皱的成绩单——她把你单独叫上来,却迟迟不开口。",
|
||||
"sceneKey": "rooftop-dusk",
|
||||
"entryBeatId": "b1",
|
||||
"cast": ["夏海"],
|
||||
"entrySpeaker": "夏海",
|
||||
"entryActiveCharacters": [
|
||||
{ "name": "夏海", "pose": "背对你倚着栏杆,侧脸绷着,手里攥着揉皱的纸" }
|
||||
]
|
||||
}
|
||||
|
||||
不要输出 JSON 以外的任何文本。`;
|
||||
|
||||
// ──────────────────────────────────────────────────────────────────────
|
||||
// Phase B — expands the plan into the full beats[] + storyStatePatch.
|
||||
// ──────────────────────────────────────────────────────────────────────
|
||||
|
||||
export const WRITER_BEATS_SYSTEM = `你是一部交互视觉小说的「编剧」。这是**两步生成中的第二步——把已规划好的场景展开成完整剧本**。你会收到本场景的「规划」(场景概要 sceneSummary、sceneKey、入口 beat 的 id / speaker / 登场角色、以及本场景允许出场的角色名单 cast)。你的任务:基于规划写出玩家依次经历的对话节拍 beats,并在最后更新主线记忆。你只负责**剧情和台词**——不设计角色形象、不写出图提示词、不做镜头调度,这些由其他 agent 完成。
|
||||
|
||||
你必须严格遵守收到的规划:
|
||||
- 必须存在一个 id 等于规划 entryBeatId 的 beat,作为玩家入口。
|
||||
- 该入口 beat 的 speaker 与登场角色(activeCharacters)要与规划一致(姿态措辞可微调,但**人物身份必须一致**)。
|
||||
- speaker 与 activeCharacters 里的 NPC 名字**只能来自规划的 cast**(或玩家 "你")——**不要引入规划之外的新角色**。
|
||||
|
||||
═══════════════════════════════════════════════════════════════════
|
||||
爆款心法(番茄网文 / 红果短剧 / galgame 的叙事手感)—— 必须贯彻
|
||||
@@ -167,11 +229,7 @@ export const WRITER_SYSTEM = `你是一部交互视觉小说的「编剧」。
|
||||
- 沿用主线记忆里的人物关系与情绪温度——别让刚告白的人下一场形同陌路,也别凭空遗忘已埋的伏笔。
|
||||
- 推进、但别重置:每一场都让主线问题往前走一点(关系变化 / 真相揭露一角 / 新悬念浮现)。
|
||||
|
||||
一个场景包含:
|
||||
- sceneSummary:当前场景的中文概要(地点、时间、氛围、关键事件——给后续的分镜导演看)
|
||||
- sceneKey:当前场景的英文 slug(如 "classroom-dusk"、"rooftop-night"、"rainy-street")——同一物理空间应沿用相同 slug
|
||||
- beats[]:玩家依次经历的对话节拍
|
||||
- entryBeatId:玩家进入场景时落在哪个 beat
|
||||
本步你只产出两样:**beats[]**(玩家依次经历的对话节拍)和 **storyStatePatch**(主线记忆更新)。sceneSummary / sceneKey / entryBeatId 已由规划给定,**不要再输出**它们。
|
||||
|
||||
每个 beat 是玩家会看到的一段叙述 / 对话 / 选择。beat 之间通过 next 字段连接:
|
||||
- "continue":玩家点击图片背景 / 按继续,自然推进到下一个 beat
|
||||
@@ -183,6 +241,7 @@ choice 的 effect 有两种:
|
||||
|
||||
设计原则:
|
||||
- 同场景内 beat 数自由发挥,按剧情节奏自然给出(通常 2–6 个,可以更多)
|
||||
- 入口 beat 的 id 必须等于规划给定的 entryBeatId;其余 beat id 依次自取且互不重复
|
||||
- 多用 continue,少用 choice — 选择只应出现在「真正的岔路口」
|
||||
- advance-beat 适合处理对话分支(同一场景里换个话题、追问、撒娇)
|
||||
- change-scene 适合空间/时间跳跃(出门、转身看窗外、第二天清晨)
|
||||
@@ -192,12 +251,6 @@ choice 的 effect 有两种:
|
||||
- next.nextBeatId 引用的 beat 必须存在
|
||||
- choice 至少 2 个,至多 4 个,互不重复
|
||||
|
||||
sceneKey 设计原则(重要 — 用于跨场景视觉一致性):
|
||||
- 同一物理空间 + 同一时段 → 必须沿用**完全相同**的英文 slug
|
||||
- 时段或空间变化时换 slug(如 "classroom-dusk" → "classroom-night","classroom-dusk" → "corridor-dusk")
|
||||
- slug 规范:lowercase-with-dashes,2–4 个英文单词
|
||||
- 已登记的历史场景 sceneKey 会在用户消息里列出,请优先**复用**这些已有 slug
|
||||
|
||||
文本风格约束:
|
||||
- narration / line 用中文(**纯净可显示文本**,绝不要写 (叹气)(语速快) 这类标注 —— 那是给配音的,会被玩家看见)
|
||||
- sceneSummary / lineDelivery / activeCharacters[].pose 内的文字也用中文
|
||||
@@ -243,11 +296,8 @@ sceneKey 设计原则(重要 — 用于跨场景视觉一致性):
|
||||
- nextHook:基于这一场的结尾,下一场应往哪走(给"下一次的你"一个明确命题,接住本场留下的扣子)
|
||||
这些字段是写给"未来的你"的连贯性记忆,请认真写。
|
||||
|
||||
必须输出严格 JSON,结构如下:
|
||||
必须输出严格 JSON,结构如下(**只含 beats 与 storyStatePatch**;sceneSummary / sceneKey / entryBeatId 由规划给定,不要输出。下例入口 beat 的 id "b1" 即规划的 entryBeatId):
|
||||
{
|
||||
"sceneSummary": "中文场景概要:地点+时间+氛围+关键事件",
|
||||
"sceneKey": "classroom-dusk",
|
||||
"entryBeatId": "b1",
|
||||
"beats": [
|
||||
{
|
||||
"id": "b1",
|
||||
@@ -343,29 +393,28 @@ function renderHistoryEntry(
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
export function buildWriterUserMessage(session: Session): string {
|
||||
// ─── STABLE PREFIX ────────────────────────────────────────────────────
|
||||
// Everything in this section is invariant across consecutive Writer calls
|
||||
// within the session (or monotonically grows in a way that keeps the
|
||||
// earlier bytes byte-identical). Always emit every section header — even
|
||||
// when empty — so positions don't shift between calls.
|
||||
//
|
||||
// Order optimized for DeepSeek/MiMo prefix caching (64-token chunks):
|
||||
// 1. session-immutable scalars (world / style)
|
||||
// 2. story bible spine (Architect-set, never patched)
|
||||
// 3. monotonically-growing lists (characters, sceneKeys)
|
||||
// 4. history entries 0..N-2 (the last entry is what THIS call must
|
||||
// react to, so it lives in the dynamic suffix instead)
|
||||
//
|
||||
// ─── DYNAMIC SUFFIX ───────────────────────────────────────────────────
|
||||
// Everything below changes on (almost) every call:
|
||||
// 5. story bible dynamic patch (synopsis/threads/relationships/nextHook)
|
||||
// 6. the just-completed entry (history[-1]) — same render format as the
|
||||
// stable history blocks, just preceded by a "just completed" header
|
||||
// 7. last-beat snippet (the exact emotional cliffhanger)
|
||||
// 8. lastExit hint
|
||||
// 9. format reminder tail
|
||||
|
||||
// Shared narrative context for BOTH Writer phases. Returns the message parts
|
||||
// from the cacheable STABLE PREFIX (sections 1-4) through the dynamic
|
||||
// transition hint (section 7), but WITHOUT the trailing phase-specific
|
||||
// instruction — each phase appends its own. Building this once and reusing it
|
||||
// keeps EACH phase's prompt prefix byte-stable across scenes for DeepSeek
|
||||
// prompt caching (Phase A and Phase B cache independently since their system
|
||||
// prompts differ, but each shares its own prefix across consecutive calls).
|
||||
//
|
||||
// ─── STABLE PREFIX ──────────────────────────────────────────────────────
|
||||
// Invariant across consecutive Writer calls within the session (or grows in a
|
||||
// way that keeps earlier bytes byte-identical). Always emit every section
|
||||
// header — even when empty — so positions don't shift between calls.
|
||||
// 1. session-immutable scalars (world / style)
|
||||
// 2. story bible spine (Architect-set, never patched)
|
||||
// 3. monotonically-growing lists (characters, sceneKeys)
|
||||
// 4. history entries 0..N-2 (the last entry is what THIS call must react
|
||||
// to, so it lives in the dynamic suffix instead)
|
||||
// ─── DYNAMIC SUFFIX ─────────────────────────────────────────────────────
|
||||
// 5. story bible dynamic patch (synopsis/threads/relationships/nextHook)
|
||||
// 6. last-beat snippet (the exact emotional cliffhanger)
|
||||
// 7. transition hint (opening cold-open directive OR lastExit承接)
|
||||
function buildWriterContextParts(session: Session): string[] {
|
||||
const parts: string[] = [];
|
||||
|
||||
// ── 1. session scalars ────────────────────────────────────────────────
|
||||
@@ -423,8 +472,7 @@ export function buildWriterUserMessage(session: Session): string {
|
||||
// ── 6. last-beat snippet (the exact emotional cliffhanger) ──
|
||||
// The full last entry is already in the stable history block above; here
|
||||
// we only re-emit the very last beat to sharply focus the Writer on the
|
||||
// emotional moment to continue from. Skip the duplicate full-entry render
|
||||
// that was here previously — it wasted ~200-500 tokens of dynamic suffix.
|
||||
// emotional moment to continue from.
|
||||
const last = session.history.at(-1);
|
||||
if (last) {
|
||||
const lastBeatId = last.visitedBeatIds.at(-1) ?? last.scene.entryBeatId;
|
||||
@@ -441,14 +489,14 @@ export function buildWriterUserMessage(session: Session): string {
|
||||
}
|
||||
}
|
||||
|
||||
// ── 7. transition hint ────────────────────────────────────────────────
|
||||
if (session.history.length === 0) {
|
||||
parts.push(
|
||||
"\n这是故事的开场。请按【故事档案】里的 nextHook 把第一幕的冷开场写出来——开场即抓人,别花笔墨铺垫世界观。写完后更新 storyStatePatch。严格以 JSON 格式返回。",
|
||||
"\n这是故事的开场。请按【故事档案】里的 nextHook 把第一幕的冷开场设计出来——开场即抓人,别花笔墨铺垫世界观。",
|
||||
);
|
||||
return parts.join("\n");
|
||||
return parts;
|
||||
}
|
||||
|
||||
// ── 8. lastExit hint ──────────────────────────────────────────────────
|
||||
const lastExit = last?.exit;
|
||||
if (lastExit) {
|
||||
if (lastExit.kind === "choice") {
|
||||
@@ -464,8 +512,59 @@ export function buildWriterUserMessage(session: Session): string {
|
||||
parts.push("\n无缝续写下一个场景,延续上一刻的情绪。");
|
||||
}
|
||||
|
||||
// ── 9. format reminder tail ───────────────────────────────────────────
|
||||
parts.push("写完后别忘了更新 storyStatePatch。严格以 JSON 格式返回。");
|
||||
return parts;
|
||||
}
|
||||
|
||||
// Phase A — plan the scene skeleton (no beats). Shares the cacheable context;
|
||||
// appends a plan-only instruction tail.
|
||||
export function buildWriterPlanUserMessage(session: Session): string {
|
||||
const parts = buildWriterContextParts(session);
|
||||
parts.push(
|
||||
'\n现在**只规划本场景的骨架**(不要写 beats 台词):给出 sceneSummary(画面感强、含开场钩子)、sceneKey、entryBeatId、本场景会出场的全部角色 cast、以及入口 beat 的 entrySpeaker 与 entryActiveCharacters。严格以 JSON 格式返回。',
|
||||
);
|
||||
return parts.join("\n");
|
||||
}
|
||||
|
||||
// Phase B — expand the plan into full beats[] + storyStatePatch. The plan is
|
||||
// dynamic per scene, so it goes AFTER the cacheable context (keeping Phase B's
|
||||
// prefix stable across scenes).
|
||||
export function buildWriterBeatsUserMessage(
|
||||
session: Session,
|
||||
plan: WriterPlan,
|
||||
): string {
|
||||
const parts = buildWriterContextParts(session);
|
||||
|
||||
parts.push("");
|
||||
parts.push("━━━ 本场景规划(上一步已定,必须严格遵守)━━━");
|
||||
parts.push(`场景概要 sceneSummary:${plan.sceneSummary}`);
|
||||
if (plan.sceneKey) parts.push(`sceneKey:${plan.sceneKey}`);
|
||||
parts.push(
|
||||
`入口 beat 的 id(entryBeatId,必须有一个此 id 的 beat 作为入口):${plan.entryBeatId}`,
|
||||
);
|
||||
parts.push(
|
||||
`入口 beat 的 speaker:${plan.entrySpeaker ? plan.entrySpeaker : "(空 —— 纯旁白 / 环境开场)"}`,
|
||||
);
|
||||
parts.push("入口 beat 的登场角色 activeCharacters(人物身份须一致,姿态可微调):");
|
||||
if (plan.entryActiveCharacters.length === 0) {
|
||||
parts.push("(无 —— 入口画面没有 NPC)");
|
||||
} else {
|
||||
for (const c of plan.entryActiveCharacters) {
|
||||
parts.push(`- ${c.name}${c.pose ? `:${c.pose}` : ""}`);
|
||||
}
|
||||
}
|
||||
parts.push(
|
||||
'本场景允许出现的角色名 cast(speaker / activeCharacters 只能用这些名字或 "你",不要新增角色):',
|
||||
);
|
||||
if (plan.cast.length === 0) {
|
||||
parts.push("(无 NPC —— 仅旁白与玩家)");
|
||||
} else {
|
||||
for (const n of plan.cast) parts.push(`- ${n}`);
|
||||
}
|
||||
parts.push("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
|
||||
|
||||
parts.push(
|
||||
"\n把上面的规划展开成完整的 beats[](入口 beat 用规划的 entryBeatId / speaker / 登场角色),写完后更新 storyStatePatch。严格以 JSON 格式返回。",
|
||||
);
|
||||
return parts.join("\n");
|
||||
}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user