Merge staging into main (#3)

* feat(engine): Architect agent + cross-scene StoryState coherence

Add a dedicated Architect LLM call at session start that expands the terse
world/style prompt into a persistent story bible (logline, genre, second-
person protagonist, cast, engineered opening hook). The bible seeds a
StoryState the Writer reads and patches every scene, carried + merged
across cuts (applyStoryStatePatch) so the story keeps a spine from beat
one instead of jumping between scenes.

- prompts: inject web-novel / short-drama / galgame craft into Writer +
  Architect; Writer emits storyStatePatch to update the running bible
- director: parallelize voice + non-entry portraits with the Painter
  (only entry-beat portraits block paint) to offset Architect latency
- architect: chat/parse guarded so a malformed response never aborts start
- types: StoryState / StoryStatePatch; required on Start/SceneResponse

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix (#2)

* docs: add AGPL-3.0 license, README i18n, and TTS accuracy fix

- LICENSE: add GNU AGPL v3 with InfiPlot copyright notice
- README.md: rewrite for open-source project, fix TTS description
  (TTS uses MiMo's own protocol, not OpenAI-compatible)
- README.zh-CN.md: add Simplified Chinese translation
- README.ja.md: add Japanese translation
- package.json: change license from UNLICENSED to AGPL-3.0-only

* fix: address Copilot review — .env.example TTS comment, zh-CN formatting

- .env.example: clarify TTS uses MiMo's own protocol, not OpenAI-compatible
- README.md: 'land paper after paper' → 'publish paper after paper'
- README.zh-CN.md: add spaces around '5 月', fix code formatting
  for model names (deepseek-v4-flash)

---------

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Zonghao Yuan
2026-06-02 13:41:37 +08:00
committed by GitHub
parent 16707cc255
commit 588b668d14
16 changed files with 1639 additions and 162 deletions
+3 -1
View File
@@ -3,9 +3,11 @@
# Recommended setup: Xiaomi MiMo Token Plan for TEXT / VISION / TTS
# (one API key covers all three) + Runware for IMAGE (FLUX.2 [klein]).
#
# TEXT / VISION / TTS use OpenAI-compatible endpoints (any OpenAI-
# TEXT / VISION use any OpenAI-compatible endpoint (any OpenAI-
# compatible host works: OpenRouter, OpenAI, Anthropic via proxy,
# Gemini, DeepSeek, Ollama, ...).
# TTS uses Xiaomi MiMo's own voice design / clone protocol
# (not OpenAI-compatible; appends -voicedesign / -voiceclone).
#
# IMAGE uses Runware's own task-array protocol (not OpenAI-compatible);
# the adapter posts an `imageInference` task to IMAGE_BASE_URL.
+3
View File
@@ -187,6 +187,7 @@ function prefetchScenePath(
const carriedBase: Session = {
...baseSession,
characters: data.characters,
storyState: data.storyState,
};
prefetchScenePath(pool, carriedBase, [...steps, nextStep], depth + 1);
}
@@ -539,6 +540,7 @@ function PlayInner() {
},
],
characters: data.characters,
storyState: data.storyState,
};
visitedBeatsRef.current = [data.scene.entryBeatId];
setSession(initial);
@@ -635,6 +637,7 @@ function PlayInner() {
},
],
characters: result.characters,
storyState: result.storyState,
};
visitedBeatsRef.current = [result.scene.entryBeatId];
setSession(newSession);