- New client-side i18n via React Context (useI18n, tArray, I18nProvider)
- Catalog ships 21 locale stubs; only zh-CN/en/ja have reviewed translations
- Header language switcher (globe icon + short label) before settings gear
- All hardcoded Chinese UI text migrated to keys: typewriter, options,
hints (with embedded gear icon via dangerouslySetInnerHTML), settings
panel, footer/about, play page hints
- AI output language follows user-selected locale via trailing one-liner
directive appended to Architect/Writer/CharacterDesigner/InsertBeat
user messages (preserves system-prompt cacheability)
- Per-locale separator rule: zh uses middot between every glyph; en/ja
use plain spaces
- Option value → i18n key suffix maps preserve Chinese as the underlying
identifier so analytics unions and STYLE_MAP keys stay byte-stable
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Address PR-agent review findings:
- resolveVoice fast path: replace ambiguous boolean comparison
(voiceProvider === "stepfun") === serverStepfun with explicit
per-provider equality checks. Prevents an undefined or unknown
provider from matching the non-stepfun (xiaomi) branch by accident.
- /api/beat-audio route: reject requests whose voice.provider is present
but not in the VALID_TTS_PROVIDERS whitelist (e.g. "azure"). Previously
such a request would pass validation when fallback fields were also
present, and resolveVoice might use the invalid voice directly instead
of falling back to reprovision — producing a silent beat instead of a
voiced one.
Make homepage cards and live sessions produce sound when the server is
configured for StepFun TTS, instead of silently failing (the prebaked
Xiaomi voice was useless on a StepFun server, and wasted ~220KB/beat in
Fast Origin Transfer).
Three coordinated changes:
1. CharacterDesigner now picks a StepFun preset voice id directly from the
32-entry catalog in the SAME LLM call that designs the character — zero
extra latency, LLM-grade match quality. The Xiaomi prompt path is
byte-identical to history (verified programmatically) so cache hit rate
and voice quality are preserved. pickStepfunVoiceId (keyword scorer)
remains the fallback for orphan speakers / invalid LLM picks.
2. The 32-preset catalog moves to lib/tts-client/stepfun-voices.json as the
single source of truth, shared by the scorer, the CharacterDesigner
prompt, /api/tts-provider, and the offline enrich script.
3. A new GET /api/tts-provider endpoint lets the client probe the server's
TTS provider at /play mount. fetchBeatAudio then shapes its request body:
on a StepFun server it sends the lightweight stepfunVoiceId /
voiceDescription and omits the ~220KB Xiaomi reference audio (FOT saving
~13MB per protagonist per session on prebaked cards). requestBeatAudio
re-provisions on a provider mismatch before synth, so audio never goes
silent on a cross-provider replay or mid-session provider flip.
New type fields are all optional and backward-compatible: Character.stepfunVoiceId,
BeatAudioRequest.voiceDescription/characterName/stepfunVoiceId, voice made
optional. AGENTS.md updated for the new route, type fields, dependency map,
and StepFun voice-selection flow.
- Player name: stored in localStorage, injected into Architect/Writer/InsertBeat
prompts so NPCs address the player by name, displayed in dialogue UI
- Freeform input: compact button at choice nodes expands to text input, LLM
classifier routes to insert-beat (interactive NPC response) or change-scene
- SettingsModal: unified panel merging player name, voice toggle (with
collapsible TTS key section), replacing the old TtsKeyModal
- Insert-beat upgrade: prompt now requires NPC reaction when characters are
present, shared by both freeform and Vision paths
- IME guard: isComposing check on freeform input to prevent CJK mid-composition
submission
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace session.styleGuide with a descriptive placeholder before the
Architect runs, so its prompt reads a natural sentence instead of the
raw "auto" marker. Also wrap selectStyle in a try-catch so a transient
LLM failure falls back to 吉卜力 instead of crashing session start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When user picks "自动", the client sends styleGuide="auto" to the
server. The orchestrator then runs a lightweight style-selector LLM
call in parallel with the Architect — both only depend on worldSetting,
so there is zero added latency. The selector picks the best-matching
preset from STYLE_MAP based on genre, mood, and setting.
Also moves STYLE_MAP from page.tsx to lib/options.ts so it can be
shared between client and server.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread orientation (portrait|landscape) from client through API, engine,
and image gen. Portrait devices render 1024x1792 (9:16) full-bleed scenes;
desktop/landscape keeps 1792x1024 (16:9). Adds cover-aware click→image
coordinate mapping, session-locked orientation, a shared coerceOrientation
helper, and a choices overflow cap in portrait.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Two lines in startSession: the full worldSetting being fed to the
Architect, and the resulting logline/genreTags/synopsis it produced.
Cheap to keep — fires once per session — and makes it possible to tell
at a glance whether a "story unrelated to my input" report is a frontend
transport bug, a worldSetting layout problem, or the LLM ignoring the
seed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Flatten the pnpm monorepo (apps/web + packages/*) into a single web package at the repo root.
- Move app/lib/components/scripts/public to root; drop apps/web and packages/* wrappers
- Rewrite tsconfig paths (@infiplot/*) to ./lib/*; turbopack.root = __dirname
- Update Vercel (no root-directory) and Cloudflare (pnpm build:cf at root) deploy paths
- Regenerate pnpm-lock.yaml to drop stale workspace importers
- Bump engines.node to >=22 to match wrangler
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>