When a user has not configured their own model keys in localStorage,
engine calls now automatically route through /api/* server routes
instead of throwing "模型配置未设置". This lets Vercel deploys with
server-side environment variables work out of the box.
- Add lib/engineClient.ts as a unified client-side routing layer:
checks localStorage for BYO config, falls back to POST /api/start,
/api/scene, /api/vision, /api/classify-freeform, /api/insert-beat
- Update app/play/page.tsx to use engineClient instead of direct
engine imports; remove buildEngineConfig()
- Update app/page.tsx style-image parsing to also fall back to
/api/parse-style-image when no local model config exists
Signed-off-by: zhi <zhi@peropero.net>
Walk every speaking beat at export time, reuse current scene's beatAudioMap,
and synth the rest via BYO TTS or /api/beat-audio with concurrency 4. Show a
progress toast on the play page while collecting.
Gallery export keeps audio in a sidecar localStorage key so the first paint
is not blocked by JSON.parse-ing several MB of base64; the gallery lazy-loads
it after the first scene image, then plays per-beat audio with a mute toggle
persisted to localStorage. .infiplot share files embed audioByBeatId in the
doc itself (v2); on import the data URIs survive scene swaps and feed back
into the per-beat audio map so replayers hear the original voices for free.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the auto-generated kyoani / shinkai style thumbnails with hand-picked
reference frames. Source PNGs were center-cropped to square and re-encoded as
512x512 WEBP (~41KB each) to match the existing thumbnail format. Bumps the
shared cache-buster from v5 to v6 so existing browsers fetch the new files.
The home-page file-import button accepts .infiplot story files. The
tooltip now spells out the file type so users distinguish it from
"开始剧情"/"载入预设" affordances on the same screen.
- Validate voice.provider against known whitelist (xiaomi|stepfun) in
beat-audio route to return a clear 400 instead of falling through
- Move single-char pronouns (他/她) to weak-signal fallback in
detectGender to avoid false positives on compounds like 其他
- Update .env.example with StepFun configuration examples
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Hash the lowercased description (matching the case-insensitive scoring)
so the same archetype text picks the same preset regardless of case.
- Thread the character name through provisionVoice -> stepfunProvision as
the hash salt, so two characters that share archetype keywords spread
across the top-N candidate presets instead of collapsing on one voice.
Xiaomi path is unaffected (voicedesign mints a unique clip per call).
Add StepFun step-tts-mini / step-tts-2 / stepaudio-2.5-tts as an alternate
TTS provider alongside Xiaomi MiMo. Auto-detected from TTS_BASE_URL host
(contains `stepfun.com` → StepFun; otherwise → MiMo), mirroring how the
image client infers Runware from `*.runware.ai`.
CharacterVoice becomes a discriminated union on `provider`:
- xiaomi: { referenceAudioBase64, mimeType } — unchanged
- stepfun: { voiceId, model, mimeType } — preset voice ID + chosen model
Provision dispatches on the current cfg's base URL; synthesis dispatches
on the voice's own `provider` tag so a session with mixed voices (e.g. a
provider switch mid-development) routes each beat through the correct
protocol. xiaomiSynthesize now guards against being called with a non-
xiaomi voice, surfacing the bug as a clear runtime error instead of a
TypeScript narrow violation at the access site.
StepFun has no voicedesign equivalent — only preset voices + voice
cloning from a reference audio upload. Cloning would require an extra
asset per character, so v1 maps the LLM's Chinese voiceDescription to one
of the 32 published preset IDs via gender + age + tone keyword scoring,
with a deterministic hash spread across the top-3 candidates so multiple
characters with similar descriptions don't collapse onto the identical
preset. lineDelivery is accepted but not yet propagated to StepFun's
voice_label.emotion / .style fields — left as a follow-up.
beat-audio route validation relaxed from `voice.referenceAudioBase64`
(xiaomi-shaped) to `voice.provider` (shape-agnostic), so stepfun voices
pass the gate; provider-specific shape errors still surface from the
synth function.
Observed latency on InfiPlot's dev loop: StepFun step-tts-mini median
~2.3s per beat with 0% timeouts across the test session, vs MiMo's
median ~8s with the long tail tripping the existing 15s synth budget
on roughly 2 of 3 beats. Pricing: step-tts-mini ¥0.9/万字符 (~¥0.14
per typical 50-beat session) vs MiMo TTS currently free under the
Token Plan creator incentive.
AGENTS.md provider matrix updated to describe both providers and the
discriminated-union dispatch.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(engine): tighten CharacterDesigner prompt to prevent look-alike characters
Expand the visualDescription rules into a 6-element mandatory checklist (hair
quad / eyes triad / face & build / outfit quad / personality-driven vibe /
silhouette tag) and add an explicit anti-collision rule comparing against the
existing cast across cross-color-family and cross-silhouette dimensions.
Also upgrade the user-message "已设定角色" block from soft hint to hard
constraint with an explicit pre-write scan step, nudging the LLM into chain-
of-thought differentiation before emitting tags.
All additions land in the session-stable system prefix, so prompt cache
absorbs the extra tokens — per-call billed token delta is ~0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(engine): replace pose examples with aura descriptors in personality vibe
The PERSONALITY-DRIVEN VIBE element listed concrete poses (arms crossed,
chin tilted up, slight slouch) which contradicted the earlier rule
banning transient poses from visualDescription. Switch to pure
atmosphere/aura keywords so the character card stays pose-neutral.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: yuanzonghao <yuanzonghao123@gmail.com>
Adds a ref-based mutex so concurrent /api/story-pack requests and
duplicate file downloads cannot be triggered by rapid clicking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Content-Length pre-check to story-pack and story-unpack routes
to reject oversized payloads before buffering the body
- Suppress internal error details in story-unpack catch (was leaking
e.message to the client)
- Strengthen sceneIndex validation: require non-negative integer
- Guard against undefined storyState when replaying shared stories
- Fix prefetch regression: remove currentBeat?.id from useEffect deps
that was re-triggering all change-scene prefetches on every beat
- Fix double detach: use else-if so the second replay detach guard
doesn't fire redundantly after the first already detached
- Align client file-size limit by format (.json 12MB, .infiplot 13MB)
- Move "载入剧情" import button next to "开始" with hover tooltip
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: simplify Docker deploy — download two files instead of cloning repo
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(docs): use mkdir -p and guard against .env.local overwrite
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Handle downloadImagesAsZip return value and surface errors to user
- Fix inferImageExtension garbage output for data URIs without semicolons
- Scale blob URL revocation delay for large zip files (>5MB → 60s)
- Cap uniqueZipPath dedup loop at 10k iterations with timestamp fallback
- Support relative URLs in inferImageExtension via base URL
- Handle svg+xml MIME subtype correctly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborators' hand-written PR titles and descriptions were being
overwritten by the automatic /describe run. Disable auto_describe on the
Claude job and set generate_ai_title = false so human-authored metadata
is preserved. Manual /describe via PR comment still works.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous "*.md" ignore glob hid best_practices.md and AGENTS.md from
the /review diff view (visible in PR #48 where the reviewer hallucinated
"this PR does not add a best_practices.md file"). README-style noise on
docs PRs is preferable to silently dropping changes to the project's
authoritative rule files.
- split per-model banners so two model jobs no longer overwrite each other
- raise reviewer findings cap to 8, broaden /improve to readability/cleanup
- enable dual-publishing for high-score suggestions (inline annotations)
- switch Claude model from opus-4-7 to opus-4-6 (fallback sonnet-4-6)
- raise reasoning_effort to high, response_language to zh-CN
- drop two dead config keys silently ignored by upstream schema
- add best_practices.md with 6 project-specific invariants for /improve
Restrict PR Agent workflow to trusted collaborators on PR comments only,
fix UTF-8 byte counting in gallery-pack, correct portrait-to-landscape
fallback orientation, track inserted freeform beats in visitedBeatIds,
allow clearing stored TTS key, and guard empty-string fuzzy match in
style selector.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Merge vision-click toggle into the shared SettingsModal alongside
player name and TTS key configuration. Remove standalone TtsKeyModal.
Add settings gear button to PlayCanvas dialogue card and header.
Fix fullscreen settings modal not rendering in immersive mode.
Voice toggle uses standard CategorySelect dropdown matching other
tab bar options.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add 14 new painting styles sourced from preset story card generation
scripts: Dunhuang fresco, Persian miniature, Byzantine mosaic, stained
glass, vaporwave, vector illustration, low poly, pop art, glitch art,
papercut, steampunk, xianxia fantasy, dark fairytale, and urban fantasy.
Reorder all 36 styles into logical visual categories (anime → cinematic
→ Eastern traditional → Western traditional → genre → digital → handcraft)
for easier browsing. Update "auto" thumbnail to a 3×3 composite grid and
"custom" thumbnail to a paintbrush-on-canvas concept image.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>