Track play_error and play_visibility_lost events via Umami to
distinguish mobile vs desktop failure modes. Each error event
captures orientation, connection type, visibility state, elapsed
time bucket, and error classification — all categorical, no free
text. Includes postJson "HTTP \d+" status parsing for the new
engineClient dual-path architecture.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Introduce user registration/login gated behind optional NEXT_PUBLIC_SUPABASE_*
env vars (leave blank to disable — app behaves exactly as before). Adds
proxy.ts for automatic cookie session refresh, requireUser() API route
guards on all 7 compute-consuming routes, AuthModal (Google/GitHub OAuth +
6-digit email OTP), UserChip header component, and login_success analytics
event. Identity is fully decoupled from Session/engine — no type changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Verify imgRef.current === el before firing onImageReady, so a
late-resolving decode from a prior <img> element cannot trigger
the gate prematurely.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keep the "transitioning" overlay visible until the <img> element's
bitmap is fully decoded, so the user never sees progressive paint
or a blank flash between scenes.
- Add onImageReady callback to PlayCanvas (<img onLoad> + decode())
- Delay setPhase("ready") until decode resolves (3s timeout fallback)
- Applied to all 4 scene entry paths: prebaked card, live /api/start,
performSceneTransition, and recorded replay transition
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
IMAGE_TIMEOUT_MS sets a per-attempt hard deadline (AbortSignal.timeout);
IMAGE_HEDGE_MS races a second identical scene-paint request when the
first is still pending past the threshold. Both default to OFF when
unset, preserving historical behavior for self-hosted deploys.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a user has not configured their own model keys in localStorage,
engine calls now automatically route through /api/* server routes
instead of throwing "模型配置未设置". This lets Vercel deploys with
server-side environment variables work out of the box.
- Add lib/engineClient.ts as a unified client-side routing layer:
checks localStorage for BYO config, falls back to POST /api/start,
/api/scene, /api/vision, /api/classify-freeform, /api/insert-beat
- Update app/play/page.tsx to use engineClient instead of direct
engine imports; remove buildEngineConfig()
- Update app/page.tsx style-image parsing to also fall back to
/api/parse-style-image when no local model config exists
Signed-off-by: zhi <zhi@peropero.net>
Walk every speaking beat at export time, reuse current scene's beatAudioMap,
and synth the rest via BYO TTS or /api/beat-audio with concurrency 4. Show a
progress toast on the play page while collecting.
Gallery export keeps audio in a sidecar localStorage key so the first paint
is not blocked by JSON.parse-ing several MB of base64; the gallery lazy-loads
it after the first scene image, then plays per-beat audio with a mute toggle
persisted to localStorage. .infiplot share files embed audioByBeatId in the
doc itself (v2); on import the data URIs survive scene swaps and feed back
into the per-beat audio map so replayers hear the original voices for free.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace the auto-generated kyoani / shinkai style thumbnails with hand-picked
reference frames. Source PNGs were center-cropped to square and re-encoded as
512x512 WEBP (~41KB each) to match the existing thumbnail format. Bumps the
shared cache-buster from v5 to v6 so existing browsers fetch the new files.
The home-page file-import button accepts .infiplot story files. The
tooltip now spells out the file type so users distinguish it from
"开始剧情"/"载入预设" affordances on the same screen.
- Validate voice.provider against known whitelist (xiaomi|stepfun) in
beat-audio route to return a clear 400 instead of falling through
- Move single-char pronouns (他/她) to weak-signal fallback in
detectGender to avoid false positives on compounds like 其他
- Update .env.example with StepFun configuration examples
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Hash the lowercased description (matching the case-insensitive scoring)
so the same archetype text picks the same preset regardless of case.
- Thread the character name through provisionVoice -> stepfunProvision as
the hash salt, so two characters that share archetype keywords spread
across the top-N candidate presets instead of collapsing on one voice.
Xiaomi path is unaffected (voicedesign mints a unique clip per call).
Add StepFun step-tts-mini / step-tts-2 / stepaudio-2.5-tts as an alternate
TTS provider alongside Xiaomi MiMo. Auto-detected from TTS_BASE_URL host
(contains `stepfun.com` → StepFun; otherwise → MiMo), mirroring how the
image client infers Runware from `*.runware.ai`.
CharacterVoice becomes a discriminated union on `provider`:
- xiaomi: { referenceAudioBase64, mimeType } — unchanged
- stepfun: { voiceId, model, mimeType } — preset voice ID + chosen model
Provision dispatches on the current cfg's base URL; synthesis dispatches
on the voice's own `provider` tag so a session with mixed voices (e.g. a
provider switch mid-development) routes each beat through the correct
protocol. xiaomiSynthesize now guards against being called with a non-
xiaomi voice, surfacing the bug as a clear runtime error instead of a
TypeScript narrow violation at the access site.
StepFun has no voicedesign equivalent — only preset voices + voice
cloning from a reference audio upload. Cloning would require an extra
asset per character, so v1 maps the LLM's Chinese voiceDescription to one
of the 32 published preset IDs via gender + age + tone keyword scoring,
with a deterministic hash spread across the top-3 candidates so multiple
characters with similar descriptions don't collapse onto the identical
preset. lineDelivery is accepted but not yet propagated to StepFun's
voice_label.emotion / .style fields — left as a follow-up.
beat-audio route validation relaxed from `voice.referenceAudioBase64`
(xiaomi-shaped) to `voice.provider` (shape-agnostic), so stepfun voices
pass the gate; provider-specific shape errors still surface from the
synth function.
Observed latency on InfiPlot's dev loop: StepFun step-tts-mini median
~2.3s per beat with 0% timeouts across the test session, vs MiMo's
median ~8s with the long tail tripping the existing 15s synth budget
on roughly 2 of 3 beats. Pricing: step-tts-mini ¥0.9/万字符 (~¥0.14
per typical 50-beat session) vs MiMo TTS currently free under the
Token Plan creator incentive.
AGENTS.md provider matrix updated to describe both providers and the
discriminated-union dispatch.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* feat(engine): tighten CharacterDesigner prompt to prevent look-alike characters
Expand the visualDescription rules into a 6-element mandatory checklist (hair
quad / eyes triad / face & build / outfit quad / personality-driven vibe /
silhouette tag) and add an explicit anti-collision rule comparing against the
existing cast across cross-color-family and cross-silhouette dimensions.
Also upgrade the user-message "已设定角色" block from soft hint to hard
constraint with an explicit pre-write scan step, nudging the LLM into chain-
of-thought differentiation before emitting tags.
All additions land in the session-stable system prefix, so prompt cache
absorbs the extra tokens — per-call billed token delta is ~0.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
* fix(engine): replace pose examples with aura descriptors in personality vibe
The PERSONALITY-DRIVEN VIBE element listed concrete poses (arms crossed,
chin tilted up, slight slouch) which contradicted the earlier rule
banning transient poses from visualDescription. Switch to pure
atmosphere/aura keywords so the character card stays pose-neutral.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: yuanzonghao <yuanzonghao123@gmail.com>
Adds a ref-based mutex so concurrent /api/story-pack requests and
duplicate file downloads cannot be triggered by rapid clicking.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Content-Length pre-check to story-pack and story-unpack routes
to reject oversized payloads before buffering the body
- Suppress internal error details in story-unpack catch (was leaking
e.message to the client)
- Strengthen sceneIndex validation: require non-negative integer
- Guard against undefined storyState when replaying shared stories
- Fix prefetch regression: remove currentBeat?.id from useEffect deps
that was re-triggering all change-scene prefetches on every beat
- Fix double detach: use else-if so the second replay detach guard
doesn't fire redundantly after the first already detached
- Align client file-size limit by format (.json 12MB, .infiplot 13MB)
- Move "载入剧情" import button next to "开始" with hover tooltip
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* docs: simplify Docker deploy — download two files instead of cloning repo
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(docs): use mkdir -p and guard against .env.local overwrite
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Handle downloadImagesAsZip return value and surface errors to user
- Fix inferImageExtension garbage output for data URIs without semicolons
- Scale blob URL revocation delay for large zip files (>5MB → 60s)
- Cap uniqueZipPath dedup loop at 10k iterations with timestamp fallback
- Support relative URLs in inferImageExtension via base URL
- Handle svg+xml MIME subtype correctly
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Collaborators' hand-written PR titles and descriptions were being
overwritten by the automatic /describe run. Disable auto_describe on the
Claude job and set generate_ai_title = false so human-authored metadata
is preserved. Manual /describe via PR comment still works.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The previous "*.md" ignore glob hid best_practices.md and AGENTS.md from
the /review diff view (visible in PR #48 where the reviewer hallucinated
"this PR does not add a best_practices.md file"). README-style noise on
docs PRs is preferable to silently dropping changes to the project's
authoritative rule files.
- split per-model banners so two model jobs no longer overwrite each other
- raise reviewer findings cap to 8, broaden /improve to readability/cleanup
- enable dual-publishing for high-score suggestions (inline annotations)
- switch Claude model from opus-4-7 to opus-4-6 (fallback sonnet-4-6)
- raise reasoning_effort to high, response_language to zh-CN
- drop two dead config keys silently ignored by upstream schema
- add best_practices.md with 6 project-specific invariants for /improve