Make homepage cards and live sessions produce sound when the server is
configured for StepFun TTS, instead of silently failing (the prebaked
Xiaomi voice was useless on a StepFun server, and wasted ~220KB/beat in
Fast Origin Transfer).
Three coordinated changes:
1. CharacterDesigner now picks a StepFun preset voice id directly from the
32-entry catalog in the SAME LLM call that designs the character — zero
extra latency, LLM-grade match quality. The Xiaomi prompt path is
byte-identical to history (verified programmatically) so cache hit rate
and voice quality are preserved. pickStepfunVoiceId (keyword scorer)
remains the fallback for orphan speakers / invalid LLM picks.
2. The 32-preset catalog moves to lib/tts-client/stepfun-voices.json as the
single source of truth, shared by the scorer, the CharacterDesigner
prompt, /api/tts-provider, and the offline enrich script.
3. A new GET /api/tts-provider endpoint lets the client probe the server's
TTS provider at /play mount. fetchBeatAudio then shapes its request body:
on a StepFun server it sends the lightweight stepfunVoiceId /
voiceDescription and omits the ~220KB Xiaomi reference audio (FOT saving
~13MB per protagonist per session on prebaked cards). requestBeatAudio
re-provisions on a provider mismatch before synth, so audio never goes
silent on a cross-provider replay or mid-session provider flip.
New type fields are all optional and backward-compatible: Character.stepfunVoiceId,
BeatAudioRequest.voiceDescription/characterName/stepfunVoiceId, voice made
optional. AGENTS.md updated for the new route, type fields, dependency map,
and StepFun voice-selection flow.
Reverts the regressions from b63b694 on the server-fallback path:
P0 — fetchBeatAudio non-BYO branch was a bare return; every non-BYO
user got silent playback regardless of server TTS config. Re-connect
to /api/beat-audio with the beatAudioAbortRef signal, count 204/!ok
as silence strikes, create a blob URL on success.
P1 — stripVoicesForTransport + mergeCharactersPreserveVoice were
deleted, so the server-fallback path re-sent ~160KB
referenceAudioBase64 per character on every request AND lost voices
for already-known characters after scene 1. Re-add both, applied
ONLY on the server-fallback branches in engineClient.ts (BYO
client-direct path untouched).
P3 — the aborted-before-store blob URL race had no revoke, leaking
one blob URL per cancelled synth. Re-add the else-if revoke.
P2 — handleSettingsSaved ignored ttsConfigured, so a BYO key entered
mid-session only took effect after a page reload. Re-add the ref/state
refresh + audio re-prefetch. Also restore the silence-nudge UI
(silenceStrikes counter, SILENCE_NUDGE_THRESHOLD, dismissible pill
beside the mute toggle) that surfaces BYO-key guidance when the
shared server key is being rate-limited.
Verified live: /api/beat-audio now returns 200 (was 0 calls under
the bug); audio plays after synth completes.
Introduce user registration/login gated behind optional NEXT_PUBLIC_SUPABASE_*
env vars (leave blank to disable — app behaves exactly as before). Adds
proxy.ts for automatic cookie session refresh, requireUser() API route
guards on all 7 compute-consuming routes, AuthModal (Google/GitHub OAuth +
6-digit email OTP), UserChip header component, and login_success analytics
event. Identity is fully decoupled from Session/engine — no type changes.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a user has not configured their own model keys in localStorage,
engine calls now automatically route through /api/* server routes
instead of throwing "模型配置未设置". This lets Vercel deploys with
server-side environment variables work out of the box.
- Add lib/engineClient.ts as a unified client-side routing layer:
checks localStorage for BYO config, falls back to POST /api/start,
/api/scene, /api/vision, /api/classify-freeform, /api/insert-beat
- Update app/play/page.tsx to use engineClient instead of direct
engine imports; remove buildEngineConfig()
- Update app/page.tsx style-image parsing to also fall back to
/api/parse-style-image when no local model config exists
Signed-off-by: zhi <zhi@peropero.net>