feat(tts): StepFun voice selection via CharacterDesigner + provider-aware beat-audio
Make homepage cards and live sessions produce sound when the server is configured for StepFun TTS, instead of silently failing (the prebaked Xiaomi voice was useless on a StepFun server, and wasted ~220KB/beat in Fast Origin Transfer). Three coordinated changes: 1. CharacterDesigner now picks a StepFun preset voice id directly from the 32-entry catalog in the SAME LLM call that designs the character — zero extra latency, LLM-grade match quality. The Xiaomi prompt path is byte-identical to history (verified programmatically) so cache hit rate and voice quality are preserved. pickStepfunVoiceId (keyword scorer) remains the fallback for orphan speakers / invalid LLM picks. 2. The 32-preset catalog moves to lib/tts-client/stepfun-voices.json as the single source of truth, shared by the scorer, the CharacterDesigner prompt, /api/tts-provider, and the offline enrich script. 3. A new GET /api/tts-provider endpoint lets the client probe the server's TTS provider at /play mount. fetchBeatAudio then shapes its request body: on a StepFun server it sends the lightweight stepfunVoiceId / voiceDescription and omits the ~220KB Xiaomi reference audio (FOT saving ~13MB per protagonist per session on prebaked cards). requestBeatAudio re-provisions on a provider mismatch before synth, so audio never goes silent on a cross-provider replay or mid-session provider flip. New type fields are all optional and backward-compatible: Character.stepfunVoiceId, BeatAudioRequest.voiceDescription/characterName/stepfunVoiceId, voice made optional. AGENTS.md updated for the new route, type fields, dependency map, and StepFun voice-selection flow.
This commit is contained in:
@@ -305,8 +305,13 @@ export async function directScene(
|
||||
}
|
||||
|
||||
// Kick off voice provisioning for every NEW char (never on the paint path).
|
||||
// On the StepFun path, thread the LLM-selected stepfunVoiceId from the card
|
||||
// into provision — it lets stepfunProvision honor the catalog pick instead
|
||||
// of falling back to the keyword scorer (same network cost: still zero).
|
||||
const voicePromises = cards.map((card) =>
|
||||
provisionCharacterVoice(config, card.voiceDescription, card.name).then(
|
||||
provisionCharacterVoice(config, card.voiceDescription, card.name, {
|
||||
stepfunVoiceId: card.stepfunVoiceId,
|
||||
}).then(
|
||||
(voice): Character => ({
|
||||
name: card.name,
|
||||
voiceDescription: card.voiceDescription,
|
||||
|
||||
Reference in New Issue
Block a user