feat(tts): Xiaomi MiMo per-beat voice + MOCK_IMAGE testing aid (#3)
Adds optional Xiaomi MiMo TTS layer on top of the scene/beat engine and a MOCK_IMAGE flag for cheap local TTS iteration. - Per-character voice provisioning via MiMo voice design → clone, reference audio persisted in session - Per-line free-form delivery direction (Director writes "鼓起勇气又害羞,声音发颤" style instructions; sent to MiMo's director channel, never read aloud) - Per-beat audio served with the scene response; frontend plays via hidden <audio> with typewriter synced to audio duration; mute toggle persisted via localStorage lazy initializer - Graceful degradation: any TTS step failing → silent beat, game continues - MOCK_IMAGE=true returns a sharp-generated placeholder PNG so local TTS iteration doesn't burn image tokens - Recommended config in .env.example: MiMo Token Plan covers TEXT/VISION/TTS with one key (mimo-v2.5-pro for text, mimo-v2.5 omni for vision, mimo-v2.5-tts for TTS) Squashed from #3: - feat(tts): 小米 MiMo 逐 beat 配音 + 按 session 角色音色 + 自由文本配音指导 - feat(engine): MOCK_IMAGE 占位图便于本地测试 - fix(tts): address Copilot review on PR #3 - fix(tts): Copilot round-2 review feedback Known limitation: Session.characters carries the full WAV reference audio (~200-300KB/character base64) and round-trips through every /api/scene, /api/vision, /api/insert-beat request. This is intrinsic to MiMo's design→clone model (voice identity IS the audio, no server-side voiceId). Fixing requires server-side storage which is out of scope; documented for future hardening. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This commit is contained in:
Generated
+9
@@ -69,6 +69,9 @@ importers:
|
||||
'@yume/ai-client':
|
||||
specifier: workspace:*
|
||||
version: link:../ai-client
|
||||
'@yume/tts-client':
|
||||
specifier: workspace:*
|
||||
version: link:../tts-client
|
||||
'@yume/types':
|
||||
specifier: workspace:*
|
||||
version: link:../types
|
||||
@@ -76,6 +79,12 @@ importers:
|
||||
specifier: ^0.33.5
|
||||
version: 0.33.5
|
||||
|
||||
packages/tts-client:
|
||||
dependencies:
|
||||
'@yume/types':
|
||||
specifier: workspace:*
|
||||
version: link:../types
|
||||
|
||||
packages/types: {}
|
||||
|
||||
packages:
|
||||
|
||||
Reference in New Issue
Block a user