fix(tts): harden StepFun provider integration

- Validate voice.provider against known whitelist (xiaomi|stepfun) in beat-audio route to return a clear 400 instead of falling through - Move single-char pronouns (他/她) to weak-signal fallback in detectGender to avoid false positives on compounds like 其他 - Update .env.example with StepFun configuration examples Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-09 14:24:27 +08:00
parent 04f22249c9
commit 1a6238f8b8
3 changed files with 28 additions and 10 deletions
@@ -66,10 +66,20 @@ VISION_MODEL=mimo-v2.5
 #   google    → VISION_BASE_URL=https://generativelanguage.googleapis.com  VISION_MODEL=gemini-3.5-flash
 # VISION_PROVIDER=openai_compatible

-# ---- 4. TTS · Xiaomi MiMo (optional — leave blank to disable) ------
-# Per-character voice design → clone, with per-line delivery direction.
-# Voice identity = the reference audio kept in the session (no server expiry).
-# The adapter appends -voicedesign / -voiceclone to TTS_SPEECH_MODEL.
+# ---- 4. TTS (optional — leave blank to disable) --------------------
+# Provider is auto-detected from TTS_BASE_URL host:
+#   *stepfun.com  → StepFun (preset voices, keyword-scored selection)
+#   otherwise     → Xiaomi MiMo (voicedesign + voiceclone)
+#
+# Xiaomi MiMo — per-character voice design → clone, with per-line delivery.
+#   TTS_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
+#   TTS_API_KEY=tp-xxx
+#   TTS_SPEECH_MODEL=mimo-v2.5-tts
+#
+# StepFun — 32 preset voices, auto-selected by gender + age + tone scoring.
+#   TTS_BASE_URL=https://api.stepfun.com/v1
+#   TTS_API_KEY=sk-xxx
+#   TTS_SPEECH_MODEL=step-tts-mini          # or step-tts-2 / stepaudio-2.5-tts
 TTS_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
 TTS_API_KEY=tp-xxx
 TTS_SPEECH_MODEL=mimo-v2.5-tts