feat(web): optional bring-your-own Xiaomi MiMo TTS key (browser-side synthesis)

Public users share one server TTS key, so Xiaomi's per-key RPM/TPM limits
cause silent playback under concurrency. This adds an OPTIONAL path: a user
can store their own Xiaomi MiMo key in the browser and synthesize voice
client-side against Xiaomi's CORS-open endpoints. The key lives only in
localStorage and is never sent to or logged by our server; the shared server
key still serves everyone who does not opt in.

- components/TtsKeyModal.tsx: shared key modal (key-family + region picker),
  reused by both the home and play pages
- app/play/page.tsx: silence nudge moved beside the mute toggle; modal opens
  in place instead of redirecting to the home page
- app/page.tsx: home page consumes the shared modal + readStoredTtsConfig
- lib/clientTtsConfig.ts, lib/ttsPresets.ts: browser config + region presets
- app/api/{start,scene,insert-beat}: thread per-request voice; lib/types update
- docs/xiaomi-tts-key.md + README note

Verified with tsc --noEmit (exit 0).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
yuanzonghao
2026-06-04 11:24:16 +08:00
parent 24b674d792
commit b0b2e922d3
13 changed files with 843 additions and 48 deletions
+77
View File
@@ -0,0 +1,77 @@
// Xiaomi MiMo TTS endpoint presets.
//
// Xiaomi issues two independent key types, each with its own base URL:
// - Token Plan (套餐, `tp-` key): per-region endpoints token-plan-{sgp,cn,ams}.
// - Pay-as-you-go (按量, `sk-` key): the single unified endpoint api.xiaomimimo.com.
//
// Used CLIENT-SIDE ONLY: when a user supplies their own key, the browser calls
// one of these endpoints directly (all return permissive CORS allowing the
// `api-key` header), so the key never transits our server. Every endpoint
// serves the same `mimo-v2.5-tts` family; Token Plan users pick the region
// matching their subscription (also the closest hop → lower synth latency),
// pay-as-you-go users have no region to choose. See docs/xiaomi-tts-key.md.
export type TtsPreset = {
id: string;
/** Which key family this endpoint serves — drives the two-step picker UI. */
kind: "token-plan" | "payg";
/** Human label shown in the picker (region for Token Plan, type for payg). */
label: string;
/** OpenAI-style base; the TTS adapter appends `/chat/completions`. */
baseUrl: string;
};
/** Base model name; the adapter derives `-voicedesign` / `-voiceclone`. */
export const DEFAULT_TTS_SPEECH_MODEL = "mimo-v2.5-tts";
/**
* In-repo tutorial for getting a free Xiaomi MiMo key + picking a region.
* Points at the default branch so it resolves once this lands on main (which
* is what production serves). Linked from the homepage BYO modal, the play
* page's silence nudge, and the README.
*/
export const TTS_KEY_DOC_URL =
"https://github.com/zonghaoyuan/infiplot/blob/main/docs/xiaomi-tts-key.md";
export const TTS_PRESETS: TtsPreset[] = [
{
id: "sgp",
kind: "token-plan",
label: "新加坡 · Singapore",
baseUrl: "https://token-plan-sgp.xiaomimimo.com/v1",
},
{
id: "cn",
kind: "token-plan",
label: "中国大陆 · China",
baseUrl: "https://token-plan-cn.xiaomimimo.com/v1",
},
{
id: "ams",
kind: "token-plan",
label: "欧洲 · Amsterdam",
baseUrl: "https://token-plan-ams.xiaomimimo.com/v1",
},
{
id: "payg",
kind: "payg",
label: "按量付费 · Pay-as-you-go",
baseUrl: "https://api.xiaomimimo.com/v1",
},
];
/** Token Plan endpoints only — the region sub-options shown once the user
* picks the "套餐" key type. */
export const TTS_REGION_PRESETS = TTS_PRESETS.filter(
(p) => p.kind === "token-plan",
);
/** The single pay-as-you-go preset id (`sk-` keys have no region). */
export const PAYG_PRESET_ID = "payg";
export function findTtsPreset(
id: string | null | undefined,
): TtsPreset | undefined {
if (!id) return undefined;
return TTS_PRESETS.find((p) => p.id === id);
}