infiplot-web

infiplot/infiplot-web

Fork 0

Commit Graph

Author	SHA1	Message	Date
yuanzonghao	8eda27f241	chore: complete @yume → @infiplot rename (post-PR#9) PR #9 已完成首页和 layout 的视觉品牌迁移，此 commit 补齐剩余的技术性改名 —— workspace 包名、source import、localStorage 键、 CSS keyframe、内部 header logo、.env.example、README。 - @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile) - localStorage/sessionStorage: yume:* → infiplot:* (含 PR #9 新增的 yume:hintClosed) - CSS keyframe yume-ripple → infiplot-ripple - new/play 页面 header logo "云梦" → "InfiPlot" - 代码注释中的「云梦」style 形容词删除（layout.tsx, page.tsx） - 根 package.json name + description（描述跟齐 staging "AI 实时交互剧情游戏"） - README: tagline / Vercel deploy URL / 目录树 / engine 描述保留：prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm placeholder 的「视觉小说画风」（图像模型识别的风格名词）。 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-06-02 09:27:00 +08:00
Zonghao Yuan	e261f4a346	feat: Runware FLUX.2 image + lazy per-beat TTS (#5 ) Reduce median scene-load latency from ~30-80s to ~17-25s by switching image generation to Runware FLUX.2 [klein] 9B KV and moving per-beat TTS synthesis off the scene response into a new lazy /api/beat-audio endpoint with hard timeout + abort support. - feat(image): migrate to Runware FLUX.2 [klein] 9B KV — task-array API, $0.001/image, sub-second inference. - feat(tts): split /api/scene into directScene + image + voicedesign-provisioning; lazily synth per beat via /api/beat-audio with 15s hard timeout + AbortSignal threaded to MiMo so timed-out calls don't keep burning sockets/quota; client fans out per-beat fetches on scene-id change with abort + identity-check finally to prevent cross-scene beat-id collisions. - refactor(tts): slim BeatAudioRequest to { beat, voice } — ~800KB per-beat upload dropped to ~160KB by sending only the speaker's voice instead of the full session. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-28 23:43:51 +08:00
Zonghao Yuan	fcd4e6c1ab	feat(tts): Xiaomi MiMo per-beat voice + MOCK_IMAGE testing aid (#3 ) Adds optional Xiaomi MiMo TTS layer on top of the scene/beat engine and a MOCK_IMAGE flag for cheap local TTS iteration. - Per-character voice provisioning via MiMo voice design → clone, reference audio persisted in session - Per-line free-form delivery direction (Director writes "鼓起勇气又害羞，声音发颤" style instructions; sent to MiMo's director channel, never read aloud) - Per-beat audio served with the scene response; frontend plays via hidden <audio> with typewriter synced to audio duration; mute toggle persisted via localStorage lazy initializer - Graceful degradation: any TTS step failing → silent beat, game continues - MOCK_IMAGE=true returns a sharp-generated placeholder PNG so local TTS iteration doesn't burn image tokens - Recommended config in .env.example: MiMo Token Plan covers TEXT/VISION/TTS with one key (mimo-v2.5-pro for text, mimo-v2.5 omni for vision, mimo-v2.5-tts for TTS) Squashed from #3: - feat(tts): 小米 MiMo 逐 beat 配音 + 按 session 角色音色 + 自由文本配音指导 - feat(engine): MOCK_IMAGE 占位图便于本地测试 - fix(tts): address Copilot review on PR #3 - fix(tts): Copilot round-2 review feedback Known limitation: Session.characters carries the full WAV reference audio (~200-300KB/character base64) and round-trips through every /api/scene, /api/vision, /api/insert-beat request. This is intrinsic to MiMo's design→clone model (voice identity IS the audio, no server-side voiceId). Fixing requires server-side storage which is out of scope; documented for future hardening. 🤖 Generated with [Claude Code](https://claude.com/claude-code)	2026-05-28 20:45:21 +08:00

Author

SHA1

Message

Date

yuanzonghao

8eda27f241

chore: complete @yume → @infiplot rename (post-PR#9)

PR #9 已完成首页和 layout 的视觉品牌迁移，此 commit 补齐剩余的
技术性改名 —— workspace 包名、source import、localStorage 键、
CSS keyframe、内部 header logo、.env.example、README。

- @yume/* → @infiplot/* (6 package.json + 17 imports + lockfile)
- localStorage/sessionStorage: yume:* → infiplot:*
  (含 PR #9 新增的 yume:hintClosed)
- CSS keyframe yume-ripple → infiplot-ripple
- new/play 页面 header logo "云梦" → "InfiPlot"
- 代码注释中的「云梦」style 形容词删除（layout.tsx, page.tsx）
- 根 package.json name + description（描述跟齐 staging
  "AI 实时交互剧情游戏"）
- README: tagline / Vercel deploy URL / 目录树 / engine 描述

保留：prompts.ts 的 LLM 体裁术语「视觉小说/galgame」、CustomForm
placeholder 的「视觉小说画风」（图像模型识别的风格名词）。

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-06-02 09:27:00 +08:00

Zonghao Yuan

e261f4a346

feat: Runware FLUX.2 image + lazy per-beat TTS (#5 )

Reduce median scene-load latency from ~30-80s to ~17-25s by switching image generation to Runware FLUX.2 [klein] 9B KV and moving per-beat TTS synthesis off the scene response into a new lazy /api/beat-audio endpoint with hard timeout + abort support.

- feat(image): migrate to Runware FLUX.2 [klein] 9B KV — task-array API, $0.001/image, sub-second inference.
- feat(tts): split /api/scene into directScene + image + voicedesign-provisioning; lazily synth per beat via /api/beat-audio with 15s hard timeout + AbortSignal threaded to MiMo so timed-out calls don't keep burning sockets/quota; client fans out per-beat fetches on scene-id change with abort + identity-check finally to prevent cross-scene beat-id collisions.
- refactor(tts): slim BeatAudioRequest to { beat, voice } — ~800KB per-beat upload dropped to ~160KB by sending only the speaker's voice instead of the full session.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

2026-05-28 23:43:51 +08:00

Zonghao Yuan

fcd4e6c1ab

feat(tts): Xiaomi MiMo per-beat voice + MOCK_IMAGE testing aid (#3 )

Adds optional Xiaomi MiMo TTS layer on top of the scene/beat engine and a MOCK_IMAGE flag for cheap local TTS iteration.

- Per-character voice provisioning via MiMo voice design → clone, reference audio persisted in session
- Per-line free-form delivery direction (Director writes "鼓起勇气又害羞，声音发颤" style instructions; sent to MiMo's director channel, never read aloud)
- Per-beat audio served with the scene response; frontend plays via hidden <audio> with typewriter synced to audio duration; mute toggle persisted via localStorage lazy initializer
- Graceful degradation: any TTS step failing → silent beat, game continues
- MOCK_IMAGE=true returns a sharp-generated placeholder PNG so local TTS iteration doesn't burn image tokens
- Recommended config in .env.example: MiMo Token Plan covers TEXT/VISION/TTS with one key (mimo-v2.5-pro for text, mimo-v2.5 omni for vision, mimo-v2.5-tts for TTS)

Squashed from #3:
- feat(tts): 小米 MiMo 逐 beat 配音 + 按 session 角色音色 + 自由文本配音指导
- feat(engine): MOCK_IMAGE 占位图便于本地测试
- fix(tts): address Copilot review on PR #3
- fix(tts): Copilot round-2 review feedback

Known limitation: Session.characters carries the full WAV reference audio (~200-300KB/character base64) and round-trips through every /api/scene, /api/vision, /api/insert-beat request. This is intrinsic to MiMo's design→clone model (voice identity IS the audio, no server-side voiceId). Fixing requires server-side storage which is out of scope; documented for future hardening.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

2026-05-28 20:45:21 +08:00

3 Commits